?;\ i: MODERN MATHEMATICS TITLES AND AUTHORS I. THE FOUNDATIONS OF GEOMETRY. By OSWALD VEBLEN, Ph.D., Professor of Mathematics in Princeton University. II. MODERN PURE GEOMETRY. By THOMAS F. HOLGATE, Ph.D., LL.D., Professor of Mathematics in Northwestern University. III. NON-EUCLIDEAN GEOMETRY. By FREDERICK S. WOODS, Ph.D., Professor of Mathematicsin the Massachusetts Institute of Technology. IV. THE FUNDAMENTAL PROPOSITIONS OF ALGEBRA. By EDWARD V. HUNTINGTON, Ph.D., Assistant Professor of Mathematics in Harvard University. V. THE ALGEBRAIC EQUATION. By G. A. MILLER, Ph.D., Professor of Mathematics in the University of Illinois. VI. THE FUNCTION CONCEPT AND THE FUNDAMENTAL NOTIONS OF THE CALCULUS. By GILBERT AMES BLISS, Ph.D., Associate Professor of Mathematics in the Universtiy of Chicago. VII. THE THEORY OF NUMBERS. By J. W. A. YOUNG, Ph.D., Associate Professor of the Pedagogy of Mathematics in the University oi Chicago. VIII. CONSTRUCTIONS WITH RULER AND COMPASSES; REGULAR POLYGONS. By L. E. DICKSON, Ph.D., Professor of Mathematics in the University of Chicago. IX. THE HISTORY AND TRANSCENDENCE OF IT. By DAVID EUGENE SMITH, Ph.D., LL.D., Professor of Mathematics in Teachers College, Columbia University, MONOGRAPHS ON TOPICS OF MODERN MATHEMATICS RELEVANT TO THE ELEMENTARY FIELD EDITED BY J. W. A. YOUNG LONGMANS, GREEN, AND CO. FOURTH AVENUE & 30TH STREET, NEW YORK LONDON, BOMBAY, AND CALCUTTA 1911 COPYRIGHT, 1911, BY LONGMANS, GREEN, & CO. THE SCIENTIFIC PRESS ROBERT DRUMMOND AND COMPANY BROOKLYN, N. Y. EDITOR'S PREFACE THE purpose of this collection of monographs may be indicated by the following citation from the letter that was sent to those who were requested to act as authors. "Among the various publications on mathematics that are being made, it would seem that there is room for a serious effort to bring within reach of secondary teachers (in service or in training), college students, and others at a like stage of mathematical advancement, a scientific treatment of some of the regions of advanced mathematics that have points of contact with the elementary field. Undoubtedly one of the most crying needs of our secondary instruction in mathematics to-day, is that the scientific attainments of the teachers be enlarged and their mathematical horizon widened; and I believe that there is a large body of earnest teachers and students that are eager to extend their mathematical knowledge if the path can be made plain and feasible for them." "A volume of monographs dealing with selected topics of higher mathematics might well be a useful contribution to the meeting of this need. Such monographs would aim to bring the reader into touch with some characteristic results and viewpoints of the topics considered, and to point out their bearing on elementary mathematics. They would therefore contain: (1) A considerable body of results proved in full, so that the reader can materially extend his mathematical acquisitions by the reading of the monograph alone. v vi PREFACE (2) Statement without proof of some leading methods and results, so as to give a bird's-eye view of the subject. (3) A small number of references indicating what the reader may profitably take up after he has mastered the contents of the monograph." Both the plan itself, and the invitation to act as author, were most cordially received; work on the monographs was promptly begun, has been carried through substantially as planned, and the results are presented herewith. The manuscripts have, whenever feasible, been read carefully by at least one collaborator other than myself, and in consequence various questions and suggestions have been submitted to the authors and acted upon by them. Each author, however, retains sole responsibility for his monograph as it now appears. No attempt has been made to secure uniformity in style of treatment; each monograph is an independent unit, that can be read without reference to the others. The amount of technical mathematical knowledge that is presupposed on the part of the reader varies with the different subjects. A large part of the book presupposes only knowledge of elementary geometry and algebra, together with a certain measure of mathematical maturity. On the other hand, there is much that will repay careful and detailed study by advanced students. So far as the subject-matter permits, the less difficult topics are taken up first in each monograph. J. W. A. YOUNG. CONTENTS PAGE I. THE FOUNDATIONS OF GEOMETRY......................... 3 Introduction-The Assumption of Order-Order on a LineThe Triangle and the Plane-Regions in a Plane-Congruence of Point Pairs-Congruence of Angles-Intersections of CirclesParallel Lines-Mensuration-Three-Dimensional Space-Conclusion. II. MODERN PURE GEOMETRY*............................... 55 Introduction-Simple Elements in Geometry-The Principle of Duality-Principle of Continuity-Points at Infinity-Fundamental Theorem-Metric Properties-Anharmonic RatiosElementary Geometric Forms-Correlation of Elementary Forms-Curves and Sheaves of Rays of the Second OrderPascal's and Brianchon's Theorems-Pole and Polar TheoryConclusion. III. NON-EUCLIDEAN GEOMETRY............................. 93 Introduction-Parallel Lines-The Euclidean AssumptionThe Lobachevskian Assumption-The Riemannian Assumption -The Sum of the Angles of a Triangle-Areas-Non-Euclidean Trigonometry-Non-Euclidean Analytic Geometry-Representation of the Lobachevskian Geometry on a Euclidean PlaneRelation between Projective and Non-Euclidean GeometryThe Element of Arc. IV. THE FUNDAMENTAL PROPOSITIONS OF ALGEBRA*........... 151 Introduction-The Addition of Angles and the Multiplication of Distances-The Abstract Theory of these OperationsGeometric Example of the Algebra of Complex Quantities: The System of Points in the Plane-The Abstract Theory of the Algebra of Complex Quantities-Appendix: Other Examples of the Algebra of Complex Quantities-Geometric Proof that Every Algebraic Equation has a Root. * A fuller Table of Contents precedes the Monograph itself. vii vii CONTENTS PAGE V. THE ALGEBRAIC EQUATION*.............................. 211 General Introduction-Historical Sketch and DefinitionsEquations with One Unknown and with Literal CoefficientsEquations with One Unknown and with Numerical Coefficients -Simultaneous Equations-A Few References. VI. THE FUNCTION CONCEPT AND THE FUNDAMENTAL NOTIONS OF THE CALCULUS*.................................. 263 Introduction-Variables and Functions-The Fundamental Notions of the Calculus. VII. THE THEORY OF NUMBERS*.............................. 307 Introduction - Factors - Diophantine Equations - Congruences-Binomial Congruences-Quadratic CongruencesBibliography. VIII. CONSTRUCTIONS WITH RULER AND COMPASSES; REGULAR POLYGONS....................................... 353 Introduction-Analytic Criterion for Constructibility-Graphical Solution of a Quadratic Equation-Domain of RationalityFunctions Involving no Irrationalities other than Square Root -Reducible and Irreducible Functions-Fundamental Theorem; Duplication of the Cube; Trisection of an Angle; Quadrature of the Circle-Connection between Regular Polygons and Roots of Unity-De Moivre's Theorem-Regular Pentagon and Decagon-Regular Polygon of 17 Sides-Construction of the Regular Polygon of 17 Sides-Gauss's Theory of Regular PolygonsPrimitive Roots of Unity-Gauss's Lemma-Irreducibility of the Cyclotomic Equation-Proofs of Theorems Cited EarlierReferences. IX. THE HISTORY AND TRANSCENDENCE OF U.................. 389 The Nature of the Problem-The History of the ProblemThe Transcendence of e-The Transcendence of 7r. * A fuller Table of Contents precedes the Monograph itself. I THE FOUNDATIONS OF GEOMETRY By OSWALD VEBLEN CONTENTS PAGES I. INTRODUCTION........................................ 3 II. THE ASSUMPTION OF ORDER............................. 5 III. ORDER ON A LINE................................ 9 IV. THE TRIANGLE AND THE PLANE.......................... 14 V. REGIONS IN A PLANE...............1............... 19 VI. CONGRUENCE OF POINT PAIRS.......................... 27 VII. CONGRUENCE OF ANGLES.............................. 30 VIII. INTERSECTIONS OF CIRCLES............................ 32 IX. PARALLEL LINES.................................... 43 X. MENSURATION........................................ 44 XI. THREE-DIMENSIONAL SPACE.......................... 47 X II. CONCLUSION.......................................... 49 2 I THE FOUNDATIONS OF GEOMETRY* By OSWALD VEBLEN I. INTRODUCTION In connection with the foundations of geometry there arise many questions of psychology, logic and epistemology. Into these the present paper does not enter. Instead we propose to write out the preliminary pages of a geometry such as Euclid might be imagined to write to-day. The resulting treatment of geometry as a whole will not be very different from that actually written by Euclid. We shall, however, go into detail only with those parts of the subject in which the modern exposition is essentially different from the ancient. That there are such differences is not because Euclid's logical * This essay is based mainly on two articles in the Transactions of the American Mathematical Society. The first one is by the present writer (Vol. V [1904], pp. 343-84) and the second by Dr. R. L. Moore (Vol. IX [1908], pp. 487-512). I have modified my assumptions in accordance with a suggestion of Dr. Moore's and have also changed the form of his assumptions in some respects. The literature is too large to cite in detail. We shall be content to mention the names of the following European contributors to the subject: Pasch, Veronese, Peano, Pieri, Schur, Hilbert, Dehn; and the following works in the English language: Hilbert (tr. by Townsend), Foundations of Geometry, Chicago. E. H. Moore, On the Projective Axioms of Geometry. Trans. Am. Math. Soc., Vol. III (1902), pp. 142-58. Halsted, Rational Geometry, New York. Whitehead, The Axioms of Descriptive Geometry, Cambridge, 1907. Coolidge, Non-Euclidean Geometry, Oxford, 1909. Schweitzer, A Theory of Geometrical Relations, American Journal of Mathematics Vol. XXXI (1909), pp. 365-410. 3 4 MODERN MATHEMATICS methods and purposes were different from those of the modern mathematical students of foundations. Euclid overlooked certain assumptions that entered tacitly into his arguments, but this was by mistake. His purpose was the same as that of the moderns, to prove every proposition which he could prove, and to prove it with a minimum of assumptions. This required him often to prove statements that were intuitively evident. Thus an axiom might be a self-evident truth, but certainly all self-evident truths were not axioms according to the usage of Euclid. In geometry a great many technical terms are defined, and each is defined in terms of other terms. Hence at the beginning of a book on geometry at least one term must be undefined; otherwise the book would have no beginning. We shall leave undefined the term point. This implies that the reader is free to carry in his mind any image of a point which he can reconcile with what is said about it. We may try to impart a notion of our image of a point by saying it has no length, breadth, or thickness, or by like phrases, but these are no part of our book on geometry; they have nothing to do with the logical steps by which the theorems are derived. If the propositions of geometry are arranged in logical order so that each proposition after a start has been made shall follow by deduction from its predecessors, it is clear that the first propositions of all cannot be deduced, because there are no previous propositions to deduce them from. There must therefore be assumptions. These may be stated so plausibly that no one doubts their truth,* but whether they are true or not cannot affect the correctness of the reasoning based upon them, nor the fact that they are assumptions. We shall not enter into the metaphysical question as to whether these assumptions are self-evident truths, axioms, common notions, experimental data or what not, but shall try to keep within the * The writer is inclined to believe that the truth of a statement can be determined only by testing all its consequences, so that the real test of the validity of the hypotheses of geometry is in the validity of the theorems. THE FOUNDATIONS OF GEOMETRY 5 realm of mathematics by using the non-committal word assumptions. In addition to the word point, we shall take as undefined a relation among points which we indicate by saying "the points ABC are in the order IABC }." This relation may mean anything the reader desires, provided it is consistent with the following statements. These assumptions were all used implicitly in the older geometries, as well as in most text-books of to-day, but have not been formulated explicitly as part of the foundations of geometry until very recent times. II. THE ASSUMPTIONS OF ORDER Assumption I. If points A, B, C are in the order {ABC} they are distinct. FIG. B C Assumption II. If points A, B, C are in the order IABC} they are not in the order { BCA }. Definitions. If A and B are distinct points the line AB consists of A and B and all points, X, in one of the orders ABX}, {AXB}, {XAB}. The points X in the order {AXB} X 4 X B X FIG. 2. constitute the linear segment AB, and are said to be between A and B. A and B are called the ends of the segment. The segment, together with its ends, constitutes a linear interval. Assumption III. If points C and D (C D)* are in the line AB, then A is in the line CD. Assumption IV. If A and B are two distinct points, there exists a point C such that A, B, and C are in the order { ABC }. * The notation A 7 B indicates that A and B are symbols for different objects. 6 MODERN MATHEMATICS Assumption V. If three distinct points, A, B, and C do not lie on the same line and D and E are two points in the orders BCD} and CEA, then a point F exists in the order { AFB } and such that D, E and F lie on the same line. / E Assumption VI. There exist three distinct points, A, B, C, not in any of the orders B C --- D ABC}, {BCA}, {CAB}.. FIG. 3. Theorem 1. If points A, B, C are in the order I ABC} they are in the order {CBA}, and not in any of the orders {CAB}, {BAC}, {ACB}, BCA }. Proof. From the definition of a line, A is on the line BC. By Assumption I, C and A are distinct. Hence by Assumption III, B is on the line CA. This means, since B is distinct from C and A, that there is one of the orders {CAB}, {CBA}, BCA }. But Assumption II states that BCA } is impossible, and if we had {CAB} it would follow by Assumption II that we did not have {ABC}. Hence {ABC} implies {CBA} and excludes BCA} and {CAB }. By what we have just proved, if we had {BAC}, we should also have { CAB }. Hence {BAC is eliminated. Since { ACB} would imply BCA }, ACB} is also excluded. Corollary 1. If A and B are distinct points the line AB is the same as the line BA, and the segment AB is the same as the segment BA. Corollary 2. If points A, B, C are in the order ABC} then they are all on the lines AB, BC, CA.* Theorem 2.t For every two distinct points there is one and only one line containing them. Proof. Let A and B be two distinct points. By Theorem 1, Corollary 1, the lines AB and BA are identical. Let C be any point on the line AB distinct from A and let X be any point of the line AB distinct from C and A. Since C is * The lines AB, BC, CA are proved identical in Theorem 2. t Cf. Euclid, Postulates 1, 2. THE FOUNDATIONS OF GEOMETRY 7 on AB it follows by Assumption III that A is on the line GX, and hence that we have one of the orders ACX, CAX, CXA. Whichever of these three cases holds it follows by Theorem 1 and the definition of a line that X is on AC. Hence all points of the line AB are on the line AC. Let Y be any point of AC distinct from B and A. Since C is on AB, by Assumption III and Corollary 2 of Theorem 1, B is on AC. Since B and Y are on AC it follows by AssumptionIII that A isonBY. Hence {ABY} or {BAY} or {BYA}. Hence by Theorem 1, Y is on AB. Thus we have shown that the lines AB and AC are identical. If D is any point of AB different from C, it follows that the line CD is identical with the line CA and hence with the line AB. In other words any line containing C and D is identical with the line CD. Corollary. Two distinct lines cannot have more than one point in common. Proof. If there were two common points, the line determined by them would be identical with each of the given lines. Theorem 3. If DE is any line there exists a point F not on this line. Proof. If every point were on the line DE then this line would contain the three points A, B, C mentioned in Assumption VI. By Theorem 2, the line AB would be identical with DE. Hence the line AB would contain C, contrary to Assumption VI. Theorem 4. If A and B are any two points there is a point F in the order AFB. Proof. By Theorem 3, there is a point E not on the line AB (Fig. 3). By Assumption IV there is a point C in the order {AEC}. The point C cannot be on the line AB, for if so this line would also contain E, by Theorem 2. By Assumption IV, there is a point D in the order {BCD}. Hence by Assumption V there is a point F in the order I AFB }. We have now the information that a line AB must always contain at least five points, namely A and B, and at least one point X1 between A and B (Theorem 4), and at least one X2 8 MODERN MATHEMATICS and one X3 in each of the orders { ABX2} and { X3AB } (Assumption IV and Theorem 1). The points X1, X2, and X3 are distinct by Theorem 1. The theorems proved above are all intuitively obvious provided the reader of these lines has in mind the same set of images as the writer. It is necessary to prove them, however, in order to show that our list of assumptions is actually a characterization of the points and lines which we image to ourselves. An obvious fact in the figure (Fig. 3), described by Assumption V, is that the points D, E, F are not only collinear as stated in that assumption, but are in the order {DEF}. This we shall now prove as a theorem. The reader will observe that most of the other assumptions are used in the argument. Theorem 5. The points D, E, F of Assumption V are in the order {DEF}. Proof. Since D, E, F are on the same line, it follows by Theorem 2 that F is on the line DE. Hence they are in one of the orders {DEF}, {DFE}, {FDE}. Suppose they were in the order {DFE}. The points E, C, D are not on the same line, because if they were, Theorem 2 would require A, B, C to be on this line. Hence by Assumption V (Fig. 4) the orders {CEA} and EFD} would imply that there E D X- D C E A A F B FIG. 4. FIG. 5. is a point X in the order {DXC} and on the line AF. But B is common to the lines AF and DC. Hence, by the Corollary of Theorem 2, X=B. Hence we would have the order {DBCI as well as BCD }, contrary to Theorem 1. Suppose the points were in the order {FDE}. As before, the points E, F, A are not on the same line. Hence by Assumption V the orders AFB} and FDE} imply the existence of a point X on the line BD and in the order {EXA}. THE FOUNDATIONS OF GEOMETRY 9 But the lines BD and EA, have C in common. Hence there would be the order {ECA } as well as {CEA, contrary to Theorem 1. We need also to prove the following theorem, which is intuitively quite as obvious as Assumption V. We shall use the word collinear of a set of points to indicate that they are all on the same line. Theorem 6. If A, B, C are non-collinear points and A' is between B and C, B' between C and A, and C' between A and B, then A', B,' C' are non-collinear. B, A' C A' A J' c B C' A FIG. 6. FIG. 7. Proof. If A,' B,' C,' were collinear we should have one of the orders { A'B'C'}, {B'A'C'}, {A'C'B'}. Consider the possibility of A'B'C'}. The points A', C', B cannot be collinear, because their line would, by Theorem 2, also have to contain A and C. Now by Assumption V, the orders IBC'A} and C'B'A'} imply the existence of a point X in the order { BXA'} and on the line AB'. But C is common to the lines A'B and AB'. Hence X=C and we should have both {BCA'} and \BA'C}. The proof that { B'A'C'} and A'C'B'} are impossible is similar. III. ORDER ON A LINE Theorem 7. If ABC} and BCD, then ABD }. Proof. By Theorem 3, and Assumption IV, there exist points P and 0 not on the line AB, and in the order BPO. By Assumption V, and Theorem 5, the orders {CBA} and {BPO} imply the existence of a point Q in the orders {OQC} and {APQ}. Similarly the orders {BCD} and {CQO} imply the existence of a point R in the orders {ORB} and {DQR}. 10 MODERN MATHEMATICS The points A, Q, D are not collinear, for, if so, P would be on AD. Hence, by Assumption V, the orders {DQR} and QPA} imply the existence of a point X in the order {AXD} and on B FIG. 8. the line RP. But the lines RP and AD have B in common Hence X=B and ABD}. Theorem 8. If ABC} and ABD }, C i D, then either {BCD} or {BDC}. Proof. In view of Theorem 2, it is necessary only to show that {CBD is impossible. By Theorem 3 and Assumption IV, there exist points 0 and P not on the line BC and in the order {OCP}. The orders {OCP and {CBD} would then imply the existence of a point Q in the orders (Fig. 9) {DQO} FIG. 9. and {PBQ. Now A, being on the line BC, is not on the line CP. Hence the orders {OCP} and {CBA} imply the existence of a point R in the orders {ARO} and {PBR}. Thus we would have three non-collinear points, 0, A, D, and three points B, Q, R, B between A and D, Q between D and 0, R between THE FOUNDATIONS OF GEOMETRY 11 0 and A, and B, Q, R would all be on the line BP, contrary to Theorem 6. The following are corollaries of Theorems 7 and 8. Corollary 1. If { ABC} and {ABD}, C D, then either {ACD} or {ADC}. Proof. By Theorem 8, we have either {BCD} or {BDC}. If {BCD}, then {DCB} and {CBA} lead by Theorem 7 to {DCA }. If { BDC} then CDB } and { DBA} imply CDA } Corollary 2. If ABD} and {ACD }, B C, then either {ABC} or {ACB}. Proof. {BAC} with {ACD} would by Theorem 7 imply BAD}, whereas our hypothesis is {ABD}. Corollary 3. If {ABC} and {ACD} then {BCD}. Proof. By Theorem 7 {CDB} would imply with {ACD} the order {ACB }, contrary to hypothesis. {CBD} with {CBA} would by Corollary 1 imply either { CDA} or {CAD}, contrary to hypothesis. Corollary 4. If {ABC} and {ACD} then {ABD}. Proof. By Corollary 3 we have {BCD}, which combined with {ABC} leads by Theorem 7 to {ABD}. These propositions are all preliminary to the following theorem. Theorem 9. If A is any point of a line AB, the points of the line exclusive of A are in two sets such that A is between any point of the first set and any point of the second set, and is not between any two points of the same set. Proof. Let [X] be the set* of all points in the order IXAB} and let [Y] be the set including B and all points in the orders {AYB} and IABY}. By definition, the line comprises no other points than A and [X] and [Y]. A is between any X and any Y. For we have {XAB} and either {AYB} or IABY}. In the first case Corollary 3 gives the conclusion YAX} and in the second case Theorem 7 yields the same result. * We let [X] denote a class of objects, the individuals of which are denoted by X, XI, X2, X", etc. 12 MODERN MATHEMATICS A is not between two X's, because {X1AB} and {X2AB} lead by Theorem 8 either to {AX1X2} or {AX2X1}. A is not between two Y's, for the possible cases are: (a) {AY1B} and AY2B }, which by Corollary 2 gives {AYlY2} or {AY2Y1}; (b) JAY1B} and {ABY2} which, by Corollary 4 gives {AY1Y2}; (c) {ABY1} and {ABY2}, which by Corollary 1 gives { A Y1Y2} or {A Y2Y1I }. Definition. The two sets of points in Theorem 9 are called half-lines or rays; A is called the origin or the end of either half-line. If AB is any segment the ray of the line AB whose end is B and which does not contain A is called the prolongation of the segment AB beyond B. A ray whose end is A and which contains B is designated as the ray AB. Corollary 1. If B is a point of a ray whose end is A the points of the ray exclusive of B are in two sets, the segment AB and a ray whose end is B. Proof. The ray is by definition composed of B and the points X in the order {AXB} and the points Y in the order {ABY}. Corollary 2. If C is a point of a segment AB, then the points of the segment, exclusive of C, are in one or the other, but not both, of the segments AC and BC. Proof. A and B are respectively in the two rays a and f, whose common end is C. The ray a contains by definition all points {X} in the order {CXA} and 3 contains similarly all points {X'} in the order {CX'B}. Hence the segments CA and CB have no points in common. The other points {Y} of the ray a are in the order CA Y}. Since we have {BCA} it follows by Theorem 7 that the points Y are also in the order {BAY} and hence not on the segment AB. The ray P also contains points Y' in the order {CBY}. These must also be in the order {ABY} and hence not on the segment AB. Hence every point on the segment AB, except C, is on one of the segments AC and CB. Definition. A set of n (n >3) points A1, A2,... An are in the order iA1A2...AT} if and only if {AiAiAk} wherever THE FOUNDATIONS OF GEOMETRY 13 i< j< k (i, j, k=l, 2,..., n). Two points A1, A2 are always in the orders {A1A2} and {A2A1}. Theorem 10. To any set of n distinct points (n > 2) on a line can be assigned the notation so that they are in the order I AlA2... An. The other points of the line fall into n +1 sets, no two of which have a point in common. These sets are the segments Al, A2, A2A3,... A,,_An and the rays which are the prolongations of the segment A1An beyond Al and An. Proof. We prove the theorem first for the case n=2. The first statement of the theorem is in this case part of the definition. Let A1 be an arbitrary one of the two points, and let li and )72 be the two rays which it determines according to Theorem 9, 7)2 being the one which contains A2. The ray 1)2 is by Corollary 1 of Theorem 9 composed of the segment A1A2, the point A2, and another ray 2' with A2 as its end. This proves the theorem for n-2. We establish it for the general case by proving that if it is true for n==k then it is true for n = k +1. Consider k points in the order { A1A2... Ak}. A point Ak+1 may fall in the ray whose end is A1 or in one of the segments A1A A, A2A3,..., -Ak, or in the ray whose end is Ak. If it falls in one of the two rays it separates this into a segment and a ray by Corollary 1 of Theorem 9. If it falls in a segment it separates this into two segments by Corollary 2 of the same theorem. So in either case we have increased the number of segments by one and left the number of rays unaltered. Call the end of one of the rays A1'. Let A2' be the other end of the segment, one of whose ends is A1'. Let A3' be the other end of the other segment whose end is A2'. By a finite number of steps the points Al,..., Ak, Ak+l are exhausted and the notation has been assigned to them in such a way that A1' and A'k+l are ends of rays and the segments are A'A2,..., A'kAk+l Since none of the points A3',... Ak+1 are on the segment A1'A2' of the ray whose end is A1, we have the order relations A1'A2'Aj'(2<j<k+ 2). Similar considerations show that all the order relations exist which are implied by the symbol (A1'A2'A3'.. A'k+ ). 14 MODERN MATHEMATICS Theorem 11. On any segment AB and on either of its prolongations there is an infinitude of points. Proof. By Theorem 4 there is a point XI on the segment AB. By the same theorem there is a point X2 on the segment AX,. By Theorem 8, Corollary 4, X2 is on AB. In like manner we obtain points X3, X4,... on AB. By Assumption I and Corollary 2 of Theorem 9, these points are all distinct. By Assumption IV there is a point Y1 on the prolongation of AB beyond B, a point Y2 on the prolongation of AY1 beyond Y2, and so on. By Theorem 7 all these points are on the prolongation of AB beyond B. IV. THE TRIANGLE AND THE PLANE Definition. Three non-collinear points A, B, C, together with the segments AB, BC and CA, are called a triangle ABC. The points A, B, C are called the vertices, and the segments AB, BC, CA are called the sides of the triangle. We shall now prove a theorem which must be carefully distinguished from Assumption V. Theorem 12. If A, B, C are three non-collinear points and D and F exist in the orders {BCD} and {AFB} then E exists in the orders AEC} and DEF. Proof. By Assumption IV there exists a point, 0 (Fig. 10),in the order ABO} which is therefore by Theorem 8, Corollaries 3 and 4, also in the orders {AFO, {FBO}. Since we also have {BCD } it follows by Assumption V and Theorem 5, that there exists a point P in the orders { OCP } and { FPD }. By the same argument there follows from the orders AFO} and { FPD} the existence of a point Q in the orders {OPQ } and {DQA}. The orders {OPQ} and {OCP} imply by Theorem 8, Corollary 3, the order {CPQ}. Since A, C, Q, are not collinear (Theorem 2), the orders {AQD} and {QPC} imply (Assumption V) the existence of a point E on the line DF and in the order {CEA }. By Assumption V and Theorem 2, the line DE meets the segment AB in F only, and hence by Theorem 5 we have the order D EF }. THE FOUNDATIONS OF GEOMETRY 15 Assumption V and Theorem 12, may be combined in one statement which for convenience we list as Theorem 13. Theorem 13. A line which meets one side of a triangle and a prolongation of another side meets the third. side also. Definition. If A, B, C are non-collinear, the set of all points collinear with pairs of points of the triangle ABC is the plane ABC. Corollary. The planes ABC, ACB, BCA, etc., are identical. Theorem 14. If 0 is any point of a side AB of a triangle ABC the plane ABC consists of the points of the lines joining 0 to points of the triangle ABC. A /B s-C D \ /?...."" FIG. 10. A - N / 1 IG. 12. FIG.. FIG. 12. FIG. 11. Proof. By definition, all points on the lines joining 0 to points of the triangle are in the plane ABC. Hence we must prove that all points of the plane ABC are on such lines. It is evident in view of Theorem 13, that all points of the three lines AB, BC, CA are in the plane and also on lines joining 0 to points of the triangle. Let X be any point in the plane ABC, but not on one of the lines AB, BC, CA. It is therefore collinear with two points M and N of the triangle, which are not both on the same one of the intervals (see the definition p. 5) AB, BC, CA. If one of these points, say M, coincides with A (Fig. 12), N 16 MODERN MATHEMATICS must be on the segment CB. If X is on either prolongation of the segment AN the theorem reduces to Theorem 13, applied to the triangle ANB. If {AXN}, by Theorem 13 applied to the triangle ANB we have the existence of a point, P, in the orders {APB } and {CXP. By the theorems of III we have P = 0 or {APOB} or { AOPB}. If P==0 then the line OX meets the triangle in C. If {APO} then OX, by Theorem 13, applied to the triangle APC, meets the segment AC. Similarly, if {OPB }, the line OX meets the segment CB. Exactly the same argument can be repeated if M -- B. Thus if any point X is collinear with A or B and a point of the triangle it is collinear with 0 and a point of the triangle. c / c /N C / a/ WJx FIG. 13. FIG. 14. If M is on the segment AB, and N on the interval CB, X is in one of the orders {MXN} {MNX}, {XMN}. If {MXN, by Theorem 13, applied to the triangle MNB we have the existence of a point P in the orders {AXP} and {NPB }. Since X is on the line AP, the line OX meets two points of the triangle ABC by the paragraph above. If X is in the order {XMN } consider the triangle CNX and the orders { CNB} and I NMX}. Hence there exists a point P (Figs. 14, 15) in the orders {BMP} and {CPX}. Since 0 is on the segment AB it either coincides with P or is on the segment PA or is on the segment PB. In the first case the line OX meets C. In the second case consider the triangle APC, and in the third case the triangle BPC. It follows by Theorem 13, in the second case, that OX meets the segment AC and in the third case that OX meets the segment CB. THE FOUNDATIONS OF GEOMETRY 17 If X is in the order {MNX} consider the triangle AMX and the orders AMB } and MNX}. It follows that there exists a point P in the orders APX} and {BNP}. Since 0 is on the segment AB it follows by considering the triangle APB, that the line OX meets the line BC in a point Q. If C A t A ~ ~Itf ^^ /B FIG. 15. G. 16. Q is on the interval BC it is the required point on the triangle ABC. If not, by Theorem 13, the line OQ meets the segment AC in a point R and thus X is collinear with O and the point R of the triangle. There now remains the case where neither M nor N is on the interval AB. Therefore one of them, say M, is on the segment AC and the other, N, is on the segment BC. Two cases arise as X is on the segment MN or not. C P - 9 A B1. 1 TFIG. 17. FIG. 18. If {MXN } the line AX by Theorem 13 applied to the triangle MCN meets the segment NC in a point P. Thus X is on the line AP and the result follows from the statement italicized above. If X is in the order {MNX}, consider the triangle AMX and the orders {AMC} and {MNX}. By Theorem 13, the segment AX is met by the line CN in a point Q. If Q is on 18 MODERN MATHEMATICS the interval CB the point X is on a line joining A to a point of this interval and thus OX meets the triangle by the statement italicized above. If Q is on one of the prolongations of the segment CB then since 0 is on the segment AB the line OX meets the segment AC according to Theorem 13. If X is in the order XMN} we replace the triangle AMX by BNX and proceed as in the paragraph above. Theorem 15. If A'B'C' are any three non-collinear points of a plane ABC then the planes ABC and A'B'C' are identical. Proof. We shall prove first that if A' is a point of the line AB distinct from B the planes ABC and A'BC are identical. CA j A' A O B A A' O B A B A' FIG. 19. FIG. 20. FIG. 21. If A'=A the statement is trivial. If not we have {A'AB} or {AA'B} or {ABA'}. In the first case let 0 be any point of the segment AB; it is therefore on the segment A'B. By Theorem 14, the plane ABC consists of all points on the lines joining 0 to the intervals BC and CA. But this set of lines is identical by Theorem 13, with the lines joining 0 to the intervals BC and CA'. Hence in this case the planes ABC and A'BC are identical. In case of the order {AA'B}, 0 is taken on the segment A'B and the argument C' is similar. In case of the order {ABA'} we have just shown that the plane ABC is identical with AA'C and the latter with A'BC. From this it follows that if C' is any point of the plane ABC not on line the AB, the plane ABC is identical with the plane ABC'. For A o B let 0 be any point on the segment AB. By FIG. 22. Theorem 14, the line OC' meets one of the intervals CA and CB. Suppose it meets the interval CA in a point P. By the paragraph above the plane THE FOUNDATIONS OF GEOMETRY 19 ABC is identical with the plane ABP, and this is identical with PAO, and this with C'AO, and this with C'AB. Now if A'B'C' are three non-collinear points of the plane ABC, at least one of them, say C', is not on the line AB. Hence by the argument above ABC is identical with ABC'. A' and B' are not both on the line AC'. Let B' be the one which is not, and we have that ABC' is identical with AB'C'. Since A' is not on the line B'C' the same argument shows that AB'C' is identical with A'B'C'. Hence ABC is identical with A'B'C'. Theorem 16.* A line having two points in common with a plane lies wholly in the plane. Proof. Let the two points be taken as A and B in defining the plane, ABC. The plane contains the line AB. Corollary. If two planes have two points in common they have a line in common. Theorem 17. A line of a plane which contains one and only one point of a side of a triangle whose vertices are in the plane contains one other point of the triangle. Proof. Let the triangle be ABC and let a line I meet the segment AB in a point 0. By Theorem 14, since any other point of I is in the plane ABC the line I meets the triangle in a point different from 0. V. REGIONS IN A PLANE In this section we shall be dealing entirely with the points of a single plane. Definition. The set of n-1 intervals A1 A2, AA3,..., An _2 A _n-, An_ A determined by n points Al, A2,... An is called the broken line Al A2 A3... A.. A and An are called its ends, and it is said to join A1 and An. A single interval is a special case of a broken line. A region is a set of points such that (1) any two points of the set can be joined by a broken line consisting entirely of points of the set, and (2) any point of the set is on at least two non-collinear segments consisting entirely of points of the set. * Cf. Euclid, Definition, I, 7. 20 MODERN MATHEMATICS The last clause excludes the possibility of a single segment being a region. A region is said to be convex if the interval joining any two points of it is composed entirely of points of the region. It is evident that the set of all points in a plane is an example of a __i( -- convex region. Further cases are ______________- developed by the theorems below. -- - Theorem 18. If I is any line passing through a point of a con_ \__\_vex region R in a plane then the _-___ ~ \ — points of R not on I constitute two convex regions R1 and R2 such that any segment joining a point FIG. 23. of R1 to a point of R2 contains a point of 1. Proof. Let 0 be a point of R on 1. By the definition of a convex region there is a segment intersecting I in 0 and consisting entirely of points of R. Let A1 and A2 be two points of this segment in the order {A10A2}. Consider the set R1 of points of R which are joined to A1 by intervals containing no point of 1. If X', X" are two such points, the segment X'X" can contain no point of I since, if it did, one of the intervals X"A1 and A1X' would contain a point of I (Theorem 17). Moreover, all points X of the segment X'X" are in R1 because if 1 should meet the segment A1X it would have, by Theorem 17, to meet A1X'. Hence the \ set R1 is a convex region. Consider also the set R2 of points of R FIG. 24. such that the segments joining them to A1 each contain points of 1. A2 is evidently a point of R2. If Y' and Y" are two points of R2 the segment Y'Y" can contain no point of l, because in that case the line 1 would meet three sides of the triangle A1Y'Y", contrary to Theorem 6. Again, THE FOUNDATIONS OF GEOMETRY 21 if Y is any point of the segment Y'Y", the segment A iY contains a point of I by Theorem 17, because the segment A Y' does and the interval Y'Y does not. Hence R2 is a convex region. Clearly all points of R are in RI, on I, or- in R2. Any segment joining a point X of R1 to a point Y of R2 meets 1. This follows by Theorem 17, because the segment A1Y does and the interval A1X does not contain a point of 1. Since the set of all points in a plane is a convex region, we have at once the following Corollary. Definition. The points of a plane ABC not on the line AB constitute two convex regions such that any segment joining a point of one region to a point of the other region contains a point of AB. These two regions are called the two sides of the line. Either of them is called a half plane. Definition. A set of points [X] is said to separate two other sets of points [Y], [Z] if and only if every broken line joining a point Y to a point Z contains a point X. A set [X] is said to decompose a region R into regions R,.., Rn if the points in [X] and R1,..., R, comprise all points of R and each pair of regions R1,... Ris separated by [X] together with the points of the plane not in R. Theorem 19. A line containing a point of a convex region decomposes it into two convex regions. Proof. This follows from Theorem 18, as soon as we prove that a broken line joining a point of R1 to a point of R2 meets 1. Let A1 A2 A3.. A,, be a broken line not meeting 1. Then A1 and A2 are on the same side of I by Theorem 18. In like manner A3 is on the same side as A1 and A2. By repeating this argument we find that An is on the same side as A1. Hence a broken line joining points on opposite sides meets 1. Definition. A point and two distinct rays having it as their common origin are called an angle. The origin is called the vertex and the rays are called the sides of the angle. If the rays are collinear the angle (which is identical with the line) is called a straight angle or a flat angle. An angle is denoted by the symbol 4ab, if a and b denote 22 MODERN MATHEMATICS the sides. The symbol 4ABC denotes the angle whose sides are the rays BA and BC. Theorem 20. Definition. An angle 4 ABC not a straight angle decomposes the plane in which it lies into two regions, one of which is convex. The convex region is called the interior of the angle and the other region the exterior of the angle. Any ray with B as origin containing a point of the interior meets the segment AC and consists entirely of interior points. Proof. Let D be a point in the order {DBA. Let [1] be the set of all points of the plane such that the segment DI contains a point of the ray BC. C -Io Let [0] be the set of all points such that the interval DO does 02, ~ -^^ ~ not contain B or a point of the A ray BC. Since all points of the cDC- -BT plane ABC are on rays having D as origin, every point of the plane is in [I] or [0] or on 4ABC. Two points, 01, 02, are joined by the broken line 01DO2, which by definition contains no \01 point of 4ABC. Two points, FIG. 25. I1, I2, are by definition on the same side of BA with C and on the opposite side of BC from D. Hence the segment I112 does not meet 4ABC. Moreover if IP is any interval not meeting 4ABC, and I is in [I], then P is on the same side of AB with I and also on the same side of BC with I. The line BC meets the side DI of the triangle DIP and does not meet the interval IP. Hence it must meet the segment DP in a point Q. Since the segment DP is on the same side of the line AB with C, Q is on the ray BC. Hence P is a point of [1]. Now if IP1P2P3... Pn is a broken line not meeting 4ABC, the argument just made shows first that P1 is in [I], then that P2 is in [I] and so on. Hence every broken line joining a point of [I] to a point of [0] meets 4ABC. The rays which have B as origin and are on the opposite THE FOUNDATIONS OF GEOMETRY 23 side of the line AB from C are all composed of 0 points and cannot meet the segment AC. The ray BD is composed of 0 points and does not meet the segment AC. The other rays whose origin is B, aside from BA and BC, meet one of the segments DC and CA. Those meeting the segment DC evidently are composed entirely of 0 points and those meeting the segment CA of I points. Hence the set of points [I] is composed of the points on the rays whose origin is B and which meet the segment AC. Theorem 21. Definition. A triangle decomposes its plane into two regions one of which is convex and is called the interior. The other region is not convex A and is called the exterior. A ray whose origin is an interior point meets the triangle in one and only / one point, and the interior consists of all points having this o — property. Proof. Let the triangle be ABC and let [I] be the set of all points on the segments [AX] / where [X] is the segment BC. By Theorem 17, any line through I except the line IX meets the FIG. 26. triangles AXB and AXC each in one point. But as B and C are on opposite sides of the line AX these points are one in each of the rays into which I decomposes the line. Hence every ray through I meets the triangle ABC once and only once. A segment 1112 cannot contain a point of the triangle for then the ray Il12 would contain at least two points of the triangle, one on the segment 11I2 and one on its prolongation beyond 12. Hence [I] is a convex region. A segment one of whose end points is I must be composed entirely of I-points if it does not contain a point of the triangle; for if P is any point of the segment, the single point of the triangle on the ray IP is by hypothesis on the prolongation 24 MODERN MATHEMATICS of IP beyond P. From this it follows as in Theorem 19 that a broken line joining a point I to a point not in [I] contains at least one point of the triangle. A point E not in [I] or on the triangle ABC must either be exterior to 4BAC or on the prolongation of a segment AX beyond X. One of the latter points E is joined to a point El of the prolongation of AB beyond B by an interval which does not meet the triangle ABC because its ends are on the opposite side of BC from A. The prolongation of EE1 beyond El is composed of points in the exterior of 4BAC. Since any two points exterior to 4BAC are connected by a broken line not meeting 4BAC, it follows that any two points E are connected by a broken line. Hence E is a region. Since it contains points on the two prolongations of a segment AX it is not convex. Hence [f] and [E] satisfy the definitions respectively of the interior and the exterior of the triangle. Corollary. Through each exterior point there pass lines which do not meet the triangle. Theorem 22. Any ray BD in the interior region of 4ABC decomposes it into two convex regions, the interiors of the angles 4ABD, 4DBC. Any ray BD in the c exterior region of 4ABC decomposes it into two regions at least one of which / D is convex. One of these is the interior or / - the exterior of 4ABD and the other the ~/ --- \ interior or the exterior of 4DBC. B - Proof. To prove the first part of |\i ~ ~ the theorem we observe that the ray BD meets the segment AC in a point P. Hence the points X on the rays joining B to the segment AP are on the oppoItI~~ ~ site side of the line BD from the points [Y] on the rays joining B to the segment FIG. 27. PC. But the sets [X] and [Y] are the angles 4ABD and 4DBC respectively. The proof of the second part is analogous to that just made. The details are left as an exercise for the reader. THE FOUNDATIONS OF GEOMETRY 25 Definition. If R is a region and there exists a set of points [B] not points of R such that every broken line joining a point of R to a point not in R contains a point B1, then [B] is called the boundary of R. For example, a line is the boundary of each of the halfplanes it determines; an angle is the boundary of its interior and also of its exterior. Definition. Two rays a, b, are separated by an angle 4hk if the four rays a, b, h, k have a common origin and one of a and b is interior while the other is exterior to 4hk. A set of rays having a common origin are said to be in the order ala2a3a4a... a,, if no two of the rays are separated by any of the angles 4ala2, 4a2a3,..., 4 an-lal, 4anal. Ct 1 \ Ct FIG. 28. FIG. 29. Corollary 1. A set of rays in the order {ala2... an-an} are also in the orders {a2a3... anal and { anai_l... a2a }. Corollary 2. Any two rays, a, b, having a common origin are in the orders {ab} and {ba}. Any three rays, a, b, c, having a common origin are in the orders {abc}, {bca}, {cab}, {acb}, {bac}, {cba}. Theorem 23. To any finite number n>2 of rays having a common origin may be assigned notation so that they are in the order {a a2a3... an}. They decompose the plane into n regions of which at most one is not convex. Proof. The theorem is obvious for n=2. Hence we can 26 ME ODERN MATHEMATICS prove it in general by showing that its truth for n= k implies its truth for n= k + 1. To k of the given k +1 rays let us assign notation so that they are in the order blb2.. bk. They decompose the plane into k regions R1, R2,..., Rk, whose boundaries are. 4blb2, b2b3,..., 4bkbl. The other ray, b, lies in one of the k regions determined by blb2... bk. By Theorem 22 it separates this region, Ri, into two regions Ri'Ri" of which one at least is convex if Ri is not convex and both of which are convex if Ri is convex. Hence the k+1 rays decompose the plane into k+1 regions R1, R2... R/R/... R of which at most one is not convex. Suppose that the boundary of Ri is 4bibi+l. Then the boundaries of the two regions into which Ri is decomposed are 4bib and 4bb+i1. Hence the l +l1 rays are in the order / {b, b2... bibb+... bb. By IV /calling the first of these al, /\ I the second a2, etc., we have __ I \_ _ assigned to them the order v I va1a2... ak+1}. Iv~ ~ EXERCISE 1. A set of n distinct coplanar lines meeting in a point FIG. 30. 0 decompose their plane into 2n convex regions. EXERCISE 2. Three lines AB, BC, CA not meeting in a point decompose their plane into seven convex regions, one of which is the interior of the triangle ABC. EXERCISE 3. A set of n lines in a plane each pair of which intersect, but no three of which pass through the same point, decompose their n(n+l) plane into -2 +1 convex regions. 2 THE FOUNDATIONS OF GEOMETRY 27 VI. CONGRUENCE OF POINT PAIRS We now introduce a new undefined term to express a relation between point pairs. The relation is called congruence. Denoting pairs of distinct points by (A, B), (C, D), etc., we write (A, B) is congruent to (C, D). Since this phrase is undefined the reader may attach to it any meaning consistent with the assumptions below. It is intended, however, to express the common notion A implied by saying that the distance from A to B as measured by a tape-line is c D the same as the distance from FIG. 31. C to D. Assumption VII.* If A B then on any ray whose origin is C there exists one and only onet point D such that (A, B) is congruent to (C, D). Assumption VIII.t If (A, B) is congruent to (C, D) and (C, D) is congruent to (E, F) then A B (A, B) is congruent to (E, F). Assumption IX. ~ If (A, B) is conoA B/o ' gruent to (A',B') and (B, C) is congruent FIG. 32. to (B', C') and {ABC} and IA'B'C'} then (A, C) is congruent to (A', C'). Assumption X. (A, B) is congruent to (B, A). Theorem 24. If {ABC}, and C' is a point on a ray A'B' such that (A, B) is congruent to (A'B') and (A, C) is congruent to (A'C') then {A'B'C'}. Proof. By Assumption VII there is a point C" on the ray B'C' such that BC is congruent to B'C". The point C" is in * Cf. Euclid, Postulate 3. t Evidently there are two statements here, (1) the existence and (2) the uniqueness of D. t Cf. Euclid, Common Notion 1. ~ Cf. Euclid, Common Notion 2. 28 MODERN MATHEMATICS the order A'B'C". But by Assumption IX, AC is congruent to AC". Hence by Assumption VII, C" = C'. Hence AB'C'}. By combining this theorem with Assumption IX we have the following Corollary. If (A, B) and (A, C) are congruent respectively to (A',B') and (A',C'), and if C is on the ray AB and C' on the ray A'B', then (if B rC) (B, C) is congruent to (B,' C'). Theorem 25.* (A, B) is congruent to (A, B). Proof. By Assumption X, (A, B) is congruent to (B, A) and (B, A) is congruent to (A, B). Hence by Assumption VIII, (A, B) is congruent to (A, B). Theorem 26. If (A, B) is congruent to (C, D) then (C, D) is congruent to (A, B). Proof. By Assumption VII there exists on the ray AB a point B' such that (C, D) is congruent to (A, B'). Hence by Assumption VIII (A, B) is congruent to (A, B'). Hence by Theorem 25 and Assumption VII B' =B. Corollary.t If (A, B) is congruent to (C, D) and also to (E, F) then (C, D) is congruent to (E, F). Proof. Since (A, B) is congruent to (C, D), (C, D) is congruent to (A, B). Since also (A, B) is congruent to (E, F) it follows by Assumption VIII that (C, D) is congruent to (E, F). The word congruent was taken without definition as a relation between point-pairs. We now proceed to extend its significance by means of a definition. Definition. A set of points [X] is congruent to a set of points [Y] if (1) every point X corresponds to one point Y in such a way that whenever (X1, X2) corresponds to (Y1, Y2), (X1, X2) is congruent to (Yi, Y2) and (2) every point Y is the correspondent of one point X. This definition corresponds precisely to the intuitive conception of superposition. If two plane figures are represented by drawings on sheets of paper it is perfectly clear that a test for their congruence is to lay one on top of the other. The * Cf. Euclid, Common Notion 4. t- Cf. Euclid, Common -Notion 1. THE FOUNDATIONS OF GEOMETRY 29 superposition with which we have to do in geometry is, however, a kind of intellectual matching of two figures together. The attention is transferred from one to the other and we try to see whether corresponding pairs of points are congruent. It would be perfectly feasible to substitute the word superposable for congruent in the definition above. Theorem 27.* Any figure is congruent to itself. If a figure is congruent to a second figure the second figure is congruent to the first. Two figures congruent to the same figure are congruent to each other. Theorem 28. Any point is congruent to any other point, any line to any other line, any ray to any other ray, any straight angle to any other straight angle. Proof. That any point, A, is congruent to any point, B, is obvious from the wording of the definition. Let AB and LM be any two rays. Let each point Y of the ray LM correspond to that point X of the ray AB which is such that (A, X) is congruent to (L, Y). Thus A x B o I every X has a y corresponding to it. Moreover, if Y1 and Y2 correspond o ---- to Xi and X2 we have FIG. 33. (L, Y1) and (L, Y2) congruent respectively to (A, X1) and (A, X2) and hence by the corollary of Theorem 24 (X1, X2) is congruent to (Y1, Y2). Hence the ray LM is congruent to the ray AB. By applying like reasoning to the rays AB' and LM', which are the prolongations beyond A and L of the segments BA and LM respectively, we have that the straight angle BAB' is congruent to the straight angle MLM'. Hence the two lines AB and LM are also congruent. Theorem 29. If (A, B) is congruent to (C, D) then the segment AB is congruent to the segment CD "nd the interval AB is congruent to the interval CD. * Cf. Euclid Common Notions 1 and 4. By the word, figure, we mean any set of points. 30 MODERN MATHEMATICS Proof. Let A correspond to C and B to D and any point X of the segment AB correspond to that point Y of the ray CD such that (A, X) is congruent to (C, Y). By Theorem 24, Y is on the segment CD and by the corollary of the same theorem, (D, Y) is congruent to (B, X). By the corollary of Theorem 24, if X1X2 correspond to Y1Y2 then (X1, X2) is congruent to (Y1, Y2). VII. CONGRUENCE OF ANGLES In order to deal with the congruence of angles and other figures in a plane, we must introduce an additional assumption. Assumption XI. If A, B, C are three non-collinear points and D is a point in the order D D' BCD}, and if A'B'C' are three non-collinear points and D' is a cd / c' " point in the order {B'C'D'} such [/\ |~/\, that the point pairs (A, B), (B, C), (C, A), (B, D) are respectively B-B -- -A B' - A'l congruent to (A', B'), (B', C'), FIG. 34. (C', A'), (B', D'), then (A, D) is congruent to (A', D'). Theorem 30.* Two angles 4BAC and 4MON are congruent in such a way that A corresponds to 0 if there are two points P and Q on the rays OM and ON such that the point pairs (A, B), (A, C) and (B, C) are respectively congruent to (0, P), (0, Q) and (P, Q). Proof. If the points P and Q exist as stated let A correspond to 0, B to P and C to Q. The ray AC is congruent to the ray OQ and the ray AB to the ray OP. Hence to prove the angles congruent we need to show that if X1 is any point of the ray AC and X2 any point of the ray AB and Y1 and Y2 are the corresponding points of the rays ON and OM respectively, then (X1, X2) is congruent to (Y1, Y2). Let B' and P' be points on the prolongations of BA and PO beyond A and 0 respectively such that (A, B') is congruent to * Cf. Euclid, I, 8. THE FOUNDATIONS OF GEOMETRY 31 (0, P'). Since (A, C), (C, B), (B, A), (B, B') are congruent respectively to (0, Q), (Q, P), (P, 0), (P, P'), it follows by Assumption XI that (B'C) is congruent to (P', Q). Now if/ X2 and Y2 are points of the rays AB and OP respectively x such that (A, X2) is congruent A o to (0, Y2), it follows, since (A, C), (C, B'), (B', A), (B', X2) / B' ' are respectively congruent to / / (0, Q), (Q, P'), (P', 0), (P', Y2), FIG. 35. that (C, X2) is congruent to (Q, Y2). In similar fashion we can prove that if X1 and Y'1 are points of the rays AC and OQ respectively such that (A, XI) is congruent to (0, Y1), then (X1, X2) is congruent to (Y1, Y2). Definition. If B' is on the prolongation of the segment BA beyond A, the angle 4B'AC is said to be a supplement of 4BAC. If C' is a point on the prolongation of the segment CA beyond A, the angles 4CAB and 4C'AB' are said to be vertical. Corollary 1. Supplements of congruent angles are congruent. Corollary 2.* Vertical angles are congruent. Definition. In a triangle ABC, the sides AB and BC are said to include 4ABC. The side AC and angle 4ABC are said to be opposite each to the other. The sides AB and BC are said to be adjacent to each other and to 4ABC. Theorem 31.t If the sides of one triangle are congruent respectively to the sides of another triangle, the triangles are congruent. Proof. Let the two triangles be ABC and A'B'C' and let the segments AB, BC, CA be congruent respectively to the segments A'B', B'C', C'A'. This determines a correspondence between the two triangles in which by Theorem 30 the angles at A, B, and C correspond to congruent angles at A'B'C'. But since 4ABC is congruent to 4A'B'C' it follows by * Cf. Euclid, I, 15. t Cf. Euclid, I, 8. 32 MODERN MATHEMATICS definition that if X and Y are any two points of the segments BA and BC respectively, and X' and Y' the corresponding points of the segments B'A' and B'C' respectively, then (X, Y) is congruent to (X', Y'). A A' Applying the same argux/\ x,/ ment to the angles 4ACB and i A'C'B' and the.B y C B' Y' c' angles 4BACand 4B'A'C' FIG. 36. we have that the two triangles are congruent. The following theorems are proved similarly and are left as an exercise for the reader. Theorem 32.* If two sides and the included angle of one triangle are congruent to two sides and the included angle of another triangle the two triangles are congruent. Theorem 33.t If two sides of a triangle are congruent, the angles opposite them are congruent. VIII. INTERSECTIONS OF CIRCLES Definition. If 0 and Xo are two points of a plane a, then the set of points [X] of a such that (0, X) is congruent to (0, Xo) is called a circle. 0 is called its centre and any one of the intervals OX is called a radius. The two radii on any line through 0 constitute a diameter. The points, except the points [X], on radii of the circle are said to be interior to the circle. The points of a not on radii are said to be exterior to FIG. 37. the circle. It can be proved that the interior and exterior points constitute two regions into which the plane a is decomposed by * Cf. Euclid, I, 4. t Cf. Euclid, I, 5. THE FOUNDATIONS OF GEOMETRY 33 the circle. Another assumption, however, is necessary before this can be done. Assumption XII. A circle passing through a point, A, interior and a point, B, exterior to another circle in the same plane has in common with the other circle at least one point on each side of the line AB. Definition. A triangle is said to be isosceles if two of its sides are congruent and to be equilateral if all three are congruent. Theorem 34.* If AB is any segment there exists in any half plane of which the line AB is the boundary an equilateral triangle of which AB is a side. Proof. Let S and T be the two circles in the given plane of which A and B are centres respectively and the interval AB is the radius. If B' is the point in the order IBAB'} S T such that (B', A) is congruent to (A, B) and A' the point in the order B' A B' {ABA'} such that (A, B) is congruent to (B, A'), the interval B'B is a diameter of the circle S and FIG. 38. contains all points of the interior of this circle which are on the line BB'. Hence the circle T has the point A interior and the point A' exterior to the circle S. Hence they have in common by Assumption XII, two points C and C', one on each side of the line AB. The interval AC is congruent to the interval AB because they are radii of S and the interval AB is congruent to the interval BC because they are radii of T. Theorem 35. Definition.t If AB is any segment there exists one and but one of its points, 0, such that (A, O) is congruent to (0, B). This point is called the mid-point of the segment or interval AB. Proof. Using the notation of the last theorem we have * Cf. Euclid, I, 1. t Cf. Euclid, I, 10. 34 MODERN MATHEMATICS that the segment CC' meets the line AB in a point 0 because C and C' are on opposite sides of the line AB. Since the pointpairs (C, B), (C', B) and (C, C') are congruent respectively to the point-pairs (C, A), (C', A) and (C, C') it follows by Theorem 30 that 4BCO is congruent to 42ACQ and hence that (0, B) is congruent to (0, A). Suppose now there were another point 0' 0 such that AO' were congruent to O'B. 0' could not be on a prolongation of the segment AB; for if it were in the order IABO'I there would be two segments O'B and O'A on the ray O'B and congruent to the segment O'B; similarly it could not be in the order O'AB}. If O' were on the segment AB it would be A 0 0' B 0' FIG. 39. in one of the orders, {AOO'B} and {AO'OB}. In the first of these cases from the order AOO'} and the hypotheses that (A, 0) and (A, 0') are congruent to (B, 0) and (B, 0') respectively it follows by Theorem 24 that we should have the order {BOO'}, contrary to hypothesis. The order {AO'OB} is proved impossible similarly. Hence there is only one midpoint, 0, and it is on the segment AB. Theorem 36.* If ABC is a triangle there is no point C' ZC on the same side of the line AB with C such that (A, C) and (B, C) are congruent to (A, C') and (B, C') respectively. Proof. By Assumption VII C' cannot be on either of the lines AC or BC. If C' exists and is not on these lines we distinguish two cases according as the line CC' does or does not meet the line AB. In case the line CC' does meet the line AB in a point, P, the point-pairs (B, C), (C, A), (A, B), (A, P) are congruent respectively to (B, C'), (C', A), (A, B), (A, P) and hence by Theorem 30 (C, P) is congruent to (C', P). This result is, however, contrary to Assumption VII. * Cf. Euclid, I, 7. THE FOUNDATIONS OF GEOMETRY 35 In case the line CC' does not meet the line AB, A and B are on the same side of the line CC'. Let 0 be the mid-point of the segment CC'; let P be a point in the order AOP and interior to 4CBC'; then by the results of V the segment PB meets the segment CC' in a point Q. A P B c A A B FIG. 40. FIG. 41. Since (0, C), (C, A), (A, 0), (A, P) are congruent respectively to (0, C'), (C', A), (A, 0), (A, P) it follows by Assumption XI that (C, P) is congruent to (C' P). Hence 4-CBP is congruent to 4C'BP and hence (C, Q) is congruent to (C' Q). Hence Q is a mid-point of the segment CC' as well as 0, contrary to Theorem 35. Corollary. If 4ABC is congruent to 4ABC' in such a way that A corresponds to itself and C' is on the same side of the line AB with C then the rays BC and BC' are identical. Theorem 37. If ABC and A'B'C' are any two planes they are congruent in such a way that B corresponds to B', the ray BC to the ray B'C' and the half plane containing A and bounded by BC to the half plane containing A' and bounded by B'C'. A' A yg C B/C"'Al B c ---- B ' 6/ C/4c FIG. 42. Proof. Let P be a point (Theorem 34) in the first half plane such that (P, B), (B, C), (C, P) are congruent (Theorem 34) and let C" be a point in the ray B'C' such that (B, C) is congruent to B'C" and let P' be a point in the second half 36 MODERN MATHEMATICS plane such that P'B', B'C", C"P' are congruent. Then by Theorem 31 the triangles PBC and P'B'C" are congruent and this determines a correspondence between the points of the two triangles. A correspondence between the two planes may be determined as follows: Let 0 be the mid-point of the segment BC and O' the midpoint of the segment B'C". If X is any point in the first plane the line OX meets the broken line PBC in a point, S. Let S' be the point congruent to S in the triangle B'P'C" and let X' be a point of the line O'S' such that (O', X') is congruent to (0, X) and on the same side of O' with S' or not, according as X is on the same side of 0 with S or not. To complete the proof of the theorem it is necessary to show that if Xi and X2 correspond in this way to X'1 and X2 then (X1, X2) is congruent to (X1', X2'). This is obvious if X1 and X2 are on the same line through 0. If they are on different lines, let S1 and S2 be the points in which the lines OX1 and OX2 meet the broken line BPC and let S1' and S2' be the corresponding points on the broken line B'P'C". By Theorem 31, (0, S1), (0, S2) and (S1, S2) are congruent respectively to (0', Si'), (0', S2'), and (SI', S2). Hence by Theorem 30, 2SiOS2 is congruent to 4Si'O'S2' and (X1, X2) is congruent to (X1', X2). With the aid of this theorem any plane figure may be superposed upon any other to determine whether or not they are congruent. As an obvious corollary of the proof of this theorem we have: Corollary. If 42ABC is congruent to 4A'B'C' in such a way that B corresponds to B' and D is a point in the interior of 4ABC and BD' a ray such that 4ABD is congruent to 4ABD', B corresponding to B', then the ray B'D' is interior to 4A'B'C'. Definition. An angle congruent to either of its supplementary angles in such a way that the vertex of the angle corresponds to itself is called a right angle. The two sides of the angle are said to be perpendicular, as are also the two lines containing these rays, THE FOUNDATIONS OF GEOMETRY 37 Corollary. The two supplementary angles and the vertical angle of a right angle are right angles. Theorem 38. If P is any point and AB any line there is one and only one line through P and perpendicular to P in any plane containing A, B and P. Proof. Suppose first that P is on the line AB. Let M and N be on opposite sides of P and such that (M, P) is congruent to (P, N) and let C be the third vertex of an equilateral triangle of which MN is a side. Since (C, P), (C, N), (N, P) are congruent respectively to (C, P), (C, M), (M, P) 4CPM is congruent to 42CPN and hence the line CP is perpendicular to the line AB. If some other line DP were also perpendicular, D would be on the same side of the line CP with M or with N. As the two cases are treated alike, suppose that D is on the same side with N. Then the seg- c ment MD meets the line CP in a point E. Let E' be the point on the opposite A4 B\ side of P from E such that MX P Q Y (P, E) is congruent to (P, E'). Also let D' be the point on ' the ray ME' such that (M, D) is congruent to (M, D'). Since we have the order 1{MED } we FIG. 43. also have the order lME'D' (Theorem 24) and hence the segment DD' meets the line MN in a point Q different from P. By the last corollary and Theorem 30, (AM, E) is congruent to (M, E'); hence (P, D) is congruent to (P, D'); hence (D, Q) is congruent to (D'Q) and (D, N) is congruent to D'N. Since the line DP is assumed perpendicular to the line MN, (D, N) is congruent to (D, M) and it follows that (D, M), (D, N), (D'N), (D'M) are all congruent. Hence we have (D', D), (D', N) and (D, N) congruent to (D, D'), (D, M) and (D', M) respectively and thus have 4ND'Q congruent to 2MDQ. Hence (M, Q) 38 MODERN MATHEMATICS is congruent to (N, Q) although Q P. This contradicts Theorem 35. Thus we have shown not only that the line PC is the only perpendicular to the line MN P at P but also that if D is any point off the line CP, (D, M) is not congruent to (D, AT). Now if P is not on the A Bline AB, let P' be a point on the opposite side of the line AB (Theorem 37), such that \P' PA4P'AB is congruent to FIG. 44. 4PAB and (P, A) is congruent to (P', A). The line PP' is easily seen to be perpendicular to the line AB. Corollary 1. Definition. If A and B are any two points, in any plane containing A and B the line through the mid-point of the segment AB perpendicular to the line AB contains all points P of the plane such that (P, A) is congruent to (P, B). This line is called the perpendicular bisector of (A, B). Corollary 2.* All right angles are congruent to one another. Theorem 39. A set of three non-collinear points cannot be congruent to a set of three collinear points. Proof. Let A, B, C be any three non-collinear points and P, Q, R three collinear points. If (A, B), (B, C), (C, A) were B congruent respectively to (P, Q), (Q, R), (R, P) t hen by the theorems of VI there would be a point D on A/D the line CA such that (A, B) and (C, B) were congruent to (A, D) P R and (C, D) respectively. Let M be FIG. 45. the mid-point of (B, D). Then since (A, B) and (A, D) are congruent, the line AM is perpendicular to the line BD and since (B, C) and (D, C) are * Cf. Euclid, Postulate 4. THE FOUNDATIONS OF GEOMETRY 39 congruent C must be on the line AM (Theorem 38), contrary to the hypothesis that A, B and C are non-collinear. Theorem 39 makes it possible to prove the converses of the theorems on angles and triangles in the last section. In that section we proved that two triangles were congruent if the three sides of one were congruent to the three sides of the other. We were not, however, prepared to say that if two triangles were congruent the vertices of one corresponded to the vertices of the other. It might have happened that the sides AB, BC of one triangle corresponded to the side PQ of another, and that the third side, CA, corresponded to the broken line QRP. This possibility is excluded by Theorem 39. We now easily see that if the triangle ABC is congruent to the triangle PQR that the point-pairs (A, B), (B, C), (C, A) are congruent to the three point-pairs (P, Q), (Q, R), (R, P). We shall take these converse theorems for granted without further proof. It is also evident that Theorem 24, and the corollary of Theorem 37, may be generalized to read: If two planar figures are congruent, the points and lines of one figure are in the same order relations as are the corresponding points and lines of the other figure. Theorem 40. A line, 1, containing a point interior to a circle meets the circle in two and only two points. Proof. Let 0 be the centre of the circle and I the given interior point and J the point of the line I such that the line OJ is perpendicular to it. Let Q be on the opposite side of J from 0 and such that (0, J) is congruent to (Q, J). We shall c' - prove first that Jis interior to the circle. If I=J this is evident. If not, let I be the point of the ray OJ such that (0, I') is congruent to (0, I). Let J' be FIG. 46. a point of the ray OI such that OJ is congruent to OJ'. If I' were in the order IOI'J} then J' 40 MODERN MATHEMATICS would be in the order { OIJ'}. In this case let K be the midpoint of the segment JJ'. It follows that 4J'KO is a right angle. Now let J'S be a ray on the same side of the line JJ' with 0 making a right angle with JJ'. The ray J'S cannot meet the interval OK because this would imply two perpendiculars to JJ' from the point of intersection. Hence the ray J'I is interior to the right angle 4JJ'S. On the other hand the ray JJ' is exterior to the right angle 40JI. But 4JJ'I is congruent to 40JJ', and thus we have a contradiction with the corollary of Theorem 37. Hence we have established the order OJI'}. s / JQK FIG. 47. Since the point of the circle on the ray 01 is on a prolongation of the segment 01 the point C of the circle on the ray OJ is on the prolongation of the segment OF'. Hence J is an interior point. The circle with Q as a centre and with a radius congruent to the interval OC has a point D on the prolongation of OQ beyond Q which must be in the order { OCD }. Hence D is outside the first circle. If C' and D' are the points of the first and second circles respectively on the prolongation of QC beyond C then it follows by Theorem 24 that we must have {QD'C'}. Hence D' is interior to the first circle. Hence the two circles have two points U, V, in common. Since (0, U) and (0, V) are congruent to (Q, U) and (QV) respectively, it follows by Theorem 38, Corollary 1, that U and V are on the line 1. Since any point W common to the line I and the first circle would be such that OW is congruent to QW and hence be on the second THE FOUNDATIONS OF GEOMETRY 41 circle there are, by Theorem 36, only two points common to the line and the first circle. With the aid of this theorem there is no difficulty in proving that a circle decomposes its plane into two regions, the interior and the exterior. It is also easy to derive the earlier propositions of the first book of Euclid's Elements. In most cases Euclid's own proof can be used. It will be an excellent exercise for the reader to work through those of the first twenty-eight propositions which we have not already taken up. It will be necessary for him to define certain terms such as addition of segments and angles which are not explicitly defined by Euclid. The logical bases for these definitions will be found in our development of the elementary theory of order and congruence. The propositions in question are as follows: 2. To construct* at a given point a segment congruent to a given segment. 3. Given two non-congruent segments, to cut off from the greater a segment congruent to the less. 6. If in a triangle the two angles be congruent to one another, the sides which subtend the congruent angles will also be congruent to one another. 9. To bisect a given angle. 11. To draw a line at right angles to a given line from a given point on it. 12. To a given line from a given point not on it, to draw a perpendicular line. 13. If a ray set up on a line make angles it will either make two right angles or angles equal to two right angles. 14. If with any line two rays on opposite sides of it make the adjacent angles equal to two right angles the two rays will be in the same line with one another. 16. In any triangle, if one of the sides be produced the exterior angle is greater than either of the interior and opposite angles. * By means of a compass which closes when lifted from the plane. 42 MODERN MATHEMATICS 17. In any triangle two angles taken together in any manner are less than two right angles. 18. In anytriangle the greater side subtends the greater angle. 19. In any triangle the greater angle is subtended by the greater side. 20. In any triangle two sides taken together in any manner are greater than the remaining one. 21. If on one of the sides of a triangle, from its ends there be constructed two lines meeting within the triangle, the segments so constructed will be less than the remaining two sides of the triangle, but will contain a greater angle. 22. Out of three intervals congruent to three given intervals to construct a triangle: thus it is necessary that two of the intervals taken together in any manner should be greater than the remaining one. 23. On a given line at a given point to construct an angle congruent to a given angle. 24. If two triangles have the two sides congruent to two sides respectively but have one of the angles contained by the, congruent sides greater than the other they will also have the base greater than the base. 25. If two triangles have the two sides congruent to two sides respectively, but have the base greater than the base, they will also have the one of the angles contained by the sides greater than the other. 26. If two triangles have the two angles congruent to two angles respectively and one side congruent to one side, namely, either the side adjacent to the congruent angles, or that subtending one of the congruent angles, they will also have the remaining sides congruent to the remaining sides and the remaining angle to the remaining angle. 27. If a line falling on two lines make the alternate angles congruent to one another, the lines will not meet. 28. If a line falling on two lines make the exterior angle congruent to the interior and opposite angle on the same side, or make the sum of the interior angles on the same side two right angles, the lines will not meet. THE FOUNDATIONS OF GEOMETRY 43 IX. PARALLEL LINES * The next assumption which we shall set down is the justly famous assumption of Euclid about parallel lines. It has been stated in many different forms, of which the following is perhaps the simplest. Assumption XIII. If A is any point and a any line not passing through A, there is not more than one line through A coplanar A b with a and not meeting a. That there is at least one a line through A, coplanar with a - B and not meeting it is easily seen FIG. 48. by dropping a perpendicular AB from A to a, and observing that the perpendicular, b, to the line AB at the point A could not meet a without contradicting Theorem 38. The same result follows directly from Euclid, I, 27 or I, 28. The assumption of parallels was stated by Euclid in his Postulate 5 as follows: "If a straight line falling on two straight lines makes the interior angles on the same side less than two right angles, the two straight lines, if produced indefinitely, meet on that side on which are the angles less than the two right angles." A-4/ -- E rThis is a consequence of our assumption, for let the rays AC and BD be such that the sum of the B D angles 4CAB and 4ABD, is less than two right angles. Let AE be FIG. 49. the ray such that the sum of 4EAB and 4ABD is two right angles. Then by Euclid, I, 28, the line AE does not meet the line BD. Hence the line AC does meet the line BD. Since the sum of the * From this point forward the essay is a mere outline, intended to suggest how the rest of the subject may be developed. 44 MODERN MATHEMATICS angles 4CAB and 4ABD is less than two right angles and the sum of 4EAB and 4ABD is equal to two right angles, it follows that the ray AC falls within 4_EAB. Hence the ray AC is on the same side of the line AE with the line BD. It is also on the same side of the line AB with the ray BD. Therefore the point of intersection of the line AC with the line BD is on the rays AC and BD. For a further discussion of the theory of parallels the reader may consult Euclid's Elements, and the memoir in this collection by Professor Woods. X. MENSURATION Defining the sum of two segments and a multiple of a segment (or point-pair) and the terms equality and inequality of segments in the obvious way, it is easy to prove first that if A and B are any two points and n is any whole number, there is a point C on the line AB such that n(A, B) - (A, C), and second that there is a point D such that n(A, D)- (A, B). From this it follows that if m and n are any whole numbers there exists a point E such that rm(A, B)-n(A, E). Thus, with an extension of our definition, we have that -(A, B) (A, E). Calling nm/n the ratio of (A, E) to (A, B) this states that there is a point-pair having to (A, B) the same ratio as that of any two whole numbers. Two such segments are said to be commensurable. It is not hard to show that there 'are segments which are not commensurable and there is thus propounded the problem of extending the notion of ratio to incommensurable segments. THE FOUNDATIONS OF GEOMETRY 45 Euclid's method of doing this is a purely geometrical one, and similar methods have been preferred by nearly all the great geometers, the latest notable example being the Algebra of Segments of Hilbert. The method, however, which is more or less approximated to in elementary teaching, is that of defining the ratio of two incommensurable segments as an irrational number. The theory of irrational numbers is taken for granted from arithmetic and algebra. The following proposition, known as the Postulate of Archimedes, is fundamental in this method. Assumption XIV. If A, B, C are three points in the order ABC} and B1, B2, B3,... are points in the order ABB1 }, AB1B2 },... such that (A, B), is congruent to each of the pointpairs (B, B1), (B1, B2),... then there are not more than a finite number of the points B1, B2,... between A and C. A B B1 B2 FIG. 50. In other words, by laying off the segment AB a finite number of times in the way indicated a point is reached which is beyond C; that is to say, there exists a number n such that (A, C)< n(A, B). Another phrasing of this assumption would be: there exists no infinitely great interval (A, C). A direct consequence of Assumption XIV is that if D is any point of the ray AB there exists a number n such that (A, B) < (A, D), for, if not, there would exist no number n such that (A, B)< n(A, D). This may be expressed by saying that there is no infinitely small interval. 46 MODERN MATHEMATICS Let Ao and A1 be any two points and let us denote by Am that point of the ray AoA1 which is such that the ratio n of the segment AoAr to AoA1 is-. If B is any point of n n the ray such that (Ao, B) is incommensurable with (Ao, A1), the points [Am] fall into two classes, those on the segment AoB, which we may call [As] and those on its prolongation which we may call [As]. The numbers, [x], associated with points in the first class, are all less than the numbers [y] associated with points in the second class. With the aid of B I, ", I l. I I Ao A4 A4.1 A2 A3 FIG. 51. Assumption XIV it can be proved that B is the only point which is between every A, and every As. By Dedekind's principle* of definition of the irrational numbers there exists a unique irrational number, b, greater than every x and less than every y. This number, b, we define to be the ratio of the segments AoB and AoA1. Since any segment whatever is congruent to one of the segments AoA1 which have Ao as one end, we have now established a scale of magnitudes for the comparison of segments and are in a position to develop a complete theory of proportion. The theory of the measure (that is to say, length) of segments depends essentially on showing how to arrange segments in order of magnitude. In like manner, the theory of the measure, that is to say, of the area, of regions in the plane depends on showing how to arrange areas in an order of magnitude. For the purpose of elementary geometry we may confine attention to convex regions. A convex region A may be said to be less than a convex region B, if it is possible to decompose A, into a finite set of convex regions congruent to a nonoverlapping set of convex regions contained in B, and such * See Monograph IV, Appendix I. THE FOUNDATIONS OF GEOMETRY 47 that B contains at least one convex region not in this set. Two convex regions A and B, may then be said to be equivalent if neither is less than the other. In order to give this definition value it must be proved that two congruent regions are equivalent. This amounts to proving the following proposition: It is not possible to decompose two congruent convex regions R1, R2 into convex regions so that all the regions into which R1 is decomposed are congruent to a subset of the regions into which R2 is decomposed. By associating with an arbitrary square the number 1, a number, called the area, can now be assigned to each region in such a way that two equivalent regions have the same area, and if one region is less than another the less region has the smaller area. The theory of volumes can be developed similarly. It has been shown by Hilbert that a theory of the areas of polygonal regions can be developed independently of Assumption XIV, and by Dehn that a fully corresponding theory for polyhedral regions does not exist. On this subject the reader should consult the second edition of Hilbert's Grundlagen der Geometric * and the article by Amaldi, " Sulla teoria dell' equivalenza," inQuestioni riguardanti la Geometria Elementare,t edited by F. Enriques. XI. THREE-DIMENSIONAL SPACE Definition. If A, B, C, D are four points not all in the same plane the set of all points on and interior to the four triangles ABC, BCD, CDA, ABD, is called a tetrahedron. The set of all points collinear with pairs of points of a tetrahedron is called a three-space. By a discussion: analogous to that made in IV it is possible to prove that if A'B'C'D' are any four points of a * Leipzig, 1903. t Bologna, 1900. $ Cf. Transactions of the American Mathematical Society, Volume V, page 360. 48 MODERN MATHEMATICS three-space ABCD, then the three-space ABCD is identical with the three-space A'B'C'D'; that if two points of a line lie in a given three-space ABCD, then so do 1C all points of the line; that if three points of a plane lie in a three-space, so x do all points of the plane; and that if B/i^ gand only if two planes are in the same three-space they have aline in common. The notion of a three-dimensional A D region can then be defined and studied FIG. 52. analogously to V. Congruent figures can be defined as in VI. Assumption VI provided for the existence of a plane, but since nothing has as yet been said about the existence of points which are not coplanar, we add the following: Assumption XV. If A, B, C are three non-collinear points, there exists a point D not in the same plane with A, B and C. Assumption XVI. Two planes which have one point in common have two points in common. Assumption XV provides for the existence of at least one three-space and from Assumption XVI it follows that all points are in the same three-space. All the theorems of elementary three-dimensional geometry can be developed on the basis of these assumptions. But to do so would be to write a large book.* * A book giving a complete and rigorous treatment of elementary geometry would be a most important influence in improving the teaching of the most ancient and perfect of sciences. Such a book could rarely, if ever, be used in the classroom, but if it were in the hands of the teachers it would serve to keep before them in something like its actual form the structure of which they are trying to give their students a first glimpse. THE FOUNDATIONS OF GEOMETRY 49 XII. CONCLUSION The logically important questions as to the independence and categoricalness of our assumptions must be passed over with a reference to the two papers in the Transactions on which this essay is based. The ideas of consistency, independence and categoricalness (sufficiency) are explained in the essay by Professor Huntington in this book, and the independence of Assumption XIII is established in the essay by Professor Woods. A reader who is sufficiently interested to pursue the subject further is strongly urged to go into the question of the independence of the assumptions and to try -to discover for himself some of the examples which constitute the independence proofs. For convenience in this sort of study we have collected the assumptions in the following list. I. If points A, B, C are in the order ABC} they are distinct. II. If points A, B, C are in the order {ABC} they are not in the order {BCA}. Definition. If A and B are distinct points the segment AB consists of all points, X, in the order {AXB}; all points of the segment AB are said to be between A and B; the segment together with A and B is called the interval AB; the line AB consists of A and B and all points, X, in one of the orders {ABX}, {AXB, } XAB}; and the ray AB consists of B and all points, X in one of the orders {AXB} and {ABX}; A is called the origin of the ray AB. III. If points C and D (C7 D) are A on the line AB, then A is on the line CD. Ff-^olu IV. If A and B are two distinct points, there exists a point C such that A, B and C are in the order {ABC}. B C D V. If three distinct points A, B and FIG. 53. C do not lie on the same line and D and E are two points in the orders {BCD} and {CEA}, then a point F exists in the order {AFB} and such that D, E and F lie on the same line. 50 MODERN MATHEMATICS VI. There exist three' distinct points, A, B, C, not in any of theorders { ABC, BCA }, { CAB}. Definition. If A, B, C are three non-collinear points, the set of all points collinear with pairs of points on the intervals AB, BC, CA is called the plane ABC. The points X of the plane such that the interval AX does not contain a point of; the line BC constitute, together with A itself, one side of the line BC. The other points of the plane, not on the line BC, constitute the other side of the line BC. The notation (A, B) denotes a pair of distinct points. VII. If A iB, then on any ray whose origin is C there exists one and only one point D such that (A, B) is congruent to (C, D). VIII. If (A, B) is congruent to (C, D) and -(C, D) is congruent to (E, F) then (A, B) is congruent to (E, F). IX. If (A, B) is congruent to (A', B') and (B, C) is congruent to (B', C') and {ABC} and {A'B'C'}, then (A, C) is congruent to (A', C'). X. (A, B) is congruent to (B, A). XI. If A, B, C are three non-collinear points and D is a point in the order BCD}, and if A'B'C' are three non-collinear points and D' is a point in the order D {B'C'D'} such that the point-pairs (A, B), (B, C), (C, A), (B, D) are respectively congruent to (A', B'), (B', C'), (C', A'), (B', D') then (A, D) is congruent to (A', D'). B Definition. If 0 and Xo are two points of a FlG. 54. plane a, then the set of points [X] of a such that (0, X) is congruent to (0, Xo) is called a circle. 0 is called its centre and any of the intervals OX is called a radius. The points, except the points [X], on radii of the circle are said to be interior to the circle. The points of a not on radii are said to be exterior to the circle. XII. A circle passing through a point, A, interior and a point, B, exterior to another circle in the smie plane has in common with the other circle at least one point on each side of the line AB. THE FOUNDATIONS OF GEOMETRY 51 XIII. If A is any point and a any line not passing through A, there is not more than one line through A coplanar with a and not meeting a. XIV. If A, B, C are three points in the order {ABC} and B1, B2, B3,... are points in the order {ABB1}, {AB1B2},... such that (A, B) is congruent to each of the point-pairs (B, B1), (B1, B2),..., then there are not more than a finite number of the points B1, B2,... between A and C. XV. If A, B, C are three non-collinear points, there exists a point D not in the same plane with A, B and C. XVI. Two planes which have one point in common have two points in common. II MODERN PURE GEOMETRY By THOMAS F. HOLGATE CONTENTS SECTIONS. I. INTRODUCTION....................................... 1-4 II. SIMPLE ELEMENTS IN GEOMETRY........................ 5 III. THE PRINCIPLE OF DUALITY........................... 6-11 6, Duality in space; 7-8, Examples of duality; 9-11, Duality in a plane. IV. PRINCIPLE OF CONTINUITY.............................. 12-13 13, Imaginary intersections. V. POINTS AT INFINITY............................. 1.4-19 14-17, Infinitely distant elements; 18-19, Postulate of parallels. VI. FUNDAMENTAL THEOREM...............................20-26 21, Perspective triangles; 22, Perspective quadrangles; 23-26, Harmonic points. VII. METRIC PROPERTIES.................................. 27-29 VIII. ANHARMONIC RATIOS.................................. 30-33 30, Definition; 31-33, Six anharmonic ratios. IX. ELEMENTARY GEOMETRIC FORMS........................ 34 X. CORRELATION OF ELEMENTARY FORMS...................35-40 39, Construction of corresponding elements. XI. CURVES AND SHEAVES OF RAYS OF THE SECOND ORDER...41-48 48, Classification of conics. XII. PASCAL'S AND BRIANCHON'S THEOREMS..................49-52 XIII. POLE AND POLAR THEORYY...............................53-58 57, Conjugate points and lines; 58, Centre and diameters. XIV. CONCLUSION......................................... 59 54 II MODERN PURE GEOMETRY By THOMAS F. HOLGATE I. INTRODUCTION 1. In Analytical Geometry conclusions are reached through the application of algebraic processes to geometric properties and relations. By making use of certain conventions the given relations are expressed in algebraic language, then certain algebraic operations are performed and the results are reinterpreted as geometric propositions. During the process the geometric concept may be entirely lost sight of and the resulting statement may bear no apparent relation to the premises from which it was derived. In Pure Geometry, on the other hand, the geometric concept is kept continually in mind throughout the reasoning process, and the steps by which a conclusion is reached from given conditions are readily traceable. 2. Pure geometry was cultivated by peoples of the earliest times. By them many important theorems were discovered on the relations of triangles and other rectilinear forms, on the properties of circles and spheres, and on areas, ratios, and the equality and similarity of geometric figures. The investigations of the ancient geometers were carried so far as to include the conic sections and certain curves of higher order whose principal properties were discovered, but the methods used were fragmentary and the results for the most part were disconnected. The ancient geometry is typified most clearly by Euclid's Elements, which was in fact a collation and systematic arrangement of the geometric knowledge of his time. 55 56 MODERN MTMATMATICS In it properties and relations are demonstrated each by itself, and little attention is paid to relations common to all forms of the same class. The method of Euclid has come to be known as the method of Elementary Geometry, and the subject-matter of his elements has prescribed the field of elementary geometry. 3. The methods of the ancient geometers were not materially modified till the period of the revival of learning early in the sixteenth century, when with the introduction of certain new concepts, and the application of well-known older ones, as, for example, infinitely distant elements, the harmonic division of a line segment, the principle of continuity, and the theory of imaginary intersections, the science began to take on a more generalized form. The renewed activity in geometric research resulted in the invention by Descartes of the analytical geometry, and for two and a half centuries investigations by purely geometric methods were for the most part pushed aside. Happily interest in pulre geometry was revived toward the close of the eighteenth century through the publications of Monge, and during the first half of the nineteenth century it reached its highest development at the hands of Poncelet, Steiner, Von Staudt, and Chasles. 4. Modern pure geometry differs from the geometry of earlier times not so much in the subjects dealt with as in the processes employed and the generality of the results obtained. Much of the material is old, but by utilizing the principle of projection and the theory of transversals, facts which were thought of as in no way related, prove to be simply different aspects of the same general truth. This generalizing tendency is the chief characteristic of modern geometry, and while it may perhaps be attributed largely to the influence of the analytic method, still it is true that some progress had been made in this direction before the analytic method was invented, and pure geometry has done much in recent times to enliven and heighten the interest in analysis. MODERN PURE GEOMETRY 57 II. SIMPLE ELEMENTS IN GEOMETRY 5. Points, straight lines, and planes are the simple undefined elements of pure geometry. Each of these may be thought of as having an existence independent of the others; a plane may be thought of without considering the lines and points which lie in it; we may think of a line without considering the points which lie on it or the planes which pass through it, and of a point without considering either the lines or the planes which pass through it. In fact each of these simple elements may be the base on which rest an indefinite number of elements of either of the other kinds. III. THE PRINCIPLE OF DUALITY 6. Duality in space. Two points will fix the identity of a straight line and three points will in general determine a plane. So also two planes intersect in a straight line and three planes in general have one point in common. If three points lie in a specialized relative position, namely, in a straight line, then many planes pass through them. Similarly, if three planes be in a specialized relative position, namely, with one line in common, then many points lie in all three. But apart from such special cases the following statements may be made: al. Three points determine a plane. a2. Three planes determine a point. bl. Two lines which have a common point determine a plane. b2. Two lines which have a common plane determine a point. cl. A line and a point determine a plane. c2. A line and a plane determine a point. In these statements taken two and two there will be noted an interchangeable relation between the elements point and plane, and between line and line. This is spoken of as a dual relation, and in accordance with it any geometric form will yield another by replacing every point in one by a plane in the 58 MODERN MATHEMATICS other, and every line joining two points in one by a line the intersection of two planes in the other. If in the original figure three planes meet at a point, in the dual or reciprocal figure three points will lie in a plane; or if in the original figure four lines lie in a plane, in the reciprocal four lines will meet in a point. 7. Examples of duality. A cube consists of eight vertices (points), six plane faces, and twelve edges each the intersection of two faces and joining two vertices. Its dual or reciprocal figure, therefore, consists of eight plane faces, six points (vertices), and twelve edges each joining two vertices and the intersection of two faces. In the original figure the faces meet by threes in the vertices, and also the edges meet by threes in the vertices, while four edges lie in each face. In the reciprocal figure, the vertices must lie by threes in the faces, and also the edges lie by threes in the faces, while four edges meet in each vertex. This reciprocal figure is readily seen to be an octahedron. The cube and the octahedron may thus be spoken of as dual or reciprocal figures. In the same way it will be seen that the dual of a tetrahedron is again a tetrahedron, and the dual of a dodecahedron is an icosahedron. 8. This principle, by which a theorem on points, lines, and planes may be deduced from another on planes, lines, and points, by simple interchange is called the principle of duality. It was made much use of by Poncelet, but was first announced as an independent principle by Gergonne (1826), and plays an important part in modern geometry. Its application is to purely descriptive properties and not in general to properties involving measurement. 9. Duality in a plane. If the forms under consideration are confined to a single plane, that is, if we are dealing only with plane geometry, the duality is between point and line, since in plane geometry two points determine a line and two lines determine a point. To any number of points on a line in one of two reciprocal plane figures there will correspond in the other an equal number of lines through a point, and if three or MODERN PURE GEOMETRY 59 more lines are concurrent in the one, their reciprocal points are collinear in the other. 10. As an illustration of reciprocal figures in a plane the following will serve: Four points (vertices) A, B, C, D, of which no three are collinear, determine six lines (sides), namely, the lines joining the vertices two and two. The lines AB and CD may be called opposite sides in the figure, and similarly AC and BD are opposite sides, as are also AD and BC. The pairs of opposite sides determine three points P, Q, R-diagonal points-the vertices of what may be called the diagonal triangle. The figure so constructed is known as a complete quadrangle. On the other hand, four lines (sides) a, b, c, d, of which no three are concurrent, determine six points (vertices), namely, the intersections of the sides two and two. The points ab and cd may be called opposite vertices in the figure, and similarly ac and bd are opposite vertices as are also ad and be. The pairs of opposite vertices determine three lines p, q, r-diagonals -the sides of what may be called the diagonal triangle. This figure is known as a complete quadrilateral. — / — FIG. 1. FIG. 2. A complete quadrangle (Fig. 1) thus consists of four vertices, six sides, and three diagonal points; a complete quadrilateral (Fig. 2) consists of four sides, six vertices, and three diagonals. Similarly, a complete pentangle (pentagon) has five vertices and ten sides intersecting by fours in the vertices, 60 MODERN MATHEMATICS while there may be found fifteen points in which only two sides intersect. A complete pentalateral on the other hand, has five sides and ten vertices lying by fours on the sides, while there may be drawn fifteen lines on which only two vertices lie. 11. As an illustration of how one theorem may be deduced from another by the principle of duality the following example will serve. The theorem on the left is well known, having been stated by Pappus in the fourth century. That on the right is not so familiar, but follows immediately by interchange of point and line, or it may be demonstrated independently. If three points A, C, E be chosen at random on a straight line p, and three others B, D, F, be chosen at random on a straight line q, and these be joined in order AB, BC, CD, DE, EF, FA, by straight lines 1, 2, 3, 4, 5, 6, then the intersections of 1 and 4, 2 and 5, 3 and 6, lie on a straight line r. Other examples of duality this chapter. If three straight lines, a, c, e, be drawn at random through a point P, and three others b, d, f, be drawn at random through a point Q, and the intersections of these two and two in order ab, be, cd, de, ef, fa, be denoted by 1, 2, 3, 4, 5, 6, then the lines 14, 25, 36, pass through one point R. will occur in the progress of IV. PRINCIPLE OF CONTINUITY 12. The principle of continuity, first assumed by Kepler and later by Desargues, asserts that a property which can be demonstrated for a particular figure will hold true if the figure should change its form in any manner subject to the conditions under which it was first constructed. This principle makes necessary an enlargement of the significance of many geometric terms so as to include what are called imaginary elements, and by the aid of these it permits the statement of general facts or theorems which otherwise would be subject to exceptions and limitations. The geometer makes no attempt to construct MODERN PURE GEOMETRY 61 imaginary elements, but contents himself with the acceptance of their existence and of the principle that though by continuous change in a figure a property once proved may become unmeaning through the loss of real elements, it is still true when imaginary elements are taken into consideration. 13. Imaginary intersections. As an illustration it may be stated that a straight line drawn through a fixed point P intersects a circle in two points. If the point P lies within the circle, the intersections are always real no matter how far the line rotates about P. If, however, the point is chosen outside the circle, the line in the first instance may cut the circle in two real points, but as it rotates about P the intersections will move so as first to fall together or become coincident, and after that they will disappear or become imaginary. To say that the line in this last phase intersects the circle may be without meaning under the ordinary conventions; yet it is assumed true, and the imaginary points of intersection play the same part in any general theorem as do the real points of intersection of the earlier phases. Thus the theorem that the product of the segments of a chord or secant of a circle remains constant while the secant rotates about a point comes to have an interpretation for all positions of the secant. V. POINTS AT INFINITY 14. Infinitely distant elements. The introduction into geometry of the notion of infinitely distant elements has aided greatly in the process of generalization with which modern methods are chiefly concerned. Many exceptional cases which under earlier conditions would require special treatment, by the addition of this concept are brought into conformity with a general statement. 15. Infinitely distant elements come most easily into view from the following considerations. Suppose a straight line b (Fig. 3) passing through a fixed point 0, intersects the line a, in a point P: and suppose the line b rotates about 0 as indicated by the arrow. The point of 62 MODERN MATHEMATICS intersection P will move along the line a to the right until it is lost to view and then will immediately appear at the far left, moving along the line in the same sense as before. The assumption is made that the two lines have not at any time ceased to intersect, and that the point P has moved continuously along the line a, ^b ~ ~ disappearing at the far b right and reappearing at / b the far left after passing / \6\, through but a single position which lies outside the // c \ " \ accessible region of the plane. In other words, it FIG. 3. is assumed that on the line a, and so on any other straight line of the plane, or for that matter, on any straight line of the finite region, there is one and only one infinitely distant point. It is also assumed that this point makes the line continuous, so that we can pass from any one point of the line to any other point of the line by moving continuously along the line either to the right or to the left.* 16. Two points will thus divide a straight line into two segments on one of which lie only finite points, while on the other lies the infinitely distant point. The first of these is sometimes called the internal segment, the second, the external segment. From this it follows that a point on a straight line cannot be separated from another point by a single third point. It requires two points to separate one point from another, just as on a ring or closed curve. 17. The assumption of a single infinitely distant point on the line a is equivalent to the assumption that through a point 0 there can be drawn one and only one straight line which does not meet a given line in the finite region, and that these lines do intersect in an infinitely distant point. This assumption * The present monograph deals only with the so-called Euclidean geometry. For the assumptions of non-Euclidean geometry, see Monograph III. MODERN PURE GEOMETRY 63 makes possible the statement that any two straight lines of a plane intersect somewhere, if not in the finite region, then in an infinitely distant point. Definition. Lines which intersect in an infinitely distant point are called parallel lines. 18. Postulate of parallels. Euclid's twelfth axiom, which is more properly speaking a postulate, was his starting point for proving that through a given point one and only one line can be drawn parallel to a given line. Its assumption is consequently equivalent to that of a single infinitely distant point on a straight line. Most of the difficulty in the treatment of parallels which perplexed geometers for centuries was caused by the failure to recognize that this so-called twelfth axiom was an assumption and not a self-evident truth. 19. A single infinitely distant point on a straight line, or what is the same thing, a single line through a given point parallel to a given line, leads at once to the conclusion that all the infinitely distant points of a plane lie on one straight line and that any two parallel lines of a plane intersect in a point of this line. The following considerations will make this clear. If a line p should rotate about a point P, every point of the line would describe a continuous path in the plane, and this may be assumed also for the infinitely distant point. The infinitely distant path described by this point contains all the infinitely distant points of the plane and is such that it is cut by any straight line in only one point. It is therefore itself a straight line. From this it follows that any two parallel planes intersect in an infinitely distant straight line common to the two. VI. FUNDAMENTAL THEOREM 20. In the development of modern geometry there have been differences among investigators as to the best mode of attack. Some geometers, Steiner and Chasles, for instance, have preferred to base their fundamental notions on certain metric properties, while others, notably Von Staudt, and after him 64 MODERN MATHEMATICS Reye, and in a modified form Cremona, have preferred to use as starting point a purely positional relation, and thus avoid the necessity of recognizing measurement as fundamental. 21. Perspective triangles. Following the latter method we announce as our fundamental fact the theorem on perspective triangles which was stated by Desargues early in the seventeenth century, but which was known much earlier, probably by Euclid. If two triangles ABC and If two triangles abc and A1 B1 C1 are so situated albl1c are so situated that the that the lines AA1, BB1, and sides a and al, b and ib, c and CC1 meet in a point S, then cl, intersect in points of one the pairs of corresponding straight line s, then the lines sides c and cl, b and bi, a and joining pairs of corresponding al, intersect in points of one vertices AA1, BB1, CC1 meet straight line. in a point. The truth of this theorem is evident if the triangles be chosen in different planes p and pi. For then the lines AA1 and BB1 meeting at S, determine a X'v~~ ~ plane in which AB and A1B1 lie. These lines therefore intersect, and can meet only B/ hb \ ^- ~ on the common line of the -ii a,/ ' A^ planes p and pi. Similarly AC and A1C1, also BC and B1C1, meet in points of this same straight line. That the theorem is true FIG. 4. also when the triangles lie in the same plane is seen most easily by projecting the whole figure, that is the two given triangles, the lines joining corresponding vertices and meeting in S, and the line of intersection of the planes p and pi, from some point 0, the eye for instance, thus forming a figure of ten lines and ten planes intersecting at 0. In each plane MODERN PURE GEOMETRY 65 will lie three of the lines and through each line will pass three of the planes. Any plane section of this projection will yield a figure consisting of two triangles so situated that the lines joining pairs of corresponding vertices intersect in one point while the pairs of corresponding sides intersect in points of one straight line. The process of projecting a figure from some chosen centre and then taking a plane section, thus securing a new diagram, is a favorite one in modern geometry. The properties which remain unchanged by this process are called projective properties, and they are found to be numerous. Magnitudes are changed but, as will be seen later, certain relations among magnitudes remain unchanged, as do also properties of intersections, contact, collineation, and the like. 22. Perspective quadrangles. By repeated applications of the theorem on perspective triangles the following theorem on complete quadrangles is proved to be true: " If two complete quadrangles are so situated that five pairs of corresponding sides intersect in points of one straight line then the sixth pair will also intersect in a point of that line." Remembering that the reciprocal of a complete quadrangle is a complete quadrilateral made up of four lines and their six points of intersection, the dual theorem may be stated as follows: " If two complete quadrilaterals are so situated that five of the lines joining pairs of corresponding vertices meet in one point, then the line joining the sixth pair of corresponding vertices will also pass through that point." 23. Harmonic points. Let ABCD be any complete quadrangle (Fig. 5) and let PQ be a line joining two diagonal points while R and S are the points in which the third pair of sides intersect PQ. Construct any other quadrangle such that one pair of sides will intersect in P, a second pair will intersect in Q, and a fifth side will pass through R. This is readily possible if the two sides through P be drawn at random and likewise the side through R be drawn at random cutting the two already 66 MODERN MATHEMATICS drawn at- A' and C', respectively. Then QA' and QC' will determine the vertices D' and B'. Now, in these two quadrangles, one of which was drawn wholly at random, five pairs of sides intersect in points of the straight line PQ, hence the sixth pair BD and B'D' must B$y,^~\ ~also intersect on PQ. In other '/ 2z)/-?words, if two points P and Q on a straight line be such p at \R i Q s that pairs of sides of a complete quadrangle intersect in them, while a fifth side passes B\^^~' ~ through a third point R of this line, then the sixth side will of necessity pass through FTa 5. a definite point S determined by the first three. These four points on the line are said to be harmonically related, or it may be said that the line segment PQ is harmonically divided at R and S, and we thus have a purely positional definition of the harmonic relation. Definition. Four points on a straight line are harmonic when they are so situated that in two of them pairs of opposite sides of a complete quadrangle may intersect while the remaining sides pass through the other two. In the diagram (Fig. 5) it should be noted that not only the points PQRS, but also the points ATCR fulfil the conditions specified for " harmonic points," as do also the points DTBS. 24. If the intersection of the sides AC and BD be the point T, and if PT intersect BC and AD in L and N respectively, and QT intersect AB and CD in K and M respectively, then the line KL must pass through R since KBLT is a quadrangle of which one pair of opposite sides intersect in P, a second pair intersect in Q, while a fifth side passes through S. Similarly, NM passes through R, while KN and LM both pass through S. KLMN is thus a complete quadrangle with one pair of opposite sides intersecting in R, one pair in S, a fifth side passing through MODERN PURE GEOMETRY 67 P, and the sixth side through Q. The points R and S in this modified diagram play exactly the same part as P and Q in the original diagram, while P and Q in the modified diagram play the same part as R and S in the original. Thus, if the segment PQ is harmoni- / cally divided at R and S, so also the segment RS is harmonically di- FIG. 6. vided at P and Q. The points P and Q are harmonic conjugates with respect to R and S, and in the same way, R and S are harmonic conjugates with respect to P and Q. 25. It is not difficult to show that the pairs of points P, Q and R, S must actually separate each other if they form a harmonic set, and that if P and Q remain fixed while R traverses the segments from Q to P internally, then S will traverse the segment from Q to P externally. From this it follows that if two pairs of points R, S, and R', S' are harmonically separated by the same points P and Q, then R, S, and R', S' cannot separate each other. Conversely, it is not difficult to show that if two pairs of points be chosen on a straight line so as not to separate each other, then a single pair may be found which will harmonically separate both pairs. 26. Suppose now the complete quadrangle ABCD with the harmonic points P, Q, R, S, is projected from some point O outside the plane, and that a section is taken by a plane cutting the projection in a new quadrangle A'B'C'D', of which two sides intersect in a point P', two sides in Q', a fifth side passes through R', and the sixth side through S'. Then the points P'Q'R'S', any section of the rays OP, OQ, OR, OS, are harmonic. Hence if four harmonic points P, Q, R, S, be projected from a centre 0, any section P'Q'R'S' of the four projecting rays is a harmonic set of points. The rays OP, OQ,, O, OS, are themselves also said to be harmonic. 68 MODERN MATHEMATICS VII. METRIC PROPERTIES 27. Thus far the harmonic relation of points on a line has been discussed from a purely positional standpoint and no question of measurement has been considered. To introduce magnitudes into our dis/LK cussion let us assume the theorem that the diagonals /^~ \^ //~ of a parallelogram bisect. each other. Then if AC be one diagonal of a paral\ 1 (i ~lelogram, ALCN, bisected by the other diagonal LN F IG. 7. at B, and if the pairs of opposite sides be produced to meet at the infinitely distant points K and AM, respectively, KLMN may be looked upon as a complete quadrangle with one pair of sides KL and MN intersecting at A, a second pair KN and LM intersecting at C, a fifth side LN passing through B and the sixth side KM (the infinitely distant line) intersecting AC in the infinitely distant point D. Then the segment AC is harmonically divided by the mid-point B and the infinitely distant point D. Thus any line-segment PQ is bisected at a point R when the harmonic conjugate of R with respect to P and Q is at infinity; or, the harmonic conjugate of the mid-point of a line-segment with respect to the extremities of the segment is the infinitely distant point of the line. 28. If a set of harmonic points ABCD be projected from any point 0 by rays OA, OB, OC, OD, and a section of these rays A'BC'D' be taken by a line drawn through B parallel to OD, the segment A'B will equal the segment BC', since D' is at infinity and the four points are harmonic. By similar triangles it follows at once that AB BA' BC' BC AD DO - DO -CD' MODERN PURE GEOMETRY 69 or, interchanging the order of segments and giving attention to direction, AB AD BC DC' That is, the segment AC is divided internally at B and externally at D in the same ratio, a relation which is frequently taken as the definition of harmonic points. o FIG. 8. It should be noted that by another interchange in the order and direction of segments, the ratio AB AD BA BC BC -C becomes AD =-C which shows that not only is the segment AC divided at B and D in equal ratios, but also that the segment BD is divided at A and C in equal ratios, as has been already pointed out. 29. From this property of harmonic points it is not difficult to demonstrate the following two: 1 1 2 (1) AB AD=AC' from which immediately comes the identity of geometric harmonics with the algebraic harmonical progression; and (2) MB MD=MC2, where M is the mid-point of the segment AC. 70 MODERN MATHEMATICS VIII. ANHARMONIC RATIOS 30. Definition. If a line-segment PQ is divided by any PR PS two points R and S, and the ratios -P and -S be formed and again the ratio of these two ratios be taken, we obtain the ratio PRS -tQ which is called the cross-ratio, or the anharmonic ratio, of the four points. 31. Six anharmonic ratios. For the same four points it is evident that there are six different anharmonic ratios according as PQ, PR, or PS is taken as the original segment, the other two points in each case being the division points, and as the ratio of ratios is taken in one order or the other. That there are not more than six different anharmonic ratios for the same four points, or that the segment RS, for example, with division points P and Q gives no new anharmonic ratio is easily shown. Forming the anharmonic ratio as before, with RS as initial segment, it takes the form RQ PS and this RQ. PS' by reversal of segments is identical with the one previously written. Three of the anharmonic ratios of four points are reciprocals of the other three since they are formed by taking the ratio of ratios in the reverse order. The six ratios therefore are pS RQ PQ. SR PQ. RS Ps. QR' PR.Q' and their reciprocals. These six anharmonic ratios involve only the quantities PQ RS, PR SQ, PS.QR, and their negatives. Now for any four points, P, Q, R, S, on a straight line it may be easily shown that PQ.RS+PR.SQ+PS.QR= O. From this we derive PQ.SR=PS.QR+PR.SQ PQ SR PR.SQ or p =QR-1 Q PS.QR PS RQ' MODERN PURE GEOMETRY 71 Also PQ.RS=PR.QS+PS RQ, PQ RS PS RQ or =1I —1 PR.QS PR SQ' PR.SQ Hence if the anharmonic ratio P k, the ratio PS - RQ PQ SR PQ RS 1 p =SR -k; and PRQSIPS -QR PR-QS k Therefore, if one of the anharmonic ratios of four points on a straight line be equal to k, the remaining five are 1 1 1' k, 1 k 1-kn 1, and k In speaking of an anharmonic ratio it is clearly necessary to distinguish which of the six ratios is meant, and when the ratio has been formed in one order that order must be retained throughout the discussion in hand. PQ.RS 32. Take PR QS as the anharmonic ratio of four given points. If two of the points P and S, and also the other two SR.QP Q and R, be interchanged, the ratio becomes SQ which by 0 SQ RP reversal of segments is equal to the original ratio. Or if any other two, and also the remaining two, be interchanged, the ratio is unaltered. Hence "If the anharmonic ratio of four points is formed in any order, the ratio is unchanged when we interchange two of the points and also the other two." PQ.RS PQ PR If the anharmonic ratio R QS — 1, then Q — RS' or the segment PS is divided at Q and R in the same ratio and the four points are harmonic. In this case P and S must be separated by Q and R. 33. Take four points, A, B, C, D, on a line and project them from any centre 0. Let p be the length of the perpendicular from 0 on the line. 72 MODERN MATHEMATICS Now the area of the triangle OAB =. p AB = = OA OR sin AOB, and there are similar relations for the other triangles. Hence AB CD sin AOB sin COD the anharmonic ratio. =sin AOC sin BOD a quantity independent of p, and hence independent of the position of the line relative to the rays of OA, OB, OC, OD. Therefore " If A', B', C', D', be a projection of the points A, B, C, D, from any centre 0, the anharmonic ratio of the former set of points is equal to the corresponding anharmonic ratio of the latter set; or, the anharmonic ratio of four points is unaltered by projection." IX. ELEMENTARY GEOMETRIC FORMS 34. The whole system of points on a straight line is called a range of points and the system of lines through a point, these lines being confined to one plane, is called a sheaf or pencil of rays. The system of planes passing through a line is a sheaf of planes. The aggregate of lines through a point, not confined to one plane, is called a bundle of rays, and the aggregate of planes through a point is a bundle of planes. In plane geometry, the range of points and the sheaf of rays are reciprocal forms, while in three-dimensional geometry the range of points is reciprocal to the sheaf of planes and the sheaf of rays is reciprocal to itself. The bundle of rays and bundle of planes are reciprocal respectively to the rays of a plane and the points of a plane. X. CORRELATION OF ELEMENTARY FORMS 35. Two ranges of points may be so correlated that to every point of one range there corresponds one and only one of the other. For example, two sections of the same sheaf of rays are correlated in this way if to each point of one range is correlated that point of the other which lies on the same ray. Sim MODERN PURE GEOMETRY 73 ilarly, two sheaves of rays may be correlated, one to one, if they project the same range of points. These are perhaps the simplest examples of one to one correlation, but other more complicated examples will readily occur to the reader. Definition. When two elementary forms-ranges of points, sheaves of rays, or sheaves of planes-are so correlated that to every set of harmonic elements in one of them there corresponds a set of harmonic elements in the other, the forms are said to be related to each other projectively. It is readily seen that if two forms are the first and last of a series, each of which is a projection or a section of the next preceding or next following, they fulfil the conditions of this definition, and hence are projectively related. 36. From the definition it follows without great difficulty (see Reye, Geometry of Position, ~80*) that to any orderly sequence of elements in one of two projectively related forms there corresponds always an orderly sequence of elements in the other, and also that the anharmonic ratio of any set of four points in one form is equal to the anharmonic ratio of the corresponding four in the other. 37. Two projectively related simple forms which have the same base-for example, two projective ranges of points which lie on the same straight line-may have two elements of the one which correspond to themselves in the other; but if more than two, then every element of the one corresponds to itself in the other, and the two forms are identical. That there may be two self-corresponding elements in the superposed forms may be seen as follows: Let A, B, C, D,... be points of a range u projected from S1, by rays a,, b1, ci, di,... and from S2, by rays a2, b2, c2, d2,... Let V cut the rays al, b1, c1, d1,... in points A1, B1, C1, D1,... and cut the rays a2, b2, C2, d2,... in points A2, B2, C2, D2,... The ranges A1, B1, C1, D1,... and A2, B2, C2, D2,..., both lying on the line v, are projectively related, since any set of har * The reference is to Reye's Geometry of Position. English translation. This is now out of print, but readily accessible in libraries. 74 MODERN MATHEMATICS monic points in one corresponds to a set of harmonic points in the other, and in general corresponding points are distinct. The ray S1S2, however, cutting the line u at the point S, will determine on v two corresponding points which coincide. Also the rays of S1 and S2 which project the point of u in which that line is intersected by v, determine on v two coincident corresponding points. So that, in two superposed projective forms, two self-corresponding elements are possible without requiring that all elements should be self-corresponding. But if three elements are self-corresponding, then all elements are self-corresponding. It is readily seen that certainly in this case an indefinite number of points will coincide with their corresponding points, namely, the harmonic conjugate of each of the three given points with respect to the remaining two, and so on indefinitely. But for a proof that every point must coincide with its corresponding point the reader is referred to Reye, ~84. 38. Let us apply this property to some simple example. (1) If two projective ranges of points Al, B1, C1,... lying on the line ul and A2, B2, C2,... lying on the line u2 are so situated that the rays AA1, BB1, and CC1, or any three such rays, pass through one point S, then all rays joining pairs of corresponding points will pass through S, and the common point of the two ranges must be self-corresponding. For S is the centre of two superposed projective sheaves of rays having three selfcorresponding rays, hence all rays are self-corresponding, (2) If two projectivesheaves of rays al, bl, cl,... with centre S1 and a2, b2, c2,... with centre S2 are so situated that the points of intersection ala2, blb2, and CIC2, or any three such points of intersection, lie on one straight line s, then all points of intersection of pairs of corresponding rays will lie on s, and the common ray of the two sheaves must be self-corresponding. For s is the base of two superposed projective ranges of points having three selfcorresponding points, hence all points are self-correspond -MODERN PURE GEOMETRY 75 and the ray joining any point P1 to S must coincide with the ray joining P2 to S, or, the ray PlP2 must pass through S. Definition. When two projective ranges of points are so situated that the lines joining pairs of corresponding points all pass through one point, they are said to be perspective to each other, or to be in perspective position, and this will happen whenever three lines joining pairs of corresponding points pass through one point. ing and the intersection of any ray pl with s must coincide with the intersection of p2 with s, or, pi and p2 must intersect on S. Definition. When two sheaves of rays are so situated that the points of intersection of pairs of corresponding rays all lie on one straight line, they are said to be perspective to each other, or to be in perspective position, and this will happen whenever three points of intersection of pairs of corresponding rays lie on one line. For brevity, the symbol A is frequently used for is projective to and the symbol - for is perspective to. It should be noted that forms which are perspective to each other are also projective, but lie in a special relative position. (3) Two projective ranges of points, A1, B1, C1,... lying on the line u, and A2, B2, C2,... lying on the line u2, are perspectively related if the common point of u, and u2 is self-corresponding. For, if A1A2 and B1B2 intersect in S and the two ranges be projected from this point, then in the two projective sheaves of rays whose centre is S there are three selfcorresponding rays, namely, SAA1, SBB1, and SK where (4) Two projective sheaves of rays al, bl, cl... with centre S1 and a2, b2, C2... with centre S2 are perspectively related if the common ray of the two sheaves, SIS2, is self-corresponding. For, if s be the line joining the intersection ala2 and blb2, and a section of each sheaf of rays by this line be taken, then in the two projective ranges of points lying on this line there will be three self-corresponding points, namely, 76 MODERN MATHEMATICS K is the common point of ala2, blb2, and the point where u1 and u2. Hence all rays s cuts S1S2. Hence all points of S are self-corresponding of these two ranges are selfand the two ranges are per- corresponding, or, in other spective. words, all pairs of corresponding rays of the two sheaves will intersect on s, and the two sheaves are perspective. (5) Two fixed straight lines ul and u2> intersect at 0 and there are two fixed points SI and S2 collinear with O. A line v rotates about a fixed point V and intersects ul and u2 in A1 and A2 respectively. The locus of the intersection of S1A1 and S2A2 is a straight line through O. For the line v rotating about V marks out on the lines u1 and u2 two perspective ranges of which A1 and A2 are corresponding points and 0 is a self-corresponding point. The sheaves S1A1 and S2A2 are therefore perspective and the locus of the intersection of pairs of corresponding rays is a straight line. That 0 is one point of the locus follows from the fact that S10 and S20 are corresponding rays in the sheaves S1 and S2. 39. Construction of corresponding elements. From what has been said it appears that two elementary forms may be correlated projectively as soon as there are known three elements in the one form which correspond to three given elements in the other. Let S1 and S2 be the centres of two sheaves of rays lying in the same plane, which are to be correlated projectively. Let the rays al, bi, cl, of the first sheaf correspond respectively to the rays a2, b2, c2, of the second sheaf. The problem is to find in the second sheaf the ray d2 which corresponds to any chosen ray d1 of the first sheaf. If al, a2 intersect at A, bl, b2, at B, and c1, c2 at C, and these three points lie on a straight line v, the two sheaves are perspective and any pair of corresponding rays will intersect on v. But, if A, B, and C are not collinear, we must find a sheaf of rays to which each of the given sheaves is perspective and so arrive at a correlation of them. MODERN PURE GEOMETRY 77 Through one of the points of intersection, A, draw two secants, u1 and u2, and consider the first, ul, a section of the sheaf S1, the second, u2, a section of the sheaf S2. These two ranges S1 of points therefore will be pro- cb, C1 jectively related and they are/ perspective since A is a selfcorresponding point. If B'C' and / /\ 1 c B"C", are the points in which bl, / / / c1, and b2, -c2 are cut respectively \B by the lines ul and u2, the intersection of B'B" and C'C", A or S, is the centre of a sheaf \ of rays of which both ul and u2 i Sd are sections. Hence the sheaves D SI and S are perspective since s corresponding rays intersect on FIG. 9. the straight line u1, and S2 and S are perspective since corresponding rays intersect on the straight line u2. If then d1 is any ray of the sheaf S1 which cuts u1 at D', the ray SD' will cut u2 in a point D" in which the ray d2 of the sheaf S2 also cuts it. Thus the ray d2 of S2 corresponding to any ray d1 of SI is ^^"~^-4iJ\ determined and the corB\^^A^~,relation is complete. S2 D,. z \ // / - 40. On the other "*'~,,.- ^ hand, let ul and u2 be.//;.\.\f\ -^ /tworangesof pointslying s~- '\\\,->B in the same plane which SD^^ 2C2,are to be correlated projectively, and let the FIG. 10. points Al, B1, C1, of the first range correspond respectively to the points A2, B2, C2, of the second range. The problem is to find the point D2 of the second range corresponding to any chosen point D1 of the first range. 78 MODERN MATHEMATICS If the rays A1A2, B1B2, C1C2 should pass through one point V, the two ranges are perspective and all pairs of corresponding points in the two ranges will lie on rays through V. But if these rays do not intersect in one point we must find a range of points to which each of the given ranges is perspective, and so arrive at a correlation. On one of the lines as AiA2 choose two centres SI and S2, and from these project the given ranges ul and u2, respectively. The two sheaves of rays S and S2 will thus be projective and they are perspective since 8182 is a self-corresponding ray. If B' is the point in which S8B1 and S2B2 intersect and C' the point in which S1Ci and S2C2 intersect, then all pairs of corresponding rays of S1 and S2 will intersect on the line B'C' or is. If then D1 be any point of the range ul, S1D1 will intersect the line B'C' in the same point as does S2D2. Thus the point D2 of U2 corresponding to any point D1 of ul is determined and the correlation is complete. It should be noted that both the problem and the process of sec. 40 are the reciprocals or duals of those of sec. 39. XI. CURVES AND SHEAVES OF RAYS OF THE SECOND ORDER 41. If two projective sheaves of rays lying in the same plane are neither concentric nor perspective, then of the points of intersection of pairs of corresponding rays, at most two can lie on any straight line. For if three, then all and the two sheaves must be perspective. If two projective ranges of points lying in the same plane are neither superposed nor perspective, then of the lines joining pairs of corresponding points, at most two can pass through any one point. For if three, then all and the two ranges are perspective. 42. Since a continuous series of elements in one of two projective forms corresponds always to a continuous series in MODERN PURE GEOMETRY 79 the other, the locus of the point of intersection of corresponding rays in two projective sheaves is a continuous series of points, or a curve, and the locus of the line joining corresponding points in two projective ranges is a continuous series of rays, or an envelope. If the two projective sheaves of rays are not perspective, the generated curve is such that not more than two of its points lie on any straight line. Such a curve is called a curve of the second order. If the two ranges of points are not perspective, the generated envelope is such that not more than two of its rays pass through any one point. Such an envelope is called a sheaf of rays of the second order. A curve of the second order A sheaf of rays of the second is generated by two project- order is generated by two ive sheaves of rays lying in projective ranges of points the same plane, which are not lying in the same plane, which perspective. are not perspective. 43. The centres of the sheaves generating a curve of the second order are themselves points of the curve, since the ray S1S2 of the sheaf S1 meets its corresponding ray at S2, and the ray S2S1 of the sheaf S2 meets its corresponding ray at Si. These corresponding rays have each only one point in common with the curve, namely, S2 and S1 respectively, while all other rays through these centres meet the curve at S2 or S1 and also elsewhere. These rays are consequently called tangents to the curve at these points. The lines on which lie ranges of points generating a sheaf of rays of the second order are themselves rays of the sheaf, since each joins a point in itself to the corresponding point in the other, namely, the common point of the two lines. Through the point of u1 which corresponds to the point of u2 lying on u1, there passes but one ray of the sheaf, namely, ul itself, while through all other points of u1 there pass two rays of the sheaf. The same is true for that point of u2 which corresponds 80 MODERN MATHEMATICS to the point of u1 lying on u2. These points are consequently called points of contact on the two rays. 44. A curve of the second order may thus be generated from two given points S1 and S2 and three rays through each correlated to three rays through the other, in other words, from A FIG. 11. five given points of the curve. The problem of constructing a curve of the second order from five given points is just the problem of determining pairs of corresponding rays in two projective sheaves when three pairs are given. FIG. 12. A sheaf of rays of the second order may be generated from two given rays ul and u2 and three points on each correlated to three points on the other; in other words, from five given rays MODERN PURE GEOMETRY 81 of the sheaf. The problem of constructing a sheaf of rays of the second order from five given rays is just the problem of determining pairs of corresponding points in two projective ranges when three pairs are given. 45. That the points S1 and S2, the centres of the generating sheaves, are not particular points of the curve, or that the curve could as well be generated with any other two of its points for centres, follows without great difficulty, but the demonstration is omitted. The same is true regarding the lines u1 and u2 in the sheaf of rays. Accepting this, it follows that: "A curve of the second order may be projected from any two of its points by projective sheaves of rays and a sheaf of rays of the second order is cut by any two of its rays in projective ranges of points." 46. A circle is a curve of the second order, since if two points S1 and S2 on it be chosen for centres and other points FIG. 13. A B C... be projected from these, the angle AS1B equals the angle AS2B, and so on, so that the two sheaves of rays Si and S2 might be placed the one on the other so that pairs of corresponding rays would coincide. Hence the sheaves are projective and the points of the circle are points of intersection of pairs of corresponding rays. 82 MODERN MATHEMATICS Similarly the tangents to a circle form a sheaf of rays of the second order.' If ul and U1 FIG. 14. points are projective and the rays of the second order. U2 are any two tangents to a circle, and other tangents cut these two in points A1, A2; B1, B2,... respectively, the angle A10B1 equals the angle A20B2, and so on, so that the ranges of points ul and u2 are sections of two identically equal sheaves of rays having the same centre 0. Hence these two ranges of system of tangents is a sheaf of 47. From considerations such as these, it may be shown that every conic section is a curve of the second order and may be generated as indicated in the preceding articles, and also that the system of tangents to a conic section is a sheaf of rays of the second order. On the other hand, any curve of the second order is a conic section and any sheaf of rays of the second order is a system of tangents to a conic section. 48. Classification of conics. If one of the two projective sheaves of rays generating a conic should be placed concentric with the other without changing the directions of its rays, the two sheaves might then have two rays which coincide with their corresponding rays, or one such, or none such, but certainly two such if the correlated rays rotate about the centre in opposite senses. In the original positions of the sheaves, therefore, there may be two pairs of correlated rays parallel, in which case the generated curve is a hyperbola, having two points at infinity; or one pair, in which case the curve is a parabola; or no corresponding rays parallel, in which case the curve is an ellipse. If the five given points from which the curve is generated lie so that one is within the quadrangle formed by the other four, the resulting curve is necessarily a hyperbola. MODERN PURE GEOMETRY 83 XII. PASCAL'S AND BRIANCHON'S THEOREMS 49. The diagram for the construction of pairs of corresponding rays in two projective sheaves (sec. 39), may be extended so far as to show that the ray SiS of the sheaf Si1 cuts the line u2 in a point, ]c t M of the curve determined / by the two sheaves. Simi- // I larly, S2S of the sheaf S2 A/ / cuts the line ul in a point \\ \\ L of the curve. Now Si, / \ / S2, A, D, L, and Mi are \ arbitrary points of the curve and lines connecting\ \ \ them in order, S1DS2LAM, \ \ form a hexagon inscribed '3 12 \ in the curve of second - order such that the pairs 2 of opposite sides S1D and LA, DS2 and AM, S2L and MS1 necessarily intersect FIG. 15 -in points of a straight line D'SD". Hence "The points of intersection of the three pairs of opposite sides of a hexagon inscribed in a conic lie on one straight line." This is the well-known theorem of Pascal enunciated in 1640, when the author was but a lad of sixteen years. 50. On the other hand, the diagram for the location of pairs of corresponding points in two projective ranges (sec. 40) may be extended to show that the lines S2R1 and S1Q2 are rays of the sheaf of second order determined by the given ranges, R1 and Q2 being the points in which the line B'C', or u, intersects u1 and U2, respectively. Now S1S2R1D1D2Q2 are vertices of a hexagon whose sides are arbitrary rays of the sheaf of second order, and it is such that the lines joining pairs of opposite vertices necessarily intersect in one point. Hence 84 MODERN MATHEMATICS " The lines joining pairs of opposite vertices of a hexagon circumscribed to a conic intersect in one point." This is Brianchon's theorem, the exact dual or reciprocal of Pascal's theorem, but not discovered till 1806. FIG. 16. 51. From these two theorems many important consequences follow. (1) If in Pascal's theorem two of the vertices of the inscribed hexagon come to coincide, the intervening side thus becoming the tangent at that vertex, the theorem takes the form: A pentagon inscribed in a conic is such that the intersections of two pairs of non-adjacent sides and of the fifth side with the tangent at the opposite vertex lie on one straight line. (3) If further the hexagon is reduced to an inscribed quadrilateral and tangents at two opposite vertices, we have: In any quadrilateral inscribed (2) If in Brianchon's theorem two of the sides of the circumscribed hexagon come to coincide, the intervening vertex thus becoming the point of contact of that side, the theorem takes the form: A pentagon circumscribed to a conic is such that the lines joining two pairs of nonadjacent vertices and the line joining the fifth vertex to the point of contact of the opposite side pass through one point. (4) If further the hexagon is reduced to a circumscribed quadrilateral and the points of contact on two opposite sides, we have: In any quad MODERN PURE GEOMETRY 85 in a conic, the intersections of pairs of opposite sides and of tangents at opposite vertices are collinear. (5) For the inscribed triangle Pascal's theorem becomes: The sides of a triangle inscribed in a conic intersect the tangents at the opposite vertices in points of one straight line. rilateral circumscribed to a conic the lines joining pairs of opposite vertices and pairs of points of contact in opposite sides are concurrent. (6) For the circumscribed triangle Brianchon's theorem becomes: The lines joining the vertices of a triangle circumscribed to a conic to the points of contact of the opposite sides intersect in one point. 52. Pascal's theorem yields itself at once to the construction of a conic of which there are given five points, or four points and the tangent at one of them, or three points and the tangents at two of them. In the case of five points being given, if these are A, B, C, D, E, and they are joined in order, while an arbitrary line through A is drawn for sixth side of the inscribed hexagon, the hexagon is determined excepting only the fifth side and the sixth vertex. Of this hexagon AB and DE are opposite sides, CD and the arbitrary line through A are opposite sides, and the intersections of these determine the Pascal line. The sides BC and the side EF, where F is on the arbitrary line through A, intersect also on the Pascal line, hence the point F, an arbitrary point of the conic, is determined. In the same way, Brianchon's theorem yields itself to the construction of tangents to a conic when there are given either five tangents, four tangents and the point of contact of one of them, or three tangents and the points of contact of two of them. 86 MODERN MATHEMATICS XIII. POLE AND POLAR THEORY 53. In the plane of a conic is a point P, and through it are drawn two secants of the conic as in the diagrams (Figs. 17 and 18), cutting the conic in the points A, B, arid C, D. If these points are joined two and two so as to form the inscribed quadrangle ACBD, the pairs of opposite sides AC and BD, AD and BC, will intersect on a line p, on which also intersect the tangents at opposite vertices A and B, C and D (Pascal's as-S I I /, ' ~//~ A PIF' FIG. 17. FIG. 18. theorem). The line p is called the polar of the point P with respect to the conic, and the point P is the pole of the line p. If the secant PAB cuts the polar at the point P' it is readily seen that the points PAP'B are harmonic, hence P' could be found as the harmonic conjugate of P with respect to A and B. Also the tangents at A and B intersect on the polar, consequently two points of the polar can be found from a single secant. Hence the position of the polar is independent of the second secant, that is, the polar of a point is independent of the process of constructing it, and it therefore bears a fixed relation to the point and the curve. MODERN PURE GEOMETRY 87 54. On the polar of a point P with respect to a conic there will lie: (1) The intersections of chords joining the extremities of pairs of secants through P. (By extremities we mean the points of intersection with the curve.) (2) The intersections of tangents at the extremities of secants through P. (3) All points harmonically separated from P by the curve. (4) The points of contact of tangents from P. 55. If a straight line p is given and we wish to find its pole with respect to a given conic, we may choose on the line two points R and S, and from these draw tangents meeting the curve at A, B, and C, D, respectively. The intersection P of the chords AB and CD is such that its polar necessarily passes through both R and S. Hence P is the pole of the given line. 56. If a point P lies inside a conic, all points of its polar lie outside, since chords through P cut the conic, and the polar passes through all points harmonically separated from P by the curve. If P lies outside the conic, some points of the polar must lie inside and the polar necessarily cuts the curve. If P is a point of the conic its polar is the tangent at that point and, conversely, the pole of a tangent is the point of contact. This follows as a limiting case of the construction for the polar in sec. 53. Incidentally, it may be stated by way of definition that a point lies inside of a closed curve when all straight lines through it cut the curve, and a point lies outside of a closed curve when through it straight lines can be drawn which do not cut the curve in real points. 57. Conjugate points and lines. If a point Q lies on the polar of a point P, relative to a conic, then P lies on the polar of Q. For if P lies outside the curve, Q may be chosen either inside or outside, but in either case, the polar of P is a secant through Q, the tangents at whose extremities intersect in a point of the polar of Q. But they intersect at P, hence P is a point of the polar of Q. If P lies inside the curve, Q must 88 MODERN MATHEMATICS lie outside and QP necessarily cuts the curve in real points. Now Q and P are harmonically separated by the curve, since Q lies on the polar of P, but for this same reason P must lie on the polar of Q. If P lies on the curve, and Q is a point of its polar, namely a point of the tangent at P, the polar of Q evidently passes through P. Two points are called polar conjugates when one, and consequently each, lies on the polar of the other; and two lines are "polar conjugates " when one, and consequently each, passes through the pole of the other. Thus a point is conjugate to all the points of its polar and a line is conjugate to all the lines through its pole. 58. Centre and diameters. In the diagram for the construction of the polar of a point (sec. 53), if P should lie at infinity, and consequently the secants through it be parallel, the polar becomes the locus of the mid-points of a system of parallel chords, each mid-point being harmonically separated from P. The locus of the mid-points of a system of parallel chords of a conic is thus a straight line-the polar of the infinitely distant point at which the chords intersect-and this line is called a diameter of the curve. The intersection of any two diameters of a conic is the pole of the infinitely distant line, and is called the centre of the conic. The centre of an ellipse lies inside the curve since the infinitely distant line of the plane lies wholly outside and all diameters cut the curve in real points. For a parabola, the infinitely distant line is tangent to the curve, hence the centre lies on the curve at infinity, and all diameters are parallel. The infinitely distant line cuts a hyperbola in two real points, hence the centre lies outside the curve and there are some diameters which cut the curve in real points while others do not. The diameters which are tangent to the curve at infinity are called asymptotes. MODERN PURE GEOMETRY 89 XIV. CONCLUSION 59. In this brief chapter it is hoped that enough of the spirit of modern pure geometry has been exhibited to encourage the reader to continue its study. The field yields rich results in applications to more elementary subjects and is not too difficult or forbidding for the isolated reader. By continuing the methods here indicated a complete study can be made not only of the conic sections, but of the relations of conies to each other and of many curves of higher order. The pole and polar theory in reference to a conic relates every point of the plane to a line and every line to a point in such a way as to give concreteness to the principle of duality and to make it possible so to reciprocate systems of points and lines as to yield definite sets of lines and points. By projection from a point outside the plane all the conclusions here reached can be transferred directly to cones of the second order and their tangent planes, and many interesting theorems develop which are not simple projections and which have not more than an analogy in a plane figure. Ruled surfaces of the second order appear from a consideration of two projectively related ranges of points which do not lie in the same plane, or from two projective sheaves of planes whose axes do not intersect. Perhaps the easiest and most attractive approach to the study is through Reye's Geometrie der Lage, but the English translation of Part I of this work, made some years ago, is now out of print. Cremona's Projective Geometry, English translation by Leudesdorf, has always been a popular text. The classic treatises on modern geometry are Chasles's Geometrie Superieure, Steiner's Systematische Entwicklung, etc., Poncelet's Proprietes Projectives, and Von Staudt's Geometrie der Lage. Of these the last, though perhaps the most systematic, should be read only after a considerable knowledge and comprehension of the subject has been obtained. III NON-EUCLIDEAN GEOMETRY By FREDERICK S. WOODS CONTENTS SECTIONS. I. INTRODUCTION....................................... 1 II. PARALLEL LINES...................................... 2-5 III. THE EUCLIDEAN ASSUMPTION........................... 6 IV. THE LOBACHEVSKIAN ASSUMPTION....................... 7-1.2 V. THE RIEMANNIAN ASSUMPTION......................... 13-15 VI. THE SUM OF THE ANGLES OF A TRIANGLE................ 16-20 V II. A REAS...............................................21-24 VIII. NON-EUCLIDEAN TRIGONOMETRY........................25-35 IX. NON-EUCLIDEAN ANALYTIC GEOMETRY.................. 36-43 X. REPRESENTATION OF THE LOBACHEVSYKIAN GEOMETRY ON A EUCLIDEAN PLANE................................. 44-51 XI. RELATION BETWEEN PROJECTIVE AND NON-EUCLIDEAN GEOMETRY.........................................52-55 XII. THE ELEMENT OF ARC................................. 56 92 III NON-EUCLIDEAN GEOMETRY By FREDERICK S. WOODS I. INTRODUCTION 1. The fifth postulate of Euclid reads as follows: "If a straight line falling on two straight lines makes the interior angles on the same side less than two right angles, the two lines, if produced indefinitely, meet on that side on which are the angles less than two right angles." Under the term non-Euclidean geometry we shall understand a system of geometry which is built up without the use of this postulate. Strictly speaking, perhaps, the same name might be given to any geometry the basis of which differs in any essential particular from that of Euclid, but usage has decreed otherwise. The conception of a non-Euclidean geometry came into being only after centuries of vain attempts to prove the truth of Euclid's postulate. There is no place here to review the history of such attempts.* It is sufficient to note that all inevitably failed. Some writers, however, especially Saccheri (1667-1733), Lambert (1728-77), and Legendre (1752-1833) made important contributions to what is now recognized as * See, for example: Engel-Staeckel, Theorie der Parallellinien von Euklid bis auf Gauss, Leipzig, 1895. A shorter account is found in Bonola, Die nichteuklidische Geometrie, Vol. IV, of the series, Wissenschaft und Hypothese, Leipzig, 1908. See also the Historical Note, in Manning, Non-Euclidean Geometry, Boston, 1901; and Heath, The Thirteen Books of Euclid's Elements, Vol. I, p. 202, Cambridge, 1908. 93 94 MODERN MATHEMATICS the non-Euclidean geometries, though each failed to see the true meaning of the results he obtained. Finally, nearly simultaneously though quite independently, a Russian, Lobachevsky, a Hungarian, J. Bolyai, and a German, Gauss, reached the conclusion not only that the parallel postulate could not be proved, but that a logical system of geometry could be constructed without its use. The work of Gauss is only partly revealed by extracts from his correspondence and fragments of his posthumous papers. That of Lobachevsky is contained in several articles published between 1833 and 1855, and that of Bolyai in an appendix to a work of his father published in 1832-35. The system of geometry common to these three writers we shall call the Lobachevskian geometry, since Lobachevsky was the mathematician to develop it most fully.* The Lobachevskian geometry remained for a time the sole type of a non-Euclidean geometry. In 1854, however, Riemann, working from the standpoint of the differential calculus, discovered a new type to which we shall give the name of the Riemannian geometry. Besides the three types of geometry, the Euclidean, the Lobachevskian, and the Riemannian, there are also three methods by which the geometries may be developed. The first is by elementary methods similar to those of Euclid, and was used by Lobachevsky, Bolyai, and Gauss. The second is by use of Cayley's system of projective measurement and has been largely employed by Klein. The third is that of the calculus, and has been used by Riemann. We shall begin by employing the first method, but shall later make some reference to the other two. It does not lie within the plan of this paper to examine the assumptions which must be made before any form of a parallel postulate can be introduced. This work has been done by * English readers will find the simplest introduction to Lobachevsky's own work in the little book written in German and translated into English by G. B. Halsted under the title, "Geometrical researches on the theory of parallels." More complete is Engel's translation: Lobatschefsky, Zwei geometrische Abhandlungen aus dem Russischen iibersetzt mit Anmerkungen and mit einer Biographie des Verfassers, Leipzig, 1879. NON-EUCLIDEAN GEOMETRY 95 Professor Veblen in his paper* contained in the present collection and the results of that paper will be assumed as known and freely referred to. It is believed, however, that this paper may be easily read by any reader who prefers to start from the original definitions, common notions, and postulates, stated or implied, of Euclid. II. PARALLEL LINES 2. We assume Euclid's fundamentals with the exception of the parallel postulate, or make Veblen's assumptions I-XII and XIV. The first twenty-eight propositions of the first book of Euclid (Veblen, VIII) are then true. We proceed to give a definition of parallel lines more general than that of Euclid. Let PQ (Fig. 1) be any straight line and A any point not on PQ. Through A there passes a set of lines intersecting PQ, /, / " C C B - Q FIG. 1. since any point on PQ may be joined to A. It is conceivable that there may be other lines through A which do not intersect PQ. In that case, there will be lines such as AL and AK, not intersecting PQ and forming the boundaries of the set of lines which meet PQ. Such lines are said to be parallel to PQ. Otherwise expressed: Let AB be any line through A intersecting PQ. The line AL is said to be parallel to PQ at the point A, if (1) AL does not intersect PQ no matter how far produced. * Monograph No. I. 96 MODERN MATHEMATICS (2) Any line through A in the angle opening BAL does intersect PQ. It is evident that this definition considers only those portions of the lines AL and PQ which lie on the same side of AB. In other words, the directions of the lines are important. We shall indicate the directions of parallel lines in the usual way by the order in wl.ich the letters at these extremities are named. Thus we shall say that AL is parallel to PQ and AK is parallel to QP. The line AB may be any line through A intersecting PQ. It is often convenient, however, to use the line AHl perpencicular to PQ. We may then show that 4HAK2= 4HAL. For if 42HAK were greater than 4HAL, we could draw AC meeting QP in C so that 4HAC= 4HAL. Now take C' on HQ so that IC' = HC and connect A and C'. By Euclid, I, 4 (Veblen, Theorem 32) the triangles HAC and HAC' are congruent, and hence 4HAC' - 42HAC = 4HAL. This is impossible, since AL is parallel to HQ. Hence 42HAK cannot be greater than 4HAL. In like manner, 4HAL cannot be greater than 4HAK. Hence 4HAIK= 4HAL. The angle HAL is called the angle of parallelism for the distance AH. In the definition, the point A plays apparently a unique role. We shall show this to be unessential by the theorem of the next section. 3. A straight line maintains the property of parallelism at all its points. Let AK (Fig. 2) be parallel to BQ at the point A and let Ai be any point on AK. We wish to show that AK is parallel to BQ at the point A1. Connect A1 and B and draw through A1 any line A1C in the angle opening BA1K. Take D any point on A1C and NON-EUCLIDEAN GEOMETRY 97 connect D with A. The line AD prolonged will meet BQ at some point F since AK is parallel to BQ. Hence A1C will meet BQ in some point between B and F (Veblen, Theorem 17). That is, any line through A1 in the angle opening BA1K intersects BQ. But A1K does not intersect BQ. Hence it is parallel to BQ. The proof also holds that if A1 is taken on the backward extension of AK, but, in that case D must be taken on the backward extension of A1C. We shall now show that the property of parallelism is reciprocal.. D A E I. -.Q P Q______ _ FIG. 2. FIG. 3. 4. If a line is parallel to another line the second line is parallel to the first. Let LK (Fig. 3) be parallel to PQ. We wish to prove that PQ is parallel to LK. From A draw a line perpendicular to LK. This perpendicular will meet PQ at some point B since LK is parallel to PQ. Draw through B any line BC in the angle opening QBA. Construct the two angles ABE and ABD so that 4ABE — 4ABD< 4-QBC. Then BD=BE, (Euclid, I, 2>3) 4BEK > 4BDK. (Euclid, I, 16) Hence we may draw in the angle BEK a line EF so that 4BEF= 4BD)K, 98 MODERN MATHEMATICS and EF will meet HQ since LK is parallel to PQ. Now take DG=EF and draw BG. Then the two triangles BEF and DBG are congruent and therefore 4DBG = 4EBF. But 4DBE< 4QBC. Therefore 4EBG > 4EBC. Hence the line BC meets LK at some point between E and G. But BC is any line through B in the angle opening QBA and LK and BQ do not meet. Therefore BQ is parallel to LK. 5. If two lines are parallel to a third, they are parallel to each other. We distinguish two cases according as the third line lies between the two lines or not. In the first case, let AK and DQ (Fig. 4) be each parallel to ML. We wish to prove that AK is parallel to DQ. Draw A A K cK K' D L Q Q L FIG. 4. FIG. 5. AC any line through A in the angle opening DAK. AC will meet ML in some point F since AK is parallel to ML. CF produced will also meet DQ, since ML and DQ are parallel. Hence any line through A in the angle opening DAK meets DQ. On the other hand AK cannot meet DQ since it cannot meet ML. Hence AK and DQ are parallel. In the second case, let AK and DQ (Fig. 5) be each parallel to ML. We wish to prove that AK is parallel to DQ. Draw through A the line AK' parallel to DQ. Then by the first case AK' is parallel to ML and hence coincides with AK. NON-EUCLIDEAN GEOMETRY 99 III. THE EUCLIDEAN ASSUMPTION 6. We may replace Postulate 5 of Euclid or Assumption XIII of Veblen by the following assumption while retaining all the other assumptions of either author. Through any point in the plane there goes one and only one line parallel to a given line. That one parallel exists, is, in fact, proved in the twentyeighth proposition of the first book of Euclid (Veblen, VIII). To assume that only one parallel exists is equivalent to assuming that in Fig. 1 the lines AL and AK form one and the same straight line. Hence 24HAL = 4HAK = rt. 4. Take now M (Fig. 6), the middle point of AB and draw MD perpendicular to PQ and intersecting AL in C. Then as L C A K rM P B Q FIG. 6. just shown, 4DCK is a right angle. The two right triangles AMC and BMD are congruent and 4CAB = 4ABD. Therefore 4QBA + ~4BAK= 2 rt. 4s. By our definition of parallels, any line through A in the angle BAK meets PQ. Hence our assumption is equivalent to Euclid's Postulate 5. From this follows the Euclidean geometry. 100 MODERN MATHEMATICS IV. THE LOBACHEVSKIAN ASSUMPTION 7. While retaining all the other assumptions of the Euclidean geometry, we will replace Postulate 5 of Euclid or Assumption XIII of Veblen by the following assumption due to Lobachevsky. Through any point in the plane there go two lines parallel to a given line. It follows that, in Fig. 6, 4QBA + 4BAK< 2 rt. 4-s. For if the sum of the angles QBA and BAK were greater than two right angles we could draw through A in the angle BAK a line not meeting BQ by Euclid I, 28. This is contrary to our assumption that AK and BQ are parallel. On the other hand, if the sum of the angles were equal to two right angles we should have the Euclidean assumption. 8. The following theorems are of vital importance in subsequent proofs. Theorem I. Let AB and CD be twoo parallel lines cut by a third line CD and let A'B' and C'D' be two other parallel lines cut by a line A'C', and let 4DCA = 4D'C'A'; then (1) If AC' = AC, 4C'A'B' = 4CAB (2) IJ A'C'< AC, 24C'A'B' > 4CAB (3) If A'C' >AC, 4.C'A'B'< 4CAB Consider first the case A'C'= AC (Fig. 7). If 4C'A'B' were less than 4CAB draw AK so that 4CAK = 4C'A'B'. Then AK meets CD in some point K. Take K' on B ' B ^A" - B" -----— A'B'.:tD/./ ]ff C K C' / C ----- D C- D' FIG. 7. FIG. 8. C'D' so that C'K'= CK and draw A'K'. Then the triangles ACK and A'C'K' are congruent (Euclid I, 4) and 4C'A'K' = 4CAK NON-EUCLIDEAN GEOMETRY 101 = 4C'A'B'. This is impossible, since A'B' does not meet C'D'. Hence 4C'A'B' cannot be less than 4CAB. Similarly 4CAB cannot be less than 4C'A'B' and hence 'C'A'B' = 4CAB. Consider secondly the case A'C'< AC (Fig. 8). On CA take CA" equal to C'A' and draw A"B" parallel to CD. Then 4CA"B" = C'A'B', as just shown, and AB and A"B" are parallel (sec. 5). Therefore, by sec. 7, 4B"A"A + 4A"AB< 2 rt. 4s. But 4B"A"A 4- 4CA"1B" 2 rt. 4s, whence 4A"AAB< 24CA"B"; that is 4CAJ'< 4lC'A'B'. The third case A'C' >AC, is handled like the second case. Theorem II. Let AB and CD be two parallel lines cut by a third line AC and let A'B' and C'D' be two other parallel lines cut by a line A'C', and let 4CAB- 4C'A'B' and 4ACD = 4A'C'D', then AC - A'C'. For each of the suppositions AC< A'C' and AC >A'C' contradicts Theorem I. If in Theorem I we take 4DCA=- 4D'C'A'=-rt.., the angles CAB and C'A'B' are the angles of parallelism for the distance AC and A'C' respectively (sec. 2). Theorem I includes then, as a special case, the following: Theorem III. The angle of parallelism is fixed for a fixed distance and decreases as the distance increases. If we denote the distance AH (Fig. 1) by p, the angle of parallelism HAL is denoted in Lobachevsky's notation by 11 (p). Theorem III asserts that 17 (p) is a decreasing function of p. The exact determination of 1 (p) will be given in sec. 33. We may note, however, that 1 (p) is always less than a right angle. In other words, Theorem IV. If two lines have a common perpendicular they neither intersect nor are parallel. The converse of IV is also true, as we shall now show. 102 MODERN MATHEMATICS 9. Two straight lines which neither intersect nor are parallel have a common perpendicular.* Let LM and EF (Fig. 9) be two straight lines which neither intersect nor are parallel. We wish to show that they have a common perpendicular. Take A and B any two points on LM and draw AH and BK perpendicular to EF. If AH=BK the existence of a common perpendicular to LM and EF follows quickly, as shown below. Suppose then that BK< AH. Draw KS parallel to LM. Place t the rt. 4 FKB on the rt. 4 FHA so that K falls on H, KF takes the direction HF LA B P B IV Q S/ s E H K R T FIG. 9. and KB takes the direction HA. The point B falls at B' between H and A, BM takes the position B'M' and KS the position HS', parallel to B'M'. Since 4FKS= 4FHS' a line parallel to KS (and hence to LM) drawn through H lies in the angle opening FHS' (sec. 7). Hence HS' intersects LM and therefore B'M' intersects LM at some point P (Veblen, Theorem 17). Draw PR perpendicular to EF. Place the right angle FHB' on the right angle FKB. Then the line PR takes the position QT, where QT= PR and QT is perpendicular to EF. Take now W halfway between R and T and draw WV * The proofs in this and the following section are due to Hilbert, Neue Begrundung der Bolyai-Lobatschefkyschen Geometrie, Math. Ann., Vol. LVII. t Here and subsequently, we use the principle of superposition to abbreviate the proof. - The theorems on congruence may of course be employed without the aid of any idea of mechanical motion. NON-EUCLIDEAN GEOMETRY 103 perpendicular to EF. Fold the figure TWV on WV. Then T falls on R, TQ coincides with RP, and 4WVQ coincides with 4WVP. Hence WV is the required common perpendicular to EF and LM. 10. Any angle is an angle of parallelism belonging to a certain distance. Let KAE (Fig. 10) be any given a. We wish to find a distance p for which a is the angle of parallelism. Construct LAE= a and on AK and AL take two points B and C so that AB=AC. Connect B and C and draw BL' parallel to BL and CK' parallel to CK. Draw also CF bisecting 4LCC' and c AB L K BG bisecting 4KBL'. It is evident L N:' that the figure is symmetric with respect to the line AE. The lines CF and BG cannot in- u E,_ tersect. For if they did intersect at s --— T a point T, we could draw TS parallel FIG. 10. to AL and BL' and then, since 4LCT = 4L'BT and CT = BT, we should have 4STC = STB (sec. 8, I) which is impossible. Also CF and BG cannot be parallel, for if they were, since 4LCF = 4L'BG and 4CNL'= 4BNF, we should have CN= NB (sec. 8, II) and therefore 4NCB= 4NBC= 4K'CB, which is impossible. Since FC and BG neither intersect nor are parallel, they have (sec. 9) a common perpendicular UV, which is also, by the symmetry of the figure, perpendicular to AE at H. We assert that UV is parallel to AK. If UV were not parallel to AK we could draw from each of the points U and V a line parallel to AK and CK'. Since CU=BV and 4UCK'=4 VBK, these two parallels would make equal angles with UV (sec. 8, I) which is impossible. Hence the angle KAE is the angle of parallelism for the distance AlH. 11. Two parallel lines approach each other continually and their distance apart eventually becomes less than any assigned quantity. 104 MODIERN MATHEMATICS Let LK and PQ (Fig. 11) be two parallel lines, and A and B two points on LK, the point B lying from A in the direction of parallelism. From A and B draw AH and BM perpendicular to PQ. We wish to prove BM< AH. Take R half way between H and M and draw the line RC perpendicular to PQ. The angle RCB is less than a right angle, since it is an angle of parallelism. Therefore 4RCB< 4RCA. Hence, if the quadrilateral RMBC is folded over on RC as an axis, the line MB takes the position HB' where MB = HB'< HA. Hence the lines LK and PQ continually approach each other. To prove the second part of the theorem, let AK and HQ (Fig. 12) be any two parallel lines and AH a perpendicular from pn B L FIG. 11. FIG. 12. A to HQ. Let e be any assigned quantity and lay off on AUT the distance HD< e. Draw DL parallel to HQ and AK. Then 4HDL< rt. 4. Hence the line DE drawn from D perpendicular to AH will meet AK in some point C. From C draw CM perpendicular to HQ. Now L4MCD > 4MCIK, for 4MCEK is the angle of parallelism for the distance CM, and the line CD and MH neither intersect nor are parallel, since they have a common perpendicular (sec. 8). Hence if the quadrilateral MHCD is folded over on MC as an axis, it takes the position Mll'D'C where CK lies between CD' and MQ. Then CK meets H'DI' in some point K' where H'K'< H'ID'= HD. Hence H'IK'< e. 12. If two lines are not parallel they will diverge if sufficiently far produced, and their distance apart will eventually become greater than any assigned quantity. Consider first two intersecting straight lines AM and AN (Fig. 13). Let B and D be two points on AM such that AD >AB, NON-EUCLIDEAN GEOMETRY 105 and let BC and DE be drawn perpendicular to AN. We wish to prove DE >BC. Suppose if possible that DE=BC. Then a line drawn perpendicular to AN at the middle point of CE would be also perpendicular to AM, which is impossible, since AM and AN intersect (sec. 8, IV). Suppose, if possible, that DE< BC. Take AF less than each of the distances DE and AB' and draw FG perpendicular to N. Then FG< AF< DE. But BC>DE. Hence at some point K between G and C there is a perpendicular HK such that IIK==DE. But this is impossible, as just shown. Therefore DE >BC. L B G K C 1 A E II FIG. 13. FIG. 14. To show that there is no superior limit to the length of ED, take AH (Fig. 14) so that 4M1AN is the angle of parallelism for AH (sec. 10) and draw HL perpendicular to AM. Then AN and IL are parallel. Let a be any quantity, no matter how large, and take Q on HL so that HQ= 2a. Connect Q and A, and at E, a point between A and H, draw a line perpendicular to AHl, intersecting AQ in R. We can take E so near H that RE will differ from HQ by as little as we please and certainly so that RE >a. But RE will intersect AN in a point D, since the angle of parallelism for AE is greater than 4HAN (sec. 8, III). Then DE >RE >a. Since a is any positive number, there is no superior limit to the length of DE. Consider now two non-intersecting lines MN and PQ (Fig. 15). At A, any point on MN draw AK r arallel to PQ. Since AK and MN intersect, their distance apart eventually becomes greater than any assigned quantity. But the distance between 106 MODERN MATHEMATICS AK and PQ eventually becomes less than any assigned quantity (sec. 11). Hence the distance between AN and PQ eventually becomes greater than any assigned quantity. P-'___- A 9 ----Z FIG. 15. It is of course possible that AN and PQ approach each other for a time, but they eventually diverge. In fact the shortest distance between them may be shown to be measured by their common perpendicular. V. THE RIEMANNIAN ASSUMPTION 13. There remains the possibility, as discovered by Riemann, of replacing Euclid's fifth postulate by the assumption: Through a point of the plane no line can be drawn parallel to a given line. In other words all lines of the pencil with its vertex at A (Fig. 1) intersect PQ. This assumption contradicts proposition 28 of Euclid's first book, so that it is necessary to modify the assumptions upon which that theorem depends. Proposition 28 depends upon proposition 16, which in turn depends upon the tacit assumption that two straight lines cannot enclose a space. This assumption is satisfied when applied to objective space in the domain of experience. We will accordingly assume that the Euclidean assumptions, with the exception of the parallel postulate, are valid in a sufficiently restricted portion of space, that is, in a portion of space in which no straight line can be drawn of greater length than some fixed line of length M. We may proceed similarly with the Veblen assumptions. Let [S9 be our space, for which all the assumptions, except II, are made. And let [So] be a subset of points of [S] for which, in addition, Assumption II is made. Then in [So] we have all NON-EUCLIDEAN GEOMETRY 107 the theorems proved by Veblen, and in [S] those theorems which do not depend upon Assumption II. The assumptions and theorems concerning congruence enable us to compare geometric configurations in [So] with others which lie outside of [So]. In particular, the theorems on the congruence of triangles are independent of the positions of the triangles. With this preparation, we may proceed to examine the results of the Riemannian assumption. 14. All lines perpendicular to the same straight line meet in a point at a constant distance from the straight line. Let LK (Fig. 16) be any straight line and A and B any two points upon it. By the Riemannian hypothesis AO and L BO, perpendicular to LK, meet c M in a point 0. Since it is conceivable that the perpendiculars may meet more than once we assume explicitly that the two B --- o perpendiculars have no common point on the segment AO or 0, D BO. We assume also that the A triangle ABO lies in the region [So] of sec. 13, so that in par- FIG. 16. ticular only one straight line can be drawn from 0 to any point of the segment AB. Since 4BAO= 4ABO, B=AO.0 by Euclid, I, 6. Construct 4BOM= 4AOB. Then by the Riemannian hypothesis the line OM meets LK in a point C. The triangle BOC has two angles and an included side congruent respectively to two angles and an included side of the triangle AOB. Hence 4BCO= 4ABO= rt. and OC=OB =OA. 108 MODERN MATHEMATICS By repeating this demonstration, we prove that if P is a point on LK such that AP=m - AB, where m is a positive integer, the line OP is perpendicular to LK at P and PO=AO. But only one perpendicular can be drawn at P to LK. Hence this perpendicular passes through 0. Now take D, so that AB=n AD, where n is a positive integer, and draw a line perpendicular to LK at D. If this perpendicular should intersect either BO or AO at a point O' in the segments BO or AO, then, by the demonstration just finished, BO and AO would also intersect at 0', contrary to hypothesis. Hence this perpendicular passes through O and DO=AO. It follows that if P is any point on LK such that AP —.AB, n where m and n are positive integers, the perpendicular at P to LK passes through O and PO-AO. Also, since by hypothesis, only one straight line can be drawn from P to 0, the line PO is perpendicular to LK. Now let Pi be a point such that AP1= AAB, where A is an irrational number. Take P such that AP -AB, it draw OP and OP1, and let - pass through rational values n approaching A as a limit. 4AP10 = lim 4APO, P10 = lirn PO. But APO is always a right angle and PO is always equal to AO. Hence 4APO1-rt. 4 and P10=AO. Our theorem is therefore proved for the line LK. If L'K' is any other line, we may take A' and B' any two points on it, NON-EUCLIDEAN GEOMETRY 109 and draw the perpendiculars A'O' and B'O', intersecting at 0'. Take AB on LK so that AB=A'B'. The two triangles ABO and A'B'O' are congruent and A'O'= AO. The distance AO is therefore independent of the line LK or of the position of the point A on the line. We will place OA =J. A corollary of our theorem is that all straight lines are of constant length. For, from the proof we have used, it is evident that, if P is any point on AB, AP 4AOP AB 4AOB' Now if 4AOP=27, the line OA coincides with OP, and AP becomes 1, the total length of the line. Then _ 4AOT B 4AOB' 15. All lines which pass through a point 0 meet again in a point 01 such that the distance 01 is constant. Let 0 (Fig. 16) be any point and OA any line through 0. Take OA=J (sec. 14) and draw LK perpendicular to AO. Let OB be any other line through 0 intersecting LK in B. Then OB is perpendicular to LK (sec. 14). Prolong AO to Oi so that AO1 = AO and draw O1B. The triangles AOB and AO1B are congruent, since two sides and the included angle of one are congruent respectively to two sides and the included angle of the other. Hence 4AB - 4ABO =rt. 4, and O1B=OB=OA. Therefore the line OBOi is a straight line and 0 =2J. Since all lines are of finite length (sec. 14) any line through 0 returns through 01 to O. Two cases are usually considered. First, the point 01 may coincide with O. The total length of a straight line is then 2J and any two lines have only one point in common. 110 MODERN MATHEMATICS Secondly, the point 01 may be distinct from 0, but the lines 001 continued through 01 meet again in 0. The total length of a line is then 4J and two lines meet in two points. The Riemannian geometry in this case is the same as the geometry on the surface of a sphere. VI. THE SUM OF THE ANGLES OF A TRIANGLE 16. Consider any triangle ABC (Fig. 17). Take E, the A K E/ \F L G B C FIG. 17. middle point of AB, F the middle point of AC, and draw a straight line EF. From A, B, and C draw the lines AG, BK, and CL perpendicular to EF. In the right triangles AEG and EBK, EA = EB and 4GEA = 4BEK. Hence the two triangles are congruent and BK= AG, 4KBE= 4GAE. Similarly, the right triangles AGF and FLC are congruent and AG=CL, 4FCL ==4GAF. If we define equivalent figures as those which may be divided into parts which are congruent in pairs, it appears that the triangle ABC is equivalent to the quadrilateral BCLK. Also, the sum of the angles of the triangle ABC is equal to the sum of the angles KBC and LCB of the quadrilateral BCLK. This quadrilateral BCLK has two right angles, L and K, and two equal sides, KB and LC, adjacent to the right angles and opposite to each other. Such a figure we shall call an isosceles birectangular quadrilateral. The study of the sum of the angles and of the area of a triangle is thus reduced to the study of an equivalent isosceles birectangular quadrilateral. 17. Let ABCD (Fig. 18) be an isosceles birectangular quadrilateral with right angles at A and B. For convenience, we NON-EUCLIDEAN GEOMETRY 1ll shall call AB the base, CD the summit, and C and D the summit angles of the quadrilateral. Take L the middle point of the base and draw LK perpendicular to the base. Fold LBDK on LK as an axis. It is clear that the point D falls on C. Hence, the summit angles of an isosceles birectangular quadrilateral are equal. Also, LK is perpendicular to CD at its middle point K and the quadrilateral LBDK has three right angles. Now through H, the middle point of LK, draw EF perpendicular to LK. Fold HFDK on HF as an axis. The point D will fall at B', B, or B" according as KD is less than, equal to, C D C K D M A L B' B B" A B FIG. 18. FIG. 19. or greater than LB. In these three cases 4D is greater than, equal to, or less than, 4B respectively. Hence: Each summit angle of an isosceles birectangular quadrilateral is less than, equal to, or greater than, a right angle, according as the summit of the quadrilateral is greater than, equal to, or less than, the base. 18. In the Euclidean geometry each summit angle of an isosceles birectangular quadrilateral is equal to a right angle. This is a familiar proposition of the Euclidean geometry and need not be proved here. We shall prove, however, the following theorem: In the Lobachevskian and Riemannian geometries, a summitangle of an isosceles birectangular quadrilateral cannot equal a right angle. Let ABCD (Fig. 19) be an isosceles birectangular quadrilateral with right angles at A and B. If possible, suppose 4C= 4D==rt. 4. Then (sec. 17) CD = AB. Take two points 112 MODERN MATHEMATICS M and N on AC and BD respectively so that CM=DN and draw MN. Then ABMN is an isosceles birectangular quadrilateral with right angles at A and B. MNDC is an isosceles birectangular quadrilateral with right angles at C and D. Then MN must be perpendicular to AC and BD or we should have, by sec. 17, MN greater than one and less than the other of the two equal lines AB and CD, which is absurd. Since M is any point between A and C it appears that the segments AC and BD are equidistant. By prolonging the lines AC and BD and considering congruent segments, it appears that the lines AC and BD are equidistant throughout their extent. Since this is impossible in the Lobachevskian and Riemannian geometries (sees. 11, 12, 13) the theorem is proved. 19. Each summit angle of an isosceles birectangular quadrilateral is less than a right angle in the Lobachevskian geometry and greater than a right angle in the Riemannian geometry. In Fig. 18, the line CK measures the distance of the line AC from the line LK at the point C. In the Lobachevskian geometry, if the line AC is taken sufficiently long, CK>AL (sec. 12). If, therefore, CK were in any position less than AL, there would exist at least one other position in which CK =-AL. This is impossible (sec. 18) and hence CK is always greater than AL and the angle C less than a right angle (sec. 17). In the Riemannian geometry the lines AC and LK eventually intersect. Hence, if AC is sufficiently long CK< AL, and therefore CK is always less than AL and the angle C greater than a right angle. 20. In the Euclidean, Lobachevskian, and Riemannian geometries respectively the sum of the angles of a triangle is equal to, less than, and greater than, two right angles. We have seen in sec. 16 that the sum of the angles of a triangle is equal to that of the summit angles of an isosceles birectangular quadrilateral. The theorem then follows from'secs. 18, 19. NON-EUCLIDEAN GEOMETRY 113 VII. AREAS 21. According to the definition already given (sec. 16) two polygons are equivalent, or equal in area, if they can be divided into the same number of triangles which are congruent in pairs. We have proved (sec. 16) that a triangle is equivalent to an isosceles birectangular quadrilateral having its summit equal to one side of the triangle, and each summit angle equal to half the sum of the angles of the triangle. Now, in either the Lobachevskian or the Riemannian geometry an isosceles birectangular quadrilateral is fully determined by its summit and summit angles, for if ABCD (Fig. 20) A FB C A B FIG. 20. FIG. 21. and ABEF are two isosceles birectangular quadrilaterals with the same summit EF and the same summit angles E and F, their bases CD and AB must coincide. Otherwise, the quadrilateral ABCD would have four right angles, which is impossible (sec. 18). Hence follows the tleorem: In the Lobachevskian and Riem2annian geometries, two triangles are equivalent if a side and the sum of the angles of one are equal to a side and the sum of the angles of another. 22. A triangle may be constructed having the same area and the same angle sum as a given triangle, and having one side arbitrarily assumed within certain wide limits. Let ABC (Fig. 21) be a given triangle and BCKL the isosceles birectangular quadrilateral constructed as in sec. 16. Let I be a given length. With Il as a radius and B as a center described an arc of a circle cutting KL in M. Connect B and 114 MODERN MATHEMATICS M and prolong BM to A' so that MA'= BM. Connect A' and C. Then A'BC is the required triangle, as is readily shown. That the construction may be possible, it is necessary, on the one hand, that BM >BK, a condition which is certainly met if I >AB. On the other hand, it is necessary, in the Riemannian geometry, that I should be less than the constant 2, (sec. 15). If, now, we have two triangles with the same angle sum, we may take I greater than one side of each, and replace each by an equivalent one with the same angle sum and a side equal to 1. The two new triangles are equivalent (sec. 21). Hence: Any two triangles with the same angle sum are equivalent. 23. Consider any triangle ABC (Fig. 22) and draw from A a straight line to any point D of the base. A We shall call this line a transversal and shall say that the triangle is divided transversally. Now if s is the sum of the angles of the triangle ABC, and si and s2the sum of the angles of the triangles. B D c ABD and ADC respectively, we have FIG. 22. s= —S +S2-2 rt. 4s. If we adopt such a unit of angle measure that a right angle 7r shall have the measure -, the above equation may be written 2' in either of the forms 7- S = (7 - Sl) + (7r- Sa) or s- r = (S - 7) + (S2 - 7). In the Lobachevskian geometry, z-s is positive (sec. 20) and is called the defect of the triangle. In the Riemannian geometry s-n is positive and is called the excess of the triangle. Hence we may state the theorem: If a triangle is divided transversally, the sum of the defects, or excesses, of the parts, is equal to the defect, or excess, of the triangle. NON-EUCLIDEAN GEOMETRY 115 The theorem evidently remains true if the triangle is further subdivided by successive transversals of the parts, as shown, for example, in Fig. 23. Further Hilbert * has shown that any division of a triangle may be reduced to transverse divisions. We have accordingly the more general theorem: In the Lobachevskian and Riemannian geometries the defect, or excess, of any tri- FIG. 23. angle is equal to the sum of the defects, or excesses, of triangles which are formed from it by any system of division. 24. Since equivalent triangles may be divided into the same number of triangles congruent in pairs (sec. 21), and since obviously congruent triangles have the same defect, or excess, it follows that any two equivalent triangles have the same defect, or excess. The converse theorem has been proved in sec. 22. We are now enabled to take the defect, or excess, of a triangle as the measure of its area, since the essential properties of a measure of area are that two triangles with the same area have the same measure, that two triangles with the same measure have the same area, and that the measure of a whole is the sum of the measure of its parts. Hence we may say: In the Lobachevskian geometry the area of a triangle is equal to a constant times its defect. In the Riemannian geometry, the area of a triangle is equal to a constant times its excess. The value of the constant depends evidently upon the unit of area employed. The area of a polygon is found by dividing it into triangles. * Grundlagen der Geometrie, Vol. VII of Wissenschaft und Hypothese, Leipzig, 1909. 116 MODERN MATHEMATICS VIII. NON-EUCLIDEAN TRIGONOMETRY 25. The definitions of the trigonometric functions as given in the elementary trigonometry are evidently not available in non-Euclidean geometries, since these definitions are based upon properties of similar triangles which are true only in the Euclidean geometry. Lobachevsky met this difficulty by the consUtruction of a limit-surface,' or horisphere, on which the Euclidean geometry and trigonomretry are valid at the same time -that the Lobachevskian geometry is valid on the plane. By the aid of this surface and the spher lie obtained the formulas which will be found in sec. 34. This method, however, cannot be applied to the Riemanniian geometry. We shall therefore follow a more general method which has also the advantage of operating entirely in the plane. The method, however, is not as elementary as the other, and we shall be obliged to state some results without proof ant to give a mere outline of other proofs.* We start with the purely analytic definitions of the trigonometric functions. That is, ev- being defined by the series XI x2 X3 e = I - +- +-.., 1 2! 3! * the trigonometric functions are defined by the equations e'i e- xz i sin x= — -e COS X =tan x -- _ where i= I/-1. These functions obey all the formulas of trigometry and if x is a real number they are real. * For complete proofs and historical notes consult Coolidge, The Elements of Non-Euclidean Geometry, Oxford, 1909, expecially Chapter IV, where all the requirements of rigor are met. NON-EUCLIDEAN GEOMETRY 117 If x is pure imaginary, the above equations lead to the hyperbolic functions, which are defined by the following equations: ex __ e-x - i s in ix -- -- -- sinh x, er -. e-r, cos ix = --- —- cosh x, -i tan ix- =-. tanh x. e,~ -+ e- If x is real, the hyperbolic functions are real, and formulas for this use are readily obtained, if needed, frolm the trigonometric functions. The following properties of cos x are important for us: If cos X < 1, x is real; if cos x >1, x is pure imaginary, except perhaps for multiples of the period 2n which may always be added. If we place cos nmx =f(x), f(x) satisfies the functional equation f ( + y) +f ( - y) = 2(x)/(). Conversely, if f (x) is a continuous function of x satisfying the above equation, then f (x) - cos mx, m being a constant, real or complex. 26. The sine and cosine of an acute angle may be defined as follows. The extension to angles of any size is then made as in the ordinary trigonometry. Let A (Fig. 24) be any acute angle and MP the perpendicular fromt any point P of one side to the other side, then it may be AM shown that AP approaches a limit as AP approaches zero, and AM that lim- p is a continuous function of A, which satisfies the A[ functional equation of sec. 25. Hence lim -p =cos mA Since AM< AP, the coefficient m is real, and if we adopt a sys 118 MODERN MATHEMATICS tem of measurement of angle by which a right angle has the measure -, we may place m =1. Hence, finally, AM lim A = cos A. In a similar manner MP lim -psin i. AP C D A M A B FIG. 24. FIG. 25. 27. Let AC (Fig. 25), be any straight line of given length a. From A draw AB perpendicular to AC and take AB any length. At B draw BD perpendicular to AB, take BD=AC, and complete the isosceles birectangular quadrilateral ABCD. CD Then it may be shown that -A approaches a limit, as AB approaches zero, and that this limit is a continuous function of a, satisfying the functional equation of sec. 25. Hence CD lirm -- = cos ma. In the Lobachevskian geometry, CD >AB and m is pure imaginary. In this case, we place m=-, where k is real, and have CD ia a lim. =cos = cosh. AB =osk NON-EUCLIDEAN GEOMETRY 119 In the Riemannian geometry, CD<AB and m is real. In this case, we place m-,I and have CD a lin - =cos -. There appears here a striking property of the non-Euclidean geometries in the existence of functions of distances analogous to functions of angles. The constant k depends upon the unit of distance employed. If we apply the construction of this article to the Euclidean CD geometry we obtain the trivial result lim -A=1. It is worth AB noting that this comes out of the previous results by placing 28. e shall indicate in this section a et by hich 28. We shall indicate in this section a method by which a fundamental formula connecting the sides of a right triangle may be obtained. Let ABC (Fig. 26), be a triangle with the right angle at C and with the sides AB=c, AC=b, CB= a. Take AA1 a small distance on AC and prolong AC to C1 so that AA== CC1, BD1 A,.. B 1 A 'f g^ --- —- - X 11_ _1 C2 FIG. 26. and construct the triangle A1BLC1 congruent to ABC. Let B2 be the point of intersection of A1Bi and BC. Prolong B1A1 and BC so that AA2 = BB2, and CC2 = BB2,and construct the triangle A2B2C2 which differs slightly from ABC. From B1 draw B1D1 perpendicular to BC, and from B draw BD perpendicular to A1B1. Also draw HH1, the common perpendicular to AC and A2C2, and EE1 the common perpen 120 MODERN MATHEMATICS dicular to AB and A2B2. RE1 evidently bisects AA1, and HH1 passes near the middle point of A1A2. Then as AA1 approaches zero as a limit it mlay be shown tbat BD lirn -- = cos inc, B1D1 BID,- -eos nma lim COS??~n CC1 line C cos mb. HHI1 In fact, from the definition of sec. 27, the reader will have no difficulty in seeing that these relations are, at least, approximately true. The rigorous demonstration may be found in the book by Coolidge just cited. We have, then, cos Inc BD CC1 HH1 lini cos ma cos mib ER1 B1D1 CC2 RD AA1 HH1 -ulim.__ -_ EEI B1D1 BR2 RD AIA HH1 B1R2 E2 R1 A1lA 2 B1D1 Now it may be shown that RD.B1Dt sin B=1i BBR2 lithin - sin A== EntII liH i i EA1A2 AAII as may be seen approximately from Fig. 26, and the definition of sec. 26. We bave, accordingly, cosine 1 1 sin B sin A cos mna Cos "lcImb sin An sin B' Or eos inC (os iia COs, ind. NON-EUCLIDEAN GEOMETRY 121 29. Let ARC (Fig. 27) be a triangle with right angle at C and AC=b, RC==a, AB=c. Take any point D on AC and draw BD, and DE perpendicular to AR. Let RD = 1, DE = p, AE = q, AD= k. Then (sec. 28) cos ml = cos ma cos m(b - k) = cos mc cos mic + cos ma sin mb sin mk cos ml = cos Mp cos m(c- q) = coS mC coS mk + cos mp sin mc sin mq, whence cos ma sin mb sin mik = cos mp sin mc sin mq. B By use of the relations cos mc = cos ma cos mb and cos mk = cos mp cos mq, we find readily E tanmb tanmq A k D c tan mc tan mk' Fig. 27. Now as k approaches zero as a limit, q approaches zero tan mql -. also, and lim lim =cos A. tan mnk k Hence tan mb ta c-cos A. tan MCn~ 30. From the result of sec. 29, we have tan2 mc, - tan2 mb sin2 A = tan2 njC~ sin2 mc - tan2 mb cos2 mc sin2 me 1-(1 +tan2 mb)cos2 mC sin2 Mc cOS2 mC cos2mb I - Cos2ma - sin"2 me sin2 IC 122 MODERN MATHEMATICS Since A is an acute angle, we have sin ma sin A= s sil me Similarly sin mb sin B= sin me From sec. 29 and these results we have cos A cos mc -r = _ - = cos ma. sin B cos mb Whence cos A = cos ma sin B. Similarly cos B = cos mb sin A. 31. The formulas obtained in secs. 28-30 are applicable to both the Lobachevskian and the Riemannian geometries. For the Riemannian geometry, we place m=k and make the following collection of the formulas: c a b cos =cos cos k k k a. c. si s =sin sin A a c tan =tan - cos B a cos A= cos, sin B. b. c. sin - =sin - sin B k k b c tan =tan cos A cos B = cos k sin A. to NON-EUCLIDEAN GEOMETRY 123 32. We obtain from sees. 28-30 the formulas for the Lobai chevskian geometry by placing m=k and replacing the trigonometric functions by the hyperbolic ones. We have c a b cosh = cosh a cosh a c sinh = sinh - sin A k k a c tanh T =tanh k cos B cos A = cosh - sin B sib csi sinh - = sinh - sin B k k b c tanh =tanh - cos A cos B = cosh 7 sin A. k It is worth noticing that the formulas trigonometry come out of those in sec. 31 cases when k= oo (cf. also sec. 43). 33. The formulas of sec. 32 may be 1 expression for the angle of parallelism i belonging to a distance x. Let BM (Fig. 28) be parallel to CN and BC perpendicular to CN. The figure NCBM may be regarded as the c limit of a right triangle ABC in which BC=x is constant, A approaches zero and B The formula for the Euclidean or sec. 32 as limit used to obtain an FIG. 28. ' approaches H(x). goes over into cos A = cosh - sin B (sec. 32) 1 2 sin /7(x) - x ' cosh - e +e e e 124 MODERN MATHEMATICS whence x x ek _e k X cos 11 (x) = e = tanh - ek e k Then sin 1(zx) -_ tan7 (x)=1 — co sH(z)=e. I* / l+cos 77x) 34. If we substitute in the formulas of sec. 32 the values x X of cosh and tanh - found in sec. 33, and make certain simple kc k reductions, the formulas of sec. 32 take the following forms: sin 7 (c) = sin 1 (a) sin 1 (b), tan 17(c) =tan Hl(a) sin A, cos H (a) = cos 7 (c) cos B, sin B = sin H (a) cos A, tan 1 (c) = tan 1 (b) sin B, cos H (b) = cos n (c) cos A, sin A =sin I (b) cos B. These are the forms found by Lobachevsky, except that he writes A=II(a), B=1I7(,), where a and i are the distances corresponding to the angles of parallelism A and B respectively. We shall make, no use of these equations, but have given them to facilitate comparison with Lobachevsky's own work. 35. The above formulas are for right triangles. We shall now obtain one for oblique triangles. A,~~~ c ~bD Let ABC (Fig. 29) be any triangle FIG. 29. with the angles A, B, and C, and the opposite sides a, b, and c, respectively. Draw BD perpendicular to AC and let BD = h, AD = k. Then cos?a = cos 7mh cos m(kI- b) = cos mc cos mnb +sin mb sin min cos Pmh = cos mc cos mb + sin mb tan mk cos mc = cos mc cos mb +sin mb sin mc cot A. NON-EUCLIDEAN GEOMETRY 125 IX. NON-EUCLIDEAN ANALYTIC GEOMETRY 36. Let OX and OY (Fig. 30) be two axes of coordinates intersecting at right angles and MP and NP the perpendiculars from any point P to OX and 0 Y respectively. We shall take OM==x, ON=y as the coordinates of P. To every point P corresponds a single set of coordinates (x, y) and to any set of coordinates corresponds not more than one point P. But if x and y are assumed arbitrarily there is not necessarily a corresponding point P in the Lobachevskian geometry, since the two perpendiculars at M and N may be parallel or non-intersecting. Y -AT L D or' r _ X 01 O X'M X _ ___ ____V FIG. 30. FIG. 31. By drawing the line OP, we mlay take OP- r, 4XOP =0, as the polar coordinates of P. Between the two sets of coordinates there exist, in either the Riemannian or the Lobachevskian geometry, the relations (sec. 29) tan mx= tan mr cos 0, tan my= tan mr sin 0, whence tan2 mx - tan2 my = tan2 mr. 37. The equation of a straight line may be obtained as follows: Let LK (Fig. 31) be any straight line determined by the parameters p and a, where p is the length of the perpendicular OD from the origin and a the angle made by OD with the 126 MODERN MATHEMATICS positive direction of OX. Let P(x, y) be any point on LK and draw OP. Then in the triangle OPD, OD=p, OP=r, 4POD= -a where (r, 0) are the polar coordinates of P. Hence (sec. 29) tan mr cos (0- a) =tan mp, whence (sec. 36) tan mx cos a +tan my sin a =tan mp, the required equation. 38. The distance between two points may be found as follows: Let Pl(xl, yi) and P2(x2, y2) (Fig. 32) be any two points with the polar coordinates (ri, 01) and (r2, 02) respectively. Draw OP1, OP2, and PIP2. Then in the triangle OP1P2 P1 = rl, OP2= r2, 4P20P1 = 01- 2. Y Y 2 p P1.L AX 0 /FIG. 32. FIG. 33. Hence (sec. 35), cos m P1P2 = c6s mrl cos mr2 +sin mnrl sin mr2 cos (1 - 02) =cos mrCr cos mr2[1 +tan mrl tan mr2 cos (01- 02)]. By use of the formulas of sec. 36, this reduces readily to CO M 1 + tan mx1 tan mx2 + tan myl tan my2 cos m PiP2= -, ov/ +tan2 mx1 +tan2 myl V/1 +tan2 mx2 +tan2 my2 the required formula. 39. The angle between two lines may be determined as follows: Let PL1 and PL2 (Fig. 33) be two straight lines intersecting NON-EUCLIDEAN GEOMETRY1 127 at P. Draw from 0 the two perpendiculars OD1 and OD2 on PL1 and PL2 respectively, and (as in sec. 37), let OD,=pi, 4XODi==al, 0D2=p2, 4X0D2=a2. Draw OP and place OP=r, 4XOP=0, 40PD1=Pi, 40PD2 = P2, and 4LPL2 = d4 =2 7 - (PI + P2). Now fromi the right triangles OPD1 and OPD2, we have (sec. 30), sin npI sin mp2 sinP sin mr SinP2 sin Mr' coS pi = coS mp1 sin (0- ai), coS p2 = cos mp2 sin (a2- 0), = -cos rnp2 sin (O- a2) Therefore, sin mp I sin mp2 cos=cos mpI cos Mp2 sin (0- a1) sin (O- a 2) + sin2 ir (1) But (sec. 37) cos (O- a,) tanmr= tan mpi, cos(I0- a 2) tan mr=tan mp2, whence sin mp1 sin MP2 0 = cos inpi coS mp2 cos (O-al) cos (0- a2) — 2Mr tan2 z Adding this equation to equation (1), we have cos ~b==cos mpi cos nip2 cos (al - a2) +sin mpi sin Mp2 cos a, cos a2 +sin ai sin a2 +tan mpI tan nip2 Vi I+tan2 mp V/I +taij2n p2 which gives the required angle in terms of the functions which enter into the equations of the lines. 40. The formulas of secs. 36-39 apply to either the Riemannian or the Lobachevskian geometry. It is now convenient to separate the two cases. In the Riemannian geometry, where n = ~, we will introduce, instead of x and y, the new coordinates $ and rl, where a a E==ktan-c, 4==tan~c i 128 1MODERN MATHEMATICS The equation of the straight line (sec. 37) becomes p i cos a + r sin a =l tan - or, more generally, a 1.c=O,.(2) where a b p -c COS a V -_ Y hla tan (3) C5 a2+ 2' Va2 + ktank=V2 +. Conversely, any equation of form Eq. (2) represents a straight line, since a and p can always be obtained from Eqs. (3). In particular, the equation represents a line perpendicular to 0Y and intersecting OX at the point where $ = cco. But, from Eq. (1), c/j-, when x 2 By sec. 15, two lines perpendicular to the same line intersect at 21 a distance J. Hence lo This fixes the constant k in termis of 4. The formulas for distance (sec. 38), and angle (sec. 39), become respectively P1P2 _ _2 __ __2 +______ coS (4) k Vhk2 ~ $ 12 + ~ 12V/k2+22 + rj22( k2(aia2+blb2) +C1C2 Cos \/AC2(a22 + 1i2) + C12 \V/k2(a22 +b22) ~ C22. (5 In Eq. (4) let us place I = $, 1 I = rj, $2 = + d$, 12=rj+d7. The right-hand side of the equation becomes, as far as infinitesimials of the second orlder are concernedl, [,72(d$2 ~ dy2) - (rd, _- $d~ )2 (/,~2 + $2 + 022) NON-EUCLIDEAN GEOMETRY 129 The left-hand side of the same equation becomes, if we place P1P2=ds, and expand, 1 (ds)2 Hence ds k,\12 (d$2- d~2) + (d,-t- $d~) 2 (6) which gives the element of arc of any curve. We may transform Eq. (6) to polar coordinates by placing r r. = k tan - cos 0, ~k tan - sin 0. k k It becomes ds dr2 + k2 sin -d02. Theref ore the circumf erence, C of the circle r a is a ~~~a C = ksin-fd0 =-27r ksin - 41. To modify3 the formulas sees. 36-39 for the Lobachevskian geometry where in =_ we place ky k_ The equation of the straight line (sec. 87) becomes cos a + i sin a=k tanh - or a$+b~~~ 4-c=-O, ~~~(2) where a. b ktn1 -~C (3 cosa si 11a= k____ (3 V\a2 +b V a2 +b k Va2~+b2 130 MODERN MATHEMATICS Now, if p is real, tanh P< 1; hence from Eq. (2) C2< k2(a2 +b2). Conversely, Eq. (2) represents a straight line provided c2< k2(a2+b2), for then a and p may be determined from Eq. (3). The formulas for distance (sec. 38) and angle (sec. 39), become respectively Pip2 _ _ _ -__1 _2 - _ __ _ _ __ cosh (4) k 'v/k2 - E12 712 '\1k 2 -$2 2 - r22 k2(a1a.2 -+ b1b2) - C1C2 cos. == (5) Vk2 ((ai 12 ~b12) - C12 V\lk2(a22 +b2 2) -C22 If in Eq. (4) we place $, 1 $2-$+ d$, ~2 + d~, P1P2 = ds it becomes, as far as infinitesimals of the second order are concerned (ds)2 Ik2(d~2 + dm2) (id$ - $dr))2 1+ 2 2 $2 2_~ whence the element of arc of any curve is given by the formula d, \/k2(d$2 ~d 2) - (T&db - $dr )2 In polar coordinates, this becomes ds=\/dr2+k2 sinh2, 02 whence the circumference of the circle r a is 2 r2" a a a C=1o ksinhkf dO = 2w1r sinhk i=1:ek- e19. 42. We may now complete the discussion of area given in secs. 21-24. The unit of angle being such that a right angle has the measure 2, we will take the unit of area such that, a, P, NON-EUCLIDEAN GEOMETRY 131 and r being the angles of a triangle ABC, we have in the Riemannian geometry Area ABC= k2(a + + r- ). and in the Lobachevskian geometry Area ABC= k2(- a-P-r), Consider now in the Riemannian plane a trirectangular quadrilateral (Fig. 34) formed by the axes OX and OY and the lines MIP( = c) and NP(= c2). Denote the area of OMPN by A and the angle MPN by,. Then, by dividing OMPN into r two triangles _____ A=k2 Q-b;$ N P PI whence sin = - cos _____._ /c C2:0; M M Therefore, by sec. 40, FIG. 34. sin...(1) sk2 Vk2 2 CV2 + ~/k2/k2 +2 Vk2 + (2' A = the positive signs of the radicals being taken since c2< 2. Let us now increase $ by dS, corresponding to MM1 in the figure. The corresponding differential of area, d~A, represented by MM1PP, is found by differentiating Eq. (1). We have ^d A _= k2..\.. (2) (k2 + 2) /k2+ 2+ 2. The differential of this area caused, by a change of dr in n is represented in the figure by PP1QQ1. We shall call this area dA and obtain it by differentiating Eq. (2) with respect to a. There results dA k3dd ( dA (k2+(2+3 *... (3) 132 MODERN MATHEMATICS The same process applied to the Lobachevskian plane leads to the result dA= k 3dvd (4) (k2_ E2_ - 2) 2.. Eq. (3) may be applied to find the area of the circle 2 + 2= k2 tan2 - in the Riemannian geometry. We have * Cktk t ann2 k-22 - d _kd sin2 2 A 4A3 I "k k = 47r/{; sm7in * 4jJo J(Jo%- W 2 +-2Y- ) 2k a Similarly the area of the circle 2+ r -/k2 tanh2 in the Lobachevskian geometry is found to be A 47&2 sinh2 a -.k.2 (e - ~ 2 43. We have noticed in sec. 32 that the formulas for the non-Euclidean trigonometry include those of the Euclidean trigonometry as a limiting case when k = oo. A similar remark applies to the non-Euclidean analytic geometry. We note that as k=oo a a a a lim k sin = lim k tan k- = lim k sinh - = lim k tanh - = a k k k kJ a a and lim k cos- = lim k cosh = 1. k k * The calculation is facilitated by changing the variables in the integral, A=k8C f d dt (k2 + 2 + \2) 3 to polar coordinates, by the methods of the calculus for such a problem. (See Hedrick's translation of Goursat's Mathematical Analysis, p. 266.) We have, in the Riemannian geometry, r A= J k sin -drdO, and similarly in the Lobachevskian geometry, A = Sk sinh -drdO. Ik NON-EUCLIDEAN GEOMETRY 133 The coordinates ($, 7) of either the Riemannian of Lobachevskian geometry become in the limit the coordinates (x, y) of the Euclidean geometry, and the formulas of sees. 40-42 reduce either to the identity 1=1 or to the corresponding Euclidean formula. For example, Eq. (4) sec. 40 or sec. 41, gives at first sight 1, but if we expand in powers of - and consider the terms of lower order it is easy to obtain the formula P1P2- /(t1 - 2)2 + (7i- r/,)2. On the other hand, Eq. (5), sec. 40 or see. 41, gives at once ata2+blb2 Cos ( -- s/al2 + bl2V/a2 + b22 It appears that the Riemannian and Lobachevskian geometries will differ, unappreciably from the Euclidean geometry, in their practical applications, provided k is very large. Therein lies the impossibility of determining by experience which of the three geometries is physically true. X. REPRESENTATION OF THE LOBACHEVSKIAN GEOMETRY ON A EUCLIDEAN PLANE 44. Let P ($, r) be any point on a Lobachevskian plane, (r, 0) its polar coordinates, where r is always positive. Then (sees. 36, 41) r ==k tanh - cos 0, = k tanh k sin 0,...... (1) $2+ 2=k2 tanh2 k < k. We may now interpret (d, a) as ordinary Cartesian coordinates upon a Euclidean plane, i.e.; a plane on which the Euclidean 134 MODERN MATHEMATICS geometry is assumed to hold. Then to P on the Lobachevskian plane corresponds a point P' on the Euclidean plane and 1' lies inside the circle 2 + 2 =k2, called the fundamental circle. Conversely, let ($, rj) be the coordinates of any point on the Euclidean plane. Solving Eqs. (1), we have cos 06 ---,\/$2 + D2 sin 0= _ V/'2 + 2' k k + V2 + 2 -= log — 2 / '2 + - 2 Hence 0 is uniquely determined and is always real and r is uniquely determined and is real, infinite, or imaginary, according as $2 + V2 is less than, equal to, or greater than, k2. We have thus a relation between the Lobachevskian and Euclidean planes by which a point on the Lobachevskian plane corresponds to one and only one point in the interior of the fundamental circle on the Euclidean plane, and conversely. The points of the funidamental circle correspond to points at infinity on the Lobachevskian plane, while points outside the circle H have no corresponding points on the X... Lobachevskian plane. /A \ 45. Consider now a straight line on the Euclidean plane (Fig....o. --- — 35) with the equation \ ' - aa+b~+c=O. Only that portion of AB which FIG. 35. is within the fundamental circle will correspond to a line on the Lobachevskian plane, the points A and B corresponding to the points at infinity on the Lobachevskian plane. Hence, unless the line AB meets the fundamental conic in two real points it will have no Lobachevskian counterpart, NON-EUCLIDEAN GEOMETRY 135 The criterion that aS + br) +c= 0 should meet 2 + r2 =k2 in two real points is that k2(a2 + b2)-c2 > 0. We thus find again the condition of sec. 41. 46. The distinction between intersecting, non-intersecting, and parallel lines is very clear in the representation we are considering. For if AB (Fig. 36) is any straight line on the Euclidean plane and P any point, the lines through P which intersect AB within the fundamental circle correspond to lines intersecting AB on the Lobachevskian plane, while the lines through P intersecting AB outside the circle correspond to lines on the Lobachevskian plane which do not meet AB. FIG. 36. FIG. 37. Between these two types of lines are the lines PA and PB which intersect AB on the fundamental circle and correspond to the Lobachevskian parallels. 47. Two straight lines ai +bl +clO..... (1) a2?I +b2r +c2 =0...... (2) on the Lobachevskian plane are perpendicular, when (sec. 41) k2(ala2 +lb2)-clc2=O... (3) The geometric meaning of this condition is readily given. Note first that if Pl($1, r1) (Fig. 37) is a point on the Euclidean plane, its polar AB with respect to the fundamental circle is 1s is t1-k2 = 0. This is the line al +bii +cl =O 136 MODERN MATHEMATICS if $1- akk2 b~k2 /a~k2 b1\ if 1=- -, 1= — 1-. That is, the point alk - -bk2 Cl Cl I el eC is the pole of the line Eq. (1), and similarly ( a2k2 b2k2) is the pole of the line Eq. (2). The condition Eq. (3) expresses the fact that the pole of Eq. (1) is on Eq. (2) and the pole of Eq. (2) on Eq. (1). Hence the following theorem: Two lines on the Lobachevskian plane are perpendicular when each of the corresponding lines on the Euclidean plane passes through the pole of the other. This leads to a shorter proof of the proposition of sec. 9 that two non-intersecting straight lines have a common perpendicular. For let LM and EF!\~u ~ (Fig. 38) be two such lines. _~~L /^< -Their point of intersection -----------—;- P on the Euclidean plane lies \ outside of the fundamental \ \,//F circle. The polar, lITV of P, '" I"" ~ passes through the circle, there-...-E7 \ fore, and corresponds to the FIG. 38. common perpendicular to LM and EF. 48. We shall now proceed to find the meaning on the Euclidean plane of the expression for a Lobachevskian distance (sec. 41, Eq. (4)). For convenience, place P1P2 = d, k2 -_ 12- _I 12.f, k2- 22- -22=f k2 -22 k2-12- 12,2=f12. Then Eq. (4), sec. 41, becomes d d ek + e- fl2 2 V/fII \lvf2' whence d=k logf12 ~ V/J122-Jf22 1 Vf J; f22 =__ A2...J + l. ~(1) k f2 + /fl22U- - /, f22 = +~ log f12-V'\If! 22 —fllf22 J NON-EUCLIDEAN GEOMETRY 137 Now on the Euclidean plane, let P1 and P2 (Fig. 39) be the two points ($1, r1) and (<2, /2) respectively, and R and Q the points in which the line P1P2 meets the fundamental circle. Let P ($1, a) be any point on P1P such that P1P PP2 where A is a Euclidean distance. Then - 1 +A_ 2 _l + ATj2 ' " 1+A 7 1 ~ ' 1+A2 ' + Substituting these values in the equation of the fundamental circle, 2 + 2_k2=0, we shall have the values of 2 corresponding to the points Q and R; namely P1Q f12 + V/f22-f-f12 22 -Q2 p22 - P1R fi - vfi22-f 1f22 2 RP2 x/f22 FIG. 39. Eq. (1) then becomes k A1 k PiQ P1R d= i log - i: log ~ )o 2 21 QP2 RP2 The Lobachevskian distance between two points is - times the 2 logarithm of the anharmonic ratio of the two g'ven points and the two points of intersection of the fundamental circle and the line through the two given points. 49. An analogous definition may be given to the Lobachevskian measure of angle. Place, for convenience, k2(al2 + b)-C12=U11, k2(a22 +b22)-C22 =U22, k2(aa2 + b1b2) - cC2 = u12. 138 MODERN MATHEMATICS Then, from Eq. (5), sec. 41, we have t U12 + V/U122-UllU22 f>= ~-o log 2 lU12- /U122 -U11U22 Now consider the two lines AL, and AL2 (Fig. 40) with the equations, al+bl +cl =0, a2 +b2 i +c2 =0, Any line through their point of intersection A has the equation (ai + Aa2)$ + (bl + Ab2) + (ci - Ac2) =0 and this line will be one of the tangent lines AR and AQ, if k2(ai + Aa2)2 + k2(bl + Ab2)2 - (C + Ac2)2 = 0, i.e., if A has either of the values R _......,.A A t _ 2 Xalp A11^2 / U12 - v/U22- U11U22 \/U22 )~2 U12-%/U122- UlUl2 I 'u22 Hence Li lo Al FIG. 40. =~ +2 log A But A- is the anharmonic ratio of the four lines AL1, AL2, /2 AR, and AQ. If A lies outside of the fundamental circle, A1 and A2 are real, and < is imaginary. If A lies on the fundamental conic, A1=A2, and q=-0. If A lies inside the fundamental conic, Al and A2 are conjugate imaginary, and 5 is real. The Lobachevskian measure of angle between two lines is times the anharmonic ratio of the two given lines and the two tangents to the fundamental circle from the point of intersection of the two given lines. 50. The study of the circle on the Lobachevskian plane by means of its representation on the Euclidean plane leads to interesting results. We obtain the general equation of the NON-EUCLIDEAN GEOMETRY 139 circle by letting ($1, ~1) in Eq. (4), sec. 41, be the fixed coordinates of the centre and letting $2=$, 12= 7 be the variable coordinates of any point on the circle. The equation is then of the form (evl+ 7 -k2)2_C(2+2- k2),..... (1) where c is a constant. This is the equation of a conic on the Euclidean plane. Coordinates which satisfy this equation and that of the fundamental circle 2 + 2 _ k2 = 0 satisfy also the equation $$ + 7)1)- k2 =0, which is that of the polar of ($i, 71). Since the polynominal $1$ + —k/2 appears to the second power in Eq. (1) it follows that Eq. (1) is the equation of a conic which is tangent to the fundamental circle at the points where the latter is cut by the polar of (E$, Vi). A,, There are therefore three cases to consider according as ($1, V1) lies out- ">. side, on, or inside the fundamental circle. (1) When C ($1, 71) is inside the fundamental circle, the polar AB of C does not cut the circle in real points and FIG. 41. hence the conic (1) lies entirely in the circle (Fig. 41). This corresponds to the ordinary circle on the Lobachevskian plane. (2) Whence C($1, i1) lies on the fundamental circle, the polar of C is the tangent to the circle at the point ($1, 7)1). The conic (1) is then tangent to the circle at the same point (Fig. 42). This corresponds on the Lobachevskian plane to the curve approached by a circle as its centre receded to infinity and its radius becomes infinite. This curve is called a limit 140 MODERN MATHEMATICS curve or horicycle. Its revolution about one of its infinite radii generates the limit-surface mentioned in sec. 25. (3) When C(1I, rl) is outside the fundamental circle, the polar C cuts the fundamental circle in two points A and B (Fig. 43). The conic (1) is therefore tangent to the fundamental conic, and corresponds on the Lobachevskian plane to a real circle with imaginary centre and radius. The straight line AB is a special case of such a circle. Draw any line CR through C, intersecting AB in Q. This represents on the Lobachevskian plane a line perpendicular to AB (sec. 47). Now in the Lobachevskian measurement ''" ~ G42 i\ ' FIG. 42. FIG. 43. CR and CQ are constant for all positions of Q on AB. Then QR is constant. That is the locus of Q has all its points equidistant from a straight line AB. This curve is sometimes called the hypo-cycle. 51. We may make, of course, a representation of the Riemannian geometry on the Euclidean plane with coordinates ($, r). But in this case the fundamental circle has the equation $2 + r72 +k2 =0 and is imaginary. The geometric properties are therefore not visible to the eye. NON-EUCLIDEAN GEOMETRY 141 XI. RELATION BETWEEN PROJECTIVE AND NON-EUCLIDEAN GEOMETRY 52. We have obtained in sees. 48, 49 a special case of the system of measurement first given by Cayley and recognized by Klein as leading to the non-Euclidean geometries. The general principles can now be quickly stated. Let us take, on a plane for which the Euclidean geometry holds, x:x2:x3 as homogeneous point coordinates and assume a fundamental conic with tho equation a1 x2l +a2222 + a33x32 + 2al2xx2 + 2a23x2x3 + 2a31xa1 =0 or, more compactly, Eaikxixk = O. (aik = aki)... (1) Let the tangential equation of the same conic be EAiA,;ato = O, (Aik-Aki),..... (2) i.e., let Eq. (2) be the condition that the straight line alxl +a2x2+ a3X3 =0 should be tangent to the conic of Eq. (1). For convenience, let us place fxx= EaikXiXk, f yy,,= aikYiYk, fxy-= EaikZXik. and Uaa= EAikcaiak, Upp= EAikPik, Uap= EAikaiPk. If P1 and P2 are two points on the plane, and Q and R are the points which the line PiP2 meets the fundamental conic, and [P1P2QR] is the anharmonic ratio of the four points P1P2QR, then the Cayleyan projective measure of the distance PiP2 is defined by the equation P1P2= M log [P1P2QR], where M is a constant. Similarly, if AL, and AL2 are two lines intersecting at A, and AR and AQ are the two tangents from A to the fundamental conic, and [L1L2QR] is the anharmonic ratio of these four lines, 142 MODERN MATHEMATICS then the Cayleyan projective measure of the angle 5 between AL1 and AL2 is given by the equation l=M1 log [L1L2Q.], where M1 is a constant. The analytic expression of these measures is found as in sees. 48, 49. If x1:x2:x3 are the coordinates of Pi, y1:y2:y3 the coordinates of P2, and Al, A2 the roots of fxx + 2A2xy + A2fyy = O then /2 P1P2= M log 21 M log.. (3) M log + vf/J2-fxfyy... f xy~ %/ xy ~f x xfy = 2M log. /f x\/fIyy By an easy calculation, we may deduce from this PlP2 PIP2 4* (e2M +e 2M ) -f.... (4) Also if aixl + a2x2 + a3X3 =0 is the equation of AL1, ilxl + j32x2+ P3X3=0 the equation of AL2, and I/t, /2 the roots of the equation Uaa + 2,Uap +,l2UpB = 0 then = M1 log 2 t2 = M1 log,Ua 4 /Ua82- UaaUg ( Uap- Va-/U aaUpp uu ( Ua 7+ V/Uap2 - UaaUpf = 2M1 log U Up V/Uaa '\/Ufpl NON-EUCLIDEAN GEOMETRY 143 whence ~= (e2 e M) - -... (6) /VUaa /Upp( We have now to consider three cases according to the nature of the fundamental conic. 53. Case I. Let the fundamental conic be a real, nondegenerate conic; i.e., either an ellipse, hyperbola, or parabola. If the points P1 and P2 are inside the conic, [PiP2QR] is real and positive. Hence if the distance P1P2 is real we k must take M a real constant, for example, 2. If A is inside the conic, the tangents AR and AQ are imaginary, and PL1 and [/2 are conjugate imaginary. Let jp=pe~i, then /2 =pe-oi and log = 20i. Hence if q is to be real we must #2 take M1 pure imaginary. When AL1 coincides with AL2 we have — M1 log 1 =0 or 2 Mlnri, where n is an integer. If then we so chose the unit of angle that the measure of a right angle shall be 2, we must place M1-2 2' 2 We are thus led to the same formulas as in sees. 48, 49, except that they are referred to a general conic instead of a circle. The Lobachevskian geometry is easily built up on this foundation. 54. Case II. Let the fundamental conic be imaginary, i.e., let there be no real values of Xl:X2:x3 satisfying Eq. (1), sec. 52. Then 2i and 22 are conjugate imaginary, as are also pu1 and /u2. Hence to obtain real distance and angle we must take M and M1 pure imaginary. As in sec. 53, we place i ik M1i= and will place Al - We have then from Eqs. (4) and (6), sec. 52, P1P2 f.ry CO Uas cos a. -—.. VU aa 2U ap 144 MODERN MATHEMATICS which are analogous to those of the Riemannian geometry (sec. 40). P1P2 Since cos 2 is never infinite, all straight lines are finite in length. Two straight lines always intersect, since two linear equations have always a solution, which cannot represent a point at infinity. The Riemannian geometry is easily built up from these foundations. 55. Case III. Let the fundamental conic degenerate. This may happen in two ways: either the point Eq. (1), sec. 52, may represent two straight lines; or the tangential Eq. (2), sec. 52, may represent two points. The latter is the most interesting case, especially when the tangential equation becomes a12 + 22 =..... (1) which is satisfied by the coefficients of all straight lines which pass through one of the two points xl:x2: 3=l: ~i:0. If 3 = 0 represents the line at infinity, these points are the circular points at infinity. Through each point of the plane go two straight lines satisfying Eq. (1), namely the minimum lines. The formula for angle is readily found. In fact, we have at once from Eq. (6), sec. 52, with Mt=, a 1,3t + (2312 cos ='la2/?2 C1a2 + a22 VP,12 - P22 But this is the Euclidean formula for the angle between the two lines alx +a2y +3 = -0, Xl1 X2 where we have placed x=-, y-. X3 X3 Hence: The Euclidean angle between two lines is equal to - times the logarithm of the anharmonic ratio of the two lines and the two minimum lines through their point of intersection. NON-EUCLIDEAN GEOMETRY14 1 4 5 To obtain the Euclidean formula for distance from the general formula, sec. 52, is not so simple a matter, but it may be done as follows: Let us take in place of Eq. (1) the equation aa2.2 + Fa. 2 = 0 (2) which goes over into Eq. (1) when e=0. The corresponding point equation is e(Xm+X2+2) +X32=0. (3) From Eq. (4), sec. 52, we have if we place M == PIP2 E (Xlyl + X2Y2) + X3Y3 cosh k -/ Vm(X12 +X2) + X3\ Vmjy 1 22) +Y32 We wish to show that this approaches as a limit the Euclidean formula as E 0 and k ==. For that purpose, replace bosh by its approximate value I + () and calculate P1P2. k~ There results v'(1IY3 -X3Y1I2 + (X2Y3 -X3Y2)2 + E(X2YI -X2Y3 )2 P1P2=ik /E-,\/ 6(X 1 + X2 2) + X33 2,\/.E(y 1 Y2 2) + Y32 Now let m 0 and k- oc,in such a way that ikV/~1. We have in the limit = v' (X1Y3 -xZ3y1)2 4- (X2Y3 x3Y2)2 1VX2,V/ y32 Finally, let us employ non-homogeneous coordinates by placing XI X x -, y =_ X3 X3, Yi r, yY3 Y3 146 MODERN MATHEMATICS We have, then, the usual Cartesian formula PAP2= V(x -x')2 + (y -y')2. Hence: The Euclidean measure of distance is a limiting case of the Cayleyan projective measurement. XII. THE ELEMENT OF ARC 56. We have found (secs. 40, 41) that in both the Riemannian and the Lobachevskian geometries the element of arc, ds, is the square root of a homogeneous quadratic function of the differentials of the coordinates which we have used. This is also true of the Euclidean geometry, where in rectangular Cartesian coordinates ds = x/d2 + dy2. Conversely, following the method first employed by Riemann, we may ask if these are all the types of geometries in which the element of arc is thus expressed. More precisely, the problem is as follows: Let it be assumed that a point on the plane may be determined by means of two coordinates xi and x2, and that the distance between two infinitely near points (xi, x2) and (xi+dxi, x2+-dx2) may be given by an equation of the form ds = V/a dx12 + 2ai2dxldx2 + a22dx22, where al, a12, a22 are functions of x. and x2. It is required to discuss the geometry which results. An adequate discussion of this question would be altogether too long for this place.* We shall simply say that the straight line is then defined as the shortest distance between two points, its equation being the relation between xi and x2 which makes the integral s= f al ldxl2 + 2a i2dxldx2 + a22dx22, taking between constant limits, a minimum. * Consult Woods, "Space of Constant Curvature," Annals of Mathematics, 2d series, Vol. VIII, 1901-2; Coolidge, The Elements of Non-Euclidean Geometry, Chapter XIX. NON-EUCLIDEAN GEOMETRY 147 It results finally that it is possible to replace (xl, x2) by polar coordinates (r, 0), whereby ds takes one of the forms ds = Vdr2 + r2d02, ds = dr2 t k2 sin d2, k ds dr2 k2 sinh2 d2 where k is a constant. But there are the three forms which belong to the Euclidean, the Riemannian, and the Lobachevskian geometries respectively. We have thus the interesting result that no new type of geometry results from the new point of view. This statement, however, requires one modification. The present discussion, since it starts with the infinitely small and proceeds by the methods of the calculus, has to do only with a restricted portion of the plane. No hypothesis is made as to the behavior of straight lines when indefinitely extended, such as enters into the parallel postulates. A geometry, in fact, which agrees with the Euclidean, Riemannian, or Lobachevskian geometry respectively, in a restricted portion of the plane, may present new features when the total extent of the plane is considered. Into this subject we cannot go.* * Consult Woods, "Forms of Non-Euclidean Space," in Boston Collo. quium Lectures on Mathematics, New York, 1905. IV THE FUNDAMENTAL PROPOSITIONS OF ALGEBRA BY EDWARD V. HUNTINGTON CONTENTS SECTIONS. I. INTRODUCTION....................................... 1-4 II. THE ADDITION OF ANGLES AND THE MULTIPLICATION OF DISTANCES...................................... 5-12 5, 6, 7, The addition of angles; 8, First step toward the science of this operation: selection of axioms; 9, 10, 11, The multiplication of distances; 12, First step toward the science of this operation: selection of axioms. III. THE ABSTRACT THEORY OF THESE OPERATIONS........... 13-20 13, Parallelism between these two operations; 14, Postulates for an abstract science to include them both; 15, " Consistency" of the postulates; 16, On the uses of an abstract science; 17, Examples of systems that do not satisfy the postulates; 18, " Independence" of the postulates; 19, "Sufficiency" of a set of postulates to determine a single type of system; 20, Note on the terms axiom and postulate. IV. GEOMETRIC EXAMPLE OF THE ALGEBRA OF COMPLEX QUANTITIES: THE SYSTEM OF POINTS IN THE PLANE.....21-29 21, Points in a plane; " real" and " imaginary" points; 22, 23, Addition of points in the plane; 24, 25, Multiplication of points in the plane; 26, Solution of algebraic equations; 27, The relation of order among the real points; 28, Classification of the real points: integral, fractional, rational, irrational; 29, First step toward the science of this algebra: selection of axioms. V. THE ABSTRACT THEORY OF THE ALGEBRA OF COMPLEX QUANTITIES.........................................30-38 30, A complete set of postulates for the algebra of complex quantities; 31, Consistency of the postulates; 32, 33, Sufficiency of the postulates. Examples of isomorphic systems; 34, 35, Independence of the postulates. Examples of systems that satisfy all but one of them; 36, What is algebra? 37, A complete set of postulates for the sub-algebra of all real quantities; 38, On the value of complex algebra in problems concerning real quantities. APPENDIX I. OTHER EXAMPLES OF THE ALGEBRA OF COMPLEX QUANTITIES 39-42 39-41, Arithmetical systems of Dedekind and Cantor. 42, Comments on these arithmetical systems. II. GEOMETRIC PROOF THAT EVERY ALGEBRAIC EQUATION HAS A ROOT................................ 43-45 150 IV THE FUNDAMENTAL PROPOSITIONS OF ALGEBRA By EDWARD V. HUNTINGTON I. INTRODUCTION 1. Purpose of the article. The main object of this article is to present, in as simple a form as possible, the results of some of the modern inquiries into the logical foundations of algebra; but the article is so arranged that readers who desire merely to increase their store of information about algebraic facts, without going into the discussion of logical foundations, may find, in Part IV, a systematic introduction to the algebra of complex quantities, which may be read independently of the rest of the article. There has been much discussion of late years over the place which logical rigor should occupy in the teaching of elementary mathematics. Some have contended that the power to understand a logically rigorous demonstration is itself the most important result to be aimed at in mathematical study. 'Others have attached greater importance to the use of mathematics as a practical art, and have felt that too much insistence on logical rigor serves only to deaden the pupil's interest, and thus to destroy all the value the study might have, either as a practical art or as a training in logic. It is not the purpose of the present article to discuss these pedagogical questions. It is intended merely to put before the reader a clear statement, in some detail, of what is actually involved in a strictly 151 152 MODERN MATHEMATICS logical treatment of algebra, leaving to the teachers themselves the question as to how far logical rigor can be pressed in the classroom.* 2. The science of algebra vs. the science of geometry. It is a curious fact that the one striking example of rigorous mathematical reasoning with which everyone is familiar is taken from geometry rather than from algebra. Euclid's Elements have stood for 2000 years as the supreme illustration of the mathematical manner of reasoning. Axiom, theorem; hypothesis, conclusion; proposition, demonstration, corollary; the defence of every statement by reference to a previously established truth-all the apparatus and method of mathematical reasoning call up at once in our minds a text-book in geometry, never a text-book in algebra. Even the external form of our books contributes to this result. The current treatises on algebra are not divided into Book I, Book II, etc., as are those in geometry; their theorems are not numbered in consecutive order; little distinction is made between explanation and proof; nothing is done to suggest the strict logical sequence of propositions which is so constantly emphasized in every book on geometry. Until recent years, elementary algebra has been largely a miscellaneous collection of rules for the manipulation of algebraic expressions, and is not at all the developed science that elementary geometry has long since become. In fact, if it were not for the study of plane geometry in our schools, it is doubtful whether our school children would ever derive, from their study of algebra alone, any clear notion of what is meant by a mathematical demonstration. This fact is the more remarkable, because, on account of the simpler nature of the concepts with which it deals, algebra is better suited than geometry to serve as an illustration of what is essentially involved in mathematical reasoning. In geometry, the very concreteness and familiarity of the subjectmatter is apt to obscure the logical structure of the science, while * Reference may here be made to a forthcoming book by John Wesley Young, entitled "Lectures on Fundamental Concepts of Algebra and Geometry," 1911. FUNDAMENTAL PROPOSITIONS OF ALGEBRA 153 in algebra, the more abstract character of the content of the theorems makes it easier to fix the attention on their formal logical relations. The present article is intended as an introduction to the science of algebra as distinguished from the art of manipulating algebraic expressions. In what proportions the science and the art should be mingled in practical teaching is a question viih which this article, as already stated, does not propose to deal. 3. The various types of algebra. Irrational and imaginary quantities. It should be mentioned at once that there is, strictly speaking, no one science of algebra, but rather a collection of closely related sciences, all of which are commonly grouped together under the general name of algebra. For example, we have the algebra of positive integers; the algebra of all integers (positive, negative, and zero); the algebra of positive rationals; the algebra of all rationals; the algebra of all real quantities (rational and irrational, positive, negative, and zero); and, finally, the algebra which in a certain sense includes all these others, and is in many respects simpler than any of them, the algebra of complex quantities. In these various algebras, many theorems are, in form, identical; but many other theorems are true in one algebra and false in another. For example, the theorem, If a= b3, then a= b, is true in the algebra of real quantities, and not true in the algebra of complex quantities. Again, the theorem that "Every quantity has at least one cube root," is true in the algebra of all complex or all real quantities, but is false in the algebra of rational quantities or the algebra of integers. The distinction between the various types of algebra is directly connected with the problem of the so-called " irrational " and " imaginary" quantities. Much of the difficulty which perplexes every thoughtful student at the time when irrational and imaginary quantities are first introduced, is due to the failure to recognize the fact that he is really leaving one system of algebra, and passing to another and different system, and that the theorems established in the first system cannot be expected (without further proof) to hold in the second. It is small wonder that a boy is confused and perplexed when he is told on one page that "the square of every number is positive, and 154 MODERN MATHEMATICS hence I/- cannot exist," and on the next page that "the V/-1 really is a number, and obeys all the laws of algebra." The fact is, of course, that the V/ —1 occurs only in the algebra of complex quantities-a quite different algebra from the algebra of real quantities which the boy has so far studied; and it is simply not true to state that a quantity which belongs in one of these algebras obeys all the laws which are valid in the other. Again, the pupil is often told that we "must" introduce the number V/-1 "because " the equation x2= -1 must needs, in the nature of things, have a root. But why do we not say, with equal reason, that we "must " introduce the number infinity, "because" the equation 5/x=0 must needs have a root? If we say that / -1 is "a number that obeys all laws of algebra," why do we not say that oo, the existence of which may be claimed on the ground of precisely similar necessity, is also a "number that obeys all the laws of algebra"? Inconsistencies like this, while they do not trouble the average pupil, do present serious perplexities to those who are more critically inclined. It is not clear why a "must" that is so imperative in one case should be so ignored in a precisely similar case. The fact is, of course, that the alleged necessity carries no compulsion with it in either case; it is merely the expression of a desire for a simpler algebra, in which every equation shall have a root; the fact that the algebra of complex quantities comes nearer than any of the other algebras to fulfilling this desire is a matter for observation, not a consequence of logical necessity. And yet what pupil in our high schools has ever had a concrete example of complex algebra presented to him upon which he could make this observation? * In regard to this whole problem of the introduction of irrational and complex quantities into elementary algebra, the method of successive "extension of the number-concept," which was historically the method by which these quantities were discovered, seems to be of very questionable value as a method of instruction at the present day. The very terms that have come down to us-surd (meaning "absurd"), irrational, imaginary-show the doubts about the legitimacy of these new quantities which were occasioned by this method of introducing them. In the light of the modern science of algebra, these doubts simply do not occur; the whole point of view in regard to algebraic quantities has changed; the old terminology itself is retained only out of respect for the past. * Compare the trenchant remarks on this subject by C. F. Gauss in his famous Doctor's Dissertation, 1799. Reprinted in Ostwald's Klassiker der exacten Wissenschaften, under the title: Beweise ffir die Zerlegung ganzer algebraischer Functionen in reelle Factoren ersten und zweiten Grades. FUNDAMENTAL PROPOSITIONS OF ALGEBRA 155 After clear ideas have once been reached on these subjects, one is forced to raise the question whether it is necessary to perplex all our pupils of to-day with the same vagueness and obscurity through which the earlier pioneers had to struggle. Is it necessary to turn out hundreds of pupils, as we do, from our courses in algebra, with the conviction hopelessly fixed in their minds that some of the things with which algebra deals are, in all truth, "absurd" and "imaginary"? In the opinion of the present writer, if the irrational and imaginary quantities are to be introduced into elementary work at all, the method which is most satisfactory from the strictly scientific point of view, is also by far the simplest and most satisfactory from the point of view of the elementary student. This opinion it is hoped will be borne out by the sequel. (See especially sees. 36, 38, 39, 42, and remarks in sees. 26, 28, 29 and 30.) 4. Plan of the article. The space at our disposal does not permit the separate development of the several types of algebra, in the order in which, beginning with the algebra of the positive integers, these types would naturally be presented to the pupil. We shall confine ourselves chiefly to the algebra of complex quantities, which is the most inclusive and the most interesting type. In IV a geometrical example of this type of algebra is given (without the use of trigonometry) and in V the abstract theory of the algebra is developed, that is, the precise conditions are laid down which any system must. satisfy in order to be equivalent to the algebra in question (sec. 30). In sec. 35 several examples of "pseudo-algebras " are given, that is, systems that satisfy most, but not all, of the conditions of sec. 30; for it is only by a study of what the algebra is not that we can fully understand what it is. II and III are preliminary to the main discussion. In II a number of geometrical facts are observed, of which use will be made in IV. III shows how this collection of geometrical facts can be reduced to an abstract science, and serves to illustrate, in this very simple case, all the steps of the reasoning which will be used in the general case in V. The chief points in the article which may be unfamiliar to many readers are the following: The analysis of the fundamental concepts which occur in algebra; the notion of the "equivalence," or "isomorphism," 156 MODERN MATHEMATICS of two algebraic systems with respect to these fundamental concepts; the notion of the " sufficiency" of a selected set of fundamental propositions to determine uniquely a particular type of algebra; and the method of establishing the "consistency" and "independence" of the propositions of such a set. II. THE ADDITION OF ANGLES AND THE MULTIPLICATION OF DISTANCES 5. The addition of angles. We begin with a preliminary discussion of the very simple and familiar process of the addition of angles. By an angle, as in all higher mathematics, we mean an amount of rotation of a line about a fixed point 0, in a plane. Such a rotation may be counter-clockwise or clockwise, and of any amount; as, +250~, -780~, etc. To clarify our ideas about rotations of more than 360~, it will be well to adopt Riemann's famous device, and think of the plane about the point 0 as made up of numerous distinct sheets, joined together after the fashion of a spiral staircase; a moving radius rotating about the point 0 winds around from one sheet to the next as if it were following the thread of a screw. Two angles like 360~ and 720~ are thus kept distinct; for although the terminal lines of these angles point in the same direction, they lie in different sheets of the Riemann surface. If two angles a and P are given, a third angle r may be derived from them by the following familiar process: starting with a given initial line as the zero angle, perform the rotation indicated by a; then continuing from the terminal line of a, perform a rotation equal in amount and direction to iP; the final position thus reached is the terminal line of the required angle r. This angle r is called the sum of the given angles a and i (with respect to the chosen zero) and is denoted by a+f. 6. Concerning the addition of angles, as thus defined, the reader may easily verify the following familiar statements: (a) If a and P are any two angles (whether equal or unequal), FUNDAMENTAL PROPOSITIONS OF ALGEBRA 157 then their sum, a +P, is an angle uniquely determined by a and P (with respect to the chosen zero-angle). (b) a += + a. (Commutative law for addition.) (c) If a, A, y are any three angles, (a+P) + = a + (P +). (Associative law for addition.) (d) If a+ =a+/P', then -3=='. ("Law of cancellation" for addition.) If we introduce, for abbreviation, the notation, 2a =a+a, 3a = a + a + a,..., na = a + a + ~* + a to n terms, where n is any positive integer, we have further: (e) If na = nj, then a ==. The angle na is called the nth multiple of the angle a. Three other facts of somewhat different character (" existence theorems ") are the following: (f) There is one and only one angle x such that x 4x=x; this angle x is the zero angle, and is denoted by 0. (g) Every angle a determines uniquely an angle a' such that a+a'=O. This angle a' is called the opposite of a and is denoted by -a. (h) For every angle a and every positive integer n, there is an angle y, uniquely determined by a and n, such that ny= a. This angle y is called the nth submultiple of a, denoted by a/n. For example, we have a/2, a/3, etc. 7. Among the many further facts which might be mentioned, the following are the most important for our present purpose: (i) If O is the zero angle, then for every angle a, a + 0= a. (j) If a and P are any given angles, there is always an angle x, uniquely determined by a and P, such that a=p +x; this angle x is called the remainder, a minus P, and is denoted by a -a, and the process by which it is obtained is called subtraction. The angle a -f is the same as the angle a + (-,). Hence, to subtract an angle Pf, means to add the opposite of P. (k) If m and n are any positive integers, the angle m(a/n) is equal to the angle (ma)/n, so that either may be denoted by (m/n)a. All these statenents, (a)-(k), may be regarded as the direct result of observation. There is no necessary logical order among 158 MODERN MATHEMATICS them; any one may be obtained without reference to the others directly from the figure, as the reader may readily verify. 8. First step toward the science of this process. Selection of axioms. Now this miscellaneous collection of facts about angles does not constitute a science. In order to reduce it to a science, the first step is to do what Euclid did in geometry, namely, to select a small number of the given facts as axioms, and then to show that all other facts can be deduced from these axioms by the methods of formal logic. As a convenient choice of axioms for the science of the addition of angles, we may take the propositions (a)-(h) in sec. 6; from these axioms the other propositions, (i), (j), (k), etc., can be deduced as theorems, without further reference to the definition. For example, the proof of theorem (i) is as follows: By (f), 0 +0=0, hence, by (a), + (O +0)=a+0, whence, by (c), (ca+O)+O =a+0, whence, by (b), O+(a+0)==O+a; therefore, by (d), a +~0 =a, which was to be proved. Similarly, the proof for (j) is as follows: By (f) and (g), there is an angle -fi such that /?+(-P)=O. Let x=a+ ( —?), which is known to be an angle, by (a). Then, by the use of (b), (c), (g), and (b), and theorem (i), +x= - [ + (-P)]= +[(-P) +a]= [.~ + )]+o= Q +a= a+0=; that is, the angle Xx + ( —P) is an angle which, when added to f, produces a, as was to be proved. That this angle is uniquely determined by a and p follows at once from (d). The proof of (k) may be illustrated by a numerical case. Let x= 3(a/2) and y= (3a)/2; then by (a) and (c), 2x= [3(a/2)] +[3(a/2)]== [ a/2 + a/2 + a/2] +[a/2 + c/2 + ca/2] = [c/2 + a/2]+[a/2 + ar/2] +[ca/2 + ac/2]= [2(c/2)] +[2(ca/2)]+[2(a/2)] = 3[2(r/2)] = 3[a], by the definition of a/2. But also, 2y= 2[(3c) /2]= (3a), by the same definition. Therefore, 2x=2y, whence x=y, by (e). The general case for m and n is proved in a similar way. It must not be supposed that proofs like these, in which every step is carefully justified by reference to one or other of the axioms, are necessary to convince us that the statements FUNDAMENTAL PROPOSITIONS OF ALGEBRA 159 in question are true; indeed, in this particular case, the theorems proved are quite as obvious as the axioms on which the proof is based; all of them may be obtained independently, by direct observation of the figure. The fact is that a mathematical demonstration, strictly speaking, is not concerned with the truth of the proposition at all; it is concerned merely with the logical relation that exists between the given proposition and certain other propositions called the axioms-in other words, all that a mathematical demonstration tells us is that if the axioms are true, then the theorem in question will also be true-provided, of course, that our deductive reasoning is sound. Provided that our deductive reasoning is sound-there is the difficulty. How can we be sure that each step of the deduction is logically justified? How can we be sure that no assumption is tacitly used in the proof which was not explicitly stated in the axioms? Even Euclid did not escape this danger; he often used, for example, assumptions about the motion of a rigid body which he did not include in his axioms. In fact, it is only in recent years that a really complete list of axioms for geometry has been laid down.* How can we be sure that similar errors will not creep into our reasoning in algebra? The answer to this question involves a further refinement of the scientific method, which will be discussed in Part III. 9. The multiplication of distances. The system studied in the preceding sections on the addition of angles is an example of the type of algebra called the " algebra of all real quantities" as far as the operation of addition is concerned. We now consider a second operation, to be called multiplication, this operation being performed not on angles, but on geometric lengths, or distances. Suppose two distances a and b are given; and then, having chosen a given distance u as a " unit distance," find a distance x by the construction shown in Figure 1, in which b is at right angles to u and a, and the oblique lines are parallel. * See Monograph I. 160 MODERN MATHEMATICS This distance x is called the product of the given distances a and b (with respect to the chosen unit) and is denoted by aXb, or a b, or simply ab. The process by which this product is obtained is called multiplication. From this definition it follows that if x=aXb, the area of the rectangle whose sides are x and u is equal to the area of the rectangle whose sides are a and b. To see that the two rectangles, OCDU and OBEA, are equivalent, note that the part OBQU is common to both; further, the lines PQ, QR, CS, and TA are all equal to BU (being portions of parallels interC D a;w\ -o -a - FIG. 1. FIG. 2. cepted between parallels), so that the triangles BPQ and DCS in one rectangle are equal to the triangles UQR and ETA in the other; and, finally, the parallelograms CSQP in one rectangle and QTAR in the other are equivalent (having equal bases PQ and QR and equal altitudes).* 10. Concerning the multiplication of distances, as thus defined, the reader may readily verify the following statements: (a) If a and b are any two distances (whether equal or unequal) then their product, a X b, is a distance uniquely determined by a and b (with respect to the chosen unit distance). * It will be noticed tihat this proof does not assume the theorem that the area of a rectangle is equal to its base times its altitude, nor any theorems on ratio and proportion. FUNDAMENTAL PROPOSITIONS OF ALGEBRA 161 (b) aXb=bXa. (Commutative law for multiplication.) (c) If a, b, c are any three distances, then (aXb) Xc=ax (b Xc). (Associative law for multiplication.) To see that this is true, let axb=x and b Xc=y, and then xXc=z and aXy=z', so that we have (aXb) Xc=xXc=z and aX(bXc) =aXy=z'. --. - IC a z FIG. 3. Then show that the parallelopiped whose edges are z, u, u and the parallelopiped whose edges are z', u, u both have the same volume as the parallelopiped whose edges are a, b, c. Therefore z= z'. (d) If aXb= aXb', then b=b'. (" Law of cancellation " for multiplication.) If we introduce, for abbreviation, the notation a2=aXa, a3=aXaXa,..., an=aXaX. ~ ~ Xa to n factors, where n is any positive integer, we have further (see Figure 4): (e) If an=bn, then a=b. The distance an is called the nth power of the distance a. Three other facts of somewhat different character (" existence theorems ") are the following: (f) There is one and only one distance x such that xXx —x. This distance x is the unit distance, and is denoted by 1. (g) Every distance a determines uniquely a distance a' such that aXa'=l. This distance a' is called the reciprocal of a 162 MODERN MATHEMATICS and is denoted by a-1 or I/a. For example, if a is five times 1 then a-~ is one-fifth of 1, etc. (h) For every distance a and every positive integer n, there is a distance y, uniquely determined by a and n, such that y = a. This distance y is called the nth root of a, denoted by alln or V/a. For example, if a is a length nine times as long as 1, then V/a, or al, will be a length three times as long as 1, etc. 11. Many other facts about the multiplication of distances b3 b2 a4a3a a U b b b FIG. 4. might be mentioned, of which the following will suffice for our present purpose: (i) If 1 is the unit distance, then for every distance a, aXl=a. (j) If a and b are any given distances, there is always a distance x, uniquely determined by a and b, such that a=b Xx; this distance x is called the quotient, a divided by b, and is denoted by a/b; and the process by which it is obtained is called division. For example, if a=10(1) and b=5(1), then a/b=2(1); etc. The distance a/b is the same as the distance a X(b-) FUNDAMENTAL PROPOSITIONS OF ALGEBRA 163 Hence, to divide by the distance b means to multiply by the reciprocal of b. (k) If m and n are any positive integers, the distance (al/n)m is equal to the distance (am)l/n, so that either may be denoted by a/n. All these statements, (a)-(k), about the multiplication of distances, like the statements (a)-(k) in sees. 6-7 about the addition of angles, may be regarded as the direct result of observation-any one of them being obtainable immediately, without reference to the others. 12. First step toward the science of this process. In order to reduce this miscellaneous collection of facts to a science, we may take as the axioms of the science the propositions (a)-(h) in sec. 10, and proceed exactly as in sec. 8; the steps of the reasoning are precisely parallel, and need not be repeated here. The system here studied is an example of the type of algebra called " the algebra of positive reals," as far as the operation of multiplication is concerned. We now turn to the problem (already referred to in sec. 8) of how to make more rigorous the science of these two systems. III. THE ABSTRACT THEORY OF THESE OPERATIONS 13. Parallelism between these two operations. The parallelism between the two systems just described is too striking to have escaped attention. The propositions (a)-(h) in sec. 6 are, as far as their form is concerned, identical with the propositions (a)-(h) in sec. 10. The meaning and content of the two sets of propositions are of course very different; the first set concerns the addition of angles, while the second set concerns the multiplication of distances; but their form is the same, since all the propositions of the second set can be obtained at once from those of the first by replacing " angle " by "distance," "sum " by " product," " zero " by " unit," " opposite by "reciprocal," "subtraction" by "division," etc. This duality between the two sets of propositions will of course extend through all the propositions that are deducible from 164 MODERN MATHEMATICS them by the methods of formal logic; from every proposition concerning the addition (or subtraction) of angles, a corresponding proposition concerning the multiplication (or division) of distances can at once be obtained by merely changing the interpretation of the symbols, without changing the form of the statement. 14. Postulates for an abstract science to include them both. This duality at once suggests the possibility of developing a general theory which shall include both these theories as special cases. To do this, we proceed as follows: Consider a general class of things or " elements " denoted by A, B, C, etc., without specifying whether these things are to be angles (a, f, y, etc.) or distances (a, b, c, etc.), and a general rule of combination denoted by o, without specifying whether this rule of combination is to be addition (+) or multiplication (X) * and impose upon these symbols the following conditions: (a) If A and B are elements of the class, then AoB (read: "A with B ") is an element of the class, uniquely determined by A and B. (b) AoB-BoA.t (Commutative law.) (c) (AoB)oC= Ao(BoC). (Associative law.) (d) If AoB=AoB', then B=B'. ("Law of cancellation.") (e) If A[lB = B[n, then = B. Here Ar"] means AoAo- oA, to n elements, where n is a positive integer. (f) There is an element X such that XoX= X. [It can be shown from the preceding conditions that there cannot be more than one such element. For, suppose these were two such elements, X and Y, such that XOX=X and YoY= Y; then, by (a), (XoX)oY= XO(YOY), whence, by (c), XO(XOY)==XO(YOY), whence, by (d), XOY= YoY; therefore, by (b), YcX= YoY, whence, by (d), X= Y.] * A system composed of a class K and a rule of combination O we shall speak of as a "system (K, o)." t The equality sign, =, is used to indicate that the two expressions between which it stands are interchangeable in any proposition of the theory. If desired, the laws of operation with this symbol may be formally stated as follows: (1) A-A; (2) if AA=B then B=A; (3) if A=B and B=C, then A = C, FUNDAMENTAL PROPOSITIONS OF ALGEBRA 165 (g) If X is the unique element such that XoX=X, then for every element A there is an element A' such that Ao A'-X. [It follows from (d) that this element A' is uniquely determined by A.] (h) For every element A and every positive integer n, there is an element Y such that yn = A, where yin] means YoYo * * oY to n elements. [It follows from (e) that this element Y is uniquely determined by A and n.] 15. Consistency of the postulates. From these eight conditions, or " postulates," as we shall call them, a long list of theorems can be deduced; for example: (i) If X is the unique element such that XoX=X, then for every element A, AoX=A; moreover, any system which satisfies all these conditions (a)-(h) will satisfy also all the theorems derived therefrom. But the first question to be asked about such a set of conditions or " postulates," is this: Are they consistent demands? In other words, does any system exist which satisfies all the conditions? In this case the answer is, of course, affirmative: for, if the class A, B, C,... is the class of angles, and the rule of combination o is the rule of addition, then all the conditions are satisfied, as we saw in sec. 6; the elements X, A', and Y, whose existence is demanded in (f), (g), and (h), are the " zero angle," the "opposite of A," and the "nth submultiple of A," of that system. Again, if the class A, B, C... is the class of distances, and the rule of combination o is the rule of multiplication, then also all the conditions are satisfied, as we saw in sec. 10; the elements X, A', and Y now being called the " unit distance," the " reciprocal of A," and the " nth root of A." Indeed, the system of angles, under addition, and the system of distances, under multiplication, are only two examples out of many which satisfy all these eight conditions, so that we may be well assured that the conditions are consistent. These eight postulates, (a)-(h), may therefore be taken as the i66 MODERN MATHEMATICS fundamental propositions of an abstract science, which will exhibit, in skeleton form, the logical structure of a large class of systems, of which the systems described in Part II are examples. This is the refinement of the scientific method, to which reference was made in sec. 8. The great advantages of the method are: first, that the essential properties of a whole class of systems are epitomized in one abstract theory; and secondly, that the liability to error in deducing one theorem from another is vastly reduced by the abstract form of statement, which includes everything that is essential and nothing that is accidental. For example, in the proof of theorem (i) in sec. 8, it was an "accident" that the symbols " at" and "0" represented angles, and the symbol "+" addition; the essential thing was that these symbols obeyed the formal laws laid down in propositions (a)-(h). Further, if any system, consisting of a class of elements A, B, C,... and a rule of combination o, is laid before us, we have only to assure ourselves that this system satisfies the eight postulates of our abstract science, in order to be convinced that this system will also satisfy all the derived theorems, which form the body of the science. 16. On the uses of an abstract science. From this discussion it will be evident that the main interest of an abstract science centers about the logical relations between abstract propositions, rather than about the applicability of these propositions to concrete things. But many important mathematical theories have been developed as " abstract sciences," from an apparently quite arbitrary set of postulates, which have later proved to be powerful tools in applied mathematics, when important practical systems that satisfied all the postulates of these particular theories unexpectedly presented themselves. The case of the algebra of complex quantities, the study of which will form the main part of the present article, is precisely a case in point. This algebra was developed, historically, from the purest of purely "mathematical" motives-to satisfy a scientific curiosity as to what conclusions could be drawn from certain assumed hypotheses, with no thought of application to electrical engineering or any other FUNDAMENTAL PROPOSITIONS OF ALGEBRA 167 branch of practical science; and yet when the electrical engineers, long after, began to develop the theory of alternating currents, they found that the fundamental conditions of their problem were formally identical with the fundamental postulates of the abstract science of this algebra; consequently the whole highly developed mathematical theory, with all its ramifications, became at once an invaluable tool, ready to hand, for the work of this most practical of practical sciences. 17. Examples of systems that do not satisfy these postulates. Concerning the set of postulates (a)-(h) of sec. 14, it will be instructive to give here a few examples of systems which do not satisfy all of these postulates; for it is only by understanding what a thing is not that we can fully understand what it is. For this purpose, we shall exhibit eight systems, each of which satisfies all but one of the eight postulates. EXAMPLE (a). Let the class A, B, C,... be the class of all angles between -10~ and + 10~, and let AoB be A +B. This system fails to satisfy postulate (a), since 7~08~= 15~, for instance, is not in the class. All the other postulates are satisfied. EXAMPLE (b). Let the class be the class of positive integral numbers; and let the rule of combination be such that AoB=B. For example, 708 = 8, 15o3 = 3, etc. This system clearly fails to satisfy the commutative law, postulate (b); but all the other postulates are satisfied. Thus, in postulate (/), any element X will have the required property XoX=X; since this element X is not uniquely determined, postulate (g) has nothing further to demand; this postulate is, therefore, as we say, satisfied "vacuously." * To show that postulate (h) is satisfied, take Y= A. EXAMPLE (C). Class: all angles; rule of combination: AoB = (A +B)/3. Here the associative law, (c), is not satisfied, since, for example, (3~06~) 012 =3~ 012~= 5~, while 3 0(6~012~) =3~06~= 3~. All the other postulates are satisfied. Thus, in (f), take X=the zero angle; in (g), take A'= -A; in (h), notice first that A 2-A A[l= (_+ A) A[]= - - +3 A 3 3' 3 32 * It is not surprising that X is not uniquely determined in this system, since postulate (b) was one of the postulates required for the proof of uniqueness given above. 168 MODERN MATHEMATICS so that in general, by the formula for the sum of a geometric series, 3n-1 +l A [.l= -A' 2X 3"- hence, if we take 2X3n-' Y= - A, postulate (h) will be satisfied. EXAMPLE (d). Class: all angles; rule of combination: if A is distinct from B, Ao~B=the zero angle; but AoA==A. This system fails to satisfy the "law of cancellation," but satisfies all the other postulates. Postulate (g) is satisfied "vacuously," since there is no uniquely determined element X to which this condition could refer. EXAMPLE (e). Class: all angles; congruent angles being regarded as equal;* rule of cormbination: AoB=that angle in the first revolution which is congruent to A - B. Here (e) is not satisfied, since, for example, 60~21= 60~060~= 120~, and also 2400[21=240~0240~= 120~, while 60~ and 240~ are not equal angles. All the other postulates are satisfied. EXAMPLE (f). Class: all positive distances; rule of combination: AoB=the hypotenuse of a right triangle of which A and B are the legs. Here (f) is not satisfied, since the hypotenuse of a right triangle is never equal to a leg. All the other postulates are satisfied, postulate (g) "vacuously." EXAMPLE (g). Class: all positive angles and the zero angle; rule of combination: o- +. This system clearly does not satisfy postulate (g), since if A=10~, for example, the opposite of A is not in the class. All the other postulates are satisfied. EXAMPLE (h). Class: all integral numbers, positive, negative, and zero; rule of combination: o= +, where + means the'ordinary "+" of arithmetic. * Congruent angles are those that differ only by a multiple of 360~. FUNDAMENTAL PROPOSITIONS OF ALGEBRA 169 This system fails on postulate (h), since, for example, there is no integral number y such that y+y +y==5. It clearly satisfies all the other postulates. 18. Independence of the postulates. These examples enable us to answer a second question concerning the set of postulates (a)-(h) in sec. 14. We have already inquired whether these postulates are consistent (sec. 15); we may now ask, Are they independent? That is, are none of them merely consequences of the rest? Or, in other words, is the set of postulates free from redundancies? The examples just cited show that in this case the postulates are all independent; for, if postulate (a), for example, were a consequence of the other seven postulates, then every system which satisfied the other seven would also satisfy (a); but this is not the case, as is shown by the example cited; therefore postulate (a) is not a consequence of the other seven postulates. Similarly, each one of the eight postulates is shown to be independent of the rest. In this connection it may be noticed that the postulates (a)-(h) in sec. 14 are often simpler statements than the propositions (a)-(h) in sec. 6 or sec. 10. For example, (f) in sec. 6 is really a double statement: (1) there is at least one angle x such that x+x=x, and (2) there is not more than one such angle; in sec. 14 we see that only the first part of this statement need be assumed as postulate (f), since the second part of the statement is a consequence of (a), (b), (c), and (d). The problem of reducing every postulate to its simplest form is one of the most fascinating problems in this kind of work; if we "weaken" the statement too much, we shall not be able to deduce what we wish to from it; while if we do not weaken it enough, we shall have difficulty in proving it independent. It would, of course, not be desirable to carry this reduction too far in elementary teaching; for the farther back we drive our postulates, the longer is the logical journey we must travel in deducing from these postulates the later and more interesting propositions of the science. 19. On the sufficiency of the postulates to determine a single type of system. We turn, finally, to a third question concerning the postulates (a)-(h) in sec. 14. We have been dealing with systems consisting of a class, say K, and a rule of com 170 MODERN MATHEMATICS bination, o; and among these systems (K, o) we have found some that satisfy the conditions laid'down in this set of postulates, and some that do not. Now the question to be raised is this: Are all the systems (K, o) that satisfy these postulates essentially of the same type? By systems of the same type we mean systems which are "isomorphic" with respect to the class K and the rule of combination o; two systems (K, o) and (K', o') being called isomorphic if the elements of the class K can be paired off (put into "one-to-one correspondence" with) the elements of the class K' in such a way that whenever A and B in the class K correspond to A' and B' in the class K', then AoB in K will correspond to A'o'B' in K'. As an example, we have the system of angles, with the rule of combination addition (sec. 5), and the system of distances, with the rule of combination multiplication (sec. 9); these two systems are isomorphic; for, if we take any angle a, not 0, and any distance a, not 1, and pair off the angles with the distances in the manner suggested by the following scheme:.. -3a -2a -a 0 -a a -oa 2a 3a 4a..... a- a-2 a~ 1 a a a a2 a3 a.. then the conditions for isomorphism are easily seen to be satisfied.* These two systems are therefore of the same type. It is easy, however, to find systems that satisfy all the postulates (a)-(h) and are not isomorphic with either of the systems just considered. For example, consider the system in which the class K is the class of all "rational" angles (that is, the class of all angles expressible in the form ~- 1~, where m and n are positive integers), and in which n the rule of combination 0 is the ordinary rule of addition. This system, like the system of all angles considered above, satisfies all the postulates, as is readily verified; but the two systems are not isomorphic; for if we attempted to set up an isomorphism between them, we should necessarily pair off first the zeros of the two systems together, and then the rational fractions of 10 in one system with the rational fractions of some angle a, in the other; whereupon the one system would be * Incidentally we notice that this isomorphism may be set up in an infinite number of ways, since the angle a and the distance a may be chosen at pleasure. FUNDAMENTAL PROPOSITIONS OF ALGEBRA 171 already exhausted, while the other would still contain an infinite number of unpaired elements (compare6 sec. 28). The answer to our third question is therefore, in this case, in the negative; all the systems that satisfy the postulates (a)-(h) of sec. 14 are not of the same type. To distinguish. between the various types of systems that satisfy these postulates, further conditions would have to be added. These facts may be expressed by saying that the postulates in question, while they are consistent (sec. 15), and independent (sec. 18), are not " sufficient," that is, not sufficient to determine any single type of system (K, o).* 20. Note on the terms " axiom " and "postulate." We have now introduced to the reader, in connection with the very simple systems studied in Part II, all the fundamental ideas which we shall need to use in the main part of the article. Before leaving this preliminary work, however, it may be well to say a word in regard to a disputed question of terminology; namely, the question of the proper use of the term " axiom." Some authors, particularly in Germany, have called any * The postulates for general algebra, which are given below in sec. 30, will be found to have all three of the properties of consistency, independence, and sufficiency. A "sufficient" set of postulates is also called "categorical," this term having been introduced by Veblen in 1904. (0. Veblen, A System of Axioms for Geometry, Trans. Am. Math. Soc., Vol. V, p. 346.) The term "sufficient" was first used by E. V. Huntington in 1902 (A Complete Set of Postulates for the Theory of Absolute Continuous Magnitude, Trans. Am. Math. Soc., Vol. III, p. 264). For a criticism of these terms, see L. Couturat, Les Principes des Mathematiques, p. 169. The earliest example of a "sufficient" or "categorical" set of postulates is a set of five postulates for the algebra of positive integers given by G. Peano in 1889. (See Bull. Am. Math. Soc., ser. 2, Vol. IX, p. 41, 1902.) In this connection compare also two papers by A. Padoa, (1) Essai d'une theorie alg6brique des nombres entiers, precede d'une introduction logique a une theorie deductive quelconque; Bibliotheque du Congres international de Philosophie, Paris, 1900, Vol. III, pp. 309-365; and (2) Un nouveau systeme irreductible des pcstulats pour l'algebre, Compte rendu du deuxieme Congres international des Mathematiciens, Paris, 1900, pp. 249-256; and a short note by D. Hilbert, Ueber den Zahibegriff, Jahresber. der deutschen Mathematiker-Vereinigung, Vol. VIII, 1900, pp. 157-168. 172 MODERN MATHEMATICS set of conditions adopted as the basis of an abstract science, like the conditions (a)-(h) of sec. 14, a set of axioms for that science. In the opinion of the present writer, however, the term axiom should be applied only to statements of fact, like the propositions of sec. 6 or sec. 10, never to statements of conditions to be satisfied, like the propositions of sec. 14. The propositions of sec. 6 or sec. 10 are properly called axioms, because they are obviously true statements about certain definite operations on angles or distances. The propositions of sec. 14, on the other hand, are of quite different character. We have called them " postulates," from the Latin postulo, because they are "demands " or conditions which a given system may or may not happen to satisfy. They are logically analogous to demands or conditions set up in other fields of activity; for example, just as any man who satisfies the conditions set up for admission to the army is entitled to belong to that particular class of men, so any system (K, o) that satisfies the conditions set up in sec. 14 is entitled to belong to a certain class of systems. No one would think of calling the conditions for admission to the army " axioms"; and there is no more reason for calling the conditions of sec. 14 by that name. Indeed if the word " axiom " is preserved in its well-established meaning, the recognition of the distinction between axiom and postulate, if properly understood, may well serve to mark the transition from the older to the more modern point of view in regard to the nature of abstract mathematical reasoning.* In regard to the term "postulate," there seems to be little choice between "postulate," "assumption," "primitive proposition," all of which are in good use. Strictly speaking, these postulates, and all the theorems deducible from them, are not propositions at all, but rather what Bertrand Russell t has called "propositional functions," which become propositions (true or false) only after particular values are assigned to the variable symbols K and 0. * Compare M. Bocher's St. Louis Address, 1904, Bull. Amer. Math. Soc., Vol. XI, pp. 115-135, especially the first footnote on p. 129. Also J. W. A. Young, The Teaching of Mathematics, pp. 193-201. t The Principles of Mathematics, Vol. I, 1903; or L. Couturat, Les Principes des Mathematiques. FUNDAMENTAL PROPOSITIONS OF ALGEBRA 173 IV. GEOMETRICAL EXAMPLE OF THE ALGEBRA OF COMPLEX QUANTITIES: THE SYSTEM OF POINTS IN A PLANE 21. Points in a plane; "real" and "imaginary" points.* As a first concrete example of the algebra of complex quantities, consider the class of all points that lie in a plane in which two points, 0 and U, are fixed. These points are divided into "real" points and "imaginary " points. The "real " points are all the points that lie on the line through 0 and U, this line being called the axis of reals; the "imaginary" points are all the remaining points of the plane.t A "real" point is called positive or negative according as it lies on that half of the real axis which contains U, or on the other half. The point 0 itself is called the zero point (see below) and is neither positive nor negative. An imaginary point is called a pure imaginary if it lies on a line through 0 perpendicular to the axis of reals. The position of any point a in the plane is determined when we know: (1) the distance of a from 0 (the distance OU being taken as the unit of measurement); and (2) the angle which the line Oa makes with the axis OU. Two points are "equal," that is, coincident, when \ U and only when their "distances" are equal and their " angles " equal or congruent. FIG. 5. The notation (5, Z 120~), for example, is used to denote a point whose "distance " is 5 times OU, and whose "angle " is 120~. * The system of points in the plane was first studied by C. Wessel, in 1799, and by Argand, in 1806. t The terms "real" and "imaginary" are unfortunate legacies from the eighteenth century, which have become firmly fixed in mathematical literature; the so-called imaginary points are of course no more imaginary, in the ordinary sense of the word, than any other points of the plane. 174 MODERN MATHEMATICS All the points whose distances equal OU are called points on the " unit circle." Among these points in the plane, we now proceed to define certain rules of combination which we shall call " addition" and " multiplication." 22. Addition of points in the plane. If two points a and b are given, a third point x may be derived from them by the following process: Starting from 0, perform the journey from 0 to a; then continuing from a, perform a journey equal to length and direction to the journey from 0 to b; the point finally reached is the required point x.* The point x thus determined is called the sum of the given points a and b, (with respect to the chosen point 0) and is denoted by a+ b. The + sign here used must of course not be confused with the + sign of arithmetic, because the a and b here denote not numbers, but points. 23. Concerning the addition of points in the plane, as thus defined, the reader may easily verify the following statements: (1) If a and b are any two points (equal or unequal) then their sum, a +b, is a point, uniquely determined by a and b; and if a and b are " real " points, then a+b is also " real." (2) a + b = b + a. (Commutative law for addition.) (3) (a +b) + c=a + (b +c). (Associative law for addition.) These facts will be clear from the accompanying figures. a+- b c x=a+bc Kb+ \ 0o FIG. 6. FIG. 7. (4) If a+b=a +Vb, then b-b'V (" Law of cancellation" for addition.) * In the cases in which a and b are not in line with 0, the point X may also be described as the fourth vertex of a parallelogram whose sides are Oa and Ob. FUNDAMENTAL PROPOSITIONS OF ALGEBRA 175 (5) If na=nb, then a=b. Here n is any positive integer, and na means aa+ +a.. +a, to n terms. The point na is called the nth multiple of a. If a is not 0, the series of points a, 2a, 3a,... will lie beyond a on the straight line through O and a. Obviously, if a is a real point, na will also be real, and positive or negative, according as a is positive or negative. 3a 2a aFIG. 8. FIG. 8. (6) There is a unique point z such that z +z =z. This point z is called the zero point of the system, and is denoted by 0. This point O is the point 0 of the figure. Obviously, from the definition, if a is any point, a +O= a. (7) Every point a determines uniquely a point a' such that a+a'=O. This point a' is called the opposite of a, and is a denoted by -a; and if a is any real point, -a will also be real. 0~ The point -a is the point symmetrical to a with -a respect to O. FIG. 9. (8) If a point a and a positive integer n are given, there is always a point x such that nx=a. This point x is called the nth submultiple of a, and is denoted by a/n; and if a is real, a/n will also be real. an Ra 0Fa. 10. FIG. 10. If a is not 0, the series of points a, a/2, a/3,... will lie on the straight line between a and 0, the series becoming more and more crowded as it approaches the point O. Obviously, if a= O, O/n=0. Further, if m and n are any positive integers, m(a/n) = (ma)/n; this point is denoted by (m/n)a. (9) If a and b are any two points, there is always a point x such that a=b +x. This point x, which is uniquely determined 176 MODERN MATHEMATICS by a and b, is called the remainder, a minus b, and is denoted by a -b; and if a and b are real, then a -b is also real. To construct this point a-b, notice that it is the same as a +(-b), as is evident from the figure; hence, to subtract a point b means to add the opposite of b. All these statements concerning the addition of points are exactly analogous to the statements in sec. 6 and sec. 7 concerning the addition of angles. 24. Multiplication of points in the plane. We now define a second operation upon these points. If a and b are any two points in the plane, a third point x may be derived from them by the follow==aXb ing process: find the " angle " of x by taking the sum of the angles of a and b, as defined in /b sec. 5; find the " distance " of x, by taking the product of the distances of a and b, as defined in sec. 9. The point x thus determined is called F. 1 the product of the given points a and b (with FIG. 11. respect to the fixed points 0 and U) and is denoted by a Xb, or a b, or simply ab. For example, if a= (2, Z 10~) and b= (3, Z 15~), then aXb= (6, Z 25~). Here again the X sign must not be confused with the X of arithmetic, since the letters a and b here denote, not numbers, but points in the plane. 25. Concerning the multiplication of points in the plane, as thus defined, the following statements hold true: (10) If a and b are any two points (equal or unequal) then their product aXb is a point uniquely determined by a and b; and if a and b are real, then a X b is also real. In particular, if a and b are both positive, or both negative, aXb will be positive; but if one factor is positive and the other negative, then the product, as obtained by the rule, will be negative. (11) aXb=bXa. (Commutative law for multiplication.) (12) (aX b) Xc = aX (bXc). (Associative law for multiplica tion.) FUNDAMENTAL PROPOSITIONS OF ALGEBRA 177 The truth of these statements (11) and (12) is evident from the fact that the addition of angles and the multiplication of distances are themselves commutative and associative (secs. 6, 10). (13) If a X b=a X b', and a is not 0, then b=b'. ("Restricted law of cancellation " for multiplication.) (14) a X (b + c) = (a X b) + (a X c). (Distributive law of multiplication with respect to addition.) a (b+c)=ab +ac ac o I b+c FIG. 12. FTG. 13. To see that this distributive law holds, let each of the points b, c, and b+c be multiplied by a, as in Fig. 12; it is required to show that the point a(b+c) is the sum of the points ab and ac. To show this, place the quadrilateral 0, ab, ac, a(b+c), together with the parallelogram 0, b, c, (b+c), in a plane perpendicular to the line OU, in the manner shown in Fig. 13, and lay off the distance Oa along that line. By the definition of the multiplication of distances, the lines U-c and a-ac, in Fig. 13, are parallel, as are also the lines U-(b +c) and a-a(b+c); therefore the planes a-ac-a(b + c) and U-c-(b + c) are parallel, and hence the lines ac-a(b+c) and c-(b+c), in which these planes intersect the given plane, are parallel. Hence ac-a(b+c) is parallel to O-ab; and similarly, ab-a(b+c) is parallel to O-ac. Therefore the quadrilateral in question is a parallelogram, and the point a(b+c) is the sum of the points ab and ac, as required.* (15) There is a unique point u, distinct from 0, such that u Xu=u; this point u is called the unit point of the system, and is denoted by 1. * The truth of the distributive law may also be inferred directly from Fig. 12, from the properties of similar triangles; but the proof given above has the advantage of not involving the theory of ratio and proportion, or the "incommensurable case.' 178 MODERN MATHEMATICS This point u is the point U of the figure; that is, the point (1, Z 0~). Obviously, from the definition of multiplication, if a is any point, a X 1= a. The successive multiples of the point 1 [sec. 23, (5)] are denoted, for brevity, as follows: 1+1=2(1) =2; 1 +1+1=3(1) =3; etc. -3 -2 -1 0 1 2 3 4 FIG. 14. (16) Every point a,.provided a is not 0, determines uniquely a point a' such that a X a'= 1, where 1 is the unit point. This point a' is called the reciprocal of a, and is denoted by a-1 or 1/a. If a is a real point (not O) then its a reciprocal will also be real. To construct the point 1/a, notice that its angle is the opposite of the angle of a (sec. 6), while its distance is the reciprocal of the distance of a (sec. 10). If a is a point on the "unit circle" (sec. 21), then 1/a will also be on the 0\ >^ ~ unit circle; while if a is inside the circle, 1/a a\ \yCt / will be outside, and the nearer a approaches the point 0, the farther off will 1/a recede. FIG. 15. (17) If a and b are any points, and b not 0, then there is always a point x such that a=b X x. This point x, which is uniquely determined by a and b, is called the quotient, a divided by b, and is denoted by a/b. Moreover, if a and b are real (and b not 0) then a/b will also be real. To construct this point a/b, notice that its angle must be the angle of a minus the angle of b (sec. 7), while its distance must be the distance of a divided by the distance of b (sec. 11). In particular, 1/1=1, and (ml)/(nl)= (m/n)l, where m and n are any positive integers [sec. 23, (8)]. Hence, if we G. 16. 2 FIG. 16. denote ml and nl by m and n [sec. 25, (15)] then m/n = (m/n)l. For example, 2/3 = (2/3)1. Notice here that 2 and 3 are points, FUNDAMENTAL PROPOSITIONS OF ALGEBRA 179 whose quotient must be found by the rules for the division of points, while 2 and 3 are numbers, indicating how often a certain operation is to be repeated. (18) If a is any point, and 0 is the zero point, then a X = 0; and if a product a X b = O, then at least one of the factors a and b must be 0. In view of these propositions, (1)-(18), we notice in passing that the system of points in the plane is a system in which addition, subtraction, multiplication, and division (except division by zero) are always possible; and the same is true of the system composed b2 of the " real " points alone.. 'b (19) The notation a%, where -- a n is any positive integer, means \ aXaXaX. ~ Xa to n factors; and the point an is called \ a 1; the nth power of the point a. / In particular, a2 is called the square and a3 the cube of the point a. Obviously, from the definition of multiplication, b 1=l, and On=O. FIG. 17. To construct the point an, notice that the angle of an is the nth multiple of the angle of a (sec. 6); while the distance of an is the nth power of the distance of a (sec. 10). If the point a lies on the unit circle, then an will also lie on this circle; if the point a lies outside the circle, then the series of powers, a, a2, a3... win lie outside the circle, on a spiral curve which recedes farther and farther from it; if the point a lies inside the circle, the series a, a2, a,... will lie inside the circle, on a spiral which again recedes farther and farther from the circle, coiling up around the point 0. Of special interest are the powers of i, /////, where i denotes the point (1, Z 90~). Referring to the figure, and applying the rule for the multiplication of points; we see that - I\.I Yo 1 the successive powers of the point i repeat in cycles of four: -^ vs i=1i, i = -1, i3= -i, i4=1, FIG. 18. i'5i, i,= -1, i7= -i, i8= 1, etc. 180 MODERN MATHEMATICS A similar fact is true of the point -i, that is, the point (1, Z270~). Hence, (20) There are two points x such that x2= -1, where -1 is the opposite of the unit point 1. These two points are called the imaginary units of the system, and are denoted by i and -1. It will be noticed that multiplying any point by i has the effect of rotating the point through 90~ about O; while multiplying it by -1 rotates it through twice that angle, or 180~. (21) If a is any point, there are always two " real " points, x and y, such that a= x+iy, where i '.. —. --- ——.- a=-+ is one of the " imaginary units." To see this, we have only to observe, I ', _ first, that any "pure imaginary" point 0'0 x, (sec. 21) can be expressed in the form FIG. 19. iy, where y is some real point, and, second, that any point a can be expressed as the sum of a real point and a pure imaginary. 26. Solution of algebraic equations. Suppose now that any point a and any positive integer n are given; and let us ask, Is there any point x such that xl =a? An inspection of the figure will show: (22) If n is any positive integer, and a is any point not 0, there will be n distinct points x such that x'n=a; each of these points is called an nth root of a. Thus, every point a, except 0, has two square roots, three cube roots, four fourth roots, and so on.* FIG. 20. FIG. 21. * It will be noticed that the proposition: If an=bn, then a =b, which we found to be true when a and b represented distances (sez. 10), is not true when a and b represent points in the plane. FUNDAMENTAL PROPOSITIONS OF ALGEBRA 181 To construct these points, notice first that if a= 1, the nth roots of 1 are points on the "unit circle," and divide that circle into n equal parts, beginning with the point 1; for, any one of these points, when raised to the nth power according to the rule, will produce the point 1. In general, the nth roots of any point a will lie on a circle whose radius is the nth root of the "distance" of a (sec. 10), and will divide this circle into n equal parts, beginning with the point whose "angle" is the nth submultiple of a (sec. 6). If any one of the nth roots is given, the rest can be obtained from it by multiplying by the nth roots of the point 1. The notation a1/n, or n/a, is used to denote that particular nth root of a which has the smallest (positive) angle. Thus, i= V/-1, and -i= - - 1. If we confine ourselves to real points, the statement of the situation is more complicated. Thus, if a is real, and n is an odd number, one of the nth roots of a will be real, and will be positive or negative according as a is positive or negative. If a is a positive real, and n is even, two of the nth roots of a will be real, one positive and one negative; but if a is a negative real, and n is even, none of the nth roots of a will be real. More generally, suppose we have any algebraic equation of the nth degree in x, that is, any equation of the form pox' + pl-'~- +p2Xn-2 + ~ p,_ 1X + pn=0, where n is a given positive integer, and po, pi, p2,..., pn are any given points, provided po is not zero; and let us inquire whether there is any value of x which will satisfy this equation. If there is such a point x, it is called a root of the equation. The facts in the case are these: (23) Every algebraic equation of the nth degree: poxn +plXn-1 +p2Xn- 2. - * + pn-ilX +- pn can be written as the product of n linear factors: po(x-X(z-X2) - *~ ~ (x-n) = 0, where the points xl, x2,..., Xn are fixed points depending on the coefficients po, pl,..., pn; each of these points x1, x2,..., Xn is a root of the equation, and there are no other roots.*, * Since the n factors x-xX, x-x,,..., x-n are not necessarily distinct from one another, the number of distinct roots may be any number from 182 MODERN MATHEMATICS The fact thus stated may be directly verified in the case of equations of the first, second, third, and fourth degrees (called linear, quadratic, cubic, and biquadratic equations, respectively). For example, the linear equation ax+b=O (a not zero), has the root x= -b/a; and the quadratic equation ax2+bx+c= 0 has the roots xl= (-b +/b2-4ac)/(2a) and x,= (-b- /bV -4ac)/(2a); and similar solutions can be obtained for equations of the third and fourth degrees.* The proof for the general case of an equation of the nth degree is more complicated, and will be given in Appendix II. It is important to notice that the fact just stated concerning the number of roots of an algebraic equation-or, what comes to the same thing, the number of linear factors-is true only when we take into consideration all the points of the plane. If we confined ourselves to the points on the real axis, the corresponding statement would be much more complicated. For example, the statement that "every quadratic equation ax'2+bx+c=0 has two (real or coincident) roots" is a true statement only when we are dealing with the complete system of all the points in the plane (or with some equivalent system). If we are dealing with the real points alone, we must say: "a quadratic equation ax2 + bx + c= 0 has two roots, or one root, or no root, according as b2 -4ac is positive, zero,i or negative." To state, as is often done, that in case b2 -4ac is negative, the two roots still "exist" but have now "become imaginary" is thoroughly mischievous. The simple fact is, that if we are dealing with the real points alone, and b_ -4ac is negative, then there is no real point such that ax'+bx+c=0. No juggling with words will alter this fact; and no talk of "imaginary points" can possibly have any definite meaning for the student until he has become acquainted with some actual system in which such points occur. We now turn to a third property of the system of points in the plane, namely, the relation of order among the points on the axis of reals. 1 to n, inclusive; for the sake of brevity, however, it is customary and convenient to say that an equation of the nth degree always has n roots, understanding that in special cases some or all of these roots may be coincident. * See Monograph V. FUNDAMENTAL PROPOSITIONS OF ALGEBRA 183 27. The relation of order among the real points. From the description of the so-called " real " points in sec. 21, it is obvious, in the first place, that (24) The real points form a subclass within the class of all the points of the plane. In particular, the points 0 and 1 are real points. Within this subclass of real points, if the point a precedes the point b as we progress along the axis of reals in the direction OU, then we write a<b (read: " a algebraically less than b," or, better ' a precedes b "). The same situation may also be expressed by writing b >a (read: " b algebraically greater than a," or " b follows a.") Concerning this relation of serial order among the points on the axis of reals, the following statements are evident:* (25) If a and b are real points, and a not equal to b, then either a< b or b< a. (26) If a< b, then a is not equal to b. (27) If a< b and b< c, then a< c. (Law of transitivity.) (28) If a, x, and y are real points, and x< y, then a + x< a + y. (29) If a>O and b>O, then aXb>O. If a>0, then a is a positive real point, and if a <, a negative real point (sec. 21). Hence the statement just made can be expressed by saying that if a and b are positive, their product aXb will also be positive. (30) If a< b, there are always real points x such that a< x and x< b. Such points x are said to lie between the points a and b. A further fact, which is not so obvious, but which may be accepted as a geometric axiom, is the following: (31) (Dedekind's principle.) If M is any (non-empty) subclass of real points, and if all the points of M precede a given * For a detailed elementary study of the relation of serial order, see E. V. Huntington, The Continuum as a Type of Order, reprinted from the Annals of Mathematics, 1905 (Publication Office of Harvard University). 1.84 MODERN MATHEMATICS point c, then there will be a uniquely determined point x, called the upper limit of M, having the following properties: First, every point in M precedes, or at most equals, x; Second, if x' is any real point such that x'< x, then there is at least one point of M that follows x'. In other words, if a subclass of ____ _ _ _ _:___ real points has any " upper bound," M^^T ^ ^it will have a " least upper bound," FIG. 22. or " upper limit." Similarly, a subclass of real points that has any lower bound, will have a " greatest lower bound," or " lower limit." This fact is of great importance in connection with the so-called irrational points, as explained in the next section. Finally, we have what is known as the Principle of Archimedes: (32) If a and b are any positive points, and. a is " less than b, it is always possible to find some multiple of a which is " greater than " b. This fact is of great importance in the theory of measurement. 28. Classification of real points. Among the real points the points 1, 2, 3,... [sec. 25, (15)] are called the positive integral points, and the points -1, -2, -3,... the negative integral points; all these, together with the point 0, form the subclass of " all integral points." All real points which can be expressed in the form ~m/n, where m and n are any positive integral points [sec. 25, (17)] together with the point 0, are called the rational points. The rational points which are not integral are called fractional; the fractional points lie between the integral points. All real points which are not rational are called irrational. That not all the real points are "rational" can be made clear by the following familiar reasoning: Consider the diagonal, D, of a square whose side is the unit distance, u; this length D cannot be expressed as a rational fraction of u; for, if D= (m/n)u, where m and n are positive integers, then, since the area of the square on D is equal to twice the FUNDAMENTAL PROPOSITIONS OF ALGEBRA 185 area of the square on u, we should have m2/n2= 2, and this numerical relation cannot be satisfied by any integers m and n*. Hence, if we take a point x on the real axis so that its distance from 0 is equal to D, then this point x cannot be expressed in the form (m/n)l, and is therefore an irrational point. From sec. 27, (31) it is clear that every irrational point a can be regarded as the limit of an infinite sequence of rational points, ai, a2, a3.... Of special importance are the sequences of the form illustrated by the following example: ao=6; ai=4/10; a2=0/102; a3=3/103; a4=1/104;... where each numerator is one of the points 0, 1, 2, 3,... 9 [see sec. 25, (15)], and each denominator is a power of the point 10. A sequence of this form is called a decimal fraction, and is denoted, for brevity, as follows (taking the same example): 6; 6.4; 6.40; 6.403; 6.4031;... These points are the first terms of the sequence that would be obtained if we attempted to approximate toward the point V/41 by a sequence of rational points in the decimal form; in fact, the algorithm for "extracting the square root" of the point 41 is exactly analogous to the familiar algorithm for extracting the square root of the nvmber 41 in arithmetic; but it should be clearly understood that when we are dealing with the system of points, the point V/41, like all the other irrational or rational points, is already given, from the start, in the system of points, while if we are dealing with the system of numbers, and have developed that system as far as the rational numbers, there is no rational number whose square is the number 41, and hence there is no rational number which could be denoted by '/41. Before we can speak of the "number" V/41 as the limit of a sequence of rational numbers, we must first define what we mean by "irrational numbers" -that is, we must point out what the objects are that we agree to call by that name, and how these objects can be "introduced" into our "number system." The ingenious manner in which this "enlargement of the number concept" has been accomplished is explained in *For, if m2/n2=2, then m2=2n2; in this equation, the left-hand side contains the factor 2 an even number of times, if at all, while the right-hand side contains the factor 2 either once or some other odd number of times. The equation is therefore impossible, since a whole number can be factored in only one way. 186 MODERN MATHEMATICS Appendix I; but throughout the body of the article we are dealing only with the geometrical system of points. 29. First step toward the science of this algebra. Selection of axioms. These 32 propositions, in sees. 23-27, might well be taken as a set of axioms for the science of algebra (compare sec. 8).; they are not all "simple statements," and they are not all independent, as will be shown in the more rigorous analysis given in Part V; but they are so chosen that all the theorems which form the main body of the science can be deduced from them without undue labor. In particular, the question of the irrational and imaginary quantities becomes, as we have just seen, not a question of "introducing" newly devised elements into the system, but merely a question of classification of elements that are already known to exist in the given system. V. THE ABSTRACT THEORY OF THE ALGEBRA OF COMPLEX QUANTITIES 30. A complete set of postulates for the algebra of complex quantities.* The system of points in the plane, studied in Part IV, is the best known and most easily understood example of the type of algebra called the algebra of complex quantities. Other examples will be given in sec. 33, and in Appendix I. We now proceed to analyze what is logically essential in this system. The fundamental notions of the system are: the class of points in general; the class of "real " points; the operations of addition and multiplication; and the relation of order. Abstractly considered, therefore, the fundamental notions in terms of which all the propositions of the algebra can be stated, are the following: (1) A class of elements, a, b, c..., which we may denote by K; * The set of postulates here given is substantially the same as that first published by the writer in Trans. Amer. Math. Soc., Vol. VI, 1905, pp. 209 -229. FUNDAMENTAL PROPOSITIONS OF ALGEBRA 187 (2) A class of elements which we may denote by C; (3) A rule of combination, which we may denote by e; (4) A rule of combination, which we may denote by o; (5) A relation, which we may denote by ~. Any system involving these fundamental notions we shall speak of as a " system (K, C, e, o, )." We now impose on these symbols the conditions expressed in postulates 1-27, below; the object being to show that every system (K, C,, o,, ) which satisfies these twenty-seven postulates is of the same "type " as the system of points in the plane. POSTULATE 1. If a and b are elements of K, then aeb is an element of K, called the sum of the elements a and b. POSTULATE 2. aob = boa. POSTULATE 3. (a~b)oc= a(bec). POSTULATE 4. If abb=aob', then b=b'. POSTULATE 5. There is an element z in K such that ze z=z. DEFINITION. If there is only one such element z, this unique element is called the zero element of the system. POSTULATE 6. For every element a in K there is an element a' in K, such that aea'==z, where z is the zero element.* DEFINITION. If this element a' is uniquely determined by a, it is called the opposite of a, and is denoted by -a. Any system (K, e) that satisfies these postulates 1-6 is called an Abelian group with respect to the operation e.t POSTULATE 7. If a and b are elements of K, then aob is an element of K, called the product of the elements a and b. POSTULATE 8. a ob = b oa. POSTULATE 9. (a ob) oc = a o (b oc). POSTULATE 10. If aob=aob', and a is not zero, then b=b'. POSTULATE 11. a (boc) =(a ob)e(a oc). * If there is no zero element in the system, postulate 6 becomes meaningless-demands nothing. We say then that every system that contains no zero element satisfies this postulate "vacuously." A similar remark applies to several of the other postulates. t For bibliographical references to definitions, of "groups" and "fields," see Trans. Amer. Math. Soc., Vol. VI, 1905, p. 181. 188 MODERN MATHEMATICS POSTULATE 12. There is an element u in K, different from zero, such that uou=u. DEFINITION. If there is only one such element u, this unique element is called the unit element of the system. POSTULATE 13. For every element a in K, provided a is not zero, there is an element a' in K, such that aoa'=u, where u is the unit-element. DEFINITION. If this element a' is uniquely determined by a, it is called the reciprocal of a, and is denoted by I/a, or a-' (provided a is not zero). Any system (K, o, o) that satisfies these postulates 1-13 is called a field with respect to the operations e and o.* The following postulates concern the class C and the relation: POSTULATE 14. If a and b are elements of C, and a not equal to b, then either a b or else b ~ a. POSTULATE 15. If a~ b, then a is not equal to b. POSTULATE 16. If a b and b~c, then a ~c. These three postulates, 14-16, make the class C an " ordered" class, with respect to the relation. POSTULATE 17. (Dedekind's postulate.) If M is any (nonempty) subclass in C, and if there is an element c in C such that a ~ c for every element a in M, then there is an element x in C having the following properties with regard to the subclass M: (1) if a belongs to Mi, then a~x, or at most, a=x; (2) if x' is any element of C such that x' ~x, then there is at least one element a in M such that x'~ a. DEFINITION. If this element x is uniquely determined by the subclass M, it is called the upper limit of AM. The following two postulates serve to connect the relation o with the operations ( and o. POSTULATE 18. Within the class C, if x 0 y, then aex o any. t POSTULATE 19. Within the class C, if z C a and z ~ b, where ~ is the zero element, then z ~ aob. * For bibliographical references to definitions of " groups" and " fields" see Trans. Amer. Math. Soc., Vol. VI. 1905, p. 181. t Provided a(x is not equal to aby. FUNDAMENTAL PROPOSITIONS OF ALGEBRA 189 If, in these last six postulates, we replace " C" by " K," the postulates 1-19, as thus altered, form a complete set of postulates for the subalgebra of all real quantities (compare sec. 37). The following postulates concern the class C and the operations E and o: POSTULATE 20. If a is an element of C, then a is an element of K. POSTULATE 21. The class C contains at least two elements. POSTULATE 22. If a and b belong to C, and have a sum aEb, then aeb also belongs to C. POSTULATE 23. If a belongs to C, and has an opposite, -a, then -a also belongs to C. POSTULATE 24. If a and b belong to C, and have a product, aob, then aob also belongs to C. POSTULATE 25. If a belongs to C, and has a reciprocal, 1/a, then 1/a also belongs to C. These six postulates, 20-25, together with postulates 1-13, make the class C, like the class I, a "field " with respect to ~ and o. POSTULATE 26. If I is a " field " there is an element j in K such that joj= -u, where -u is the opposite of the unit element. DEFINITION. If there are two (and only two) such elements, j and -j, either of them may be called the "imaginary unit" of the system. POSTULATE 27. If K and C are " fields " and K contains an " imaginary unit " j, then for every element a in I there are elements x and y in C, such that xe(joy) =a. These postulates, 1-27, form a complete set of postulates for the algebra of complex quantities. From these twenty-seven postulates all the theorems of the algebra of complex quantities can be deduced. In particular, it is easily proved that every system that satisfies these postulates will have a unique zero-element and a unique unit-element; also, every element a will determine a unique opposite, -a, and (except when a is zero) a unique reciprocal,,/a; the pair of imaginary units j and -j is uniquely 190 MODERN MATHEMATICS determined; and every subclass of the kind described in Dedekind's postulate will have a unique upper limit. To avoid any possible misunderstanding, it may be well to state again that these postulates are not by any means intended for use in elementary instruction. Such a set of postulates exhibits, in skeleton form, the logical structure of a particular type of algebra; but an interest in the logical structure of a science naturally does not arise in a student's mind until the facts of that science have long been familiar to him. It must not be supposed, moreover, that the set of postulates here given is the only possible set of postulates for the algebra in question; or that the fundamental notions here mentioned are the ones that are necessarily adopted. On the contrary, a wide range of choice is possible; but any set of symbols selected as the fundamental notions for the algebra must be definable in terms of the fundamental notions here givei, and any set of postulates selected as the fundamental propositions of the algebra must be deducible from the postulates here given.* In the actual development of the algebra from these postulates, when only one system is contemplated, we of course omit the circles around the signs e, o, and ~, and replace z, u, and j by the more familiar 0, 1, and i; but when we are comparing several systems, or testing a given system to see whether it satisfies the postulates, then the more general notation is essential, if we would avoid hopeless confusion. 31. Consistency of the postulates. To establish the consistency of these twenty-seven postulates, we must exhibit at least one actual system (K, C, D, o, o) that satisfies them all (compare sec. 15). The simplest system of this kind is the system studied in Part III; namely: K= the class of all points in the plane (sec. 21); e= +, as defined in sec. 22; o= X, as defined in sec. 24; C=the class of all points on the axis of reals (sec. 21) ~ - <, as defined in sec. 27. * Considerations of this kind were first emphasized by the Italians, as Peano, Padoa, Pieri, Burali-Forti, etc., their work dating from about 1890. FUNDAMENTAL PROPOSITIONS OF ALGEBRA 191 That this system satisfies all the postulates of sec. 30 is evident from the facts enumerated in Part IV. Here z=O, u==l, and j=i. Other such systems, built up out of purely arithmetical material, without recourse to geometric intuitions, will be mentioned in Appendix I. Any one of these systems shows that the postulates are consistent. Still other, and very instructive examples are given in sec. 33. 32. Sufficiency of the postulates. Further, the twenty-seven postulates of sec. 30 are sufficient to determine a definite type among the systems (K, C, e, o, ~); that is, any two systems (K, C,, o, ~ ) that satisfy all these postulates will be " isomorphic " with respect to K, C, e, o, and ~ (sec. 19). To prove this, suppose two systems (K, C, D, 0, 0) and (K', C', (', 0', 0') are given. First, pair the elements z and u of class K with the elements z' and u' of class K'; then pair all the rational real elements of K with the corresponding rational real elements of K'; and, further, pair the irrational real elements of K with the irrational real elements of K' by pairing the limit of every sequence of rationals in K with the limit of the corresponding sequence of rationals in K'. In this way a one-to-one correspondence is established between the subclasses C and C'. Next, taking one of the elements z /-u in K as j, and one of the elements ~ / -u' in K' as j', pair these elements j and j'; and finally pair every element xfDjoy in K with the corresponding element x' E 'j' 'y' in K', thus completing the one-to-one correspondence between the two classes. It is then easy to see that the correspondence is of such a nature that if a and b in K correspond to a' and b' in K', then a b will correspond to a' 'b' and a Ob to a'O'b'; and, furthermore, if a b, then a' ~'b'. The isomorphism between the two systems is thus established. It may be noticed that the isomorphism between the two systems can be set up in two ways, according to which of the elements ~ n/-u we take as j. It is a curious fact that there is no way of distinguishing between j and -j by any statement that can be expressed in terms of the symbols K, C, E, 0, and 0; that is, any true statement involving j and expressible in terms of these symbols alone, will remain a true statement when j is replaced by -j. All the systems that satisfy these twenty-seven postulates are therefore identical as far as properties statable in terms of K, C, 192 MODERN MATHEMATICS e, o, and ~ are concerned: that is, every proposition statable in terms of these symbols alone will either be true for all such systems, or else be false for all of them. 33. Examples of isomorphic systems. The following examples of isomorphic systems will be instructive; in each case the symbols +, X, and < are to be understood in the sense defined in IV. (a) K=class of all points in the complex plane; ab=-a+b; ab= 5(a X b); C= class of all points on the axis of reals; ~ == <. Here z= O, uz= /5, j= i/5. (b) K= class of all points in the complex plane; aEDb= (aX b) / (a + b), except that ab==a+b whenever a or b or a+b is zero; aOb=aXb; C=class of all points on the axis of reals; (a ~ b)= (a <b), except that when a and b are both positive or both negative, (a 0 b) = (a> b). Here z= O, u= 1, j= i. (c) K=class of all points in the complex plane; a =b=a+b+l; a b==aX b + a+ b; C= class of all points on the axis of reals; ~ = <. Here z= -1, u=O, -u= -2, and j=i-1. Each of these systems satisfies all the twenty-seven postulates of sec. 30, and hence is strictly isomorphic with the system described in IV*. It will be noticed that the ordinary meaning of addition is preserved in Example (a), and the ordinary meaning of multiplication in Example (b). Other examples are given in Appendix I. 34. Independence of the postulates. Finally the twentyseven postulates of sec. 30 are all independent; that is, no one of them can be deduced from the remaining twenty-six. To prove this, we must exhibit, in the case of each postulate, a system (K, C, e, o, 0) which satisfies all the other postulates, but not the one in question (compare sec. 18). A complete list of such "pseudo-algebras " is given in the Transactions of the American Mathematical Society, Vol. VI, 1905, pp. 227-229; a few examples from this list are given in the next paragraph, the most interesting one being the example for Postulate 18. 35. Selected examples of systems that satisfy all but one of the postulates. EXAMPLE FOR 1. Let K be a class consisting of five elements, 0, * Each of these systems is obtained from the ordinary complex plane by a projective transformation. FUNDAMENTAL PROPOSITIONS OF ALGEBRA 193 1, -1, i, -i, and C a class consisting of three of these elements, namely, 0, 1, -1, and let E, 0, and ~ mean the ordinary +, X, and <. This system does not satisfy Postulate 1, since, for example, the element 1 1= 2 does not belong to the class. All the other postulates are satisfied. EXAMPLE FOR 3. K= all complex quantities; C=all real quantities; aEb=-(a+b)/3; 0= X; = <. This system does not satisfy the associative law for addition (compare Example (c), sec. 17). All the other postulates are satisfied. EXAMPLE FOR 4. The same as for 3, except that ~ is now defined so that aBEb= 0 for all values of a and b. In this system, postulate 4, the "law of cancellation" for addition, is clearly not satisfied. There is a zero element z= 0, and a unit element u=1; and Postulate 6 is satisfied. Since a+a'=O, whatever the value of a', we cannot speak of "the opposite of a," since this element a' is not uniquely determined. Hence postulates like 26 and 27, which presuppose the existence of an opposite, are satisfied "vacuously." EXAMPLE FOR 8. K all complex quantities; C= all real quantities; == +; ab-,b; ~=<. This system clearly does not satisfy the commutative law for multiplication. All the other postulates are satisfied. (The system does not contain a unique unit element, and therefore all the postulates which presuppose such an element are satisfied "vacuously.") EXAMPLE FOR 11. The usual system of complex quantities, except that 0 is so defined that a b=a+b-l. Here z= 0, u=1; since the distributive law is not satisfied, the system is not a "field," and Postulates 26 and 27 demand nothing. All the other postulates are satisfied. EXAMPLE FOR 12. K= the class of all complex quantities x +iy, in which x and y are even integers (positive, negative, or zero); C= all the elements of this class which are real; El, (, and ~ defined as the ordinary +, X, and <. This system contains no unit element, but satisfies all the other conditions. EXAMPLE FOR 16. K= all complex quantities; C= all real quantities; += +, 0= X; but ~ interpreted to mean "not equal to." This system satisfies all the postulates except the law of transitivity; for, with the meaning given to 0, we may have a ~ b and b ~ c, and yet not a ~ c. EXAMPLE FOR 17. The ordinary system of complex quantities, x+iy, with x and y restricted to rational values (positive, negative, or zero). 194 MODERN MATHEMATICS EXAMPLE FOR 18. K= a class of nine objects, let us say nine umbrellas, marked with the labels 0, 1, 2, 3, 4, 5, 6, 7, 8; C= the subclass composed of umbrellas 0, 1, and 2, with Q = <; G and 0 defined according to the following tables:: ( 012345678 0 012345 678 0 012345678 0000000000 1 120453786 1 012345678 2 201534867 2 021687354 3 345678012 3 036471825 4453786120 4 048723561 5534867201 5057138246 6 678012345 6063852417 7 786120453 7 075264183 8 867201534 8 084516732 For example, 3~7=1; 307=2. This remarkable system does not satisfy Postulate 18, as we see by taking a= 1, and x= 1, y= 2. All the other postulates can be shown to be satisfied, although the labor of a direct verification of the associative and distributive laws would be large. The zero element of the system is z=0, the unit element is u=, and the imaginary units are 4 and 8. To show that Postulate 27 is satisfied, take i=4, and build all the elements of the form x+iy, where x and y belong to C; this set of elements will be seen to exhaust the given class K. This system is a good example of the strange "pseudo-algebras" which would have to be admitted if we left out even one of the twentyseven conditions imposed by the postulates. EXAMPLE FOR 20. K= all complex quantities x+iy, where x and y are restricted to rational values (positive, negative, or zero); C=all real quantities; E, 0, and 0 meaning the ordinary +, X, and <. This system satisfies all the postulates except the 20th. EXAMPLE FOR 24. K=all complex quantities, with = -+, and = X; C=all pure imaginaries (sec. 21), with 0 defined so that ix C iy whenever x <y. Here the product of two elements of C will not (in general) belong to C, but all the other postulates are satisfied (19 and 27 vacuously). EXAMPLE FOR 26. K= all real quantities, C= all real quantities, E= +, = X, 0= <. This system contains no "imaginary units." * Advanced students will recognize this system as a Galois Field of order 32. FUNDAMENTAL PROPOSITIONS OF ALGEBRA 195 EXAMPLE,FOR 27. The system employed to show the independence of Postulate 27 is a rather complicated one, as follows: K=the class of all algebraic expressions T of the form, T= Amtm +Am+ltm+l +An+2tm+2+.., where t is a parameter, and m any integer (positive, negative, or zero), while the A's are ordinary complex quantities. The operations E and o are defined as the ordinary + and X for such (finite or infinite) expressions. The class C is the class of all those elements T in which all the coefficients are zero except A0, and A0 is real; that is, C= the class of real quantities. Within this class C, ~ is defined as the ordinary <. This system satisfies all the postulates except Postulate 27. It is larger than the system of ordinary complex quantities, and contains that system just as the system of ordinary complex quantities contains the system of real quantities. Postulate 27 is therefore a restrictive condition. 36. What is algebra? We are now in a position to answer the question, " What is the algebra of complex quantities?" The answer is, the algebra of complex quantities is the scientific study of that particular type of " system (K, C, e, o, ~ ) "which satisfies the twenty-seven postulates of sec. 30; any system (K, C, o, o, O) that satisfies these twenty-seven conditions may be taken as a representative example of the algebra, and all the propositions which are logically deducible from these twentyseven postulates are the propositions which form the body of the science. The system of points in a plane, described at length in Part IV, is the simplest representative example of this algebra, and is the only example which could possibly be used to advantage in elementary instruction (compare Appendix I, especially sec. 42). Again, if one asks, "What is an imaginary quantity?" the answer is this: If any system (K, C, E, o, ~) that satisfies the twenty-seven laws of complex algebra is given, then any element of K, not belonging to the subclass C, is called an "imaginary " element of that system. The question "What is an irrational quantity?" may be answered in a similar way. 196 MODERN MATHEMATICS A striking peculiarity of the set of postulates adopted in sec. 30 is that none of the postulates presupposes any knowledge of arithmetic, not even the notion of counting.* 37. A complete set of postulates for the subalgebra of real quantities. A complete set of postulates for the algebra of all real quantities may be obtained from the list in sec. 30 as follows: Omit postulates 20-27; abandon the distinction between the classes K and C, and make postulates 14-19 apply to the whole class K. The resulting set of nineteen postulates, 1-19, will be consistent, sufficient, and independent; and any system (K, ~, 0 ) which satisfies them all will be an example of the type of algebra called the algebra of all real quantities.t Complete sets of postulates for other subalgebras, as the algebra of positive integers, the algebra of all integers, the algebra of all rationals, etc., are given in another paper by the writer.: 38. On the value of complex algebra in problems concerning real quantities. As already pointed out, the rules of operation in any of the subalgebras are more complicated, that is, more subject to exceptions, than are the rules of operation in the general algebra of complex quantities. On this account, it is usually worth while to employ the algebra of complex quantities even in cases where the data of the problem, and the required answer, are all real quantities. For example, if it is required to find a real value of x that satisfies a given equation ax2 + bx +c=0, the simplest plan is first to find all the values of x that satisfy the equation, and then to pick out those, if any, that are real. Similarly, if the problem calls for a positive value (or an integral value) of x, we do not confine ourselves * This must not be understood to imply that the postulates of sec. 30 would therefore form a suitable introduction to algebra for beginners; compare the remark near the end of sec. 30. t For bibliographical references, see Trans. Amer. Math. Soc., Vol. III, 1902, p. 265.: The Fundamental Laws of Addition and Multiplication in Elementary Algebra, reprinted from the Annals of Mathematics, 1906. (Publication Office of Harvard University). FUNDAMENTAL PROPOSITIONS OF ALGEBRA 197 to the algebra of positive quantities (or the algebra of integral quantities) but proceed at once to operate in the realm of all complex quantities, and then select those results that satisfy the given conditions. It is chiefly for reasons of this sort, if at all, that the algebra of complex quantities should be taught in the secondary schools; for elementary practical problems in which this type of algebra is directly applicable are not of frequent occurrence. APPENDIX I. OTHER EXAMPLES OF THE ALGEBRA OF COMPLEX QUANTITIES 39. Arithmetical systems. In the latter half of the nineteenth century a large amount of effort was expended in devising definitions of the irrational and imaginary quantities which should rest on a purely arithmetical basis, independent of any geometrical intuitions. The problem, as we should now state it, was this: To construct, out of purely arithmetical material, systems that satisfy the postulates 1-27 of sec. 30, and are therefore isomorphic with the system of points in the plane. The only use made of such systems is in the proof of the consistency of the postulates-a non-geometric system being, from certain points of view, more satisfactory for this purpose than a geometric one; but after the consistency of the postulates is once established, these arithmetical systems need not be again referred to, and from the elementary pedagogical point of view, they seem to have no value whatever. Since, however, many of the newer text-books are inclined to lay great stress on this matter, a brief account of one of these arithmetical systems will here be given. The system is built up by successive steps from the system of natural numbers, 1, 2, 3,.'..; and we shall assume that the rules for adding and multiplying these numbers are known. 40. System based on Dedekind's " cuts." The best known of these arithmetical systems is one based on a very ingenious idea published by R. Dedekind in his "Stetigkeit und irrationale Zahlen," in 1872. The steps by which the system is constructed are as follows: (a) Positive rationals, R. Consider first a class R composed of all possible pairs of numbers, m/n. (By "number" we mean, throughout this section, a natural number, 1, 2, 3,...) Two such pairs, m/n and m'/n', are called equal if the numbers mn' and m'n are the same; the pair (mn' +m'n) / (n') is called the sum, and the pair (mm')/ (nn') the product, of the pairs m/n and m'/n'; * and the pair m/n is said to * The product of two equal pairs is called the " square " of that pair. 198 MODERN MATHEMATICS precede, or be less than the pair m'/n' if the number mn' is less than the number m'n. If two number pairs are denoted by a and b, their sum and product, as just defined, may be denoted by a+b and aXb; and the notation b<a may be used to denote that b "precedes" a. Further, if b<a, there is always a pair x, such that b+x=a; this pair x is called the remainder a minus b, and is denoted by a-b. The system R thus defined is an example of the type of algebra called the algebra of positive rational quantities. (b) Positive reals, Q. In the series R thus defined, there is an infinity of ways in which the whole series of number pairs can be divided into two parts, U and V, such that every pair in the class U "precedes" every pair in the class V. Every such method of division in the series R is called a cut (U, V). For example, the following set of instructions: Assign to U every pair whose square "precedes" 2/1, and to V every pair whose square "follows" 2/1-is a "cut." If there is a pair m/n which is either the last pair in U or else the first pair in V, then this pair m/n is called the generating element of the cut, and the cut is called a rational cut; but for most cuts, no such pair will exist. We now consider a class Q composed of all possible "cuts" in the series R. Two cuts (U, V) and (U', V') are called equal if the classes U and V are the same as the classes U' and V' respectively. A cut (X, Y) is called the sum (or product) of the cuts (U, V) and (U', V') if the class X contains every pair which is the sum (or product) of a pair in U and a pair in U', while the class Y contains every pair which is the sum (or product) of a pair in V and a pair in V'. A cut (U, V) is said to precede a cut (U', V'), if there is any pair in the class V which precedes a pair in the class U'. If A and B are two cuts, then A +B means their sum, and A X B their product, as just defined, and B <A means that B precedes A. Further, if A and B represent two cuts, and B precedes A, then there is always a cut x, which, when added to B, according to the rule, will produce A; this cut x is called the remainder, A minus B, and is denoted by A -B. The system Q thus defined is an example of the algebra of positive real quantities. It is a system in which Dedekind's principle can be shown to hold; but the proof requires very close reasoning.* (c) All reals, q. Next, we consider a still more complicated class, q, made up of three kinds of elements: (1) All symbols of the form +A, where A is any element of the class Q, and + is a distinguishing mark, read "positive "; (2) all symbols of the form -A, where A is any element See Weber and Wellstein, Elementar-Mathematik, Vol. I, sec. 23. FUNDAMENTAL PROPOSITIONS OF ALGEBRA 199 of the class Q, and - is a distinguishing mark, read "negative"; (3) an extra symbol, 0, called zero. Within this class q, " sums," " products," and the relation of "precedence" are defined by the following formulas, in which A, B,..., denote elements of the class Q, and A +B, A-B, AXB, and A<B have the meanings already defined for that class: +Ae+B=+(A+B), Ae-B=-(A+B); +AE -B=-Be+A=+(A-B) if B<A, and ==-(B-A) if A<B; +AE0= O+A=+AA, -AeD0=Oe-A=-A, +AA -A = -A+A==O; +A +B= +(A XB), -AO-B= +(A XB), +A-B= -Bo+A = -(A XB); +AOO- O = +A=O, -A0 = O -A = 0; +A ~ +B when A <B, -A -B when B<A; -A ~ +B; -A ~ 0, 0 ~ +A. These definitions being once established, the circles may be dropped from the symbols O, 0, and ~. Further, if a and fi are any elements of q (whether a<? or P <a), there will always be an element x in q such that a = P + x; this element x is called the remainder, a minus i, and is denoted by a-P. The system q thus defined is an example of the algebra of all real quantities, for, like the system of points on the axis of reals, it can be shown to satisfy all the nineteen postulates of sec. 37; the labor of verification is in this case, however, very considerable, especially in case of Dedekind's postulate. (d) Finally, we construct still another class, K, by taking as elements of K all possible couples of the form (a, P), where c and P are any elements of the class q-that is, any real quantities. Two such couples (a, P) and (a', P') are called equal when a= a' and = Pi'. The sum and products of two couples are defined by the following formulas, in which a and P denote any elements of the system of real quantities, q, and a+fP, a-P, and axXP have the meanings defined for that system: (a, P) (a', 3') (a + a', P + P), (a, P) o((oa, P') = (aCa' -,/,, a,' + ~ ls). Within the class K, the couples of the form (a, 0), in which the second element is zero, form a subclass C, and within this subclass C a couple (a, 0) is said to precede a couple (a', 0) if a precedes a' in the system q. The complete system K thus constructed can be shown to satisfy all the twenty-seven postulates' of sec. 30, and is, therefore, like the system of points in the plane, an example of the algebra of complex quantities. 200 MODERN MATHEMATICS 41. System based on Cantor's " regular sequences." Another system of the same general character can be built up by using "methods of forming infinite sequences" of a certain special kind in the series of rational quantities, instead of "methods of forming cuts" in that series. The definitions of "sums" and "products" in this system are of course quite different, in detail, from the definitions in the system just described; but the general plan by which the system is built up, and the highly abstruse nature of the concepts involved, are the same in both cases. 42. Comments on these arithmetical systems. It will be sufficiently obvious from the above descriptions that these arithmetical systems are wholly unsuitable for use in elementary instruction. And yet it is unfortunately customary to speak of the elements of such an arithmetical system as the genuine "algebraic quantities," and to regard the points in the plane as merely "geometrical representations" of them. As a matter of fact, both the arithmetical and the geometrical systems are equally entitled to stand as representatives of the type of algebra in question-the only genuine definition of the system being embodied in the laws of operation of the system, as expressed in a set of postulates like those in sec. 30. And when we consider what the elements of the arithmetical system really are-" couples " of "methods of division" of a series of "pairs of numbers "-while the elements of the other systems are simply geometric points, it is easy to decide which of these systems is the more suitable concrete example to present to an elementary student. Moreover, the complicated nature of these arithmetical systems is not lessened by calling them systems of numbers, in an extended sense of the term number.* It has become customary, during the latter part of the nineteenth century, to speak of all the objects described in sec. 40 or sec. 41 as "numbers," and to regard algebra as the study of these "number systems;" but in actual practice the original definitions of these so-called "numbers" drop entirely out of mind, and a "number system" comes to be thought of as any system of objects which can be put into one-to-one correspondence with the system of points in the plane. Indeed, too often a text-book will profess to "introduce" or "invent" a new "number" to correspond to some point, without vouchsafing any description whatever of the object so invented, beyond the statement that it does correspond to the point. If a "number" is * To avoid confusion, at least in elementary work, it seems preferable to reserve the word "number" for its ordinary arithmetical use, and to call the other elements " quantities" as is done, for example, in Professor Bocher's new book on "Higher Algebra." Thus the term "complex quantity" is surely less perplexing to a beginner than "complex number." FUNDAMENTAL PROPOSITIONS OF ALGEBRA 201 thus to have no properties that a "point" does not have, it would seem unnecessary to make the distinction in terminology; for the two systems become no longer parallel, but identical! As a matter of fact, all that is really essential in either the system of points or the system of numbers is the set of formal laws which govern the operations within these systems. APPENDIX II. PROOF THAT EVERY ALGEBRAIC EQUATION HAS A ROOT 43. We give here the proof, omitted in sec. 26, that every algebraic equation, poxn +plXn-1 + p2Xn-2 +.. + pn-IX +pn =0, has at least one root. Here n is any positive integer, and po, pi, p2, *. pn are any given points in the complex plane (po not zero). Numerous proofs of this important theorem have been given, the earliest rigorous demonstration being cue to Gauss (1799). The proof here presented differs from those commonly given in the fact that no use is made of trigonometry, or of the method of separating a complex quantity into its real and pure imaginary parts. Throughout the proof we shall use the notation lal, due to Weierstrass, to denote the distance of the point a from the zero point. It is obvious from the definition of addition of points (sec. 22) that if x = a + b + c +-, then the distance of x cannot exceed the sum of the distances of a, b, c,..; that is, la+b+c+.- ~ ' \\aj +[ cl + *Jb *; it is also obvious, by the definition of the subtraction of points, that a -bl will denote the distance between the two points a and b. As a further matter of notation, we denote the left-hand side of the given equation by f(x): f(x) = poXn + pln-1 +p2xn-2 ~ - p-i- d_- pn, 202 MODERN MATHEMATICS the value of f(x) when x=a is then denoted by f(a), and our problem is to show that there is at least one point x=a such that f(a) =0. The function f(x) is called a polynomial of the nth degree in x. 44. In order to simplify the proof, we first establish the following properties of the function f(x). (1) Given, any distance R, we can find a distance G such that If(x) < G whenever Ixl < R. For, take G>n-lpl' Snl, where p is the most distant of the given points po, pi,..., p and S is a point such that jS]>R and also > 111. Then whenever Ixl <R, we shall have Ixkl < Rk < Skl < ISn, (k= 1, 2,..., n), and therefore, If(x)I = Ipoxn +pxan-' +~ +piX +pnl < Ipo0Xn + plp'n-1l + ~ +pn-iXj + ]P < IpSnl + IpSn + + IpSnj + IpSnl <n. pl |ISl, which is less than G, as required. (2) By taking IxI sufficiently large, we can make [f(x)l as large as we please. That is, given any distance g, we can find a distance h such that If(x)l >g whenever Ixl >h. FIG. 23. For, write f (x) in the form f (x) =xn(po +Q), where Q=p +p2 + pY- l p~ QP1~+ P... X X2 Xn —1 Xn and take h larger than ]11, and larger than each of the distances (2g/lpo!)/ll and (2nlp[)/lpol, where p is the most distant of the given points Po, pi,..., pn. FUNDAMENTAL PROPOSITIONS OF ALGEBRA 203 Then, whenever lxi> h, j&I< JPJ+ IP21 j~ IXI IX? I LXnt P( +1 I+ 1P < hh jhij h"I n1 ~h< nn jhl (2nlpl)/lpol lPot 2 and therefore, as we may see from the figure, lpo +QI >I! 2 Further, whenever, Ixl>h, 1xn>2g/1poi. Hence, whenever!x!>h, If (X)i ==Ixnl p I p 0~Q > g, as required. (3) The polynomial f(x), of the nth degree, may be written in the form f(x) =f(a) + (x -a)kF(x), where a is any given point in the -plane, and F(x) is another polynomial, of the (n-lc)th degree, and such that F(a) is not zero.* For, we have f(x)=p0x~+px-1 +. -+3n-l~7Jn, and f(a) =pan+ p,an-' + + pn-ja+pn, whence, by subtraction, f( x) -f(a)= p. (xn-an) +p1(Xn-1' an-1) ~ + pn-,(x —a). Now xm -am, where m is any positive integer, is always divisible by x -a; t hence f(x) — f(a) contains the factor x -a at least once, and may be written in the form f(x) -f(a)= (x -a)F,(x), where F1(x) is a polynomial of the (n - 1)st degree. It may of course happen that it contains the factor x -a more than once, but by dividing by x -a as often as possible, say k times, we shall finally arrive at the equation f(x) -f(a) = (x -a) kF(x), where F(x) is a polynomial, of degree n-k, that is not divisible by x -a. Moreover, this polynomial F(x) will not become zero when x= a; for, as above, F(x) — F(a) an expression * In case k = n, F(x) reduces to a constant, not zero. t Thus, xm-(anm =(x-a)(xm-l+aXm-2 alxm-3+... -am-2X am-1). 204 MODERN MATHEMATICS containing x —a as a factor, and if F(a) were zero, then F(x) itself would contain x-a as a factor, which is not the case. (4) The polynomial f (x) is continuous at every point x=a. Roughly speaking, this means that a small change in the position of x will produce a correspondingly small change in the position of f (x). More precisely, given any radius R about the point f(a), we can always find a radius r about the point a, such that whenever x -aI < r, If(x) -f (a) < R. To show that this is true, write f(x) in the form f(x) -f(a) + (x -a) k' (x), as in (3), and draw a circle of radius Ilj about the point a, and a more inclusive circle of radius Ial + i about the point O. By (1), we can find a distance G such that IF(x) < G whenever x lies within the larger (and hence whenever x lies within the smaller) circle. If now we take r<(R/G)l/k, and also < 111, then whenever lx-al <r we shall have f(x) — f(a) I = l (x-a) kIF(x) <rk.G < (R/G)G=R, as required. (5) If f (a) is not zero, we can always choose x so that ]f(x) < I f (a); that is, so that the point f (x) is nearer the zero point than f (a) is. To see that this is true, we shall write f(x) in the form f(x) =f(a) + (x -a) kF(x), as in (3), and then show that x can be so chosen that the point (x -a) kF(x) will fall within the region shaded in // ``~~ Ythe figure; that is, so that the distance A / a(x) of (x-a)kF(x) will be less than If(a)[, f/, 5 a) while its angle will lie between 0+ 120~ / O / /0 \ and 0+240~, where 0 is the angle of the pointf(a). When x is so chosen, the ^^ /^/ \sum of the point (x-a)kF'(x) and the fixed point f(a) will then lie within the region bounded by the dotted line in the figure, and hence will be nearer to O FIG. 24. than f(a) is. To see that the distance of (x-a)kF(x) can be made less than f(a), we have merely to notice that the factor l(x-a) k can be made FUNDAMENTAL PROPOSITIONS OF ALGEBRA 205 as small as we please by taking x sufficiently near to a, while the other factor, IF(x)l, by (1), remains less than some finite quantity G. To see that the angle of (x-a)kF(x) can be made to lie between 0+120~ and 0+2400, notice first that this angle is equal to kc4+$, where b is the angle of x-a and f is the angle of F(x). Now $, the angle of F(x), can, by (4), be made to differ by as little as we please from a, the angle of F(a), by taking x sufficiently near to a; in particular, if we take x -al <r, where r is sufficiently small, $ will lie between, say, a-60~ and a +60~. Having thus chosen the distance of x-a, we can still vary X, the angle of x-a, at pleasure, by moving x around the circumference of its small circle of radius Ix -al about a. In particular, if we take 5= (O-a+180~)/k, then kk+a=0+180~, and kq5 + e will lie between 0 + 120~ and 0 + 240~, as required. The following general property of points in the plane will also be useful. (6) If an infinite collection of points x all lie within a finite region of the plane, say within a square of side D, then there will be at least one point X, within or on the boundary of the square, having the following property: every circle, however small, drawn about X as a centre, includes an infinite number of points that belong to the collection. Such a point X is called a cluster point for the given collection; but it may or may not itself belong to the collection. To see that such a cluster point will always exist, we have merely to draw through each point of the collection lines parallel to the sides of the square, and consider the sets of points in which these lines cut two adjacent sides of the square. Each of these sets of points, by Dedekind's principle, [sec. 27, (31)], will have at least one limit point; and the point of intersection of lines drawn through these limit points, parallel to the sides of the square, will be the cluster point required. (7) If a collection of points {x} is given, and the corresponding points {f(x)} possess a cluster point Y, then there is at least one point x in the plane such that f (X) = Y. To prove this, pick out from the collection f(x) } an infinite sequence of points, f(x,), f(x2),..., having the following properties: (a)Each point of the sequence is nearer to Y than the preceding point is; and (b) every circle drawn about Y as a centre will contain an infinite 206 MODERN MATHEMATICS number of points of the sequence. The corresponding points, x1, 2,..., will then all lie within a finite region of the plane, as we may see by (2), and they will, therefore, by (6), have at least one cluster point, X. This point X will have the required property, f(X)= Y. For, suppose f(X) were equal to Y', where Y' is different from Y. Then we could draw two non-overlapping circles of radii R and R', about the points Y and Y', respectively, and deduce a contradiction, as follows: In the first place, from the nature of the sequence f(xA), f(x2),..., all the points of this sequence beyond a certain stage, say, f(x!,), will lie within the circle R about the point Y. On the other hand, since Y'=f(X), we can, by (4), draw a circle of radius r' about X so that whenever x lies within this circle r', f(x) will lie within the circle R', and hence outside the circle R. Therefore none of the points of the sequence x1, x2,..., beyond the stage Xk can lie within the circle r', which contradicts the fact that X is a cluster point for this sequence. 45. The main proposition can now be established, as follows: Suppose the proposition is false; that is, suppose that there is a distance c, not zero, such that If(x)I >c for every point x in the plane. Then, by Dedekind's principle, the possible values of If (x) must have a lower limit, b, c, such that If(x) is never less than b, but can be brought as near to b as we please by properly choosing x. Two cases are now conceivable-either there is a point a such that If (a) = b, or else f (x) I > b for all values of x. The first case is impossible, since, if there were a point a such that If(a)==b, then, by (5), there would be also a point x such that (x) I < b, which is contrary to the hypothesis that b is the lower limit of If (x). In the second case, If(x)l has the lower limit b, but never reaches it. If therefore we draw a circle of radius b about the point 0, there will be an infinite number of points f(x) in a finite region just outside this circle. It is easy to see, by (6), that this collection of points f(x) will have a cluster point Y somewhere on the circumference of the circle of radius b; therefore, by (7), there is a point X for which f (X) = Y; but for this point If (X)l= b, and the second case is therefore as impossible as the first. FUNDAMENTAL PROPOSITIONS OF ALGEBRA 207 Hence the supposition with which we started must be false, and the theorem that every algebraic equation has at least one root is thus established. The general theorem in sec. 26 (23) follows without difficulty. V THE ALGEBRAIC EQUATION By G. A. MILLER CONTENTS SECTIONS. I. GENERAL INTRODUCTION.............................. 1-4 1, Aim of the monograph; 2, How it should be read; 3, Mathematics presupposed; 4, Type of questions studied. II. HISTORICAL SKETCH AND DEFINITIONS,................... 5-9 5, Introduction; 6, Definitions; 7, Fundamental problems; 8, Symbols; 9, Domain of rationality. III. EQUATIONS WITH ONE UNKNOWN AND WITH LITERAL COEFFICIENTS.................................... 10-17 10, General Statement; 11, Substitutions and substitution groups; 12, Linear equations; 13, Quadratic equations; 14, Extensions of the number concept due to the quadratic; 15, Cubic equations; 16, Biquadratic equations; 17, Equations whose degrees exceed 4. IV. EQUATIONS WITH ONE UNKNOWN AND WITH NUMERICAL COEFFICIENTS................................... 18-24 18, General statement; 19, Multiple roots; 20, Sturm's theorem; 21, Rational roots; 22, Irrational roots; 23, Solutions by means of graphs and machines; 24, A few fallacies. V. SIMULTANEOUS EQUATIONS..............................25-30 25, Introduction; 26, Consistency of a system of linear equations; 27, Geometrical interpretation; 28, Consistency of two equations in one unknown; 29, Equivalent equations; 30, A few tests for equivalence of equations. VI. A FEW REFERENCES.................................. 31-32 31, Text-books; 32, Articles. 210 V THE ALGEBRAIC EQUATION By G. A. MILLER I. GENERAL INTRODUCTION 1. Aim of the monograph. The present monograph aims to give a sketch of some of the most fundamental processes in which the algebraic equation occupies a central position, and thus to fix the attention more completely on the underlying thoughts and the historical setting than would be feasible in a short treatise on the theory of equations. The monograph is intended to supplement such treatises rather than to replace them. By means of the historic setting of many elementary facts it is hoped that parts of it may be useful also to those who have only such a knowledge of the equation as would naturally result from an elementary course in algebra. 2. How it should be read. The reader is advised not to insist on understanding every statement before proceeding to the next. To some readers such concepts as domainofrationality, substitution group, and p-valued rational function may be new, and our short account of them may not appear entirely satisfactory. A slight knowledge of such dominating concepts and of their applications is, however, much better than total ignorance, and if the present monograph leads to an intelligent search for knowledge along these important lines its perusal will not have been in vain. 3. Mathematics presupposed. To avoid prolixity it has seemed desirable to presuppose, in a few places, an elementary knowledge of determinants as well as a knowledge of the first 211 212 MODERN MATHEMATICS derivative of a function of a single variable. As it seemed undesirable to presuppose an elementary knowledge of the Galois theory of equations some fundamental processes could not be sketched with the completeness that would be desirable. It is hoped, however, that the viewpoint which has been adopted will tend to prepare the way for this general theory; this is especially true of the methods used to solve the cubic and the biquadratic equations. While the common road to a knowledge of the equation leads through numerous problems, it is sometimes desirable to take a broad survey of the historic setting and of the underlying principles, and thus to gather new inspiration and a deeper insight. It is hoped that the present monograph may aid in taking such surveys. 4. Type of questions studied. Equations of the form xn= 1 play an important role in the general theory of equations. Since the fundamental properties of these equations are treated in Monograph No. VII, sees. 28, 29, they are not given in the present monograph. As the roots of the equation xn= ~a may be obtained by multiplying those of xn= ~1 by the arithmetic nth root of the positive number a, it results that the theory of the equations of the form x =1 is almost equivalent to that of equations with two coefficients not zero. For many purposes it is convenient to study equations from the standpoint of the number of coefficients which are supposed to differ from zero, especially when this number does not exceed 3, but in the present monograph the classification is made with respect to the degrees of the unknowns. The interesting properties which result from the assumption that the coefficients represent successively the various terms of sequences of numbers have been left untouched for want of space. THE ALGEBRAIC EQUATION 213 II. HISTORICAL SKETCH AND DEFINITIONS 5. Introduction. "An equation is the most serious and important thing in mathematics," says Sir Oliver Lodge.* It is also one of the oldest mathematical concepts, since the fundamental operation of counting itself is based upon the idea of a kind of equality between the things counted. Even elementary algebraic equations are very old; for, such instances as " Heap, its one-seventh, its whole, it makes 19," and " Heap, its two-thirds, its one-half, its one-seventh, its whole, it makes 33," are found in the work of an Egyptian Ahmes written about 1700 B.C. It is evident from these and many similar instances that the ancient Egyptians used "heap " with the same significance as our more modern x, and that the given statements are respectively equivalent to the equations x 7+x=19, 2x x x — + +x +x=33. On fragments of papyri which have been deciphered more recently, but are probably older than the work of Ahmes, statements equivalent to the system of two simultaneous equations x2 +y2= 100, y=3x, have been found. Even special systems of n equations involving n unknowns were solved at an early date. A Greek named Thymaridas gave a rule for solving the following system: Xi +x2+x3 +... +Xn=S, X1 +-X2=al, X1 +~X3=a2,.. X1 +Xn=an-l. It is an interesting fact that the technical terms given (known) and unknown are involved in this ancient rule. A similar system was solved by a Hindu Aryabhata of the sixth century A.D, but the methods for solving general systems of * Easy Mathematics, 1906, p. 127. 214 MODERN MATHEMATICS m equations in n unknowns are of comparatively recent origin. In general, the ancient mathematicians and those of the Middle Ages sought merely numerical values of the unknowns in a system of equations, but did not give the expressions representing the unknowns in terms of the coeficients. 6. Definitions. An equation of the form alxl + a2x2 + a3x3 +.. + anXn =.k, where al, a2,..., an, k are supposed to represent known numbers and xl, x2,..., x are unknowns, is called an equation of the first degree, or a linear equation. Equations which are true only on condition that the unknowns involved have particular values are called conditional equations. If an equation is true for every set of values that may be arbitrarily assigned to the unknowns, or if it is a true relation between known numbers only, the equation is called an identical equation, or briefly, an identity. Thus 2 3+4 7= 34, 3m-2m = m* are identical equations, while 2x + 3y = 1, 5x -2y = 12, are conditional equations. If it is assumed that a sequence of numbers may be assigned to the unknowns of an equation these unknowns are called variables. Whether the letters of an equation are to represent unknown constants or variables depends upon the point of view, but the difference between these concepts should be * In an identical equation the symbol = is frequently replaced by =. This symbol was first used with the present meaning by Riemann, according to Kronecker's Vorlesungen fiber Zahlentheorie, 1901, p. 86. Gauss used it with a different meaning in the theory of congruences, such as 10 3 (mod. 7), (see Monograph No. VII), at an earlier date, 1801. Hence this symbol is now used to represent both something stronger and also something weaker than what is generally implied by the symbol =. As Kronecker observed the stronger meaning seems the more natural, as = would appear to imply something more than =, but the symbol is more extensively used with the weaker meaning. THE ALGEBRAIC EQUATION 215 carefully observed. By varying the meaning of the letters the equation reveals its full significance and usefulness. An equation involving no unknowns with fractional exponents is said to be of the nth degree if it involves at least one term in which the sum of the exponents of the unknowns is n but no term in which the sum of these exponents exceeds n. If all the terms, which are not identically zero, of an equation are of the same degree it is said to be homogeneous. For instance, x+y=0, x2 -xy+7y2=0 are homogeneous equations of the first and second degrees respectively. In view of ancient geometric interpretations, equations of the second and the third degree are commonly called quadratic and cubic respectively. If an equation is reduced to an identity when known numbers are substituted for the unknowns, these numbers are called roots of the equation and the roots are said to satisfy the equation. The process of determining the roots is called a solution of the equation. If the unknowns of two or more equations are supposed to have the same values the equations are said to be simultaneous equations. 7. Fundamental problems. Two fundamental problems in the theory of equations are the solution of the general equation of the nth degree in one unknown and the solution of a system of m simultaneous equations in n unknowns. Although the former of these is a special case of the latter it is of such paramount importance and difficulty that it may be regarded as a fundamental problem in the theory of equations. Only very special cases of these problems were solved by the ancient and mediaeval mathematicians, and both problems have furnished nuclei for extensive theories which are still in the process of development. An instance of the solution of special cases of the first is furnished by the "heap " problems of Ahrnes, which were mentioned above. Among other instances are the following: the extraction of square roots, such as f25 5, 125 5 1.6 4 4 =2 found recently on an Egyptian papyrus now in the Berlin 216 MODERN MATHEMATICS Museum; the geometrical representation of roots of equations of small degrees by the early Greeks, including Euclid, and especially by the Arabs, the finding of one positive rational root of quadratic equations by Diophantus, and the recognition of the fact that at least some numerical quadratic equations have two roots by the Hindus and the Arabs. Starting from such special cases as these, mathematicians have gradually been enabled to comprehend the fact that every equation of degree n in one unknown has exactly n roots. This elegant theorem is commonly known as the fundamental theorem of algebra.* In France it is also known as the theorem of d'Alembert,t since d'Alembert published a proof of it in 1746, which was supposed in his day to be rigorous. The first satisfactory proof was given by Gauss in 1799. The gradual progress toward this theorem through many centuries furnishes an impressive picture of the slow pace at which our rich mathematical inheritance was developed, and of the interesting history which surrounds the fundamental theorems. The second fundamental problem mentioned above has also led to rich results in modern times. When restricted to the case in which the m simultaneous equations in n unknowns are linear, the problem has led to an important branch of mathematics known as determinants, and the theory of determinants, in turn, has thrown much light on this problem. The isolated simultaneous equations of the ancient Egyptians and of the ancient Greeks gave expression to needs of the human mind which have been largely satisfied, but which have fortunately led to a deeper sense of the need of still further developments and of hope that such developments will be forthcoming. In our brief treatment of the algebraic equation we shall devote most of our space to the consideration of equations involving only one unknown, since such equations form also the basis of the theory of a system of m simultaneous equations in n unknowns. * For a proof of this theorem see Monograph No. IV, Appendix II. t The noted Italian mathematician S. Pincherle also styles it the theorem of d'Alembert in his Lezioni di Algebra Complementare, 1909, p. 109. THE ALGEBRAIC EQUATION 217 8. Symbols. The ancient and mediseval mathematicians commonly wrote the word equals, or its equivalent, between the two members of an equation. This was, however, not universal, but a large number of different symbols have been used to indicate equality between the two members of an equation. Even Ahmes used a symbol (<) for such an equality, Diophantus used a as an abbreviation of the word '1ox (equal) and the western Arabs made use of a symbol resembling a capital J for the same purpose. During the sixteenth century the symbol ox, standing for the first two letters of aequalis, was extensively used. Our modern symbol = was introduced by Record in 1557 in his Whetstone of Witte, and the reason assigned for choosing this particular symbol was that "noe 2 thynges can be moare equalle" than two parallel lines. During the seventeenth century two parallel vertical lines were frequently used, especially in France, instead of =, since the latter was used to represent the absolute difference between two numbers. It was also a recognized abbreviation for the word est in mediaeval manuscripts. The most important things about an equation are the unknowns. In fact these characterize a conditional equation and the determination of the range of values of the unknowns is the main mission of the equation.* Among the various symbols that have been used for a single unknown none seems more expressive than the one employed by Ahmes, 1700 B.C.; for the term heap naturally implies that the number of the individuals is unknown. Diophantus represented the unknown by a final sigma, s, and the Hindu, Brahmagupta, represented the first unknown by yavat tavat, and if more than one unknown were employed he used colors, black, blue, yellow, etc., to represent the second, third, fourth, etc., unknowns. Alkarismi, the noted Arab whose work gave rise to the term algebra, and whose name gave rise to the term algorithm, called the unknown the thing or the root, and these terms were in common * These remarks are based upon an elementary point of view. From another point of view, it is equally true that the known coefficients completely dominate and determine the possible values of the unknowns. 218 MODERN MATHEMATICS use during the Middle Ages. In 1637 Descartes introduced the present custom of representing the unknowns by the last letters of the alphabet (x, y, z) and the knowns by the first letters (a, b, c, etc.). 9. Domain of rationality. One of the most useful modern concepts relating to the algebraic equations is that of the domain of rationality. If a symbol R, which obeys the ordinary laws of algebra, is combined with itself and the results of such combination, by addition, subtraction, multiplication and division (division by zero being always excluded) in every possible way, there results a certain totality of expressions, which evidently has the important property that no additional expression results from the combination of the expressions of the totality with respect to any of the four given operations. These operations are collectively called the rational operations of algebra, and the given totality is known as the domain of rationality constituted by R, and it is denoted by (R).* If R represents the number 1 and if we operate on this number and the resulting numbers in every possible way according to the rational operations of algebra, the resulting totality is composed of all the rational numbers. The same totality would have been obtained by letting R represent any other rational number besides 0. That is, (1)_(n), where n is any rational number except 0. To understand the meaning of domain of rationality it is important to observe that it implies a totality which is closed as regards the rational operations of algebra. In this connection it is interesting to observe that the n nth roots of unity form a closed totality as regards multiplication and division but not as regards addition and subtraction. For instance, if we take the four fourth roots of unity 1, -1, i, -i (i-/V-1) and combine them in every possible way by means of the operations of multiplication and division we obtain no additional number. Similarly, the set of eight numbers -1, -2, 3, 4, 12, 2, 2 4 * See also Monograph No. VIII, sec. 4. THE ALGEBRAIC EQUATION 219 forms a closed totality as regards the operations of subtracting from 2 and dividing 2, as may easily be verified. Such closed totalities, involving either a finite or an infinite number of distinct elements, are of fundamental importance in various mathematical subjects. A totality of numbers which is closed as regards the two operations of addition and subtraction is known as a number modulus. The rational integers, for instance, form such a modulus, but they do not form a domain of rationality.* In general, the domain of rationality (R) must include the domain of rational numbers, since it includes -= 1. Hence the rational numbers of elementary arithmetic constitute the smallest possible domain of rationality,? and this domain is included in every other domain. The most general expression of the domain (R) may evidently be reduced to the form ao+alR +....+a -Rn bo + bR +... + bmRm' where ao, al,..., an and bo, bl,..., bm are ordinary positive or negative integers. That is, the domain of rationality (R) is composed of all the rational functions of R with integral coefficients. It is evident that the totality of the rational integral functions, i.e., all the functions of the form ao+a l+.R.. + anRn * Numerous examples of such finite closed totalities may be found in any book dealing with groups of finite order, such as Burnside, Theory of Groups, 1897; Dickson's Linear Groups, 1901; and Cajori's Introduction to the Modern Theory of Equations, 1904. A few very elementary examples are found in the article entitled "Groups of subtraction and division," Quarterly Journal of Mathematics Vol. XXXVII, 1906, p. 80. The totality of numbers lying on a line through the origin in the ordinary complex number plane clearly forms an infinite totality which is closed as regards addition and subtraction but not generally as regards multiplication and division. Hence this totality is also a number modulus. If it is closed as regards each of the four rational operations of algebra the line must be the totality of real numbers. Cf. American Mathematical Monthly, Vol. XV, 1908, p. 117. t The trivial domain composed of 0 is excluded from our consideration. 220 MODERN MATHEMATICS where ao, a,..., a, have the same meaning as above, has the property that no additional function arises by combining any two of them (or any one with itself) by means of any of the three operations addition, subtraction, and multiplication. This totality is known as the domain of integrity constituted by R and it is denoted by [R]. When R= 1 this domain reduces to the totality of the ordinary positive and negative integers, and [R] is always included in (R). It was observed that (R) results when all the rational operations are successively performed upon R. This fact is generally expressed by saying that (R) is generated by R as regards the rational operations. On the contrary [R], is not usually generated by R, as may be seen, for instance, when R-=/2. There is, however, an infinity of pairs of elements in [R] which generate [R] when they are combined with respect to the rational integral operations of algebra. The simplest of these pairs is R and 1. The totality of rational functions, with rational coefficients, of the symbols R1, R2, R3,... (where the number of symbols is finite or infinite) is called a domain of rationality and is denoted by (R1, R2, R3,...). Each of the symbols R1, R2, Rf3,... is called an element of the domain. Similarly, the integral domain [R1, R2, R3,...] may be defined by replacing the expressions rational functions and rational coefficients in the preceding definition by integral functions and integral coefficients respectively. A rational integral function f(x), of x, is said to be in the domain of rationality (R1, R2, R3,...) if all its coefficients are in this domain; if all its coefficients are in the domain of integrity [R1, R2, R3,...] the function is said to be in this domain. If f(x) is the product of two rational integral functions in the same domain of rationality, f(x) is said to be reducible in this domain. When this cannot be done in a given domain f(x) is said to be irreducible in this domain. Functions which are irreducible in one domain may be reducible in another. For instance, x2+x-1 is irreducible in (V/2) but it is reducible in (V/5); on the other hand, x2 -2 is reducible in the former of these domains but not in the THE ALGEBRAIC EQUATION 221 latter. Neither of these functions is reducible in the domain of rational numbers, although both of them are in this domain. From the fundamental theorem it results that every rational integral function of x whose degree exceeds one is reducible in some domain of rationality. If we consider rational integral functions in two or more variables it is not possible to prove a similar theorem. For instance, the function xy-1 is in the domain of rational numbers, but it can be proved that this function cannot be resolved into two factors whose coefficients are in any domain of rationality whatever. For further developments regarding the concepts of reducibility and domain of rationality, and extensive references to the literature on these subjects, we refer to tome I, volume 2 of the Encyclopedie des Sciences Mathematiques, p. 205. A clear introduction to the subject is given in Dickson's Theory of Algebraic Equations, 1903, and in Cajori's Introduction to the Modern Theory of Equations, 1904. III. EQUATIONS WITH ONE UNKNOWN AND WITH LITERAL COEFFICIENTS 10. General statement. The ancient and the mediaeval mathematicians knew only five algebraic operations, viz., addition, subtraction, multiplication, division, and the extraction of roots. These operations suffice to solve every equation of the form f (x) =aoxn + axin+l +.. a, = 0, where aoal,..., a, are real or complex numbers and n is a positive integer, provided n<5; but they are not sufficient to solve this general equation when n>4. The first rigorous proof of this important theorem was published in 1824 by a Norwegian mathematician named Abel,t and the theorem * Unless the contrary is stated it will be assumed that ao 7 0. When a0 is real we may assume that it is positive, but when it is complex we cannot make this assumption, since the terms positive and negative cannot be directly applied to complex numbers. t This proof appeared also in Crelle's Journal, Vol. I, 1826. 222 MODERN MATHEMATICS marks an important line of division between equations of the first four degrees and those of a higher degree. Numerous efforts to solve by means of these operations * the general equation of the fifth and higher degrees had preceded Abel's proof of the fact that such efforts were necessarily futile. It should be emphasized that there is a vast difference between proving the existence of a root of f(x)=0 and finding this root. The existence of such a root is proved by the fundamental theorem of algebra, but the finding of methods to express such a root in terms of the coefficients ao, a,..., a,, is a much more difficult problem when n>4. In 1858 the noted French mathematician Hermite found a method by means of which he could express a root of the general equation of the fifth degree in terms of certain functions known as elliptic functions. More recently it has been proved that a root of the general equation of degree n may be represented in terms of the coefficients by means of certain functions called Fuchsian.t The very important theorem that f(x) is divisible by x-xl whenever xl is a root of f(x) = 0 was observed by Descartes. This theorem establishes two fundamental facts, viz., (1) that the finding of the roots of f(x)=0 is equivalent to resolving f(x) into its linear factors, and (2) that the proof of the existence of one root of f(x)=0 proves the existence of n roots. Moreover, it is not difficult to prove this important theorem. The proof may be obtained as follows: Let f(x) =aoxn +al-1 -..... +a,. (1) Since xl is a root of f(x) 0, we have f (x) =0, or O =aoxln +alxln-l +...+a.... (2) By subtracting (2) from (1) there results f(x) =ao(X — Xln) +ai(xn-1 -n- 1)... an_ (x -X1). * Solutions confined to the use of these operations are known as solutions by radicals. f Cf. Tropfke, Geschichte der Elementar-Mathematik, Vol. I, 1902, p. 292 THE ALGEBRAIC EQUATION 223 As xl -X1- = (X -Xl) (x- 1 + Xl-2 +... + l- 1) whenever I is any positive integer, it has thus been proved that f(x)= (X -X)fl(X) where fi(x) is a rational integral function of x of degree n-1. In other words, f(x) is divisible by x -xl. Since fi(x) is of the same general type as f(x) we may apply to it the same kind of reasoning. In particular, if the general function of the form f(x) has a root, fi(x) must also have a root (x2) and hence it is divisible by x-x2 with f2(x) as a quotient. As n is a finite positive integer, we must finally arrive at a linear quotient by repeating these operations and thus prove that f(x) = ao(x-Xl) (X -X2)... (X -Xn). It should be emphasized that this process establishes the existence of these n linear factors only on the assumption that every such function as f(x) has at least one root. The theorem just proved is evidently a special case of the theorem that if f (x) is divided by x-xl the remainder is f(xl). In 1629 Girard published an important work entitled Invention nouvelle en l'algebre, in which he stated the theorem that f(x)=0 has n roots and observed some general relations existing between the elementary symmetric functions of the roots and the coefficients of the equation. Special cases of these relations had been observed earlier by Cardan and Vieta. The more general relations can readily be deduced from the fact that f(x) = ao(x-xl) (x-X2)... (X -Xn). By multiplying the factors of the second member it is clear that the sum of the roots is -al/ao, while the sum of the products of the different combinations of a roots is (-1)aa/ao, where a=2, 3,..., n. 224 MODERN MATHEMATICS Girard even computed the values of the following symmetric functions of the roots i-=n i=n i=n EXi2, 23, X 2Xi4 i=l i=1 i=l in terms of the ratios al a2 a, ao' a'' ' ao These ratios represent separately symmetric functions of the roots, as was observed in the preceding paragraph, and these interesting symmetric functions are technically known as the elementary symmetric functions of the n roots. They are respectively of degrees 1, 2,..., n. This work of Girard prepared the way for the beautiful theorem that every integral symmetric function of xl, x2,..., xn can be expressed in one and in only one way as an integral function of the elementary symmetric functions. A proof of this theorem may be found, among other places, in Dickson's Algebraic Equations, 1903, p. 99. A number of fundamental properties of symmetric functions are developed also in Burnside and Panion's Theory of Equations. 11. Substitutions and substitution groups. A profound study of the algebraic equation involves not only a knowledge of the properties of symmetric functions, but also a knowledge of rational functions which are not symmetric. In his noted memoir entitled "Reflexions sur la resolution algebrique des equations," Nouveux memoires de l'Academie Royale des Sciences de Berlin, 1770 and 1771, Lagrange made a thorough study of the known methods which had been employed to solve equations of higher degrees, reviewing the methods employed by Cardan, Ferrari, Descartes, Tschirnhaus, Euler, and Bezont. He observed that the solution of an algebraic equation depends upon the solution of a certain other equation, since known as the resolvent, and he showed that the roots of the various resolvents are rational functions of the roots of the given equation. This led to a recognition of the fundamental fact THE ALGEBRAIC EQUATION 225 that the problem of the solution of equations depends upon the properties of rational functions of the roots. The fertile point of view at which Lagrange had arrived in the extensive memoir noted in the preceding paragraph called for a comprehensive study of the properties of rational functions, especially as regards the number of values assumed by such functions when their n elements are permuted in every possible manner. This study led to the theory of substitutions, called "calcul des combinaisons " by Lagrange, which has proved to be a most powerful instrument to secure a deep insight into the nature and properties of an algebraic equation. Among those who share with Lagrange the honor of having discovered the fundamental importance of the theory substitutions along this line we mention especially Ruffini, Abel, Galois, and Jordan. A study of the various forms which a rational function assumes when its elements or letters are permuted furnishes one of the most natural ways to secure a knowledge of the true meaning of substitutions and substitution groups. For instance, the historic function X1X2 + X3X4 is evidently left unchanged by replacing x1 by x2 and x2 by x1. This fact is more briefly expressed by saying that the function xx2 +x3x4 is transformed into itself by the substitution (XlX2). It is clearly also transformed into itself by the substitution (X3X4). The fact that the two substitutions (xlx2), (X3X4) are to be performed successively is indicated by (X1X2) (X34), and this substitution is called the product of the two substitutions (Xl12), (X3X4). It is clear that if each of two substitutions transforms a function into itself their product must also transform this function into itself. A set of distinct substitutions which has the property of including the square of each of the set as well as the product of any two of them is called a substitution group. As (X1x2)2 (x3x4)2 =[(XlX2) (x3x4)]2 =, or the identity, where 1 implies that all the elements 226 MODERN MATHEMATICS of the function under consideration are left unchanged, it is clear that the four substitutions 1, (Xlx2), (X3x4), (XlX2) (X3X4) form a substitution group. This group is of order 4, the order being the number of substitutions in a group. It plays a fundamental role in many mathematical considerations and is known abstractly under various names as follows: the axial group, the anharmonic ratio group, the quadratic group, the four group, the group of the rectangle, etc. The given function is evidently also transformed into itself by the additional substitutions (X1X3) (x2x4), (X1X4) (x2x3) (xlX3X2x4), (x1x4x2x3) where the last two substitutions indicate that the letters xl, Xs, x2, x4; Xl, x4, x2, x3, respectively, are permuted cyclically in the given order. It is easy to verify that the eight substitutions 1, (XlX2), (XZ34), (XlX2) (X34), (X13) (x2x4), (X1x4) (x2x3), (xlX3X2x4), (xlx4x2x3) form a group and that these are the only substitutions on these letters which transform the given function into itself. A group, that is contained in another group is called a subgroup. The first four substitutions of this group of order 8 therefore constitute a subgroup of order 4, while the first two substitutions constitute a subgroup of order 2. The reader can readily verify the fact that the given group of order 8 contains two and only two other subgroups of order 4, and four other subgroups of order 2. This group is known abstractly as the octic group or the group of the square. While the theory of substitutions is essential to attain an insight into what is known as the Galois theory of the algebraic equations and is very important also in other domains of mathematics, we shall make no explicit use of it in what follows, in view of the facts that the elements of this subject are not as generally known as they should be, and a proper development of the subject ab initio would demand too much space for the present monograph. It seems, however, desirable to THE ALGEBRAIC EQUATION 227 state a few general theorems depending on this theory, whose import can be at least partially appreciated by means of the development of the preceding paragraphs. It was observed above that the function Xlx2 + x34 is transformed into itself by all the substitutions of a certain group of order 8, but by no other substitution on these letters. This fact is commonly expressed by saying the function x1x2+X3X4 belongs to this group. It is easy to find other functions in these letters, for instance (X +X2-x3-X4)2, which belong to the same group; and it has been proved that an infinite number of distinct rational functions belong to any given substitution group while such a function belongs to only one substitution group. That is, there is an (o, 1) correspondence between rational functions involving certain letters, and the substitution groups on these letters, and it is an important fact that all these functions which belong to the same group are rational functions of each other. Lagrange observed that the number of values which such a function assumes when its n elements are permuted in every possible manner is a divisor of n! For instance, xlx2 +x3x4 assumes the following three values: XX2 + X3X4, XlX3 +X2X4, XlX4+X2X3. This is in accord with the general theory, as 3 is a divisor of 24; but it is not possible to construct a rational function in four letters which assumes exactly five distinct values when its elements are permuted in every possible way, since 5 is not a divisor of 24. Although the number of values which a rational function whose degree does not exceed n assumes when its letters are permuted in every possible manner is a divisor of n! it does not follow that there is such a function for every divisor of n! In fact, it has been proved that whenever n>4 it is not possible to construct a function for every divisor of n! while it is possible to construct such a function whenever n< 5. The fact that it is not possible to construct a rational function with five letters which assumes either 3, 4, or 8 values when 228 MODERN MATHEMATICS these letters are permuted in every possible manner was proved by Ruffini in his Teoria generale delle equazioni, in cui si dimostra impossible la soluzione algebraica delle equazioni generali di grade superiore al quarto, published at Bologna in 1799. This fact is equivalent to the theorem that there is no substitution group on five or a smaller number of letters having for its order one of the numbers 40, 30, 15. 12. Linear equations. Every linear equation with one unknown can be reduced to the form ax =b, where a and b are known, and x is the unknown. Necessary and sufficient conditions that this equation can be solved are that either a and b are both equal to zero, or that a is not zero. If the former condition is satisfied the equation has an infinite number of solutions, as x may have any value. On the contrary, the equation has only one solution when the latter condition is satisfied. In this case the value of x is obtained by dividing b by a. As this is a rational process the root of the equation must be in the domain of rationality (a, b) constituted by a and b. The root is, however, not necessarily in the integral domain [a, b] constituted by a and b. If a and b are restricted to be natural numbers it is not possible to reduce every linear equation to a single form, but all such equations can, in this case, be reduced to one of the two forms ax=b, ax+b=0. This is also true in case a and b are only restricted to be positive rational numbers. The ancient and the mediaeval mathematicians generally imposed the latter restriction on a and b as well as on the root.* Hence the second form was not solvable. The general solution of the linear equation as noted above therefore calls for the extension of the number concept so as to include both negative and fractional numbers in * This was done by the great French algebraist Vieta (1540-1603) and even Descartes called negative roots false roots in his Geometrie. THE ALGEBRAIC EQUATION 229 addition to the natural numbers. With this extension of the number concept the linear equation is solvable except when a=0 and b60. The further extension of this concept so as to include the irrational and the ordinary complex numbers does not affect the given discussion of the linear equation. 13. Quadratic equations. Every quadratic equation with one unknown can be reduced to the form aox2- al x+-a2 = O. If we put x= z + k this equation becomes aoz2 + (2aok +ai)z +aok2 +alk +a2 = 0. Since aoZO, it results from the preceding paragraph that it is always possible to solve the following linear equation in k: 2aok + a = O, and thus arrive at an equation of the form z2 =A, which involves only the extraction of the square root of a number. The most important thing about the solution of the quadratic equation is the extraction of the square root of a number. In fact, this is the only operation which enters into the solution of the quadratic equation but not into the solution of the linear equation, as can be deduced from the general solution sketched above. The extraction of the square root is, however, not a little thing in mathematics. It opens up the question of irrational numbers as well as that of ordinary complex numbers-two very profound and far-reaching questions. As the number A is a rational function of the coefficients ao, al, a2, it must lie in the domain of rationality constituted by these coefficients. Hence the quadratic equation in one unknown can always be solved provided we can extract the square root of all the numbers in the domain of rationality constituted by the coefficients of the equation. As it is known that we can extract the square root of any real number as well as of any ordinary complex number, it results, in particular, that the 230 MODERN MATHEMATICS quadratic equation with one unknown can always be solved provided its coefficients are ordinary real or complex numbers. While the root of a linear equation in one unknown lies in the domain of rationality constituted by its coefficients this is not necessarily true of the roots of a quadratic equation, since the operation of extracting the square root is not a rational operation. If the coefficients of the quadratic are rational numbers the domain of rationality constituted by one root clearly includes the other root, but this is not necessarily true when the coefficients are either irrational or complex numbers. It is, however, always true that the two roots of a quadratic equation in one unknown must be in the domain of rationality constituted by the coefficients and one of the roots, since each root may be obtained by a rational process from the coefficients and the other root. In other words, the quadratic in one unknown is always reducible in the domain of rationality constituted by its coefficients and one of its roots, but in no smaller domain of rationality. The reduction of the quadratic equation to the form z2=A is a special case of the removal of the second term in the equation aon + alx-1 +...+ a, = 0. If we substitute z + k for x in this equation, there results ao(z + k)n +al(z +k)'-l +... +an=0. The coefficient of zn-~ in this equation is aonk + a,. As ao and n are both different from zero (otherwise the equation would not be of degree n>0) a number can always be found which when substituted for k will reduce aonk+at to zero. This number is sometimes called the zero of the function aonk+al, but it is more commonly known as the root of the following linear equation in k: aonk + a = 0. THE ALGEBRAIC EQUATION 231 Hence the solution of a linear equation suffices to determine a number by means of which the coefficient of x-l1 can be reduced to zero. In general, the solution of an equation of degree a suffices to reduce the coefficient of x=n- to zero. In particular, to reduce the absolute term an to zero it is necessary to solve an equation of degree n. 14. Extensions of the number concept due to the quadratic. The solution of the general quadratic calls for numbers of the form a+b/ -1, where a and b are real, even when coefficients of the equations (ao, al, a2) are rational numbers. It has been proved that numbers of this form likewise suffice for the solution of the equation of the nth degree in one unknown even if the coefficients are also any numbers of this form. It should, however, not be inferred that the numbers of the form a+bV -1 which are required to solve the quadratic equation with rational coefficients are coextensive with those required to solve the equations of the nth degree. In 1770 Lagrange proved that the real irrational numbers which are roots of a quadratic equation with rational coefficients have the characteristic property that they may be represented by periodic continued fractions whose elements are integers.* The quadratic equation merely opened the great problem of distinguishing the different kinds of irrational numbersa problem which is to-day the object of important investigations. The quadratic equation opened also the question as to the number of possible roots of an equation. It should be emphasized that the answer given to this question depends upon the point of view. For Diophantus and the older mathematicians who did not admit negative, irrational, or complex numbers as roots of an equation, most of the quadratic equations did not have any root; others had one root, but no instance is known where the ancient Greeks, or the older mathematicians, observed that at least some quadratic equations may have two roots. On the contrary, Bhaskara, a * Cf. Cahen, Elements de la Theorie des Nombres, 1900, p. 183. This theorem was extended by Minkowski in Gottingen Nachrichten, 1899, p. 64. 232 MODERN MATHEMATICS Hindu mathematician of the twelfth century of ou~r era, observed that some quadratic equations have two roots, but even for him many quadratic equations had no root whatever, since he did not use complex numbers. He gave the following interesting rule: "The square of a positive as well as that of a negative number is positive, and the square root of a positive number is double, positive and negative. There is no square root of a negative number because a negative number is not a square." Problems leading to quadratic equations are frequently viewed so narrowly that only one root, or even no root, seems to have meaning, and the existence of two roots has often led to a more comprehensive conception of the problem. This equation has thus contributed to more accurate and deeper thought as to the real nature of the problem. It is very important to observe that the study of the nature and the properties of the roots of an equation has not only led to a clearer comprehension of the essence of an equation, but also to a deeper insight into the nature of the subject giving rise to the equation. This point of view was taken by Poinsot in his important article entitled " Reflexions sur les principes fundementeux de la theorie des nombres," in contradicting a view expressed earlier by d'Alembert to the effect that the additional roots, beyond those to which the problem was supposed to give rise, were an inconvenience and were not to be attributed to the richness of algebra as some had supposed.* The question as to the number of roots of an equation is not entirely confined to the number domain in which the values of the unknown are supposed to be. The additional difficulty is opened by the quadratic which is a perfect square. For instance, there seems to be good reason for saying the equation, x2-2x+1 =0, has only one root, since 1 is the only solution of this equation. On the other hand, when the first member of this equation is * Poinsot, Journal de Mathenatique, Vol. X, 1845, p. 8. THE ALGEBRAIC EQUATION 233 written in the form (x -l)(x-1), it is evident that x=1 will make each of its factors vanish, and hence we may say that 1 is a repeated root. This should, however, be regarded as merely a convention which tends toward simplicity and clearness-two of the most potent factors in shaping the development of mathematics. The statement that a quadratic equation has always two and only two roots is thus seen to be heavily laden with historical facts of great significance, and it opens up the way for harmony and brevity in the theory of the general equation.* 15. Cubic equations. The solution of the general cubic equation requires the operation of extracting the cube root in addition to the operations involved in the solution of the quadratic. The preparation for root extraction is, however, not so evident in the case of the cubic as it is in the case of the quadratic. In fact, many mathematicians attempted these preliminary transformations in vain before an Italian, Scipione del Ferro, professor in Bolgona from 1496 to 1526, finally succeeded. Since this time, a large number of different solutions have been given and a very extensive literature on the cubic has been developed, as may be seen by consulting the " Subject Index " of the Royal Society of London Catalogue of Scientific Papers, 1800-1900, pp. 170-71. Many of these methods of solution are based upon considerations which do not apply to the general equation of the nth degree but are very elegant as regards the cubic. On the contrary, we shall first give a method which involves many very interesting general theorems but requires lengthy computations, for farreaching thoughts are a greater desideratum for the mathematician than brief special methods. Remove the second term of the general cubic by the method indicated in sec. 13. The equation may thus be reduced to the form, x3 +qx +r=0. * Professor E. R. Hedrick has given several reasons why the beginner should not be taught that every quadratic equation has exactly two roots, School Science and Mathematics, Vol. IX, 1909, p. 563. 234 MODERN MATHEMATICS Let x1, x2, x3 be the roots of this equation, and consider the functions: (Z1 - 2)(x1 — X3)(x2 — 3), (XI w + 02 + 2X3)3, (X1 + w2x2 + wx3)3, where w is an imaginary cube root of unity. Observe that whenever any one of these functions remains unchanged under any given substitution on the roots, the other two do so also. That is, each of these functions is transformed into itself by the same substitutions on the roots. From the fact that two rational functions which are transformed into themselves by the same substitutions can be expressed rationally in terms of each other (sec. 11), it results that each of these functions can be expressed rationally in terms of each of the others.* As the square of the first of these functions is symmetric, this square can be expressed rationally in terms of the elementary symmetric functions q and r, as was noted above in sec. 11. These general theorems enable us to see how the cubic can be solved. We express (xl -x2) (X -X3) (X2- -3) as the square root of a rational function of q and r. Then we express each of the other functions as a rational function of this square root and extract the cube root of this rational function. In this way we find the values of x1 + WX2 + W2X3 and xl + 2-w22 + O3. As we know that xl + x2 + X3 =0 we have three linear equations in three unknowns from which the values of the unknowns can be readily found. It should be observed that this general method enables us to see, before we do any calculating, how we may proceed to find the roots, and it illustrates an important tendency in mathematics to see things without calculating. Normally, thought should precede rather than follow calcu * Cf. Dickson, Introduction to the theory of algebraic equations, 1903, p. 24. THE ALGEBRAIC EQUATION 235 lations in pure mathematics. The calculations will come out as follows: (X1 -X2) (X1 -X3)(X2 -X3) - -4q3 -27r2; /I 27 3 1 + w2 + 2x3 = -— r - 12q3 +81r2; Xl + o2X2+WX3= — r+2 /12q+81r2; x1 +X2+-X3=0. On adding the last three equations we obtain 3r r2q3 r r2 q3 x1= -2 + 27+ 2 Ii4 +27' This is known as Cardan's formula, because he first published it. The substance of it had been obtained by Cardan from Tartaglia under the promise of secrecy, but Cardan broke his promise and published the formula h Ars n, in his "Ars Magna in 1545.* An elegant and brief solution of the cubic was given in 1591 by Vieta, a noted French mathematician. The general equation is first reduced to the form, x3 +3ax = 2b. a -Y2 Letting x- - this becomes Y y6+ 2by3= a3. As this is in the form of a quadric it is very easy to find the possible values of y, and after these are known the values a -Y2 of x result from x==. Numerous other brief methods are Y known and may be found in works on the Theory of Equations. 16. Biquadratic equations. We shall see that the solution of the general biquadratic equation requires no non-rational * For a clear statement of the extenuating circumstances the reader may consult, Tropfke's Geschichte der Elementar-Mathematik, Vol. I, p. 275. 236 MODERN MATHEMATICS operation except the extraction of the square and the cube root. Hence the operations which enter into this solution are of the same type as those which enter into the solution of the cubic. As in the case of the cubic we shall begin with a method which is valuable on account of its perspicuity and the far-reaching thoughts which it involves, but is not the simplest from the standpoint of practical applications. For numerical equations the methods of the following section are generally to be preferred. We shall suppose that the general biquadratic has been reduced to the form, x4 +qx2 +rx+s =O, and that its roots are xi, x2, x3, x4. The following three functions are clearly transformed either into themselves or into each other by every substitution on the roots (X1 + X2 -X3 -X4)2, (X1 -X2 + X3 -X4)2, (x1 -X2 -x3 + x4)2. Hence the cubic equation which has these functions as roots must have for its coefficients symmetric functions of xl, x2, x3, x4. As these symmetric functions can be expressed as integral functions of the elementary symmetric functions q, r, s, it results that this cubic lies in the domain of rationality constituted by these elementary symmetric functions. As a matter of fact, this cubic is y3 + 8qy2 + (1 6q2 -64s)y -64r2 = 0. Solving this cubic and denoting its root by 01, 02, 03, we have the following system of four linear equations in four unknowns: x1 +2 +X3+x4=0; xl +X2 -X3 -X4= V/1; X -x2 + X3 -X4 = /02; xl -X2 -X3 + 4= V/03. By adding these equations we observe that x1 is one-fourth of the sum of the square roots of the roots of the given cubic. THE ALGEBRAIC EQUATION 237 After x1 is known it is easy to find the values of the other three roots. The discovery of a solution of the general biquadratic is due to Ferrari, who was a pupil of Cardan. It is especially interesting since Ferrari was not yet twenty-three years old when he discovered it. In this connection, it is interesting to note that both Abel and Galois did their fundamental work on the theory of equations before they were twenty-three years old. The substance of Ferrari's solution was as follows: Write the biquadratic in the form, x4 + px3 + qx2 +rx+ s= O. Add (ax + b)2 to both members and then assume that the first member is a perfect square. That is, X4 +px3 + ( +a2)2 + (r+2ab)x +s b2 = (x2 +px +k)2 By equating coefficients of the like powers of x and eliminating a and b there results a cubic in k. After finding the value of k by means of this cubic it is only necessary to factor, (x2 + px +k)2 -(ax + b)2 =, in order to reduce the solution of the biquadratic to that of two quadratics. The solutions of these quadratics must include the roots of the original equation. 17. Equations whose degrees exceed 4. The brilliant discoveries of the Italian mathematicians regarding the solution of the cubic and biquadratic equation led to numerous attempts to solve general equations of higher degrees by rational operations and the extraction of roots, as these were the only known algebraic operations at that time. All such efforts were destined to failure, but it required nearly three hundred years from the time when Ferrari first solved the biquadratic until Abel discovered, at the age of twenty-two, the first rigorous proof of the fact that the general quintic equation cannot be solved by these elementary operations. It is interesting to note that Abel began his scientific career by attempts to solve the quintic by radicals and he believed for 238 MODERN MATHEMATICS some time that he had actually found a solution, but he afterward discovered his own error. His apparent success won for him the life-long friendship and support of his countryman, Hansteens. Abel was not the first who attempted to prove that the general quintic cannot be solved by the extraction of roots. About a quarter of a century earlier Paolo Ruffini did much to develop methods which were of sufficient power to prove this fundamental fact. In particular, he gave a number of theorems on groups of substitutions, as was noted above. The difficulties which the general solution of the quintic presented have thus become a source of great riches for the later development of mathematics. Besides Ruffini, some of the most eminent among those who started these developments are: Tschirnhaus, Euler, Lagrange, Gauss, Galois, and Hermite. The work of Galois (1811-32) was especially fundamental as regards the establishment of more definite relations between the theory of equations and the theory of substitution groups, by proving that every equation belongs to a certain substitution group, and that the properties of this group give definite information as to the solvability by radicals of the equations belonging to the group. The important theorem that two rational functions of the roots of any equation may be expressed rationally in terms of each other, in the domain of rationality of the coefficients of the equations, had been proved earlier by Lagrange. For an introduction to the elegant theory of equations based upon these theorems we may refer the reader to the following works: Dickson, Introduction to the Theory of Algebraic Equations, 1903; Cajori, An Introduction to the Modern Theory of Equations, 1904; Mathews, Algebraic Equations, 1907. THE ALGEBRAIC EQUATION 239 IV. EQUATIONS WITH ONE UNKNOWN AND WITH NUMERICAL COEFFICIENTS 18. General statement. Although numerical algebraic equations have a prehistoric origin, the arithmetical epigrams of the Greek Anthology, among other things, support the assumption that they resulted from puzzles and word-equations. The fully developed equations represent highways of exact thought without by-ways, and the coefficients determine the possible destinations of these highways. The ancient problems of duplicating a cube and trisecting an angle, among many others, directed attention to the need of such highways, but their construction for coefficients, which may be regarded as arbitrary, presented great difficulties. Even in the case of the cubic with three real roots (casus irreducibilis) Cardan's formula represents the real root in the form of the sum of two imaginary expressions; and it has been proved that it is impossible in this case to represent the roots of the cubic in a real form by means of radicals.* On the other hand, the great French algebraist, Vieta (1540-1603), showed how the real values of the three roots may be obtained by means of trigonometry. From the preceding paragraph it results that the solution of numerical equations of a given degree may present difficulties even after a formula for the roots of the general equation of this degree is known. These difficulties, combined with those of finding such general formulas, directed attention to special methods of solution in case the coefficients are numbers. It is of especial importance to observe that for many applications of algebra only approximate values of the real roots are needed. This need has led to a vast literature * Cf. Encyklopadie der Mathematischen Wissenschaften, Vol. I, p. 518. The French edition of this work, to which we have already referred, treats many subjects more completely than the German. This is especially true as regards algebra and arithmetic. Neither of these editions is completely published, but the German is considerably further advanced than the French. They constitute at present the most important mathematical works of reference. 240 MODERN MATHEMATICS which embodies some of the most beautiful results relating to alegbraic equations. As the solutions of the general linear and quadratic equations are so easily available for numerical equations, we shall assume, in what follows, that the degrees of the equations under consideration exceed 2. The solutions of numerical equations may frequently be simplified by considering the special properties of the coefficients, and hence they demand great alertness as regards details. A large part of the theory of numerical equations confines itself to real numbers, since these are frequently the only numbers applying directly to the conditions which give rise to an equation. This is especially true as regards the coefficients of an equation. When the coefficients of the rational integral function f(x) involve complex numbers it is evidently possible to write this function in the form, f(x) = (X) +i(x), where the coefficients of the rational integral functions f(x) and b(x) are real numbers. After multiplying both members of this equation by the conjugate value, ((x)-ib(x), we obtain a new rational integral function of x, which involves all the roots of f(x)=O, but has only real coefficients. Hence it results that if we can find all the roots of every rational integral function of x with real coefficients we can also find those of such a function with complex coefficients. It is also important to observe from the given form of f(x) that any real root of f(x)=0 is a common root of q (x)=0, (x) =0, and hence it is a root of the highest common factor of +(x) and O(x). In view of these facts and for the sake of brevity and perspicuity we shall assume throughout the rest of the present section that all the coefficients of f(x) are real numbers. 19. Multiple roots. If f(x) is divisible by (x -r)Y but not by (x-r)a+l, r is said to occur exactly a times as a root of the equation f(x) =0; sometimes it is also called such a root or a zero of f(x). When a>l, r is called a multiple root of f(x) =0, or multiple zero of f(x). To determine the multiple roots of f(x) =0 it is convenient to use the well-known property THE ALGEBRAIC EQUATION 241 that any root which occurs exactly a times in f(x) = 0 must occur exactly a -1 times as a root of f'(x)=0, wheref'(x) is the first derivative of f(x). Hence a multiple root of f(x) is also a root of the highest common factor of the two functions, f(x), f(x). Since the first derivative off(x) may be found by a rational process it results that f(x) is reducible in the domain of rationality of its coefficients whenever it has multiple roots, but the converse of this theorem is evidently not necessarily true. From the preceding paragraph it results that the multiple roots of f(x) = 0 may be found by means of the highest common factor of f(x) and f'(x). As the multiple roots of this highest common factor may be found in a similar manner it results that whenever f(x)=0 has no more than, distinct multiple roots, all these roots may be found by rational operations and by solving equations whose degrees do exceed P. In particular, if f(x)=0 has only one multiple root it may be found by rational operations. It is frequently possible to find the rational multiple roots by inspection. Since the quotient obtained by dividing f(x) by the highest common factor of f(x) and f'(x) involves each root of f(x) once and only once, we may suppose in what follows that f(x)=0 involves no multiple root. This hypothesis will conduce to brevity of statements. 20. Sturm's theorem. This theorem (proved in 1829) furnishes the scientific foundation for every method of finding the approximate values of the unknown in an algebraic equation with real coefficients, as it gives definite information in regard to the number of real roots between two arbitrarily assigned numbers.* Moreover, the proof of this theorem is not difficult, being based upon the following two elementary facts: (1) The continuity of f(x), and (2) the fact that if a is a real root of f(x)=0 and h is a sufficiently small positive number, then f(a -h) and f'(a -h) have different signs, while * Encyklopadie der Elementar-Mathematik von Weber und Wellstein, 1906, Vol. I, p. 337. 242 MODERN MATHEMATICS f(a+h) and f'(a+h) must have the same sign, where f'(x) is the first derivative of f(x). A proof of these two facts is found in many elementary text-books; e.g., Burnside and Panton's Theory of Equations, Vol. I, 1899, pp. 9 and 161. To obtain Sturm's Series we proceed exactly as in the process of finding the highest common factor of f(x) and f'(x) with the single exception that the sign of each remainder is changed. In this way we obtain the following relations: f(x) ql (x)' (x) -r (x), f'(X) = q2()rl (x) -r2(x), rl(x) = q3(x)r2(x) -r3(x), rn-2(x) =qn(x)rn-l (X) -r (x), where rn(x) is a constant, different from zero, since f(x)=O has no multiple root. The series, f(x), f'(x), rl(x), r2(x),... rn(X), has the following properties: No two adjacent functions can vanish for the same value of x; otherwise all the succeeding functions would have to vanish for this value of x, but this is impossible since rn(x) cannot be 0. When any function vanishes the two adjacent functions must have opposite signs in order to satisfy the given equations. In finding the number of changes of sign in this series as x increases continuously from the real number a to a larger real number b we need therefore not consider the vanishing of any function except the first one. In case this vanishes a change of sign is lost, as was observed in the preceding paragraph. This proves Sturm's Theorem, which may be stated as follows: If any two real numbers a and b be substituted for x in Sturm's Series, f(x), f'(x), ri(x), r2(x),..., n(x), the difference between the number of changes of sign in the series when a is substituted for x and the number when b is substituted for x is exactly the number of real roots of the equation f(x) = 0 between a and b. THE ALGEBRAIC EQUATION 243 The total number of real roots of f(x) = 0 is equal to the difference between the changes of sign in these functions when - oo is first substituted for x and then + oo. The total number of positive roots may be found by first substituting 0 and then + oo, and of the negative roots by first substituting - o and then 0. This theorem is more general than Descartes' Rule of Sign, as the latter gives merely an upper limit for the number of real roots. The disadvantage of Sturm's Theorem is that it requires considerable labor to find Sturm's Series, especially when the degree of f(x) is large, since the coefficients in the successive function of the series may become large. It is evident that the successive remainders may be multiplied or divided by any positive number and that it is not necessary to find the exact value of rn(x), since only its sign is considered in the application of the theorem. Sturm's Series suffices to find the rational roots of an equation and to approximate the irrational roots to any desired degree of accuracy, but other methods generally require much less computation. One of the most useful auxiliary theorems in locating the roots of an equation may be stated as follows: There must be an odd number of roots between a and b whenever f(a) and f(b) have opposite signs. This theorem results directly from the fact that f(x) is continuous and hence can change its sign between a and b only by passing through zero. It is evident that the number of roots between a and b must be zero or even whenever f(a) and f(b) have the same sign. 21. Rational roots. Descartes observed that every root of f(x) = aon +aln-l +... +an=0 is a divisor of a-. Moreover, if a root is rational and reduced to its lowest terms, its numerator is a divisor of an and its denominator is a divisor of ao, as results directly from substituting such a root ( ) in f(x) =0. In fact, all the terms of (maol n a, (m? n-1 n L ao / ao 244 MODERN MATHEMATICS except possibly the first are evidently integers. As the sum of all these terms is zero the first must also be an integer. On the other hand, since m divides all these terms except possibly the last it must also divide the last. If f(x) —0 has a second rational root and this root is also reduced to its lowest terms, its numerator evidently divides an - m and its denominator divides ao l, etc. As the numerator of every rational root in its lowest terms divides a, and the denominator divides ao, it results that we can find all the rational roots of f(x)= 0 by a finite number of trials and that the number of these trials is small when a, and ao have only a small number of factors. 22. Irrational roots. It is always possible, in accord with the preceding theory, to find two rational numbers, whose difference is less than any assigned finite number, such that one of these numbers is greater than the required irrational root while the other is less than this root. We may choose these rational numbers successively so as to differ from each other by powers of -'-. That is, we may first find two integers which differ by 10~=1 such that the root lies between them, then we may find two rational numbers differing by 10-1, such that the root lies between them, then we may find two rational numbers differing by 10-2 and inclosing the root, etc. The smaller of these two rational numbers is called the approximate value of the root, and the process of finding it is known as approximating the root. In practice this process is greatly modified in details so as to require much less labor. In 1767 Lagrange published a theoretically simple method for finding the approximate value of an irrational root by means of continued fractions. The main features of this method are as follows: After finding that a root of f(x)= 0 lies between the integers r and r+l we substitute for x in f(x) =0, 1 ryl+l1 x=r+ -- -- yl y1 and thus obtain another equation of degree n, fi(yi)=0, which has the same number of real roots greater than 1, as f(x)=0 THE ALGEBRAIC EQUATION 245 has real roots between r and r +1. We then find an integer r >0, such that there is a root between ri and r1 +1 and substitute in fi (yi) = 0, 1 rly2 + y 2 Y2 In this way there results an equation of the nth degree in y2 which has as many real roots greater than 1 as f(yi)=O has real roots between r1 and r1 + 1. By continuing this process we must arrive at an equation which has only one root greater than 1, and this root may be traced as far as may be desired. The value of a root of the original equation is then given by the continued fraction, x==ri l. rl+ — r2 + Although this method is perspicuous and exhibits clearly the reason for each step, it has not been used as widely as the well-known Horner's Method. 23. Solutions by means of graphs and machines. If an exact graph of y=f(x) could be constructed and if it were possible to measure exactly the abscissas of the points where this graph crosses the x-axis, the numerical measures of these abscissas would furnish all the real roots of f(x)=0. This method has the advantage that it exhibits the values of f(x) for all the values of x within certain limits. Its disadvantage is that a graph cannot be said to represent a function accurately on account of the imperfections in measurement and drawing. It serves the purpose of a hypothesis by bringing unity into what might otherwise appear as disconnected, and hence it serves a very useful purpose, especially for the beginner. It is a convenient receptacle for a large number of facts whose significance might otherwise not be so clear. The ancient Greeks used geometric constructions to solve certain problems of geometry which are equivalent to the solutions of equations of the second degree, but the present graphic methods for solving equations were developed mostly since 246 MODERN MATHEMATICS the beginning of the nineteenth century. In many cases these methods serve only to show that certain solutions are possible and in some cases they serve as a rough check on the accuracy of the calculations, but there are a large number of cases where such solutions are sufficiently accurate for the problems on hand. As they are especially well adapted to the saving of thought as regards details they are doubtless destined to play a more and more prominent role as mathematical methods find wider and wider use in the development of science and industry. Instead of drawing the graph of y=f(x) as noted above, it is often more convenient to construct two curves such that the abscissas of the points of intersection are the roots of f(x)=O. Sometimes one curve is fixed for all the equations of the same degree, while the other curve is made to vary so as to correspond to the different values of the coefficients. As early as 1637 Descartes employed a fixed parabola and a variable circle to solve equations of the third and fourth degrees, and he also solved equations of the fifth and sixth degrees by means of a certain fixed curve of the third order and a circle. The literature on graphic algebra is very extensive and is growing rapidly. Among the introductory treatises we may mention the Graphic Algebra by Phillips and Beebe. Closely related to the graphic methods are the various machines for finding the approximate values of the roots of a numerical equation. Some of these are very ingenious, employing principles of equilibrium of forces and of hydrostatics as well as of electricity. Although the ancient Greeks solved the Delian problem, involving the solution of a cubic equation, by means of mechanical devices, the machines suitable for finding the roots of a great variety of equations are comparatively recent inventions. One of the most noted was invented in 1893 by a Spanish engineer named M. L. Torres. For a detailed description of this and other machines to solve equations and to simplify other calculations we may refer to Le Calcul Simplifie par Maurice d'Ocagne, 1905. The large mathematical encyclopedias, especially the Encyclopedie THE ALGEBRAIC EQUATION 247 des Sciences Mathematiques, tome 1, Vol. IV, contain a large amount of information on this subject. 24. A few fallacies and notes of caution. While the chief aim of mathematics is the construction of permanent and attractive highways of thought leading as directly as possible to important treasures of the intellect, it is of some interest to observe where one is led by following by-ways regardless of the danger signals. One of the most prominent of these signals is: Never divide both members of an equation by an expression whose value is zero. If it were allowable to divide by such an expression it would be easy to prove that every number is equal to zero. One such proof would be as follows: From x= l there would result successively, x2=1, X2-1=0, x+=-0, x=-1, 1=-1, a=-a, 2a=0. As a may be so selected that 2a is an arbitrary number, it would result from this that any arbitrary number is zero. A fallacy of a somewhat different nature results from the fact that we are so apt to forget that a number has n nth roots. This is illustrated in the following two examples: 1 -1 -1 1 Extracting the square root of both members gives N/1 /-1 V-1 Vi ' Clearing of fractions and observing that.(/1)2=1 and (/ -1)2 = -1, if / stands for a single root, there results, 1- -1. The danger signal here is remember that a number has n nth roots. The use of radicals in elementary mathematics is not as uniform as it should be. For instance, the symbol \/ should either imply two values and hence should never be preceded by ~, or we should have a slightly modified symbol to denote 248 MODERN MATHEMATICS the arithmetic square root. If we assume that the symbol / indicates merely a positive square root, such equations as V/x+a+ /x=l, a>, are clearly impossible. On the other hand, they are possible when this symbol indicates either of the two possible square roots, and the possible value of x may be found by clearing of radicals in the ordinary way. Such equations should therefore not be called impossible without stating that symbol / is to be given an arithmetic meaning. The equation, (x q= (xl) q where p and q are integers, should not be regarded as an identity, as is evident from the fact that (x-)4 has only one value while (x4)1 has, in general, four distinct values. All the values of the first member of the given equation are evidently values of the second, but the converse is not true.* Such equations must therefore be used with great care. For more detailed information along this line the reader may consult Catalan, Sur un paradoxe algebrique, Nouv. Annales de Math., Vol. VIII, 1869, p. 456. V. SIMULTANEOUS EQUATIONS 25. Introduction. In sec. 5 it was observed that simultaneous equations appear on some of the oldest mathematical papyri and that the solution of a special case of a system of n simultaneous equations was known to the ancient Greeks. A satisfactory treatment of such equations was, however, not possible until determinants had been developed. This subject is comparatively modern, having its origin in the writings of Leibnitz (1693), and assuming a significant position in mathematical literature during the latter part of the eighteenth and the first part of the nineteenth century. In what follows we shall * Cf. Valles, Nouvelles Annales de Mathematiques, Vol. IX, 1870, p. 20. THE ALGEBRAIC EQUATION 249 assume a knowledge of the elementary properties of determinants. In the case of a single equation in one or more unknowns, it is known that it can always be solved in the sense that at least one value of each of the unknowns exists which will satisfy the equation. The only exception to this rule is when all the coefficients of the unknown, or the unknowns, are equal to zero,* while the known term is not equal to zero. In the case of a system of equations, a number of other possibilities arise and one of the first questions in regard to such a system is whether it can be solved. If this can be done the system is said to be consistent. A set of mn quantities arranged in rectangular array of m rows and n columns is called a matrix. When m=n it is called a square matrix, so that the matrix of a determinant is always a square matrix. The rank of a matrix is the order of the largest non-vanishing determinant contained in the matrix. 26. Consistency of a system of linear equations.t Consider the following system of m equations in n unknowns: allX1 +a12x2... +alnXn +b =0, a211 +a22X2 +..+a2nXn + b2=O, amixl + am22 +... + amnXn + br -O, where m and n are any two positive integers. The three cases that can arise are: (1) The equations may have no solution and hence be inconsistent. (2) They may have only one solution. (3) They may have more than one solution. * We consider only finite values of the unknowns in the solutions of equations. t In this article we have, in the main, adopted the mode of presentation given in the Introduction to Higher Algebra, by Maxime B6cher, 1907. 250 MODERN 7MATHEMATICS It will soon appear that they must have an infinite number of solutions whenever they have more than one (in fact, each unknown has none, one, or an infinite number of values), so that the possible cases are: No solution, one solution, or an infinite number of solutions. To prove this it will be convenient to consider the two matrices: 1al a12.. al n aj1^12... alnbi A= a21a22... a2n a21a22... a2 b2 atnlam2... amn amlai2. *. amnbm The latter is obtained by adding the column of b's to the former, and hence it is called the augmented matrix of the system, while A is the matrix of the system. It is evident that the rank of B cannot be less than the rank of A and that the former cannot exceed the latter by more than unity. Hence we have the two possible cases: (1) The rank of A is equal to that of B, (2) the rank of A is one less than that of B. Suppose that the given system of equations comes under the latter of these two possible cases. We may therefore suppose that the rank of B is r while the rank of A is r-1. The given system of equations may be supposed to have been arranged in such a manner that the non-vanishing determinant of order r in B is in the upper right-hand corner of this matrix. Since the rank of A is r-1 it results that the homogeneous parts (fi, f2,...,f) of the first r equations of the given system may be multiplied by constants (cl, c2,..., Cr), so that clfl +c2f2 +.. +4Crfr=O, independently of the values of the unknowns, where at least one of the c's is not 0. If we represent the first members of the given system by F1, F,..., Fm,, so that Fi -fi+bi, (i=, 2,..., m), it follows from the above that clFl +c2F2 +... +crFr=clbl +C2b2+... +crbr=c. THE ALGEBRAIC EQUATION 251 Since the rank of B is r it is necessary that c 70, otherwise each of the elements in one row of the matrix of a non-vanishing determinant would be the same linear function of the corresponding elements in the other rows. The fact that c O for any possible values of the unknowns proves the inconsistency of the system, for if they were consistent there would be values of the unknowns which would cause each of the functions F1, F2,..., Fr to vanish and hence c would be 0. Having proved that the given system of equations is inconsistent when the rank of B is larger than the rank of A we proceed to prove that the system must be consistent when the rank of A is equal to that of B. Suppose that each of these two matrices is of rank r and that the equations are so arranged that a non-vanishing determinant of order r appears in the upper left-hand corner of each of these matrices. Since each of the determinants of order r+1 must vanish we have the relation, c1F1 + c2F2 +..~ + CrFr + cr+lFr+l = 0, independently of the values of the unknowns. As clF + c2F2 +... + crFr 0 independently of the values of the unknowns, it results that cr+l #O. Hence we may divide the given equation by cr+1 and thus express Fr+i in terms of F1, F2,..., Fr. As the same argument holds for Fr+2, Fr+3,.., Fm it results that any solutions of the first r equations must also be solutions of all the rest. If in the first r of the given system of equations we assign arbitrary values to r+l... x. we obtain a system which can be solved in the ordinary way by means of determinants, since the determinant of the system does not vanish. In this way we obtain one and only one value for each of the unknowns xl,..., r. The preceding considerations prove the following theorem: A necessary and sufficient condition for a system of linear equations to be consistent is that the matrix of the system has the same rank as the augmented matrix. Since the values assigned to x,+l... n are arbitrary, it also follows 252 MODERN MATHEMATICS that a system of linear equations has an infinite number of solutions whenever it has more than one solution. To provide very elementary illustrations of the preceding theorem we consider the following systems: {3x-2y+ z=8, f3x+4y= 7, x-4y+2z==6. - 6x+8y=10. x+2y=5, x- y+3z= 4, III. 1 2x-y =0, IV. 2x+3y- z= 5, 4x+3y==10. [3x+2y+2z=10. In system I the rank of the matrix of the system is 2, since 3 -2 1 -4 0. As the rank of the augmented matrix is also 2 this system is consistent and arbitrary values may be assigned to either y or z. On the other hand, the only value that x can have is 2.* The rank of system II is 1, while the rank of the augmented matrix is 2; hence this system has no solution. In system III the rank of the matrix as well as that of the augmented matrix is 2. Hence this system has a solution and it is evident that it has only one solution, viz., x=l, y=2. As the matrix system of IV is of rank 2, while the augmented matrix is of rank 3, this system has no solution. 27. Geometrical interpretation. As a linear equation involving no more than three unknowns may be conveniently represented as a plane in ordinary space, clearness is often attained by thinking of the planes which represent given systems of equations. For instance, system I of the preceding paragraph represents two planes intersecting on the plane x=2, and hence these planes are cut in parallel lines by every plane parallel to the plane x=2, while they are cut in two * A necessary and sufficient condition that a given unknown in a consistent system of linear equation has the same value in every possible solution of the system is that the rank of the matrix of the system is decreased when the coefficients of this unknown are omitted from the matrix of the system. Cf. American Mlathematical Monthly, Vol. XVII, 1910, p. 137. THE ALGEBRAIC EQUATION 253 intersecting lines by the planes parallel to y = 0 or z =0. System II represents two parallel planes, while system III represents three planes through a line parallel to the z-axis. Finally, system IV represents three planes intersecting in three parallel lines. These interpretations follow directly from solid analytic geometry, and they tend to elucidate the theory of systems of linear equations, but they do not form an essential element of this theory. 28. Consistency of two equations in one unknown. Suppose that two rational integral equations in x, fi() = 0, f2() = 0, have a common root. If fi(x) is of degree mn and f2(x) is of degree n, we obtain m + n equations in the m + n-1 unknowns, x, x2,..., xm+-1 by multiplying fi(x) successively by x, x2,.., x n-1, and f2(x) by x, x2,..., xm-1. The consistency of this system of m + n equations requires that the determinant of the augmented matrix of the system be equal to zero.* This determinant is known as the resultant of the equations and the method by which we obtained it is known as Sylvester's dialytic method of elimination. The resultant of the two linear equations, ax+b=O, aix+bi=O is a b =abl-alb=O' at bi and the resultant of the two quadratic equations, ax2 +bx +cO, aix2+ bix+c =0, is the determinant of the fourth order. a b c o o ab c =0. al bl ci o o al b1 ci * It has been proved that this condition is sufficient as well as necessary. The arguments here employed prove only the latter. 254 MODERN MATHEMATICS For instance, the two equations, x2 +4x -21 = 0, x2 +2x -15=0, are consistent, since their resultant is 0. It is evident that this method may also be employed to eliminate one of the unknowns from two simultaneous equations in two unknowns. 29. Equivalent equations. In elementary algebra two equations are generally regarded as equivalent by definition if they have all their roots in common.* Similarly, two systems of simultaneous equations are regarded as equivalent by definition if all the solutions of one system are solutions of the other, and vice versa. On the other hand, it is frequently desirable to define the term equivalent with regard to a certain set of transformations, and to say that two expressions, or sets of expressions, are equivalent as regards a certain set of transformations if this set includes at least one transformation which carries the first of these expressions over into the second, and also at least one which carries the second over into the first. Two expressions which are equivalent as regards one set of transformations need not be equivalent as regards another set. In the present article we shall adopt the former of these definitions of equivalence, and we shall first inquire what effect clearing of fractions may have upon certain rational equations in one unknown. It is convenient to premise the evident theorem: A necessary and sufficient condition that the sum of the n rational numerical fractions, al a2 an bl' b2 ' ' ' ' bn, in the form alb2b3... bn +a2blb3... o+. +...+ blb. bnl bb2.. bn * In Jordan's Traite des substitutions, p. 271, two equations of the same degree are called equivalent if the roots of the one may be represented as rational functions of the roots of the other. THE ALGEBRAIC EQUATION 255 shall be in its lowest terms is that each of the n given fractions shall be in its lowest terms and that the denominators, bl, b,..., b,, are relatively prime. Let fl(x) f (x) + fn(x) ~, --- 4-'~,-/ — +... +, / \ == U,. * * * (1) I0 (X) +>2(X) + n(x) 0 be an equation in which each fraction is reduced to its lowest terms and the denominators are relatively prime, fi,... fn, 1,.*.., q5 representing rational integral functions of x, not excluding the case when some of these are constants. When cleared of fractions this equation becomes fl() 02(x)....(X) +.. +fn(x) I (x).. n (Xi(x)=0. (2) Suppose that a is a root of (2), and hence f i() 822(a)... * n(a) +..+fn(() l(a).. *n-l(a)=0. (3) It is easy to see that none of the 4's is equal to 0. For instance, Q b(a)=0 would imply that fi(a) O2(a). *.n() = 0. As a is not a root of any of the functions fi(x), 02(x),... bn(x) it cannot be a root of their product. That is, 01(a) 7Z0. Since none of the O's is 0 we may divide Eq. (3) by lia) a)2(a) *.. n(a), and thus obtain fl (a) 2 (a) (a) -. This proves that every root of Eq. (2) is also a root of Eq. (1) and it is evident that every root of Eq. (1) is also a root of Eq. (2), since no root can be lost by multiplying both members of an equation by a rational integral function. Hence it results that Eqs. (1) and (2) are equivalent equations. That Eqs. (1) and (2) are not necessarily equivalent if we omit either of the conditions that the O's are relatively prime 256 MODERN MATHEMATICS or that the fractions are in their lowest terms results directly from the following examples: The equation x 1 x-1 x-1 has no root, since dividing by 0 is excluded, and 1 is the only number that requires consideration, but x(x -1)-x +1 = 0 has 1 as a repeated root. It should be observed that the equation obtained by multiplying the former of these equations by the least common multiple of the denominators is x-1 =0, and hence this has a root which is not a root of the equation in the fractional form. On the other hand, of the two equations, x-1 1 x -= 0' 2X2 -X — 1 = 0, X2_1~xO 2x2 —1=0, the latter has the root x =1, while the former does not have this root. It is of especial interest to observe that all the roots of the former must be roots of the latter, since the latter was obtained by multiplying both members of the former by a rational integral function of x. Geometrical considerations frequently throw additional light on the subject of equivalence of equations. For instance, the two equations, I 1 - +- =2 and x+ y = 2xy, xy represent two loci which have every point except the origin in common. The latter of these is a hyperbola, and if the former could be plotted accurately its graph would be so nearly like that of the latter that no microscope would reveal the difference, since such an instrument could not reveal the THE ALGEBRAIC EQUATION 257 missing point. It may be added that the rigid exclusion of division by 0 is not followed by all mathematicians and that many of the leading mathematicians of earlier times did not completely exclude the possibility of such division. 29. A few tests for equivalence of equations. If the two members of an equation are either multiplied or divided by a rational expression of the unknowns, which cannot be zero or infinity in the domain of rationality to which the unknowns are restricted, the resulting equation is equivalent to the original. Let A=B be any equation, and let K be any expression which cannot be zero or infinity for any of the values of the unknowns under consideration. The equations, A B KA=KB, KK' may be written as follows: K(A-B)=O, (A-B)=O. Since K 30, or so, these equations can be satisfied only by those values of the unknowns which make A=B. It results directly that two equations which are equivalent in one domain of rationality are not necessarily equivalent in another. For instance, if the values of the unknowns are confined to real numbers, K could be x2+1, but K cannot have this value if x may be any complex integer. If the two members of an equation are increased or diminished by the same expression, the resulting equation is evidently equivalent to the original. This clearly includes the transposing of any term from one member of an equation to the other as well as the changing of the sign of each term of an equation. In transforming equations it is very important to observe whether the derived equations are actually equivalent to the original. If a derived equation contains all the roots of the original and some others it is said to be redundant, if it lacks some of the roots of the original it is defective. From 258 MODERN MATHEMATICS what precedes it is clear that the ordinary process of clearing of fractions leads either to an equivalent or to a redundant equation. VI. A FEW REFERENCES 30. Text-books. The a-lgebraic equation occupies a prominent place in algebra and some of its elementary properties are developed in the text-books on algebra for the secondary schools. More extensive developments of these properties may be found in the advanced text-books on this subject, such as (1) Chrystal, Algebra, an elementary text-book, 2 vols., 2d edition, 1900. (2) Capelli, Istituzioni di analisi algebrica, 4th edition, 1909. (3) Weber, Lehrbuch der Algebra, 2 vols., 2d edition, 1898-99. (4) Serret, Cours d'algebre superieure, 2 vols., 6th edition, 1910. The last two of these works include a treatment of the Galois theory of equations while the first two omit this theory, but they give an elementary introduction to the theory of substitutions. In the first this introduction is very brief and incomplete. A large number of special treatises on the general theory of the algebraic equation have appeared, beginning with the works of Vieta in the early part of the seventeenth century. Among the modern works in the English language Burnside and Panton's Theory of Equations is probably the most generally known. The first three editions of this work appeared in one volume and excluded the Galois theory, while the fourth and' fifth appeared in two volumes and include an introduction to substitution groups and the Galois theory of equations. Among the other treatises on this subject we may mention (1) Dickson, Introduction to the Theory of Algebraic Equations, 1903. (2) Cajori, An Introduction to the Modern Theory of Equations, 1904. (3) Mathews, Algebraic Equations, 1907. (4) Netto-Cole, Theory of Substitutions and its Applications to Algebra, 1892. (5) Barton, An Elementary Treatise on the Theory of Equations, 2d edition, 1903. (6) Bianchi, Lezioni sulla teoria dei gruppi di sostituzioni e delle equazioni algebriche secondo Galois, 1900. (7) Vogt, Leqons sur la resolution algebrique des equations, 1895. (8) Matthiessen, Grundzuige der antiken und modernen Algebra der litteralen Gleichungen, 1878. THE ALGEBRAIC EQUATION 259 The last of these works contains an account of many of the ancient methods which were used to solve equations and is rich in historical material. As a result of the rapid growth of historical knowledge during recent years some of this material has been found not entirely reliable. Certain phases of the theory of equations are presented in a very instructive manner in Klein's Elementarmathematik vom hoheren Standpunkte aus, Autogr., 1908-09; and also in Bocher's Introduction to Higher Algebra, 1907. An extensive list of treatises on this and other mathematical subjects may be found in the Mathematischer Bucherschatz by Ernst TWolfing. This extensive work is supposed to give a systematic list of the principal books and monographs appearing during the nineteenth century. It appeared in the Abhandlungen zur Geschichte der Mathematischen Wissenschaften, 1903. 31. Articles. (1) Pierpont, Galois "Theory of algebraic equations," Annals of Mathematics, Vols. I and II, 1900, pp. 113 and 22. (2) Bocher, Gauss's "Third proof of the fundamental theorem of algebra," Bulletin of the American Mathematical Society, Vol. I, 1895, p. 205. (3) Sylvester, "On an elementary proof and generalization of Sir Isaac Newton's hitherto undemonstrated rule for the discovery of imaginary roots," Proceedings of the London Mathematical Society, Vol. I, 1865, p. 1. (4) Van Vleck, "A sufficient condition for the maximum number of imaginary roots of an equation of the nth degree," Annals of Mathematics, Vol. IV, 1903, p. 191. (5) Baker, "A balance for the solution of algebraic equations," American Mathematical Monthly, Vol. II, 1904, p. 224. (6) Emch, "Hydraulic solution of an algebraic equation of the nth degree," ibid., Vol. VIII, 1901, p. 58. (7) Moritz, " On certain proofs of the fundamental theorem of algebra," ibid., Vol. X, 1903, p. 159. (8) McClintock, "A method for calculating simultaneously all the roots of an equation," American Journal of Mathematics, Vol. XVII, 1895, p. 89. (9) Tanner, "A graphical representation of the theorems of Sturm and Fourier," Messenger of Mathematics, Vol. XVIII, 1889, p. 95. (10) Kellogg, "A necessary condition that all the roots of an algebraic equation are real," Annals of Mathematics, Vol. XI, 1908, p. 97. (11) Lambert, "On the solution of algebraic equations in infinite series," Bulletin of the American Mathematical Society, Vol. XIV, 1908, p. 467. (12) Allardice, "On a limit of the roots of an equation that is independent of all but two of the coefficients," ibid., Vol., XIII, 1907, p. 443. 2J6 0 MODERN MATHEMATICS (13) Dickson, " On the theory of equations in a modular field," ibid., Vol. XIII, 1906, p. 8. (14) Bauer, " Ueber die versehiedenen Wurzeln euler algebraischen Gleichung," Mathematische Annalen, Vol. LII, 1899, p. 113. (15) Dedelkind, "Ueber Gleichungen mit rationalen Coefficienten," Jahresbericht der deutschen Miathematiker-Vereinigung, Vol. I, 1892, p. 33. (16) Lucas, "IResolution electromagneltique des equations," Comptes rendus de 1'A~cad6rniie des Sciences, Paris, Vol. CXI, 1890, p. 965. A very extensive list of additional references to articles may be found in the Royal Society of London Subject Index Catalogue of Scientific Papers, Vol. I, 1908, pp. 156-87. A selected list of treatises and articles is contained in Felix Midler's Fifihrer durch die mathematische Literatur, 1909, pp. 055-62. VI THE FUNCTION CONCEPT AND THE FUNDAMENTAL NOTIONS OF THE CALCULUS By GILBERT AMES BLISS CONTENTS SECS. I. INTRODUCTION........................................... 1-3 1-3, The need of a unifying conception in elementary mathematics. II. VARIABLES AND FUNCTIONS...............................4-18 4-8, Definitions and examples; 9-10, Graphs of functions; Functions with discontinuous graphs (10); 11-17, Classification of functions; The location of the elementary functions (12-14); Applications in collegiate teaching (15-17); 18, Continuity of a function. III. THE FUNDAMENTAL NOTIONS OF THE CALCULUS............29-32 19-21, The derivative and its interpretations; 22-25, The anti-derivative; Relations between a function and its anti-derivative (24); 26-30, The definite integral; Geometrical and physical quantities as definite integrals (26-28); Computation of definite integrals by means of antiderivatives (29-30); 31-32, Relations between functions and their graphs. 262 VI THE FUNCTION CONCEPT AND THE FUNDA MENTAL NOTIONS OF THE CALCULUS By GILBERTr AlMES BLIss I. INTRODUCTION 1. Euclidean geometry a logical model. The mathematical historian tells us that the most important contribution by Euclid to mathematical science was his systematization of geometrical principles already known to the mathematicians of his day, rather than the additions which he made to the science in the form of new theorems. His development of the structure of Euclidean geometry has itself not been kept inviolate from criticism in recent years. But whatever may be the faults of his presentation from the standpoint of presentday methods, it must nevertheless be recognized that he was among the earliest exponents of a now well-established logical form for the application of mathematics to the phenomena of nature. The structure of such an application consists of two essential parts: first, a set of postulates suggested by our intuitive interpretation of natural phenomena; and second, a collection of definitions and of theorems stated in terms of the definitions and deduced by logical processes from the initial assumptions. The postulates are the foundation, and the definitions and theorems the superstructure of the science. 2. Imperfections in the presentation of other subjects. The Euclidean theory of geometry, which was presented thus early to mankind in a form attractive alike to the intuitive or to the logical type of mind, has for centuries occupied a prominent 263 264 MODERN MATHEMATICS place in educational curricula, and it is no wonder that the theory remains to the present day the gem of our elementary mathematical courses. The marvel is, on the other hand, that the characteristics of the Euclidean theory which make it seem logically so complete, and so interesting to the mind sympathetically inclined to mathematical thinking, have apparently been overlooked to a very large extent in the presentation of other elementary subjects. Especially is this true in the case of algebra. To be convinced one needs only to take a cursory glance through the table of contents of almost any college or elementary text-book on the subject, and to note the heterogeneity of the subjects presented. Topics, related perhaps inherently but with no indicated relationships, follow each other in a confusion of radicals, exponents, progressions, imaginaries, probabilities, and other algebraic conceptions, in a way which must tend to develop a very disjointed understanding on the part of the beginner. It is true that efforts have been made with considerable success, in some of the more recent text-books, to effect unity of presentation by grouping the usual elementary algebraic conceptions about the equation as a central notion. It is true also that heterogeneity of presentation is much less marked in the cases of trigonometry and analytics, largely because the mathematical material designated by those titles is in itself more homogeneous. But very little conscious effort seems to have been made to make these subjects appear in their proper light as interrelated parts of a larger mathematical theory. 3. A remedy in the function concept. It is one of the purposes of the present paper to show how this lack of unity may be remedied with the help of a very important mathematical conception which is called a function. The notion of a function has been inserting itself into the consciousness of mathematicians in its most general guise since the time of Dirichlet, though long before present and recognized in more special forms. It is interesting to note that the definition of Dirichlet, which seems very abstract in comparison with those of the earlier mathematicians, was really devised as a result of his THE FUNCTION CONCEPT AND THE CALCULUS 265 consideration of a practical problem involving the representation of functions by means of series, that of the flow of heat. His definition is a simple one, though at first sight it seems to be too general to serve as a basis for any extensive theory of functions or to have important applications in other branches of science. In order to explain it one must first consider what is meant by a variable in terms of which the notion of a function is defined. II. VARIABLES AND FUNCTIONS 4. Definition of a variable. A variable is simply a symlol, say x, which in a given discussion may be used to denote any one of a given set of objects. By means of a variable we are enabled to express in terms of a single statement involving x, a property which is common to all objects of the set. Thus we may say that for any positive integer x the number 3x + 2 will have a remainder 2 when divided by 3, and we express thereby a property of each of the positive integers. Or if we wish to say that any curve joining two given points p and q is longer than the straight line pq, we may designate by C any one of the curves and state that C is longer than the line pq. The set of objects, any one of which is represented by x, is called the range of the variable, and in elementary mathematics it usually consists of numbers, though the example just given shows that it may contain elements of a quite different character. 5. Example of a function. The word function was originally used to denote any power of a number, but with the introduction of the calculus it came to mean any mathematical expression involving a variable x, the value of which could be calculated when that of x was assigned. The definition of Dirichlet is more general still, and we may understand it better perhaps by examining first some simple examples. Consider the accompanying table, in which the numbers in the first column are hours of the day, and the numbers in the second the corresponding temperatures. If x is a variable 266 MODERN MATHEMATICS which may star Hour. 8 9 10 11 12 1 2 3 4 5 6 7 8 Temperature 52.2 53.4 61.0 69.8 75.7 77.8 78.1 76.9 72.55 67.8 66.8 60.0 51.1 id for any one of the hours of the day, and y a variable representing the temperatures, then the table sets up a correspondence between the values of x and y in such a way that whenever a value is assigned to x a corresponding value of y is uniquely determined, and y is said to be a function of x. Similarly the mathematical formula y x/(x2-1) makes a unique value of y correspond to any given value of x with the exception of x- ~1, and y is again said to be a function of x. The range of the variable - x is in this case the totality of real numbers excluding the values ~1, and it may be seen without much difficulty that y takes all values between -oo and r oc. 6. Definition of a function. WTith these examples in mind we may now define with Dirichlet a single-valued function of a variable x to be a second variable y so related to x that whenever a value is assigned to x from the x-range, a corresponding value of y is uniquely determined in the y-range. The correspondence between the objects in the x- and y-ranges need not be made, however, by means of what is usually called a mathematical expression, but may be determined in any way whatsoever, provided only that it is unique for each object which can be represented by x. The essentials in the definition are evidently the independent variable x with its range, and the correspondence between x and y. The range of the dependent variable y is of course not necessarily exhausted by the correspondence; it may contain elements which do not correspond to any object in the x-range. 7. Examples of functions. To a person encountering this definition for the first time, it would no doubt seem very artificial and too general to be of any great service. It is possible, however, to develop an elaborate and important theory of functions for the very general case when the range of x is left entirely arbi THE FUNCTION CONCEPT AND THE CALCULUS 267 trary while that of y consists of real numbers, as has recently been shown by Professor Moore* in his study of a field of mathematics which he has called General Analysis. The generality of the definition depends not only upon the absence of any specification as to the character of the ranges of x and y, however, but also upon the freedom which it leaves in the choice of the functional correspondence, even when the ranges consist of real numbers. The example of the hours and temperatures above illustrates the existence of functions which do not involve a mathematical expression of any kind, and many similar examples could be found in the tabulated results of statistical observations or of physical experiments. But the correspondence may also be entirely artificial. For let x range over all the real numbers from zero to one. Corresponding to any rational values of x suppose that y is to have the value +1, and for irrational values of x the value -1. Then y is a function of x over the interval from zero to one, and the range of y consists only of two elements. This example suggests the definition of a constant function, that is, one for which the dependent variable y takes but a single value over the whole x-range. A constant, in general, may be regarded as a variable for which the range consists of a single number. For the function last given the x-range has an infinity of values while the y-range consists of only two values, and a table could be arranged which would indicate this functional correspondence. But if the y-range contained also an infinity of different y-values, it might be impossible to express the correspondence by listing the values of x opposite the corresponding values of y in a table. For by far the majority of functions met with in mathematical theories the correspondence is specified not by means of a table, but by means of a mathematical formula which takes the place of the table, and which often implicitly defines the range of the independent variable x, * E. H. Moore, the New Haven Mathematical Colloquium. The lectures in this volume were delivered by Professor Moore and others in September, 1906. 268 MODERN MATHEMATICS as well as the correspondence. This the formula y=x/(x2-1) given above defines a function of x which is well defined for all real values of x except the values x= t~1, and as has been noted the y-range consists of all values between -o and + o. 8. Functions with other than numerical ranges. The preceding examples illustrate the definition of a function when the range of the independent variable consists of numbers. It is easy to find functions for which this range has elements of a different kind. If two points p and q are joined by a curve C, the area pqrs shown in the figure is uniquely determined and depends upon the form of the curve. A functional correspondence is thus set up between the variable C whose range is the totality of curves joining the points p and q, and the variable A which represents the area. The statement that ------------- a dependent variable y is a func- /' tion of an independent variable x, is usually expressed in the form of an equation y=f(x),,. s where the symbol f (x) is an ab- IIG. 1. breviation for the phrase " function of the independent variable x." If we wish, therefore, to express the fact that the area A described is a function of C, we may represent it by the symbol A(C). Similarly the length of the curve C is another function of C which may be denoted by L(C), as is also the surface area S(C) generated by the arc C when the whole figure is revolved about the horizontal line as an axis. There is a famous problem from a dornain of mathematics called the calculus of variations, which gives rise to a function of precisely the type which has just been considered. Suppose that it is desired to find the curve C along which a marble will roll in the shortest time from p to q. The time T in this case depends upon the form of the curve C and is a function T(C). It is impossible to describe here the interesting controversy which arose between John Bernoulli, the proposer of the problem, and his brother James as a result of their rival solutions, or THE FUNCTION CONCEPT AND THE CALCULUS 269 to undertake a detailed study of the methods by means of which they found the minimizing curve. Suffice it to say that in their work is found the origin of the whole subject of the calculus of variations. But it may not be uninteresting to see the result which they obtained. The minimizing curve is a cycloid, that is, the locus of a point on the circumference of a circle which rolls along a straight line. In the figure the cycloid is shown inverted with its steeper part as it should be near the point p, so that the marble accumulates a high velocity at the beginning of its fall. Similar minimum problems can be stated with respect to two others of the functions described above. For the length function L(C) the curve which provides the minimum value is evidently the straight line joining p and q; and the curve which describes the surface of minimum area when it is revolved about the horizontal line is a catenary, whose form is that of a heavy chain allowed to hang freely from two points of suspension. An example of a function whose independent variable has a range of still a different kind would be the shortest or straightline distance D(q) between a fixed point p and a variable point q. The independent variable is now the point q, which may range over the whole plane, while the range of the dependent variable D is the totality of numbers between 0 and oo. This and the preceding examples show that there are important examples in which the elements of the range of the independent variable are real numbers, arcs of curves, or points of a plane, and many other examples could be devised to exhibit a variety of ranges which it would not be possible to enumerate here, even if the patience of the reader permitted. In all of the examples which have been given the range of the dependent variable has consisted of real numbers, but this is also not necessarily the case, as appears in the theory of the so-called integral equations and in parts of the General Analysis of Professor Moore, referred to above. 9. Graphs of functions. For functions both of whose ranges are real numbers a graphical representation was devised 270 MODERN MATHEMATICS by Descartes, which is too familiar to need a detailed description here. But a few remarks concerning it may not be amiss. A horizontal line is taken with a zero point 0 and a unit of measure (see Fig. 2), so that to each value of x there corresponds a point of the line to the right of 0 if x is positive, and to the left if negative. In order to represent the value of a function at a point x, a perpendicular is erected equal in length to the number f(x). When f(x) is positive, the distance is measured upward; when negative, downward. The horizontal axis is called the axis of abscissas, and the vertical lines are called ordinates. It is customary to erect a perpendicular called the y-axis at the point 0, but such a line is not at all essential to the representation of the o 1 FIG. 2. function, and it is interesting to note that in some of the early editions of Descartes' Analytical Geometry the line does not appear. If the range of x has only a finite number of elements, as in a statistical table, then all the values of the function may be plotted in this way, the result being a picture which is much more suggestive and easy of interpretation than the table of values itself. For example, the accompanying graph shows a baby's ages measured in weeks as abscissas, and weights measured in pounds as ordinates. Even a bachelor's eye suffices to discover at a glance the unhappier periods which are suggested only after some study by the table from which the graph was made. The ordinates under the dots represent the values of the weight function in this case, the straight THE FUNCTION CONCEPT AND THE CALCULUS 271 lines being drawn only to assist the eye in passing from one significant point to the next. On the other hand if the x-range contains an infinity of elements it is usually impossible to make a complete picture of the function. One must be satisfied with plotting as many points as is convenient or desirable from the nature of the problem, and these may then be joined by a continuous curve which will give an idea of the functional values and their variation as x is changed. For example consider the function y=x/(x2-1). If only the points indicated by the dots are FIG. 3. actually calculated, the rest of the figure must be drawn free hand. (See Fig. 3.) At the best the graph can only be regarded as an approximate representation of a function, the errors which occur being essentially of the following two different kinds. Since it is impossible to represent distances exactly by means of marks which have finite dimensions, a first source of error would be in the use of the drawing instruments and would depend not only upon the inadequacy of the instruments themselves, but also upon the skill of the draughtsman. A second source of error lies in our inability to plot more than a finite number of points and the consequent necessity of filling in arbitrarily by far the major portions of the graph. The magnitude of 272 MODERN MATHEMATICS the errors due to the first cause can be estimated only by an experimental examination of the accuracy of the instruments and the personal equation of the operator. Similarly it is quite impossible without experimental evidence to say what the error will be which is due to the process of " filling in " the curve, provided that the curve joining the plotted points is drawn arbitrarily. But if the inaccuracies of the instruments are neglected, and if it is agreed to join the finite number of points which are actually plotted by straight lines, then it is possible to show that certain types of functions are fairly represented by such broken lines, and to show also that the error of representation for the functional values over a given interval can be made arbitrarily small by plotting a sufficient number of points sufficiently near together. The proof of this statement is made with the help of a property of functions called uniform continuity, and will be given later in the paper. For the present it may be stated that all of the functions which occur in elementary mathematics can be represented with a degree of accuracy proportionate to the desire and the patience of the investigator. 10. Functions with discontinuous graphs. From what has been said it will doubtless be inferred that there exist functions which cannot properly be represented by a graph, and in fact the function referred to above which is equal to +1 for rational values of x between 0 and +1, and equal to -1 for irrational values, is of this character. For a line parallel to the x-axis representing the values of the function for rational points would, according to the usual interpretation of the graph, imply + 1 as the values of the function for irrational values of x also. Nor is it true, as might be supposed, that any function for which the functional correspondence is defined by means of a mathematical formula, can be represented by a curve. Professor Pierpont * has set up a number of interesting formulas which have curious geometrical interpretations, one of which represents the function having no proper * The Theory of Functionw of Real Variables, p. 202. THE FUNCTION CONCEPT AND THE CALCULUS 273 graph, which has just been mentioned. He begins by considering a function which he calls signum x, or sgn x, and which is defined by the following conditions: sgnx= +1 for 0<x<1, sgn x= 0 for x==, 0 sgnx=-1 for — 1-x<O. FIG. 4. This function has the relatively simple formula: 2 lim sgn x- arc tan nx... (1) 7r n=-oc For if x is positive the limit has the value <7/2, if x is zero the value 0, and if x is negative the value -. By means of the function sgn x a formula can be found for the function which takes any given value a for rational values of x, and any other number b for irrational values. For consider the function lim g(x) =a+(b-a) l sgn (sin2 n! nx)... (2) If x is a rational number the expression in the parenthesis becomes and remains equal to zero for a sufficiently large n, since n! rx becomes a multiple of w. Hence sgn (sin2 n! x) 0, and g(x) has the value a, as a result of the properties of sgn x. For an irrational x the product n! nx is never a multiple of rT, and hence sin2 n! =x is some number between 0 and 1. For such values of x sgn (sin2 n! 7x)= l, and g(x) has the value b. 274 MODERN MATHEMATICS Another example which Professor Pierpont gives is the function lim x -+x n For any value of x different from zero this has the value unity, while for xO0 it is equal to zero. Still more curious is the function lim (1 +sin 7x) -1 Y n=C ( l+sin tx) + 1' I J ~ o T FIG. 5. FIG. 6. which has the discontinuous graph shown in the accompanying figure. If x is any integer this expression is evidently equal to zero. For any value of x between 0 and 1 the parenthesis (1 +sin r7x) is greater than unity and its nth power approaches infinity as n increases indefinitely. Hence the limit of the fraction is + 1. On the other hand, if x is between 1 and 2, then (1 +sin nx) is less than one and approaches zero as n increases, so that the limit of the fraction is in this case -1. 11. Classification of functions. The examples which have just been given and those which precede show clearly the necessity of some methods of classifying functions, if an intelligent study of them is to be made. There are several methods in use, each of which is important in some branch of the function theory, but one of them, which will presently be explained, is especially interesting from the standpoint of the elementary functions. On the basis of this classification some suggestions will also be made with regard to the presentation of the elementary subjects, suggestions which it is hoped will not seem too radical to be useful. In endeavoring to introduce any THE FUNCTION CONCEPT AND THE CALCULUS 275 pedagogical improvement the teacher is always hampered by the conservatism of the printed page. An alteration in method, in order to be successful with elementary students, must be mild enough to be adapted to the printed machinery already at hand; or if it is a radical reform, it must be accompanied by a well-written and practical text, in order to be at all effective or far reaching. The suggestions which are to be made here are of the milder sort, with the possible exception of those referring to algebra, where it seems to the writer at least that a thorough reorganization of subject-matter might lead to a very great improvement. 12. Algebraic functions. The simplest type of a function is the polynomial y=aoxm +alxml-+...+a,, sometimes called a rational integral function, after which come the rational functions or quotients of two polynomials, aom m+alxm —.. a(m Y b + bxbo - -... bn ( Both these types of functions are formed with the help only of the four processes of addition, subtraction, mulpiplication, and division, and the next class of functions which would naturally suggest themselves would be those expressible by means of the four processes just mentioned with the addition of extraction of roots. But it is better to regard functions so constructed as well as the polynomials and rational functions as belonging to a larger category called algebraic functions which are defined as follows. Suppose an equation ao(x)y+al(x)y~-l+... +an_1(x)y+a,(x)=O. (4) is given, in which the coefficients of the powers of y are themselves polynomials in x. If any value is assigned to x, the resulting equation in y will have a certain number of roots, in general n. To any x, therefore, the equation assigns a number of values of y, and y is said to be a multiple-valued 276 MODERN MATHEMATICS function of x. Evidently a polynomial or a rational function is an algebraic function, the equation which y satisfies in the latter case being easily found from equation (3) by making a common denominator. It is not so easy to show that any function which is expressible by means of radicals is algebraic, but a few examples will indicate very well how this may be the case. Take the functions y= Vx + V x, y= Vl/ +x + 1-x. By the usual algebraic methods of rationalization the variables x and y are found to satisfy, respectively, the equations y6- 2xy3 + x2-x = 0, y4- 4y2 + 4x2 = O, and in general it can be proved that any function found by addition, subtraction, multiplication, division, and extraction of roots, satisfies an equation of the type (4).* The various values which can be assigned to the radicals account for the multiple values of the function. If it is remembered that equations of the fifth degree and higher can be solved by radicals only in special cases, it appears at once that the class of algebraic functions includes many which cannot be calculated by means of these elementary processes alone. The generalization of the properties of functions determinable in terms of radicals to the corresponding properties for algebraic functions of the most general type, has furnished one of the most fruitful and interesting fields of mathematical research. 13. Transcendental functions. The trigonometric functions and the inverse trigonometric functions, the logarithm and the exponential, as well as an infinity of other functions which appear only in the higher analysis, do not satisfy any equation of the type (4) and have been given the name transcendental functions.t The values of these functions cannot be calculated analytically by a finite number of additions, subtractions, multiplications, and divisions, but depend upon an infinity of * See Monograph V. sec. 7; Monograph IV, Appendix II. t For proof of the transcendence of the numbers e and 7r see Monograph IX. THE FUNCTION CONCEPT AND THE CALCULUS 277 such operations indicated by means of a power series. As examples of such series may be cited the well-known ones for the sine, logarithm, and exponential: sin x=x —! +... x2 x3 log(l+x)-=x+2+3+.., ex =l+x+ tr+.... A function whose values are expressible by means of a power series is called an analytic function, and it can be shown that not only transcendental functions, but also all of the algebraic functions are expressible in this way. Even this very general category of analytic functions does not exhaust all of the possibilities for functions of a real variable, but for the purposes of the present paper it will be unnecessary to pursue the classification further. The function (2) which has so often been used as an illustration before, is an example of a function which for real values of x is not expressible by means of a power series, and there are many others. The results of the classification, as far as it has been made, can be summarized most concisely in the form of the following table: Analytic functions. Algebraic functions. Rational function. Polynomials. Rational fractions. Irrational functions. Those expressible by means of radicals. Those not so expressible. Transcendental functions. The trigonometric functions and their inverses. The logarithm and the exponential. Other functions of less elementary character, Non-analytic functions. 278 MODERN MATHEMATICS 14. The trigonometric and exponential functions are transcendental. There is an objection to this classification from the elementary standpoint, which ought to be mentioned. It is the difficulty in proving that all functions expressible in terms of radicals are algebraic, and the necessity of proving that the transcendental functions do not have this property. In one of the accompanying monographs * it has been shown that all the numbers expressible in terms of quadratic radicals only are the roots of an algebraic equation, and a similar proof could be made for functions of x, which are so expressible. But for the radicals of higher orders the problem is a much less elementary one and cannot be undertaken here. Professor Pierpont has given a simple proof that the function y=sin x cannot satisfy an equation of the form (4). For if there were such an equation, then there would be one of the lowest degree with the same property and of the form n+al (x)yn-l+... a,(x)O.... (5) where the coefficients are now rational fractions in x. On account of the periodicity of the sine function, the two equations yn+al(x+2m7r)yn-l +.. +a(x +2mr) =0, [al(x +2mr) - al(x)]yn-l +.. +[a,(x +2m7r) -an(x)] = 0, would also have to be satisfied for any integral value of m. Since a rational fraction in m can vanish for only a finite number of values, it follows that if m is properly chosen the coefficients of the last equation do not vanish, and hence the hypothesis that Eq. (5) is an equation of lowest degree satisfied by y is contradicted. If the function y=sin x is not algebraic, then the inverse function x=arc sin y cannot be, for an equation of the type (4): for one of these functions would determine the algebraic character of the other also. Similar proofs can be made for all of the trigonometric functions, and also for the exponential yt=e: and its inverse x=logy provided that * Monograph VIII, sees. 5, 6, THE FUNCTION CONCEPT AND THE CALCULUS 279 imaginary values of the variable x are admitted. The proof given above depends upon the periodicity of the sine function, and it is known that the exponential has the similarly imaginary period 2mn\/-1. 15. Applications of the function concept in collegiate teaching. In a preceding paragraph it was suggested that the classification of functions which has been made might be helpful in relating to each other the different parts of the undergraduate curriculum. In order to see how this may be done let us first of all consider the topics which are treated in the elementary courses in their relation to the table. The subject of study in trigonometry is the group of functions which have been classified as transcendental, emphasis being laid on the trigonometric functions and their inverses. The exponential is usually considered only as an introduction to logarithms, and the logarithmic function itself only so far as is necessary to enable the student to make a successful mechanical use of logarithmic tables. It is hard to say with precision where the topics treated of in algebra belong in the table, but most of them are related to the polynomials or rational fractions, and it is proposed to show precisely how this relationship may profitably be employed. Analytic geometry, on the other hand, is concerned with the graphical representation of functions and with the properties of the elementary algebraic functions which are defined by equations ay +bx+c —O ay2 + (bx + c)y + (dx2 + ex +f) = 0, of the first and second degrees in x and y. 16. Objections to present methods. There are numerous objections which might be made to the way in which these subjects are usually presented, a few of which will suffice to show at least the possibility of improvement. In the first place there is one toward the removal of which much has recently been done. That is the now somewhat obsolete tendency to confine the graphical study of functions entirely to the courses in analytics. The graphical representation of a function 280 MODERN MATHEMATICS is a device of the utmost importance not only in the study of the conic sections and the straight line, but also in the study of all the other elementary functions, and the student cannot be made familiar with it too early in his mathematical course. A second and more justifiable objection is the lack of attention given to the exponential and logarithm. It is safe to say that none of the functions listed in our table have wider or more frequent applications, and yet there is none which the student understands with so little thoroughness at the end of his freshman year, his only study of them having been for the purpose of enabling him to attain a certain mechanical skill in the use of logarithmic tables. The lack of unity in algebra courses and the desirability of graphical methods have already been pointed out. It may also be added that the elementary notions of the calculus can be introduced with much profit at suitable points in both algebra and analytics. In the analytics, especially the process of finding the tangent line to a conic involves the calculus notion of the slope of the tangent, and yet it is a common custom of writers on the subject to avoid carefully the notions and notations of the more advanced subject. It is difficult to account for this tendency on the part of our text-book writers, except on the theory that one should never encroach on a neighbor's property, a principle which is good when applied to real estate, but hardly commendable in a scientific treatise. 17. Suggestions for improvement. What are then the conclusions and the suggestions for improvement which can be drawn from these objections? In the first place does it not seem proper that each elementary course, since mathematics under our present collegiate mechanism must be divided into courses, should have to do with a particular class of functions, and should not the purpose of the course as a study of those functions be set clearly before the student at the beginning and re-emphasized at proper intervals until it is clearly understood? If the answer is affirmative, not only should trigonometry be concerned with the elementary transcendental functions and analytics a study of the simple irrational algebraic functions, but the subject-matter of algebra should be related to the study THE FUNCTION CONCEPT AND THE CALCULUS 281 of the rational algebraic functions, the polynomials and the rational fractions. What is unrelated should be relegated to its proper place in some other part of the mathematical curriculum. Furthermore, the treatment of these functions should be complete as far as possible at the stage in which the student finds himself, and illumined by a foresight on the part of the instructor of the conceptions of the calculus. There arc good reasons why the differentiation of the transcendental functions should not be considered in trigonometry, for the limiting processes involved are too complicated for the elementary student, but much can be said in favor of the early consideration of derivatives and anti-derivatives of polynominalsnotions which will be explained later in this paper —in a course in algebra, and in favor of the study of the derivatives of the elementary algebraic functions which occur in analytical geometry. Let us outline then a course of study for the freshman year in college, which is not to depart too radically from the present plan as usually followed, and yet which may afford a systematic treatment of the elementary functions. The course should begin with a consideration of the function concept, by means of special examples perhaps, and with frequent applications of the graphical representation of a function which should be continued throughout the entire course. The exponential and the logarithm might well be studied next on account of their importance in numerical computation, in particular in the plotting of other functions. Their graphs can be readily drawn without the use of a table, if it is noticed that y=ax can be plotted very easily, and that the graph of the logarithm can then be found by simply rotating the plane about a line through the origin making an angle of 45~ with the x-axis. After these preliminaries the usual course in trigonometry can be given with considerable economy in time on account of the familiarity which the student has already gained with graphical methods and the use of the logarithmic tables. The course in algebra is the one in which it seems that the notion of a function can be used to effect the greatest improve 282 MODERN MATHEMATICS ment. It is perhaps not easy to see how all of the topics usually studied in algebra can be related to the study of the rational functions, and on this account a brief outline of a course which might be given is to be inserted here. Let the course begin with an explanation of the kinds of functions which are to be studied, and show by means of examples, or more generally, that any function formed with the four elementary operations only is a rational fraction. This will give plenty of opportunity to exercise the student in the reduction of complex fractions. Following this a chapter on operations with polynomials should be given, including the division equation f(x =g(x)x) + r(x), synthetic division, and the computation of the coefficients of a polynomial ao(x+a)m+al(x+a)m-l+.. +am-_(x+a) +a,, by means of synthetic division.* Then take up linear functions and study their graphs and intersections, with the aid of determinants of the second order. The theory of quadratic functions affords occasion to emphasize the notion of a root of a polynomial, and may be used to introduce two new conceptions, the slope of a curved line by means of which maxima and minima may be determined, and imaginary numbers. A short treatment of imaginary numbers and DeMoivre's theorem will not be amiss at this stage, to be followed by a graphical study of polynomials of higher degrees, including the theory of maxima and minima with the help of the derivative. The roots of polynomials should then be studied systematically with the remainder theorem as basis, the theorems upon which Horner's method is based receiving due attention. After a chapter on the numerical determination of roots, including Horner's method, take up the study of polynomials of special types. For example the polynomials xn-a lead to the theory of rad* See for example Fine's College Algebra, sec. 422. THE FUNCTION CONCEPT AND THE CALCULUS 283 icals and fractional exponents, whose properties can all and perhaps best be derived from that equation xn=a; the polynomial (a+x)m suggests the binomial theorem, and the polynomials a+(a+x) +(a+2x)+... (a+nx), a+ax+ax2+... +axn are progressions. When the elementary properties of polynomials have been exhausted, the graphical theory of rational fractions may be developed, followed by a study of indeterminate forms and undetermined coefficients as applied to partial fractions. Chapters on series, permutations and combinations, and probabilities fit less easily into the elementary function theory, although series may be regarded as a natural generalization from polynomials with a finite number of terms to those with an infinite number, and a new proof of the binomial theorem might be made the excuse for the introduction of the formulas for combinations. The number of combinations n! of n things k at a time is (-k)!n!' and it can be argued that the number of terms of the form a-"kxk which occur in the product (a+x)n is also equal to this number. No mention has been made of a place for probabilities or the theory of determinants. The former might well give place to topics which are more important at this stage of the student's course, and the latter really belongs in a course in the theory of equations, or else in solid analytical geometry. The course in plane analytic geometry needs but few remarks aside from those which have been made above. It should be devoted to the theory of the simple irrational functions, including the solution of simultaneous quadratic equations, with applications to intersections of conics, and an introduction to the process of finding the derivative of an algebraic function with its interpretation in the problem of determinating the slopes of tangents. The detailed study of the differentiation of transcendental functions and of algebraic functions in general must of course 284 MODERN MATHEMATICS be left for the course in calculus, where functions are studied from a somewhat different standpoint. In the calculus the continuity, differentiation, and integration of functions hold the most prominent place in our attention, and as the basis of'the behavior of functions under these operations, other classifications besides the one given in the table above can also be made, which are more important for purposes of the higher analysis. 18. Continuity of a function. A discussion of functions would not be complete without a description of what is meant by the property of continuity mentioned in the preceding paragraph. Speaking very roughly, a function is continuous when it has an unbroken graph. Thus the function y (= 2 1) is continuous for every value of x except the values x== ~1. The function (1), Sec. 10, is not continuous at x=O0, for its func If a-a..,i,a a-6 a a+5 FIG. 7. f(x) must approach tional values jump from -1 to 0 and to +1, as x increases through this value. Analytically a function f(x) is said to be continuous at a value x =a if a belongs to the range of x-values for which f(x) is defined, and if the difference f(x)-f(a) can be made arbitrarily small by taking x sufficiently near to a. If a function has this property it is evident that f(a) as x approaches a. The definition is made still more precise by saying that f(xc) is continuous provided that for any positive number e, however small, a second positive number 3 can be found, such that f(x)-f(a) is numerically less than e whenever x differs from a by less than d. Graphically interpreted, this means again that on the interval from a- to a+ a the difference of any pair of ordinates, f(x) and f(a), of the curve y =f(x) is less than e. It will be understood readily, from their graphs, and it may be proved analytically, that polynomials are continuous functions for every value of x, and that a rational fraction is con THE FUNCTION CONCEPT AND THE CALCULUS 285 tinuous at every value except those which make its denominator vanish. It is true of any other elementary functions also that they are continuous for every value of x, with the possible exception of certain isolated ones. Thus the trigonometric sine and cosine are everywhere continuous, while the tangent becomes infinite and therefore has a discontinuity for values of x which are odd multiples of -. But other functions may be discontinuous in a much more complicated way, as in the case of the function (2) which is discontinuous at every point. The continuity properties of the elementary functions are evidently relatively simple, and we may therefore leave them at this point in order to consider other important properties of functions which occur in the calculus. III. THE FUNDAMENTAL NOTIONS OF THE CALCULUS 19. The three fundamental notions of the calculus. The differential and integral calculus has to do with three fundamental notions associated with functions, to which are due most of the applications of the function theory in geometry, mechanics, and physics, as well as other branches of science. These three conceptions are called the derivative, the antiderivative or indefinite integral, and the definite integral. All there may be interpreted geometrically and illustrated simply by means of polynomials, and it is proposed to explain them briefly here. The real difficulties of the calculus arise in applying the fundamental notions mentioned to the irrational algebraic and transcendental functions. 20. The derivative function and its interpretations. Let us agree to consider from this point on in our discussion only functions f(x) which are defined for x on the whole range of real numbers, or on a certain interval a_ x b of that range. If x is thought of as indicating the time at any moment and increasing uniformly from the value at one end of its range to that at the other, the variable y=f(x) will simultaneously change in value. At each value of x the function will have a 286 MODERN MATHEMATICS certain rate of change relatively to x, which may be defined in the following way. Consider an interval of x-values between x and x +3tx, where ax is simply a symbol used to denote a quantity which is to be added to x. At the value x + Jx the function y will have a value which may be represented by y + y =f (x + x), and the difference between the values of y at the beginning and end of the interval, is therefore dy =f (x + Ax)-f (x). The quotient Jy f (x+ J)-f (x) (6) iz ix.(6) represents then the average rate of change of the function as x varies from x to x + ix. The limit of this quotient, as ax decreases in size and approaches zero, is what is meant by the rate of change, of the function at the value x. Evidently if this limit exists it will be a variable which is uniquely determined at each value of x, and is therefore itself a new function usually denoted by the symbol f '(x). The function f'(x), which is called the derivative or rate of change of f (x), does not exist for every function, as might easily be shown for some of those which have already been defined. But for the elementary functions the rate of change can always be found. The manner in which it is calculated can be well illustrated by the familiar problem of the falling body. When a heavy particle falls from rest, the distance through which it has fallen in the time t is a function of t defined by the well-known formula s= gt. If the distance fallen through in the time t +3 t is denoted by s + Is, then the equation s + As= =-g(t +,t)2 holds, and the average velocity during the time it is Js ((t+dt)2-t2 t-tt — = g tt + ~g t. THE FUNCTION CONCEPT AND THE CALCULUS 287 As At approaches zero this average velocity approaches the limit gt, which is the actual velocity of the falling body at any given moment t. The rate of change which has just been calculated was that of a very simple polynomial in t. The rate of change of any polynomial can readily be found by a similar method with the help of the binomial theorem. For consider first the function y = axn. By the process described above the value of the average rate of change in the interval from x to x +Ax is the quotient. Ay (x +.x)n.- n ay (x +a -anxn~-l+terms containing powers of Jx. Hence the rate of change of axn is the function anxn-l, a formula which holds for any positive integral value of n. Similarly if y is the polynomial y=2x3-x +-5, the average rate of change in the interval between x and x + Ax, will be Ay (x + x)3-x3 (x -+ Ax) -x Jx Jx Ax and the limit of this expression is the derivative function 6x2-1. From this last example it may be inferred that the derivative of any polynomial can be found by applying the formula for the rate of change of ax' to each term separately and adding the results. The above definition of the derivative function as a rate of change is the one which gives this function importance in mechanical problems, but the derivative has also an interesting geometrical interpretation. Suppose that the function y=f (x) has the graph shown in the accompanying figure. At any value of x the vertical line xp has a length equal to the value of the function f (x), and at x+Ax the corresponding ordinate 'from x + Jx to q has the value f (x) + ix. Hence, in Fig. 8 pr = x, qr =f (x + x)-f (x), and the value of the quotient Eq. (6) is evidently the same pr as that of the quotient r, the slope of the secant pq. As ix qr 288 MODERN MATHEMATICS approaches zero the point q approaches p and the secant pq approaches the tangent at p as a limiting position. The slope of pq must therefore simultaneously approach the slope of the tangent, so that the value of the derivative function f'(x) is numerically equal to the slope of the tangent line pt. 21. Maxima and minima of functions. Perhaps the most important application of the geometrical notion of a derivative is in the determination of the maximum and minimum values of functions. Evidently the slope of the tangent at the maximum and minimum points, a, b, c, Fig. 8, must have the value zero. If, therefore, the derivative function f'(x) can be found for a given function f(x), then the maximum or minimum b q~~ I I ~~~x t G G+a.X x + A-:b FIG. 8. IIG. 9. values f (x) will be determined by values of x for which f'(x) vanishes. As an example, suppose that it is required to find the dimensions of the largest box which can be made by cutting squares of side x out of a piece of tin as in Fig. 9, and then folding along the dotted lines. If the dimensions of the tin are 3X5 inches the volume of the box m ill be a function of x defined by the equation v = (6- 2x) (4- 2x)x = 24x- 20x2 + 4x3, This function has the graph shown in Fig. 10. The slope of the tangent to the curve at any point is given by the derivative function v'= 15-32x + 12x2 THE FUNCTION CONCEPT AND THE CALCULUS 289 which must vanish at the point a where v is a maximum. The roots of the last function are 5 + /7 5-V/7 X 3 ' — 3 ' the latter being the value of x for the point a. In order, therefore, to get a box of the greatest capacity, the corners must 5- \7. be cut in a distance equal to inches. If a function f (x) has everywhere the same value c, its rate of change is evidently zero, and its graph is a straight line a FIG. 10. parallel to the x-axis. Conversely it is reasonable to infer that any function f (x) whose rate of change is zero must have a graph which is a straight line parallel to the x-axis, and must therefore have the same value for every value of x. Consider now two functions f (x) and g(x) which have the same rate of change. Their difference f(x)-g(x) will be a new function of x whose rate of change is everywhere zero, since it is the difference of the rates of f (x) and g(x). But it has just been seen that such a function is always equal to a constant c, and it follows at once that if two functions have the same derivative they are always related to each other by an equation of the form f (x) =g(x) + c. 22. The Anti-derivative functions. With this remark in mind we may undertake a study of the second fundamental notion of the calculus, that of the anti-derivative. It has already been seen that in general any function f (x) has asso 290 MODERN MATHEMATICS ciated with it a derivative function f'(x) which expresses its rate of change. But it may also be asked whether or not there exists a function of which f (x) is itself the derivative. The answer is that in general such a function exists, and it is called the anti-derivative of f (x). It is easy to find an antiderivative for any polynomial by inspection, if the formula for the derivative of xn is borne in mind. By an application axn + of this formula it is seen at once that the function has n+1 for its derivative aXn, and hence is an anti-derivative of axn. The anti-derivatives of each term of a polynomial can therefore be found by adding one to the exponent of each term, and dividing the terms by the exponent so increased. The anti-derivative of the whole polynomial is then the sum of these separate antiderivatives. For example the polynomial 6x6-122 + 5 is the rate of change of the polynomial x7 -4x3 + 5x, as may be verified easily by applying to the last polynomial the formula previously given for differentiation. The anti-derivative of a function f (x) is unlike the derivative in that it is not uniquely determinable when f (x) is given. For convenience let us denote the anti-derivative by a(x), the letter a serving to indicate the relation between the two functions. If A(x) is any other anti-derivative of f (x), then A(x) and a(x) by definition have the same derivatives and they must be related by an equation of the form A(x) =a(x) +c. It follows that although the anti-derivative is not unique, yet if one anti-derivative a(x) is known, then all the others are found by adding constants to a(x). 23. A typical mechanical application. One of the uses of the anti-derivative is well illustrated by the problem of deter THE FUNCTION CONCEPT AND THE CALCULUS 291 mining at any moment the height of a ball thrown vertically upward with a given initial velocity. Physical experiments tell us that the velocity of the ball will decrease uniformly by an amount equal to -g in each second, where g is approximately 32.2 feet. In other words the rate of change of the velocity is a constant -g. If any anti-derivative of -g were known, the velocity v would necessarily differ from it by a constant only. Such an anti-derivative can readily be found by means of the formula given above, the result being -gt, and the corresponding expression for v is v= -gt+c. The constant c may be determined in terms of the initial velocity vo at the time t=0 when the ball was thrown. For since the equation just written is true for all values of t, it will be true also when t=O, and it follows readily that c=vo. In a similar way a formula for the height s in terms of the time can be derived by seeking an anti-derivative of its rate of change gt2 v. The value of the anti-derivative is — 2 +vot, and hence this function and s must satisfy an equation of the form s-= gt2+vot +d. Here the constant d turns out to be zero on account of the fact that s = 0 when t = 0, and the final formula for s is therefore s = vot- gt2. The problem of the thrown ball, and many others involving similar principles, show clearly the importance of having a method for finding an anti-derivative, as well as the derivative, of any given function. The integration of a function is the term applied to the process of finding an anti-derivative, and differentiation is the process of finding the derivative. One of the chief problems of the calculus is the determination derivatives and anti-derivatives for as many different types of functions as possible, 292 MODERN MATHEMATICS 24. Relations between a function and its anti-derivatives. The graphs of the functions a(x) and f (x) are related to each other by two very interesting properties, one of which follows immediately from the definition of an anti-derivative. For any value of x the slope of the tangent to the anti-derivative curve, at the point n in Fig. 11, is equal numerically to the number of linear units in the corresponding ordinate xq of the original curve y=f (x). Evidently when y=a(x) has a maximum or minimum the curve y =f (x) must intersect the x-axis, since the slope of the former, and therefore the ordinate to the latter, is zero at such a point. The second relation between the curve is more interesting and more important, but in order to exhibit it we must first n/ -f 1y f(x) I Xo x x+iAx FIG. 11. prove a property of the curve y=f(x) itself. Consider the area A bounded by the two ordinates at xo and x, the curve itself, and the x-axis. For every value of x the value of A is uniquely determined, and according to the definition of a function, it is therefore a function A (x). The derivative of this function can readily be calculated. For the difference A(x+A.x)-l A(x) is the area under the curve bounded by the ordinates at x and x-+Ax in Fig. 11. This latter area is less than the rectangle whose corners are x, v, s, (x+~Jx) in the figure, greater than the rectangle x, t, u, (x+3Jx), and therefore equal to some rectangle intermediate between the two whose upper side cuts the curve in a point v with ani abscissa which may be denoted by xi, THE FUNCTION CONCEPT AND THE CALCULUS 293 Since the altitude of this rectangle is f (xl), its area has the valuef (xl)Jx, and A(x + Jx)- A(x) =f (xi) 4x. The quotient A (x +a x) - A (x) -mx --- =/f (xl) will therefore have the limit f (x), since the value of xl is always between x and x + x, and must approach x as Jx approaches zero. We have then this striking result that the rate of change of the area A(x) is numerically equal to the length of the ordinate f (x) at the boundary of the area. 25. Representation of an area by a line. Consider now the two functions A(x) and a(x). They are both anti-derivatives of the function f (x) and hence must satisfy an equation of the form A (x)=a(x) + c,...... (7) where c is a constant whose value may be determined by putting x=O. The value of A(a) is seen to be zero, so that for x=O the last equation becomes O=a(xo) +c, and the relation (7) takes the form A(x)=a(x)-a(xo)....... (8) Interpreted geometrically this important equation means that the number of square units in the area A.(x) is equal to the number of linear units in the line inn, which is the difference of the ordinates a(x) and a(xo) of the anti-derivative curve. (See Fig. 11.) Consider for examiple the curves y = 3X2 =f (x), y= x3 +1 = a(x). The area under y=3x2 between the origin and the ordinate at x= 2 is equal numerically to the length of the line mn, which in this case is 294 MODERN MATHEMATICS The curve y=eZ, shown in the accompanying graph, where e=2.718+, has the interesting property that it is its own derivative curve. Hence the area enclosed between the curve, the x-axis, and any two ordinates is equal numerically to the difference between the two ordinates. / // /, ___ —r --- —/- -in Ye 0 2 0 FIG. 12. FIG. 13. Similarly the area under any arch of the cosine curve can be calculated as soon as it is known that the sine is its antiderivative. For by the theorem just proved this area is equal to the difference sin 2sin - =2.. —? J= sin x // \ y=cos x /0 _/ FIG. 14. 26. The definite integral. Fluid pressure. The relation which has just been exhibited between the derivative and antiderivative curves is interesting geometrically, but its importance really lies in its application to the evaluation of the third THE FUNCTION CONCEPT AND THE CALCULUS 295 fundamental notion of the calculus, the definite integral. Let us consider first some examples which lead to definite integrals. Suppose that a cylindrical vessel full of water is at hand and that it is required to find the pressure of the water on the sides of the vessel. It is a well-known principle of physics that the horizontal pressure at any point in the liquid is the same as the pressure vertically downward. If w is the weight of a cubic unit of water, and x the depth of the point in question, then the pressure per unit of area at that depth is equal to the weight w l.x of a column of water one square unit in cross section and x units high. Let the cylindrical surface between the top and the bottom of the liquid be divided by planes parallel to the bottom of the vessel into n horizontal rings of width Axz, Ax2....Axn. If r is the radius of the cylinder, then the area of any one of these rings will be of the form x 27rAxk. The pressure on this area is less. than the product of 27rAxk by the depth -AK at the lower edge of the ring, greater than the product of 2wrAxk by the pressure 2r h at the upper edge, and therefore equal FIG. 15. to 27rxkixk, where Xk is some properly chosen depth in the interval Jxk between the two extremes. The sum 27ra{xlxl+x2+z2+... +XnAxn}... (9) is the total pressure. In its present form this sum would be difficult to calculate on account of the indeterminateness of the values x,, but it turns out that the limit of the sum as the intervals Axk are decreased in size, can be very easily found by a rule which will be explained a little later. Since the sum is always equal to the desired pressure its limit will have the same value. It would be laborious to write down for many examples a detailed description of a sum such as (9) and its limit, and consequently a notation has been devised which suggests 296 i(MODERN MATHEMATICS at a glance the essential steps in the process. For the examples just given the limit is denoted by the symbol rh j 2r.rdx,..... (10) where h denotes the depth of the water. In this notation the integral sign j is a metamorphosed old English letter s, and suggests that the limit of a sum has been taken; the limit o and h indicate the interval for which the sum has been constructed; and the "integrand " 2r.rxdx shows the nature of the terms which have been summed. The whole expression is called the definite integral of the function 2nrx between the limits o and h. 27. Volumes of solids of revolution. Another simple problem which may be solved with the help of a definite integral I 0 a' A.xK, h FrI. 16. is that of finding the volume of a cone. Let the cone be generated by revolving the triangle shown in the Fig. 16, about the x-axis. The hypotenuse of the triangle is a part of the graph of the function y=-ax/h, since for any point of it y and x have the ratio a:h. Divide the interval from o to h into n parts Jxk as before. The volume generated by the trapezoid over Jxk will be equal to that generated by a properly chosen aXk rectangle with base Jxk and altitude yk= -. The volume generated by the rectangle is cylindrical and equal to the proda2 X k2 uct of its base -7-2 by its altitude Jxk. The whole volume 7oa2 of the cone will then be a sum of terms of the type h2 THE FUNCTION CONCEPT AND THE CALCULUS 297 and according to the description of the definite integral symbol given above, the limit of this sum can be denoted by h a2 - x2dx...... ( 11) The volume of a cone of course can be calculated by the methods of elementary geometry. But the process just described enables us to find with equal ease an expression for the volume generated by revolving about the x-axis the area xopqx (Fig. 11) under any arbitrary curve y=f (x), a problem quite beyond the scope of the usual elementary methods. The only differences in this case are that the type of the terms to be summed is 7ra2xk2JdXk 7f 2(Xk)JXk instead of h2, and the interval over which the sum is to be taken extends from xo to x instead of from o to h. The definite integral expressing the value of the volume has therefore the form 77f 2(x)dx....... (12) 28. Areas. The area xopqx in Fig. 12 can also be expressed as a definite integral. For the part of the area underneath the curve and over the interval Axk is greater than the product of dxk by the highest ordinate over the interval, less than the product of Jxk by the shortest ordinate, and therefore equal to Axk multiplied by some intermediate ordinate f (Xk). The total area is consequently a series of terms of the type f (xk) Jxk and is equal to the definite integral ff (x)dx, 0 which is the limit of this sum as the Axk approaches zero. 29. Computation of definite integrals. The fundamental theorem. The fact that the area xopqx can be expressed as a definite integral suggests at once a formula by means of which the values of many definite integrals can be calculated with considerable ease. In discussing the relation between the curves belonging to a function and its anti-derivative, it was found that the area xopqx for the curve y =f (x) is equal to the difference 298 MODERN MATHEMATICS of the ordinates of the anti-derivative curve at the values a and Pl. By comparing these two results we have at once a remarkable theorem which is called the fundamental theorem of the integral calculus. According to it, the value of the definite integral f (x)dx = lim {f (xl)JXl +f (x2)Jx2+. +. +f (xn)xn is given by the formula jf (x)dx = a(x) - a(xo), where the function a(x) is any anti-derivative of the function f (x). 30. Applications. The values which the formula would give for the definite integral if two different anti-derivatives were used are evidently the same, since the difference of the anti-derivatives is always a constant. The theorem has been derived with the help of geometrical conceptions, but the definite integral is really an analytic notion with a geometrical interpretation, and the theorem itself is essentially analytic in character. It enables us to calculate the values of any definite integral for which an anti-derivative function can be found, irrespective of its geometrical or mechanical interpretation. Thus in the first example discussed above the function under the integral sign (10) is 2wrx, and an anti-derivative, formed by the usual rule for functions of the type axn, is zrx2. The total pressure on the walls of the cylindrical vessel is therefore rh J27orxdx = 7rrh2 - rO2 = 7rh2. Similarly the anti-derivative for the integral (11) which ia2x3 expresses the volume of a cone is 3h, and the volume itself turns out to have the well-known value t a2 7C x2dx = 7a2h, one-third of the product of the base by the altitude. THE FUNCTION CONCEPT AND THE CALCULUS 299 In a similar way the volume of a sphere can be calculated by means of the formula (12). At any point of semicircle of radius r about the origin the abscissa x and ordinate y, satisfy the relation x2+y2=r2, so that the function which is represented by the circle has the equation y = /r2 X2. The volume generated by rotating the semicircle about the horizontal axis is that of a sphere of radius r. The definite integral which represents the volume, formed from the formula (12) by substituting the radical \/r- x2 in place of f (x), has the form r, (r2- z2)dx, )., -r 0 x 4-r FIG. 17. and an anti-derivative of the integrand function is r2x- x3. The volume has therefore the value +r (r2-X2)dx = (2X3-2x3 -)=_4wr3 If the area under the curve y = 3x2 in figure is rotated about the x-axis, the volume generated is easily found from the same formula. In this case the integral is Z 3x2dx = 23-03 } = 8s. 31. Relations between functions and graphs. Let us conclude our brief study of the more important notions of the calculus with a consideration of a question which was proposed 300 MODERN MATHEMATICS earlier in the paper with regard to the representation of a function by means of a graph. If a function is continuous at every point of an interval a < x < p, then the difference f (x') -f (x") for any two values x' and x" in the interval can be made arbitrarily small by choosing x' and x" sufficiently near together. The proof that this property of a continuous function is a consequence of its continuity at the individual points between a and fi is somewhat complicated, anti cannot well be given here. The property itself is called the "uniform continuity of f (x) in the interval a < x < p." Assuming that it is true, we can without difficulty see that any continuous function can be approximately represented by a polygon. For /W(! f( K+i) f(XK) XKK + 1 FIG. 18. suppose that the interval from a to P has been divided into segments by a set of values so near together that the difference f (x) -f (Xk) for any value x between xk and xk+ 1 is less than an arbitrarily chosen number e. If the points p and q corresponding to any two successive values Xk and xk+~l are plotted and the ordinate to the straight line joining p and q is represented by g(x), it follows that the difference g(x) -f (xk) will also be less than e, since f (xa) -f (Xk+ I) is less than e and g(x) lies between f (Xk) and f (Xk+ 1). Since f (x) and g(x) both differ from f (Xk) by less than e it follows that their difference can itself not exceed 2e. This result will hold for each segment AXk, however small the constant E is taken, provided only that the points of division in the interval between a and Pi are taken sufficiently near together. It is evident then that a continuous function can be represented with any desired degree of numerical THE FUNCTION CONCEPT AND THE CALCULUS 301 accuracy by plotting a finite number of points sufficiently near together, and joining them by straight lines. The numerical accuracy of the representation is not the only characteristic of the graph, however, which should be taken into consideration. The broken line represents the values of the function with some degree of fairness, but it does not in general indicate other properties satisfactorily, and a smooth curve drawn through the corners of the polygon might be equally misleading. A smooth curve, for example, suggests to the eye that at each point of the curve there is a tangent line whose direction changes continuously as the point of tangency moves along the curve, and whose slope also changes continuously. Hence the function f (x) which such a curve represents should have a continuous derivative, which is a 0O_ o b FIG. 19. not always the case. A function may in fact be continuous in an interval and yet not have a derivative at any point of it, as is shown by a classical example of such a function due to Weierstrass.* The graph does not indicate as much, however, with regard to the rate of change of the slope of the tangent which is denoted by f"(x) and called the second derivative, and very little indeed concerning the rate of change of f"(x) and the successive rates of change of higher orders. The second derivative is positive along an arc ao convex downward where the slope of the tangent is increasing, and negative on an arc ob concave which is concave downward. It vanishes presumably and changes sign at o, though at such a point it may change abruptly, as it would if for example ao and ob were arcs of the two curves y-=x+x2, y=x-~x2. * Mathematische Annalen, Vol, XIX, p. 591. 302 MODERN MATHEMATICS Both of these curves pass through the origin o, and their derivatives, 1+2x and 1-x, have the same value for x=O, so that the two curves are tangent to each other at that point. On the other hand the second derivatives are respectively 2 and -1. From this and other examples which might be constructed it follows that a curve which appears perfectly smooth to the eye may represent a function which has a discontinuous second derivative, or possibly no second derivative at all. 32. The graph as a mathematical symbol. From the remarks which have been made it may be inferred that graphs have two distinct and important uses, the first of which is the numerical representation of the values of a function. It has been seen that such a representation may have significance, even if the function is only continuous without having any of the derivatives. But a graph is most useful, in theoretical work at least, as a mathematical symbol for a function in the same way that f (x) is a notation for a function or f (x)dx for a definite integral. The variety of characteristics which may be suggested by a glance at a graph is, however, much greater than is suggested by the symbol f (x) which indicates only functional dependence upon x, and its value as a symbol is proportionately enhanced. From the graph of the function Y= - 1 in Fig. 3, for example, we read that this function is continuous and has a continuous derivative except at x= ~1; that it always decreases, varying from 0 to X as x increases from - o to -1, from + to - as x increases from -1 to + 1, and from + 0o to 0 as x increases from + 1 to oo; that it vanishes only once, when x =0; that its derivative is negative with variations clearly indicated; that its second derivative vanishes at x=0O; and so on; all of these properties being much more significantly suggested by the graph than by the corresponding and somewhat clumsy description in words. As the usefulness of any mathematical notation depends upon the sharpness with which the conception for which it THE FUNCTION CONCEPT AND THE CALCULUS 303 is to stand is defined, so the graph attains its greatest efficiency as a symbol only when the nature of the functions which are to be represented is clearly specified in advance, as well as the properties of functions which are to be represented by special features of the curve. As has been seen above, the characteristics of first and second derivatives seem to be particularly adapted to graphical representation, and it has been suggested that curves possess their fullest significance as symbols of functions when the functions are continuous, have only a finite number of maxima and minima in any given interval, and have continuous derivatives of the first and second orders. The elementary functions have these properties, in common with all of the other functions which have been designated as analytic. But it is not necessary that the functions represented be thus restricted in character, provided only that the correspondence between the analytical characteristics of the function on the one hand and the graphical characteristics of the curve on the other, is expressly understood. In the elementary courses it is evidently impossible to discuss the niceties of the relation of graphical to analytical conceptions, and it is highly desirable that graphical methods should be used. But they should always be formulated with special reference in the mind of the instructor to the correspondence between the graphical and the analytical processes, with which the student will later be familiar. We have now come to the end of our brief survey of the elements of the calculus, the threshold of the higher mathematics. The technical difficulties which would arise have prevented the application of the processes of differentation and integration to any but the simplest functions, the polynomials. By means of these alone, however, it has been possible to explain the meaning of the derivative, the anti-derivative, and the definite integral, and some of their interrelations among themselves. The rest of the theory is for the most part an application in many different ways and to many different functions of these three fundamental conceptions. It is hoped that by his perusal of these pages the reader unfamiliar with 304 MODERN MATHEMATICS with the calculus will have lost whatever awe he may have had of one at least of the more advanced mathematical subjects, and at the same time have gained an insight into the variety and importance of its relations with problems of a practical nature and with other branches of science. VII THE THEORY OF NUMBERS BY J. W. A. YoUNG. CONTENTS SECTIONS. I. INTRODUCTION........................................... 1-3 II. FACTORS............................................. 4-18 4-5, Primes; 6, Arithmetical progressions; 7, Problems concerning primes; 8, Method of finding primes; 9, Tables of factors; 10, Factors of large numbers; 11, Relative primes; 12-13, Totient, < (m); 14, Sum of all factors of a number; 15-18, Perfect numbers. III. DIOPHANTINE EQUATIONS..............................19-22 19, Definition; 20, The equation, x2+y2= z2; 21, The equation, xn+yn= zn; 22, The equation, x2-Dy2=-l. IV, CONGRUENCES.................................. 23-38 23-27, Introductory definitions and properties; 28-29, Fundamental properties; 30, Applications: To find remainders in division; Criteria for divisibility; 31, Roots of congruences; 32, Theoretic solution of the linear congruence in one unknown; 33, Numerical solution of the linear congruence in one unknown; 34-35, Fermat's theorem; 36-38, Wilson's theorem. V. BINOMIAL CONGRUENCES............................. 39-50 39-46, Definitions and theorems; 47-50, Primitive roots. VI. QUADRATIC CONGRUENCES...............................51-58 51-53, Definitions and reductions; 54-57, Quadratic residues; 58, Legendre's law of reciprocity. VII. BIBLIOGRAPHY.......................................... 59 306 VII THE THEORY OF NUMBERS By J. W. A. YOUNG I. INTRODUCTION 1. The "Theory of Numbers" might, in a certain sense, include nearly all of the subject-matter usually treated in mathematics, since, with the exception of the non-metrical portions of Geometry, there are few domains of mathematics that are not fundamentally concerned with numbers. But the term is commonly used in a restricted, technical sense as meaning the theory of integral numbers (positive, negative, zero). Even this must be further restricted, for all numbers other than integers can be defined in terms of integers,* so that to study the whole body of theory that has been built up on integral numbers would still be tantamount to studying nearly the whole body of mathematical science. The restriction customarily made is to regard the " theory of numbers " as concerned with integers as such; their properties and their combinations by operations that lead to integral results. The operations of addition, subtraction, and multiplication are accordingly admitted when applied to any integers, and division is admitted when applied to integers such that the quotient is integral. The process of division may also be used to obtain equations between integers. For example, 9385=62 151 +23.t In all that follows the term number shall accordingly be understood to mean integral number; and other terms, for * See Monograph IV, Appendix I. t The dot indicates multiplication. 307 308 MODERN MATHEMATICS example, factor, shall be understood to be similarly restricted in meaning. 2. The treatment of our subject, as now delimited, might properly begin with a chapter studying the nature and genesis of the concept of integer, the fundamental definitions and postulates relating to integers and to the admitted operations thereupon, the " laws " of operation, and the like. This would be, in a measure, the treatment of the theoretic basis of elementary arithmetic.* 3. We, however, here assume a working knowledge of elementary arithmetic, and begin with a consideration of various properties, connected with the factors of nulbers, that are not ordinarily treated in that subject. II. FACTORS 4. Definition. A prime number (or briefly, a prime) is a number having no other factors than itself and unity. 5. Theorem. Tth series of primes is endless. Proof. It is sufficient to show that there exists a prime larger than any given prime. Let the given prime be p. Consider N=23-5... p+1, * For a treatment of the corresponding questions relative to the numbers of algebra, which include those of arithmetic, see Monograph IV. For the more strictly arithmetical theory see: Dedekind, Was sind und was sollen die Zahlen? Braunschweig, 2d ed., 1893. English translation by Beman as the second essay of "Essays on Number," Chicago, 1901. Stolz-Gmeiner, Theoretische Arithmetik, Part I, 2d ed., Leipzig, 1900. (This work presents the theory of the natural numbers, published by Peano, under the title, "Arithmetices principia nova methodo exposita," Turin, 1889, in a symbolic notation. A brief account of this theory is given by Huntington in the Bulletin of the American Mathematical Society, 2d Series, Vol. IX, 1902, pp. 40-46.) Padoa, "Theorie algebrique des nombres entiers," Internat. Cong. de Philos., Paris, 1900, pp. 309-65. Huntington, " Complete sets of postulates for the theories of positive integral and positive rational numbers," Transactions American Mathematical Society, Vol. III, 1902, pp. 280-84. Huntington, pp. 27-29 of "The fundamental laws of addition and multiplication in elementary algebra," Annals of Math., Vol. VIII, 1906, pp. 1-44. THE THEORY OF NUMBERS 309 where the first term of N is the product of all the primes not greater than p. Then it appears from the form of N, that if N be divided by any one of the primes just mentioned, the remainder will be 1. Consequently, every prime factor of N must be greater than p. Since N must have one or more prime factors, the existence of a prime greater than p is thus proved. But this is by no means tantamount to the actual finding of a prime greater than a given prime p. No general method for doing this has as yet been discovered. This theorem may also be stated thus: There is no largest prime number; or thus: The primes being arranged in order of increasing magnitude, after each prime there follows another; or also thus: The number of primes is infinite. The last form of statement means neither more nor less than the others. It has been conjectured that every even number is the sum of two primes, but this has not yet been proved. 6. The theorem above was known to Euclid two thousand years ago. In the nineteenth century Dirichlet proved an elegant generalization of it, viz., There is an endless set of primes in every arithmetical progression whose first term and common difference have no common factor. Dirichlet's proof of this theorem makes use of numbers and operations not admitted in our subject* (which is often called higher arithmetic) thus furnishing an instance of a " non-arithmetical " proof of an arithmetical proposition.t It is, however, easy to prove the theorem arithmetically for certain progressions. For example, the progression 3, 7, 11, 15, 19, 23,... 4n-1,... contains an endless sequence of primes. * See sec. 1. t Such proofs abound in the development of the theory of numbers. For an introduction to this division of the subject see: Bachman, Analytische Zahlentheorie, Leipzig, 1892 (proof of the above theorem, pp. 74-88); Kronecker-Hensel, Zahlentheorie, Leipzig, 1901 (above theorem, pp. 438 et seq.). 310 MODERN MATHEMATICS To prove this it is sufficient to show that for every prime p there exists a larger prime of the form 4n-1. Consider N=2(23.5.7... p)+l, where the number in the parenthesis is the product of all primes not greater than p. Then it is clear from the form of N, that none of the primes 2, 3... p is a factor of N. All the prime factors of N are therefore greater than p. All odd primes are of the form 4n + 1 or 4n- 1. The product of two numbers of the form 4n+l is also of the form 4 + 1. But N is of the form 4n-1. Hence at least one of its prime factors must be of the form 4n-1. The existence of a prime of this form larger than the given prime, p, is thus proved. It can be proved quite analogously that the progression, 5, 11, 17, 23, 29, 35,.., 6;r-1,.. contains an unending set of primes. 7. Various important general problems have been studied relating to primes. For example: (1) To determine the number of primes in a given interval. (2) To determine a prime larger than a given prime. (3) To determine the prime next larger than a given prime. (4) To determine whether or not a given number is prime; or, more generally, to determine the factors of a given number. No general solution of these problems has as yet been found. 8. The simplest method of finding factors is by actual trial. It is sufficient to try only primes, and of these, only those whose squares are smaller than the given number. But this method is impracticable for large numbers. For these, use is made of various results and methods that are developed in our subject. 9. Tables of the factors of all members up to ten millions have been published.* A manuscript in the Archives of the * Lehmer, Factor Table for the First Ten Millions, Washington 1909. Carr's Synopsis of Pure Mathematics, London, 1886, contains a table extending to 99,000. Still smaller tables are found in Jones' Logarithmic Tables, Ithaca, N. Y., 1889, and elsewhere. THE THEORY OF NUMBERS 311 Academy of Vienna gives the factors of numbers from 3,000,000 to 100,000,000. (This MS. is known to contain many errors.) 10. Factors of particular numbers much larger than those in the tables have also been found. For example, in the theory of the construction of regular polygons* it is important to know whether or not 22 +'1 is a prime number. It has been shown that 22 + 1 = 4,294,967,297 =641 6,700,417 Also that 223 +1, a number of more than twenty trillion places, has the prime factor: 2,748,779,069,441.t 11. Definition. Two numbers having no common factor but unity, are called relatively prime. Each is said to be prime to the other. 12. Definition. The number of (positive) integers not greater than m and prime to m is called the totient of m, and denoted by q (m). Thus (l1)=1; b(2)=1; q(3)=2; q(4)=2; <(5)=4; ~(6)-2; >(7)=6; <)(8)=4. If p is prime, o(p)=p-1. 13. Problem. To determine q (m). Solution. Let n =paqbrc... Vh, where p, q, r... v are different primes, a, b, c,... h positive integers. If from the series of numbers 1, 2, 3, 4, 5,... m-1, m, we strike out all those that have as factor p or q, or r, etc., the numbers that remain will be prime to m, and the number of such numbers is the desired totient. * Monograph No. VIII, sec. 26. t Encyc. des Sciences Math., Tome I, Vol. III, p. 5.1. 312 MODERN MATHEMATICS First consider those having p as a factor. in They are: p, 2p, 3p,... p. P m im Their number is - There are therefore m — or m(1 —) p P P numbers of the series 1, 2.. m that do not have the factor p. This may be stated generally thus: Lemma. If M has the prime factor P, then Mf(1-p) of the numbers 1, 2, 3... M. do not have the factor P. We next strike out the numbers having the factor q. These numbers are in q, 2q,3q,... q. Some of them may already be struck off as having the factor p. The number not having the factor p, is the number of the coefficients rn 1,2,3...-, not having the factor p. By the lemma this number is q P The number of numbers 1, 2, 3... m, having neither p nor q as factor is therefore or (1 - (- or a - Similarly, the numbers r r, 2r,3r.. r * This number is an integer, since p is a factor of m. For similar reasons, the other numbers indicated by fractions in what follows, are also integers. THE THEORY OF NUMBERS 313 have r as a factor. Some of them may have p or q as factor. The number of those that do not is the number of coefficients m 1, 2, 3... that have neither p nor q as factor. By the preceding result this number is M -1 1-1. r p/\ q/ Consequently these are in the series 1, 2,... m, m( 1-1- -)- (1 - ) (1-), \ P/ q/ r\ P/\ 9/ or m(1-p)(1 ---q) 1-), \ P/\ q\ r/ numbers not divisible by either p, q, r. The same reasoning may be repeated until all of the prime factors of m have been used. The numbers remaining will be prime to m, and we have thus: (m)i-m)(1 — q(1-1)... -1). REMARKS. (1) The repetition of the reasoning for all of the prime factors of m is formally accomplished by the process of mathematical induction, that is, we show that if a result of the above type holds for any k different prime factors of m, such a result also holds for k +1 of the prime factors of m, consisting of the k already considered and any other one. Since such results have been proved above for one, two, and three factors, it would follow that a similar result holds for four factors, therefore for five factors, etc., therefore for all the factors. (2) If the reader has any difficulty in following the reasoning above for a general m, he should first carry it through for one or more particular values (say, 60 = 22 3 5, 48 = 24 3, 55 = 5 11), and then generalize. This remark applies to our whole subject-the theory of numbers. It cannot be mastered without much work with specific numbers, and recourse should always be had to particular instances, whenever the general theory becomes in any way hazy. 314 3MODERN MATHEMATICS 14. Problem. To find the sum of all the factors of any number, m. Solution. Let m = paqbrc... v', where p, q, r,... v are different primes, and a, b, c... 1 are positive integers. Let all the factors of m, including unity and m itself, be di, d2, d3... dk, and let di +d2+d3 +... ~dk=S(m). Every factor of m is of the form: d=pa'qb'rc'... VI/ where a' b' c'... 1' have any combination of the values: a'=O,1, 2.. l'=O,1, 2... and, conversely, all expressions of this form are factors. Further, every expression of this form occurs once, and only once, as a term of the following product, and the product contains no other terms: P=(1~p+p2+...~pa)(1~qq+q2~..+qb)... (~v+ v2-I-...+-VI). Consequently P is the sum of the d's, and since each factor of P is a geometric series, we obtain: pa+l1_ qbl1 _1+1___I SW-. -1 q-1 v-1 EXAMPLES. 1. Since 25=52, 5 -1 S(25)= =31. 2. Since 72=23.32, 21-1 3'-1. S(72)= =15-13=195. 2-1 3-1 3. Since 100,800=26. 3252.7, 27-1 31-1 51-1 71 -8(100) 800)= -. - -. - 2 1 3-1 5-1 7-1 -127 ~ 13 ~ 31 - 8 = 409,448. THE THEORY OF NUMBERS 315 15. Definition. A number that is equal to the sum of all its factors, except itself, is called a perfect number. For example, 6 and 28 are perfect numbers, since 6= 3+2+1, 28=14+7+4+2+1. 16. Theorem. If 2k- is a prime, then 2k-1(2k-1) is a perfect number. (This theorem is given by Euclid.) Proof. Let n =2k- (2k- 1) and let p=2k-1. Then n=2k-lp. And by sec. 14, 2(k-1)+_-1 p2-1 S(n)= p 1 = (2k-1)(p+1) = (2k 1)2k. Subtracting n from both members we have: S(n)-n= (2k- 1)2k-2k-1(2k- 1) = (2k- 1)(2k-2k-1) = (2k- 1)(2.2k-1-2k-1) = (2 -1)2k-1 =n. That is, n is a perfect number. 17. It is not difficult to prove * that every even perfect number is of the form given above. No odd perfect number has been found, and it is not known whether or not any exists. 18. The question naturally arises as to what values of k will make 2k- 1 a prime. It is easy to see that a first condition is that k itself must be prime. For if k = ab, 2ab- I has (according to elementary algebra) the factor 2a- 1. * See, for example, Lucas, Theorie des Nombres, Paris, 1891, p. 375. 316 MODERN MATHEMATICS In 1644 Mersenne asserted that when p is a prime not greater than 257, 2P-1 is a prime if and only if p=2, 3, 5, 7, 13, 17, 19, 31, 61, 127, 257. Numbers of the form 2P-1, p<257, are called Mersenne's numbers. The statement that 2P-1 is prime has been verified for the first 9 values of p, which, consequently, when substituted in Euclid's formula gives the nine known perfect numbers. The first eight of them were known as early as the sixteenth century, the ninth (whose value is 2,658,455,991,569,831,744,654,692,615,953,842,176) was verified late in the nineteenth century. The values p= 127 and p=257 are still in doubt. The statement that 2P-1 is composite for values of p< 257 other than those of the list above has been verified in a large number of instances,* but not yet in all. It is believed that Mersenne knew some more powerful and general method of dealing with these questions, which his successors have not yet succeeded in rediscovering. III. DIOPHANTINE EQUATIONS 19. Definition. An equation in two or more unknowns whose values are to be integral is called a Diophantine equation; also an indeterminate equation. Linear Diophantine equations are best studied in connection with another division of our subject (Congruences, sees. 31-33). 20. An interesting instance of a quadratic Diophantine equation is the equation X2 + y2= 2......(1) The numbers of any set x, y, z satisfying these equations are the lengths of the sides of a right triangle. So that the two problems of finding all integral solutions of the above equations and of finding all right triangles with sides of integral length are equivalent. Such triangles are called Pythagorean triangles. * For list, see Lucas, Thlorie des Nombres, p. 375. THE THEORY OF NUMBERS 317 A solution in which x, y, z have no common factor is called a primitive solution. It will be sufficient to find all the primitive solutions, for every non-primitive solution can be deduced from some primitive solution by multiplying all its numbers by the proper factor. We begin the search for the primitive solutions by showing that in any primitive solution of one of the numbers, x and y, say x, is even and the other, y, is odd. For (a) if x and y were both even, z would also be even; the common factor, 2, would be present and the solution would not be primitive. (b) If x and y were both odd (that is, of the form 2n+1), x2 and y2 would both be of the form 4n +; and hence z2 would be of the form 4n+2. But this is impossible, since the square of every even number is of the form 4n, and that of every odd number is of the form 4n +. Since suppositions (a) and (b) are both incorrect one of the numbers x and y must be even, the other, odd. Let x denote the even one. Then y and z are odd. From (1): x2 = z2_ y2 =(z+y)(z-y). Since z and y are both odd we may put, z+y=2k 1 -z-y=21 t Hence, x2 = 4kl. Since x, y, 'z are relatively prime, k and I must also be relatively prime; for, from equations (2), z=k+l and y=k-l; hence if k and I had a common factor y and z would have that factor in common also. Since 4kl is a square, it follows that k and 1 must be square. We therefore put: k, Dp2 ==2 (,, relatively prime) I- == 2 ('m, q, relatively prime). 318 MODERN MATHEMATICS Consequently, in any primitive solution of equation (1), x, y, z must be of the forms: x= 2mq y=m2- q2....... (3) z=m2 +q2 It is readily seen by substitution that every set of values of this form, whether primitive or not, satisfies the equation. To pick out these solutions of form (3) that are primitive, we proceed as follows: If m and q have a common factor, then x, y, z evidently have that factor in common also. The primitive solutions will therefore all be among those obtained under the restriction that m and q shall be relatively prime. Further, since z +y==2m2, and z-y=2q2, any common factor of z and y would be a common factor of 2m2 and of 2q2, or, if m and q are relatively prime, of 2. That is, if m and q are relatively prime, y and z can have, at most, the factor 2 in common. They do, indeed, have this common factor when m and q are both odd, and not otherwise (m and q being relatively prime). We have thus proved the following Theorem. All the primitive solutions and no others of the equation x2 +y2= z2 are given by the formulas (3) if m and q run through all possible sets of relatively prime values such that m >q, and that one of the two is even, the other odd. It is now simply a matter of substitution to prepare a table of the smaller primitive solutions.* * A table of all primitive solutions in which z<2500 is given by Whitworth, Proc. Lit. and Phil. Soc. of Liverpool, Vol. XXIX, 1874, p. 237. m q x y z m q x z 2 1 4 3 5 7 6 84 13 85 3 2 12 5 13 4 56 33 65 4 3 24 7 25 2 28 45 53 1 8 15 17 8 7 112 15 113 5 4 40 9 41 5 80 39 89 2 20 21 29 3 48 55 73 6 5 60 11 61 1 16 63 65 1 12 35 37 ______________ __________________ _ THE THEORY OF NUMBERS 319 Theorem. Of the three numbers, x, y, z, one is divisible by 3, one (perhaps the same one) by 4 and one by 5. Proof. Since either m or q is even, x is divisible by 4. If either m or q is divisible by 3 or by 5, x is divisible by 3 or by 5. If neither m nor q is divisible by 3, they are both of the form 3n ~ 1, and their squares are of the form 3n +1. Therefore m2-q2 is of the form 3n. That is y is a multiple of 3. If neither m nor q is divisible by 5 they are of one of the forms 5n~ 1, 5n~2, and their squares are of the form 5n~1. If both m2 and q2 are of the same form (either 5n +1 or 5n-1) m2-q2 is of the form 5n; while if one is of the form 5n + and the other of the form 5n-1, m2 +q2 is of the form 5n. That is, in the former case y is a multiple of 5, in the latter case, z is a multiple of 5. All of these statements may be verified for the particular instances occurring in the table above, and they should be so verified if the reader has the slightest difficulty in understanding the general reasoning. (See note, sec. 13.) 21. As it has been easy to solve completely the equation 2 +y2 =z2, it would be natural to expect corresponding success in the solution of x3 +y3 =,3, but this expectation is doomed to disappointment. It has been proved* that the equation has no solution. In other words: no cube of an integer can be the sum of two cubes of integers. This is a special case of the following more general theorem announced by Fermat: The equation xn+yn=zn admits no solution in integers, if n is a positive integer greater than two. This famous theorem is commonly known as Fermat's last theorem, and was stated without proof by Fermat in the seventeenth century. Since then the theorem has stood as a standing challenge to arithmeticians. For various specific instances the proof has been found, including every n <100 and some others, but the general proof has not vet been made. * Euler, Algebra, St. Petersburg, 1770. 320 MODERN MATHEMATICS 22. Mere mention must suffice for the interesting and famous indeterminate equation: x2-Dy2= ~1, generally known as the Pellian equation, though it has recently been pointed out that Pell never published anything on this equation.* A method for the solution of this equation was known to the Hindus about 600 A. D., but it was solved independently by La Grange in the eighteenth century. The equation is treated in the works on the Theory of Numbers cited in the bibliography; these works also discuss many other indeterminate equations that cannot even be mentioned here. IV. CONGRUENCES 23. It frequently happens that in a particular problem numbers whose difference is a multiple of a given number, are equivalent. For example: (1) With respect to the day of the week on which the last day of a certain period falls, numbers of days counted from a fixed day are equivalent if their difference is a multiple of 7. (2) With respect to their trigonometric functions, angles are equivalent if they differ only by multiples of 360~. (3) With respect to their numerical value, powers of -1 are equivalent if their exponents differ only by multiples of 2. 24. Definitions. If a=b +cm, that is, if a-b is a multiple of m, we say that a is congruent to b with respect to the modulus m, and write: a-b(mod. m).... (1) The modulus is supposed to be positive. A relation of the form (1) is called a congruence. a and b are called residues of each other, modulo m. The numbers on the two sides of the sign - are called the members of the congruence. * Encyc. des Sc. Math., Tome I, "Vol. III, p. 27. THE THEORY OF NUMBERS 321 The following are examples of congruences: The reader will readily convince himself of their correctness. 15- 8 (mod. 7) 60- 0 (mod. 12) 37-19 (mod. 6) -18-32 (mod. 10) 1-41 (mod. 5) 3 -59 (mod. 31) 25. Every number is congruent (mod. m) to one and only one of the series: 0, 1, 2... m-1; also to one and only one of the series: 0, -1, -2,... -(m- 1); also, if mn is odd, to one and only one of the series: in-1 0, i1, ~2...; and, if m is even, to one and only one of the series: in in 2 2' 0, ~1, ~2... ~- + These are called respectively the series of least positive residues, least negative residues, and absolutely least residues (mod. in). 26. In any congruence, multiples of the modulus may be added or subtracted at will, without disturbing the congruence' For a -b (mod. m) means that a differs from b by a multiple of m. This property is not affected if a or b, or both, are altered by a multiple of m. Similarly, any factor may be increased or diminished by a multiple of the modulus without destroying the congruence. That is, if ab -c (mod. m) then also (a +dmi)b c (mod. mn). The reader may supply the details of this reasoning. 27. We may, therefore, in any congruence reduce all numerical termrs and coefficients to values less than the modulus without destroying the congruence. 322 MODERN MATHEMATICS Thus, the congruence 86c 7 (mod. 11) may be replaced by 9c 7 (mod. 11), and 437a+289b 469c (mod. 27) may be replaced by 5a + 19b 8c (mod. 27). The reader should practice with similar relations until he is quite familiar with the idea. These relations may be taken quite at random. Thus, 873-? (mod. 36); 4729? (mod. 123). What congruences with coefficients smaller than the modulus are equivalent to the following? 83x -7 (mod. 13); 439x 3283 (mod. 20); 1la-23b 36 (mod. 5); 4632y=367,832-439 (mod. 16), etc. (But exponents may not be treated similarly. From 27 =3 (mod. 5), it does not follow that 22-3 (mod. 5). A theorem which enables us to replace exponents larger than the modulus by smaller ones, will be proved later-sees. 34, 35.) 28. Fundamental properties of congruences. I. If b-a (mod. m) and c-a (mod. m) then b-c (mod. m). Proof. The given congruences mean: b=a +dim c a + ern. c —a+em. Subtracting, b - c (d - e)m.. by definition b c (mod. m). II. If al1-b1 (mod. m) a2-b2 (mod. m) aCi b (mod. m). then: al +a2 +... +alb +b2+... + b (mod. m). The reader can readily supply the proof here, and in the case of the other properties of this list where the proof is omitted. Corollary. Terms may be transposed from one member of a congruence to another; that is, they may be omitted where they stand, and inserted in the other member with their signs changed. For if t represents the term to be transposed, this is equivalent to adding the mlembers of the congruence -t- -t r(mod. in) respectively to the members of the given congruence. THE THEORY OF NUMBERS 323 III. If a b (mod. in), then ka =-kb (mod. 'in), and also ka =_ kb (mod. ki). IV. If a-b (mod. mn) and c =d (mod. in) then ac = bd (mod. in). For by III, ac bc (mod. in) and bc bd (mod. m) by I ac bd (mod. i). Corollary 1. If a1='b1 (mod. m) a2 b2 (mod. in) a, b1 (mod. m) then ala2....;blb2... b1 (mod. n). Corollary 2. If a=b (mod. m) then ar br (mod. i). V. If a b (mod. in) a b (mod. i2) a-b (mod. m.) and if M = L. C. M. ofMIM2n.2.. MI, the n a-b (mod. M). Proof. By hypothesis a -b=brlin a-b =r2m2 a-b =riml and since a-b is a multiple of inl, i2... 1i, it is a multiple of their least common multiple. 29. Those of the preceding properties that relate to a single modulus in, are analogous to the corresponding properties of 324 MODERN MATHEMATICS algebraic equations; instead of " equal" we here say "congruent." These properties concern addition, subtraction and multiplication. We consider next the inverse operation of multiplication, namely, factoring, and shall see that in this case the analogy between the properties of equations and congruences is not so close. In equations we know that if ab O0, then either a = 0 or b = 0. But we know that 46 0- (mod. 12) while neither 4 =0 (mod. 12) nor 6-0 (mod. 12). That is, from ab 0O (mod. m), we may not infer that necessarily either a 0 (mod. in) or b-0 (mod. in). More generally, if we know that ab=ac, and that a z0, we know that b - c. But it is easy to show by an example that if ab =ac(mod. m) and atO (rod. in) it does not necessarily follow that b-=c (mod. mi). Thus: 2 21 —2-17 (mod. S) and 24=0 (rmod. 8). But it is not true that 21 17 (mod. 8). The following property states what follows from ab ac (mod. m). VI. From ab-ac (mod. m), where a and m1 have the highest common factor d, it follows that b c(rod. I). Proof. The hypothesis means ab = ac + kin. or a(b-c) =kn. Since m is a factor of the left member and d is the largest factor of m that is a factor of a, it follows that - is a factor of b-c. That is -d' or b c (mod. -. d Corollary. Both members of any congruence may be divided by any factor that is primne to the modulus, but if the THE THEORY OF NUMBERS 325 divisor have a factor common with the modulus, that factor must be taken out of the modulus also. Thus: (1) From 30- 78 (mod. 12), it follows that 5- 13 (mod. 2). (2) From 108 192 (mod. 14), it follows that 9- 16 (mod. 7). (3) From 224- 44 (mod. 15), it follows that 56 — 11 (rod. 15). 30. Applications of the idea of congruence. The idea of congruence, together with the elementary properties that we have named, is sufficient for the solution of various interesting problems, of which we give a few examples. I. To find the remainder when large numbers are divided by a given number. (1) To find the remainder when 240 is divided by 23: We know that 2 = 32. Hence 25- 9 (mod. 23). Squaring 210~ 81 (mod. 23) - 12 (mod. 23). Squaring 220 144 (mod. 23) - 6 (mod. 23). Squaring 240 - 36 (mod. 23) = 13 (mod. 23). That is, if 240 is divided by 23 the remainder is 13. (2) To show that 22 + 1 has the factor 641 (sec. 10): To show this it is sufficient to show that 22' or 232 has the remainder 640, or -1, when divided by 641. We have 22 =4, 24=16, 2 = (16)2, =256, 216= (256)' =65,536 =154 (mod. 641). 232-(154)2 (mod. 641) -23,716 (mod. 641) -1 (mod. 641). 326 MODERN MATHEMATICS In all such problems the work of multiplication is reduced by taking the absolutely least residue whether positive or negative. (3) It is easily verified similarly that the following Mersenne's numbers (sec. 18), have the factor indicated: Number Factor 211 -1 23 223 -1 47 229 -1 233 237 -1 223 2239-1 479 2251-1 503 (4) At the expense of a somewhat longer computation it can be verified in precisely the same way that 297-1 has the factor 11,447, that 2223-1 has the factor 18,287, that 222+1 has the factor 114,689, and even the statement of sec. 10 with respect to 22 +1 could be verified by a calculation that would indeed be tedious in itself, but that nevertheless, in view of the enormous number whose factor is verified, would be a striking example of the power of the method. It is easy to verify factors, such as the above, when once they are known, but it may be exceedingly difficult to find them. II. Criteria for divisibility. If the digits of a number N read from right to left are a, b, c, d, e, f, g,..., we have N=a+10b +102c+103d+104e + 105f+ 106g +... (1) Since 10-1 (mod. 9), and hence by sec. 28, IV. Cor. 2, 102-l (mod. 9), 103-1 (mod. 9)..., we may write N-a+b+c+d+... (mod. 9). If a +b+c+d +... is a multiple of 9, then N is a multiple of 9. This is the well-known criterion: a number is a multiple of 9 if and only if the sum of its digits is a multiple of 9. (2) Since 10 -1 (mod. 11) and hence, 102-1 (mod. 11), 103 —1 (mod. 11), 104-1 (mod. 11), etc., we may write: N _ a-b +c-d+ -f-... (od. 11). THE THEORY OF NUMBERS 327 That is, a number is a multiple of 11 if and only if the sum of the digits in the odd-numbered places diminished by the sum of the digits in the even numbered places is a multiple of 11. (3) Since 103+1=7.11 13 we seek to obtain criteria for divisibility by 7, 11, or 13, by taking residues of the terms of N according to the modulus 103 + 1. Since 103 — 1 (mod. 103 -1), we obtain according to sec. 28, III, the following congruences: 104 ---10 105 -- - 102 106 —103=-(-l)- I 1 (mod. 103 + 1). 107 -10 108 - 102 etc. Hence: N - (a + b102 + )- (d + 10e + 102f) + (g + 10h + 102j)-... (mod. 103-t 1). Consequently we may state the following criterion for divisibility by 7, 11, or 13. Beginning at the right, separate the given number into periods of three places each (the last period on the left may of course have fewer than three digits). Regard these periods as three place numbers and add them with alternating signs. If the algebraic sum thus obtained is divisible by 7, 11, or 13, the original number is so divisible, and otherwise not. Thus: To examine 847,963,207 as to divisibility by 7, 11, and 13, we form 207 -963 +847 = 91. Since 91 is divisible by 7 and 13 but not by 11, it follows that 847,963,207 is divisible by 7 and by 13 but not by 11. On examining the proofs above it appears that when the given divisor is not a factor of the number, the residue of the division will be furnished by the same test. Divisibility is simply the case in which the residue is zero. 328 MODERN MATHEMATICS Thus, the residue when a number is divided by 9 is the same as the residue when the sum of its digits is divided by 9. Likewise, the number 847,963,207 has the residue 3, when divided by 11, since 91 (found as above) has the residue 3 on division by 11. 31. Roots of congruences. The congruence aoxn-+a1 xn~la2xn-2 +...+a,_2x2 +a,- x +a, -0 (mod. m) where the a's are any numbers except that ao is not a multiple of m, is said to be of degree n in the unknown x. Any number x1, which, when substituted for x, makes the left member congruent to the right (mod. m) is said to satisfy the congruence and to be a root of the congruence. If any number, x, is a root, all numbers congruent to x1 (mod. m) also satisfy the congruence (sec. 26). But these are not regarded as different roots. Taken modulo m, the totality of all numbers that are congruent to x1 are regarded as a single value, and any number whatever of the totality may be selected to represent it; the least positive residue (mod. m), for example, may be so chosen. The numbers 0, 1, 2, 3,. m-1, thus represent all the different values that exist (mod. m); if we test a congruence for these, no other possibilities remain. It is easy to show by special examples that the properties of equations as to existence and number of roots do not hold unmodified for congruences. Thus, the equation ax b, always has one, and only one, root. But we readily show by particular instances that the congruence ax=b (mod. m) may have: (1) No root at all. Example: 3x-5 (mod. 9). By trying the nine possible values for x, x-0, 1, 2, 3, 4, 5, 6, 7, 8 (mod. 9), it appears that none satisfies the congruence. This could also be seen without trial by writing the congruence in the form: 3x-5- 0 (mod. 9). THE THEORY OF NUMBERS 329 This means that x must be so chosen that 3x-5 is a multiple of 9. But whatever the value of x, 3x-5 is not even a multiple of 3, much less of 9. (2) One root. Example: 5x 3 (mod. 9). By trying the nine possible values, it appears that the value 6, and no other, satisfies the congruence. (3) More than one root. Example: 6x 3 (mod. 9). By trial it appears that the values 2, 5, 8, and no others satisfy the congruence. The roots of such congruences will be discussed in more detail in the next section. It is not difficult to prove the following theorem, which is somewhat analogous to the fundamental theorem of algebra that every equation of degree n has precisely n roots.* Theorem. A congruence of degree n, and with a prime modulus cannot have more than n roots. We omit the proof. The reader may supply it, following a line of argument analogous to that used for equations.t 32. Theoretic solution of the linear congruence in one unknown. Given ax= b (mod. m). It may be assumed that b is positive and less than m. If not given so it may readily be made so by addition or subtraction of a multiple of m. Case I. a prime to m. In ax substitute for x in turn the values 0, 1, 3... m-1, obtaining ax=O, a, 2a, 3a,... (m-1)a, or taking least positive residues (mod. m) ax=co(= 0), cl, c2, 3... c,,_1 (mod. m). * See Monograph V, sees. 7, 10, and Monograph IV, Appendix II. t See Monograph V, sec. 10. 330 MODERN MATHEMATICS Can any of the c's be equal? Suppose Ck =Ch k>h By definition ka= ck +rmn and ha = ch + -Sn. If Ck=Ch, we obtain (k-/h)a = (r- s)m. But a is prime to mi, hence k-h is a multiple of mn. But k-h is positive, and k is less than m, being some one of the numbers 1, 2... m-1. Hence k-h is less than m. Since k-h is positive and less than m, it cannot be a multiple of in. Therefore the supposition Ck==Ch is incorrect, and the c's are all different. Since there are m of them, and each one is some one of the m numbers 0, 1, 2... m-1, the fact that they are all different has as consequence that the whole set of the c's must be the numbers 0, 1, 2, 3.. m-1 in some order. In the last set of numbers, the number b occurs once and only once. There is therefore exactly one c that is equal to b, or exactly one value of x such that ax b (mod. m). We have thus shown that: a linear congruence in which the coefficient of the unknown is prime to the modulus has one and only one solution. Case II. Let a and m have the highest common factor d; d>1. The congruence ax-=b (mod. m) means ax= b +km. Since a and m have the factor d, this equation cannot be true if b does not also have the factor d. That is, if btO (mod. d), our congruence has no solution. Let b=0 (mod. d), and let a=ald b = bid m == md THE THEORY OF NUMBERS 331 (a1 is prime to mi, since d is the highest common factor of a and m). Then we may divide the given congruence, including the modulus, by d, obtaining alx=-bl (mod. mi). By Property III, sec. 28, every root of this congruence is a root of the given congruence. This congruence falls under the previous case, and has one and only one root. Let this root be r. Then all numbers of the form r+km1 are equivalent so far as the modulus ml is concerned. All these numbers satisfy the given congruence. But are they equivalent to a single solution with respect to its modulus, m? Let r+k1ml and r+k2ml (kl >k2) be equivalent according to the modulus m. That is: r +klml =-r +k2m (mod. m) or (kl- k2)ml O0 (mod. m). Hence, dividing the members of the congruence and the modulus by mi, we obtain kl -k2 O (mod. d) or kl —k2 (mod. d). That is, two numbers of the form r+kml are congruent, (mod. m) if, and only if, the values of k are congruent (mod. d). Accordingly the given congruence has d solutions, obtained from the expression r+ckml by giving k in turn the values 0, 1, 2, 3... d-1. EXAMPLES: (1) 12x=6 (mod. 15). Here d= 3, m =5. Dividing through by d, 4x- 2 (Iod. 5). 332 MODERN MATHEMATICS By trial, it is seen that this congruence is satisfied by x-3 (mod. 5). Here r=3, and r +kmi becomes 3+5k. By giving k the values 0, 1, 2, we obtain the three roots (mod. 15), 3, 8, 13. (2) 8x 12 (mod. 28). Here d=4, m1 =7. Dividing through by 4, 2x-3 (mod. 7). By trial, this is seen to be satisfied by x 5 (mod. 7). Here r=5, and r+km, becomes 5+7k. Giving k the values 0, 1, 2, 3, we obtain the four roots of the given congruence: 5, 12, 19, 26 (mod. 28). 33. Numerical solution of the congruence ax b (mod. m). The preceding considerations merely proved the existence of one or more roots in certain cases, but provided no method other thain trial for finding their numerical value. It will be sufficient to find such a method for the case, a prime to m, for we have seen above that the solution of a congruence in which a is not prime to m, may be accomplished by the solution of a congruence in which a is prime to m. We assert further that the solution of ax b (mod. m) can readily be found by means of the solution of ax-1 (mod. m). For let r be a root of the latter congruence; then ar _=1 (mod. m), and, multiplying both members by b, a(br) b (mod. m). That is, br is the solution of the original congruence. The problem is then reduced to solving the congruence: ax 1 (mod. m). In this congruence there are really two unknowns, x and the multiple of the modulus, call it y. That is, we seek values of x and y to satisfy the equation: ax= 1 + my or ax- my = 1. THE THEORY OF NUMBERS 333 But the last equation is familiar from the theory of continued fractions.* If the fraction is developed into a continued in Y a fraction, and if - is the last convergent before the value - is reached, it is known that the relation aX- mY= ~ 1 holds. Hence either X or -X is a root of ax -1 (mod. m) We have thus established the following rule for the computation of the root of ax=b (mod. in). Develop a into a continued fraction. The denominator of the a last convergent before - is reached will be the absolute value. of?2T the root of ax 1 (mod. in). Determine by trial which sign is to be taken; the value thus obtained multiplied by b is the root of ax b (mod. m). * It will be recalled that expressions of the form a+1 b+1, a an integer, b, c, d,..., integers>0, are c+1 d+.... called continued fractions. Every rational fraction can be expressed as a terminated continued fraction. Thus, 29 4 1 1 1 1 — ==_-3+-=-3+-= -3+ - -3+ == —3+ -113 11 1 2+ 2+ 2+ - 4 4 4 1 - 1+3 3 a 1 1 The fractions, -, a+-, a+, etc., are called the first, second, third 1' b b+1 C..., convergents of the continued fraction. Thus the convergents of the 1 1 1 fraction used as example above, are -3, -3+,-3+ -3+ --- 2+- 2+ 1 1 1+5 8 29 or in reduced form, -3, 2' 3' 11' For proof of the property used in the main text, see works on college algebra. 334 MODERN MATHEMATICS For example: 49x-23 (mod. 125). 49 1 125 2+1 1+1 1+1 4+1 1 2+2 20 The last convergent is 52. 51' Hence X=51. By trial we find, 49 51 -1 (mod. 125), that is -51 is a solution of 49x-=1 (mod. 125). Multiplying -51 by 23 we obtain a solution of the original congruence 23(-51) 77 (mod. 125)..'. 77 is the solution of the original congruence, as may be verified by substitution. The reader may solve and verify similarly other congruences taken at random, such as: 83x= 7 (mod. 96); 11x-81 (mod. 85); 72x- 27 (mod. 75); 75x 73 (mod. 85); and the like. 34. Fermat's Theorem. If p is a prime, and a is prime to p then a-1 - 1 (mod. p).* Proof. We have already proved (sec. 32) that the numbers a, 2a, 3a... (p-1)a are congruent (mod. p) to the residues 1,2,3.. p-1 * Announced without proof by Ferinat in 1679; first proved by Euler in 1736. The Chinese are thought to have known this theorem for the case t — 2, as earlyr as 500..c. THE THEORY OF NUMBERS 335 in some order.* Multiplying these congruences together we have a 2a3a... (p-l)a-l2. 3... p.a-lI (mod. p). Dividing both members by 1, 2, 3... p-1, which is prime to the modulus, we obtain the-desired result, aP- 1 - 1 (mod. p). 35. Applications. (1) Find a congruence equivalent to the following, but of degree lower than 13: x27 +3x25 +4x8 -3x7 6x13-2x7+ 1lx-5-=O (mod. 13). By inspection it is evident that x-0 (mod. 13) does not satisfy this congruence; we accordingly know that any root x is prime to 13, and hence that x13-' -1(mod. 13). Further, x27= (x12)2.3 _ (1)2x3x3/ (mod. 13), 3x25 =3(x2)2. -S3x (mod. 13), 4x 8=4x 2.x6 4x6 (mod. 13), 3x7 _ 3x5 (mod. 13), 6x1-6x. (mod. 13). Substituting these results in the original congruence, we obtain, -2x7+4x6-3x5+x3 +20x-5 =0 (mod. 13). (2) To find the remainder when 477385 is divided by 17. Dividing 7385 by 17-1 or 16, we have, 7385 =461 16+9. 477 385 = (4716) 461. 479 =(1)481.479 (mod. 17). 47=2 17 + 13 or 47 13 (mod, 17). Hence, by sec. 28j IV, Cor. 2, (47)9-139 (mod. 17), or, 47738 = 139 (mod. 17). * This statement follows at once from the result in sec. 32, if we remember that 0 0 (mod. p). 336 MODERN MATHEMATICS We proceed to work out 139: 132=169 -1 (mod. 17). Squaring 134-1 (mrod. 17). Squaring again, 13S=-1 (mod. 17). Multiplying both members by 13, 139 - 13 (mod. 17). 477385 13 (mod. 17). The remainder is 13. The reader may solve similarly other problems of this sort taken at random. For example, to find the remainder when 1237841 is divided by 29; when 30067489 is divided by 41, and the like. (3) If n is any integer greater than 1, show that n" -n has the factor 2730. Since 2730=2 3-5.7 13, it is sufficient to show that n'3-n has each of these primes as factor. The factor 2. n'3-n=n(n'2-1). If n is even the factor 2 is present. If n is odd we must show that n12-1 is even. This is evident at once, since any power of an odd number is odd; hence n12 is odd and n'2-1 is even. It also appears by Fermat's theorem thus: Since n is prime to 2, n2 —11 (mod. 2). n12=(n2-1)12=1 (mod. 2), or, nL2-1-0 (mod. 2). The factor 3. As above, unless n is a multiple of 3, we must show that n12-1 is a multiple of 3. But by Fermat's theorem, n3- - 1 (mod. 3). n2=(n3 —1)61 (mod. 3), and n2 -1=0 (mod. 3). Similarly by writing our given expression in the forms, n[(n5-1)3-1], n[(n7-1)2-1], n[n13- _ 1], THE THEORY OF NUMBERS 337 we show that it must have the factors 5, 7, and 13, and the proof is completed. (4) Show that every prime number (except 2 and 5) is a factor of a boundless number of numbers all of whose digits are 9's. Let p be a prime other than 2 or 5. Then 10n is prime to p. Hence, by Fermat's theorem, (10n) — 1 -- 0 (mod. p). This is true for every n. The number (IOn)p-1-1 always consists of 9's exclusively, and hence the theorem is proved. (5) The congruence ax-b (mod. p), where a is prime to p, can be solved by multiplying both members by all- and applying Fermat's theorem, with the result, x-baP-2 (mod. p). 36. Wilson's Theorem. If p is a prime, (p-i)! —1 (mod. p).* For p = 2, the theorem is obviously true. We accordingly suppose p >2, and to prove the theorem for this case, first prove the following lemma: Lemma. The root of the congruence ax-1 (mod. p) is congruent to a, if, and only if, a 1l or a = p-1 (mod. p). Proof. By sec. 32 the congruence ax -1 (mod. p) has one root. Suppose it to be a. Then a2== (mod. p) or (a-1)(a+l)-0 (mod. p). But a product is a multiple of a prime p, if and only if one of its factors is a multiple of p. Hence, either a-1- 0 or a+1 =0, (mod. p). That is the root of the congruence ax-l (mod. p) can be congruent to a only if a-1, or p-1 (mod. p). It can readily be verified that the root is congruent to a in these cases, and the lemma is thus proved. * First published, without proof, by Waring in his Meditationes Algebraicae, Cambridge, 1770, and ascribed by him to J. Wilson. It was proved by Euler in 1773, and by Gauss in his Disquisitiones Arithmeticse, 1801. 338 MODERN MATHEMATICS If now al is one of the numbers 2, 3, 4... p-2, the root of the congruence a1x=l (mod. p) will, by the lemma, be different from a1; calling the root a2, we have ala2 - (mod. p). Consider next a3, a third number of the set above. Denote the root of the congruence, a3x-1 (mod. p) by a4. Then we have a3a4 1 (mod. p), and by the lemma, a4 is not congruent to a3. We show further that it is not congruent to a2. For, if a2=-a4 (mod. p), then multiplying both members by a3, a3a2=a3a4 (mod. p) or a3a2-1 (mod. p). But we know ala2-1 (mod. p)..a3a2-ala2 (mod. p). Dividing both numbers by a2 a3=a1 (mod. p). This is contrary to the choice of a3 as different from al. Hence the hypothesis a2-ax (mod. p) is incorrect. Similarly, it appears that a40al (mod. p). If now a5 is a fifth number of the set different from the form already considered, and if a6 denote the root of the congruence a5x =1 (mod. p), then by the same reasoning as above it appears that a6 is not congruent to any one of the numbers a1... a5. Continuing in the same way, the entire set of numbers 2, 3,... p-2 can be grouped in pairs such that the product of the numbers in each pair is congruent 1 (mod. p). That is ala2 —1 a3a4 1 (mod. p). a,-4ap-3 1 We know further that p- — 1 THE THEORY OF NUMBERS 339 Multiplying all these congruences member by member and remembering that the a's are the numbers 2, 3,... p-2 in some order, we obtain 2.3... (p-2) (p- l)- - (mod. p) or (p-1)! — I (mod. p). 37. Wilson's theorem does not hold for composite moduli. For if m is a composite number, and k is one of its factors, (1< k< m), then (m- 1)! will have k as a factor, and consequently (m-l)!+l will not be a multiple of k and, therefore, not of m. Accordingly, Wilson's theorem furnishes a theoretically complete criterion for determining whether or not a given number n is prime. Namely, form (n — 1)!; divide it by n; if the residue -1 can be obtained, n is a prime; otherwise, n is composite. But with large numbers, this method is of no practical use, on account of the enormous calculations that would be required. 38. Applications. (1) If p is a prime number, the residue when 1.2-3...p-1 is divided by 1+2+3+...+(p-l) is p-1. In symbols: (p-l)!=p-1 (mod. 1+2+3+...+p-l). By Wilson's theorem, 1.2.3... p-l= —1+kp ==(k-1)p+(p-1). As the left member has the factor p-1, and the second term on the right is p-1, it follows that (k-l)p must have the factor p-1, and since p- 1 is prime to p, that k- 1 has the factor p -1. Let k-1=h(p-1). Substituting, we obtain, 1-2.3... p-1=h(p-1)p+p-1 =2h(P -)p 2 But (P =1)P-+2+... +(p-). ence the assertion is pro Hence the assertion is proved. 340 MODERN MATHEMATICS (2) If a prime of the form 4n+1, then (1.2-3... 2n)2+1 is a multiple of p. By Wilson's theorem, 1.23... (p-1)+1=-0 (mod. p). Or 1 23... 2n(2n + 1)... (4n-2)(4n-1) (4n) +1 0 (mod. p). (1) But, since p = 4n + 1, 4n= -1 (mod. p), 4n-1 -2 (mod. p), 2n+2 -(2n-1) (mod. p), 2n + 1 -2n (mod. p)..'. (2n+1) (2n + 2)... (4n-1)(4n) -(-1)2.2.3... 2n (mod. p). (2) From (2) and (1), [1.2.3...2n]2+ 1 0 (mod. p). V. BINOMIAL CONGRUENCES 39. Definition. Congruences of the form xn-A -0 (mod. m) are called binomial congruences. We shall consider only xn-1l0 (mod. p) where p is a prime. By Fermat's theorem (sec. 34) we can always make n< p. If p=2, the congruence is linear (since n< p), and has already been solved. We accordingly suppose throughout the subject of binomial congruences that p is a prime greater than 2. 40. The solutions of xm 1 (mod. p) where m is any positive integer, must be prime to p, and are therefore, by Fermat's theorem, also solutions of: xP-ll (mod. p). Further, by sec. 28, IV., Cor. 2, every solution of xm-=1 (mod. p) will also be a solution of xkm 1 (mod. p), for every k. THE THEORY OF NUMBERS 341 41. Theorem. If a is a root of xn=1 (mod. p) and also of xq 1 (mod. p) and if d is the highest common factor of n and q, then a is a root of xd -1 (mod. p). Proof. Let n= n'd, q=q'd. Then n' and q' are relatively prime, and the congruence: n'z =l1 (mod. q') admits one solution (sec. 32). That is, there exist numbers z and y satisfying the equation n'z = 1 + yq or n'z-yq' =; or, multiplying through by d, nz-yq=d. By hypothesis: a= 1 (mod. p); hence anz 1 (mod. p) and aq 1 (mod p); hence aqy I (mod. p). Subtracting: ao2- an - O (mod. p) or aqY(anz-qy — 1) 0 O (mod. p). Since a must be prime to p, anz-q- 0q -- (mod. p) or ad _-l 0 (mod. p). That is: a is a root of xd =1 (mod. p). Corollary. If d is the highest common factor of n and p-1, the solutions of x- 1 (mod. p) satisfy also xd 1 (mod. p). It is accordingly sufficient to consider only congruences of the type: d = 1 (mod. p), d, a divisor of p-1. 42. Definition. The number a is said to belong to the exponent d (mod. p), if adl- (mod. p) and if ay1l (mod. p), whenever y< d. 342 MODERN MATHEMATICS 43. Theorem. If a belongs to the exponent d (mod. p), then at 1 (mod. p) if and only if t is a multiple of d. Proof. Let at=l (mod. p) and let D be the highest common factor of t and d. Hence by sec. 41, aD-1 (mod. p). If t is not a multiple of d, D< d, and in this case a would satisfy a congruence of degree D, less than d, the exponent to which a belongs. Hence t must be a multiple of d. 44. Theorem. If a belongs to the exponent r, and b belongs to the exponent s (mod. p) and if r and s are relatively prime, then ab belongs to the exponent rs (niod. p). Proof. The hypotheses mean that: ar= 1 (mod. p) bs 1 (mod. p), and that a and b satisfy no congruences of this type of lower degree. We have to prove (i) that (ab)rs-1 (mod. p), and (ii) that-no lower power of ab is congruent to 1 (mod. p). (i) (ab)rs = arsbrs - (a?')s (bs)' is - Jr (mod. p) =1 (mod. p). (ii) Let k be any exponent such that (ab) -l (mlod. p). Then ak.b =l (mod. p). Raising both members to the power r, arkbr;k (mod. p). or, since ar 1 (mod. p) brk =1 (mod. p). Hence, since b belongs to s, rk is a multiple of s, by sec. 43, and therefore since r is prime to s, k is a multiple of s. Quite similarly, it may be shown that k is a multiple of r. Hence, since r and s are relatively prime, k is a multiple of rs, TIIE THEORY OF NUMBERS 343 Hence the lowest value of k is rs itself, and the proof that ab belongs to rs is completed. 45. If r and s are not relatively prime, and if m denote their least common multiple, it can be proved in an analogous manner that a number belonging to m can be determined by means of a and b. 46. Theorem. To every divisor, d, of p-1, there belongs (mod. p) at least one number a. Proof. (1) We take up first the case: d- q, where q is a prime. Then the congruence: x)-l-1-=0 (mod. p),..... (1) may be written xf. -1-0 (imod. p); (2) or (xqa-l)(x(f- )q+X(f/-2)q '+..+ + Xq )-0 (mod. p).. (3) But by Fermat's theorem the congruence (1) is satisfied for every value of x except those -0 (mod. p). Accordingly the congruence (3) has the maximum number of roots. But since neither factor of the left member of (3) can be congruent zero for more roots than there are units in its degree, it follows that each factor is congruent zero for as many roots as there are units in its degree. In particular: xq - 1-0 (mod. p)...... (4) has q0 roots. Some of these will also satisfy congruences of this type and of lower degree. By sees. 41, 40, all such roots will satisfy X2q -1 -0 (mod. p). But by what has just been proved this congruence has precisely qu-1 roots. Hence (sec. 40), there are precisely q"-q-1 or qx-l(q-1) roots of the congruence (4) that satisfy no congruence of lower degree. That is, there exist precisely q"-l(q-1) incongruent numbers belonging to the exponent qa (mod. p). 344 MODERN MATHEMATICS (2) Case, d any divisor of p-1. Let d=qarSsr..., where q, r, s are different primes. Then by (1), there exists a number, call it a, belonging to q"; and there exists a number, call it b, belonging to r3. Hence, by the theorem of sec. 44, ab belongs to qar0. By (1) there exists a number, call it c, belonging to sr. Since qarO and sr are relatively prime, (ab)c belongs to qtrIsr. Continuing in this way, all the factors of d are used, and the existence of a number belonging to d is established. Corollary. There exists at least one number, call it g, belonging to p-1. 47. Definition. If g belongs to the exponent p-1, then g is called a primitive root of the congruence Xp- 1l (mod. p), or briefly, a primitive root of p. 48. Theorem. If g is a primitive root of p, the numbers g, g2, g3... gp-1 are distinct (mod. p) and have the residues 1, 2, 3... p- in some order. Proof. Suppose gh gk (mod. p) p- 1 h >k 1. Then gh- k _ (mod. p). But p- >h-k l. Hence this result contradicts the hypothesis that g is a primitive root of p. Consequently, the p- powers, g, g2,... gp-1, all have different residues (mod. p), and therefore have the residues 1, 2, 3... p-1 in some order. 49. Theorem. If g is a primitive root of p, and if k is prime to p-1, gk is a primitive root of p. Proof. Let (gk)h — (mod. p). Then, since g belongs to exponent p-1, kh 0 (mod. p-1). Hence, since k is relatively prime to p-1, h-0 (mod. p-1). THE THEORY OF NUMBERS 345 The lowest admissible value of h is therefore p-1; that is, gk belongs to the exponent p-1, and is hence a primitive root of p. Corollary. There are 4p(p-1) primitive roots of p. 50. The actual value of a primitive root may be found by trial, if the modulus is small. Thus, for p= 17, we try 2, 2=2 2' =-2 (mod. 17) 22=4 26 -4 (mod. 17) 23=8 27 -8 (mod. 17) 24=16 28-=-16 (mod. 17) - 1 (mod. 17) -1 (mod. 17) That is, 2 belongs to the exponent 8, and is not a primitive root of 17. Nor can any of the residues obtained, 2, 4, 8, 16, 15 ( -2), 13, 9, 1, be primitive roots. For they are all of the form 2k and (2k)8 or 28k is -1 (mod. p) since 28 is so. Hence all of these residues belong either to 8 or to a divisor of 8. The smallest number not in the above list is 3. Trying 3 it is found to belong to the exponent 16; that is, 3 is a primitive root of 17. It can also be proved without trial that 3 must be a primitive root. For, since 95(16) =8, 17 has 8 primitive roots. But there are 8 residues of 2k, none of which is a primitive root, consequently, each one of the 8 other non-zero residues must be a primitive root. In particular, 3 is a primitive root. If the second trial likewise does not lead to a primitive root, the theorems above (secs. 44, 45) enable us to determine a number belonging to the least common multiple of the two exponents. If this least common multiple is p-1 itself, we have found a primitive root. If not, we have at least a number belonging to a much larger exponent, and all its powers are thus excluded from further consideration. In this way, systematic trial enables us to find a primitive root. For large primes, the calculations may become laborious. Some general theorems are known as to primitive roots. For example: If a prime is of the form 22n + 1, it has the primitive root 3. If a prime p is of the form 8n+3, and if 4n+1 is also prime, p has the primitive root 2.* * Tschebyscheff, Theorie der Congruenzen, Berlin, 1889, p. 306, et seq., where others are given and proved. 346 MODERN MATHEMATICS VI. QUADRATIC CONGRUENCES 51. The most general congruence of the second degree in one unknown is: ax2 +bx +c-O (mod. m). We simplify the form of this as follows: Multiply both members and the modulus by 4a, 4a2x2 + 4abx + 4ac 0 (mod. 4am). (The modulus is multiplied also, so that the inverse operation may always be possible); or: (2ax b)2-b2 +4ac O (mod. 4am). Putting y 2ax +-b (mod. 4am) db2- 4ac (mod. 4am) the congruence becomes: y2-d (mod. 4am). From the values of y, we find the values of x, by solution of the linear congruence: y -2ax + b(mod. 4am). 52. If 4am=plkl.p2k2... plkl, where pi, p2. p2 are different primes, any number that satisfies the congruence: y2-d=- (mod. 4am).. (1) will also satisfy each of the congruences: y2-d 0 (mod. p1ik) 1............ *(2) y2-d O (mod. plki) Conversely, the definition of a congruence shows that any number that satisfies each of the congruences (2), satisfies also congruence (1). THE THEORY OF NUMBERS 347 53. The solution of the general quadratic congruence is thus reduced to that of the type: x2-a=O (mod. pk), where p is a prime. Any solution of this congruence is also a solution of: x2-a-0 (mod. ph), where h< k, and, in particular, of: x2-a 0 (mod. p,...... (3) We shall restrict further consideration to congruences of this type, and as preliminary example take the modulus 7. Forming the squares of the seven least positive residues (mod. 7) we have: 02=0 22=4 42 2 62-=1 (mod. 7) 12=1 32=2 52 4 We see from this that the congruence x2=a (mod. 7) has a solution if a-O, 1, 2, 4, but has no solution if a-3, 5, 6. The former numbers are residues of squares according to the modulus 7, or briefly quadratic residues of 7; the latter numbers are not such residues. 54. Definition. If the congruence x2-a (mod. p) admits a solution, the number a is called a quadratic residue of p: otherwise it is called a quadratic non-residue of p. When there is no danger of misinterpretation, the word "quadratic" is often omitted for brevity. 55. It can be proved without much difficulty that the product of two residues of p is also a residue; that the product of two non-residues is a residue; and that the product of a residue and a non-residue is a non-residue. 56. These results can be stated in the form of a single equation by the use of the symbol (p) which is defined as * Introduced by Legendre, and known as "Legendrc's symbol," 348 MODERN MATHEMATICS having the value +1, if a is a residue of p, and -1, if a is a non-residue of p. Then we have always: \p p/ p )I It follows that if m=(-1)a2bp 2d... then: =( -1)(2)b(p)(P2)* 57. To determine whether or not any number m is a residue of p, it is sufficient to determine whether or not -1, 2, and the odd prime factors of m are residues of p. The following results may be proved: I. = (-1) -. /2\ p2- l II. 0=(-1) s. III. q(-p) 1) 2 2. (p, q odd primes). 58. The last is an important theorem, known as Legendre's Law of Reciprocity, and may be stated as follows: If p and q are two odd primes and if at least one of them is of the form 4n + 1, then q is a residue of p, if and only if p is a residue of q, while if both p and q are of the form 4n + 3, then q is a residue of p when p is a non-residue of q, and vice versa. This theorem was discovered empirically by Euler (1783), announced in its general form by Legendre (1785), and partly proved by him. The first complete proof was, however, due to Gauss, who gave eight distinct proofs. Many others have been given down to the present time.* For further information and for a full presentation of some of the proofs, the reader is referred to the works mentioned in the Bibliography. * A chronological list of 49 proofs, extending from the first proof, published by Gauss in 1801, to three proofs by Lange in 1896-97, is given in Bachmann, Niedere Zahlentheorie, I, pp. 203-4. THE THEORY OF NUMBERS 349 VII. BIBLIOGRAPHY 59. The classic work in our subject is the Disquisitiones Arithmeticse of C. F. Gauss, published in 1801, when Gauss was only twenty-four years of age, and really completed a few years earlier. In this work Gauss gave a masterly presentation of the subject which has remained unequalled; unlike many masterpieces, it is written so clearly and simply that much of it is intelligible to the beginner. A German translation by Maser (Berlin, 1889), and a French translation by Poullet-Delisle (Paris, 1807), make the work more widely accessible. The following texts also take up the subject from the beginning, reaching varying degrees of advancement: Dirichlet-Dedekind, Zahlentheorie, Braunschweig, 4th ed., 1894. Bachmann, Niedere Zahlentheorie, I, Leipzig, 1902. Cahen, Theorie des Nombres, Paris, 1900. Mathews, Theory of Numbers, I, Cambridge, 1892. These works contain numerous references, both to the older and the contemporary literature. An excellent sketch of the principal results and present state of our subject is given in the Encyclopadie der Mathematischen Wissenschaften, Band I, 2ter Teil, appearing with additions in the French translation, Encyclopedie des Sciences Mathematiques, Tome I, Vol. III. The theory of numbers figures largely in the field of "Mathematical recreations." An introduction to this field may be obtained through some or all of the following: Ball, Mathematical Recreations and Problems, 3d ed., London, 1890. Bachet de Mkeziriac, Problemes plaisants et delectables qui se font par les nombres. First published in 1612, and reprinted at Paris in 1884. Lucas, Recreations Mathematiques, 4 vols., Paris, 1891-96. Ahrens, Mathematische Unterhaltungen und Spiele, Leipzig, 1900. In this connection mention may also be made of a paper by Bouton on "Nim, A Game with a Complete Mathematical Theory" (Annals of Math., ser. 2, Vol. III, pp. 35-39, 1901), recently generalized by Moore (ibid., Vol. XI, pp. 90-94, 1910). VIII CONSTRUCTIONS WITH RULER AND COMPASSES; REGULAR POLYGONS By L. E. DICKSON CONTENTS 1. Introduction. 2. Analytic criterion for constructibility. 3. Graphical solution of a quadratic equation. 4. Domain of rationality. 5. Functions involving no irrationalities other than square root. 9. Reducible and irreducible functions. 11. Fundamental theorem; Duplication of the cube; Trisection of an angle; Quadrature of the circle. 13. Connection between regular polygons and roots of unity. 14. De Moivre's theorem. 17. Regular pentagon and decagon. 19. Regular polygon of 17 sides. 20. Construction of the regular polygon of 17 sides. 21. Gauss's theory of regular polygons. 28. Primitive roots of unity. 30. Gauss's lemma. 31. Irreducibility of the cyclotomic equation. 32. Proofs of theorems cited earlier. 39. References. 352 VIII CONSTRUCTIONS WITH RULER AND COMPASSES; REGULAR POLYGONS By L. E. DICKSON 1. Introduction. The Greek geometricians discovered constructions by ruler and compasses for various elementary problems. There arose, however, certain famous problems, such as the duplication of a cube, the trisection of an angle, and the quadrature of a circle, for which the ancients vainly sought constructions by ruler and compasses. The impossibility of these constructions was proved only in recent times. As such proofs are beyond the scope of elementary geometry, recourse must be had to analytic methods, in particular to the general processes and theorems of algebra. To these analytic methods is due likewise the discovery of the possibility of certain constructions. This is the case, for instance, with the regular polygon of seventeen sides, the possibility of whose construction by ruler and compasses was not suspected during the twenty centuries from Euclid to Gauss. 2. Analytic criterion for constructibility. The first step in our consideration of a proposed construction consists in formulating the problem analytically. In some instances elementary algebra suffices for this formulation. For example, in the ancient problem of the duplication of the cube, we are given the length s of a side and seek a number x such that X3=2s3. But usually it is convenient to employ analytic geometry; a point is determined by its coordinates x and y with reference to fixed axes, a straight line or circle by an 353 354 MODERN MATHEMATICS equation of the first or second degree between the coordinates of the general point on it. Hence we are concerned with certain numbers, some being the coordinates of points, others being the ratios of the coefficients in equations, and others expressing lengths, areas, or volumes. We shall establish the following Criterion. A proposed construction is possible by ruler and compasses if, and only if, the numbers which define analytically the desired geometric elements can be derived from those defining the given elements by a finite number of rational operations and extractions of real square roots. Suppose, first, that the construction is possible. The straight lines and circles drawn in making the construction are located by means of points either initially given or obtained as the intersections of two straight lines, a straight line and a circle, or two circles. The coordinates of the intersection of two straight lines are rational functions of the coefficients of the equations of the lines. To determine the coordinates of the intersection of the straight line y=mx+b with the circle (x -c)2 + (y -d)2 = r we eliminate y between the equations and obtain a quadratic equation for x. Thus x (and hence mx+b or y) involves no irrationality (in addition to those in m, b, c, d, r) other than the square root of a certain known expression. Finally, the intersections of the preceding circle with a second circle, (x -e)2 + (y -f )2= s2, are given by the intersections of one of the circles with their common chord, whose equation is obtained by subtracting the members of the equation of one circle from those of the other. This third case has therefore been reduced to the second. The property stated in the criterion is thus proved. Conversely, let there be no irrationalities other than real square roots. Then the construction is possible by ruler and compasses. First, a rational function of given quantities is CONSTRUCTIONS WITH RULER AND COMPASSES 355 obtained by the operations, addition, subtraction, multiplication, and division. The construction of the sum or differ1b a b 1 ~b 6 1 q=a/b FIG. 1. FIG. 2. ence of two segments is obvious. The construction, by means of parallel lines, of a segment whose length p is the product a b of the lengths of two given segments is shown in Fig. 1; that for the quotient q=a/b in Fig. 2. Next, a segment of length r= N/m may be constructed, as in Fig. 3, by 1 - drawing a semicircle on a diameter FIG. 3. composed of two segments of lengths 1 and m, then a perpendicular to the diameter. 3. Graphical solution of a quadratic equation. The roots of 2 -ax + b = 0 are 2(a ~/a2-4b). When the roots are real, the only irrationality is a real square root. The criterion for constructibility in sec. 2 is therefore satisfied. Of various methods of making the construction, the following * is especially simple: Draw a circle having as diameter the line BQ joining the points B=(0, 1) and Q=(a, b). The abscissas ON and OM of the points of intersection of this circle with the x-axis are the roots of the quadratic x2 -ax + b =0. First Proof. In Fig. 4, OB=1, OT=a, TQ=b. The centre of circle is thus (2' l )its diameter is the hypote2 ' * Acredited to Lill by D'Ocagne, Le Calcul Simplifie, Paris, 1905, p. 139. 356 MODERN MATHEMATICS nuse of a right triangle with legs a and b-1. Hence the equation of the circle is / a\2 / b+1\2 a\2 b -1\2 (x-2) +Y- 2 - 92J +(-2 )2 2 2 2 To find its intersection with the x-axis, we set y= 0, and get x2-ax+b =0. Second Proof. To give a proof by elementary geometry, let OB meet the circle again at C, and let TQ meet it at D. Join CQ and BD. Since BQ is a diameter, angles C and D are right angles. c/ \ --------- HenceOC==b, DT=OB. SinceparalIe ^-^ ~~ \ Mlel lines intercept equal arcs, chords BN and DM are equal. Thus triB _-__D angles BON and DTM are congruent, N\ M _ T whence ON=MT. Thus, OM+ON=OM+MT=OT=a. FIG. 4. The product of the segments on one secant equals the product of those on another from the same point. Hence OM ON=OCOB=b 1==b. Since OM and ON have the sum a and the product b, they are the roots of x2-ax+b==0. 4. Domain of rationality. If a set of numbers has the property that, when each of the rational operations, addition, subtraction, multiplication, and division (the divisor not being zero), is performed on any two numbers of the set, the result is one of the numbers of the given set, the set of numbers is said to form a domain of rationality. For example, the set of all real numbers forms a domain of rationality since the sum, difference, product, or quotient of any two real numbers is a real number. Again, the set of all rational numbers (that is, all positive and negative integers and fractions) forms a domain of rationality. But the set CONSTRUCTIONS WITH RULER AND COMPASSES 357 of all positive integers is not a domain of rationality, since the difference of two positive integers is not always a positive integer. Nor is the set of all positive and negative integers a domain of rationality, since the quotient of two integers is not always an integer. The set of all rational functions, with integral coefficients, of assigned numbers a, b, c,..., forms a domain of rationality; it is said to be defined by a, b, c,... * If, in a proposed construction, the given geometric elements are determined analytically by the numbers a, b, c,...,the domain of rationality defined by a, b, c,..will be called the domain of the geometric data and designated by D. 5. Functions involving no irrationalities other than square roots. Let x be a function derived from the numbers a, b, c,... of the domain D by rational operations and extractions of square roots, finite in number. The purpose of investigating such functions x is to deduce a condition for constructibility more easily applied than the criterion in sec. 2. The number of superimposed radical signs in a term of x is called the order of the term; the maximum order of the various terms of x is denoted by m. For example, in x= /\/a + b+/c + V/d +//e +V/f+g, the first three terms are of order 2, the fourth is of order 1, the last term g is of order zero; consequently, m=2. Frequently a function x can be given a modified form involving fewer radicals. Thus, V/9 can be replaced by 3, and /10 -2\/ by /3-1. If r= /3+ /5 and r'=/ - then rr'=2, so that 2r-7r', which involves two radicals of 14 order 2, can be replaced by 2r —, which involves only one radical of order 2. Again, if x involves V/3, /5, and /15, we would replace V/15 by the product V/3 S/5. In general, if any of the various radicals of order n is a rational function * See also Monograph V, sec. 9. 358 MODERN MATHEMATICS of the remaining radicals of order n and the radicals of lower order, we assume that it is so expressed in terms of the other radicals. Hence, after all such simplifications are made, no one of the various radicals of order m is a rational function of the remaining radicals of order m, and the radicals of lower order occurring separately or underneath other radical signs; likewise, no radical of order m -1 is a rational function of the remaining radicals of order m -1 and the radicals of lower order, etc. The distinct radicals which occur in this simplified form of x will therefore be said to be independent. In case the resulting function x is a sum of several fractions, we bring them to a common denominator and express x as the quotient of two integral functions of the radicals. For example, if x= V5+2r-14/r, where r=V/3+ V5, we give x the form -, where A=rV/5 +2(3+ /5) -14. Next, we rationalize the denominator by the following process: If the denominator contains a radical x/k of the maximum order m, it can be given the form a+bV/k, where a and b do not involve Vk. We then multiply the numerator and denominator by a-bV/k. Similarly, we rid the denominator of each radical of order m, then rid it of each radical of order A m-1, etc. Thus, in the preceding example, x=-, where r=/3 + 5, the first step gives A(-r) -Ar r(-r) -3-V5 The next step gives -Ar( -3 + 5) 3Ar -ArV/5 (- -3-v5)( -3 + /5) 4 We have now proved that x can be given a normal form composed of a sum of terms each a product of radicals, having as coefficient a number of the domain D, and such that the distinct radicals are independent. For example, 5 +_ /5-V7-+4 5v7 is in normal form. CONSTRUCTIONS WITH RULER AND COMPASSES 359 6. Let n be the number of distinct radicals (including radicals occurring beneath radical signs) found in the normal form of x. By changing the sign of one or more of these n radicals everywhere that it occurs in x, we obtain 2n conjugate functions x-x1, 2,..., X2n. For example, Xi=3 +2V5 +/3 -2/V5 is one of 23 =8 conjugates, of which only 4 are distinct, namely, x1, x2 =3+2/5- /3-25, x3= -V/3+2V/5+/3-2V/5, X4= -4/3+2\/5-/3 -2v/5. The 2n conjugate quantities xl, x2,..., are the roots of the equation F(x) = (x-xl)(x-X2)... (x-X2n)-0. The expanded form of this product is F(x) - X2n + kx2n-1 +.. + k2n, where, as shown in the theory of equations,* kl== -(xl +2+... +x2n), k2=xlx2+x2x3+xlx3... For example, x = 3a + 2 /b and its conjugate 3a -2 /b are the roots of the equation x2 -6ax + 9a2 -4b =0 with coefficients in the domain defined by a and b. Although the roots xl, x2,... involve radicals, the symmetrical combinations ki of the roots will be shown to equal expressions free of these radicals, and hence rational functions with integral coefficients of the given numbers a, b, c,... defining the domain D. Indeed, suppose that ki involves one of the radicals, say /r. Then it can be put into the form ki = p + q/r, where neither p nor q involves /r. When any one of the n distinct radicals is changed in sign, the roots xl, x2,... are * See Monograph No. V, sec. 10. 360 MODERN MATHEMATICS interchanged in pairs and the product F(x) is unaltered. Since ki must therefore remain unaltered when /r is changed into -V/r, we have p + qVr= p -qV/r, q=0, so that ki=p is free of Vr. Since ki involves no one of the n radicals, it equals a number of the domain D. Hence, the function x satisfies an equation F(x) =0 of degree 2n with coefficients in the domain D. 7. The quantity x1 satisfies various equations with coefficients in the domain D; for example, M(x) F(x) =0, where M(x) is any integral function with coefficients in D. We next prove an important property of all such equations. Theorem. If one of the conjugate quantities xl, x2,..., X2n satisfies any equation f(x) =0 with coefficients in the domain D, then all the quantities xz satisfy this equation. Let xi=p+qV/r, where /r is a radical of the maximum order m, while p and q do not involve V/r but may contain some of the remaining radicals of order m and radicals of lower order. By changing the sign of V/r, we obtain another xi, say 2 = p-q/r. Now f(xl) may be given the form A +B/r, where A and B do not involve /r. By hypothesis, f(xl) =0; that is, A+B/r=-O. If BO0, we would have /r= -A/B, contrary to the assumption (sec. 5) on the independence of the radicals. Hence B = 0 and therefore also A = 0. Since f(x2) = A-B /r we have f(x) =0. Thus x2 is a root of f(x) =0. The proof that any xi is a root of f(x)=0 is based upon the same principles. To simplify the formulas, let x1 contain just two radicals /r and ~/r' of the maximum order m. Then (end of sec. 5), f(xI) = A + B VrT+ C /' + E Vr /r', CONSTRUCTIONS WITH RULER AND COMPASSES 361 where A, B, C, E involve only radicals of orders < m. In view of the independence of the radicals (sec. 5), we see as above that A, B, C, E must each be zero. Let A contain just three radicals s, V/s, V/s" of order m -1. Then A =g +hV/s + jVs'+js" +kV/sV' +....+qV/s V/s'v7. As before, A=0 requires that g, h,... q shall be zero. Likewise, the coefficients of /s, /s',..., in B, C, E must be zero. We may proceed similarly with the radicals of orders m-2,..., 1. Hence in the expression (end of sec. 5) f(xl) =d +eV/r+f/r' +g./s +.. +pVr V/r' +QqVr V/s +....+tVr /r' Vs+.... of f(xi) as a sum of terms each a product of radicals with coefficients in the domain D, each coefficient d, e, f,... is zero. Now xi can be derived from x1 by changing the signs of certain of the radicals /r, /r', V/s,...Thus f(xi) is derived from the preceding expression for f(xi) by the same changes. Since d, e,,... are zero, it follows that f(xi) is zero. Hence xi is a root of f(x) =0. 8. It was shown in sec. 6 that x1 satisfies an equation F(x)=0 of degree 2n with coefficients in the domain D. Of all the equations, with coefficients in D, which are satisfied by xl, let +(x)=0 be one of the lowest degree 1. The coefficient of xi may be assumed to be unity. There cannot be two such equations of degree I, since by subtraction we would obtain an equation of degree < 1, with its coefficients in D and having the root x1. We shall prove that the function F(x) is an exact power of this unique function 0(x). Divide F(x) by +(x) and let the quotient be Fi(x) and the remainder be r(x) of degree <1, where F1(x) and r(x) are integral functions with coefficients in D. Then Fi(x) (x) F (x) + r(x). Let x-xl; since F(xl)=0 and 0(xl)=0, we have r(xl)=0. If r(x) is not identically zero, r(x)=0 is an equation, with 362 MODERN MATHEMATICS coefficients in D, having the root xl and of degree <1, contrary to the hypothesis that I is the lowest degree of such equations. Hence r(x) is identically zero, so that F(x) =-0(x) F (x). If Fl(x) reduces to a constant, necessarily unity, F(x) is the first power of 05(x) and our theorem is proved. In the contrary case, Fl(x) is a factor of degree 1 of F(x) and Fl(x)=0 has as a root at least one of the roots xi of F(x)=O, and hence (sec. 7) has every one of the xi as roots. In particular, F (x)=0 has the root xi, so that by the above argument, F1 (x ) = (x) F2(x), where F2(x) is an integral function with coefficients in D. If F2(x) reduces to a constant, necessarily unity, F(x) is the square of 5(x) and the theorem is proved. On the contrary case, the same argument shows that F2(x) = (x) F3a(). Proceeding in this way, we find ultimately that F(x) = [ (x)]. The degree of the second member is lk and the degree of F(x) is 2n. Thus I is a divisor of 2n. Hence the degree of +(x) is a power of 2. We have therefore proved the following theorem: The unique equation of lowest degree with coefficients in the domain D which is satisfied by a function x1 derived from numbers of D by a finite number of rational operations and extractions of square roots is of degree a power of 2. 9. Reducible and irreducible functions. An integral function f(x) with coefficients in a given domain D is called reducible, or irreducible in the domain, according as it can or cannot be factored into two integral functions, each of degree >1, with coefficients in D. For example, x2-4 is reducible in any domain; x2 -3 is CONSTRUCTIONS WITH RULER AND COMPASSES 363 irreducible in the domain of all rational numbers, but is reducible in the domain of all real numbers; x2 +4 is irreducible in the latter domain, but is reducible in the domain of all real and complex numbers, since x2+ 4 (x + 2i) (x-2i), i= V/-1. 10. The function +(x), defined in sec. 8, is irreducible in the domain D. For, if f(x) be the product of two integral functions, each of degree >1, with coefficients in D, one of the factors equated to zero would give an equation, with coefficients in D, which is satisfied by xl and is of lower degree than +(x). But this would contradict the hypothesis concerning f(x). An equation G(x)=0 is said to be irreducible in D, if the function G(x) is irreducible in D. The equation f(x)=0 is the only equation irreducible in D, which has the root x1. For, if G(x)=0 is irreducible in D and has the root xl, the argument used in sec. 8 shows that G(x) has the factor O(x), so that G(x) must be the product of 0(x) and a constant. The theorem in sec. 8 is therefore equivalent to the following: The unique equation, irreducible in the domain D, which is satisfied by a function xl derived from numbers of D by a finite number of rational operations and extractions of square roots is of degree a power of 2. 11. From the last theorem and the criterion in sec. 2 we deduce the Fundamental theorem. A proposed construction is not possible by ruler and compasses if any one of the numbers which define analytically the required geometric elements satisfies an equation irreducible in the domain of the geometric data whose degree is not a power of 2. 12. The preceding result enables us to treat the three famous problems mentioned in the Introduction. Duplication of the cube. The problem is to construct a cube whose volume is twice the volume of a given cube. Taking as the unit of length an edge of the given cube, we see that an edge x of the required cube is a root of x3=2. The equation 364 MODERN MATHEMATICS x3-2=0 is irreducible in the domain of all rational numbers. For, if reducible, it would have a linear factor and hence a rational root. But if a/b is a root, where a and b are integers with no common divisor except unity, then a3=2b3. Hence a3, and therefore a, is even, a=2c. Then 4c3=b3, so that b is even. Thus a and b are both even and hence have the common factor 2, contrary to hypothesis. Since the degree of the irreducible equation x3=2 is not a power of 2, it follows from sec. 11 that the duplication of a cube is not possible by ruler and compasses. Trisection of an angle. To prove the impossibility of the trisection of an arbitrary* angle by ruler and compasses, it is sufficient to prove it for a particular angle, for example, for 120~. The construction of the angle ~ (120~) =40~ is equivalent to the construction of a right-angled triangle whose hypotenuse is unity and base is cos 40~. In the trigonometric identity cos 3x=4 cos3x-3 cos x, take x=40~. Since cos 120~= -4, we get 4 cos3 40~ -3 cos 40~+ 4 =0. Multiply by 2 and write y=2 cos 40~. Then y3-3y +1 =0. This equation is irreducible in the domain of rational numbers. For, if reducible, it would have a linear factor and hence a root a/b, where a and b are integers with no common factor except unity and b is positive. In the cubic equation, set y=a/b and multiply by b2. We get a3 - -3ab+b2 0, so that a3/b is an integer. If b>1, a and b would have a common factor >1. Hence b=l. The integral root y =a makes a3-3a an integral multiple of a, so that the constant term 1 must be a multiple of a. Hence a= I1. By trial * Certain special angles like 360~, 180~, 90~ can be trisected, since angles 120~, 60~, 30~ can be constructed by ruler and compasses. CONSTRUCTIONS WITH RULER AND COMPASSES 365 neither +1 nor -1 is a root of the cubic. It now follows from sec. 11 that the trisection of 120~ is not possible by ruler and compasses. Another proof follows from the fact (sec. 27) that a regular polygon of 9 sides cannot be inscribed in a circle by ruler and compasses. Quadrature of the circle. The problem is to construct by ruler and compasses a square whose area shall equal the area 7TR2 of a circle of given radius R. The construction is impossible. For, if it were possible, and R is taken as the unit of length, the number n of the units of area would satisfy an algebraic equation with rational coefficients (sec. 2, sec. 6). But this is not the case.* 13. Connection between regular polygons and roots of unity. Consider a regular polygon of n sides (an n-gon) inscribed in a circle of unit radius. We employ a rectangular system of coordinates with the origin at the centre of the circle and the x-axis passing through a vertex of the polygon. This vertex is therefore (1, 0). For n=4, the remaining vertices of the square have their coordinates marked in Fig. 5. For 0/ 1 /,yJ2 -1,0 ), 1,0 0,0 0,0 n-1 y Y f n-! 0,-i FIG. 5. FIG. 6. any n, the remaining vertices, taken in counter-clockwise order, will be designated (xl, yl), (x2, y2),..., (Xn-_ Yn-1), as in Fig. 6. Since a side of the n-gon subtends at the centre an angle whose magnitude is 360/n degrees, or 2r/n radians, 27. 27 47.4r X1=COS —, yl=sin X2=cos-, y2=sn,. * See Monograph No. IX. 366 MODERN MATHEMATICS Each point (x, y) of the plane uniquely determines a complex number x+iy, where i= / -1. Conversely, a complex number x+iy uniquely determines a point (x, y). With the vertices of our square are thus associated the four distinct complex numbers 1, i, -1, -i....... (1) With the vertices of the n-gon are associated the distinct complex numbers 27. 2r 4.. 4r 1, rl=cos- +i sin r2=cos - + sin-,.. n ny n n 2(n-1)7wr 2(n-1)7r rn-l =COS +i s in --. n n Since i2 — 1, the four numbers (1) are roots of x4=l,....... (3) and hence are called fourth roots of unity. Any one of the numbers (2) is a root of the equation x =l1,........... (4) and is called an nth root of unity. Indeed, from 2k7. 2k 7 rk= cos - +,..... (2') n n we find by applying formula (5) below that rk =cos 2k7 +i sin 2kr= 1 +i.0=1. 14. De Moivre's theorem. For any positive integer n, we have (cos A +i sin A)n =cos nA +i sin nA... (5) We first prove the formula (cos A +i sin A) (cos B +i sin B) -ccos (A +B) +i sin (A +B). (6) The product of the two numbers is a +ib, where a =cos A cos B-sin A sin B = cos (A +B), b = cos A sin B +sin A cos B = sin (A +B). CONSTRUCTIONS WITH RULER AND COMPASSES 367 If we take B-A in (6), we obtain formula (5) for the case n=2; (5) is evidently true for the case n=l. In general, the proof of (5) for any exponent n is made by mathematical induction. Assume that it is true for the case n=m, that is, (cos A +i sin A)m -cos mA +i sin mA. Multiply each member by cos A+i sin A. For the product on the right we employ (6) with B=mA. Hence we get (cos A +i sin A)m+l =cos (m+1)A +isin (m+1)A, so that (5) holds also for n = m + l. The induction is thus complete. 15. It follows from De Moivre's theorem that (2') is the kth power of r=cos- +i sin -... (7) n n Hence the numbers (2) may be expressed in the form =r-n, rl=r, r2==,...', rk=-k~... rn_l=rn-1. Since these n numbers were seen to be distinct and to be roots of (4), they give all the roots of (4). Indeed, an algebraic equation of degree n cannot have more than n distinct roots.* With the successive vertices of the regular n-gon inscribed as in Fig. 6 in a circle of unit radius are associated the n distinct numbers 1, r, r2, rk.. rn-,... - (8) which give all the nth roots of unity. Here r is defined by (7). For n=4, these numbers are 1, i, i2= -1, i3= -i. 16. The complex number C= cos A+i sin A has the reciprocal C- = cos A-i sin A, since the product of the two is cos2 A+sin2 A=1. Hence their sum C+C-1 is the real number 2 cos A. The inscription of a regular n-gon by ruler and compasses is equivalent to the construction of the angle 27/n, and hence * See Monograph No. V, sec. 10. 368 MODERN MATHEMATICS is equivalent to the construction of a right-angled triangle with hypotenuse unity and base cos 2w/n. Instead of determining the complex root of unity r, defined by (7), it is therefore sufficient to determine r + r-l=2 cos 27r/n. Since rn=1, we have r-1=rn-1. We shall therefore seek certain real-valued combinations, such as r+rn-1, of the roots (8). 17. Regular pentagon and decagon. For n=5, we wish to determine 27r )o=r+r4 =2cos -' where r=cos +i sin 52 5 5 Since r5= and rl1, we have r5-1 =r4 +r3 +r2 +r + =0. r-1 Hence if r +r4 is found, so is also 71 =r2+r3, since Vo+~ i= -1. Two numbers can be determined when their sum and product are known. We therefore evaluate o' il. By actual multiplication, (r +r4)(r2 +r3) =r3 +r4 +r6+r7 = r3+r4+r +r2 -1, since r5=1. Hence Do and 1 are the roots ~( -1 ~ /5) of X2 — (ro0 + l)X + 0o1 =X2 + X-1 = 0. Since the cosine of an acute angle is positive, we have 2w2 o = 2 cos 1- -I + -/5), 1- = -1 -/5). From the value of Do, we may construct the angle 2r/5. Let AOA' and BOB' be perpendicular diameters in a circle of radius R, and M the middle point of OA' (Fig. 7). Then BM2 = R2 + (IR)2, BM = - R 5. CONSTRUCTIONS WITH RULER AND COMPASSES 369 Let the circle with centre M and radius BM cut OA at C. Let N be the middle point of OC. Then 2r OC= =R(\/5 -1) = Rro, ON=R cos - 5. Draw DN parallel to OB. Then angle DON equals 27/5. Hence AD is the side s5 of regular inscribed pentagon. We may omit the construction of DN, DO, DA, and prove that CB=s5, CO=slo, the side of a regular decagon. The latter follows from 27 slo = 2R sin 18~- 2R cos 72~-2R cos = Ro = OC. B B S6 A ^jrA' A 's'o A 0 C S10 M A/ B' B' FIG. 7. FIG. 8. Next, sin 18~=cos 72~=1 -2 sin2 36~. Multiplying by 2R2 and replacing 2R sin 36~ by its value S5, and 2R sin 18~ by slo, we get Rso = 2R2 -s52. But 7o is a root of 2 + x -1 = 0, and Roo = sio. Hence So2 + Rs1o -R2 = 0. It follows that slo2 +R2 = 52. Since OC=slo, OB=R, the hypotenuse BC must equal s5. We have now established the following elegant construction of the regular pentagon and decagon: If AOA' and BOB' are perpendicular diameters, and M is the middle point of the radius OA', a circle with centre M and radius MB will cut OA at a point C such that OC and BC are the sides sio and S5 of the inscribed regular decagon and pentagon. 370 MODERN MATHEMATICS In particular, Fig. 8 exhibits the above relation S12 + S62 = S52 between the sides of the inscribed regular decagon, hexagon, and pentagon. 18. If p is a prime number, a regular polygon of p sides can be inscribed by ruler and compasses not merely in the well-known cases p=3 and p=5, but also when p=17 and when p has certain larger values. This important discovery was made by Gauss. For the general theorem see sec. 27. In the treatment of the case p=5 (sec. 17), we made use of the combinations r+r4 and r2+r3, called periods, of the complex fifth roots of unity. If the latter be written in the order r, r2, r4, r3 - r8 so that each is the square of the preceding, we note that the periods are obtained by taking alternate terms of this series. For another value of p it may not be possible to arrange the complex pth roots of unity (sec. 15), r, r2, r3, -... rp-,.. (9) in such an order that each term is the square of the preceding. In fact, this is not possible when p=7, since the fourth term r8 is now identical with the first term r. But when p=7 the roots can be arranged so that each is the cube of the preceding, namely, r, r3, r2, r6, r4, r5. It is shown in Monograph No. VII, sec. 46, that for any prime p there exists an integer g (called a primitive root of p), such that the remainders obtained upon dividing 1, g, g2,. gp-2 by p are in some order 1, 2,..., p-1. Hence the roots r,, r, r2,... rp -2. o... (10) are identical in some order with the roots (9). CONSTRUCTIONS WITH RULER AND COMPASSES 371 19. Regular polygon of 17 sides. For p- 17, we may take g = 3, since the remainders obtained upon dividing the successive powers 1, 3, 32,., 316 of 3 by 17 are 1, 3, 9, 10, 13., 51, 15, 11, 16, 14, 8, 7, 4, 12, 2, 6, which form a permutation of 1, 2,..., 16. Taking alternate terms, we form the periods' ~o=r +r +rl3 +rl5+rl6 +r8 ~r4 +r2, 1=r3~+r10 +r5 +r11 ~rl4+r7 +r'2+r. Since r17 =:I and r l, we have rl7 161r5.. +r~1.=O. r - Hence ~o + ij = -1. In the 64 terms of the product ~orj we reduce exponents by means of r 17 =1 and find that each root r, r2,..., r16 occurs exactly four times. Hence ~0~1-4(~r2...+r1) -4. But ~o, ~j are the roots of x2-+~)ix1o~,==0. Hence ~1 satisf y X2 +X-4 =0.. (I (1) By taking alternate terms in ~o we form two periods ~o' and p2'; likewise, two periods from ii1. of=,r + r13 +r16 +r4, ~1' =r3 -i-r5 +r14 +r12, ~2/ r9 +rl5+r8 +r2,,~3'-1=~r11+r7 +r6, We readily' verify that O'~~2'_ - 1 ~1fl3'. Hence ~0y2' satisfy x2-~Ox -1 =0,...(12) 1' ' satisfy x2 -1x -1 =0....(13) In view of sec. 16, it suffices to determine r ~r16. The periods n"r+r'6, i"=13+r4, 372 MODERN MATHEMATICS have the sum iwo' and the product rt' Hence 7O /, 1)4 satisfy x2 -_o'x +1)'=O.. (14) To decide which root of (II) is ro and which is rji, and the similar question in (12)-(14), we employ the formulas 27w 2w 2kw. 2kw r = cos- +i sin rk=cOS-+ll 17 17' 17 1W7 Hence, as in sec. 16, c2 os 27,w i4"=2 Cos 87 0/1 > )4/>1 lOw 7w Similarly, employing cos 17 = -cosp-, we get 2w 8w 6w 7w oo's=2 c C o~2cos i-7 9f=2cos —2cos17 17' 17 17' Hence ro' and r,' are positive. Further, 6wC 5w 7w 3wc '11=2 Cos 17r -2cosy7-2cosp-2cos 3i 17 17 17 17 is negative, since the first cosine is less- than the second. But we had 1)0)= -4. Hence 1) is positive. Hence (11)T(14) give 2 1 7 (,712,\,/ + ~ 11 I + + \1 I+ 1 o12 20. Construction of the regular polygon of 17 sides. In a circle of radius unity construct two perpendicular diameters AB, CD, and at A, D draw tangents which intersect at S. Determine the point E so that AE =AS (for example, by two bisections). Then AE=, QE =VAO2+AE2 = 17. Let the circle with centre E and radius QE cut AS at F and F'. Then AF =EF -EA=OE-{ W = Y AF'=EF'~EA=OE~ = -1, ___ OF= VA 2 +AF2= Vi +j1)o2, OF' V +j1)i2. CONSTRUCTIONS WITH RULER AND COMPASSES 373 Let the circle with centre F and radius FO cut AS at H; that with centre F' and radius F'O cut AS at H'. Then AH=AF+FH= AF+OF = ro + /1 + o2 O, AH' - F'H' -F'A = OF' -AF' = '. It remains to construct the roots of (14). This may be done by sec. 3. Draw HTQ parallel to AO and intersecting OC produced at T. Make TQ=AH'. Draw a circle having as diameter the line BQ joining B=(0, 1) with Q= (go', 1'). The abscissas ON and OM of the intersections of this circle B D 0 T f S E A l'F FIG. 9. with the x-axis OT are the roots of (14). Hence the larger root 7o't is 2w OM=2 cos The perpendicular bisector LP of OM cuts the unit circle 2w 2n at P. Then cos LOP = OL = cos 7 LOP =- Hence the chord CP is a side of the inscribed regular 17-gon. For an elegant construction by von Staudt which employs only straight lines and the given circle, see Bachmann, Kreistheilung, pp. 69-75. The figure cannot, however, be conveniently drawn on a single page of the size of the present book. 21. Having treated in detail the special cases p=5 and p=17, we proceed to develop Gauss's theory for any prime p. Let p -1 = e f be any factorization of p -1 into two positive integers. We separate the p-1 roots (10) into e sets each 374 MODERN MATHEMATICS of f roots. For the first set, we take the first root r, the eth root rag following it, then the eth root following the latter, etc. For the second set, we take the second root rg, the eth root following it, etc. The exponents in the various sets are therefore 1, ge, g92e,g(f l)e, g, ge+l g2e+l1,...,g(f l)el ( I ~~(15) 9e-1 Y 92e-1 g3e-1 gp -1.el The sum of the roots in any set is called a period. Hence the periods are 9r=rk +Te+k 2eyae k I ~,~-)ef k 7,k=rg +rg ~ rge +... +rg f0~k k=O 1,1..., e -1). (.16) Let f= e' f' be any factorization of f into two positive integers. Then p -1 is the product of ee' and f'. As above we have ec' periods, each the sumn of f' roots, j=i=rg +rg ee+irg2ee +... ~rg j'i)ee~( = 10, 1,..., ee' -1). (17) Each period (16) is the sum of certain e' periods (17), I.- = TIIk 7/e+ k + )7/2e +k + + e + kI'-l~~k (k=01 l..., e — 1). - (18) Indeed, the second member is seen to contain each root se+I. once and but once, while this is also true of )le. Let f' = ef" be any factorization of f' into two positive integers. Then p -l is the product of ee'e" and f". Hence there are ee'e" periods, each of f" roots, 7l -It C!"-r1ee'e) + + 2ee'e" + tJ (f 1)ee'e" +t (t=0, 1,..., ee'e".-1). (19) Each period (17) is the sum of certain e" periods (19), )i. + Ilce'~j++ )2ee'+~. (= 01 1,..., ee' -l). (20) CONSTRUCTIONS WITHI RTLFER AND. COMPASSES 37 375 Similarly, we may take any factorization f"= — e..f"' of f", then any factorization off"',etc., until we reach f(l) -1I. Thus, each period separates into periods of f ewer terms, the final periods having a single term. For example, if p=l17 we may take e = 2, f= 8, el'== 2 f'= 4, e" =_=2, f"=~2, e"'1=2, f"'-=1, and obtain the periods given in sec. 19. The following theorems will be proved in sec. 38: Theorem I. The periods 2)0., 7)I,.. 7)e-i are the roots of an equation F (x) = 0 of degree e with integral coefficients. Theorem IL. The e' periods, each off' terms, k, 7e-k, 7 2e-j-k, 7).. (e'-l)e+klc,.. (1 whose sum is.7)i, are the roots of an equation 4/'k (X) 0 of degree e' whose coefficients are linear fanctions of 7)o, 7)1,, )with integral coefficients. Since Theorem II relates to any factorization p -1 ee'f' of p -1, we may, by a suitable change of notation, apply, it to any other factorization of p -1. Taking the factors ee', e, f", we conclude that -the e" periods, each of f" terms, k, 7ee'~k, 7 2ee'~k,.,7 (e"-l)ee-+ky (22) whose sum is 7',are the roots of an equation j5"k(X) = 0 Of degree e" whose coefficients are linear functions of '~o', 7)1',.. 7)ee'-1 with integral coefficients. Next, taking the factors ee'e", e"'f, f"', we conclude that the e"' periods, each off' ternis, 7)k7) ee'e"~k, 7 2eele"+ k,..., 7 (e"'-1)ee'e"+4-kj (23) whose sum is 7)"k, are the roots of an equation ~b"'k(X) ==O of degree e"' whose coefficients are linear functions of 7)o", 7) /eei with integral coefficients. Finally, we obtain equations of degree e(7) satisfied by periods composed of a single term, namely, one of the roots. (10). We have now shown that if e, e'..., e~l) are any integers whose product is p -1, there can be determined a series of equations, 376 MODERN MATHEMATICS of degrees e, e', e",..., e(), of which the first has integral coefficients, while the coefficients of 7,.(t)(x)=0 are linear functions, with integral coefficients, of the roots of k(t- 1() = 0, and such that the roots of the final equations bk(l)(X)=0 are the complex pth roots of unity (10). 22. If p-1 is a power of 2, the numbers e, e'..., may each be taken to be 2, so that the auxiliary equations (24) are quadratics. For the application to regular polygons we may omit the final equations whose roots are the complex pth roots of unity, since (sec. 16) we require only the combination r+r-1, and since r+r-1 is a root of one of the equations (24) just preceding the final type. To prove the last statement, we note that by Monograph No. VII, sec. 47, a primitive root g of p satisfiesthe congruence ge -1 (mod. p), where e = (p -1), so that rg"=r-1. But to obtain the periods ik composed of only two terms we must set f=2, e==- (p-1) in (16). Then, k = rk +r ge+k =q ar +r-gk By the first remark in sec. 16, rk is a real number. Since each period containing more than two terms is a sum of periods containing just two terms, it follows that every period is a real number, excepting only the periods containing a single term. Hence all the quadratic equations which are required to evaluate r+r-1=2cos 2w/p have real roots. Hence if p -1 is of the form 2h, the value of 2 cos 2w/p can be found by the solution of a series of quadratic equations with real roots, so that by sec. 3, the angle 2n/p can be constructed by ruler and compasses. Hence we may state the Theorem. If p is a prime number of the form 2h+ 1, a regular polygon of p sides can be inscribed by ruler and compasses. 23. We next investigate the regular polygon of n sides, when n has two or more distinct prime factors p, q,..., namely, n=pqt... If we have a regular polygon of n sides, we may join certain vertices and obtain a regular polygon of ps sides, or one of CONSTRUCTIONS WITH RULER AND COMPASSES 377 qt sides,.... Conversely, if the latter polygons are given, we can construct one of n sides. In general, if a and b are any relatively prime numbers, we can derive a regular ab-gon from a regular a-gon and a regular b-gon. Indeed, by Monograph No. VII, sec. 32, there exist integers c and d, such that ca+db=1. Since we have the angles 2r/a and 2r/b, we can construct multiples of them, add these multiples, and obtain the angle 27 2 27 27 d 2- +c- (db+ca)=-, a b ab' and therefore construct a regular ab-gon. We have thus proved the Theorem. If n=psqt..., where p, q,... are distinct primes, a regular polygon of n sides can be inscribed by ruler and compasses if, and only if, regular polygons of ps sides, qt sides,... can be inscribed. 24. It therefore remains to consider a regular polygon the number of whose sides is a power of a prime, say p8. The psth root of unity, r= cos 2/ps + i sin 27/p,.. (25) is a root of xPS=1, but not a root of xPs-1 1, as shown by De Moivre's theorem (sec. 5). Hence r is a root of XP -1 X -- ps-1-l (p)+ P p —2)+. ~=ps- 10. (26) It will be shown in sec. 31 that this equation is irreducible in the domain of rational numbers. If a regular ps-gon can be inscribed by ruler and compasses, the coordinates Xk, yk of its vertices (sec. 13) involve no irrationalities other than real square roots (sec. 2). Hence, Xk+iyk, where i=V/-1, will involve no irrationalities other than real or imaginary square roots. In the algebraic discussion in sees. 5-10 the radicals were not restricted to real radicals. Hence, by sec. 10, the equation (26), which is irreducible in the domain of rational numbers and has the root r = x +iyl, must be of degree a power 378 MODERN MATHEMATICS of 2. If s>l, ps-1 (p-1) is not a power of 2 except in the case p —2. Hence we may state the Theorem. When p is a prime number >2, a regular polygon of p8 sides cannot be inscribed by ruler and compasses if s>l, or if s= 1 and p-1 is not of the form 2h. 25. In view of the theorems in sees. 22 and 24, a regular polygon of p sides, where p is a prime number >2, can be inscribed by ruler and compasses if, and only if, p is of the form 2h+1. We note that 2h+l is composite if h has an odd factor 2&k+l, so that h= (2k+l)q, since in that case 2h +1 has the factor 2q +1. If a number h has no odd factor it must be a power 2t of 2. We therefore have the result: A regular polygon of p sides, where p is a prime >2, can be inscribed by ruler and compasses if, and only if, p is of the form 22t +.......... (27) 26. We are thus led to ask for what values of t the number (27) is a prime. For t=0, 1, 2, 3, 4, the numbers are 3, 5, 17, 257, 65537, each being prime. The famous arithmetician Fermat expressed his belief that the number (27) was a prime for every t, but admitted that he had no proof of his conjecture. But Euler proved in 1732 that when t=5 the number is not prime, 232 + 1 641.6700417. Further, the number (27) is known * to be not prime for t=6, 7, 8, 9, 11, 12, 18, 23, 36, 38, 73. The regular 257-gon has been discussed at length by Richelot in Crelle's Journal fur Mathematik, 1832, pp. 1-26, 146 -161, 209-230, 337-356; and geometrically by Affolter and Pascal in Rendiconti della R. Accademia di Napoli, 1887. The regular polygon of 216 +1=65,537 sides has been discussed by Hermes, G6ttingen Nachrichten, 1894. *Proceedings of the London Mathematical Society, 1903, p. 175, 1905, p. xxi; Bulletin of the American Mathematical Society, 1906, p. 449; Vol. XI, p. 543; 1909, p. 1. CONSTRUCTIONS WITH RULER AND COMPASSES 379 27. Since any angle can be bisected, a regular 2k-gon is inscriptible if a regular k-gon is. Hence the results in sees. 23-25 lead to the Theorem. A regular polygon of n sides can be inscribed by ruler and compasses if, and only if, n= 21lp1..., where pi, p2,.. are distinct primes of the form 22t + 1. The lowest primes pi are 3, 5, 17, 257, 65537. For the succeeding values 5, 6, 7, 8, 9 of t, the number is not prime. For the next case t=10, the number has 155 digits; whether or not it is prime has not yet been determined. The regular polygons of n sides, where n lies between 2 and 26, fall into the following two classes: Inscriptible: 3, 4, 5, 6, 8, 10, 12, 15, 16, 17, 20, 24; Not inscriptible: 7, 9, 11, 13, 14, 18, 19, 21, 22, 23, 25. 28. Primitive roots of unity. A root r of x =1 is called a primitive nth root of unity if it is not the root of a similar equation of lower degree, namely, xl =, with 0< < n. For example, i= \/-1 is a primitive fourth root of unity, since i4=1, while i, i2, and i3 are distinct from 1. There exist primitive nth roots of unity, for instance, 27 2n rl =cos - +z sin-; n n by De Moivre's theorem, rln is the least positive power of ri equal to 1. If r is any primitive nth root of unity, the powers r, r2,.., r. (28) give all the nth roots of unity. In fact, these powers are roots of x =1 and are all distinct; furthermore, there cannot be more than n distinct roots of an equation of degree n. It is easy to determine which of the roots (28) are primitive nth roots of unity. Consider rk and let g be the greatest common divisor of k and n. Then (rk)n/g = (r't) k/ 1. 380 MODERN MATHEMATICS Hence, if g >1, so that n/g< n, rk is not a primitive nth root of unity. But if g=l, there exist (Monograph No. VII, sec. 32) integers a and b for which ak + bn = 1. Then if (rk)l=1, for 0< l< n, we would have rl=r(ak4 bn)l =(rkl)a(rn)bl _1 whereas rl 1. Hence, if g=1, rk is a primitive root. We have thus proved that when r is any primitive nth root of unity, rk is also a primitive root if, and only if, k is relatively prime to n. For example, i= =/-1 is a primitive fourth root of unity and hence also i3= -i is, whereas i2 - 1 is not a primitive root. Another statement of the preceding theorem is the following: If r is any primitive nth root of unity and if 1, a, b,..., I are the integers less than n and relatively prime to n, then r, ra rb,... r.... (29) give the all distinct primitive nth roots of unity. 29. Let n=p8, where p is a prime, and let r be a primitive psth root of unity. Of the n= p8 roots (28) of xP =l, rp, r2P,..., rP give the ps-l distinct roots of xP"-l=l. The remaining p -ps-l roots are primitive p8th roots of unity (sec. 28). Hence the roots of equation (26) give all the primitive p8th roots of unity. To complete the discussion in sec. 24, we must prove that this equation is irreducible in the domain of all rational numbers. This proof will be based upon a lemma of great importance. For the case of the special function y3 -3y +1, this lemma states that the function is the product of two factors with rational coefficients only when these coefficients are integers. For this case a proof has been given in sec. 12. 30. Gauss's lemma. If an integral function f(x) with integral coefficients, that of the highest power being unity, is the product of two integral functions (x)=xm+bl"xm-l +... +b,,, (x) =-Xm -+cixm'-1 +... +Cm' with rational coefficients, these coefficients are integers. CONSTRUCTIONS WITH RULER AND COMPASSES 381 Let the fractions bl,..., bm be brought to their least positive common denominator /o and set bi=P i//o. Hence Po,..., * * have no common divisor except unity. Similarly, let ci=ri/0o, where ro,..., r,, are integers with no common divisor >1. Multiplying f=.* s by Poro, we get Porof(x)= l(x> l(x),..(30) where 1= poxm +Pxm-1+...+, ~ I, l= rOXm + rlm-~ +. *+ r,. The lemma is proved if Po= o=l. Suppose, however, that PoTo>l. Then every term of the left member of (30) is a multiple of any prime divisor p of Poro. Not all the P's have a common divisor p. Let fi be the first coefficient in kil(x) which is not divisble by p. At least one of the T's is not divisible by p; let rk be the first one. The total coefficient of xm+m'-i-k in the product kl (x)- cl (x) is Pirk +Pi-lTk+l +Pi —2rk+2 +. +Pi+lrk-1 +Pi+2rk-2 +-~ and this sum must be divisible by p, since every term on the left of (30) is divisible by p. By hypothesis, Pi-i, Pi-2,. and Tk-l, rk-2,'.. are divisible by p. Hence firk must be divisible by p, contrary to hypothesis. 31. Of various proofs of the irreducibility of equation (26), we shall reproduce Kronecker's first proof. To prove that the function f(x), defined by (26), is irreducible in the domain of rational numbers, it suffices, in view of the preceding lemma, to show that f(x) is not the product of two polynomials 5(x), &(x), with integral coefficients. Suppose that such a factorization f'(x) =(X) (x) is possible. For x=1, we get Since p is a prime, one of these integers, say S(1), must equal ~1. Let r be any primitive path root of unity. Now all the primitive roots are given by (29), where 1, a, b,..., I 382 MODERN MATHEMATICS denote the t=ps-ps-1 integers less than ps and relatively prime to ps. Further, by sec. 29, these numbers (29) give all the roots of (26). Hence the factor 0(x) must vanish when one of these numbers is substituted for x, so that (r) (ra) rb)...(r') = 0. In other words, the function p(x)=b(x).< (Xa (Xb).. (XI) vanishes when x is replaced by any primitive p"th root r of unity. Since P(x) thus vanishes for each root of f(x), and( since the roots of f(x) are all distinct, it follows that P(x) is divisible by f(x). Thus P(x) =f(x) q(x), where q(x) is a polynomial with integral coefficients. The number of factors d in P(x) is t. Hence, for x=l, [(1)]t=p. q(1). Since 5(1)= 1:, p cannot divide [9(1)]t. The assumption that f(x) is reducible has therefore led to a contradiction. 32. The proof of the theorems stated in sec. 21 rests upon four lemmas which will now be established. While, in sec. 21, r denoted the particular pth root of unity, 2n.. 2n cos - +- sin -, P P we shall henceforth denote by r any primitive pth ropt of unity. In view of sec. 28, the powers (9) continue to give all the complex pth roots of unity. The same is true of the powers (10). Hence when r is any primitive pth root of unity, the various roots (10) can be separated into periods exactly as in sec. 21. 33. It is shown in elementary algebra that if an equation F(x)=0 with real coefficients has a complex root a+bi, where i= /-1 and b 70, it has also the root a-bi, so that F(x) has the factor q (x) = (x -a -bi) (x -a + bi) = x2 -2ax + a2 + b2. CONSTRUCTIONS WITH RULER AND COMPASSES 383 Since O(x) has no factor x -d, where d is real, it is irreducible in the domain of real numbers (sec. 9). This result is merely a special case of the following: Lemma I. If F(x) and +(x) are integral functions with coefficients in the domain D, and +(x) is irreducible in D, and if F(x) vanishes for one root x1 of /4(x)=0, then F(x) is the product of qf(x) by an integral function with coefficients in D. The ordinary process for finding the greatest common divisor g(x) of F(x) and f(x) involves only rational operations. Hence g(x) is an integral function with coefficients in the domain D. Moreover, g(x) is not a constant, since F(x) and +(x) have the common factor x-xl. Since ((x) is irreducible in D, its factor g(x) must equal cf(x), where c is a constant. Hence +(x), as well as g(x), is a factor of F(x). Corollary. If the degree of F(x) is less than that of 0(x), then F(x) has all its coefficients zero. 34. Lemma II. Any integral function f(r) of a primitive pth root r of unity can be given the normal form cor +clrg +2rg2+... +cprg-2,.. (31) in which each ci is an integral function, with integral coefficients, of the coefficients of f(x). If f(r) has rational coefficients, it has a single normal form. Since rP=l and r7 1, r is a root of xP -1 — =XP-1 — Xp-2+.. +x+1=0.. (32) x-1 Hence rp-1.. +r2+r+1=0.. (33) By employing r 1, we may give f(r) the form f(r) -a - ao+r +a2r2 +... ap_lrP-1. From this we subtract ao times (33) and obtain f(r)=Alr+A2r2+...+Ap_irP-,... (34) where Ai=ai-ao. Since the quantities (9) are identical in some order with the quantities (10), we may give (34) the 384 MODERN MATHEMATICS normal form (31). The first part of the lemma is therefore proved. We now make the assumption that the coefficients of the initial function f(r) are rational numbers. Then the ai, and hence also the Ai, are rational numbers. In this case f(r) can be expressed in the form (34) in only one way. For, if also f(r) =Blr + 2r2+...+Bp-_rP-1, in which the Bi are rational numbers, we obtain by subtraction and removal of the factor r an equation, o = A -B1 + (A2 -B)r... + (Ap_1 -Bp,_)rP-2, with rational coefficients. But equation (32) is irreducible in the domain of rational numbers (sec. 31 with s=1). Hence by the Corollary in sec. 33, each coefficient A -Bi is zero. 35. The periods io, 1i,..., e-i, defined by (16), have the important property that each is unaltered when r is replaced by rge. In fact, rg8 is then replaced by (re)S rge+s, so that any term, except the last, of a period )k is replaced by the next succeeding term, and the last term by the first The last statement follows from gfe gp- 1 (mod. p). 36. Lemma III. Any integral function f(r) of a primitive pth root of unity, with integral coefficients ai, and having the property that it remains unaltered when r is replaced by rge, equals a linear function of the periods, kro + k +....+ ke-l e-1,.. (35) where each ki is an integral function of the ai with integral coefficients. If the ai are all integers, then the ki are integers. CONSTRUCTIONS WITH RULER AND COMPASSES 385 Let f(r) be given the normal form (31), but with the powers of r arranged in tabular form as in (15). Thus, f(r) coor +clorge + C2or2e +.. + cf -1of- 1)e +cokrgk +C + krg 2+Cre++ k f. kfg(f- 1)e k When r is replaced by rg, the powers in any row are permuted cyclically (end of sec. 35). The coefficients of the resulting function must equal those of f(r), by Lemma II. Hence Cok= —C1i, Clk=C2k,..., Cf-1k-Cok k=01,...,e-1). Thus the c's in each row are all equal. Removing the common factor, we obtain a sum of powers of r which defines a period Vk. Hence f(r) =ooro+coll +.-..+COke ck+...+COe-l7e-l. 37. Lemma IV. An integral function f(r) of a primitive pth root of unity with integral coefficients, which remains unaltered when r is replaced by rg, equals an integer. We apply Lemma III for the case e =1. Then irs = r +rg + rg2... +r -- is the only period of f=p-1 terms. By (33), Vo-= -1. Hence, by (35), f(r) equals the integer -ko. 38. We are now in a position to prove Theorems I and II of sec. 21. First, Do,..., Ve-i are the roots of F(x) = (x — ro) (x —7).. (x — e-1) = 0. Its coefficients are symmetric functions, with integral coefficients, of the periods ri. When r is replaced by rg, these periods are permuted cyclically, that is, Do is replaced by V1, 11 by 72,..., and 7re-1 by )o. Hence a symmetric function of these periods remains unaltered and, by Lemma IV, equals an integer. Theorem I is therefore true. 386 MODERN MATHEMATICS Next, the e' periods (21) are the roots of kt(X) = (X -T)k)(X -- e+k)... (X -Be'-1 )e.k) =0, whose coefficients are symmetric functions, with integral coefficients, of these periods (21). But the latter are permuted cyclically when r is replaced by rge. Hence Theorem II follows from Lemma III. 39. References. The proof that regular polygon of p sides, where p is a prime of the form 2^+1, is geometrically inscriptible was first made by Gauss, Disquisitiones Arithmeticae, translated into German by Maser. On p. 447 of the latter, Gauss states that a regular n-gon is not inscriptible if n contains an odd prime factor not of the form 2h+1, or the square of a prime 2h + 1 (i.e., states the theorems of sees. 23 and 24 above); 'but no proof appears to have been published by Gauss. References to the proof (secs. 5-11) of this impossibility may be made to Petersen, Theorie der Algebraischen Gleichungen, Kopenhagen, 1878, p. 156. Klein, Vortrage fiber ausgewahlte Fragen der Elementargeometrie, Leipzig, 1895; English translation by Beman and Smith, Boston, 1897. Enriques, Questioni Riguardanti la Geometria Elementare, Bologna, 1900, Articles 10 and 11; German edition. The theorems may be readily proved by means of Galois's theory of algebraic equations. For the domain of rational numbers, the Galois group of equation (26), whose roots are the primitive psth roots of unity, is cyclic, so that its factors of composition are the prime factors of ps-l(p-1). If, and only if, these factors are all 2 will the equation be equivalent to a chain of quadratic equations. IX THE HISTORY AND TRANSCENDENCE OF X By DAVID EUGENE SMITH CONTENTS 1, The nature of the problem; 2, The history of the problem; 3, The transcendence of e; 4, The transcendence of n. 388 IX THE HISTORY AND TRANSCENDENCE OF 7 By DAVID EUGENE SMITH 1. Nature of the problem. The first areas that the world measured accurately were doubtless rectangles, and, in particular, squares. If the sides of the rectangles were commensurable with common units of linear measure, and for practical purposes they were, at least with some convenient submultiples of those units, then the problem was easily solved. The next step was probably the mensuration of the parallelogram or triangle, to be followed by that of the trapezoid, thus completing the most common rectilinear forms. In theory the measurement of these polygons offered no serious difficulties, and by means of these figures the areas of other polygons could easily be found. When it came to finding the area of curvilinear figures, however, the problem assumed new difficulties, and in connection with the most common of these figures the effort was early made to find a square that should have an area equal to that of a given circle, the subsequent problem of measurement of the square being then a simple one. In other words, the problem was one of " squaring the circle." Since, however, it was early seen that a= rc, it was evident that the problem could be solved if a straight line could be found that should equal the circumference. For if this line could be found, then the formula a= =rc would give a rectangle with the same area as the circle, and it is a simple matter to construct a square with area equal to that of a given rectangle. The problem thus reduces 389 390 MODERN MATHEMATICS to " rectifying the circumference," or " rectifying the circle," as we would now say. Furthermore, since c=2nr, if we could find the value of w, as an integer or a common fraction (or finite decimal), we could easily rectify the circle. Since we can construct V/ab by the use of the compasses and straightedge, it would also be possible to rectify the circle if we could express 7r by means of a finite number of square roots. In other words, the circle would be rectified if = could be expressed by rational operations and by irrational operations involving a finite number of square roots. On the other hand, every particular geometric construction effected by the straightedge and compasses reduces to the determination of the intersection rf two straight lines, of one straight line and a circle, or of two circles, and is equivalent to a rational operation or the extracting of a square root. A geometric construction is therefore impossible unless it can be effected by rational operations or by the aid of a finite number of square roots.* The problem therefore finally reduces to the determining of the nature of w, whether or not it is the root of an algebraic equation that can be solved by these methods. 2. History of the problem. The history of the problem of " squaring the circle," or more specifically of investigating the nature of w, may be found in any of the standard histories of mathematics, and in particular in Cantor's Vorlesungen fiber die Geschichte der Mathematik (4 vols.). The subject has been specially treated, however, by Rudio in his work entitled Archimedes; Huygens, Lambert, Legendre, vier Abhandlungen fiber die Kreismessung, and in a more condensed manner in the German edition of Enriques's Fragen der Elementargeometrie, both of which works have been freely used in preparing this article. There have been three well-defined epochs in the consideration of this problem. The first extended from the earliest times to about 1650 A.D. It is characterized by innumerable and ingenious attempts at finding a square equal in * See Monograph No. VIII, sees. 2, 11. THE HISTORY AND TRANSCENDENCE OF 7 9 391 area to a given circle, or at finding the approximate value of 7 by purely geometric methods, and especially by the methods now used in our elementary text-books. The second period was about a century in length, extending from the invention of the differential and integral calculus to the year 1766, when Lambert published his work on the subject. In this period the methods of analysis replace the geometric methods of the ancients, and the names of Newton, Leibnitz, the Bernoullis, and Euler are prominently connected with the investigation. Instead of the Greek method of exhaustions, used to such advantage by Archimedes, we now find infinite series and products used to approximate the value of =T, and Euler's remarkable formula, to be referred to later, is introduced into the discussion. The third period extends from the middle of the eighteenth century to the present time, and is characterized by the efforts to discover not the approximate value of 7r, but the nature of this number, whether or not it is rational, or whether it is algebraic or transcendent. Since the two latter terms will enter into this discussion it should be understood that an algebraic number is a number that is the root of an equation, Co+ C1x +C22-... + -Cnxn-O, where the coefficients Co, C1,..., Cn are rational numbers. A number which is not algebraic, that is, which satisfies no such equation, is called transcendent.* It should further be mentioned that if a number is the root of an algebraic equation with rational coefficients it is also the root of an algebraic equation with integral coefficients. For this reason we may restrict our equations to those in which Co, C1,..., Cn are integers. The first period begins in prehistoric times, the earliest approximation for Tr probably being 3, as in the Bible (I Kings vii 23, and II Chronicles iv 2). On the Babylonian cylinders there has not yet been found any definite statement as to this value, and the Hindu and Chinese records are untrustworthy for these remote times. We have, however, a valuable papyrus * See also Monograph No. VI, sec. 13. 392 MODERN MATHEMATICS in the British Museum, probably copied about 1700 B.c. from a work of some centuries before, in which an approximation for 7 is given. This papyrus was copied by one Ahmes, a scribe, and states that the area of the circle is, in our symbolism, )d2, or 1 d2. Now since a=rd2, it follows that the Ahmes 256 value of w is -, or 3.1604... 81 ' Among the Greeks, numerous philosophers attempted to solve the problem. One of the earliest to make any progress was Hippias of Elis (c. 420 B.c.) who invented a curve known as the 77cpayowvlovua or quadratrix, which usually bears the name of Dinostratos (c. 350 B.C.) who studied it carefully. The curve may be described as follows: If a circle of unit radius has its centre at the origin of rectangular coordinates, and if two points Q and R move with uniY form velocity, one upon the quadrant AB B /R Q and the other upon the radius OB, so fQ A X that they start from A and 0, respect< o J ively, at the same time, and reach B simultaneously, then the point of interY' section, P, of OQ and of a perpendicular to OB from R, describes a quadratrix. It therefore follows that the ordinate y is proportional to the angle qS, and specifically that as we double y (within the quadrant) we double ~. Furthermore, since if y=l, 2/ y y Also =.' ^-tan"1 -, or arc tan-, -tan y, X 2 and x=-. tan tan -y THE HISTORY AND TRANSCENDENCE OF 7 393 Therefore the curve meets the x-axis at x= lim Y 2 y0O tan y That is, if we can construct the quadratrix we shall have an abscissa exactly equal to -, from which w can easily be constructed. The difficulty was at once seen, however, namely, that the construction of a quadratrix itself was as difficult as to find w, and that indeed it was practically the same problem. Contemporary with Hippias were Antiphon and Bryson, to whom we are largely indebted for our present methods of attacking the problem in elementary geometry. Antiphon inscribed a square (or possibly an equilateral triangle) in a circle, and by continually doubling the number of sides he approximately exhausted the difference between the polygon and the circle, thus approximating the area. Bryson, of Heraclea, a follower of the Pythagoreans, not only inscribed a regular polygon, but also circumscribed one similar to it, and then assumed that the area of the circle was the arithmetic mean between the two areas, a false assumption that led only to a fair approximation. To Antiphon, therefore, we trace one of the earliest steps in the invention of the modern calculus. The first one to actually square a curvilinear figure, in his efforts to square the circle, was Hippocrates, of Chios (c. 450 B.C.). He proved that if semicircles be described upon the sides of an isosceles right triangle, as shown in the figure, the lune A will equal the triangle A'. The proposition is easily generalized for scalene right triangles, but it contributed nothing to the general problem of the circle. The greatest step among the Greeks was taken by Archimedes in his three propositions on the measurement of the 394 MODERN MATHEMATICS circle (KVKXOV &proLts). Substantially his method of finding the value of n is by inscribing and circumscribing regular polygons and doubling the number of sides, quite as in elementary geometry to-day. By this means, using a polygon of 96 sides, he showed that 3 3 r>3, from which fact 31 has often been called the Archimedean value of w. Since 3-1 is less than 0.2 per cent larger than the real value, and is such a simple number for ordinary computations, it is still in common use. Ptolemy improved upon the values assigned by Archimedes, expressing the result in the sexagesimal system as 3 8' 30", i.e., 3 0 + 2+,3 which reduces to 3207 =3.14166... A somewhat similar value of z appeared in India as early as c. 500 A.D., when Aryabhatta gave the value 2,000, which equals 3.1416, a value, however, that may be due to a later writer by the same name. Brahmagupta (born 598 A.D.) gaveV/10 as the exact value, perhaps because of the common approximation formula, /a2+r =a+- this leading to /10 = 3 +, or the common Archimedean value. This value V/10, was extensively used in mediaeval times. The next noteworthy step in obtaining an approximate value for 7= was taken by the Chinese. Chang Heng (78-139 A.D.) gave a rule that was equivalent to taking \/10 for r. Wang Fan (229-267) gave r==142:45, or 3.1555..., and a contemporary writer, Lui Hui, proceeding in the same way as Antiphon, found the ratio 157:50, or 3.14. The most interesting of the Chinese discoveries, however, is that of Tsu Ch'ungchih (fifth century A.D.), who found for the limits of 10r, 31.415927 and 31.415926, from which he inferred by some reasoning not stated in his works that 22 and 1 55were approximate values. The latter is the one usually attributed to Adriaen Anthonisz, as mentioned hereafter. Various other attempts were made by the Chinese, but no noteworthy results were obtained until after the European influence had permeated their civilization. In the Su-li Ching-yin, compiled THE HISTORY AND TRANSCENDENCE OF Tr 395 by Imperial order in 1713, the value of, is found to 19 figures.* The greatest mathematical genius of the Middle Ages, Leonardo Pisano, Fibonacci, brought the limits of n somewhat closer than Archimedes, namely to 144 = 3.1427. 1440 1440 o and 458=3.1410...,taking as the mean 8=3.1418. No material improvement in methods or results were thereafter made until about the beginning of the seventeenth century. It was then that the Chinese value 3-5=3.1415929..., was rediscovered by Adriaen Anthonisz (1527-1607), being published by his son, Adriaen (1571-1635), who, from the fact that his family was originally from Metz, took the name of Metius. This publication took place in 1625, and it appears that the father had first shown that 3.1<7r<337< and that he had reached this value by assuming that =315+17 =31 5=13 106 +120 113 113' The value is correct through the sixth decimal place. About the same time Viete (1540-1603), following the Greek method, considered polygons of 6-216 sides, and found the value of = correct to nine decimal places. Adriaen van Rooman (Adrianus Romanus, 1561-1615), a Lyonese by birth, carried the computation to seventeen decimal places, and a little later Ludolph van Ceulen (1540-1610), extended it to thirty-five decimal places, a fact that was thought to be so noteworthy as to lead to 7 being called the Ludolphian number, a name still used in Germany. The last noteworthy attempt by Greek methods was the improvement suggested by Christian Huygens (1629-95), by which he was enabled to find the value to nine decimal places by using only the inscribed polygon of 60 sides. With his labors the ancient methods may be said to close. The second period in the solution of the problem begins in the second half of the seventeenth century. It was now that the new analysis came to the aid of the investigators, and the genius of men like Newton, Leibnitz, Fermat, Wallis, Brouncker, * See Mikami, in the Bibliotheca Mathematica, 1909-10, p. 1. 396 MODERN MATHEMATICS and the Bernoullis asserted itself. Instead of the geometric methods of Archimedes there appeared methods of a radically different nature, having for their object the expressing of 7 analytically, and developing it as an infinite series or product. The first noteworthy attempts in this line were made by John Wallis (1616-1703) who proved that 7 2 2 4 4 6 6 8 8 _ 22446688 2 1 3 3 5 5 7 7 9 ' and that 4 1 -1+ = 2 +9 2+25 2+49 2+81 2+..., this second form, the continued fraction, having already been given to him without proof by Lord Brouncker (1620-84). The most important infinite series developed at this time for the study of the circle was discovered by James Gregory (1638-75) in 1670, and independently by Leibnitz (1646-1716) in 1673. This is the series: X3 x5 X7 tan -ix=x- +- - +... 3 5 7 Gregory, moreover, recognized the necessity of considering the question of the convergency of such a series, a subject elaborated by Leibnitz a little later. Gregory also stated that in general the ratio of a sector of a circle to the area of any inscribed or circumscribed polygon cannot be expressed by a finite number of algebraic terms. He therefore concluded that the circle could not be squared, although as Mr. Ball has pointed out in his history, " it is conceivable that some particular sector might be squared, and this particular sector might be the whole circle." In the series for tan-' x, if x= 1, we have the series 7 11 1 -=4_ __ - 4 35 7'-*' THE HISTORY AND TRANSCENDENCE OF 7n 397 but it converges so slowly as not to be convenient in practice. This series bears the name of Leibnitz, having been communicated by him to certain of his friends in 1674, and published by him in 1682. It was known before his time, however. If, instead of using x=l, we take x= /-, the series for tan-1 x becomes f 1 1 1 1 \ 6 \ 1 -3 -'3 + 325 337 349 35 which is more usable than that for. Still better than this formula is one derived from the series for tan-1 x by means of an addition theorem, viz.: x+y tan- x + tan- y = tan-l' 1-x I1-xy' which by repeated application leads to a formula for the sum of several antitangents or for multiples of a single antitangent. It was thus that the English mathematician Machin (1680 -1752) established the relation i1 1 -4 tan-' — tan-1 4 5 239 / ___ 1 1 5 3.53+5.5-5- 7.57 (239 3.2393 5.2395 7.2397 + By these means the value of r was computed to 100 decimal places after Abraham Sharp (1653-1742) had computed it to 72 decimal places by the help of the series for -. The other attempts at computing the value of 7r by means of series may be summarized by mentioning the names of the French mathematician Lagny (1660-1734), who carried the computation to 127 decimal places; the Austrian Georg Vega (1756-1802), 140 places; the Hamburg computer Zacharias 398 MODERN MATHEMATICS 1 1 Dase (1824-61),200 places, using the formula 4 = tan-1 +tan- 5 1 +tan-'; Richter, who extended the value to 500 decimal 0 places, and Shanks, who carried it to 700 decimal places. These efforts are of value chiefly in showing the superiority of the modern over the ancient methods. Practically, as the late Professor Newcomb remarked, " ten decimals are sufficient to give the circumference of the earth to the fraction of an inch, and thirty decimals would give the circumference of the whole visible universe to a quantity imperceptible with the most powerful microscope." The results of these extended computations revealed nothing concerning the real nature of w, nothing as to whether it is rational or irrational, and nothing as to its possible transcendental character. The foundation for the solution of the problem as to the nature of 7 was furnished by Euler in connection with the formulas involving e, the base of the so-called Naperian logarithms, although first used as a base in the tables of John Speidell, published in London in 1619. Starting with Maclaurin's formula, X2 X3 f(x) -f(0) +f'(0). x+f"(0) 1 +f '(0). *-3 +. it is evident that X X2 X3 x x2 x_ 1 1.2+.2.3+ cos X1 - ~ +1 + 2 3 3X3 X5 and sin x=x 1.2.3-2 - all being convergent series. It was by the help of these series that Euler (1707-83) showed that ix x2 ix3 eix=1+ +.. 1 1.2 1.2.3 THE HISTORY AND TRANSCENDENCE OF n 399 ix3 ix5 and i sin x=ix - 3+12 1.2.3 1.2.3.4.5 ' ' '2 whence ei x-cos x + i sin x. If x-r, this reduces to the form ei7r- -1, whence 1 + ei = 0, an expression involving perhaps the five most interesting quantities in mathematics. It is by means of this equation that the transcendent nature of n was proved about a century and a half after Euler's discovery. Euler also gave numerous other relations between e and w, and expressed in various ways the values of these numbers in infinite series and products, and as continued fractions. For example, he showed that the following relations exist: r2 1 1 I 1 rr3 1 1 1 1 321 33 53 73 93-... r2 22 32 52 72 112 6 22-1 32-1 52-1 72-1 112-1.. e =2+ -- 2+1 1+1 4+1 1+... e-1 1 2 1+1 10+ 1 14 1 18+... 400 MODERN MATHEMATICS The third period in the history of the study of 7r begins with the work of the German mathematician Johann Heinrich Lambert (1728-77). In his treatise on the quadrature and rectification of the circle (1766) he set forth two fundamental propositions, viz.: 1. If x is a rational number, not 0, then ex cannot be rational; 2. If ex is a rational number, not 0, then x cannot be rational. He reached these conclusions by starting with Euler's expression for (e -1), viz.: e-1 1 2 1+1 6+1 10+ 1 14+ 1 18 +1 22+.... He then showed that ex-1 1 ex+l 2 1 x 6 1 x 10 1 X 14 T- ~ ~ ~ x and 1 tan x=- 1 1 x 3 1 x 5 1 x 7 1 x 9 x and from these continued fractions he drew the conclusions stated, the proof not being rigorous. For the special case of THE HISTORY AND TRANSCENDENCE OF z 4 401 W x= we know that tan - 1, whence he asserted that r can4 4 not be rational. The failure of Lambert to prove that the continued fraction m n + i' 7d.LI + 77// n/+... is irrational, the number of terms being infinite, m, m'.. M Mt and n, n'... being integral, and -, n,.., being less than 1, was remedied by Legendre (1752-1833), who supplied the proof in his Elements de geometrie (1794). With Legendre's work, therefore, the proof of the irrationality of = may be said to have been settled, and to this he added a proof of the irrationality of 72. The next noteworthy step was taken by Liouville (1809-82) in 1840, when he showed that e cannot be the root of a quadratic equation with rational coefficients, or, in other words, that if a, b, c are rational, ae2+be+c=0 is impossible. This was the first successful attempt toward verifying what Legendre had stated to be probable,-that = is of such a nature that it cannot be classed among algebraic numbers, that is, that it is not the root of any algebraic equation with a finite number of terms with rational coefficients. The question then, as it stood after the contribution of Liouville, was twofold: Of what, if any, algebraic equations with a finite number of terms with rational coefficients can e and = be roots? Is it not possible to find numbers that are not roots of an algebraic equation of this kind? Legendre was the first to express the doubt contained in the second part of this question, and the doubt became a certainty when LiouLville proved, in 1844, the existence of non-algebraic numbers and justified the division of numbers into algebraic and transcendental. As the result of a careful investigation of the exponential function Hermite succeeded in proving, in 1873, that the 402 MODERN MATHEMATICS number e is transcendental, and Lindemann, in 1882, succeeded in proving the same for T, basing his proof upon the labors of Hermite. Lindemann proved essentially that in an equation of the form ao +aleP +a2eq +a3er +... — 0, the exponents and coefficients cannot all be algebraic numbers. It therefore follows that in the Euler equation, 1 +ei =O, where the coefficients are algebraic, the exponent i7 is not algebraic, and hence r is transcendental. While we shall not follow Lindemann's proof exactly, it is nevertheless necessary, as a preliminary to considering the nature of a, to prove that e is a transcendental number. 3. The transcendence of e. Since Hermite first proved that e is transcendental others have materially simplified his treatment of the problem. The contributions of Hilbert, Hurwitz, and Gordan were published in the Mathematische Annalen in 1893. The Gordan proof was still further simplified by Weber in his Algebra, and later in the Encyklopadie der ElementarMathematik (1903), and Enriques, in his Fragen der Elementargeometrie (German edition, 1907), presents it in its latest form. To the last-named work the basis of the following proof is due but the proof has been materially simplified, chiefly through the kind assistance of Professor E. V. Huntington, of Harvard University, who planned the treatment for e, and who made the suggestion of using the cubic instead of the general equation, and of distinctly setting forth the three lemmas. To prove that e is a transcendental number means that it must be shown that e is not a root of any algebraic equation with rational coefficients. In other words, it must be shown that it is impossible to have a general equation of the form C + Cle +C2e2 +.. +Cnen=0.... (1) where n is any positive integer, and where the coefficients Co, C1,..., are any rational numbers, including 0, except that Co and Cn cannot be 0, since this would change the degree of the equation. In order to simplify the proof it is proposed to take a cubic THE HISTORY AND TRANSCENDENCE OF 403 equation instead of this general equation of the nth degree, and to show that it is impossible to have Co +Cle +CC2e +C3e3=..... (2) The proof, however, is essentially the same as that for the equation of the nth degree, the gain being in the simplicity of statement. The extension of the proof to the general equation is obvious. The proof requires us to consider two important functions which, on account of their frequent use, we shall distinguish by the symbols f(x) and F(x). The first of these is a rational integral function of x of the nth degree, such that f(0)=0. It is therefore of the form f (x) =-al - a2X2 + a3x3 +... + anXn, where the a's are rational numbers, being the coefficients in the expansion of f(x) in powers of x. The proof depends upon the ingenious selection of the following function: XP-I[(X -1) (X -2) (X -3)] JfWx (p-1)! in which p is a prime number to be determined later in this discussion. If f(x) is put into the form aix+a2x2 +...+anXn, that is, if p-lr(x -_) (x -2) (X -3)]P Jf(x) = (p-i)! =alx +a2x2 + a3x3 +... +a,..(3) it is evident that n=3p+p-1, and that ap_l is the first coefficient that is not zero, since the lowest power of x is xp-1. More generally, of course, the numerator of this fraction would be xp-l[(x-1)(x -2)... (x -m)]P, but for our present purposes the one selected is sufficient. The second function that enters into the discussion is F(x) =f'(x) +f"(x) +f"'(x) +.. +f (x),.. (4) where f'(x), f"(x),... f(n)(x) are the successive derivatives of f(). 404 MODERN MATHEMATICS In order to bring clearly to view the principal steps of the proof, unencumbered by subordinate matters, it is now proposed to state three lemmas concerning these functions, f(x) and F(x), relegating the proofs of these lemmas to the end of this part of the discussion. Lemma I. If f(x) =alx a2x2~+a3x3+... +a,,xn, and if S, denotes the sum of the first n terms in the series ex, so that S, = I) S2 1 + Y S3=1+ + Y I I i' 1! 2!' ~ X X2 xn-1 +(n-i) then, from (4), F(x) =1! Sial +2! S2a2+3! S3a3 +.. +n!San... (5) In particular, F(0)=1!a1+2! a2~3!a3~...+n!an. (6) Lemma II. Again referring to X73-1[(x -1) (X -2) (x -3)]P f (X) == (p-i)! and F(x) =f'(x) +f"(x) ~f"'(x) +... +f() (x), if p is any prime number and n is any positive integer, and Co, C1, 02, 03 are any integers, then CoF(0) ~01F(1) +C2F(2) ~C3F(3) ==Co(3!)P+pQ,.. (7) where Q is some integer depending upon the values of the C's and p. Lemma III. Again referring to f(x) = ax + a2x2 + a3x3 ~.. xp-'[ (X -1) (X — 2) (x -3)] P + an Xn (P if we let Ali=jaI, A2 =la2,... An=lanl, and X=IxI, then A1X~A2X2+A3X3+-..+A. Xn XP —1[(X +I)(X +2)(X $3)]P (P -1)!. ~~(8) THE HISTORY AND TRANSCENDENCE OF w 40 405 Assuming for the time being that these lemmas have been proved, or referring to the proofs at the close of this discussion, we now proceed to consider the transcendence of e. As a point of departure we take the series defining ex, already considered: x x2 x3 X ex1+l+ 3+.(9) which is convergent for all values of x. In this we let S,, stand for the sum of the first n terms as in Lemma I, so that 2 X Sn1+VX +. If, now, we multiply (9) successively by 1!, 2!,, and put Un=X n~- ++ n+1 (n+1)(n+2) so that we have x2 x3 we shall have 3!,... n!, (10) 1! ex=1! Si +x - -+. x3 x4 2! ex= 2! S2 +X2+-+34+. x4 x5 3! ex =3! S3 +X3++45+. etc., =! +U1, = 2! S2 ~ U2y =3! S3 + U3, etc. (11) Xn+1 xn~2 n! e =n Snv, — +- +.=n! S.+ Un n+1 (n+1)(n+2). 406 4MODERN MATHEMATICS If we multiply both members of the successive equations of (11) by the successive coefficients a,, a2, a3,... a, of (3) and add, we shall have (1! a, +2! a2-.. +n! a,,)ex = (1 Sla 1+2! 82a2..~fn! Sa.) + (a1U1 +a2U2+.. a,,U,,). But by (6) we know that F(O) =>1! a, +2! a2 +..- +n a,, arid by (5) we know that F(x) =1! Siat +2! S2a2+. +n! Sna, It therefore follows, from the preceding equation, that F(0)ex = F(x) +ai Ui + a2U2+.. ananFor brevity we let O(x) ->aU1 +a2U2+l-.. -+ai. (12) and we have F(0)ex =F(x)+ + (x). (13) We now have an expression for ex which depends upon F(x) of (4), and hence ultimately upon the choice of p in (3). We now return to the essential point of the problem, and recall that we are to prove that it is impossible that Co ~ Cie + Ce2 + C3e3 should equal zero. We shall evidently have this form as a factor if, in (13), we substitute 0,1, 2, and 3, sicecessively for x, multiply the results by Co, C1, C2, and 03, and then add, thus: F(0)Co =Co(FO)+Cok(0), F(0)Cie =C1F(1) + C1b(i),.(14) F(O)C2e 2=C2F (2) +0C2 0b(2), F(0)C3e3 =03F(3) + C3 (3). Adding, F(0)[Co + Cie + C2e2 + C3e3] CoF (0) + C1F(1) + C2F (2) ~ C3F(3) 1 ~0Co (0) ~ C1 b(1)~+ C2 (2) + C3a (3). ( THE HISTORY AND TRANSCENDENCE OF z7 407 We will now make the assumption of (2), that Co+Cle+ C2e2 + C3e3 =0, and will show that (15) is impossible, and hence that the assumption is absurd. In making this substitution in (15) we also recall that CoF(0) +C1F(1) +CF F(2) +C3F(3) = Co(3!)P+pQ, by (7). We therefore have 0 =[Co(3!)p +pQ]+[Co1(O) +C1 (1) +C2(0(2) +C30(3)], (16) where Q is some integer depending upon the values of C and p, and 0 is the function defined in (12). The problem now reduces to showing that (16) is impossible, and hence that (15) is impossible, and hence that e cannot be a root of an equation like (3). We can show that (16) is impossible if we can prove that (1) The absolute value of the first part, Co(3!)p+pQ, is greater than or equal to 1; (2) The absolute value of the second part, Cob(0)+... +C3 7(3), is less than 1. For in case we can prove this, then in the most unfavorable case we shall have ~1 ~ (a number less than 1)=0, which is manifestly impossible, whence (16) is impossible. As to the first part, Co(3!) +pQ, if we take p a prime number greater than 3, and not a factor of Co, Co(3!)p is not divisible by p, but pQ is divisible by p. Therefore, because Co is not zero, we have the absolute value of Co(3!)P+pQ 1. Consider now the second part of (16), Co0(O) +C1(1) +C27(2) +C30(3). In this we shall need to make use of the fact that the absolute value of a sum is less than, or at most equal to, the sum of the absolute values of the terms, as is seen in the simple case of 12 2 -+2 -21 -0, while 21 +1 -21 + 121 +] -21 =. And since the b functions are defined (12) in terms of the U functions (10), we consider first U,. From (10), Ul=x n+-+ (n+ l)n2) +. ] n+1(nl (+2 408 MODERN MATHEMATICS Putting X for lxi, as in (8), we have us I,< Xn I~ ( + X>( + X2~ tz~-] junl<Xni~+ ~+~.. 1 since each denominator has here been replaced by a smaller one. Hence, from (9), lUnl<XneX. (17) Having now considered the U function used in defining the & function (12), we consider the latter, i.e., O4x)=aiUi +a2U2 +....+an In this we put A, for lall, A2 for ja2j,.., and we have j0(x2) <<Ajj Ujj Aj UTj1...$Anj Unly as in the preliminary work of (17). Substituting the limiting value of I U,,1, from (17), and giving to n the successive values 1, 2, 3,..., we have Jsb(x)I < eX[A,X+A2X2 ~...A,Xn]. Whence, from (8) 1YIP1[(X + 1)(X +2) (X +3)]p whence [X (X+ 1) (X~ 2) (X+ 3)]P-1 (x)I < eX(X+1)(X+2)(X+3) (p-i)!-3(18) Now for any fixed value of X we can take for p a value so large that LX(X~ 1)(X +2)(X + 3)]P-1 shall be as small as we please, since this is of the formynl and is therefore the pth term of the convergent exponential series and therefore approaches zero as p increases. THE HISTORY AND TRANSCENDENCE OF 7 409 Hence, putting 0, 1, 2, and 3 successively for x, we see that 1b(0)1, 1s(1)l, 1 (2)1, and 1I(3)1 can all be made as small as we please by choosing p sufficiently large. Hence the absolute value of the second part of (16) which we are considering, viz.: Co; (0) +C1i (l) +C2 0(2) +C3 0(3), can be made as small as we please, and hence less than 1, which was what we set out to show as the second part of the general proof. It therefore appears that (16) cannot be true, and that therefore (15) cannot be true, and that therefore (3) cannot be true; in other words, that e is not the root of any cubic equation with integral coefficients. And what has been shown with respect to the cubic equation can evidently be shown with respect to ar-equation of the nth degree, since no essential use has -been made of the restriction n=3. Hence e cannot be the root of any algebraic equation. Proof of Lemma I. Lemma I asserts that if f (x) =al +a2x2 + a3x3 +.. + ann, and if we let S1=l, S2=l+~! S3=1+ + x xn-1... Sn=l+!+..+.+~(n-)' then f'(x) +f()+f"' () + (+...+f(n)(X) =-1! Sia + 2! S2a2+... +n! Snan. To prove this, first write f(x) in this form: x X2 X3 Xn f(x) =1! al +2! a22+3! a3- +.. +n! an Taking the successive derivatives we have X X2 Xn-1 J'(x) 1! al +2! a2! +3! a3 +. * *+nan(n _ —1) X xn-2 f"(x)= 2!a2 +3!a3-+....+n!an(, 2) 1(x)^n -3!a 3( n —3 f"' (X)= 3!a3 +... +n! an (n -3)!' f() (x)== n! an. 4.10 MODERN MATHEMATICS By adding, and substituting Si, S2,..., S,, for their 're. spective series, we have - Proof of Lemma II. Lemma II asserts that if (p-i[x- )!( 2 ( 3] and F (x) ==f'(x) +f" (x)~+...~n ( and p is any prime number, and n is any positive integer, and the C's are any integers, then C0F (0) ~ C1F (1) + C2F (2) ~0C3F (3) = Co (3!) P+ pQ, where Q is some integer depending upon the values of the C's and p. Arranging f(x) according to ascending powers of x we have B.-1xp-l ~Bpxp ~Bp+lxP~l +....Bp1 ~since it is apparent from (3) that the lowest power of x is p -1, and the highest is 3p -1 p -1 == 4p -1. It is evident that B,1, Bp,... B4p-i are integral, since they are products of integers, and that Taking the successive derivatives, so as to deternilne the values of F(0), F(i), F(2), F(3), we have, after putting 0 for x, f'(0)=0, f"(0) =0,...f f(p2) (0) = 01 but Hence Substituting the value of B,, above, we have (CoF(0) =Co(3!)P +a set of integers in which p is a factor. THE HISTORY. AND TRANSCENDENCE OF w 4 411 Similarly, taking the -values of f'(1), f"(l),..., we shall find that F(1) equals a series of integers in which p is a factor of each term, and so for F(2) and F(3). Hence C0F(O) ~ C1F(1) + C2F(2) + C3FG3) ==Co(3!)P +pQ. Proof of Lemma III. Lemma III asserts that if Xp-1[ (x -1) (X;i2) (x - 3)] P f(x>W ==-x x+a2x2~ + a3x3+..~af (pXni) and if AI=jaij, A=1a21,..An==ja4, and X=Ixl, then Xnp-l[(X ~1)(X-i- 2)(X+ v3)]P A1X+A2X2+A3X3+..+A,X;l= Referring to the second form of f(x) above, we see that f(x) is a function of x with alternating signs. For if we take the general case of xk -axkl + a2Xk-2 +..., and multiply this by x -b, the resulting function will have alternating signs, as in the case of (x -c) (x -d). Furthermore, the result is the same, aside from the signs, as that obtained by multiplying ak +aixk-l + a2x k2 +. by x ~b. Repeated application of this theorem shows that the expanded product of the general case, Xp-1[(X -1)(x -2)(x -3)... (x -m)]P f (pi) has the same alternating signs, and reduces, when the absolute values An of all the coefficients an are taken, and when we take 3 for rn, to XP-l1[(X +1)(X + 2) (X +3)]P A,X+A2X2+A3X3+...+AAXn= 4. The transcendence of 7. The proof of the transcendence of 7r is based upon three propositions already given, viz.: F(O)ex = F(x)~+ O(x);.(13) 1~1O [(Xe t)(Xh +2)(oe ( 3)19 O~(x) I <ex XP8 1+ei1T== ' Euler's theorem.,,,!9 412 MODERN MATHEMATICS If we assume 7 to be an algebraic number, then in is evidently an algebraic number, and therefore is the root of an algebraic equation with rational coefficients. If this equation is taken, as before, to be of the third degree (the proof being essentially the same for the general case) we may indicate its roots by yl, y2, and Y3, and among these in must be found. But since 1 +ei =0, we should then have (l +) 0 + e(i) ( + )( +3)0 = 0 whence 1 + (eY1 +ey2 +ey) + (el+y2 +e2+y3 +ey +yel) -eyl+y2+y =0. (20) It is proposed to show that this equation is impossible. The symmetric functions of the quantities yi, Y2, y3, are, by our hypothesis (1), rational numbers, and hence l1, y2, y3 are roots of the rational algebraic equation (zx) -0. The symmetric functions of the quantities yl+y2, Y22+y3, y3+yi (for example, their power sum) are also symmetric functions of yk, and are therefore rational numbers. The quantities yl +y2, y2+ +3, 3 + y1 are therefore roots of a second algebraic equation 1(x) =0. Similarly, y1 +y2 +y3 is the root of a third algebraic equation, 02(X) =0. Therefore (x) 1(x) X)()...... (21) is an integral function of x which becomes 0 as soon as x becomes equal to one of the numbers yj, Yi+yk, or yl1+Y2+Y3. Some of these numbers, say, N of them, may equal zero. If we place the product (21) equal to 0, and suppress the factor xN, we have an equation 0(x) =0, which we may consider as being reduced to a form having integral coefficients. Since the zero THE HISTORY AND TRANSCENDENCE OF 7 413 roots have just been suppressed, 0(0) cannot equal 0, and hence 0(x) may be written 0(x) = axm +alxrn- + a2Xm-2 +...+am = 0, where a, al, am are integral, and a and a,, are not 0, and a is positive. This may easily be transformed, by multiplying by a'"-l and putting z for ax, into an equation with integral coefficients, of the form 1(z) zm + bzm-l + b2z'-2+... + b,,.. (22) the coefficient of the highest power being unity. Let the roots of the equation 0(x)==0 be xl, x, x3,..., these representing the numbers among the numbers yi, yj + Ykj y1 + 2 +Y3 that are not equal to 0. It is seen from (20) that they must satisfy the equation, K+exl+ex2+ex+'...=0... (23) We now return to the fundamental equation (1-3) F(O)e = F(x) + b(x). If we put for x the numbers xl, x2, x3,.., and add the results, we shall have, with attention to (23), -K. F(O) = F(x) + F(x2) + F(x3) +. + 1(Xl) + 52)+ b(x3) +... or K.F(O) +F(xl) +F(x2) +F(x3) +... + 0(xi) + (x2) + (x3) +... =0.. (24) We now wish to prove that when we make a suitable choice of the integral function f(x), which is entirely arbitrary except for the condition that f(O) =0, the equation (24) is impossible. It will then follow that our sole hypothesis, viz., that 7r is an algebraic number, is incorrect. If we prove that 1. K F(O) +F(xi) +F(x2) +F(x3) +.. is integral and not 0; 2. The absolute value of (x1) + (x2) + (x3) +... <1; 414 MODERN MATHEMATICS then we shall have proved the impossibility of (24), for the sum of an integer and a number whose absolute value is less than 1 cannot be 0. We first let p represent a prime number, and we take for f(x) the integral function zP-1[01 (z)]p amP-lxP-[O(x)]p f/(X) (p (p_)!, *.. (25) an equation that is evident from the fact that we took z=ax and multiplied 0(x) by a'"-1 when we formed 01(z). We arrange [O(z)]P according to ascending powers of z, and we have [O(z)]P=Ao +Az +Az,2+...=Ao +Alax +A2a2x2 +... where the A's are integral, and, from (22), Ao= b,,p and therefore not 0. Now from (25), AoaP-lxP-l Alapx +-A2aP+lxp+l.. ( f(-x) =I. (26) Taking the derivatives, and letting x= 0, we have f(0)=0, f'()=0,..., f(P-2)(0)=O, f(p-l) (0) = AoaP-1 = bmPaP-l f(P)(O) =pAlaP, f(p+l) (0) =p(p + )A2aP+l,.... We now select a value for p greater than the greatest number a, bm, K. Then f(p-1)(O) is not divisible by p, while all the other derived functions are either 0 or are divisible by p. Therefore F(0), which from (4) equals f'(0) +f"(0) +.., is an integer not divisible by p, and thus K F(O) is also an integer not divisible by p, which tells us the nature of part of the first function under consideration. In deriving (22) we used z for ax, and we may therefore take f(x) and arrange it according to ascending powers of z -Zk, where Zk is one of the roots of (22), and we have (z ( -k)PB1 (k) + (Z-Z k) +1B2( k) +.. ^ == ------- (p-1)! aP(x -Xk)PB (Zk) +aP+l(X -Xk)P+lB2(Zk)... (p-i)!, - (27) (p-l THE HISTORY AND TRANSCENDENCE OF 7r 1 415 where B1 (Zk), B2(Zk),... are integral functions Of Zk with rational coefficients. Hence, as with equation (26), we have f(Xk) ==0, f'(Xk)=0, f"1(Xk) =0,... f(Pl) (Xk) = 0; f(P) (Xk)== paPBi(z ), f (P~'(Xk)= p(p +1)aP~'B2 (Z ), If now we let Q(Zk)= apB1(zk) + (p+1)aP~1B2(Zk) +. we have from (4), F(Xk) =pQ(Zk).(28) Therefore F(xl) ~F(X2) ~F(X3) ~. ==p[Q(zl) +Q(z2)+ ~Q (z3) ~..J. (29) But the second member of (29) is an integral symmetric function of the rn roots of equation (22), and hence is integral and contains the factor p. We have now proved that K. F(0) is an integer not divisible by p, and that the -sum of the functions F(Xk) is an integer that is divisible by p, so that K.F(0) ~ F(x1) ~ F(X2) +F(X3) ~... is an integer and not divisible by p, and therefore is not 0, which was the first thing to be proved. We now take up the second thing to be proved, that the absolute value Of 0b(X1) + b(X1)+A- (X2) ~.. is less than 1. To do this we begin with j0(X~~eX-Xp-' (p-I)!X2(+)p8 Taking 0(x) ==axm ~,aixm- ~...+am= 0, already considered, we write this 0 (x) = a (x -xi) (x -X2)... (X -Xm,,). (30) Then, from (25) and (30) we have a(m+l) P-lxP-l (X -X1) P(X -X2) P... (X -Xn) P 416 MODERN MATHEMATICS Letting X stand for Ixl, and Xk for Ixk[, it is evidert that the coefficients in (31) are not greater than those in a(m+l) p-xP-l(x -X1) p(X +X2)... (X + Xm) P (p-1)! If we now place P(X) = am+'lX(X~ +X1)(X +X2)... (X+-Xm), then for every positive number X we have Xp-l[(X+l)(X+2)(X+3)]P [P(X)]P (p-1)! aX(p-1)!' P(X) [P(X)]p-1 <aX (p-1) We now proceed as with (18). For any fixed value of X we can take a value of p so large that XP-1[(X +1)(X+2)(X +3)]P (p -1)! shall be as small as we please. We now recall that ) -x.+ [(X+1)(X+2)(X+3)]( ()< eX... (18) Hence Ob(x) may be made as small as we please, and hence the absolute value of ~(Xl) + b(X2) + 0(X3) +... may be made less than 1 by taking a suitable value of p, which proves the second part of the proposition. The two points necessary to show the transcendency of 7r have now been proved. In other words, n satisfies no algebraic equation with rational coefficients, and therefore cannot be found by means of the ordinary algebraic operations, and therefore cannot be constructed geometrically by the use of the instruments of elementary geometry, nor even by the aid of higher algebraic curves.