?;\
i:


MODERN MATHEMATICS


TITLES AND AUTHORS
I. THE FOUNDATIONS OF GEOMETRY.
By OSWALD VEBLEN, Ph.D., Professor of Mathematics in Princeton University.
II. MODERN PURE GEOMETRY.
By THOMAS F. HOLGATE, Ph.D., LL.D., Professor of
Mathematics in Northwestern University.
III. NON-EUCLIDEAN GEOMETRY.
By FREDERICK S. WOODS, Ph.D., Professor of Mathematicsin the Massachusetts Institute of Technology.
IV. THE FUNDAMENTAL PROPOSITIONS OF ALGEBRA.
By EDWARD V. HUNTINGTON, Ph.D., Assistant Professor of Mathematics in Harvard University.
V. THE ALGEBRAIC EQUATION.
By G. A. MILLER, Ph.D., Professor of Mathematics
in the University of Illinois.
VI. THE FUNCTION CONCEPT AND THE FUNDAMENTAL NOTIONS OF THE CALCULUS.
By GILBERT AMES BLISS, Ph.D., Associate Professor
of Mathematics in the Universtiy of Chicago.
VII. THE THEORY OF NUMBERS.
By J. W. A. YOUNG, Ph.D., Associate Professor of the
Pedagogy of Mathematics in the University oi
Chicago.
VIII. CONSTRUCTIONS WITH RULER AND COMPASSES;
REGULAR POLYGONS.
By L. E. DICKSON, Ph.D., Professor of Mathematics
in the University of Chicago.
IX. THE HISTORY AND TRANSCENDENCE OF IT.
By DAVID EUGENE SMITH, Ph.D., LL.D., Professor of
Mathematics in Teachers College, Columbia University,


MONOGRAPHS
ON TOPICS OF
MODERN MATHEMATICS
RELEVANT TO THE ELEMENTARY FIELD
EDITED BY
J. W. A. YOUNG
LONGMANS, GREEN, AND CO.
FOURTH AVENUE &amp; 30TH STREET, NEW YORK
LONDON, BOMBAY, AND CALCUTTA
1911


COPYRIGHT, 1911,
BY
LONGMANS, GREEN, &amp; CO.
THE SCIENTIFIC PRESS
ROBERT DRUMMOND AND COMPANY
BROOKLYN, N. Y.


EDITOR'S PREFACE


THE purpose of this collection of monographs may be indicated by the following citation from the letter that was sent
to those who were requested to act as authors.
"Among the various publications on mathematics that
are being made, it would seem that there is room for a serious
effort to bring within reach of secondary teachers (in service
or in training), college students, and others at a like stage of
mathematical advancement, a scientific treatment of some
of the regions of advanced mathematics that have points of
contact with the elementary field. Undoubtedly one of the
most crying needs of our secondary instruction in mathematics
to-day, is that the scientific attainments of the teachers be
enlarged and their mathematical horizon widened; and I
believe that there is a large body of earnest teachers and students that are eager to extend their mathematical knowledge
if the path can be made plain and feasible for them."
"A volume of monographs dealing with selected topics
of higher mathematics might well be a useful contribution to
the meeting of this need. Such monographs would aim to
bring the reader into touch with some characteristic results
and viewpoints of the topics considered, and to point out their
bearing on elementary mathematics. They would therefore contain:
(1) A considerable body of results proved in full, so that the
reader can materially extend his mathematical acquisitions by
the reading of the monograph alone.
v


vi


PREFACE


(2) Statement without proof of some leading methods and
results, so as to give a bird's-eye view of the subject.
(3) A small number of references indicating what the reader
may profitably take up after he has mastered the contents of
the monograph."
Both the plan itself, and the invitation to act as author,
were most cordially received; work on the monographs was
promptly begun, has been carried through substantially as
planned, and the results are presented herewith.
The manuscripts have, whenever feasible, been read carefully by at least one collaborator other than myself, and in
consequence various questions and suggestions have been
submitted to the authors and acted upon by them. Each
author, however, retains sole responsibility for his monograph
as it now appears. No attempt has been made to secure
uniformity in style of treatment; each monograph is an independent unit, that can be read without reference to the others.
The amount of technical mathematical knowledge that is
presupposed on the part of the reader varies with the different
subjects. A large part of the book presupposes only knowledge
of elementary geometry and algebra, together with a certain
measure of mathematical maturity.   On the other hand,
there is much that will repay careful and detailed study by
advanced students. So far as the subject-matter permits,
the less difficult topics are taken up first in each monograph.
J. W. A. YOUNG.


CONTENTS
PAGE
I. THE FOUNDATIONS OF GEOMETRY.........................     3
Introduction-The Assumption of Order-Order on a LineThe Triangle and the Plane-Regions in a Plane-Congruence of
Point Pairs-Congruence of Angles-Intersections of CirclesParallel Lines-Mensuration-Three-Dimensional Space-Conclusion.
II. MODERN PURE GEOMETRY*............................... 55
Introduction-Simple Elements in Geometry-The Principle of
Duality-Principle of Continuity-Points at Infinity-Fundamental Theorem-Metric Properties-Anharmonic RatiosElementary Geometric Forms-Correlation of Elementary
Forms-Curves and Sheaves of Rays of the Second OrderPascal's and Brianchon's Theorems-Pole and Polar TheoryConclusion.
III. NON-EUCLIDEAN GEOMETRY............................. 93
Introduction-Parallel Lines-The Euclidean AssumptionThe Lobachevskian Assumption-The Riemannian Assumption
-The Sum of the Angles of a Triangle-Areas-Non-Euclidean
Trigonometry-Non-Euclidean Analytic Geometry-Representation of the Lobachevskian Geometry on a Euclidean PlaneRelation between Projective and Non-Euclidean GeometryThe Element of Arc.
IV. THE FUNDAMENTAL PROPOSITIONS OF ALGEBRA*........... 151
Introduction-The Addition of Angles and the Multiplication
of Distances-The Abstract Theory of these OperationsGeometric Example of the Algebra of Complex Quantities: The
System of Points in the Plane-The Abstract Theory of the
Algebra of Complex Quantities-Appendix: Other Examples
of the Algebra of Complex Quantities-Geometric Proof that
Every Algebraic Equation has a Root.


* A fuller Table of Contents precedes the Monograph itself.


vii


vii


CONTENTS


PAGE
V. THE ALGEBRAIC EQUATION*.............................. 211
General Introduction-Historical Sketch and DefinitionsEquations with One Unknown and with Literal CoefficientsEquations with One Unknown and with Numerical Coefficients
-Simultaneous Equations-A Few References.
VI. THE FUNCTION CONCEPT AND THE FUNDAMENTAL NOTIONS OF
THE CALCULUS*.................................. 263
Introduction-Variables and Functions-The Fundamental
Notions of the Calculus.
VII. THE THEORY OF NUMBERS*.............................. 307
Introduction - Factors - Diophantine  Equations - Congruences-Binomial Congruences-Quadratic  CongruencesBibliography.
VIII. CONSTRUCTIONS WITH RULER AND COMPASSES; REGULAR
POLYGONS.......................................  353
Introduction-Analytic Criterion for Constructibility-Graphical Solution of a Quadratic Equation-Domain of RationalityFunctions Involving no Irrationalities other than Square Root
-Reducible and Irreducible Functions-Fundamental Theorem;
Duplication of the Cube; Trisection of an Angle; Quadrature
of the Circle-Connection between Regular Polygons and Roots
of Unity-De Moivre's Theorem-Regular Pentagon and Decagon-Regular Polygon of 17 Sides-Construction of the Regular
Polygon of 17 Sides-Gauss's Theory of Regular PolygonsPrimitive Roots of Unity-Gauss's Lemma-Irreducibility of
the Cyclotomic Equation-Proofs of Theorems Cited EarlierReferences.
IX. THE HISTORY AND TRANSCENDENCE OF U.................. 389
The Nature of the Problem-The History of the ProblemThe Transcendence of e-The Transcendence of 7r.
* A fuller Table of Contents precedes the Monograph itself.


I
THE FOUNDATIONS OF GEOMETRY
By OSWALD VEBLEN


CONTENTS
PAGES
I.  INTRODUCTION........................................  3
II. THE ASSUMPTION OF ORDER.............................  5
III. ORDER ON A LINE................................     9
IV. THE TRIANGLE AND THE PLANE..........................  14
V. REGIONS IN A PLANE...............1...............  19
VI. CONGRUENCE OF POINT PAIRS..........................   27
VII. CONGRUENCE OF ANGLES.............................. 30
VIII. INTERSECTIONS OF CIRCLES............................  32
IX. PARALLEL LINES....................................    43
X. MENSURATION........................................  44
XI. THREE-DIMENSIONAL SPACE..........................     47
X II.  CONCLUSION..........................................  49
2


I


THE FOUNDATIONS OF GEOMETRY*
By OSWALD VEBLEN
I. INTRODUCTION
In connection with the foundations of geometry there arise
many questions of psychology, logic and epistemology.          Into
these the present paper does not enter. Instead we propose
to write out the preliminary pages of a geometry such as Euclid
might be imagined to write to-day.        The resulting treatment
of geometry as a whole will not be very different from that
actually written by Euclid.     We shall, however, go into detail
only with those parts of the subject in which the modern exposition is essentially different from the ancient.
That there are such differences is not because Euclid's logical
* This essay is based mainly on two articles in the Transactions of the
American Mathematical Society. The first one is by the present writer
(Vol. V [1904], pp. 343-84) and the second by Dr. R. L. Moore (Vol. IX
[1908], pp. 487-512). I have modified my assumptions in accordance with
a suggestion of Dr. Moore's and have also changed the form of his assumptions in some respects. The literature is too large to cite in detail. We
shall be content to mention the names of the following European contributors to the subject: Pasch, Veronese, Peano, Pieri, Schur, Hilbert, Dehn;
and the following works in the English language:
Hilbert (tr. by Townsend), Foundations of Geometry, Chicago.
E. H. Moore, On the Projective Axioms of Geometry. Trans. Am.
Math. Soc., Vol. III (1902), pp. 142-58.
Halsted, Rational Geometry, New York.
Whitehead, The Axioms of Descriptive Geometry, Cambridge, 1907.
Coolidge, Non-Euclidean Geometry, Oxford, 1909.
Schweitzer, A Theory of Geometrical Relations, American Journal of
Mathematics Vol. XXXI (1909), pp. 365-410.
3


4


MODERN MATHEMATICS


methods and purposes were different from those of the modern
mathematical students of foundations. Euclid overlooked
certain assumptions that entered tacitly into his arguments,
but this was by mistake. His purpose was the same as that
of the moderns, to prove every proposition which he could
prove, and to prove it with a minimum of assumptions.
This required him often to prove statements that were
intuitively evident. Thus an axiom might be a self-evident
truth, but certainly all self-evident truths were not axioms
according to the usage of Euclid.
In geometry a great many technical terms are defined, and
each is defined in terms of other terms. Hence at the beginning of a book on geometry at least one term must be undefined;
otherwise the book would have no beginning. We shall leave
undefined the term point. This implies that the reader is
free to carry in his mind any image of a point which he can
reconcile with what is said about it. We may try to impart
a notion of our image of a point by saying it has no length,
breadth, or thickness, or by like phrases, but these are no part
of our book on geometry; they have nothing to do with the
logical steps by which the theorems are derived.
If the propositions of geometry are arranged in logical order
so that each proposition after a start has been made shall
follow by deduction from its predecessors, it is clear that the
first propositions of all cannot be deduced, because there are
no previous propositions to deduce them from. There must
therefore be assumptions. These may be stated so plausibly
that no one doubts their truth,* but whether they are true or
not cannot affect the correctness of the reasoning based upon
them, nor the fact that they are assumptions. We shall not
enter into the metaphysical question as to whether these
assumptions are self-evident truths, axioms, common notions,
experimental data or what not, but shall try to keep within the


* The writer is inclined to believe that the truth of a statement can be
determined only by testing all its consequences, so that the real test of the
validity of the hypotheses of geometry is in the validity of the theorems.


THE FOUNDATIONS OF GEOMETRY


5


realm of mathematics by using the non-committal word assumptions.
In addition to the word point, we shall take as undefined
a relation among points which we indicate by saying "the
points ABC are in the order IABC }."    This relation may
mean anything the reader desires, provided it is consistent
with the following statements.
These assumptions were all used implicitly in the older
geometries, as well as in most text-books of to-day, but have
not been formulated explicitly as part of the foundations of
geometry until very recent times.
II. THE ASSUMPTIONS OF ORDER
Assumption I.  If points A, B, C are in the order {ABC}
they are distinct.
FIG. B   C
Assumption II. If points A, B, C are in the order IABC}
they are not in the order { BCA }.
Definitions. If A and B are distinct points the line AB
consists of A and B and all points, X, in one of the orders
ABX}, {AXB}, {XAB}. The points X in the order {AXB}
X     4     X    B     X
FIG. 2.
constitute the linear segment AB, and are said to be between
A and B. A and B are called the ends of the segment. The
segment, together with its ends, constitutes a linear interval.
Assumption III.  If points C and D (C   D)* are in the
line AB, then A   is in the line CD.
Assumption IV. If A and B are two distinct points, there
exists a point C such that A, B, and C are in the order { ABC }.


* The notation A 7 B indicates that A and B are symbols for different
objects.


6


MODERN MATHEMATICS


Assumption V. If three distinct points, A, B, and C do not
lie on the same line and D and E are two points in the orders
BCD} and    CEA, then a point F exists
in the order { AFB } and such that D, E and
F lie on the same line.
/ E             Assumption VI. There exist three distinct points, A, B, C, not in any of the orders
B      C ---     D    ABC}, {BCA}, {CAB}..
FIG. 3.          Theorem 1. If points A, B, C are in
the order I ABC} they are in the order
{CBA}, and not in any of the orders {CAB}, {BAC}, {ACB},
BCA }.
Proof. From the definition of a line, A is on the line BC.
By Assumption I, C and A are distinct. Hence by Assumption
III, B is on the line CA. This means, since B is distinct from
C and A, that there is one of the orders {CAB}, {CBA},
BCA }. But Assumption II states that BCA } is impossible,
and if we had {CAB} it would follow by Assumption II that
we did not have {ABC}. Hence {ABC} implies {CBA}
and excludes  BCA} and {CAB }.
By what we have just proved, if we had {BAC}, we should
also have { CAB }. Hence {BAC is eliminated. Since { ACB}
would imply BCA }, ACB} is also excluded.
Corollary 1. If A and B are distinct points the line AB is
the same as the line BA, and the segment AB is the same as the
segment BA.
Corollary 2. If points A, B, C are in the order ABC} then
they are all on the lines AB, BC, CA.*
Theorem 2.t For every two distinct points there is one and
only one line containing them.
Proof. Let A and B be two distinct points. By Theorem
1, Corollary 1, the lines AB and BA are identical. Let C be
any point on the line AB distinct from A and let X be
any point of the line AB distinct from C and A. Since C is


* The lines AB, BC, CA are proved identical in Theorem 2.
t Cf. Euclid, Postulates 1, 2.


THE FOUNDATIONS OF GEOMETRY


7


on AB it follows by Assumption III that A is on the line GX,
and hence that we have one of the orders ACX, CAX, CXA.
Whichever of these three cases holds it follows by Theorem
1 and the definition of a line that X is on AC. Hence all
points of the line AB are on the line AC.
Let Y be any point of AC distinct from B and A. Since
C is on AB, by Assumption III and Corollary 2 of Theorem 1,
B is on AC. Since B and Y are on AC it follows by AssumptionIII that A isonBY. Hence {ABY} or {BAY} or {BYA}.
Hence by Theorem 1, Y is on AB.
Thus we have shown that the lines AB and AC are identical.
If D is any point of AB different from C, it follows that the
line CD is identical with the line CA and hence with the line
AB. In other words any line containing C and D is identical
with the line CD.
Corollary. Two distinct lines cannot have more than one
point in common.
Proof. If there were two common points, the line determined by them would be identical with each of the given lines.
Theorem 3. If DE is any line there exists a point F not on
this line.
Proof. If every point were on the line DE then this line
would contain the three points A, B, C mentioned in Assumption VI. By Theorem 2, the line AB would be identical with
DE. Hence the line AB would contain C, contrary to Assumption VI.
Theorem 4. If A and B are any two points there is a point
F in the order AFB.
Proof. By Theorem 3, there is a point E not on the line
AB (Fig. 3). By Assumption IV there is a point C in the
order {AEC}. The point C cannot be on the line AB, for if
so this line would also contain E, by Theorem 2. By Assumption IV, there is a point D in the order {BCD}. Hence by
Assumption V there is a point F in the order I AFB }.
We have now the information that a line AB must always
contain at least five points, namely A and B, and at least one
point X1 between A and B (Theorem 4), and at least one X2


8


MODERN MATHEMATICS


and one X3 in each of the orders { ABX2} and { X3AB } (Assumption IV and Theorem 1). The points X1, X2, and X3 are
distinct by Theorem 1.
The theorems proved above are all intuitively obvious
provided the reader of these lines has in mind the same set of
images as the writer. It is necessary to prove them, however, in order to show that our list of assumptions is actually
a characterization of the points and lines which we image to
ourselves. An obvious fact in the figure (Fig. 3), described by
Assumption V, is that the points D, E, F are not only collinear
as stated in that assumption, but are in the order {DEF}.
This we shall now prove as a theorem. The reader will observe
that most of the other assumptions are used in the argument.
Theorem 5. The points D, E, F of Assumption V are in
the order {DEF}.
Proof. Since D, E, F are on the same line, it follows by
Theorem 2 that F is on the line DE. Hence they are in one of
the orders {DEF}, {DFE}, {FDE}.
Suppose they were in the order {DFE}. The points E, C, D
are not on the same line, because if they were, Theorem 2 would
require A, B, C to be on this line. Hence by Assumption V
(Fig. 4) the orders {CEA} and EFD} would imply that there
E
D
X-                            D
C     E      A        A      F       B
FIG. 4.                FIG. 5.
is a point X in the order {DXC} and on the line AF. But B
is common to the lines AF and DC. Hence, by the Corollary
of Theorem 2, X=B. Hence we would have the order {DBCI
as well as BCD }, contrary to Theorem 1.
Suppose the points were in the order {FDE}. As before,
the points E, F, A are not on the same line.   Hence by
Assumption V the orders AFB} and FDE} imply the existence of a point X on the line BD and in the order {EXA}.


THE FOUNDATIONS OF GEOMETRY


9


But the lines BD and EA, have C in common. Hence there
would be the order {ECA } as well as {CEA, contrary to
Theorem 1.
We need also to prove the following theorem, which is
intuitively quite as obvious as Assumption V. We shall use
the word collinear of a set of points to indicate that they are
all on the same line.
Theorem 6. If A, B, C are non-collinear points and A'
is between B and C, B' between C and A, and C' between A and
B, then A', B,' C' are non-collinear.
B,                       A'
C     A'
A    J'   c               B        C'   A
FIG. 6.                   FIG. 7.
Proof. If A,' B,' C,' were collinear we should have one of
the orders { A'B'C'}, {B'A'C'}, {A'C'B'}. Consider the possibility of A'B'C'}. The points A', C', B cannot be collinear,
because their line would, by Theorem 2, also have to contain
A and C. Now by Assumption V, the orders IBC'A} and
C'B'A'} imply the existence of a point X in the order { BXA'}
and on the line AB'. But C is common to the lines A'B
and AB'. Hence X=C and we should have both {BCA'} and
\BA'C}.
The proof that { B'A'C'} and  A'C'B'} are impossible is
similar.
III. ORDER ON A LINE
Theorem 7. If ABC} and BCD, then ABD }.
Proof. By Theorem 3, and Assumption IV, there exist
points P and 0 not on the line AB, and in the order BPO.
By Assumption V, and Theorem 5, the orders {CBA} and
{BPO} imply the existence of a point Q in the orders {OQC}
and {APQ}. Similarly the orders {BCD} and {CQO} imply
the existence of a point R in the orders {ORB} and {DQR}.


10


MODERN MATHEMATICS


The points A, Q, D are not collinear, for, if so, P would be on
AD. Hence, by Assumption V, the orders {DQR} and QPA}
imply the existence of a point X in the order {AXD} and on


B


FIG. 8.


the line RP. But the lines RP and AD have B in common
Hence X=B and ABD}.
Theorem  8. If  ABC} and     ABD }, C i D, then either
{BCD} or {BDC}.
Proof. In view of Theorem 2, it is necessary only to show
that {CBD   is impossible. By Theorem 3 and Assumption
IV, there exist points 0 and P not on the line BC and in the
order {OCP}. The orders {OCP      and {CBD} would then
imply the existence of a point Q in the orders (Fig. 9) {DQO}


FIG. 9.


and {PBQ. Now A, being on the line BC, is not on the line
CP. Hence the orders {OCP} and {CBA} imply the existence
of a point R in the orders {ARO} and {PBR}. Thus we would
have three non-collinear points, 0, A, D, and three points
B, Q, R, B between A and D, Q between D and 0, R between


THE FOUNDATIONS OF GEOMETRY


11


0 and A, and B, Q, R would all be on the line BP, contrary to
Theorem 6.
The following are corollaries of Theorems 7 and 8.
Corollary 1. If { ABC} and   {ABD}, C    D, then either
{ACD} or {ADC}.
Proof. By Theorem 8, we have either {BCD} or {BDC}.
If {BCD}, then {DCB} and {CBA} lead by Theorem 7 to
{DCA }. If { BDC} then   CDB } and { DBA} imply    CDA }
Corollary 2. If  ABD} and {ACD }, B     C, then either
{ABC} or {ACB}.
Proof. {BAC} with {ACD} would by Theorem 7 imply
BAD}, whereas our hypothesis is {ABD}.
Corollary 3. If {ABC} and {ACD} then {BCD}.
Proof. By Theorem 7 {CDB} would imply with {ACD}
the order {ACB }, contrary to hypothesis. {CBD} with {CBA}
would by Corollary 1 imply either { CDA} or {CAD}, contrary
to hypothesis.
Corollary 4. If {ABC} and {ACD} then {ABD}.
Proof. By Corollary 3 we have {BCD}, which combined
with {ABC} leads by Theorem 7 to {ABD}.
These propositions are all preliminary to the following
theorem.
Theorem 9. If A is any point of a line AB, the points of
the line exclusive of A are in two sets such that A is between any
point of the first set and any point of the second set, and is not
between any two points of the same set.
Proof. Let [X] be the set* of all points in the order
IXAB} and let [Y] be the set including B and all points in
the orders {AYB} and IABY}. By definition, the line comprises no other points than A and [X] and [Y].
A is between any X and any Y. For we have {XAB} and
either {AYB} or IABY}. In the first case Corollary 3 gives
the conclusion  YAX} and in the second case Theorem    7
yields the same result.


* We let [X] denote a class of objects, the individuals of which are denoted
by X, XI, X2, X", etc.


12


MODERN MATHEMATICS


A is not between two X's, because {X1AB} and {X2AB}
lead by Theorem 8 either to {AX1X2} or {AX2X1}.
A is not between two Y's, for the possible cases are: (a)
{AY1B} and    AY2B }, which by Corollary 2 gives {AYlY2}
or {AY2Y1}; (b) JAY1B} and {ABY2} which, by Corollary
4 gives {AY1Y2};    (c) {ABY1} and    {ABY2}, which by
Corollary 1 gives { A Y1Y2} or {A Y2Y1I }.
Definition. The two sets of points in Theorem 9 are called
half-lines or rays; A is called the origin or the end of either
half-line. If AB is any segment the ray of the line AB whose
end is B and which does not contain A is called the prolongation of the segment AB beyond B.
A ray whose end is A and which contains B is designated
as the ray AB.
Corollary 1. If B is a point of a ray whose end is A the
points of the ray exclusive of B are in two sets, the segment AB
and a ray whose end is B.
Proof. The ray is by definition composed of B and the
points X in the order {AXB} and the points Y in the order
{ABY}.
Corollary 2. If C is a point of a segment AB, then the points
of the segment, exclusive of C, are in one or the other, but not both,
of the segments AC and BC.
Proof. A and B are respectively in the two rays a and f,
whose common end is C. The ray a contains by definition all
points {X} in the order {CXA} and 3 contains similarly all
points {X'} in the order {CX'B}. Hence the segments CA
and CB have no points in common.
The other points {Y} of the ray a are in the order CA Y}.
Since we have {BCA} it follows by Theorem 7 that the points
Y are also in the order {BAY} and hence not on the segment
AB. The ray P also contains points Y' in the order {CBY}.
These must also be in the order {ABY} and hence not on the
segment AB. Hence every point on the segment AB, except
C, is on one of the segments AC and CB.
Definition. A set of n (n &gt;3) points A1, A2,... An are in
the order iA1A2...AT} if and only if {AiAiAk} wherever


THE FOUNDATIONS OF GEOMETRY


13


i&lt; j&lt; k (i, j, k=l, 2,..., n). Two points A1, A2 are always
in the orders {A1A2} and {A2A1}.
Theorem 10. To any set of n distinct points (n &gt; 2) on a
line can be assigned the notation so that they are in the order
I AlA2... An. The other points of the line fall into n +1
sets, no two of which have a point in common. These sets are
the segments Al, A2, A2A3,... A,,_An and the rays which are
the prolongations of the segment A1An beyond Al and An.
Proof. We prove the theorem    first for the case n=2.
The first statement of the theorem is in this case part of the
definition. Let A1 be an arbitrary one of the two points, and
let li and )72 be the two rays which it determines according
to Theorem 9, 7)2 being the one which contains A2. The ray
1)2 is by Corollary 1 of Theorem 9 composed of the segment
A1A2, the point A2, and another ray 2' with A2 as its end.
This proves the theorem for n-2. We establish it for the
general case by proving that if it is true for n==k then it is
true for n = k +1. Consider k points in the order { A1A2... Ak}.
A point Ak+1 may fall in the ray whose end is A1 or in one
of the segments A1A A, A2A3,..., -Ak, or in the ray whose
end is Ak.
If it falls in one of the two rays it separates this into a segment and a ray by Corollary 1 of Theorem 9. If it falls in a
segment it separates this into two segments by Corollary 2
of the same theorem. So in either case we have increased the
number of segments by one and left the number of rays unaltered.
Call the end of one of the rays A1'. Let A2' be the other
end of the segment, one of whose ends is A1'. Let A3' be
the other end of the other segment whose end is A2'. By
a finite number of steps the points Al,..., Ak, Ak+l are
exhausted and the notation has been assigned to them in
such a way that A1' and A'k+l are ends of rays and the segments are A'A2,..., A'kAk+l  Since none of the points
A3',... Ak+1 are on the segment A1'A2' of the ray whose end
is A1, we have the order relations A1'A2'Aj'(2&lt;j&lt;k+ 2).
Similar considerations show that all the order relations exist
which are implied by the symbol (A1'A2'A3'.. A'k+ ).


14


MODERN MATHEMATICS


Theorem 11. On any segment AB and on either of its prolongations there is an infinitude of points.
Proof. By Theorem 4 there is a point XI on the segment
AB. By the same theorem there is a point X2 on the segment
AX,. By Theorem 8, Corollary 4, X2 is on AB. In like
manner we obtain points X3, X4,... on AB. By Assumption I
and Corollary 2 of Theorem 9, these points are all distinct.
By Assumption IV there is a point Y1 on the prolongation
of AB beyond B, a point Y2 on the prolongation of AY1 beyond
Y2, and so on. By Theorem 7 all these points are on the prolongation of AB beyond B.
IV. THE TRIANGLE AND THE PLANE
Definition. Three non-collinear points A, B, C, together
with the segments AB, BC and CA, are called a triangle ABC.
The points A, B, C are called the vertices, and the segments
AB, BC, CA are called the sides of the triangle.
We shall now prove a theorem which must be carefully
distinguished from  Assumption V.
Theorem 12. If A, B, C are three non-collinear points and
D and F exist in the orders {BCD} and {AFB} then E exists
in the orders AEC} and DEF.
Proof. By Assumption IV there exists a point, 0 (Fig. 10),in
the order ABO} which is therefore by Theorem 8, Corollaries 3
and 4, also in the orders {AFO, {FBO}. Since we also have
{BCD } it follows by Assumption V and Theorem 5, that there
exists a point P in the orders { OCP } and { FPD }. By the same
argument there follows from the orders AFO} and { FPD} the
existence of a point Q in the orders {OPQ } and {DQA}. The
orders {OPQ} and {OCP} imply by Theorem 8, Corollary 3,
the order {CPQ}. Since A, C, Q, are not collinear (Theorem
2), the orders {AQD} and {QPC} imply (Assumption V) the
existence of a point E on the line DF and in the order {CEA }.
By Assumption V and Theorem 2, the line DE meets the segment AB in F only, and hence by Theorem 5 we have the
order D EF }.


THE FOUNDATIONS OF GEOMETRY


15


Assumption V and Theorem 12, may be combined in one
statement which for convenience we list as Theorem 13.
Theorem 13. A line which meets one side of a triangle and
a prolongation of another side meets the third. side also.
Definition. If A, B, C are non-collinear, the set of all
points collinear with pairs of points of the triangle ABC is
the plane ABC.
Corollary. The planes ABC, ACB, BCA, etc., are identical.
Theorem 14. If 0 is any point of a side AB of a triangle
ABC the plane ABC consists of the points of the lines joining
0 to points of the triangle ABC.
A
/B  s-C       D            \    /?....""
FIG. 10.
A    -  N                     /      1
IG. 12.                      FIG..
FIG. 12.                    FIG. 11.
Proof. By definition, all points on the lines joining 0
to points of the triangle are in the plane ABC. Hence we must
prove that all points of the plane ABC are on such lines.
It is evident in view of Theorem 13, that all points of the
three lines AB, BC, CA are in the plane and also on lines joining 0 to points of the triangle. Let X be any point in the plane
ABC, but not on one of the lines AB, BC, CA. It is therefore
collinear with two points M and N of the triangle, which are
not both on the same one of the intervals (see the definition
p. 5) AB, BC, CA.
If one of these points, say M, coincides with A (Fig. 12), N


16


MODERN MATHEMATICS


must be on the segment CB. If X is on either prolongation of
the segment AN the theorem reduces to Theorem 13, applied to
the triangle ANB. If {AXN}, by Theorem 13 applied to the
triangle ANB we have the existence of a point, P, in the orders
{APB } and {CXP. By the theorems of III we have P = 0 or
{APOB} or { AOPB}. If P==0 then the line OX meets the
triangle in C. If {APO} then OX, by Theorem 13, applied to
the triangle APC, meets the segment AC. Similarly, if {OPB },
the line OX meets the segment CB. Exactly the same argument can be repeated if M -- B. Thus if any point X is collinear
with A or B and a point of the triangle it is collinear with 0 and
a point of the triangle.
c  /                c   /N
C     /                a/    WJx
FIG. 13.                   FIG. 14.
If M  is on the segment AB, and N on the interval CB,
X is in one of the orders {MXN} {MNX}, {XMN}.          If
{MXN, by Theorem    13, applied to the triangle MNB we
have the existence of a point P in the orders {AXP} and
{NPB }. Since X is on the line AP, the line OX meets two points
of the triangle ABC by the paragraph above. If X is in the
order {XMN } consider the triangle CNX and the orders { CNB}
and I NMX}. Hence there exists a point P (Figs. 14, 15) in the
orders {BMP} and {CPX}. Since 0 is on the segment AB it
either coincides with P or is on the segment PA or is on the
segment PB. In the first case the line OX meets C. In the
second case consider the triangle APC, and in the third case
the triangle BPC. It follows by Theorem 13, in the second
case, that OX meets the segment AC and in the third case
that OX meets the segment CB.


THE FOUNDATIONS OF GEOMETRY


17


If X is in the order {MNX} consider the triangle AMX
and the orders  AMB } and   MNX}. It follows that there
exists a point P in the orders  APX} and {BNP}. Since
0 is on the segment AB it follows by considering the triangle
APB, that the line OX meets the line BC in a point Q. If
C
A t
A  ~ ~Itf
^^                            /B
FIG. 15.                     G. 16.
Q is on the interval BC it is the required point on the triangle
ABC. If not, by Theorem 13, the line OQ meets the segment
AC in a point R and thus X is collinear with O and the point
R of the triangle.
There now remains the case where neither M nor N is on
the interval AB. Therefore one of them, say M, is on the
segment AC and the other, N, is on the segment BC. Two
cases arise as X is on the segment MN or not.
C
P
-                          9
A             B1. 1
TFIG. 17.                   FIG. 18.
If {MXN } the line AX by Theorem 13 applied to the triangle
MCN meets the segment NC in a point P. Thus X is on the
line AP and the result follows from the statement italicized
above.
If X is in the order {MNX}, consider the triangle AMX
and the orders {AMC} and {MNX}. By Theorem 13, the
segment AX is met by the line CN in a point Q. If Q is on


18


MODERN MATHEMATICS


the interval CB the point X is on a line joining A to a point of
this interval and thus OX meets the triangle by the statement italicized above. If Q is on one of the prolongations of
the segment CB then since 0 is on the segment AB the line
OX meets the segment AC according to Theorem 13.
If X is in the order XMN} we replace the triangle AMX
by BNX and proceed as in the paragraph above.
Theorem 15. If A'B'C' are any three non-collinear points
of a plane ABC then the planes ABC and A'B'C' are identical.
Proof. We shall prove first that if A' is a point of the
line AB distinct from B the planes ABC and A'BC are identical.
CA                      j
A' A O B      A   A' O B  A   B A'
FIG. 19.      FIG. 20.   FIG. 21.
If A'=A the statement is trivial. If not we have {A'AB}
or {AA'B} or {ABA'}. In the first case let 0 be any point
of the segment AB; it is therefore on the segment A'B. By
Theorem 14, the plane ABC consists of all points on the lines
joining 0 to the intervals BC and CA. But this set of lines
is identical by Theorem 13, with the lines joining 0 to the
intervals BC and CA'. Hence in this case the planes ABC
and A'BC are identical. In case of the order {AA'B}, 0 is
taken on the segment A'B and the argument
C'        is similar.  In case of the order {ABA'} we
have just shown that the plane ABC is identical with AA'C and the latter with A'BC.
From this it follows that if C' is any point of
the plane ABC not on line the AB, the plane
ABC is identical with the plane ABC'. For
A    o   B  let 0 be any point on the segment AB. By
FIG. 22.  Theorem 14, the line OC' meets one of the
intervals CA and CB. Suppose it meets the
interval CA in a point P. By the paragraph above the plane


THE FOUNDATIONS OF GEOMETRY


19


ABC is identical with the plane ABP, and this is identical
with PAO, and this with C'AO, and this with C'AB.
Now if A'B'C' are three non-collinear points of the plane
ABC, at least one of them, say C', is not on the line AB. Hence
by the argument above ABC is identical with ABC'. A'
and B' are not both on the line AC'. Let B' be the one which
is not, and we have that ABC' is identical with AB'C'. Since
A' is not on the line B'C' the same argument shows that AB'C'
is identical with A'B'C'. Hence ABC is identical with A'B'C'.
Theorem 16.* A line having two points in common with a
plane lies wholly in the plane.
Proof. Let the two points be taken as A and B in defining
the plane, ABC. The plane contains the line AB.
Corollary. If two planes have two points in common they
have a line in common.
Theorem 17. A line of a plane which contains one and only
one point of a side of a triangle whose vertices are in the plane
contains one other point of the triangle.
Proof. Let the triangle be ABC and let a line I meet the
segment AB in a point 0. By Theorem 14, since any other
point of I is in the plane ABC the line I meets the triangle in
a point different from 0.
V. REGIONS IN A PLANE
In this section we shall be dealing entirely with the points
of a single plane.
Definition. The set of n-1 intervals A1 A2, AA3,...,
An   _2  A     _n-,   An_  A  determined by n points Al, A2,... 
An is called the broken line Al A2 A3... A..   A  and An are
called its ends, and it is said to join A1 and An. A single interval
is a special case of a broken line.
A region is a set of points such that (1) any two points of
the set can be joined by a broken line consisting entirely of
points of the set, and (2) any point of the set is on at least two
non-collinear segments consisting entirely of points of the set.
* Cf. Euclid, Definition, I, 7.


20


MODERN MATHEMATICS


The last clause excludes the possibility of a single segment
being a region. A region is said to be convex if the interval
joining any two points of it is
composed entirely of points of the
region.
It is evident that the set of all
points in a plane is an example of a
__i(   -- convex region. Further cases are
______________-    developed by the theorems below.
--  -   Theorem 18.  If I is any line
passing through a point of a con_  \__\_vex region R in a plane then the
_-___         ~ \    â points of R not on I constitute two
convex regions R1 and R2 such
that any segment joining a point
FIG. 23.        of R1 to a point of R2 contains a
point of 1.
Proof. Let 0 be a point of R on 1. By the definition of
a convex region there is a segment intersecting I in 0 and consisting entirely of points of R. Let A1 and A2 be two points
of this segment in the order {A10A2}.
Consider the set R1 of points of R which are joined to
A1 by intervals containing no point of 1. If X', X" are two
such points, the segment X'X" can contain
no point of I since, if it did, one of the intervals X"A1 and A1X' would contain a point
of I (Theorem 17).  Moreover, all points X
of the segment X'X" are in R1 because if 1 
should meet the segment A1X it would have, 
by Theorem 17, to meet A1X'. Hence the      \
set R1 is a convex region.
Consider also the set R2 of points of R    FIG. 24.
such that the segments joining them to A1
each contain points of 1. A2 is evidently a point of R2.  If
Y' and Y" are two points of R2 the segment Y'Y" can contain
no point of l, because in that case the line 1 would meet three
sides of the triangle A1Y'Y", contrary to Theorem 6. Again,


THE FOUNDATIONS OF GEOMETRY


21


if Y is any point of the segment Y'Y", the segment A iY contains a point of I by Theorem 17, because the segment A Y'
does and the interval Y'Y does not.   Hence R2 is a convex
region.
Clearly all points of R are in RI, on I, or- in R2. Any
segment joining a point X of R1 to a point Y of R2 meets
1. This follows by Theorem 17, because the segment A1Y does
and the interval A1X does not contain a point of 1.
Since the set of all points in a plane is a convex region,
we have at once the following
Corollary. Definition.  The points of a plane ABC not
on the line AB constitute two convex regions such that any segment joining a point of one region to a point of the other region
contains a point of AB. These two regions are called the two
sides of the line. Either of them is called a half plane.
Definition. A set of points [X] is said to separate two
other sets of points [Y], [Z] if and only if every broken line
joining a point Y to a point Z contains a point X. A set
[X] is said to decompose a region R  into regions R,..,
Rn if the points in [X] and R1,..., R, comprise all points
of R and each pair of regions R1,...  Ris separated by [X]
together with the points of the plane not in R.
Theorem 19. A line containing a point of a convex region
decomposes it into two convex regions.
Proof. This follows from Theorem 18, as soon as we prove
that a broken line joining a point of R1 to a point of R2
meets 1. Let A1 A2 A3.. A,, be a broken line not meeting
1. Then A1 and A2 are on the same side of I by Theorem 18.
In like manner A3 is on the same side as A1 and A2. By repeating this argument we find that An is on the same side as A1.
Hence a broken line joining points on opposite sides meets 1.
Definition. A point and two distinct rays having it as
their common origin are called an angle. The origin is called
the vertex and the rays are called the sides of the angle. If
the rays are collinear the angle (which is identical with the line)
is called a straight angle or a flat angle.
An angle is denoted by the symbol 4ab, if a and b denote


22


MODERN MATHEMATICS


the sides. The symbol 4ABC denotes the angle whose sides
are the rays BA and BC.
Theorem 20. Definition. An angle 4 ABC not a straight
angle decomposes the plane in which it lies into two regions, one
of which is convex. The convex region is called the interior of
the angle and the other region the exterior of the angle. Any
ray with B as origin containing a point of the interior meets the
segment AC and consists entirely of interior points.
Proof. Let D be a point in the order {DBA. Let [1] be
the set of all points of the plane such that the segment DI contains a point of the ray BC.
C    -Io Let [0] be the set of all points
such that the interval DO does
02, ~ -^^ ~  not contain B or a point of the
A   ray BC. Since all points of the
cDC-  -BT                 plane ABC are on rays having
D as origin, every point of the
plane is in [I] or [0] or on 4ABC.
Two   points, 01, 02, are
joined by the broken line 01DO2,
which by definition contains no
\01                point of 4ABC.   Two points,
FIG. 25.          I1, I2, are by definition on the
same side of BA with C and
on the opposite side of BC from D. Hence the segment I112
does not meet 4ABC. Moreover if IP is any interval not
meeting 4ABC, and I is in [I], then P is on the same side of
AB with I and also on the same side of BC with I. The line
BC meets the side DI of the triangle DIP and does not meet
the interval IP. Hence it must meet the segment DP in a
point Q. Since the segment DP is on the same side of the line
AB with C, Q is on the ray BC. Hence P is a point of [1].
Now if IP1P2P3... Pn is a broken line not meeting 4ABC,
the argument just made shows first that P1 is in [I], then that
P2 is in [I] and so on. Hence every broken line joining a point
of [I] to a point of [0] meets 4ABC.
The rays which have B as origin and are on the opposite


THE FOUNDATIONS OF GEOMETRY


23


side of the line AB from C are all composed of 0 points and
cannot meet the segment AC. The ray BD is composed of 0
points and does not meet the segment AC. The other rays
whose origin is B, aside from BA and BC, meet one of the segments DC and CA. Those meeting the segment DC evidently
are composed entirely of 0 points and those meeting the segment CA of I points. Hence the set of points [I] is composed
of the points on the rays whose origin is B and which meet
the segment AC.
Theorem 21. Definition. A triangle decomposes its plane
into two regions one of which is convex and is called the interior.
The other region is not convex
A
and is called the exterior. A ray
whose origin is an interior point
meets the triangle in one and only            / 
one point, and the interior consists of all points having this             o â
property.
Proof. Let the triangle be
ABC and let [I] be the set of
all points on the segments [AX]     /
where [X] is the segment BC.
By Theorem 17, any line through
I except the line IX meets the          FIG. 26.
triangles AXB and AXC each
in one point. But as B and C are on opposite sides of the
line AX these points are one in each of the rays into which
I decomposes the line. Hence every ray through I meets the
triangle ABC once and only once.
A segment 1112 cannot contain a point of the triangle for
then the ray Il12 would contain at least two points of the triangle, one on the segment 11I2 and one on its prolongation
beyond 12. Hence [I] is a convex region.
A segment one of whose end points is I must be composed
entirely of I-points if it does not contain a point of the triangle;
for if P is any point of the segment, the single point of the
triangle on the ray IP is by hypothesis on the prolongation


24


MODERN MATHEMATICS


of IP beyond P. From this it follows as in Theorem 19 that
a broken line joining a point I to a point not in [I] contains at
least one point of the triangle.
A point E not in [I] or on the triangle ABC must either be
exterior to 4BAC or on the prolongation of a segment AX
beyond X. One of the latter points E is joined to a point El
of the prolongation of AB beyond B by an interval which does
not meet the triangle ABC because its ends are on the opposite
side of BC from A. The prolongation of EE1 beyond El is
composed of points in the exterior of 4BAC. Since any two
points exterior to 4BAC are connected by a broken line not
meeting 4BAC, it follows that any two points E are connected
by a broken line. Hence E is a region. Since it contains points
on the two prolongations of a segment AX it is not convex.
Hence [f] and [E] satisfy the definitions respectively of the
interior and the exterior of the triangle.
Corollary. Through each exterior point there pass lines
which do not meet the triangle.
Theorem 22. Any ray BD in the interior region of 4ABC
decomposes it into two convex regions, the interiors of the angles
4ABD, 4DBC. Any ray BD in the
c         exterior region of 4ABC decomposes it
into two regions at least one of which
/   D is convex. One of these is the interior or
/   -  the exterior of 4ABD and the other the
~/ ---  \    interior or the exterior of 4DBC.
B  -                    Proof. To prove the first part of
|\i ~  ~      the theorem we observe that the ray BD
meets the segment AC in a point P.
Hence the points X on the rays joining
B to the segment AP are on the oppoItI~~  ~      site side of the line BD from the points
[Y] on the rays joining B to the segment
FIG. 27.      PC. But the sets [X] and [Y] are the
angles 4ABD and 4DBC respectively.
The proof of the second part is analogous to that just made.
The details are left as an exercise for the reader.


THE FOUNDATIONS OF GEOMETRY


25


Definition. If R is a region and there exists a set of points
[B] not points of R such that every broken line joining a point
of R to a point not in R contains a point B1, then [B] is
called the boundary of R.
For example, a line is the boundary of each of the halfplanes it determines; an angle is the boundary of its interior
and also of its exterior.
Definition. Two rays a, b, are separated by an angle 4hk
if the four rays a, b, h, k have a common origin and one of
a and b is interior while the other is exterior to 4hk.  A set
of rays having a common origin are said to be in the order
ala2a3a4a... a,, if no two of the rays are separated by any
of the angles 4ala2, 4a2a3,..., 4 an-lal, 4anal.
Ct 1   \ Ct
FIG. 28.                       FIG. 29.
Corollary 1. A set of rays in the order {ala2... an-an} are
also in the orders {a2a3... anal and { anai_l... a2a }.
Corollary 2. Any two rays, a, b, having a common origin
are in the orders {ab} and {ba}. Any three rays, a, b, c, having
a common origin are in the orders {abc}, {bca}, {cab}, {acb},
{bac}, {cba}.
Theorem 23. To any finite number n&gt;2 of rays having
a common origin may be assigned notation so that they are in
the order {a a2a3... an}. They decompose the plane into n
regions of which at most one is not convex.
Proof. The theorem is obvious for n=2. Hence we can


26


ME ODERN MATHEMATICS


prove it in general by showing that its truth for n= k implies
its truth for n= k + 1.
To k of the given k +1 rays let us assign notation so that
they are in the order blb2.. bk. They decompose the plane
into k regions R1, R2,..., Rk, whose boundaries are. 4blb2,
b2b3,..., 4bkbl. The other ray, b, lies in one of the k
regions determined by blb2... bk. By Theorem 22 it separates
this region, Ri, into two regions Ri'Ri" of which one at least
is convex if Ri is not convex and both of which are convex if
Ri is convex. Hence the k+1 rays decompose the plane into
k+1 regions R1, R2... R/R/... R   of which at most one
is not convex.
Suppose that the boundary of Ri is 4bibi+l. Then the
boundaries of the two regions into which Ri is decomposed
are 4bib and 4bb+i1.    Hence
the l +l1 rays are in the order
/  {b, b2... bibb+... bb.  By
IV     /calling the first of these al,
/\   I    the second a2, etc., we have
__ I \_ _      assigned  to them  the order
v       I         va1a2... ak+1}.
Iv~    ~
EXERCISE 1. A set of n distinct
coplanar lines meeting in a point
FIG. 30.           0 decompose their plane into 2n
convex regions.
EXERCISE 2. Three lines AB, BC, CA not meeting in a point decompose their plane into seven convex regions, one of which is the interior
of the triangle ABC.
EXERCISE 3. A set of n lines in a plane each pair of which intersect,
but no three of which pass through the same point, decompose their
n(n+l)
plane into -2   +1 convex regions.
2


THE FOUNDATIONS OF GEOMETRY


27


VI. CONGRUENCE OF POINT PAIRS
We now introduce a new undefined term to express a relation between point pairs. The relation is called congruence.
Denoting pairs of distinct points by (A, B), (C, D), etc., we
write (A, B) is congruent to (C, D). Since this phrase is undefined
the reader may attach to it
any meaning consistent with 
the assumptions below.    It
is intended,   however, to
express the common notion          A
implied by saying that the
distance  from  A  to B as
measured by a tape-line is     c             D
the same as the distance from            FIG. 31.
C to D.
Assumption VII.* If A    B then on any ray whose origin
is C there exists one and only onet point D such that (A, B)
is congruent to (C, D).
Assumption VIII.t If (A, B) is congruent to (C, D) and
(C, D) is congruent to (E, F) then
A     B                 (A, B) is congruent to (E, F).
Assumption IX. ~ If (A, B) is conoA    B/o            ' gruent to (A',B') and (B, C) is congruent
FIG. 32.        to (B', C') and {ABC} and IA'B'C'}
then (A, C) is congruent to (A', C').
Assumption X. (A, B) is congruent to (B, A).
Theorem  24. If {ABC}, and C' is a point on a ray A'B'
such that (A, B) is congruent to (A'B') and (A, C) is congruent
to (A'C') then {A'B'C'}.
Proof. By Assumption VII there is a point C" on the ray
B'C' such that BC is congruent to B'C". The point C" is in
* Cf. Euclid, Postulate 3.
t Evidently there are two statements here, (1) the existence and (2) the
uniqueness of D.
t Cf. Euclid, Common Notion 1.
~ Cf. Euclid, Common Notion 2.


28


MODERN MATHEMATICS


the order A'B'C". But by Assumption IX, AC is congruent
to AC". Hence by Assumption VII, C" = C'. Hence AB'C'}.
By combining this theorem with Assumption IX we have
the following
Corollary. If (A, B) and (A, C) are congruent respectively
to (A',B') and (A',C'), and if C is on the ray AB and C' on the
ray A'B', then (if B rC) (B, C) is congruent to (B,' C').
Theorem 25.* (A, B) is congruent to (A, B).
Proof. By Assumption X, (A, B) is congruent to (B, A)
and (B, A) is congruent to (A, B). Hence by Assumption
VIII, (A, B) is congruent to (A, B).
Theorem 26. If (A, B) is congruent to (C, D) then (C, D)
is congruent to (A, B).
Proof. By Assumption VII there exists on the ray AB
a point B' such that (C, D) is congruent to (A, B'). Hence
by Assumption VIII (A, B) is congruent to (A, B'). Hence
by Theorem 25 and Assumption VII B' =B.
Corollary.t If (A, B) is congruent to (C, D) and also to
(E, F) then (C, D) is congruent to (E, F).
Proof. Since (A, B) is congruent to (C, D), (C, D) is congruent to (A, B). Since also (A, B) is congruent to (E, F)
it follows by Assumption VIII that (C, D) is congruent to
(E, F).
The word congruent was taken without definition as a
relation between point-pairs. We now proceed to extend its
significance by means of a definition.
Definition. A set of points [X] is congruent to a set of
points [Y] if (1) every point X corresponds to one point Y
in such a way that whenever (X1, X2) corresponds to (Y1, Y2),
(X1, X2) is congruent to (Yi, Y2) and (2) every point Y is the
correspondent of one point X.
This definition corresponds precisely to the intuitive conception of superposition. If two plane figures are represented
by drawings on sheets of paper it is perfectly clear that a test
for their congruence is to lay one on top of the other. The


* Cf. Euclid, Common Notion 4.


t- Cf. Euclid, Common -Notion 1.


THE FOUNDATIONS OF GEOMETRY


29


superposition with which we have to do in geometry is, however, a kind of intellectual matching of two figures together.
The attention is transferred from one to the other and we try
to see whether corresponding pairs of points are congruent.
It would be perfectly feasible to substitute the word superposable for congruent in the definition above.
Theorem 27.* Any figure is congruent to itself. If a figure
is congruent to a second figure the second figure is congruent to
the first. Two figures congruent to the same figure are congruent
to each other.
Theorem 28. Any point is congruent to any other point,
any line to any other line, any ray to any other ray, any straight
angle to any other straight angle.
Proof. That any point, A, is congruent to any point, B,
is obvious from the wording of the definition.
Let AB and LM be any two rays. Let each point Y of the
ray LM correspond to that point X of the ray AB which is
such that (A, X) is congruent to (L, Y).  Thus    A           x               B
o           I
every X has a y corresponding to it. Moreover,
if Y1 and Y2 correspond o ----
to Xi and X2 we have                   FIG. 33.
(L, Y1) and (L, Y2) congruent respectively to (A, X1) and (A, X2) and hence by the
corollary of Theorem 24 (X1, X2) is congruent to (Y1, Y2).
Hence the ray LM is congruent to the ray AB.
By applying like reasoning to the rays AB' and LM', which
are the prolongations beyond A and L of the segments BA and
LM   respectively, we have that the straight angle BAB' is
congruent to the straight angle MLM'. Hence the two lines
AB and LM are also congruent.
Theorem 29. If (A, B) is congruent to (C, D) then the segment AB is congruent to the segment CD "nd the interval AB is
congruent to the interval CD.
* Cf. Euclid Common Notions 1 and 4. By the word, figure, we mean any
set of points.


30


MODERN MATHEMATICS


Proof. Let A correspond to C and B to D and any point
X of the segment AB correspond to that point Y of the ray
CD such that (A, X) is congruent to (C, Y). By Theorem
24, Y is on the segment CD and by the corollary of the same
theorem, (D, Y) is congruent to (B, X). By the corollary of
Theorem 24, if X1X2 correspond to Y1Y2 then (X1, X2) is congruent to (Y1, Y2).
VII. CONGRUENCE OF ANGLES
In order to deal with the congruence of angles and other
figures in a plane, we must introduce an additional assumption.
Assumption XI. If A, B, C are three non-collinear points
and D is a point in the order
D             D'   BCD}, and if A'B'C' are three
non-collinear points and D' is a
cd /         c'   "   point in the order {B'C'D'} such
[/\  |~/\,  that the point pairs (A, B), (B, C),
(C, A), (B, D) are respectively
B-B --  -A    B'   -   A'l congruent to (A', B'), (B', C'),
FIG. 34.         (C', A'), (B', D'), then (A, D)
is congruent to (A', D').
Theorem 30.*  Two angles 4BAC and 4MON are congruent in such a way that A corresponds to 0 if there are two
points P and Q on the rays OM and ON such that the point pairs
(A, B), (A, C) and (B, C) are respectively congruent to (0, P),
(0, Q) and (P, Q).
Proof. If the points P and Q exist as stated let A correspond
to 0, B to P and C to Q. The ray AC is congruent to the ray
OQ and the ray AB to the ray OP. Hence to prove the angles
congruent we need to show that if X1 is any point of the ray
AC and X2 any point of the ray AB and Y1 and Y2 are the
corresponding points of the rays ON and OM respectively,
then (X1, X2) is congruent to (Y1, Y2).
Let B' and P' be points on the prolongations of BA and PO
beyond A and 0 respectively such that (A, B') is congruent to


* Cf. Euclid, I, 8.


THE FOUNDATIONS OF GEOMETRY


31


(0, P'). Since (A, C), (C, B), (B, A), (B, B') are congruent
respectively to (0, Q), (Q, P), (P, 0), (P, P'), it follows by
Assumption XI that (B'C) is
congruent to (P', Q). Now if/ 
X2 and Y2 are points of the 
rays AB   and OP respectively   x
such that (A, X2) is congruent      A              o
to (0, Y2), it follows, since
(A, C), (C, B'), (B', A), (B', X2)  / B'               '
are respectively congruent to  /              /
(0, Q), (Q, P'), (P', 0), (P', Y2),     FIG. 35.
that (C, X2) is congruent to
(Q, Y2). In similar fashion we can prove that if X1 and Y'1
are points of the rays AC and OQ respectively such that (A, XI)
is congruent to (0, Y1), then (X1, X2) is congruent to (Y1, Y2).
Definition. If B' is on the prolongation of the segment
BA beyond A, the angle 4B'AC is said to be a supplement
of 4BAC. If C' is a point on the prolongation of the segment
CA beyond A, the angles 4CAB and 4C'AB' are said to be
vertical.
Corollary 1. Supplements of congruent angles are congruent.
Corollary 2.* Vertical angles are congruent.
Definition. In a triangle ABC, the sides AB and BC are
said to include 4ABC. The side AC and angle 4ABC are
said to be opposite each to the other. The sides AB and BC
are said to be adjacent to each other and to 4ABC.
Theorem 31.t If the sides of one triangle are congruent
respectively to the sides of another triangle, the triangles are congruent.
Proof. Let the two triangles be ABC and A'B'C' and let
the segments AB, BC, CA be congruent respectively to the
segments A'B', B'C', C'A'. This determines a correspondence
between the two triangles in which by Theorem 30 the angles
at A, B, and C correspond to congruent angles at A'B'C'.
But since 4ABC is congruent to 4A'B'C' it follows by


* Cf. Euclid, I, 15.


t Cf. Euclid, I, 8.


32


MODERN MATHEMATICS


definition that if X and Y are any two points of the segments
BA and BC respectively, and X' and Y' the corresponding
points of the segments B'A' and B'C' respectively, then (X, Y)
is congruent to (X', Y').
A               A'         Applying the same argux/\  x,/      ment to the angles 4ACB
and i  A'C'B' and    the.B    y     C   B'    Y'     c'  angles 4BACand 4B'A'C'
FIG. 36.             we have that the two triangles are congruent.
The following theorems are proved similarly and are left
as an exercise for the reader.
Theorem 32.* If two sides and the included angle of one
triangle are congruent to two sides and the included angle of
another triangle the two triangles are congruent.
Theorem 33.t If two sides of a triangle are congruent, the
angles opposite them are congruent.
VIII. INTERSECTIONS OF CIRCLES
Definition. If 0 and Xo are two points of a plane a, then
the set of points [X] of a such that (0, X) is congruent to (0, Xo)
is called a circle. 0 is called its centre and any one of the intervals OX is called a radius.
The two radii on any line
through   0  constitute  a 
diameter.  The points, except the   points [X], on
radii of the circle are said
to be interior to the circle.
The points of a not on radii
are said to be exterior to              FIG. 37.
the circle.
It can be proved that the interior and exterior points constitute two regions into which the plane a is decomposed by


* Cf. Euclid, I, 4.


t Cf. Euclid, I, 5.


THE FOUNDATIONS OF GEOMETRY


33


the circle. Another assumption, however, is necessary before
this can be done.
Assumption XII.   A   circle passing through a point, A,
interior and a point, B, exterior to another circle in the same plane
has in common with the other circle at least one point on each
side of the line AB.
Definition. A triangle is said to be isosceles if two of its sides
are congruent and to be equilateral if all three are congruent.
Theorem  34.* If AB is any segment there exists in any
half plane of which the line AB is the boundary an equilateral
triangle of which AB is a side.
Proof. Let S and T be the two circles in the given plane
of which A and B are centres respectively and the interval AB
is the radius.  If B' is the
point in the order IBAB'}       S                   T 
such that (B', A) is congruent to (A, B) and A'
the point in the order     B'        A         B'
{ABA'} such that (A, B)
is congruent to (B, A'),
the  interval B'B is a
diameter of the circle S and           FIG. 38.
contains all points of the
interior of this circle which are on the line BB'. Hence the
circle T has the point A interior and the point A' exterior to
the circle S. Hence they have in common by Assumption
XII, two points C and C', one on each side of the line AB. The
interval AC is congruent to the interval AB because they are
radii of S and the interval AB is congruent to the interval
BC because they are radii of T.
Theorem  35. Definition.t If AB   is any segment there
exists one and but one of its points, 0, such that (A, O) is congruent to (0, B). This point is called the mid-point of the
segment or interval AB.
Proof. Using the notation of the last theorem we have


* Cf. Euclid, I, 1.


t Cf. Euclid, I, 10.


34


MODERN MATHEMATICS


that the segment CC' meets the line AB in a point 0 because
C and C' are on opposite sides of the line AB. Since the pointpairs (C, B), (C', B) and (C, C') are congruent respectively
to the point-pairs (C, A), (C', A) and (C, C') it follows by
Theorem 30 that 4BCO is congruent to 42ACQ and hence
that (0, B) is congruent to (0, A).
Suppose now there were another point 0' 0 such that
AO' were congruent to O'B. 0' could not be on a prolongation
of the segment AB; for if it were in the order IABO'I there
would be two segments O'B and O'A on the ray O'B and congruent to the segment O'B; similarly it could not be in the
order O'AB}. If O' were on the segment AB it would be
A      0  0' B      0'
FIG. 39.
in one of the orders, {AOO'B} and {AO'OB}. In the first
of these cases from the order AOO'} and the hypotheses that
(A, 0) and (A, 0') are congruent to (B, 0) and (B, 0') respectively it follows by Theorem 24 that we should have the order
{BOO'}, contrary to hypothesis. The order {AO'OB} is
proved impossible similarly. Hence there is only one midpoint, 0, and it is on the segment AB.
Theorem 36.* If ABC is a triangle there is no point C' ZC
on the same side of the line AB with C such that (A, C) and (B, C)
are congruent to (A, C') and (B, C') respectively.
Proof. By Assumption VII C' cannot be on either of the
lines AC or BC.  If C' exists and is not on these lines we
distinguish two cases according as the line CC' does or does
not meet the line AB.
In case the line CC' does meet the line AB in a point, P,
the point-pairs (B, C), (C, A), (A, B), (A, P) are congruent
respectively to (B, C'), (C', A), (A, B), (A, P) and hence by
Theorem 30 (C, P) is congruent to (C', P). This result is,
however, contrary to Assumption VII.


* Cf. Euclid, I, 7.


THE FOUNDATIONS OF GEOMETRY


35


In case the line CC' does not meet the line AB, A and B
are on the same side of the line CC'. Let 0 be the mid-point
of the segment CC'; let P be a point in the order AOP and
interior to 4CBC'; then by the results of V the segment
PB meets the segment CC' in a point Q.
A     P    B               c
A  A          B
FIG. 40.                   FIG. 41.
Since (0, C), (C, A), (A, 0), (A, P) are congruent respectively to (0, C'), (C', A), (A, 0), (A, P) it follows by Assumption
XI that (C, P) is congruent to (C' P). Hence 4-CBP is congruent to 4C'BP and hence (C, Q) is congruent to (C' Q).
Hence Q is a mid-point of the segment CC' as well as 0, contrary
to Theorem 35.
Corollary. If 4ABC is congruent to 4ABC' in such a
way that A corresponds to itself and C' is on the same side of the
line AB with C then the rays BC and BC' are identical.
Theorem 37. If ABC and A'B'C' are any two planes they
are congruent in such a way that B corresponds to B', the ray
BC to the ray B'C' and the half plane containing A and bounded
by BC to the half plane containing A' and bounded by B'C'.
A'
A      yg  C   B/C"'Al
B         c ----    B   ' 6/      C/4c
FIG. 42.
Proof. Let P be a point (Theorem 34) in the first half
plane such that (P, B), (B, C), (C, P) are congruent (Theorem
34) and let C" be a point in the ray B'C' such that (B, C) is
congruent to B'C" and let P' be a point in the second half


36


MODERN MATHEMATICS


plane such that P'B', B'C", C"P' are congruent. Then by
Theorem 31 the triangles PBC and P'B'C" are congruent and
this determines a correspondence between the points of the
two triangles. A correspondence between the two planes may
be determined as follows:
Let 0 be the mid-point of the segment BC and O' the midpoint of the segment B'C". If X is any point in the first plane
the line OX meets the broken line PBC in a point, S. Let
S' be the point congruent to S in the triangle B'P'C" and let
X' be a point of the line O'S' such that (O', X') is congruent
to (0, X) and on the same side of O' with S' or not, according
as X is on the same side of 0 with S or not.
To complete the proof of the theorem it is necessary to
show that if Xi and X2 correspond in this way to X'1 and X2
then (X1, X2) is congruent to (X1', X2'). This is obvious if
X1 and X2 are on the same line through 0. If they are on
different lines, let S1 and S2 be the points in which the lines
OX1 and OX2 meet the broken line BPC and let S1' and S2'
be the corresponding points on the broken line B'P'C". By
Theorem 31, (0, S1), (0, S2) and (S1, S2) are congruent respectively to (0', Si'), (0', S2'), and (SI', S2). Hence by Theorem
30, 2SiOS2 is congruent to 4Si'O'S2' and (X1, X2) is congruent to (X1', X2).
With the aid of this theorem any plane figure may be superposed upon any other to determine whether or not they are
congruent.
As an obvious corollary of the proof of this theorem we
have:
Corollary. If 42ABC is congruent to 4A'B'C' in such a way
that B corresponds to B' and D is a point in the interior of 4ABC
and BD' a ray such that 4ABD is congruent to 4ABD', B
corresponding to B', then the ray B'D' is interior to 4A'B'C'.
Definition. An angle congruent to either of its supplementary angles in such a way that the vertex of the angle corresponds to itself is called a right angle. The two sides of the
angle are said to be perpendicular, as are also the two lines
containing these rays,


THE FOUNDATIONS OF GEOMETRY


37


Corollary. The two supplementary angles and the vertical
angle of a right angle are right angles.
Theorem 38. If P is any point and AB any line there is
one and only one line through P and perpendicular to P in any
plane containing A, B and P.
Proof. Suppose first that P is on the line AB. Let M
and N be on opposite sides of P and such that (M, P) is congruent to (P, N) and let C be the third vertex of an equilateral
triangle of which MN is a side. Since (C, P), (C, N), (N, P)
are congruent respectively to (C, P), (C, M), (M, P) 4CPM
is congruent to 42CPN and hence the line CP is perpendicular
to the line AB.
If some other line DP were also perpendicular, D would be
on the same side of the line CP with M or with N. As the two
cases are treated alike, suppose that D is on the same
side with N.  Then the seg-                    c 
ment MD meets the line CP 
in a point E.   Let E' be
the point on the opposite     A4 B\
side of P from E such that              MX      P Q   Y
(P, E) is congruent to (P, E').
Also let D' be the point on                     '
the ray ME' such that (M, D)
is congruent to (M, D'). Since
we have the order 1{MED } we           FIG. 43.
also have the order lME'D' 
(Theorem 24) and hence the segment DD' meets the line MN
in a point Q different from P. By the last corollary and
Theorem 30, (AM, E) is congruent to (M, E'); hence (P, D)
is congruent to (P, D'); hence (D, Q) is congruent to (D'Q)
and (D, N) is congruent to D'N. Since the line DP is assumed
perpendicular to the line MN, (D, N) is congruent to (D, M)
and it follows that (D, M), (D, N), (D'N), (D'M) are all
congruent.  Hence we have (D', D), (D', N) and (D, N)
congruent to (D, D'), (D, M) and (D', M) respectively and
thus have 4ND'Q congruent to 2MDQ.         Hence (M, Q)


38


MODERN MATHEMATICS


is congruent to (N, Q) although Q    P.  This contradicts
Theorem 35.
Thus we have shown not only that the line PC is the only
perpendicular to the line MN
P         at P but also that if D is any
point off the line CP, (D, M) is
not congruent to (D, AT).
Now if P is not on the
A                       Bline AB, let P' be a point on
the opposite side of the line
AB (Theorem 37), such that
\P'      PA4P'AB   is  congruent   to
FIG. 44.          4PAB    and (P, A) is congruent to (P', A). The line
PP' is easily seen to be perpendicular to the line AB.
Corollary 1. Definition. If A and B are any two points,
in any plane containing A and B the line through the mid-point
of the segment AB perpendicular to the line AB contains all
points P of the plane such that (P, A) is congruent to (P, B).
This line is called the perpendicular bisector of (A, B).
Corollary 2.* All right angles are congruent to one another.
Theorem 39. A set of three non-collinear points cannot be
congruent to a set of three collinear points.
Proof. Let A, B, C be any three non-collinear points and
P, Q, R three collinear points.
If (A, B), (B, C), (C, A) were               B
congruent respectively to (P, Q),
(Q, R), (R, P) t hen by the theorems
of VI there would be a point D on  A/D 
the line CA such that (A, B) and
(C, B) were congruent to (A, D)   P                   R
and (C, D) respectively. Let M be        FIG. 45.
the mid-point of (B, D). Then
since (A, B) and (A, D) are congruent, the line AM  is perpendicular to the line BD and since (B, C) and (D, C) are


* Cf. Euclid, Postulate 4.


THE FOUNDATIONS OF GEOMETRY


39


congruent C must be on the line AM (Theorem 38), contrary
to the hypothesis that A, B and C are non-collinear.
Theorem 39 makes it possible to prove the converses of the
theorems on angles and triangles in the last section. In that
section we proved that two triangles were congruent if the
three sides of one were congruent to the three sides of the other.
We were not, however, prepared to say that if two triangles
were congruent the vertices of one corresponded to the vertices
of the other. It might have happened that the sides AB,
BC of one triangle corresponded to the side PQ of another,
and that the third side, CA, corresponded to the broken line
QRP. This possibility is excluded by Theorem 39. We now
easily see that if the triangle ABC is congruent to the triangle
PQR that the point-pairs (A, B), (B, C), (C, A) are congruent
to the three point-pairs (P, Q), (Q, R), (R, P). We shall take
these converse theorems for granted without further proof.
It is also evident that Theorem 24, and the corollary of
Theorem 37, may be generalized to read: If two planar figures
are congruent, the points and lines of one figure are in the same
order relations as are the corresponding points and lines of the other
figure.
Theorem 40. A line, 1, containing a point interior to a circle
meets the circle in two and only two points.
Proof. Let 0 be the centre of the circle and I the given
interior point and J the point of
the line I such that the line OJ
is perpendicular to it. Let Q
be on the opposite side of J
from 0 and such that (0, J) is 
congruent to (Q, J). We shall   c'      -
prove first that Jis interior to the
circle. If I=J this is evident.
If not, let I be the point of
the ray OJ such that (0, I') is
congruent to (0, I). Let J' be            FIG. 46.
a point of the ray OI such that
OJ is congruent to OJ'. If I' were in the order IOI'J} then J'


40


MODERN MATHEMATICS


would be in the order { OIJ'}. In this case let K be the midpoint of the segment JJ'. It follows that 4J'KO is a right
angle. Now let J'S be a ray on the same side of the line JJ'
with 0 making a right angle with JJ'. The ray J'S cannot
meet the interval OK because this would imply two perpendiculars to JJ' from the point of intersection. Hence the ray
J'I is interior to the right angle 4JJ'S. On the other hand
the ray JJ' is exterior to the right angle 40JI. But 4JJ'I
is congruent to 40JJ', and thus we have a contradiction with
the corollary of Theorem 37. Hence we have established the
order OJI'}.
s
/   JQK
FIG. 47.
Since the point of the circle on the ray 01 is on a prolongation of the segment 01 the point C of the circle on the ray OJ
is on the prolongation of the segment OF'. Hence J is an
interior point.
The circle with Q as a centre and with a radius congruent
to the interval OC has a point D on the prolongation of OQ
beyond Q which must be in the order { OCD }. Hence D is outside the first circle. If C' and D' are the points of the first and
second circles respectively on the prolongation of QC beyond C
then it follows by Theorem 24 that we must have {QD'C'}.
Hence D' is interior to the first circle. Hence the two circles
have two points U, V, in common. Since (0, U) and (0, V)
are congruent to (Q, U) and (QV) respectively, it follows by
Theorem 38, Corollary 1, that U and V are on the line 1. Since
any point W common to the line I and the first circle would
be such that OW is congruent to QW and hence be on the second


THE FOUNDATIONS OF GEOMETRY


41


circle there are, by Theorem 36, only two points common to
the line and the first circle.
With the aid of this theorem there is no difficulty in proving that a circle decomposes its plane into two regions, the
interior and the exterior. It is also easy to derive the earlier
propositions of the first book of Euclid's Elements. In most
cases Euclid's own proof can be used.
It will be an excellent exercise for the reader to work through
those of the first twenty-eight propositions which we have not
already taken up. It will be necessary for him to define certain terms such as addition of segments and angles which are
not explicitly defined by Euclid. The logical bases for these
definitions will be found in our development of the elementary
theory of order and congruence. The propositions in question
are as follows:
2. To construct* at a given point a segment congruent to
a given segment.
3. Given two non-congruent segments, to cut off from the
greater a segment congruent to the less.
6. If in a triangle the two angles be congruent to one another,
the sides which subtend the congruent angles will also be congruent to one another.
9. To bisect a given angle.
11. To draw a line at right angles to a given line from a
given point on it.
12. To a given line from a given point not on it, to draw a
perpendicular line.
13. If a ray set up on a line make angles it will either make
two right angles or angles equal to two right angles.
14. If with any line two rays on opposite sides of it make
the adjacent angles equal to two right angles the two rays will
be in the same line with one another.
16. In any triangle, if one of the sides be produced the
exterior angle is greater than either of the interior and opposite
angles.


* By means of a compass which closes when lifted from the plane.


42


MODERN MATHEMATICS


17. In any triangle two angles taken together in any manner
are less than two right angles.
18. In anytriangle the greater side subtends the greater angle.
19. In any triangle the greater angle is subtended by the
greater side.
20. In any triangle two sides taken together in any manner
are greater than the remaining one.
21. If on one of the sides of a triangle, from its ends there
be constructed two lines meeting within the triangle, the segments so constructed will be less than the remaining two sides
of the triangle, but will contain a greater angle.
22. Out of three intervals congruent to three given intervals
to construct a triangle: thus it is necessary that two of the
intervals taken together in any manner should be greater than
the remaining one.
23. On a given line at a given point to construct an angle
congruent to a given angle.
24. If two triangles have the two sides congruent to two
sides respectively but have one of the angles contained by the,
congruent sides greater than the other they will also have the
base greater than the base.
25. If two triangles have the two sides congruent to two
sides respectively, but have the base greater than the base,
they will also have the one of the angles contained by the
sides greater than the other.
26. If two triangles have the two angles congruent to two
angles respectively and one side congruent to one side, namely,
either the side adjacent to the congruent angles, or that subtending one of the congruent angles, they will also have the
remaining sides congruent to the remaining sides and the remaining angle to the remaining angle.
27. If a line falling on two lines make the alternate angles
congruent to one another, the lines will not meet.
28. If a line falling on two lines make the exterior angle
congruent to the interior and opposite angle on the same side,
or make the sum of the interior angles on the same side two
right angles, the lines will not meet.


THE FOUNDATIONS OF GEOMETRY


43


IX. PARALLEL LINES *
The next assumption which we shall set down is the justly
famous assumption of Euclid about parallel lines. It has been
stated in many different forms, of which the following is perhaps
the simplest.
Assumption XIII. If A is any point and a any line not passing through A, there is not more
than one line through A coplanar        A       b
with a and not meeting a.
That there is at least one
a
line through A, coplanar with a  -      B
and not meeting it is easily seen      FIG. 48.
by dropping a perpendicular
AB from A to a, and observing that the perpendicular, b, to
the line AB at the point A could not meet a without contradicting Theorem 38. The same result follows directly from Euclid,
I, 27 or I, 28.
The assumption of parallels was stated by Euclid in his
Postulate 5 as follows:
"If a straight line falling on two straight lines makes the
interior angles on the same side less than two right angles,
the two straight lines, if produced indefinitely, meet on that
side on which are the angles less
than the two right angles."
A-4/     --  E     rThis is a consequence of our
assumption, for let the rays AC and
BD be such that the sum of the
B             D   angles 4CAB and 4ABD, is less
than two right angles. Let AE be
FIG. 49.        the ray such that the sum of 4EAB
and 4ABD is two right angles.
Then by Euclid, I, 28, the line AE does not meet the line BD.
Hence the line AC does meet the line BD. Since the sum of the


* From this point forward the essay is a mere outline, intended to suggest
how the rest of the subject may be developed.


44


MODERN MATHEMATICS


angles 4CAB and 4ABD is less than two right angles and the
sum of 4EAB and 4ABD is equal to two right angles, it follows
that the ray AC falls within 4_EAB. Hence the ray AC is on
the same side of the line AE with the line BD. It is also on the
same side of the line AB with the ray BD. Therefore
the point of intersection of the line AC with the line BD is on
the rays AC and BD.
For a further discussion of the theory of parallels the reader
may consult Euclid's Elements, and the memoir in this collection by Professor Woods.
X. MENSURATION
Defining the sum of two segments and a multiple of a segment (or point-pair) and the terms equality and inequality
of segments in the obvious way, it is easy to prove first that if
A and B are any two points and n is any whole number, there
is a point C on the line AB such that
n(A, B) - (A, C),
and second that there is a point D such that
n(A, D)- (A, B).
From this it follows that if m and n are any whole numbers
there exists a point E such that
rm(A, B)-n(A, E).
Thus, with an extension of our definition, we have that
-(A, B)  (A, E).
Calling nm/n the ratio of (A, E) to (A, B) this states that there is
a point-pair having to (A, B) the same ratio as that of any
two whole numbers.  Two such segments are said to be commensurable.
It is not hard to show that there 'are segments which are
not commensurable and there is thus propounded the problem
of extending the notion of ratio to incommensurable segments.


THE FOUNDATIONS OF GEOMETRY


45


Euclid's method of doing this is a purely geometrical one, and
similar methods have been preferred by nearly all the great
geometers, the latest notable example being the Algebra of
Segments of Hilbert.
The method, however, which is more or less approximated
to in elementary teaching, is that of defining the ratio of two
incommensurable segments as an     irrational number.  The
theory of irrational numbers is taken for granted from arithmetic
and algebra.
The following proposition, known as the Postulate of Archimedes, is fundamental in this method.
Assumption XIV. If A, B, C are three points in the order
ABC} and B1, B2, B3,... are points in the order ABB1 },
AB1B2 },... such that (A, B), is congruent to each of the pointpairs (B, B1), (B1, B2),... then there are not more than a
finite number of the points B1, B2,... between A and C.
A B     B1  B2 
FIG. 50.
In other words, by laying off the segment AB a finite number of times in the way indicated a point is reached which is
beyond C; that is to say, there exists a number n such that
(A, C)&lt; n(A, B).
Another phrasing of this assumption would be: there exists no
infinitely great interval (A, C).
A direct consequence of Assumption XIV is that if D is
any point of the ray AB there exists a number n such that
(A, B) &lt; (A, D),
for, if not, there would exist no number n such that
(A, B)&lt; n(A, D).
This may be expressed by saying that there is no infinitely
small interval.


46


MODERN MATHEMATICS


Let Ao and A1 be any two points and let us denote by
Am that point of the ray AoA1 which is such that the ratio
n
of the segment AoAr to AoA1 is-.  If B is any point of
n            n
the ray such that (Ao, B) is incommensurable with (Ao, A1),
the points [Am] fall into two classes, those on the segment
AoB, which we may call [As] and those on its prolongation
which we may call [As]. The numbers, [x], associated with
points in the first class, are all less than the numbers [y]
associated with points in the second class.  With the aid of
B
I, ",   I    l. I  I
Ao A4 A4.1       A2       A3
FIG. 51.
Assumption XIV it can be proved that B is the only point
which is between every A, and every As.
By Dedekind's principle* of definition of the irrational
numbers there exists a unique irrational number, b, greater
than every x and less than every y. This number, b, we define
to be the ratio of the segments AoB and AoA1.
Since any segment whatever is congruent to one of the segments AoA1 which have Ao as one end, we have now established
a scale of magnitudes for the comparison of segments and are
in a position to develop a complete theory of proportion.
The theory of the measure (that is to say, length) of segments depends essentially on showing how to arrange segments
in order of magnitude. In like manner, the theory of the
measure, that is to say, of the area, of regions in the plane depends on showing how to arrange areas in an order of magnitude.
For the purpose of elementary geometry we may confine attention to convex regions. A convex region A may be said to
be less than a convex region B, if it is possible to decompose
A, into a finite set of convex regions congruent to a nonoverlapping set of convex regions contained in B, and such


* See Monograph IV, Appendix I.


THE FOUNDATIONS OF GEOMETRY


47


that B contains at least one convex region not in this set.
Two convex regions A and B, may then be said to be equivalent if neither is less than the other. In order to give this
definition value it must be proved that two congruent regions
are equivalent. This amounts to proving the following proposition:
It is not possible to decompose two congruent convex
regions R1, R2 into convex regions so that all the regions
into which R1 is decomposed are congruent to a subset of the
regions into which R2 is decomposed.
By associating with an arbitrary square the number 1,
a number, called the area, can now be assigned to each region
in such a way that two equivalent regions have the same area,
and if one region is less than another the less region has the
smaller area. The theory of volumes can be developed similarly.
It has been shown by Hilbert that a theory of the areas of
polygonal regions can be developed independently of Assumption XIV, and by Dehn that a fully corresponding theory for
polyhedral regions does not exist. On this subject the reader
should consult the second edition of Hilbert's Grundlagen
der Geometric * and the article by Amaldi, " Sulla teoria dell'
equivalenza," inQuestioni riguardanti la Geometria Elementare,t
edited by F. Enriques.
XI. THREE-DIMENSIONAL SPACE
Definition. If A, B, C, D are four points not all in the
same plane the set of all points on and interior to the four
triangles ABC, BCD, CDA, ABD, is called a tetrahedron.
The set of all points collinear with pairs of points of a tetrahedron is called a three-space.
By a discussion: analogous to that made in IV it is
possible to prove that if A'B'C'D' are any four points of a


* Leipzig, 1903.
t Bologna, 1900.
$ Cf. Transactions of the American Mathematical Society, Volume V,
page 360.


48


MODERN MATHEMATICS


three-space ABCD, then the three-space ABCD is identical with
the three-space A'B'C'D'; that if two points of a line lie in a
given three-space ABCD, then so do
1C         all points of the line; that if three
points of a plane lie in a three-space, so
x   do all points of the plane; and that if
B/i^ gand only if two planes are in the same
three-space they have aline in common.
The notion of a three-dimensional
A                D       region can then be defined and studied
FIG. 52.         analogously to V. Congruent figures
can be defined as in VI.
Assumption VI provided for the existence of a plane, but
since nothing has as yet been said about the existence of points
which are not coplanar, we add the following:
Assumption XV. If A, B, C are three non-collinear points,
there exists a point D not in the same plane with A, B and C.
Assumption XVI.     Two planes which have one point in common have two points in common.
Assumption XV provides for the existence of at least one
three-space and from Assumption XVI it follows that all points
are in the same three-space.
All the theorems of elementary three-dimensional geometry
can be developed on the basis of these assumptions. But
to do so would be to write a large book.*
* A book giving a complete and rigorous treatment of elementary geometry
would be a most important influence in improving the teaching of the most
ancient and perfect of sciences. Such a book could rarely, if ever, be used
in the classroom, but if it were in the hands of the teachers it would serve
to keep before them in something like its actual form the structure of which
they are trying to give their students a first glimpse.


THE FOUNDATIONS OF GEOMETRY


49


XII. CONCLUSION
The logically important questions as to the independence
and categoricalness of our assumptions must be passed over
with a reference to the two papers in the Transactions on which
this essay is based. The ideas of consistency, independence
and categoricalness (sufficiency) are explained in the essay by
Professor Huntington in this book, and the independence of
Assumption XIII is established in the essay by Professor Woods.
A reader who is sufficiently interested to pursue the subject
further is strongly urged to go into the question of the independence of the assumptions and to try -to discover for himself some
of the examples which constitute the independence proofs.
For convenience in this sort of study we have collected the
assumptions in the following list.
I. If points A, B, C are in the order ABC} they are distinct.
II. If points A, B, C are in the order {ABC} they are not
in the order {BCA}.
Definition. If A and B are distinct points the segment
AB consists of all points, X, in the order {AXB}; all points
of the segment AB are said to be between A and B; the segment
together with A and B is called the interval AB; the line AB
consists of A and B and all points, X, in one of the orders {ABX},
{AXB, } XAB}; and the ray AB consists of B and all points,
X in one of the orders {AXB} and {ABX}; A is called the
origin of the ray AB.
III. If points C and D (C7 D) are
A             on the line AB, then A    is on the
line CD.
Ff-^olu        IV. If A and B are two distinct
points, there exists a point C such that
A, B and C are in the order {ABC}.
B       C       D      V. If three distinct points A, B and
FIG. 53.      C do not lie on the same line and D and
E are two points in the orders {BCD}
and {CEA}, then a point F exists in the order {AFB} and
such that D, E and F lie on the same line.


50


MODERN MATHEMATICS


VI. There exist three' distinct points, A, B, C, not in any
of theorders { ABC, BCA }, { CAB}.
Definition. If A, B, C are three non-collinear points, the
set of all points collinear with pairs of points on the intervals
AB, BC, CA is called the plane ABC. The points X of the
plane such that the interval AX does not contain a point of;
the line BC constitute, together with A itself, one side of the
line BC. The other points of the plane, not on the line BC,
constitute the other side of the line BC.
The notation (A, B) denotes a pair of distinct points.
VII. If A iB, then on any ray whose origin is C there
exists one and only one point D such that (A, B) is congruent to (C, D).
VIII. If (A, B) is congruent to (C, D) and -(C, D) is congruent to (E, F) then (A, B) is congruent to (E, F).
IX. If (A, B) is congruent to (A', B') and (B, C) is congruent
to (B', C') and {ABC} and {A'B'C'}, then (A, C) is congruent
to (A', C').
X. (A, B) is congruent to (B, A).
XI. If A, B, C are three non-collinear points and D is a
point in the order BCD}, and if A'B'C' are three
non-collinear points and D' is a point in the order      D
{B'C'D'} such that the point-pairs (A, B), (B, C),
(C, A), (B, D) are respectively   congruent to 
(A', B'), (B', C'), (C', A'), (B', D') then (A, D) is 
congruent to (A', D').                            B 
Definition. If 0 and Xo are two points of a     FlG. 54.
plane a, then the set of points [X] of a such that
(0, X) is congruent to (0, Xo) is called a circle. 0 is called its
centre and any of the intervals OX is called a radius. The
points, except the points [X], on radii of the circle are said
to be interior to the circle.  The points of a not on radii are
said to be exterior to the circle.
XII. A circle passing through a point, A, interior and a
point, B, exterior to another circle in the smie plane has in
common with the other circle at least one point on each side
of the line AB.


THE FOUNDATIONS OF GEOMETRY             51
XIII. If A is any point and a any line not passing through
A, there is not more than one line through A coplanar with a
and not meeting a.
XIV. If A, B, C are three points in the order {ABC} and
B1, B2, B3,... are points in the order {ABB1}, {AB1B2},...
such that (A, B) is congruent to each of the point-pairs (B, B1),
(B1, B2),..., then there are not more than a finite number
of the points B1, B2,... between A and C.
XV. If A, B, C are three non-collinear points, there exists
a point D not in the same plane with A, B and C.
XVI. Two planes which have one point in common have
two points in common.


II
MODERN PURE GEOMETRY
By THOMAS F. HOLGATE


CONTENTS
SECTIONS.
I. INTRODUCTION....................................... 1-4
II. SIMPLE ELEMENTS IN GEOMETRY........................  5
III. THE PRINCIPLE OF DUALITY........................... 6-11
6, Duality in space;
7-8, Examples of duality;
9-11, Duality in a plane.
IV. PRINCIPLE OF CONTINUITY.............................. 12-13
13, Imaginary intersections.
V. POINTS AT INFINITY.............................  1.4-19
14-17, Infinitely distant elements;
18-19, Postulate of parallels.
VI. FUNDAMENTAL THEOREM...............................20-26
21, Perspective triangles;
22, Perspective quadrangles;
23-26, Harmonic points.
VII. METRIC PROPERTIES.................................. 27-29
VIII. ANHARMONIC RATIOS.................................. 30-33
30, Definition;
31-33, Six anharmonic ratios.
IX. ELEMENTARY GEOMETRIC FORMS........................   34
X. CORRELATION OF ELEMENTARY FORMS...................35-40
39, Construction of corresponding elements.
XI. CURVES AND SHEAVES OF RAYS OF THE SECOND ORDER...41-48
48, Classification of conics.
XII. PASCAL'S AND BRIANCHON'S THEOREMS..................49-52
XIII. POLE AND POLAR THEORYY...............................53-58
57, Conjugate points and lines;
58, Centre and diameters.
XIV. CONCLUSION.........................................  59
54


II


MODERN PURE GEOMETRY
By THOMAS F. HOLGATE
I. INTRODUCTION
1. In Analytical Geometry conclusions are reached through
the application of algebraic processes to geometric properties
and relations. By making use of certain conventions the given
relations are expressed in algebraic language, then certain algebraic operations are performed and the results are reinterpreted as geometric propositions. During the process the geometric concept may be entirely lost sight of and the resulting
statement may bear no apparent relation to the premises from
which it was derived. In Pure Geometry, on the other hand,
the geometric concept is kept continually in mind throughout
the reasoning process, and the steps by which a conclusion
is reached from given conditions are readily traceable.
2. Pure geometry was cultivated by peoples of the earliest
times. By them many important theorems were discovered
on the relations of triangles and other rectilinear forms, on the
properties of circles and spheres, and on areas, ratios, and the
equality and similarity of geometric figures. The investigations of the ancient geometers were carried so far as to include
the conic sections and certain curves of higher order whose
principal properties were discovered, but the methods used
were fragmentary and the results for the most part were disconnected. The ancient geometry is typified most clearly
by Euclid's Elements, which was in fact a collation and systematic arrangement of the geometric knowledge of his time.
55


56


MODERN MTMATMATICS


In it properties and relations are demonstrated each by itself,
and little attention is paid to relations common to all forms
of the same class. The method of Euclid has come to be known
as the method of Elementary Geometry, and the subject-matter
of his elements has prescribed the field of elementary geometry.
3. The methods of the ancient geometers were not materially
modified till the period of the revival of learning early in the
sixteenth century, when with the introduction of certain new
concepts, and the application of well-known older ones, as,
for example, infinitely distant elements, the harmonic division
of a line segment, the principle of continuity, and the theory
of imaginary intersections, the science began to take on a more
generalized form. The renewed activity in geometric research
resulted in the invention by Descartes of the analytical geometry, and for two and a half centuries investigations by purely
geometric methods were for the most part pushed aside. Happily interest in pulre geometry was revived toward the close
of the eighteenth century through the publications of Monge,
and during the first half of the nineteenth century it reached
its highest development at the hands of Poncelet, Steiner, Von
Staudt, and Chasles.
4. Modern pure geometry differs from  the geometry of
earlier times not so much in the subjects dealt with as in the
processes employed and the generality of the results obtained.
Much of the material is old, but by utilizing the principle of
projection and the theory of transversals, facts which were
thought of as in no way related, prove to be simply different
aspects of the same general truth. This generalizing tendency
is the chief characteristic of modern geometry, and while it
may perhaps be attributed largely to the influence of the analytic method, still it is true that some progress had been made
in this direction before the analytic method was invented,
and pure geometry has done much in recent times to enliven
and heighten the interest in analysis.


MODERN PURE GEOMETRY


57


II. SIMPLE ELEMENTS IN GEOMETRY
5. Points, straight lines, and planes are the simple undefined
elements of pure geometry. Each of these may be thought
of as having an existence independent of the others; a plane
may be thought of without considering the lines and points
which lie in it; we may think of a line without considering
the points which lie on it or the planes which pass through it,
and of a point without considering either the lines or the planes
which pass through it. In fact each of these simple elements
may be the base on which rest an indefinite number of elements
of either of the other kinds.
III. THE PRINCIPLE OF DUALITY
6. Duality in space. Two points will fix the identity of a
straight line and three points will in general determine a plane.
So also two planes intersect in a straight line and three planes
in general have one point in common. If three points lie in a
specialized relative position, namely, in a straight line, then
many planes pass through them. Similarly, if three planes be
in a specialized relative position, namely, with one line in common, then many points lie in all three. But apart from such
special cases the following statements may be made:
al. Three points determine a plane.
a2. Three planes determine a point.
bl. Two lines which have a common point determine a plane.
b2. Two lines which have a common plane determine a point.
cl. A line and a point determine a plane.
c2. A line and a plane determine a point.
In these statements taken two and two there will be noted
an interchangeable relation between the elements point and
plane, and between line and line. This is spoken of as a dual
relation, and in accordance with it any geometric form will
yield another by replacing every point in one by a plane in the


58


MODERN MATHEMATICS


other, and every line joining two points in one by a line the
intersection of two planes in the other. If in the original
figure three planes meet at a point, in the dual or reciprocal
figure three points will lie in a plane; or if in the original figure
four lines lie in a plane, in the reciprocal four lines will meet
in a point.
7. Examples of duality. A cube consists of eight vertices
(points), six plane faces, and twelve edges each the intersection
of two faces and joining two vertices. Its dual or reciprocal
figure, therefore, consists of eight plane faces, six points (vertices), and twelve edges each joining two vertices and the intersection of two faces. In the original figure the faces meet by
threes in the vertices, and also the edges meet by threes in the
vertices, while four edges lie in each face. In the reciprocal
figure, the vertices must lie by threes in the faces, and also the
edges lie by threes in the faces, while four edges meet in each
vertex. This reciprocal figure is readily seen to be an octahedron.
The cube and the octahedron may thus be spoken of as
dual or reciprocal figures. In the same way it will be seen that
the dual of a tetrahedron is again a tetrahedron, and the dual
of a dodecahedron is an icosahedron.
8. This principle, by which a theorem on points, lines, and
planes may be deduced from another on planes, lines, and points,
by simple interchange is called the principle of duality.  It
was made much use of by Poncelet, but was first announced
as an independent principle by Gergonne (1826), and plays
an important part in modern geometry. Its application is to
purely descriptive properties and not in general to properties
involving measurement.
9. Duality in a plane. If the forms under consideration
are confined to a single plane, that is, if we are dealing only with
plane geometry, the duality is between point and line, since in
plane geometry two points determine a line and two lines
determine a point. To any number of points on a line in one
of two reciprocal plane figures there will correspond in the
other an equal number of lines through a point, and if three or


MODERN PURE GEOMETRY


59


more lines are concurrent in the one, their reciprocal points
are collinear in the other.
10. As an illustration of reciprocal figures in a plane the
following will serve:
Four points (vertices) A, B, C, D, of which no three are
collinear, determine six lines (sides), namely, the lines joining
the vertices two and two. The lines AB and CD may be called
opposite sides in the figure, and similarly AC and BD are opposite
sides, as are also AD and BC. The pairs of opposite sides
determine three points P, Q, R-diagonal points-the vertices
of what may be called the diagonal triangle. The figure so
constructed is known as a complete quadrangle.
On the other hand, four lines (sides) a, b, c, d, of which no
three are concurrent, determine six points (vertices), namely,
the intersections of the sides two and two. The points ab and
cd may be called opposite vertices in the figure, and similarly
ac and bd are opposite vertices as are also ad and be. The
pairs of opposite vertices determine three lines p, q, r-diagonals
-the sides of what may be called the diagonal triangle. This
figure is known as a complete quadrilateral.
â / â
FIG. 1.                       FIG. 2.
A complete quadrangle (Fig. 1) thus consists of four vertices, six sides, and three diagonal points; a complete quadrilateral (Fig. 2) consists of four sides, six vertices, and three
diagonals.  Similarly, a complete pentangle (pentagon) has
five vertices and ten sides intersecting by fours in the vertices,


60


MODERN MATHEMATICS


while there may be found fifteen points in which only two sides
intersect. A complete pentalateral on the other hand, has
five sides and ten vertices lying by fours on the sides, while
there may be drawn fifteen lines on which only two vertices lie.
11. As an illustration of how one theorem may be deduced
from another by the principle of duality the following example
will serve. The theorem on the left is well known, having been
stated by Pappus in the fourth century. That on the right
is not so familiar, but follows immediately by interchange
of point and line, or it may be demonstrated independently.


If three points A, C, E be
chosen at random on a straight
line p, and three others B, D,
F, be chosen at random on
a straight line q, and these
be joined in order AB, BC,
CD, DE, EF, FA, by straight
lines 1, 2, 3, 4, 5, 6, then the
intersections of 1 and 4, 2 and
5, 3 and 6, lie on a straight
line r.
Other examples of duality
this chapter.


If three straight lines, a, c,
e, be drawn at random through
a point P, and three others
b, d, f, be drawn at random
through a point Q, and the
intersections of these two and
two in order ab, be, cd, de, ef,
fa, be denoted by 1, 2, 3, 4, 5, 6,
then the lines 14, 25, 36, pass
through one point R.
will occur in the progress of


IV. PRINCIPLE OF CONTINUITY
12. The principle of continuity, first assumed by Kepler
and later by Desargues, asserts that a property which can be
demonstrated for a particular figure will hold true if the figure
should change its form in any manner subject to the conditions
under which it was first constructed. This principle makes
necessary an enlargement of the significance of many geometric
terms so as to include what are called imaginary elements,
and by the aid of these it permits the statement of general facts
or theorems which otherwise would be subject to exceptions
and limitations. The geometer makes no attempt to construct


MODERN PURE GEOMETRY


61


imaginary elements, but contents himself with the acceptance
of their existence and of the principle that though by continuous change in a figure a property once proved may become
unmeaning through the loss of real elements, it is still true when
imaginary elements are taken into consideration.
13. Imaginary intersections. As an illustration it may
be stated that a straight line drawn through a fixed point P
intersects a circle in two points. If the point P lies within
the circle, the intersections are always real no matter how far
the line rotates about P. If, however, the point is chosen outside the circle, the line in the first instance may cut the circle
in two real points, but as it rotates about P the intersections
will move so as first to fall together or become coincident,
and after that they will disappear or become imaginary. To
say that the line in this last phase intersects the circle may be
without meaning under the ordinary conventions; yet it is
assumed true, and the imaginary points of intersection play
the same part in any general theorem as do the real points of
intersection of the earlier phases. Thus the theorem that the
product of the segments of a chord or secant of a circle remains
constant while the secant rotates about a point comes to have
an interpretation for all positions of the secant.
V. POINTS AT INFINITY
14. Infinitely  distant elements.  The  introduction  into
geometry of the notion of infinitely distant elements has aided
greatly in the process of generalization with which modern
methods are chiefly concerned. Many exceptional cases which
under earlier conditions would require special treatment, by
the addition of this concept are brought into conformity with
a general statement.
15. Infinitely distant elements come most easily into view
from the following considerations.
Suppose a straight line b (Fig. 3) passing through a fixed
point 0, intersects the line a, in a point P: and suppose the line
b rotates about 0 as indicated by the arrow. The point of


62


MODERN MATHEMATICS


intersection P will move along the line a to the right until it
is lost to view and then will immediately appear at the far left,
moving along the line in the same sense as before.
The assumption is made that the two lines have not at any
time ceased to intersect, and that the point P has moved continuously along the line a,
^b  ~       ~ disappearing at the far
b                 right and reappearing at
/ b            the far left after passing
/  \6\,     through but a single position which lies outside the
//   c     \     " \     accessible region of the
plane. In other words, it
FIG. 3.            is assumed that on the line
a, and so on any other
straight line of the plane, or for that matter, on any straight
line of the finite region, there is one and only one infinitely
distant point. It is also assumed that this point makes the
line continuous, so that we can pass from any one point of the
line to any other point of the line by moving continuously along
the line either to the right or to the left.*
16. Two points will thus divide a straight line into two
segments on one of which lie only finite points, while on the
other lies the infinitely distant point. The first of these is
sometimes called the internal segment, the second, the external
segment. From this it follows that a point on a straight line
cannot be separated from another point by a single third point.
It requires two points to separate one point from another,
just as on a ring or closed curve.
17. The assumption of a single infinitely distant point on
the line a is equivalent to the assumption that through a point
0 there can be drawn one and only one straight line which does
not meet a given line in the finite region, and that these lines
do intersect in an infinitely distant point. This assumption


* The present monograph deals only with the so-called Euclidean
geometry. For the assumptions of non-Euclidean geometry, see Monograph
III.


MODERN PURE GEOMETRY


63


makes possible the statement that any two straight lines of
a plane intersect somewhere, if not in the finite region, then in
an infinitely distant point.
Definition. Lines which intersect in an infinitely distant
point are called parallel lines.
18. Postulate of parallels. Euclid's twelfth axiom, which
is more properly speaking a postulate, was his starting point
for proving that through a given point one and only one line
can be drawn parallel to a given line. Its assumption is consequently equivalent to that of a single infinitely distant point
on a straight line. Most of the difficulty in the treatment of
parallels which perplexed geometers for centuries was caused
by the failure to recognize that this so-called twelfth axiom
was an assumption and not a self-evident truth.
19. A single infinitely distant point on a straight line, or
what is the same thing, a single line through a given point
parallel to a given line, leads at once to the conclusion that
all the infinitely distant points of a plane lie on one straight
line and that any two parallel lines of a plane intersect in a
point of this line. The following considerations will make
this clear. If a line p should rotate about a point P, every
point of the line would describe a continuous path in the plane,
and this may be assumed also for the infinitely distant point.
The infinitely distant path described by this point contains
all the infinitely distant points of the plane and is such that it
is cut by any straight line in only one point. It is therefore
itself a straight line.
From this it follows that any two parallel planes intersect in
an infinitely distant straight line common to the two.
VI. FUNDAMENTAL THEOREM
20. In the development of modern geometry there have been
differences among investigators as to the best mode of attack.
Some geometers, Steiner and Chasles, for instance, have preferred to base their fundamental notions on certain metric
properties, while others, notably Von Staudt, and after him


64


MODERN MATHEMATICS


Reye, and in a modified form Cremona, have preferred to use
as starting point a purely positional relation, and thus avoid
the necessity of recognizing measurement as fundamental.
21. Perspective triangles. Following the latter method we
announce as our fundamental fact the theorem on perspective
triangles which was stated by Desargues early in the seventeenth
century, but which was known much earlier, probably by
Euclid.
If two triangles ABC and     If two triangles abc and
A1 B1 C1 are so situated     albl1c are so situated that the
that the lines AA1, BB1, and  sides a and al, b and ib, c and
CC1 meet in a point S, then  cl, intersect in points of one
the pairs of corresponding   straight line s, then the lines
sides c and cl, b and bi, a and  joining pairs of corresponding
al, intersect in points of one  vertices AA1, BB1, CC1 meet
straight line.               in a point.
The truth of this theorem is evident if the triangles be chosen
in different planes p and pi. For then the lines AA1 and BB1
meeting at S, determine a
X'v~~           ~ plane in which AB and A1B1
lie.  These lines therefore
intersect, and can meet only
B/ hb \ ^-  ~     on the common line of the
-ii  a,/  ' A^   planes p and pi. Similarly
AC and A1C1, also BC and
B1C1, meet in points of this
same straight line.
That the theorem is true
FIG. 4.            also when the triangles lie in
the same plane is seen most
easily by projecting the whole figure, that is the two given
triangles, the lines joining corresponding vertices and meeting
in S, and the line of intersection of the planes p and pi, from
some point 0, the eye for instance, thus forming a figure
of ten lines and ten planes intersecting at 0. In each plane


MODERN PURE GEOMETRY


65


will lie three of the lines and through each line will pass
three of the planes. Any plane section of this projection
will yield a figure consisting of two triangles so situated that
the lines joining pairs of corresponding vertices intersect in
one point while the pairs of corresponding sides intersect in
points of one straight line.
The process of projecting a figure from some chosen centre
and then taking a plane section, thus securing a new diagram,
is a favorite one in modern geometry. The properties which
remain unchanged by this process are called projective properties,
and they are found to be numerous. Magnitudes are changed
but, as will be seen later, certain relations among magnitudes
remain unchanged, as do also properties of intersections, contact,
collineation, and the like.
22. Perspective quadrangles. By repeated applications of
the theorem on perspective triangles the following theorem on
complete quadrangles is proved to be true:
" If two complete quadrangles are so situated that five
pairs of corresponding sides intersect in points of one straight
line then the sixth pair will also intersect in a point of that
line."
Remembering that the reciprocal of a complete quadrangle
is a complete quadrilateral made up of four lines and their
six points of intersection, the dual theorem may be stated as
follows:
" If two complete quadrilaterals are so situated that five
of the lines joining pairs of corresponding vertices meet in one
point, then the line joining the sixth pair of corresponding
vertices will also pass through that point."
23. Harmonic points. Let ABCD be any complete quadrangle (Fig. 5) and let PQ be a line joining two diagonal points
while R and S are the points in which the third pair of sides
intersect PQ. Construct any other quadrangle such that one
pair of sides will intersect in P, a second pair will intersect in
Q, and a fifth side will pass through R. This is readily possible
if the two sides through P be drawn at random and likewise
the side through R be drawn at random cutting the two already


66


MODERN MATHEMATICS


drawn at- A' and C', respectively. Then QA' and QC' will
determine the vertices D' and B'.
Now, in these two quadrangles, one of which was drawn
wholly at random, five pairs of sides intersect in points of the
straight line PQ, hence the
sixth pair BD and B'D' must
B$y,^~\  ~also intersect on PQ. In other
'/ 2z)/-?words, if two points P and
Q on a straight line be such
p    at \R i      Q    s    that pairs of sides of a complete quadrangle intersect in
them, while a fifth side passes
B\^^~'   ~     through a third point R of
this line, then the sixth side
will of necessity pass through
FTa 5.            a definite point S determined
by the first three. These four
points on the line are said to be harmonically related, or it may
be said that the line segment PQ is harmonically divided at
R and S, and we thus have a purely positional definition of the
harmonic relation.
Definition. Four points on a straight line are harmonic
when they are so situated that in two of them pairs of opposite
sides of a complete quadrangle may intersect while the remaining
sides pass through the other two.
In the diagram (Fig. 5) it should be noted that not only the
points PQRS, but also the points ATCR fulfil the conditions
specified for " harmonic points," as do also the points DTBS.
24. If the intersection of the sides AC and BD be the point
T, and if PT intersect BC and AD in L and N respectively, and
QT intersect AB and CD in K and M respectively, then the
line KL must pass through R since KBLT is a quadrangle of
which one pair of opposite sides intersect in P, a second pair
intersect in Q, while a fifth side passes through S. Similarly,
NM passes through R, while KN and LM both pass through S.
KLMN is thus a complete quadrangle with one pair of opposite
sides intersecting in R, one pair in S, a fifth side passing through


MODERN PURE GEOMETRY


67


P, and the sixth side through Q. The points R and S in this
modified diagram play exactly the same part as P and Q in the
original diagram, while
P and Q in the modified 
diagram play the same
part as R and S in the
original. Thus, if the
segment PQ is harmoni-     / 
cally divided at R and 
S, so also the segment 
RS is harmonically di-              FIG. 6.
vided at P and Q. The
points P and Q are harmonic conjugates with respect to R and
S, and in the same way, R and S are harmonic conjugates
with respect to P and Q.
25. It is not difficult to show that the pairs of points P,
Q and R, S must actually separate each other if they form a
harmonic set, and that if P and Q remain fixed while R traverses
the segments from Q to P internally, then S will traverse the
segment from Q to P externally. From this it follows that
if two pairs of points R, S, and R', S' are harmonically separated
by the same points P and Q, then R, S, and R', S' cannot
separate each other. Conversely, it is not difficult to show
that if two pairs of points be chosen on a straight line so as not
to separate each other, then a single pair may be found which
will harmonically separate both pairs.
26. Suppose now the complete quadrangle ABCD with the
harmonic points P, Q, R, S, is projected from some point O
outside the plane, and that a section is taken by a plane cutting
the projection in a new quadrangle A'B'C'D', of which two
sides intersect in a point P', two sides in Q', a fifth side passes
through R', and the sixth side through S'.  Then the points
P'Q'R'S', any section of the rays OP, OQ, OR, OS, are harmonic.
Hence if four harmonic points P, Q, R, S, be projected from a
centre 0, any section P'Q'R'S' of the four projecting rays is
a harmonic set of points. The rays OP, OQ,, O, OS, are
themselves also said to be harmonic.


68


MODERN MATHEMATICS


VII. METRIC PROPERTIES
27. Thus far the harmonic relation of points on a line has
been discussed from a purely positional standpoint and no
question of measurement has been considered. To introduce
magnitudes into our dis/LK               cussion let us assume the
theorem that the diagonals
/^~ \^  //~ of a parallelogram bisect.   each other. Then if AC
be one diagonal of a paral\ 1 (i ~lelogram, ALCN, bisected
by the other diagonal LN
F IG. 7.           at B, and if the pairs of
opposite sides be produced
to meet at the infinitely distant points K and AM, respectively, KLMN may be looked upon as a complete quadrangle with one pair of sides KL and MN intersecting at A,
a second pair KN and LM intersecting at C, a fifth side LN
passing through B and the sixth side KM (the infinitely distant
line) intersecting AC in the infinitely distant point D. Then
the segment AC is harmonically divided by the mid-point B
and the infinitely distant point D.
Thus any line-segment PQ is bisected at a point R when the
harmonic conjugate of R with respect to P and Q is at infinity;
or, the harmonic conjugate of the mid-point of a line-segment
with respect to the extremities of the segment is the infinitely
distant point of the line.
28. If a set of harmonic points ABCD be projected from
any point 0 by rays OA, OB, OC, OD, and a section of these
rays A'BC'D' be taken by a line drawn through B parallel
to OD, the segment A'B will equal the segment BC', since D'
is at infinity and the four points are harmonic.
By similar triangles it follows at once that
AB   BA' BC' BC
AD   DO -   DO -CD'


MODERN PURE GEOMETRY


69


or, interchanging the order of segments and giving attention
to direction,
AB     AD
BC     DC'
That is, the segment AC is divided internally at B and
externally at D in the same ratio, a relation which is frequently
taken as the definition of harmonic points.
o
FIG. 8.
It should be noted that by another interchange in the order
and direction of segments, the ratio
AB     AD            BA     BC
BC  -C   becomes AD =-C
which shows that not only is the segment AC divided at B and
D in equal ratios, but also that the segment BD is divided at A
and C in equal ratios, as has been already pointed out.
29. From this property of harmonic points it is not difficult
to demonstrate the following two:
1    1    2
(1) AB   AD=AC' from which immediately comes the
identity of geometric harmonics with the algebraic harmonical
progression; and
(2) MB MD=MC2, where M is the mid-point of the segment AC.


70


MODERN MATHEMATICS


VIII. ANHARMONIC RATIOS
30. Definition. If a line-segment PQ is divided by any
PR      PS
two points R and S, and the ratios -P  and -S be formed and
again the ratio of these two ratios be taken, we obtain the ratio
PRS -tQ which is called the cross-ratio, or the anharmonic ratio,
of the four points.
31. Six anharmonic ratios. For the same four points
it is evident that there are six different anharmonic ratios
according as PQ, PR, or PS is taken as the original segment,
the other two points in each case being the division points,
and as the ratio of ratios is taken in one order or the other.
That there are not more than six different anharmonic
ratios for the same four points, or that the segment RS, for
example, with division points P and Q gives no new anharmonic
ratio is easily shown. Forming the anharmonic ratio as before,
with RS as initial segment, it takes the form RQ PS and this
RQ. PS'
by reversal of segments is identical with the one previously
written.
Three of the anharmonic ratios of four points are reciprocals
of the other three since they are formed by taking the ratio of
ratios in the reverse order. The six ratios therefore are pS RQ
PQ. SR  PQ. RS
Ps. QR' PR.Q' and their reciprocals. These six anharmonic
ratios involve only the quantities PQ RS, PR SQ, PS.QR,
and their negatives.
Now for any four points, P, Q, R, S, on a straight line it
may be easily shown that PQ.RS+PR.SQ+PS.QR= O.
From this we derive
PQ.SR=PS.QR+PR.SQ
PQ SR      PR.SQ
or              p     =QR-1     Q
PS.QR     PS RQ'


MODERN PURE GEOMETRY 


71


Also         PQ.RS=PR.QS+PS RQ,
PQ RS      PS RQ
or                    =1I â1
PR.QS     PR SQ'
PR.SQ
Hence if the anharmonic ratio P    k, the ratio
PS - RQ
PQ SR               PQ RS       1
p =SR -k; and   PRQSIPS -QR              PR-QS       k
Therefore, if one of the anharmonic ratios of four points
on a straight line be equal to k, the remaining five are
1           1       1'        k, 1  k   1-kn 1, and   k
In speaking of an anharmonic ratio it is clearly necessary
to distinguish which of the six ratios is meant, and when the
ratio has been formed in one order that order must be retained
throughout the discussion in hand.
PQ.RS
32. Take PR QS as the anharmonic ratio of four given
points. If two of the points P and S, and also the other two
SR.QP
Q and R, be interchanged, the ratio becomes SQ  which by
0 SQ RP
reversal of segments is equal to the original ratio. Or if any
other two, and also the remaining two, be interchanged, the
ratio is unaltered. Hence
"If the anharmonic ratio of four points is formed in any order,
the ratio is unchanged when we interchange two of the points
and also the other two."
PQ.RS             PQ    PR
If the anharmonic ratio  R QS â 1, then  Q â RS' or
the segment PS is divided at Q and R in the same ratio and the
four points are harmonic. In this case P and S must be separated by Q and R.
33. Take four points, A, B, C, D, on a line and project them
from any centre 0. Let p be the length of the perpendicular
from 0 on the line.


72


MODERN MATHEMATICS


Now the area of the triangle OAB
=. p AB = = OA OR sin AOB,
and there are similar relations for the other triangles. Hence
AB CD   sin AOB sin COD
the anharmonic ratio.    =sin AOC sin BOD   a quantity
independent of p, and hence independent of the position of the
line relative to the rays of OA, OB, OC, OD. Therefore
" If A', B', C', D', be a projection of the points A, B, C, D,
from any centre 0, the anharmonic ratio of the former set of
points is equal to the corresponding anharmonic ratio of the
latter set; or, the anharmonic ratio of four points is unaltered
by projection."
IX. ELEMENTARY GEOMETRIC FORMS
34. The whole system of points on a straight line is called
a range of points and the system of lines through a point, these
lines being confined to one plane, is called a sheaf or pencil of
rays. The system of planes passing through a line is a sheaf
of planes. The aggregate of lines through a point, not confined
to one plane, is called a bundle of rays, and the aggregate of
planes through a point is a bundle of planes.
In plane geometry, the range of points and the sheaf of
rays are reciprocal forms, while in three-dimensional geometry
the range of points is reciprocal to the sheaf of planes and the
sheaf of rays is reciprocal to itself. The bundle of rays and
bundle of planes are reciprocal respectively to the rays of a
plane and the points of a plane.
X. CORRELATION OF ELEMENTARY FORMS
35. Two ranges of points may be so correlated that to every
point of one range there corresponds one and only one of the
other. For example, two sections of the same sheaf of rays
are correlated in this way if to each point of one range is correlated that point of the other which lies on the same ray. Sim


MODERN PURE GEOMETRY


73


ilarly, two sheaves of rays may be correlated, one to one, if
they project the same range of points. These are perhaps
the simplest examples of one to one correlation, but other more
complicated examples will readily occur to the reader.
Definition.  When two elementary forms-ranges of points,
sheaves of rays, or sheaves of planes-are so correlated that to
every set of harmonic elements in one of them there corresponds
a set of harmonic elements in the other, the forms are said to
be related to each other projectively.
It is readily seen that if two forms are the first and last of
a series, each of which is a projection or a section of the next
preceding or next following, they fulfil the conditions of this
definition, and hence are projectively related.
36. From the definition it follows without great difficulty
(see Reye, Geometry of Position, ~80*) that to any orderly
sequence of elements in one of two projectively related forms
there corresponds always an orderly sequence of elements in the
other, and also that the anharmonic ratio of any set of four
points in one form is equal to the anharmonic ratio of the corresponding four in the other.
37. Two projectively related simple forms which have the
same base-for example, two projective ranges of points which
lie on the same straight line-may have two elements of the
one which correspond to themselves in the other; but if more
than two, then every element of the one corresponds to itself
in the other, and the two forms are identical.
That there may be two self-corresponding elements in the
superposed forms may be seen as follows: Let A, B, C, D,... be
points of a range u projected from S1, by rays a,, b1, ci, di,...
and from S2, by rays a2, b2, c2, d2,... Let V cut the rays
al, b1, c1, d1,... in points A1, B1, C1, D1,... and cut the
rays a2, b2, C2, d2,... in points A2, B2, C2, D2,...  The
ranges A1, B1, C1, D1,... and A2, B2, C2, D2,..., both
lying on the line v, are projectively related, since any set of har

* The reference is to Reye's Geometry of Position. English translation.
This is now out of print, but readily accessible in libraries.


74


MODERN MATHEMATICS


monic points in one corresponds to a set of harmonic points
in the other, and in general corresponding points are distinct.
The ray S1S2, however, cutting the line u at the point S, will
determine on v two corresponding points which coincide. Also
the rays of S1 and S2 which project the point of u in which that
line is intersected by v, determine on v two coincident corresponding points. So that, in two superposed projective forms,
two self-corresponding elements are possible without requiring
that all elements should be self-corresponding.
But if three elements are self-corresponding, then all elements
are self-corresponding. It is readily seen that certainly in this
case an indefinite number of points will coincide with their
corresponding points, namely, the harmonic conjugate of each
of the three given points with respect to the remaining two,
and so on indefinitely. But for a proof that every point must
coincide with its corresponding point the reader is referred to
Reye, ~84.
38. Let us apply this property to some simple example.


(1) If two projective ranges
of points Al, B1, C1,... lying
on the line ul and A2, B2, C2,... lying on the line u2 are
so situated that the rays AA1,
BB1, and CC1, or any three such
rays, pass through one point
S, then all rays joining pairs
of corresponding points will
pass through S, and the common point of the two ranges
must be self-corresponding.
For S is the centre of two
superposed projective sheaves
of rays having three selfcorresponding rays, hence all
rays are self-corresponding,


(2) If two projectivesheaves
of rays al, bl, cl,... with
centre S1 and a2, b2, c2,... with
centre S2 are so situated that
the points of intersection ala2,
blb2, and CIC2, or any three
such points of intersection,
lie on one straight line s, then
all points of intersection of
pairs of corresponding rays
will lie on s, and the common
ray of the two sheaves must
be self-corresponding.
For s is the base of two
superposed projective ranges
of points having three selfcorresponding points, hence
all points are self-correspond


-MODERN PURE GEOMETRY


75


and the ray joining any point
P1 to S must coincide with
the ray joining P2 to S, or,
the ray PlP2 must pass
through S.
Definition. When two projective ranges of points are so
situated that the lines joining
pairs of corresponding points
all pass through one point,
they are said to be perspective
to each other, or to be in
perspective position, and this
will happen whenever three
lines joining pairs of corresponding points pass through
one point.


ing and the intersection of
any ray pl with s must coincide with the intersection of
p2 with s, or, pi and p2 must
intersect on S.
Definition. When two
sheaves of rays are so situated
that the points of intersection
of pairs of corresponding rays
all lie on one straight line,
they are said to be perspective
to each other, or to be in
perspective position, and this
will happen whenever three
points of intersection of pairs
of corresponding rays lie on
one line.


For brevity, the symbol A is frequently used for is projective
to and the symbol - for is perspective to. It should be noted
that forms which are perspective to each other are also projective, but lie in a special relative position.


(3) Two projective ranges
of points, A1, B1, C1,...
lying on the line u, and
A2, B2, C2,... lying on the
line u2, are perspectively related if the common point of
u, and u2 is self-corresponding.
For, if A1A2 and B1B2 intersect in S and the two ranges
be projected from this point,
then in the two projective
sheaves of rays whose centre
is S there are three selfcorresponding rays, namely,
SAA1, SBB1, and SK where


(4) Two projective sheaves
of rays al, bl, cl... with
centre S1 and a2, b2, C2...
with centre S2 are perspectively related if the common
ray of the two sheaves, SIS2,
is self-corresponding.
For, if s be the line joining
the intersection ala2 and blb2,
and a section of each sheaf of
rays by this line be taken, then
in the two projective ranges
of points lying on this line
there will be three self-corresponding points, namely,


76


MODERN MATHEMATICS


K is the common point of ala2, blb2, and the point where
u1 and u2. Hence all rays     s cuts S1S2. Hence all points
of S   are self-corresponding  of these two ranges are selfand the two ranges are per-   corresponding, or, in  other
spective.                     words, all pairs of corresponding rays of the two sheaves
will intersect on s, and the
two sheaves are perspective.
(5) Two fixed straight lines ul and u2&gt; intersect at 0 and
there are two fixed points SI and S2 collinear with O. A line
v rotates about a fixed point V and intersects ul and u2 in A1
and A2 respectively. The locus of the intersection of S1A1
and S2A2 is a straight line through O.
For the line v rotating about V marks out on the lines u1
and u2 two perspective ranges of which A1 and A2 are corresponding points and 0 is a self-corresponding point. The
sheaves S1A1 and S2A2 are therefore perspective and the locus
of the intersection of pairs of corresponding rays is a straight
line. That 0 is one point of the locus follows from the fact that
S10 and S20 are corresponding rays in the sheaves S1 and S2.
39. Construction of corresponding elements. From what has
been said it appears that two elementary forms may be correlated projectively as soon as there are known three elements
in the one form which correspond to three given elements in
the other.  Let S1 and S2 be the centres of two sheaves of
rays lying in the same plane, which are to be correlated projectively. Let the rays al, bi, cl, of the first sheaf correspond
respectively to the rays a2, b2, c2, of the second sheaf.  The
problem is to find in the second sheaf the ray d2 which corresponds to any chosen ray d1 of the first sheaf.
If al, a2 intersect at A, bl, b2, at B, and c1, c2 at C, and these
three points lie on a straight line v, the two sheaves are perspective and any pair of corresponding rays will intersect on v.
But, if A, B, and C are not collinear, we must find a sheaf of
rays to which each of the given sheaves is perspective and so
arrive at a correlation of them.


MODERN PURE GEOMETRY


77


Through one of the points of intersection, A, draw two
secants, u1 and u2, and consider the first, ul, a section of the
sheaf S1, the second, u2, a section
of the sheaf S2. These two ranges            S1
of points therefore will be pro-    cb, C1
jectively related  and they  are/ 
perspective since A  is a selfcorresponding point. If B'C' and   /           /\ 1   c
B"C", are the points in which bl,      /       /   /
c1, and b2, -c2 are cut respectively  \B
by the lines ul and u2, the 
intersection of B'B" and C'C",       A 
or S, is the centre of a sheaf             \
of rays of which both ul and u2            i Sd
are sections. Hence the sheaves                     D
SI and S are perspective since          s
corresponding rays intersect on           FIG. 9.
the straight line u1, and S2 and
S are perspective since corresponding rays intersect on the
straight line u2.
If then d1 is any ray of the sheaf S1 which cuts u1 at D',
the ray SD' will cut u2 in a point D" in which the ray d2 of
the sheaf S2 also cuts it. Thus the ray d2 of S2 corresponding
to any ray d1 of SI is
^^"~^-4iJ\                 determined and the corB\^^A^~,relation is complete.
S2                  D,. z \ //  /     -              40. On  the  other
"*'~,,.-          ^    hand, let ul and u2 be.//;.\.\f\  -^ /tworangesof pointslying
s~-   '\\\,-&gt;B               in the same plane which
SD^^  2C2,are to be correlated projectively, and let the
FIG. 10.             points Al, B1, C1, of
the first range correspond respectively to the points A2, B2, C2, of the second range.
The problem is to find the point D2 of the second range corresponding to any chosen point D1 of the first range.


78


MODERN MATHEMATICS


If the rays A1A2, B1B2, C1C2 should pass through one point
V, the two ranges are perspective and all pairs of corresponding points in the two ranges will lie on rays through V. But
if these rays do not intersect in one point we must find a range
of points to which each of the given ranges is perspective, and
so arrive at a correlation.
On one of the lines as AiA2 choose two centres SI and S2,
and from these project the given ranges ul and u2, respectively.
The two sheaves of rays S and S2 will thus be projective and
they are perspective since 8182 is a self-corresponding ray.
If B' is the point in which S8B1 and S2B2 intersect and C' the
point in which S1Ci and S2C2 intersect, then all pairs of corresponding rays of S1 and S2 will intersect on the line B'C'
or is.
If then D1 be any point of the range ul, S1D1 will intersect
the line B'C' in the same point as does S2D2. Thus the point
D2 of U2 corresponding to any point D1 of ul is determined and
the correlation is complete.
It should be noted that both the problem and the process of
sec. 40 are the reciprocals or duals of those of sec. 39.
XI. CURVES AND SHEAVES OF RAYS OF THE SECOND
ORDER


41. If two projective sheaves
of rays lying in the same plane
are neither concentric nor perspective, then of the points
of intersection of pairs of corresponding rays, at most two
can lie on any straight line.
For if three, then all and the
two sheaves must be perspective.


If two projective ranges of
points lying in the same plane
are neither superposed nor
perspective, then of the lines
joining pairs of corresponding
points, at most two can pass
through any one point.
For if three, then all and
the two ranges are perspective.


42. Since a continuous series of elements in one of two
projective forms corresponds always to a continuous series in


MODERN PURE GEOMETRY


79


the other, the locus of the point of intersection of corresponding rays in two projective sheaves is a continuous series of points,
or a curve, and the locus of the line joining corresponding
points in two projective ranges is a continuous series of rays,
or an envelope.
If the two projective sheaves of rays are not perspective,
the generated curve is such that not more than two of its points
lie on any straight line. Such a curve is called a curve of the
second order.
If the two ranges of points are not perspective, the generated envelope is such that not more than two of its rays pass
through any one point. Such an envelope is called a sheaf
of rays of the second order.
A curve of the second order    A sheaf of rays of the second
is generated by two project-  order is generated    by two
ive sheaves of rays lying in   projective ranges of points
the same plane, which are not  lying in the same plane, which
perspective.                  are not perspective.
43. The centres of the sheaves generating a curve of the
second order are themselves points of the curve, since the ray
S1S2 of the sheaf S1 meets its corresponding ray at S2, and the
ray S2S1 of the sheaf S2 meets its corresponding ray at Si.
These corresponding rays have each only one point in common
with the curve, namely, S2 and S1 respectively, while all other
rays through these centres meet the curve at S2 or S1 and also
elsewhere. These rays are consequently called tangents to the
curve at these points.
The lines on which lie ranges of points generating a sheaf of
rays of the second order are themselves rays of the sheaf,
since each joins a point in itself to the corresponding point in
the other, namely, the common point of the two lines. Through
the point of u1 which corresponds to the point of u2 lying on
u1, there passes but one ray of the sheaf, namely, ul itself,
while through all other points of u1 there pass two rays of the
sheaf. The same is true for that point of u2 which corresponds


80


MODERN MATHEMATICS


to the point of u1 lying on u2. These points are consequently
called points of contact on the two rays.
44. A curve of the second order may thus be generated
from two given points S1 and S2 and three rays through each
correlated to three rays through the other, in other words, from
A


FIG. 11.


five given points of the curve. The problem of constructing a
curve of the second order from five given points is just the
problem of determining pairs of corresponding rays in two
projective sheaves when three pairs are given.


FIG. 12.


A sheaf of rays of the second order may be generated from
two given rays ul and u2 and three points on each correlated to
three points on the other; in other words, from five given rays


MODERN PURE GEOMETRY


81


of the sheaf. The problem of constructing a sheaf of rays of the
second order from five given rays is just the problem of determining pairs of corresponding points in two projective ranges
when three pairs are given.
45. That the points S1 and S2, the centres of the generating
sheaves, are not particular points of the curve, or that the
curve could as well be generated with any other two of its
points for centres, follows without great difficulty, but the
demonstration is omitted. The same is true regarding the lines
u1 and u2 in the sheaf of rays. Accepting this, it follows that:
"A curve of the second order may be projected from any
two of its points by projective sheaves of rays and a sheaf of
rays of the second order is cut by any two of its rays in projective ranges of points."
46. A circle is a curve of the second order, since if two
points S1 and S2 on it be chosen for centres and other points
FIG. 13.
A B C... be projected from these, the angle AS1B equals
the angle AS2B, and so on, so that the two sheaves of rays
Si and S2 might be placed the one on the other so that pairs
of corresponding rays would coincide. Hence the sheaves are
projective and the points of the circle are points of intersection
of pairs of corresponding rays.


82


MODERN MATHEMATICS


Similarly the tangents to a circle form a sheaf of rays of


the second order.' If ul and


U1
FIG. 14.
points are projective and the
rays of the second order.


U2 are any two tangents to a
circle, and other tangents cut
these two in points A1, A2;
B1, B2,... respectively, the
angle A10B1 equals the angle
A20B2, and so on, so that
the ranges of points ul and
u2 are sections of two identically equal sheaves of rays
having the same centre 0.
Hence these two ranges of
system of tangents is a sheaf of


47. From considerations such as these, it may be shown
that every conic section is a curve of the second order and may
be generated as indicated in the preceding articles, and also
that the system of tangents to a conic section is a sheaf of rays
of the second order. On the other hand, any curve of the
second order is a conic section and any sheaf of rays of the
second order is a system of tangents to a conic section.
48. Classification of conics. If one of the two projective
sheaves of rays generating a conic should be placed concentric
with the other without changing the directions of its rays,
the two sheaves might then have two rays which coincide
with their corresponding rays, or one such, or none such, but
certainly two such if the correlated rays rotate about the centre
in opposite senses. In the original positions of the sheaves,
therefore, there may be two pairs of correlated rays parallel,
in which case the generated curve is a hyperbola, having two
points at infinity; or one pair, in which case the curve is a
parabola; or no corresponding rays parallel, in which case the
curve is an ellipse. If the five given points from which the
curve is generated lie so that one is within the quadrangle
formed by the other four, the resulting curve is necessarily a
hyperbola.


MODERN PURE GEOMETRY


83


XII. PASCAL'S AND BRIANCHON'S THEOREMS
49. The diagram for the construction of pairs of corresponding rays in two projective sheaves (sec. 39), may be extended
so far as to show that the
ray SiS of the sheaf Si1 
cuts the line u2 in a point,      ]c t
M of the curve determined            / 
by the two sheaves. Simi-     //    I
larly, S2S of the sheaf S2  A/             /
cuts the line ul in a point   \\        \\ 
L of the curve. Now Si,            /    \     /
S2, A, D, L, and Mi are        \ 
arbitrary  points of the
curve and lines connecting\           \       \
them in order, S1DS2LAM,      \ \ 
form a hexagon inscribed       '3 12         \
in the curve of second                -
order such that the pairs            2 
of opposite sides S1D and
LA, DS2 and AM, S2L and 
MS1 necessarily intersect              FIG. 15 -in points of a straight line
D'SD". Hence
"The points of intersection of the three pairs of opposite
sides of a hexagon inscribed in a conic lie on one straight line."
This is the well-known theorem of Pascal enunciated in 1640,
when the author was but a lad of sixteen years.
50. On the other hand, the diagram for the location of pairs
of corresponding points in two projective ranges (sec. 40) may
be extended to show that the lines S2R1 and S1Q2 are rays of
the sheaf of second order determined by the given ranges,
R1 and Q2 being the points in which the line B'C', or u, intersects u1 and U2, respectively. Now S1S2R1D1D2Q2 are vertices
of a hexagon whose sides are arbitrary rays of the sheaf of
second order, and it is such that the lines joining pairs of
opposite vertices necessarily intersect in one point. Hence


84


MODERN MATHEMATICS


" The lines joining pairs of opposite vertices of a hexagon
circumscribed to a conic intersect in one point."
This is Brianchon's theorem, the exact dual or reciprocal
of Pascal's theorem, but not discovered till 1806.


FIG. 16.


51. From these two theorems many important consequences
follow.


(1) If in Pascal's theorem
two of the vertices of the inscribed hexagon come to coincide, the intervening side thus
becoming the tangent at that
vertex, the theorem takes the
form: A pentagon inscribed
in a conic is such that the
intersections of two pairs of
non-adjacent sides and of the
fifth side with the tangent at
the opposite vertex lie on one
straight line.
(3) If further the hexagon
is reduced to an inscribed
quadrilateral and tangents at
two opposite vertices, we have:
In any quadrilateral inscribed


(2) If in Brianchon's theorem two of the sides of the circumscribed hexagon come to
coincide, the intervening vertex thus becoming the point
of contact of that side, the
theorem takes the form: A
pentagon circumscribed to a
conic is such that the lines
joining two pairs of nonadjacent vertices and the line
joining the fifth vertex to the
point of contact of the opposite side pass through one
point.
(4) If further the hexagon is
reduced to a circumscribed
quadrilateral and the points
of contact on two opposite
sides, we have: In any quad


MODERN PURE GEOMETRY


85


in a conic, the intersections of
pairs of opposite sides and of
tangents at opposite vertices
are collinear.
(5) For the inscribed triangle
Pascal's theorem becomes: The
sides of a triangle inscribed
in a conic intersect the tangents at the opposite vertices
in points of one straight line.


rilateral circumscribed to a
conic the lines joining pairs
of opposite vertices and pairs
of points of contact in opposite sides are concurrent.
(6) For the circumscribed
triangle Brianchon's theorem
becomes: The lines joining
the vertices of a triangle circumscribed to a conic to the
points of contact of the opposite sides intersect in one
point.


52. Pascal's theorem yields itself at once to the construction of a conic of which there are given five points, or four
points and the tangent at one of them, or three points and the
tangents at two of them. In the case of five points being given,
if these are A, B, C, D, E, and they are joined in order, while an
arbitrary line through A is drawn for sixth side of the inscribed
hexagon, the hexagon is determined excepting only the fifth side
and the sixth vertex. Of this hexagon AB and DE are opposite
sides, CD and the arbitrary line through A are opposite sides,
and the intersections of these determine the Pascal line. The
sides BC and the side EF, where F is on the arbitrary line through
A, intersect also on the Pascal line, hence the point F, an
arbitrary point of the conic, is determined.
In the same way, Brianchon's theorem yields itself to the
construction of tangents to a conic when there are given either
five tangents, four tangents and the point of contact of one
of them, or three tangents and the points of contact of two of
them.


86


MODERN MATHEMATICS


XIII. POLE AND POLAR THEORY
53. In the plane of a conic is a point P, and through it are
drawn two secants of the conic as in the diagrams (Figs. 17 and
18), cutting the conic in the points A, B, arid C, D. If these
points are joined two and two so as to form the inscribed
quadrangle ACBD, the pairs of opposite sides AC and BD,
AD and BC, will intersect on a line p, on which also intersect
the tangents at opposite vertices A and B, C and D (Pascal's
as-S            I  I          /, '
~//~   A
PIF'
FIG. 17.                   FIG. 18.
theorem). The line p is called the polar of the point P with
respect to the conic, and the point P is the pole of the line p.
If the secant PAB cuts the polar at the point P' it is readily
seen that the points PAP'B are harmonic, hence P' could be
found as the harmonic conjugate of P with respect to A and
B. Also the tangents at A and B intersect on the polar, consequently two points of the polar can be found from a single
secant. Hence the position of the polar is independent of the
second secant, that is, the polar of a point is independent of
the process of constructing it, and it therefore bears a fixed relation to the point and the curve.


MODERN PURE GEOMETRY


87


54. On the polar of a point P with respect to a conic there
will lie:
(1) The intersections of chords joining the extremities of
pairs of secants through P. (By extremities we mean the points
of intersection with the curve.)
(2) The intersections of tangents at the extremities of
secants through P.
(3) All points harmonically separated from P by the curve.
(4) The points of contact of tangents from P.
55. If a straight line p is given and we wish to find its pole
with respect to a given conic, we may choose on the line two
points R and S, and from these draw tangents meeting the curve
at A, B, and C, D, respectively. The intersection P of the
chords AB and CD is such that its polar necessarily passes
through both R and S. Hence P is the pole of the given line.
56. If a point P lies inside a conic, all points of its polar
lie outside, since chords through P cut the conic, and the polar
passes through all points harmonically separated from P by
the curve. If P lies outside the conic, some points of the polar
must lie inside and the polar necessarily cuts the curve. If
P is a point of the conic its polar is the tangent at that point
and, conversely, the pole of a tangent is the point of contact.
This follows as a limiting case of the construction for the polar
in sec. 53.
Incidentally, it may be stated by way of definition that a
point lies inside of a closed curve when all straight lines through
it cut the curve, and a point lies outside of a closed curve when
through it straight lines can be drawn which do not cut the
curve in real points.
57. Conjugate points and lines. If a point Q lies on the
polar of a point P, relative to a conic, then P lies on the polar
of Q.
For if P lies outside the curve, Q may be chosen either
inside or outside, but in either case, the polar of P is a secant
through Q, the tangents at whose extremities intersect in a
point of the polar of Q. But they intersect at P, hence P is
a point of the polar of Q. If P lies inside the curve, Q must


88


MODERN MATHEMATICS


lie outside and QP necessarily cuts the curve in real points.
Now Q and P are harmonically separated by the curve, since Q
lies on the polar of P, but for this same reason P must lie on
the polar of Q. If P lies on the curve, and Q is a point of its
polar, namely a point of the tangent at P, the polar of Q evidently passes through P.
Two points are called polar conjugates when one, and consequently each, lies on the polar of the other; and two lines are
"polar conjugates " when one, and consequently each, passes
through the pole of the other.
Thus a point is conjugate to all the points of its polar and a
line is conjugate to all the lines through its pole.
58. Centre and diameters.  In the diagram  for the construction of the polar of a point (sec. 53), if P should lie at
infinity, and consequently the secants through it be parallel,
the polar becomes the locus of the mid-points of a system of
parallel chords, each mid-point being harmonically separated
from P.
The locus of the mid-points of a system of parallel chords
of a conic is thus a straight line-the polar of the infinitely
distant point at which the chords intersect-and this line is
called a diameter of the curve.
The intersection of any two diameters of a conic is the pole
of the infinitely distant line, and is called the centre of the
conic.
The centre of an ellipse lies inside the curve since the infinitely distant line of the plane lies wholly outside and all
diameters cut the curve in real points. For a parabola, the
infinitely distant line is tangent to the curve, hence the centre
lies on the curve at infinity, and all diameters are parallel.
The infinitely distant line cuts a hyperbola in two real points,
hence the centre lies outside the curve and there are some diameters which cut the curve in real points while others do not.
The diameters which are tangent to the curve at infinity are
called asymptotes.


MODERN PURE GEOMETRY


89


XIV. CONCLUSION
59. In this brief chapter it is hoped that enough of the
spirit of modern pure geometry has been exhibited to encourage
the reader to continue its study. The field yields rich results
in applications to more elementary subjects and is not too
difficult or forbidding for the isolated reader. By continuing
the methods here indicated a complete study can be made not
only of the conic sections, but of the relations of conies to each
other and of many curves of higher order. The pole and polar
theory in reference to a conic relates every point of the plane
to a line and every line to a point in such a way as to give
concreteness to the principle of duality and to make it possible so to reciprocate systems of points and lines as to yield
definite sets of lines and points. By projection from a point
outside the plane all the conclusions here reached can be transferred directly to cones of the second order and their tangent
planes, and many interesting theorems develop which are
not simple projections and which have not more than an analogy
in a plane figure. Ruled surfaces of the second order appear
from a consideration of two projectively related ranges of
points which do not lie in the same plane, or from two projective sheaves of planes whose axes do not intersect.
Perhaps the easiest and most attractive approach to the
study is through Reye's Geometrie der Lage, but the English
translation of Part I of this work, made some years ago, is now
out of print. Cremona's Projective Geometry, English translation by Leudesdorf, has always been a popular text. The
classic treatises on modern geometry are Chasles's Geometrie
Superieure, Steiner's Systematische Entwicklung, etc., Poncelet's Proprietes Projectives, and Von Staudt's Geometrie der
Lage. Of these the last, though perhaps the most systematic,
should be read only after a considerable knowledge and comprehension of the subject has been obtained.


III
NON-EUCLIDEAN GEOMETRY
By FREDERICK S. WOODS


CONTENTS
SECTIONS.
I. INTRODUCTION.......................................  1
II.  PARALLEL  LINES......................................  2-5
III. THE  EUCLIDEAN  ASSUMPTION...........................  6
IV. THE LOBACHEVSKIAN ASSUMPTION....................... 7-1.2
V. THE RIEMANNIAN ASSUMPTION......................... 13-15
VI. THE SUM OF THE ANGLES OF A TRIANGLE................ 16-20
V II.  A REAS...............................................21-24
VIII. NON-EUCLIDEAN TRIGONOMETRY........................25-35
IX. NON-EUCLIDEAN ANALYTIC GEOMETRY.................. 36-43
X. REPRESENTATION OF THE LOBACHEVSYKIAN GEOMETRY ON A
EUCLIDEAN PLANE................................. 44-51
XI. RELATION BETWEEN    PROJECTIVE AND NON-EUCLIDEAN
GEOMETRY.........................................52-55
XII. THE ELEMENT OF ARC.................................  56
92


III


NON-EUCLIDEAN GEOMETRY
By FREDERICK S. WOODS
I. INTRODUCTION
1. The fifth postulate of Euclid reads as follows: "If a
straight line falling on two straight lines makes the interior
angles on the same side less than two right angles, the two lines,
if produced indefinitely, meet on that side on which are the
angles less than two right angles."
Under the term    non-Euclidean geometry we shall understand a system of geometry which is built up without the use
of this postulate. Strictly speaking, perhaps, the same name
might be given to any geometry the basis of which differs in
any essential particular from that of Euclid, but usage has
decreed otherwise.
The conception of a non-Euclidean geometry came into
being only after centuries of vain attempts to prove the truth
of Euclid's postulate. There is no place here to review the
history of such attempts.* It is sufficient to note that all
inevitably failed. Some writers, however, especially Saccheri
(1667-1733), Lambert (1728-77), and Legendre (1752-1833)
made important contributions to what is now       recognized as
* See, for example: Engel-Staeckel, Theorie der Parallellinien von Euklid
bis auf Gauss, Leipzig, 1895. A shorter account is found in Bonola, Die
nichteuklidische Geometrie, Vol. IV, of the series, Wissenschaft und Hypothese,
Leipzig, 1908. See also the Historical Note, in Manning, Non-Euclidean
Geometry, Boston, 1901; and Heath, The Thirteen Books of Euclid's Elements, Vol. I, p. 202, Cambridge, 1908.
93


94


MODERN MATHEMATICS


the non-Euclidean geometries, though each failed to see the
true meaning of the results he obtained.
Finally, nearly simultaneously though quite independently,
a Russian, Lobachevsky, a Hungarian, J. Bolyai, and a German,
Gauss, reached the conclusion not only that the parallel postulate
could not be proved, but that a logical system of geometry could
be constructed without its use. The work of Gauss is only
partly revealed by extracts from his correspondence and fragments of his posthumous papers. That of Lobachevsky is contained in several articles published between 1833 and 1855, and
that of Bolyai in an appendix to a work of his father published
in 1832-35. The system of geometry common to these three
writers we shall call the Lobachevskian geometry, since Lobachevsky was the mathematician to develop it most fully.*
The Lobachevskian geometry remained for a time the
sole type of a non-Euclidean geometry. In 1854, however,
Riemann, working from the standpoint of the differential calculus, discovered a new type to which we shall give the name
of the Riemannian geometry.
Besides the three types of geometry, the Euclidean, the
Lobachevskian, and the Riemannian, there are also three
methods by which the geometries may be developed. The first
is by elementary methods similar to those of Euclid, and was used
by Lobachevsky, Bolyai, and Gauss. The second is by use of
Cayley's system of projective measurement and has been largely
employed by Klein. The third is that of the calculus, and has
been used by Riemann. We shall begin by employing the first
method, but shall later make some reference to the other two.
It does not lie within the plan of this paper to examine the
assumptions which must be made before any form of a parallel
postulate can be introduced. This work has been done by
* English readers will find the simplest introduction to Lobachevsky's
own work in the little book written in German and translated into English
by G. B. Halsted under the title, "Geometrical researches on the theory of
parallels." More complete is Engel's translation: Lobatschefsky, Zwei geometrische Abhandlungen aus dem Russischen iibersetzt mit Anmerkungen
and mit einer Biographie des Verfassers, Leipzig, 1879.


NON-EUCLIDEAN GEOMETRY


95


Professor Veblen in his paper* contained in the present collection
and the results of that paper will be assumed as known and
freely referred to. It is believed, however, that this paper
may be easily read by any reader who prefers to start from the
original definitions, common notions, and postulates, stated or
implied, of Euclid.
II. PARALLEL LINES
2. We assume Euclid's fundamentals with the exception of
the parallel postulate, or make Veblen's assumptions I-XII and
XIV. The first twenty-eight propositions of the first book of
Euclid (Veblen, VIII) are then true. We proceed to give a
definition of parallel lines more general than that of Euclid.
Let PQ (Fig. 1) be any straight line and A any point not
on PQ. Through A there passes a set of lines intersecting PQ,
/,  /          "   C
C       B -               Q
FIG. 1.
since any point on PQ may be joined to A. It is conceivable
that there may be other lines through A which do not intersect
PQ. In that case, there will be lines such as AL and AK, not
intersecting PQ and forming the boundaries of the set of lines
which meet PQ. Such lines are said to be parallel to PQ.
Otherwise expressed: Let AB be any line through A intersecting PQ. The line AL is said to be parallel to PQ at the point
A, if
(1) AL does not intersect PQ no matter how far produced.


* Monograph No. I.


96


MODERN MATHEMATICS


(2) Any line through A in the angle opening BAL does intersect PQ.
It is evident that this definition considers only those portions
of the lines AL and PQ which lie on the same side of AB. In
other words, the directions of the lines are important. We
shall indicate the directions of parallel lines in the usual way
by the order in wl.ich the letters at these extremities are named.
Thus we shall say that AL is parallel to PQ and AK is parallel
to QP.
The line AB may be any line through A intersecting PQ.
It is often convenient, however, to use the line AHl perpencicular to PQ. We may then show that
4HAK2= 4HAL.
For if 42HAK were greater than 4HAL, we could draw
AC meeting QP in C so that 4HAC= 4HAL. Now take C'
on HQ so that IC' = HC and connect A and C'. By Euclid, I,
4 (Veblen, Theorem 32) the triangles HAC and HAC' are congruent, and hence
4HAC' - 42HAC = 4HAL.
This is impossible, since AL is parallel to HQ. Hence
42HAK   cannot be greater than 4HAL. In like manner,
4HAL cannot be greater than 4HAK. Hence 4HAIK=
4HAL.
The angle HAL is called the angle of parallelism for the
distance AH.
In the definition, the point A plays apparently a unique
role. We shall show this to be unessential by the theorem of
the next section.
3. A straight line maintains the property of parallelism at all
its points.
Let AK (Fig. 2) be parallel to BQ at the point A and let
Ai be any point on AK. We wish to show that AK is parallel
to BQ at the point A1.
Connect A1 and B and draw through A1 any line A1C in
the angle opening BA1K. Take D any point on A1C and


NON-EUCLIDEAN GEOMETRY


97


connect D with A. The line AD prolonged will meet BQ at
some point F since AK is parallel to BQ. Hence A1C will
meet BQ in some point between B and F (Veblen, Theorem 17).
That is, any line through A1 in the angle opening BA1K intersects BQ. But A1K does not intersect BQ. Hence it is parallel
to BQ.
The proof also holds that if A1 is taken on the backward
extension of AK, but, in that case D must be taken on the
backward extension of A1C.
We shall now show that the property of parallelism is reciprocal.. D A E 
I. -.Q P                                  Q______ _
FIG. 2.                      FIG. 3.
4. If a line is parallel to another line the second line is parallel
to the first.
Let LK (Fig. 3) be parallel to PQ. We wish to prove that
PQ is parallel to LK. From A draw a line perpendicular to
LK. This perpendicular will meet PQ at some point B since LK
is parallel to PQ. Draw through B any line BC in the angle
opening QBA. Construct the two angles ABE and ABD so
that
4ABE â 4ABD&lt;      4-QBC.
Then            BD=BE,       (Euclid, I, 2&gt;3)
4BEK &gt; 4BDK.        (Euclid, I, 16)
Hence we may draw in the angle BEK a line EF so that


4BEF= 4BD)K,


98


MODERN MATHEMATICS


and EF will meet HQ since LK is parallel to PQ. Now take
DG=EF and draw BG. Then the two triangles BEF and
DBG are congruent and therefore
4DBG = 4EBF.
But                 4DBE&lt; 4QBC.
Therefore           4EBG &gt; 4EBC.
Hence the line BC meets LK at some point between E and
G. But BC is any line through B in the angle opening QBA
and LK and BQ do not meet. Therefore BQ is parallel to LK.
5. If two lines are parallel to a third, they are parallel to each
other.
We distinguish two cases according as the third line lies
between the two lines or not.
In the first case, let AK and DQ (Fig. 4) be each parallel
to ML. We wish to prove that AK is parallel to DQ. Draw
A                            A
K
cK                           K'
D
L                            Q
Q                             L 
FIG. 4.                      FIG. 5.
AC any line through A in the angle opening DAK. AC will
meet ML in some point F since AK is parallel to ML. CF
produced will also meet DQ, since ML and DQ are parallel.
Hence any line through A in the angle opening DAK meets
DQ. On the other hand AK cannot meet DQ since it cannot
meet ML. Hence AK and DQ are parallel.
In the second case, let AK and DQ (Fig. 5) be each parallel
to ML. We wish to prove that AK is parallel to DQ. Draw
through A the line AK' parallel to DQ. Then by the first case
AK' is parallel to ML and hence coincides with AK.


NON-EUCLIDEAN GEOMETRY


99


III. THE EUCLIDEAN ASSUMPTION
6. We may replace Postulate 5 of Euclid or Assumption
XIII of Veblen by the following assumption while retaining
all the other assumptions of either author.
Through any point in the plane there goes one and only one
line parallel to a given line.
That one parallel exists, is, in fact, proved in the twentyeighth proposition of the first book of Euclid (Veblen, VIII).
To assume that only one parallel exists is equivalent to assuming
that in Fig. 1 the lines AL and AK form one and the same
straight line. Hence
24HAL = 4HAK = rt. 4.
Take now M   (Fig. 6), the middle point of AB and draw
MD perpendicular to PQ and intersecting AL in C. Then as
L          C  A          K
rM
P      B                 Q
FIG. 6.
just shown, 4DCK is a right angle. The two right triangles
AMC and BMD are congruent and 4CAB = 4ABD.
Therefore
4QBA + ~4BAK= 2 rt. 4s.
By our definition of parallels, any line through A in the
angle BAK meets PQ.
Hence our assumption is equivalent to Euclid's Postulate 5.
From this follows the Euclidean geometry.


100


MODERN MATHEMATICS


IV. THE LOBACHEVSKIAN ASSUMPTION
7. While retaining all the other assumptions of the Euclidean
geometry, we will replace Postulate 5 of Euclid or Assumption
XIII of Veblen by the following assumption due to Lobachevsky.
Through any point in the plane there go two lines parallel to
a given line.
It follows that, in Fig. 6,
4QBA + 4BAK&lt; 2 rt. 4-s.
For if the sum of the angles QBA and BAK were greater
than two right angles we could draw through A in the angle
BAK a line not meeting BQ by Euclid I, 28. This is contrary
to our assumption that AK and BQ are parallel.
On the other hand, if the sum of the angles were equal to
two right angles we should have the Euclidean assumption.
8. The following theorems are of vital importance in subsequent proofs.
Theorem I. Let AB and CD be twoo parallel lines cut by a
third line CD and let A'B' and C'D' be two other parallel lines
cut by a line A'C', and let 4DCA = 4D'C'A'; then
(1) If AC' = AC, 4C'A'B' = 4CAB
(2) IJ A'C'&lt; AC, 24C'A'B' &gt; 4CAB
(3) If A'C' &gt;AC, 4.C'A'B'&lt; 4CAB
Consider first the case A'C'= AC (Fig. 7).
If 4C'A'B' were less than 4CAB draw AK so that 4CAK
= 4C'A'B'. Then AK meets CD in some point K. Take K' on
B           '       B
^A"   -    B"         -----â A'B'.:tD/./  ]ff
C      K        C'      /    C -----    D     C-       D'
FIG. 7.                       FIG. 8.
C'D' so that C'K'= CK and draw A'K'. Then the triangles ACK
and A'C'K' are congruent (Euclid I, 4) and 4C'A'K' = 4CAK


NON-EUCLIDEAN GEOMETRY


101


= 4C'A'B'. This is impossible, since A'B' does not meet C'D'.
Hence 4C'A'B' cannot be less than 4CAB. Similarly 4CAB
cannot be less than 4C'A'B' and hence 'C'A'B' = 4CAB.
Consider secondly the case A'C'&lt; AC (Fig. 8).
On CA take CA" equal to C'A' and draw A"B" parallel
to CD. Then 4CA"B" =       C'A'B', as just shown, and AB
and A"B" are parallel (sec. 5). Therefore, by sec. 7,
4B"A"A + 4A"AB&lt; 2 rt. 4s.
But           4B"A"A 4- 4CA"1B"     2 rt. 4s,
whence        4A"AAB&lt; 24CA"B";
that is       4CAJ'&lt; 4lC'A'B'.
The third case A'C' &gt;AC, is handled like the second case.
Theorem II. Let AB and CD be two parallel lines cut by a
third line AC and let A'B' and C'D' be two other parallel lines cut
by a line A'C', and let 4CAB- 4C'A'B' and 4ACD = 4A'C'D',
then AC - A'C'.
For each of the suppositions AC&lt; A'C' and AC &gt;A'C'
contradicts Theorem I.
If in Theorem I we take 4DCA=- 4D'C'A'=-rt.., the
angles CAB and C'A'B' are the angles of parallelism for the
distance AC and A'C' respectively (sec. 2). Theorem I includes
then, as a special case, the following:
Theorem III. The angle of parallelism is fixed for a fixed
distance and decreases as the distance increases.
If we denote the distance AH (Fig. 1) by p, the angle of
parallelism HAL is denoted in Lobachevsky's notation by 11 (p).
Theorem III asserts that 17 (p) is a decreasing function of p.
The exact determination of 1 (p) will be given in sec. 33.  We
may note, however, that 1 (p) is always less than a right angle.
In other words,
Theorem IV. If two lines have a common perpendicular they
neither intersect nor are parallel.
The converse of IV is also true, as we shall now show.


102


MODERN MATHEMATICS


9. Two straight lines which neither intersect nor are parallel
have a common perpendicular.*
Let LM and EF (Fig. 9) be two straight lines which neither
intersect nor are parallel. We wish to show that they have a
common perpendicular. Take A and B any two points on LM
and draw AH and BK perpendicular to EF. If AH=BK the
existence of a common perpendicular to LM and EF follows
quickly, as shown below. Suppose then that
BK&lt; AH.
Draw KS parallel to LM. Place t the rt. 4 FKB on the
rt. 4 FHA so that K falls on H, KF takes the direction HF
LA       B   P 
B               IV  Q
S/   s
E H    K   R      T 
FIG. 9.
and KB takes the direction HA. The point B falls at B' between
H and A, BM    takes the position B'M' and KS the position
HS', parallel to B'M'.
Since 4FKS= 4FHS' a line parallel to KS (and hence to
LM) drawn through H lies in the angle opening FHS' (sec. 7).
Hence HS' intersects LM  and therefore B'M' intersects LM
at some point P (Veblen, Theorem 17).
Draw PR perpendicular to EF. Place the right angle FHB'
on the right angle FKB. Then the line PR takes the position
QT, where QT= PR and QT is perpendicular to EF.
Take now W halfway between R and T and draw WV
* The proofs in this and the following section are due to Hilbert, Neue
Begrundung der Bolyai-Lobatschefkyschen Geometrie, Math. Ann., Vol.
LVII.
t Here and subsequently, we use the principle of superposition to abbreviate the proof. - The theorems on congruence may of course be employed
without the aid of any idea of mechanical motion.


NON-EUCLIDEAN GEOMETRY


103


perpendicular to EF. Fold the figure TWV on WV. Then
T falls on R, TQ coincides with RP, and 4WVQ coincides with
4WVP. Hence WV is the required common perpendicular
to EF and LM.
10. Any angle is an angle of parallelism belonging to a certain
distance.
Let KAE (Fig. 10) be any given a. We wish to find a
distance p for which a is the angle of parallelism.
Construct LAE= a and on AK and AL take two points B
and C so that AB=AC. Connect B and C and draw BL'
parallel to BL and CK' parallel to CK.
Draw also CF bisecting 4LCC' and          c  AB
L               K
BG bisecting 4KBL'. It is evident    L     N:'
that the figure is symmetric with respect
to the line AE.
The lines CF and BG cannot in-       u     E,_
tersect. For if they did intersect at  s   --â T
a point T, we could draw TS parallel       FIG. 10.
to AL and BL' and then, since
4LCT = 4L'BT and CT = BT, we should have 4STC =    STB
(sec. 8, I) which is impossible.
Also CF and BG cannot be parallel, for if they were, since
4LCF = 4L'BG and 4CNL'= 4BNF, we should have CN=
NB (sec. 8, II) and therefore  4NCB= 4NBC= 4K'CB,
which is impossible.
Since FC and BG neither intersect nor are parallel, they
have (sec. 9) a common perpendicular UV, which is also, by
the symmetry of the figure, perpendicular to AE at H. We
assert that UV is parallel to AK.
If UV were not parallel to AK we could draw from each
of the points U and V a line parallel to AK and CK'. Since
CU=BV and 4UCK'=4 VBK, these two parallels would
make equal angles with UV (sec. 8, I) which is impossible.
Hence the angle KAE is the angle of parallelism for the
distance AlH.
11. Two parallel lines approach each other continually and their
distance apart eventually becomes less than any assigned quantity.


104


MODIERN MATHEMATICS


Let LK and PQ (Fig. 11) be two parallel lines, and A and
B two points on LK, the point B lying from A in the direction
of parallelism. From A and B draw AH and BM     perpendicular to PQ. We wish to prove BM&lt; AH.
Take R half way between H and M and draw the line RC
perpendicular to PQ. The angle RCB is less than a right angle,
since it is an angle of parallelism. Therefore 4RCB&lt; 4RCA.
Hence, if the quadrilateral RMBC is folded over on RC as an
axis, the line MB takes the position HB' where MB = HB'&lt; HA.
Hence the lines LK and PQ continually approach each other.
To prove the second part of the theorem, let AK and HQ
(Fig. 12) be any two parallel lines and AH a perpendicular from
pn               B                   L 
FIG. 11.                     FIG. 12.
A to HQ. Let e be any assigned quantity and lay off on AUT
the distance HD&lt; e. Draw DL parallel to HQ and AK. Then
4HDL&lt; rt. 4. Hence the line DE drawn from D perpendicular to AH will meet AK in some point C. From C draw
CM perpendicular to HQ. Now L4MCD &gt; 4MCIK, for 4MCEK
is the angle of parallelism for the distance CM, and the line CD
and MH neither intersect nor are parallel, since they have a
common perpendicular (sec. 8).
Hence if the quadrilateral MHCD is folded over on MC as
an axis, it takes the position Mll'D'C where CK lies between
CD' and MQ. Then CK meets H'DI' in some point K' where
H'K'&lt; H'ID'= HD. Hence H'IK'&lt; e.
12. If two lines are not parallel they will diverge if sufficiently
far produced, and their distance apart will eventually become
greater than any assigned quantity.
Consider first two intersecting straight lines AM and AN
(Fig. 13). Let B and D be two points on AM such that AD &gt;AB,


NON-EUCLIDEAN GEOMETRY


105


and let BC and DE be drawn perpendicular to AN. We wish
to prove DE &gt;BC.
Suppose if possible that DE=BC. Then a line drawn perpendicular to AN at the middle point of CE would be also
perpendicular to AM, which is impossible, since AM and AN
intersect (sec. 8, IV).
Suppose, if possible, that DE&lt; BC. Take AF less than
each of the distances DE and AB' and draw FG perpendicular
to N. Then FG&lt; AF&lt; DE. But BC&gt;DE. Hence at some
point K between G and C there is a perpendicular HK such
that IIK==DE. But this is impossible, as just shown. Therefore DE &gt;BC.
L
B
G  K  C                                1 A  E  II
FIG. 13.                      FIG. 14.
To show that there is no superior limit to the length of ED,
take AH (Fig. 14) so that 4M1AN is the angle of parallelism
for AH (sec. 10) and draw HL perpendicular to AM. Then
AN and IL are parallel. Let a be any quantity, no matter
how large, and take Q on HL so that HQ= 2a. Connect Q and
A, and at E, a point between A and H, draw a line perpendicular
to AHl, intersecting AQ in R. We can take E so near H that
RE will differ from HQ by as little as we please and certainly
so that RE &gt;a. But RE will intersect AN in a point D, since
the angle of parallelism for AE is greater than 4HAN (sec.
8, III). Then DE &gt;RE &gt;a. Since a is any positive number,
there is no superior limit to the length of DE.
Consider now two non-intersecting lines MN and PQ (Fig.
15). At A, any point on MN draw AK r arallel to PQ. Since
AK and MN intersect, their distance apart eventually becomes
greater than any assigned quantity. But the distance between


106


MODERN MATHEMATICS


AK and PQ eventually becomes less than any assigned quantity
(sec. 11). Hence the distance between AN and PQ eventually
becomes greater than any assigned quantity.
P-'___- A                9 ----Z
FIG. 15.
It is of course possible that AN and PQ approach each
other for a time, but they eventually diverge. In fact the
shortest distance between them may be shown to be measured
by their common perpendicular.
V. THE RIEMANNIAN ASSUMPTION
13. There remains the possibility, as discovered by Riemann,
of replacing Euclid's fifth postulate by the assumption:
Through a point of the plane no line can be drawn parallel
to a given line.
In other words all lines of the pencil with its vertex at A
(Fig. 1) intersect PQ.
This assumption contradicts proposition 28 of Euclid's
first book, so that it is necessary to modify the assumptions
upon which that theorem depends. Proposition 28 depends
upon proposition 16, which in turn depends upon the tacit
assumption that two straight lines cannot enclose a space.
This assumption is satisfied when applied to objective space
in the domain of experience. We will accordingly assume that
the Euclidean assumptions, with the exception of the parallel postulate, are valid in a sufficiently restricted portion of space, that is,
in a portion of space in which no straight line can be drawn of
greater length than some fixed line of length M.
We may proceed similarly with the Veblen assumptions.
Let [S9 be our space, for which all the assumptions, except II,
are made. And let [So] be a subset of points of [S] for which,
in addition, Assumption II is made. Then in [So] we have all


NON-EUCLIDEAN GEOMETRY


107


the theorems proved by Veblen, and in [S] those theorems which
do not depend upon Assumption II. The assumptions and
theorems concerning congruence enable us to compare geometric
configurations in [So] with others which lie outside of [So]. In
particular, the theorems on the congruence of triangles are
independent of the positions of the triangles.
With this preparation, we may proceed to examine the
results of the Riemannian assumption.
14. All lines perpendicular to the same straight line meet in
a point at a constant distance from the straight line.
Let LK (Fig. 16) be any straight line and A and B any two
points upon it. By the Riemannian hypothesis AO    and              L
BO, perpendicular to LK, meet             c      M
in a point 0. Since it is conceivable that the perpendiculars
may meet more than once we
assume explicitly that the two  B ---                  o
perpendiculars have no common
point on the segment AO or     0,        D
BO. We assume also that the              A
triangle ABO lies in the region
[So] of sec. 13, so that in par-        FIG. 16.
ticular only one straight line
can be drawn from 0 to any point of the segment AB.
Since
4BAO= 4ABO,
B=AO.0
by Euclid, I, 6.
Construct 4BOM= 4AOB.       Then by the Riemannian
hypothesis the line OM meets LK in a point C. The triangle BOC
has two angles and an included side congruent respectively to
two angles and an included side of the triangle AOB. Hence
4BCO= 4ABO= rt. 


and


OC=OB =OA.


108


MODERN MATHEMATICS


By repeating this demonstration, we prove that if P is a
point on LK such that
AP=m - AB,
where m is a positive integer, the line OP is perpendicular to
LK at P and PO=AO. But only one perpendicular can be
drawn at P to LK. Hence this perpendicular passes through 0.
Now take D, so that
AB=n AD,
where n is a positive integer, and draw a line perpendicular to
LK at D. If this perpendicular should intersect either BO or
AO at a point O' in the segments BO or AO, then, by the
demonstration just finished, BO and AO would also intersect
at 0', contrary to hypothesis. Hence this perpendicular passes
through O and DO=AO.
It follows that if P is any point on LK such that
AP   â.AB,
n
where m and n are positive integers, the perpendicular at P
to LK passes through O and PO-AO. Also, since by hypothesis, only one straight line can be drawn from P to 0, the line
PO is perpendicular to LK.
Now let Pi be a point such that
AP1= AAB,
where A is an irrational number. Take P such that AP -AB,
it 
draw OP and OP1, and let - pass through rational values
n
approaching A as a limit.
4AP10 = lim 4APO,     P10 = lirn PO.
But APO is always a right angle and PO is always equal to
AO. Hence 4APO1-rt. 4 and P10=AO.
Our theorem is therefore proved for the line LK. If L'K'
is any other line, we may take A' and B' any two points on it,


NON-EUCLIDEAN GEOMETRY


109


and draw the perpendiculars A'O' and B'O', intersecting at 0'.
Take AB on LK so that AB=A'B'. The two triangles ABO
and A'B'O' are congruent and A'O'= AO. The distance AO is
therefore independent of the line LK or of the position of the
point A on the line. We will place OA =J.
A corollary of our theorem is that all straight lines are of
constant length. For, from the proof we have used, it is evident
that, if P is any point on AB,
AP 4AOP
AB 4AOB'
Now if 4AOP=27, the line OA coincides with OP, and AP
becomes 1, the total length of the line. Then
_ 4AOT B
4AOB'
15. All lines which pass through a point 0 meet again in a
point 01 such that the distance 01 is constant.
Let 0 (Fig. 16) be any point and OA any line through 0.
Take OA=J (sec. 14) and draw LK perpendicular to AO. Let
OB be any other line through 0 intersecting LK in B. Then
OB is perpendicular to LK (sec. 14). Prolong AO to Oi so
that AO1 = AO and draw O1B. The triangles AOB and AO1B
are congruent, since two sides and the included angle of one
are congruent respectively to two sides and the included angle
of the other. Hence
4AB    - 4ABO =rt. 4,
and                 O1B=OB=OA.
Therefore the line OBOi is a straight line and
0 =2J.
Since all lines are of finite length (sec. 14) any line through
0 returns through 01 to O. Two cases are usually considered.
First, the point 01 may coincide with O. The total length
of a straight line is then 2J and any two lines have only one
point in common.


110


MODERN MATHEMATICS


Secondly, the point 01 may be distinct from 0, but the lines
001 continued through 01 meet again in 0. The total length
of a line is then 4J and two lines meet in two points. The
Riemannian geometry in this case is the same as the geometry
on the surface of a sphere.
VI. THE SUM OF THE ANGLES OF A TRIANGLE
16. Consider any triangle ABC (Fig. 17). Take E, the


A
K      E/      \F L
G
B                 C
FIG. 17.


middle point of AB, F the middle
point of AC, and draw a straight
line EF. From A, B, and C draw
the lines AG, BK, and CL perpendicular to EF.
In the right triangles AEG and
EBK, EA = EB and 4GEA = 4BEK.
Hence the two triangles are congruent and


BK= AG, 4KBE= 4GAE.
Similarly, the right triangles AGF and FLC are congruent and
AG=CL, 4FCL ==4GAF.
If we define equivalent figures as those which may be divided
into parts which are congruent in pairs, it appears that the triangle
ABC is equivalent to the quadrilateral BCLK. Also, the sum
of the angles of the triangle ABC is equal to the sum of the
angles KBC and LCB of the quadrilateral BCLK.
This quadrilateral BCLK has two right angles, L and K,
and two equal sides, KB and LC, adjacent to the right angles
and opposite to each other. Such a figure we shall call an
isosceles birectangular quadrilateral.
The study of the sum of the angles and of the area of a
triangle is thus reduced to the study of an equivalent isosceles
birectangular quadrilateral.
17. Let ABCD (Fig. 18) be an isosceles birectangular quadrilateral with right angles at A and B. For convenience, we


NON-EUCLIDEAN GEOMETRY


1ll


shall call AB the base, CD the summit, and C and D the
summit angles of the quadrilateral.
Take L the middle point of the base and draw LK perpendicular to the base. Fold LBDK on LK as an axis. It is
clear that the point D falls on C. Hence, the summit angles of
an isosceles birectangular quadrilateral are equal. Also, LK is
perpendicular to CD at its middle point K and the quadrilateral
LBDK has three right angles.
Now through H, the middle point of LK, draw EF perpendicular to LK. Fold HFDK on HF as an axis. The point D
will fall at B', B, or B" according as KD is less than, equal to,
C          D
C       K       D
M
A        L      B' B B"        A            B
FIG. 18.                    FIG. 19.
or greater than LB. In these three cases 4D is greater than,
equal to, or less than, 4B respectively. Hence:
Each summit angle of an isosceles birectangular quadrilateral
is less than, equal to, or greater than, a right angle, according as
the summit of the quadrilateral is greater than, equal to, or less
than, the base.
18. In the Euclidean geometry each summit angle of an isosceles
birectangular quadrilateral is equal to a right angle.
This is a familiar proposition of the Euclidean geometry
and need not be proved here. We shall prove, however, the
following theorem:
In the Lobachevskian and Riemannian geometries, a summitangle of an isosceles birectangular quadrilateral cannot equal a
right angle.
Let ABCD (Fig. 19) be an isosceles birectangular quadrilateral with right angles at A and B. If possible, suppose
4C= 4D==rt. 4. Then (sec. 17) CD = AB. Take two points


112


MODERN MATHEMATICS


M and N on AC and BD respectively so that CM=DN and
draw MN.
Then ABMN is an isosceles birectangular quadrilateral with
right angles at A and B. MNDC is an isosceles birectangular
quadrilateral with right angles at C and D. Then MN must
be perpendicular to AC and BD or we should have, by sec. 17,
MN greater than one and less than the other of the two equal
lines AB and CD, which is absurd.
Since M is any point between A and C it appears that the
segments AC and BD are equidistant. By prolonging the lines
AC and BD and considering congruent segments, it appears that
the lines AC and BD are equidistant throughout their extent.
Since this is impossible in the Lobachevskian and Riemannian
geometries (sees. 11, 12, 13) the theorem is proved.
19. Each summit angle of an isosceles birectangular quadrilateral is less than a right angle in the Lobachevskian geometry
and greater than a right angle in the Riemannian geometry.
In Fig. 18, the line CK measures the distance of the line
AC from the line LK at the point C. In the Lobachevskian
geometry, if the line AC is taken sufficiently long, CK&gt;AL
(sec. 12). If, therefore, CK were in any position less than AL,
there would exist at least one other position in which CK =-AL.
This is impossible (sec. 18) and hence CK is always greater
than AL and the angle C less than a right angle (sec. 17).
In the Riemannian geometry the lines AC and LK eventually
intersect. Hence, if AC is sufficiently long CK&lt; AL, and therefore CK is always less than AL and the angle C greater than a
right angle.
20. In the Euclidean, Lobachevskian, and Riemannian geometries respectively the sum of the angles of a triangle is equal to,
less than, and greater than, two right angles.
We have seen in sec. 16 that the sum of the angles of a
triangle is equal to that of the summit angles of an isosceles
birectangular quadrilateral.
The theorem then follows from'secs. 18, 19.


NON-EUCLIDEAN GEOMETRY


113


VII. AREAS
21. According to the definition already given (sec. 16) two
polygons are equivalent, or equal in area, if they can be divided
into the same number of triangles which are congruent in pairs.
We have proved (sec. 16) that a triangle is equivalent to an
isosceles birectangular quadrilateral having its summit equal
to one side of the triangle, and each summit angle equal to half
the sum of the angles of the triangle.
Now, in either the Lobachevskian or the Riemannian geometry an isosceles birectangular quadrilateral is fully determined
by its summit and summit angles, for if ABCD (Fig. 20)
A         FB                       C
A           B 
FIG. 20.                   FIG. 21.
and ABEF are two isosceles birectangular quadrilaterals with
the same summit EF and the same summit angles E and F,
their bases CD and AB must coincide. Otherwise, the quadrilateral ABCD would have four right angles, which is impossible
(sec. 18). Hence follows the tleorem:
In the Lobachevskian and Riem2annian geometries, two triangles
are equivalent if a side and the sum of the angles of one are equal
to a side and the sum of the angles of another.
22. A triangle may be constructed having the same area
and the same angle sum as a given triangle, and having one
side arbitrarily assumed within certain wide limits.
Let ABC (Fig. 21) be a given triangle and BCKL the
isosceles birectangular quadrilateral constructed as in sec. 16.
Let I be a given length. With Il as a radius and B as a center
described an arc of a circle cutting KL in M. Connect B and


114


MODERN MATHEMATICS


M and prolong BM to A' so that MA'= BM. Connect A' and
C. Then A'BC is the required triangle, as is readily shown.
That the construction may be possible, it is necessary, on
the one hand, that BM &gt;BK, a condition which is certainly
met if I &gt;AB.
On the other hand, it is necessary, in the Riemannian
geometry, that I should be less than the constant 2, (sec. 15).
If, now, we have two triangles with the same angle sum, we
may take I greater than one side of each, and replace each by
an equivalent one with the same angle sum and a side equal
to 1. The two new triangles are equivalent (sec. 21). Hence:
Any two triangles with the same angle sum are equivalent.
23. Consider any triangle ABC (Fig. 22) and draw from A
a straight line to any point D of the base.
A      We shall call this line a transversal and
shall say that the triangle is divided
transversally. Now if s is the sum of the
angles of the triangle ABC, and si and
s2the sum of the angles of the triangles.
B     D       c   ABD and ADC respectively, we have
FIG. 22.
s= âS +S2-2 rt. 4s.
If we adopt such a unit of angle measure that a right angle
7r
shall have the measure -, the above equation may be written
2'
in either of the forms
7- S = (7 - Sl) + (7r- Sa)
or
s- r = (S - 7) + (S2 - 7).
In the Lobachevskian geometry, z-s is positive (sec. 20) and
is called the defect of the triangle. In the Riemannian geometry
s-n is positive and is called the excess of the triangle. Hence
we may state the theorem:
If a triangle is divided transversally, the sum of the defects,
or excesses, of the parts, is equal to the defect, or excess, of the
triangle.


NON-EUCLIDEAN GEOMETRY


115


The theorem evidently remains true if the triangle is further
subdivided by successive transversals of the parts, as shown,
for example, in Fig. 23. Further Hilbert *
has shown that any division of a triangle
may be reduced to transverse divisions.
We have accordingly the more general
theorem:
In the Lobachevskian and Riemannian
geometries the defect, or excess, of any tri-   FIG. 23.
angle is equal to the sum of the defects, or
excesses, of triangles which are formed from it by any system
of division.
24. Since equivalent triangles may be divided into the same
number of triangles congruent in pairs (sec. 21), and since
obviously congruent triangles have the same defect, or excess,
it follows that any two equivalent triangles have the same defect, or
excess. The converse theorem has been proved in sec. 22.
We are now enabled to take the defect, or excess, of a triangle
as the measure of its area, since the essential properties of a
measure of area are that two triangles with the same area have
the same measure, that two triangles with the same measure
have the same area, and that the measure of a whole is the
sum of the measure of its parts. Hence we may say:
In the Lobachevskian geometry the area of a triangle is equal
to a constant times its defect. In the Riemannian geometry, the
area of a triangle is equal to a constant times its excess.
The value of the constant depends evidently upon the unit
of area employed.
The area of a polygon is found by dividing it into triangles.


* Grundlagen der Geometrie, Vol. VII of Wissenschaft und Hypothese,
Leipzig, 1909.


116


MODERN MATHEMATICS


VIII. NON-EUCLIDEAN TRIGONOMETRY
25. The definitions of the trigonometric functions as given
in the elementary trigonometry are evidently not available
in non-Euclidean geometries, since these definitions are based
upon properties of similar triangles which are true only in the
Euclidean geometry.
Lobachevsky met this difficulty by the consUtruction of a
limit-surface,' or horisphere, on which the Euclidean geometry
and trigonomretry are valid at the same time -that the Lobachevskian geometry is valid on the plane. By the aid of this
surface and the spher lie obtained the formulas which will
be found in sec. 34.
This method, however, cannot be applied to the Riemanniian
geometry. We shall therefore follow a more general method
which has also the advantage of operating entirely in the plane.
The method, however, is not as elementary as the other, and
we shall be obliged to state some results without proof ant to
give a mere outline of other proofs.*
We start with the purely analytic definitions of the trigonometric functions. That is, ev- being defined by the series
XI x2 X3
e = I  - +- +-..,
1   2! 3!    * 
the trigonometric functions are defined by the equations
e'i e- xz i
sin x= â -e
COS X   =tan x      -- _
where i= I/-1.   These functions obey all the formulas of
trigometry and if x is a real number they are real.


* For complete proofs and historical notes consult Coolidge, The Elements
of Non-Euclidean Geometry, Oxford, 1909, expecially Chapter IV, where all
the requirements of rigor are met.


NON-EUCLIDEAN GEOMETRY


117


If x is pure imaginary, the above equations lead to the
hyperbolic functions, which are defined by the following
equations:
ex __ e-x
-      i   s      in ix -- -- -- sinh x,
er  -. e-r,
cos ix = --- â-  cosh x,
-i tan ix-     =-. tanh x.
e,~ -+ e- 
If x is real, the hyperbolic functions are real, and formulas
for this use are readily obtained, if needed, frolm the trigonometric functions.
The following properties of cos x are important for us:
If cos X &lt; 1, x is real; if cos x &gt;1, x is pure imaginary, except
perhaps for multiples of the period 2n which may always be
added.
If we place cos nmx =f(x), f(x) satisfies the functional equation
f ( + y) +f ( - y) = 2(x)/().
Conversely, if f (x) is a continuous function of x satisfying
the above equation, then f (x) - cos mx, m being a constant, real
or complex.
26. The sine and cosine of an acute angle may be defined
as follows. The extension to angles of any size is then made
as in the ordinary trigonometry.
Let A (Fig. 24) be any acute angle and MP the perpendicular fromt any point P of one side to the other side, then it may be
AM
shown that AP approaches a limit as AP approaches zero, and
AM
that lim- p is a continuous function of A, which satisfies the
A[
functional equation of sec. 25. Hence lim  -p =cos mA
Since AM&lt; AP, the coefficient m is real, and if we adopt a sys


118


MODERN MATHEMATICS


tem of measurement of angle by which a right angle has the
measure -, we may place m =1. Hence, finally,
AM
lim A  = cos A.
In a similar manner
MP
lim  -psin i.
AP
C           D
A         M              A             B
FIG. 24.                  FIG. 25.
27. Let AC (Fig. 25), be any straight line of given length
a. From A draw AB perpendicular to AC and take AB any
length. At B draw BD perpendicular to AB, take BD=AC,
and complete the isosceles birectangular quadrilateral ABCD.
CD
Then it may be shown that -A approaches a limit, as AB
approaches zero, and that this limit is a continuous function of
a, satisfying the functional equation of sec. 25. Hence
CD
lirm -- = cos ma.
In the Lobachevskian geometry, CD &gt;AB and m is pure
imaginary. In this case, we place m=-, where k is real, and
have
CD      ia      a
lim. =cos    = cosh.
AB =osk 


NON-EUCLIDEAN GEOMETRY


119


In the Riemannian geometry, CD&lt;AB and m is real. In
this case, we place m-,I and have
CD      a
lin  - =cos -.
There appears here a striking property of the non-Euclidean
geometries in the existence of functions of distances analogous
to functions of angles. The constant k depends upon the
unit of distance employed.
If we apply the construction of this article to the Euclidean
CD
geometry we obtain the trivial result lim -A=1. It is worth
AB
noting that this comes out of the previous results by placing
28.    e shall indicate in this section a  et   by  hich
28. We shall indicate in this section a method by which


a fundamental formula connecting
the sides of a right triangle may be
obtained.
Let ABC (Fig. 26), be a triangle
with the right angle at C and with
the sides AB=c, AC=b, CB= a.
Take AA1 a small distance on AC
and prolong AC to C1 so that
AA== CC1,


BD1
A,.. B 1
A
'f g^ --- â- -
X 11_ _1  C2
FIG. 26.


and construct the triangle A1BLC1 congruent to ABC. Let B2
be the point of intersection of A1Bi and BC. Prolong B1A1
and BC so that
AA2 = BB2,
and
CC2 = BB2,and construct the triangle A2B2C2 which differs slightly from
ABC.
From B1 draw B1D1 perpendicular to BC, and from B draw
BD perpendicular to A1B1. Also draw HH1, the common
perpendicular to AC and A2C2, and EE1 the common perpen


120


MODERN MATHEMATICS


dicular to AB and A2B2. RE1 evidently bisects AA1, and
HH1 passes near the middle point of A1A2. Then as AA1
approaches zero as a limit it mlay be shown tbat
BD
lirn --  = cos inc,
B1D1
BID,- -eos nma
lim       COS??~n
CC1
line C    cos mb.
HHI1
In fact, from the definition of sec. 27, the reader will have
no difficulty in seeing that these relations are, at least, approximately true. The rigorous demonstration may be found in
the book by Coolidge just cited.
We have, then,
cos Inc        BD   CC1 HH1
lini
cos ma cos mib    ER1 B1D1 CC2
RD AA1 HH1
-ulim.__    -_
EEI B1D1 BR2
RD   AIA  HH1   B1R2
E2 R1 A1lA 2 B1D1
Now it may be shown that
RD.B1Dt
sin B=1i  BBR2   lithin -
sin A== EntII   liH i i EA1A2      AAII
as may be seen approximately from Fig. 26, and the definition
of sec. 26.
We bave, accordingly,
cosine            1           1
sin B       sin A
cos mna Cos "lcImb  sin An     sin B'
Or
eos inC  (os iia COs, ind.


NON-EUCLIDEAN GEOMETRY


121


29. Let ARC (Fig. 27) be a triangle with right angle at C and
AC=b, RC==a, AB=c. Take any point D on AC and draw
BD, and DE perpendicular to AR. Let RD = 1, DE = p, AE = q,
AD= k. Then (sec. 28)
cos ml = cos ma cos m(b - k)
= cos mc cos mic + cos ma sin mb sin mk
cos ml = cos Mp cos m(c- q)
= coS mC coS mk + cos mp sin mc sin mq,
whence
cos ma sin mb sin mik = cos mp sin mc sin mq.
B
By use of the relations cos mc = cos
ma cos mb and cos mk = cos mp cos mq,
we find readily                               E
tanmb tanmq                    A   k D         c
tan mc tan mk'                       Fig. 27.
Now as k approaches zero as a limit, q approaches zero
tan mql -. 
also, and lim       lim   =cos A.
tan mnk     k
Hence
tan mb
ta  c-cos A.
tan MCn~
30. From the result of sec. 29, we have
tan2 mc, - tan2 mb
sin2 A =
tan2 njC~
sin2 mc - tan2 mb cos2 mc
sin2 me
1-(1 +tan2 mb)cos2 mC
sin2 Mc
cOS2 mC
cos2mb   I - Cos2ma
- sin"2 me  sin2 IC


122               MODERN MATHEMATICS
Since A is an acute angle, we have
sin ma
sin A= s 
sil me
Similarly
sin mb
sin B=
sin me
From sec. 29 and these results we have
cos A cos mc
-r = _   -   = cos ma.
sin B  cos mb
Whence
cos A = cos ma sin B.
Similarly
cos B = cos mb sin A.
31. The formulas obtained in secs. 28-30 are applicable to
both the Lobachevskian and the Riemannian geometries. For the
Riemannian geometry, we place m=k and make the following
collection of the formulas:
c      a    b
cos  =cos   cos 
k      k    k
a.  c.
si     s =sin  sin A
a      c
tan   =tan - cos B
a
cos A= cos, sin B.  b.  c.
sin - =sin - sin B
k      k
b      c
tan   =tan   cos A
cos B = cos k sin A.
to


NON-EUCLIDEAN GEOMETRY


123


32. We obtain from sees. 28-30 the formulas for the Lobai
chevskian geometry by placing m=k and replacing the trigonometric functions by the hyperbolic ones. We have
c       a     b
cosh  = cosh a cosh 
a      c
sinh  = sinh - sin A
k      k
a       c
tanh T =tanh k cos B
cos A = cosh - sin B
sib     csi
sinh - = sinh - sin B
k      k
b       c
tanh  =tanh - cos A
cos B = cosh 7 sin A.
k


It is worth noticing that the formulas
trigonometry come out of those in sec. 31
cases when k= oo (cf. also sec. 43).
33. The formulas of sec. 32 may be 1
expression for the angle of parallelism  i
belonging to a distance x.
Let BM (Fig. 28) be parallel to CN
and BC perpendicular to CN. The
figure NCBM may be regarded as the    c
limit of a right triangle ABC in which
BC=x is constant, A approaches zero and B
The formula


for the Euclidean
or sec. 32 as limit
used to obtain an


FIG. 28.
' approaches H(x).


goes over into


cos A = cosh - sin B (sec. 32)
1        2
sin /7(x)  -        x '
cosh -  e +e e
e


124


MODERN MATHEMATICS


whence
x     x
ek _e   k      X
cos 11 (x) =  e   = tanh -
ek e   k
Then
sin 1(zx)   -_
tan7 (x)=1 â co sH(z)=e.
I* / l+cos 77x)
34. If we substitute in the formulas of sec. 32 the values
x          X
of cosh  and tanh - found in sec. 33, and make certain simple
kc         k
reductions, the formulas of sec. 32 take the following forms:
sin 7 (c) = sin 1 (a) sin 1 (b),
tan 17(c) =tan Hl(a) sin A,
cos H (a) = cos 7 (c) cos B,
sin B = sin H (a) cos A,
tan 1 (c) = tan 1 (b) sin B,
cos H (b) = cos n (c) cos A,
sin A =sin I (b) cos B.
These are the forms found by Lobachevsky, except that he
writes A=II(a), B=1I7(,), where a and i are the distances
corresponding to the angles of parallelism A and B respectively.
We shall make, no use of these equations,
but have given them to facilitate comparison with Lobachevsky's own work.
35. The above formulas are for right
triangles.  We shall now obtain one for
oblique triangles.
A,~~~ c      ~bD       Let ABC   (Fig. 29) be any triangle
FIG. 29.      with the angles A, B, and C, and the
opposite sides a, b, and c, respectively.
Draw BD perpendicular to AC and let BD = h, AD = k. Then
cos?a = cos 7mh cos m(kI- b)
= cos mc cos mnb +sin mb sin min cos Pmh
= cos mc cos mb + sin mb tan mk cos mc
= cos mc cos mb +sin mb sin mc cot A.


NON-EUCLIDEAN GEOMETRY


125


IX. NON-EUCLIDEAN ANALYTIC GEOMETRY
36. Let OX and OY (Fig. 30) be two axes of coordinates
intersecting at right angles and MP and NP the perpendiculars
from any point P to OX and 0 Y respectively. We shall take
OM==x, ON=y
as the coordinates of P. To every point P corresponds a single
set of coordinates (x, y) and to any set of coordinates corresponds not more than one point P. But if x and y are
assumed arbitrarily there is not necessarily a corresponding
point P in the Lobachevskian geometry, since the two perpendiculars at M and N may be parallel or non-intersecting.
Y
-AT                L
D
or'     r     _      X
01
O       X'M          X          _ ___ ____V
FIG. 30.                     FIG. 31.
By drawing the line OP, we mlay take
OP- r, 4XOP =0,
as the polar coordinates of P.
Between the two sets of coordinates there exist, in either
the Riemannian or the Lobachevskian geometry, the relations
(sec. 29)
tan mx= tan mr cos 0,
tan my= tan mr sin 0,
whence          tan2 mx - tan2 my = tan2 mr.
37. The equation of a straight line may be obtained as
follows:
Let LK (Fig. 31) be any straight line determined by the
parameters p and a, where p is the length of the perpendicular
OD from the origin and a the angle made by OD with the


126


MODERN MATHEMATICS


positive direction of OX. Let P(x, y) be any point on LK
and draw OP. Then in the triangle OPD,
OD=p, OP=r, 4POD= -a
where (r, 0) are the polar coordinates of P. Hence (sec. 29)
tan mr cos (0- a) =tan mp,
whence (sec. 36)
tan mx cos a +tan my sin a =tan mp,
the required equation.
38. The distance between two points may be found as
follows:
Let Pl(xl, yi) and P2(x2, y2) (Fig. 32) be any two points
with the polar coordinates (ri, 01) and (r2, 02) respectively.
Draw OP1, OP2, and PIP2. Then in the triangle OP1P2
P1 = rl, OP2= r2, 4P20P1 = 01- 2. 
Y
Y                            2 p
P1.L
AX         0   /FIG. 32.                  FIG. 33.
Hence (sec. 35),
cos m P1P2 = c6s mrl cos mr2 +sin mnrl sin mr2 cos (1 - 02)
=cos mrCr cos mr2[1 +tan mrl tan mr2 cos (01- 02)].
By use of the formulas of sec. 36, this reduces readily to
CO M    1 + tan mx1 tan mx2 + tan myl tan my2
cos m PiP2=                                             -,
ov/ +tan2 mx1 +tan2 myl V/1 +tan2 mx2 +tan2 my2
the required formula.
39. The angle between two lines may be determined as
follows:
Let PL1 and PL2 (Fig. 33) be two straight lines intersecting


NON-EUCLIDEAN GEOMETRY1


127


at P. Draw from 0 the two perpendiculars OD1 and OD2 on PL1
and PL2 respectively, and (as in sec. 37), let
OD,=pi, 4XODi==al,
0D2=p2, 4X0D2=a2.
Draw   OP   and  place OP=r, 4XOP=0, 40PD1=Pi,
40PD2 = P2, and 4LPL2 = d4 =2 7 - (PI + P2).
Now fromi the right triangles OPD1 and OPD2, we have
(sec. 30),
sin npI         sin mp2
sinP   sin mr   SinP2  sin Mr'
coS pi = coS mp1 sin (0- ai), coS p2 = cos mp2 sin (a2- 0),
= -cos rnp2 sin (O- a2)
Therefore,
sin mp I sin mp2
cos=cos mpI cos Mp2 sin (0- a1) sin (O- a 2) +  sin2 ir  (1)
But (sec. 37)
cos (O- a,) tanmr= tan mpi,
cos(I0- a 2) tan mr=tan mp2,
whence
sin mp1 sin MP2
0 = cos inpi coS mp2 cos (O-al) cos (0- a2) â  2Mr
tan2 z
Adding this equation to equation (1), we have
cos ~b==cos mpi cos nip2 cos (al - a2) +sin mpi sin Mp2
cos a, cos a2 +sin ai sin a2 +tan mpI tan nip2
Vi I+tan2 mp V/I +taij2n p2
which gives the required angle in terms of the functions which
enter into the equations of the lines.
40. The formulas of secs. 36-39 apply to either the Riemannian or the Lobachevskian geometry. It is now convenient to
separate the two cases.
In the Riemannian geometry, where n = ~, we will introduce,
instead of x and y, the new coordinates $ and rl, where
a        a
E==ktan-c,    4==tan~c                 i


128


1MODERN MATHEMATICS


The equation of the straight line (sec. 37) becomes
p
i cos a + r sin a =l tan -
or, more generally,
a    1.c=O,.(2)
where
a                b           p     -c
COS a  V     -_ Y   hla            tan              (3)
C5 a2+ 2'        Va2 +     ktank=V2 +.
Conversely, any equation of form   Eq. (2) represents a
straight line, since a and p can always be obtained from Eqs. (3).
In particular, the equation
represents a line perpendicular to 0Y and intersecting OX at
the point where $ = cco. But, from Eq. (1),  c/j-, when x  2
By sec. 15, two lines perpendicular to the same line intersect at
21
a distance J. Hence lo       This fixes the constant k in termis
of 4.
The formulas for distance (sec. 38), and angle (sec. 39),
become respectively
P1P2   _     _2 __ __2 +______
coS                                                (4)
k    Vhk2 ~ $ 12 + ~  12V/k2+22 + rj22(
k2(aia2+blb2) +C1C2
Cos  \/AC2(a22 + 1i2) + C12 \V/k2(a22 +b22) ~ C22. (5
In Eq. (4) let us place  I = $, 1 I = rj, $2 = + d$, 12=rj+d7.
The right-hand side of the equation becomes, as far as infinitesimials of the second orlder are concernedl,
[,72(d$2 ~ dy2) - (rd, _- $d~ )2
(/,~2 + $2 + 022)


NON-EUCLIDEAN GEOMETRY                 129
The left-hand side of the same equation becomes, if we place
P1P2=ds, and expand,
1  (ds)2
Hence
ds k,\12 (d$2- d~2) + (d,-t- $d~) 2      (6)
which gives the element of arc of any curve.
We may transform Eq. (6) to polar coordinates by placing
r                 r.
= k tan - cos 0,  ~k tan - sin 0.
k                 k
It becomes
ds dr2 + k2 sin -d02.
Theref ore the circumf erence, C of the circle r a is
a          ~~~a
C = ksin-fd0 =-27r ksin -
41. To modify3 the formulas sees. 36-39 for the Lobachevskian geometry where in =_ we place
ky
k_ 
The equation of the straight line (sec. 87) becomes
cos a + i sin a=k tanh -
or
a$+b~~~ 4-c=-O, ~~~(2)
where
a.       b      ktn1        -~C     (3
cosa            si 11a=          k____               (3
V\a2 +b V         a2 +b         k   Va2~+b2


130


MODERN MATHEMATICS


Now, if p is real, tanh P&lt; 1; hence from Eq. (2)
C2&lt; k2(a2 +b2).
Conversely, Eq. (2) represents a straight line provided
c2&lt; k2(a2+b2), for then a and p may be determined from
Eq. (3).
The formulas for distance (sec. 38) and angle (sec. 39),
become respectively
Pip2   _  _   _ -__1 _2 -  _ __ _ _ __
cosh                                                (4)
k    'v/k2 - E12  712 '\1k 2 -$2 2 - r22
k2(a1a.2 -+ b1b2) - C1C2
cos. ==  (5)
Vk2 ((ai 12 ~b12) - C12 V\lk2(a22 +b2 2) -C22
If in Eq. (4) we place $,     1    $2-$+ d$, ~2    + d~,
P1P2 = ds it becomes, as far as infinitesimals of the second
order are concerned
(ds)2          Ik2(d~2 + dm2) (id$ - $dr))2
1+  2                    2 $2 2_~
whence the element of arc of any curve is given by the formula
d, \/k2(d$2 ~d 2) - (T&amp;db - $dr )2
In polar coordinates, this becomes
ds=\/dr2+k2 sinh2, 02
whence the circumference of the circle r a is
2 r2"          a       a     a
C=1o ksinhkf dO = 2w1r sinhk i=1:ek- e19.
42. We may now complete the discussion of area given
in secs. 21-24. The unit of angle being such that a right angle
has the measure 2, we will take the unit of area such that, a, P,


NON-EUCLIDEAN GEOMETRY


131


and r being the angles of a triangle ABC, we have in the Riemannian geometry
Area ABC= k2(a + + r- ).
and in the Lobachevskian geometry
Area ABC= k2(- a-P-r),
Consider now in the Riemannian plane a trirectangular
quadrilateral (Fig. 34) formed by the axes OX and OY and the
lines MIP( = c) and NP(= c2).
Denote the area of OMPN by A and the angle MPN by,. Then, by dividing OMPN into   r
two triangles                    _____
A=k2 Q-b;$             N           P     PI
whence
sin = - cos             _____._
/c C2:0;                        M   M
Therefore, by sec. 40,                  FIG. 34.
sin...(1)
sk2 Vk2 2 CV2  + ~/k2/k2 +2 Vk2 +         (2'
A   =
the positive signs of the radicals being taken since c2&lt; 2.
Let us now increase $ by dS, corresponding to MM1 in the
figure. The corresponding differential of area, d~A, represented
by MM1PP, is found by differentiating Eq. (1). We have
^d A _=  k2..\..  (2)
(k2 + 2) /k2+ 2+ 2.
The differential of this area caused, by a change of dr in n
is represented in the figure by PP1QQ1. We shall call this
area dA and obtain it by differentiating Eq. (2) with respect
to a. There results
dA     k3dd                         (
dA   (k2+(2+3 *... (3)


132


MODERN MATHEMATICS


The same process applied to the Lobachevskian plane leads
to the result
dA=    k 3dvd                      (4)
(k2_ E2_ - 2) 2..
Eq. (3) may be applied to find the area of the circle
2 + 2= k2 tan2 - in the Riemannian geometry. We have *
Cktk t       ann2 k-22 -  d  _kd sin2 2
A  4A3 I     "k k                      = 47r/{; sm7in  *
4jJo       J(Jo%- W       2 +-2Y- )        2k
a
Similarly the area of the circle  2+ r  -/k2 tanh2  in the
Lobachevskian geometry is found to be
A   47&amp;2 sinh2 a -.k.2 (e -  ~ 2
43. We have noticed in sec. 32 that the formulas for the
non-Euclidean trigonometry include those of the Euclidean
trigonometry as a limiting case when k = oo. A similar remark
applies to the non-Euclidean analytic geometry. We note that
as k=oo
a            a            a            a
lim k sin  = lim k tan k- =   lim k sinh - = lim k tanh - = a
k           k             k            kJ
a             a
and             lim k cos- = lim  k cosh  = 1.
k             k
* The calculation is facilitated by changing the variables in the integral,
A=k8C f    d dt
(k2 + 2 + \2) 3
to polar coordinates, by the methods of the calculus for such a problem.
(See Hedrick's translation of Goursat's Mathematical Analysis, p. 266.)
We have, in the Riemannian geometry,
r
A= J k sin -drdO,
and similarly in the Lobachevskian geometry,
A = Sk sinh -drdO.
Ik


NON-EUCLIDEAN GEOMETRY


133


The coordinates ($, 7) of either the Riemannian of Lobachevskian geometry become in the limit the coordinates (x, y)
of the Euclidean geometry, and the formulas of sees. 40-42
reduce either to the identity 1=1 or to the corresponding
Euclidean formula.
For example, Eq. (4) sec. 40 or sec. 41, gives at first sight
1, but if we expand in powers of - and consider the terms
of lower order it is easy to obtain the formula
P1P2- /(t1 - 2)2 + (7i- r/,)2.
On the other hand, Eq. (5), sec. 40 or see. 41, gives at once
ata2+blb2
Cos (               --
s/al2 + bl2V/a2 + b22
It appears that the Riemannian and Lobachevskian geometries will differ, unappreciably from the Euclidean geometry,
in their practical applications, provided k is very large. Therein
lies the impossibility of determining by experience which of the
three geometries is physically true.
X. REPRESENTATION OF THE LOBACHEVSKIAN GEOMETRY
ON A EUCLIDEAN PLANE
44. Let P ($, r) be any point on a Lobachevskian plane,
(r, 0) its polar coordinates, where r is always positive. Then
(sees. 36, 41)
r
==k tanh - cos 0,
= k tanh k sin 0,......        (1)
$2+ 2=k2 tanh2 k &lt; k.
We may now interpret (d, a) as ordinary Cartesian coordinates
upon a Euclidean plane, i.e.; a plane on which the Euclidean


134


MODERN MATHEMATICS


geometry is assumed to hold. Then to P on the Lobachevskian
plane corresponds a point P' on the Euclidean plane and 1'
lies inside the circle 2 + 2 =k2, called the fundamental circle.
Conversely, let ($, rj) be the coordinates of any point on
the Euclidean plane. Solving Eqs. (1), we have
cos 06     ---,\/$2 + D2
sin 0=    _ 
V/'2 + 2'
k    k + V2 +  2
-= log â 
2 /      '2 + - 2
Hence 0 is uniquely determined and is always real and r is
uniquely determined and is real, infinite, or imaginary, according
as $2 + V2 is less than, equal to, or greater than, k2.
We have thus a relation between the Lobachevskian and Euclidean planes by which a point on the Lobachevskian plane corresponds
to one and only one point in the interior of the fundamental circle
on the Euclidean plane, and conversely. The points of the funidamental circle correspond to points at infinity on the Lobachevskian
plane, while points outside the circle
H               have no corresponding points on the
X...              Lobachevskian plane.
/A     \             45. Consider now  a straight
line on the Euclidean plane (Fig....o. --- â       35) with the equation
\      '        - aa+b~+c=O.
Only that portion of AB which
FIG. 35.         is within the fundamental circle
will correspond to a line on the
Lobachevskian plane, the points A and B corresponding to
the points at infinity on the Lobachevskian plane.
Hence, unless the line AB meets the fundamental conic in
two real points it will have no Lobachevskian counterpart,


NON-EUCLIDEAN GEOMETRY


135


The criterion that aS + br) +c= 0 should meet 2 + r2 =k2 in two
real points is that
k2(a2 + b2)-c2 &gt; 0.
We thus find again the condition of sec. 41.
46. The distinction between intersecting, non-intersecting,
and parallel lines is very clear in the representation we are
considering. For if AB (Fig. 36) is any straight line on the
Euclidean plane and P any point, the lines through P which
intersect AB within the fundamental circle correspond to lines
intersecting AB on the Lobachevskian plane, while the lines
through P intersecting AB outside the circle correspond to
lines on the Lobachevskian plane which do not meet AB.
FIG. 36.                      FIG. 37.
Between these two types of lines are the lines PA and PB
which intersect AB on the fundamental circle and correspond
to the Lobachevskian parallels.
47. Two straight lines
ai +bl +clO..... (1)
a2?I +b2r +c2 =0...... (2)
on the Lobachevskian plane are perpendicular, when (sec. 41)
k2(ala2 +lb2)-clc2=O...        (3)
The geometric meaning of this condition is readily given.
Note first that if Pl($1, r1) (Fig. 37) is a point on the Euclidean
plane, its polar AB with respect to the fundamental circle is
1s is t1-k2 = 0.
This is the line


al +bii +cl =O


136


MODERN MATHEMATICS


if $1- akk2   b~k2                         /a~k2     b1\
if 1=-   -, 1= â 1-. That is, the point     alk  - -bk2
Cl         Cl          I              el    eC
is the pole of the line Eq. (1), and similarly (  a2k2  b2k2)
is the pole of the line Eq. (2). The condition Eq. (3) expresses
the fact that the pole of Eq. (1) is on Eq. (2) and the pole of
Eq. (2) on Eq. (1). Hence the following theorem:
Two lines on the Lobachevskian plane are perpendicular when
each of the corresponding lines on the Euclidean plane passes
through the pole of the other.
This leads to a shorter proof of the proposition of sec. 9
that two non-intersecting straight lines have a common perpendicular. For let LM  and EF!\~u ~         (Fig. 38) be two such lines.
_~~L /^&lt; -Their            point  of  intersection
-----------â;-  P on the Euclidean plane lies
\      outside  of the  fundamental
\  \,//F      circle. The polar, lITV of P,
'" I""  ~ passes through the circle, there-...-E7   \                fore, and corresponds to the
FIG. 38.         common perpendicular to LM
and EF.
48. We shall now proceed to find the meaning on the
Euclidean plane of the expression for a Lobachevskian distance
(sec. 41, Eq. (4)). For convenience, place
P1P2 = d,     k2 -_ 12- _I 12.f,
k2- 22- -22=f     k2 -22 k2-12- 12,2=f12.
Then Eq. (4), sec. 41, becomes
d    d
ek + e-     fl2
2      V/fII \lvf2'
whence
d=k logf12 ~ V/J122-Jf22   1
Vf J;  f22 
=__ A2...J + l. ~(1)
k    f2 + /fl22U- - /, f22
=   +~ log
f12-V'\If! 22 âfllf22 J


NON-EUCLIDEAN GEOMETRY


137


Now on the Euclidean plane, let P1 and P2 (Fig. 39) be
the two points ($1, r1) and (&lt;2, /2) respectively, and R and Q
the points in which the line P1P2 meets the fundamental circle.
Let P ($1, a) be any point on P1P such that
P1P
PP2
where A is a Euclidean distance. Then
- 1 +A_ 2    _l + ATj2
' "  1+A   7  1 ~  '
1+A2    '        +
Substituting these values in the equation of the fundamental
circle,
2 + 2_k2=0,
we shall have the values of 2 corresponding to the points Q and R; namely 
P1Q  f12 + V/f22-f-f12 22
-Q2 p22 - 
P1R fi  - vfi22-f 1f22
2 RP2         x/f22                   FIG. 39.
Eq. (1) then becomes
k    A1    k     PiQ P1R
d= i   log -  i: log
~   )o 2    21    QP2 RP2
The Lobachevskian distance between two points is - times the
2
logarithm of the anharmonic ratio of the two g'ven points and the
two points of intersection of the fundamental circle and the line
through the two given points.
49. An analogous definition may be given to the Lobachevskian measure of angle. Place, for convenience,
k2(al2 + b)-C12=U11,   k2(a22 +b22)-C22 =U22,


k2(aa2 + b1b2) - cC2 = u12.


138               MODERN MATHEMATICS
Then, from Eq. (5), sec. 41, we have
t    U12 + V/U122-UllU22
f&gt;= ~-o log
2 lU12- /U122 -U11U22
Now consider the two lines AL, and AL2 (Fig. 40) with
the equations,
al+bl +cl =0,
a2 +b2 i +c2 =0,
Any line through their point of intersection A has the equation
(ai + Aa2)$ + (bl + Ab2) + (ci - Ac2) =0
and this line will be one of the tangent lines AR and AQ, if
k2(ai + Aa2)2 + k2(bl + Ab2)2 - (C + Ac2)2 = 0,
i.e., if A has either of the values
R   _......,.A  A t   _ 2 Xalp A11^2
/                     U12 -  v/U22- U11U22
\/U22
)~2 U12-%/U122- UlUl2
I  'u22
Hence
Li                               lo Al
FIG. 40.               =~ +2 log A
But A- is the anharmonic ratio of the four lines AL1, AL2,
/2
AR, and AQ. If A lies outside of the fundamental circle, A1
and A2 are real, and &lt; is imaginary. If A lies on the fundamental conic, A1=A2, and q=-0. If A lies inside the fundamental conic, Al and A2 are conjugate imaginary, and 5 is real.
The Lobachevskian measure of angle between two lines is 
times the anharmonic ratio of the two given lines and the two
tangents to the fundamental circle from the point of intersection of
the two given lines.
50. The study of the circle on the Lobachevskian plane
by means of its representation on the Euclidean plane leads to
interesting results. We obtain the general equation of the


NON-EUCLIDEAN GEOMETRY


139


circle by letting ($1, ~1) in Eq. (4), sec. 41, be the fixed coordinates of the centre and letting $2=$, 12= 7 be the variable
coordinates of any point on the circle. The equation is then
of the form
(evl+ 7  -k2)2_C(2+2- k2),..... (1)
where c is a constant.
This is the equation of a conic on the Euclidean plane.
Coordinates which satisfy this equation and that of the fundamental circle
2 + 2 _ k2 = 0
satisfy also the equation
$$ + 7)1)- k2 =0,
which is that of the polar of ($i, 71). Since the polynominal
$1$ + âk/2 appears to the second power in Eq. (1) it follows
that Eq. (1) is the equation of a conic which is tangent to the
fundamental circle at the points where
the latter is cut by the polar of (E$, Vi).   A,,
There are therefore three cases to
consider according as ($1, V1) lies out-           "&gt;.
side, on, or inside the fundamental
circle.
(1) When C ($1, 71) is inside the 
fundamental circle, the polar AB of C
does not cut the circle in real points and    FIG. 41.
hence the conic (1) lies entirely in the
circle (Fig. 41). This corresponds to the ordinary circle on the
Lobachevskian plane.
(2) Whence C($1, i1) lies on the fundamental circle, the
polar of C is the tangent to the circle at the point ($1, 7)1).
The conic (1) is then tangent to the circle at the same point
(Fig. 42). This corresponds on the Lobachevskian plane to
the curve approached by a circle as its centre receded to infinity
and its radius becomes infinite. This curve is called a limit


140


MODERN MATHEMATICS


curve or horicycle. Its revolution about one of its infinite
radii generates the limit-surface mentioned in sec. 25.
(3) When C(1I, rl) is outside the fundamental circle, the
polar C cuts the fundamental circle in two points A and B
(Fig. 43). The conic (1) is therefore tangent to the fundamental
conic, and corresponds on the Lobachevskian plane to a real
circle with imaginary centre and radius. The straight line
AB is a special case of such a circle.
Draw any line CR through C, intersecting AB in Q. This
represents on the Lobachevskian plane a line perpendicular
to AB  (sec. 47). Now in the Lobachevskian measurement
''"   ~ G42         i\ '
FIG. 42.                FIG. 43.
CR and CQ are constant for all positions of Q on AB. Then
QR is constant. That is the locus of Q has all its points
equidistant from a straight line AB. This curve is sometimes
called the hypo-cycle.
51. We may make, of course, a representation of the
Riemannian geometry on the Euclidean plane with coordinates
($, r). But in this case the fundamental circle has the equation
$2 + r72 +k2 =0
and is imaginary. The geometric properties are therefore not
visible to the eye.


NON-EUCLIDEAN GEOMETRY


141


XI. RELATION BETWEEN PROJECTIVE AND NON-EUCLIDEAN
GEOMETRY
52. We have obtained in sees. 48, 49 a special case of the
system of measurement first given by Cayley and recognized
by Klein as leading to the non-Euclidean geometries. The general principles can now be quickly stated.
Let us take, on a plane for which the Euclidean geometry
holds, x:x2:x3 as homogeneous point coordinates and assume
a fundamental conic with tho equation
a1 x2l +a2222 + a33x32 + 2al2xx2 + 2a23x2x3 + 2a31xa1 =0
or, more compactly,
Eaikxixk = O.  (aik = aki)...    (1)
Let the tangential equation of the same conic be
EAiA,;ato = O, (Aik-Aki),..... (2)
i.e., let Eq. (2) be the condition that the straight line
alxl +a2x2+ a3X3 =0 should be tangent to the conic of Eq. (1).
For convenience, let us place
fxx= EaikXiXk, f yy,,=  aikYiYk, fxy-= EaikZXik.
and
Uaa= EAikcaiak, Upp= EAikPik, Uap= EAikaiPk.
If P1 and P2 are two points on the plane, and Q and
R are the points which the line PiP2 meets the fundamental
conic, and [P1P2QR] is the anharmonic ratio of the four points
P1P2QR, then the Cayleyan projective measure of the distance
PiP2 is defined by the equation
P1P2= M log [P1P2QR],
where M is a constant.
Similarly, if AL, and AL2 are two lines intersecting at A,
and AR and AQ are the two tangents from A to the fundamental
conic, and [L1L2QR] is the anharmonic ratio of these four lines,


142


MODERN MATHEMATICS


then the Cayleyan projective measure of the angle 5 between
AL1 and AL2 is given by the equation
l=M1 log [L1L2Q.],
where M1 is a constant.
The analytic expression of these measures is found as in
sees. 48, 49.
If x1:x2:x3 are the coordinates of Pi, y1:y2:y3 the coordinates of P2, and Al, A2 the roots of
fxx + 2A2xy + A2fyy = O
then
/2
P1P2= M log 21
M log..                (3)
M log  + vf/J2-fxfyy... 
f xy~ %/ xy ~f x xfy
= 2M log.
/f x\/fIyy
By an easy calculation, we may deduce from this
PlP2   PIP2     4*
(e2M +e 2M )    -f....   (4)
Also if aixl + a2x2 + a3X3 =0 is the equation of AL1, ilxl +
j32x2+ P3X3=0 the equation of AL2, and I/t, /2 the roots of
the equation
Uaa + 2,Uap +,l2UpB = 0
then
= M1 log 2
t2
= M1 log,Ua 4 /Ua82- UaaUg         (
Uap- Va-/U  aaUpp uu         ( 
Ua 7+ V/Uap2 - UaaUpf
= 2M1 log      U   Up
V/Uaa '\/Ufpl


NON-EUCLIDEAN GEOMETRY


143


whence
~= (e2 e  M)   -      -...       (6)
/VUaa /Upp(
We have now to consider three cases according to the nature
of the fundamental conic.
53. Case I. Let the fundamental conic be a real, nondegenerate conic; i.e., either an ellipse, hyperbola, or parabola.
If the points P1 and P2 are inside the conic, [PiP2QR]
is real and positive. Hence if the distance P1P2 is real we
k
must take M a real constant, for example, 2.
If A is inside the conic, the tangents AR and AQ are imaginary, and PL1 and [/2 are conjugate imaginary. Let jp=pe~i,
then /2 =pe-oi and log  = 20i. Hence if q is to be real we must
#2
take M1 pure imaginary. When AL1 coincides with AL2
we have
â M1 log 1 =0 or 2 Mlnri,
where n is an integer. If then we so chose the unit of angle that
the measure of a right angle shall be 2, we must place M1-2
2'                   2
We are thus led to the same formulas as in sees. 48, 49, except
that they are referred to a general conic instead of a circle.
The Lobachevskian geometry is easily built up on this
foundation.
54. Case II. Let the fundamental conic be imaginary,
i.e., let there be no real values of Xl:X2:x3 satisfying Eq. (1),
sec. 52. Then 2i and 22 are conjugate imaginary, as are also
pu1 and /u2. Hence to obtain real distance and angle we must
take M and M1 pure imaginary. As in sec. 53, we place
i                  ik
M1i=   and will place Al -   We have then from   Eqs. (4)
and (6), sec. 52,
P1P2     f.ry
CO        Uas
cos       a. -â..
VU aa 2U ap


144


MODERN MATHEMATICS


which are analogous to those of the Riemannian geometry
(sec. 40).
P1P2
Since cos    2 is never infinite, all straight lines are finite
in length.
Two straight lines always intersect, since two linear equations have always a solution, which cannot represent a point
at infinity.
The Riemannian geometry is easily built up from these
foundations.
55. Case III. Let the fundamental conic degenerate. This
may happen in two ways: either the point Eq. (1), sec. 52,
may represent two straight lines; or the tangential Eq. (2),
sec. 52, may represent two points. The latter is the most
interesting case, especially when the tangential equation becomes
a12 + 22 =..... (1)
which is satisfied by the coefficients of all straight lines which
pass through one of the two points xl:x2: 3=l: ~i:0. If
3 = 0 represents the line at infinity, these points are the circular
points at infinity. Through each point of the plane go two
straight lines satisfying Eq. (1), namely the minimum lines.
The formula for angle is readily found. In fact, we have
at once from Eq. (6), sec. 52, with Mt=,
a 1,3t + (2312
cos  ='la2/?2
C1a2 + a22 VP,12 - P22
But this is the Euclidean formula for the angle between the two
lines
alx +a2y +3 = -0,
Xl1    X2
where we have placed x=-, y-.
X3     X3
Hence: The Euclidean angle between two lines is equal to -
times the logarithm of the anharmonic ratio of the two lines and the
two minimum lines through their point of intersection.


NON-EUCLIDEAN GEOMETRY14


1 4 5


To obtain the Euclidean formula for distance from the
general formula, sec. 52, is not so simple a matter, but it may
be done as follows:
Let us take in place of Eq. (1) the equation
aa2.2 + Fa. 2 = 0                (2)
which goes over into Eq. (1) when e=0. The corresponding
point equation is
e(Xm+X2+2) +X32=0.                (3)
From Eq. (4), sec. 52, we have if we place M == 
PIP2          E (Xlyl + X2Y2) + X3Y3
cosh
k  -/ Vm(X12 +X2) + X3\ Vmjy 1 22) +Y32
We wish to show that this approaches as a limit the Euclidean
formula as E 0 and k ==. For that purpose, replace bosh
by its approximate value I + ()   and calculate P1P2.
k~
There results
v'(1IY3 -X3Y1I2 + (X2Y3 -X3Y2)2 + E(X2YI -X2Y3 )2
P1P2=ik /E-,\/ 6(X 1 + X2 2) + X33 2,\/.E(y 1 Y2 2) + Y32
Now let m 0 and k- oc,in such a way that ikV/~1. We have
in the limit
= v' (X1Y3 -xZ3y1)2 4- (X2Y3  x3Y2)2
1VX2,V/ y32
Finally, let us employ non-homogeneous coordinates by
placing
XI      X
x  -, y =_
X3      X3, Yi 
r, yY3      Y3


146


MODERN MATHEMATICS


We have, then, the usual Cartesian formula
PAP2= V(x -x')2 + (y -y')2.
Hence:
The Euclidean measure of distance is a limiting case of the
Cayleyan projective measurement.
XII. THE ELEMENT OF ARC
56. We have found (secs. 40, 41) that in both the Riemannian
and the Lobachevskian geometries the element of arc, ds, is
the square root of a homogeneous quadratic function of the
differentials of the coordinates which we have used.  This
is also true of the Euclidean geometry, where in rectangular
Cartesian coordinates ds = x/d2 + dy2.
Conversely, following the method first employed by Riemann,
we may ask if these are all the types of geometries in which the
element of arc is thus expressed. More precisely, the problem
is as follows: Let it be assumed that a point on the plane may
be determined by means of two coordinates xi and x2, and that
the distance between two infinitely near points (xi, x2) and
(xi+dxi, x2+-dx2) may be given by an equation of the form
ds = V/a dx12 + 2ai2dxldx2 + a22dx22,
where al, a12, a22 are functions of x. and x2. It is required
to discuss the geometry which results.
An adequate discussion of this question would be altogether
too long for this place.*
We shall simply say that the straight line is then defined
as the shortest distance between two points, its equation being
the relation between xi and x2 which makes the integral
s= f     al ldxl2 + 2a i2dxldx2 + a22dx22,
taking between constant limits, a minimum.


* Consult Woods, "Space of Constant Curvature," Annals of Mathematics, 2d series, Vol. VIII, 1901-2; Coolidge, The Elements of Non-Euclidean
Geometry, Chapter XIX.


NON-EUCLIDEAN GEOMETRY


147


It results finally that it is possible to replace (xl, x2) by
polar coordinates (r, 0), whereby ds takes one of the forms
ds = Vdr2 + r2d02,
ds =  dr2 t k2 sin  d2,
k
ds    dr2 k2 sinh2  d2
where k is a constant.
But there are the three forms which belong to the Euclidean,
the Riemannian, and the Lobachevskian geometries respectively.
We have thus the interesting result that no new type of geometry
results from the new point of view. This statement, however,
requires one modification. The present discussion, since it
starts with the infinitely small and proceeds by the methods of
the calculus, has to do only with a restricted portion of the plane.
No hypothesis is made as to the behavior of straight lines when
indefinitely extended, such as enters into the parallel postulates.
A geometry, in fact, which agrees with the Euclidean, Riemannian, or Lobachevskian geometry respectively, in a restricted
portion of the plane, may present new features when the total
extent of the plane is considered. Into this subject we cannot
go.*


* Consult Woods, "Forms of Non-Euclidean Space," in Boston Collo.
quium Lectures on Mathematics, New York, 1905.


IV
THE FUNDAMENTAL PROPOSITIONS
OF ALGEBRA
BY EDWARD V. HUNTINGTON


CONTENTS
SECTIONS.
I.  INTRODUCTION.......................................  1-4
II. THE ADDITION OF ANGLES AND THE MULTIPLICATION OF
DISTANCES...................................... 5-12
5, 6, 7, The addition of angles;
8, First step toward the science of this operation: selection of axioms;
9, 10, 11, The multiplication of distances;
12, First step toward the science of this operation: selection of axioms.
III. THE ABSTRACT THEORY OF THESE OPERATIONS........... 13-20
13, Parallelism between these two operations;
14, Postulates for an abstract science to include them both;
15, " Consistency" of the postulates;
16, On the uses of an abstract science;
17, Examples of systems that do not satisfy the postulates;
18, " Independence" of the postulates;
19, "Sufficiency" of a set of postulates to determine a
single type of system;
20, Note on the terms axiom and postulate.
IV. GEOMETRIC EXAMPLE OF THE ALGEBRA OF COMPLEX QUANTITIES: THE SYSTEM OF POINTS IN THE PLANE.....21-29
21, Points in a plane; " real" and " imaginary" points;
22, 23, Addition of points in the plane;
24, 25, Multiplication of points in the plane;
26, Solution of algebraic equations;
27, The relation of order among the real points;
28, Classification of the real points: integral, fractional,
rational, irrational;
29, First step toward the science of this algebra: selection
of axioms.
V. THE ABSTRACT THEORY OF THE ALGEBRA OF COMPLEX QUANTITIES.........................................30-38
30, A complete set of postulates for the algebra of complex
quantities;
31, Consistency of the postulates;
32, 33, Sufficiency of the postulates. Examples of isomorphic systems;
34, 35, Independence of the postulates. Examples of
systems that satisfy all but one of them;
36, What is algebra?
37, A complete set of postulates for the sub-algebra of all
real quantities;
38, On the value of complex algebra in problems concerning
real quantities.
APPENDIX
I. OTHER EXAMPLES OF THE ALGEBRA OF COMPLEX QUANTITIES 39-42
39-41, Arithmetical systems of Dedekind and Cantor.
42, Comments on these arithmetical systems.
II. GEOMETRIC PROOF THAT EVERY ALGEBRAIC EQUATION HAS
A  ROOT................................ 43-45
150


IV


THE FUNDAMENTAL PROPOSITIONS
OF ALGEBRA
By EDWARD V. HUNTINGTON
I. INTRODUCTION
1. Purpose of the article. The main object of this article is to
present, in as simple a form as possible, the results of some of
the modern inquiries into the logical foundations of algebra; but
the article is so arranged that readers who desire merely to
increase their store of information about algebraic facts, without
going into the discussion of logical foundations, may find, in
Part IV, a systematic introduction to the algebra of complex quantities, which may be read independently of the rest of the
article.
There has been much discussion of late years over the place which
logical rigor should occupy in the teaching of elementary mathematics.
Some have contended that the power to understand a logically rigorous
demonstration is itself the most important result to be aimed at in
mathematical study. 'Others have attached greater importance to the
use of mathematics as a practical art, and have felt that too much
insistence on logical rigor serves only to deaden the pupil's interest,
and thus to destroy all the value the study might have, either as a
practical art or as a training in logic.
It is not the purpose of the present article to discuss these pedagogical questions. It is intended merely to put before the reader a clear
statement, in some detail, of what is actually involved in a strictly
151


152


MODERN MATHEMATICS


logical treatment of algebra, leaving to the teachers themselves the
question as to how far logical rigor can be pressed in the classroom.*
2. The science of algebra vs. the science of geometry. It is
a curious fact that the one striking example of rigorous mathematical reasoning with which everyone is familiar is taken from
geometry rather than from algebra. Euclid's Elements have
stood for 2000 years as the supreme illustration of the mathematical manner of reasoning. Axiom, theorem; hypothesis,
conclusion; proposition, demonstration, corollary; the defence
of every statement by reference to a previously established
truth-all the apparatus and method of mathematical reasoning
call up at once in our minds a text-book in geometry, never a
text-book in algebra. Even the external form of our books
contributes to this result. The current treatises on algebra are
not divided into Book I, Book II, etc., as are those in geometry;
their theorems are not numbered in consecutive order; little
distinction is made between explanation and proof; nothing is
done to suggest the strict logical sequence of propositions which
is so constantly emphasized in every book on geometry.
Until recent years, elementary algebra has been largely a
miscellaneous collection of rules for the manipulation of algebraic
expressions, and is not at all the developed science that elementary geometry has long since become. In fact, if it were not
for the study of plane geometry in our schools, it is doubtful
whether our school children would ever derive, from their study
of algebra alone, any clear notion of what is meant by a mathematical demonstration.
This fact is the more remarkable, because, on account of
the simpler nature of the concepts with which it deals, algebra
is better suited than geometry to serve as an illustration of
what is essentially involved in mathematical reasoning. In
geometry, the very concreteness and familiarity of the subjectmatter is apt to obscure the logical structure of the science, while


* Reference may here be made to a forthcoming book by John Wesley
Young, entitled "Lectures on Fundamental Concepts of Algebra and
Geometry," 1911.


FUNDAMENTAL PROPOSITIONS OF ALGEBRA


153


in algebra, the more abstract character of the content of the
theorems makes it easier to fix the attention on their formal
logical relations.
The present article is intended as an introduction to the
science of algebra as distinguished from the art of manipulating
algebraic expressions. In what proportions the science and the
art should be mingled in practical teaching is a question viih
which this article, as already stated, does not propose to deal.
3. The various types of algebra. Irrational and imaginary
quantities. It should be mentioned at once that there is, strictly
speaking, no one science of algebra, but rather a collection of closely
related sciences, all of which are commonly grouped together under the
general name of algebra.
For example, we have the algebra of positive integers; the algebra
of all integers (positive, negative, and zero); the algebra of positive
rationals; the algebra of all rationals; the algebra of all real quantities
(rational and irrational, positive, negative, and zero); and, finally, the
algebra which in a certain sense includes all these others, and is in many
respects simpler than any of them, the algebra of complex quantities.
In these various algebras, many theorems are, in form, identical;
but many other theorems are true in one algebra and false in another.
For example, the theorem,
If a= b3, then a= b,
is true in the algebra of real quantities, and not true in the algebra of
complex quantities. Again, the theorem that "Every quantity has at
least one cube root," is true in the algebra of all complex or all real
quantities, but is false in the algebra of rational quantities or the algebra
of integers.
The distinction between the various types of algebra is directly
connected with the problem of the so-called " irrational " and " imaginary"
quantities.
Much of the difficulty which perplexes every thoughtful student at
the time when irrational and imaginary quantities are first introduced,
is due to the failure to recognize the fact that he is really leaving one
system of algebra, and passing to another and different system, and
that the theorems established in the first system cannot be expected
(without further proof) to hold in the second.
It is small wonder that a boy is confused and perplexed when he is
told on one page that "the square of every number is positive, and


154


MODERN MATHEMATICS


hence I/-  cannot exist," and on the next page that "the V/-1 really
is a number, and obeys all the laws of algebra." The fact is, of course,
that the V/ â1 occurs only in the algebra of complex quantities-a quite
different algebra from the algebra of real quantities which the boy has
so far studied; and it is simply not true to state that a quantity which
belongs in one of these algebras obeys all the laws which are valid in
the other.
Again, the pupil is often told that we "must" introduce the number
V/-1 "because " the equation x2= -1 must needs, in the nature of things,
have a root. But why do we not say, with equal reason, that we "must "
introduce the number infinity, "because" the equation 5/x=0 must
needs have a root?  If we say that / -1 is "a number that obeys all
laws of algebra," why do we not say that oo, the existence of which
may be claimed on the ground of precisely similar necessity, is also a
"number that obeys all the laws of algebra"? Inconsistencies like
this, while they do not trouble the average pupil, do present serious
perplexities to those who are more critically inclined. It is not clear
why a "must" that is so imperative in one case should be so ignored
in a precisely similar case. The fact is, of course, that the alleged
necessity carries no compulsion with it in either case; it is merely the
expression of a desire for a simpler algebra, in which every equation
shall have a root; the fact that the algebra of complex quantities comes
nearer than any of the other algebras to fulfilling this desire is a matter
for observation, not a consequence of logical necessity. And yet what
pupil in our high schools has ever had a concrete example of complex
algebra presented to him upon which he could make this observation? *
In regard to this whole problem of the introduction of irrational and
complex quantities into elementary algebra, the method of successive
"extension of the number-concept," which was historically the method
by which these quantities were discovered, seems to be of very questionable value as a method of instruction at the present day. The very
terms that have come down to us-surd (meaning "absurd"), irrational,
imaginary-show the doubts about the legitimacy of these new quantities which were occasioned by this method of introducing them. In
the light of the modern science of algebra, these doubts simply do not
occur; the whole point of view in regard to algebraic quantities has
changed; the old terminology itself is retained only out of respect for
the past.


* Compare the trenchant remarks on this subject by C. F. Gauss in his
famous Doctor's Dissertation, 1799. Reprinted in Ostwald's Klassiker der
exacten Wissenschaften, under the title: Beweise ffir die Zerlegung ganzer
algebraischer Functionen in reelle Factoren ersten und zweiten Grades.


FUNDAMENTAL PROPOSITIONS OF ALGEBRA


155


After clear ideas have once been reached on these subjects, one is
forced to raise the question whether it is necessary to perplex all our
pupils of to-day with the same vagueness and obscurity through which
the earlier pioneers had to struggle. Is it necessary to turn out hundreds of pupils, as we do, from our courses in algebra, with the conviction hopelessly fixed in their minds that some of the things with
which algebra deals are, in all truth, "absurd" and "imaginary"?
In the opinion of the present writer, if the irrational and imaginary
quantities are to be introduced into elementary work at all, the method
which is most satisfactory from the strictly scientific point of view,
is also by far the simplest and most satisfactory from the point of view
of the elementary student. This opinion it is hoped will be borne out
by the sequel. (See especially sees. 36, 38, 39, 42, and remarks in
sees. 26, 28, 29 and 30.)
4. Plan of the article. The space at our disposal does not
permit the separate development of the several types of algebra,
in the order in which, beginning with the algebra of the positive
integers, these types would naturally be presented to the pupil.
We shall confine ourselves chiefly to the algebra of complex
quantities, which is the most inclusive and the most interesting
type. In IV a geometrical example of this type of algebra is
given (without the use of trigonometry) and in V the
abstract theory of the algebra is developed, that is, the precise
conditions are laid down which any system must. satisfy in order
to be equivalent to the algebra in question (sec. 30). In sec. 35
several examples of "pseudo-algebras " are given, that is, systems that satisfy most, but not all, of the conditions of sec.
30; for it is only by a study of what the algebra is not that
we can fully understand what it is.
II and III are preliminary to the main discussion.  In II a
number of geometrical facts are observed, of which use will be
made in IV. III shows how this collection of geometrical facts
can be reduced to an abstract science, and serves to illustrate,
in this very simple case, all the steps of the reasoning which
will be used in the general case in V.
The chief points in the article which may be unfamiliar to many
readers are the following: The analysis of the fundamental concepts which
occur in algebra; the notion of the "equivalence," or "isomorphism,"


156


MODERN MATHEMATICS


of two algebraic systems with respect to these fundamental concepts;
the notion of the " sufficiency" of a selected set of fundamental propositions to determine uniquely a particular type of algebra; and the
method of establishing the "consistency" and "independence" of the
propositions of such a set.
II. THE ADDITION OF ANGLES AND THE MULTIPLICATION
OF DISTANCES
5. The addition of angles. We begin with a preliminary
discussion of the very simple and familiar process of the addition
of angles.
By an angle, as in all higher mathematics, we mean an
amount of rotation of a line about a fixed point 0, in a plane.
Such a rotation may be counter-clockwise or clockwise, and of
any amount; as, +250~, -780~, etc.
To clarify our ideas about rotations of more than 360~, it
will be well to adopt Riemann's famous device, and think of
the plane about the point 0 as made up of numerous distinct
sheets, joined together after the fashion of a spiral staircase;
a moving radius rotating about the point 0 winds around from
one sheet to the next as if it were following the thread of a
screw. Two angles like 360~ and 720~ are thus kept distinct;
for although the terminal lines of these angles point in the
same direction, they lie in different sheets of the Riemann
surface.
If two angles a and P are given, a third angle r may be
derived from them by the following familiar process: starting
with a given initial line as the zero angle, perform the rotation indicated by a; then continuing from the terminal line
of a, perform a rotation equal in amount and direction to iP;
the final position thus reached is the terminal line of the
required angle r. This angle r is called the sum of the given
angles a and i (with respect to the chosen zero) and is denoted
by a+f.
6. Concerning the addition of angles, as thus defined, the
reader may easily verify the following familiar statements:
(a) If a and P are any two angles (whether equal or unequal),


FUNDAMENTAL PROPOSITIONS OF ALGEBRA


157


then their sum, a +P, is an angle uniquely determined by a
and P (with respect to the chosen zero-angle).
(b) a += + a. (Commutative law for addition.)
(c) If a, A, y are any three angles, (a+P) + = a + (P +).
(Associative law for addition.)
(d) If a+ =a+/P', then -3=='. ("Law of cancellation"
for addition.)
If we introduce, for abbreviation, the notation, 2a =a+a,
3a = a + a + a,..., na = a + a + ~*  + a to n terms, where n is
any positive integer, we have further:
(e) If na = nj, then a ==.
The angle na is called the nth multiple of the angle a.
Three other facts of somewhat different character (" existence theorems ") are the following:
(f) There is one and only one angle x such that x 4x=x;
this angle x is the zero angle, and is denoted by 0.
(g) Every angle a determines uniquely an angle a' such
that a+a'=O. This angle a' is called the opposite of a and
is denoted by -a.
(h) For every angle a and every positive integer n, there
is an angle y, uniquely determined by a and n, such that
ny= a. This angle y is called the nth submultiple of a, denoted
by a/n. For example, we have a/2, a/3, etc.
7. Among the many further facts which might be mentioned,
the following are the most important for our present purpose:
(i) If O is the zero angle, then for every angle a, a + 0= a.
(j) If a and P are any given angles, there is always an
angle x, uniquely determined by a and P, such that a=p +x;
this angle x is called the remainder, a minus P, and is denoted
by a -a, and the process by which it is obtained is called
subtraction. The angle a -f is the same as the angle a + (-,).
Hence, to subtract an angle Pf, means to add the opposite of P.
(k) If m and n are any positive integers, the angle m(a/n)
is equal to the angle (ma)/n, so that either may be denoted
by (m/n)a.
All these statenents, (a)-(k), may be regarded as the direct
result of observation. There is no necessary logical order among


158


MODERN MATHEMATICS


them; any one may be obtained without reference to the others
directly from the figure, as the reader may readily verify.
8. First step toward the science of this process. Selection of
axioms. Now this miscellaneous collection of facts about angles
does not constitute a science. In order to reduce it to a science,
the first step is to do what Euclid did in geometry, namely, to
select a small number of the given facts as axioms, and then to
show that all other facts can be deduced from  these axioms by
the methods of formal logic.
As a convenient choice of axioms for the science of the
addition of angles, we may take the propositions (a)-(h) in
sec. 6; from these axioms the other propositions, (i), (j), (k),
etc., can be deduced as theorems, without further reference to
the definition.
For example, the proof of theorem (i) is as follows: By
(f), 0 +0=0, hence, by (a),      + (O +0)=a+0, whence, by
(c), (ca+O)+O   =a+0, whence, by       (b), O+(a+0)==O+a;
therefore, by (d), a +~0 =a, which was to be proved.
Similarly, the proof for (j) is as follows: By (f) and (g), there is an
angle -fi such that /?+(-P)=O. Let x=a+ ( â?), which is known
to be an angle, by (a). Then, by the use of (b), (c), (g), and (b), and
theorem  (i),
+x=   - [ + (-P)]=  +[(-P) +a]= [.~ +  )]+o= Q +a= a+0=;
that is, the angle Xx + ( âP) is an angle which, when added to f, produces
a, as was to be proved. That this angle is uniquely determined by a and
p follows at once from (d).
The proof of (k) may be illustrated by a numerical case.  Let
x= 3(a/2) and y= (3a)/2; then by (a) and (c),
2x= [3(a/2)] +[3(a/2)]== [ a/2 + a/2 + a/2] +[a/2 + c/2 + ca/2]
=    [c/2 + a/2]+[a/2 + ar/2] +[ca/2 + ac/2]= [2(c/2)] +[2(ca/2)]+[2(a/2)]
= 3[2(r/2)]
= 3[a], by the definition of a/2.
But also, 2y= 2[(3c) /2]= (3a), by the same definition. Therefore,
2x=2y, whence x=y, by (e). The general case for m and n is proved
in a similar way.
It must not be supposed that proofs like these, in which
every step is carefully justified by reference to one or other
of the axioms, are necessary to convince us that the statements


FUNDAMENTAL PROPOSITIONS OF ALGEBRA


159


in question are true; indeed, in this particular case, the theorems
proved are quite as obvious as the axioms on which the proof
is based; all of them  may be obtained independently, by
direct observation of the figure.
The fact is that a mathematical demonstration, strictly
speaking, is not concerned with the truth of the proposition
at all; it is concerned merely with the logical relation that
exists between the given proposition and certain other propositions called the axioms-in other words, all that a mathematical demonstration tells us is that if the axioms are true,
then the theorem in question will also be true-provided, of
course, that our deductive reasoning is sound.
Provided that our deductive reasoning is sound-there is
the difficulty. How can we be sure that each step of the
deduction is logically justified? How can we be sure that no
assumption is tacitly used in the proof which was not explicitly
stated in the axioms? Even Euclid did not escape this danger;
he often used, for example, assumptions about the motion of
a rigid body which he did not include in his axioms. In fact,
it is only in recent years that a really complete list of axioms
for geometry has been laid down.*  How can we be sure that
similar errors will not creep into our reasoning in algebra?
The answer to this question involves a further refinement
of the scientific method, which will be discussed in Part III.
9. The multiplication of distances. The system studied in
the preceding sections on the addition of angles is an example
of the type of algebra called the " algebra of all real quantities"
as far as the operation of addition is concerned.
We now consider a second operation, to be called multiplication, this operation being performed not on angles, but
on geometric lengths, or distances.
Suppose two distances a and b are given; and then, having
chosen a given distance u as a " unit distance," find a distance
x by the construction shown in Figure 1, in which b is at
right angles to u and a, and the oblique lines are parallel.


* See Monograph I.


160


MODERN MATHEMATICS


This distance x is called the product of the given distances
a and b (with respect to the chosen unit) and is denoted by
aXb, or a b, or simply ab. The process by which this product
is obtained is called multiplication.
From this definition it follows that if x=aXb, the area of
the rectangle whose sides are x and u is equal to the area of the
rectangle whose sides are a and b.
To see that the two rectangles, OCDU and OBEA, are equivalent,
note that the part OBQU is common to both; further, the lines PQ,
QR, CS, and TA are all equal to BU (being portions of parallels interC   D
a;w\ -o                 -a  -
FIG. 1.                     FIG. 2.
cepted between parallels), so that the triangles BPQ and DCS in one
rectangle are equal to the triangles UQR and ETA in the other; and,
finally, the parallelograms CSQP in one rectangle and QTAR in the
other are equivalent (having equal bases PQ and QR and equal altitudes).*
10. Concerning the multiplication   of distances, as thus
defined, the reader may readily verify the following statements:
(a) If a and b are any two distances (whether equal or
unequal) then their product, a X b, is a distance uniquely determined by a and b (with respect to the chosen unit distance).


* It will be noticed tihat this proof does not assume the theorem that the
area of a rectangle is equal to its base times its altitude, nor any theorems
on ratio and proportion.


FUNDAMENTAL PROPOSITIONS OF ALGEBRA


161


(b) aXb=bXa.     (Commutative law for multiplication.)
(c) If a, b, c are any three distances, then
(aXb) Xc=ax (b Xc).
(Associative law for multiplication.)
To see that this is true, let axb=x and b Xc=y, and then xXc=z and
aXy=z', so that we have
(aXb) Xc=xXc=z and aX(bXc) =aXy=z'.
--. -
IC                                       a
z
FIG. 3.
Then show that the parallelopiped whose edges are z, u, u and the
parallelopiped whose edges are z', u, u both have the same volume as the
parallelopiped whose edges are a, b, c. Therefore z= z'.
(d) If aXb= aXb', then b=b'. (" Law of cancellation " for
multiplication.)
If we introduce, for abbreviation, the notation a2=aXa,
a3=aXaXa,..., an=aXaX.        ~ ~ Xa to n factors, where n is
any positive integer, we have further (see Figure 4):
(e) If an=bn, then a=b.
The distance an is called the nth power of the distance a.
Three other facts of somewhat different character (" existence theorems ") are the following:
(f) There is one and only one distance x such that xXx âx.
This distance x is the unit distance, and is denoted by 1.
(g) Every distance a determines uniquely a distance a' such
that aXa'=l. This distance a' is called the reciprocal of a


162


MODERN MATHEMATICS


and is denoted by a-1 or I/a. For example, if a is five times
1 then a-~ is one-fifth of 1, etc.
(h) For every distance a and every positive integer n,
there is a distance y, uniquely determined by a and n, such
that y = a. This distance y is called the nth root of a, denoted
by alln or V/a. For example, if a is a length nine times as
long as 1, then V/a, or al, will be a length three times as long
as 1, etc.
11. Many other facts about the multiplication of distances
b3
b2
a4a3a a     U   b   b           b
FIG. 4.
might be mentioned, of which the following will suffice for
our present purpose:
(i) If 1 is the unit distance, then for every distance a,
aXl=a.
(j) If a and b are any given distances, there is always a
distance x, uniquely determined by a and b, such that a=b Xx;
this distance x is called the quotient, a divided by b, and is
denoted by a/b; and the process by which it is obtained is
called division. For example, if a=10(1) and b=5(1), then
a/b=2(1); etc.
The distance a/b is the same as the distance a X(b-)


FUNDAMENTAL PROPOSITIONS OF ALGEBRA


163


Hence, to divide by the distance b means to multiply by the
reciprocal of b.
(k) If m and n are any positive integers, the distance
(al/n)m is equal to the distance (am)l/n, so that either may be
denoted by a/n.
All these statements, (a)-(k), about the multiplication of
distances, like the statements (a)-(k) in sees. 6-7 about the
addition of angles, may be regarded as the direct result of
observation-any one of them being obtainable immediately,
without reference to the others.
12. First step toward the science of this process. In order
to reduce this miscellaneous collection of facts to a science, we
may take as the axioms of the science the propositions (a)-(h)
in sec. 10, and proceed exactly as in sec. 8; the steps of the
reasoning are precisely parallel, and need not be repeated here.
The system here studied is an example of the type of algebra
called " the algebra of positive reals," as far as the operation
of multiplication is concerned.
We now turn to the problem (already referred to in sec. 8)
of how to make more rigorous the science of these two systems.
III. THE ABSTRACT THEORY OF THESE OPERATIONS
13. Parallelism between these two operations. The parallelism between the two systems just described is too striking to
have escaped attention. The propositions (a)-(h) in sec. 6 are,
as far as their form is concerned, identical with the propositions
(a)-(h) in sec. 10. The meaning and content of the two sets
of propositions are of course very different; the first set concerns the addition of angles, while the second set concerns the
multiplication of distances; but their form is the same, since
all the propositions of the second set can be obtained at
once from those of the first by replacing " angle " by "distance," "sum " by " product," " zero " by " unit," " opposite 
by "reciprocal," "subtraction" by "division," etc. This
duality between the two sets of propositions will of course
extend through all the propositions that are deducible from


164


MODERN MATHEMATICS


them by the methods of formal logic; from every proposition
concerning the addition (or subtraction) of angles, a corresponding proposition concerning the multiplication (or division) of
distances can at once be obtained by merely changing the
interpretation of the symbols, without changing the form of
the statement.
14. Postulates for an abstract science to include them both.
This duality at once suggests the possibility of developing a
general theory which shall include both these theories as special
cases. To do this, we proceed as follows: Consider a general
class of things or " elements " denoted by A, B, C, etc., without
specifying whether these things are to be angles (a, f, y, etc.)
or distances (a, b, c, etc.), and a general rule of combination
denoted by o, without specifying whether this rule of combination is to be addition (+) or multiplication (X) * and impose
upon these symbols the following conditions:
(a) If A and B are elements of the class, then AoB (read:
"A with B ") is an element of the class, uniquely determined
by A and B.
(b) AoB-BoA.t      (Commutative law.)
(c) (AoB)oC= Ao(BoC).      (Associative law.)
(d) If AoB=AoB', then B=B'.      ("Law of cancellation.")
(e) If A[lB = B[n, then  = B.
Here Ar"] means AoAo-         oA, to n elements, where n
is a positive integer.
(f) There is an element X such that XoX= X.
[It can be shown from the preceding conditions that there cannot be
more than one such element. For, suppose these were two such elements,
X and Y, such that XOX=X and YoY= Y; then, by (a), (XoX)oY=
XO(YOY), whence, by (c), XO(XOY)==XO(YOY), whence, by (d),
XOY= YoY; therefore, by (b), YcX= YoY, whence, by (d), X= Y.]
* A system composed of a class K and a rule of combination O we shall
speak of as a "system (K, o)."
t The equality sign, =, is used to indicate that the two expressions
between which it stands are interchangeable in any proposition of the theory.
If desired, the laws of operation with this symbol may be formally stated
as follows: (1) A-A; (2) if AA=B then B=A; (3) if A=B and B=C,
then A = C,


FUNDAMENTAL PROPOSITIONS OF ALGEBRA


165


(g) If X is the unique element such that XoX=X, then for
every element A there is an element A' such that Ao A'-X.
[It follows from (d) that this element A' is uniquely determined by A.]
(h) For every element A and every positive integer n, there
is an element Y such that yn = A, where yin] means YoYo
* * oY to n elements.
[It follows from (e) that this element Y is uniquely determined by
A and n.]
15. Consistency of the postulates. From these eight conditions, or " postulates," as we shall call them, a long list of
theorems can be deduced; for example:
(i) If X is the unique element such that XoX=X, then
for every element A,
AoX=A;
moreover, any system which satisfies all these conditions (a)-(h) will
satisfy also all the theorems derived therefrom.
But the first question to be asked about such a set of conditions or " postulates," is this: Are they consistent demands?
In other words, does any system exist which satisfies all the
conditions? In this case the answer is, of course, affirmative:
for, if the class A, B, C,... is the class of angles, and the rule
of combination o is the rule of addition, then all the conditions
are satisfied, as we saw in sec. 6; the elements X, A', and Y,
whose existence is demanded in (f), (g), and (h), are the " zero
angle," the "opposite of A," and the "nth submultiple of A," of
that system. Again, if the class A, B, C... is the class of
distances, and the rule of combination o is the rule of multiplication, then also all the conditions are satisfied, as we saw
in sec. 10; the elements X, A', and Y now being called the
" unit distance," the " reciprocal of A," and the " nth root of
A." Indeed, the system of angles, under addition, and the
system of distances, under multiplication, are only two examples
out of many which satisfy all these eight conditions, so that we
may be well assured that the conditions are consistent.
These eight postulates, (a)-(h), may therefore be taken as the


i66


MODERN MATHEMATICS


fundamental propositions of an abstract science, which will exhibit,
in skeleton form, the logical structure of a large class of systems,
of which the systems described in Part II are examples.
This is the refinement of the scientific method, to which
reference was made in sec. 8. The great advantages of the
method are: first, that the essential properties of a whole class
of systems are epitomized in one abstract theory; and secondly,
that the liability to error in deducing one theorem from another
is vastly reduced by the abstract form of statement, which
includes everything that is essential and nothing that is accidental.
For example, in the proof of theorem (i) in sec. 8, it was an "accident" that the symbols " at" and "0" represented angles, and the
symbol "+" addition; the essential thing was that these symbols
obeyed the formal laws laid down in propositions (a)-(h).
Further, if any system, consisting of a class of elements
A, B, C,... and a rule of combination o, is laid before us,
we have only to assure ourselves that this system satisfies the
eight postulates of our abstract science, in order to be convinced that this system will also satisfy all the derived theorems, which form the body of the science.
16. On the uses of an abstract science. From this discussion it will be evident that the main interest of an abstract
science centers about the logical relations between abstract
propositions, rather than about the applicability of these propositions to concrete things. But many important mathematical
theories have been developed as " abstract sciences," from an
apparently quite arbitrary set of postulates, which have later
proved to be powerful tools in applied mathematics, when
important practical systems that satisfied all the postulates
of these particular theories unexpectedly presented themselves.
The case of the algebra of complex quantities, the study of which
will form the main part of the present article, is precisely a case in
point. This algebra was developed, historically, from the purest of
purely "mathematical" motives-to satisfy a scientific curiosity as to
what conclusions could be drawn from certain assumed hypotheses,
with no thought of application to electrical engineering or any other


FUNDAMENTAL PROPOSITIONS OF ALGEBRA   167


branch of practical science; and yet when the electrical engineers, long
after, began to develop the theory of alternating currents, they found
that the fundamental conditions of their problem were formally identical
with the fundamental postulates of the abstract science of this algebra;
consequently the whole highly developed mathematical theory, with all
its ramifications, became at once an invaluable tool, ready to hand,
for the work of this most practical of practical sciences.
17. Examples of systems that do not satisfy these postulates.
Concerning the set of postulates (a)-(h) of sec. 14, it will
be instructive to give here a few examples of systems which
do not satisfy all of these postulates; for it is only by understanding what a thing is not that we can fully understand
what it is. For this purpose, we shall exhibit eight systems,
each of which satisfies all but one of the eight postulates.
EXAMPLE (a).     Let the class A, B, C,... be the class of
all angles between -10~ and + 10~, and let AoB be A +B.
This system fails to satisfy postulate (a), since 7~08~= 15~, for
instance, is not in the class. All the other postulates are satisfied.
EXAMPLE (b). Let the class be the class of positive integral
numbers; and let the rule of combination be such that AoB=B.
For example, 708 = 8, 15o3 = 3, etc.
This system clearly fails to satisfy the commutative law, postulate
(b); but all the other postulates are satisfied. Thus, in postulate (/),
any element X will have the required property XoX=X; since this
element X is not uniquely determined, postulate (g) has nothing further
to demand; this postulate is, therefore, as we say, satisfied "vacuously." *
To show that postulate (h) is satisfied, take Y= A.
EXAMPLE    (C). Class:   all angles;  rule  of combination:
AoB = (A +B)/3.
Here the associative law, (c), is not satisfied, since, for example,
(3~06~) 012 =3~ 012~= 5~, while  3 0(6~012~) =3~06~= 3~.
All the other postulates are satisfied. Thus, in (f), take X=the zero
angle; in (g), take A'= -A; in (h), notice first that
A   2-A    A[l= (_+   A)     A[]= -      - +3   A
3              3'           3    32


* It is not surprising that X is not uniquely determined in this system,
since postulate (b) was one of the postulates required for the proof of
uniqueness given above.


168


MODERN MATHEMATICS


so that in general, by the formula for the sum of a geometric series,
3n-1 +l
A [.l=     -A'
2X 3"- 
hence, if we take
2X3n-'
Y=     -  A,
postulate (h) will be satisfied.
EXAMPLE (d).    Class: all angles; rule of combination: if
A is distinct from B, Ao~B=the zero angle; but AoA==A.
This system fails to satisfy the "law of cancellation," but satisfies
all the other postulates. Postulate (g) is satisfied "vacuously," since
there is no uniquely determined element X to which this condition
could refer.
EXAMPLE (e).    Class: all angles; congruent angles being
regarded as equal;* rule of cormbination: AoB=that angle in
the first revolution which is congruent to A - B.
Here (e) is not satisfied, since, for example, 60~21= 60~060~= 120~,
and also 2400[21=240~0240~= 120~, while 60~ and 240~ are not equal
angles. All the other postulates are satisfied.
EXAMPLE (f). Class: all positive distances; rule of combination: AoB=the hypotenuse of a right triangle of which
A and B are the legs.
Here (f) is not satisfied, since the hypotenuse of a right triangle is
never equal to a leg. All the other postulates are satisfied, postulate
(g) "vacuously."
EXAMPLE (g). Class: all positive angles and the zero angle;
rule of combination: o- +.
This system clearly does not satisfy postulate (g), since if A=10~,
for example, the opposite of A is not in the class. All the other postulates
are satisfied.
EXAMPLE (h). Class: all integral numbers, positive, negative, and zero; rule  of combination: o= +, where + means
the'ordinary "+" of arithmetic.


* Congruent angles are those that differ only by a multiple of 360~.


FUNDAMENTAL PROPOSITIONS OF ALGEBRA


169


This system fails on postulate (h), since, for example, there is no
integral number y such that y+y +y==5. It clearly satisfies all the
other postulates.
18. Independence of the postulates. These examples enable
us to answer a second question concerning the set of postulates
(a)-(h) in sec. 14. We have already inquired whether these
postulates are consistent (sec. 15); we may now ask, Are they
independent? That is, are none of them merely consequences
of the rest? Or, in other words, is the set of postulates free
from redundancies?
The examples just cited show that in this case the postulates are all independent; for, if postulate (a), for example, were
a consequence of the other seven postulates, then every system
which satisfied the other seven would also satisfy (a); but
this is not the case, as is shown by the example cited; therefore postulate (a) is not a consequence of the other seven postulates. Similarly, each one of the eight postulates is shown to
be independent of the rest.
In this connection it may be noticed that the postulates (a)-(h) in
sec. 14 are often simpler statements than the propositions (a)-(h) in
sec. 6 or sec. 10. For example, (f) in sec. 6 is really a double statement: (1) there is at least one angle x such that x+x=x, and (2) there
is not more than one such angle; in sec. 14 we see that only the first
part of this statement need be assumed as postulate (f), since the second
part of the statement is a consequence of (a), (b), (c), and (d).
The problem of reducing every postulate to its simplest form is one
of the most fascinating problems in this kind of work; if we "weaken"
the statement too much, we shall not be able to deduce what we wish to
from it; while if we do not weaken it enough, we shall have difficulty in
proving it independent. It would, of course, not be desirable to carry
this reduction too far in elementary teaching; for the farther back we
drive our postulates, the longer is the logical journey we must travel
in deducing from these postulates the later and more interesting propositions of the science.
19. On the sufficiency of the postulates to determine a single
type of system.   We turn, finally, to a third question concerning the postulates (a)-(h) in sec. 14. We have been dealing
with systems consisting of a class, say K, and a rule of com


170


MODERN MATHEMATICS


bination, o; and among these systems (K, o) we have found
some that satisfy the conditions laid'down in this set of postulates, and some that do not. Now the question to be raised
is this: Are all the systems (K, o) that satisfy these postulates
essentially of the same type? By systems of the same type we
mean systems which are "isomorphic" with respect to the
class K and the rule of combination o; two systems (K, o)
and (K', o') being called isomorphic if the elements of the class
K can be paired off (put into "one-to-one correspondence" with)
the elements of the class K' in such a way that whenever A
and B in the class K correspond to A' and B' in the class K',
then AoB in K will correspond to A'o'B' in K'.
As an example, we have the system of angles, with the rule of combination addition (sec. 5), and the system of distances, with the rule of
combination multiplication (sec. 9); these two systems are isomorphic;
for, if we take any angle a, not 0, and any distance a, not 1, and pair
off the angles with the distances in the manner suggested by the following scheme:.. -3a   -2a    -a   0   -a  a      -oa 2a  3a  4a.....       a- a-2 a~   1   a   a  a   a2  a3 a.. 
then the conditions for isomorphism are easily seen to be satisfied.*
These two systems are therefore of the same type.
It is easy, however, to find systems that satisfy all the postulates
(a)-(h) and are not isomorphic with either of the systems just considered. For example, consider the system in which the class K is the
class of all "rational" angles (that is, the class of all angles expressible
in the form ~- 1~, where m and n are positive integers), and in which
n
the rule of combination 0 is the ordinary rule of addition. This
system, like the system of all angles considered above, satisfies all the
postulates, as is readily verified; but the two systems are not isomorphic;
for if we attempted to set up an isomorphism between them, we should
necessarily pair off first the zeros of the two systems together, and then
the rational fractions of 10 in one system with the rational fractions
of some angle a, in the other; whereupon the one system would be


* Incidentally we notice that this isomorphism may be set up in an
infinite number of ways, since the angle a and the distance a may be chosen
at pleasure.


FUNDAMENTAL PROPOSITIONS OF ALGEBRA


171


already exhausted, while the other would still contain an infinite number
of unpaired elements (compare6 sec. 28).
The answer to our third question is therefore, in this case,
in the negative; all the systems that satisfy the postulates
(a)-(h) of sec. 14 are not of the same type. To distinguish.
between the various types of systems that satisfy these postulates, further conditions would have to be added.
These facts may be expressed by saying that the postulates
in question, while they are consistent (sec. 15), and independent
(sec. 18), are not " sufficient," that is, not sufficient to determine
any single type of system  (K, o).*
20. Note on    the terms " axiom " and "postulate."        We
have now introduced to the reader, in connection with the
very simple systems studied in Part II, all the fundamental
ideas which we shall need to use in the main part of the
article. Before leaving this preliminary work, however, it may
be well to say a word in regard to a disputed question of terminology; namely, the question of the proper use of the term
" axiom."
Some authors, particularly in Germany, have called any
* The postulates for general algebra, which are given below in sec. 30,
will be found to have all three of the properties of consistency, independence,
and sufficiency. A "sufficient" set of postulates is also called "categorical,"
this term having been introduced by Veblen in 1904. (0. Veblen, A System of
Axioms for Geometry, Trans. Am. Math. Soc., Vol. V, p. 346.) The term
"sufficient" was first used by E. V. Huntington in 1902 (A Complete Set of
Postulates for the Theory of Absolute Continuous Magnitude, Trans. Am.
Math. Soc., Vol. III, p. 264). For a criticism of these terms, see L. Couturat,
Les Principes des Mathematiques, p. 169.
The earliest example of a "sufficient" or "categorical" set of postulates
is a set of five postulates for the algebra of positive integers given by G.
Peano in 1889. (See Bull. Am. Math. Soc., ser. 2, Vol. IX, p. 41, 1902.)
In this connection compare also two papers by A. Padoa, (1) Essai d'une
theorie alg6brique des nombres entiers, precede d'une introduction logique
a une theorie deductive quelconque; Bibliotheque du Congres international
de Philosophie, Paris, 1900, Vol. III, pp. 309-365; and (2) Un nouveau systeme irreductible des pcstulats pour l'algebre, Compte rendu du deuxieme
Congres international des Mathematiciens, Paris, 1900, pp. 249-256; and a
short note by D. Hilbert, Ueber den Zahibegriff, Jahresber. der deutschen
Mathematiker-Vereinigung, Vol. VIII, 1900, pp. 157-168.


172


MODERN MATHEMATICS


set of conditions adopted as the basis of an abstract science, like
the conditions (a)-(h) of sec. 14, a set of axioms for that science.
In the opinion of the present writer, however, the term axiom
should be applied only to statements of fact, like the propositions of sec. 6 or sec. 10, never to statements of conditions
to be satisfied, like the propositions of sec. 14.
The propositions of sec. 6 or sec. 10 are properly called
axioms, because they are obviously true statements about certain definite operations on angles or distances. The propositions of sec. 14, on the other hand, are of quite different
character.  We have called them " postulates," from the Latin
postulo, because they are "demands " or conditions which a
given system  may or may not happen to satisfy.      They are
logically analogous to demands or conditions set up in other
fields of activity; for example, just as any man who satisfies
the conditions set up for admission to the army is entitled to
belong to that particular class of men, so any system (K, o)
that satisfies the conditions set up in sec. 14 is entitled to belong
to a certain class of systems.  No one would think of calling
the conditions for admission to the army " axioms"; and there is
no more reason for calling the conditions of sec. 14 by that name.
Indeed if the word " axiom " is preserved in its well-established meaning, the recognition of the distinction between axiom
and postulate, if properly understood, may well serve to mark
the transition from the older to the more modern point of
view in regard to the nature of abstract mathematical reasoning.*
In regard to the term "postulate," there seems to be little choice
between "postulate," "assumption," "primitive proposition," all of
which are in good use. Strictly speaking, these postulates, and all the
theorems deducible from them, are not propositions at all, but rather
what Bertrand Russell t has called "propositional functions," which
become propositions (true or false) only after particular values are
assigned to the variable symbols K and 0.
* Compare M. Bocher's St. Louis Address, 1904, Bull. Amer. Math. Soc., Vol.
XI, pp. 115-135, especially the first footnote on p. 129. Also J. W. A.
Young, The Teaching of Mathematics, pp. 193-201.
t The Principles of Mathematics, Vol. I, 1903; or L. Couturat, Les Principes
des Mathematiques.


FUNDAMENTAL PROPOSITIONS OF ALGEBRA


173


IV. GEOMETRICAL EXAMPLE OF THE ALGEBRA OF COMPLEX QUANTITIES: THE SYSTEM OF POINTS IN A PLANE
21. Points in a plane; "real" and "imaginary" points.*
As a first concrete example of the algebra of complex quantities,
consider the class of all points that lie in a plane in which two
points, 0 and U, are fixed.
These points are divided into "real" points and "imaginary " points. The "real " points are all the points that lie
on the line through 0 and U, this line being called the axis
of reals; the "imaginary" points are all the remaining points
of the plane.t  A "real" point is called positive or negative
according as it lies on that half of the real axis which contains
U, or on the other half. The point 0 itself is called the zero
point (see below) and is neither positive nor negative.
An imaginary point is called a pure imaginary if it lies on a
line through 0 perpendicular to the axis of reals.
The position of any point a in the plane is determined when
we know: (1) the distance of a from
0 (the distance OU being taken as 
the unit of measurement); and (2) 
the angle which the line Oa makes 
with the axis OU. Two points are
"equal," that is, coincident, when              \      U
and only when their "distances"
are equal and their " angles " equal
or congruent.                                 FIG. 5.
The notation (5, Z 120~), for example, is used to denote a point whose "distance " is 5 times
OU, and whose "angle " is 120~.
* The system of points in the plane was first studied by C. Wessel, in
1799, and by Argand, in 1806.
t The terms "real" and "imaginary" are unfortunate legacies from the
eighteenth century, which have become firmly fixed in mathematical literature; the so-called imaginary points are of course no more imaginary, in
the ordinary sense of the word, than any other points of the plane.


174


MODERN MATHEMATICS


All the points whose distances equal OU are called points
on the " unit circle."
Among these points in the plane, we now proceed to define
certain rules of combination which we shall call " addition"
and " multiplication."
22. Addition of points in the plane. If two points a and b
are given, a third point x may be derived from them by the
following process: Starting from 0, perform the journey from
0 to a; then continuing from a, perform a journey equal to
length and direction to the journey from 0 to b; the point
finally reached is the required point x.*
The point x thus determined is called the sum of the given
points a and b, (with respect to the chosen point 0) and is
denoted by a+ b.
The + sign here used must of course not be confused with the +
sign of arithmetic, because the a and b here denote not numbers, but
points.
23. Concerning the addition of points in the plane, as thus
defined, the reader may easily verify the following statements:
(1) If a and b are any two points (equal or unequal) then
their sum, a +b, is a point, uniquely determined by a and b; and
if a and b are " real " points, then a+b is also " real."
(2) a + b = b + a. (Commutative law for addition.)
(3) (a +b) + c=a + (b +c). (Associative law for addition.)
These facts will be clear from the accompanying figures.
a+- b c
x=a+bc    Kb+       \
0o 
FIG. 6.         FIG. 7.
(4) If a+b=a +Vb, then b-b'V (" Law of cancellation" for
addition.)


* In the cases in which a and b are not in line with 0, the point X may
also be described as the fourth vertex of a parallelogram whose sides are
Oa and Ob.


FUNDAMENTAL PROPOSITIONS OF ALGEBRA


175


(5) If na=nb, then a=b. Here n is any positive integer,
and na means aa+ +a.. +a, to n terms. The point na is called
the nth multiple of a.


If a is not 0, the series of points a, 2a,
3a,... will lie beyond a on the straight line
through O and a.    Obviously, if a is a real
point, na will also be real, and positive
or negative, according as a is positive or
negative.


3a
2a
aFIG. 8.
FIG. 8.


(6) There is a unique point z such that z +z =z. This point
z is called the zero point of the system, and is denoted by 0.
This point O is the point 0 of the figure. Obviously, from the definition, if a is any point, a +O= a.
(7) Every point a determines uniquely a point a' such that
a+a'=O. This point a' is called the opposite of a, and is
a             denoted by -a; and if a is any real point,
-a will also be real.
0~         The point -a is the point symmetrical to a with
-a  respect to O.
FIG. 9.       (8) If a point a and a positive integer n are
given, there is always a point x such that nx=a. This point x
is called the nth submultiple of a, and is denoted by a/n; and if
a is real, a/n will also be real.
an  Ra
0Fa. 10.
FIG. 10.
If a is not 0, the series of points a, a/2, a/3,... will lie on the
straight line between a and 0, the series becoming more and more
crowded as it approaches the point O. Obviously, if a= O, O/n=0.
Further, if m and n are any positive integers, m(a/n) = (ma)/n; this
point is denoted by (m/n)a.
(9) If a and b are any two points, there is always a point x
such that a=b +x. This point x, which is uniquely determined


176


MODERN MATHEMATICS


by a and b, is called the remainder, a minus b, and is denoted
by a -b; and if a and b are real, then a -b is also real.
To construct this point a-b, notice that it is the same as a +(-b),
as is evident from the figure; hence, to subtract a point b means to add
the opposite of b.
All these statements concerning the addition of points are
exactly analogous to the statements in sec. 6 and sec. 7 concerning the addition of angles.
24. Multiplication of points in the plane. We now define a
second operation upon these points.
If a and b are any two points in the plane, a third point
x may be derived from   them   by the follow==aXb     ing process: find the " angle " of x by taking
the sum of the angles of a and b, as defined in
/b    sec. 5; find the " distance " of x, by taking the
product of the distances of a and b, as defined
in sec. 9. The point x thus determined is called
F. 1    the product of the given points a and b (with
FIG. 11.   respect to the fixed points 0 and U) and is
denoted by a Xb, or a b, or simply ab.
For example, if a= (2, Z 10~) and b= (3, Z 15~), then aXb= (6, Z 25~).
Here again the X sign must not be confused with the X of arithmetic,
since the letters a and b here denote, not numbers, but points in the
plane.
25. Concerning the multiplication of points in the plane, as
thus defined, the following statements hold true:
(10) If a and b are any two points (equal or unequal) then
their product aXb is a point uniquely determined by a and b;
and if a and b are real, then a X b is also real.
In particular, if a and b are both positive, or both negative, aXb
will be positive; but if one factor is positive and the other negative,
then the product, as obtained by the rule, will be negative.
(11) aXb=bXa.     (Commutative law for multiplication.)
(12) (aX b) Xc = aX (bXc). (Associative law for multiplica

tion.)


FUNDAMENTAL PROPOSITIONS OF ALGEBRA


177


The truth of these statements (11) and (12) is evident from the fact
that the addition of angles and the multiplication of distances are
themselves commutative and associative (secs. 6, 10).
(13) If a X b=a X b', and a is not 0, then b=b'.  ("Restricted
law of cancellation " for multiplication.)
(14) a X (b + c) = (a X b) + (a X c).  (Distributive law of multiplication with respect to addition.)
a (b+c)=ab +ac
ac   o 
I     b+c
FIG. 12.                     FTG. 13.
To see that this distributive law holds, let each of the points b, c,
and b+c be multiplied by a, as in Fig. 12; it is required to show that
the point a(b+c) is the sum of the points ab and ac. To show this,
place the quadrilateral 0, ab, ac, a(b+c), together with the parallelogram 0, b, c, (b+c), in a plane perpendicular to the line OU, in the
manner shown in Fig. 13, and lay off the distance Oa along that line.
By the definition of the multiplication of distances, the lines U-c and
a-ac, in Fig. 13, are parallel, as are also the lines U-(b +c) and a-a(b+c);
therefore the planes a-ac-a(b + c) and U-c-(b + c) are parallel, and hence the
lines ac-a(b+c) and c-(b+c), in which these planes intersect the given
plane, are parallel. Hence ac-a(b+c) is parallel to O-ab; and similarly,
ab-a(b+c) is parallel to O-ac. Therefore the quadrilateral in question
is a parallelogram, and the point a(b+c) is the sum  of the points ab
and ac, as required.*
(15) There is a unique point u, distinct from 0, such that
u Xu=u; this point u is called the unit point of the system, and
is denoted by 1.


* The truth of the distributive law may also be inferred directly from
Fig. 12, from the properties of similar triangles; but the proof given above
has the advantage of not involving the theory of ratio and proportion, or
the "incommensurable case.'


178


MODERN MATHEMATICS


This point u is the point U of the figure; that is, the point (1, Z 0~).
Obviously, from the definition of multiplication, if a is any point, a X 1= a.
The successive multiples of the point 1 [sec. 23, (5)] are
denoted, for brevity, as follows: 1+1=2(1) =2; 1 +1+1=3(1)
=3; etc.
-3 -2   -1   0   1   2   3   4
FIG. 14.
(16) Every point a,.provided a is not 0, determines uniquely
a point a' such that a X a'= 1, where 1 is the unit point. This
point a' is called the reciprocal of a, and is denoted by a-1 or
1/a.  If a is a real point (not O) then its
a     reciprocal will also be real.
To construct the point 1/a, notice that its
angle is the opposite of the angle of a (sec. 6),
while its distance is the reciprocal of the distance
of a (sec. 10). If a is a point on the "unit
circle" (sec. 21), then 1/a will also be on the
0\ &gt;^     ~  unit circle; while if a is inside the circle, 1/a
a\ \yCt /  will be outside, and the nearer a approaches
the point 0, the farther off will 1/a recede.
FIG. 15.          (17) If a and b are any points, and
b not 0, then there is always a point x
such that a=b X x.   This point x, which is uniquely determined
by a and b, is called the quotient, a divided by b, and is denoted
by a/b. Moreover, if a and b are real (and b not 0) then a/b
will also be real.
To construct this point a/b, notice that its angle must be the angle
of a minus the angle of b (sec. 7), while its distance must be the distance
of a divided by the distance of b (sec. 11).
In particular, 1/1=1, and (ml)/(nl)= (m/n)l, where m
and n are any positive integers [sec. 23, (8)]. Hence, if we
G. 16.          2
FIG. 16.
denote ml and nl by m and n [sec. 25, (15)] then m/n = (m/n)l.
For example, 2/3 = (2/3)1.  Notice here that 2 and 3 are points,


FUNDAMENTAL PROPOSITIONS OF ALGEBRA


179


whose quotient must be found by the rules for the division of
points, while 2 and 3 are numbers, indicating how often a
certain operation is to be repeated.
(18) If a is any point, and 0 is the zero point, then a X  = 0;
and if a product a X b = O, then at least one of the factors a and b
must be 0.
In view of these propositions, (1)-(18), we notice in passing
that the system of points in the plane is a system in which addition,
subtraction, multiplication, and division (except division by zero) are
always possible; and the same
is true of the system composed               b2
of the " real " points alone..   'b
(19) The notation a%, where             --     a
n is any positive integer, means              \ 
aXaXaX.       ~ Xa to n factors; and the point an is called      \    a      1;
the nth power of the point a.                         /
In particular, a2 is called the 
square and a3 the cube of the
point a. Obviously, from the
definition of multiplication,           b
1=l, and On=O.                              FIG. 17.
To construct the point an, notice that the angle of an is the nth
multiple of the angle of a (sec. 6); while the distance of an is the nth
power of the distance of a (sec. 10). If the point a lies on the unit
circle, then an will also lie on this circle; if the point a lies outside
the circle, then the series of powers, a, a2, a3... win lie outside
the circle, on a spiral curve which recedes farther and farther from
it; if the point a lies inside the circle, the series a, a2, a,... will
lie inside the circle, on a spiral which again recedes farther and
farther from the circle, coiling up around the point 0.
Of special interest are the powers of i,
/////,   where i denotes the point (1, Z 90~). Referring to the figure, and applying the rule
for the multiplication of points; we see that
- I\.I  Yo      1      the successive powers of the point i repeat
in cycles of four:
-^ vs      i=1i, i =           -1, i3= -i, i4=1,
FIG. 18.             i'5i, i,= -1, i7= -i, i8= 1, etc.


180


MODERN MATHEMATICS


A similar fact is true of the point -i, that is, the point (1, Z270~).
Hence,
(20) There are two points x such that x2= -1, where -1 is
the opposite of the unit point 1. These two points are called the
imaginary units of the system, and are denoted by i and -1.
It will be noticed that multiplying any point by i has the effect of
rotating the point through 90~ about O; while multiplying it by -1
rotates it through twice that angle, or 180~.
(21) If a is any point, there are always two " real " points,
x and y, such that a= x+iy, where i
'.. â. --- ââ.- a=-+  is one of the " imaginary units."
To see this, we have only to observe,
I ',    _       first, that any "pure imaginary" point
0'0             x,    (sec. 21) can be expressed in the form
FIG. 19.         iy, where y is some real point, and,
second, that any point a can be expressed as the sum of a real point and a pure imaginary.
26. Solution of algebraic equations. Suppose now that any
point a and any positive integer n are given; and let us ask,
Is there any point x such that xl =a?    An inspection of the
figure will show:
(22) If n is any positive integer, and a is any point not 0,
there will be n distinct points x such that x'n=a; each of these
points is called an nth root of a.
Thus, every point a, except 0, has two square roots, three
cube roots, four fourth roots, and so on.*


FIG. 20.                       FIG. 21.


* It will be noticed that the proposition: If an=bn, then a =b, which we
found to be true when a and b represented distances (sez. 10), is not true
when a and b represent points in the plane.


FUNDAMENTAL PROPOSITIONS OF ALGEBRA


181


To construct these points, notice first that if a= 1, the nth roots
of 1 are points on the "unit circle," and divide that circle into n equal
parts, beginning with the point 1; for, any one of these points, when
raised to the nth power according to the rule, will produce the point 1.
In general, the nth roots of any point a will lie on a circle whose radius
is the nth root of the "distance" of a (sec. 10), and will divide this
circle into n equal parts, beginning with the point whose "angle" is
the nth submultiple of a (sec. 6). If any one of the nth roots is given,
the rest can be obtained from it by multiplying by the nth roots of
the point 1. The notation a1/n, or n/a, is used to denote that particular
nth root of a which has the smallest (positive) angle. Thus, i= V/-1,
and -i= -    - 1.
If we confine ourselves to real points, the statement of the
situation is more complicated. Thus, if a is real, and n is an
odd number, one of the nth roots of a will be real, and will be
positive or negative according as a is positive or negative. If
a is a positive real, and n is even, two of the nth roots of a will
be real, one positive and one negative; but if a is a negative
real, and n is even, none of the nth roots of a will be real.
More generally, suppose we have any algebraic equation of
the nth degree in x, that is, any equation of the form
pox' + pl-'~- +p2Xn-2 + ~      p,_ 1X + pn=0,
where n is a given positive integer, and po, pi, p2,..., pn are
any given points, provided po is not zero; and let us inquire
whether there is any value of x which will satisfy this equation.
If there is such a point x, it is called a root of the equation.
The facts in the case are these:
(23) Every algebraic equation of the nth degree:
poxn +plXn-1 +p2Xn- 2. - * + pn-ilX +- pn
can be written as the product of n linear factors:
po(x-X(z-X2) - *~ ~ (x-n) = 0,
where the points xl, x2,..., Xn are fixed points depending on the
coefficients po, pl,..., pn; each of these points x1, x2,..., Xn is a
root of the equation, and there are no other roots.*,
* Since the n factors x-xX, x-x,,..., x-n are not necessarily distinct
from one another, the number of distinct roots may be any number from


182


MODERN MATHEMATICS


The fact thus stated may be directly verified in the case of
equations of the first, second, third, and fourth degrees (called
linear, quadratic, cubic, and biquadratic equations, respectively).
For example, the linear equation ax+b=O (a not zero), has the
root x= -b/a; and the quadratic equation ax2+bx+c= 0 has the
roots xl= (-b +/b2-4ac)/(2a) and x,= (-b- /bV  -4ac)/(2a); and
similar solutions can be obtained for equations of the third and fourth
degrees.*
The proof for the general case of an equation of the nth
degree is more complicated, and will be given in Appendix II.
It is important to notice that the fact just stated    concerning the number of roots of an algebraic equation-or, what
comes to the same thing, the number of linear factors-is true
only when we take into consideration all the points of the
plane. If we confined ourselves to the points on the real axis,
the corresponding statement would be much more complicated.
For example, the statement that "every quadratic equation
ax'2+bx+c=0 has two (real or coincident) roots" is a true statement
only when we are dealing with the complete system of all the points
in the plane (or with some equivalent system). If we are dealing with
the real points alone, we must say: "a quadratic equation ax2 + bx + c= 0
has two roots, or one root, or no root, according as b2 -4ac is positive, zero,i
or negative."  To state, as is often done, that in case b2 -4ac is negative,
the two roots still "exist" but have now  "become imaginary" is
thoroughly mischievous. The simple fact is, that if we are dealing with
the real points alone, and b_ -4ac is negative, then there is no real
point such that ax'+bx+c=0. No juggling with words will alter
this fact; and no talk of "imaginary points" can possibly have any
definite meaning for the student until he has become acquainted with
some actual system in which such points occur.
We now turn to a third property of the system     of points
in the plane, namely, the relation of order among the points
on the axis of reals.


1 to n, inclusive; for the sake of brevity, however, it is customary and convenient to say that an equation of the nth degree always has n roots, understanding that in special cases some or all of these roots may be coincident.
* See Monograph V.


FUNDAMENTAL PROPOSITIONS OF ALGEBRA


183


27. The relation of order among the real points. From the
description of the so-called " real " points in sec. 21, it is obvious,
in the first place, that
(24) The real points form a subclass within the class of all
the points of the plane. In particular, the points 0 and 1 are
real points.
Within this subclass of real points, if the point a precedes
the point b as we progress along the axis of reals in the direction
OU, then we write
a&lt;b
(read: " a algebraically less than b," or, better ' a precedes
b "). The same situation may also be expressed by writing
b &gt;a (read: " b algebraically greater than a," or " b follows a.")
Concerning this relation of serial order among the points
on the axis of reals, the following statements are evident:*
(25) If a and b are real points, and a not equal to b, then
either a&lt; b or b&lt; a.
(26) If a&lt; b, then a is not equal to b.
(27) If a&lt; b and b&lt; c, then a&lt; c. (Law of transitivity.)
(28) If a, x, and y are real points, and x&lt; y, then a + x&lt; a + y.
(29) If a&gt;O and b&gt;O, then aXb&gt;O.
If a&gt;0, then a is a positive real point, and if a &lt;, a negative
real point (sec. 21). Hence the statement just made can be expressed
by saying that if a and b are positive, their product aXb will also be
positive.
(30) If a&lt; b, there are always real points x such that a&lt; x
and x&lt; b. Such points x are said to lie between the points a
and b.
A further fact, which is not so obvious, but which may be
accepted as a geometric axiom, is the following:
(31) (Dedekind's principle.)  If M  is any (non-empty) subclass of real points, and if all the points of M  precede a given


* For a detailed elementary study of the relation of serial order, see
E. V. Huntington, The Continuum as a Type of Order, reprinted from the
Annals of Mathematics, 1905 (Publication Office of Harvard University).


1.84


MODERN MATHEMATICS


point c, then there will be a uniquely determined point x, called
the upper limit of M, having the following properties:
First, every point in M precedes, or at most equals, x;
Second, if x' is any real point such that x'&lt; x, then there is
at least one point of M that follows x'.
In other words, if a subclass of
____      _ _ _ _:___   real points has any " upper bound,"
M^^T ^       ^it will have a " least upper bound,"
FIG. 22.        or " upper limit."
Similarly, a subclass of real points
that has any lower bound, will have a " greatest lower bound,"
or " lower limit."
This fact is of great importance in connection with the
so-called irrational points, as explained in the next section.
Finally, we have what is known as the Principle of Archimedes:
(32) If a and b are any positive points, and. a is " less than 
b, it is always possible to find some multiple of a which is " greater
than " b.
This fact is of great importance in the theory of measurement.
28. Classification of real points. Among the real points
the points 1, 2, 3,... [sec. 25, (15)] are called the positive integral
points, and the points -1, -2, -3,... the negative integral
points; all these, together with the point 0, form the subclass
of " all integral points."
All real points which can be expressed in the form  ~m/n,
where m and n are any positive integral points [sec. 25, (17)]
together with the point 0, are called the rational points.
The rational points which are not integral are called fractional;
the fractional points lie between the integral points.
All real points which are not rational are called irrational.
That not all the real points are "rational" can be made clear by
the following familiar reasoning: Consider the diagonal, D, of a square
whose side is the unit distance, u; this length D cannot be expressed
as a rational fraction of u; for, if D= (m/n)u, where m and n are positive
integers, then, since the area of the square on D is equal to twice the


FUNDAMENTAL PROPOSITIONS OF ALGEBRA


185


area of the square on u, we should have m2/n2= 2, and this numerical relation cannot be satisfied by any integers m and n*. Hence, if we take a
point x on the real axis so that its distance from 0 is equal to D, then
this point x cannot be expressed in the form (m/n)l, and is therefore
an irrational point.
From sec. 27, (31) it is clear that every irrational point a
can be regarded as the limit of an infinite sequence of rational
points, ai, a2, a3....  Of special importance are the sequences
of the form illustrated by the following example:
ao=6; ai=4/10; a2=0/102; a3=3/103; a4=1/104;...
where each numerator is one of the points 0, 1, 2, 3,... 9
[see sec. 25, (15)], and each denominator is a power of the point
10. A sequence of this form is called a decimal fraction, and
is denoted, for brevity, as follows (taking the same example):
6; 6.4; 6.40; 6.403; 6.4031;...
These points are the first terms of the sequence that would be
obtained if we attempted to approximate toward the point V/41 by a
sequence of rational points in the decimal form; in fact, the algorithm
for "extracting the square root" of the point 41 is exactly analogous
to the familiar algorithm for extracting the square root of the nvmber
41 in arithmetic; but it should be clearly understood that when we
are dealing with the system of points, the point V/41, like all the other
irrational or rational points, is already given, from the start, in the
system of points, while if we are dealing with the system of numbers,
and have developed that system as far as the rational numbers, there
is no rational number whose square is the number 41, and hence there
is no rational number which could be denoted by '/41. Before we can
speak of the "number" V/41 as the limit of a sequence of rational
numbers, we must first define what we mean by "irrational numbers"
-that is, we must point out what the objects are that we agree to call
by that name, and how these objects can be "introduced" into our
"number system." The ingenious manner in which this "enlargement of the number concept" has been accomplished is explained in
*For, if m2/n2=2, then m2=2n2; in this equation, the left-hand side
contains the factor 2 an even number of times, if at all, while the right-hand
side contains the factor 2 either once or some other odd number of times.
The equation is therefore impossible, since a whole number can be factored
in only one way.


186


MODERN MATHEMATICS


Appendix I; but throughout the body of the article we are dealing
only with the geometrical system of points.
29. First step toward the science of this algebra. Selection of
axioms. These 32 propositions, in sees. 23-27, might well be
taken as a set of axioms for the science of algebra (compare
sec. 8).; they are not all "simple statements," and they are
not all independent, as will be shown in the more rigorous
analysis given in Part V; but they are so chosen that all the
theorems which form the main body of the science can be
deduced from them without undue labor.
In particular, the question of the irrational and imaginary
quantities becomes, as we have just seen, not a question of
"introducing" newly devised elements into the system, but
merely a question of classification of elements that are already
known to exist in the given system.
V. THE ABSTRACT THEORY OF THE ALGEBRA OF
COMPLEX QUANTITIES
30. A complete set of postulates for the algebra of complex
quantities.* The system of points in the plane, studied in
Part IV, is the best known and most easily understood example
of the type of algebra called the algebra of complex quantities.
Other examples will be given in sec. 33, and in Appendix I.
We now proceed to analyze what is logically essential in this
system.
The fundamental notions of the system are: the class of
points in general; the class of "real " points; the operations
of addition and multiplication; and the relation of order.
Abstractly considered, therefore, the fundamental notions in
terms of which all the propositions of the algebra can be stated,
are the following:
(1) A class of elements, a, b, c..., which we may denote
by K;


* The set of postulates here given is substantially the same as that first
published by the writer in Trans. Amer. Math. Soc., Vol. VI, 1905, pp. 209 -229.


FUNDAMENTAL PROPOSITIONS OF ALGEBRA


187


(2) A class of elements which we may denote by C;
(3) A rule of combination, which we may denote by e;
(4) A rule of combination, which we may denote by o;
(5) A relation, which we may denote by ~.
Any system involving these fundamental notions we shall
speak of as a " system (K, C, e, o,  )."
We now impose on these symbols the conditions expressed
in postulates 1-27, below; the object being to show that every
system  (K, C,,    o,,  ) which satisfies these twenty-seven
postulates is of the same "type " as the system of points in
the plane.
POSTULATE 1. If a and b are elements of K, then aeb is
an element of K, called the sum of the elements a and b.
POSTULATE 2. aob = boa.
POSTULATE 3. (a~b)oc= a(bec).
POSTULATE 4. If abb=aob', then b=b'.
POSTULATE 5. There is an element z in K such that ze z=z.
DEFINITION. If there is only one such element z, this unique
element is called the zero element of the system.
POSTULATE 6. For every element a in K there is an element
a' in K, such that aea'==z, where z is the zero element.*
DEFINITION. If this element a' is uniquely determined by
a, it is called the opposite of a, and is denoted by -a.
Any system (K, e) that satisfies these postulates 1-6 is called
an Abelian group with respect to the operation e.t
POSTULATE 7. If a and b are elements of K, then aob is an
element of K, called the product of the elements a and b.
POSTULATE 8. a ob = b oa.
POSTULATE 9. (a ob) oc = a o (b oc).
POSTULATE 10. If aob=aob', and a is not zero, then b=b'.
POSTULATE 11. a   (boc) =(a ob)e(a oc).
* If there is no zero element in the system, postulate 6 becomes meaningless-demands nothing. We say then that every system that contains no
zero element satisfies this postulate "vacuously." A similar remark applies
to several of the other postulates.
t For bibliographical references to definitions, of "groups" and "fields,"
see Trans. Amer. Math. Soc., Vol. VI, 1905, p. 181.


188


MODERN MATHEMATICS


POSTULATE 12. There is an element u in K, different from
zero, such that uou=u.
DEFINITION. If there is only one such element u, this
unique element is called the unit element of the system.
POSTULATE 13. For every element a in K, provided a is
not zero, there is an element a' in K, such that aoa'=u,
where u is the unit-element.
DEFINITION. If this element a' is uniquely determined by
a, it is called the reciprocal of a, and is denoted by I/a, or a-'
(provided a is not zero).
Any system (K, o, o) that satisfies these postulates 1-13 is
called a field with respect to the operations e and o.*
The following postulates concern the class C and the relation:
POSTULATE 14. If a and b are elements of C, and a not
equal to b, then either a  b or else b ~ a.
POSTULATE 15. If a~ b, then a is not equal to b.
POSTULATE 16. If a b and b~c, then a ~c.
These three postulates, 14-16, make the class C an " ordered"
class, with respect to the relation.
POSTULATE 17. (Dedekind's postulate.) If M  is any (nonempty) subclass in C, and if there is an element c in C such that
a ~ c for every element a in M, then there is an element x in C
having the following properties with regard to the subclass M:
(1) if a belongs to Mi, then a~x, or at most, a=x; (2) if x'
is any element of C such that x' ~x, then there is at least one
element a in M such that x'~ a.
DEFINITION. If this element x is uniquely determined by
the subclass M, it is called the upper limit of AM.
The following two postulates serve to connect the relation
o with the operations ( and o.
POSTULATE 18. Within the class C, if x 0 y, then aex o any. t
POSTULATE 19. Within the class C, if z C a and z ~ b, where
~ is the zero element, then z ~ aob.
* For bibliographical references to definitions of " groups" and " fields"
see Trans. Amer. Math. Soc., Vol. VI. 1905, p. 181.
t Provided a(x is not equal to aby.


FUNDAMENTAL PROPOSITIONS OF ALGEBRA


189


If, in these last six postulates, we replace " C" by " K," the
postulates 1-19, as thus altered, form a complete set of postulates
for the subalgebra of all real quantities (compare sec. 37).
The following postulates concern the class C and the operations E and o:
POSTULATE 20. If a is an element of C, then a is an element
of K.
POSTULATE 21. The class C contains at least two elements.
POSTULATE 22. If a and b belong to C, and have a sum
aEb, then aeb also belongs to C.
POSTULATE 23. If a belongs to C, and has an opposite, -a,
then -a also belongs to C.
POSTULATE 24. If a and b belong to C, and have a product,
aob, then aob also belongs to C.
POSTULATE 25. If a belongs to C, and has a reciprocal, 1/a,
then 1/a also belongs to C.
These six postulates, 20-25, together with postulates 1-13, make
the class C, like the class I, a "field " with respect to ~ and o.
POSTULATE 26. If I is a " field " there is an element j in
K such that joj= -u, where -u is the opposite of the unit
element.
DEFINITION. If there are two (and only two) such elements,
j and -j, either of them may be called the "imaginary unit"
of the system.
POSTULATE 27. If K and C are " fields " and K contains an
" imaginary unit " j, then for every element a in I there are
elements x and y in C, such that xe(joy) =a.
These postulates, 1-27, form a complete set of postulates for
the algebra of complex quantities.
From these twenty-seven postulates all the theorems of the
algebra of complex quantities can be deduced.
In particular, it is easily proved that every system that
satisfies these postulates will have a unique zero-element and a
unique unit-element; also, every element a will determine a
unique opposite, -a, and (except when a is zero) a unique
reciprocal,,/a; the pair of imaginary units j and -j is uniquely


190


MODERN MATHEMATICS


determined; and every subclass of the kind described in Dedekind's postulate will have a unique upper limit.
To avoid any possible misunderstanding, it may be well to state
again that these postulates are not by any means intended for use in
elementary instruction. Such a set of postulates exhibits, in skeleton
form, the logical structure of a particular type of algebra; but an
interest in the logical structure of a science naturally does not arise in
a student's mind until the facts of that science have long been familiar
to him.
It must not be supposed, moreover, that the set of postulates here
given is the only possible set of postulates for the algebra in question; or
that the fundamental notions here mentioned are the ones that are
necessarily adopted. On the contrary, a wide range of choice is possible;
but any set of symbols selected as the fundamental notions for the
algebra must be definable in terms of the fundamental notions here givei,
and any set of postulates selected as the fundamental propositions
of the algebra must be deducible from the postulates here given.*
In the actual development of the algebra from these postulates, when only one system is contemplated, we of course
omit the circles around the signs e, o, and ~, and replace
z, u, and j by the more familiar 0, 1, and i; but when we are
comparing several systems, or testing a given system to see
whether it satisfies the postulates, then the more general notation is essential, if we would avoid hopeless confusion.
31. Consistency of the postulates. To establish the consistency of these twenty-seven postulates, we must exhibit at
least one actual system (K, C, D, o, o) that satisfies them
all (compare sec. 15).
The simplest system of this kind is the system studied in
Part III; namely:
K= the class of all points in the plane (sec. 21);
e= +, as defined in sec. 22;
o= X, as defined in sec. 24;
C=the class of all points on the axis of reals (sec. 21)
~ - &lt;, as defined in sec. 27.


* Considerations of this kind were first emphasized by the Italians, as
Peano, Padoa, Pieri, Burali-Forti, etc., their work dating from about 1890.


FUNDAMENTAL PROPOSITIONS OF ALGEBRA


191


That this system   satisfies all the postulates of sec. 30 is
evident from the facts enumerated in Part IV.        Here z=O,
u==l, and j=i.
Other such systems, built up out of purely arithmetical
material, without recourse to geometric intuitions, will be mentioned in Appendix I. Any one of these systems shows that the
postulates are consistent.
Still other, and very instructive examples are given in sec. 33.
32. Sufficiency of the postulates. Further, the twenty-seven
postulates of sec. 30 are sufficient to determine a definite type
among the systems (K, C, e, o, ~); that is, any two systems
(K, C,, o, ~ ) that satisfy all these postulates will be " isomorphic " with respect to K, C, e, o, and ~ (sec. 19).
To prove this, suppose two systems (K, C, D, 0, 0) and (K',
C', (', 0', 0') are given. First, pair the elements z and u of class
K with the elements z' and u' of class K'; then pair all the rational
real elements of K with the corresponding rational real elements of K';
and, further, pair the irrational real elements of K with the irrational
real elements of K' by pairing the limit of every sequence of rationals
in K with the limit of the corresponding sequence of rationals in K'.
In this way a one-to-one correspondence is established between the
subclasses C and C'. Next, taking one of the elements z /-u in K
as j, and one of the elements ~ / -u' in K' as j', pair these elements j
and j'; and finally pair every element xfDjoy in K with the corresponding
element x' E 'j' 'y' in K', thus completing the one-to-one correspondence
between the two classes. It is then easy to see that the correspondence
is of such a nature that if a and b in K correspond to a' and b' in K',
then a b will correspond to a' 'b' and a Ob to a'O'b'; and, furthermore,
if a b, then a' ~'b'. The isomorphism between the two systems is
thus established.
It may be noticed that the isomorphism between the two systems
can be set up in two ways, according to which of the elements ~ n/-u
we take as j. It is a curious fact that there is no way of distinguishing
between j and -j by any statement that can be expressed in terms
of the symbols K, C, E, 0, and 0; that is, any true statement involving
j and expressible in terms of these symbols alone, will remain a true
statement when j is replaced by -j.
All the systems that satisfy these twenty-seven postulates are
therefore identical as far as properties statable in terms of K, C,


192


MODERN MATHEMATICS


e, o, and ~ are concerned: that is, every proposition statable
in terms of these symbols alone will either be true for all such
systems, or else be false for all of them.
33. Examples of isomorphic systems. The following examples of
isomorphic systems will be instructive; in each case the symbols +,
X, and &lt; are to be understood in the sense defined in IV.
(a) K=class of all points in the complex plane; ab=-a+b;
ab= 5(a X b); C= class of all points on the axis of reals; ~ == &lt;.
Here z= O, uz= /5, j= i/5.
(b) K= class of all points in the complex plane; aEDb= (aX b) / (a + b),
except that ab==a+b whenever a or b or a+b is zero; aOb=aXb;
C=class of all points on the axis of reals; (a ~ b)= (a &lt;b), except that
when a and b are both positive or both negative, (a 0 b) = (a&gt; b).
Here z= O, u= 1, j= i.
(c) K=class of all points in the complex plane; a  =b=a+b+l;
a  b==aX b + a+ b; C= class of all points on the axis of reals; ~ = &lt;.
Here z= -1, u=O, -u= -2, and j=i-1.
Each of these systems satisfies all the twenty-seven postulates of
sec. 30, and hence is strictly isomorphic with the system described in
IV*. It will be noticed that the ordinary meaning of addition is
preserved in Example (a), and the ordinary meaning of multiplication
in Example (b). Other examples are given in Appendix I.
34. Independence of the postulates.    Finally the twentyseven postulates of sec. 30 are all independent; that is, no
one of them can be deduced from the remaining twenty-six.
To prove this, we must exhibit, in the case of each postulate,
a system  (K, C, e, o, 0) which satisfies all the other postulates, but not the one in question (compare sec. 18). A complete list of such "pseudo-algebras " is given in the Transactions of the American Mathematical Society, Vol. VI, 1905,
pp. 227-229; a few examples from this list are given in the
next paragraph, the most interesting one being the example
for Postulate 18.
35. Selected examples of systems that satisfy all but one of
the postulates.
EXAMPLE FOR 1. Let K be a class consisting of five elements, 0,


* Each of these systems is obtained from the ordinary complex plane
by a projective transformation.


FUNDAMENTAL PROPOSITIONS OF ALGEBRA


193


1, -1, i, -i, and C a class consisting of three of these elements, namely,
0, 1, -1, and let E, 0, and ~ mean the ordinary +, X, and &lt;.
This system does not satisfy Postulate 1, since, for example, the
element 1 1= 2 does not belong to the class. All the other postulates
are satisfied.
EXAMPLE FOR 3. K= all complex quantities; C=all real quantities; aEb=-(a+b)/3; 0= X;     = &lt;.
This system does not satisfy the associative law for addition (compare Example (c), sec. 17). All the other postulates are satisfied.
EXAMPLE FOR 4. The same as for 3, except that ~ is now defined
so that aBEb= 0 for all values of a and b.
In this system, postulate 4, the "law of cancellation" for addition,
is clearly not satisfied. There is a zero element z= 0, and a unit
element u=1; and Postulate 6 is satisfied. Since a+a'=O, whatever
the value of a', we cannot speak of "the opposite of a," since this
element a' is not uniquely determined. Hence postulates like 26 and
27, which presuppose the existence of an opposite, are satisfied "vacuously."
EXAMPLE FOR 8. K    all complex quantities; C= all real quantities;
== +; ab-,b; ~=&lt;.
This system  clearly does not satisfy the commutative law for
multiplication. All the other postulates are satisfied. (The system
does not contain a unique unit element, and therefore all the postulates
which presuppose such an element are satisfied "vacuously.")
EXAMPLE FOR 11. The usual system of complex quantities, except
that 0 is so defined that a b=a+b-l.
Here z= 0, u=1; since the distributive law is not satisfied, the
system is not a "field," and Postulates 26 and 27 demand nothing.
All the other postulates are satisfied.
EXAMPLE FOR 12. K= the class of all complex quantities x +iy,
in which x and y are even integers (positive, negative, or zero); C= all
the elements of this class which are real; El, (, and ~ defined as the
ordinary +, X, and &lt;.
This system contains no unit element, but satisfies all the other
conditions.
EXAMPLE FOR 16. K= all complex quantities; C= all real quantities;
+= +, 0= X; but ~ interpreted to mean "not equal to."
This system satisfies all the postulates except the law of transitivity;
for, with the meaning given to 0, we may have a ~ b and b ~ c, and
yet not a ~ c.
EXAMPLE FOR 17. The ordinary system     of complex quantities,
x+iy, with x and y restricted to rational values (positive, negative, or
zero).


194


MODERN MATHEMATICS


EXAMPLE FOR 18. K= a class of nine objects, let us say nine
umbrellas, marked with the labels 0, 1, 2, 3, 4, 5, 6, 7, 8; C= the subclass composed of umbrellas 0, 1, and 2, with Q = &lt;; G and 0 defined
according to the following tables::
( 012345678                0 012345 678
0 012345678                0000000000
1 120453786                1 012345678
2 201534867                2 021687354
3 345678012                3 036471825
4453786120                 4 048723561
5534867201                 5057138246
6 678012345                6063852417
7 786120453                7 075264183
8 867201534                8 084516732
For example, 3~7=1; 307=2.
This remarkable system does not satisfy Postulate 18, as we see
by taking a= 1, and x= 1, y= 2. All the other postulates can be shown
to be satisfied, although the labor of a direct verification of the associative and distributive laws would be large. The zero element of the
system is z=0, the unit element is u=, and the imaginary units are
4 and 8. To show that Postulate 27 is satisfied, take i=4, and build
all the elements of the form x+iy, where x and y belong to C; this set
of elements will be seen to exhaust the given class K.
This system is a good example of the strange "pseudo-algebras"
which would have to be admitted if we left out even one of the twentyseven conditions imposed by the postulates.
EXAMPLE FOR 20. K= all complex quantities x+iy, where x and y
are restricted to rational values (positive, negative, or zero); C=all
real quantities; E, 0, and 0 meaning the ordinary +, X, and &lt;.
This system satisfies all the postulates except the 20th.
EXAMPLE FOR 24. K=all complex quantities, with      = -+, and
= X; C=all pure imaginaries (sec. 21), with 0    defined so that
ix C iy whenever x &lt;y.
Here the product of two elements of C will not (in general) belong
to C, but all the other postulates are satisfied (19 and 27 vacuously).
EXAMPLE FOR 26. K= all real quantities, C= all real quantities,
E= +,   = X, 0= &lt;.
This system contains no "imaginary units."


* Advanced students will recognize this system  as a Galois Field of
order 32.


FUNDAMENTAL PROPOSITIONS OF ALGEBRA


195


EXAMPLE,FOR 27. The system employed to show the independence
of Postulate 27 is a rather complicated one, as follows: K=the class
of all algebraic expressions T of the form,
T= Amtm +Am+ltm+l +An+2tm+2+..,
where t is a parameter, and m any integer (positive, negative, or zero),
while the A's are ordinary complex quantities. The operations E and
o are defined as the ordinary + and X for such (finite or infinite)
expressions. The class C is the class of all those elements T in which
all the coefficients are zero except A0, and A0 is real; that is, C= the
class of real quantities. Within this class C, ~ is defined as the ordinary &lt;.
This system satisfies all the postulates except Postulate 27. It is
larger than the system of ordinary complex quantities, and contains
that system just as the system of ordinary complex quantities contains
the system of real quantities. Postulate 27 is therefore a restrictive
condition.
36. What is algebra?   We are now in a position to answer
the question, " What is the algebra of complex quantities?"
The answer is, the algebra of complex quantities is the scientific
study of that particular type of " system (K, C, e, o, ~ ) "which
satisfies the twenty-seven postulates of sec. 30; any system (K,
C, o, o, O) that satisfies these twenty-seven conditions may
be taken as a representative example of the algebra, and all the
propositions which are logically deducible from these twentyseven postulates are the propositions which form the body of
the science.
The system of points in a plane, described at length in Part IV, is
the simplest representative example of this algebra, and is the only
example which could possibly be used to advantage in elementary
instruction (compare Appendix I, especially sec. 42).
Again, if one asks, "What is an imaginary quantity?" the
answer is this: If any system (K, C, E, o, ~) that satisfies
the twenty-seven laws of complex algebra is given, then any
element of K, not belonging to the subclass C, is called an
"imaginary " element of that system.
The question "What is an irrational quantity?" may be
answered in a similar way.


196


MODERN MATHEMATICS


A striking peculiarity of the set of postulates adopted in
sec. 30 is that none of the postulates presupposes any knowledge
of arithmetic, not even the notion of counting.*
37. A complete set of postulates for the subalgebra of real
quantities.  A complete set of postulates for the algebra of all
real quantities may be obtained from    the list in sec. 30 as
follows: Omit postulates 20-27; abandon the distinction between
the classes K and C, and make postulates 14-19 apply to the
whole class K.
The resulting set of nineteen postulates, 1-19, will be consistent, sufficient, and independent; and any system (K, ~, 0  )
which satisfies them all will be an example of the type of algebra
called the algebra of all real quantities.t
Complete sets of postulates for other subalgebras, as the
algebra of positive integers, the algebra of all integers, the
algebra of all rationals, etc., are given in another paper by
the writer.:
38. On the value of complex algebra in problems concerning
real quantities. As already pointed out, the rules of operation
in any of the subalgebras are more complicated, that is, more
subject to exceptions, than are the rules of operation in the
general algebra of complex quantities. On this account, it is
usually worth while to employ the algebra of complex quantities
even in cases where the data of the problem, and the required
answer, are all real quantities. For example, if it is required
to find a real value of x that satisfies a given equation ax2 + bx
+c=0, the simplest plan is first to find all the values of x
that satisfy the equation, and then to pick out those, if any,
that are real.  Similarly, if the problem  calls for a positive
value (or an integral value) of x, we do not confine ourselves
* This must not be understood to imply that the postulates of sec. 30
would therefore form a suitable introduction to algebra for beginners; compare the remark near the end of sec. 30.
t For bibliographical references, see Trans. Amer. Math. Soc., Vol. III,
1902, p. 265.: The Fundamental Laws of Addition and Multiplication in Elementary
Algebra, reprinted from the Annals of Mathematics, 1906. (Publication Office
of Harvard University).


FUNDAMENTAL PROPOSITIONS OF ALGEBRA


197


to the algebra of positive quantities (or the algebra of integral
quantities) but proceed at once to operate in the realm of all
complex quantities, and then select those results that satisfy
the given conditions.
It is chiefly for reasons of this sort, if at all, that the algebra
of complex quantities should be taught in the secondary schools;
for elementary practical problems in which this type of algebra
is directly applicable are not of frequent occurrence.
APPENDIX     I.  OTHER    EXAMPLES     OF THE     ALGEBRA     OF
COMPLEX QUANTITIES
39. Arithmetical systems. In the latter half of the nineteenth
century a large amount of effort was expended in devising definitions
of the irrational and imaginary quantities which should rest on a purely
arithmetical basis, independent of any geometrical intuitions. The
problem, as we should now state it, was this: To construct, out of purely
arithmetical material, systems that satisfy the postulates 1-27 of sec. 30,
and are therefore isomorphic with the system of points in the plane.
The only use made of such systems is in the proof of the consistency
of the postulates-a non-geometric system being, from certain points of
view, more satisfactory for this purpose than a geometric one; but after
the consistency of the postulates is once established, these arithmetical
systems need not be again referred to, and from the elementary pedagogical point of view, they seem to have no value whatever.
Since, however, many of the newer text-books are inclined to lay
great stress on this matter, a brief account of one of these arithmetical
systems will here be given. The system is built up by successive steps
from the system of natural numbers, 1, 2, 3,.'..; and we shall assume
that the rules for adding and multiplying these numbers are known.
40. System based on Dedekind's " cuts." The best known of
these arithmetical systems is one based on a very ingenious idea published by R. Dedekind in his "Stetigkeit und irrationale Zahlen," in
1872. The steps by which the system is constructed are as follows:
(a) Positive rationals, R. Consider first a class R composed of all
possible pairs of numbers, m/n. (By "number" we mean, throughout
this section, a natural number, 1, 2, 3,...) Two such pairs, m/n
and m'/n', are called equal if the numbers mn' and m'n are the same;
the pair (mn' +m'n) / (n') is called the sum, and the pair (mm')/ (nn')
the product, of the pairs m/n and m'/n'; * and the pair m/n is said to


* The product of two equal pairs is called the " square " of that pair.


198


MODERN MATHEMATICS


precede, or be less than the pair m'/n' if the number mn' is less than
the number m'n.
If two number pairs are denoted by a and b, their sum and product,
as just defined, may be denoted by a+b and aXb; and the notation
b&lt;a may be used to denote that b "precedes" a. Further, if b&lt;a,
there is always a pair x, such that b+x=a; this pair x is called the
remainder a minus b, and is denoted by a-b.
The system R thus defined is an example of the type of algebra called
the algebra of positive rational quantities.
(b) Positive reals, Q. In the series R thus defined, there is an
infinity of ways in which the whole series of number pairs can be divided
into two parts, U and V, such that every pair in the class U "precedes"
every pair in the class V. Every such method of division in the series R
is called a cut (U, V).  For example, the following set of instructions:
Assign to U every pair whose square "precedes" 2/1, and to V every
pair whose square "follows" 2/1-is a "cut."
If there is a pair m/n which is either the last pair in U or else the
first pair in V, then this pair m/n is called the generating element of
the cut, and the cut is called a rational cut; but for most cuts, no
such pair will exist.
We now consider a class Q composed of all possible "cuts" in the
series R. Two cuts (U, V) and (U', V') are called equal if the classes
U and V are the same as the classes U' and V' respectively. A cut
(X, Y) is called the sum (or product) of the cuts (U, V) and (U', V')
if the class X contains every pair which is the sum (or product) of a pair
in U and a pair in U', while the class Y contains every pair which is
the sum (or product) of a pair in V and a pair in V'. A cut (U, V) is
said to precede a cut (U', V'), if there is any pair in the class V which
precedes a pair in the class U'. If A and B are two cuts, then A +B
means their sum, and A X B their product, as just defined, and B &lt;A
means that B precedes A.
Further, if A and B represent two cuts, and B precedes A, then
there is always a cut x, which, when added to B, according to the rule,
will produce A; this cut x is called the remainder, A minus B, and
is denoted by A -B.
The system Q thus defined is an example of the algebra of positive real
quantities. It is a system in which Dedekind's principle can be shown
to hold; but the proof requires very close reasoning.*
(c) All reals, q. Next, we consider a still more complicated class,
q, made up of three kinds of elements: (1) All symbols of the form +A,
where A is any element of the class Q, and + is a distinguishing mark,
read "positive "; (2) all symbols of the form -A, where A is any element
See Weber and Wellstein, Elementar-Mathematik, Vol. I, sec. 23.


FUNDAMENTAL PROPOSITIONS OF ALGEBRA


199


of the class Q, and - is a distinguishing mark, read "negative"; (3)
an extra symbol, 0, called zero.
Within this class q, " sums," " products," and the relation of "precedence" are defined by the following formulas, in which A, B,...,
denote elements of the class Q, and A +B, A-B, AXB, and A&lt;B
have the meanings already defined for that class:
+Ae+B=+(A+B),         Ae-B=-(A+B);
+AE -B=-Be+A=+(A-B) if B&lt;A, and             ==-(B-A) if A&lt;B;
+AE0= O+A=+AA,       -AeD0=Oe-A=-A,       +AA -A = -A+A==O;
+A  +B= +(A XB), -AO-B= +(A XB), +A-B= -Bo+A = -(A XB);
+AOO- O = +A=O,    -A0 = O -A = 0;
+A ~ +B when A &lt;B, -A   -B when B&lt;A; -A ~ +B; -A ~ 0, 0 ~ +A.
These definitions being once established, the circles may be dropped
from the symbols O, 0, and ~.
Further, if a and fi are any elements of q (whether a&lt;? or P &lt;a),
there will always be an element x in q such that a = P + x; this element
x is called the remainder, a minus i, and is denoted by a-P.
The system q thus defined is an example of the algebra of all real quantities, for, like the system of points on the axis of reals, it can be shown
to satisfy all the nineteen postulates of sec. 37; the labor of verification is in this case, however, very considerable, especially in case of
Dedekind's postulate.
(d) Finally, we construct still another class, K, by taking as elements
of K all possible couples of the form (a, P), where c and P are any
elements of the class q-that is, any real quantities.
Two such couples (a, P) and (a', P') are called equal when a= a'
and  = Pi'.
The sum and products of two couples are defined by the following
formulas, in which a and P denote any elements of the system of real
quantities, q, and a+fP, a-P, and axXP have the meanings defined
for that system:
(a, P) (a', 3') (a + a', P + P),
(a, P) o((oa, P') = (aCa' -,/,, a,' + ~ ls).
Within the class K, the couples of the form (a, 0), in which the second
element is zero, form a subclass C, and within this subclass C a couple
(a, 0) is said to precede a couple (a', 0) if a precedes a' in the system q.
The complete system K thus constructed can be shown to satisfy all the
twenty-seven postulates' of sec. 30, and is, therefore, like the system of points
in the plane, an example of the algebra of complex quantities.


200


MODERN MATHEMATICS


41. System based on Cantor's " regular sequences." Another
system of the same general character can be built up by using "methods
of forming infinite sequences" of a certain special kind in the series
of rational quantities, instead of "methods of forming cuts" in that
series. The definitions of "sums" and "products" in this system are
of course quite different, in detail, from the definitions in the system
just described; but the general plan by which the system is built up,
and the highly abstruse nature of the concepts involved, are the same
in both cases.
42. Comments on these arithmetical systems. It will be
sufficiently obvious from the above descriptions that these arithmetical
systems are wholly unsuitable for use in elementary instruction. And
yet it is unfortunately customary to speak of the elements of such an
arithmetical system as the genuine "algebraic quantities," and to
regard the points in the plane as merely "geometrical representations"
of them. As a matter of fact, both the arithmetical and the geometrical
systems are equally entitled to stand as representatives of the type of algebra
in question-the only genuine definition of the system being embodied
in the laws of operation of the system, as expressed in a set of postulates
like those in sec. 30.
And when we consider what the elements of the arithmetical system
really are-" couples " of "methods of division" of a series of "pairs of
numbers "-while the elements of the other systems are simply geometric
points, it is easy to decide which of these systems is the more suitable
concrete example to present to an elementary student.
Moreover, the complicated nature of these arithmetical systems is
not lessened by calling them systems of numbers, in an extended sense
of the term number.*  It has become customary, during the latter part
of the nineteenth century, to speak of all the objects described in sec. 40
or sec. 41 as "numbers," and to regard algebra as the study of these
"number systems;" but in actual practice the original definitions of
these so-called "numbers" drop entirely out of mind, and a "number
system" comes to be thought of as any system of objects which can
be put into one-to-one correspondence with the system of points in the
plane. Indeed, too often a text-book will profess to "introduce" or
"invent" a new   "number" to correspond to some point, without
vouchsafing any description whatever of the object so invented, beyond
the statement that it does correspond to the point. If a "number" is
* To avoid confusion, at least in elementary work, it seems preferable to
reserve the word "number" for its ordinary arithmetical use, and to call
the other elements " quantities" as is done, for example, in Professor Bocher's
new book on "Higher Algebra." Thus the term "complex quantity" is surely
less perplexing to a beginner than "complex number."


FUNDAMENTAL PROPOSITIONS OF ALGEBRA


201


thus to have no properties that a "point" does not have, it would
seem unnecessary to make the distinction in terminology; for the two
systems become no longer parallel, but identical!
As a matter of fact, all that is really essential in either the system of
points or the system of numbers is the set of formal laws which govern the
operations within these systems.
APPENDIX II. PROOF THAT EVERY ALGEBRAIC EQUATION
HAS A ROOT
43. We give here the proof, omitted in sec. 26, that every
algebraic equation,
poxn +plXn-1 + p2Xn-2 +.. + pn-IX +pn =0,
has at least one root. Here n is any positive integer, and po,
pi, p2, *.  pn are any given points in the complex plane (po not
zero).
Numerous proofs of this important theorem have been given,
the earliest rigorous demonstration being cue to Gauss (1799).
The proof here presented differs from those commonly given
in the fact that no use is made of trigonometry, or of the
method of separating a complex quantity into its real and pure
imaginary parts.
Throughout the proof we shall use the notation lal, due
to Weierstrass, to denote the distance of the point a from the
zero point. It is obvious from the definition of addition of
points (sec. 22) that if x = a + b + c +-, then the distance of x
cannot exceed the sum of the distances of a, b, c,..; that is,
la+b+c+.-       ~ ' \\aj +[ cl +  *Jb   *;
it is also obvious, by the definition of the subtraction of points,
that a -bl will denote the distance between the two points a
and b.
As a further matter of notation, we denote the left-hand
side of the given equation by f(x):


f(x) = poXn + pln-1 +p2xn-2  ~ -  p-i- d_-  pn,


202


MODERN MATHEMATICS


the value of f(x) when x=a is then denoted by f(a), and our
problem is to show that there is at least one point x=a such
that f(a) =0. The function f(x) is called a polynomial of the
nth degree in x.
44. In order to simplify the proof, we first establish the
following properties of the function f(x).
(1) Given, any distance R, we can find a distance G such
that If(x) &lt; G whenever Ixl &lt; R.
For, take G&gt;n-lpl' Snl, where p is the most distant of the given
points po, pi,..., p and S is a point such that jS]&gt;R and also &gt; 111.
Then whenever   Ixl &lt;R, we  shall have   Ixkl &lt; Rk &lt; Skl &lt; ISn,
(k= 1, 2,..., n), and therefore,
If(x)I = Ipoxn +pxan-' +~  +piX +pnl
&lt; Ipo0Xn + plp'n-1l +  ~  +pn-iXj + ]P
&lt; IpSnl + IpSn +   + IpSnj + IpSnl
&lt;n. pl |ISl,
which is less than G, as required.
(2) By taking IxI sufficiently large, we can make [f(x)l as
large as we please. That is, given any distance g, we can find
a distance h such that If(x)l &gt;g whenever Ixl &gt;h.
FIG. 23.
For, write f (x) in the form f (x) =xn(po +Q), where
Q=p +p2       +    pY- l p~
QP1~+ P...
X   X2        Xn â1 Xn
and take h larger than ]11, and larger than each of the distances
(2g/lpo!)/ll and (2nlp[)/lpol, where p is the most distant of the given
points Po, pi,..., pn.


FUNDAMENTAL PROPOSITIONS OF ALGEBRA


203


Then, whenever lxi&gt; h,
j&amp;I&lt; JPJ+ IP21  j~
IXI    IX? I     LXnt
P( +1 I+       1P
&lt; hh jhij      h"I
n1 ~h&lt; nn
jhl    (2nlpl)/lpol
lPot
2
and therefore, as we may see from the figure, lpo +QI &gt;I!
2
Further, whenever, Ixl&gt;h,  1xn&gt;2g/1poi.
Hence, whenever!x!&gt;h,    If (X)i ==Ixnl p I p 0~Q  &gt; g, as required.
(3) The polynomial f(x), of the nth degree, may be written in
the form
f(x) =f(a) + (x -a)kF(x),
where a is any given point in the -plane, and F(x) is another
polynomial, of the (n-lc)th degree, and such that F(a) is not
zero.*
For, we have
f(x)=p0x~+px-1 +.     -+3n-l~7Jn,
and          f(a) =pan+ p,an-' +     + pn-ja+pn,
whence, by subtraction,
f( x) -f(a)= p. (xn-an) +p1(Xn-1' an-1) ~  + pn-,(x âa).
Now xm -am, where m is any positive integer, is always divisible by
x -a; t hence f(x) â f(a) contains the factor x -a at least once, and
may be written in the form f(x) -f(a)= (x -a)F,(x), where F1(x) is a
polynomial of the (n - 1)st degree. It may of course happen that it
contains the factor x -a more than once, but by dividing by x -a as
often as possible, say k times, we shall finally arrive at the equation
f(x) -f(a) = (x -a) kF(x), where F(x) is a polynomial, of degree n-k,
that is not divisible by x -a. Moreover, this polynomial F(x) will
not become zero when x= a; for, as above, F(x) â F(a) an expression


* In case k = n, F(x) reduces to a constant, not zero.
t Thus, xm-(anm =(x-a)(xm-l+aXm-2 alxm-3+... -am-2X       am-1).


204


MODERN MATHEMATICS


containing x âa as a factor, and if F(a) were zero, then F(x) itself
would contain x-a as a factor, which is not the case.
(4) The polynomial f (x) is continuous at every point x=a.
Roughly speaking, this means that a small change in the
position of x will produce a correspondingly small change in
the position of f (x). More precisely, given any radius R about
the point f(a), we can always find a radius r about the point a,
such that whenever x -aI &lt; r, If(x) -f (a) &lt; R.
To show that this is true, write f(x) in the form
f(x) -f(a) + (x -a) k' (x),
as in (3), and draw a circle of radius Ilj about the point a, and a more
inclusive circle of radius Ial + i about the point O. By (1), we can
find a distance G such that IF(x) &lt; G whenever x lies within the larger
(and hence whenever x lies within the smaller) circle. If now we take
r&lt;(R/G)l/k, and also &lt; 111, then whenever lx-al &lt;r we shall have
f(x) â f(a) I = l (x-a) kIF(x) &lt;rk.G &lt; (R/G)G=R,
as required.
(5) If f (a) is not zero, we can always choose x so that
]f(x) &lt; I f (a); that is, so that the point f (x) is nearer the zero
point than f (a) is.
To see that this is true, we shall write f(x) in the form
f(x) =f(a) + (x -a) kF(x),
as in (3), and then show that x can be so chosen that the point (x -a) kF(x)
will fall within the region shaded in
// ``~~    Ythe figure; that is, so that the distance
A  /  a(x)   of (x-a)kF(x) will be less than If(a)[,
f/, 5   a)  while its angle will lie between 0+ 120~
/ O / /0  \      and 0+240~, where 0 is the angle of
the pointf(a). When x is so chosen, the
^^  /^/ \sum of the point (x-a)kF'(x) and the
fixed point f(a) will then lie within
the region bounded by the dotted line in
the figure, and hence will be nearer to O
FIG. 24.      than f(a) is.
To see that the distance of (x-a)kF(x) can be made less than
f(a), we have merely to notice that the factor l(x-a) k can be made


FUNDAMENTAL PROPOSITIONS OF ALGEBRA


205


as small as we please by taking x sufficiently near to a, while the other
factor, IF(x)l, by (1), remains less than some finite quantity G.
To see that the angle of (x-a)kF(x) can be made to lie between
0+120~ and 0+2400, notice first that this angle is equal to kc4+$,
where b is the angle of x-a and f is the angle of F(x). Now $, the
angle of F(x), can, by (4), be made to differ by as little as we please
from a, the angle of F(a), by taking x sufficiently near to a; in particular, if we take x -al &lt;r, where r is sufficiently small, $ will lie between,
say, a-60~ and a +60~. Having thus chosen the distance of x-a,
we can still vary X, the angle of x-a, at pleasure, by moving x around
the circumference of its small circle of radius Ix -al about a. In particular, if we take  5= (O-a+180~)/k, then kk+a=0+180~, and
kq5 + e will lie between 0 + 120~ and 0 + 240~, as required.
The following general property of points in the plane will
also be useful.
(6) If an infinite collection of points x all lie within a finite
region of the plane, say within a square of side D, then there will
be at least one point X, within or on the boundary of the square,
having the following property: every circle, however small, drawn
about X as a centre, includes an infinite number of points that
belong to the collection.  Such a point X is called a cluster point
for the given collection; but it may or may not itself belong
to the collection.
To see that such a cluster point will always exist, we have merely to
draw through each point of the collection lines parallel to the sides of
the square, and consider the sets of points in which these lines cut two
adjacent sides of the square. Each of these sets of points, by Dedekind's principle, [sec. 27, (31)], will have at least one limit point; and
the point of intersection of lines drawn through these limit points,
parallel to the sides of the square, will be the cluster point required.
(7) If a collection of points {x} is given, and the corresponding
points {f(x)} possess a cluster point Y, then there is at least one
point x in the plane such that f (X) = Y.
To prove this, pick out from the collection f(x) } an infinite sequence
of points, f(x,), f(x2),..., having the following properties: (a)Each
point of the sequence is nearer to Y than the preceding point is; and
(b) every circle drawn about Y as a centre will contain an infinite


206


MODERN MATHEMATICS


number of points of the sequence. The corresponding points, x1,
2,..., will then all lie within a finite region of the plane, as we may
see by (2), and they will, therefore, by (6), have at least one cluster
point, X. This point X will have the required property, f(X)= Y.
For, suppose f(X) were equal to Y', where Y' is different from Y.
Then we could draw two non-overlapping circles of radii R and R',
about the points Y and Y', respectively, and deduce a contradiction,
as follows: In the first place, from the nature of the sequence f(xA),
f(x2),..., all the points of this sequence beyond a certain stage, say,
f(x!,), will lie within the circle R about the point Y. On the other
hand, since Y'=f(X), we can, by (4), draw a circle of radius r' about
X so that whenever x lies within this circle r', f(x) will lie within the
circle R', and hence outside the circle R. Therefore none of the points
of the sequence x1, x2,..., beyond the stage Xk can lie within the
circle r', which contradicts the fact that X is a cluster point for this
sequence.
45. The main proposition can now be established, as follows:
Suppose the proposition is false; that is, suppose that
there is a distance c, not zero, such that If(x)I &gt;c for every
point x in the plane. Then, by Dedekind's principle, the possible
values of If (x) must have a lower limit, b, c, such that If(x) 
is never less than b, but can be brought as near to b as we please
by properly choosing x.
Two cases are now conceivable-either there is a point a
such that If (a) = b, or else f (x) I &gt; b for all values of x.
The first case is impossible, since, if there were a point a
such that If(a)==b, then, by (5), there would be also a point
x such that    (x) I &lt; b, which is contrary to the hypothesis that
b is the lower limit of If (x).
In the second case, If(x)l has the lower limit b, but never
reaches it. If therefore we draw a circle of radius b about the
point 0, there will be an infinite number of points f(x) in a
finite region just outside this circle. It is easy to see, by (6),
that this collection of points f(x) will have a cluster point Y
somewhere on the circumference of the circle of radius b; therefore, by (7), there is a point X for which f (X) = Y; but for this
point If (X)l= b, and the second case is therefore as impossible
as the first.


FUNDAMENTAL PROPOSITIONS OF ALGEBRA         207
Hence the supposition with which we started must be false,
and the theorem that every algebraic equation has at least one
root is thus established.
The general theorem in sec. 26 (23) follows without difficulty.


V
THE ALGEBRAIC EQUATION
By G. A. MILLER


CONTENTS
SECTIONS.
I. GENERAL INTRODUCTION..............................  1-4
1, Aim of the monograph;
2, How it should be read;
3, Mathematics presupposed;
4, Type of questions studied.
II. HISTORICAL SKETCH AND DEFINITIONS,................... 5-9
5, Introduction;
6, Definitions;
7, Fundamental problems;
8, Symbols;
9, Domain of rationality.
III. EQUATIONS WITH ONE UNKNOWN AND WITH LITERAL COEFFICIENTS.................................... 10-17
10, General Statement;
11, Substitutions and substitution groups;
12, Linear equations;
13, Quadratic equations;
14, Extensions of the number concept due to the quadratic;
15, Cubic equations;
16, Biquadratic equations;
17, Equations whose degrees exceed 4.
IV. EQUATIONS WITH ONE UNKNOWN AND WITH NUMERICAL COEFFICIENTS................................... 18-24
18, General statement;
19, Multiple roots;
20, Sturm's theorem;
21, Rational roots;
22, Irrational roots;
23, Solutions by means of graphs and machines;
24, A few fallacies.
V. SIMULTANEOUS EQUATIONS..............................25-30
25, Introduction;
26, Consistency of a system of linear equations;
27, Geometrical interpretation;
28, Consistency of two equations in one unknown;
29, Equivalent equations;
30, A few tests for equivalence of equations.
VI. A FEW REFERENCES.................................. 31-32
31, Text-books;
32, Articles.
210


V


THE ALGEBRAIC EQUATION
By G. A. MILLER
I. GENERAL INTRODUCTION
1. Aim of the monograph. The present monograph aims
to give a sketch of some of the most fundamental processes
in which the algebraic equation occupies a central position,
and thus to fix the attention more completely on the underlying thoughts and the historical setting than would be feasible
in a short treatise on the theory of equations. The monograph
is intended to supplement such treatises rather than to replace
them. By means of the historic setting of many elementary
facts it is hoped that parts of it may be useful also to those
who have only such a knowledge of the equation as would
naturally result from an elementary course in algebra.
2. How it should be read. The reader is advised not to
insist on understanding every statement before proceeding to
the next. To some readers such concepts as domainofrationality, substitution group, and p-valued rational function may
be new, and our short account of them may not appear
entirely satisfactory. A slight knowledge of such dominating
concepts and of their applications is, however, much better
than total ignorance, and if the present monograph leads to
an intelligent search for knowledge along these important
lines its perusal will not have been in vain.
3. Mathematics presupposed. To avoid prolixity it has
seemed desirable to presuppose, in a few places, an elementary
knowledge of determinants as well as a knowledge of the first
211


212


MODERN MATHEMATICS


derivative of a function of a single variable.  As it seemed
undesirable to presuppose an elementary knowledge of the
Galois theory of equations some fundamental processes could
not be sketched with the completeness that would be desirable.
It is hoped, however, that the viewpoint which has been
adopted will tend to prepare the way for this general theory;
this is especially true of the methods used to solve the cubic
and the biquadratic equations. While the common road to
a knowledge of the equation leads through numerous problems,
it is sometimes desirable to take a broad survey of the historic
setting and of the underlying principles, and thus to gather
new inspiration and a deeper insight. It is hoped that the
present monograph may aid in taking such surveys.
4. Type of questions studied. Equations of the form
xn= 1 play an important role in the general theory of equations.
Since the fundamental properties of these equations are treated
in Monograph No. VII, sees. 28, 29, they are not given in the
present monograph. As the roots of the equation xn= ~a
may be obtained by multiplying those of xn= ~1 by the
arithmetic nth root of the positive number a, it results that
the theory of the equations of the form  x =1 is almost
equivalent to that of equations with two coefficients not zero.
For many purposes it is convenient to study equations from
the standpoint of the number of coefficients which are supposed
to differ from zero, especially when this number does not exceed
3, but in the present monograph the classification is made
with respect to the degrees of the unknowns. The interesting
properties which result from the assumption that the coefficients
represent successively the various terms of sequences of numbers have been left untouched for want of space.


THE ALGEBRAIC EQUATION


213


II. HISTORICAL SKETCH AND DEFINITIONS
5. Introduction. "An equation is the most serious and
important thing in mathematics," says Sir Oliver Lodge.* It
is also one of the oldest mathematical concepts, since the
fundamental operation of counting itself is based upon the
idea of a kind of equality between the things counted. Even
elementary algebraic equations are very old; for, such instances
as " Heap, its one-seventh, its whole, it makes 19," and " Heap,
its two-thirds, its one-half, its one-seventh, its whole, it makes
33," are found in the work of an Egyptian Ahmes written
about 1700 B.C. It is evident from these and many similar
instances that the ancient Egyptians used "heap " with the
same significance as our more modern x, and that the given
statements are respectively equivalent to the equations
x
7+x=19,
2x x x
â +    +x +x=33.
On fragments of papyri which have been deciphered more
recently, but are probably older than the work of Ahmes, statements equivalent to the system of two simultaneous equations
x2 +y2= 100,  y=3x,
have been found. Even special systems of n equations involving
n unknowns were solved at an early date. A Greek named
Thymaridas gave a rule for solving the following system:
Xi +x2+x3 +... +Xn=S,
X1 +-X2=al, X1 +~X3=a2,.. X1 +Xn=an-l.
It is an interesting fact that the technical terms given
(known) and unknown are involved in this ancient rule. A
similar system was solved by a Hindu Aryabhata of the sixth
century A.D, but the methods for solving general systems of
* Easy Mathematics, 1906, p. 127.


214


MODERN MATHEMATICS


m equations in n unknowns are of comparatively recent origin.
In general, the ancient mathematicians and those of the Middle
Ages sought merely numerical values of the unknowns in a
system of equations, but did not give the expressions representing
the unknowns in terms of the coeficients.
6. Definitions. An equation of the form
alxl + a2x2 + a3x3 +.. + anXn =.k,
where al, a2,..., an, k are supposed to represent known
numbers and xl, x2,..., x are unknowns, is called an equation
of the first degree, or a linear equation. Equations which are
true only on condition that the unknowns involved have
particular values are called conditional equations. If an equation is true for every set of values that may be arbitrarily
assigned to the unknowns, or if it is a true relation between
known numbers only, the equation is called an identical equation, or briefly, an identity.
Thus         2 3+4 7= 34,      3m-2m = m*
are identical equations, while
2x + 3y = 1,   5x -2y = 12,
are conditional equations.
If it is assumed that a sequence of numbers may be assigned
to the unknowns of an equation these unknowns are called
variables. Whether the letters of an equation are to represent
unknown constants or variables depends upon the point of
view, but the difference between these concepts should be
* In an identical equation the symbol = is frequently replaced by =.
This symbol was first used with the present meaning by Riemann, according
to Kronecker's Vorlesungen fiber Zahlentheorie, 1901, p. 86. Gauss used it
with a different meaning in the theory of congruences, such as 10 3 (mod. 7),
(see Monograph No. VII), at an earlier date, 1801. Hence this symbol is
now used to represent both something stronger and also something weaker
than what is generally implied by the symbol =. As Kronecker observed
the stronger meaning seems the more natural, as = would appear to imply
something more than =, but the symbol is more extensively used with the
weaker meaning.


THE ALGEBRAIC EQUATION


215


carefully observed. By varying the meaning of the letters
the equation reveals its full significance and usefulness.
An equation involving no unknowns with fractional exponents is said to be of the nth degree if it involves at least one
term in which the sum of the exponents of the unknowns is n
but no term in which the sum of these exponents exceeds n.
If all the terms, which are not identically zero, of an equation
are of the same degree it is said to be homogeneous. For
instance, x+y=0, x2 -xy+7y2=0 are homogeneous equations
of the first and second degrees respectively.
In view of ancient geometric interpretations, equations of
the second and the third degree are commonly called quadratic
and cubic respectively. If an equation is reduced to an identity
when known numbers are substituted for the unknowns, these
numbers are called roots of the equation and the roots are
said to satisfy the equation. The process of determining the
roots is called a solution of the equation. If the unknowns
of two or more equations are supposed to have the same
values the equations are said to be simultaneous equations.
7. Fundamental problems. Two fundamental problems in
the theory of equations are the solution of the general equation
of the nth degree in one unknown and the solution of a
system of m simultaneous equations in n unknowns. Although
the former of these is a special case of the latter it is of such
paramount importance and difficulty that it may be regarded
as a fundamental problem in the theory of equations. Only
very special cases of these problems were solved by the ancient
and mediaeval mathematicians, and both problems have furnished
nuclei for extensive theories which are still in the process of
development. An instance of the solution of special cases of
the first is furnished by the "heap " problems of Ahrnes,
which were mentioned above. Among other instances are
the following: the extraction of square roots, such as
f25 5,    125 5
1.6 4    4 =2
found recently on an Egyptian papyrus now in the Berlin


216


MODERN MATHEMATICS


Museum; the geometrical representation of roots of equations
of small degrees by the early Greeks, including Euclid, and
especially by the Arabs, the finding of one positive rational
root of quadratic equations by Diophantus, and the recognition
of the fact that at least some numerical quadratic equations
have two roots by the Hindus and the Arabs.
Starting from such special cases as these, mathematicians
have gradually been enabled to comprehend the fact that
every equation of degree n in one unknown has exactly n roots.
This elegant theorem is commonly known as the fundamental
theorem of algebra.*  In France it is also known as the
theorem  of d'Alembert,t since d'Alembert published a proof
of it in 1746, which was supposed in his day to be rigorous.
The first satisfactory proof was given by Gauss in 1799. The
gradual progress toward this theorem through many centuries
furnishes an impressive picture of the slow pace at which our
rich mathematical inheritance was developed, and of the
interesting history which surrounds the fundamental theorems.
The second fundamental problem mentioned above has
also led to rich results in modern times. When restricted
to the case in which the m simultaneous equations in n unknowns
are linear, the problem has led to an important branch of
mathematics known as determinants, and the theory of determinants, in turn, has thrown much light on this problem. The
isolated simultaneous equations of the ancient Egyptians and
of the ancient Greeks gave expression to needs of the human
mind which have been largely satisfied, but which have fortunately led to a deeper sense of the need of still further developments and of hope that such developments will be forthcoming.
In our brief treatment of the algebraic equation we shall
devote most of our space to the consideration of equations
involving only one unknown, since such equations form also
the basis of the theory of a system of m simultaneous equations
in n unknowns.


* For a proof of this theorem see Monograph No. IV, Appendix II.
t The noted Italian mathematician S. Pincherle also styles it the theorem
of d'Alembert in his Lezioni di Algebra Complementare, 1909, p. 109.


THE ALGEBRAIC EQUATION


217


8. Symbols. The ancient and mediseval mathematicians
commonly wrote the word equals, or its equivalent, between
the two members of an equation. This was, however, not
universal, but a large number of different symbols have been
used to indicate equality between the two members of an
equation. Even Ahmes used a symbol (&lt;) for such an equality,
Diophantus used  a as an abbreviation of the word '1ox (equal)
and the western Arabs made use of a symbol resembling a
capital J for the same purpose. During the sixteenth century
the symbol ox, standing for the first two letters of aequalis,
was extensively used. Our modern symbol = was introduced
by Record in 1557 in his Whetstone of Witte, and the reason
assigned for choosing this particular symbol was that "noe
2 thynges can be moare equalle" than two parallel lines.
During the seventeenth century two parallel vertical lines
were frequently used, especially in France, instead of =, since
the latter was used to represent the absolute difference between
two numbers. It was also a recognized abbreviation for the
word est in mediaeval manuscripts.
The most important things about an equation are the
unknowns. In fact these characterize a conditional equation
and the determination of the range of values of the unknowns
is the main mission of the equation.* Among the various
symbols that have been used for a single unknown none seems
more expressive than the one employed by Ahmes, 1700 B.C.;
for the term heap naturally implies that the number of the
individuals is unknown. Diophantus represented the unknown
by a final sigma, s, and the Hindu, Brahmagupta, represented
the first unknown by yavat tavat, and if more than one unknown
were employed he used colors, black, blue, yellow, etc., to
represent the second, third, fourth, etc., unknowns. Alkarismi,
the noted Arab whose work gave rise to the term algebra,
and whose name gave rise to the term algorithm, called the
unknown the thing or the root, and these terms were in common
* These remarks are based upon an elementary point of view. From
another point of view, it is equally true that the known coefficients completely dominate and determine the possible values of the unknowns.


218


MODERN MATHEMATICS


use during the Middle Ages. In 1637 Descartes introduced
the present custom of representing the unknowns by the last
letters of the alphabet (x, y, z) and the knowns by the first
letters (a, b, c, etc.).
9. Domain of rationality. One of the most useful modern
concepts relating to the algebraic equations is that of the
domain of rationality. If a symbol R, which obeys the ordinary
laws of algebra, is combined with itself and the results of such
combination, by addition, subtraction, multiplication and
division (division by zero being always excluded) in every
possible way, there results a certain totality of expressions,
which evidently has the important property that no additional
expression results from the combination of the expressions
of the totality with respect to any of the four given operations. These operations are collectively called the rational
operations of algebra, and the given totality is known as the
domain of rationality constituted by R, and it is denoted by
(R).* If R represents the number 1 and if we operate on
this number and the resulting numbers in every possible way
according to the rational operations of algebra, the resulting
totality is composed of all the rational numbers. The same
totality would have been obtained by letting R represent any
other rational number besides 0. That is, (1)_(n), where n
is any rational number except 0.
To understand the meaning of domain of rationality it
is important to observe that it implies a totality which is
closed as regards the rational operations of algebra. In this
connection it is interesting to observe that the n nth roots
of unity form a closed totality as regards multiplication and
division but not as regards addition and subtraction. For
instance, if we take the four fourth roots of unity 1, -1, i,
-i (i-/V-1) and combine them in every possible way by
means of the operations of multiplication and division we
obtain no additional number. Similarly, the set of eight
numbers
-1, -2, 3, 4, 12, 2, 2 4
* See also Monograph No. VIII, sec. 4.


THE ALGEBRAIC EQUATION


219


forms a closed totality as regards the operations of subtracting  from  2 and dividing    2, as may easily be verified.
Such closed totalities, involving either a finite or an infinite
number of distinct elements, are of fundamental importance
in various mathematical subjects. A totality of numbers
which is closed as regards the two operations of addition and
subtraction is known as a number modulus. The rational
integers, for instance, form such a modulus, but they do not
form a domain of rationality.*
In general, the domain of rationality (R) must include
the domain of rational numbers, since it includes -= 1.     Hence
the rational numbers of elementary arithmetic constitute the
smallest possible domain of rationality,? and this domain is
included in every other domain. The most general expression
of the domain (R) may evidently be reduced to the form
ao+alR +....+a -Rn
bo + bR +... + bmRm'
where ao, al,..., an and bo, bl,..., bm are ordinary positive
or negative integers. That is, the domain of rationality (R)
is composed of all the rational functions of R with integral coefficients.
It is evident that the totality of the rational integral
functions, i.e., all the functions of the form
ao+a l+.R.. + anRn
* Numerous examples of such finite closed totalities may be found in
any book dealing with groups of finite order, such as Burnside, Theory of
Groups, 1897; Dickson's Linear Groups, 1901; and Cajori's Introduction to
the Modern Theory of Equations, 1904. A few very elementary examples
are found in the article entitled "Groups of subtraction and division,"
Quarterly Journal of Mathematics Vol. XXXVII, 1906, p. 80.  The totality
of numbers lying on a line through the origin in the ordinary complex
number plane clearly forms an infinite totality which is closed as regards
addition and subtraction but not generally as regards multiplication and
division. Hence this totality is also a number modulus. If it is closed
as regards each of the four rational operations of algebra the line must be
the totality of real numbers. Cf. American Mathematical Monthly, Vol. XV,
1908, p. 117.
t The trivial domain composed of 0 is excluded from our consideration.


220


MODERN MATHEMATICS


where ao, a,..., a, have the same meaning as above, has the
property that no additional function arises by combining any
two of them (or any one with itself) by means of any of the
three operations addition, subtraction, and multiplication.
This totality is known as the domain of integrity constituted
by R and it is denoted by [R]. When R= 1 this domain
reduces to the totality of the ordinary positive and negative
integers, and [R] is always included in (R). It was observed
that (R) results when all the rational operations are successively performed upon R. This fact is generally expressed
by saying that (R) is generated by R as regards the rational
operations. On the contrary [R], is not usually generated by
R, as may be seen, for instance, when R-=/2. There is,
however, an infinity of pairs of elements in [R] which generate
[R] when they are combined with respect to the rational integral
operations of algebra. The simplest of these pairs is R and 1.
The totality of rational functions, with rational coefficients,
of the symbols R1, R2, R3,... (where the number of symbols
is finite or infinite) is called a domain of rationality and is
denoted by (R1, R2, R3,...). Each of the symbols R1, R2,
Rf3,... is called an element of the domain. Similarly, the
integral domain [R1, R2, R3,...] may be defined by replacing
the expressions rational functions and rational coefficients in
the preceding definition by integral functions and integral
coefficients respectively. A rational integral function f(x), of
x, is said to be in the domain of rationality (R1, R2, R3,...)
if all its coefficients are in this domain; if all its coefficients
are in the domain of integrity [R1, R2, R3,...] the function
is said to be in this domain. If f(x) is the product of two
rational integral functions in the same domain of rationality,
f(x) is said to be reducible in this domain. When this cannot
be done in a given domain f(x) is said to be irreducible in
this domain.
Functions which are irreducible in one domain may be
reducible in another. For instance, x2+x-1   is irreducible
in (V/2) but it is reducible in (V/5); on the other hand, x2 -2
is reducible in the former of these domains but not in the


THE ALGEBRAIC EQUATION


221


latter. Neither of these functions is reducible in the domain
of rational numbers, although both of them are in this domain.
From the fundamental theorem it results that every rational
integral function of x whose degree exceeds one is reducible
in some domain of rationality. If we consider rational integral
functions in two or more variables it is not possible to prove
a similar theorem. For instance, the function xy-1 is in
the domain of rational numbers, but it can be proved that
this function cannot be resolved into two factors whose coefficients are in any domain of rationality whatever. For further
developments regarding the concepts of reducibility and domain
of rationality, and extensive references to the literature on
these subjects, we refer to tome I, volume 2 of the Encyclopedie des Sciences Mathematiques, p. 205. A clear introduction
to the subject is given in Dickson's Theory of Algebraic
Equations, 1903, and in Cajori's Introduction to the Modern
Theory of Equations, 1904.
III.  EQUATIONS WITH ONE UNKNOWN AND WITH
LITERAL COEFFICIENTS
10. General statement.  The ancient and the mediaeval
mathematicians knew only five algebraic operations, viz.,
addition, subtraction, multiplication, division, and the extraction of roots. These operations suffice to solve every equation
of the form
f (x) =aoxn + axin+l +..   a, = 0,
where aoal,..., a, are real or complex numbers and n is a
positive integer, provided n&lt;5; but they are not sufficient
to solve this general equation when n&gt;4. The first rigorous
proof of this important theorem was published in 1824 by a
Norwegian mathematician named Abel,t and the theorem
* Unless the contrary is stated it will be assumed that ao 7 0. When a0
is real we may assume that it is positive, but when it is complex we cannot
make this assumption, since the terms positive and negative cannot be directly
applied to complex numbers.
t This proof appeared also in Crelle's Journal, Vol. I, 1826.


222


MODERN MATHEMATICS


marks an important line of division between equations of the
first four degrees and those of a higher degree. Numerous
efforts to solve by means of these operations * the general
equation of the fifth and higher degrees had preceded Abel's
proof of the fact that such efforts were necessarily futile.
It should be emphasized that there is a vast difference
between proving the existence of a root of f(x)=0 and finding
this root. The existence of such a root is proved by the
fundamental theorem of algebra, but the finding of methods
to express such a root in terms of the coefficients ao, a,..., a,,
is a much more difficult problem when n&gt;4. In 1858 the
noted French mathematician Hermite found a method by
means of which he could express a root of the general equation
of the fifth degree in terms of certain functions known as
elliptic functions. More recently it has been proved that a
root of the general equation of degree n may be represented
in terms of the coefficients by means of certain functions
called Fuchsian.t
The very important theorem that f(x) is divisible by x-xl
whenever xl is a root of f(x) = 0 was observed by Descartes.
This theorem establishes two fundamental facts, viz., (1) that
the finding of the roots of f(x)=0 is equivalent to resolving
f(x) into its linear factors, and (2) that the proof of the
existence of one root of f(x)=0 proves the existence of n
roots. Moreover, it is not difficult to prove this important
theorem. The proof may be obtained as follows:
Let         f(x) =aoxn +al-1 -..... +a,. (1)
Since xl is a root of f(x) 0, we have f (x) =0, or
O =aoxln +alxln-l +...+a.... (2)
By subtracting (2) from (1) there results
f(x) =ao(X â Xln) +ai(xn-1 -n- 1)... an_ (x -X1).


* Solutions confined to the use of these operations are known as solutions
by radicals.
f Cf. Tropfke, Geschichte der Elementar-Mathematik, Vol. I, 1902, p. 292


THE ALGEBRAIC EQUATION                 223
As
xl -X1- = (X -Xl) (x- 1 + Xl-2 +... +  l- 1)
whenever I is any positive integer, it has thus been proved that
f(x)= (X -X)fl(X)
where fi(x) is a rational integral function of x of degree n-1.
In other words, f(x) is divisible by x -xl.
Since fi(x) is of the same general type as f(x) we may
apply to it the same kind of reasoning. In particular, if the
general function of the form f(x) has a root, fi(x) must also
have a root (x2) and hence it is divisible by x-x2 with f2(x)
as a quotient. As n is a finite positive integer, we must finally
arrive at a linear quotient by repeating these operations and
thus prove that
f(x) = ao(x-Xl) (X -X2)... (X -Xn).
It should be emphasized that this process establishes the
existence of these n linear factors only on the assumption
that every such function as f(x) has at least one root. The
theorem just proved is evidently a special case of the theorem
that if f (x) is divided by x-xl the remainder is f(xl).
In 1629 Girard published an important work entitled
Invention nouvelle en l'algebre, in which he stated the theorem
that f(x)=0 has n roots and observed some general relations
existing between the elementary symmetric functions of the
roots and the coefficients of the equation. Special cases of
these relations had been observed earlier by Cardan and Vieta.
The more general relations can readily be deduced from the
fact that
f(x) = ao(x-xl) (x-X2)... (X -Xn).
By multiplying the factors of the second member it is clear
that the sum of the roots is -al/ao, while the sum of the
products of the different combinations of a roots is (-1)aa/ao,
where a=2, 3,..., n.


224


MODERN MATHEMATICS


Girard even computed the values of the following symmetric
functions of the roots
i-=n  i=n    i=n
EXi2,  23, X 2Xi4
i=l   i=1   i=l
in terms of the ratios
al  a2     a,
ao'  a''  ' ao
These ratios represent separately symmetric functions of the
roots, as was observed in the preceding paragraph, and these
interesting symmetric functions are technically known as
the elementary symmetric functions of the n roots. They are
respectively of degrees 1, 2,..., n. This work of Girard
prepared the way for the beautiful theorem that every integral
symmetric function of xl, x2,..., xn can be expressed in one
and in only one way as an integral function of the elementary
symmetric functions. A proof of this theorem may be found,
among other places, in Dickson's Algebraic Equations, 1903,
p. 99. A number of fundamental properties of symmetric
functions are developed also in Burnside and Panion's Theory
of Equations.
11. Substitutions and substitution groups.  A  profound
study of the algebraic equation involves not only a knowledge
of the properties of symmetric functions, but also a knowledge
of rational functions which are not symmetric. In his noted
memoir entitled "Reflexions sur la resolution algebrique des
equations," Nouveux memoires de l'Academie Royale des
Sciences de Berlin, 1770 and 1771, Lagrange made a thorough
study of the known methods which had been employed to solve
equations of higher degrees, reviewing the methods employed
by Cardan, Ferrari, Descartes, Tschirnhaus, Euler, and Bezont.
He observed that the solution of an algebraic equation depends
upon the solution of a certain other equation, since known
as the resolvent, and he showed that the roots of the various
resolvents are rational functions of the roots of the given
equation. This led to a recognition of the fundamental fact


THE ALGEBRAIC EQUATION


225


that the problem of the solution of equations depends upon
the properties of rational functions of the roots.
The fertile point of view at which Lagrange had arrived
in the extensive memoir noted in the preceding paragraph
called for a comprehensive study of the properties of rational
functions, especially as regards the number of values assumed
by such functions when their n elements are permuted in
every possible manner. This study led to the theory of substitutions, called "calcul des combinaisons " by Lagrange,
which has proved to be a most powerful instrument to secure
a deep insight into the nature and properties of an algebraic
equation. Among those who share with Lagrange the honor
of having discovered the fundamental importance of the theory
substitutions along this line we mention especially Ruffini,
Abel, Galois, and Jordan.
A study of the various forms which a rational function
assumes when its elements or letters are permuted furnishes
one of the most natural ways to secure a knowledge of the
true meaning of substitutions and substitution groups. For
instance, the historic function
X1X2 + X3X4
is evidently left unchanged by replacing x1 by x2 and x2 by
x1. This fact is more briefly expressed by saying that the
function xx2 +x3x4 is transformed into itself by the substitution (XlX2). It is clearly also transformed into itself by the
substitution (X3X4). The fact that the two substitutions (xlx2),
(X3X4) are to be performed successively is indicated by
(X1X2) (X34), and this substitution is called the product of
the two substitutions (Xl12), (X3X4). It is clear that if each of
two substitutions transforms a function into itself their product
must also transform this function into itself. A set of distinct
substitutions which has the property of including the square
of each of the set as well as the product of any two of them
is called a substitution group.  As (X1x2)2 (x3x4)2 =[(XlX2)
(x3x4)]2 =, or the identity, where 1 implies that all the elements


226


MODERN MATHEMATICS


of the function under consideration are left unchanged, it is
clear that the four substitutions
1, (Xlx2), (X3x4), (XlX2) (X3X4)
form a substitution group. This group is of order 4, the order
being the number of substitutions in a group. It plays a
fundamental role in many mathematical considerations and
is known abstractly under various names as follows: the
axial group, the anharmonic ratio group, the quadratic group,
the four group, the group of the rectangle, etc.
The given function is evidently also transformed into itself
by the additional substitutions (X1X3) (x2x4), (X1X4) (x2x3)
(xlX3X2x4), (x1x4x2x3) where the last two substitutions indicate
that the letters xl, Xs, x2, x4; Xl, x4, x2, x3, respectively, are
permuted cyclically in the given order. It is easy to verify
that the eight substitutions
1, (XlX2), (XZ34), (XlX2) (X34), (X13) (x2x4),
(X1x4) (x2x3), (xlX3X2x4), (xlx4x2x3)
form a group and that these are the only substitutions on
these letters which transform the given function into itself.
A group, that is contained in another group is called a subgroup.  The first four substitutions of this group of order 8
therefore constitute a subgroup of order 4, while the first two
substitutions constitute a subgroup of order 2. The reader
can readily verify the fact that the given group of order 8
contains two and only two other subgroups of order 4, and
four other subgroups of order 2. This group is known
abstractly as the octic group or the group of the square.
While the theory of substitutions is essential to attain an
insight into what is known as the Galois theory of the algebraic
equations and is very important also in other domains of
mathematics, we shall make no explicit use of it in what follows,
in view of the facts that the elements of this subject are not
as generally known as they should be, and a proper development of the subject ab initio would demand too much space
for the present monograph. It seems, however, desirable to


THE ALGEBRAIC EQUATION


227


state a few general theorems depending on this theory, whose
import can be at least partially appreciated by means of the
development of the preceding paragraphs.
It was observed above that the function Xlx2 + x34 is
transformed into itself by all the substitutions of a certain
group of order 8, but by no other substitution on these letters.
This fact is commonly expressed by saying the function
x1x2+X3X4 belongs to this group. It is easy to find other functions in these letters, for instance (X +X2-x3-X4)2, which
belong to the same group; and it has been proved that an
infinite number of distinct rational functions belong to any
given substitution group while such a function belongs to
only one substitution group. That is, there is an (o, 1)
correspondence between rational functions involving certain
letters, and the substitution groups on these letters, and it is
an important fact that all these functions which belong to the
same group are rational functions of each other.  Lagrange
observed that the number of values which such a function
assumes when its n elements are permuted in every possible
manner is a divisor of n!  For instance, xlx2 +x3x4 assumes
the following three values:
XX2 + X3X4, XlX3 +X2X4, XlX4+X2X3.
This is in accord with the general theory, as 3 is a divisor
of 24; but it is not possible to construct a rational function
in four letters which assumes exactly five distinct values when
its elements are permuted in every possible way, since 5 is
not a divisor of 24.
Although the number of values which a rational function
whose degree does not exceed n assumes when its letters are
permuted in every possible manner is a divisor of n! it does
not follow that there is such a function for every divisor of
n! In fact, it has been proved that whenever n&gt;4 it is not
possible to construct a function for every divisor of n! while
it is possible to construct such a function whenever n&lt; 5.
The fact that it is not possible to construct a rational function
with five letters which assumes either 3, 4, or 8 values when


228


MODERN MATHEMATICS


these letters are permuted in every possible manner was proved
by Ruffini in his Teoria generale delle equazioni, in cui si
dimostra impossible la soluzione algebraica delle equazioni
generali di grade superiore al quarto, published at Bologna
in 1799. This fact is equivalent to the theorem that there
is no substitution group on five or a smaller number of letters
having for its order one of the numbers 40, 30, 15.
12. Linear equations.  Every linear equation with one
unknown can be reduced to the form
ax =b,
where a and b are known, and x is the unknown.
Necessary and sufficient conditions that this equation can
be solved are that either a and b are both equal to zero,
or that a is not zero. If the former condition is satisfied the
equation has an infinite number of solutions, as x may have
any value. On the contrary, the equation has only one solution
when the latter condition is satisfied. In this case the value
of x is obtained by dividing b by a. As this is a rational
process the root of the equation must be in the domain of
rationality (a, b) constituted by a and b. The root is, however,
not necessarily in the integral domain [a, b] constituted by
a and b.
If a and b are restricted to be natural numbers it is not
possible to reduce every linear equation to a single form, but
all such equations can, in this case, be reduced to one of the
two forms
ax=b, ax+b=0.
This is also true in case a and b are only restricted to be positive
rational numbers. The ancient and the mediaeval mathematicians generally imposed the latter restriction on a and b
as well as on the root.*   Hence the second form   was not
solvable. The general solution of the linear equation as noted
above therefore calls for the extension of the number concept
so as to include both negative and fractional numbers in
* This was done by the great French algebraist Vieta (1540-1603) and
even Descartes called negative roots false roots in his Geometrie.


THE ALGEBRAIC EQUATION


229


addition to the natural numbers. With this extension of
the number concept the linear equation is solvable except when
a=0 and b60. The further extension of this concept so as
to include the irrational and the ordinary complex numbers
does not affect the given discussion of the linear equation.
13. Quadratic equations.  Every quadratic equation with
one unknown can be reduced to the form
aox2- al x+-a2 = O.
If we put x= z + k this equation becomes
aoz2 + (2aok +ai)z +aok2 +alk +a2 = 0.
Since aoZO, it results from the preceding paragraph that it
is always possible to solve the following linear equation in k:
2aok + a = O,
and thus arrive at an equation of the form
z2 =A,
which involves only the extraction of the square root of a
number. The most important thing about the solution of
the quadratic equation is the extraction of the square root of
a number. In fact, this is the only operation which enters
into the solution of the quadratic equation but not into the
solution of the linear equation, as can be deduced from the
general solution sketched above. The extraction of the square
root is, however, not a little thing in mathematics. It opens
up the question of irrational numbers as well as that of ordinary
complex numbers-two very profound and far-reaching questions.
As the number A is a rational function of the coefficients
ao, al, a2, it must lie in the domain of rationality constituted
by these coefficients. Hence the quadratic equation in one
unknown can always be solved provided we can extract the square
root of all the numbers in the domain of rationality constituted
by the coefficients of the equation. As it is known that we can
extract the square root of any real number as well as of any
ordinary complex number, it results, in particular, that the


230


MODERN MATHEMATICS


quadratic equation with one unknown can always be solved
provided its coefficients are ordinary real or complex numbers.
While the root of a linear equation in one unknown lies
in the domain of rationality constituted by its coefficients
this is not necessarily true of the roots of a quadratic equation,
since the operation of extracting the square root is not a
rational operation. If the coefficients of the quadratic are
rational numbers the domain of rationality constituted by
one root clearly includes the other root, but this is not necessarily true when the coefficients are either irrational or complex
numbers. It is, however, always true that the two roots of
a quadratic equation in one unknown must be in the
domain of rationality constituted by the coefficients and
one of the roots, since each root may be obtained by a
rational process from the coefficients and the other root. In
other words, the quadratic in one unknown is always reducible
in the domain of rationality constituted by its coefficients
and one of its roots, but in no smaller domain of rationality.
The reduction of the quadratic equation to the form
z2=A
is a special case of the removal of the second term in the
equation
aon + alx-1 +...+ a, = 0.
If we substitute z + k for x in this equation, there results
ao(z + k)n +al(z +k)'-l +... +an=0.
The coefficient of zn-~ in this equation is
aonk + a,.
As ao and n are both different from zero (otherwise the
equation would not be of degree n&gt;0) a number can always
be found which when substituted for k will reduce aonk+at
to zero. This number is sometimes called the zero of the
function aonk+al, but it is more commonly known as the root
of the following linear equation in k:
aonk + a = 0.


THE ALGEBRAIC EQUATION


231


Hence the solution of a linear equation suffices to determine
a number by means of which the coefficient of x-l1 can be
reduced to zero. In general, the solution of an equation of
degree a suffices to reduce the coefficient of x=n- to zero. In
particular, to reduce the absolute term an to zero it is necessary
to solve an equation of degree n.
14. Extensions of the number concept due to the quadratic.
The solution of the general quadratic calls for numbers of
the form a+b/ -1, where a and b are real, even when coefficients of the equations (ao, al, a2) are rational numbers. It
has been proved that numbers of this form likewise suffice
for the solution of the equation of the nth degree in one unknown
even if the coefficients are also any numbers of this form.
It should, however, not be inferred that the numbers of the
form a+bV -1 which are required to solve the quadratic
equation with rational coefficients are coextensive with those
required to solve the equations of the nth degree. In 1770
Lagrange proved that the real irrational numbers which are
roots of a quadratic equation with rational coefficients have
the characteristic property that they may be represented by
periodic continued fractions whose elements are integers.*
The quadratic equation merely opened the great problem
of distinguishing the different kinds of irrational numbersa problem which is to-day the object of important investigations.
The quadratic equation opened also the question as to the
number of possible roots of an equation. It should be emphasized that the answer given to this question depends upon
the point of view. For Diophantus and the older mathematicians who did not admit negative, irrational, or complex
numbers as roots of an equation, most of the quadratic equations did not have any root; others had one root, but no
instance is known where the ancient Greeks, or the older
mathematicians, observed that at least some quadratic equations may have two roots. On the contrary, Bhaskara, a


* Cf. Cahen, Elements de la Theorie des Nombres, 1900, p. 183. This
theorem was extended by Minkowski in Gottingen Nachrichten, 1899, p. 64.


232


MODERN MATHEMATICS


Hindu mathematician of the twelfth century of ou~r era, observed
that some quadratic equations have two roots, but even for
him many quadratic equations had no root whatever, since he
did not use complex numbers. He gave the following interesting rule: "The square of a positive as well as that of a
negative number is positive, and the square root of a positive
number is double, positive and negative. There is no square
root of a negative number because a negative number is not
a square."
Problems leading to quadratic equations are frequently
viewed so narrowly that only one root, or even no root, seems
to have meaning, and the existence of two roots has often
led to a more comprehensive conception of the problem. This
equation has thus contributed to more accurate and deeper
thought as to the real nature of the problem. It is very
important to observe that the study of the nature and the
properties of the roots of an equation has not only led to a
clearer comprehension of the essence of an equation, but also
to a deeper insight into the nature of the subject giving rise
to the equation.  This point of view was taken by Poinsot
in his important article entitled " Reflexions sur les principes
fundementeux de la theorie des nombres," in contradicting a
view expressed earlier by d'Alembert to the effect that the
additional roots, beyond those to which the problem  was
supposed to give rise, were an inconvenience and were not to
be attributed to the richness of algebra as some had supposed.*
The question as to the number of roots of an equation is
not entirely confined to the number domain in which the
values of the unknown are supposed to be. The additional
difficulty is opened by the quadratic which is a perfect square.
For instance, there seems to be good reason for saying the
equation,
x2-2x+1 =0,
has only one root, since 1 is the only solution of this equation.
On the other hand, when the first member of this equation is
* Poinsot, Journal de Mathenatique, Vol. X, 1845, p. 8.


THE ALGEBRAIC EQUATION


233


written in the form (x -l)(x-1), it is evident that x=1 will
make each of its factors vanish, and hence we may say that
1 is a repeated root. This should, however, be regarded as
merely a convention which tends toward simplicity and clearness-two of the most potent factors in shaping the development of mathematics. The statement that a quadratic equation
has always two and only two roots is thus seen to be heavily
laden with historical facts of great significance, and it opens
up the way for harmony and brevity in the theory of the
general equation.*
15. Cubic equations.  The solution of the general cubic
equation requires the operation of extracting the cube root
in addition to the operations involved in the solution of the
quadratic. The preparation for root extraction is, however,
not so evident in the case of the cubic as it is in the case of
the quadratic. In fact, many mathematicians attempted
these preliminary transformations in vain before an Italian,
Scipione del Ferro, professor in Bolgona from 1496 to 1526,
finally succeeded. Since this time, a large number of different
solutions have been given and a very extensive literature on
the cubic has been developed, as may be seen by consulting
the " Subject Index " of the Royal Society of London Catalogue
of Scientific Papers, 1800-1900, pp. 170-71. Many of these
methods of solution are based upon considerations which do
not apply to the general equation of the nth degree but are
very elegant as regards the cubic. On the contrary, we shall
first give a method which involves many very interesting
general theorems but requires lengthy computations, for farreaching thoughts are a greater desideratum for the mathematician than brief special methods.
Remove the second term of the general cubic by the
method indicated in sec. 13. The equation may thus be
reduced to the form,
x3 +qx +r=0.
* Professor E. R. Hedrick has given several reasons why the beginner
should not be taught that every quadratic equation has exactly two roots,
School Science and Mathematics, Vol. IX, 1909, p. 563.


234


MODERN MATHEMATICS


Let x1, x2, x3 be the roots of this equation, and consider the
functions:
(Z1 - 2)(x1 â X3)(x2 â 3),
(XI w   + 02 + 2X3)3,
(X1 + w2x2 + wx3)3,
where w is an imaginary cube root of unity. Observe that
whenever any one of these functions remains unchanged under
any given substitution on the roots, the other two do so also.
That is, each of these functions is transformed into itself by
the same substitutions on the roots. From the fact that
two rational functions which are transformed into themselves
by the same substitutions can be expressed rationally in terms
of each other (sec. 11), it results that each of these functions
can be expressed rationally in terms of each of the others.*
As the square of the first of these functions is symmetric, this
square can be expressed rationally in terms of the elementary
symmetric functions q and r, as was noted above in sec. 11.
These general theorems enable us to see how the cubic
can be solved. We express (xl -x2) (X -X3) (X2- -3) as the
square root of a rational function of q and r. Then we express
each of the other functions as a rational function of this square
root and extract the cube root of this rational function. In
this way we find the values of
x1 + WX2 + W2X3 and  xl + 2-w22 + O3.
As we know that xl + x2 + X3 =0 we have three linear equations in three unknowns from which the values of the unknowns
can be readily found. It should be observed that this general
method enables us to see, before we do any calculating, how
we may proceed to find the roots, and it illustrates an important
tendency in mathematics to see things without calculating.
Normally, thought should precede rather than follow calcu

* Cf. Dickson, Introduction to the theory of algebraic equations, 1903,
p. 24.


THE ALGEBRAIC EQUATION


235


lations in pure mathematics. The calculations will come out
as follows:
(X1 -X2) (X1 -X3)(X2 -X3) - -4q3 -27r2;
/I 27    3
1 + w2 + 2x3 =  -â r -   12q3 +81r2;
Xl + o2X2+WX3= â r+2 /12q+81r2;
x1 +X2+-X3=0.
On adding the last three equations we obtain
3r      r2q3       r    r2  q3
x1=  -2 +    27+     2  Ii4 +27'
This is known as Cardan's formula, because he first published
it. The substance of it had been obtained by Cardan from
Tartaglia under the promise of secrecy, but Cardan broke his
promise and published the formula h Ars n, in his "Ars Magna in
1545.*
An elegant and brief solution of the cubic was given in
1591 by Vieta, a noted French mathematician. The general
equation is first reduced to the form,
x3 +3ax = 2b.
a -Y2
Letting x-   - this becomes
Y
y6+ 2by3= a3.
As this is in the form of a quadric it is very easy to find
the possible values of y, and after these are known the values
a -Y2
of x result from x==. Numerous other brief methods are
Y
known and may be found in works on the Theory of Equations.
16. Biquadratic equations. We shall see that the solution
of the general biquadratic equation requires no non-rational


* For a clear statement of the extenuating circumstances the reader
may consult, Tropfke's Geschichte der Elementar-Mathematik, Vol. I, p. 275.


236


MODERN MATHEMATICS


operation except the extraction of the square and the cube
root. Hence the operations which enter into this solution
are of the same type as those which enter into the solution of the
cubic. As in the case of the cubic we shall begin with a
method which is valuable on account of its perspicuity and
the far-reaching thoughts which it involves, but is not the
simplest from the standpoint of practical applications. For
numerical equations the methods of the following section are
generally to be preferred.
We shall suppose that the general biquadratic has been
reduced to the form,
x4 +qx2 +rx+s =O,
and that its roots are xi, x2, x3, x4. The following three functions are clearly transformed either into themselves or into
each other by every substitution on the roots
(X1 + X2 -X3 -X4)2, (X1 -X2 + X3 -X4)2, (x1 -X2 -x3 + x4)2.
Hence the cubic equation which has these functions as roots
must have for its coefficients symmetric functions of xl, x2,
x3, x4. As these symmetric functions can be expressed as
integral functions of the elementary symmetric functions
q, r, s, it results that this cubic lies in the domain of rationality
constituted by these elementary symmetric functions. As a
matter of fact, this cubic is
y3 + 8qy2 + (1 6q2 -64s)y -64r2 = 0.
Solving this cubic and denoting its root by 01, 02, 03, we
have the following system of four linear equations in four
unknowns:
x1 +2 +X3+x4=0;
xl +X2 -X3 -X4= V/1;
X -x2 + X3 -X4 = /02;
xl -X2 -X3 + 4= V/03.
By adding these equations we observe that x1 is one-fourth
of the sum of the square roots of the roots of the given cubic.


THE ALGEBRAIC EQUATION


237


After x1 is known it is easy to find the values of the other
three roots.
The discovery of a solution of the general biquadratic is
due to Ferrari, who was a pupil of Cardan. It is especially
interesting since Ferrari was not yet twenty-three years old
when he discovered it. In this connection, it is interesting
to note that both Abel and Galois did their fundamental work
on the theory of equations before they were twenty-three
years old. The substance of Ferrari's solution was as follows:
Write the biquadratic in the form,
x4 + px3 + qx2 +rx+ s= O.
Add (ax + b)2 to both members and then assume that the
first member is a perfect square. That is,
X4 +px3 + ( +a2)2 + (r+2ab)x +s b2 = (x2 +px +k)2
By equating coefficients of the like powers of x and eliminating a and b there results a cubic in k. After finding the
value of k by means of this cubic it is only necessary to factor,
(x2 + px +k)2 -(ax + b)2 =,
in order to reduce the solution of the biquadratic to that of
two quadratics. The solutions of these quadratics must
include the roots of the original equation.
17. Equations whose degrees exceed 4. The brilliant discoveries of the Italian mathematicians regarding the solution
of the cubic and biquadratic equation led to numerous attempts
to solve general equations of higher degrees by rational
operations and the extraction of roots, as these were the only
known algebraic operations at that time. All such efforts
were destined to failure, but it required nearly three hundred
years from the time when Ferrari first solved the biquadratic
until Abel discovered, at the age of twenty-two, the first
rigorous proof of the fact that the general quintic equation
cannot be solved by these elementary operations. It is
interesting to note that Abel began his scientific career by
attempts to solve the quintic by radicals and he believed for


238


MODERN MATHEMATICS


some time that he had actually found a solution, but he
afterward discovered his own error. His apparent success
won for him the life-long friendship and support of his countryman, Hansteens.
Abel was not the first who attempted to prove that the
general quintic cannot be solved by the extraction of roots.
About a quarter of a century earlier Paolo Ruffini did much
to develop methods which were of sufficient power to prove
this fundamental fact. In particular, he gave a number of
theorems on groups of substitutions, as was noted above. The
difficulties which the general solution of the quintic presented
have thus become a source of great riches for the later development of mathematics. Besides Ruffini, some of the most eminent
among those who started these developments are: Tschirnhaus,
Euler, Lagrange, Gauss, Galois, and Hermite.
The work of Galois (1811-32) was especially fundamental
as regards the establishment of more definite relations between
the theory of equations and the theory of substitution groups,
by proving that every equation belongs to a certain substitution group, and that the properties of this group give definite
information as to the solvability by radicals of the equations
belonging to the group. The important theorem that two
rational functions of the roots of any equation may be expressed
rationally in terms of each other, in the domain of rationality
of the coefficients of the equations, had been proved earlier
by Lagrange. For an introduction to the elegant theory of
equations based upon these theorems we may refer the reader
to the following works: Dickson, Introduction to the Theory
of Algebraic Equations, 1903; Cajori, An Introduction to
the Modern Theory of Equations, 1904; Mathews, Algebraic
Equations, 1907.


THE ALGEBRAIC EQUATION


239


IV.   EQUATIONS WITH      ONE UNKNOWN       AND WITH
NUMERICAL COEFFICIENTS
18. General statement.    Although    numerical   algebraic
equations have a prehistoric origin, the arithmetical epigrams
of the Greek Anthology, among other things, support the
assumption that they resulted from puzzles and word-equations. The fully developed     equations represent highways
of exact thought without by-ways, and the coefficients determine the possible destinations of these highways. The ancient
problems of duplicating a cube and trisecting an angle, among
many others, directed attention to the need of such highways,
but their construction for coefficients, which may be regarded
as arbitrary, presented great difficulties. Even in the case
of the cubic with three real roots (casus irreducibilis) Cardan's
formula represents the real root in the form of the sum of
two imaginary expressions; and it has been proved that it is
impossible in this case to represent the roots of the cubic in
a real form by means of radicals.* On the other hand, the
great French algebraist, Vieta (1540-1603), showed how the
real values of the three roots may be obtained by means of
trigonometry.
From the preceding paragraph it results that the solution
of numerical equations of a given degree may present difficulties even after a formula for the roots of the general equation of this degree is known. These difficulties, combined
with those of finding such general formulas, directed attention
to special methods of solution in case the coefficients are
numbers. It is of especial importance to observe that for
many applications of algebra only approximate values of the
real roots are needed. This need has led to a vast literature
* Cf. Encyklopadie der Mathematischen Wissenschaften, Vol. I, p. 518.
The French edition of this work, to which we have already referred, treats
many subjects more completely than the German. This is especially true
as regards algebra and arithmetic. Neither of these editions is completely
published, but the German is considerably further advanced than the French.
They constitute at present the most important mathematical works of reference.


240


MODERN MATHEMATICS


which embodies some of the most beautiful results relating
to alegbraic equations. As the solutions of the general linear
and quadratic equations are so easily available for numerical
equations, we shall assume, in what follows, that the degrees
of the equations under consideration exceed 2.  The solutions
of numerical equations may frequently be simplified by considering the special properties of the coefficients, and hence they
demand great alertness as regards details.
A large part of the theory of numerical equations confines
itself to real numbers, since these are frequently the only
numbers applying directly to the conditions which give rise
to an equation. This is especially true as regards the coefficients of an equation. When the coefficients of the rational
integral function f(x) involve complex numbers it is evidently
possible to write this function in the form,
f(x) =  (X) +i(x),
where the coefficients of the rational integral functions f(x)
and b(x) are real numbers. After multiplying both members
of this equation by the conjugate value,    ((x)-ib(x), we
obtain a new rational integral function of x, which involves all
the roots of f(x)=O, but has only real coefficients. Hence it
results that if we can find all the roots of every rational integral
function of x with real coefficients we can also find those of
such a function with complex coefficients. It is also important
to observe from the given form of f(x) that any real root of
f(x)=0 is a common root of q (x)=0,     (x) =0, and hence it
is a root of the highest common factor of +(x) and O(x). In
view of these facts and for the sake of brevity and perspicuity
we shall assume throughout the rest of the present section
that all the coefficients of f(x) are real numbers.
19. Multiple roots. If f(x) is divisible by (x -r)Y but not
by (x-r)a+l, r is said to occur exactly a times as a root of
the equation f(x) =0; sometimes it is also called such a root
or a zero of f(x). When a&gt;l, r is called a multiple root of
f(x) =0, or multiple zero of f(x). To determine the multiple
roots of f(x) =0 it is convenient to use the well-known property


THE ALGEBRAIC EQUATION


241


that any root which occurs exactly a times in f(x) = 0 must
occur exactly a -1 times as a root of f'(x)=0, wheref'(x) is
the first derivative of f(x).  Hence a multiple root of f(x)
is also a root of the highest common factor of the two functions, f(x), f(x).  Since the first derivative off(x) may be found
by a rational process it results that f(x) is reducible in the
domain of rationality of its coefficients whenever it has multiple
roots, but the converse of this theorem is evidently not
necessarily true.
From the preceding paragraph it results that the multiple
roots of f(x) = 0 may be found by means of the highest common
factor of f(x) and f'(x). As the multiple roots of this highest
common factor may be found in a similar manner it results
that whenever f(x)=0 has no more than, distinct multiple
roots, all these roots may be found by rational operations and
by solving equations whose degrees do exceed P. In particular, if f(x)=0 has only one multiple root it may be found
by rational operations. It is frequently possible to find the
rational multiple roots by inspection.
Since the quotient obtained by dividing f(x) by the highest
common factor of f(x) and f'(x) involves each root of f(x)
once and only once, we may suppose in what follows that
f(x)=0 involves no multiple root. This hypothesis will conduce to brevity of statements.
20. Sturm's theorem. This theorem (proved in 1829)
furnishes the scientific foundation for every method of finding
the approximate values of the unknown in an algebraic equation with real coefficients, as it gives definite information
in regard to the number of real roots between two arbitrarily
assigned numbers.* Moreover, the proof of this theorem is
not difficult, being based upon the following two elementary
facts: (1) The continuity of f(x), and (2) the fact that if a
is a real root of f(x)=0 and h is a sufficiently small positive
number, then f(a -h) and f'(a -h) have different signs, while


* Encyklopadie der Elementar-Mathematik von Weber und Wellstein,
1906, Vol. I, p. 337.


242


MODERN MATHEMATICS


f(a+h) and f'(a+h) must have the same sign, where f'(x)
is the first derivative of f(x). A proof of these two facts is
found in many elementary text-books; e.g., Burnside and
Panton's Theory of Equations, Vol. I, 1899, pp. 9 and 161.
To obtain Sturm's Series we proceed exactly as in the
process of finding the highest common factor of f(x) and f'(x)
with the single exception that the sign of each remainder is
changed. In this way we obtain the following relations:
f(x) ql (x)' (x) -r (x),
f'(X) = q2()rl (x) -r2(x),
rl(x) = q3(x)r2(x) -r3(x),
rn-2(x) =qn(x)rn-l (X) -r (x),
where rn(x) is a constant, different from zero, since f(x)=O
has no multiple root. The series,
f(x), f'(x), rl(x), r2(x),...  rn(X),
has the following properties: No two adjacent functions can
vanish for the same value of x; otherwise all the succeeding
functions would have to vanish for this value of x, but this
is impossible since rn(x) cannot be 0. When any function
vanishes the two adjacent functions must have opposite signs
in order to satisfy the given equations. In finding the number
of changes of sign in this series as x increases continuously
from the real number a to a larger real number b we need therefore not consider the vanishing of any function except the
first one. In case this vanishes a change of sign is lost, as was
observed in the preceding paragraph. This proves Sturm's
Theorem, which may be stated as follows:
If any two real numbers a and b be substituted for x in
Sturm's Series,
f(x), f'(x), ri(x), r2(x),..., n(x),
the difference between the number of changes of sign in the series
when a is substituted for x and the number when b is substituted
for x is exactly the number of real roots of the equation f(x) = 0
between a and b.


THE ALGEBRAIC EQUATION


243


The total number of real roots of f(x) = 0 is equal to the
difference between the changes of sign in these functions
when - oo is first substituted for x and then + oo. The total
number of positive roots may be found by first substituting 0
and then + oo, and of the negative roots by first substituting
- o and then 0. This theorem is more general than Descartes'
Rule of Sign, as the latter gives merely an upper limit for the
number of real roots. The disadvantage of Sturm's Theorem
is that it requires considerable labor to find Sturm's Series,
especially when the degree of f(x) is large, since the coefficients
in the successive function of the series may become large.
It is evident that the successive remainders may be multiplied
or divided by any positive number and that it is not necessary
to find the exact value of rn(x), since only its sign is considered
in the application of the theorem.
Sturm's Series suffices to find the rational roots of an
equation and to approximate the irrational roots to any desired
degree of accuracy, but other methods generally require much
less computation. One of the most useful auxiliary theorems
in locating the roots of an equation may be stated as follows:
There must be an odd number of roots between a and b whenever
f(a) and f(b) have opposite signs. This theorem results directly
from the fact that f(x) is continuous and hence can change its
sign between a and b only by passing through zero. It is evident
that the number of roots between a and b must be zero or
even whenever f(a) and f(b) have the same sign.
21. Rational roots. Descartes observed that every root of
f(x) = aon +aln-l +... +an=0
is a divisor of a-. Moreover, if a root is rational and reduced
to its lowest terms, its numerator is a divisor of an and its
denominator is a divisor of ao, as results directly from substituting such a root ( ) in f(x) =0. In fact, all the terms of
(maol n a, (m? n-1        n
L ao  /         ao


244


MODERN MATHEMATICS


except possibly the first are evidently integers. As the sum
of all these terms is zero the first must also be an integer.
On the other hand, since m divides all these terms except
possibly the last it must also divide the last. If f(x) â0 has a
second rational root and this root is also reduced to its lowest
terms, its numerator evidently divides an - m and its denominator divides ao l, etc. As the numerator of every rational
root in its lowest terms divides a, and the denominator divides
ao, it results that we can find all the rational roots of f(x)= 0
by a finite number of trials and that the number of these trials
is small when a, and ao have only a small number of factors.
22. Irrational roots. It is always possible, in accord with
the preceding theory, to find two rational numbers, whose
difference is less than any assigned finite number, such that
one of these numbers is greater than the required irrational
root while the other is less than this root. We may choose
these rational numbers successively so as to differ from each
other by powers of -'-. That is, we may first find two integers
which differ by 10~=1 such that the root lies between them,
then we may find two rational numbers differing by 10-1, such
that the root lies between them, then we may find two
rational numbers differing by 10-2 and inclosing the root, etc.
The smaller of these two rational numbers is called the approximate value of the root, and the process of finding it is known
as approximating the root. In practice this process is greatly
modified in details so as to require much less labor.
In 1767 Lagrange published a theoretically simple method
for finding the approximate value of an irrational root by
means of continued fractions. The main features of this
method are as follows: After finding that a root of f(x)= 0
lies between the integers r and r+l we substitute for x in
f(x) =0,
1 ryl+l1
x=r+ --    --
yl    y1
and thus obtain another equation of degree n, fi(yi)=0, which
has the same number of real roots greater than 1, as f(x)=0


THE ALGEBRAIC EQUATION


245


has real roots between r and r +1. We then find an integer
r &gt;0, such that there is a root between ri and r1 +1 and
substitute in fi (yi) = 0,
1  rly2 +
y 2   Y2
In this way there results an equation of the nth degree in y2
which has as many real roots greater than 1 as f(yi)=O has
real roots between r1 and r1 + 1.
By continuing this process we must arrive at an equation
which has only one root greater than 1, and this root may be
traced as far as may be desired. The value of a root of the
original equation is then given by the continued fraction,
x==ri     l.
rl+ â
r2 +
Although this method is perspicuous and exhibits clearly
the reason for each step, it has not been used as widely as the
well-known Horner's Method.
23. Solutions by means of graphs and machines. If an
exact graph of y=f(x) could be constructed and if it were
possible to measure exactly the abscissas of the points where
this graph crosses the x-axis, the numerical measures of these
abscissas would furnish all the real roots of f(x)=0. This
method has the advantage that it exhibits the values of f(x)
for all the values of x within certain limits. Its disadvantage
is that a graph cannot be said to represent a function accurately on account of the imperfections in measurement and
drawing. It serves the purpose of a hypothesis by bringing
unity into what might otherwise appear as disconnected, and
hence it serves a very useful purpose, especially for the beginner.
It is a convenient receptacle for a large number of facts whose
significance might otherwise not be so clear.
The ancient Greeks used geometric constructions to solve
certain problems of geometry which are equivalent to the solutions of equations of the second degree, but the present graphic
methods for solving equations were developed mostly since


246


MODERN MATHEMATICS


the beginning of the nineteenth century. In many cases
these methods serve only to show that certain solutions are
possible and in some cases they serve as a rough check on the
accuracy of the calculations, but there are a large number
of cases where such solutions are sufficiently accurate for the
problems on hand. As they are especially well adapted to
the saving of thought as regards details they are doubtless
destined to play a more and more prominent role as mathematical methods find wider and wider use in the development
of science and industry.
Instead of drawing the graph of y=f(x) as noted above,
it is often more convenient to construct two curves such that
the abscissas of the points of intersection are the roots of
f(x)=O. Sometimes one curve is fixed for all the equations
of the same degree, while the other curve is made to vary so as
to correspond to the different values of the coefficients. As
early as 1637 Descartes employed a fixed parabola and a
variable circle to solve equations of the third and fourth degrees,
and he also solved equations of the fifth and sixth degrees
by means of a certain fixed curve of the third order and a
circle. The literature on graphic algebra is very extensive
and is growing rapidly. Among the introductory treatises
we may mention the Graphic Algebra by Phillips and Beebe.
Closely related to the graphic methods are the various
machines for finding the approximate values of the roots
of a numerical equation. Some of these are very ingenious,
employing principles of equilibrium of forces and of hydrostatics as well as of electricity. Although the ancient Greeks
solved the Delian problem, involving the solution of a cubic
equation, by means of mechanical devices, the machines
suitable for finding the roots of a great variety of equations
are comparatively recent inventions. One of the most noted
was invented in 1893 by a Spanish engineer named M. L.
Torres. For a detailed description of this and other machines
to solve equations and to simplify other calculations we may
refer to Le Calcul Simplifie par Maurice d'Ocagne, 1905. The
large mathematical encyclopedias, especially the Encyclopedie


THE ALGEBRAIC EQUATION


247


des Sciences Mathematiques, tome 1, Vol. IV, contain a large
amount of information on this subject.
24. A few fallacies and notes of caution.  While the chief
aim of mathematics is the construction of permanent and
attractive highways of thought leading as directly as possible
to important treasures of the intellect, it is of some interest
to observe where one is led by following by-ways regardless
of the danger signals. One of the most prominent of these
signals is: Never divide both members of an equation by an expression whose value is zero. If it were allowable to divide by
such an expression it would be easy to prove that every number is equal to zero. One such proof would be as follows:
From x= l there would result successively,
x2=1, X2-1=0, x+=-0, x=-1, 1=-1, a=-a, 2a=0.
As a may be so selected that 2a is an arbitrary number, it
would result from this that any arbitrary number is zero.
A fallacy of a somewhat different nature results from the
fact that we are so apt to forget that a number has n nth
roots. This is illustrated in the following two examples:
1    -1
-1    1
Extracting the square root of both members gives
N/1     /-1
V-1     Vi '
Clearing of fractions and observing that.(/1)2=1 and
(/ -1)2 = -1, if / stands for a single root, there results,
1- -1.
The danger signal here is remember that a number has n nth
roots.
The use of radicals in elementary mathematics is not as
uniform as it should be. For instance, the symbol \/ should
either imply two values and hence should never be preceded
by ~, or we should have a slightly modified symbol to denote


248


MODERN MATHEMATICS


the arithmetic square root. If we assume that the symbol
/ indicates merely a positive square root, such equations as
V/x+a+ /x=l, a&gt;,
are clearly impossible. On the other hand, they are possible
when this symbol indicates either of the two possible square
roots, and the possible value of x may be found by clearing
of radicals in the ordinary way.   Such equations should
therefore not be called impossible without stating that
symbol / is to be given an arithmetic meaning.
The equation,
(x q=  (xl) q
where p and q are integers, should not be regarded as an
identity, as is evident from the fact that (x-)4 has only one
value while (x4)1 has, in general, four distinct values. All
the values of the first member of the given equation are
evidently values of the second, but the converse is not true.*
Such equations must therefore be used with great care. For
more detailed information along this line the reader may
consult Catalan, Sur un paradoxe algebrique, Nouv. Annales de
Math., Vol. VIII, 1869, p. 456.
V. SIMULTANEOUS EQUATIONS
25. Introduction. In sec. 5 it was observed that simultaneous
equations appear on some of the oldest mathematical papyri
and that the solution of a special case of a system of n simultaneous equations was known to the ancient Greeks. A satisfactory treatment of such equations was, however, not possible
until determinants had been developed. This subject is comparatively modern, having its origin in the writings of Leibnitz
(1693), and assuming a significant position in mathematical
literature during the latter part of the eighteenth and the
first part of the nineteenth century. In what follows we shall


* Cf. Valles, Nouvelles Annales de Mathematiques, Vol. IX, 1870, p. 20.


THE ALGEBRAIC EQUATION


249


assume a knowledge of the elementary properties of determinants.
In the case of a single equation in one or more unknowns,
it is known that it can always be solved in the sense that at
least one value of each of the unknowns exists which will
satisfy the equation. The only exception to this rule is when
all the coefficients of the unknown, or the unknowns, are equal
to zero,* while the known term is not equal to zero. In the
case of a system of equations, a number of other possibilities
arise and one of the first questions in regard to such a system
is whether it can be solved. If this can be done the system
is said to be consistent.
A set of mn quantities arranged in rectangular array of
m rows and n columns is called a matrix.  When m=n it is
called a square matrix, so that the matrix of a determinant is
always a square matrix. The rank of a matrix is the order
of the largest non-vanishing determinant contained in the
matrix.
26. Consistency of a system of linear equations.t Consider
the following system of m equations in n unknowns:
allX1 +a12x2... +alnXn +b =0,
a211 +a22X2 +..+a2nXn + b2=O,
amixl + am22 +... + amnXn + br -O,
where m and n are any two positive integers. The three cases
that can arise are:
(1) The equations may have no solution and hence be
inconsistent.
(2) They may have only one solution.
(3) They may have more than one solution.
* We consider only finite values of the unknowns in the solutions of
equations.
t In this article we have, in the main, adopted the mode of presentation
given in the Introduction to Higher Algebra, by Maxime B6cher, 1907.


250


MODERN 7MATHEMATICS


It will soon appear that they must have an infinite number
of solutions whenever they have more than one (in fact, each
unknown has none, one, or an infinite number of values),
so that the possible cases are: No solution, one solution, or
an infinite number of solutions. To prove this it will be
convenient to consider the two matrices:
1al a12.. al n         aj1^12... alnbi
A=   a21a22... a2n          a21a22... a2 b2
atnlam2... amn       amlai2. *.  amnbm
The latter is obtained by adding the column of b's to the
former, and hence it is called the augmented matrix of the
system, while A is the matrix of the system. It is evident that
the rank of B cannot be less than the rank of A and that the
former cannot exceed the latter by more than unity. Hence
we have the two possible cases: (1) The rank of A is equal
to that of B, (2) the rank of A is one less than that of B.
Suppose that the given system of equations comes under
the latter of these two possible cases. We may therefore
suppose that the rank of B is r while the rank of A is r-1.
The given system of equations may be supposed to have been
arranged in such a manner that the non-vanishing determinant
of order r in B is in the upper right-hand corner of this matrix.
Since the rank of A is r-1 it results that the homogeneous
parts (fi, f2,...,f) of the first r equations of the given
system may be multiplied by constants (cl, c2,..., Cr), so that
clfl +c2f2 +.. +4Crfr=O,
independently of the values of the unknowns, where at least
one of the c's is not 0. If we represent the first members
of the given system by F1, F,..., Fm,, so that
Fi -fi+bi,     (i=, 2,..., m),
it follows from the above that
clFl +c2F2 +... +crFr=clbl +C2b2+... +crbr=c.


THE ALGEBRAIC EQUATION


251


Since the rank of B is r it is necessary that c 70, otherwise
each of the elements in one row of the matrix of a non-vanishing
determinant would be the same linear function of the corresponding elements in the other rows.
The fact that c O for any possible values of the unknowns
proves the inconsistency of the system, for if they were consistent there would be values of the unknowns which would
cause each of the functions F1, F2,..., Fr to vanish and
hence c would be 0. Having proved that the given system
of equations is inconsistent when the rank of B is larger than
the rank of A we proceed to prove that the system must be
consistent when the rank of A is equal to that of B. Suppose
that each of these two matrices is of rank r and that the
equations are so arranged that a non-vanishing determinant
of order r appears in the upper left-hand corner of each of
these matrices. Since each of the determinants of order r+1
must vanish we have the relation,
c1F1 + c2F2 +..~  + CrFr + cr+lFr+l = 0,
independently of the values of the unknowns.  As clF +
c2F2 +... + crFr 0 independently of the values of the unknowns,
it results that cr+l #O. Hence we may divide the given equation by cr+1 and thus express Fr+i in terms of F1, F2,..., Fr.
As the same argument holds for Fr+2, Fr+3,.., Fm it results
that any solutions of the first r equations must also be solutions
of all the rest.
If in the first r of the given system of equations we assign
arbitrary values to  r+l... x. we obtain a system  which
can be solved in the ordinary way by means of determinants,
since the determinant of the system does not vanish. In
this way we obtain one and only one value for each of the
unknowns xl,..., r. The preceding considerations prove the
following theorem: A necessary and sufficient condition for a
system of linear equations to be consistent is that the matrix of
the system has the same rank as the augmented matrix. Since
the values assigned to x,+l... n are arbitrary, it also follows


252


MODERN MATHEMATICS


that a system of linear equations has an infinite number of
solutions whenever it has more than one solution.
To provide very elementary illustrations of the preceding
theorem we consider the following systems:
{3x-2y+ z=8,             f3x+4y= 7,
x-4y+2z==6.           -    6x+8y=10.
x+2y=5,                  x- y+3z= 4,
III. 1 2x-y  =0,          IV. 2x+3y- z=      5,
4x+3y==10.               [3x+2y+2z=10.
In system I the rank of the matrix of the system is 2, since
3 -2
1 -4     0.
As the rank of the augmented matrix is also 2 this system
is consistent and arbitrary values may be assigned to either
y or z. On the other hand, the only value that x can have
is 2.* The rank of system II is 1, while the rank of the
augmented matrix is 2; hence this system has no solution.
In system III the rank of the matrix as well as that of the
augmented matrix is 2. Hence this system has a solution
and it is evident that it has only one solution, viz., x=l,
y=2. As the matrix system     of IV is of rank 2, while the
augmented matrix is of rank 3, this system has no solution.
27. Geometrical interpretation.  As   a  linear  equation
involving no more than three unknowns may be conveniently
represented as a plane in ordinary space, clearness is often
attained by thinking of the planes which represent given
systems of equations. For instance, system I of the preceding
paragraph represents two planes intersecting on the plane
x=2, and hence these planes are cut in parallel lines by every
plane parallel to the plane x=2, while they are cut in two
* A necessary and sufficient condition that a given unknown in a consistent system of linear equation has the same value in every possible solution
of the system is that the rank of the matrix of the system is decreased when
the coefficients of this unknown are omitted from the matrix of the system.
Cf. American Mlathematical Monthly, Vol. XVII, 1910, p. 137.


THE ALGEBRAIC EQUATION


253


intersecting lines by the planes parallel to y = 0 or z =0.
System  II represents two parallel planes, while system  III
represents three planes through a line parallel to the z-axis.
Finally, system IV represents three planes intersecting in
three parallel lines. These interpretations follow directly
from solid analytic geometry, and they tend to elucidate the
theory of systems of linear equations, but they do not form
an essential element of this theory.
28. Consistency of two equations in one unknown. Suppose
that two rational integral equations in x,
fi() = 0,   f2() = 0,
have a common root. If fi(x) is of degree mn and f2(x) is of
degree n, we obtain m + n equations in the m + n-1 unknowns,
x, x2,..., xm+-1  by  multiplying fi(x) successively  by
x, x2,.., x n-1, and f2(x) by x, x2,..., xm-1. The consistency
of this system of m + n equations requires that the determinant of the augmented matrix of the system be equal to
zero.*  This determinant is known as the resultant of the
equations and the method by which we obtained it is known
as Sylvester's dialytic method of elimination.  The resultant of
the two linear equations,
ax+b=O,      aix+bi=O
is
a b =abl-alb=O'
at bi
and the resultant of the two quadratic equations,
ax2 +bx +cO,    aix2+ bix+c =0,
is the determinant of the fourth order.
a b c o
o ab c
=0.
al bl ci o
o al b1 ci
* It has been proved that this condition is sufficient as well as necessary.
The arguments here employed prove only the latter.


254


MODERN MATHEMATICS


For instance, the two equations,
x2 +4x -21 = 0,  x2 +2x -15=0,
are consistent, since their resultant is 0. It is evident that
this method may also be employed to eliminate one of the
unknowns from two simultaneous equations in two unknowns.
29. Equivalent equations.  In  elementary  algebra two
equations are generally regarded as equivalent by definition
if they have all their roots in common.* Similarly, two
systems of simultaneous equations are regarded as equivalent
by definition if all the solutions of one system are solutions
of the other, and vice versa. On the other hand, it is frequently desirable to define the term equivalent with regard
to a certain set of transformations, and to say that two
expressions, or sets of expressions, are equivalent as regards
a certain set of transformations if this set includes at least
one transformation which carries the first of these expressions
over into the second, and also at least one which carries the
second over into the first. Two expressions which are equivalent as regards one set of transformations need not be equivalent as regards another set.
In the present article we shall adopt the former of these
definitions of equivalence, and we shall first inquire what
effect clearing of fractions may have upon certain rational
equations in one unknown. It is convenient to premise the
evident theorem: A necessary and sufficient condition that
the sum of the n rational numerical fractions,
al  a2       an
bl'  b2  ' ' '  '  bn,
in the form
alb2b3... bn +a2blb3... o+. +...+ blb.  bnl
bb2.. bn


* In Jordan's Traite des substitutions, p. 271, two equations of the same
degree are called equivalent if the roots of the one may be represented as
rational functions of the roots of the other.


THE ALGEBRAIC EQUATION


255


shall be in its lowest terms is that each of the n given fractions
shall be in its lowest terms and that the denominators,
bl, b,..., b,, are relatively prime. Let
fl(x)  f (x)       + fn(x) 
~, ---  4-'~,-/ â +...  +,  /  \ == U,.  *  *  *  (1)
I0 (X) +&gt;2(X) +        n(x) 0 
be an equation in which each fraction is reduced to its lowest
terms and the denominators are relatively prime, fi,... fn,
1,.*.., q5 representing rational integral functions of x, not
excluding the case when some of these are constants. When
cleared of fractions this equation becomes
fl() 02(x)....(X) +.. +fn(x) I (x)..  n   (Xi(x)=0.  (2)
Suppose that a is a root of (2), and hence
f i() 822(a)... *  n(a) +..+fn(() l(a).. *n-l(a)=0.   (3)
It is easy to see that none of the 4's is equal to 0. For
instance, Q b(a)=0 would imply that
fi(a) O2(a). *.n() = 0.
As a is not a root of any of the functions fi(x), 02(x),...
bn(x) it cannot be a root of their product. That is, 01(a) 7Z0.
Since none of the O's is 0 we may divide Eq. (3) by
lia) a)2(a) *.. n(a),
and thus obtain
fl (a)   2 (a)          (a) -.
This proves that every root of Eq. (2) is also a root of
Eq. (1) and it is evident that every root of Eq. (1) is also
a root of Eq. (2), since no root can be lost by multiplying both
members of an equation by a rational integral function. Hence
it results that Eqs. (1) and (2) are equivalent equations.
That Eqs. (1) and (2) are not necessarily equivalent if we
omit either of the conditions that the O's are relatively prime


256


MODERN MATHEMATICS


or that the fractions are in their lowest terms results directly
from the following examples:
The equation
x     1
x-1   x-1
has no root, since dividing by 0 is excluded, and 1 is the only
number that requires consideration, but
x(x -1)-x +1 = 0
has 1 as a repeated root. It should be observed that the
equation obtained by multiplying the former of these equations
by the least common multiple of the denominators is
x-1 =0,
and hence this has a root which is not a root of the equation
in the fractional form. On the other hand, of the two equations,
x-1 1
x -= 0' 2X2 -X â 1 = 0,
X2_1~xO       2x2 â1=0,
the latter has the root x =1, while the former does not have
this root. It is of especial interest to observe that all the
roots of the former must be roots of the latter, since the latter
was obtained by multiplying both members of the former by
a rational integral function of x.
Geometrical considerations frequently throw additional
light on the subject of equivalence of equations. For instance,
the two equations,
I 1
- +- =2   and   x+ y = 2xy,
xy
represent two loci which have every point except the origin
in common. The latter of these is a hyperbola, and if the
former could be plotted accurately its graph would be so
nearly like that of the latter that no microscope would reveal
the difference, since such an instrument could not reveal the


THE ALGEBRAIC EQUATION


257


missing point. It may be added that the rigid exclusion of
division by 0 is not followed by all mathematicians and that
many of the leading mathematicians of earlier times did not
completely exclude the possibility of such division.
29. A few tests for equivalence of equations. If the two
members of an equation are either multiplied or divided by
a rational expression of the unknowns, which cannot be zero
or infinity in the domain of rationality to which the unknowns
are restricted, the resulting equation is equivalent to the
original. Let A=B be any equation, and let K    be any
expression which cannot be zero or infinity for any of the
values of the unknowns under consideration. The equations,
A B
KA=KB, KK'
may be written as follows:
K(A-B)=O,       (A-B)=O.
Since K 30, or so, these equations can be satisfied only by
those values of the unknowns which make A=B. It results
directly that two equations which are equivalent in one domain
of rationality are not necessarily equivalent in another. For
instance, if the values of the unknowns are confined to real
numbers, K could be x2+1, but K cannot have this value
if x may be any complex integer.
If the two members of an equation are increased or
diminished by the same expression, the resulting equation is
evidently equivalent to the original. This clearly includes the
transposing of any term from one member of an equation to
the other as well as the changing of the sign of each term
of an equation. In transforming equations it is very important to observe whether the derived equations are actually
equivalent to the original. If a derived equation contains all
the roots of the original and some others it is said to be redundant,
if it lacks some of the roots of the original it is defective. From


258


MODERN MATHEMATICS


what precedes it is clear that the ordinary process of clearing
of fractions leads either to an equivalent or to a redundant
equation.
VI. A FEW REFERENCES
30. Text-books. The a-lgebraic equation occupies a prominent place
in algebra and some of its elementary properties are developed in the
text-books on algebra for the secondary schools. More extensive developments of these properties may be found in the advanced text-books on
this subject, such as
(1) Chrystal, Algebra, an elementary text-book, 2 vols., 2d edition,
1900.
(2) Capelli, Istituzioni di analisi algebrica, 4th edition, 1909.
(3) Weber, Lehrbuch der Algebra, 2 vols., 2d edition, 1898-99.
(4) Serret, Cours d'algebre superieure, 2 vols., 6th edition, 1910.
The last two of these works include a treatment of the Galois theory
of equations while the first two omit this theory, but they give an
elementary introduction to the theory of substitutions. In the first
this introduction is very brief and incomplete.
A large number of special treatises on the general theory of the
algebraic equation have appeared, beginning with the works of Vieta
in the early part of the seventeenth century. Among the modern works
in the English language Burnside and Panton's Theory of Equations
is probably the most generally known. The first three editions of this
work appeared in one volume and excluded the Galois theory, while
the fourth and' fifth appeared in two volumes and include an introduction to substitution groups and the Galois theory of equations.
Among the other treatises on this subject we may mention
(1) Dickson, Introduction to the Theory of Algebraic Equations,
1903.
(2) Cajori, An Introduction to the Modern Theory of Equations, 1904.
(3) Mathews, Algebraic Equations, 1907.
(4) Netto-Cole, Theory of Substitutions and its Applications to
Algebra, 1892.
(5) Barton, An Elementary Treatise on the Theory of Equations, 2d
edition, 1903.
(6) Bianchi, Lezioni sulla teoria dei gruppi di sostituzioni e delle
equazioni algebriche secondo Galois, 1900.
(7) Vogt, Leqons sur la resolution algebrique des equations, 1895.
(8) Matthiessen, Grundzuige der antiken und modernen Algebra der
litteralen Gleichungen, 1878.


THE ALGEBRAIC EQUATION


259


The last of these works contains an account of many of the ancient
methods which were used to solve equations and is rich in historical
material. As a result of the rapid growth of historical knowledge during
recent years some of this material has been found not entirely reliable.
Certain phases of the theory of equations are presented in a very
instructive manner in Klein's Elementarmathematik vom      hoheren
Standpunkte aus, Autogr., 1908-09; and also in Bocher's Introduction to
Higher Algebra, 1907. An extensive list of treatises on this and other
mathematical subjects may be found in the Mathematischer Bucherschatz
by Ernst TWolfing. This extensive work is supposed to give a systematic
list of the principal books and monographs appearing during the nineteenth century. It appeared in the Abhandlungen zur Geschichte der
Mathematischen Wissenschaften, 1903.
31. Articles. (1) Pierpont, Galois "Theory of algebraic equations,"
Annals of Mathematics, Vols. I and II, 1900, pp. 113 and 22.
(2) Bocher, Gauss's "Third proof of the fundamental theorem  of
algebra," Bulletin of the American Mathematical Society, Vol. I, 1895,
p. 205.
(3) Sylvester, "On an elementary proof and generalization of Sir
Isaac Newton's hitherto undemonstrated rule for the discovery of imaginary roots," Proceedings of the London Mathematical Society, Vol. I,
1865, p. 1.
(4) Van Vleck, "A sufficient condition for the maximum number of
imaginary roots of an equation of the nth degree," Annals of Mathematics, Vol. IV, 1903, p. 191.
(5) Baker, "A balance for the solution of algebraic equations,"
American Mathematical Monthly, Vol. II, 1904, p. 224.
(6) Emch, "Hydraulic solution of an algebraic equation of the nth
degree," ibid., Vol. VIII, 1901, p. 58.
(7) Moritz, " On certain proofs of the fundamental theorem of algebra,"
ibid., Vol. X, 1903, p. 159.
(8) McClintock, "A method for calculating simultaneously all the
roots of an equation," American Journal of Mathematics, Vol. XVII,
1895, p. 89.
(9) Tanner, "A graphical representation of the theorems of Sturm
and Fourier," Messenger of Mathematics, Vol. XVIII, 1889, p. 95.
(10) Kellogg, "A necessary condition that all the roots of an algebraic
equation are real," Annals of Mathematics, Vol. XI, 1908, p. 97.
(11) Lambert, "On the solution of algebraic equations in infinite
series," Bulletin of the American Mathematical Society, Vol. XIV, 1908,
p. 467.
(12) Allardice, "On a limit of the roots of an equation that is independent of all but two of the coefficients," ibid., Vol., XIII, 1907, p. 443.


2J6 0                 MODERN MATHEMATICS
(13) Dickson, " On the theory of equations in a modular field,"
ibid., Vol. XIII, 1906, p. 8.
(14) Bauer, " Ueber die versehiedenen Wurzeln euler algebraischen
Gleichung," Mathematische Annalen, Vol. LII, 1899, p. 113.
(15) Dedelkind, "Ueber Gleichungen mit rationalen Coefficienten,"
Jahresbericht der deutschen Miathematiker-Vereinigung, Vol. I, 1892, p. 33.
(16) Lucas, "IResolution electromagneltique des equations," Comptes
rendus de 1'A~cad6rniie des Sciences, Paris, Vol. CXI, 1890, p. 965.
A very extensive list of additional references to articles may be
found in the Royal Society of London Subject Index Catalogue of Scientific Papers, Vol. I, 1908, pp. 156-87. A selected list of treatises and
articles is contained in Felix Midler's Fifihrer durch die mathematische
Literatur, 1909, pp. 055-62.


VI
THE FUNCTION CONCEPT AND THE FUNDAMENTAL NOTIONS OF THE CALCULUS
By GILBERT AMES BLISS


CONTENTS
SECS.
I.  INTRODUCTION...........................................  1-3
1-3, The need of a unifying conception in elementary mathematics.
II. VARIABLES AND  FUNCTIONS...............................4-18
4-8, Definitions and examples;
9-10, Graphs of functions;
Functions with discontinuous graphs (10);
11-17, Classification of functions;
The location of the elementary functions (12-14);
Applications in collegiate teaching (15-17);
18, Continuity of a function.
III. THE FUNDAMENTAL NOTIONS OF THE CALCULUS............29-32
19-21, The derivative and its interpretations;
22-25, The anti-derivative;
Relations between a function and its anti-derivative (24);
26-30, The definite integral;
Geometrical and physical quantities as definite
integrals (26-28);
Computation of definite integrals by means of antiderivatives (29-30);
31-32, Relations between functions and their graphs.
262


VI


THE FUNCTION CONCEPT AND THE FUNDA
MENTAL NOTIONS OF THE CALCULUS
By GILBERTr AlMES BLIss
I. INTRODUCTION
1. Euclidean geometry a logical model. The mathematical
historian tells us that the most important contribution by
Euclid to mathematical science was his systematization of
geometrical principles already known to the mathematicians
of his day, rather than the additions which he made to the
science in the form of new theorems. His development of
the structure of Euclidean geometry has itself not been kept
inviolate from criticism in recent years. But whatever may
be the faults of his presentation from the standpoint of presentday methods, it must nevertheless be recognized that he was
among the earliest exponents of a now well-established logical
form for the application of mathematics to the phenomena
of nature. The structure of such an application consists of
two essential parts: first, a set of postulates suggested by our
intuitive interpretation of natural phenomena; and second, a
collection of definitions and of theorems stated in terms of the
definitions and deduced by logical processes from the initial
assumptions. The postulates are the foundation, and the
definitions and theorems the superstructure of the science.
2. Imperfections in the presentation of other subjects. The
Euclidean theory of geometry, which was presented thus early
to mankind in a form attractive alike to the intuitive or to the
logical type of mind, has for centuries occupied a prominent
263


264


MODERN MATHEMATICS


place in educational curricula, and it is no wonder that the
theory remains to the present day the gem of our elementary
mathematical courses. The marvel is, on the other hand, that
the characteristics of the Euclidean theory which make it
seem logically so complete, and so interesting to the mind
sympathetically inclined  to  mathematical thinking, have
apparently been overlooked to a very large extent in the presentation of other elementary subjects. Especially is this true
in the case of algebra. To be convinced one needs only to
take a cursory glance through the table of contents of almost
any college or elementary text-book on the subject, and to
note the heterogeneity of the subjects presented. Topics,
related perhaps inherently but with no indicated relationships,
follow each other in a confusion of radicals, exponents, progressions, imaginaries, probabilities, and other algebraic conceptions, in a way which must tend to develop a very disjointed
understanding on the part of the beginner. It is true that
efforts have been made with considerable success, in some of
the more recent text-books, to effect unity of presentation by
grouping the usual elementary algebraic conceptions about
the equation as a central notion. It is true also that heterogeneity of presentation is much less marked in the cases of trigonometry and analytics, largely because the mathematical
material designated by those titles is in itself more homogeneous.
But very little conscious effort seems to have been made to
make these subjects appear in their proper light as interrelated parts of a larger mathematical theory.
3. A remedy in the function concept. It is one of the purposes of the present paper to show how this lack of unity may
be remedied with the help of a very important mathematical
conception which is called a function. The notion of a function has been inserting itself into the consciousness of mathematicians in its most general guise since the time of Dirichlet,
though long before present and recognized in more special
forms. It is interesting to note that the definition of Dirichlet,
which seems very abstract in comparison with those of the
earlier mathematicians, was really devised as a result of his


THE FUNCTION CONCEPT AND THE CALCULUS


265


consideration of a practical problem involving the representation of functions by means of series, that of the flow of heat.
His definition is a simple one, though at first sight it seems
to be too general to serve as a basis for any extensive theory
of functions or to have important applications in other branches
of science. In order to explain it one must first consider what
is meant by a variable in terms of which the notion of a function is defined.
II. VARIABLES AND FUNCTIONS
4. Definition of a variable. A variable is simply a symlol,
say x, which in a given discussion may be used to denote any
one of a given set of objects. By means of a variable we are
enabled to express in terms of a single statement involving x,
a property which is common to all objects of the set. Thus
we may say that for any positive integer x the number 3x + 2
will have a remainder 2 when divided by 3, and we express
thereby a property of each of the positive integers. Or if
we wish to say that any curve joining two given points p and
q is longer than the straight line pq, we may designate by C
any one of the curves and state that C is longer than the line
pq. The set of objects, any one of which is represented by x,
is called the range of the variable, and in elementary mathematics it usually consists of numbers, though the example
just given shows that it may contain elements of a quite different character.
5. Example of a function. The word function was originally used to denote any power of a number, but with the
introduction of the calculus it came to mean any mathematical
expression involving a variable x, the value of which could be
calculated when that of x was assigned. The definition of
Dirichlet is more general still, and we may understand it
better perhaps by examining first some simple examples.
Consider the accompanying table, in which the numbers in
the first column are hours of the day, and the numbers in
the second the corresponding temperatures. If x is a variable


266


MODERN MATHEMATICS


which may star


Hour.
8
9
10
11
12
1
2
3
4
5
6
7
8


Temperature
52.2
53.4
61.0
69.8
75.7
77.8
78.1
76.9
72.55
67.8
66.8
60.0
51.1


id for any one of the hours of the day, and
y a variable representing the temperatures, then the table sets up a correspondence between the values of x and y in such
a way that whenever a value is assigned
to x a corresponding value of y is
uniquely determined, and y is said to be
a function of x. Similarly the mathematical formula y x/(x2-1) makes a
unique value of y correspond to any
given value of x with the exception of
x- ~1, and y is again said to be a
function of x. The range of the variable
-  x is in this case the totality of real
numbers excluding the values ~1, and


it may be seen without much difficulty that y takes all values
between -oo and r oc.
6. Definition of a function. WTith these examples in mind
we may now define with Dirichlet a single-valued function of a
variable x to be a second variable y so related to x that whenever a
value is assigned to x from the x-range, a corresponding value
of y is uniquely determined in the y-range. The correspondence between the objects in the x- and y-ranges need not
be made, however, by means of what is usually called
a mathematical expression, but may be determined in any
way whatsoever, provided only that it is unique for each object
which can be represented by x. The essentials in the definition
are evidently the independent variable x with its range, and
the correspondence between x and y. The range of the dependent variable y is of course not necessarily exhausted by the
correspondence; it may contain elements which do not correspond to any object in the x-range.
7. Examples of functions. To a person encountering this
definition for the first time, it would no doubt seem very artificial
and too general to be of any great service. It is possible, however, to develop an elaborate and important theory of functions
for the very general case when the range of x is left entirely arbi


THE FUNCTION CONCEPT AND THE CALCULUS


267


trary while that of y consists of real numbers, as has recently
been shown by Professor Moore* in his study of a field of mathematics which he has called General Analysis. The generality of
the definition depends not only upon the absence of any specification as to the character of the ranges of x and y, however, but
also upon the freedom which it leaves in the choice of the functional correspondence, even when the ranges consist of real
numbers.
The example of the hours and temperatures above illustrates
the existence of functions which do not involve a mathematical
expression of any kind, and many similar examples could be
found in the tabulated results of statistical observations or of
physical experiments. But the correspondence may also be
entirely artificial. For let x range over all the real numbers
from zero to one. Corresponding to any rational values of
x suppose that y is to have the value +1, and for irrational
values of x the value -1. Then y is a function of x over the
interval from zero to one, and the range of y consists only of
two elements. This example suggests the definition of a constant function, that is, one for which the dependent variable
y takes but a single value over the whole x-range. A constant, in general, may be regarded as a variable for which the
range consists of a single number.
For the function last given the x-range has an infinity of
values while the y-range consists of only two values, and a
table could be arranged which would indicate this functional
correspondence. But if the y-range contained also an infinity
of different y-values, it might be impossible to express the correspondence by listing the values of x opposite the corresponding
values of y in a table. For by far the majority of functions
met with in mathematical theories the correspondence is specified not by means of a table, but by means of a mathematical
formula which takes the place of the table, and which often
implicitly defines the range of the independent variable x,


* E. H. Moore, the New Haven Mathematical Colloquium. The lectures
in this volume were delivered by Professor Moore and others in September,
1906.


268


MODERN MATHEMATICS


as well as the correspondence. This the formula y=x/(x2-1)
given above defines a function of x which is well defined for
all real values of x except the values x= t~1, and as has been
noted the y-range consists of all values between -o and + o.
8. Functions with  other than  numerical ranges.  The
preceding examples illustrate the definition of a function when
the range of the independent variable consists of numbers.
It is easy to find functions for which this range has elements
of a different kind. If two points p and q are joined by a curve
C, the area pqrs shown in the figure is uniquely determined
and depends upon the form of the curve. A functional correspondence is thus set up between the variable C whose range
is the totality of curves joining the points p and q, and the
variable A which represents
the area. The statement that    -------------
a dependent variable y is a func-             /'
tion of an independent variable
x, is usually expressed in the
form of an equation y=f(x),,.      s
where the symbol f (x) is an ab-         IIG. 1.
breviation for the phrase " function of the independent variable x." If we wish, therefore,
to express the fact that the area A described is a function of
C, we may represent it by the symbol A(C). Similarly the
length of the curve C is another function of C which may be
denoted by L(C), as is also the surface area S(C) generated
by the arc C when the whole figure is revolved about the horizontal line as an axis.
There is a famous problem from a dornain of mathematics
called the calculus of variations, which gives rise to a function
of precisely the type which has just been considered. Suppose
that it is desired to find the curve C along which a marble will
roll in the shortest time from p to q. The time T in this case
depends upon the form of the curve C and is a function T(C).
It is impossible to describe here the interesting controversy
which arose between John Bernoulli, the proposer of the problem,
and his brother James as a result of their rival solutions, or


THE FUNCTION CONCEPT AND THE CALCULUS


269


to undertake a detailed study of the methods by means of which
they found the minimizing curve. Suffice it to say that in
their work is found the origin of the whole subject of the calculus of variations. But it may not be uninteresting to see
the result which they obtained. The minimizing curve is a
cycloid, that is, the locus of a point on the circumference of
a circle which rolls along a straight line. In the figure the
cycloid is shown inverted with its steeper part as it should be
near the point p, so that the marble accumulates a high velocity
at the beginning of its fall.
Similar minimum problems can be stated with respect to
two others of the functions described above. For the length
function L(C) the curve which provides the minimum value
is evidently the straight line joining p and q; and the curve
which describes the surface of minimum area when it is revolved
about the horizontal line is a catenary, whose form is that of
a heavy chain allowed to hang freely from two points of suspension.
An example of a function whose independent variable has
a range of still a different kind would be the shortest or straightline distance D(q) between a fixed point p and a variable point
q. The independent variable is now the point q, which may
range over the whole plane, while the range of the dependent
variable D is the totality of numbers between 0 and oo. This
and the preceding examples show that there are important
examples in which the elements of the range of the independent
variable are real numbers, arcs of curves, or points of a plane,
and many other examples could be devised to exhibit a variety
of ranges which it would not be possible to enumerate here,
even if the patience of the reader permitted. In all of the
examples which have been given the range of the dependent
variable has consisted of real numbers, but this is also not
necessarily the case, as appears in the theory of the so-called
integral equations and in parts of the General Analysis of
Professor Moore, referred to above.
9. Graphs of functions. For functions both of whose
ranges are real numbers a graphical representation was devised


270


MODERN MATHEMATICS


by Descartes, which is too familiar to need a detailed description here. But a few remarks concerning it may not be amiss.
A horizontal line is taken with a zero point 0 and a unit of
measure (see Fig. 2), so that to each value of x there corresponds a point of the line to the right of 0 if x is positive, and
to the left if negative.  In order to represent the value of
a function at a point x, a perpendicular is erected equal in
length to the number f(x). When f(x) is positive, the distance
is measured upward; when negative, downward. The horizontal axis is called the axis of abscissas, and the vertical
lines are called ordinates.  It is customary to   erect a
perpendicular called the y-axis at the point 0, but such
a line is not at all essential to the representation of the
o  1
FIG. 2.
function, and it is interesting to note that in some of the
early editions of Descartes' Analytical Geometry the line does
not appear.
If the range of x has only a finite number of elements, as
in a statistical table, then all the values of the function may be
plotted in this way, the result being a picture which is much
more suggestive and easy of interpretation than the table of
values itself. For example, the accompanying graph shows
a baby's ages measured in weeks as abscissas, and weights
measured in pounds as ordinates. Even a bachelor's eye
suffices to discover at a glance the unhappier periods which
are suggested only after some study by the table from which
the graph was made. The ordinates under the dots represent
the values of the weight function in this case, the straight


THE FUNCTION CONCEPT AND THE CALCULUS         271
lines being drawn only to assist the eye in passing from one
significant point to the next.
On the other hand if the x-range contains an infinity of
elements it is usually impossible to make a complete picture
of the function. One must be satisfied with plotting as many
points as is convenient or desirable from the nature of the
problem, and these may then be joined by a continuous curve
which will give an idea of the functional values and their variation as x is changed. For example consider the function
y=x/(x2-1). If only the points indicated by the dots are
FIG. 3.
actually calculated, the rest of the figure must be drawn free
hand. (See Fig. 3.)
At the best the graph can only be regarded as an approximate representation of a function, the errors which occur
being essentially of the following two different kinds. Since it
is impossible to represent distances exactly by means of marks
which have finite dimensions, a first source of error would be
in the use of the drawing instruments and would depend not
only upon the inadequacy of the instruments themselves, but
also upon the skill of the draughtsman. A second source of
error lies in our inability to plot more than a finite number of
points and the consequent necessity of filling in arbitrarily
by far the major portions of the graph. The magnitude of


272


MODERN MATHEMATICS


the errors due to the first cause can be estimated only by an
experimental examination of the accuracy of the instruments
and the personal equation of the operator. Similarly it is quite
impossible without experimental evidence to say what the error
will be which is due to the process of " filling in " the curve,
provided that the curve joining the plotted points is drawn
arbitrarily. But if the inaccuracies of the instruments are
neglected, and if it is agreed to join the finite number of points
which are actually plotted by straight lines, then it is possible
to show that certain types of functions are fairly represented
by such broken lines, and to show also that the error of representation for the functional values over a given interval can
be made arbitrarily small by plotting a sufficient number of
points sufficiently near together. The proof of this statement
is made with the help of a property of functions called uniform
continuity, and will be given later in the paper. For the
present it may be stated that all of the functions which occur
in elementary mathematics can be represented with a degree
of accuracy proportionate to the desire and the patience of the
investigator.
10. Functions with discontinuous graphs. From what
has been said it will doubtless be inferred that there exist
functions which cannot properly be represented by a graph,
and in fact the function referred to above which is equal to
+1 for rational values of x between 0 and +1, and equal to
-1 for irrational values, is of this character. For a line parallel to the x-axis representing the values of the function for
rational points would, according to the usual interpretation
of the graph, imply + 1 as the values of the function for irrational
values of x also. Nor is it true, as might be supposed, that
any function for which the functional correspondence is defined
by means of a mathematical formula, can be represented by
a curve. Professor Pierpont * has set up a number of interesting formulas which have curious geometrical interpretations, one of which represents the function having no proper


* The Theory of Functionw of Real Variables, p. 202.


THE FUNCTION CONCEPT AND THE CALCULUS


273


graph, which has just been mentioned. He begins by considering a function which he calls signum x, or sgn x, and which
is defined by the following conditions:
sgnx= +1   for    0&lt;x&lt;1,
sgn x=  0 for     x==, 0
sgnx=-1    for â 1-x&lt;O.
FIG. 4.
This function has the relatively simple formula:
2 lim
sgn x-        arc tan nx...   (1)
7r n=-oc
For if x is positive the limit has the value &lt;7/2, if x is zero the
value 0, and if x is negative the value -.
By means of the function sgn x a formula can be found for
the function which takes any given value a for rational values
of x, and any other number b for irrational values. For consider
the function
lim
g(x) =a+(b-a) l      sgn (sin2 n! nx)... (2)
If x is a rational number the expression in the parenthesis
becomes and remains equal to zero for a sufficiently large n,
since n! rx becomes a multiple of w. Hence
sgn (sin2 n! x) 0,
and g(x) has the value a, as a result of the properties of sgn x.
For an irrational x the product n! nx is never a multiple of rT,
and hence sin2 n! =x is some number between 0 and 1. For
such values of x
sgn (sin2 n! 7x)= l,


and g(x) has the value b.


274


MODERN MATHEMATICS


Another example which Professor Pierpont gives is the
function
lim   x
-+x
n
For any value of x different from zero this has the value unity,
while for xO0 it is equal to zero.
Still more curious is the function
lim  (1 +sin 7x) -1
Y  n=C ( l+sin tx) + 1'
I                           J  ~
o                           T
FIG. 5.                     FIG. 6.
which has the discontinuous graph shown in the accompanying
figure. If x is any integer this expression is evidently equal
to zero. For any value of x between 0 and 1 the parenthesis
(1 +sin r7x) is greater than unity and its nth power approaches
infinity as n increases indefinitely. Hence the limit of the
fraction is + 1. On the other hand, if x is between 1 and 2,
then (1 +sin nx) is less than one and approaches zero as n
increases, so that the limit of the fraction is in this case -1.
11. Classification of functions. The examples which have
just been given and those which precede show clearly the
necessity of some methods of classifying functions, if an intelligent study of them is to be made. There are several methods
in use, each of which is important in some branch of the function
theory, but one of them, which will presently be explained,
is especially interesting from the standpoint of the elementary
functions. On the basis of this classification some suggestions
will also be made with regard to the presentation of the elementary subjects, suggestions which it is hoped will not seem
too radical to be useful. In endeavoring to introduce any


THE FUNCTION CONCEPT AND THE CALCULUS         275
pedagogical improvement the teacher is always hampered
by the conservatism of the printed page. An alteration in
method, in order to be successful with elementary students,
must be mild enough to be adapted to the printed machinery
already at hand; or if it is a radical reform, it must be accompanied by a well-written and practical text, in order to be at
all effective or far reaching. The suggestions which are to be
made here are of the milder sort, with the possible exception
of those referring to algebra, where it seems to the writer at
least that a thorough reorganization of subject-matter might
lead to a very great improvement.
12. Algebraic functions. The simplest type of a function is
the polynomial
y=aoxm +alxml-+...+a,,
sometimes called a rational integral function, after which come
the rational functions or quotients of two polynomials,
aom m+alxm â.. a(m
Y   b   + bxbo -  -... bn          (
Both these types of functions are formed with the help only
of the four processes of addition, subtraction, mulpiplication,
and division, and the next class of functions which would
naturally suggest themselves would be those expressible by
means of the four processes just mentioned with the addition
of extraction of roots. But it is better to regard functions so
constructed as well as the polynomials and rational functions
as belonging to a larger category called algebraic functions
which are defined as follows. Suppose an equation
ao(x)y+al(x)y~-l+... +an_1(x)y+a,(x)=O. (4)
is given, in which the coefficients of the powers of y are themselves polynomials in x. If any value is assigned to x, the
resulting equation in y will have a certain number of roots,
in general n. To any x, therefore, the equation assigns a
number of values of y, and y is said to be a multiple-valued


276


MODERN MATHEMATICS


function of x. Evidently a polynomial or a rational function
is an algebraic function, the equation which y satisfies in the
latter case being easily found from equation (3) by making a
common denominator. It is not so easy to show that any
function which is expressible by means of radicals is algebraic,
but a few examples will indicate very well how this may be
the case. Take the functions
y= Vx + V x, y= Vl/ +x + 1-x.
By the usual algebraic methods of rationalization the variables
x and y are found to satisfy, respectively, the equations
y6- 2xy3 + x2-x = 0, y4- 4y2 + 4x2 = O,
and in general it can be proved that any function found by
addition, subtraction, multiplication, division, and extraction
of roots, satisfies an equation of the type (4).* The various
values which can be assigned to the radicals account for the
multiple values of the function. If it is remembered that
equations of the fifth degree and higher can be solved by radicals only in special cases, it appears at once that the class of
algebraic functions includes many which cannot be calculated
by means of these elementary processes alone. The generalization of the properties of functions determinable in terms
of radicals to the corresponding properties for algebraic functions
of the most general type, has furnished one of the most fruitful
and interesting fields of mathematical research.
13. Transcendental functions. The trigonometric functions
and the inverse trigonometric functions, the logarithm and the
exponential, as well as an infinity of other functions which
appear only in the higher analysis, do not satisfy any equation
of the type (4) and have been given the name transcendental
functions.t The values of these functions cannot be calculated
analytically by a finite number of additions, subtractions,
multiplications, and divisions, but depend upon an infinity of


* See Monograph V. sec. 7; Monograph IV, Appendix II.
t For proof of the transcendence of the numbers e and 7r see Monograph IX.


THE FUNCTION CONCEPT AND THE CALCULUS


277


such operations indicated by means of a power series. As
examples of such series may be cited the well-known ones for
the sine, logarithm, and exponential:
sin x=x â! +...
x2 x3
log(l+x)-=x+2+3+..,
ex =l+x+  tr+....
A function whose values are expressible by means of a power
series is called an analytic function, and it can be shown that
not only transcendental functions, but also all of the algebraic
functions are expressible in this way. Even this very general
category of analytic functions does not exhaust all of the possibilities for functions of a real variable, but for the purposes of
the present paper it will be unnecessary to pursue the classification further. The function (2) which has so often been
used as an illustration before, is an example of a function which
for real values of x is not expressible by means of a power
series, and there are many others. The results of the classification, as far as it has been made, can be summarized most
concisely in the form of the following table:
Analytic functions.
Algebraic functions.
Rational function.
Polynomials.
Rational fractions.
Irrational functions.
Those expressible by means of radicals.
Those not so expressible.
Transcendental functions.
The trigonometric functions and their inverses.
The logarithm and the exponential.
Other functions of less elementary character,
Non-analytic functions.


278


MODERN MATHEMATICS


14. The trigonometric and exponential functions are transcendental. There is an objection to this classification from
the elementary standpoint, which ought to be mentioned. It
is the difficulty in proving that all functions expressible in
terms of radicals are algebraic, and the necessity of proving
that the transcendental functions do not have this property.
In one of the accompanying monographs * it has been shown
that all the numbers expressible in terms of quadratic radicals
only are the roots of an algebraic equation, and a similar proof
could be made for functions of x, which are so expressible.
But for the radicals of higher orders the problem is a much less
elementary one and cannot be undertaken here. Professor
Pierpont has given a simple proof that the function y=sin x
cannot satisfy an equation of the form (4). For if there
were such an equation, then there would be one of the lowest
degree with the same property and of the form
n+al (x)yn-l+... a,(x)O.... (5)
where the coefficients are now rational fractions in x. On
account of the periodicity of the sine function, the two
equations
yn+al(x+2m7r)yn-l    +.. +a(x +2mr) =0,
[al(x +2mr) - al(x)]yn-l +.. +[a,(x +2m7r) -an(x)] = 0,
would also have to be satisfied for any integral value of m.
Since a rational fraction in m can vanish for only a finite number
of values, it follows that if m is properly chosen the coefficients
of the last equation do not vanish, and hence the hypothesis
that Eq. (5) is an equation of lowest degree satisfied by y is
contradicted. If the function y=sin x is not algebraic, then
the inverse function x=arc sin y cannot be, for an equation
of the type (4): for one of these functions would determine
the algebraic character of the other also. Similar proofs can
be made for all of the trigonometric functions, and also for
the exponential yt=e: and its inverse x=logy provided that


* Monograph VIII, sees. 5, 6,


THE FUNCTION CONCEPT AND THE CALCULUS


279


imaginary values of the variable x are admitted. The proof
given above depends upon the periodicity of the sine function,
and it is known that the exponential has the similarly imaginary
period 2mn\/-1.
15. Applications of the function concept in collegiate teaching.
In a preceding paragraph it was suggested that the classification
of functions which has been made might be helpful in relating to
each other the different parts of the undergraduate curriculum.
In order to see how this may be done let us first of all consider
the topics which are treated in the elementary courses in their
relation to the table. The subject of study in trigonometry
is the group of functions which have been classified as transcendental, emphasis being laid on the trigonometric functions
and their inverses. The exponential is usually considered
only as an introduction to logarithms, and the logarithmic
function itself only so far as is necessary to enable the student
to make a successful mechanical use of logarithmic tables.
It is hard to say with precision where the topics treated of
in algebra belong in the table, but most of them are related
to the polynomials or rational fractions, and it is proposed to
show precisely how this relationship may profitably be employed.
Analytic geometry, on the other hand, is concerned with the
graphical representation of functions and with the properties
of the elementary algebraic functions which are defined by
equations
ay +bx+c âO
ay2 + (bx + c)y + (dx2 + ex +f) = 0,
of the first and second degrees in x and y.
16. Objections to present methods. There are numerous
objections which might be made to the way in which these
subjects are usually presented, a few of which will suffice to
show at least the possibility of improvement. In the first
place there is one toward the removal of which much has
recently been done. That is the now somewhat obsolete tendency to confine the graphical study of functions entirely to the
courses in analytics. The graphical representation of a function


280


MODERN MATHEMATICS


is a device of the utmost importance not only in the study of
the conic sections and the straight line, but also in the study
of all the other elementary functions, and the student cannot
be made familiar with it too early in his mathematical course.
A second and more justifiable objection is the lack of attention
given to the exponential and logarithm. It is safe to say that
none of the functions listed in our table have wider or more
frequent applications, and yet there is none which the student
understands with so little thoroughness at the end of his freshman
year, his only study of them having been for the purpose of
enabling him to attain a certain mechanical skill in the use of
logarithmic tables. The lack of unity in algebra courses and
the desirability of graphical methods have already been pointed
out. It may also be added that the elementary notions of the
calculus can be introduced with much profit at suitable points
in both algebra and analytics. In the analytics, especially
the process of finding the tangent line to a conic involves the
calculus notion of the slope of the tangent, and yet it is a common custom of writers on the subject to avoid carefully the
notions and notations of the more advanced subject. It is difficult to account for this tendency on the part of our text-book
writers, except on the theory that one should never encroach
on a neighbor's property, a principle which is good when applied
to real estate, but hardly commendable in a scientific treatise.
17. Suggestions for improvement. What are then the
conclusions and the suggestions for improvement which can
be drawn from these objections? In the first place does it
not seem proper that each elementary course, since mathematics
under our present collegiate mechanism must be divided into
courses, should have to do with a particular class of functions,
and should not the purpose of the course as a study of those
functions be set clearly before the student at the beginning
and re-emphasized at proper intervals until it is clearly understood? If the answer is affirmative, not only should trigonometry
be concerned with the elementary transcendental functions
and analytics a study of the simple irrational algebraic functions,
but the subject-matter of algebra should be related to the study


THE FUNCTION CONCEPT AND THE CALCULUS


281


of the rational algebraic functions, the polynomials and the
rational fractions. What is unrelated should be relegated
to its proper place in some other part of the mathematical
curriculum. Furthermore, the treatment of these functions
should be complete as far as possible at the stage in which the
student finds himself, and illumined by a foresight on the part
of the instructor of the conceptions of the calculus. There arc
good reasons why the differentiation of the transcendental
functions should not be considered in trigonometry, for the
limiting processes involved are too complicated for the elementary student, but much can be said in favor of the early consideration of derivatives and anti-derivatives of polynominalsnotions which will be explained later in this paper âin a course
in algebra, and in favor of the study of the derivatives of the
elementary algebraic functions which occur in analytical
geometry.
Let us outline then a course of study for the freshman year
in college, which is not to depart too radically from the present
plan as usually followed, and yet which may afford a systematic
treatment of the elementary functions. The course should
begin with a consideration of the function concept, by means
of special examples perhaps, and with frequent applications
of the graphical representation of a function which should be
continued throughout the entire course. The exponential
and the logarithm might well be studied next on account of
their importance in numerical computation, in particular in
the plotting of other functions. Their graphs can be readily
drawn without the use of a table, if it is noticed that y=ax
can be plotted very easily, and that the graph of the logarithm
can then be found by simply rotating the plane about a line
through the origin making an angle of 45~ with the x-axis.
After these preliminaries the usual course in trigonometry can
be given with considerable economy in time on account of the
familiarity which the student has already gained with graphical
methods and the use of the logarithmic tables.
The course in algebra is the one in which it seems that the
notion of a function can be used to effect the greatest improve


282


MODERN MATHEMATICS


ment. It is perhaps not easy to see how all of the topics usually
studied in algebra can be related to the study of the rational
functions, and on this account a brief outline of a course which
might be given is to be inserted here.
Let the course begin with an explanation of the kinds of
functions which are to be studied, and show by means of
examples, or more generally, that any function formed with
the four elementary operations only is a rational fraction.
This will give plenty of opportunity to exercise the student in
the reduction of complex fractions. Following this a chapter
on operations with polynomials should be given, including
the division equation
f(x =g(x)x) + r(x),
synthetic division, and the computation of the coefficients of
a polynomial
ao(x+a)m+al(x+a)m-l+.. +am-_(x+a) +a,,
by means of synthetic division.* Then take up linear functions
and study their graphs and intersections, with the aid of determinants of the second order. The theory of quadratic functions affords occasion to emphasize the notion of a root of a
polynomial, and may be used to introduce two new conceptions,
the slope of a curved line by means of which maxima and
minima may be determined, and imaginary numbers. A short
treatment of imaginary numbers and DeMoivre's theorem will
not be amiss at this stage, to be followed by a graphical study
of polynomials of higher degrees, including the theory of maxima
and minima with the help of the derivative. The roots of
polynomials should then be studied systematically with the
remainder theorem as basis, the theorems upon which Horner's
method is based receiving due attention. After a chapter on
the numerical determination of roots, including Horner's
method, take up the study of polynomials of special types.
For example the polynomials xn-a lead to the theory of rad* See for example Fine's College Algebra, sec. 422.


THE FUNCTION CONCEPT AND THE CALCULUS


283


icals and fractional exponents, whose properties can all and
perhaps best be derived from that equation xn=a; the polynomial (a+x)m suggests the binomial theorem, and the polynomials
a+(a+x) +(a+2x)+... (a+nx),
a+ax+ax2+... +axn
are progressions. When the elementary properties of polynomials have been exhausted, the graphical theory of rational
fractions may be developed, followed by a study of indeterminate forms and undetermined coefficients as applied to
partial fractions. Chapters on series, permutations and combinations, and probabilities fit less easily into the elementary
function theory, although series may be regarded as a natural
generalization from polynomials with a finite number of terms
to those with an infinite number, and a new proof of the binomial
theorem might be made the excuse for the introduction of the
formulas for combinations. The number of combinations
n!
of n things k at a time is (-k)!n!' and it can be argued that
the number of terms of the form a-"kxk which occur in the
product (a+x)n is also equal to this number. No mention
has been made of a place for probabilities or the theory of
determinants. The former might well give place to topics
which are more important at this stage of the student's course,
and the latter really belongs in a course in the theory of equations, or else in solid analytical geometry.
The course in plane analytic geometry needs but few remarks
aside from those which have been made above. It should
be devoted to the theory of the simple irrational functions,
including the solution of simultaneous quadratic equations,
with applications to intersections of conics, and an introduction
to the process of finding the derivative of an algebraic function
with its interpretation in the problem of determinating the
slopes of tangents.
The detailed study of the differentiation of transcendental
functions and of algebraic functions in general must of course


284


MODERN MATHEMATICS


be left for the course in calculus, where functions are studied
from a somewhat different standpoint. In the calculus the
continuity, differentiation, and integration of functions hold
the most prominent place in our attention, and as the basis
of'the behavior of functions under these operations, other
classifications besides the one given in the table above can also
be made, which are more important for purposes of the higher
analysis.
18. Continuity of a function.  A discussion of functions
would not be complete without a description of what is meant
by the property of continuity mentioned in the preceding
paragraph. Speaking very roughly, a function is continuous
when it has an unbroken graph. Thus the function y  (=  2 1)
is continuous for every value of x except the values x== ~1.
The function (1), Sec. 10, is not continuous at x=O0, for its func

If
a-a..,i,a 
a-6    a    a+5
FIG. 7.
f(x) must approach


tional values jump from -1 to 0
and to +1, as x increases through
this value. Analytically a function
f(x) is said to be continuous at a
value x =a if a belongs to the range
of x-values for which f(x) is defined,
and if the difference f(x)-f(a) can
be made arbitrarily small by taking
x sufficiently near to a. If a function
has this property it is evident that
f(a) as x approaches a. The definition is


made still more precise by saying that f(xc) is continuous provided that for any positive number e, however small, a second
positive number 3 can be found, such that f(x)-f(a) is
numerically less than e whenever x differs from a by less than d.
Graphically interpreted, this means again that on the interval
from a-   to a+ a the difference of any pair of ordinates, f(x)
and f(a), of the curve y =f(x) is less than e.
It will be understood readily, from their graphs, and it may
be proved analytically, that polynomials are continuous functions for every value of x, and that a rational fraction is con


THE FUNCTION CONCEPT AND THE CALCULUS


285


tinuous at every value except those which make its denominator
vanish. It is true of any other elementary functions also that
they are continuous for every value of x, with the possible
exception of certain isolated ones. Thus the trigonometric
sine and cosine are everywhere continuous, while the tangent
becomes infinite and therefore has a discontinuity for values
of x which are odd multiples of -.  But other functions may
be discontinuous in a much more complicated way, as in the
case of the function (2) which is discontinuous at every point.
The continuity properties of the elementary functions are
evidently relatively simple, and we may therefore leave them
at this point in order to consider other important properties
of functions which occur in the calculus.
III. THE FUNDAMENTAL NOTIONS OF THE CALCULUS
19. The three fundamental notions of the calculus. The
differential and integral calculus has to do with three fundamental notions associated with functions, to which are due
most of the applications of the function theory in geometry,
mechanics, and physics, as well as other branches of science.
These three conceptions are called the derivative, the antiderivative or indefinite integral, and the definite integral. All
there may be interpreted geometrically and illustrated simply
by means of polynomials, and it is proposed to explain them
briefly here. The real difficulties of the calculus arise in applying the fundamental notions mentioned to the irrational algebraic and transcendental functions.
20. The derivative function and its interpretations. Let
us agree to consider from this point on in our discussion only
functions f(x) which are defined for x on the whole range of
real numbers, or on a certain interval a_ x  b of that range.
If x is thought of as indicating the time at any moment and
increasing uniformly from the value at one end of its range
to that at the other, the variable y=f(x) will simultaneously
change in value. At each value of x the function will have a


286


MODERN MATHEMATICS


certain rate of change relatively to x, which may be defined
in the following way. Consider an interval of x-values
between x and x +3tx, where ax is simply a symbol used to
denote a quantity which is to be added to x. At the value
x + Jx the function y will have a value which may be represented
by y + y =f (x + x), and the difference between the values
of y at the beginning and end of the interval, is therefore
dy =f (x + Ax)-f (x).
The quotient
Jy f (x+ J)-f (x)                (6)
iz       ix.(6)
represents then the average rate of change of the function as
x varies from x to x + ix. The limit of this quotient, as ax
decreases in size and approaches zero, is what is meant by the
rate of change, of the function at the value x. Evidently if
this limit exists it will be a variable which is uniquely determined
at each value of x, and is therefore itself a new function usually
denoted by the symbol f '(x).
The function f'(x), which is called the derivative or rate
of change of f (x), does not exist for every function, as might
easily be shown for some of those which have already been
defined. But for the elementary functions the rate of change
can always be found. The manner in which it is calculated
can be well illustrated by the familiar problem of the falling
body. When a heavy particle falls from rest, the distance
through which it has fallen in the time t is a function of t defined
by the well-known formula
s= gt.
If the distance fallen through in the time t +3 t is denoted by
s + Is, then the equation
s + As= =-g(t +,t)2
holds, and the average velocity during the time it is
Js   ((t+dt)2-t2
t-tt â =          g tt + ~g t.


THE FUNCTION CONCEPT AND THE CALCULUS


287


As At approaches zero this average velocity approaches the
limit gt, which is the actual velocity of the falling body at any
given moment t.
The rate of change which has just been calculated was that
of a very simple polynomial in t. The rate of change of any
polynomial can readily be found by a similar method with the
help of the binomial theorem. For consider first the function
y = axn. By the process described above the value of the average
rate of change in the interval from x to x +Ax is the quotient.
Ay   (x +.x)n.- n
ay (x +a         -anxn~-l+terms containing powers of Jx.
Hence the rate of change of axn is the function anxn-l, a formula
which holds for any positive integral value of n. Similarly
if y is the polynomial
y=2x3-x +-5,
the average rate of change in the interval between x and x + Ax,
will be
Ay   (x + x)3-x3   (x -+ Ax) -x
Jx        Jx          Ax 
and the limit of this expression is the derivative function 6x2-1.
From this last example it may be inferred that the derivative of
any polynomial can be found by applying the formula for the
rate of change of ax' to each term separately and adding the results.
The above definition of the derivative function as a rate
of change is the one which gives this function importance in
mechanical problems, but the derivative has also an interesting
geometrical interpretation. Suppose that the function y=f (x)
has the graph shown in the accompanying figure. At any
value of x the vertical line xp has a length equal to the value
of the function f (x), and at x+Ax the corresponding ordinate
'from x + Jx to q has the value f (x) + ix. Hence, in Fig. 8
pr = x, qr =f (x + x)-f (x),
and the value of the quotient Eq. (6) is evidently the same
pr
as that of the quotient r, the slope of the secant pq. As ix
qr 


288


MODERN MATHEMATICS


approaches zero the point q approaches p and the secant pq
approaches the tangent at p as a limiting position. The slope
of pq must therefore simultaneously approach the slope of
the tangent, so that the value of the derivative function f'(x) is
numerically equal to the slope of the tangent line pt.
21. Maxima and minima of functions. Perhaps the most
important application of the geometrical notion of a derivative
is in the determination of the maximum and minimum values
of functions. Evidently the slope of the tangent at the maximum and minimum points, a, b, c, Fig. 8, must have the value
zero. If, therefore, the derivative function f'(x) can be found
for a given function f(x), then the maximum or minimum
b
q~~        I            I  ~~~x
t
G G+a.X    x + A-:b
FIG. 8.                   IIG. 9.
values f (x) will be determined by values of x for which f'(x)
vanishes.
As an example, suppose that it is required to find the dimensions of the largest box which can be made by cutting squares
of side x out of a piece of tin as in Fig. 9, and then folding along
the dotted lines. If the dimensions of the tin are 3X5 inches
the volume of the box m ill be a function of x defined by the
equation
v = (6- 2x) (4- 2x)x = 24x- 20x2 + 4x3,
This function has the graph shown in Fig. 10. The slope of
the tangent to the curve at any point is given by the derivative function


v'= 15-32x + 12x2


THE FUNCTION CONCEPT AND THE CALCULUS


289


which must vanish at the point a where v is a maximum. The
roots of the last function are
5 + /7   5-V/7
X     3 ' â   3   '
the latter being the value of x for the point a. In order, therefore, to get a box of the greatest capacity, the corners must
5- \7.
be cut in a distance equal to   inches.
If a function f (x) has everywhere the same value c, its rate
of change is evidently zero, and its graph is a straight line
a
FIG. 10.
parallel to the x-axis. Conversely it is reasonable to infer
that any function f (x) whose rate of change is zero must have
a graph which is a straight line parallel to the x-axis, and must
therefore have the same value for every value of x. Consider
now two functions f (x) and g(x) which have the same rate of
change. Their difference f(x)-g(x) will be a new function
of x whose rate of change is everywhere zero, since it is the
difference of the rates of f (x) and g(x). But it has just been
seen that such a function is always equal to a constant c, and
it follows at once that if two functions have the same derivative
they are always related to each other by an equation of the form
f (x) =g(x) + c.
22. The Anti-derivative functions.  With this remark in
mind we may undertake a study of the second fundamental
notion of the calculus, that of the anti-derivative. It has
already been seen that in general any function f (x) has asso


290


MODERN MATHEMATICS


ciated with it a derivative function f'(x) which expresses
its rate of change. But it may also be asked whether or not
there exists a function of which f (x) is itself the derivative.
The answer is that in general such a function exists, and it is
called the anti-derivative of f (x). It is easy to find an antiderivative for any polynomial by inspection, if the formula
for the derivative of xn is borne in mind. By an application
axn +
of this formula it is seen at once that the function    has
n+1
for its derivative aXn, and hence is an anti-derivative of axn.
The anti-derivatives of each term of a polynomial can therefore
be found by adding one to the exponent of each term, and dividing
the terms by the exponent so increased. The anti-derivative of
the whole polynomial is then the sum  of these separate antiderivatives. For example the polynomial
6x6-122 + 5
is the rate of change of the polynomial
x7 -4x3 + 5x,
as may be verified easily by applying to the last polynomial
the formula previously given for differentiation.
The anti-derivative of a function f (x) is unlike the derivative
in that it is not uniquely determinable when f (x) is given.
For convenience let us denote the anti-derivative by a(x),
the letter a serving to indicate the relation between the two
functions. If A(x) is any other anti-derivative of f (x), then
A(x) and a(x) by definition have the same derivatives and they
must be related by an equation of the form
A(x) =a(x) +c.
It follows that although the anti-derivative is not unique, yet
if one anti-derivative a(x) is known, then all the others are found
by adding constants to a(x).
23. A typical mechanical application. One of the uses of
the anti-derivative is well illustrated by the problem of deter


THE FUNCTION CONCEPT AND THE CALCULUS


291


mining at any moment the height of a ball thrown vertically
upward with a given initial velocity. Physical experiments
tell us that the velocity of the ball will decrease uniformly
by an amount equal to -g in each second, where g is approximately 32.2 feet. In other words the rate of change of the
velocity is a constant -g. If any anti-derivative of -g were
known, the velocity v would necessarily differ from it by a
constant only. Such an anti-derivative can readily be found
by means of the formula given above, the result being -gt,
and the corresponding expression for v is
v= -gt+c.
The constant c may be determined in terms of the initial
velocity vo at the time t=0 when the ball was thrown. For
since the equation just written is true for all values of t, it will
be true also when t=O, and it follows readily that c=vo. In
a similar way a formula for the height s in terms of the time
can be derived by seeking an anti-derivative of its rate of change
gt2
v. The value of the anti-derivative is â 2 +vot, and hence
this function and s must satisfy an equation of the form
s-=   gt2+vot +d.
Here the constant d turns out to be zero on account of the fact
that s = 0 when t = 0, and the final formula for s is therefore
s = vot- gt2.
The problem of the thrown ball, and many others involving
similar principles, show clearly the importance of having a
method for finding an anti-derivative, as well as the derivative, of any given function.  The integration of a function
is the term applied to the process of finding an anti-derivative,
and differentiation is the process of finding the derivative.
One of the chief problems of the calculus is the determination
derivatives and anti-derivatives for as many different types of
functions as possible,


292


MODERN MATHEMATICS


24. Relations between a function and its anti-derivatives.
The graphs of the functions a(x) and f (x) are related to each
other by two very interesting properties, one of which follows
immediately from  the definition of an anti-derivative. For
any value of x the slope of the tangent to the anti-derivative
curve, at the point n in Fig. 11, is equal numerically to the
number of linear units in the corresponding ordinate xq of the
original curve y=f (x). Evidently when y=a(x) has a maximum or minimum the curve y =f (x) must intersect the x-axis,
since the slope of the former, and therefore the ordinate to the
latter, is zero at such a point.
The second relation between the curve is more interesting
and more important, but in order to exhibit it we must first
n/
-f 
1y f(x)                 I
Xo         x     x+iAx
FIG. 11.
prove a property of the curve y=f(x) itself. Consider the
area A bounded by the two ordinates at xo and x, the curve
itself, and the x-axis. For every value of x the value of A is
uniquely determined, and according to the definition of a
function, it is therefore a function A (x). The derivative of
this function can readily be calculated. For the difference
A(x+A.x)-l A(x) is the area under the curve bounded by the
ordinates at x and x-+Ax in Fig. 11. This latter area is less
than the rectangle whose corners are x, v, s, (x+~Jx) in
the figure, greater than the rectangle x, t, u, (x+3Jx), and
therefore equal to some rectangle intermediate between the
two whose upper side cuts the curve in a point v with ani
abscissa which may be denoted by xi,


THE FUNCTION CONCEPT AND THE CALCULUS


293


Since the altitude of this rectangle is f (xl), its area has the
valuef (xl)Jx, and
A(x + Jx)- A(x) =f (xi) 4x.
The quotient
A (x +a x) -  A (x)
-mx --- =/f (xl)
will therefore have the limit f (x), since the value of xl is always
between x and x + x, and must approach x as Jx approaches
zero. We have then this striking result that the rate of change
of the area A(x) is numerically equal to the length of the ordinate
f (x) at the boundary of the area.
25. Representation of an area by a line. Consider now
the two functions A(x) and a(x). They are both anti-derivatives of the function f (x) and hence must satisfy an equation
of the form
A (x)=a(x) + c,...... (7)
where c is a constant whose value may be determined by putting
x=O. The value of A(a) is seen to be zero, so that for x=O
the last equation becomes
O=a(xo) +c,
and the relation (7) takes the form
A(x)=a(x)-a(xo)....... (8)
Interpreted  geometrically this important equation  means
that the number of square units in the area A.(x) is equal to the
number of linear units in the line inn, which is the difference of
the ordinates a(x) and a(xo) of the anti-derivative curve. (See
Fig. 11.)
Consider for examiple the curves
y = 3X2 =f (x), y= x3 +1 = a(x).
The area under y=3x2 between the origin and the ordinate
at x= 2 is equal numerically to the length of the line mn, which
in this case is


294


MODERN MATHEMATICS


The curve y=eZ, shown in the accompanying graph, where
e=2.718+, has the interesting property that it is its own
derivative curve. Hence the area enclosed between the curve,
the x-axis, and any two ordinates is equal numerically to the
difference between the two ordinates.
/
//
/,
___ âr --- â/- -in                      Ye
0         2                      0
FIG. 12.                     FIG. 13.
Similarly the area under any arch of the cosine curve can
be calculated as soon as it is known that the sine is its antiderivative. For by the theorem just proved this area is equal
to the difference
sin 2sin -     =2.. â? J= sin x
//        \ y=cos x
/0
_/
FIG. 14.
26. The definite integral. Fluid pressure.  The relation
which has just been exhibited between the derivative and antiderivative curves is interesting geometrically, but its importance
really lies in its application to the evaluation of the third


THE FUNCTION CONCEPT AND THE CALCULUS


295


fundamental notion of the calculus, the definite integral. Let
us consider first some examples which lead to definite integrals.
Suppose that a cylindrical vessel full of water is at hand and
that it is required to find the pressure of the water on the sides
of the vessel. It is a well-known principle of physics that the
horizontal pressure at any point in the liquid is the same as
the pressure vertically downward. If w is the weight of a
cubic unit of water, and x the depth of the point in question,
then the pressure per unit of area at that depth is equal to the
weight w l.x of a column of water one square unit in cross
section and x units high. Let the cylindrical surface between
the top and the bottom of the liquid
be divided by planes parallel to the 
bottom of the vessel into n horizontal
rings of width Axz, Ax2....Axn. If r is
the radius of the cylinder, then the area of
any one of these rings will be of the form  x
27rAxk. The pressure on this area is less.
than the product of 27rAxk by the depth -AK
at the lower edge of the ring, greater
than the product of 2wrAxk by the pressure     2r     h
at the upper edge, and therefore equal       FIG. 15.
to 27rxkixk, where Xk is some properly
chosen depth in the interval Jxk between the two extremes.
The sum
27ra{xlxl+x2+z2+... +XnAxn}... (9)
is the total pressure. In its present form this sum would be
difficult to calculate on account of the indeterminateness of
the values x,, but it turns out that the limit of the sum as the
intervals Axk are decreased in size, can be very easily found by
a rule which will be explained a little later. Since the sum
is always equal to the desired pressure its limit will have the
same value.
It would be laborious to write down for many examples
a detailed description of a sum such as (9) and its limit,
and consequently a notation has been devised which suggests


296


i(MODERN MATHEMATICS


at a glance the essential steps in the process. For the examples
just given the limit is denoted by the symbol
rh
j 2r.rdx,.....              (10)
where h denotes the depth of the water. In this notation the
integral sign j is a metamorphosed old English letter s, and
suggests that the limit of a sum has been taken; the limit o
and h indicate the interval for which the sum has been constructed; and the "integrand " 2r.rxdx shows the nature of
the terms which have been summed. The whole expression
is called the definite integral of the function 2nrx between the limits
o and h.
27. Volumes of solids of revolution. Another simple problem which may be solved with the help of a definite integral
I
0        a'     A.xK,  h
FrI. 16.
is that of finding the volume of a cone. Let the cone be
generated by revolving the triangle shown in the Fig. 16, about
the x-axis. The hypotenuse of the triangle is a part of the
graph of the function y=-ax/h, since for any point of it y and
x have the ratio a:h. Divide the interval from o to h into n
parts Jxk as before. The volume generated by the trapezoid
over Jxk will be equal to that generated by a properly chosen
aXk
rectangle with base Jxk and altitude yk= -. The volume
generated by the rectangle is cylindrical and equal to the proda2 X k2
uct of its base -7-2 by its altitude Jxk. The whole volume
7oa2 
of the cone will then be a sum of terms of the type  h2


THE FUNCTION CONCEPT AND THE CALCULUS


297


and according to the description of the definite integral symbol
given above, the limit of this sum can be denoted by
h   a2
-    x2dx......  ( 11)
The volume of a cone of course can be calculated by the
methods of elementary geometry. But the process just described
enables us to find with equal ease an expression for the volume
generated by revolving about the x-axis the area xopqx (Fig.
11) under any arbitrary curve y=f (x), a problem quite beyond
the scope of the usual elementary methods. The only differences
in this case are that the type of the terms to be summed is
7ra2xk2JdXk
7f 2(Xk)JXk instead of  h2, and the interval over which
the sum is to be taken extends from xo to x instead of from
o to h. The definite integral expressing the value of the volume
has therefore the form
77f 2(x)dx....... (12)
28. Areas. The area xopqx in Fig. 12 can also be expressed
as a definite integral. For the part of the area underneath
the curve and over the interval Axk is greater than the product
of dxk by the highest ordinate over the interval, less than the
product of Jxk by the shortest ordinate, and therefore equal
to Axk multiplied by some intermediate ordinate f (Xk). The
total area is consequently a series of terms of the type f (xk)
Jxk and is equal to the definite integral
ff (x)dx,
0
which is the limit of this sum as the Axk approaches zero.
29. Computation of definite integrals. The fundamental
theorem. The fact that the area xopqx can be expressed as
a definite integral suggests at once a formula by means of which
the values of many definite integrals can be calculated with considerable ease. In discussing the relation between the curves
belonging to a function and its anti-derivative, it was found that
the area xopqx for the curve y =f (x) is equal to the difference


298


MODERN MATHEMATICS


of the ordinates of the anti-derivative curve at the values
a and Pl. By comparing these two results we have at once a
remarkable theorem which is called the fundamental theorem
of the integral calculus. According to it, the value of the definite
integral
f (x)dx =  lim {f (xl)JXl +f (x2)Jx2+. +. +f (xn)xn
is given by the formula
jf (x)dx = a(x) - a(xo),
where the function a(x) is any anti-derivative of the function f (x).
30. Applications. The values which  the formula would
give for the definite integral if two different anti-derivatives
were used are evidently the same, since the difference of the
anti-derivatives is always a constant. The theorem has been
derived with the help of geometrical conceptions, but the
definite integral is really an analytic notion with a geometrical
interpretation, and the theorem itself is essentially analytic
in character. It enables us to calculate the values of any
definite integral for which an anti-derivative function can be
found, irrespective of its geometrical or mechanical interpretation. Thus in the first example discussed above the function
under the integral sign (10) is 2wrx, and an anti-derivative,
formed by the usual rule for functions of the type axn, is zrx2.
The total pressure on the walls of the cylindrical vessel is
therefore
rh
J27orxdx = 7rrh2 - rO2 = 7rh2.
Similarly the anti-derivative for the integral (11) which
ia2x3
expresses the volume of a cone is 3h, and the volume itself
turns out to have the well-known value
t a2
7C   x2dx = 7a2h,
one-third of the product of the base by the altitude.


THE FUNCTION CONCEPT AND THE CALCULUS


299


In a similar way the volume of a sphere can be calculated
by means of the formula (12). At any point of semicircle
of radius r about the origin the abscissa x and ordinate y,
satisfy the relation x2+y2=r2, so that the function which is
represented by the circle has the equation
y = /r2  X2.
The volume generated by rotating the semicircle about the
horizontal axis is that of a sphere of radius r. The definite
integral which represents the volume, formed from the formula
(12) by substituting the radical \/r- x2 in place of f (x), has
the form
r,   (r2- z2)dx,
).,  
-r           0         x  4-r
FIG. 17.
and an anti-derivative of the integrand function is r2x- x3.
The volume has therefore the value
+r
(r2-X2)dx =     (2X3-2x3 -)=_4wr3
If the area under the curve y = 3x2 in figure is rotated about
the x-axis, the volume generated is easily found from the same
formula. In this case the integral is
Z   3x2dx =  23-03 } = 8s.
31. Relations between functions and graphs. Let us conclude our brief study of the more important notions of the
calculus with a consideration of a question which was proposed


300


MODERN MATHEMATICS


earlier in the paper with regard to the representation of a function by means of a graph. If a function is continuous at every
point of an interval a &lt; x &lt; p, then the difference f (x') -f (x")
for any two values x' and x" in the interval can be made arbitrarily small by choosing x' and x" sufficiently near together.
The proof that this property of a continuous function is a
consequence of its continuity at the individual points between
a and fi is somewhat complicated, anti cannot well be given
here. The property itself is called the "uniform continuity
of f (x) in the interval a &lt; x &lt; p."  Assuming that it is true,
we can without difficulty see that any continuous function
can  be  approximately  represented  by  a  polygon. For
/W(!
f( K+i)
f(XK)
XKK + 1
FIG. 18.
suppose that the interval from a to P has been divided into
segments by a set of values so near together that the difference
f (x) -f (Xk) for any value x between xk and xk+ 1 is less than an
arbitrarily chosen number e. If the points p and q corresponding to any two successive values Xk and xk+~l are plotted and
the ordinate to the straight line joining p and q is represented
by g(x), it follows that the difference g(x) -f (xk) will also be less
than e, since f (xa) -f (Xk+ I) is less than e and g(x) lies between
f (Xk) and f (Xk+ 1). Since f (x) and g(x) both differ from f (Xk)
by less than e it follows that their difference can itself not
exceed 2e. This result will hold for each segment AXk, however
small the constant E is taken, provided only that the points
of division in the interval between a and Pi are taken sufficiently
near together. It is evident then that a continuous function
can be represented with any desired degree of numerical


THE FUNCTION CONCEPT AND THE CALCULUS


301


accuracy by plotting a finite number of points sufficiently
near together, and joining them by straight lines.
The numerical accuracy of the representation is not the
only characteristic of the graph, however, which should be
taken into consideration. The broken line represents the
values of the function with some degree of fairness, but it does
not in general indicate other properties satisfactorily, and a
smooth curve drawn through the corners of the polygon might
be equally misleading. A smooth curve, for example, suggests
to the eye that at each point of the curve there is a tangent
line whose direction changes continuously as the point of
tangency moves along the curve, and whose slope also changes
continuously. Hence the function f (x) which such a curve
represents should have a continuous derivative, which is
a
0O_ o             b
FIG. 19.
not always the case. A function may in fact be continuous
in an interval and yet not have a derivative at any point of
it, as is shown by a classical example of such a function due
to Weierstrass.* The graph does not indicate as much, however, with regard to the rate of change of the slope of the
tangent which is denoted by f"(x) and called the second
derivative, and very little indeed concerning the rate of change
of f"(x) and the successive rates of change of higher orders.
The second derivative is positive along an arc ao convex downward where the slope of the tangent is increasing, and negative
on an arc ob concave which is concave downward. It vanishes
presumably and changes sign at o, though at such a point it
may change abruptly, as it would if for example ao and ob
were arcs of the two curves
y-=x+x2, y=x-~x2.


* Mathematische Annalen, Vol, XIX, p. 591.


302


MODERN MATHEMATICS


Both of these curves pass through the origin o, and their derivatives, 1+2x and 1-x, have the same value for x=O, so that
the two curves are tangent to each other at that point. On
the other hand the second derivatives are respectively 2 and
-1. From this and other examples which might be constructed it follows that a curve which appears perfectly smooth
to the eye may represent a function which has a discontinuous
second derivative, or possibly no second derivative at all.
32. The graph as a mathematical symbol. From the remarks
which have been made it may be inferred that graphs have
two distinct and important uses, the first of which is the numerical representation of the values of a function. It has been
seen that such a representation may have significance, even
if the function is only continuous without having any of the
derivatives. But a graph is most useful, in theoretical work
at least, as a mathematical symbol for a function in the same
way that f (x) is a notation for a function or  f (x)dx for
a definite integral. The variety of characteristics which may
be suggested by a glance at a graph is, however, much greater
than is suggested by the symbol f (x) which indicates only
functional dependence upon x, and its value as a symbol is
proportionately enhanced. From the graph of the function
Y= -   1 in Fig. 3, for example, we read that this function is
continuous and has a continuous derivative except at x= ~1;
that it always decreases, varying from 0 to X as x increases
from - o to -1, from +  to -   as x increases from -1 to
+ 1, and from + 0o to 0 as x increases from + 1 to oo; that it vanishes only once, when x =0; that its derivative is negative
with variations clearly indicated; that its second derivative
vanishes at x=0O; and so on; all of these properties being
much more significantly suggested by the graph than by
the corresponding and somewhat clumsy description in
words.
As the usefulness of any mathematical notation depends
upon the sharpness with which the conception for which it


THE FUNCTION CONCEPT AND THE CALCULUS


303


is to stand is defined, so the graph attains its greatest efficiency
as a symbol only when the nature of the functions which are
to be represented is clearly specified in advance, as well as the
properties of functions which are to be represented by special
features of the curve. As has been seen above, the characteristics of first and second derivatives seem to be particularly
adapted to graphical representation, and it has been suggested
that curves possess their fullest significance as symbols of
functions when the functions are continuous, have only a
finite number of maxima and minima in any given interval,
and have continuous derivatives of the first and second orders.
The elementary functions have these properties, in common
with all of the other functions which have been designated
as analytic. But it is not necessary that the functions represented be thus restricted in character, provided only that the
correspondence between the analytical characteristics of the
function on the one hand and the graphical characteristics
of the curve on the other, is expressly understood. In the
elementary courses it is evidently impossible to discuss the
niceties of the relation of graphical to analytical conceptions,
and it is highly desirable that graphical methods should be
used. But they should always be formulated with special
reference in the mind of the instructor to the correspondence
between the graphical and the analytical processes, with which
the student will later be familiar.
We have now come to the end of our brief survey of the
elements of the calculus, the threshold of the higher mathematics. The technical difficulties which would arise have
prevented the application of the processes of differentation
and integration to any but the simplest functions, the polynomials. By means of these alone, however, it has been possible
to explain the meaning of the derivative, the anti-derivative,
and the definite integral, and some of their interrelations among
themselves. The rest of the theory is for the most part an
application in many different ways and to many different
functions of these three fundamental conceptions. It is hoped
that by his perusal of these pages the reader unfamiliar with


304              MODERN MATHEMATICS
with the calculus will have lost whatever awe he may have had
of one at least of the more advanced mathematical subjects,
and at the same time have gained an insight into the variety
and importance of its relations with problems of a practical
nature and with other branches of science.


VII
THE THEORY OF NUMBERS
BY J. W. A. YoUNG.


CONTENTS
SECTIONS.
I. INTRODUCTION...........................................  1-3
II. FACTORS............................................. 4-18
4-5, Primes;
6, Arithmetical progressions;
7, Problems concerning primes;
8, Method of finding primes;
9, Tables of factors;
10, Factors of large numbers;
11, Relative primes;
12-13, Totient, &lt; (m);
14, Sum of all factors of a number;
15-18, Perfect numbers.
III. DIOPHANTINE  EQUATIONS..............................19-22
19, Definition;
20, The equation, x2+y2= z2;
21, The equation, xn+yn= zn;
22, The equation, x2-Dy2=-l.
IV, CONGRUENCES.................................. 23-38
23-27, Introductory definitions and properties;
28-29, Fundamental properties;
30, Applications: To find remainders in division; Criteria for
divisibility;
31, Roots of congruences;
32, Theoretic solution of the linear congruence in one unknown;
33, Numerical solution of the linear congruence in one unknown;
34-35, Fermat's theorem;
36-38, Wilson's theorem.
V. BINOMIAL CONGRUENCES............................. 39-50
39-46, Definitions and theorems;
47-50, Primitive roots.
VI. QUADRATIC CONGRUENCES...............................51-58
51-53, Definitions and reductions;
54-57, Quadratic residues;
58, Legendre's law of reciprocity.
VII. BIBLIOGRAPHY..........................................  59
306


VII


THE THEORY OF NUMBERS
By J. W. A. YOUNG
I. INTRODUCTION
1. The "Theory of Numbers" might, in a certain sense,
include nearly all of the subject-matter usually treated in mathematics, since, with the exception of the non-metrical portions of
Geometry, there are few domains of mathematics that are not
fundamentally concerned with numbers. But the term is commonly used in a restricted, technical sense as meaning the theory
of integral numbers (positive, negative, zero). Even this must be
further restricted, for all numbers other than integers can be
defined in terms of integers,* so that to study the whole body of
theory that has been built up on integral numbers would still be
tantamount to studying nearly the whole body of mathematical
science. The restriction customarily made is to regard the
" theory of numbers " as concerned with integers as such; their
properties and their combinations by operations that lead to
integral results. The operations of addition, subtraction, and
multiplication are accordingly admitted when applied to any
integers, and division is admitted when applied to integers such
that the quotient is integral. The process of division may also
be used to obtain equations between integers. For example,
9385=62 151 +23.t
In all that follows the term number shall accordingly be
understood to mean integral number; and other terms, for
* See Monograph IV, Appendix I.
t The dot indicates multiplication.
307


308


MODERN MATHEMATICS


example, factor, shall be understood to be similarly restricted
in meaning.
2. The treatment of our subject, as now        delimited, might
properly begin with a chapter studying the nature and genesis
of the concept of integer, the fundamental definitions and
postulates relating to integers and to the admitted operations
thereupon, the " laws " of operation, and the like.     This would
be, in a measure, the treatment of the theoretic basis of elementary arithmetic.*
3. We, however, here assume a working knowledge of elementary  arithmetic, and begin with a consideration         of various
properties, connected with the factors of nulbers, that are not
ordinarily treated in that subject.
II. FACTORS
4. Definition. A prime number (or briefly, a prime) is a
number having no other factors than itself and unity.
5. Theorem.     Tth series of primes is endless.
Proof.   It is sufficient to show   that there exists a prime
larger than any given prime. Let the given prime be p.
Consider            N=23-5...       p+1,
* For a treatment of the corresponding questions relative to the numbers
of algebra, which include those of arithmetic, see Monograph IV.
For the more strictly arithmetical theory see:
Dedekind, Was sind und was sollen die Zahlen? Braunschweig, 2d ed.,
1893. English translation by Beman as the second essay of "Essays on
Number," Chicago, 1901. Stolz-Gmeiner, Theoretische Arithmetik, Part I,
2d ed., Leipzig, 1900. (This work presents the theory of the natural numbers,
published by Peano, under the title, "Arithmetices principia nova methodo
exposita," Turin, 1889, in a symbolic notation. A brief account of this
theory is given by Huntington in the Bulletin of the American Mathematical
Society, 2d Series, Vol. IX, 1902, pp. 40-46.) Padoa, "Theorie algebrique
des nombres entiers," Internat. Cong. de Philos., Paris, 1900, pp. 309-65.
Huntington, " Complete sets of postulates for the theories of positive integral
and positive rational numbers," Transactions American Mathematical Society, Vol. III, 1902, pp. 280-84.  Huntington, pp. 27-29 of "The fundamental laws of addition and multiplication in elementary algebra," Annals
of Math., Vol. VIII, 1906, pp. 1-44.


THE THEORY OF NUMBERS


309


where the first term of N is the product of all the primes not
greater than p.
Then it appears from the form of N, that if N be divided by
any one of the primes just mentioned, the remainder will be 1.
Consequently, every prime factor of N     must be greater than
p. Since N must have one or more prime factors, the existence
of a prime greater than p is thus proved.     But this is by no
means tantamount to the actual finding of a prime greater than
a given prime p. No general method for doing this has as yet
been discovered.
This theorem may also be stated thus: There is no largest prime
number; or thus: The primes being arranged in order of increasing
magnitude, after each prime there follows another; or also thus: The
number of primes is infinite. The last form of statement means neither
more nor less than the others.
It has been conjectured that every even number is the sum
of two primes, but this has not yet been proved.
6. The theorem   above was known to Euclid two thousand
years ago. In the nineteenth century Dirichlet proved an
elegant generalization of it, viz., There is an endless set of primes
in every arithmetical progression whose first term and common
difference have no common factor.
Dirichlet's proof of this theorem makes use of numbers and
operations not admitted in our subject* (which is often called
higher arithmetic) thus furnishing an instance of a " non-arithmetical " proof of an arithmetical proposition.t
It is, however, easy to prove the theorem arithmetically
for certain progressions.
For example, the progression
3, 7, 11, 15, 19, 23,... 4n-1,...
contains an endless sequence of primes.
* See sec. 1.
t Such proofs abound in the development of the theory of numbers.
For an introduction to this division of the subject see: Bachman, Analytische
Zahlentheorie, Leipzig, 1892 (proof of the above theorem, pp. 74-88);
Kronecker-Hensel, Zahlentheorie, Leipzig, 1901 (above theorem, pp. 438
et seq.).


310


MODERN MATHEMATICS


To prove this it is sufficient to show that for every prime p
there exists a larger prime of the form 4n-1.
Consider       N=2(23.5.7... p)+l,
where the number in the parenthesis is the product of all primes
not greater than p.
Then it is clear from the form of N, that none of the primes
2, 3... p is a factor of N. All the prime factors of N are
therefore greater than p.
All odd primes are of the form 4n + 1 or 4n- 1. The product
of two numbers of the form 4n+l is also of the form 4 + 1.
But N is of the form 4n-1. Hence at least one of its prime
factors must be of the form 4n-1. The existence of a prime
of this form larger than the given prime, p, is thus proved.
It can be proved quite analogously that the progression,
5, 11, 17, 23, 29, 35,.., 6;r-1,.. 
contains an unending set of primes.
7. Various important general problems have been studied
relating to primes. For example:
(1) To determine the number of primes in a given interval.
(2) To determine a prime larger than a given prime.
(3) To determine the prime next larger than a given prime.
(4) To determine whether or not a given number is prime;
or, more generally, to determine the factors of a given number.
No general solution of these problems has as yet been found.
8. The simplest method of finding factors is by actual trial.
It is sufficient to try only primes, and of these, only those whose
squares are smaller than the given number. But this method
is impracticable for large numbers. For these, use is made of
various results and methods that are developed in our subject.
9. Tables of the factors of all members up to ten millions
have been published.* A manuscript in the Archives of the
* Lehmer, Factor Table for the First Ten Millions, Washington 1909.
Carr's Synopsis of Pure Mathematics, London, 1886, contains a table extending to 99,000. Still smaller tables are found in Jones' Logarithmic
Tables, Ithaca, N. Y., 1889, and elsewhere.


THE THEORY OF NUMBERS


311


Academy of Vienna gives the factors of numbers from 3,000,000
to 100,000,000. (This MS. is known to contain many errors.)
10. Factors of particular numbers much larger than those
in the tables have also been found. For example, in the theory
of the construction of regular polygons* it is important to
know whether or not 22 +'1 is a prime number. It has been
shown that
22 + 1 = 4,294,967,297
=641 6,700,417
Also that 223 +1, a number of more than twenty trillion places,
has the prime factor:
2,748,779,069,441.t
11. Definition. Two numbers having no common factor but
unity, are called relatively prime. Each is said to be prime to
the other.
12. Definition. The number of (positive) integers not
greater than m and prime to m is called the totient of m, and
denoted by q (m).
Thus       (l1)=1;  b(2)=1; q(3)=2; q(4)=2;
&lt;(5)=4; ~(6)-2;   &gt;(7)=6; &lt;)(8)=4.
If p is prime,        o(p)=p-1.
13. Problem. To determine q (m).
Solution. Let n =paqbrc... Vh, where p, q, r... v are
different primes, a, b, c,... h positive integers.
If from the series of numbers
1, 2, 3, 4, 5,... m-1, m,
we strike out all those that have as factor p or q, or r, etc.,
the numbers that remain will be prime to m, and the number
of such numbers is the desired totient.
* Monograph No. VIII, sec. 26.
t Encyc. des Sciences Math., Tome I, Vol. III, p. 5.1.


312


MODERN MATHEMATICS


First consider those having p as a factor.
in
They are:        p, 2p, 3p,...  p.
P
m                        im
Their number is -    There are therefore m â or m(1 â)
p                        P         P
numbers of the series 1, 2.. m that do not have the factor
p. This may be stated generally thus:
Lemma. If M    has the prime factor P, then Mf(1-p)
of the numbers 1, 2, 3... M. do not have the factor P.
We next strike out the numbers having the factor q.
These numbers are
in
q, 2q,3q,...   q.
Some of them may already be struck off as having the
factor p. The number not having the factor p, is the number
of the coefficients
rn
1,2,3...-,
not having the factor p. By the lemma this number is
q    P
The number of numbers 1, 2, 3... m, having neither p
nor q as factor is therefore
or  (1 -  (- 
or                      a    -
Similarly, the numbers
r
r, 2r,3r..   r


* This number is an integer, since p is a factor of m. For similar reasons,
the other numbers indicated by fractions in what follows, are also integers.


THE THEORY OF NUMBERS


313


have r as a factor. Some of them may have p or q as factor.
The number of those that do not is the number of coefficients
m
1, 2, 3... 
that have neither p nor q as factor.
By the preceding result this number is
M     -1   1-1.
r     p/\     q/
Consequently these are in the series 1, 2,... m,
m(        1-1-  -)-  (1 -   ) (1-),
\ P/   q/    r\    P/\    9/
or 
m(1-p)(1 ---q)        1-),
\  P/\  q\     r/
numbers not divisible by either p, q, r. The same reasoning
may be repeated until all of the prime factors of m have been
used. The numbers remaining will be prime to m, and we have
thus:
(m)i-m)(1 â q(1-1)...                 -1).
REMARKS. (1) The repetition of the reasoning for all of the prime
factors of m is formally accomplished by the process of mathematical
induction, that is, we show that if a result of the above type holds for any
k different prime factors of m, such a result also holds for k +1 of the
prime factors of m, consisting of the k already considered and any other
one. Since such results have been proved above for one, two, and
three factors, it would follow that a similar result holds for four factors,
therefore for five factors, etc., therefore for all the factors.
(2) If the reader has any difficulty in following the reasoning above
for a general m, he should first carry it through for one or more particular values (say, 60 = 22 3 5, 48 = 24 3, 55 = 5 11), and then generalize.
This remark applies to our whole subject-the theory of numbers. It
cannot be mastered without much work with specific numbers, and
recourse should always be had to particular instances, whenever the
general theory becomes in any way hazy.


314


3MODERN MATHEMATICS


14. Problem. To find the sum of all the factors of any number, m.
Solution.  Let m = paqbrc... v', where p, q, r,... v are
different primes, and a, b, c... 1 are positive integers. Let
all the factors of m, including unity and m itself, be di, d2,
d3... dk, and let di +d2+d3 +... ~dk=S(m).
Every factor of m is of the form:
d=pa'qb'rc'... VI/
where a' b' c'... 1' have any combination of the values:
a'=O,1, 2..
l'=O,1, 2...
and, conversely, all expressions of this form are factors.
Further, every expression of this form occurs once, and only
once, as a term of the following product, and the product contains no other terms:
P=(1~p+p2+...~pa)(1~qq+q2~..+qb)...
(~v+ v2-I-...+-VI).
Consequently P is the sum of the d's, and since each factor of
P is a geometric series, we obtain:
pa+l1_  qbl1         _1+1___I
SW-.      -1     q-1        v-1
EXAMPLES. 1. Since 25=52,
5 -1
S(25)=     =31.
2. Since 72=23.32,
21-1 3'-1.
S(72)=         =15-13=195.
2-1  3-1
3. Since 100,800=26. 3252.7,
27-1 31-1 51-1 71 -8(100) 800)= -.  -  -.  -
2 1 3-1 5-1 7-1
-127 ~ 13 ~ 31 - 8 = 409,448.


THE THEORY OF NUMBERS


315


15. Definition. A number that is equal to the sum of all
its factors, except itself, is called a perfect number.
For example, 6 and 28 are perfect numbers, since
6= 3+2+1,
28=14+7+4+2+1.
16. Theorem. If 2k-   is a prime, then 2k-1(2k-1) is a
perfect number. (This theorem is given by Euclid.)
Proof. Let        n =2k- (2k- 1)
and let              p=2k-1.
Then                 n=2k-lp.
And by sec. 14,
2(k-1)+_-1 p2-1
S(n)=             p  1
= (2k-1)(p+1)
= (2k 1)2k.
Subtracting n from both members we have:
S(n)-n= (2k- 1)2k-2k-1(2k- 1)
= (2k- 1)(2k-2k-1)
= (2k- 1)(2.2k-1-2k-1)
= (2 -1)2k-1
=n.
That is, n is a perfect number.
17. It is not difficult to prove * that every even perfect
number is of the form given above. No odd perfect number
has been found, and it is not known whether or not any exists.
18. The question naturally arises as to what values of k
will make 2k- 1 a prime. It is easy to see that a first condition
is that k itself must be prime. For if k = ab, 2ab- I has (according to elementary algebra) the factor 2a- 1.


* See, for example, Lucas, Theorie des Nombres, Paris, 1891, p. 375.


316


MODERN MATHEMATICS


In 1644 Mersenne asserted that when p is a prime not
greater than 257, 2P-1 is a prime if and only if
p=2, 3, 5, 7, 13, 17, 19, 31, 61, 127, 257.
Numbers of the form 2P-1, p&lt;257, are called Mersenne's
numbers.
The statement that 2P-1 is prime has been verified for the
first 9 values of p, which, consequently, when substituted in
Euclid's formula gives the nine known perfect numbers. The
first eight of them were known as early as the sixteenth century,
the ninth (whose value is
2,658,455,991,569,831,744,654,692,615,953,842,176)
was verified late in the nineteenth century. The values p= 127
and p=257 are still in doubt. The statement that 2P-1 is
composite for values of p&lt; 257 other than those of the list
above has been verified in a large number of instances,* but
not yet in all. It is believed that Mersenne knew some more
powerful and general method of dealing with these questions,
which his successors have not yet succeeded in rediscovering.
III. DIOPHANTINE EQUATIONS
19. Definition. An equation in two or more unknowns
whose values are to be integral is called a Diophantine equation;
also an indeterminate equation.
Linear Diophantine equations are best studied in connection
with another division of our subject (Congruences, sees. 31-33).
20. An interesting instance of a quadratic Diophantine
equation is the equation
X2 + y2= 2......(1)
The numbers of any set x, y, z satisfying these equations are
the lengths of the sides of a right triangle. So that the two
problems of finding all integral solutions of the above equations
and of finding all right triangles with sides of integral length
are equivalent. Such triangles are called Pythagorean triangles.
* For list, see Lucas, Thlorie des Nombres, p. 375.


THE THEORY OF NUMBERS


317


A solution in which x, y, z have no common factor is called
a primitive solution. It will be sufficient to find all the primitive
solutions, for every non-primitive solution can be deduced from
some primitive solution by multiplying all its numbers by the
proper factor.
We begin the search for the primitive solutions by showing
that in any primitive solution of one of the numbers, x and y,
say x, is even and the other, y, is odd. For (a) if x and y were
both even, z would also be even; the common factor, 2, would be
present and the solution would not be primitive. (b) If x and
y were both odd (that is, of the form 2n+1), x2 and y2 would
both be of the form 4n +; and hence z2 would be of the form
4n+2. But this is impossible, since the square of every even
number is of the form 4n, and that of every odd number is of
the form 4n +. Since suppositions (a) and (b) are both incorrect one of the numbers x and y must be even, the other, odd.
Let x denote the even one. Then y and z are odd.
From (1):
x2 = z2_ y2
=(z+y)(z-y).
Since z and y are both odd we may put,
z+y=2k 1
-z-y=21 t
Hence,                  x2 = 4kl.
Since x, y, 'z are relatively prime, k and I must also be
relatively prime; for, from equations (2),
z=k+l and    y=k-l;
hence if k and I had a common factor y and z would have that
factor in common also.
Since 4kl is a square, it follows that k and 1 must be square.
We therefore put:
k, Dp2
==2        (,, relatively prime)
I- == 2    ('m, q, relatively prime).


318


MODERN MATHEMATICS


Consequently, in any primitive solution of equation (1),
x, y, z must be of the forms:
x= 2mq 
y=m2- q2.......         (3)
z=m2 +q2
It is readily seen by substitution that every set of values of
this form, whether primitive or not, satisfies the equation. To
pick out these solutions of form (3) that are primitive, we proceed
as follows:
If m and q have a common factor, then x, y, z evidently have
that factor in common also. The primitive solutions will therefore all be among those obtained under the restriction that m
and q shall be relatively prime.
Further, since z +y==2m2, and z-y=2q2, any common
factor of z and y would be a common factor of 2m2 and of 2q2,
or, if m and q are relatively prime, of 2. That is, if m and q
are relatively prime, y and z can have, at most, the factor 2
in common. They do, indeed, have this common factor when
m and q are both odd, and not otherwise (m and q being relatively prime). We have thus proved the following
Theorem. All the primitive solutions and no others of the
equation x2 +y2= z2 are given by the formulas (3) if m and q run
through all possible sets of relatively prime values such that m &gt;q,
and that one of the two is even, the other odd.
It is now simply a matter of substitution to prepare a table
of the smaller primitive solutions.*
* A table of all primitive solutions in which z&lt;2500 is given by Whitworth, Proc. Lit. and Phil. Soc. of Liverpool, Vol. XXIX, 1874, p. 237.
m     q     x     y     z     m      q     x            z
2     1     4     3     5     7     6      84    13     85
3     2     12    5    13           4      56    33     65
4     3     24    7    25           2      28    45     53
1     8    15    17     8     7     112    15    113
5     4    40     9    41           5      80    39     89
2    20    21    29           3      48    55     73
6     5    60     11   61            1     16    63     65
1     12   35    37
______________ __________________  _


THE THEORY OF NUMBERS


319


Theorem. Of the three numbers, x, y, z, one is divisible by
3, one (perhaps the same one) by 4 and one by 5.
Proof. Since either m or q is even, x is divisible by 4. If
either m or q is divisible by 3 or by 5, x is divisible by 3
or by 5.
If neither m nor q is divisible by 3, they are both of the
form 3n ~ 1, and their squares are of the form 3n +1. Therefore
m2-q2 is of the form 3n. That is y is a multiple of 3.
If neither m nor q is divisible by 5 they are of one of the
forms 5n~ 1, 5n~2, and their squares are of the form 5n~1.
If both m2 and q2 are of the same form (either 5n +1 or 5n-1)
m2-q2 is of the form 5n; while if one is of the form 5n +  and
the other of the form 5n-1, m2 +q2 is of the form 5n. That is,
in the former case y is a multiple of 5, in the latter case, z is a
multiple of 5.
All of these statements may be verified for the particular instances
occurring in the table above, and they should be so verified if the reader
has the slightest difficulty in understanding the general reasoning.
(See note, sec. 13.)
21. As it has been easy to solve completely the equation
2 +y2 =z2, it would be natural to expect corresponding success
in the solution of x3 +y3 =,3, but this expectation is doomed to
disappointment. It has been proved* that the equation has no
solution. In other words: no cube of an integer can be the
sum of two cubes of integers. This is a special case of the
following more general theorem announced by Fermat:
The equation xn+yn=zn admits no solution in integers, if n
is a positive integer greater than two.
This famous theorem is commonly known as Fermat's last
theorem, and was stated without proof by Fermat in the seventeenth century. Since then the theorem has stood as a standing
challenge to arithmeticians. For various specific instances the
proof has been found, including every n &lt;100 and some others,
but the general proof has not vet been made.


* Euler, Algebra, St. Petersburg, 1770.


320


MODERN MATHEMATICS


22. Mere mention must suffice for the interesting and famous
indeterminate equation:
x2-Dy2= ~1,
generally known as the Pellian equation, though it has recently
been pointed out that Pell never published anything on this
equation.* A method for the solution of this equation was
known to the Hindus about 600 A. D., but it was solved independently by La Grange in the eighteenth century. The equation
is treated in the works on the Theory of Numbers cited in the
bibliography; these works also discuss many other indeterminate
equations that cannot even be mentioned here.
IV. CONGRUENCES
23. It frequently happens that in a particular problem numbers whose difference is a multiple of a given number, are
equivalent.
For example:
(1) With respect to the day of the week on which the last day of
a certain period falls, numbers of days counted from a fixed day are
equivalent if their difference is a multiple of 7.
(2) With respect to their trigonometric functions, angles are equivalent if they differ only by multiples of 360~.
(3) With respect to their numerical value, powers of -1 are equivalent if their exponents differ only by multiples of 2.
24. Definitions. If a=b +cm, that is, if a-b is a multiple
of m, we say that a is congruent to b with respect to the
modulus m, and write:
a-b(mod. m)....                   (1)
The modulus is supposed to be positive.
A relation of the form (1) is called a congruence. a and b
are called residues of each other, modulo m.
The numbers on the two sides of the sign - are called the
members of the congruence.
* Encyc. des Sc. Math., Tome I, "Vol. III, p. 27.


THE THEORY OF NUMBERS


321


The following are examples of congruences: The reader will readily
convince himself of their correctness.
15- 8 (mod. 7)      60- 0 (mod. 12)
37-19 (mod. 6)     -18-32 (mod. 10)
1-41 (mod. 5)      3 -59 (mod. 31)
25. Every number is congruent (mod. m) to one and only
one of the series:
0, 1, 2... m-1;
also to one and only one of the series:
0, -1, -2,... -(m- 1);
also, if mn is odd, to one and only one of the series:
in-1
0, i1, ~2...;
and, if m is even, to one and only one of the series:
in     in
2       2'
0, ~1, ~2... ~-        +
These are called respectively the series of least positive
residues, least negative residues, and absolutely least residues
(mod. in).
26. In any congruence, multiples of the modulus may be
added or subtracted at will, without disturbing the congruence'
For a -b (mod. m) means that a differs from b by a multiple
of m. This property is not affected if a or b, or both, are altered
by a multiple of m.
Similarly, any factor may be increased or diminished by a
multiple of the modulus without destroying the congruence.
That is, if
ab -c (mod. m)
then also       (a +dmi)b  c (mod. mn).
The reader may supply the details of this reasoning.
27. We may, therefore, in any congruence reduce all numerical termrs and coefficients to values less than the modulus
without destroying the congruence.


322


MODERN MATHEMATICS


Thus, the congruence 86c 7 (mod. 11) may be replaced by
9c 7 (mod. 11), and 437a+289b 469c (mod. 27) may be replaced by
5a + 19b 8c (mod. 27).
The reader should practice with similar relations until he is quite
familiar with the idea. These relations may be taken quite at random.
Thus, 873-? (mod. 36); 4729? (mod. 123). What congruences with
coefficients smaller than the modulus are equivalent to the following?
83x -7 (mod. 13);  439x 3283 (mod. 20);
1la-23b 36 (mod. 5);   4632y=367,832-439 (mod. 16),
etc.
(But exponents may not be treated similarly. From 27 =3 (mod. 5),
it does not follow that 22-3 (mod. 5). A theorem which enables us to
replace exponents larger than the modulus by smaller ones, will be
proved later-sees. 34, 35.)
28. Fundamental properties of congruences.
I. If b-a (mod. m) and c-a (mod. m) then b-c (mod. m).
Proof. The given congruences mean:
b=a +dim
c  a + ern.
c âa+em.
Subtracting,        b - c  (d - e)m.. by definition          b   c (mod. m).
II. If                al1-b1 (mod. m)
a2-b2 (mod. m)
aCi b (mod. m).
then:       al +a2 +... +alb +b2+... + b (mod. m).
The reader can readily supply the proof here, and in the case of the
other properties of this list where the proof is omitted.
Corollary. Terms may be transposed from one member of
a congruence to another; that is, they may be omitted where
they stand, and inserted in the other member with their signs
changed.
For if t represents the term to be transposed, this is equivalent to adding the mlembers of the congruence -t- -t r(mod. in)
respectively to the members of the given congruence.


THE THEORY OF NUMBERS


323


III. If            a b (mod. in),
then                 ka =-kb (mod. 'in),
and also             ka =_ kb (mod. ki).
IV. If             a-b (mod. mn)
and                   c =d (mod. in)
then                 ac = bd (mod. in).
For by III,       ac bc (mod. in)
and                  bc bd (mod. m)
by I               ac bd (mod. i).
Corollary 1. If   a1='b1 (mod. m)
a2 b2 (mod. in)
a, b1 (mod. m)
then         ala2....;blb2... b1 (mod. n).
Corollary 2. If   a=b (mod. m)
then                 ar br (mod. i).
V. If             a b (mod. in)
a b (mod. i2)
a-b (mod. m.)
and if              M = L. C. M. ofMIM2n.2.. MI,
the n                a-b (mod. M).
Proof. By hypothesis a -b=brlin
a-b =r2m2
a-b =riml
and since a-b is a multiple of inl, i2... 1i, it is a multiple
of their least common multiple.
29. Those of the preceding properties that relate to a single
modulus in, are analogous to the corresponding properties of


324


MODERN MATHEMATICS


algebraic equations; instead of " equal" we here say "congruent."  These properties concern addition, subtraction and
multiplication.  We consider next the inverse operation of
multiplication, namely, factoring, and shall see that in this
case the analogy between the properties of equations and congruences is not so close.
In equations we know that if ab O0, then either a = 0 or b = 0.
But we know that 46 0- (mod. 12) while neither 4 =0 (mod. 12)
nor 6-0 (mod. 12). That is, from ab 0O (mod. m), we may not
infer that necessarily either a 0 (mod. in) or b-0 (mod. in).
More generally, if we know that ab=ac, and that a z0, we
know that b - c.
But it is easy to show by an example that if ab =ac(mod. m)
and atO (rod. in) it does not necessarily follow that b-=c
(mod. mi).
Thus: 2 21 â2-17 (mod. S) and 24=0 (rmod. 8). But it is not true
that 21 17 (mod. 8).
The following property states what follows from ab ac
(mod. m).
VI. From ab-ac (mod. m), where a and m1 have the highest
common factor d, it follows that
b c(rod. I).
Proof. The hypothesis means
ab = ac + kin.
or                    a(b-c) =kn.
Since m is a factor of the left member and d is the largest
factor of m that is a factor of a, it follows that - is a factor of
b-c. That is
-d'
or                       b c (mod. -.
d
Corollary. Both members of any congruence may be
divided by any factor that is primne to the modulus, but if the


THE THEORY OF NUMBERS


325


divisor have a factor common with the modulus, that factor
must be taken out of the modulus also.
Thus:
(1) From            30- 78  (mod. 12),
it follows that         5- 13   (mod. 2).
(2) From           108  192  (mod. 14),
it follows that         9- 16   (mod. 7).
(3) From           224- 44  (mod. 15),
it follows that        56 â 11  (rod. 15).
30. Applications of the idea of congruence.   The idea of
congruence, together with the elementary properties that we
have named, is sufficient for the solution of various interesting
problems, of which we give a few examples.
I. To find the remainder when large numbers are divided by
a given number.
(1) To find the remainder when 240 is divided by 23:
We know that
2 = 32.
Hence               25-  9  (mod. 23).
Squaring           210~ 81  (mod. 23)
- 12   (mod. 23).
Squaring           220 144  (mod. 23)
- 6 (mod. 23).
Squaring           240 - 36  (mod. 23)
= 13 (mod. 23).
That is, if 240 is divided by 23 the remainder is 13.
(2) To show that 22 + 1 has the factor 641 (sec. 10):
To show this it is sufficient to show that 22' or 232 has the remainder
640, or -1, when divided by 641.
We have
22 =4,
24=16,
2 = (16)2,
=256,
216= (256)'
=65,536
=154     (mod. 641).
232-(154)2 (mod. 641)
-23,716 (mod. 641)
-1      (mod. 641).


326


MODERN MATHEMATICS


In all such problems the work of multiplication is reduced by taking
the absolutely least residue whether positive or negative.
(3) It is easily verified similarly that the following Mersenne's numbers
(sec. 18), have the factor indicated:
Number        Factor
211 -1          23
223 -1          47
229 -1         233
237 -1         223
2239-1         479
2251-1         503
(4) At the expense of a somewhat longer computation it can be
verified in precisely the same way that 297-1 has the factor 11,447,
that 2223-1 has the factor 18,287, that 222+1 has the factor 114,689,
and even the statement of sec. 10 with respect to 22 +1 could be
verified by a calculation that would indeed be tedious in itself, but that
nevertheless, in view of the enormous number whose factor is verified,
would be a striking example of the power of the method.
It is easy to verify factors, such as the above, when once they
are known, but it may be exceedingly difficult to find them.
II. Criteria for divisibility.
If the digits of a number N read from right to left are
a, b, c, d, e, f, g,..., we have
N=a+10b +102c+103d+104e + 105f+ 106g +...
(1) Since 10-1 (mod. 9), and hence by sec. 28, IV. Cor. 2,
102-l (mod. 9), 103-1 (mod. 9)..., we may write
N-a+b+c+d+... (mod. 9).
If a +b+c+d +... is a multiple of 9, then N is a multiple
of 9. This is the well-known criterion: a number is a multiple
of 9 if and only if the sum of its digits is a multiple of 9.
(2) Since 10   -1 (mod. 11) and hence,
102-1 (mod. 11), 103 â1     (mod. 11), 104-1   (mod. 11), etc.,
we may write:
N _ a-b +c-d+ -f-...         (od. 11).


THE THEORY OF NUMBERS


327


That is, a number is a multiple of 11 if and only if the sum of
the digits in the odd-numbered places diminished by the sum of
the digits in the even numbered places is a multiple of 11.
(3) Since 103+1=7.11 13 we seek to obtain criteria for
divisibility by 7, 11, or 13, by taking residues of the terms of
N according to the modulus 103 + 1.
Since 103 â 1 (mod. 103 -1), we obtain according to sec.
28, III, the following congruences:
104 ---10 
105 -- - 102
106 â103=-(-l)-        I
1   (mod. 103 + 1).
107 -10
108 - 102
etc.
Hence:
N - (a +  b102 +  )- (d + 10e + 102f)
+ (g + 10h + 102j)-... (mod. 103-t 1).
Consequently we may state the following criterion for divisibility
by 7, 11, or 13.
Beginning at the right, separate the given number into periods
of three places each (the last period on the left may of course
have fewer than three digits).  Regard these periods as three
place numbers and add them with alternating signs. If the algebraic
sum thus obtained is divisible by 7, 11, or 13, the original number
is so divisible, and otherwise not.
Thus: To examine 847,963,207 as to divisibility by 7, 11, and 13,
we form
207 -963 +847 = 91.
Since 91 is divisible by 7 and 13 but not by 11, it follows that
847,963,207 is divisible by 7 and by 13 but not by 11.
On examining the proofs above it appears that when the
given divisor is not a factor of the number, the residue of the
division will be furnished by the same test. Divisibility is
simply the case in which the residue is zero.


328


MODERN MATHEMATICS


Thus, the residue when a number is divided by 9 is the same as the
residue when the sum of its digits is divided by 9.
Likewise, the number 847,963,207 has the residue 3, when divided
by 11, since 91 (found as above) has the residue 3 on division by 11.
31. Roots of congruences.
The congruence
aoxn-+a1 xn~la2xn-2 +...+a,_2x2 +a,- x +a, -0 (mod. m)
where the a's are any numbers except that ao is not a multiple
of m, is said to be of degree n in the unknown x.
Any number x1, which, when substituted for x, makes the
left member congruent to the right (mod. m) is said to satisfy
the congruence and to be a root of the congruence.
If any number, x, is a root, all numbers congruent to x1 (mod. m)
also satisfy the congruence (sec. 26). But these are not regarded as
different roots. Taken modulo m, the totality of all numbers that are
congruent to x1 are regarded as a single value, and any number whatever of the totality may be selected to represent it; the least positive
residue (mod. m), for example, may be so chosen. The numbers 0, 1,
2, 3,.   m-1, thus represent all the different values that exist
(mod. m); if we test a congruence for these, no other possibilities remain.
It is easy to show by special examples that the properties
of equations as to existence and number of roots do not hold
unmodified for congruences.
Thus, the equation ax    b, always has one, and only one,
root.
But we readily show by particular instances that the congruence
ax=b (mod. m)
may have:
(1) No root at all.
Example: 3x-5 (mod. 9).
By trying the nine possible values for x,
x-0, 1, 2, 3, 4, 5, 6, 7, 8  (mod. 9),
it appears that none satisfies the congruence. This could also be seen
without trial by writing the congruence in the form:
3x-5- 0   (mod. 9).


THE THEORY OF NUMBERS


329


This means that x must be so chosen that 3x-5 is a multiple of 9. But
whatever the value of x, 3x-5 is not even a multiple of 3, much less of 9.
(2) One root.
Example: 5x 3    (mod. 9).
By trying the nine possible values, it appears that the value 6, and
no other, satisfies the congruence.
(3) More than one root.
Example: 6x 3    (mod. 9).
By trial it appears that the values 2, 5, 8, and no others satisfy the
congruence.
The roots of such congruences will be discussed in more
detail in the next section.
It is not difficult to prove the following theorem, which is
somewhat analogous to the fundamental theorem of algebra
that every equation of degree n has precisely n roots.*
Theorem. A congruence of degree n, and with a prime modulus
cannot have more than n roots.
We omit the proof. The reader may supply it, following
a line of argument analogous to that used for equations.t
32. Theoretic solution of the linear congruence in one
unknown.
Given ax= b (mod. m).
It may be assumed that b is positive and less than m. If not given
so it may readily be made so by addition or subtraction of a multiple
of m.
Case I. a prime to m.
In ax substitute for x in turn the values 0, 1, 3... m-1,
obtaining
ax=O, a, 2a, 3a,... (m-1)a,
or taking least positive residues (mod. m)
ax=co(= 0), cl, c2, 3...  c,,_1 (mod. m).
* See Monograph V, sees. 7, 10, and Monograph IV, Appendix II.
t See Monograph V, sec. 10.


330


MODERN MATHEMATICS


Can any of the c's be equal? Suppose
Ck =Ch  k&gt;h
By definition       ka= ck +rmn
and                    ha = ch + -Sn.
If Ck=Ch, we obtain
(k-/h)a = (r- s)m.
But a is prime to mi, hence k-h is a multiple of mn. But
k-h is positive, and k is less than m, being some one of the
numbers 1, 2... m-1.
Hence k-h is less than m. Since k-h is positive and less
than m, it cannot be a multiple of in. Therefore the supposition
Ck==Ch is incorrect, and the c's are all different. Since there
are m of them, and each one is some one of the m numbers
0, 1, 2... m-1, the fact that they are all different has as
consequence that the whole set of the c's must be the numbers
0, 1, 2, 3.. m-1 in some order.
In the last set of numbers, the number b occurs once and
only once. There is therefore exactly one c that is equal to b,
or exactly one value of x such that ax b (mod. m).
We have thus shown that: a linear congruence in which
the coefficient of the unknown is prime to the modulus has one
and only one solution.
Case II. Let a and m have the highest common factor
d; d&gt;1.
The congruence       ax-=b (mod. m)
means                ax= b +km.
Since a and m have the factor d, this equation cannot be
true if b does not also have the factor d. That is, if btO
(mod. d), our congruence has no solution.
Let                b=0 (mod. d),
and let              a=ald
b = bid
m == md


THE THEORY OF NUMBERS


331


(a1 is prime to mi, since d is the highest common factor of a
and m).
Then we may divide the given congruence, including the
modulus, by d, obtaining
alx=-bl (mod. mi).
By Property III, sec. 28, every root of this congruence is a
root of the given congruence.
This congruence falls under the previous case, and has one
and only one root. Let this root be r. Then all numbers of
the form r+km1 are equivalent so far as the modulus ml is
concerned. All these numbers satisfy the given congruence.
But are they equivalent to a single solution with respect to
its modulus, m?
Let r+k1ml and r+k2ml (kl &gt;k2) be equivalent according
to the modulus m. That is:
r +klml =-r +k2m (mod. m)
or           (kl- k2)ml O0 (mod. m).
Hence, dividing the members of the congruence and the modulus
by mi, we obtain
kl -k2 O (mod. d)
or                  kl âk2 (mod. d).
That is, two numbers of the form r+kml are congruent,
(mod. m) if, and only if, the values of k are congruent (mod. d).
Accordingly the given congruence has d solutions, obtained
from the expression r+ckml by giving k in turn the values
0, 1, 2, 3... d-1.
EXAMPLES:
(1)               12x=6 (mod. 15).
Here               d= 3, m =5.
Dividing through by d,
4x-   2  (Iod. 5).


332


MODERN MATHEMATICS


By trial, it is seen that this congruence is satisfied by x-3 (mod. 5).
Here r=3, and r +kmi becomes 3+5k. By giving k the values 0, 1, 2,
we obtain the three roots (mod. 15), 3, 8, 13.
(2)                8x 12  (mod. 28).
Here                d=4,   m1 =7.
Dividing through by 4,
2x-3   (mod. 7).
By trial, this is seen to be satisfied by x 5 (mod. 7). Here
r=5, and r+km, becomes 5+7k. Giving k the values 0, 1, 2, 3, we
obtain the four roots of the given congruence: 5, 12, 19, 26 (mod. 28).
33. Numerical solution of the congruence ax b (mod. m).
The preceding considerations merely proved the existence of
one or more roots in certain cases, but provided no method
other thain trial for finding their numerical value. It will be
sufficient to find such a method for the case, a prime to m, for
we have seen above that the solution of a congruence in which
a is not prime to m, may be accomplished by the solution of a
congruence in which a is prime to m.
We assert further that the solution of ax b (mod. m) can
readily be found by means of the solution of ax-1 (mod. m).
For let r be a root of the latter congruence; then
ar _=1 (mod. m),
and, multiplying both members by b,
a(br) b (mod. m).
That is, br is the solution of the original congruence. The
problem is then reduced to solving the congruence:
ax 1 (mod. m).
In this congruence there are really two unknowns, x and the
multiple of the modulus, call it y. That is, we seek values of
x and y to satisfy the equation:
ax= 1 + my


or


ax- my = 1.


THE THEORY OF NUMBERS                     333
But the last equation is familiar from the theory of continued
fractions.*  If the fraction    is developed into a continued
in
Y                                           a
fraction, and if - is the last convergent before the value -
is reached, it is known that the relation aX- mY= ~ 1 holds.
Hence either X or -X is a root of ax -1 (mod. m)
We have thus established the following rule for the computation of the root of ax=b (mod. in).
Develop a into a continued fraction.  The denominator of the
a
last convergent before - is reached will be the absolute value. of?2T
the root of ax  1 (mod. in).  Determine by trial which sign is to
be taken; the value thus obtained multiplied by b is the root of
ax  b (mod. m).
* It will be recalled that expressions of the form
a+1
b+1, a an integer, b, c, d,..., integers&gt;0, are
c+1
d+....
called continued fractions.
Every rational fraction can be expressed as a terminated continued
fraction. Thus,
29       4       1         1          1           1
â ==_-3+-=-3+-= -3+           - -3+       == â3+
-113    11       1       2+         2+        2+ -
4          4          4            1
-       1+3            3
a     1      1
The fractions, -, a+-, a+, etc., are called the first, second, third
1'      b  b+1
C..., convergents of the continued fraction. Thus the convergents of the
1        1          1
fraction used as example above, are -3, -3+,-3+  -3+ ---
2+-       2+
1            1
1+5   8   29
or in reduced form, -3, 
2'  3'   11'
For proof of the property used in the main text, see works on college
algebra.


334


MODERN MATHEMATICS


For example:
49x-23 (mod. 125).
49    1
125 2+1
1+1
1+1
4+1
1
2+2
20
The last convergent is 52.
51'
Hence X=51. By trial we find,
49 51  -1   (mod. 125),
that is -51 is a solution of 49x-=1 (mod. 125). Multiplying -51 by
23 we obtain a solution of the original congruence
23(-51) 77   (mod. 125)..'. 77 is the solution of the original congruence, as may be verified
by substitution.
The reader may solve and verify similarly other congruences
taken at random, such as:
83x= 7   (mod. 96);  11x-81   (mod. 85);
72x- 27  (mod. 75);  75x 73   (mod. 85);
and the like.
34. Fermat's Theorem.    If p is a prime, and a is prime to
p then a-1 -  1 (mod. p).*
Proof.  We have already proved (sec. 32) that the numbers
a, 2a, 3a... (p-1)a
are congruent (mod. p) to the residues
1,2,3.. p-1


* Announced without proof by Ferinat in 1679; first proved by Euler in
1736. The Chinese are thought to have known this theorem for the case
t â 2, as earlyr as 500..c.


THE THEORY OF NUMBERS


335


in some order.*    Multiplying these congruences together we
have
a 2a3a... (p-l)a-l2.     3... p.a-lI (mod. p).
Dividing both members by 1, 2, 3... p-1, which is prime
to the modulus, we obtain the-desired result,
aP- 1 - 1 (mod. p).
35. Applications. (1) Find a congruence equivalent to the following, but of degree lower than 13:
x27 +3x25 +4x8 -3x7  6x13-2x7+ 1lx-5-=O  (mod. 13).
By inspection it is evident that x-0 (mod. 13) does not satisfy
this congruence; we accordingly know that any root x is prime to 13,
and hence that x13-' -1(mod. 13). Further,
x27= (x12)2.3 _ (1)2x3x3/ (mod. 13),
3x25 =3(x2)2. -S3x       (mod. 13),
4x 8=4x 2.x6   4x6        (mod. 13),
3x7 _ 3x5                 (mod. 13),
6x1-6x.                   (mod. 13).
Substituting these results in the original congruence, we obtain,
-2x7+4x6-3x5+x3 +20x-5 =0     (mod. 13).
(2) To find the remainder when 477385 is divided by 17.
Dividing 7385 by 17-1 or 16, we have,
7385 =461 16+9.
477 385 = (4716) 461. 479
=(1)481.479 (mod. 17).
47=2 17 + 13   or   47 13   (mod, 17).
Hence, by sec. 28j IV, Cor. 2,
(47)9-139 (mod. 17),
or,                  47738  = 139  (mod. 17).
* This statement follows at once from the result in sec. 32, if we remember
that 0 0 (mod. p).


336


MODERN MATHEMATICS


We proceed to work out 139:
132=169
-1   (mod. 17).
Squaring              134-1     (mrod. 17).
Squaring again,       13S=-1    (mod. 17).
Multiplying both members by 13,
139 - 13  (mod. 17).
477385 13   (mod. 17).
The remainder is 13.
The reader may solve similarly other problems of this sort taken
at random. For example, to find the remainder when 1237841 is divided
by 29; when 30067489 is divided by 41, and the like.
(3) If n is any integer greater than 1, show that n" -n has the
factor 2730.
Since 2730=2 3-5.7 13, it is sufficient to show that n'3-n has
each of these primes as factor.
The factor 2. n'3-n=n(n'2-1).
If n is even the factor 2 is present. If n is odd we must show that
n12-1 is even. This is evident at once, since any power of an odd number is odd; hence n12 is odd and n'2-1 is even.
It also appears by Fermat's theorem thus:
Since n is prime to 2,
n2 â11           (mod. 2).
n12=(n2-1)12=1  (mod. 2),
or,                   nL2-1-0         (mod. 2).
The factor 3. As above, unless n is a multiple of 3, we must show
that n12-1 is a multiple of 3.
But by Fermat's theorem,
n3-     - 1      (mod. 3).
n2=(n3 â1)61   (mod. 3),
and                n2 -1=0            (mod. 3).
Similarly by writing our given expression in the forms,
n[(n5-1)3-1],  n[(n7-1)2-1],  n[n13- _ 1],


THE THEORY OF NUMBERS


337


we show that it must have the factors 5, 7, and 13, and the proof is
completed.
(4) Show that every prime number (except 2 and 5) is a factor
of a boundless number of numbers all of whose digits are 9's.
Let p be a prime other than 2 or 5. Then 10n is prime to p. Hence,
by Fermat's theorem,
(10n) â 1 --  0  (mod. p).
This is true for every n.
The number (IOn)p-1-1 always consists of 9's exclusively, and
hence the theorem is proved.
(5) The congruence ax-b (mod. p), where a is prime to p, can be
solved by multiplying both members by all- and applying Fermat's
theorem, with the result, x-baP-2 (mod. p).
36. Wilson's  Theorem.    If p  is a prime, (p-i)! â1
(mod. p).*
For p = 2, the theorem is obviously true.
We accordingly suppose p &gt;2, and to prove the theorem
for this case, first prove the following lemma:
Lemma. The root of the congruence ax-1 (mod. p) is congruent to a, if, and only if, a  1l or a = p-1 (mod. p).
Proof. By sec. 32 the congruence
ax -1 (mod. p)
has one root. Suppose it to be a. Then
a2==  (mod. p)
or            (a-1)(a+l)-0 (mod. p).
But a product is a multiple of a prime p, if and only if one of
its factors is a multiple of p.  Hence, either a-1- 0 or a+1 =0,
(mod. p). That is the root of the congruence ax-l (mod. p)
can be congruent to a only if a-1, or p-1 (mod. p).      It can
readily be verified that the root is congruent to a in these cases,
and the lemma is thus proved.
* First published, without proof, by Waring in his Meditationes Algebraicae, Cambridge, 1770, and ascribed by him to J. Wilson. It was proved
by Euler in 1773, and by Gauss in his Disquisitiones Arithmeticse, 1801.


338


MODERN MATHEMATICS


If now al is one of the numbers 2, 3, 4... p-2, the root
of the congruence a1x=l (mod. p) will, by the lemma, be different from a1; calling the root a2, we have ala2 -  (mod. p).
Consider next a3, a third number of the set above. Denote
the root of the congruence, a3x-1 (mod. p) by a4. Then we
have a3a4 1 (mod. p), and by the lemma, a4 is not congruent
to a3. We show further that it is not congruent to a2. For,
if a2=-a4 (mod. p), then multiplying both members by a3,
a3a2=a3a4 (mod. p)
or                 a3a2-1 (mod. p).
But we know        ala2-1 (mod. p)..a3a2-ala2 (mod. p).
Dividing both numbers by a2
a3=a1 (mod. p).
This is contrary to the choice of a3 as different from al. Hence
the hypothesis a2-ax (mod. p) is incorrect. Similarly, it
appears that a40al (mod. p).
If now a5 is a fifth number of the set different from the form
already considered, and if a6 denote the root of the congruence
a5x =1 (mod. p), then by the same reasoning as above it appears
that a6 is not congruent to any one of the numbers a1... a5.
Continuing in the same way, the entire set of numbers 2,
3,... p-2 can be grouped in pairs such that the product of
the numbers in each pair is congruent 1 (mod. p).
That is           ala2 â1
a3a4 1
(mod. p).
a,-4ap-3 1
We know further that p- â 1 


THE THEORY OF NUMBERS


339


Multiplying all these congruences member by member and
remembering that the a's are the numbers 2, 3,... p-2 in
some order, we obtain
2.3...  (p-2) (p- l)-  -  (mod. p)
or                         (p-1)! â I (mod. p).
37. Wilson's theorem does not hold for composite moduli.
For if m is a composite number, and k is one of its factors,
(1&lt; k&lt; m), then (m- 1)! will have k as a factor, and consequently
(m-l)!+l will not be a multiple of k and, therefore, not of m.
Accordingly, Wilson's theorem furnishes a theoretically complete criterion for determining whether or not a given number
n is prime. Namely, form (n â 1)!; divide it by n; if the residue
-1 can be obtained, n is a prime; otherwise, n is composite.
But with large numbers, this method is of no practical use, on
account of the enormous calculations that would be required.
38. Applications. (1) If p is a prime number, the residue when
1.2-3...p-1 is divided by 1+2+3+...+(p-l) is p-1. In
symbols:
(p-l)!=p-1   (mod. 1+2+3+...+p-l).
By Wilson's theorem,
1.2.3... p-l= â1+kp
==(k-1)p+(p-1).
As the left member has the factor p-1, and the second term on the
right is p-1, it follows that (k-l)p must have the factor p-1, and
since p- 1 is prime to p, that k- 1 has the factor p -1.
Let                   k-1=h(p-1).
Substituting, we obtain,
1-2.3... p-1=h(p-1)p+p-1
=2h(P -)p
2
But (P    =1)P-+2+... +(p-).
ence the assertion is pro
Hence the assertion is proved.


340


MODERN MATHEMATICS


(2) If a prime of the form 4n+1, then (1.2-3... 2n)2+1 is a multiple of p.
By Wilson's theorem,
1.23... (p-1)+1=-0 (mod. p).
Or 1 23... 2n(2n + 1)... (4n-2)(4n-1) (4n) +1 0 (mod. p). (1)
But, since p = 4n + 1,
4n= -1       (mod. p),
4n-1 -2         (mod. p),
2n+2   -(2n-1) (mod. p),
2n + 1 -2n      (mod. p)..'. (2n+1) (2n + 2)... (4n-1)(4n) -(-1)2.2.3... 2n (mod. p). (2)
From (2) and (1),
[1.2.3...2n]2+ 1  0 (mod. p).
V. BINOMIAL CONGRUENCES
39. Definition. Congruences of the form xn-A -0 (mod. m)
are called binomial congruences.
We shall consider only xn-1l0 (mod. p) where p is a
prime. By Fermat's theorem (sec. 34) we can always make
n&lt; p. If p=2, the congruence is linear (since n&lt; p), and has
already been solved. We accordingly suppose throughout the
subject of binomial congruences that p is a prime greater than 2.
40. The solutions of xm  1 (mod. p) where m is any positive
integer, must be prime to p, and are therefore, by Fermat's
theorem, also solutions of:
xP-ll (mod. p).
Further, by sec. 28, IV., Cor. 2, every solution of
xm-=1 (mod. p)
will also be a solution of


xkm  1 (mod. p),


for every k.


THE THEORY OF NUMBERS


341


41. Theorem. If a is a root of xn=1 (mod. p) and also of
xq 1 (mod. p) and if d is the highest common factor of n and q,
then a is a root of xd -1 (mod. p).
Proof. Let n= n'd, q=q'd. Then n' and q' are relatively
prime, and the congruence:
n'z =l1 (mod. q')
admits one solution (sec. 32). That is, there exist numbers
z and y satisfying the equation
n'z = 1 + yq
or                n'z-yq' =;
or, multiplying through by d,
nz-yq=d.
By hypothesis: a= 1 (mod. p); hence anz 1 (mod. p) and
aq 1 (mod p); hence aqy I (mod. p).
Subtracting:
ao2- an - O (mod. p)
or             aqY(anz-qy â 1) 0 O (mod. p).
Since a must be prime to p,
anz-q-  0q --   (mod. p)
or                    ad _-l 0 (mod. p).
That is:
a is a root of xd =1 (mod. p).
Corollary. If d is the highest common factor of n and p-1,
the solutions of x- 1 (mod. p) satisfy also xd 1 (mod. p).
It is accordingly sufficient to consider only congruences of
the type:
d = 1 (mod. p),  d, a divisor of p-1.
42. Definition. The number a is said to belong to the
exponent d (mod. p), if adl- (mod. p) and if ay1l (mod. p),
whenever y&lt; d.


342


MODERN MATHEMATICS


43. Theorem. If a belongs to the exponent d (mod. p), then
at 1 (mod. p) if and only if t is a multiple of d.
Proof. Let          at=l (mod. p)
and let D be the highest common factor of t and d. Hence
by sec. 41, aD-1 (mod. p). If t is not a multiple of d, D&lt; d,
and in this case a would satisfy a congruence of degree
D, less than d, the exponent to which a belongs. Hence t must
be a multiple of d.
44. Theorem. If a belongs to the exponent r, and b belongs
to the exponent s (mod. p) and if r and s are relatively prime,
then ab belongs to the exponent rs (niod. p).
Proof. The hypotheses mean that:
ar= 1 (mod. p)
bs 1 (mod. p),
and that a and b satisfy no congruences of this type of lower
degree. We have to prove (i) that (ab)rs-1 (mod. p), and
(ii) that-no lower power of ab is congruent to 1 (mod. p).
(i)               (ab)rs = arsbrs - (a?')s (bs)'
is - Jr (mod. p)
=1 (mod. p).
(ii) Let k be any exponent such that
(ab) -l (mlod. p).
Then                 ak.b =l (mod. p).
Raising both members to the power r,
arkbr;k  (mod. p).
or, since               ar 1 (mod. p)
brk =1 (mod. p).
Hence, since b belongs to s, rk is a multiple of s, by sec.
43, and therefore since r is prime to s, k is a multiple of s.
Quite similarly, it may be shown that k is a multiple of r.
Hence, since r and s are relatively prime, k is a multiple of rs,


TIIE THEORY OF NUMBERS


343


Hence the lowest value of k is rs itself, and the proof that ab
belongs to rs is completed.
45. If r and s are not relatively prime, and if m denote
their least common multiple, it can be proved in an analogous
manner that a number belonging to m can be determined by
means of a and b.
46. Theorem. To every divisor, d, of p-1, there belongs
(mod. p) at least one number a.
Proof. (1) We take up first the case: d- q, where q is a
prime.
Then the congruence:
x)-l-1-=0 (mod. p),..... (1)
may be written
xf. -1-0 (imod. p);              (2)
or
(xqa-l)(x(f- )q+X(f/-2)q '+..+ + Xq   )-0 (mod. p).. (3)
But by Fermat's theorem the congruence (1) is satisfied
for every value of x except those -0 (mod. p). Accordingly
the congruence (3) has the maximum number of roots. But
since neither factor of the left member of (3) can be congruent
zero for more roots than there are units in its degree, it follows
that each factor is congruent zero for as many roots as there
are units in its degree. In particular:
xq - 1-0 (mod. p)......         (4)
has q0 roots. Some of these will also satisfy congruences of
this type and of lower degree. By sees. 41, 40, all such roots
will satisfy X2q  -1 -0 (mod. p). But by what has just been
proved this congruence has precisely qu-1 roots. Hence (sec.
40), there are precisely q"-q-1 or qx-l(q-1) roots of the
congruence (4) that satisfy no congruence of lower degree.
That is, there exist precisely q"-l(q-1) incongruent numbers
belonging to the exponent qa (mod. p).


344


MODERN MATHEMATICS


(2)  Case, d any divisor of p-1.
Let d=qarSsr..., where q, r, s are different primes. Then
by (1), there exists a number, call it a, belonging to q"; and
there exists a number, call it b, belonging to r3. Hence, by
the theorem of sec. 44, ab belongs to qar0.
By (1) there exists a number, call it c, belonging to sr.
Since qarO and sr are relatively prime, (ab)c belongs to qtrIsr.
Continuing in this way, all the factors of d are used, and the
existence of a number belonging to d is established.
Corollary. There exists at least one number, call it g,
belonging to p-1.
47. Definition.  If g belongs to the exponent p-1, then
g is called a primitive root of the congruence
Xp- 1l (mod. p),
or briefly, a primitive root of p.
48. Theorem. If g is a primitive root of p, the numbers
g, g2, g3... gp-1 are distinct (mod. p) and have the residues
1, 2, 3... p-  in some order.
Proof. Suppose   gh gk          (mod. p) p- 1   h &gt;k  1.
Then              gh- k _  (mod. p).
But p-    &gt;h-k    l. Hence this result contradicts the
hypothesis that g is a primitive root of p. Consequently, the
p-   powers, g, g2,... gp-1, all have different residues (mod. p),
and therefore have the residues 1, 2, 3... p-1 in some order.
49. Theorem. If g is a primitive root of p, and if k is prime
to p-1, gk is a primitive root of p.
Proof. Let        (gk)h â (mod. p).
Then, since g belongs to exponent p-1,
kh  0 (mod. p-1).
Hence, since k is relatively prime to p-1,


h-0 (mod. p-1).


THE THEORY OF NUMBERS


345


The lowest admissible value of h is therefore p-1; that
is, gk belongs to the exponent p-1, and is hence a primitive
root of p.
Corollary. There are    4p(p-1) primitive roots of p.
50. The actual value of a primitive root may be found by
trial, if the modulus is small.
Thus, for p= 17, we try 2,
2=2                    2' =-2   (mod. 17)
22=4                   26  -4    (mod. 17)
23=8                   27 -8     (mod. 17)
24=16                  28-=-16   (mod. 17)
- 1    (mod. 17)       -1      (mod. 17)
That is, 2 belongs to the exponent 8, and is not a primitive root
of 17. Nor can any of the residues obtained, 2, 4, 8, 16, 15 (  -2),
13, 9, 1, be primitive roots. For they are all of the form 2k and (2k)8
or 28k is -1 (mod. p) since 28 is so. Hence all of these residues belong
either to 8 or to a divisor of 8.
The smallest number not in the above list is 3. Trying 3 it is found
to belong to the exponent 16; that is, 3 is a primitive root of 17.
It can also be proved without trial that 3 must be a primitive root.
For, since 95(16) =8, 17 has 8 primitive roots. But there are 8 residues
of 2k, none of which is a primitive root, consequently, each one of the
8 other non-zero residues must be a primitive root. In particular, 3
is a primitive root.
If the second trial likewise does not lead to a primitive root, the
theorems above (secs. 44, 45) enable us to determine a number belonging to the least common multiple of the two exponents. If this least
common multiple is p-1 itself, we have found a primitive root. If
not, we have at least a number belonging to a much larger exponent,
and all its powers are thus excluded from further consideration. In
this way, systematic trial enables us to find a primitive root.
For large primes, the calculations may become laborious. Some general theorems are known as to primitive roots. For example: If a prime
is of the form 22n + 1, it has the primitive root 3.
If a prime p is of the form 8n+3, and if 4n+1 is also prime, p has
the primitive root 2.*


* Tschebyscheff, Theorie der Congruenzen, Berlin, 1889, p. 306, et seq.,
where others are given and proved.


346


MODERN MATHEMATICS


VI. QUADRATIC CONGRUENCES
51. The most general congruence of the second degree in
one unknown is:
ax2 +bx +c-O (mod. m).
We simplify the form of this as follows: Multiply both members
and the modulus by 4a,
4a2x2 + 4abx + 4ac 0 (mod. 4am).
(The modulus is multiplied also, so that the inverse operation
may always be possible);
or:         (2ax b)2-b2 +4ac O (mod. 4am).
Putting       y 2ax +-b (mod. 4am)
db2- 4ac (mod. 4am)
the congruence becomes:
y2-d (mod. 4am).
From the values of y, we find the values of x, by solution of the
linear congruence:
y -2ax + b(mod. 4am).
52. If 4am=plkl.p2k2... plkl, where pi, p2. p2 are
different primes, any number that satisfies the congruence:
y2-d=- (mod. 4am)..         (1)
will also satisfy each of the congruences:
y2-d 0 (mod. p1ik) 1............ *(2)
y2-d  O (mod. plki)
Conversely, the definition of a congruence shows that any
number that satisfies each of the congruences (2), satisfies also
congruence (1).


THE THEORY OF NUMBERS


347


53. The solution of the general quadratic congruence is thus
reduced to that of the type:
x2-a=O (mod. pk),   where p is a prime.
Any solution of this congruence is also a solution of:
x2-a-0 (mod. ph),         where h&lt; k,
and, in particular, of:
x2-a 0 (mod. p,...... (3)
We shall restrict further consideration to congruences of
this type, and as preliminary example take the modulus 7.
Forming the squares of the seven least positive residues (mod. 7)
we have:
02=0    22=4    42 2   62-=1  (mod. 7)
12=1    32=2    52 4
We see from this that the congruence
x2=a (mod. 7)
has a solution if a-O, 1, 2, 4, but has no solution if a-3, 5, 6.
The former numbers are residues of squares according to
the modulus 7, or briefly quadratic residues of 7; the latter
numbers are not such residues.
54. Definition. If the congruence x2-a (mod. p) admits a
solution, the number a is called a quadratic residue of p: otherwise it is called a quadratic non-residue of p. When there is
no danger of misinterpretation, the word "quadratic" is often
omitted for brevity.
55. It can be proved without much difficulty that the
product of two residues of p is also a residue; that the product
of two non-residues is a residue; and that the product of a
residue and a non-residue is a non-residue.
56. These results can be stated in the form of a single
equation by the use of the symbol (p) which is defined as
* Introduced by Legendre, and known as "Legendrc's symbol,"


348


MODERN MATHEMATICS


having the value +1, if a is a residue of p, and -1, if a is a
non-residue of p. Then we have always:
\p p/      p )I
It follows that if
m=(-1)a2bp    2d...
then:             =( -1)(2)b(p)(P2)*
57. To determine whether or not any number m is a residue
of p, it is sufficient to determine whether or not -1, 2, and
the odd prime factors of m are residues of p.
The following results may be proved:
I.        =      (-1) -.
/2\    p2- l
II.          0=(-1) s.
III.     q(-p)       1) 2    2.       (p, q odd primes).
58. The last is an important theorem, known as Legendre's
Law of Reciprocity, and may be stated as follows: If p and q
are two odd primes and if at least one of them is of the form 4n + 1,
then q is a residue of p, if and only if p is a residue of q, while
if both p and q are of the form 4n + 3, then q is a residue of p when
p is a non-residue of q, and vice versa.
This theorem was discovered empirically by Euler (1783),
announced in its general form by Legendre (1785), and partly
proved by him. The first complete proof was, however, due
to Gauss, who gave eight distinct proofs. Many others have
been given down to the present time.* For further information and for a full presentation of some of the proofs, the reader
is referred to the works mentioned in the Bibliography.
* A chronological list of 49 proofs, extending from the first proof, published by Gauss in 1801, to three proofs by Lange in 1896-97, is given in
Bachmann, Niedere Zahlentheorie, I, pp. 203-4.


THE THEORY OF NUMBERS


349


VII. BIBLIOGRAPHY
59. The classic work in our subject is the Disquisitiones Arithmeticse
of C. F. Gauss, published in 1801, when Gauss was only twenty-four years
of age, and really completed a few years earlier. In this work Gauss
gave a masterly presentation of the subject which has remained unequalled; unlike many masterpieces, it is written so clearly and simply
that much of it is intelligible to the beginner. A German translation
by Maser (Berlin, 1889), and a French translation by Poullet-Delisle
(Paris, 1807), make the work more widely accessible.
The following texts also take up the subject from the beginning,
reaching varying degrees of advancement:
Dirichlet-Dedekind, Zahlentheorie, Braunschweig, 4th ed., 1894.
Bachmann, Niedere Zahlentheorie, I, Leipzig, 1902.
Cahen, Theorie des Nombres, Paris, 1900.
Mathews, Theory of Numbers, I, Cambridge, 1892.
These works contain numerous references, both to the older and
the contemporary literature. An excellent sketch of the principal
results and present state of our subject is given in the Encyclopadie der
Mathematischen Wissenschaften, Band I, 2ter Teil, appearing  with
additions in the French translation, Encyclopedie des Sciences Mathematiques, Tome I, Vol. III.
The theory of numbers figures largely in the field of "Mathematical
recreations." An introduction to this field may be obtained through
some or all of the following:
Ball, Mathematical Recreations and Problems, 3d ed., London, 1890.
Bachet de Mkeziriac, Problemes plaisants et delectables qui se font
par les nombres. First published in 1612, and reprinted at Paris in
1884.
Lucas, Recreations Mathematiques, 4 vols., Paris, 1891-96.
Ahrens, Mathematische Unterhaltungen und Spiele, Leipzig, 1900.
In this connection mention may also be made of a paper by Bouton
on "Nim, A Game with a Complete Mathematical Theory" (Annals of
Math., ser. 2, Vol. III, pp. 35-39, 1901), recently generalized by Moore
(ibid., Vol. XI, pp. 90-94, 1910).


VIII
CONSTRUCTIONS WITH RULER AND COMPASSES;
REGULAR POLYGONS
By L. E. DICKSON


CONTENTS


1. Introduction.
2. Analytic criterion for constructibility.
3. Graphical solution of a quadratic equation.
4. Domain of rationality.
5. Functions involving no irrationalities other than square root.
9. Reducible and irreducible functions.
11. Fundamental theorem; Duplication of the cube; Trisection
of an angle; Quadrature of the circle.
13. Connection between regular polygons and roots of unity.
14. De Moivre's theorem.
17. Regular pentagon and decagon.
19. Regular polygon of 17 sides.
20. Construction of the regular polygon of 17 sides.
21. Gauss's theory of regular polygons.
28. Primitive roots of unity.
30. Gauss's lemma.
31. Irreducibility of the cyclotomic equation.
32. Proofs of theorems cited earlier.
39. References.
352


VIII


CONSTRUCTIONS WITH RULER AND COMPASSES;
REGULAR POLYGONS
By L. E. DICKSON
1. Introduction. The Greek geometricians discovered constructions by ruler and compasses for various elementary
problems. There arose, however, certain famous problems,
such as the duplication of a cube, the trisection of an angle, and
the quadrature of a circle, for which the ancients vainly sought
constructions by ruler and compasses. The impossibility of
these constructions was proved only in recent times. As such
proofs are beyond the scope of elementary geometry, recourse
must be had to analytic methods, in particular to the general
processes and theorems of algebra. To these analytic methods
is due likewise the discovery of the possibility of certain constructions. This is the case, for instance, with the regular
polygon of seventeen sides, the possibility of whose construction by ruler and compasses was not suspected during the
twenty centuries from Euclid to Gauss.
2. Analytic criterion for constructibility. The first step in
our consideration of a proposed construction consists in formulating the problem analytically. In some instances elementary algebra suffices for this formulation. For example,
in the ancient problem of the duplication of the cube, we
are given the length s of a side and seek a number x such
that X3=2s3. But usually it is convenient to employ analytic
geometry; a point is determined by its coordinates x and y
with reference to fixed axes, a straight line or circle by an
353


354


MODERN MATHEMATICS


equation of the first or second degree between the coordinates
of the general point on it. Hence we are concerned with
certain numbers, some being the coordinates of points, others
being the ratios of the coefficients in equations, and others
expressing lengths, areas, or volumes. We shall establish the
following
Criterion. A proposed construction is possible by ruler and
compasses if, and only if, the numbers which define analytically
the desired geometric elements can be derived from those defining
the given elements by a finite number of rational operations and
extractions of real square roots.
Suppose, first, that the construction is possible. The
straight lines and circles drawn in making the construction
are located by means of points either initially given or obtained
as the intersections of two straight lines, a straight line and a
circle, or two circles. The coordinates of the intersection of
two straight lines are rational functions of the coefficients of
the equations of the lines. To determine the coordinates of
the intersection of the straight line y=mx+b with the circle
(x -c)2 + (y -d)2 = r
we eliminate y between the equations and obtain a quadratic
equation for x. Thus x (and hence mx+b or y) involves no
irrationality (in addition to those in m, b, c, d, r) other than
the square root of a certain known expression. Finally, the
intersections of the preceding circle with a second circle,
(x -e)2 + (y -f )2= s2,
are given by the intersections of one of the circles with their
common chord, whose equation is obtained by subtracting
the members of the equation of one circle from those of the
other. This third case has therefore been reduced to the
second. The property stated in the criterion is thus proved.
Conversely, let there be no irrationalities other than real
square roots. Then the construction is possible by ruler and
compasses. First, a rational function of given quantities is


CONSTRUCTIONS WITH RULER AND COMPASSES    355


obtained by the operations, addition, subtraction, multiplication, and division. The construction of the sum or differ1b
a
b
1 ~b 6         1   q=a/b
FIG. 1.                      FIG. 2.
ence of two segments is obvious. The construction, by means
of parallel lines, of a segment whose length p is the product
a b of the lengths of two given segments is shown in Fig. 1; that for
the quotient q=a/b in Fig. 2.
Next, a segment of length r= N/m 
may be constructed, as in Fig. 3, by  1     -
drawing a semicircle on a diameter        FIG. 3.
composed of two segments of lengths
1 and m, then a perpendicular to the diameter.
3. Graphical solution of a quadratic equation. The roots of
2 -ax + b = 0
are 2(a ~/a2-4b). When the roots are real, the only irrationality is a real square root. The criterion for constructibility in sec. 2 is therefore satisfied. Of various methods of
making the construction, the following * is especially simple:
Draw a circle having as diameter the line BQ joining the
points B=(0, 1) and Q=(a, b). The abscissas ON and OM
of the points of intersection of this circle with the x-axis are the
roots of the quadratic x2 -ax + b =0.
First Proof.  In Fig. 4, OB=1, OT=a, TQ=b. The
centre of circle is thus (2'  l )its diameter is the hypote2 '


* Acredited to Lill by D'Ocagne, Le Calcul Simplifie, Paris, 1905, p. 139.


356


MODERN MATHEMATICS


nuse of a right triangle with legs a and b-1. Hence the
equation of the circle is
/    a\2 /  b+1\2   a\2   b -1\2
(x-2) +Y- 2 -       92J +(-2 )2          2      2       2
To find its intersection with the x-axis, we set y= 0, and get
x2-ax+b =0.
Second Proof. To give a proof by elementary geometry,
let OB meet the circle again at C, and let TQ meet it at D. Join
CQ and BD. Since BQ is a diameter,
angles C and D are right angles.
c/            \ ---------  HenceOC==b, DT=OB. SinceparalIe    ^-^   ~~ \ Mlel lines intercept equal arcs, chords
BN and DM    are equal. Thus triB     _-__D          angles BON and DTM are congruent,
N\       M _ T  whence ON=MT. Thus,
OM+ON=OM+MT=OT=a.
FIG. 4.
The product of the segments on
one secant equals the product of those on another from the
same point. Hence
OM ON=OCOB=b 1==b.
Since OM and ON have the sum a and the product b, they
are the roots of x2-ax+b==0.
4. Domain of rationality. If a set of numbers has the
property that, when each of the rational operations, addition,
subtraction, multiplication, and division (the divisor not
being zero), is performed on any two numbers of the set, the
result is one of the numbers of the given set, the set of
numbers is said to form a domain of rationality.
For example, the set of all real numbers forms a domain of
rationality since the sum, difference, product, or quotient of
any two real numbers is a real number. Again, the set of all
rational numbers (that is, all positive and negative integers
and fractions) forms a domain of rationality. But the set


CONSTRUCTIONS WITH RULER AND COMPASSES        357
of all positive integers is not a domain of rationality, since the
difference of two positive integers is not always a positive
integer. Nor is the set of all positive and negative integers
a domain of rationality, since the quotient of two integers is
not always an integer.
The set of all rational functions, with integral coefficients,
of assigned numbers a, b, c,..., forms a domain of rationality;
it is said to be defined by a, b, c,... *
If, in a proposed construction, the given geometric elements
are determined analytically by the numbers a, b, c,...,the
domain of rationality defined by a, b, c,..will be called the
domain of the geometric data and designated by D.
5. Functions involving no irrationalities other than square
roots. Let x be a function derived from the numbers a, b,
c,... of the domain D by rational operations and extractions
of square roots, finite in number. The purpose of investigating such functions x is to deduce a condition for constructibility more easily applied than the criterion in sec. 2.
The number of superimposed radical signs in a term of x
is called the order of the term; the maximum order of the
various terms of x is denoted by m. For example, in
x= /\/a +   b+/c + V/d +//e +V/f+g,
the first three terms are of order 2, the fourth is of order 1,
the last term g is of order zero; consequently, m=2.
Frequently a function x can be given a modified form
involving fewer radicals. Thus, V/9 can be replaced by 3,
and /10 -2\/ by    /3-1. If r= /3+ /5 and r'=/     -
then rr'=2, so that 2r-7r', which involves two radicals of
14
order 2, can be replaced by 2r â, which involves only one
radical of order 2. Again, if x involves V/3, /5, and  /15,
we would replace V/15 by the product V/3 S/5. In general,
if any of the various radicals of order n is a rational function


* See also Monograph V, sec. 9.


358


MODERN MATHEMATICS


of the remaining radicals of order n and the radicals of lower
order, we assume that it is so expressed in terms of the other
radicals. Hence, after all such simplifications are made, no
one of the various radicals of order m is a rational function of
the remaining radicals of order m, and the radicals of lower order
occurring separately or underneath other radical signs; likewise,
no radical of order m -1 is a rational function of the remaining
radicals of order m -1 and the radicals of lower order, etc. The
distinct radicals which occur in this simplified form of x will
therefore be said to be independent.
In case the resulting function x is a sum of several fractions,
we bring them to a common denominator and express x as
the quotient of two integral functions of the radicals. For
example, if x= V5+2r-14/r, where r=V/3+ V5, we give x
the form -, where A=rV/5 +2(3+ /5) -14.
Next, we rationalize the denominator by the following
process: If the denominator contains a radical x/k of the
maximum order m, it can be given the form a+bV/k, where
a and b do not involve Vk. We then multiply the numerator
and denominator by a-bV/k. Similarly, we rid the denominator
of each radical of order m, then rid it of each radical of order
A
m-1, etc. Thus, in the preceding example, x=-, where
r=/3 +   5, the first step gives
A(-r)     -Ar
r(-r)    -3-V5
The next step gives
-Ar( -3 +   5)    3Ar -ArV/5
(- -3-v5)( -3 + /5)       4
We have now proved that x can be given a normal form
composed of a sum of terms each a product of radicals, having
as coefficient a number of the domain D, and such that the distinct
radicals are independent.
For example, 5 +_ /5-V7-+4 5v7 is in normal form.


CONSTRUCTIONS WITH RULER AND COMPASSES    359


6. Let n be the number of distinct radicals (including
radicals occurring beneath radical signs) found in the normal
form of x. By changing the sign of one or more of these n
radicals everywhere that it occurs in x, we obtain 2n conjugate
functions x-x1, 2,..., X2n.
For example,
Xi=3 +2V5 +/3 -2/V5
is one of 23 =8 conjugates, of which only 4 are distinct, namely,
x1, x2 =3+2/5- /3-25, x3= -V/3+2V/5+/3-2V/5,
X4= -4/3+2\/5-/3 -2v/5.
The 2n conjugate quantities xl, x2,..., are the roots of the
equation
F(x) = (x-xl)(x-X2)... (x-X2n)-0.
The expanded form of this product is
F(x) - X2n + kx2n-1 +.. + k2n,
where, as shown in the theory of equations,*
kl== -(xl +2+... +x2n),  k2=xlx2+x2x3+xlx3...
For example, x = 3a + 2 /b and its conjugate 3a -2 /b are
the roots of the equation x2 -6ax + 9a2 -4b =0 with coefficients
in the domain defined by a and b.
Although the roots xl, x2,... involve radicals, the symmetrical combinations ki of the roots will be shown to equal
expressions free of these radicals, and hence rational functions
with integral coefficients of the given numbers a, b, c,...
defining the domain D. Indeed, suppose that ki involves one
of the radicals, say /r. Then it can be put into the form
ki = p + q/r,
where neither p nor q involves /r. When any one of the n
distinct radicals is changed in sign, the roots xl, x2,... are


* See Monograph No. V, sec. 10.


360


MODERN MATHEMATICS


interchanged in pairs and the product F(x) is unaltered.
Since ki must therefore remain unaltered when /r is changed
into -V/r, we have
p + qVr= p -qV/r,  q=0,
so that ki=p is free of Vr. Since ki involves no one of the n
radicals, it equals a number of the domain D. Hence, the
function x satisfies an equation F(x) =0 of degree 2n with
coefficients in the domain D.
7. The quantity x1 satisfies various equations with coefficients in the domain D; for example, M(x) F(x) =0, where
M(x) is any integral function with coefficients in D. We
next prove an important property of all such equations.
Theorem. If one of the conjugate quantities xl, x2,..., X2n
satisfies any equation f(x) =0 with coefficients in the domain D,
then all the quantities xz satisfy this equation.
Let xi=p+qV/r, where   /r is a radical of the maximum
order m, while p and q do not involve V/r but may contain
some of the remaining radicals of order m and radicals of lower
order. By changing the sign of V/r, we obtain another xi, say
2 = p-q/r.
Now f(xl) may be given the form A +B/r, where A and
B do not involve   /r. By hypothesis, f(xl) =0; that is,
A+B/r=-O. If BO0, we would have         /r= -A/B, contrary to the assumption (sec. 5) on the independence of the
radicals. Hence B = 0 and therefore also A = 0. Since
f(x2) = A-B /r
we have f(x) =0. Thus x2 is a root of f(x) =0.
The proof that any xi is a root of f(x)=0 is based upon
the same principles. To simplify the formulas, let x1 contain
just two radicals /r and ~/r' of the maximum order m. Then
(end of sec. 5),
f(xI) = A + B VrT+ C /' + E Vr /r',


CONSTRUCTIONS WITH RULER AND COMPASSES  361


where A, B, C, E involve only radicals of orders &lt; m. In
view of the independence of the radicals (sec. 5), we see as
above that A, B, C, E must each be zero. Let A contain
just three radicals  s, V/s, V/s" of order m -1. Then
A =g +hV/s +      jVs'+js" +kV/sV' +....+qV/s V/s'v7.
As before, A=0 requires that g, h,... q shall be zero. Likewise, the coefficients of /s, /s',..., in B, C, E must be zero.
We may proceed similarly with the radicals of orders m-2,..., 1. Hence in the expression (end of sec. 5)
f(xl) =d +eV/r+f/r' +g./s   +.. +pVr V/r' +QqVr V/s
+....+tVr /r' Vs+....
of f(xi) as a sum of terms each a product of radicals with
coefficients in the domain D, each coefficient d, e, f,... is
zero. Now xi can be derived from x1 by changing the signs
of certain of the radicals  /r, /r', V/s,...Thus f(xi) is
derived from the preceding expression for f(xi) by the same
changes. Since d, e,,... are zero, it follows that f(xi) is
zero. Hence xi is a root of f(x) =0.
8. It was shown in sec. 6 that x1 satisfies an equation
F(x)=0 of degree 2n with coefficients in the domain D. Of
all the equations, with coefficients in D, which are satisfied by
xl, let +(x)=0 be one of the lowest degree 1. The coefficient
of xi may be assumed to be unity. There cannot be two such
equations of degree I, since by subtraction we would obtain an
equation of degree &lt; 1, with its coefficients in D and having
the root x1.
We shall prove that the function F(x) is an exact power
of this unique function 0(x). Divide F(x) by +(x) and let
the quotient be Fi(x) and the remainder be r(x) of degree
&lt;1, where F1(x) and r(x) are integral functions with coefficients in D. Then
Fi(x)  (x) F (x) + r(x).
Let x-xl; since F(xl)=0 and 0(xl)=0, we have r(xl)=0.
If r(x) is not identically zero, r(x)=0 is an equation, with


362


MODERN MATHEMATICS


coefficients in D, having the root xl and of degree &lt;1, contrary to the hypothesis that I is the lowest degree of such
equations. Hence r(x) is identically zero, so that
F(x) =-0(x) F (x).
If Fl(x) reduces to a constant, necessarily unity, F(x) is
the first power of 05(x) and our theorem is proved. In the
contrary case, Fl(x) is a factor of degree  1 of F(x) and
Fl(x)=0 has as a root at least one of the roots xi of F(x)=O,
and hence (sec. 7) has every one of the xi as roots.    In
particular, F (x)=0 has the root xi, so that by the above
argument,
F1 (x  ) =  (x) F2(x),
where F2(x) is an integral function with coefficients in D.
If F2(x) reduces to a constant, necessarily unity, F(x) is the
square of 5(x) and the theorem is proved. On the contrary
case, the same argument shows that
F2(x) = (x) F3a().
Proceeding in this way, we find ultimately that
F(x) = [ (x)].
The degree of the second member is lk and the degree of
F(x) is 2n. Thus I is a divisor of 2n. Hence the degree of
+(x) is a power of 2. We have therefore proved the following
theorem:
The unique equation of lowest degree with coefficients in the
domain D which is satisfied by a function x1 derived from numbers
of D by a finite number of rational operations and extractions of
square roots is of degree a power of 2.
9. Reducible and irreducible functions. An integral function
f(x) with coefficients in a given domain D is called reducible,
or irreducible in the domain, according as it can or cannot
be factored into two integral functions, each of degree &gt;1,
with coefficients in D.
For example, x2-4 is reducible in any domain; x2 -3 is


CONSTRUCTIONS WITH RULER AND COMPASSES    363


irreducible in the domain of all rational numbers, but is
reducible in the domain of all real numbers; x2 +4 is irreducible
in the latter domain, but is reducible in the domain of all
real and complex numbers, since
x2+ 4  (x + 2i) (x-2i),  i= V/-1.
10. The function +(x), defined in sec. 8, is irreducible in
the domain D. For, if f(x) be the product of two integral
functions, each of degree &gt;1, with coefficients in D, one of
the factors equated to zero would give an equation, with
coefficients in D, which is satisfied by xl and is of lower degree
than +(x). But this would contradict the hypothesis concerning f(x).
An equation G(x)=0 is said to be irreducible in D, if the
function G(x) is irreducible in D.  The equation  f(x)=0
is the only equation irreducible in D, which has the root x1.
For, if G(x)=0 is irreducible in D and has the root xl, the
argument used in sec. 8 shows that G(x) has the factor O(x),
so that G(x) must be the product of 0(x) and a constant. The
theorem in sec. 8 is therefore equivalent to the following:
The unique equation, irreducible in the domain D, which is
satisfied by a function xl derived from numbers of D by a finite
number of rational operations and extractions of square roots
is of degree a power of 2.
11. From the last theorem and the criterion in sec. 2 we
deduce the
Fundamental theorem.    A  proposed construction is not
possible by ruler and compasses if any one of the numbers which
define analytically the required geometric elements satisfies an
equation irreducible in the domain of the geometric data whose
degree is not a power of 2.
12. The preceding result enables us to treat the three
famous problems mentioned in the Introduction.
Duplication of the cube. The problem is to construct a
cube whose volume is twice the volume of a given cube. Taking
as the unit of length an edge of the given cube, we see that an
edge x of the required cube is a root of x3=2. The equation


364


MODERN MATHEMATICS


x3-2=0 is irreducible in the domain of all rational numbers.
For, if reducible, it would have a linear factor and hence a
rational root. But if a/b is a root, where a and b are integers
with no common divisor except unity, then a3=2b3. Hence
a3, and therefore a, is even, a=2c. Then 4c3=b3, so that b
is even. Thus a and b are both even and hence have the common factor 2, contrary to hypothesis. Since the degree of
the irreducible equation x3=2 is not a power of 2, it follows
from sec. 11 that the duplication of a cube is not possible
by ruler and compasses.
Trisection of an angle. To prove the impossibility of the
trisection of an arbitrary* angle by ruler and compasses, it is
sufficient to prove it for a particular angle, for example, for
120~. The construction of the angle ~ (120~) =40~ is equivalent to the construction of a right-angled triangle whose
hypotenuse is unity and base is cos 40~. In the trigonometric identity cos 3x=4 cos3x-3 cos x, take x=40~. Since
cos 120~= -4, we get
4 cos3 40~ -3 cos 40~+ 4 =0.
Multiply by 2 and write y=2 cos 40~. Then
y3-3y +1 =0.
This equation is irreducible in the domain of rational numbers.
For, if reducible, it would have a linear factor and hence a root
a/b, where a and b are integers with no common factor except
unity and b is positive. In the cubic equation, set y=a/b
and multiply by b2. We get
a3
- -3ab+b2 0,
so that a3/b is an integer. If b&gt;1, a and b would have a
common factor &gt;1. Hence b=l. The integral root y =a
makes a3-3a an integral multiple of a, so that the constant
term 1 must be a multiple of a. Hence a= I1. By trial


* Certain special angles like 360~, 180~, 90~ can be trisected, since angles
120~, 60~, 30~ can be constructed by ruler and compasses.


CONSTRUCTIONS WITH RULER AND COMPASSES    365


neither +1 nor -1 is a root of the cubic. It now follows
from sec. 11 that the trisection of 120~ is not possible by ruler
and compasses.
Another proof follows from the fact (sec. 27) that a regular
polygon of 9 sides cannot be inscribed in a circle by ruler and
compasses.
Quadrature of the circle. The problem is to construct by
ruler and compasses a square whose area shall equal the area
7TR2 of a circle of given radius R. The construction is impossible. For, if it were possible, and R is taken as the unit of
length, the number n of the units of area would satisfy an
algebraic equation with rational coefficients (sec. 2, sec. 6).
But this is not the case.*
13. Connection between regular polygons and roots of unity.
Consider a regular polygon of n sides (an n-gon) inscribed in a
circle of unit radius. We employ a rectangular system of
coordinates with the origin at the centre of the circle and
the x-axis passing through a vertex of the polygon. This
vertex is therefore (1, 0). For n=4, the remaining vertices
of the square have their coordinates marked in Fig. 5. For
0/               1 /,yJ2
-1,0        ),                          1,0
0,0                     0,0
n-1 y  Y f n-!
0,-i
FIG. 5.                 FIG. 6.
any n, the remaining vertices, taken in counter-clockwise order,
will be designated (xl, yl), (x2, y2),..., (Xn-_ Yn-1), as in
Fig. 6. Since a side of the n-gon subtends at the centre an
angle whose magnitude is 360/n degrees, or 2r/n radians,
27. 27         47.4r
X1=COS â, yl=sin     X2=cos-, y2=sn,. 


* See Monograph No. IX.


366              MODERN MATHEMATICS
Each point (x, y) of the plane uniquely determines a complex number x+iy, where i= / -1. Conversely, a complex
number x+iy uniquely determines a point (x, y). With the
vertices of our square are thus associated the four distinct
complex numbers
1,  i,  -1,  -i.......  (1)
With the vertices of the n-gon are associated the distinct
complex numbers
27.   2r        4.. 4r
1, rl=cos- +i sin    r2=cos - + sin-,..
n       ny         n       n
2(n-1)7wr     2(n-1)7r
rn-l =COS        +i s in --.
n             n
Since i2 â 1, the four numbers (1) are roots of
x4=l,....... (3)
and hence are called fourth roots of unity.
Any one of the numbers (2) is a root of the equation
x =l1,...........  (4)
and is called an nth root of unity. Indeed, from
2k7.   2k 7
rk= cos -  +,..... (2')
n        n
we find by applying formula (5) below that
rk =cos 2k7 +i sin 2kr= 1 +i.0=1.
14. De Moivre's theorem. For any positive integer n, we
have
(cos A +i sin A)n =cos nA +i sin nA... (5)
We first prove the formula
(cos A +i sin A) (cos B +i sin B) -ccos (A +B) +i sin (A +B). (6)
The product of the two numbers is a +ib, where
a =cos A cos B-sin A sin B = cos (A +B),
b = cos A sin B +sin A cos B = sin (A +B).


CONSTRUCTIONS WITH RULER AND COMPASSES    367


If we take B-A in (6), we obtain formula (5) for the case
n=2; (5) is evidently true for the case n=l. In general, the
proof of (5) for any exponent n is made by mathematical
induction. Assume that it is true for the case n=m, that is,
(cos A +i sin A)m -cos mA +i sin mA.
Multiply each member by cos A+i sin A. For the product
on the right we employ (6) with B=mA. Hence we get
(cos A +i sin A)m+l =cos (m+1)A +isin (m+1)A,
so that (5) holds also for n = m + l. The induction is thus
complete.
15. It follows from De Moivre's theorem that (2') is the
kth power of
r=cos- +i sin -... (7)
n   n
Hence the numbers (2) may be expressed in the form
=r-n, rl=r, r2==,...', rk=-k~...   rn_l=rn-1.
Since these n numbers were seen to be distinct and to be
roots of (4), they give all the roots of (4). Indeed, an algebraic
equation of degree n cannot have more than n distinct roots.*
With the successive vertices of the regular n-gon inscribed as in
Fig. 6 in a circle of unit radius are associated the n distinct
numbers
1,   r,  r2,  rk..  rn-,... -   (8)
which give all the nth roots of unity. Here r is defined by (7).
For n=4, these numbers are 1, i, i2= -1, i3= -i.
16. The complex number C= cos A+i sin A has the reciprocal C- = cos A-i sin A, since the product of the two is
cos2 A+sin2 A=1. Hence their sum     C+C-1 is the real
number 2 cos A.
The inscription of a regular n-gon by ruler and compasses
is equivalent to the construction of the angle 27/n, and hence


* See Monograph No. V, sec. 10.


368


MODERN MATHEMATICS


is equivalent to the construction of a right-angled triangle
with hypotenuse unity and base cos 2w/n. Instead of determining the complex root of unity r, defined by (7), it is
therefore sufficient to determine
r + r-l=2 cos 27r/n.
Since rn=1, we have r-1=rn-1. We shall therefore seek
certain real-valued combinations, such as r+rn-1, of the roots
(8).
17. Regular pentagon and decagon. For n=5, we wish
to determine
27r
)o=r+r4 =2cos -'
where              r=cos   +i sin 52
5       5
Since r5=  and rl1, we have
r5-1
=r4 +r3 +r2 +r + =0.
r-1
Hence if r +r4 is found, so is also 71 =r2+r3, since
Vo+~ i= -1.
Two numbers can be determined when their sum and product
are known. We therefore evaluate  o' il. By actual multiplication,
(r +r4)(r2 +r3) =r3 +r4 +r6+r7 = r3+r4+r +r2  -1,
since r5=1. Hence Do and 1 are the roots ~( -1 ~ /5) of
X2 â (ro0 + l)X + 0o1 =X2 + X-1 = 0.
Since the cosine of an acute angle is positive, we have
2w2
o = 2 cos 1- -I + -/5), 1-  = -1 -/5).
From the value of Do, we may construct the angle 2r/5.
Let AOA' and BOB' be perpendicular diameters in a circle of
radius R, and M the middle point of OA' (Fig. 7). Then
BM2 = R2 + (IR)2,  BM = - R  5.


CONSTRUCTIONS WITH RULER AND COMPASSES


369


Let the circle with centre M and radius BM cut OA at C.
Let N be the middle point of OC. Then
2r
OC= =R(\/5 -1) = Rro,     ON=R cos -
5.
Draw DN parallel to OB. Then angle DON equals 27/5.
Hence AD is the side s5 of regular inscribed pentagon.
We may omit the construction of DN, DO, DA, and prove
that CB=s5, CO=slo, the side of a regular decagon. The
latter follows from
27
slo = 2R sin 18~- 2R cos 72~-2R cos  = Ro = OC.
B                             B
S6
A    ^jrA'                   A    's'o
A         0                       C S10 M       A/
B'                           B'
FIG. 7.                     FIG. 8.
Next, sin 18~=cos 72~=1 -2 sin2 36~. Multiplying by 2R2
and replacing 2R sin 36~ by its value S5, and 2R sin 18~ by slo,
we get
Rso = 2R2 -s52.
But 7o is a root of 2 + x -1 = 0, and Roo = sio. Hence
So2 + Rs1o -R2 = 0.
It follows that
slo2 +R2 = 52.
Since OC=slo, OB=R, the hypotenuse BC must equal s5.
We have now established the following elegant construction
of the regular pentagon and decagon:
If AOA' and BOB' are perpendicular diameters, and M is
the middle point of the radius OA', a circle with centre M and
radius MB will cut OA at a point C such that OC and BC are
the sides sio and S5 of the inscribed regular decagon and pentagon.


370


MODERN MATHEMATICS


In particular, Fig. 8 exhibits the above relation S12 + S62 = S52
between the sides of the inscribed regular decagon, hexagon,
and pentagon.
18. If p is a prime number, a regular polygon of p sides
can be inscribed by ruler and compasses not merely in the
well-known cases p=3 and p=5, but also when p=17 and
when p has certain larger values. This important discovery
was made by Gauss. For the general theorem see sec. 27.
In the treatment of the case p=5 (sec. 17), we made use
of the combinations r+r4 and r2+r3, called periods, of the
complex fifth roots of unity. If the latter be written in the
order
r, r2, r4, r3 - r8
so that each is the square of the preceding, we note that the
periods are obtained by taking alternate terms of this series.
For another value of p it may not be possible to arrange
the complex pth roots of unity (sec. 15),
r,  r2,  r3, -...  rp-,..        (9)
in such an order that each term is the square of the preceding.
In fact, this is not possible when p=7, since the fourth term
r8 is now identical with the first term r. But when p=7 the
roots can be arranged so that each is the cube of the preceding,
namely,
r, r3, r2, r6, r4, r5.
It is shown in Monograph No. VII, sec. 46, that for any
prime p there exists an integer g (called a primitive root of p),
such that the remainders obtained upon dividing
1, g, g2,.  gp-2
by p are in some order 1, 2,..., p-1. Hence the roots
r,, r, r2,...  rp -2.  o...  (10)


are identical in some order with the roots (9).


CONSTRUCTIONS WITH RULER AND COMPASSES          371
19. Regular polygon of 17 sides.   For p- 17, we may
take g = 3, since the remainders obtained upon dividing the
successive powers 1, 3, 32,., 316 of 3 by 17 are
1, 3, 9, 10, 13., 51, 15, 11, 16, 14, 8, 7, 4, 12, 2, 6,
which form a permutation of 1, 2,..., 16.  Taking alternate terms, we form the periods'
~o=r +r +rl3 +rl5+rl6 +r8 ~r4 +r2,
1=r3~+r10 +r5 +r11 ~rl4+r7 +r'2+r.
Since r17 =:I and r l, we have
rl7   161r5.. +r~1.=O.
r -
Hence ~o + ij = -1. In the 64 terms of the product ~orj
we reduce exponents by means of r 17 =1 and find that each
root r, r2,..., r16 occurs exactly four times. Hence
~0~1-4(~r2...+r1)   -4.
But ~o, ~j are the roots of x2-+~)ix1o~,==0. Hence
~1   satisf y  X2 +X-4 =0..       (I (1)
By taking alternate terms in ~o we form two periods ~o'
and p2'; likewise, two periods from ii1.
of=,r + r13 +r16 +r4,  ~1' =r3 -i-r5 +r14 +r12,
~2/ r9 +rl5+r8 +r2,,~3'-1=~r11+r7 +r6,
We readily' verify that O'~~2'_ - 1 ~1fl3'. Hence
~0y2'  satisfy  x2-~Ox -1 =0,...(12)
1'  '  satisfy  x2 -1x -1 =0....(13)
In view of sec. 16, it suffices to determine r ~r16. The
periods
n"r+r'6,  i"=13+r4,


372


MODERN MATHEMATICS


have the sum iwo' and the product rt'  Hence
7O /, 1)4  satisfy  x2 -_o'x +1)'=O..   (14)
To decide which root of (II) is ro and which is rji, and the
similar question in (12)-(14), we employ the formulas
27w     2w           2kw. 2kw
r = cos- +i sin     rk=cOS-+ll
17      17'           17       1W7
Hence, as in sec. 16,
c2 os 27,w  i4"=2 Cos 87   0/1 &gt; )4/&gt;1
lOw       7w
Similarly, employing cos 17 = -cosp-, we get
2w       8w             6w       7w
oo's=2 c C  o~2cos i-7  9f=2cos â2cos17       17'            17       17'
Hence ro' and r,' are positive. Further,
6wC      5w       7w       3wc
'11=2 Cos 17r -2cosy7-2cosp-2cos 3i
17       17       17       17
is negative, since the first cosine is less- than the second. But
we had 1)0)= -4. Hence 1) is positive. Hence (11)T(14) give
2 1  7                (,712,\,/ +       ~    11 I +  + \1 I+ 1 o12
20. Construction of the regular polygon of 17 sides. In a
circle of radius unity construct two perpendicular diameters
AB, CD, and at A, D draw tangents which intersect at S.
Determine the point E so that AE =AS (for example, by
two bisections). Then
AE=,     QE =VAO2+AE2 =       17.
Let the circle with centre E and radius QE cut AS at F
and F'. Then
AF =EF -EA=OE-{          W = Y
AF'=EF'~EA=OE~ = -1, ___
OF= VA    2 +AF2= Vi +j1)o2,   OF' V   +j1)i2.


CONSTRUCTIONS WITH RULER AND COMPASSES    373


Let the circle with centre F and radius FO cut AS at H;
that with centre F' and radius F'O cut AS at H'. Then
AH=AF+FH= AF+OF = ro + /1 + o2           O,
AH' - F'H' -F'A = OF' -AF' =  '.
It remains to construct the roots of (14). This may be
done by sec. 3. Draw HTQ parallel to AO and intersecting
OC produced at T. Make TQ=AH'. Draw a circle having
as diameter the line BQ joining B=(0, 1) with Q= (go', 1').
The abscissas ON and OM of the intersections of this circle
B
D         0                  T
f    S     E  A  l'F 
FIG. 9.
with the x-axis OT are the roots of (14). Hence the larger
root 7o't is
2w
OM=2 cos
The perpendicular bisector LP of OM cuts the unit circle
2w        2n
at P. Then cos LOP = OL = cos 7    LOP =-    Hence the
chord CP is a side of the inscribed regular 17-gon.
For an elegant construction by von Staudt which employs
only straight lines and the given circle, see Bachmann, Kreistheilung, pp. 69-75. The figure cannot, however, be conveniently drawn on a single page of the size of the present book.
21. Having treated in detail the special cases p=5 and
p=17, we proceed to develop Gauss's theory for any prime p.
Let p -1 = e f be any factorization of p -1 into two positive
integers. We separate the p-1 roots (10) into e sets each


374


MODERN MATHEMATICS


of f roots. For the first set, we take the first root r, the eth
root rag following it, then the eth root following the latter,
etc. For the second set, we take the second root rg, the eth
root following it, etc. The exponents in the various sets
are therefore
1,   ge,   g92e,g(f l)e,
g,   ge+l   g2e+l1,...,g(f l)el           (
I      ~~(15)
9e-1 Y 92e-1 g3e-1    gp -1.el 
The sum of the roots in any set is called a period. Hence
the periods are
9r=rk +Te+k  2eyae k       I ~,~-)ef k
7,k=rg +rg   ~ rge  +... +rg f0~k k=O 1,1..., e -1). (.16)
Let f= e' f' be any factorization of f into two positive
integers. Then p -1 is the product of ee' and f'. As above
we have ec' periods, each the sumn of f' roots,
j=i=rg +rg ee+irg2ee  +... ~rg j'i)ee~( = 10, 1,..., ee' -1). (17)
Each period (16) is the sum of certain e' periods (17),
I.- = TIIk  7/e+ k + )7/2e +k +  +  e + kI'-l~~k (k=01 l..., e â 1). - (18)
Indeed, the second member is seen to contain each root
se+I.
once and but once, while this is also true of )le.
Let f' = ef" be any factorization of f' into two positive
integers. Then p -l is the product of ee'e" and f". Hence
there are ee'e" periods, each of f" roots,
7l                   -It        C!"-r1ee'e) + +  2ee'e" + tJ  (f 1)ee'e" +t
(t=0, 1,..., ee'e".-1).  (19)
Each period (17) is the sum of certain e" periods (19),
)i. + Ilce'~j++ )2ee'+~.
(= 01 1,..., ee' -l).  (20)


CONSTRUCTIONS WITHI RTLFER AND. COMPASSES 37


375


Similarly, we may take any factorization f"= â e..f"' of f",
then any factorization off"',etc., until we reach f(l) -1I. Thus,
each period separates into periods of f ewer terms, the final
periods having a single term.
For example, if p=l17 we may take e = 2, f= 8, el'== 2 f'= 4,
e" =_=2, f"=~2, e"'1=2, f"'-=1, and obtain the periods given
in sec. 19.
The following theorems will be proved in sec. 38:
Theorem I.   The periods 2)0., 7)I,..  7)e-i are the roots of
an equation F (x) = 0 of degree e with integral coefficients.
Theorem IL. The e' periods, each off' terms,
k,  7e-k,  7 2e-j-k,   7)..   (e'-l)e+klc,.. (1
whose sum is.7)i, are the roots of an equation 4/'k (X)  0 of degree
e' whose coefficients are linear fanctions of 7)o, 7)1,, )with integral coefficients.
Since Theorem II relates to any factorization p -1 ee'f'
of p -1, we may, by a suitable change of notation, apply, it
to any other factorization of p -1. Taking the factors ee',
e, f", we conclude that -the e" periods, each of f" terms,
k, 7ee'~k, 7 2ee'~k,.,7 (e"-l)ee-+ky      (22)
whose sum    is 7',are the roots of an equation     j5"k(X) = 0 Of
degree e" whose coefficients are linear functions of '~o', 7)1',..
7)ee'-1 with integral coefficients. Next, taking the factors
ee'e", e"'f, f"', we conclude that the e"' periods, each off'
ternis,
7)k7) ee'e"~k, 7 2eele"+ k,..., 7 (e"'-1)ee'e"+4-kj  (23)
whose sum    is 7)"k, are the roots of an equation ~b"'k(X) ==O
of degree e"' whose coefficients are linear functions of 7)o",
7) /eei  with   integral coefficients.  Finally, we
obtain equations of degree e(7) satisfied by periods composed
of a single term, namely, one of the roots. (10). We have
now shown that if e, e'..., e~l) are any integers whose product
is p -1, there can be determined a series of equations,


376


MODERN MATHEMATICS


of degrees e, e', e",..., e(), of which the first has integral
coefficients, while the coefficients of  7,.(t)(x)=0 are linear
functions, with integral coefficients, of the roots of k(t- 1()  = 0,
and such that the roots of the final equations bk(l)(X)=0 are
the complex pth roots of unity (10).
22. If p-1 is a power of 2, the numbers e, e'..., may
each be taken to be 2, so that the auxiliary equations (24) are
quadratics. For the application to regular polygons we may
omit the final equations whose roots are the complex pth
roots of unity, since (sec. 16) we require only the combination
r+r-1, and since r+r-1 is a root of one of the equations (24)
just preceding the final type. To prove the last statement,
we note that by Monograph No. VII, sec. 47, a primitive root
g of p satisfiesthe congruence ge  -1 (mod. p), where e = (p -1),
so that rg"=r-1. But to obtain the periods ik composed of
only two terms we must set f=2, e==- (p-1) in (16). Then,
k = rk +r ge+k =q ar +r-gk
By the first remark in sec. 16, rk is a real number. Since each
period containing more than two terms is a sum of periods
containing just two terms, it follows that every period is a
real number, excepting only the periods containing a single
term. Hence all the quadratic equations which are required
to evaluate r+r-1=2cos 2w/p have real roots. Hence if
p -1 is of the form 2h, the value of 2 cos 2w/p can be found
by the solution of a series of quadratic equations with real
roots, so that by sec. 3, the angle 2n/p can be constructed by
ruler and compasses. Hence we may state the
Theorem. If p is a prime number of the form 2h+ 1, a regular
polygon of p sides can be inscribed by ruler and compasses.
23. We next investigate the regular polygon of n sides,
when n has two or more distinct prime factors p, q,...,
namely,
n=pqt...
If we have a regular polygon of n sides, we may join certain
vertices and obtain a regular polygon of ps sides, or one of


CONSTRUCTIONS WITH RULER AND COMPASSES   377


qt sides,....  Conversely, if the latter polygons are given,
we can construct one of n sides. In general, if a and b are
any relatively prime numbers, we can derive a regular ab-gon
from a regular a-gon and a regular b-gon. Indeed, by Monograph No. VII, sec. 32, there exist integers c and d, such that
ca+db=1. Since we have the angles 2r/a and 2r/b, we can
construct multiples of them, add these multiples, and obtain
the angle
27   2   27          27
d 2- +c-      (db+ca)=-,
a     b              ab'
and therefore construct a regular ab-gon. We have thus
proved the
Theorem. If n=psqt..., where p, q,... are distinct
primes, a regular polygon of n sides can be inscribed by ruler
and compasses if, and only if, regular polygons of ps sides, qt
sides,... can be inscribed.
24. It therefore remains to consider a regular polygon
the number of whose sides is a power of a prime, say p8. The
psth root of unity,
r= cos 2/ps + i sin 27/p,.. (25)
is a root of xPS=1, but not a root of xPs-1  1, as shown by
De Moivre's theorem (sec. 5). Hence r is a root of
XP    -1
X  -- ps-1-l (p)+ P    p â2)+.       ~=ps-  10.  (26)
It will be shown in sec. 31 that this equation is irreducible
in the domain of rational numbers. If a regular ps-gon can
be inscribed by ruler and compasses, the coordinates Xk, yk
of its vertices (sec. 13) involve no irrationalities other than
real square roots (sec. 2). Hence, Xk+iyk, where i=V/-1,
will involve no irrationalities other than real or imaginary
square roots. In the algebraic discussion in sees. 5-10 the
radicals were not restricted to real radicals. Hence, by sec. 10,
the equation (26), which is irreducible in the domain of rational
numbers and has the root r = x +iyl, must be of degree a power


378


MODERN MATHEMATICS


of 2. If s&gt;l, ps-1 (p-1) is not a power of 2 except in the
case p â2. Hence we may state the
Theorem. When p is a prime number &gt;2, a regular polygon
of p8 sides cannot be inscribed by ruler and compasses if s&gt;l,
or if s= 1 and p-1 is not of the form 2h.
25. In view of the theorems in sees. 22 and 24, a regular
polygon of p sides, where p is a prime number &gt;2, can be
inscribed by ruler and compasses if, and only if, p is of the
form 2h+1. We note that 2h+l is composite if h has an
odd factor 2&amp;k+l, so that h= (2k+l)q, since in that case
2h +1 has the factor 2q +1. If a number h has no odd factor
it must be a power 2t of 2. We therefore have the result:
A regular polygon of p sides, where p is a prime &gt;2, can be
inscribed by ruler and compasses if, and only if, p is of the form
22t +..........           (27)
26. We are thus led to ask for what values of t the number
(27) is a prime. For t=0, 1, 2, 3, 4, the numbers are 3, 5,
17, 257, 65537, each being prime. The famous arithmetician
Fermat expressed his belief that the number (27) was a prime
for every t, but admitted that he had no proof of his conjecture.
But Euler proved in 1732 that when t=5 the number is not
prime,
232 + 1 641.6700417.
Further, the number (27) is known * to be not prime for
t=6, 7, 8, 9, 11, 12, 18, 23, 36, 38, 73.
The regular 257-gon has been discussed at length by Richelot in Crelle's Journal fur Mathematik, 1832, pp. 1-26, 146 -161, 209-230, 337-356; and geometrically by Affolter and
Pascal in Rendiconti della R. Accademia di Napoli, 1887.
The regular polygon of 216 +1=65,537 sides has been
discussed by Hermes, G6ttingen Nachrichten, 1894.


*Proceedings of the London Mathematical Society, 1903, p. 175, 1905,
p. xxi; Bulletin of the American Mathematical Society, 1906, p. 449; Vol. XI,
p. 543; 1909, p. 1.


CONSTRUCTIONS WITH RULER AND COMPASSES


379


27. Since any angle can be bisected, a regular 2k-gon is
inscriptible if a regular k-gon is. Hence the results in sees. 23-25
lead to the
Theorem. A regular polygon of n sides can be inscribed by
ruler and compasses if, and only if, n= 21lp1..., where
pi, p2,.. are distinct primes of the form 22t + 1.
The lowest primes pi are 3, 5, 17, 257, 65537. For the
succeeding values 5, 6, 7, 8, 9 of t, the number is not prime.
For the next case t=10, the number has 155 digits; whether
or not it is prime has not yet been determined.
The regular polygons of n sides, where n lies between 2
and 26, fall into the following two classes:
Inscriptible:    3, 4, 5, 6, 8, 10, 12, 15, 16, 17, 20, 24;
Not inscriptible: 7, 9, 11, 13, 14, 18, 19, 21, 22, 23, 25.
28. Primitive roots of unity. A root r of x =1 is called
a primitive nth root of unity if it is not the root of a similar
equation of lower degree, namely, xl =, with 0&lt; &lt; n.
For example, i= \/-1 is a primitive fourth root of unity,
since i4=1, while i, i2, and i3 are distinct from 1.
There exist primitive nth roots of unity, for instance,
27       2n
rl =cos - +z sin-;
n        n
by De Moivre's theorem, rln is the least positive power of ri
equal to 1.
If r is any primitive nth root of unity, the powers
r, r2,.., r.             (28)
give all the nth roots of unity. In fact, these powers are
roots of x =1 and are all distinct; furthermore, there cannot
be more than n distinct roots of an equation of degree n.
It is easy to determine which of the roots (28) are primitive
nth roots of unity. Consider rk and let g be the greatest
common divisor of k and n. Then
(rk)n/g = (r't) k/  1.


380


MODERN MATHEMATICS


Hence, if g &gt;1, so that n/g&lt; n, rk is not a primitive nth root
of unity. But if g=l, there exist (Monograph No. VII,
sec. 32) integers a and b for which
ak + bn = 1.
Then if (rk)l=1, for 0&lt; l&lt; n, we would have
rl=r(ak4 bn)l =(rkl)a(rn)bl _1 
whereas rl 1. Hence, if g=1, rk is a primitive root. We
have thus proved that when r is any primitive nth root of unity,
rk is also a primitive root if, and only if, k is relatively prime
to n.
For example, i= =/-1 is a primitive fourth root of unity
and hence also i3= -i is, whereas i2  - 1 is not a primitive root.
Another statement of the preceding theorem is the following:
If r is any primitive nth root of unity and if 1, a, b,..., I
are the integers less than n and relatively prime to n, then
r,  ra   rb,...  r....  (29)
give the all distinct primitive nth roots of unity.
29. Let n=p8, where p is a prime, and let r be a primitive
psth root of unity.  Of the n= p8 roots (28) of xP =l, rp,
r2P,..., rP give the ps-l distinct roots of xP"-l=l. The
remaining p -ps-l roots are primitive p8th roots of unity
(sec. 28). Hence the roots of equation (26) give all the primitive
p8th roots of unity.
To complete the discussion in sec. 24, we must prove that
this equation is irreducible in the domain of all rational
numbers. This proof will be based upon a lemma of great
importance. For the case of the special function y3 -3y +1, this
lemma states that the function is the product of two factors
with rational coefficients only when these coefficients are
integers. For this case a proof has been given in sec. 12.
30. Gauss's lemma.    If an   integral function f(x) with
integral coefficients, that of the highest power being unity, is the
product of two integral functions
(x)=xm+bl"xm-l +... +b,,,     (x) =-Xm -+cixm'-1 +...  +Cm'
with rational coefficients, these coefficients are integers.


CONSTRUCTIONS WITH RULER AND COMPASSES   381


Let the fractions bl,..., bm be brought to their least
positive common denominator /o and set bi=P i//o. Hence
Po,..., * *  have no common divisor except unity. Similarly,
let ci=ri/0o, where ro,..., r,, are integers with no common
divisor &gt;1. Multiplying f=.* s by Poro, we get
Porof(x)=  l(x&gt;  l(x),..(30)
where
1= poxm +Pxm-1+...+,       ~ I,   l= rOXm + rlm-~ +. *+ r,.
The lemma is proved if Po= o=l. Suppose, however, that
PoTo&gt;l. Then every term   of the left member of (30) is a
multiple of any prime divisor p of Poro. Not all the P's have
a common divisor p. Let fi be the first coefficient in kil(x)
which is not divisble by p. At least one of the T's is not
divisible by p; let rk be the first one. The total coefficient of
xm+m'-i-k in the product kl (x)- cl (x) is
Pirk +Pi-lTk+l +Pi â2rk+2 +. 
+Pi+lrk-1 +Pi+2rk-2 +-~ 
and this sum must be divisible by p, since every term on the
left of (30) is divisible by p. By hypothesis, Pi-i, Pi-2,. 
and Tk-l, rk-2,'.. are divisible by p. Hence firk must be
divisible by p, contrary to hypothesis.
31. Of various proofs of the irreducibility of equation (26),
we shall reproduce Kronecker's first proof. To prove that the
function f(x), defined by (26), is irreducible in the domain
of rational numbers, it suffices, in view of the preceding lemma,
to show that f(x) is not the product of two polynomials 5(x),
&amp;(x), with integral coefficients. Suppose that such a factorization
f'(x) =(X) (x)
is possible. For x=1, we get
Since p is a prime, one of these integers, say S(1), must
equal ~1. Let r be any primitive path root of unity. Now
all the primitive roots are given by (29), where 1, a, b,..., I


382


MODERN MATHEMATICS


denote the t=ps-ps-1 integers less than ps and relatively
prime to ps. Further, by sec. 29, these numbers (29) give all
the roots of (26). Hence the factor 0(x) must vanish when one
of these numbers is substituted for x, so that
(r)   (ra)  rb)...(r') = 0.
In other words, the function
p(x)=b(x).&lt; (Xa (Xb).. (XI)
vanishes when x is replaced by any primitive p"th root r of
unity. Since P(x) thus vanishes for each root of f(x), and(
since the roots of f(x) are all distinct, it follows that P(x) is
divisible by f(x). Thus
P(x) =f(x) q(x),
where q(x) is a polynomial with integral coefficients. The
number of factors d in P(x) is t. Hence, for x=l,
[(1)]t=p. q(1).
Since 5(1)= 1:, p cannot divide [9(1)]t. The assumption
that f(x) is reducible has therefore led to a contradiction.
32. The proof of the theorems stated in sec. 21 rests upon
four lemmas which will now be established.
While, in sec. 21, r denoted the particular pth root of unity,
2n.. 2n
cos - +- sin -,
P       P
we shall henceforth denote by r any primitive pth ropt of
unity. In view of sec. 28, the powers (9) continue to give
all the complex pth roots of unity. The same is true of the
powers (10). Hence when r is any primitive pth root of unity,
the various roots (10) can be separated into periods exactly
as in sec. 21.
33. It is shown in elementary algebra that if an equation
F(x)=0 with real coefficients has a complex root a+bi, where
i=  /-1 and b 70, it has also the root a-bi, so that F(x)
has the factor
q (x) = (x -a -bi) (x -a + bi) = x2 -2ax + a2 + b2.


CONSTRUCTIONS WITH RULER AND COMPASSES    383


Since O(x) has no factor x -d, where d is real, it is irreducible
in the domain of real numbers (sec. 9). This result is merely
a special case of the following:
Lemma I. If F(x) and +(x) are integral functions with
coefficients in the domain D, and +(x) is irreducible in D, and
if F(x) vanishes for one root x1 of /4(x)=0, then F(x) is the
product of qf(x) by an integral function with coefficients in D.
The ordinary process for finding the greatest common
divisor g(x) of F(x) and f(x) involves only rational operations. Hence g(x) is an integral function with coefficients
in the domain D. Moreover, g(x) is not a constant, since F(x)
and +(x) have the common factor x-xl. Since ((x) is irreducible in D, its factor g(x) must equal cf(x), where c is a constant. Hence +(x), as well as g(x), is a factor of F(x).
Corollary. If the degree of F(x) is less than that of 0(x),
then F(x) has all its coefficients zero.
34. Lemma II. Any integral function f(r) of a primitive
pth root r of unity can be given the normal form
cor +clrg +2rg2+... +cprg-2,..   (31)
in which each ci is an integral function, with integral coefficients,
of the coefficients of f(x). If f(r) has rational coefficients, it has
a single normal form.
Since rP=l and r7 1, r is a root of
xP -1
â =XP-1         â Xp-2+.. +x+1=0..    (32)
x-1
Hence
rp-1.. +r2+r+1=0..                (33)
By employing r    1, we may give f(r) the form
f(r) -a - ao+r +a2r2 +...  ap_lrP-1.
From this we subtract ao times (33) and obtain
f(r)=Alr+A2r2+...+Ap_irP-,... (34)
where Ai=ai-ao. Since the quantities (9) are identical in
some order with the quantities (10), we may give (34) the


384


MODERN MATHEMATICS


normal form (31). The first part of the lemma is therefore
proved.
We now make the assumption that the coefficients of the
initial function f(r) are rational numbers. Then the ai, and
hence also the Ai, are rational numbers. In this case f(r)
can be expressed in the form (34) in only one way. For, if
also
f(r) =Blr +  2r2+...+Bp-_rP-1,
in which the Bi are rational numbers, we obtain by subtraction
and removal of the factor r an equation,
o = A -B1 + (A2 -B)r... + (Ap_1 -Bp,_)rP-2,
with rational coefficients. But equation (32) is irreducible in
the domain of rational numbers (sec. 31 with s=1). Hence
by the Corollary in sec. 33, each coefficient A -Bi is zero.
35. The periods io, 1i,..., e-i, defined by (16), have the
important property that each is unaltered when r is replaced
by rge. In fact, rg8 is then replaced by
(re)S rge+s,
so that any term, except the last, of a period  )k is replaced
by the next succeeding term, and the last term by the first
The last statement follows from
gfe gp-   1    (mod. p).
36. Lemma III. Any integral function f(r) of a primitive
pth root of unity, with integral coefficients ai, and having the
property that it remains unaltered when r is replaced by rge,
equals a linear function of the periods,
kro + k   +....+ ke-l e-1,..        (35)
where each ki is an integral function of the ai with integral
coefficients. If the ai are all integers, then the ki are integers.


CONSTRUCTIONS WITH RULER AND COMPASSES


385


Let f(r) be given the normal form (31), but with the powers
of r arranged in tabular form as in (15). Thus,
f(r) coor +clorge + C2or2e +.. + cf -1of- 1)e
+cokrgk +C  + krg 2+Cre++ k        f.   kfg(f- 1)e k
When r is replaced by rg, the powers in any row are permuted cyclically (end of sec. 35). The coefficients of the
resulting function must equal those of f(r), by Lemma II.
Hence
Cok= âC1i, Clk=C2k,..., Cf-1k-Cok  k=01,...,e-1).
Thus the c's in each row are all equal. Removing the common
factor, we obtain a sum of powers of r which defines a period
Vk. Hence
f(r) =ooro+coll +.-..+COke ck+...+COe-l7e-l.
37. Lemma IV. An integral function f(r) of a primitive
pth root of unity with integral coefficients, which remains unaltered
when r is replaced by rg, equals an integer.
We apply Lemma III for the case e =1. Then
irs  = r +rg + rg2...  +r  --
is the only period of f=p-1 terms. By (33), Vo-= -1. Hence,
by (35), f(r) equals the integer -ko.
38. We are now in a position to prove Theorems I and II
of sec. 21. First, Do,..., Ve-i are the roots of
F(x) = (x â ro) (x â7).. (x â e-1) = 0.
Its coefficients are symmetric functions, with integral coefficients, of the periods ri. When r is replaced by rg, these periods
are permuted cyclically, that is, Do is replaced by V1, 11 by
72,..., and 7re-1 by )o. Hence a symmetric function of these
periods remains unaltered and, by Lemma IV, equals an
integer. Theorem I is therefore true.


386


MODERN MATHEMATICS


Next, the e' periods (21) are the roots of
kt(X) = (X -T)k)(X -- e+k)... (X -Be'-1 )e.k) =0,
whose coefficients are symmetric functions, with integral
coefficients, of these periods (21). But the latter are permuted cyclically when r is replaced by rge. Hence Theorem II
follows from Lemma III.
39. References. The proof that regular polygon of p
sides, where p is a prime of the form 2^+1, is geometrically
inscriptible was first made by Gauss, Disquisitiones Arithmeticae, translated into German by Maser. On p. 447 of the
latter, Gauss states that a regular n-gon is not inscriptible
if n contains an odd prime factor not of the form 2h+1, or
the square of a prime 2h + 1 (i.e., states the theorems of sees. 23
and 24 above); 'but no proof appears to have been published by
Gauss. References to the proof (secs. 5-11) of this impossibility
may be made to
Petersen, Theorie der Algebraischen Gleichungen, Kopenhagen, 1878, p. 156.
Klein, Vortrage fiber ausgewahlte Fragen der Elementargeometrie, Leipzig, 1895; English translation by Beman and
Smith, Boston, 1897.
Enriques, Questioni Riguardanti la Geometria Elementare,
Bologna, 1900, Articles 10 and 11; German edition.
The theorems may be readily proved by means of Galois's
theory of algebraic equations. For the domain of rational
numbers, the Galois group of equation (26), whose roots are
the primitive psth roots of unity, is cyclic, so that its factors
of composition are the prime factors of ps-l(p-1). If, and
only if, these factors are all 2 will the equation be equivalent
to a chain of quadratic equations.


IX
THE HISTORY AND TRANSCENDENCE OF X
By DAVID EUGENE SMITH


CONTENTS
1, The nature of the problem;
2, The history of the problem;
3, The transcendence of e;
4, The transcendence of n.
388


IX


THE HISTORY AND TRANSCENDENCE OF 7
By DAVID EUGENE SMITH
1. Nature of the problem. The first areas that the world
measured accurately were doubtless rectangles, and, in particular, squares. If the sides of the rectangles were commensurable with common units of linear measure, and for
practical purposes they were, at least with some convenient
submultiples of those units, then the problem was easily
solved. The next step was probably the mensuration of the
parallelogram or triangle, to be followed by that of the
trapezoid, thus completing the most common rectilinear forms.
In theory the measurement of these polygons offered no
serious difficulties, and by means of these figures the areas
of other polygons could easily be found. When it came to
finding the area of curvilinear figures, however, the problem
assumed new difficulties, and in connection with the most
common of these figures the effort was early made to find a
square that should have an area equal to that of a given circle,
the subsequent problem of measurement of the square being
then a simple one. In other words, the problem was one of
" squaring the circle."  Since, however, it was early seen that
a= rc, it was evident that the problem could be solved if a
straight line could be found that should equal the circumference. For if this line could be found, then the formula
a= =rc would give a rectangle with the same area as the circle,
and it is a simple matter to construct a square with area
equal to that of a given rectangle. The problem thus reduces
389


390


MODERN MATHEMATICS


to " rectifying the circumference," or " rectifying the circle,"
as we would now say. Furthermore, since c=2nr, if we could
find the value of w, as an integer or a common fraction (or
finite decimal), we could easily rectify the circle. Since we
can construct V/ab by the use of the compasses and straightedge, it would also be possible to rectify the circle if we could
express 7r by means of a finite number of square roots. In
other words, the circle would be rectified if = could be expressed
by rational operations and by irrational operations involving
a finite number of square roots. On the other hand, every
particular geometric construction effected by the straightedge
and compasses reduces to the determination of the intersection rf two straight lines, of one straight line and a circle,
or of two circles, and is equivalent to a rational operation or
the extracting of a square root. A geometric construction is
therefore impossible unless it can be effected by rational
operations or by the aid of a finite number of square roots.*
The problem therefore finally reduces to the determining of
the nature of w, whether or not it is the root of an algebraic
equation that can be solved by these methods.
2. History of the problem.  The history of the problem
of " squaring the circle," or more specifically of investigating
the nature of w, may be found in any of the standard histories
of mathematics, and in particular in Cantor's Vorlesungen fiber
die Geschichte der Mathematik (4 vols.).  The subject has
been specially treated, however, by Rudio in his work entitled
Archimedes; Huygens, Lambert, Legendre, vier Abhandlungen
fiber die Kreismessung, and in a more condensed manner in
the German edition of Enriques's Fragen der Elementargeometrie, both of which works have been freely used in preparing this article.
There have been three well-defined epochs in the consideration of this problem. The first extended from the
earliest times to about 1650 A.D. It is characterized by innumerable and ingenious attempts at finding a square equal in


* See Monograph No. VIII, sees. 2, 11.


THE HISTORY AND TRANSCENDENCE OF 7          9


391


area to a given circle, or at finding the approximate value
of 7 by purely geometric methods, and especially by the
methods now used in our elementary text-books.
The second period was about a century in length, extending from the invention of the differential and integral calculus
to the year 1766, when Lambert published his work on the
subject. In this period the methods of analysis replace the
geometric methods of the ancients, and the names of Newton,
Leibnitz, the Bernoullis, and Euler are prominently connected
with the investigation. Instead of the Greek method of
exhaustions, used to such advantage by Archimedes, we now
find infinite series and products used to approximate the
value of =T, and Euler's remarkable formula, to be referred
to later, is introduced into the discussion.
The third period extends from the middle of the eighteenth
century to the present time, and is characterized by the efforts
to discover not the approximate value of 7r, but the nature
of this number, whether or not it is rational, or whether it is
algebraic or transcendent. Since the two latter terms will
enter into this discussion it should be understood that an
algebraic number is a number that is the root of an equation,
Co+ C1x +C22-... + -Cnxn-O, where  the  coefficients Co,
C1,..., Cn are rational numbers. A number which is not
algebraic, that is, which satisfies no such equation, is called
transcendent.*  It should further be mentioned that if a
number is the root of an algebraic equation with rational
coefficients it is also the root of an algebraic equation with
integral coefficients. For this reason we may restrict our
equations to those in which Co, C1,..., Cn are integers.
The first period begins in prehistoric times, the earliest
approximation for Tr probably being 3, as in the Bible (I Kings
vii 23, and II Chronicles iv 2). On the Babylonian cylinders
there has not yet been found any definite statement as to this
value, and the Hindu and Chinese records are untrustworthy
for these remote times. We have, however, a valuable papyrus


* See also Monograph No. VI, sec. 13.


392


MODERN MATHEMATICS


in the British Museum, probably copied about 1700 B.c. from
a work of some centuries before, in which an approximation for
7 is given. This papyrus was copied by one Ahmes, a scribe,
and states that the area of the circle is, in our symbolism,
)d2, or 1 d2. Now since a=rd2, it follows that the Ahmes
256
value of w is -, or 3.1604...
81 '
Among the Greeks, numerous philosophers attempted to
solve the problem. One of the earliest to make any progress
was Hippias of Elis (c. 420 B.c.) who invented a curve known
as the 77cpayowvlovua or quadratrix, which usually bears the
name of Dinostratos (c. 350 B.C.) who studied it carefully.
The curve may be described as follows: If a circle of unit
radius has its centre at the origin of rectangular coordinates,
and if two points Q and R move with uniY          form velocity, one upon the quadrant AB
B
/R Q       and the other upon the radius OB, so
fQ  A X   that they start from  A and 0, respect&lt;   o  J    ively, at the same time, and reach B
simultaneously, then the point of interY'        section, P, of OQ and of a perpendicular
to OB from R, describes a quadratrix. It
therefore follows that the ordinate y is proportional to the
angle qS, and specifically that as we double y (within the
quadrant) we double ~.  Furthermore, since if y=l,
2/
y     y
Also              =.' ^-tan"1 -, or arc tan-,
-tan y,
X      2
and                      x=-.
tan
tan -y


THE HISTORY AND TRANSCENDENCE OF 7


393


Therefore the curve meets the x-axis at
x= lim   Y    2
y0O tan  y
That is, if we can construct the quadratrix we shall have an
abscissa exactly equal to -, from which w can easily be constructed. The difficulty was at once seen, however, namely,
that the construction of a quadratrix itself was as difficult
as to find w, and that indeed it was practically the same
problem.
Contemporary with Hippias were Antiphon and Bryson,
to whom we are largely indebted for our present methods of
attacking the problem in elementary geometry. Antiphon
inscribed a square (or possibly an equilateral triangle) in a
circle, and by continually doubling the number of sides he
approximately exhausted the difference between the polygon
and the circle, thus approximating the area. Bryson, of
Heraclea, a follower of the Pythagoreans, not only inscribed
a regular polygon, but also circumscribed one similar to it,
and then assumed that the area of the circle was the arithmetic
mean between the two areas, a false assumption that led only
to a fair approximation. To Antiphon, therefore, we trace
one of the earliest steps in the invention of the modern calculus.
The first one to actually square a curvilinear figure, in
his efforts to square the circle, was
Hippocrates, of Chios (c. 450 B.C.).
He proved that if semicircles be
described upon  the sides of an
isosceles right triangle, as shown in 
the figure, the lune A will equal the
triangle A'. The proposition is easily generalized for scalene
right triangles, but it contributed nothing to the general problem of the circle.
The greatest step among the Greeks was taken by Archimedes in his three propositions on the measurement of the


394


MODERN MATHEMATICS


circle (KVKXOV &amp;proLts).  Substantially his method of finding
the value of n is by inscribing and circumscribing regular polygons and doubling the number of sides, quite as in elementary
geometry to-day. By this means, using a polygon of 96
sides, he showed that 3   3 r&gt;3, from which fact 31 has often
been called the Archimedean value of w. Since 3-1 is less than
0.2 per cent larger than the real value, and is such a simple
number for ordinary computations, it is still in common
use.
Ptolemy improved upon the values assigned by Archimedes,
expressing the result in the sexagesimal system as 3 8' 30",
i.e., 3    0 + 2+,3 which reduces to 3207 =3.14166...
A somewhat similar value of z appeared in India as early
as c. 500 A.D., when Aryabhatta gave the value 2,000, which
equals 3.1416, a value, however, that may be due to a later writer
by the same name. Brahmagupta (born 598 A.D.) gaveV/10 as
the exact value, perhaps because of the common approximation
formula, /a2+r =a+-      this leading to /10 = 3 +, or the
common Archimedean value. This value V/10, was extensively
used in mediaeval times.
The next noteworthy step in obtaining an approximate
value for 7= was taken by the Chinese. Chang Heng (78-139 A.D.)
gave a rule that was equivalent to taking \/10 for r. Wang
Fan (229-267) gave r==142:45, or 3.1555..., and a contemporary writer, Lui Hui, proceeding in the same way as
Antiphon, found the ratio 157:50, or 3.14. The most interesting of the Chinese discoveries, however, is that of Tsu Ch'ungchih (fifth century A.D.), who found for the limits of 10r,
31.415927 and 31.415926, from which he inferred by some
reasoning not stated in his works that 22 and 1 55were approximate values. The latter is the one usually attributed to
Adriaen Anthonisz, as mentioned hereafter. Various other
attempts were made by the Chinese, but no noteworthy results
were obtained until after the European influence had permeated their civilization.  In the Su-li Ching-yin, compiled


THE HISTORY AND TRANSCENDENCE OF Tr


395


by Imperial order in 1713, the value of, is found to 19
figures.*
The greatest mathematical genius of the Middle Ages,
Leonardo Pisano, Fibonacci, brought the limits of n somewhat closer than   Archimedes, namely to 144 = 3.1427.
1440                                1440           o
and 458=3.1410...,taking as the mean   8=3.1418. No
material improvement in methods or results were thereafter
made until about the beginning of the seventeenth century.
It was then that the Chinese value 3-5=3.1415929..., was
rediscovered by Adriaen Anthonisz (1527-1607), being published by his son, Adriaen (1571-1635), who, from the fact that
his family was originally from Metz, took the name of Metius.
This publication took place in 1625, and it appears that the
father had first shown that 3.1&lt;7r&lt;337&lt; and that he had
reached this value by assuming that  =315+17 =31 5=13
106 +120  113 113'
The value is correct through the sixth decimal place. About
the same time Viete (1540-1603), following the Greek method,
considered polygons of 6-216 sides, and found the value of =
correct to nine decimal places. Adriaen van Rooman (Adrianus
Romanus, 1561-1615), a Lyonese by birth, carried the computation to seventeen decimal places, and a little later Ludolph
van Ceulen (1540-1610), extended it to thirty-five decimal
places, a fact that was thought to be so noteworthy as to lead
to 7 being called the Ludolphian number, a name still used
in Germany.
The last noteworthy attempt by Greek methods was the
improvement suggested by Christian Huygens (1629-95), by
which he was enabled to find the value to nine decimal places
by using only the inscribed polygon of 60 sides. With his
labors the ancient methods may be said to close.
The second period in the solution of the problem begins in
the second half of the seventeenth century. It was now that
the new analysis came to the aid of the investigators, and the
genius of men like Newton, Leibnitz, Fermat, Wallis, Brouncker,
* See Mikami, in the Bibliotheca Mathematica, 1909-10, p. 1.


396


MODERN MATHEMATICS


and the Bernoullis asserted itself. Instead of the geometric
methods of Archimedes there appeared methods of a radically
different nature, having for their object the expressing of 7
analytically, and developing it as an infinite series or product.
The first noteworthy attempts in this line were made by John
Wallis (1616-1703) who proved that
7  2 2 4 4 6 6 8 8
_ 22446688
2  1 3 3 5 5 7 7 9       '
and that
4     1
-1+
=     2 +9
2+25
2+49
2+81
2+...,
this second form, the continued fraction, having already been
given to him without proof by Lord Brouncker (1620-84).
The most important infinite series developed at this time
for the study of the circle was discovered by James Gregory
(1638-75) in 1670, and independently by Leibnitz (1646-1716)
in 1673. This is the series:
X3 x5 X7
tan -ix=x-   +- -   +...
3  5   7
Gregory, moreover, recognized the necessity of considering the
question of the convergency of such a series, a subject elaborated by Leibnitz a little later. Gregory also stated that in
general the ratio of a sector of a circle to the area of any
inscribed or circumscribed polygon cannot be expressed by a
finite number of algebraic terms. He therefore concluded that
the circle could not be squared, although as Mr. Ball has
pointed out in his history, " it is conceivable that some particular sector might be squared, and this particular sector
might be the whole circle."
In the series for tan-' x, if x= 1, we have the series
7     11    1
-=4_ __ -
4     35    7'-*'


THE HISTORY AND TRANSCENDENCE OF 7n


397


but it converges so slowly as not to be convenient in practice.
This series bears the name of Leibnitz, having been communicated by him to certain of his friends in 1674, and published
by him in 1682. It was known before his time, however.
If, instead of using x=l, we take x= /-, the series for
tan-1 x becomes
f            1   1          1     1       \
6  \ 1       -3 -'3  + 325  337 349  35
which is more usable than that for.
Still better than this formula is one derived from the series
for tan-1 x by means of an addition theorem, viz.:
x+y
tan- x + tan- y = tan-l' 1-x
I1-xy'
which by repeated application leads to a formula for the sum
of several antitangents or for multiples of a single antitangent.
It was thus that the English mathematician Machin (1680 -1752) established the relation
i1       1
-4 tan-' â tan-1
4         5       239
/   ___        1       1 
5   3.53+5.5-5-  7.57
(239  3.2393 5.2395   7.2397 + 
By these means the value of r was computed to 100 decimal
places after Abraham Sharp (1653-1742) had computed it to
72 decimal places by the help of the series for -.
The other attempts at computing the value of 7r by means
of series may be summarized by mentioning the names of the
French mathematician Lagny (1660-1734), who carried the
computation to 127 decimal places; the Austrian Georg Vega
(1756-1802), 140 places; the Hamburg computer Zacharias


398


MODERN MATHEMATICS


1       1
Dase (1824-61),200 places, using the formula 4 = tan-1 +tan- 5
1
+tan-'; Richter, who extended the value to 500 decimal
0
places, and Shanks, who carried it to 700 decimal places.
These efforts are of value chiefly in showing the superiority
of the modern over the ancient methods. Practically, as the
late Professor Newcomb remarked, " ten decimals are sufficient
to give the circumference of the earth to the fraction of an
inch, and thirty decimals would give the circumference of the
whole visible universe to a quantity imperceptible with the
most powerful microscope." The results of these extended
computations revealed nothing concerning the real nature of
w, nothing as to whether it is rational or irrational, and nothing
as to its possible transcendental character.
The foundation for the solution of the problem as to the
nature of 7 was furnished by Euler in connection with the
formulas involving e, the base of the so-called Naperian logarithms, although first used as a base in the tables of John
Speidell, published in London in 1619. Starting with Maclaurin's formula,
X2          X3
f(x) -f(0) +f'(0). x+f"(0) 1  +f '(0).  *-3 +.
it is evident that
X   X2    X3
x   x2 x_
1  1.2+.2.3+
cos X1 - ~   +1 + 2 3 
3X3      X5
and         sin x=x   1.2.3-2 -
all being convergent series. It was by the help of these series
that Euler (1707-83) showed that
ix   x2    ix3
eix=1+                +..
1   1.2  1.2.3


THE HISTORY AND TRANSCENDENCE OF n           399
ix3      ix5
and       i sin x=ix -   3+12
1.2.3  1.2.3.4.5  ' ' '2
whence        ei x-cos x + i sin x.
If x-r, this reduces to the form
ei7r- -1,
whence                  1 + ei = 0,
an expression involving perhaps the five most interesting quantities in mathematics. It is by means of this equation that
the transcendent nature of n was proved about a century
and a half after Euler's discovery.
Euler also gave numerous other relations between e and w,
and expressed in various ways the values of these numbers in
infinite series and products, and as continued fractions. For
example, he showed that the following relations exist:
r2     1   1  I   1
rr3    1   1  1   1
321 33 53 73 93-...
r2   22    32    52   72    112
6  22-1 32-1 52-1 72-1 112-1..
e =2+ --
2+1
1+1
4+1
1+...
e-1 1
2    1+1
10+ 1
14  1
18+...


400


MODERN MATHEMATICS


The third period in the history of the study of 7r begins
with the work of the German mathematician Johann Heinrich
Lambert (1728-77). In his treatise on the quadrature and
rectification of the circle (1766) he set forth two fundamental
propositions, viz.:
1. If x is a rational number, not 0, then ex cannot be
rational;
2. If ex is a rational number, not 0, then x cannot be
rational.
He reached these conclusions by starting with Euler's
expression for (e -1), viz.:
e-1   1
2    1+1
6+1
10+ 1
14+ 1
18 +1
22+....
He then showed that
ex-1 1
ex+l 2    1
x 6    1
x 10   1
X 14
T-   ~ ~ ~
x
and
1
tan x=- 
1 1
x 3   1
x 5 1
x 7 1
x 9
x
and from these continued fractions he drew the conclusions
stated, the proof not being rigorous. For the special case of


THE HISTORY AND TRANSCENDENCE OF z          4


401


W
x=   we know that tan -   1, whence he asserted that r can4                  4
not be rational. The failure of Lambert to prove that the continued fraction
m
n + i'
7d.LI + 77//
n/+...
is irrational, the number of terms being infinite, m, m'..
M Mt
and n, n'... being integral, and -, n,.., being less than
1, was remedied by Legendre (1752-1833), who supplied the
proof in his Elements de geometrie (1794). With Legendre's
work, therefore, the proof of the irrationality of = may be
said to have been settled, and to this he added a proof of the
irrationality of 72.
The next noteworthy step was taken by Liouville (1809-82)
in 1840, when he showed that e cannot be the root of a quadratic equation with rational coefficients, or, in other words,
that if a, b, c are rational, ae2+be+c=0 is impossible. This
was the first successful attempt toward verifying what Legendre
had stated to be probable,-that = is of such a nature that
it cannot be classed among algebraic numbers, that is, that
it is not the root of any algebraic equation with a finite number
of terms with rational coefficients. The question then, as it
stood after the contribution of Liouville, was twofold: Of what,
if any, algebraic equations with a finite number of terms with
rational coefficients can e and = be roots?  Is it not possible
to find numbers that are not roots of an algebraic equation
of this kind?  Legendre was the first to express the doubt
contained in the second part of this question, and the doubt
became a certainty when LiouLville proved, in 1844, the existence
of non-algebraic numbers and justified the division of numbers
into algebraic and transcendental.
As the result of a careful investigation of the exponential
function Hermite succeeded in proving, in 1873, that the


402


MODERN MATHEMATICS


number e is transcendental, and Lindemann, in 1882, succeeded
in proving the same for T, basing his proof upon the labors
of Hermite. Lindemann proved essentially that in an equation
of the form ao +aleP +a2eq +a3er +... â 0, the exponents and
coefficients cannot all be algebraic numbers. It therefore
follows that in the Euler equation, 1 +ei =O, where the coefficients are algebraic, the exponent i7 is not algebraic, and
hence r is transcendental. While we shall not follow Lindemann's proof exactly, it is nevertheless necessary, as a preliminary to considering the nature of a, to prove that e is a
transcendental number.
3. The transcendence of e. Since Hermite first proved that
e is transcendental others have materially simplified his treatment of the problem. The contributions of Hilbert, Hurwitz,
and Gordan were published in the Mathematische Annalen in
1893.  The Gordan proof was still further simplified by Weber
in his Algebra, and later in the Encyklopadie der ElementarMathematik (1903), and Enriques, in his Fragen der Elementargeometrie (German edition, 1907), presents it in its latest
form. To the last-named work the basis of the following proof
is due but the proof has been materially simplified, chiefly
through the kind assistance of Professor E. V. Huntington,
of Harvard University, who planned the treatment for e, and
who made the suggestion of using the cubic instead of the
general equation, and of distinctly setting forth the three
lemmas.
To prove that e is a transcendental number means that it
must be shown that e is not a root of any algebraic equation
with rational coefficients. In other words, it must be shown
that it is impossible to have a general equation of the form
C + Cle +C2e2 +.. +Cnen=0.... (1)
where n is any positive integer, and where the coefficients Co,
C1,..., are any rational numbers, including 0, except that
Co and Cn cannot be 0, since this would change the degree of
the equation.
In order to simplify the proof it is proposed to take a cubic


THE HISTORY AND TRANSCENDENCE OF 


403


equation instead of this general equation of the nth degree,
and to show that it is impossible to have
Co +Cle +CC2e +C3e3=.....          (2)
The proof, however, is essentially the same as that for the
equation of the nth degree, the gain being in the simplicity
of statement. The extension of the proof to the general
equation is obvious.
The proof requires us to consider two important functions
which, on account of their frequent use, we shall distinguish
by the symbols f(x) and F(x). The first of these is a rational
integral function of x of the nth degree, such that f(0)=0.
It is therefore of the form
f (x) =-al - a2X2 + a3x3 +... + anXn,
where the a's are rational numbers, being the coefficients in the
expansion of f(x) in powers of x. The proof depends upon the
ingenious selection of the following function:
XP-I[(X -1) (X -2) (X -3)]
JfWx      (p-1)!
in which p is a prime number to be determined later in this
discussion. If f(x) is put into the form aix+a2x2 +...+anXn,
that is, if
p-lr(x -_) (x -2) (X -3)]P
Jf(x) =       (p-i)!
=alx +a2x2 + a3x3 +... +a,..(3)
it is evident that n=3p+p-1, and that ap_l is the first
coefficient that is not zero, since the lowest power of x is
xp-1. More generally, of course, the numerator of this fraction
would be xp-l[(x-1)(x -2)... (x -m)]P, but for our present
purposes the one selected is sufficient.
The second function that enters into the discussion is
F(x) =f'(x) +f"(x) +f"'(x) +..  +f  (x),.. (4)
where f'(x), f"(x),... f(n)(x) are the successive derivatives
of f().


404


MODERN MATHEMATICS


In order to bring clearly to view the principal steps of the
proof, unencumbered by subordinate matters, it is now proposed to state three lemmas concerning these functions, f(x)
and F(x), relegating the proofs of these lemmas to the end
of this part of the discussion.
Lemma I. If f(x) =alx a2x2~+a3x3+... +a,,xn, and if S,
denotes the sum of the first n terms in the series ex, so that
S, = I) S2 1 +    Y S3=1+  +  Y  I
I i'   1!    2!'  ~
X  X2         xn-1
+(n-i)
then, from (4),
F(x) =1! Sial +2! S2a2+3! S3a3 +.. +n!San...  (5)
In particular,
F(0)=1!a1+2! a2~3!a3~...+n!an.    (6)
Lemma II. Again referring to
X73-1[(x -1) (X -2) (x -3)]P
f (X) ==      (p-i)!
and F(x) =f'(x) +f"(x) ~f"'(x) +... +f() (x), if p is any prime
number and n is any positive integer, and Co, C1, 02, 03 are
any integers, then
CoF(0) ~01F(1) +C2F(2) ~C3F(3) ==Co(3!)P+pQ,.. (7)
where Q is some integer depending upon the values of the
C's and p.
Lemma III. Again referring to f(x) = ax + a2x2 + a3x3 ~..
xp-'[ (X -1) (X â 2) (x -3)] P
+ an Xn        (P              if we let Ali=jaI, A2 =la2,... An=lanl, and X=IxI, then
A1X~A2X2+A3X3+-..+A.     Xn
XP â1[(X +I)(X +2)(X $3)]P
(P -1)!.   ~~(8)


THE HISTORY AND TRANSCENDENCE OF w    40


405


Assuming for the time being that these lemmas have been
proved, or referring to the proofs at the close of this discussion,
we now proceed to consider the transcendence of e.
As a point of departure we take the series defining ex, already
considered:
x  x2 x3        X
ex1+l+       3+.(9)
which is convergent for all values of x. In this we let S,, stand
for the sum of the first n terms as in Lemma I, so that
2         X
Sn1+VX    +.


If, now, we multiply (9) successively by 1!, 2!,,
and put
Un=X n~-     ++
n+1 (n+1)(n+2)
so that we have
x2 x3
we shall have


3!,... n!,


(10)


1! ex=1! Si +x  -  -+. 
x3 x4
2! ex= 2! S2 +X2+-+34+. 
x4 x5
3! ex =3! S3 +X3++45+. 
etc.,


=! +U1,
= 2! S2 ~ U2y
=3! S3 + U3,
etc.


(11)


Xn+1   xn~2
n! e =n Snv, â +- +.=n! S.+ Un
n+1  (n+1)(n+2).


406


4MODERN MATHEMATICS


If we multiply both members of the successive equations
of (11) by the successive coefficients a,, a2, a3,... a, of (3)
and add, we shall have
(1! a, +2! a2-.. +n! a,,)ex = (1 Sla 1+2! 82a2..~fn! Sa.)
+ (a1U1 +a2U2+.. a,,U,,).
But by (6) we know that
F(O) =&gt;1! a, +2! a2 +..- +n a,,
arid by (5) we know that
F(x) =1! Siat +2! S2a2+. +n! Sna,
It therefore follows, from the preceding equation, that
F(0)ex = F(x) +ai Ui + a2U2+.. ananFor brevity we let
O(x) -&gt;aU1 +a2U2+l-.. -+ai.    (12)
and we have
F(0)ex =F(x)+ + (x).                      (13)
We now have an expression for ex which depends upon F(x)
of (4), and hence ultimately upon the choice of p in (3).
We now return to the essential point of the problem, and
recall that we are to prove that it is impossible that Co ~ Cie
+ Ce2 + C3e3 should equal zero. We shall evidently have this
form as a factor if, in (13), we substitute 0,1, 2, and 3, sicecessively for x, multiply the results by Co, C1, C2, and 03, and
then add, thus:
F(0)Co  =Co(FO)+Cok(0),
F(0)Cie =C1F(1) + C1b(i),.(14)
F(O)C2e 2=C2F (2) +0C2 0b(2),
F(0)C3e3 =03F(3) + C3 (3).
Adding, F(0)[Co + Cie + C2e2 + C3e3]
CoF (0) + C1F(1) + C2F (2) ~ C3F(3) 1
~0Co (0) ~ C1 b(1)~+ C2 (2) + C3a (3).  (


THE HISTORY AND TRANSCENDENCE OF z7


407


We will now make the assumption of (2), that Co+Cle+
C2e2 + C3e3 =0, and will show that (15) is impossible, and hence
that the assumption is absurd. In making this substitution
in (15) we also recall that
CoF(0) +C1F(1) +CF F(2) +C3F(3) = Co(3!)P+pQ,
by (7). We therefore have
0 =[Co(3!)p +pQ]+[Co1(O) +C1 (1) +C2(0(2) +C30(3)],  (16)
where Q is some integer depending upon the values of C and
p, and 0 is the function defined in (12).
The problem now reduces to showing that (16) is impossible,
and hence that (15) is impossible, and hence that e cannot
be a root of an equation like (3). We can show that (16)
is impossible if we can prove that
(1) The absolute value of the first part, Co(3!)p+pQ, is
greater than or equal to 1;
(2) The absolute value of the second part, Cob(0)+...
+C3 7(3), is less than 1.
For in case we can prove this, then in the most unfavorable
case we shall have ~1 ~ (a number less than 1)=0, which is
manifestly impossible, whence (16) is impossible.
As to the first part, Co(3!) +pQ, if we take p a prime
number greater than 3, and not a factor of Co, Co(3!)p is not
divisible by p, but pQ is divisible by p. Therefore, because Co
is not zero, we have the absolute value of Co(3!)P+pQ  1.
Consider now the second part of (16), Co0(O) +C1(1)
+C27(2) +C30(3). In this we shall need to make use of the
fact that the absolute value of a sum is less than, or at most
equal to, the sum of the absolute values of the terms, as is
seen in the simple case of 12 2 -+2 -21 -0, while 21 +1 -21 + 121
+] -21 =. And since the   b functions are defined (12) in
terms of the U functions (10), we consider first U,.
From (10),   Ul=x   n+-+     (n+ l)n2) +. ]
n+1(nl (+2


408


MODERN MATHEMATICS


Putting X for lxi, as in (8), we have
us I,&lt; Xn I~  ( + X&gt;( + X2~ tz~-]
junl&lt;Xni~+     ~+~.. 1
since each denominator has here been replaced by a smaller
one. Hence, from (9),
lUnl&lt;XneX.                  (17)
Having now considered the U function used in defining the
&amp; function (12), we consider the latter, i.e.,
O4x)=aiUi +a2U2 +....+an
In this we put A, for lall, A2 for ja2j,.., and we have
j0(x2) &lt;&lt;Ajj Ujj Aj UTj1...$Anj Unly
as in the preliminary work of (17). Substituting the limiting
value of I U,,1, from (17), and giving to n the successive values
1, 2, 3,..., we have
Jsb(x)I &lt; eX[A,X+A2X2 ~...A,Xn].
Whence, from (8)
1YIP1[(X + 1)(X +2) (X +3)]p
whence
[X (X+ 1) (X~ 2) (X+ 3)]P-1
(x)I &lt; eX(X+1)(X+2)(X+3)          (p-i)!-3(18)
Now for any fixed value of X we can take for p a value
so large that
LX(X~ 1)(X +2)(X + 3)]P-1
shall be as small as we please, since this is of the formynl
and is therefore the pth term of the convergent exponential
series and therefore approaches zero as p increases.


THE HISTORY AND TRANSCENDENCE OF 7 


409


Hence, putting 0, 1, 2, and 3 successively for x, we see
that 1b(0)1, 1s(1)l, 1 (2)1, and 1I(3)1 can all be made as small
as we please by choosing p sufficiently large. Hence the
absolute value of the second part of (16) which we are considering, viz.: Co; (0) +C1i (l) +C2 0(2) +C3 0(3), can be made
as small as we please, and hence less than 1, which was what
we set out to show as the second part of the general proof.
It therefore appears that (16) cannot be true, and that
therefore (15) cannot be true, and that therefore (3) cannot be
true; in other words, that e is not the root of any cubic equation with integral coefficients. And what has been shown with
respect to the cubic equation can evidently be shown with
respect to ar-equation of the nth degree, since no essential use
has -been made of the restriction n=3. Hence e cannot be
the root of any algebraic equation.
Proof of Lemma I. Lemma I asserts that if
f (x) =al +a2x2 + a3x3 +.. + ann,
and if we let
S1=l, S2=l+~! S3=1+        +
x         xn-1... Sn=l+!+..+.+~(n-)'
then
f'(x) +f()+f"' () + (+...+f(n)(X)
=-1! Sia + 2! S2a2+... +n! Snan.
To prove this, first write f(x) in this form:
x       X2     X3           Xn
f(x) =1! al  +2! a22+3! a3- +.. +n! an
Taking the successive derivatives we have
X       X2            Xn-1
J'(x) 1! al +2! a2! +3! a3  +. * *+nan(n _ â1)
X             xn-2
f"(x)=      2!a2 +3!a3-+....+n!an(,     2)
1(x)^n -3!a  3(
n â3
f"' (X)=            3!a3 +... +n! an
(n -3)!'


f() (x)==


n! an.


4.10             MODERN MATHEMATICS
By adding, and substituting Si, S2,..., S,, for their 're.
spective series, we have  -
Proof of Lemma II. Lemma II asserts that if
(p-i[x- )!(  2 (  3]
and
F (x) ==f'(x) +f" (x)~+...~n (
and p is any prime number, and n is any positive integer, and
the C's are any integers, then
C0F (0) ~ C1F (1) + C2F (2) ~0C3F (3) = Co (3!) P+ pQ,
where Q is some integer depending upon the values of the C's
and p.
Arranging f(x) according to ascending powers of x we have
B.-1xp-l ~Bpxp ~Bp+lxP~l +....Bp1 ~since it is apparent from (3) that the lowest power of x is
p -1, and the highest is 3p -1 p -1 == 4p -1. It is evident that
B,1, Bp,... B4p-i are integral, since they are products of
integers, and that
Taking the successive derivatives, so as to deternilne the
values of F(0), F(i), F(2), F(3), we have, after putting 0 for x,
f'(0)=0, f"(0) =0,...f f(p2) (0) = 01
but
Hence
Substituting the value of B,, above, we have
(CoF(0) =Co(3!)P +a set of integers in which p is a factor.


THE HISTORY. AND TRANSCENDENCE OF w       4


411


Similarly, taking the -values of f'(1), f"(l),..., we shall
find that F(1) equals a series of integers in which p is a factor
of each term, and so for F(2) and F(3). Hence
C0F(O) ~ C1F(1) + C2F(2) + C3FG3) ==Co(3!)P +pQ.
Proof of Lemma III. Lemma III asserts that if
Xp-1[ (x -1) (X;i2) (x - 3)] P
f(x&gt;W ==-x x+a2x2~ + a3x3+..~af         (pXni)
and if
AI=jaij, A=1a21,..An==ja4, and X=Ixl,
then
Xnp-l[(X ~1)(X-i- 2)(X+ v3)]P
A1X+A2X2+A3X3+..+A,X;l=
Referring to the second form of f(x) above, we see that
f(x) is a function of x with alternating signs. For if we take
the general case of xk -axkl + a2Xk-2 +..., and multiply this
by x -b, the resulting function will have alternating signs, as
in the case of (x -c) (x -d). Furthermore, the result is the
same, aside from the signs, as that obtained by multiplying
ak +aixk-l + a2x k2 +.  by x ~b. Repeated application of this
theorem shows that the expanded product of the general case,
Xp-1[(X -1)(x -2)(x -3)... (x -m)]P
f                  (pi)
has the same alternating signs, and reduces, when the absolute
values An of all the coefficients an are taken, and when we
take 3 for rn, to
XP-l1[(X +1)(X + 2) (X +3)]P
A,X+A2X2+A3X3+...+AAXn=
4. The transcendence of 7. The proof of the transcendence
of 7r is based upon three propositions already given, viz.:
F(O)ex = F(x)~+ O(x);.(13)
1~1O [(Xe  t)(Xh +2)(oe          ( 3)19
O~(x) I &lt;ex XP8
1+ei1T==  ' Euler's theorem.,,,!9


412


MODERN MATHEMATICS


If we assume 7 to be an algebraic number, then in is evidently an algebraic number, and therefore is the root of an
algebraic equation with rational coefficients.
If this equation is taken, as before, to be of the third
degree (the proof being essentially the same for the general
case) we may indicate its roots by yl, y2, and Y3, and among
these in must be found. But since 1 +ei =0, we should then
have
(l +)   0 + e(i) ( + )( +3)0 = 0
whence
1 + (eY1 +ey2 +ey) + (el+y2 +e2+y3 +ey +yel) -eyl+y2+y =0.  (20)
It is proposed to show that this equation is impossible.
The symmetric functions of the quantities yi, Y2, y3, are,
by our hypothesis (1), rational numbers, and hence l1, y2, y3 are
roots of the rational algebraic equation
(zx) -0.
The symmetric functions of the quantities yl+y2, Y22+y3,
y3+yi (for example, their power sum) are also symmetric
functions of yk, and are therefore rational numbers. The
quantities yl +y2, y2+ +3, 3 + y1 are therefore roots of a second
algebraic equation
1(x) =0.
Similarly, y1 +y2 +y3 is the root of a third algebraic equation,
02(X) =0.
Therefore
(x) 1(x) X)()......      (21)
is an integral function of x which becomes 0 as soon as x becomes
equal to one of the numbers yj, Yi+yk, or yl1+Y2+Y3. Some
of these numbers, say, N of them, may equal zero. If we
place the product (21) equal to 0, and suppress the factor xN,
we have an equation 0(x) =0, which we may consider as being
reduced to a form having integral coefficients. Since the zero


THE HISTORY AND TRANSCENDENCE OF 7


413


roots have just been suppressed, 0(0) cannot equal 0, and hence
0(x) may be written
0(x) = axm +alxrn- + a2Xm-2 +...+am = 0,
where a, al, am are integral, and a and a,, are not 0, and a is
positive.
This may easily be transformed, by multiplying by a'"-l and
putting z for ax, into an equation with integral coefficients, of
the form
1(z) zm + bzm-l + b2z'-2+... +  b,,.. (22)
the coefficient of the highest power being unity. Let the
roots of the equation 0(x)==0 be xl, x, x3,..., these representing the numbers among the numbers yi, yj + Ykj y1 + 2 +Y3
that are not equal to 0. It is seen from (20) that they must
satisfy the equation,
K+exl+ex2+ex+'...=0...   (23)
We now return to the fundamental equation (1-3)
F(O)e = F(x) + b(x).
If we put for x the numbers xl, x2, x3,.., and add the
results, we shall have, with attention to (23),
-K. F(O) = F(x) + F(x2) + F(x3) +.
+    1(Xl) + 52)+ b(x3) +...
or
K.F(O) +F(xl) +F(x2) +F(x3) +...
+ 0(xi) + (x2) +  (x3) +... =0.. (24)
We now wish to prove that when we make a suitable choice
of the integral function f(x), which is entirely arbitrary except
for the condition that f(O) =0, the equation (24) is impossible.
It will then follow that our sole hypothesis, viz., that 7r is an
algebraic number, is incorrect.
If we prove that
1. K F(O) +F(xi) +F(x2) +F(x3) +.. is integral and not 0;
2. The absolute value of  (x1) + (x2) +  (x3) +... &lt;1;


414


MODERN MATHEMATICS


then we shall have proved the impossibility of (24), for the
sum of an integer and a number whose absolute value is less
than 1 cannot be 0.
We first let p represent a prime number, and we take for
f(x) the integral function
zP-1[01 (z)]p  amP-lxP-[O(x)]p
f/(X)  (p             (p_)!,  *.. (25)
an equation that is evident from the fact that we took z=ax
and multiplied 0(x) by a'"-1 when we formed 01(z).
We arrange [O(z)]P according to ascending powers of z,
and we have
[O(z)]P=Ao +Az +Az,2+...=Ao +Alax +A2a2x2 +...
where the A's are integral, and, from (22), Ao= b,,p and therefore not 0. Now from (25),
AoaP-lxP-l Alapx +-A2aP+lxp+l..  (
f(-x) =I.                                  (26)
Taking the derivatives, and letting x= 0, we have
f(0)=0, f'()=0,..., f(P-2)(0)=O,
f(p-l) (0) = AoaP-1 = bmPaP-l
f(P)(O) =pAlaP,
f(p+l) (0) =p(p + )A2aP+l,....
We now select a value for p greater than the greatest
number a, bm, K. Then f(p-1)(O) is not divisible by p, while
all the other derived functions are either 0 or are divisible by p.
Therefore F(0), which from (4) equals f'(0) +f"(0) +..,
is an integer not divisible by p, and thus K F(O) is also an
integer not divisible by p, which tells us the nature of part
of the first function under consideration.
In deriving (22) we used z for ax, and we may therefore
take f(x) and arrange it according to ascending powers of
z -Zk, where Zk is one of the roots of (22), and we have
(z ( -k)PB1 (k) + (Z-Z k) +1B2( k) +.. 
^ ==   ------- (p-1)!
aP(x -Xk)PB (Zk) +aP+l(X -Xk)P+lB2(Zk)...
(p-i)!,     -   (27)
(p-l 


THE HISTORY AND TRANSCENDENCE OF 7r   1


415


where B1 (Zk), B2(Zk),... are integral functions Of Zk with
rational coefficients. Hence, as with equation (26), we have
f(Xk) ==0, f'(Xk)=0, f"1(Xk) =0,... f(Pl) (Xk) = 0;
f(P) (Xk)== paPBi(z ), f (P~'(Xk)= p(p +1)aP~'B2 (Z ),
If now we let
Q(Zk)= apB1(zk) + (p+1)aP~1B2(Zk) +.
we have from (4),
F(Xk) =pQ(Zk).(28)
Therefore
F(xl) ~F(X2) ~F(X3) ~.   ==p[Q(zl) +Q(z2)+ ~Q (z3) ~..J. (29)
But the second member of (29) is an integral symmetric
function of the rn roots of equation (22), and hence is integral
and contains the factor p. We have now proved that K. F(0)
is an integer not divisible by p, and that the -sum of the functions
F(Xk) is an integer that is divisible by p, so that
K.F(0) ~ F(x1) ~ F(X2) +F(X3) ~... is an integer and not
divisible by p, and therefore is not 0, which was the first thing
to be proved.
We now take up the second thing to be proved, that the
absolute value Of 0b(X1) + b(X1)+A- (X2) ~..  is less than 1.
To do this we begin with
j0(X~~eX-Xp-' (p-I)!X2(+)p8
Taking 0(x) ==axm ~,aixm- ~...+am= 0, already considered,
we write this
0 (x) = a (x -xi) (x -X2)... (X -Xm,,). (30)
Then, from (25) and (30) we have
a(m+l) P-lxP-l (X -X1) P(X -X2) P... (X -Xn) P


416


MODERN MATHEMATICS


Letting X stand for Ixl, and Xk for Ixk[, it is evidert that
the coefficients in (31) are not greater than those in
a(m+l) p-xP-l(x -X1) p(X +X2)... (X + Xm) P
(p-1)!
If we now place
P(X) = am+'lX(X~ +X1)(X +X2)... (X+-Xm),
then for every positive number X we have
Xp-l[(X+l)(X+2)(X+3)]P     [P(X)]P
(p-1)!            aX(p-1)!'
P(X) [P(X)]p-1
&lt;aX    (p-1) 
We now proceed as with (18). For any fixed value of X
we can take a value of p so large that
XP-1[(X +1)(X+2)(X +3)]P
(p -1)!
shall be as small as we please.
We now recall that
)   -x.+ [(X+1)(X+2)(X+3)](
()&lt; eX...          (18)
Hence Ob(x) may be made as small as we please, and hence
the absolute value of ~(Xl) + b(X2) + 0(X3) +... may be made
less than 1 by taking a suitable value of p, which proves the
second part of the proposition.
The two points necessary to show the transcendency of 7r
have now been proved. In other words, n satisfies no algebraic
equation with rational coefficients, and therefore cannot be
found by means of the ordinary algebraic operations, and
therefore cannot be constructed geometrically by the use of
the instruments of elementary geometry, nor even by the aid
of higher algebraic curves.