Seismological QJ^ervatory Weston College Weston Massachusetts THE CATHERINE B. O'CONNOR LIBRARV WESTON OBSERVATORY WESTON, MASSACHUSETTS 02193 fc^^ Digitized by the Internet Archive in 2010 with funding from Boston Library Consortium Member Libraries http://www.archive.org/details/firstcourseintheOOdick FIRST COURSE IN THE THEORY OF EQUATIONS FIRST COURSE IN THE THEORY OF EQUATIONS QAM LEONARD EUGENE DICKSON, Ph.D. CORRESPONDANT DE L INSTITUT DE FRANCE PROFESSOR OF MATHEMATICS IN THE UNIVERSITY OF CHICAGO 1??1 LIBRARY SEROLOGICAL OBSERVATORY "*''*■'"'-•* NEW YORK JOHN WILEY & SONS, Inc. London: CHAPMAN & HALL, Limited THE CATHERINE B. O'CONNOR LIBRARY WESTON OBSERVATORY WESTON, MASSACHUSETTS 02193 499 Copyright, 1922, BY LEONARD EUGENE DICKSON All Rights Reserved This book or any part thereof must not be reproduced in any form without the written permission of the publisher. .Printed in U. S. A 5/32 PRESS OF BR/.UNWORTH &. CO., INC. BOOK MANUFACTURERS BROOKLYN, NEW YORK THE CATHERINE B. O'CONNOR LIBRARY WESTON OBSERVATORY WESTON, MASSACHUSETTS 02193 PREFACE The theory of equations is not only a necessity in the subsequent mathematical courses and their applications, but furnishes an illumina- ting sequel to geometry, algebra and analytic geometry. Moreover, it develops anew and in greater detail various fundamental ideas of calculus for the simple, but important, case of polynomials. The theory of equations therefore affords a useful supplement to differential calculus whether taken subsequently or simultaneously. It was to meet the numerous needs of the student in regard to his earlier and future mathematical courses that the present book was planned with great care and after wide consultation. It differs essentially from the author's Elementary Theory of Equations, both in regard to omissions and additions, and since it is addressed to younger students and may be used parallel with a course in differential calculus. Simpler and more detailed proofs are now employed. The exercises are simpler, more numerous, of greater variety, and involve more practical applications. This book throws important light on various elementary topics. For example, an alert student of geometry who has learned how to bisect any angle is apt to ask if every angle can be trisected with ruler and compasses and if not, why not. After learning how to construct regular polygons of 3, 4, 5, 6, 8 and 10 sides, he will be inquisitive about the missing ones of 7 and 9 sides. The teacher will be in a comfortable position if he knows the facts and what is involved in the simplest discussion to date of these questions, as given in Chapter III. Other chapters throw needed light on various topics of algebra. In particular, the theory of graphs is presented in Chapter V in a more scientific and practical manner than was possible in algebra and analytic geometry. There is developed a method of computing a real root of an equation with minimum labor and with certainty as to the accuracy of all the decimals obtained. We first find by Horner's method successive trans- iv PREFACE formed equations whose number is half of the desired number of significant figures of the root. The final equation is reduced to a linear equation by applying to the constant term the correction computed from the omitted terms of the second and higher degrees, and the work is completed bjr abridged division. The method combines speed with control of accuracy. Newton's method, which is presented from both the graphical and the numerical standpoints, has the advantage of being applicable also to equations which are not algebraic; it is applied in detail to various such equations. In order to locate or isolate the real roots of an equation we may employ a graph, provided it be constructed scientifically, or the theorems of Descartes, Sturm, and Budan, which are usually neither stated, nor proved, correctly. The long chapter on determinants is independent of the earlier chap- ters. The theory of a general system of linear equations is here pre- sented also from the standpoint of matrices. For valuable suggestions made after reading the preliminary manu- script of this book, the author is greatly indebted to Professor Bussey of the University of Minnesota, Professor Roever of Washington Uni- versity, Professor Kempner of the University of Illinois, and Professor Young of the University of Chicago. The revised manuscript was much improved after it was read critically by Professor Curtiss of Northwestern University. The author's thanks are due also to Professor Dresden of the University of Wisconsin for various useful suggestions on the proof-sheets. Chicago, 1921. CONTENTS Numbers refer to pages. i CHAPTER I Complex Numbers Square roots, 1. Addition, multiplication and division of complex num- bers, 2. Cube roots of unity, 3. Geometrical representation, 3. Product and quotient, 4. De Moivre's theorem, 5. Cube roots, 5. nth roots, 7. Roots of unity, 8, Primitive roots of unity, 9. CHAPTER II Elementary Theorems on the Roots of an Equation Quadratic equation, 11. Remainder theorem, 12. Synthetic division, 13. Factored form of a polynomial, 15. Multiple roots, 16. Identical poly- nomials, 16. Relations between the roots and the coefficients, 17. Imaginary roots occur in pairs, 19. Upper limit to the real roots, 21. Integral roots, 24. Rational roots, 27. CHAPTER III Constructions with Ruler and Compasses Graphical solution of a quadratic equation, 29. Analytic criterion for constructibility, 30. Cubic equations with a constructible root, 32. Tri- section of an angle, 34. Duplication of a cube, 35. Regular polygons of 7, 9, 17, and n sides, 35-44. Reciprocal equations, 37. CHAPTER IV Cubic and Quartic Equations Algebraic solution of a cubic, 45. Discriminant, 47. Number of real roots of a cubic, 48. Trigonometric solution of a cubic, 49. Ferrari's and Descartes' solutions of a quartic, 50. Resolvent cubic, 51 . Discriminant of a quartic, 51 . CHAPTER V The Graph of an Equation Use of graphs, 55. Caution in plotting, 55. Bend points, 56. Derivatives, 58. Horizontal tangents, 60. Multiple roots, 60. Ordinary and inflexion v vi CONTENTS tangents, 62. Real roots of a cubic equation, 65. Continuity, 66. Con- dition for a root between a and b, 67. Sign of a polynomial at infinity, 68. Rolle's theorem, 69. CHAPTER VI Isolation of the Real Roots Descartes' rule of signs, 71. Sturm's method, 75. Sturm's functions for the general quartic equation, 80. Budan's theorem, 83. CHAPTER VII Solution of Numerical Equations Horner's method, 86. Newton's method, algebraic and graphical discus- sion, systematic computation, also for functions not polynomials, 90. Imagi- nary roots, 98. CHAPTER VIII Determinants; Systems of Linear Equations Solution of 2 or 3 linear equations by determinants, 101. Even and odd arrangements, 103. Definition of a determinant of order n, 105. Interchange of rows and columns, 106. Interchange of two columns or two rows, 107. Minors, 109. Expansion, 109. Removal of factors, 111. Sum of deter- minants, 112. Addition of columns or rows, 113. Rank, 116. System of n linear equations in n unknowns, 114, 116. Homogeneous equations, 119. System of m linear equations in n unknowns, matrix and augmented matrix, 120. Complementary minors, 122. Laplace's development, 122. Product of determinants, 124. CHAPTER IX Symmetric Functions Sigma functions, 128. Elementary symmetric functions, 128. Funda- mental theorem, 129. Rational functions symmetric in all but one of the roots, 132. Sums of like powers of the roots, Newton's identities, 134. Waring's formula, 136. Computation of symmetric functions, 141. CHAPTER X Elimination, Resultants and Discriminants Methods of Sylvester, Euler, and Bezout, 143. Discriminants, 152. APPENDIX The Fundamental Theorem of Algebra Answers 159 Index 167 First Course in The Theory of Equations CHAPTER I Complex Numbers 1. Square Roots. If p is a positive real number, the symbol Vp is used to denote the positive square root of p. It is most easily computed by logarithms. We shall express the square roots of negative numbers in terms of the symbol i such that the relation i 2 = — 1 holds. Consequently we denote the roots of x 2 = — 1 by i and —i. The roots of x 2 = — 4 are written in the form ±2i in preference to ± V — 4. In general, if p is positive, the roots of x 2 — — p are written in the form ±Vp i in preference to ± v — p. The square of either root is thus (V p)H 2 — — p. Had we used the less desirable notation ± v — P for the roots of x 2 = — p, we might be tempted to find the square of either root by multiplying together the values under the radical sign and conclude erroneously that V-p V-p = Vp 2 =+p. To prevent such errors we use V p i and not V —p. 2. Complex Numbers. If a and 6 are any two real numbers and 4?=—l,a-\-bi is called a complex number 1 and a — 6z its conjugate. Either is said to be zero if a = b = 0. Two complex numbers a-\-bi and c+d« are said to be equal if and only if a = c and b = d. In particular, a+fo' = 1 Complex numbers are essentially couples of real numbers. For a treatment from this standpoint and a treatment based upon vectors, see the author's Elementary Theory oj Equations, p. 21, p. 18. 2 COMPLEX NUMBERS [Ch. I if and only if a = b = 0. If 6^0, a-\-bi is said to be imaginary. In partic- ular, bi is called a pure imaginary. Addition of complex numbers is defined by (a+bi) + {c+di) = (a+c) + (b+d)i. The inverse operation to addition is called subtraction, and consists in finding a complex number z such that (c-\-di)-\-z = a-\-bi. In notation and value, z is (a-\-bi) - {c+di) = {a-c) + {b-d)i. Multiplication is defined by (a-\-bi) (c-\-di) —ac — bd-\-{ad-{-bc)i, and hence is performed as in formal algebra with a subsequent reduction by means of i 2 = — 1. For example, (a+bi) (a-bi)=a 2 -b 2 i 2 = a 2 +b 2 . Division is defined as the operation which is inverse to multiplication, and consists in finding a complex number q such that (a+bi)q = e+fi. Multiplying each member by a — bi, we find that q is, in notation and value, e -\-ji _ (e +fi) {a — bi) _ ae + bf , af—be. a+bi a 2 +b 2 a 2 +b 2 ' a 2 -\-b 2 Since a 2 +6 2 = implies a = b = when a and b are real, we conclude that division except by zero is possible and unique. EXERCISES Express as complex numbers 1. V^9. 2. VI. 3. (V25+V-25) V^16- 4. -f . /- 3+V-5 „ 3+5i a+bi 5. 8+2V3. 6. ==. 7. — - . 8. — - . 2-|-V — 1 2— 6% a— 01 9. Prove that the sum of two conjugate complex numbers is real and that their difference is a pure imaginary. 10. Prove that the conjugate of the sum of two complex numbers is equal to the sum of their conjugates. Does the result hold true if each word sum is replaced by the word difference? §4] GEOMETRICAL REPRESENTATION 11. Prove that the conjugate of the product (or quotient) of two complex numbers is equal to the product (or quotient) of their conjugates. 12. Prove that, if the product of two complex numbers is zero, at least one of them is zero. 13. Find two pairs of real numbers x, y for which (x+yi) 2 = -7+24t. As in Ex. 13, express as complex numbers the square roots of 14. -11+60 i. 15.5-121 16. ±cd+(2c 2 -2d 2 )i. 3. Cube Roots of Unity. Any complex number x whose cube is equal to unity is called a cube root of unity. Since x 3 -l = (x-l) (x 2 +x+l), the roots of x 3 = 1 are 1 and the two numbers x for which x 2 +x+l = 0, (z+i) 2 =-f, z+i=±aV3z\ Hence there are three cube roots of unity, viz., 1, a=-$+$VZi, a/=-!-iV3;. In view of the origin of a>, we have the important relations W 2 + CO+1=0, co 3 = l. Since coa/ = l and co 3 = l, it follows that a/ = co 2 , co = w' 2 . 4. Geometrical Representation of Complex Numbers. Using rect- angular axes of coordinates, OX and OY, we represent the complex number a-\-bi by the point A having the coordinates a, b (Fig. 1). The positive number r=Va 2 +b 2 giving the length of OA is called the modulus (or absolute value) of a -\-bi. The angle 6 = XOA, measured counter-clockwise from OX to OA, is called the amplitude (or argument) of a-{-bi. Thus cos 6 = a/r, sin 6 = b/r, whence (1) a+fo' = r(cos 0-H sin 6). Fig. 1 The second member is called the trigonometric form of a-\-bi. For the amplitude we may select, instead of 6, any of the angles 0±36O°, 0±72O°. etc. COMPLEX NUMBERS [Ch. I Fig. 2 Two complex numbers are equal if and only if their moduli are equal and an amplitude of the one is equal to an amplitude of the other. For example, the cube roots of unity are 1 and = cos 120° +i sin 120°, co 2 =-i-iV3 i = cos240°+isin240°, and are represented by the points marked 1, u, a 2 at the vertices of an equilateral triangle inscribed in a circle of radius unity and center at the origin O (Fig. 2). The indicated amplitudes of a> and w 2 are 120° and 240° respectively, while the modulus of each is 1. The modulus of —3 is 3 and its amplitude is 180° or 180° plus or minus the product of 360° by any positive whole number. 5. Product of Complex Numbers. By actual multiplication, [r(cos d-\-i sin 6)} [r'(cos a-\-i sin a)] = rr'[(cos 6 cos a — sin 6 sina)-H(sin 6 cos a+cos 6 sin a)] = rr'[cos {6-\-a)-\-i sin (0+a)], by trigonometry. Hence the modulus of the product of two complex numbers is equal to the product of their moduli, while the amplitude of the product is equal to the sum of their amplitudes. For example, the square of w = cos 120°+£ sin 120° has the modulus 1 and the ampli- tude 120°+120° and hence is « 2 = cos 240°+i sin 240°. Again, the product of w and co 2 has the modulus 1 and the amplitude 120°+240° and hence is cos 360°+* sin 360°, which reduces to 1. This agrees with the known fact that w 3 = l. Taking r = r' = 1 in the above relation, we obtain the useful formula (2) (cos d+i sin 6) (cos a+i sin a) =cos (d-\-a)+i sin (d+a). 6. Quotient of Complex Numbers. Taking a = /3 — 6 in (2) and divid- ing the members of the resulting equation by cos 6-\-i sin 6, we get cos /3+i sin /3 . ' x . . . ,_ ', . . . . ^ = cos G3-0)+* sin 03-0). cos d-\-i sin 6 § 7] DE MOIVRE'S THEOREM 5 Hence the amplitude of the quotient of R(cos /3+t sin (3) by r (cos d-\-i sin 0) zs egitaZ to the difference (3— 9 of their amplitudes, while the modulus of the quotient is equal to the quotient R/r of their moduli. The case /3 = gives the useful formula , . . — ^ = cos 6 — i sin 6. cos d-\-i sin 6 7. De Moivre's Theorem. // n is any positive whole number, (3) (cos d-\-i sin 0) ra = cos nd-\-i sin nd. This relation is evidently true when n = l, and when n = 2 it follows from formula (2) with a = 6. To proceed by mathematical induction, suppose that our relation has been established for the values 1,2, ... ,m of n. We can then prove that it holds also for the next value ra+1 of n. For, by hypothesis, we have (cos 0-\-i sin 0) OT = cos md-\-i sin md. Multiply each member by cos 0-\-i sin 6, and for the product on the right substitute its value from (2) with a = md. Thus (cos 6-\-i sin d) m+1 = (cos 6-\-i sin 6) (cos md-\-i sin md), = cos (6-\-md)-\-ism (d-\-md), which proves (3) when w = m+l. Hence the induction is complete. Examples are furnished by the results at the end of § 5 : (cos 120° +i sin 120°) 2 = cos 240° +i sin 240°, (cos 120°+i sin 120°) 3 = cos 360°+i sin 360°. 8. Cube Roots. To find the cube roots of a complex number, we first express the number in its trigonometric form. For example, 4V2+4V2 i = 8(cos 45°+t sin 45°). If it has a cube root which is a complex number, the latter is expressible in the trigonometric form (4) r(cos d-\-i sin 6). The cube of the latter, which is found by means of (3), must be equal to the proposed number, so that r 3 (cos Z6+i sin 30) =8(cos 45°+t sin 45°). 6 COMPLEX NUMBERS [Ch. I The moduli r 3 and 8 must be equal, so that the positive real number r is equal to 2. Furthermore, 30 and 45° have equal cosines and equal sines, and hence differ by an integral mulitple of 360°. Hence 30 = 45°+ k -360°, or 0=15°+fc-12O°, where k is an integer. 1 Substituting this value of and the value 2 of r in (4), we get the desired cube roots. The values 0, 1, 2 of k give the distinct results #i=2(cos 15°+*' sin 15°), i? 2 = 2(cos 135°+i sin 135°), #3 = 2(cos255°+;sin255 ). Each new integral value of k leads to a result which is equal to Ri, R2 or R3. In fact, from A; = 3 we obtain R\, from fc = 4 we obtain R2, from k = 5 we obtain R3, from /b = 6 we obtain Ri again, and so on periodically. EXERCISES 1. Verify that R 2 = uRi, R i = ui' l Ri. Verify that Ri is a cube root of 8 (cos 45° + i sin 45°) by cubing Ri and applying De Moivre's theorem. Why are the new expressions for R2 and R 3 evidently also cube roots? 2. Find the three cube roots of —27; those of — i; those of u. 3. Find the two square roots of i; those of — i; those of w. 4. Prove that the numbers cos d+i sin and no others are represented by points on the circle of radius unity whose center is the origin. 5. If a+bi and c+di are represented by the points A and C in Fig. 3, prove that their sum is represented by the fourth vertex S of the parallelogram two of whose sides are OA and OC. Hence show that the modulus of the sum of two complex numbers is equal to or less than the sum of their moduli, and is equal to or greater than the dif- ference of their moduli. 1 Here, as elsewhere when the contrary is not specified, zero and negative as well as positive whole numbers are included under the term "integer." § 9] ROOTS OF COMPLEX NUMBERS 7 6. Let r and r' be the moduli and 6 and a the amplitudes of two complex numbers represented by the points A and C in Fig. 4. Let U be the point on the x-axis one unit to the right of the origin O. Construct triangle OCP similar to triangle OUA and similarly placed, so that corresponding sides are OC and OU, CP and UA, OP and OA, while the vertices 0, C, P are in the same order (clockwise or counter-clockwise) as the corresponding vertices O, U, A. Prove that P represents the product ( § 5) of the complex numbers represented by A and C. 7. If a+bi and e+fi are represented by the points A and S in Fig. 3, prove that the complex number obtained by subtracting a-\-bi from e-\-fi is represented by the point C. Hence show that the absolute value of the difference of two complex numbers is equal to or less than the sum of their absolute values, and is equal to or greater than the difference of their absolute values. 8. By modifying Ex. 6, show how to construct geometrically the quotient of two complex numbers, 9. nth Roots. As illustrated in § 8, it is evident that the nth roots of any complex number p (cos A-\- i sin A) a^e the products of the nth roots of cos A -\-i ^ sin A by the positive real nth root of the positive real number p (which may be found by logarithms). Let an nth root of cos A-\-i sin A be of the form (4) r(cos 6+i sin &). Then, by De Moivre's theorem, r n (cos nd-\-i sin nd) = cos A -\-i sin A. The moduli r n and 1 must be equal, so that the positive real number r is equal to 1. Since nd and A have equal sines and equal cosines, they differ by an integral multiple of 360°. Hence nd = A-\-k-3Q0°, where k is an integer. Substituting the resulting value of 6 and the value 1 of r in (4), we get (5) cos I ) +i sin For each integral value of k, (5) is an answer since its nth power reduces to cos A-\-i sin A by DeMoivre's theorem. Next, the value n of k gives the same answer as the value of k; the value n+1 of k gives the same answer as the value 1 of A;; and in general the value n+m of k gives the same answer as the value m of i Hence we may restrict attention to the values 0, 1, . . . , n— 1 of k. Finally, the answers (5) given by these Observatory Weston College Weston Massachusetts 8 COMPLEX NUMBERS [Ch. I values 0, 1, . . . , n— 1 of k are all distinct, since they are represented by points whose distance from the origin is the modulus 1 and whose ampli- tudes are A A 360^ A 2-360° i « ~t~ } • n n A 360 c n n A (n-l)360 e n n so that these n points are equally spaced points on a circle of radius unity. Special cases are noted at the end of § 10. Hence any complex number different from zero has exactly n distinct complex nth roots. 10. Roots of Unity. The ti'igonometric form of 1 is cos 0°-H sin 0°. Hence by § 9 with A = 0, the n distinct nth roots of unity are (6) 2/C7T . . 2/C7T n . cos \-i sin (/c = 0, 1, . . . , n— 1), n n where now the angles are measured in radians (an angle of 180 degrees being equal to -k radians, where tt = 3.1416, approximately). For k = 0, (6) reduces to 1, which is an evident nth. root of unity. For k=l, (6) is (7) it = cos \-i sin — . n n By De Moivre's theorem, the general number (6) is equal to the fcth power of R. Hence the n distinct nth roots of unity are (8) R,R 2 ,R 3 , . . . ,R n -\R n =h As a special case of the final remark in § 9, the n complex numbers (6), and therefore the numbers (8), are represented geometrically by the vertices of a regular polygon of n sides inscribed in the circle of radius unity and center at the origin with one vertex on the positive rc-axis. -l / / \X f / v \ 1/ \\ ° / 1 /y For n = 3, the numbers (8) are u>, w 2 , 1, which are repre- sented in Fig. 2 by the vertices of an equilateral triangle. For w = 4, R = cos ir/2 +i sin tt/2 = i. The four fourth roots of unity (8) are i, i 2 =— 1, i 3 =— i, i l = l, which are repre- sented by the vertices of a square inscribed in a circle of radius unity and center at the origin (Fig, 5), Fig. 5 § 11] PRIMITIVE ROOTS OF UNITY EXERCISES 1. Simplify the trigonometric forms (6) of the four fourth roots of unity. Check the result by factoring x 4 — 1 . 2. For n = 6, show that R= — « 2 . The sixth roots of unity are the three cube roots of unity and their negatives. Check by factoring x 6 — 1. 3. From the point representing a+bi, how do you obtain that representing — (a+bi)? Hence derive from Fig. 2 and Ex. 2 the points representing the six sixth roots of unity. Obtain this result another way. 4. Find the five fifth roots of —1. 5. Obtain the trigonometric forms of the nine ninth roots of unity. Which of them are cube roots of unity? 6. Which powers of a ninth root (7) of unity are cube roots of unity? 11. Primitive nth Roots of Unity. An nth root of unity is called primitive if n is the smallest positive integral exponent of a power of it that is equal to unity. Thus p is a primitive nth root of unity if and only if p n =l and p l ^l for all positive integers l1 in common with 4. But the second and fourth are not primitive fourth roots of unity (since the square of —1 and the first power of 1 are equal to unity), and their exponents 2 and 4 have the divisor 2 in common with n = 4. These facts illustrate and prove the next theorem for the case 71=4. Theorem. The primitive nth roots of unity are those of the numbers (8) whose exponents are relatively prime to n. Proof. If k and n have a common divisor d (d> 1), R* is not a primitive nth root of unity, since (R*fi=(R n y=i, and the exponent n/d is a positive integer less than n. 10 COMPLEX NUMBERS [Ch. I But if k and n are relatively prime, i.e., have no common divisor >1, R k is a primitive nth. root of unity. To prove this, we must show that (J? ft )Ml if Z is a positive integer 1 0>2 ... Cbn-l o,n 1 c cbo cbi ... cb n -2 cb„-i b bi 62 ... 6«_i, r In the second space below ao we write 60 (which is equal to ao). We multiply bo by c and enter the product directly under a\, add and write the sum 61 below it. Next we multiply 61 by c and enter the product directly under a 2 , add and write the sum 62 below it; etc. EXERCISES Work each of the following exercises by synthetic division. 1. Divide x 3 +3x 2 -2x-5 by x-2. 2. Divide 2x 5 -x 3 +2x-l by x+2. 3. Divide x 3 +6x 2 + 10x-l by x-0.09. 4. Find the quotient of x 3 — 5x 2 — 2x+24 by x— 4, and then divide the quotient by x - 3 . What are the roots of x 3 - 5x 2 - 2x +24 = 0? 5. Given that x 4 — 2x 3 — 7x 2 +8x+12 = has the roots —1 and 2, find the quadratic equation whose roots are the remaining two roots of the given equation, and find these roots. 6. If x 4 — 2x 3 — 12x 2 + 10x+3 = has the roots 1 and —3, find the remaining two roots. 7. Find the quotient of 2x 4 -x 3 -6x 2 +4x-8 by x 2 -4. 8. Find the quotient of x 4 -3x 3 +3x 2 -3x+2 by x 2 -3x+2. 9. Solve Exercises 1, 2, 3, 6, 7 of § 14 by synthetic division. 16. Factored Form of a Polynomial. Consider a polynominal /(z)=c z n +ciz B - 1 + . . . +c (Co 5*0), whose leading coefficient c is not zero. If f(x) =0 has the root a\, which may be any complex number, the Factor Theorem shows that f(x) has the factor x—ai, so that f(x) = (x- ai )Q(x), Q(x)=w?- 1 +ci'a?- a + . . . +c' a _i. If Q(x) =0 has the root 0:2, then Q(x) = (x-a 2 )Qi(x), f(x) = (x — ai) (x-a 2 )Qi(x). If Qi(x) =0 has the root 0:3, etc., we finally get (5) f(x)=C (x—ai) (x — a 2 ) . . . (x-a n ). We shall deduce several important conclusions from the preceding discussion. First, suppose that the equation fix) = of degree n is known 16 THEOREMS ON ROOTS OF EQUATIONS [Ch. II to have n distinct roots a\, . . . , a n . In f(x) = (x— ai)Q(x) take x =02; then 0=(ct2— ai) Q(«2), whence $(0:2) =0 and $(#)=() has the root 0:2. Similarly, Qi(rc) = has the root «3, etc. Thus all of the assumptions (each introduced by an " if ") made in the above discussion have been justified and we have the conclusion (5). Hence if an. equation f(x) = of degree n has n distinct roots a\, . . . , a n , f(x) can be expressed in the factored form (5). It follows readily that the equation can not have a root a different from at, . . . , a n . For, if it did, the left member of (5) is zero when x=a and hence one of the factors of the right member must then be zero, say a— ctj = 0, whence the root a is equal to aj. We have now proved the following important result. Theorem. An equation of degree n cannot have more than n distinct roots. 17. Multiple Roots. 1 Equalities may occur among the a's in (5). Suppose that exactly mi of the a's (including ai) are equal to a\) that 0), the first negative coefficient is preceded by k coefficients which are positive or zero, and if G denotes the greatest of the numerical values of the negative coefficients, then each real root is less than l + -y/(r/ao. For example, in x 5 +4x 4 — 7x 2 — 40a; +1=0, G = 40 and k = 3 since we must supply the coefficient zero to the missing power z 3 . Thus the theorem asserts that each root is less than 1 + V40 and therefore less than 4.42. Hence 4.42 is an upper limit to the roots. Proof. For positive values of x, f{x) will be reduced in value or remain unchanged if we omit the terms a\x n ~ l , . . . , a k -ix n ~ k+1 (which are positive or zero), and if we change each later coefficient a k , . . . , a n to — G. Hence f(x)^a x n -G(x n - k -\-x n - k - 1 -{- . . . +x+l). But, by Ex. 7 of § 14, z w -*+ • •• +X+l=- — -, x— 1 ifz^l. Furthermore, a ° x ~ G \ x -i )=- -^=r~ Hence, if x>l, x n - k+1 {a x k -\x-l)-G} f{x)> x—l x n - k+1 {a (x-l) k -G} x-1 Thus, for x>l, f(x)>0 and x is not a root if a (x—l) k — G^.0, which is true if x|i l + ^G'/ao. THE CATHERINE B. O'CONNOR LIBRARY ! 03SL WESTON, MASSACHUSETTS 02193 22 THEOREMS ON ROOTS OF EQUATIONS [Ch. II 23. Another Upper Limit to the Roots. Theokem II. //, in a real algebraic equation in which the coefficient of the highest power of the unknown is positive, the numerical value of each negative coefficient be divided by the sum of all the positive coefficients which precede it, the greatest quotient so obtained increased by unity is an upper limit to the roots. For the example in § 22, the quotients are 7/(1+4) and 40/5, so that Theorem II asserts that 1+8 or 9 is an upper limit to the roots. Theorem I gave the better upper limit 4.42. But for x 3 +8x 2 — 9x+c 2 = 0, Theorem I gives the upper limit 4, while Theorem II gives the better upper limit 2. We first give the proof for the case of the equation /(x) =p 4 x i —p 3 x 3 +p2X 2 —p 1 x+po = in which each p t is positive. In view of the identities x 4 = (x-l) (x 3 +x 2 +x+l)+l, x 2 = (x-l) (x + l)+l, / (x) is equal to the sum of the terms Pi(x — l)x 3 +pi(x — l)x 2 +pi(x — l)x +pi(x — 1) +p 4 , — Pzx z +p 2 (x — l)x+p 2 (x — 1)+P2, —pix +p - If x>l, negative terms occur only in the first and third columns, while the sum of the terms in each of these two columns will be ^L if P4 (x-1)-P8^0, (p 4 +p 2 ) (z-D-pi^O. Hence f(x) > and x is not a root if »>!+& *>1+- Pl Pi Pi + P2 This proves the theorem for the present equation. Next, let/(x) be modified by changing its constant term to — p . We modify the above proof by employing the sum (pi-\-pi)x — p of all the terms in the corresponding last two columns. This sum will be >0 if x>p /(P4+P2), which is true if *>l+-i^ P4 + P2 To extend this method of proof to the general case f(x)=a n x n + . . . +a (an>0), we have only to employ suitable general notations. Let the negative coefficients be a tl , . . . , a tt , where &i>/c2> • • • >fa- For each posi- § 23] UPPER LIMIT TO REAL ROOTS 23 tive integer m which is £ n and distinct from fci, . . . , k t} we replace x m by the equal value d(x m ~ l +x m - 2 + . . . +a;+l)+l where d=z— 1. Let F(x) denote the polynomial in x, with coefficients involving d, which is obtained from f(x) by these replacements. Let x>l, so that d is positive. Thus the terms a^xh are the only negative quantities occurring in F(x). If k t >0, the terms of F(x) which involve explicitly the power xh are a^xh and the a m dxh for the various positive coefficients am which precede a 4< . The sum of these terms will be ^.0 if a^+dSa^^O, i.e., if —a t . There is an additional case if k t = 0, i.e., if a is negative. Then the terms of F(x) not involving x explicitly are a and the a m (d-{-l) for the various positive coefficients a m . Their sum, ao-\-x2,a m , will be >0 if — «o which is true if EXERCISES Apply the methods of both § 22 and § 23 to find an upper limit to the roots of 1. 4x 5 -8x 4 +22x 3 +98.r 2 -73a;+5=0. 2. z 4 -5x 3 +7x 2 -8x+l=0. 3. 2 7 +3x 6 -4x 5 +5x 4 -6x 3 -7x 2 -8=0. 4. x 7 +2x 5 +4x 4 -8x 2 -32=0. 5. A lower limit to the negative roots of /(x)=0 may be found by applying our theorems to/(— x) =0, i.e., to the equation derived from/(x) =0 by replacing x by — x. Find a lower limit to the negative roots in Exs. 2, 3, 4. 6. Prove that every real root of a real equation /(x) =0 is less than \-\-g/a Q if a >0, where g denotes the greatest of the numerical values of a\, . . . , a n . Hint: if x>0, a x n +a 1 x n - 1 + . . .^a x n -g(x n - 1 + . . . +x+l). Proceed as in § 22 with k = 1 . 7. Prove that l+gr^|ac| is an upper limit for the moduli of all complex roots of any equation /(x)=0 with complex coefficients, where g is the greatest of the values |oi|, . . . , \a n \, and \a\ denotes the modulus of a. Hint: use Ex. 5 of § 8. 24 THEOREMS ON ROOTS OF EQUATIONS [Ch. II 24. Integral Roots. For an equation all of whose coefficients are integers, any integral root is an exact divisor of the constant term. For, if x is an integer such that (12) a x n + . . . +a n -ix+a n = 0, where the a's are all integers, then, by transposing terms, we obtain x{— a Q x n ~ l — . . . —a n -i)=a n . Thus x is an exact divisor of a n since the quotient is the integer given by the quantity in parenthesis. Example 1. Find all the integral roots of x 3 +x 2 -3x+9=0. Solution. The exact divisors of the constant term 9 are ±1, ±3, ±9. By trial, no one of ±1, 3 is a root. Next, we find that —3 is a root by synthetic division (§ 15) : 1 1-3 9 1-3 -3 6 -9 1-230 Hence the quotient is z 2 -2x+3, which is zero for x = l±V -2. Thus —3 is the only integral root. When the constant term has numerous exact divisors, some device may simplify the application of the theorem. Example 2. 1 Find all the integral roots of 2/ 3 +12y 2 -32y -256 = 0. Solution. Since all the terms except y 3 are divisible by 2, an integral root y must be divisible by 2. Since all the terms except y 3 are now divisible by 2 4 , we have y = 4z, where z is an integer. Removing the factor 2 6 from the equation in z, we obtain z 3 +3z 2 -2z-4=0. An integral root must divide the constant term 4. Hence, if there are any integral roots, they occur among the numbers ±1, ±2, ±4. By trial, —1 is found to be a root: 1 3-2-4 1 -1 -1-2 4 12-40 1 This problem is needed for the solution (§ 48) of a certain quartic equation. § 24] INTEGRAL ROOTS 25 Hence the quotient is z 2 +2z— 4, which is zero for z=-l±V5. Thus y=4z=— 4 is the only integral root of the proposed equation. EXERCISES Find all the integral roots of 1. x 3 +8x 2 +13x+6 = 0. 2. x 3 -5x 2 -2x+24 = 0. 3. x 3 -10x 2 +27x-18=0. 4. x 4 +4x 3 +8x+32=0. 5. The equation in Ex. 4 of § 23. 25. Newton's Method for Integral Roots. In § 24 we proved that an integral root x of equation (12) having integral coefficients must be an exact divisor of a n . Similarly, if we transpose all but the last two terms of (12), we see that a n -\x-\-a n must be divisible by x 2 , and hence a n -\-\-a n /x divisible by x. By transposing all but the last three terms of (12), we see that their sum must be divisible by x s , and hence a„_2+ (a n -i-\-a n /x)/x divisible by x. We thus obtain a series of conditions of divisibility which an integral root must satisfy. The final sum a -{-ai/x-\- . . . must not merely be divisible by x, but be actually zero, since it is the quotient of the function (12) by x 11 . In practice, we must test in turn the various divisors x of a n . If a chosen x is not a root, that fact will be disclosed by one of the conditions mentioned. Newton's method is quicker than synthetic division since it usually detects early and throws out wrong guesses as to a root, whereas in synthetic division the decision comes only at the final step. For example, the divisor —3 of the constant term of (13) /(x)=x 4 -9x 3 +24x 2 -23x + 15=0 is not a root since — 23+15/(— 3) = —28 is not divisible by —3. To show that none of the tests fails for 3, so that 3 is a root, we may arrange the work systematically as f ollows : 3 1 -9 24 -23 15 (14) -16-6 5 (divisor) -3 18 -18 First we divide the final coefficient 15 by 3, place the quotient 5 directly under the coef- ficient —23, and add. Next, we divide this sum —18 by 3, place the quotient —6 directly under the coefficient 24, and add. After two more such steps we obtain the sum zero, so that 3 is a root. It is instructive to obtain the preceding process by suitably modifying synthetic division. First, we replace x by l/y in (13), multiply each term by y 4 , and obtain 15?/ 4 -23y 3 +24?/ 2 -9?/ + l =0. 26 THEOREMS ON ROOTS OF EQUATIONS [Ch. II We may test this for the root y = \, which corresponds to the root x = 3 of (13), by ordinary synthetic division: 15 -23 24 -9 1 i 5 —6 6—1 (multiplier) 15 -18 18 -3 The coefficients in the last two lines (after omitting 15) are the same as those of the last two fines in (14) read in reverse order. This should be the case since we have here multiplied the same numbers by ^ that we divided by 3 in (14). The numbers in the present third line are the coefficients of the quotient (§ 15). Since we equate the quotient to zero for the applications, we may replace these coefficients by the numbers in the second line which are the products of the former numbers by -§-. The numbers in the second line of (14) are the negatives of the coefficients of the quotient of f(x) by x-Z. Example. Find all the integral roots of equation (13). Solution. For a negative value of x, each term is positive. Hence all the real roots are positive. By § 23, 10 is an upper limit to the roots. By § 24, any integral root is an exact divisor of the constant term 15. Hence the integral roots, if any, occur among the numbers 1, 3, 5. Since /(l) =8, 1 is not a root. By (14), 3 is a root. Pro- ceeding similarly with the quotient by a;— 3, whose coefficients are the negatives of the numbers in the second line of (14), we find that 5 is a root. EXERCISES 1. Solve Exs. 1-4 of § 24 by Newton's method. 2. Prove that, in extending the process (14) to the general equation (12), we may employ the final equations in § 15 with r = and write c (divisor) flo Q>i O2 • • • 0, n — 2 ®"n — 1 ' — b —hi —6 2 . . . —b n - 2 —b n -i — cb —cbi . . . — cb„—2 —cb n —2 Here the quotient, —b n —i, of a n by c is placed directly under a n — 1; and added to it to yield the sum —cb n - 2 , etc. 26. Another Method for Integral Roots. An integral divisor d of the constant term is not a root if d—m is not a divisor of f(m), where m is any chosen integer. For, if d is a root of f(x) = 0, then f(x)m(z-d)Q(x), where Q(x) is a polynomial having integral coefficients (§ 15). Hence f(m) = (m — d) q, where q is the integer Q(ra). § 27] RATIONAL ROOTS 27 In the example of § 25, take d = 15, m = 1. Since /(l) = 8 is not divisible by 15 — 1 = 14, 15 is not an integral root. Consider the more difficult example fix) =x 3 -20x 2 +164z-400 = 0, whose constant term has many divisors. There is evidently no negative root, while 21 is an upper limit to the roots. The positive divisors less than 21 of 400 = 2 4 5 2 are d = 1, 2, 4, 8, 16, 5, 10, 20. First, take m = 1 and note that/(l) = -255 = -3 • 5 • 17. The corresponding values of d — 1 are 0, 1, 3, 7, 15, 4, 9, 19; of these, 7, 4, 9, 19 are not divisors of /(l), so that d = S, 5, 10 and 20 are not roots. Next, take m = 2 and note that /(2) = — 144 is not divisible by 16—2 = 14. Hence 16 is not a root. Incidentally, d = l and d = 2 were excluded since /(d) ^0. There remains only d = 4, which is a root. In case there are numerous divisors within the limits to the roots, it is usually a waste of time to list all these divisors. For, if a divisor is found to be a root, it is preferable to employ henceforth the quotient, as was done in the example in § 25. EXERCISES Find all the integral roots of 1. z 4 -2z 3 -21a; 2 +22a;+40 = 0. 2. y 3 -9y 2 -24^+216 = 0. 3. x"-23x 3 +187a; 2 -653x+936 = 0. 4. z 5 +47x 4 +423x 3 + 140x 2 +1213x-420 = 0. 5. x 5 -34a; 3 +29x 2 +212x -300 = 0. 27. Rational Roots. If an equation with integral coefficients (15) c a; M +cix"- 1 + . . . +c„_ia;+c„ = has the rational root a/b, where a and b are integers without a common divisor > 1, then a is an exact divisor of c n , and b is an exact divisor of Cq. Insert the value a/b of x and multiply all terms of the equation by b n . We obtain c a n +c 1 a n ~ 1 b+ . . . -\-c n - 1 ab n - 1 +c n b n = 0. Since a divides all the terms preceding the last term, it divides that term. But a has no divisor in common with b n ; hence a divides c n . Similarly, b divides all the terms after the first term and hence divides c . Example. Find all the rational roots of 2z 3 -7x 2 +10x-6=0. 28 THEOREMS ON ROOTS OF EQUATIONS [Ch. II Solution. By the theorem, the denominator of any rational root £ is a divisor of 2. Hence y = 2x is an integer. Multiplying the terms of our equation by 4, we obtain y 3 -7y 2 +20y-24 = 0. There is evidently no negative root. By either of the tests in §§ 22, 23, an upper limit to the positive roots of our equation in x is 1+7/2, so that t/<9. Hence the only possible values of an integral root y are 1, 2, 3, 4, 6, 8. Since 1 and 2 are not roots, we try 3: 1 -7 20 -24 |_3_ -1 4 -8 -3 12 Hence 3 is a root and the remaining roots satisfy the equation y 2 — 4y+8 = and are 2±2£. Thus the only rational root of the proposed equation is x = 3/2. If c = l, then 6= ±1 and a/b is an integer. Hence we have the Corollary. Any rational root of an equation with integral coefficients, that of the highest power of the unknown being unity, is an integer. Given any equation with integral coefficients a Q y n -^-a 1 y n - 1 -\- . . . +a w = 0, we multiply each term by a Q n ~ l , write a y = x, and obtain an equation (15) with integral coefficients, in which the coefficient c of x n is now unity. By the Corollary, each rational root x is an integer. Hence we need only find all the integral roots x and divide them by ao to obtain all the rational roots y of the proposed equation. Frequently it is sufficient (and of course simpler) to set ky = x, where A; is a suitably chosen integer less than a . EXERCISES Find all of the rational roots of 1- V'- L i / 2 /3+J-|o^-40 2 /+9 = 0. 2. 6y 3 -ll2/ 2 +62/-l=0. 3. 108?/ 3 -270?/ 2 -42 2 / + l=0. [Use/b = 6.] 4. 32i/ 3 -6f/-l=0. [Use the least k.} 5. 96?/ 3 -16?/ 2 -62/+l=0. 6. 247/ 3 -2?/ 2 -52/ + l =0. 7. y 3 -±y 2 -2y+l=0. < 8. y*-%y*+3y -2=0. 9. Solve Exs. 2-0 by replacing y by 1/x. Find the equations whose roots are the products of by the roots of 10. 2 / 2 -2y-L = 0. 11. 2/ 3 —b 2 -!2/+i = 0. CHAPTER III Constructions with Ruler and Compasses 28. Impossible Constructions. We shall prove that it is not possible, by the methods of Euclidean geometry, to trisect all angles, or to con- struct a regular polygon of 7 or 9 sides. The proof, which is beyond the scope of elementary geometry, is based on principles of the theory of equations. Moreover, the discussion will show that a regular polygon of 17 sides can be constructed with ruler and compasses, a fact not suspected during the twenty centuries from Euclid to Gauss. 29. Graphical Solution of a Quadratic Equation. structible, and If a and b are con- (1) x 2 -ax+b = has real coefficients and real roots, the roots can be constructed with ruler and compasses as follows. Draw a circle having as a diam- eter the line BQ joining the points B = (0, 1) and Q=(a, b) in Fig. 6. Then the abscissas ON and OM of the points of inter- section of this circle with the x-axis are the roots of (1). For, the center of the circle is (a/2, (6+l)/2); the square of BQ is a 2 +(6— l) 2 ; hence the equation of the circle is Fig. 6 hh^j- a 2 +(b-lf 2 / 4 This is found to reduce to (1) when y = 0, which proves the theorem. When the circle is tangent to the x-axis, so that M and N coincide, the two roots are equal. When the circle does not cut the rc-axis, or when Q coincides with B, the roots are imaginary. Another construction follows from § 30. 29 30 CONSTRUCTIONS WITH RULER AND COMPASSES [Ch. Ill EXERCISES Solve graphically: 1. x 2 -5x+4 = 0. 2. x 2 +5x+4 = 0. 3. z 2 +5z-4 = 0. 4. x 2 -5a;-4 = 0. 5. a; 2 -4x+4 = 0. 6. x 2 -3x+4 = 0. 30. Analytic Criterion for Constructibility. The first step in our consideration of a problem proposed for construction consists in formu- lating the problem analytically. In some instances elementary algebra suffices for this formulation. For example, in the ancient problem of the duplication of a cube, we take as a unit of length a side of the given cube, and seek the length x of a side of another cube whose volume is double that of the given cube; hence (2) x 3 = 2. But usually it is convenient to employ analytic geometry as in § 29; a point is determined by its coordinates x and y with reference to fixed rectangular axes; a straight line is determined by an equation of the first degree, a circle by one of the second degree, in the coordinates of the general point on it. Hence we are concerned with certain numbers, some being the coordinates of points, others being the coefficients of equa- tions, and still others expressing lengths, areas or volumes. These num- bers may be said to define analytically the various geometric elements involved. Criterion. A proposed construction is possible by ruler and com- passes if and only if the numbers which define analytically the desired geo- metric elements can be derived from those defining the given elements by a finite number of rational operations and extractions of real square roots. In § 29 we were given the numbers a and b, and constructed lines of lengths ^(a±Va 2 -46). Proof. First, we grant the condition stated in the criterion and prove that the construction is possible with ruler and compasses. For, a rational function of given quantities is obtained from them by additions, sub- tractions, multiplications, and divisions. The construction of the sum or difference of two segments is obvious. The construction, by means of parallel fines, of a segment whose length p is equal to the product a-b of the lengths of two given segments is shown in Fig. 7; that for the quo- 30] ANALYTIC CRITERION FOR CONSTRUCTIBILITY 31 tient q = a/b in Fig. 8. Finally, a segment of length s = Vn may be con- structed, as in Fig. 9, by drawing a semicircle on a diameter composed of two segments of lengths 1 and n, and then drawing a perpendicular to the diameter at the point which separates the two segments. Or we may construct a root of x 2 — n = by § 29. Second, suppose that the proposed construction is possible with ruler and compasses. The straight lines and circles drawn in making the con- struction are located by means of points either initially given or obtained as the intersections of two straight lines, a straight line and a circle, or two circles. Since the axes of coordinates are at our choice, we may assume that the y-axis is not parallel to any of the straight lines employed in the construction. Then the equation of any one of our lines is (3) y = mx-\-b. Let y = m'x-\-b' be the equation of another of our lines which inter- sects (3). The coordinates of their point of intersection are x = b'-b m—m y- mb' — m'b m—m' ' which are rational functions of the coefficients of the equations of the two lines. Suppose that a line (3) intersects the circle (x-c) 2 +(y-d) 2 = r 2 , with the center (c, d) and radius r. To find the coordinates of the points of intersection, we eliminate y between the equations and obtain a quad- ratic equation for x. Thus x (and hence also mx-\-b or y) involves no 32 CONSTRUCTIONS WITH RULER AND COMPASSES [Ch. Ill irrationality other than a real square root, besides real irrationalities present in m, b, c, d, r. Finally, the intersections of two circles are given by the intersections of one of them with their common chord, so that this case reduces to the preceding. .For example, a side of a regular pentagon inscribed in a circle of radius unity is (Ex. 2 of § 37) (4) s = l\/lO-2V5, which is a number of the type mentioned in the criterion. Hence a regular pentagon can be constructed by ruler and compasses (see the example above quoted) . 31. Cubic Equations with a Constructive Root. We saw that the problem of the duplication of a cube led to a cubic equation (2). We shall later show that each of the problems, to trisect an angle, and to con- struct regular polygons of 7 and 9 sides with ruler and compasses, leads to a cubic equation. We shall be in a position to treat all of these problems as soon as we have proved the following general result. Theokem. It is not possible to construct with ruler and compasses a line whose length is a root or the negative of a root of a cubic equation with rational coefficients having no rational root. Suppose that x\ is a root of (5) x 3 +ax 2 -\-px+y = (a, )3, y rational) such that a line of length xi or — Xi can be constructed with ruler and com- passes; we shall prove that one of the roots of (5) is rational. We have only to discuss the case in which xi is irrational. By the criterion in § 30, since the given numbers in this problem are a, /3, 7, all rational, x\ can be obtained by a finite number of rational operations and extractions of real square roots, performed upon rational numbers or numbers derived from them by such operations. Thus x\ involves one or more real square roots, but no further irrationalities. As in the case of (4), there may be superimposed radicals. Such a two-story radical which is not expressible as a rational function, with rational coefficients, of a finite number of square roots of positive rational numbers is said to be a radical of order 2. In general, an n-story radical is said to be of order n if it is not expressible as a rational function, with rational coefficients, of radicals each with fewer than n superimposed radicals, the innermost ones affecting positive rational numbers. §31] CUBIC EQUATIONS WITH A CONSTRUCTIBLE ROOT 33 We agree to simplify x\ by making all possible replacements of certain types that are sufficiently illustrated by the following numerical examples. If xi involves V3, V5, and Vl5, we agree to replace Vl5 by VZ-VK If xi = s — It, where s is given by (4) and so that si=v5, we agree to write x\ in the form s — 7\^5/s, which involves a single radical of ord er 2 and no new radical of lower order. Finally, we agree to replace V 4 — 2V3 by its simpler form V3— 1. After all possible simplifications of these types have been made, the resulting expressions have the following properties (to be cited as our agreements): no one of the radicals of highest order n in x\ is equal to a rational function, with rational coefficients, of the remaining radicals of order n and the radicals of lower orders, while no one of the radicals of order n— 1 is equal to a rational function of the remaining radicals of order n—\ and the radicals of lower orders, etc. Let Vk be a radical of highest order n in x\. Then = a+bVk Xl ~c+dVW where a, b, c, d do not involve Vfc, but may involve other radicals. If d = 0, then c^-0 and we write e for a/c, f for b/c, and get (6) X!=e+fVh, C/VO) where neither e nor/ involves Vk. If d^O, we derive (6) by multiplying the numerator and denominator of the fraction for x\ by c—dVk, which is not zero since 'Vk = c/d would contradict our above agreements. By hypothesis, (6) is a root of equation (5). After expanding the powers and replacing the square of V& by k, we see that (7) (e+/V^)3+«( e +/V^)2 +/3 ( e+ /V/7)+ 7 = A+ J BVfc, where A and B are certain polynomials in e, f, k and the rational numbers a, j8, 7. Thus A+BVk = 0. If B^O, Vk= -A/B is a rational function, with rational coefficients, of the radicals, other than v/b, in x\, contrary to our agreements. Hence B = and therefore A = 0. When e— /Vfc is substituted for x in the cubic function (5), the result 34 CONSTRUCTIONS WITH RULER AND COMPASSES [Ch. Ill is the left member of (7) with V/b replaced by — V/c, and hence the result is A - BVk. But A = B = 0. This shows that (8) x 2 = e-fVk is a new root of our cubic equation. Since the sum of the three roots is equal to —a by § 20, the third root is (9) X3= — a— X\ — X2= — a — 2e. Now a is rational. If also e is rational, xs is a rational root and we have reached our goal. We next make the assumption that e is irrational and show that it leads to a contradiction. Since e is a component part of the constructible root (6), its only irrationalities are square roots. Let Vs be one of the radicals of highest order in e. By the argument which led to (6), we may write e = e'+/'Vs, whence, by (9), (9') xz^g+hVs, (h9*0) where neither g nor h involves Vs. Then by the argument which led to (8), g — AVs is a root, different from xz, of our cubic equation, and hence is equal to x\ or X2 since there are only three roots (§ 16). Thus g-hVs = e±fVk. By definition, Vs is one of the radicals occurring in e. Also, by (9'), every radical occurring in g or h occurs in x% and hence in e = %{— a— £3), by (9), a being rational. Hence V/c is expressible rationally in terms of the remaining radicals occurring in e and /, and hence in x\, whose value is given by (6). But this contradicts one of our agreements. 32. Trisection of an Angle. For a given angle A, we can construct with ruler and compasses a line of length cos A or —cos A, namely the adjacent leg of a right triangle, with hypotenuse unity, formed by dropping a perpendicular from a point in one side of A to the other, produced if necessary. If it were possible to trisect angle A, i.e., construct the angle A/3 with ruler and compasses, we could as before construct a line whose length is ±cos (A/3). Hence if we show that this last cannot be done when the only given geometric elements are the angle A and a line of unit length, we shall have proved that the angle A cannot be trisected. We shall give the proof for A = 120°. We employ the trigonometric identity cos A =4 cos 3 -5 — 3 cos -5-. o o § 32] TRISECTION OF AN ANGLE 35 Multiply each term by 2 and write x for 2 cos (A/3). Thus (10) z 3 -3:r = 2cosA. For A = 120°, cos A = — £ and (10) becomes (11) x?-Sx+l=0. Any rational root is an integer (§ 27) which is an exact divisor of the constant term (§24). By trial, neither +1 nor —1 is a root. Hence (11) has no rational root. Hence (§31) it is not possible to trisect all angles with ruler and compasses. Certain angles, like 90°, 180°, can be trisected. When A = 180°, the equation (10) becomes x 3 — 3x=— 2 and has the rational root x = l. It is the rationality of a root which accounts for the possibility of trisecting this special angle 180°. 33. Regular Polygon of 9 Sides, Duplication of a Cube. Since angle 120° cannot be trisected with ruler and compasses (§ 32), angle 40° can- not be so constructed in terms of angle 120° and the line of unit length as the given geometric elements. Since the former of these elements and its cosine are constructible when the latter is given, we may take the line of unit length as the only given element. In a regular polygon of 9 sides, the angle subtended at the center by one side is ^-360° = 40°. Hence a regular polygon of 9 sides cannot be constructed with ruler and com- passes. Here, as in similar subsequent statements where the given elements are not specified, the only such element is the line of unit length. A rational root of x? = 2 is an integer (§ 27) which is an exact divisor of 2. The cubes of ±1 and ±2 are distinct from 2. Hence there is no rational root. Hence (§§ 30, 31) it is not possible to duplicate a cube with ruler and compasses. 34. Regular Polygon of 7 Sides. If we could construct with ruler and compasses an angle 5 containing 360/7 degrees, we could so con- struct a line of length re = 2 cos 5. Since 7 5 = 360°, cos3 5 = cos4 5. But 2 cos SB = 2(4 cos 3 5 - 3 cos 5) = x* - 3x, 2 cos 45 = 2(2 cos 2 25-1) =4(2 cos 2 5-l) 2 -2= (z 2 -2) 2 -2. Hence = x 4 - 4x 2 + 2 - (x 3 - 3x) = (x - 2) (x 3 +x 2 - 2x - 1) . But x = 2 would give cos 5=1, whereas 5 is acute. Hence (12) x 3 +x 2 -2x-l=0. 36 CONSTRUCTIONS WITH RULER AND COMPASSES [Ch. Ill Since this has no rational root, it is impossible to construct a regular polygon of 7 sides with ruler and compasses. 35. Regular Polygon of 7 Sides and Roots of Unity. If _ 2tt, .', 2tt R = cos-=-+t siny, we saw in § 10 that R, R 2 , R 3 , R 4 , R 5 , R 6 , R 7 = l give all the roots of y 7 = 1 and are complex numbers represented by the vertices of a regular polygon of 7 sides inscribed in a circle of radius unity and center at the origin of coordinates. By § 6, 1 2tt . . 2tt d . 1 o 2tt ^ = cos^ — tsm-y, #+-^ = 2cosy. We saw in § 34 that 2 cos (27r/7) is one of the roots of the cubic equation (12). This equation can be derived in a new manner by utilizing the preceding remarks on 7th roots of unity. Our purpose is not primarily to derive (12) again, but to illustrate some principles necessary in the general theory of the construction of regular polygons. Removing from y 7 — 1 the factor y—1, we get (13) y 6 +y»+t+y z +y 2J ry+i=Q, whose roots are R, R 2 , . . . , R 6 . Since we know that R-\-l/R is one of the roots of the cubic equation (12), it is a natural step to make the sub- stitution (14) y+± = x in (13). After dividing its terms by y s , we have (i3o {y 3+ $+{y 2+ $ + { y+ l) +l=0 - By squaring and cubing the members of (14), we see that (15) l) are integers without a common factor, and q is not divisible by a cube. Prove algebraically that it is possible, with ruler and compasses: 13. To trisect an angle whose cosine is (4a 3 — 3ab 2 )/fe 3 , where the integer a is numer- ically less than the integer b; for example, cos -1 ll/16ifa=— 1, b=4. 14. To construct the legs of a right triangle, given its area and hypotenuse. 15. To construct the third side of a triangle, given two sides and its area. 16. To locate the point P on the side BC = 1 of a given square ABCD such that tne straight line AP cuts DC produced at a point Q for which the length of PQ is a given number g. Show that y = BP is a root of a reciprocal quartic equation, and solve it when o = 10, 38. The Periods of Roots of Unity. Before taking up the regular polygon of 17 sides, we first explain another method of finding the pairs of imaginary seventh roots of unity R and R 6 , R 2 and R 5 , R 3 and R 4 , employed in (16). To this end we seek a positive integer g such that the six roots can be arranged in the order (20) R, R°, R g2 , R g3 , R gi , R 9 \ where each term is the gth power of its predecessor. Trying g = 2, we find that the fourth term would then be R 8 = R. Hence g^2. Trying # = 3, we obtain (21) R, R 3 , R 2 , R 6 , R 4 , R 5 , where each term is the cube of its predecessor. § 39] REGULAR POLYGON OF 17 SIDES 41 To define three periods, each of two terms, (16') R+R 6 , R 2 +R 5 , R 3 +R 4 , we select the first term R of (21) and the third term R 6 after it and add them, then the second term it! 3 and the third term R 4 after it, and finally R 2 and the third term R 5 after it. We may also define two periods, each of three terms, z 1 = R-{-R 2 -{-R 4 , z 2 = R z +R & +R 5 , by taking alternate terms in (21). Since Zi+Z2 = — 1, ZiZ 2 = 3+R-\- . . . -\-R 6 = 2, zi and z 2 are the roots of z 2 +z+2=0. Then R, R 2 , R 4 are the roots of w 3 — ZiW 2 -\-z 2 w — 1 = 0. 39. Regular Polygon of 17 Sides. Let R be a root ^1 of x 17 = l. Then **^^=R 16 +R 15 + . . . +£+1=0. K — 1 As in § 38, we may take g = 3 and arrange the roots R, . . . , R 16 so that each is the cube of its predecessor: R, R 3 ,R 9 , R 10 , R 13 , R 5 , R 15 , R n , R 16 , R 14 , R 8 , R 7 , R 4 , R 12 , R 2 , R 6 . Taking alternate terms, we get the two periods, each of eight terms, yi =R+R 9 +R l3 +R 15 +R 16 +R 8 +R 4 -\-R 2 , y 2 = R 3 +R 10 +R 5 +R n +R 14 +R 7 +R 12 +R 6 . Hence yi+y 2 = -1. We find that yiy 2 = 4: (R+ . . . +R l6 ) = -4. Thus (22) 2/1, y 2 satisfy y 2 +y-4: = 0. Taking alternate terms in yi, we obtain the two periods zi= J R+# 13 +# 16 +£ 4 , z 2 = R g +R 15 +R 8 +R 2 . Taking alternate terms in y 2) we get the two periods Wl = R 3 +R*+R l4 +R 12 , w 2 ^R 10 +R n +R 7 +R (i . Thus z\ +z 2 = y\, wi-\-w 2 — y 2 . We find that 2i22 = wiW2= —1. Hence (23) z\, 22 satisfy z 2 — y\Z — 1 = 0, (24) w\, w 2 satisfy w 2 — y 2 w — 1 = 0. 42 CONSTRUCTIONS WITH RULER AND COMPASSES [Ch. Ill Taking alternate terms in z\, we obtain the periods Vl = R+R 16 , v 2 = R 13 +R*. Now, vi-\-V2 = Zi, viV2 = wi. Hence (25) vi, V2 satisfy v 2 — ziv-\-wi=0, (26) R, R 16 satisfy P 2 -v 1P + 1 = 0. Hence we can find R by solving a series of quadratic equations. Which of the sixteen values of R we shall thus obtain depends upon which root of (22) is called y\ and which y 2 , and similarly in (23)-(26). We shall now show what choice is to be made in each such case in order that we shall finally get the value of the particular root D 2t . . 2?r R = cos yj+i sin t=. Then 1 2tt . . 2tt d . 1 _ 2tt -^ = cos-=-ismj=, vi = R +p = 2cos y=, J R 4 = cos-y+«sm-w, V2 = /t 4 +Q4 = 2cos-=. Hence vi > V2 > 0, and therefore z\ = yi+V2 > 0. Similarly, tvj i 1 i 7->r; , 1 « 67T , _ lOx _ 07T _ '7T _ Wl =# 3 +-^3+# 5 +^5 = 2 COS Yy + 2 COS ^y- = 2 COS Jj" 2 C °S JJ>°» 6-n- IOt 12-n- \4lT 2/2 = 2 cos T^+2 cos Ty-+ 2 c °s "77~+ 2 cos TT ' since only the first cosine in y% is positive and it is numerically less than the third. But yiy 2 = -4. Hence j/i>0. Thus (22)-(24) give yi =h(VV7-i), i/ 2 = |(-Vl7-l), We may readily construct segments of these lengths. Evidently Vl7 is the length of the hypotenuse of a right triangle whose legs are of lengths 1 and 4, while for the radical in z\ we employ legs of lengths 1 and \y\. We thus obtain segments representing the coefficients of the 40] REGULAR POLYGONS 43 quadratic equation (25). larger root is Its roots may be constructed as in § 29. 2-k The v\ = 2 cos 17' Hence we can construct angle 2-71-/17 with ruler and compasses, and there- fore a regular polygon of 17 sides. 40. Construction of a Regular Polygon of 17 Sides. In a circle of radius unity, construct two per- pendicular diameters AB, CD, and draw tangents at A, D, which intersect at S (Fig. 11). Find the point E in AS for which AE — \AS, by means of two bi- sections. Then AE=h OE = lVl7. Let the circle with center E and radius OE cut AS at F and F'. Then F' S AF = EF-EA = OE-\ = \ yi , AF' = EF'+EA = OE+\ = -±y 2 , OF=VOA 2 +AF 2 = Vl+l Vl 2 , OF' = Vl+ly 2 2 . Let the circle with center F and radius FO cut AS at H, outside of F'F; that with center F' and radius F'O cut AS at H' between F' and F. Then AH=AF+FH=AF+0F=i yi +Vl+l yi 2 =zi, AH'=F'H'-F'A=OF'-AF'=wi. It remains to construct the roots of equation (25). This will be done as in § 29. Draw HTQ parallel to AO and intersecting OC produced at T. Make TQ = AH'. Draw a circle having as diameter the line BQ joining B— (0,1) with Q=(zi,wi). The abscissas ON and OM of the inter- sections of this circle with the rc-axis OT are the roots of (25). Hence the larger root v\ is 0M = 2 cos(27r/17). 44 CONSTRUCTIONS WITH RULER AND COMPASSES [Ch. in Let the perpendicular bisector LP of OM cut the initial circle of unit radius at P. Then cos LOP = 0L = cos ~, LOP = ^. Hence the chord CP is a side of the inscribed regular polygon of 17 sides, constructed with ruler and compasses. 41. Regular Polygon of n Sides. If n be a prime such that n — 1 is a power 2 h of 2 (as is the case when n = 3, 5, 17), the n— 1 imaginary nth roots of unity can be separated into 2 sets each of 2 h ~ 1 roots, each of these sets subdivided into 2 sets each of 2 h ~ 2 roots, etc., until we reach the pairs R, 1/R and R 2 , l/R 2 , etc., and in fact 1 in such a manner that we have a series of quadratic equations, the coefficients of any one of which depend only upon the roots of quadratic equations preceding it in the series. Note that this was the case for n = 17 and for n = 5. It is in this manner that it can be proved that the roots of x n = l can be found in terms of square roots, so that a regular polygon of n sides can be inscribed by ruler and compasses, provided n be a prime of the form 2 ft +l. If n be a product of distinct primes of this form, or 2 fc times such a product (for example, n = 15, 30 or 6), or if n = 2 m (m> 1), it follows readily (see Ex. 1 below) that we can inscribe with ruler and compasses a regular polygon of n sides. But this is impossible for all other values of n. EXERCISES 1. If os and b are relatively prime numbers, so that their greatest common divisor is unity, we can find integers c and d such that ac-\-bd = \. Show that, if regular poly- gons of a and b sides can be constructed and hence angles 2w/a and 2-ir/b, a regular polygon of a-b sides can be derived. 2. If p = 2 h +l is a prime, h is a power of 2. For h = 2°, 2 1 , 2 2 , 2 3 , the values of p are 3, 5, 17, 257 and are primes. [Show that h cannot have an odd factor other than unity.] 3. For 13th roots of unity find the least g (§38), write out the three periods each of four terms, and find the cubic equation having them as roots. 4. For the primitive ninth roots of unity find the least g and write out the three periods each of two terms. Solve the following reciprocal equations: 5. y 4 +4?/ 3 -3?/ 2 +% + l=0. 6. y 5 -4y 4 +y 3 +y 2 -4y + l=0. 7. 2y«-5y\+4y i -4y 2 +5y-2 = 0. 8. y & + l =31(?/ + 1) 5 . 1 See the author's article " Constructions with ruler and compasses; regular poly- gons," in Monographs on Topics of Modern Mathematics, Longmans, Green and Co., 1911, p. 374. CHAPTER IV Solution of Cubic and Quartic Equations; Their Discriminants 42. Reduced Cubic Equation. If, in the general cubic equation (1) x 3 +bx 2 +cx+d = 0, we set x = y—b/3, we obtain the reduced cubic equation (2) y 3 +py+q = 0, lacking the square of the unknown y, where (3) p = c-j, 9 = d— 3+27- After finding the roots yi, yz, ys of (2), we shall know the roots of (1): ,.. b b b (4) xi=yi—g, x 2 = y 2 -y x 3 = y 3 --. 43. Algebraic Solution of the Reduced Cubic Equation. We shall employ the method which is essentially the same as that given by Vieta in 1591. We make the substitution (5) _„-.-£ in (2) and obtain '■ff-0, 27z 3 since the terms in z cancel, and likewise the terms in 1/z. Thus (6) . z 6 +gz 3 -^ = 0. Solving this as a quadratic equation for z 3 , we obtain 2 (7) ,3 = _| ±V K, B-.g)V 8 46 CUBIC AND QUARTIC EQUATIONS [Ch. IV By § 8, any number has three cube roots, two of which are the products of the remaining one by the imaginary cube roots of unity: (8) w=-!+§Va;, co 2 =-i-iV3;. We can choose particular cube roots (9) = ff^. B ^-|-VS, such that AB = — p/3, since the product of the numbers under the cube root radicals is equal to ( — p/3) 3 . Hence the six values of z are A, coA, o> 2 A, B, uB, u 2 B. These can be paired so that the product of the two in each pair is — p/3 • Hence with any root z is paired a root equal to —p/(3z). By (5), the sum of the two is a value of y. Hence the three values of y are (10) !/i= A+B, y 2 = o>A + u 2 B, y 3 = u 2 A+a:B. It is easy to verify that these numbers are actually roots of (2), For example, since co 3 = 1, the cube of y 2 is A 3 +B 3 +3uA 2 B+3co 2 AB 2 = -q- V {uA + u 2 B) = -q-py 2 , by (9) and AB=-

2 +5co give the three roots. EXERCISES Solve the equations: '1. i/ 3 -18?/ +35=0. 2. x 3 +6x 2 +3a;+18=0. 3. y 3 -2y+4 = 0. 4. 28z 3 +9:e 2 -l=0. § 44] DISCRIMINANT OF A CUBIC 47 44. Discriminant. The product of the squares of the differences of the roots of any equation in which the coefficient of the highest power of the unknown is unity shall be called the discriminant of the equation. For the reduced cubic (2), the discriminant is (11) (y i - y2) 2 (y i - ys) 2 (y 2 -ys) 2 =-4:p s - 27 q 2 , a result which should be memorized in view of its important applications. It is proved by means of (10) and co 3 = l, co 2 +to + l =0, as follows: yi-y 2 = (l-o>)(A-^B), yi -yz = {l-^)(A-uB), 2/2-2/3 = (co-co 2 ) (A.-B), (l-co)(l-co 2 )=3, co-co 2 = V3f. Since 1, co, co 2 are the cube roots of unity, (x — 1) (x — co) (x — co 2 ) = x 3 — 1, identically in x. Taking x = A/B, we see that (A -B)(A- co£) (A - u 2 B) = A 3 -B 3 = 2Vr, by (9). Hence (yi - 2/2) (yi - 2/3) (2/2 - 2/3) = 6 VsVr i. Squaring, we get (11), since — 108it!= — 4p 3 — 27 q 2 by (7). For later use, we note that the discriminant of the reduced cubic is equal to — 108 R. The discriminant A of the general cubic (1) is equal to the discriminant of the corresponding reduced cubic (2). For, by (4), xi-x 2 = yi-y2, x i -xz = y 1 -y 3 , X2 -x 3 = y 2 -y3- Inserting in (11) the values of p and q given by (3), we get (12) A = 18bcd-±b 3 d+b 2 c 2 -4:c 3 - 27 d 2 . It is sometimes convenient to employ a cubic equation (13) ax 3 +bx 2 +cx+d = (a^O), in which the coefficient of x 3 has not been made unity by division. The product P of the squares of the differences of its roots is evidently derived from (12) by replacing b, c, d by b/a, c/a, d/a. Hence (14) a 4 P = 18 abcd-4b 3 d+b 2 c 2 -0. When the roots of a real cubic equation are all real, i.e., if R is negative, they can be computed simultaneously by means of a table of cosines with much less labor than required by Cardan's formulas. To this end we write the trigonometric identity cos 3A = 4 cos 3 A — 3 cos A in the form z 3 — \z — I cos 3A=0 (2 = cosA). In the given cubic y 3 +py-\-q = take y = nz; then *3 + ^ + 4 = 0, which will be identical with the former equation in z if n = V-±p, cos3A = -|g-=-\/-p 3 /27. Since R = p 3 /27-\-q 2 /4 is negative, p must be negative, so that n is real and the value of cos 2>A is real and numerically less than unity. Hence we can find 3A from a table of cosines. The three values of z are then cos A, cos(A + 120°), cos(A+240°). Multiplying these by n, we obtain the three roots y correct to a number of decimal places which depends on the tables used. EXERCISES /l. For ^-2^-1=0, show that n 2 = 8/3, cos 3^1 = V27/32 3A=23°17' 0", cos A= 0.99084, cos (A +120°) = -0.61237, cos (A +240°) = -0,37847, and that the roots y are 1.61804, -1, -0.61804. 2. Solve Exs. 1, 3, 4, 5, 6 of § 46 by trigonometry. 50 CUBIC AND QUARTIC EQUATIONS [Ch. IV 48. Ferrari's Solution of the Quartic Equation. The general quartic equation (15) x 4 +bx 3 +ex 2 +dx+e = 0, or equation of degree four, becomes after transposition of terms rc 4 +foc 3 = — ex 2 — dx— e. The left member contains two of the terms of the square of x 2 +\bx. Hence by completing the square, we get (x 2 + %bz) 2 = {\b 2 - c)x 2 -dx-e. Adding (x 2j r%bx)y-\-\y 2 to each member, we obtain (16) (x 2 +±bx+hy) 2 =(lb 2 -c+y)x 2 +(hby-d)x+ly 2 -e. The second member is a perfect square of a linear function of x if and only if its discriminant is zero (§ 12): (hby-d) 2 -±£b 2 -c+y) (ly 2 -e)=0, which may be written in the form (17) y 3 -cy 2 +(bd-4:e)y-b 2 e+4ce-d 2 = 0. Choose any root y of this resolvent cubic equation (17). Then the right member of (16) is the square of a linear function, say mx-\-n. Thus (18) x 2j r\bx-\-\y = mx-\-n or x 2j r\bx-\-\y= —mx — n. The roots of these quadratic equations are the four roots of (16) and hence of the equivalent equation (15). This method of solution is due to Ferrari (1522-1565). Example. Solve x 4 +2x 3 -12a; 2 -10:c+3=0. Solution. Here b = 2, c= — 12, d= — 10, e = 3. Hence (17) becomes y 3 + 12y 2 -32y- 256 = 0, which by Ex. 2 of § 24 has the root y = —4. Our quartic may be written in the form (,x 1 +x)* = 13x* + 10z-3. Adding (x 2 +x) (—4) +4 to each member, we get (x 2 +x-2y = Qx 2 +Gx + l = (3x+l) 2 , z 2 +x-2 = ±(3x+l), x 2 -2x-3 = 0ora; 2 +4z-l=0, whose roots are 3, —1, — 2± V 5. As a check, note that the sum of the roots is —2. § 50] DISCRIMINANT OF A QUARTIC 51 EXERCISES ■1. Solve x 4 -8x 3 +9x 2 4-8x- 10 = 0. Note that (17) is (y-9) (y 2 -24)=0. 2. Solve x 4 -2x 3 -7x 2 +8x+12 = 0. Since the right member of (16) is (S+y) (x 2 -x) +b 2 -12, use 7/= -8. 3. Solve x 4 -3x 2 +6x-2 = 0. 4. Solve x 4 -2x 2 -8x -3=0. ^5. Solve x 4 -10x 2 -20x -16 = 0. 49. Roots of the Resolvent Cubic Equation. Let 2/1 be the root y which was employed in § 48. Let X\ and X2 be the roots of the first quadratic equation (18), and x% and x± the roots of the second. Then xix 2 = %yi-n, X3Xi = iyi+n, xiX2+X3X±=yi. If, instead of y\, another root 2/2 or 2/3 of the resolvent cubic (17) had been employed in § 48, quadratic equations different from (18) would have been obtained, such, however, that their four roots are x\, x 2 , X3, £4, paired in a new manner. The root which is paired with x\ is x 2 or X3 or x±. It is now plausible that the values of the three y's are (19) yi=XiX 2 +X 3 X4;, y 2 =XiX3+X 2 X4, y3=XiX4:+X 2 X3. To give a more formal proof that the y's given by (19) are the roots of (17), we employ (§ 20) Xi+X 2 -\-X3+X±= —b, XiX 2 X3+XiX2X± + XiX3Xi-\-X 2 X3X±= ~d, X\X 2 -\-XiX3-\-XiX±-\-X 2 X3-{-X 2 X4-\-X3X4 L = C, X\X 2 X3X± = e. From these four relations we conclude that 2/i+2/2+2/3 = c, 2/12/2+2/12/3 + 2/21/3= (Xi+X 2 + X3+X±)(XiX 2 X3-\r . . . +X 2 X3X4) — ±X X X 2 X3X± = bd — 4e, yiy2y3=(XlX 2 X3+ . . . ) 2 -\-XiX 2 X3X 4i {{Xi+ . . . ) 2 ~4:(XiX 2 + . . . )} = d 2 +e(6 2 -4c). Hence (§ 20) 2/1, 2/2, 2/3 are the roots of the cubic equation (17). 50. Discriminant. The discriminant A of the quartic equation (15) is defined to be the product of the squares of the differences of its roots: A=(Xi-X 2 ) 2 (Xi-X3) 2 (Xi-X4) 2 (x 2 -X3) 2 (x 2 -X±) 2 (X3-X4) 2 . 52 CUBIC AND QUARTIC EQUATIONS [Ch. IV The fact that A is equal to the discriminant of the resolvent cubic equation (17) follows at once from (19), by which 2/i — 2/2 = (xi—x±) (x2—xz), yi — y3 = (xi-X3)(x 2 -X4:), y2-y3 = (xi-x 2 )(x 3 -x±), (yi-y2) 2 (yi-y3) 2 (y2-y3) 2 = &. Hence (§44) A is equal to the discriminant — 4p 3 — 27 q 2 of the reduced cubic F 3 +pF+g = 0, obtained from (17) by setting y=Y-\-c/S. Thus (20) y = bd-4e-\c 2 , q= -tfe+lbcd+^ce-d 2 -^. Theorem. The discriminant of any quartic equation (15) is equal to the discriminant of its resolvent cubic equation and therefore is equal to the discriminant — 4p 3 — 27q 2 of the corresponding reduced cubic Y z -\-pY-\-q = 0, whose coefficients have the values (20). EXERCISES 1. Find the discriminant of a; 4 — 3x i +x 2 +3x — 2 = and show that the equation has a multiple root. 2. Show by its discriminant that x 4 — 8x 3 +22x 2 — 24z+9 = has a multiple root. 3. If a real quartic equation has two pairs of conjugate imaginary roots, show that its discriminant A is positive. Hence prove that, if A<0, there are exactly two real roots. 4. Hence show that x i — 3x 3 -\-3x 2 — Sx+2 = has two real and two imaginary roots. 51. Descartes' Solution of the Quartic Equation. Replacing x by 2 — 6/4 in the general quartic (15), we obtain the reduced quartic equation (21) z 4 +qz 2 +rz+s = 0, lacking the term with z 8 . We shall prove that we can express the left member of (21) as the product of two quadratic factors (z 2 +2kz+l)(z 2 -2kz+m)=z 4 +(l+m-4:k 2 )z 2 +2k(m-l)z+lm. The conditions are l-\-m — 4k 2 — q, 2k(m — l)=r, lm = s. If k^O, the first two give 2l = q+4k 2 -^, 2m = q+4k 2 +^. Inserting these values in 2l-2m = 4s, we obtain (22) 64/c 6 +32gfc 4 +4(g 2 -4s)/c 2 -r 2 = 0. § 51] DESCARTES' SOLUTION OF A QUARTIC 53 The latter may be solved as a cubic equation for k 2 . Any root fc 2 ^0 gives a pair of quadratic factors of (21) : (23) z 2 ±2kz+±q+2k 2 ^F^. The four roots of these two quadratic functions are the four roots of (21). This method of Descartes (1596-1650) therefore succeeds unless every root of (22) is zero, whence q = s = r = 0, so that (18.) is the trivial equation 24 = 0. * ! For example, consider z 4 — Sz 2 +6z — 2 = 0. Then (22) becomes 64& 6 -3-32/c 4 +4-17A; 2 -36 = 0. The value k 2 = l gives the factors z 2 -\-2z — 1, z 2 — 2z-\-2. Equating these to zero, we find the four roots -1±V2, 1±V-1. 52. Symmetrical Form of Descartes' Solution. To obtain this sym- metrical form, we use all three roots k\ 2 , k 2 2 , k 3 2 of (22). Then k{ 2 +k 2 2 +kf = ~k, h 2 k 2 2 ks 2 =^. It is at our choice as to which square root of k\ 2 is denoted by -\-k\ and which by —ki, and likewise as to ±k 2 , ±&3- For our purposes any choice of these signs is suitable provided the choice give (24) fci*afcs=-g. Let fci^O. The quadratic function (23) is zero for k = k\ if (z±k 1 r=-^-k 1 2 ±~=k 2 2 +k 3 2 ^ 8J ^^=(k 2 -Fk 3 ) 2 . Hence the four roots of the quartic equation (21) are (25) ki+k 2 +k 3j ki — k 2 — k3, —ki+k 2 — kz, —ki — k 2 +k 3 . EXERCISES 1. Solve Exs. 4, 5 of § 48 by the method of Descartes. 2. By writing y h y 2 , ys for the roots h 2 , k 2 2 , k 3 2 of (26) 64?/ 3 +32gt/ 2 +4(g 2 -4s)7/-r 2 = 0, show that the four roots of (21) are the values of (27) z = Vy~ 1 + Vy 2 + Vy~ i 54 CUBIC AND QUARTIC EQUATIONS [Ch. IV for all combinations of the square roots for which (28) Vyl-V^-Vy^- 1 -. 3. Euler (1707-1783) solved (21) by assuming that it has a root of the form (27). Square (27), transpose the terms free of radicals, square again, replace the last factor of 8 Vy 1 y 2 y3 (v 2/i+ v y 2 + v 2/s) by z, and identify the resulting quartic in 2 with (21). Show that j/i, 2/2, 2/3 are the roots of (26) and that relation (28) holds. 4. Find the six differences of the roots (25) and verify that the discriminant A of (21) is equal to the quotient of the discriminant of (26) by 4 6 . 5. In the theory of the inflexion points of a plane cubic curve there occurs the equation z i -Sz 2 -^Tz—^S 2 = 0. Show that (26) now becomes »-f '-« HW-(I and that the roots of the quartic equation are ± y/\s + = 0. Its discriminant is zero if d = 3\ 3, i' = 27; find h. 5. Find the admissible values of h in Ex. 4 when d = 12, v = 332.5. 6. Find a necessary and sufficient condition that quartic equation (15) shall have one root the negative of another root. Hint: (xi+x 2 ) (x 3 +x 4 ) =q — yi- Hence substitute q for y in (17). 7. In the study of parabolic orbits occurs the equation tan 2^ + 3 tan 3 ^v = t. Prove that there is a single real root and that it has the same sign as t. 8. In the problem of three astronomical bodies occurs the equation x 3 +az+2 = 0. Prove that it has three real roots if and only if a'S. —3. CHAPTER V The Graph of an Equation 53. Use of Graphs in the Theory of Equations. To find geometrically the real roots of a real equation f(x) = 0, we construct a graph of y =f(x) and measure the distances from the origin to the intersections of the graph and the rc-axis, whose equation is y = 0. For example to find geometrically the real roots of (1) 6x-3 = 0, we equate the left member to y and make a graph of (10 y = x 2 — 6# — 3. ClM Y f(7,a Q / „ \ 4 (0,-3): 11 *(6 r 3> 0,-8) i(5,-8) ( 2- 11)\ p; 1X4,-11) We obtain the parabola in Fig. 12. Of the points shown, P has the abscissa x = OQ = 4: and the ordinate y=— QP=— 11. From the points of intersection of t/ = (the rr-axis OX) with the parabola, we obtain the approximate (3.-12) values 6.46 and -0.46 of the roots of (1). Fig. 12 EXERCISES /\. Find graphically the real roots of x 2 — 6x+7 = 0. Hint: For each x, y = x 2 —6x+7 exceeds the y in (1') by 10, so that the new graph is obtained by shifting the parabola in Fig. 12 upward 10 units, leaving the axes OX and OY unchanged. What amounts to the same thing, but is simpler to do, we leave the parabola and OY unchanged, and move the axis OX downward 10 units. 2. Discuss graphically the reality of the roots of x 2 — 6x + 12 = 0. 3. Find graphically the roots of x 2 — 62 +9 = 0. 54. Caution in Plotting. If the example set were (2) y = %x±-Ux*-§x 2 +Ux-2, one might use successive integral values of x, obtain the points ( — 2, 180), 55 56 GRAPHS [Ch. V (-1, 0), (0, -2), (1, -6), (2, 0), (3, 220), all but the first and last of which are shown (by crosses) in Fig. 13, and be tempted to conclude that the graph is a U-shaped curve approximately like that in Fig. 12 and that there are just two real roots, — 1 and 2, of (2') 8x 4 -14a; 3 -9x 2 +llx-2-0. But both of these conclusions would be false. In fact, the graph is a W-shaped curve (Fig. 13) and the additional real roots are \ and ^. This example shows that it is often necessary to employ also values of x which are not integers. The purpose of the example was, however, not to point out this obvious fact, but rather to emphasize the chance of serious error in sketching a curve through a number of points, however numerous. The true curve between two points below the z-axis may not cross the z-axis, or may have a peak and actually cross the z-axis twice, or may be an M-shaped curve crossing it four times, etc. For example, the graph (Fig. 14) of (3) y = £ 3 +4z 2 -ll crosses the z-axis only once; but this fact can- not be established by a graph located by a num- ber of points, however numerous, whose abscissas are chosen at random. We shall find that correct conclusions regard- ing the number of real roots may be deduced from a graph whose bend points (§ 55) have been located. 55. Bend Points. A point (like M or M' in Fig. 14) is called a bend point of the graph of y =f(x) if the tangent to the graph at that point is horizontal and if all of the adjacent points of the graph lie below the tangent or all above the _ . Fig. 14 tangent. The first, but not the second, condi- Fig. 13 55] BEND POINTS 57 tion is satisfied by the point of the graph of y = x s given in Fig. 15 (see § 57). In the language of the calculus, f(x) has a (relative) maximum or minimum value at the abscissa of a bend point on the graph of y=f(x). Fig. 15 Fig. 16 Let P=(x,y) and Q = (x-\-h, Y) be two points on the graph, sketched in Fig. 16, of y—f{x). By the slope of a straight line is meant the tangent of the angle between the line and the rc-axis, measured counter-clockwise from the latter. In Fig. 16, the slope of the straight line PQ is (4) Y-y f( x +h)-f(x) h h For equation (3) , f(x) = x 3 +4z 2 — 11. Hence f(x+h) = (x+hf+4;(x+K) 2 - 11 = rc 3 +4z 2 -ll + (3z 2 +8x) ^+(3x4-4) h 2 +h* The slope (4) of the secant PQ is therefore here 3z 2 +8z+(3z+4)/i+/i 2 . Now let the point Q move along the graph toward P. Then In approaches the value zero and the secant PQ approaches the tangent at P. The slope of the tangent at P is therefore the corresponding limit 3z 2 +8z of the preceding expression. We call 3a: 2 +8:r the derivative of x 3 +4x 2 — 1 1 . 58 GRAPHS [Ch. V In particular, if P is a bend point, the slope of the (horizontal) tangent at P is zero, whence 3x 2 -{-8x = 0, x = or x=— f. Equation (3) gives the corresponding values of y. The resulting points M= (0,-11), M' = (--|,-f^ are easily shown to be bend points. Indeed, for :r.>0 and for x between — 4 and 0, x 2 (x+4) is positive, and hence f(x)> —11 for such values of x, so that the function (3) has a relative minimum at x = 0. Similarly, there is a relative maximum at x= — f. We may also employ the general method of § 59 to show that M and M' are bend points. Since these bend points are both below the x-axis we are now certain that the graph crosses the x-axis only once. The use of the bend points insures greater accuracy to the graph than the use of dozens of points whose abscissas are taken at random. 56. Derivatives. We shall now find the slope of the tangent to the graph of y=f(x), where /(re) is any polynomial (5) f(x)=a x n +a 1 x n - 1 + . . . -f-On-iS+a,,. We need the expansion of f(x-\-h) in powers of x. By the binomial theorem, a (x+h) n = aox n +na x n - 1 h + n ^~ 1) a , g n - 2 ft 2 + . . . , ai(x+h) n - 1 =a 1 x n - 1 + (n-l)a 1 x n - 2 h+ ( ™~ 1) !; n ~ 2 \ ix n - 3 h 2 + . . . , a n _ 2 (x+h) 2 = a n _ 2 x 2j r2a n _ 2 xh +a B _ 2 /i 2 , a a - 1 (x+h)=a n _ 1 x +a n ^ 1 h, The sum of the left members is evidently f(x-\-h). On the right, the sum of the first terms (i.e., those free of h) is f(x). The sum of the coef- ficients of h is denoted by f(x), the sum of the coefficients of \ln? is denoted by /"(#), • • • j the sum of the coefficients of h k 1-2 ... k § 56] DERIVATIVES, TAYLOR'S THEOREM 59 is denoted by / (i) (x) . Thus (6) f(x)=na x n - 1 + (n-l)a 1 x n - 2 + . . . +2a w _ 2 rc+o M _ l5 (7) f"(x)=n(n-l)aox n - 2 +(n-l) (n-2)arx n - 3 + . . . +2a n _ 2 , etc. Hence we have (8) Kx+h)=f(x)+f(x)h+r(^+r"(x 1-2 ,J K y l-2-3 h n r! n! + . , +/«(*£+ • • • +/ (ra) (*S where r! is the symbol, read r factorial, for the product 1-2-3 . . . (r—l)r. Here r is a positive integer, but we include the case r = by the definition, 0!=1. This formula (8) is known as Taylor's theorem for the present case of a polynomial /(a;) of degree n. We call fix) the (first) derivative of fix), and f"(x) the second derivative of f(x), etc. Concerning the fact that f"{x) is equal to the first derivative of f'(x) and that, in general, the kih derivative f a) (x) of f(x) is equal to the first derivative of / ( * _1) (a:)> see Exs. 6-9 of the next set. In view of (8), the limit of (4) as h approaches zero is f'(x). Hence f{x) is the slope of the tangent to the graph of y=f(x) at the point (x, y). In (5) and (6), let every a be zero except ao. Thus the derivative of aox n is naoX n ~ l , and hence is obtained by multiplying the given term by its exponent n and then diminishing its exponent by unity. For example, the derivative of 2x 3 is Qx 2 . Moreover, the derivative of f(x) is equal to the sum of the derivatives of its separate terms. Thus the derivative of x 3 + 4z 2 — 11 is 3x 2 -\-8x, as found also in § 55. EXERCISES 1. Show that the slope of the tangent to y = 8x 3 — 22x 2 + 13x—2 at (x, y) is 24x 2 - 44x+13, and that the bend points are (0.37, 0.203), (1.46, —5.03), approximately. Draw the graph. 2. Prove that the bend points of y = x 3 -2z-5 are (.82, -6.09), (-.82, -3.91), approximately. Draw the graph and locate the real roots. 3. Find the bend points of y = x 3 +6z 2 +8x+8. Locate the real roots. 4. Locate the real roots of f(x) =x 4 +x 3 — x — 2=0. Hints: The abscissas of the bend points are the roots of /'(x) =4x 3 +3x 2 — 1 =0. The bend points of y=f'(x) are (0, — 1) and (— \, — f ), so that ?{£) =0 has a single real root (it is just less than \). The single bend point of y=f(x) is (|, — f-g-), approxi- mately. 60 GRAPHS [Ch. V 5. Locate the real roots of x 6 — 7x i — 3x 2 +7 = 0. 6. Prove that/"(x), given by (7), is equal to the first derivative of f'(x). 7. If f(x) =fi(x) +fo(x) , prove that the kth. derivative of / is equal to the sum of the kth derivatives of /i and/ 2 . Use (8). 8. Prove that / (S) 0) is equal to the first derivative of f (k ~ 1) (x). Hint: prove this for f = ax m ; then prove that it is true for / =/i +/ 2 if true for/i and/ 2 . 9. Find the third derivative of x 6 +5x 4 by forming successive first derivatives; also that of 2x 5 — 7x 3 +x. 10. Prove that if g and k are polynomials in x,the derivative of gk is g'k+gk'. Hint: multiply the members of g(x+h)=g(x)+g'(x)h+ . . .and k(x+h) =k(x)+k'(x)h+ . . . and use (8) for f=gk. 57. Horizontal Tangents. If (x, y) is a bend point of the graph of y=f(x), then, by definition, the slope of the tangent at (x, y) is zero. Hence (§56), the abscissa a; is a root of f(x)=0. In Exs. 1-5 of the preceding set, it was true that, conversely, any real root of f(x)=Q is the abscissa of a bend point. However, this is not always the case. We shall now consider in detail an example illustrating this fact. The example is the one merely mentioned in § 55 to indicate the need of the second requirement made in our definition of a bend point. The graph (Fig. 15) of y = x? has no bend point since x 3 increases when x increases. Nevertheless, the derivative 3a; 2 of x 3 is zero for the real value x = 0. The tangent to the curve at (0, 0) is the horizontal lrae y = 0. It may be thought of as the limiting position of a secant through which meets the curve in two further points, seen to be equidistant from 0. When one, and hence also the other, of the latter points approaches 0, the secant approaches the position of tangency. In this sense the tangent at is said to meet the curve in three coincident points, their abscissas being the three coinciding roots of x 3 = 0. In the language of § 17, x 3 = has the triple root x = 0. The subject of bend points, to which we recur in § 59, has thus led us to a digression on the important subject of multiple roots. 58. Multiple Roots. In (8) replace x by a, and h by x-a. Then (9) ftx) =/(«)+/Xa)(^-«)+/^«)^^ 2 +/ /// («)^^ 3 + . . . (ra— 1)! m! By definition (§ 17) a is a root of /(re) =0 of multiplicity m ii fix) is exactly § 58] MULTIPLE ROOTS 61 divisible by (re — a) m , but not by (re — a) m+1 . Hence a is a root of multi- plicity m of /(re) =0 if and only if (10) f{a) = 0, f(a)=0, f"(a)=0, ..., /<—»(«) =0, f m \a)^0. For example, x i -\-2x 3 = has the triple root x = since is a root, and since the first and second derivatives 4x 3 +6x 2 and 12x 2 -)-12x are zero for x = 0, while the third derivative 24x+12 is not zero for x = 0. If in (9) we replace / by /' and hence / (4) by f a+1) , or if we differentiate every term with respect to x, we see by either method that (x-a) m - 2 (to- 2)! (11) f f (x) =/'(«)+/"(«)(*-«) + . . . +/<-» ^ w (to-1)! ^ Let /(re) and/'(rr) have the common factor (re— a) m ~ 1 , but not the com- mon factor (x—a) m , where m>l. Since (11) has the factor (x-a) m_1 , we have f'(a)=0, . . . , f (m ~ 1) (a)=0. Since also /(re) has the factor x — a, evidently /(a)=0. Then, by (9), fix) has the factor (x-a) m , which, by hypothesis, is not also a factor of /'(re). Hence, in (11), / (to) (q;)^0. Thus, by (10), a is a root of /(re) =0 of multiplicity to. Conversely, let a be a root of /(re) = of multiplicity to. Then rela- tions (10) hold, and hence, by (11), /'(re) is divisible by (x—a)" 1-1 , but not by (re— a) m . Thus /(re) and /'(re) have the common factor (re— a)" 1-1 , but not the common factor (re— a) m . We have now proved the following useful result. Theorem. If /(re) and f'(x) have a greatest common divisor g(x) involving x, a root of gf(rc)=0 of multiplicity to— 1 is a root of /(rc)=0 of multiplicity to, and conversely any root of /(re) = of multiplicity m is a root °f g( x ) = Q °f multiplicity to— 1. In view of this theorem, the problem of finding all the multiple roots of /(re) = and the multiplicity of each multiple root is reduced to the problem of finding the roots of g(x) =0 and the multiplicity of each. For example, let/(x) =x 3 — 2x 2 — 4x+8. Then fix) =3x 2 -4x-4, 9/(.r) =/'(x)(3x-2) -32(x-2). Since x — 2 is a factor of fix), it may be taken to be the greatest common divisor of fix) and fix), the choice of the constant factor c in c(x— 2) being here immaterial. Hence 2 is a double root of /(x) =0, while the remaining root —2 is a simple root. 62 GRAPHS [Ch. V EXERCISES 1. Prove that x 3 — 7x 2 +15x — 9 = has a double root. 2. Show that x 4 — 8x 2 +16 = has two double roots. 3. Prove that x 4 — 6x 2 — 8x — 3 = has a triple root. 4. Test x 4 -8x 3 +22x 2 -24x+9 = for multiple roots. 5. Test x 3 — 6x 2 +llx — 6 = for multiple roots. 6. Test x 4 -9x 3 +9x 2 +81x- 162=0 for multiple roots. 59. Ordinary and Inflexion Tangents. The equation of the straight line through the point (a, /3) with the slope s is y — (3 = s(x — a). The slope of the tangent to the graph of y=f(x) at the point (a, /3) on it is s =/'(«) by § 56. Also, /3 =/(«). Hence the equation of the tangent is (12) y=f(a)+f'(a)(x-a). By subtracting the members of this equation from the corresponding members of equation (9), we see that the abscissas x of the points of inter- section of the graph of y =f(x) with its tangent satisfy the equation J w 2 , -rj w 3! -i-....-rv w ( m -l)! ^ v y m! = 0. Here the term containing / (m_1) (a) must evidently be suppressed if ra = 2, since the term containing f (m) (a) then coincides with the first term. If a: is a root of multiplicity m of this equation, i.e., if the left member is divisible by (x—a) m , but not by (x—a) m+1 , the point (a, /3) is counted as m coincident points of intersection of the curve with its tangent (just as in the case of y = x 3 and its tangent y = in § 57). This will be the case if and only if (13) f"(a)=0, /'"(a)=0, . . . , /(— D(a)=0, f m) (a)^0, in which m>l and, as explained above, only the final relation f"(a)?^0 is retained if m = 2. If ra = 3, the conditions are f'(a) =0, f (3) (a)^0. For example, if /(x) =x 4 and a=0, then /"(0) =/"' (0) =0, / c4) (0) = 24^0, so that m = 4. The graph of y = x 4 is a [/-shaped curve, whose intersection with the tangent (the x-axis) at (0, 0) is counted as four coincident points of intersection. Given f(x) and a, we can find, as in the preceding example, the value of m for which relations (13) hold. We then apply the 59] ORDINARY AND INFLEXION TANGENTS 63 Theorem. If m is even (m>0), the points of the curve in the vicinity of the point of tangency (a, /3) are all on the same side of the tangent, which is then called an ordinary tangent. But if m is odd (ra> 1), the curve crosses the tangent at the point of tangency (a, /3), and this point is called an inflexion point, while the tangent is called an inflexion tangent. For example, in Fig. 15, OX is an inflexion tangent, while the tangent at any point except is an ordinary tangent. In Figs. 18, 19, 20, the tangents at the points marked by crosses are ordinary tangents, but the tangent at the point midway between them and on the y-axis is an inflexion tangent. To simplify the proof, we first take as new axes lines parallel to the old axes and intersecting at (a, /3). In other words, we set x — a = X, y — j3=Y, where X, Y are the coordinates of {x, y) referred to the new axes. Since j3 =/(«), the tangent (12) becomes Y=f(a)X, while, by (9), y = f( x ) = fi J rf'( a ) (x—a)-\- . . . becomes (14) Y=f(a)X+f"(*)¥-+ ml iX,Y) after omitting terms which are zero by (13). To simplify further the algebraic work, we pass to oblique axes, 1 the new ?/-axis coinciding with the F-axis, while the new z-axis is the tangent, the angle between which ajad the X-axis is designated by d. Then tan 6=f'(a). By Fig. 17, X = xcos6>, Y-y=f'{a)X. Hence when expressed in terms of the new coordinates x, y, the tangent is y = 0, while the equation (14) of the curve becomes f m) (a) cos" y=cx m +dx m+1 + 9*0. For x sufficiently small numerically, whether positive or negative, the sum of the terms after cx m is insignificant in comparison with cx m , 1 Since the earlier x, y do not occur in (14) and the new equation of the tangent, we shall designate the final coordinates by x, y without confusion. 64 GRAPHS [Ch. V so that y has the same sign as cx m (§ 64). Hence, if m is even, the points of the curve in the vicinity of the origin and on both sides of it are all on the same side of the x-axis, i.e., the tangent. But, if m is odd, the points with small positive abscissas x lie on one side of the rc-axis and those with numerically small negative abscissas lie on the opposite side. Our transformations of coordinates changed the equations of the curve and of its tangent, but did not change the curve itself and its tangent. Hence our theorem is proved. By our theorem, a is the abscissa of an inflexion point of the graph of y=f(x) if and only if conditions (13) hold with m odd (m>l). These conditions include neither f(a)=0 nor f'(a)=0, in contrast with (10). In the theory of equations we are primarily interested in the abscissas a of only those points of inflexion whose inflexion tangents are horizontal, and are interested in them, because we must exclude such roots a of f{x) =0 when seeking the abscissas of bend points, which are the important points for our purposes. A point on the graph at which the tangent is both horizontal and an ordinary tangent is a bend point by the definition in §55. Hence if we apply our theorem to the special case f'(a)=0, we obtain the following Criterion. Any root a of f'(x)=0 is the abscissa of a bend point of the graph of y =f(x) or of a point with a horizontal inflexion tangent according as the value of m for which relations (13) hold is even or odd. For example, if f(x) =x i , then a = and to =4, so that (0, 0) is a bend point of the [/-shaped graph of y = x i . If f(x)=x 3 , then a = and to = 3, so that (0, 0) is a point with a horizontal inflexion tangent (OX in Fig. 15) of the graph of y = x 3 . EXERCISES 1. If /(x)=3x 5 +5z 3 +4, the only real root of f'(x)=0 is x = 0. Show that (0, 4) is an inflexion point, and thus that there is no bend point and hence that f(x) = has a single real root. 2. Prove that x 3 — 3x 2 -\-3x-\-c = has an inflexion point, but no bend point. 3. Show that x b — ICte 3 — 20x 2 — 15x+c = has two bend points and no horizontal inflexion tangents. 4. Prove that 3x 5 — 40rr 3 +240x+c = has no bend point, but has two horizontal inflexion tangents. 5. Prove that any function x 3 —3ax 2 + ... of the third degree can be written in the form/(x) = (x— a) 3 +ax+b. The straight line having the equation y = ax-\rb meets the graph of y =f{x) in three coincident points with the abscissa a and hence is an inflexion tangent. If we take new axes of coordinates parallel to the old and inter- secting at the new oriein (a, 0), i.e., if we make the transformation x = X+a, y=?Y, 60] REAL ROOTS OF A CUBIC 65 of coordinates, we see that the equation /(x)=0 becomes a 1 educed cubic equation X 3 +pX+g=0(§42). 6. Find the inflexion tangent to y=x 3 +Qx 2 — 3x+l and transform x 3 +Qx 2 — 3x + 1=0 into a reduced cubic equation. 60. Real Roots of a Real Cubic Equation. It suffices to consider f{x)=x 3 -?>lx+q (1^0), in view of Ex. 5 above. Then /' = 3 (x 2 -l), f" = 6x. If l<0, there is no bend point and the cubic equation f(x) =0 has a single real root. If l>0, there are two bend points (VT, q-2lVl), (-VT, q+2lVT), which are shown by crosses in Figs. 18-20 for the graph of y=f(x) in the -2Z vT-=g-=2Z /I Fig. 20 three possible cases specified by the inequalities shown below the figures. For a large positive x, the term x 3 in f(x) predominates, so that the graph contains a point high up in the first quad- rant, thence extends downward to the right-hand bend point, then ascends to the left-hand bend point, and finally de- scends. As a check, the graph contains a point far down in the third quadrant, since for x negative, but sufficiently large numerically, the term a; 3 predominates and the sign of y is negative. If the equality sign holds in Fig. 18 or Fig. 19, a necessary and sufficient condition for which is q 2 = 4l 3 , one of the bend points is on the z-axis, and the cubic equation has a double root. The inequalities in Fig. 20 hold if and only if q 2 <4l 3 , which implies that 7>0. Hence x 3 — 3lx-\-q = Q has three distinct real roots if and only if q 2 <4l 3 , a single real root if and only if q 2 ~>4l 3 , a double root (necessarily real) if and only if q 2 — 4l 3 and 1^0, and a triple root if q 2 = 4l 3 — 0. 66 GRAPHS [Ch. V EXERCISES Find the bend points, sketch the graph, and find the number of real roots of /L. x 3 +2z-4 = 0. 2. x 3 -7x+7=0. 3. z 3 -2x-l=0. 4. x 3 +Qx 2 -dx+l=0. 5. Prove that the inflexion point of y = x 3 — 3lx+q is (0, q). 6. Show that the theorem in the text is equivalent to that in § 45. 7. Prove that, if m and n are positive odd integers and m>n, x m +px n +q = has no bend point and hence has a single real root if p>0; but, if p<0, it has just two bend points which are on the same side or opposite sides of the z-axis according as np\ m I nq m / \m—'i is positive or negative, so that the number of real roots is 1 or 3 in the respective cases. 8 . Draw the graph oiy = x i —x 2 . By finding its intersections with the line y = mx + b, solve x 4 — x 2 — mx — 6 = 0. 9. Prove that, if p and q are positive, x 2m — px 2n +q = has four distinct real roots, two pairs of equal roots, or no real root, according as np\ m ( nq \ m ~ n n J >0, =0, or <0. (x)=(x— Xi) . . . (X — X r ), then (a) and (b) have opposite signs. Thus a—xt and b—x% have opposite signs for at least one real root x%. (Lagrange.) 64. Sign of a Polynomial. Given a polynomial f{x)=a x n +a 1 x n - x + ... +a n (a ^0) with real coefficients, we can find a positive number P such that f(x) has the same sign as oqx 71 when x>P. In fact, /Cr)=z»(ao+4>), 0-^+*+...+* By the result in § 62, the numerical value of is less than that of a® when 1/x is positive and less than a sufficiently small positive number, say 1/P, and hence when x>P. Then ao+<£ has the same sign as ao, and hence f{x) the same sign as aoX n . The last result holds also when rr is a negative number sufficiently large numerically. For, if we set x=— X, the former case shows that /( — X) has the same sign as (— l) n aoX n when X is a sufficiently large positive number. § 65] ROLLE'S THEOREM 69 We shall therefore say briefly that, for re = +<*>, f(x) has the same sign as ao; while, for x— — oo, f(x) has the same sign as ao if n is even, but the sign opposite to ao if n is odd. EXERCISES 1. Prove that x z -\-ax i -\-bx— 4 = has a positive real root [use a: = and x = + co ]. 2. Prove that x 3 +ax 2 +bx+4i = has a negative real root [use x = and x = — co ]. 3. Prove that if a > and n is odd, a x n + . . . +a n = has a real root of sign opposite to the sign of a n [use x = — oo , 0, + oo ]. 4. Prove that x i +ax 3 +bx 2 +cx— 4 = has a positive and a negative root. 5. Show that any equation of even degree n in which the coefficient of x n and the constant term are of opposite signs has a positive and a negative root. 65. Rolle's Theorem. Between two consecutive real roots a and b of f(x) = 0, there is an odd number of real roots of fix) = 0, a root of multiplicity m being counted as m roots. Let f(x) = (x-a) T (x-b) s Q(x), a0 for re = 6, and hence vanishes an odd number of times between a and b (§63). But, in the left member, (x—a) (x — b) and/(:r) remain of constant sign between a and b, since f(x) = has no root between a and 6. Hence f'(x) vanishes an odd number of times. Corollary. Between two consective real roots a and /3 of f'(x)=0 there occurs at most one real root of f(x) =0. For, if there were two such real roots a and b of f(x) = 0, the theorem shows that f'(x) =0 would have a real root between a and b and hence be- tween a and /3, contrary to hypothesis. Applying also § 63 we obtain the Criterion. If a and /3 are consecutive real roots of f'(x) = 0, then f(x) = has a single real root between a and /3 if f(a) and f(/3) have opposite signs, but no root if they have like signs. At most one real root of f(x) =0 is greater than the greatest real root of f'(x) =0, and at most one real root of f(x) =0 is less than the least real root of f'(x) =0. 70 GRAPHS [Ch. V If f(a)=0 for our root a of f{x)=Q, a is a multiple root of f(x)=0 and it would be removed before the criterion is applied. Example. For/(x) = 3x 5 -25x 3 +60x-20, I L/'( a; )= X 4_5 x 2 + 4 = ( a .2_ 1 ) (£2_4) t Hence the roots of f'(x) =0 are ±1, ±2. Now f(_oo) = -oo, /(-2) = -36, /(-l) = -58, /(1)=18, /(2) = -4, /(+«)) = + oo. Hence there is a single real root in each of the intervals . (-1, 1), (1, 2), (2, + co), and two imaginary roots. The three real roots are positive. EXERCISES 1. Prove that x 5 — 5x+2=0 has 1 negative, 2 positive and 2 imaginary roots. 2. Prove that x 6 +x — 1=0 has 1 negative, 1 positive and 4 imaginary roots. 3. Show that x 5 — 3x 3 +2x 2 — 5 = has two imaginary roots, and a real root in each of the intervals (-2, -1.5), (-1.5, -1), (1, 2). 4. Prove that 4x 5 — 3x 4 — 2x 2 +4x — 10 = has a single real root. 5. Show that, if / (S) (x) =0 has imaginary roots, /(x) =0 has imaginary roots. 6. Derive Rolle's theorem from the fact that there is an odd number of bend points between a and b, the abscissa of each being a root of /'(x) =0 of odd multiplicity, while the abscissa of an inflexion point with a horizontal tangent is a root of /'(x) =0 of even multiplicity, CHAPTER VI Isolation of the Real Roots of a Real Equation 66. Purpose and Methods of Isolating the Real Roots. In the next chapter we shall explain processes of computing the real roots of a given real equation to any assigned number of decimal places. Each such method requires some preliminary information concerning the root to be computed. For example, it would be sufficient to know that the root is between 4 and 5, provided there be no other root between the same limits. But in the contrary case, narrower limits are necessary, such as 4 and 4.3, with the further fact that only one root is between these new limits. Then that root is said to be isolated. If an equation has a single positive root and a single negative root, the real roots are isolated, since there is a single root between — oo and 0, and a single one between and + oo . However, for the practical purpose of their computation, we shall need narrower limits, sufficient to fix the first significant figure of each root, for example -40 and -30, or 20 and 30. We may isolate the real roots of f(x) = by means of the graph of y=f(x). But to obtain a reliable graph, we saw in Chapter V that we must employ the bend points, whose abscissas occur among the roots f(x)=0. Since the latter equation is of degree n—l when f(x)=0 is of degree n, this method is usually impracticable when n exceeds 3. The method based on Rolle's theorem (§ 65) is open to the same objection. The most effective method is that due to Sturm (§68). We shall, however, begin with Descartes' rule of signs since it is so easily applied. Unfortunately it rarely tells us the exact number of real roots. 67. Descartes' Rule of Signs. Two consecutive terms of a real poly- nomial or equation are said to present a variation of sign if their coefficients have unlike signs. By the variations of sign of a real polynomial or equa- tion we mean all the variations presented by consecutive terms. Thus, in x 5 — 2x 3 — 4x 2 +3 = 0, the first two terms present a variation of sign, and likewise the last two terms. The number of variations of sign of the equation is two. 71 72 ISOLATION OF REAL ROOTS [Ch. VI Descartes' Rule. The number of positive real roots of an equation with real coefficients is either equal to the number of its variations of sign or is less than that number by a positive even integer. A root of multiplicity m is here counted as m roots. For example, x G —Sx 2 +x-\-l =0 has either twoior no positive roots, the exact number not being found. But 3a; 3 — x — 1=0 has exactly one positive root, which is a simple root. Descartes' rule will be derived in § 73 as a corollary to Budan's theorem. The following elementary proof 1 was communicated to the author by Professor D. R. Curtiss. Consider any real polynomial f(x) = a x n +a 1 x n - 1 + . . . +a l x n ~ % (oo^O, a,3*0). Let r be a positive real number. By actual multiplication, F{x)^{x-r)f{x)=A Q x n+l +Aix n + . . . +A l+l x n ~ l , where Ao = ao, Ai=ai—rao, A 2 = a 2 — rai, . . . , A l = ai—ra l ^ l , A l+1 = —ra l . In f(x) let a tl be the first non-vanishing coefficient of different sign from oq, let a i2 be the first non-vanishing coefficient following a tl and of the same sign as ao, etc., the last such term, a kv , being either a t or of the same sign as a % . Evidently v is the number of variations of sign of fix). For example, if/(x) =2x 6 +3x 5 — 4x 4 — 6x 3 -r-7x, we have y = 2, at, x =a%=* —4, at 2 =ah =7. Note that 04 = since x 2 is absent. The numbers Ao, A tv . . . , A^, A l+ i are all different from zero and have the same signs as ao, a 4l , . . . , a tv , —a h respectively. This is obviously true for Aq = oo and A i+i = —ra h Next, A^ is the sum of the non- vanishing number a ti and the number — ra^x, which is either zero or else of the same sign as a ti since a^-i is either zero or of opposite sign to a ti . Hence the sum A ti is not zero and has the same sign as a ki . By hypothesis, each of the numbers ao, a tv . . . , a kv after the first is of opposite sign to its predecessor, while — a } is of opposite sign to a t() . Hence each term after the first in the sequence Ao, A tl , . . . , A tv , A l+1 is of opposite sign to its predecessor. Thus these terms present v-\-l variations of sign. We conclude that F(x) has at least one more vari- ation of sign than f{x). But we may go further and prove the following 1 The proofs given in college algebras are mere verifications of special cases. § 67] DESCARTES' RULE OF SIGNS 73 Lemma. The number of variations of sign of F(x) is equal to that of f(x) increased by some positive odd integer. For, the sequence Aq, A\, . . . , A kl has an odd number of variations of sign since its first and last terms are of opposite sign; and similarly for the v sequences Ai v A tl+ i, . . . , A i2 ; A iv , At v+1 , . . . , A l+1 . The total number of variations of sign of the entire sequence Aq, A\, . . . , A l+1 is evidently the sum of the numbers of variations of sign for the y+1 partial sequences indicated above, and is thus the sum of v+1 posi- tive odd integers. Since each such odd integer may be expressed as 1 plus or a positive even integer, the sum mentioned is equal to v+1 plus or a positive even integer, i.e., to v plus a positive odd integer. To prove Descartes' rule of signs, consider first the case in which f(x) = has no positive real roots, i.e., no real root between and +oo. Then /(0) and /(oo) are of the same sign (§ 63), and hence the first and last coefficients of f(x) are of the same sign. 1 Thus f{x) has either no vari- ations of sign or an even number of them, as Descartes' rule requires. Next, let f{x) = have the positive real roots n, . . . , r fc and no others. A root of multiplicity m occurs here m times, so that the r's need not be distinct. Then f(x) = (x-n) . . . (x-r k )(p(x), where (x) is a polynomial with real coefficients such that 4>(x)=0 has no positive real roots. We saw in the preceding paragraph that 4>(x) has either no variations of sign or an even number of them. By the Lemma, the product (x — r k )4>(x) has as the number of its variations of sign the number for (x) increased by a positive odd integer. Similarly when we introduce each new factor x — r t . Hence the number of varia- tions of sign of the final product f(x) is equal to that of (a;) increased by k positive odd integers, i.e., by k plus or a positive even integer. Since (x) has either no variations of sign or an even number of them, the number of variations of sign of f(x) is k plus or a positive even integer, a result equivalent to our statement of Descartes' rule. 1 In case f(x) has a factor x n ~ l , we use the polynomial f(x)/x n ~ instead of f(x) in this argument. 74 ISOLATION OF REAL ROOTS [Ch. VI If — p is a negative root of f(x) = 0, then p is a positive root of /( — x) = 0. Hence we obtain the Corollary. The number of negative roots of f(x) = is either equal to the number of variations of sign of f(—x) or is less than that number by a positive even integer. For example, x 4 +3x 3 +x — 1=0 has a single negative root, which is a simple root, since x 4 — 3x 3 — x — 1 =0 has a single positive root. As indicated in Exs. 10, 11 below, Descartes' rule may be used to isolate the roots. EXERCISES Prove by Descartes' rule the statements in Exs. 1-8, 12, 15. 1. An equation all of whose coefficients are of like sign has no positive root. Why is this self-evident? 2. There is no negative root of an equation, like x 5 — 2x 4 — 3x 2 +7x — 5 = 0, in which the coefficients of the odd powers of x are of like sign, and the coefficients of the even powers (including the constant term) are of the opposite sign. Verify by taking x = — p, where p is positive. 3. x 3 +a 2 x+i> 2 = has two imaginary roots if b^O. 4. For n even, x n — 1 =0 has only two real roots. 5. For n odd, x n — 1 =0 has only one real root. 6. For n even, x n +l =0 has no real root; for n odd, only one. 7. x 4 +12x 2 +5x — 9 = has just two imaginary roots. 8. x i -\-a 2 x 2 +b 2 x — c 2 = (c^O) has just two imaginary roots. 9. Descartes' rule enables us to find the exact number of positive roots only when all the coefficients are of like sign or when f(x)=X U + piX n ~ 1 + . . . +p n - s X S — p n - s + iX S ~ 1 - . . . —p n = 0, each pi being ^0. Without using that rule, show that the latter equation has one and only one positive root r. Hints: There is a positive root r by § 63 (a = 0, b= oo ). Denote by P(x) the quotient of the sum of the positive terms by x s , and by —N(x) that of the negative terms. Then N(x) is a sum of powers of 1/x with positive coef- ficients. If x>r, P(x)>P(r), N(x)0; If xN(r), /(x)<0. 10. Prove that we obtain an upper limit to the number of real roots of /(x)=0 between a and b, if we set a-\-by / x — a '■y l+y \ b—xj multiply by (l+y) n , and apply Descartes' rule to the resulting equation in y. § 68] STURM'S METHOD 75 11. Show by the method of Ex. 10 that there is a single root between 2 and 4 of x 3 +z 2 -17x+15 = 0. Here we have 27y 3 +Zy 2 -23y-7 = 0. 12. In the astronomical problem of three bodies occurs the equation r 5 + (3-M)r 4 +(3-2 M )r 3 -Mr 2 -2 M r-M = 0, where 0= 1 by insert- ing the computed values of /, /', /* for x = 1 . Experience shows that most students make some error in finding / 2 , / 3 , . . . , so that checking is essential. 2 The notation /i instead of the usual /', and similarly /o instead of /, is used to reg- ularize the notation of all the f's, and enables us to write any one of the equations (1) in the single notation (3) . 3 If the division process did not yield ultimately a constant remainder 5^0, / and /» would have a common factor involving x, and hence f(x) =0 a multiple root. §*69] STURM'S THEOREM 77 a root of f{x) = 0, the number of real roots of f{x) =0 between a and b is equal to the excess of the number of variations of sign of (2) /Or), fx(x), f 2 (x), ...-, /,_!(*), /. for x = a over the number of variations of sign for x = b. Terms which vanish are to be dropped out before counting the variations of sign. For brevity, let V x denote the number of variations of sign of the numbers (2) when x is a particular real number not a root of f(x) = 0. First, if x\ and X2 are real numbers such that no one of the continuous functions (2) vanishes for a va.iue of x between x\ and X2 or for x = x\ or x = x%, the values of any one of these functions for x = xi and x = X2 are both positive or both negative (§ 63). and therefore V Xl =V X2 . Second, let p be a root of f t (x) = 0, where l£i)Fi+i(p), c i+1 >0, k i+1 ( P )>0. Thus F t _i and F i+l have opposite signs for x = p. We proceed as in § 69. Example 1. If f{x) =x 3 +6x — 10, /i = 3(a; 2 +2) is always positive. Hence we may employ / and Fi = 1 . For x = — , there is one variation of sign ; for x = + oo, no variation. Hence there is a single real root; it lies between 1 and 2. Example 2. If f(x) = 2x i — 13x 2 — lOx — 19, we may take /i=4a; 3 -13x-5. Then 2/=x/ 1 -/ 2 , / 2 = 13x 2 + 15x+38 = 13(x+if) 2 +iffi Since / 2 is always positive, we need go no further (we may take F 2 — l). For x= — oo , the signs are -\ 1-; forx = + oo,+ + +. Hence there are two real roots. The signs for x = are — h Hence one real root is positive and the other negative. EXERCISES Isolate by Sturm's theorem the real roots of 1. x 3 +3x 2 -2x-5 = 0. 2. x 4 + 12x 2 +5x-9 = 0. 3. x 3 -7x-7=0. 4. 3x 4 -6x 2 +8x-3=0. 5. x 6 +6x 5 -30x 2 -12x-9=0 [stop with / 2 ] . 6. x 4 -8x 3 +25x 2 -36x+8 = 0. 7. For/ = x 3 +px+g (p^0), show that/i = 3x 2 +p, f 2 = -2px-3g, 4p 2 /i = ( - Gpx +9g)/, -fa / 3 = - 4p 3 - 27g 2 , so that fz is the discriminant A (§ 44). Let [p] denote the sign of p. Then the signs of /, /i, ft, /s are - + +[p] [A] forx=-oo, + + -[p] [A] forx= + ^o. For A negative there is a single real root. For A positive and therefore p negative, there are three distinct real roots. For A = 0, f% is a divisor of /i and/, so that x = — 3q/ (2p) is a double root. 80 ISOLATION OF REAL ROOTS [Ch. VI 8. Prove that if one of Sturm's functions has p imaginary roots, the initial equation has at least p imaginary roots. 9. State Sturm's theorem so as to include the possibility of a, or b, or both a and b being roots of f(x) =0. 71. Sturm's Functions for a Quartic Equation. For the reduced quar- tic equation f(z) = 0, ' / = z 4L -{-qz 2 -\-rz-\-s, (5) \ fi=4z*+2qz+r, . J2= — 2qz 2 — 3rz— 4s. Let q9^0 and divide q 2 fi by/2. The negative of the remainder is (6) f 3 = Lz-12rs-rq 2 , L = Sqs-2q 3 -9r 2 . Let L=^0. Then f± is a constant which is zero if and only if /=0 has multiple roots, i.e., if its discriminant A is zero. We therefore desire /t expressed as a multiple of A. By § 50, (7) A=-4P 3 -27Q 2 , P=-4s-^, Q = ^ qs -r 2 -^. We may employ P and Q to eliminate (8) 4s=-P-| 2 , r 2 =-Q-%qP-^ T q s . We divide L 2 J2 by (9) f 3 = Lz+3rP, L = 9Q+4gP. The negative of the remainder x is (10) 18r 2 qP 2 - 9r 2 LP+4sL 2 = q 2 A. The left member is easily reduced to q 2 A. Inserting the values (8) and replacing L 2 by L(9Q+4gP), we get - 18gQP 2 - 12q 2 P 3 - i£q 4 P 2 +2qP 2 L+±q 3 PL - 3q 2 QL. Replacing L by its value (9), we get q 2 A. Hence we may take (11) / 4 = A. Hence if gLA^O, we may take (5), (9), (11) as Sturm's functions. 1 Found directly by the Remainder Theorem (§ 14) by inserting the root 2= —SrP/L of / 3 = into L 2 / 2 . § 71] STURM'S FUNCTIONS FOR A QUARTIC 81 Denote the sign of q by [q\. The signs of Sturm's functions are + - -[q] -[L] [A] forx=- oo, + + -k] [L] [A] iorx=+oo. First, let A>0. If q is negative and L is positive, the signs are H 1 h and + + + + + , so that there are four real roots. In each of the remaining three cases for q and L, there are two variations of sign in either of the two series and hence there is no real root. Next, let A<0. In each of the three cases in which q and L are not both positive, there are three variations of sign in the first series and one variation in the second, and hence just two real roots. If q and L are both positive, the number of variations is 1 in the first series and 3 in the second, so that this case is excluded by the Corollary to Sturm's theorem. To give a direct proof, note that, by the value of L in (6), L>0, q>0 imply 4s>3 2 , i.e., s>0, and hence, by (7), P is negative, so that each term of (10) is 2:0, whenceA>0. Hence, if gLA^O, there are four distinct real roots if and only if A and L are positive, and q negative; two distinct real and two imaginary roots if and only if A is negative. Combining this result with that in Ex. 4 below, we obtain the Theorem. If the discriminant A of zt+qz 2 -\-rz-\-s = is negative, there are two distinct real roots and two imaginary roots; if A>0, q<0, L>0, four distinct real roots; if A > and either qhO or L £ 0, no real roots. Here L = 8qs-2q 3 -9r 2 . Our discussion furnished also the series of Sturm functions, which may be used in isolating the roots. EXERCISES 1. If qA^O, L = 0, then / 3 = 3rP is not zero (there being no multiple root) and its sign is immaterial in determining the number of real roots. Prove that there are just two real roots if q<0, and none if q>0. By (10), q has the same sign as A. 2. If r A 5^0, q = 0, obtain — f 3 by substituting z = — 4s/(3r) in /i. Show that we may take/ 3 = rA and that there are just two real roots if A<0, and no real roots if A>0. 3. If A=^0, 0. Since A=256s 3 , check by solving z 4 -|-.s = 0. 4. If A 7^0, qL = 0, there are just two real roots if A<0, and no real roots if A>0. [Combine the results in Exs. 1-3.] 5. Apply the theorem to Exs. 2, 4, 6 of § 70. 6. Isolate the real roots of Exs. 3, 4, 5 of § 48. 82 ISOLATION OF REAL ROOTS [Ch. VI 72. Sturm's Theorem for the Case of Multiple Roots. We might remove the multiple roots by dividing /(z) by 1 f n (x), the greatest com- mon divisor of fix) and f\ =f'(x) ; but this would involve considerable work, besides wasting the valuable information in hand. As before, we suppose /(a) and /(&) different from zero. We have equations (1) in which /„ is now not a constant. The difference V a —V b is the number of real roots between a and b, each multiple root being counted only once. (i [^ If p is a root of f t (x) = 0, but not a multiple root of f(x)—0, then /<_i(p) is not zero. For, if it were zero, x — p would by (1) be a common factor of / and /i. We may now proceed as in the second case in § 69. The third case requires a modified proof only when r is a multiple root. Let r be a root of multiplicity m, ra^.2. Then./(r), f(r), . . . , f {m ~ l) (r) are zero and, by Taylor's theorem, f'(r-\-v) = f (m) 0) + J v^W 1-2. . . (m-iy K) ^ These have like signs if p is a positive number so small that the signs of the polynomials are those of their first terms. Similarly, f(r — p) and f(r—p) have opposite signs. Hence / and f\ show one more variation of sign for x = r — p than for x^r+p. Now (x— r) w_1 is a factor of / and /i and hence, by (1), of f 2 , . . . , /„. Let their quotients by this factor be , <£i, . . • , 4> n - Then equations (1) hold after the/'s are replaced by the 4>'s. Taking p so small that 4>iix)=0 has no root between r — p and r-\-p, we see by the first and second cases in § 69 that 0i, . . . , 4> n show the same number of variations of sign for x = r— p as for x = r-\-p. The same is true for /i, . . . , /„ since the products of i, . . . , ^„ by (x — r) m_1 have for a given x the same signs as 0i . . . , ^or the same signs as —fa, ... , — M . Hence (4) is proved and consequently the present theorem. 1 The degree of f(x) is not n, nor was it necessarily n in § 69. 73] BUDAN'S THEOREM 83 EXERCISES 1. For/ = x 4 -8x 2 4-16, prove that F 1 = x 3 -4x, F 2 = x 2 -4, F x =xF 2 . Hence n = 2. Verify that V-x—2, Foo=0, and that there are just two real roots, each a double root. Discuss similarly the following equations. 2. x 4 -5x 3 +9x 2 -7x+2 = 0. 3. x 4 +2x 3 -3x 2 -4x+4 = 0. 4. x 4 -x 2 -2x+2 = 0. 73. Budan's Theorem. Let a and b be real numbers, ao and = b. This inclusive theorem has been proved, by means of Rolle's theorem, by A. Hurwitz, M athematische Annalen, Vol. 71, 1912, p. 584, who extended Budan's theorem from the case of a polynomial to a function /(x) which is real and regular f or a ^ x < b . 84 ISOLATION OF REAL ROOTS [Ch. VI the first derivative f (i+1) (x) of f {i) (x) is not zero for x = r. As in the third step (now actually the case 4 = 0) in § 69, f (i) (x) and f ii+1) (x) show one more variation of sign for x — r — p than for x = r-\-p, where p is a sufficiently small positive number. If i>0, f (i) is preceded by a term / (i_1) in (12). By hypothesis, / w_1) (x)^0 for x — r and hence has the same sign for x = r — p and x = r-\-p when p is sufficiently small. For these values of x, f li) (x) has opposite signs. Hence / (i_1) and / (i) show one more or one less variation of sign for x = r — p than for x = r-\-p, so that/ (i_1) , f H) , / (i+1) show two more variations or the same number of variations of sign. Next, let no term of the series (12) vanish for x = a or for x = 6, but let several sucessive terms (13) /»(*), f« +1 Kx), ." . • , /»+'-»(*) all vanish for a value r of x between a and 6, while f {i+j) {r) is not zero, but is say positive. 1 Let I\ be the interval between r — p and r, and I2 the interval between r and r-\-p. Let the positive number p be so small that no one of the functions (13) or f (i+j) (x) is zero in these intervals, so that the last function remains positive. Hence f (i+} ~ l) {x) increases with x (since its derivative is positive) and is therefore negative in 7i and positive in I2. Thus f ii+j ~ 2) (x) decreases in 7i and increases in I2 and hence is positive in each interval. In this manner we may verify the signs in the following table: f(i) /(* + !> f(f+2) ... f(i+J-3) f(i+j-2) JH+j-l) f{i+j) h I (->• (-)'- 1 (~>- 2 ••• + + h I + + + ...+ + + + Hence these functions show j variations of sign in 7i and none in I2. If i>0, the first term of (13) is preceded by a function / (i_1) (x) which is not zero for x = r, and hence not zero in 7i or 72 if p is sufficiently small. If j is even, the signs of / w_1) and / (i) are H — h or h in both 7i and 72, showing no loss in the number of variations of sign. If j is odd, their signs are 7i + - or 7 2 + + - + so that there is a loss or gain of a single variation of sign. Hence f«-D fa) / (1+1) , • f- i+J) 1 If negative, all signs in the table below are to be changed; but the conclusion holds § 73] BU DAN'S AND DESCARTES' THEOREMS 85 show a loss of j variations of sign if j is even, and a loss of j± 1 if j is odd, and hence always a loss of an even number h of variations of sign. If i = 0, f m =f has r as a j-f old root and the functions in the table show j more variations of sign for x = r — p than for x = r-\-y. Thus, when no one of the functions (12) vanishes for x = a or for x = b, the theorem follows as at the end of § 69 (with unity replaced by the multiplicity of a root). Finally, let one of the functions (12), other than f(x) itself, vanish for x = a or for x = 6. If 8 is a sufficiently small positive number, all of the N roots of f(x)=0 between a and b lie between a-\-8 and b—8, and for the latter values no one of the functions (12) is zero. By the above proof, V a+5 -V J) - 5 = N+2t, V a -V a+S =2j, V b - S -V b =2s, where t, j, s are integers 2: 0. Hence V a — V b = N-{-2(t-\-j-\-s). Descartes' rule of signs (§ 67) is a corollary to Budan's theorem. Con- sider any equation with real coefficients f(x)=aox n +aiX n ~ 1 + . . . -\-a n _ 1 x+a n = 0, having a„^0. For x — the functions (12) have the same signs as 0-n, O n -l) • • • } a i> a o- Hence Vq is equal to the number V of variations of sign of f(x) . For x = + oo , the functions all have the same sign, which is that of ao. Thus Fo — F^oo = ^ is either the number of positive roots or exceeds that number by a positive even integer. Finally, Descartes' rule holds if a„ = 0, as shown by removing the factors x. EXERCISES Isolate by Budan's theorem the real roots of 1. x 3 -x 2 -2x + l=0. 2. x 3 +3x 2 -2a;-5=0. 3. Prove that if f(a) ?^0, V a equals the number of real roots >a or exceeds that number by an even integer. 4. Prove that there is no root greater than a number making each of the functions (12) positive, if the leading coefficient of /(x) is positive. (Newton.) 5. Hence verify that x i — 4x 3 — 3x+23 =0 has no root >4. 6. Show that x 4 — 4x 3 +x 2 +6x+2 = has no root >3. CHAPTER VII Solution of Numerical Equations 74. Horner's Method. 1 After we have isolated a real root of a real equation by one of the methods in Chapter VI, we can compute the root to any desired number of decimal places either by Horner's method, which is available only for polynomial equations, or by Newton's method (§75), which is applicable also to logarithmic, trigonometric, and other equations. To find the root between 2 and 3 of (1) x 3 - 2a:- 5 = 0, set x = 2-\-p. Direct substitution gives the transformed equation for p: (2) p 3 +6p 2 +10p-l = 0. The method just used is laborious especially for equations of high degree. We next explain a simpler method. Since p = x — 2, x 3 -2x-5 = (x-2) 3 +6(x-2) 2 + 10(x-2)-l, identically in x. Hence —1 is the remainder obtained when the given polynomial x 3 — 2x — 5 is divided by x — 2. By inspection, the quotient Q is equal to O-2) 2 +6(:r-2) + 10. Hence 10 is the remainder obtained when Q is divided by x — 2. The new quotient is equal to (x — 2)-\-G, and another division gives the remainder 6. Hence to find the coefficients 6, 10, — 1 of the terms follow- ing p 3 in the transformed equation (2), we have only to divide the given polynomial x 3 — 2x— 5 by z — 2, the quotient Q by x—2, etc., and take the remainders in reverse order. However, when this work is performed by synthetic division (§ 15) as tabulated below, no reversal of order is 1 W. G. Horner, London Philosophical Transactions, 1819. Earlier (1804) by P. Ruffini. See Bulletin American Math. Society, May, 1911. 86 74] HORNER'S METHOD 87 necessary, since the coefficients then appear on the page in their desired order. 10-2-5 |2 2 4 4 1 2 2 -1 2 8 1 4 2 10 Thus 1, 6, 10, — 1 are the coefficients of the desired equation (2). To obtain an approximation to the decimal p, we ignore for the moment the terms involving p 3 and p 2 ; then by lOp— 1 = 0, p = 0.1. But this value is too large since the terms ignored are all positive. For p = 0.09, the polynomial in (2) is found to be negative, while for p = 0.1 it was just seen to be positive. Hence p = 0.09-\-h, where h is of the denomination thousandths. The coefficients 1, 6.27, ... of the transformed equation for h appear in heavy type just under the first zigzag line in the following scheme: 10.09 0.05 11.1 = 0.004 1 6 0.09 10 0.5481 -1 0.949329 1 6.09 0.09 10.5481 0.5562 -0.050671 1 6.18 0.09 11.1043 0.025096 0.044517584 1 6.27 0.00 4 1 6.274 0.004 11.129396 0.025112 -0.006153416 1 6.278 0.004 11 . 154508 1 6.28 2 Hence z = 2.094+^ where t is a root of * 3 +6.282Z 2 +ll. 154508*- 0.006153416 = 0. By the last two terms, t is between 0.0005 and 0.0006. Then the value 88 SOLUTION OF NUMERICAL EQUATIONS [Ch. VII of C=t 3 +6.282t 2 is found to lie between 0.00000157 and 0.00000227. Hence we may ignore C provided the constant term be reduced by an amount between these limits. Whichever of the two limits we use, we obtain the same dividend below correct to 6 decimal places. 11.154508 | 0.006151 | 0.000551 =1 5577 574 558 16 11 Since the quotient is 0.0005 +, only two decimal places of the divisor are used, except to see by inspection how much is to be carried when making the first multiplication. Hence we mark a cross above the figure 5 in the hundredths place of the divisor and use only 11.15. Before making the multiplication by the second significant figure 5 of the quotient t, we mark a cross over the figure 1 in the tenths place of the divisor and hence use only 11.1. Thus x = 2.0945514+ , with doubt only as to whether the last figure should be 4 or 5. If we require a greater number of decimal places, it is not necessary to go back and construct a new transformed equation from the equation in t. We have only to revise our preceding dividend on the basis of our present better value of t. We now know that t is between 0.000551 and 0.000552. To compute the new value of the correction C, in which we may evidently ignore t s , we use logarithms. log 5.51 = .74115 log 5.52 = .74194 /. log 5 . 5 1 2 = 1 . 48230 .'. log 5 . 52 2 = 1 . 48388 log 6.282= .79810 log 6.282= .79810 log 190.72 =2.28040 log 191.42 =2.28198 Hence C is between 0.000001907 and 0.000001915. Whichever of the two limits we use, we obtain the same new dividend below correct to 8 decimal places. §74] HORNER'S METHOD X X XX X 11.154508 0.00615150 557725 57425 55773 1652 1115 537 446 91 89 0.00055148 LIBRARY SEISM0L0GIC OBSERVATORY Hence, finally, x = 2.094551482, with doubt only as to the last figure. EXERCISES (The number of transformations made by synthetic division should be about half the number of significant figures desired for a root.) By one of the methods in Chapter VI, isolate each real root of the following equa- tions, and compute each real root to 5 decimal places. 1. x 3 +2x4-20 = 0. 3. x 3 +x 2 -2x-l=0. 5. x 4 - ll,727x+40,385 = =0. 2. x 3 +3x 2 -2x-5 = 0. 4. x 4 +4x 3 -17.5x 2 -l8x-f-58.5=0. 6. x 3 = 10. Find to 7 decimal places all the real roots of 7. x 3 +4x 2 -7 = 0. 8. x 3 -7x-7 = 0. Find to 8 decimal places 9. The root between 2 and 3 of x 3 — x— 9 = (make only 3 transformations). 10. The real cube root of 7.976. 11. The abscissa of the real point of intersection of the conies y = x 2 , xy+x+Sy — 6 = 0. 12. Find to 3 decimal places the abscissas of the points of intersection of x 2 +y 2 = 9, y = x 2 — x. 13. A sphere two feet in diameter is formed of a kind of wood a cubic foot of which weighs two-thirds as much as a cubic foot of water (i.e., the specific gravity of the wood is 2/3). Find to four significant figures the depth h to which the floating sphere will sink in water. Hints: The volume of a sphere of radius r is ^vr 3 . Hence our sphere whose radius 90 SOLUTION OF NUMERICAL EQUATIONS [Ch. VII is 1 foot weighs as much as ■§*'"§■ cubic feet of water. The volume of the submerged portion of the sphere is Trh 2 (r—^h) cubic feet. Since this is also the volume of the dis- placed water, its value for r = \ must equal -g-n--f Hence h 3 — 3/i 2 +§ = 0. 14. If the specific gravity of cork is 1/4, find to four significant figures how far a cork sphere two feet in diameter will sink in water. 15. Compute cos 20° to four decimal places by use of cos SA =4 cosM.— 3 cos A, cos 60° = -g-. 16. Three intersecting edges of a rectangular parallelopiped are of lengths 6, 8, and 10 feet. If the volume is increased by 300 cubic feet by equal elongations of the edges, find the elongation to three decimal places. 17. Given that the volume of a right circular cylinder is aw and the total area of its surface is 2j3tt, prove that the radius r of its base is a root of r 3 — /3r+a = 0. If a = 56, (3 = 28, find to four decimal places the two positive roots r. The corresponding altitude is a/r 2 . 18. What rate of interest is implied in an offer to sell a house for $2700 cash, or in annual installments each of $1000 payable 1, 2, and 3 years from date? Hint: The amount of $2700 with interest for 3 years should be equal to the sum of the first payment with interest for 2 years, the amount of the second payment with interest for 1 year, and the third payment. Hence if r is the rate of interest and we write x for 1 +r, we have 2700 x 3 = 1000 x 2 +1000 a; +1000. 19. Find the rate of interest implied in an offer to sell a house for $3500 cash, or in annual installments each of $1000 payable 1, 2, 3, and 4 years from date. 20. Find the rate of interest implied in an offer to sell a house for $3500 cash, or $4000 payable in annual installments each of $1000, the first payable now. 75. Newton's Method. Prior to 1676, Newton 1 had already found the root between 2 and 3 of equation (1). He replaced x by 2+p and obtained (2). Since p is a decimal, he neglected the terms in p z and p 2 , and hence obtained p = 0.1, approximately. Replacing p by 0.1+g in (2), he obtained g 3 +6.3g 2 +11.23g+0.061 = 0. Dividing —0.061 by 11.23, he obtained —0.0054 as the approximate value of q. Neglecting q 3 and replacing q by — 0.0054+r, he obtained 6.3r 2 + 11. 16196r+0.000541708 = 0. Dropping 6.3r 2 , he found r and hence x = 2+0. 1 - 0.0054 - 0.00004853 = 2.09455147, 1 Isaac Newton, Opuscula, I, 1794, p. 10, p. 37. 75] NEWTON'S METHOD 91 of which all figures but the last are correct (§ 74). But the method will not often lead so quickly to so accurate a value of the root. Newton used the close approximation 0.1 to p, in spite of the fact that this value exceeds the root p and hence led to a negative correction at the next step. This is in contrast with Horner's method in which each correction is positive, so that each approximation must be chosen less than the root, as 0.09 for p. Newton's method may be presented in the following general form, which is applicable to any equation f(x) = 0, whether f(x) is a polynomial or not. Given an approximate value a of a real root, we can usually find a closer approximation a+h to the root by neglecting the powers h 2 , h 3 , . . . of the small number h in Taylor's formula (§ 56) f(a+h) =f(a)+f(a)h+f"(a)^+ and hence by taking f(a)+f'(a)h = 0, h = -f(fl) /'(a) ' We then repeat the process with ai = a-\-h in place of the former a. Thus in Newton's example, f(x) =x z — 2x — 5, we have, for a = 2, -/(2)_1 /'(2) 10' h- ai = a-\-h = 2.l, hi = -/(2.1) -0.061 /'(2-1) 11.23 = -0.0054 76. Graphical Discussion of Newton's Method. Using rectangular coordinates, consider the graph of y=f(x) and the point P on it with the abscissa OQ = a (Fig. 22). Let the tangent at P meet the a>axis at T Fig. 22 Fig. 23 92 SOLUTION OF NUMERICAL EQUATIONS [Ch. VII and let the graph meet the z-axis at S. Take h = QT, the subtangent. Then ~f(a) QP=f(a), /'(a)=tanX7T = - h ' h = -f(a) f'(a) " In the graph in Fig. 22, OT = a-\-h is a better approximation to the root OS than OQ = a. The next step (indicated by dotted lines) gives a still better approximation OT\. If, however, we had begun with the abscissa a of a point Pi in Fig. 22 near a bend point, the subtangent would be very large and the method would probably fail to give a better approximation. Failure is certain if we use a point P2 such that a single bend point lies between it and S. We are concerned with the approximation to a root previously isolated as the only real root between two given numbers a and 8. These should be chosen so nearly equal that f'(x) =0 has no real root between a and 8, and hence/(a;) = y has no bend point between a and 8. Further, if f"(x) = has a root between our limits, our graph will have an inflexion point with an abscissa between a and 8, and the method will likely fail (Fig. 23). Let, therefore, neither f'(x) nor f"(x) vanish between a and /3. Since /" preserves its sign in the interval from a to /3, while / changes in sign, /" and / will have the same sign for one end point. According as the abscissa of this point is a or (3, we take a = a or a = 8 for the first step of Newton's process. In fact, the tangent at one of the end points meets the x-axis at a point T with an abscissa within the interval from a. to 8. If f'(x) is positive in the interval, so that the tangent makes an acute angle with the z-axis, we have Fig. 24 or Fig. 25; if f is negative, Fig. 26 or Fig. 22. « T Fig. 24 Fig. 25 Fig. 26 § 76] GRAPHICAL DISCUSSION OF NEWTON'S METHOD 93 In Newton's example, the graph between the points with the abscissas a = 2 and /3 = 3 is of the type in Fig. 24, but more nearly like a vertical straight line. In view of this feature of the graph, we may safely take a=a, as did Newton, although our general procedure would be to take a =/3. The next step, however, accords with our present process; we have a = 2, = 2.1 in Fig. 24 and hence we now take a = /3, getting 0.061 = 0.0054 11 23 as the sub tangent, and hence 2.1 —0.0054 as the approximate root. If we have secured (as in Fig. 24 or Fig. 26) a better upper limit to the root than /3, we may take the abscissa c of the intersection of the chord A B with the a>axis as a better lower limit than a. By similar triangles, -f(a) :c-a=m : (3-c, whence a/(/3)-/3/(a) (3) /G8)-/(«) This method of finding the value of c intermediate to a and /3 is called the method of interpolation {regula falsi) . In Newton's example, a = 2, /3 = 2.1, /(«) = -l, /(,3) =0.061, c = 2.0942. The advantage of having c at each step is that we know a close limit of the error made in the approximation to the root. We may combine the various possible cases discussed into one: If f(x) = has a single real root between a and /3, and f'{x) =0, f"{x) = have no real root between a and /3, and if we designate by /3 that one of the numbers a and fi for which /(/3) and f"{0) have the same sign, then the root lies in the narrower interval from c to 0— f(0)/f'(0), where c is given by (3). It is possible to prove l this theorem algebraically and to show that by repeated applications of it we can obtain two limits a', {$' between which the root lies, such that a'—fi' is numerically less than any assigned posi- tive number. Hence the root can be found in this manner to any desired accuracy. Examine. f(x) =x 3 -2x 2 -2, a = 2\, P=2%. Then f(a) = -U, /(£)=!■ ^Weber's Algebra, 2d ed., I, pp. 380-382; Kleines Lehrbuch der Algebra, 1912, p. 163. 94 SOLUTION OF NUMERICAL EQUATIONS [Ch. VII Neither of the roots 0, 4/3 of fix) = lies between a and ft so that /(a;) =0 has a single real root between these limits (§ 65) . Nor is the root f of fix) = within these limits. The conditions of the theorem are therefore satisfied. For a dr4 r " (2) are in reverse order the coefficients of the transformed equation d 4 +9d 3 +27d 2 +31d+6 = 0, obtained in the Example in the text, and printed in heavy type. 5. The method commonly used to find the positive square root of n by a computing machine consists in dividing n by an assumed approximate value a of the square root and taking half the sum of a and the quotient as a better approximation. Show that the latter agrees with the value of a+h given by applying Newton's method to f(x) =x 2 —n. 78. Newton's Method for Functions not Polynomials. Example 1 . Find the angle x at the center of a circle subtended by a chord which cuts off a segment whose area is one-eighth of that of the circle. Solution. If x is measured in radians and if r is the radius, the area of the segment is equal to the left member of ^r 2 (x — sin x) = -girr 2 , whence x — sin £ = -j7r. § 78] NEWTON'S METHOD FOR FUNCTIONS NOT POLYNOMIALS 97 By means of a graph of y = smx and the straight line represented by y=x — £n-, we see that the abscissa of their point of intersection is approximately 1.78 radians or 102°. Thus a = 102° is a first approximation to the root of /(x) =x— sin x— ^tt = 0. By Newton's method a better approximation is a-\-h, where * —f(a) — a+sina+T 7r h = f'(a) 1— cos a sin 102° = 0.9781 cos 102° = -0.2079 £(3.1416)= 0.7854 1- cos 102°= 1.2079 1.7635 -0.0167 h = = -0.0138 102 ° = 1 .7802 radians 1 .2079 -0.0167 Oi = o+ft = 1.7664 L -/(ai) -1.7664+0.9809+0.7854 /li = = = — U .UUU1 . /'(a:) 1.1944 Hence x = ai+hi = 1.7663 radians, or 101° 12' Example 2. 2 Solve x— log x = 7, the logarithm being to base 10. Solution. Evidently x exceeds 7 by a positive decimal which is the value of log x. Hence in a table of common logarithms, we seek a number x between 7 and 8 whose logarithm coincides approximately with the decimal part of x. We read off the values in the second column. 7.897 7.898 log x 0.89746 . 89752 x— log X 6.99954 7.00048 By the final column the ratio of interpolation is 46/94. Hence a; = 7.8975 to four decimal places. 1 The derivative of sin x is cos x. We need the limit of sin (x+2k) —sin x _2 cos ^(2x+2k) sin -g-(2/c) _cos (x+k) sin k 2k 2k k as 2k approaches zero. Since the ratio of sin k to k approaches 1, the limit is cos x. 2 This Ex. 2, which should be contrasted with Ex. 3, is solved by interpolation since that method is simpler than Newton's method in this special case. 98 SOLUTION OF NUMERICAL EQUATIONS (Ch. Vll Example 3. Solve 2x — log x = 7 f the logarithm being to base 10. Solution. Evidently x is a little less than 4. A table of common logarithms shows at once that a fair approximation to x is a = 3.8. Write /(x)=2x-logx-7, log x=ilf loge x, M = 0.4343. By calculus, the derivative of loge x is 1/x. Hence f(x)=2-— , /'(a) =2-0.1143 = 1.8857, x /(a) =0.6-log 3.8 = 0.6-0.57978 = 0.02022, _/j =4^=0.0107, ai = a+ft = 3.7893, /(ai) =0.000041, /(3.7892) = -0.000148. — X0.0001= 0.000078, x=3.789278. 189 All figures of x are correct as shown by Vega's table of logarithms to 10 places. EXERCISES Fine? the angle x at the center of a circle subtended by a chord which cuts off a seg- ment whose ratio to the circle is 1. i- 2. f. When the logarithms are to base 10, 3. Solve2x-logx = 9. 4. Solve 3x-log x = 9. 5. Find the angle just > 15° for which ^ sin x +sin 2x = 0.64. 6. Find the angle just > 72° for which x— | sin x = \ir. 7. Find all solutions of Ex. 5 by replacing sin 2x by 2 sin a; cos x, squaring, and solving the quartic equation for cos x. 8. Solve similarly sin x+sin 2x = 1.2. 9. Find x to 6 decimal places in sin x = x — 2. 10. Find x to 5 decimal places in x = 3 log e x. 79. Imaginary Roots. To find the imaginary roots x-\-yi of an equa- tion /(V)=0 with real coefficients, expand f(x-\-yi) by Taylor's theorem; we get /(x)+fd)!/t-/"W 1 4-/'"W I | L 3+ • • • =0. 79] IMAGINARY ROOTS 99 Since x and y are to be real, and y^O, (4) /(^)-/ ,/ (^)^+r ,, w r ^4- . . . =o, f( X )-r\z)j3^f<*(x)£- ... =o. In the Example and Exercises below, f(z) is of degree 4 or less. Then the second equation (4) is linear in y 2 . Substituting the resulting value of y 2 in the first equation (4), we obtain an equation E(x) = 0, whose real roots may be found by one of the preceding methods. If the degree of f(z) exceeds 4, we may find E(x) — by eliminating y 2 between the two equations (4) by one of the methods to be explained in Chapter X. Example. For f(z) = z 4 — 2+1, equations (4) are z 4 -x+l-6xy+2/ 4 = 0, 4x 3 -l-4:n/ 2 =0. Thus 2/2=3.2-— _4a.6_l_3.2_L. = Q. Q.X 10 The cubic equation in x 2 has the single real root x 2 = 0.528727, x = ±0.72714. Then y 2 = 0.184912 or 0.87254, and 2 = x+_yi-0.72714±0.43001i, -0.72714±0.93409i. EXERCISES Find the imaginary roots of 1. 2 3 -2z-5=0. 2. 28z 3 +92 2 -l=0. 3. z 4 -3z 2 -62 = 2. 4. z 4 -4s 3 +llz 2 -14z+10<-0. 5. z 4 -4z 3 +9z 2 -16z+20 = 0. Hint: #(x)=x(x-2)(16x 4 -64a- 3 + 136:c 2 -144x+65)=0, and the last factor becomes (w 2 -\-l) (w 2 -\-9) for 2x = w+2. Note. If we know a real root r of a cubic equation /(z) =0, we may remove the factor z—r and solve the resulting quadratic equation. When, as usual, r involves several decimal places, this method is laborious and unsatisfactory. But we may utilize a device, explained in the author's Elementary Theory of Equations, pp. 119-121, §§ 6, 7. As there explained, a simila. device may be used when we know two real roots of a quartic equation. 100 SOLUTION OF NUMERICAL EQUATIONS [Ch. VII MISCELLANEOUS EXERCISES (Give answers to 6 decimal places, unless the contrary is stated.) 1. What arc of a circle is double its chord? 2. What arc of a circle is double the distance from the center of the circle to the chord of the arc? 3. If A and B are the points of contact of two tangents to a circle of radius unity from a point P without it, and if arc AB is equal to PA, find the length of the arc. 4. Find the angle at the center of a circle of a sector which is bisected by its chord. 5. Find the radius of the smallest hollow iron sphere, with air exhausted, which will float in water if its shell is 1 inch thick and the specific gravity of iron is 7.5. 6. From one end of a diameter of a circle draw a chord which bisects the semicircle. 7. The equation xtan:r=c occurs in the theory of vibrating strings. Its approxi- mate solutions may be found from the graphs of y = cot x, y = x/c. Find x when c = 1. 8. The equation tan x=x occurs in the study of the vibrations of air in a spherical cavity. From an approximate solution Xi = 1.5ir, we obtain successively better approxi- mations a; 2 =tan- 1 a;i = 1.4334 7T, x 3 = tan~ 1 x 2 , .... Find the first three solutions to 4 decimal places. 9. Find to 3 decimal places the first five solutions of 2x tan x = 2-x 2 ' which occurs in the theory of vibrations in a conical pipe. 10. 4tx 3 — (3x — 1) 2 = arises in the study of the isothermals of a gas. Find its roots when (i) T = 0.002 and (ii) t=0.99. 11. Solve x x = 100. 12. Solve x = 10 log x. 13. Solve x +log x = x log a;. 14. Solve Kepler's equation M = x-e sin x when M = 332° 28' 54.8", e = 14° 3' 20". 15. In what time would a sum of money at 6% interest compounded annually amount to as much as the same sum at simple interest at 8%? 16. In a semicircle of diameter x is inscribed a quadrilateral with sides a, b, c, x; then x*-(a 2 +b 2 +c 2 )x-2abc = (I. Newton). Given a = 2, 6 = 3, c = 4, find x. 17. What rate of interest is implied in an offer to sell a house for $9000 cash, or $1000 down and $3000 at the end of each year for three years? CHAPTER VIII Determinants; Systems of Linear Equations 80. Solution of Two Linear Equations by Determinants of Order 2. Assume that there is a pair of numbers x and y for which (1) aix+biy = ki, a2X+b 2 y = k 2 . Multiply the members of the first equation by b 2 and those of the second equation by —61, and add the resulting equations. We get (ai&2 — a 2 b\)X — &1&2 — k 2 b\. Employing the respective multipliers — a 2 and a\, we get (a\b 2 — o 2 bi)y = a\k 2 — a 2 k\. The common multiplier of x and y is (2) a\b 2 — a 2 b\, and is denoted by the symbol (2') a 2 b 2 which is called a determinant of the second order, and also called the deter- minant of the coefficients of x and y in equations (1). The results above may now be written in the form (3) a\ bx a 2 62 k\ b\ k 2 b 2 a\ b\ a 2 b 2 y- a\ k\ a 2 k 2 We shall call k\ and k 2 the known terms of our equations (1). Hence, if D is the determinant of the coefficients of the unknowns, the product of D by any one of the unknowns is equal to the determinant obtained from D by substituting the known terms in place of the coefficients of that unknown. 101 102 DETERMINANTS; SYSTEMS OF LINEAR EQUATIONS [Ch. VIII If D 5^0, relations (3) uniquely determine values of x and y: x = kib2 — k2bi U a\k,2 — a,2ki D D and these values satisfy equations (1); for example, aix+biy= y> = *i- Hence our equations (1) have been solved by determinants when D^O. We shall treat in § 96 the more troublesome case in which D = 0. Example. For 2x — 3y= — 4, 2 -3 -4 -3 6 -2 X — 2 -2 1 4y = 2 -4 6 2 6x— 2y = 2, we have 14a; = 14, = 28, y=2. x = l, EXERCISES Solve by determinants the following systems of equations: 1. 8x-2/ = 34, 2. 3z+4t/ = 10, 3. az+fo/ = a 2 , x+8y = 53. 4x+ y= 9. bx — ay = ab. 81. Solution of Three Linear Equations by Determinants of Order 3. Consider a system of three linear equations aix+biy-\-ciz = ki, (4) a,2X-\-b2y-\-C2z = k2, azx-\-bzy- i rC2,z = kz. Multiply the members of the first, second and third equations by (5) &2C3 — bzC2, &3C1 — 61C3, b\C2 — bza, respectively, and add the resulting equations. We obtain an equation in which the coefficients of y and z are found to be zero, while the coeffi- cient of x is (6) ai&2C3 — ai&3C2 + «2&3Cl— a2&lC3 + a3&lC2 — a3&2Cl. 82] SIGNS OF TERMS OF A DETERMINANT 103 Such an expression is called a determinant of the third order and denoted by the symbol I cii b\ c\ (6') | 0,2 62 C2 I «3 03 C3 The nine numbers a-i, . . . , C3 are called the elements of the determi- nant. In the symbol these elements lie in three (horizontal) rows, and also in three (vertical) columns. Thus 0,2, 62, C2 are the elements of the second row, while the three c's are the elements of the third column. The equation (free of y and z), obtained above, may now be written as a,\ b\ c\ fci b\ C\ &2 02 C2 x — Ji2 62 C2 03 63 C3 &3 63 C3 since the right member was the sum of the products of the expressions (5) by k\, /c2, &3, and hence may be derived from (6) by replacing the a's by the /c's. Thus the theorem of § 80 holds here as regards the unknown x. We shall later prove, without the laborious computations just employed, that the theorem holds for all three unknowns. 82. The Signs of the Terms of a Determinant of Order 3. In the six terms of our determinant (6), the letters a, b, c were always written in this sequence, while the subscripts are the six possible arrangements of the numbers 1, 2, 3. The first term ai&2C3 shall be called the diagonal term, since it is the product of the elements in the main diagonal running from the upper left-hand corner to the lower right-hand corner of the symbol (6') for the determinant. The subscripts in the term —a\bzC2 are derived from those of the diagonal term by interchanging 2 and 3, and the minus sign is to be associated with the fact that an odd number (here one) of interchanges of subscripts were used. To obtain the arrange- ment 2, 3, 1 of the subscripts in the term -\-a2b3C1 from the natural order 1, 2, 3 (in the diagonal term), we may first interchange 1 and 2, obtaining the arrangement 2, 1, 3, and then interchange 1 and 3; an even number (two) of interchanges of subscripts were used and the sign of the term is plus. While the arrangement 1, 3, 2 was obtained from 1, 2, 3 by one inter- change (2, 3), we may obtain it by applying in succession the three inter- 104 DETERMINANTS; SYSTEMS OF LINEAR EQUATIONS [Ch. VIII changes (1, 2), (1, 3), (1, 2), and in many new ways. To show that the number of interchanges which will produce the final arrangement 1, 3, 2 is odd in every case, note that each of the three possible interchanges, viz., (1, 2), (1, 3), and (2, 3), changes the sign of the product P=(xi — x 2 ) (xi - xs.) (x 2 — x 3 ) , where the x's are arbitrary variables. Thus a succession of k interchanges yields P or — P according as k is even or odd. Starting with the arrange- ment 1, 2, 3 and applying k successive interchanges, suppose that we obtain the final arrangement 1, 3, 2. But if in P we replace the subscripts 1, 2, 3 by 1, 3, 2, respectively, i.e., if we interchange 2 and 3, we obtain —P. Hence k is odd. We have therefore proved the following rule of signs: Although the arrangement r, s, t of the subscripts in any term ±ajb s c t of the determinant may be obtained from the arrangement 1, 2, 3 by various successions of interchanges, the number of these interchanges is either always an even number and then the sign of the term is plus or always an odd num- ber and then the sign of the term is minus. EXERCISES Apply the rule of signs to all terms of 1. Determinant (6). 2. Determinant oib 2 — ct2?>i. 83. Number of Interchanges always Even or always Odd. We now extend the result in § 82 to the case of n variables xi, . . . , x tt . The product of all of their differences Xi — Xj(i a-i h c 3 -d, a-i b 2 c 2 +di a 2 bi c 2 a\ hi Ct 0,4 bi Ci ^ bi Ci a 3 6 3 c 3 (10) D = enEn-e2iE 2 i+e3 1 E 31 - . . . +(-l) w - 1 e„ 1 # r , so that D may be expanded according to the elements of its first column. By (9) the terms of D having the factor en are of the form (-l) 4 ene* 2 2. . . e v , where 1, 12, • ■ ■ , in is an arrangement of 1, 2, . . . , n, obtained from the latter by i interchanges, so that 12, • • • , i n is an arrangement of 2, ... , n, derived from the latter by i interchanges. After removing from each term the common factor en and adding the quotients, we obtain a sum which, by definition, is the value of the determinant En of order n—1. Hence the terms of D having the factor en may all be combined into en En, which is the first part of (10). We next prove that the terms of D having the factor 621 may be com- bined into — 621-^21, which is the second part of (10). For, if A be the determinant obtained from D by interchanging its first and second rows, the result just proved shows that the terms of A having the factor 621 may be combined into the product of e2i by the minor 612 ei3 ^32 633 esn of 621 in A. Now this minor is identical with the minor #21 of e2i in D. But A= — D (§ 87). Hence the terms of D having the factor e2i may be 91] REMOVAL OF FACTORS 111 combined into — 621-^21- Similarly, the terms of D having the factor 631 may be combined into e^iE^i, etc., as in (10). (ii) We next prove that D may be expanded according to the elements of its kth. column (fc>l): (11) d=s (-iy+h jb E jt . Consider the determinant 8 derived from D by moving the kth column over the earlier columns until it becomes the new first column. Since this may be done by k — 1 interchanges of adjacent columns, 5 = (— l) i_1 D. The minors of the elements e lk , . . . , e nb in the first column of 8 are evidently the minors E lk , . . . , E nk of e lk , . . . , e nk in D. Hence, by (10), 8 = e lk E lk -e 2k E 2k + +(-l) n - 1 e nk E nk = t (-iy +l e jk E jk . 3=1 Thus D=(-l) k - 1 8 has the desired value (11). (m) Finally, D may be expanded according to the elements of its kth row: D=t (-iy+%jE kJ . In fact, by Case (ii) , the latter is the expansion of the equal determinant D' in § 85 according to the elements of its &th column. 91. Removal of Factors. A common factor of all of the elements of the same row or same column of a determinant may be divided out of the elements and placed as a factor before the new determinant. In other words, if all of the elements of a row or column are divided by n, the value of the determinant is divided by n. For example, na\ nb\ — n ai 61 02 62 ar = 0. 1. 3o 36 3c 2. 2r I 3r 5a 5b 5c = 0. 2s m 3s d e f 2t n 3t by the shortest method and evaluate 3. 2 7 3 4. 5 7 5 9 8 6 8 3 3 9 4 a b c d a 2 b* c 2 d 2 a 3 b 3 c 3 d 3 a 4 b 4 c 4 d 4 = abcd(a — 6) (a— c)(a— d)(6— c)(&— d)(c— a"). 92. Sum of Determinants. A determinant having ai+gi, 02+^2, ... as f/ie elements of a column is equal to the sum of the determinant having a\, a2, ... as the elements of the corresponding column and the determinant having q\, q2, ... as the elements of that column, while the elements of the remaining columns of each determinant are the same as in the given determi- nant. For example, ai+qi 61 c\ «2 + ?2 b 2 C2 a3~\-q3 03 C3 To prove the theorem we have only to expand the three determinants according to the elements of the column in question (the first column in the example) and note that the minors are the same for all three determi- nants. Hence ai+gi is multiplied by the same minor that a\ and q\ are multiplied by separately, and similarly for £12+02, etc. The similar theorem concerning the splitting of the elements of any row into two parts is proved by expanding the three determinants accord- ing to the elements of the row in question. For example, a\ 61 c\ a2 62 C2 + CI3 &3 C3 QX 61 C\ Q2 02 C2 C/3 63 C3 a-\-r b+s d a b r s = + c d c d 93] ADDITION OF COLUMNS OR ROWS 113 93. Addition of Columns or Rows. A determinant is not changed in value if we add to the elements of any column the products of the correspond- ing elements of another column by the same arbitrary number. Let ai, a,2, . . . be the elements to which we add the products of the elements &i, &2, ... by n. We apply §92 with qi=nb\, q2 = nb2, .... Thus the modified determinant is equal to the sum of the initial determi- nant and a determinant having 61, 62, ... in one column and nb\, nb2, ... in another column. But (§ 91) the latter determinant is equal to the product of n by a determinant with two columns alike and hence is zero (§ 88). For example, ai-\-nbi 61 c\ a\ b\ c\ hi b\ c\ <22 + n&2 &2 C2 = 02 62 C2 -\-n 62 62 C2 a3+nb s 63 c 3 as 63 c 3 &3 63 C3 and the last determinant is zero. Similarly, a determinant is not changed in value if we add to the elements of any row the products of the corresponding elements of another row by the same arbitrary number. For example, -\-nc b+nd a b +n c d a b c d c d c d c d Example. Evaluate the first determinant below. 1 -2 1 1 2 3 = 6 4 3 1 1 1 8 3 = 6 10 3 1 -2 8 1 2 8 3 = 3 10 1 3 10 3 -44. Solution. First we add to the elements of the second column the products of the elements of the last column by 2. In the resulting second determinant, we add to the elements of the first column the products of the elements of the third column by — 1 . Finally, we expand the resulting third determinant according to the elements of its first row, 1. Prove that EXERCISES b -\-c c +a a -j-b bi+ci ci+ai ai+bi 62+C2 C2+32 a 2 +b 2 a b c Oi h Ci di 62 Cl 114 DETERMINANTS; SYSTEMS OF LINEAR EQUATIONS [Ch. VIII By reducing to a determinant of order 3, etc., prove that 2. 3. 1 1 1 1 a b c d a 2 b 2 c 2 d 2 a 3 ¥ c 3 d 3 2 -1 3 -2 1 7 1 -1 3 5 - -5 3 4 -3 2 -1 = (a-b)(a-c)(a-d)(b-c)(b-d)(c-d). -42. 1 1 1 2 1 3 1 4 1 1 3 4 6 10 10 20 = 1. 94. System of n Linear Equations in n Unknowns with ZM0. In a\iXi-\-ai2X2+ . . . +a ln x n = ki, (12) d n \X\-\- tt n 2X2 ~T • • • "r^mj^Ti — ™»> let Z) denote the determinant of the coefficients of the n unknowns : an ai2 . . . a ln D = Then Dxi a nl a n2 anxi ai2 . . d\ n anxi+ai 2 a;2+ . • i^ln^ny ai2 • . dj (l n lXl 0, n 2 • • • d nn a nl xi-\-a n2 x 2 + . • ~\^nn^ni a n2 • • a„ where the second determinant was derived from the first by adding to the elements of the first column the products of the corresponding elements of the second column by X2, etc., and finally the products of the elements of the last column by x n . The elements of the new first column are equal to fci, . . . , fc„ by (12). In this manner, we find that (13) Dxi=Ki, Dx 2 = K 2 , ..., Dx n = K n , in which K t is derived from D by substituting k\, . . . , k n for the elements a u , ..., a ni of the ^th column of D, whence k\ a\2 . . . a ln Ki = , . . . , K n = k n a„ ■ a„ an ... ai n _! ki a nl . . . a nn _! k n 94] EQUATIONS WITH DETERMINANT NDT ZERO 115 If ZMO, the unique values of xi, . . . , x n determined by division from (13) actually satisfy equations (12). For instance, the first equation is satisfied since ki an a\2 • . • a-, k\D — a\\K\ — CI12K2 — — a\ n K n — fci k 2 an a2i «12 «22 a ln a 2n K n a f{ <(, and as shown by expansion according to the elements of the first row; the determinant is zero, having two rows alike. Theorem. If D denotes the determinant of the coefficients of the n unknowns in a system of n linear equations, the product of D by any one of the unknowns is equal to the determinant obtained from D by substituting the known terms in place of the coefficients of that unknown. If D^O, we obtain the unique values of the unknowns by division by D. We have therefore given a complete proof of the results stated and illustrated in § 80, § 81. Another proof is suggested in Ex. 7 below. The theorem was discovered by induction in 1750 by G. Cramer. EXERCISES Solve by determinants the following systems of equations (reducing each deter- minant to one having zero as the value of every element but one in a row or column, as in the example in § 93) . 1. x + y+ 3 = 11, 2x — Qy— 2 = 0, 3z+42/+2z = 0. 3. x-2y+ z = 12, x +2y +32 = 48, 6x+4?/+3z = 84. 5. x+ y-\- 2+ w= 1, x+2y+ 32+ 4u> = ll, x+Zy+ 62+10w = 26, x+4?/+102+20w; = 47. 2. x+ y+ 2 = 0, x+2y+3z=-l, x+3y+Gz = 0. 4. Bx-2y = 7, 3y-2z = 6, 3z-2x=-l. 6. 2x- y+3z-2w = i, x+7y+ 2— w — 2, 3x+5j/-53+3to=0, 4x — 3y+2z— w = 5. 7. Prove the first relation (13) by multiplying the members of the first equation (12) by An, those of the second equation by — A 2 i, . . . , those of the nth equation by ( — l) n-1 A„i, and adding, where Ay denotes the minor of a y in D. Hint: The resulting coefficient of x 2 is the expansion, according to the elements of its first column, of a deter- minant derived from D by replacing an by a i2) . . . , o nl by a n2 . 116 DETERMINANTS; SYSTEMS OF LINEAR EQUATIONS [Ch.VIII 95. Rank of a Determinant. If we erase from a determinant D of order n all but r rows and all but r columns, we obtain a determinant of order r called an r-rowed minor of D. In particular, any element is regarded as a one-rowed minor, and D itself is regarded as an n-rowed minor. If a determinant D of order n is not zero, it is said to be of rank n. If, for 0 of D by replacing the elements of any column by the corresponding known terms ki are not all zero, the equations are incon- sistent. But if these determinants K are all zero, the r equations involving the elements of a non-vanishing r-rowed minor of D determine uniquely r of the unknowns as linear functions of the remaining n — r unknowns, which are independent variables, and the expressions for these r unknowns satisfy also the remaining n — r equations. 96] EQUATIONS WITH DETERMINANT ZERO 117 Consider for example the three equations (4) in the unknowns x, y, z. Five cases arise: (a) D of rank 3, i.e., LMO. (/3) D of rank 2 {i.e., Z> = 0, but some two-rowed minor ^0), and Kv hi bi c\ k 2 b 2 oi h b 3 c 3 K 2 di k\ C\ a 2 k 2 Ci a 3 k 3 c 3 K 3 = ai 6i ki Oj &2 fe G&3 &3 &3 a^ h bi fCf Ci h Oj kj > bj kj » cj kj not all zero. (7) D of rank 2 and K h K 2 , K 3 all zero. (5) D of rank 1 (i.e., every two-rowed minor = 0, but some element ?*0), and (i, j chosen from 1, 2, 3) not all zero; there are nine such determinants K. (e) D of rank 1, and all nine of the two-rowed determinants K zero. In case (a) the equations have a single set of solutions (§94). In cases (/3) and (5) there is no set of solutions. For (/3) the proof follows from (13). In case (7) one of the equations is a linear combination of the other two; for example, if a\b 2 — a 2 6i^0, the first two equations determine x and y as linear functions of z (as shown by trans- posing the terms in z and solving the resulting equations for x and y), and the resulting values of x and y satisfy the third equation identically as to z. Finally, in case (e), two of the equations are obtained by multiplying the remaining one by constants. The reader acquainted with the elements of solid analytic geometry will see that the planes represented by the three equations have the following relations: (a) The three planes intersect in a single point. ()3) Two of the planes intersect in a line parallel to the third plane. (7) The three planes intersect in a common line. (5) The three planes are parallel and not all coincident. (e) The three planes coincide. The remarks preceding our theorem furnish an illustration (the case r=n— 1) of the following Lemma 1. If every (r+l)-rowed minor M formed from certain r-f-1 rows of D is zero, the corresponding r+1 equations (12) are inconsistent provided there is a non-vanishing determinant K formed from any M by replacing the elements of any column by the corresponding known terms k t . For concreteness, 1 let the rows in question be the first r+1 and let 1 All other cases may be reduced to this one by rearranging the n equations and relabelling the unknowns (replacing x 3 by the new X\, for example) 118 DETERMINANTS; SYSTEMS OF LINEAR EQUATIONS [Ch.VIII K an a lr &7-+11 h+lr hi k T+ i 5^0. Let di, . . . , d r+ i be the minors of ki, . . . , k r+l in K. Multiply the first r+1 equations (12) by di, —d 2 , . . . , (-l) r d r+ i, respectively, and add. The right member of the resulting equation is the expansion of ±K. The coefficient of x s is the expansion of an . . . a lr a,i s ± l r+ll r, and having two columns identical if s£r. Hence 0= ±K. Thus if K^O, the equations are inconsistent. Lemma 2. If all of the determinants M and K in Lemma 1 are zero, but an r-rowed minor of an M is not zero, one of the corresponding r+ 1 equations is a linear combination of the remaining r equations. As before let the r+1 rows in question be the first r+1. Let the non-vanishing r-rowed minor be (14) dr+l — an air 9*0. a rl . Let the functions obtained by transposing the terms k t in (12) be L i = a n xi+a i2 X2 J r . . . +a in x n — k t . By the multiplication made in the proof of Lemma 1, diLi-d 2 L 2 + . . . +(-l) r d r+1 L r+1 ==FK = 0. Hence L r+i is a linear combination of Li, . . . , L r . The first part of the theorem is true by Lemma 1. The second part is readily proved by means of Lemma 2. Let (14) be the non-vanishing r-rowed minor of D. For s>r, the sth equation is a linear combination of the first r equations, and hence is satisfied by any set of solutions of the latter. In the latter transpose the terms involving x T+1 , . . . , x n . Since the determinant of the coefficients of xi, . . . , x T is not zero, § 94 shows that xi, . . . , x T are uniquely determined linear functions of x T+1 , . . . , x n (which enter from the new right members). 97] HOMOGENEOUS EQUATIONS 119 EXERCISES Apply the theorem to the following four systems of equations and check the con- clusions: 1. 2x+ y+3z = l, 4x+2y- z=-3, 2x+ y—^z = —4. 3. x- 3y+ 4z = l, 4x-12t/ + 162 = 3, 3x- 9y + 12z = 3. 2. 2x+ y+Sz = l, 4:x+2y- 2 = 3, 2x+ y — 42 = 4. 4. x— 3y4- 42 = 1, 4x- 12j/ + I62 = 4, 3s- 9y + 12z = 3. 5 Discuss the system ax+ 7/+ z = a — 3, x+ay+ z= —2, x+ y+az=-2, when (i) a = l; (u) o=— 2; {Hi) a^l,— 2, obtaining the simplest forms of the unknowns. 6. Discuss the system x+ y+ 2 = 1, ax+ by-\- cz = k, a 2 x-\-b 2 y-\-c 2 z = k 2 , when (i) a, b, c are distinct; (u) a = b^c; (m) a = b = c. 97. Homogeneous Linear Equations. When the known terms fci, . . . , k n in (12) are all zero, the equations are called homogeneous. The determi- nants K are now all zero, so that the n homogeneous equations are never inconsistent. This is also evident from the fact that they have the set of solutions £i = 0, . . . , x n = 0. By (13), there is no further set of solu- tions if ZMO. If D = 0, there are further sets of solutions. This is shown by the theorem of § 96 which now takes the following simpler form. // the determinant D of the coefficients of n linear homogeneous equations in n unknowns is of rank r, r = 0, 5. 2x4- Zy- 4z + 5w = 0, x+2y+3z-48w = 0, 3x+ 5y- z + 2m; = 0, x-2y+ z-12w = 0, 7x+lly- 9z+12w = 0, 4a; +42/- z-24w = 0. 3x + 4y-llz+13u> = 0. 98. System of m Linear Equations in n Unknowns. The case mn, we select any n of the equations and apply to them the theorems of §§ 94, 96. If they are found to be inconsistent, the entire system is evidently inconsist- ent. But if the n equations are consistent, and if r is the rank of the determinant of their coefficients, we obtain r of the unknowns expressed as linear functions of the remaining n—r unknowns. Substituting these values of these r unknowns in the remaining equations, we obtain a system of m— n linear equations in n — r unknowns. Treating this sys- tem in the same manner, we ultimately either find that the proposed m equations are consistent and obtain the general set of solutions, or find that they are inconsistent. To decide in advance whether the former or latter of these cases will arise, we have only to find the maxi- mum order r of a non-vanishing r-rowed determinant formed from the coefficients of the unknowns, taken in the regular order in which they occur in the equations, and ascertain whether or not the corresponding (r+l)-rowed determinants K, formed as in § 96, are all zero. The last result may be expressed simply by employing the terminology of matrices. The system of coefficients of the unknowns in any set of linear equations an a;i+ • • • -+a ln x n =ki, (15) a^jl -El I • • • \0'mn %n >^mj arranged as they occur in the equations, is called the matrix of the coeffi- cients, and is denoted by / an ai2 . . . a lw \ A= §98] m EQUATIONS IN n UNKNOWNS; MATRIX 121 By annexing the column composed of the known terms k t , we obtain the so-called augmented matrix I an ai2 . . . a in k\ \ *- • \ Q'm\ Q"m2 • • • "mn K>m I The definitions of an r-rowed minor (determinant) of a matrix and of the rank of a matrix are entirely analogous to the definitions in § 95. In view of Lemma 1 in § 96, our equations (15) are inconsistent if B is of rank r+1 and A is of rank£r. By Lemma 2, if A and B are both of rank r, all of our equations are linear combinations of r of them. Noting also that the rank r of A cannot exceed the rank of B, since every minor of A is a minor of B, and hence a non-vanishing r-rowed minor of A is a minor of B, so that the rank of B is not less than r, we have the following Theorem. A system of m linear equations in n unknowns is consistent if and only if the rank of the matrix of the coefficients of the unknowns is equal to the rank of the augmented matrix. If the rank of both matrices is r, certain r of the equations determine uniquely r of the unknowns as linear functions of the remaining n — r unknowns, which are independent vari- ables, and the expressions for these r unknowns satisfy also the remaining m — r equations. When m = n-\-l, B has an m-rowed minor called the determinant of the square matrix B. If this determinant is not zero, B is of rank m. Since A has no m-rowed minor, its rank is less than m. Hence we obtain the Corollary. Any system of n-\-l linear equations in n unknowns is inconsistent if the determinant of the augmented matrix is not zero. EXERCISES Discuss the following systems of equations: 1. 2x+ y+3z=l, 2. 2x- y+Bz = 2, 3. 4x- y+ z = 5, 4. 4x-5y = 2, 4x+2y- z= -3, x+7y+ s = l, 2x-3y+5z = l, 2x+3y = 12, 2x+ y-\z=-\, 3x+5y-5z = a, x+ y-2z = 2, 10x-7y = lQ. 10x+5t/-6z=-10. 4x-By+2z = l. 5x - z = 2. 5. Prove the Corollary by multiplying the known terms by x n +i = l and applying § 97 with n replaced by n + 1. 6. Prove that if the matrix of the coefficients of any system of linear homogeneous equations in n unknowns is of rank r. the values of certain n— r of the unknowns mav be 122 DETERMINANTS; SYSTEMS OF LINEAR EQUATIONS [Ch.VIII assigned at pleasure and the others will then be uniquely determined and satisfy all of the equations. 99. Complementary Minors. The determinant a\ 61 c\ d\ (16) D = 0,2 &2 C2 (1% a% 63 C3 ds a± &4 C4 d± is said to have the two-rowed complementary minors M = a$ 63 M' = C2 d 2 C4 d\ since either is obtained by erasing from D all the rows and columns having an element which occurs in the other. In general, if we erase from a determinant D of order n all but r rows and all but r columns, we obtain a determinant M of order r called an r-rowed minor of D. But if we had erased from D the r rows and r columns previously kept, we would have obtained an (n — r) -rowed minor of D called the minor complementary to M. In particular, any element is regarded as a one-rowed minor and is complementary to its minor (of order n— 1). 100. Laplace's Development by Columns. Any determinant D is equal to the sum of all the signed products ±MM', where M is an r-rowed minor having its elements in the first r columns of D, and M' is the minor complementary to M, while the sign is + or — according as an even or odd number of interchanges of rows of D will bring M into the position occupied by the minor M\ whose elements lie in the first r rows and first r columns ofD. For r = l, this development becomes the known expansion of D according to the elements of the first column (§ 90); here Mi = e n . If r = 2 and D is the determinant (16), D = a 2 b 2 Cz dz C4 ci\ - ai bi a 3 b 3 c 2 d 2 Ci di + ai bi ai bi Ci di cz di + Oi b 2 a 3 h ' ci di Ci di - G&2 bi 04 bi Ci di cz d z + 03 63 ai bi Ci di c 2 di ] §100] LAPLACE'S DEVELOPMENT 123 The first product in the development is M1M1'; the second product is —MM' (in the notations of § 99) , and the sign is minus since the interchange of the second and third rows of D brings this M into the position of Mi. The sign of the third product in the development is plus since two interchanges of rows of D bring the first factor into the position of Mi. If D is the determinant (8) , then Mr en Mi' = ■r + 1 r+l e r+ln &n t+1 Any term of the product M \M \ is of the type (17) (-l)Vi e i2 2 . . . e irT - (-l)\ +ir +i . . . e v , where i\, . . . , i T is an arrangement of 1, . . . , r derived from 1, . . . , r by i interchanges, while i r+1 , . . . , i n is an arrangement of r+l, . . . , n derived by j interchanges. Hence i\, . . . , i n is an arrangement of 1, . . . , n derived by i-\-j interchanges, so that the product (17) is a term of D with the proper sign. It now follows from § 87 that any term of any of the products ±MM' mentioned in the theorem is a term of D. Clearly we do not obtain twice in this manner the same term of D. Conversely, any term t of D occurs in one of the products ±MM'. Indeed, t contains as factors r elements from the first r columns of D, no two being in the same row, and the product of these is, except per- haps as to sign, a term of some minor M. Thus t is a term of MM' or of —MM'. In view of the earlier discussion, the sign of t is that of the corresponding term in ±MM', where the latter sign is given by the theorem. 101. Laplace's Development by Rows. There is a Laplace develop- ment of D in which the r-rowed minors M have their elements in the first r rows of D, instead of in the first r columns as in § 100. To prove this, we have only to apply § 100 to the equal determinant obtained by inter • changing the rows and columns of D. There are more general (but less used) Laplace developments in which the r-rowed minors M have their elements in any chosen r columns (or rows) of D. It is simpler to apply the earlier developments to the determi- nant zfc-D having the elements of the chosen r columns (or rows) in the new first r columns (or rows) . 124 DETERMINANTS; SYSTEMS OF LINEAR EQUATIONS [Ch. VIII EXERCISES 1. Prove that abed e f g h a b j k j k e f I m I m 2. By employing 2-rowed minors from the first two rows, show that abed e f g h abed = a b e f c d g h 1 a c e g b d f h + e f g h a d e h b c f 9 = 0. 3. By employing 2-rowed minors from the first two columns of the 4-rowed deter- minant in Ex. 2, show that the products in Laplace's development cancel. 102. Product of Determinants. The product of two determinants of the same order is equal to a determinant of like order in which the element of the r th row and c th column is the sum of the products of the elements of the r th row of the first determinant by the corresponding elements of the c th column of the second determinant. For example, (18) a b e f c d g h ae-\-bg ce+dg af+bh cf+dh While for brevity we shall give the proof for determinants of order 3, the method is seen to apply to determinants of any order. By Laplace's development with r = 3 (§ 101), we have (19) a\ bi C\ tt2 b 2 C-2 az bs Oi -1 ei fi 9i - -1 62 J2 92 - -1 63 h 93 a\ &i ci ei h 01 a2 b% C2 &2 J2 92 a3 63 C3 63 /3 93 102] PRODUCT OF DETERMINANTS 125 In the determinant of order 6, add to the elements of the fourth, fifth, and sixth columns the products of the elements of the first column by ei, /i, gi, respectively (and hence introduce zeros in place of the former elements ei, /i, g{). Next, add to the elements of the fourth, fifth, and sixth columns the products of the elements of the second column by 62, hi 92, respectively. Finally, add to the elements of the fourth, fifth, and sixth columns the products of the elements of the third column by 63, /3, #3, respectively. The new determinant is aiei+&i? 2 +cie 3 ai/i+&i/ 2 +ei/ 3 ai0i+&igr 2 + ci03 ^261 + 6262 + 0263 «2/l+&2/2 + C2/3 a2#l + &202 + C203 a3ei+63e2+C3e 3 03/1+63/2 + 63/3 03^1 J rb 3 g 2 +63^3 ai 61 61 Ct2 62 62 Ct3 &3 63 1 -1 -1 By Laplace's development (or by expansion according to the elements of the last row, etc.), this is equal to the 3-rowed minor whose elements are the long sums. Hence this minor is equal to the product in the right member of (19). EXERCISES 1. Prove (18) by means of § 92. 2. Prove that, if 8 i =a i +p t +y i , 111 a /3 7 a 2 /3 2 7 2 1 a a 2 3 Si S2 1 P = Si S2 S3 1 Y 7 2 s 2 S3 Si 3. If Ai, Bi, Ci are the minors of a*, bi, c% in the determinant D defined by the second factor below, prove that Ai -A, A, a\ bi Ci D -Bi B, -B z a-i b-i c-i = D & -c 2 C 3 a% b 3 ^ D Hence the first factor is equal to D 2 if D ?^0. 126 DETERMINANTS; SYSTEMS OF LINEAR EQUATIONS [Ch.VIII 4. Express (a 2 +b 2 +c 2 -\-d 2 ) (e 2 +/ 2 -\-g 2 -\-h 2 ) as a sum of four squares by writing a-\-bi c-\-di -c-\-di a — bi e+fi g+hi ■g+hi e—fi as a determinant of order 2 similar to each factor. Hint: If k' denotes the conjugate of the complex number k, each of the three determinants is of the form k I -V k' MISCELLANEOUS EXERCISES 1. Solve ax-\- by-\- cz = k, a 2 x-\-b 2 y-\-c 2 z = k 2 , a 4 x+b i y+c i z = k i by determinants for x, treating all cases. 2. In three linear homogeneous equations in four unknowns, prove that the values of the unknowns are proportional to four determinants of order 3 formed from the coefficients. Factor the following determinants: a be b ca ab c b a 4. X x 2 yz X 2 X 3 1 y y 2 xz = y 2 y 3 l z z 2 xy z 2 z 3 1 = (a+6+c)(a+&G)+c« 2 )(a+frw 2 +c«), where a> is an imaginary cube root of unity. (). a b c d b a d c c d a b d c b a 7. a b d a c d b c 8. If the points (xi, i/i), d a . , (x 4 , y^) lie on a circle, prove that xi 2 +yi 2 xi ?/i 1 X 4 2 + 2/4 2 Xi Vi {). §102] MISCELLANEOUS EXERCISES 127 9. Prove that aa' -\-W +cc' ae'+bf'+cg' + ea'+fb'+gc' ee'+ff'+gg' b + f g 10. Prove that the cubic equation a—x b c D(x) = b f-x g c g h—x = has only real roots. Hints: a 2 +b 2 + c 2 — x 2 ab+bf+ eg ac+bg-\- eh D(x)-D(-x) = ab+bf +cg b 2 + f 2 + g 2 -x 2 be +fg+gh ac-\-bg-\-c h be -\-fg-\-gh c 2 +g 2 + h 2 -x* = -x*+x i (a 2 +P+h 2 +2b 2 +2c 2 +2g 2 )-x 2 (D 1 +D 2 +D 3 )+D 2 (0), where D z denotes the first determinant in Ex. 9 with all accents removed and with e = b, while A and Z) 2 are analogous minors of elements in the main diagonal of the present determinant of order 3 with x = 0. Hence the coefficient of — x 2 is a sum of squares. Since the function of degree 6 is not zero for a negative value of x 2 , D(x) =0 has no purely imaginary root. If it had an imaginary root r+si, then D(x+r) =0 would have a purely imaginary root si. But D(x+r) is of the form D(x) with a, f, h replaced by a—r, f—r, h—r. Hence D(x) =0 has only real roots. The method is applicable to such determinants of order n. 11. If oi, . . . , a n are distinct, solve the system of equations Xi x 2 : — a-i h Wa ^— ^ X^a s -^W a/3 ^a g V-= Va/3- V--V^ ^-Va 2 ^w" P ^Wa 2 ^a 1 / — (qr 2 — 2q 2 s — prs +4s 2 ) . s 2 a ^Q 7 _ 3r-pg ^Wa/3 s 104. Fundamental Theorem on Symmetric Functions. Am/ pofo/- nomial symmetric in x\, . . . , x n is equal to an integral rational function, with integral coefficients, of the elementary symmetric functions (2) Ei = Xx h E 2 = 2xix 2 , E 3 = '2xiX2X3, . . . , E n = x 1 x 2 . . . x n and the coefficients of the given polynomial. In particular, any symmetric polynomial with integral coefficients is equal to a polynomial in the elementary symmetric functions with integral coefficients. For example, if n = 2, rx 1 2 +rx 2 2 +sx l +sx 2 =r(E l 2 -2E 2 )+sE l . In case r and s are integers, the resulting polynomial in E\ and E 2 has integral coeffi- cients. The theorem is most frequently used in the equivalent form: 130 SYMMETRIC FUNCTIONS [Ch. IX Any polynomial symmetric in the roots of an equation, x n -E 1 x n - 1 +E 2 x n - 2 - . . . + (-l) n E n = 0, is equal to an integral rational function, with integral coefficients, of the coeffi- cients of the equation and the coefficients of the polynomial. It is this precise theorem that is required in all parts of modern algebra and the theory of numbers, where attention to the nature of the coefficients is vital, rather than the inadequate, oft-quoted, theorem that any sym- metric function of the roots is expressible (rationally) in terms of the coefficients. It suffices to prove the theorem for any homogeneous symmetric poly- nomial S, i.e., one expressible as a sum of terms h = ax\^X2 ki • . • Xn*" of constant total degree k = ki+k2+ • • • +k n in the x's. Evidently we may assume that no two terms of S have the same set of exponents fci, . . . , k n (since such terms may be combined into a single one). We shall say that h is higher than the term bx\ h X2 h . . . x n ln if k\>l\, or if ki = h, k2>h, or if ki = h, k2 = h, kz>h, . . . , so that the first one of the differences k\ — l\, k2 — h, k^—h, . . . which is not zero is positive. We first prove that, if the above term h is the highest term of S, then k\~t.k2^kz . . . ^k n . For, if ki 4, we have a = E 2 E% =*S+32xi 2 X2X 3 X4+10 221X2X3X4X6, Si=S— 0= —3 Sxi 2 x 2 x 3 X4 — 10 2x1X2X3X4X5, o\= — 3 E\Ei= —3 (2xi 2 x 2 x 3 X4+5 2x1X2X3X4X5), Sz =Si — 4, o- = ^ 1 2 ^3=-Ei(2xi 2 x 2 X3+4 2x1X2X3X4) = 2xi 3 x 2 x 3 +2 2xi 2 x 2 2 x 3 +3 2xi 2 x 2 X3X4 +4 (2Xi 2 X 2 X3X4 + 5 2X1X2X3X4X5), Si=S—4, o2xi 2 x 2 2 x 3 +b2xi 3 x 2 X3 = bEJEt - (3a +6)^4+ (a - 2b)E 2 E 3 +5(a +b)E & . 105. Rational Functions Symmetric in all but One of the Roots. If P is a rational function of the roots of an equation f(x) = of degree n and if P is symmetric in n—\ of the roots, then P is equal to a rational function, with integral coefficients, of the remaining root and the coefficients of f(x) and P. For example, P = rai+a 2 2 +a3 2 + . . . +a w 2 is symmetric in a 2 , ... , %, and P = ra 1 + 2o;i 2 -ai 2 = Ci 2 -2c2+rai-ai 2 , if on, . . . , ocn are the roots of equation (1). Since x any symmetric rational function is the quotient of two symmet- ric polynomials, the above theorem will follow if proved for the case in which the words rational function are in both places replaced by polynomial. If a\ is the remaining root, the polynomial P is symmetric in the roots «2, . . . , a n of f(x)/(x— ai) =0, an equation of degree n—1 whose coeffi- cients are polynomials in a\, c\, . . . , c n with integral coefficients. Hence (§ 104), P is equal to a polynomial, with integral coefficients, inai, ci, . . . , c n and the coefficients of P. 1 If N/D is symmetric in «i, « 2 , and the polynomials N and D have no common factor, while TV becomes N' and D becomes D' when a h a 2 are interchanged, then ND' =DN'. Thus N divides N' and both are of the same degree. Hence N' = cN, D'=cD, where c is a constant. By again interchanging a\, a 2 , we obtain N from N', whence N = cN' = c 2 N, c 2 = l. If c=— 1, we take ai=a 2 and see that N = N'=—N, N = 0, whence N has the factor a„— a 2 . Similarly, D has the same factor, contrary to hypothesis. Hence c= +1 and N and D are each symmetric in a h a 2 . § 105] FUNCTIONS SYMMETRIC IN ALL BUT ONE ROOT 133 Example. If a, 13, y are the roots of f(x) =x s +px 2 +qx-\-r = 0, find ^-4 a- a ! jf_a4f a 2 +7 2 /3 2 +7 2 i+P ~ a+j3 a+y /3+ T ' Solution. Sincel3 2 -\-y 2 = p 2 — 2q—a 2 , j3+y=—p—a, ^ a 2 +(3 2 _^p p 2 -2q-a 2 _^p / 2g\_ yj_ But a+p, /3+p, 7+p are the roots y h y%, y 3 of the cubic equation obtained from /(x)=0 by the substitution x-\-p = y, i.e., x = y — p. The resulting equation is y 3 — 2py 2 + (p 2 +q)y+r-pq = 0. Since we desire the sum of the reciprocals of yi, y 2 , yz, we set y = l/z and find the sum of the roots Z\, z 2 , Zz of 1 —2pz-\- (p 2 -\-q)z 2 -\- (r — pq)z 3 = . Hence lQ ,2 +/3 2 2q 2 -2p 2 q+4:pr ^A 1 _^1_^ p 2 +g -ST^o: 2 <^-*a-\-p ^^ y\ -*-^ pq — r -^-W a +/3 pq—r EXERCISES [In Exs. 1-12, a, 13, y are the roots of f(x) = x 3 +px 2 +qx+r = 0.] Using (3y -\-a((3 -\-y) = ^p3/37-2a 2 /3+T ' ^ /3+7-«' 3. Why would the use of 0y = —r/a complicate Exs. 1, 2? Verify that — r f(a)—r @y = = =a 2 +pa+q. a a >V^/3 2 +7 2 4. Why would vou use fiy= —r/a in finding / ? *—1 $y +c 5. Find ^2\{&+y) 2 . 6. Find ^\{a+p-y)\ 7. Find ^(^^J- 8. Find a necessary and sufficient condition on the coefficients that the roots, in 112 — 3r some order, shall be in harmonic progression. Hint: If — | — = -, then — /S = 0, a 7 /3 q and conversely. Hence the condition is 134 SYMMETRIC FUNCTIONS [Ch. IX 1 1 1 TT . 9. Find the cubic equation with the roots /3y — —, ay — -, a/3 — -. Hint: since a /3 y these are (— r — l)/a, etc., make the substitution (— r— l)/x=y. Find the substitution which replaces the given cubic equation by one with the roots 10. afi+ay, n, we multiply (1) by x k ~ n , take x = a\, ... , x = a n in turn, and add the resulting equations. We get (11) s fc +cis fe _ 1 +c 2 s fc _ 2 + . . . +c n s 2fe _„ = (k>n). Instead of memorizing this formula, it is preferable to deduce it for the particular equation for which it is needed, thus avoiding errors of substi- tution as well as confusion with (10). Example. Find sj for x n — 1 = 0. Solution. Comparing our equation with (1), we have Ci=0, . . . , c n _i=0, c n = — 1. Hence in (10) for kn, multiply our equation by x l ~ n . In the resulting equation x l — x l ~ n = we substitute each root, add, and obtain sj — sj_ re = 0. Hence from si we obtain an equal s by subtracting n from I. After repeated subtrac- tions, we reach a value k for which 1 £ k = n. Since st = or n according as ky <$>y <$>=a k+ \l-Py)+$ k +\l-ay). (l-ay)(l-Py) l+py+qy 2 Hence (15) T ^-+ T ^- r = s 1 +s 2 y+ . . . +s k y k - 1 + .. 4>y . 2 , I- ay 1-fiy y * 1+py+qy 2 where the exact expression for 4> is immaterial. Next, we seek an expansion of the fraction in the left member of (13). Its denominator will be identical with that in (14) if we choose r= — py — qy 2 . Evidently (14) may be written in the compact form -i t - 1 -k 1 — r t = 1—r Hence it becomes . , , 2 = S (-iY(py+qy 2 Y+ , , yy , 2 , 1+py+qy 2 , =0 ™ ™ 1+py+qy 2 138 SYMMETRIC FUNCTIONS [Ch. IX where ^=( — p — qy) lc , although no use will be made of the particular form of the polynomial \p. By the binomial theorem, (g+h)) (py+qy 2 Y= ^~j£r(py) e ( g\h\ where the summation extends over all sets of integers g and h, each |:0, for which g+h = t, while g\ denotes the product of 1. 2, . . . , g if ghl, but denotes unity if g = 0. Hence (16) i+lylZ 2 = (y+2 ^ )s( ~ 1)0+n+lk gW r 3 E _ (-p-^qy)^y k i+py+qy 2 ' where the summation extends over all sets of integers g and h, each 2:0, for which g-\-h^.k—l. Since the left members of (15) and (16) are identically equal by (13), their right members must be identical, so that the coefficients of y*' 1 in them must be equal. 1 Hence the coefficient s k of y k ~ l in (15) is equal to the coefficient of y k ~ 1 in (16), which is made up of two parts, correspond- ing to the two terms of the factor p-\-2qy. When we use the constant term p, we must employ from 2 in (16) the terms in which the exponent of y is equal to k—1. But when we use the other term 2qy, we must employ from 2 the terms in which the exponent of y is equal to k — 2, in order to obtain the combined exponent k— 1 of y. Hence s k is equal to the sum of the following two parts: p2(-l)" +A+1 ^pW (g+2h = k-l), 2 g 2(-l) g+ft+l (g "|" /t / | ) W (g+2h = k-2). In the upper sum, write i for g-\-l, and j for h. In the lower sum, write i for g, and j for h-\-l. Hence 1 In fact, the (k — l)th derivatives of the two right members are identical, and we obtain the indicated result by substituting y = in these two derivatives and equating the results. Note that the final terms in both (15) and (16) have y as a factor of their (k — l)th derivatives. § 107] WARING'S FORMULA 139 where now each summation extends over all sets of integers i and j, each 2: 0, for which (17) i+2j=k. Finally, we may combine our two sums. Multiply the numerator and denominator of the first fraction by i, and those of the second fraction by j. Thus (18) ft = fcZ(-l)«+' ^"7 * ; w , (»+■;•- PL W since the present fraction occurred first multiplied by i and second multi- plied by 2j, and, by (17), the sum of these multipliers is equal to k. Our final result is (18), where the summation extends over all sets of integers i and j, each hO, satisfying (17). If we replace i by its value k—2j, and change the sign of p, we obtain from (18) the result that the sum of the k th powers of the roots of x 2 —px+q = is equal to (19) ss = *f (-1)^^V- 2 V k , fc-2 i k(k — 3) k _ i 2 k(k— 4)(/c— 5) j._ 6 3 = P~kp 1+^^p q r ^— p 5 +.... where K is the largest integer not exceeding k/2. The product of the roots is equal to q. Hence if x denotes one root, the second root is q/x. Thus Sk = x -\-(q/x) . Again, the sum of the roots is x-\-q/x = p. Regard q as given and p as unknown. Hence, if c is an arbitrary constant, the equation (20) p -kqp H T~2~ q V ~ ■ ■ ■ = c is transformed by the substitution p = x+q/x into ^&- Hence equation (20) may be solved for p by radicals by the method employed in § 43 for a cubic equation. 140 SYMMETRIC FUNCTIONS [Ch. IX The above proof applies 1 without essential change to any equation x n+ ClX n-i_^_ + c„ = and leads to the following formula for the sum of the kth powers of its roots: (21) * = kZ(-l)i+ ■ ■ ■ +* (ri+ n !'.'. + rir 1)! cin - * • ^ where the summation extends over all sets of integers n, . . . , r n , each ^0, for which ri+2r 2 +3r 3 + . . • +nr n = k. This result (21) is known as Waring' s formula and was published by him in 1762. Example. Letn = 3, fc=4. Then r,.+2?- 2 +3r 3 = 4 and (ri, r 2 , n) = (4, 0, 0), (2, 1, 0), (1, 0, 1), (0, 2, 0), /3! 2! 1! 1! S4 = 4 ( v ^ ! Cl4 -2m Cl2C2 + l!l! ClC3+ 2! C2: = ci 4 -4ci 2 c 2 +4c 1 c 3 +2c 2 2 . EXERCISES 1. For the quadratic z 2 — pz+g = write out the expressions for s 2 , s 3 , s 4 , s 5 given by (19), and compare with those obtained from Newton's identities (Ex. 3, § 106). 2. Find s 4 for a quartic equation by Waring's formula. 3. For k = 5, (20) becomes De Moivre's quintic p s — 5qp 3 +5q 2 p = c. Solve it by radicals for p. 4. Solve (20) by radicals when k = 7. 108. S-functions Expressed in Terms of the Functions s k . Since we have learned two methods of expressing the s k in terms of the coeffi- cients, it is desirable to learn how to express any S-polynomial (and hence any symmetric function) in terms of the s t . By performing the indicated multiplication, we find that s a s b = 2ai a • W = 2ai a+6 +m2aiW, where m=l if a^b, m = 2 if a = b. Transposing the first term, which is equal to s a+b , and dividing by m, we obtain (22) 2ai a a 2 b = -(s a s b -s a+b ). 1 See the author's Elementary Theory of Equations, pp 72-74, where there is given also a shorter proof by means of infinite series. § 108] COMPUTATION OF SIGMA FUNCTIONS 141 In order to compute Sai 4 o!2 3 «3 2 in terms of the s k , we form the product Sai 4 • Sai 3 o:2 2 = 2ai 'o;2 2 + 2q:i 6 q;2 3 + 2ai 4 a2 3 a:3 2 . Making three applications of (22), we get S4 (S3S2 — S5) = (S7S2 — Sg) + (s 6 S3 — Sg) + Sai 4 a2 3 o:3 2 . Hence So:i 4 a2 3 o:3 2 = S2S3S4 — S2S7 — S3S6 — S4S5+2S9. EXERCISES For a quartic equation, express in terms of the si and ultimately in terms of the coefficients Ci, . . . , c 4 : 1. Sai 2 a2 2 . 2. Sai 3 a2. 3. '2ai 2 a.2fXa. 4. 2ai 2 a2 2 «3 2 . 5. If a^.b > c>0, prove that Sai°a 2 a3 <: = — (s a S6S c — S a S 6+c — S6S 4. C — S c Sa+6+2s 4-{,4. c ), m where m — \ if a>&, m=2ifa = fe. 6. Sai% 6 a3 6 =i(s a s 6 2 -s a S26— 2s 6 s a+5 +2s a+2& ), a>&>0. 7. Sa!% a a 3 a =-l-(s 3 -3s a S2 a +2s3 a ), a>0. 109. Computation of Symmetric Functions. The method last explained is practicable when a term of the S-function involves only a few distinct roots, the largeness of the exponents not introducing a difficulty in the initial work of expressing the 2-f unction in terms of the s k . But when a term of the S-function involves a large number of roots with small exponents, we resort to a method suggested by § 104, which tells us which auxiliary simpler symmetric fuctions should be multiplied together to produce our S-function along with simpler ones. For example, to find Zxi 2 x 2 x 3 X4, when n>4, we employ E1E4 = 2xi • 2x1X2X3X4 = 2xi 2 x 2 x 3 X4+5 2x1X2X3X4X5, 2xi 2 x 2 x 3 X4 = .Z?ii?4— 5 E b . To find 2xi 2 X2 2 x 3 2 x 4 , employ ^3^4 = 2x1X2X3-2x1X2X3X4. When many such products of 2-functions are to be computed, it will save time in the long run to learn and apply the " method of leaders " explained in the author's Elementary Theory of Equations, pp. 64-65. 142 SYMMETRIC FUNCTIONS [Ch. IX MISCELLANEOUS EXERCISES Express in terms of the coefficients ci, . . . , c n : 1. 2ai 2 a 2 «3. 2. 2ai 2 a2 2 «3. 3. Sai 2 a 2 2 o;3a4. 4. Sai 2 a 2 2 a3 2 . If a, /3, 7 are the roots of x 3 +px 2 +qx+r = 0, find a cubic equation with the roots 2 2 2 5. a 2 , /3 2 , T 2 - 6. a/3, ay, 0y. 7. -, -, -. a p 7 8. a 2 +/3 2 , a 2 +7 2 , /3 2 +7 2 . 9. a 2 +a/3+0 2 , etc. If a, /3, 7, 5 are the roots of x 4 +px 3 +qx 2 +rx+s = 0, find io. y^ythrL«.y-z^,_ 4 _ p yi. jL^ta 2 ' ' -^Wo: ^- /a ^-*a 2 -^Wa ^Wa/3' 12. Express SaiWasW* in terms of the st when (t) o>6>c>d>0, and (ii) when a = & = c=d. 13. By solving the first k of Newton's identities (10) as a system of linear equations, find an expression in the form of a determinant (i) for st in terms of c h . . . , a, and (ii) for C£ in terms of s h . . . , sj. 14. One set of n numbers is a mere rearrangement of another set if S\, . . . , s n have the same values for each set. CHAPTER X Elimination, Resultants and Discriminants 110. Elimination. If the two equations ax+b = 0, cx+d = (a 9*0, c=*0) are simultaneous, i.e., if x has the same value in each, then x= — = — , R=ad — bc = 0, a c and conversely. Hence a necessary and sufficient condition that the equations have a common root is R = 0. We call R the resultant (or eliminant) of the two equations. The result of eliminating x between the two equations might equally well have been written in the form be— ad = 0. But the arbitrary selec- tion of R as the resultant, rather than the product of R by some constant, as — 1, is a matter of more importance than is apparent at first sight. For, we seek a definite function of the coefficients a, b, c, d of the functions ax-\-b, cx-\-d, and not merely a property R = or Ry*0 of the correspond- ing equations. Accordingly, we shall lay down the definition in § 111, which, as the reader may verify, leads to R in our present example. Methods of elimination which seem plausible often yield not R itself, but the product of R by an extraneous function of the coefficients. This point (illustrated in § 114) indicates that the subject demands a more careful treatment than is often given. 111. Resultant of Two Polynomials in x. Let f(x)=a x m +a 1 x m - 1 + . . . +a m («o^O), 1 g(x)=b x n +b 1 x n - 1 + . . . +b n (6 ^0) be two polynomials of degrees m and n. Let ai, . . . , a m be the roots of f(x)=0. Since a\ is a root of g(x)=0 only when g(ai)=0, the two equations have a root in common if and only if the product g(ai)g(a 2 ) . . . g{a m ) 143 144 ELIMINATION, RESULTANTS, DISCRIMINANTS [Ch, X is zero. This symmetric function of the roots of f(x)=0 is of degree n in any one root and hence is expressible as a polynomial of degree n in the elementary symmetric functions (§ 104), which are equal to —a\/ao, G^/ao, .... To be rid of the denominators ao, it therefore suffices to multiply our polynomial by ao 11 . We therefore define (2) R(f,g)=a n gMg(a 2 ) . . . g(a m ) to be the resultant of / and g. It equals an integral rational function of ao, . . . , a m , bo, . . . , b n with integral coefficients. EXERCISES 1. If w = l, n = 2, R(f, g)=b a 1 2 —bia ai-{-b2ao 2 . 2. If m = 2, n = l, R(f, g) =a (6o«i+bi)(bo«2+W =a 6i 2 — ai&o&i+o^&o 2 , since ao(ai-\-a 2 ) = — Oi, aoaia 2 =02. 3. If /3i, . . . , ft are the roots of g(x) = 0, so that g(a i )=&o(a 1 - J 8i) (aj-ft) . . . (a<-ft»), then #(/, g)=a n bo m (f*i-Pi) (oa-ft . . . («i-|8„) •(as -ft) (aj-ft) . . . (a 2 -ft,) '(«!»— ft) (am— ft) . . . (a m — ft,). Multiplying together the differences in each column, we see that #(/, ? ) = (-l) CTre &o m /(ft)/(ft) . . .f(t3 n ) = (-l) mn R(g,f). 4. If m = 2, n = l, R(g,f) = bo 2 /( — &1/&0) = a &i 2 — a 1 &o&i+a 2 feo 2 , which is equal to R(f, g) by Ex. 2. This illustrates the final result in Ex. 3. 5. If m=n = 2, R(f, g)=ao 2 b 2 ai 2 a2 2 +ao 2 bobiaia 2 (a 1 -\-a2) +a 2 bob2(ai 2 +a 2 2 ) +ao 2 &i 2 «ia2+a 2 &ife2(«i+a2) +a 2 b 2 2 = b 2 O2 2 — bobiaia2+bob 2 (ai 2 — 2 a a2)+6i 2 a a2 — bAooai+ao 2 ^ 2 . This equals i2(gr, /), since it is unaltered when the a's and 6's are interchanged. 6. Prove by (2) that R is homogeneous and of total degree m in 6 , . . . , b n ; and by Ex. 3, that R is homogeneous and of total degree n in a , . . . , a m . Show that R has the terms a n b n m and (-l) mn 6 m a m n . 7. R(S, gig*)=R . , x, 1, in turn, and the second by x m_1 , x m ~ 2 , . . . , x, 1, in turn. We obtain n-\-m equations which are linear and homogeneous in the m-\-n quantities x m+n ~ 1 , . . . , «, 1. Hence the determinant do oi fl2 . . . a m ao ci a2 . . . a m a a\ a 2 ... a m ... (5) F = . bo bi b r , m rows is zero. It may be shown to be equal to the resultant R(f, g), whether mn is even or odd, by the method employed in the above case m = 3, n = 2. We may also prove as follows that if F = the equations /=0 and <7 = have a common root. Since F was obtained as the determinant of the coefficients of x n ~ 1 f, . . . , xf,f, x m ~ l g, . . . , xg, g, F = implies, by § 96, Lemma 2, the existence of a linear relation B x n - l f+ . . . +B n _ 2 xf-\-B n _ l f+A x m - l g+ . . . +A m _ 2 xg+A, tf^O, §112] SYLVESTER'S METHOD OF ELIMINATION 147 identically in x, with constant coefficients Bo, . In other words, /3/+ag = 0, where , A CT _! not all zero. (6) a=A x m - 1 + . . . +A m - 2 x+A m _ 1 , ^=B x n - 1 + . . . +B n -&+B n - 1 . Neither a nor /3 is identically zero. For, if «=0, for example, then /3/==0 and /3 = 0, whereas the A t and B t are not all zero. Consider the factored forms of /, g, a, /3. Suppose that / and g have no common linear factor. The highest power of each linear factor occur- ring in / divides ag= —fif and hence divides a. Thus / divides a, whereas / is of higher degree than a. Hence our assumption that /=0 and g = have no common root has led to a contradiction. A similar idea is involved in the method of elimination due to Euler (1707-1783). If /=0 and <7 = have a common root c, then f = (x— c)a, —g = (x—c)p, identically in x, where a and /3 are polynomials in x of degrees m — 1 and n — 1, respectively. Give them the notations (6). In the identity j3f+ag=0, the coefficient of each power of x is zero. Hence aoBo aiBo+aoBi +b A +b i Ao+b A 1 =0 =0 OmB n -2-\-a m -iBn-i &mtin — 1 +b n A m - 2 +b n - iA m - 1 =0 +6„A m _ 1 =0. Since these m-\-n linear homogeneous equations in the unknowns B , . . . , 2? n _ i, A , . . . ■ A m -i have a set of solutions not all zero, the determinant of the coefficients is zero, By interchanging the rows and columns, we obtain the determinant (5) EXERCISES 1. For m=n=2, show that the resultant is do CLi 02 R = ao ai 02 b 61 62 b bi b 2 Interchange the second and third rows, apply Laplace's development, and prove that R = (a b 2 ) 2 — (a bi) (aib 2 ) , where (0062) denotes aob 2 — a 2 b a , etc. 148 ELIMINATION, RESULTANTS, DISCRIMINANTS [Ch.X a ai a 3 a 3 a a x a2 03 a ai a 2 a 3 b &i 2 63 Oo ai 02 a 3 Oo Ol a2 o 3 bo &i b 2 &3 b &1 62 &3 bo h fe 2 &3 a Ol «2 03 h 6i &2 bz &0 Ol 62 h 2. For ?n=n = S, write down the resultant R and, by interchanges of rows, derive the second determinant in R To the second determinant apply Laplace's development, selecting minors from the first two rows, and to the complementary minors apply a similar development. This may be done by inspection and the following value of — R will be obtained: (ao&i) { (aib 2 ) (a 2 6 3 ) — (aib 3 ) 2 + (a 2 b 3 ) (a b 3 ) } — (a 6 2 ) { (a 6 2 ) (0263) — (a b 3 ) (aib 3 ) } + (a b 3 ) { (ooMfeos) -(0063) 2 } • The third term of the first line and the first term of the last line are alike. Hence, changing the signs, R = (0063) 3 — 2 (o 6i) (aob 3 ) (a 2 b 3 ) — (o 6 2 ) (a O 3 ) (oi6 3 ) + (00&2) 2 (a 2 b 3 ) + (oobi) (oi0 3 ) 2 — (a 6i) (a x b 2 ) (a 2 b 3 ) . Other methods of simplifying Sylvester's determinant (5) are given in § 113. 113. Bezout's Method of Elimination. When the two equations are of the same degree, the method published by Bezout in 1764 will be clear from the example f=a x 3 -{-aiX 2 -{-a 2 x+a3 = 0, g = b x 3 -\-bix 2 -\-b 2 x-Jrb3 = 0. Then (7) aog-bof, (a Q x-\-a{)g — (b x+bi)f, (aox 2j t-aix+a2)g—(box 2 +bix+b 2 )f are equal respectively to (a bi)x 2 J r(aob 2 )x +((1063) =0, (8) (aob 2 )x 2 + {(ao& 3 ) + (ai&2)}z+(ai&3)=0, (a &3)z 2 +(ciib 3 )x +(a 2 b 3 )=0, §113] BEZOUT'S METHOD OF ELIMINATION 149 where (ao&i)=ao&i — aibo, etc. The determinant of the coefficients is the negative of the resultant R(f, g). Indeed, the negative of the determi- nant is easily verified to have the expansion given at the end of Ex. 2 just above. To give a more instructive proof of the last fact, note that, by (7), equations (8) are linear combinations of scy=0, xf=0, /=0, x 2 g = 0, xg = 0, g = 0, the latter being the equations used in Sylvester's method of elimination. The deter- minant of the coefficients in these six equations is the first determinant R in Ex. 2 just above. The operations carried out to obtain equations (8) are seen to correspond step for step to the following operations on determinants. To the products of the ele- ments of the fourth row by a add the products of the elements of the 1st, 2nd, 3rd, 5th, 6th rows by — b , — &i, — b 2 , a.\, 02 respectively [corresponding to the formation of the third function (7)]. To the products of the elements of the fifth row by a add the products of the elements of the 2nd, 3rd, 6th rows by — bo, — &i, «i respectively [cor- responding to the second function (7)]. Finally, to the products of the elements of the sixth row by ao add the products of the elements of the third row by — b Q [corresponding to a g — b f\. Hence a ai 02 a 3 a 3 R = so that R is equal to the 3-rowed minor enclosed by the dots. The method of Bezout therefore suggests a definite process for the reduction of Sylvester's determinant of order 2n (when m=n) to one of order n. Next, for equations of different degrees, consider the example a ai «2 a 3 a eti ch 03 (a 6 3 ) (0163) (a 2 b 3 ) (00&2) (ao&s) + (ai&2) (01&3) (aobi) (ao&2) (a b 3 ) Then f=aoX iJ ra\X 3 -\-a i x' 2J raiX- i rai ) g = & £ 2 4-&i£+&2- a o x 2 g—b f, (a x+ai)x 2 g — (boX+b^f are equal respectively to (a bi)x 3 + (aob2)x 2 — a 3 boX— aj} , (a b 2 )x 3 + { (aib 2 ) —a 3 b } x 2 — { a 3 &i +a 4 &o } * — a 4 bi . The determinant of the coefficients of x 3 , x 2 , x, 1 in these two functions and xg, g, after the first and second rows are interchanged, is the determinant of order 4 enclosed by flots in the second determinant below. Hence it is the resultant R(f, g) . 150 ELIMINATION, RESULTANTS, DISCRIMINANTS [Ch. X Clo «1 Ch a 3 4 ao ai CH a 3 a 4 &o by b 2 bo hi b 2 bo h b 2 bo bi &2 As in the former example, we shall indicate the corresponding operations on Syl- vester's determinant R Multiply the elements of the third and fourth rows by ao. In the resulting determinant a 2 R, add to the elements of the third row the products of the elements of the first, second and fourth rows by — bo, — bi, ai/a respectively. Add to the elements of the fourth row the products of those of the second by — b . We get a 2 R = a ai Ch a 3 04 ao ai a 2 a 3 a^ (a & 2 ) (aib 2 ) — a 3 bo — a 3 6i— aib —aj)i (a bi) (a &2) — a 3 b — a 4 6o 6o &i h bo 6i h Hence R is equal to the minor enclosed by dots. EXERCISES 1. For to = 3, n = 2, apply to Sylvester's determinant R exactly the same operations as used in the last case in § 113 and obtain R- {ciob 2 ) (aib 2 ) — a 3 6o — a 3 bi (a &i) (ciob 2 ) — a 3 b b bi b 2 2. For TO = n=4, reduce Sylvester's R (as in the first case in § 113) to (a bi) (a b 2 ) (aoh) (aob 4 ) (a b 2 ) (a b 3 ) + (aib 2 ) (a b 4 ) + (aib 3 ) (aib 4 ) (a b 3 ) (a b 4 )-r-(aib 3 ) (aib 4 ) + (a 2 b 3 ) {a 2 b^) (aoh) (aib 4 ) (aiib 4 ) (a 3 bi) §114] GENERAL THEOREM ON ELIMINATION 151 114. General Theorem on Elimination. If any method of eliminating x between two equations in x leads to a relation F = 0, where F is a polynomial in the coefficients, then F has as a factor the true resultant of the equations. Some of the preceding proofs become simpler if this theorem is applied. For example, determinant (3) is divisible by the resultant R. Since the diagonal term of (3) is a term ao 2 b 2 3 of R (Ex. 6, § 111), i^ is identical with R. The preceding general theorem is proved in the author's Elementary Theory of Equations, pp. 152-4. We shall here merely verify the theorem in an instructive special case. Let f=aox 3 +aix 2 +a2X+a3 = 0, g=box 3 -\-biX 2 -\-b 2 x-\-b 3 = have a common root x^O. Then — bof-\-aog = (a 6i)x 2 + (a b 2 )x+ (0063), (63/- a 3 g)/x = (aob 3 )x 2 + (aib 3 )x+ (0263) . By Ex. 1 of § 112, the resultant of these two quadratic functions is F = (a 6 3 ) (ao&i) 2 _ (aob 3 ) (ao&i) (ai&s) (a b 2 ) (a 2 6 3 ) (a fr 3 ) (0163) (ao&2) (02^3) (0063) This is, however, not the resultant R of the cubic functions /, g. To show that (aob 3 ) is an extraneous factor, note that the terms of F not having this factor explicitly are (ao&i) (a 2 &3) { (ao&i) (a2&3) — (aob 2 ) (0163) } . The quantity in brackets is equal to — (ao&3)(ai& 2 ), since, as in Ex. 2 of § 101, 0=4 ao a\ a 2 as bo 61 62 b 3 ao a\ «2 as bo bi b 2 b 3 t/zo&i) (a 2 b 3 ) - (0062) (ai6 3 ) + (0063) (ai& 2 ) . We now see that F=(a b 3 )R, where R is given in Ex. 2 of § 112. This method of elimination therefore introduces an extraneous factor (0063). The student should employ only methods of elimination (such as those due to Sylvester, Euler, and Bezout) which have been proved to lead to the true resultant. 152 ELIMINATION, RESULTANTS AND DISCRIMINANTS [Ch. X EXERCISES Find the result of eliminating x and hence find all sets of common solutions of 1. x 2 -y 2 = 9, xy = 5y. 2. x 2 +y 2 = 25, x 2 +3(c-l)x+c(y*-25) =0. 3. When x 2 +ax+b=0 has a double root, what 3 rowed determinant is zero? 4. Find the roots of x 6 +3x 4 +32z 3 +67£ 2 +32x+65 = by § 79. 115. Discriminants. Let a\, . . . , a m be the roots of (9) f(x) =a x m +a l x m - l + . . . +a„, = (a ^0), so that (10) f(x) =a (x-a 1 )(x-a 2 ) . . . (x-a m ). As in § 44, we define the discriminant of (9) to be D = ao 2m ~ 2 (ai-a 2 ) 2 (a]_-a3) 2 . . . (ai-a m ) 2 (a 2 -a3) 2 . . . (a m _i— a m ) 2 . Evidently D is unaltered by the interchange of any two roots. Since the degree in any root is 2{m—l); the symmetric function D is equal to a polynomial in ao, . . . , a m . Indeed, ao 2m ~ 2 is the lowest power of ao sufficient to cancel the denominators introduced by replacing 2ai by — ai/ao, . . . , a\a 2 ... am by ±a m /ao. By differentiating (10), we see that f(ai)=ao(ai— 0L2) (0:1—0:3) • • • (ai—a m ), f'{a 2 )=ao{a 2 —ai){(x 2 —az) . . . (a 2 —a m ), f'(a3)=ao(a3—ai) (0:3—0:2) (0:3— 0:4) ■ • ■ (0:3— a m ), etc. Hence oo m -7'(«i) • • ./ , W=ao 2m - 1 ("l) 1+2+ • • ' +ro - 1 («i-« 2 ) 2 • • • (owi-og 2 m(m — 1) = (-1) 2 aoD. By (2), the left member is the resultant of f(x), f'(x). Henc'e m(m — 1) 1 (id d=(-d 2 -R(f,n ao §115] DISCRIMINANTS 153 EXERCISES 1. Show that the discriminant of f=y 3 +py-\-q = is — 4p 3 — 27 g 2 by evaluating the determinant of order five for R(J, /'). 2. Prove that the discriminant of the product of two functions is equal to the prod- uct of their discriminants multiplied by the square of their resultant. Hint: use the expressions in terms of the differences of the roots. 3. For 00 = 1, show that the discriminant is equal to So Sm-l s m 1 a m a m ' . . . am' S w _i S m S m+ i . . . S2 m -2 where Si= ai l + . . . +a m \ See Ex. 4, § 88; Ex. 2, § 102. 4. Hence verify that the discriminant of x 3 +px+g = is equal to 3 -2p -2p -3q =-4p 3 -27g 2 . -2p -Sq 2p 2 5. By means of Ex. 1, § 113, show that the discriminant of a x 3 +aiX 2 +a2X-\-a 3 = is 2ao«2 ai02+3ffloa3 2ai03 ai 2a2 3o3 3ao 2ai a^ = 18a aia2O3 — 4a O2 3 — 4ai 3 a 3 +ai 2 a 2 2 — 27a 2 a 3 2 . MISCELLANEOUS EXERCISES 1. Find the equation whose roots are the abscissas of the points of intersection of two general conies. 2. Find a necessary and sufficient condition that f(x) =x 4 -\-px 3 -\-qx 2 -\-rx-\-s = shall have one root the negative of another root. When this condition is satisfied, what are the quadratic factors of /(x)? Apply to Ex. 4, § 74. Hint: add and subtract f{x) and/(— x). 3. Solve /(x) =x 4 — 6x 3 +13z 2 — 14x+6 = 0, given that two roots a and are such that 2a+/3 = 5. Hint: /(x) and/(5— 2x) have a common factor. 4. Solve x 3 +px+q = by eliminating x between it and x 2 -\-vx-\-w = y by the greatest common divisor process, and choosing v and w so that in the resulting cubic equation for y the coefficients of y and y 2 are zero. The next to the last step of the elimination 154 ELIMINATION, RESULTANTS AND DISCRIMINANTS [Ch. X gives x as a rational function of y. (Tschirnhausen, Acta Erudit., Lipsiae, II, 1683, p. 204.) 5. Find the preceding y-cubic as follows. Multiply x 2 -\-vx-\-w = y by x and replace x 3 by — px — q; then multiply the resulting quadratic equation in x by x and replace x 3 by its value. The determinant of the coefficients of x 2 , x, 1 must vanish. 6. Eliminate y between y 3 =v, x — ry-^-sy 2 , and get x 3 — Zrsvx — {r 3 v-\-s 3 v 2 ) =0. Take s = l and chose r and v so that this equation shall be identical with x 3 +px+q = 0, and hence solve the latter. (Euler, 1764.) 7. Eliminate y between y 3 = v, x=f+ey-\-y 2 and get 1 e f—x e f—x v f—x v ev = 0. This cubic equation in x may be identified with the general cubic equation by choice of e, f, v, Hence solve the latter. 8. Determine r, s and v so that the resultant of x-\-r y 3 =v, y=—- y+s shall be identical with x 3 -\-px+q = 0. (Bezout, 1762.) 9. Show that the reduction of a cubic equation in x to the form y 3 = v by the sub- stitution r+sy x= 1+1/ is not essentially different from the method of Ex. 7. [Multiply the numerator and denominator of x by 1— y+y 2 .] 10. Prove that the equation whose roots are the n(n — 1) differences xj—xt of the roots of f(x) =0 may be obtained by ehminating x between the latter and f{x-\-y) =0 and deleting from the eliminant the factor y n (arising from y = xj—Xj = 0). The equation free of this factor may be obtained by eliminating x between f(x) = and n-l {f(x+y)-f(x)} /y=f(x)+f"{x) ^ +. . . +f n) (x)^ -=0. This eliminant involves only even powers of y, so that if we set y 2 = z we obtain an equation in z having as its roots the squares of the differences of the roots of fix) = 0. (Lagrange Resolution des equations, 1798, § 8.) 11. Compute by Ex. 10 the z-equation when/(x) =x 3 +px+q. APPENDIX THE FUNDAMENTAL THEOREM OF ALGEBRA Theorem. An equation of degree n with any complex coefficients f{z)^z n +a 1 z n ~ 1 + . . . -ra„ = has a complex (real or imaginary) root. Write z = x-\-iy where x and y are real, and similarly a\ = ci-\-id\, etc. By means of the binomial theorem, we may express any power of z in the form X+iY. Hence (1) f(z) = (x,y)+ixP(x,y), where 4> and \p are polynomials with real coefficients. The first proof of the fundamental theorem was given by Gauss in 1799 and simplified by him in 1849. This simplified proof consists in showing that the two curves represented by cj>(x, y)=0 and \l/(x, t/)=0 have at least one point {x\, y\) in common, so that zx=xi-\-iyi is a root of f(z)=0. This proof is given in Chapter V of the author's Elementary Theory of Equations. We here give a shorter proof, the initial idea of which was suggested, but not fully developed, by Cauchy. 1 Lemma 1. aih-\-a2h 2 -\- . . . -\-a n h n is less in absolute value than any assigned positive number p for all complex values of h sufficiently small in absolute value. The proof differs from that of the auxiliary theorem in § 62 only in reading " in absolute value " for " numerically." We shall employ the notation \z\ for the absolute value + Vx 2 +y 2 of z = x-\-iy. 1 For a history of the fundamental theorem, see Encyclopedic des sciences mathe- maliques, tome I, vol. II, pp. 189-205. 155 156 APPENDIX Lemma 2. Given any positive number P, we can find a positive number R such that \ f(z) \>Pif\z\^R. The proof is analogous to that in § 64. We have f(z)=z n (l+D), Z)-=ai0)+ . . . +a»0)" Since (Ex. 5, § 8) the absolute value of a sum of two complex numbers is equal to or greater than the difference of their absolute values, we have \f(z)\^\z\ n [l-\D\}. Let p be any assigned positive number <1. Applying Lemma 1 with h replaced by 1/z, we see that | D \

p\l- V )>P if p n ^P/(l-p), which is true if *fe iR. V This proves Lemma 2. Lemma 3. Given a complex number a such that f(a) 3^0, we can find a complex number z for which \ f(z) \ < | /(a) | . Write z = a+h. By Taylor's theorem (8) of § 56, h r h n ,^-H...+/w(a).% r! n! f(a+h)=f(a)+f'W+ - ■ ■ +/ (r) («)-^+ • • • +f w (fl)- Not all of the values f'(a), /"(a), ... are zero since f in) (a)=nl Let / (r) (a) be the first one of these values which is not zero. Then /(a+ft) , , / w (a) V , , / (w) (q) £ /(a) "^/(a) "rl-" "*"/(<») nV Writing the second member in the simpler notation g (h) = l+bh r +ch r+1 + . . . +lh n , b^O, we shall prove that a complex value of h may be found such that \ g(h) | < 1. Then the absolute value of f(z)/f (a) will be <1 and Lemma 3 proved. To find such a value of h, write h and b in their trigonometric forms (§ 4) h = p(cos d+isin 6), b=\ b | (cos /3+t sin /3). FUNDAMENTAL THEOREM OF ALGEBRA 157 Then by § 5, § 7, bh r =\b\p T { cos (B+rd)+ism (/3+r0)}. Since h is at our choice, p and angle 6 are at our choice. We choose 8 so that /3+r0=18O°. Then the quantity in brackets reduces to — 1, whence 0(70 = (1-| b \p T )+h r (ch+ . . . +lh n ~ T ). By Lemma 1, we may choose p so small that \ch+ . . . +lh n ~ r \<\b\. By taking p still smaller if necessary, we may assume at the same time that | 6 |p r 2 (x,y) + e(x,y), which, by (1), is the square of | f(z) \. As in the elements of solid analytic geometry, consider the surface represented by Z = G(x, y) and the right circular cylinder x 2 -\-y 2 = R 2 . Of the points on the first surface and on or within their curve of intersection there is a lowest point or there are several equally low lowest points, possibly an infinite number of them. Expressed arithmetically, among all the pairs of real numbers x, y for 158 APPENDIX which x 2 -\-y 2 ^.R 2 , there is 1 at least one pair x\, y\ for which the polynomial G(x, y) takes a minimum value G(x±, y{), i.e., for which G(xi, y\) =G{x, y) for all pairs of real numbers x, y for which x 2 -\-y 2 SR 2 . Proof of the Fundamental Theorem. Let z' denote any complex number for which f(z') ^0. Let P denote any positive number exceeding | f(z') \. Determine R as in Lemma 2. In it the condition | z \ hR may be interpreted geometrically to imply that the point (x, y) representing z = x-\-iy is outside or on the circle C having the equation x 2 -\-y 2 = R 2 . Lemma 2 thus states that, if z is represented by any point outside or on the circle C, then \f(z) \>P. In other words, if |/(z) \=P, the point representing z is inside circle C. In particular, the point representing z' is inside circle C. In view of the preceding section on minimum value, we have G(x h Vl ) < G(x, y) for all pairs of real numbers x, y for which x 2 -\-y 2 ^.R 2 , where x\, y\ is one such pair. Write z\ for x\-\-iy\. Since | f(z) \ 2 = G(x, y), we have for all z's represented by points on or within circle C. Since z' is repre- sented by such a point, (2) \f(.zi)\^\m\, -3w 2 ; i, coi; w 2 i; fl = cos40°-H sin 40°, wR, w 2 R. 3. ±(l+t)/V2; ±(l-i)/V2; ± co 2 . Page 9 4. -1,'cos A+i sin A (A =36°, 108°, 252°, 324°). 6. R*, R«, R*. Page 10 5. p(p-l). 6. (p-l)(g-l)(r-l) if n=pqr. Page 13 1. 51. 2. 13. Page 15 1. Rem. 11, quot. x 2 +5x+8. 2. -61, 2x"-4x 3 +7a: 2 -14a:+30. 3. -0.050671, x 2 +6.09x + 10.5481. 4. z 2 -x-6, z+2; 4,3, -2. 5. x 2 -z-6=0, 3, -2. 6. 2±V5. 7. 2x 2 -x+2. 8. jc«+1. Page 17 1. x 3 -3x 2 +2a;=0. 2. x 4 -5x 2 +4 =0. 3. x 4 -18x 2 +81 =0. 4. x 4 -5x 3 +92 2 -7:c+2=0. 5. 6 2 =4ac. 7. By theorem in §18. 159 160 ANSWERS Page 19 1. x 3 -6x 2 +llx-6=0. 2. x 4 -8x 2 +16=0. 3. 1, 2. 5. 4, f, -f . 6. 1, 3, 5. 7. 1, 1, 1, 3. 8. 2, -6, 18. 9- -3, 1, 5. 10. 5, 2, -1, -4. 11. 2/2-(p 2 -2g)y+g 2 =0. 12. y 2 -(p 3 -3pq)y+q 3 =0. 13. ft) 2/ 2 -Z/(P 3 -3pg)/g+?=0. 14. P 3 r=q 3 . (u) y2- q (p2-2q)y+q*=0. 15. 2,4, -6. (m) 2/ 2 -(p+p/?)2/+2+g+l/g=0. Page 20 1. 5, - 1 ± V^3. 2. 1 ±i, 1 ±V2. 3. x 3 -7x 2 +19a;-13=0. 4- 4, 1 - V-5, x 3 -6x 2 +14x-24=0. 6. ±1, 2±V3. 7. V3, 2±i. 9- x 3 -fx 2 -fx+-|=0. 10. 2 + V3, x 2 +2x+2=0. 11,12. Not necessarily. 13. No. Page 23 1. 19J, 3. 2. 6. 3. 2. 4. 3. 5- 0, -7, -£. Page 25 1. -1, -1, -6. 2. -2, 3, 4. 3. 1, 3, 6. 4. -2, -4. 5- None. Page 27 1. 2, -1, -4, 5. 2. 9. 3- 8, 9. 4. -12, -35. 5. 2, 2, -3. Page 28 I. 1, 6, », 3. 2. 1, 2, 3- 3- 6- 4" 2) 4> 4- £>• 4, 4) 6- "• 2> 3} 4- 7. a, 8. f. 10. x 2 -12x-12=0. n. x 3 -3x 2 -12x+54=0. Page 30 i.l,4. 2. -1,-4. 3.0.7,-5.7. 4. -0.7,5.7. 5-2,2. 6. Imaginary. Page 40 5. x*+x*-4x 3 -3x 2 +3x+l=0. 6. -|(l±V-3), £(7±V45). 10. See (11), §32. 11. Edges roots of x 3 -7x 2 + 12x-t> =0, all real (§45) and irra tional. 14. A = area, c=hypotenuse, squares of legs §(c 2 ± Vc 4 -16A 2 ) . 15. A area, a, & given sides, square third side is a 2 +6 2 ±2\ / a 2 & 2 — 4A 2 . 16. y i -2y 3 + (2-g 2 )y 2 -2y+l=0 ) pos. roots 0.09125, 10.95862. *r -4q 3 -2pqr -9r 2 (r-pq) 2 10. y=q+r/x. 4x 2 +px+q (5p 2 -12q)(p 2 -4q) _13 4(p 3 -4pq+8r) ~~Z P ' 6. 24r-p 3 . 8. 27r 2 -9pqr +2q* = 0. 11. x = 12. ?/=- — 3x-p , see §112. 13. 2+2y- 2q(p 3 +2pq—r) p 2 q—pr+s -5p, see Ex. 17. Page 136 3. s 2 =p 2 -2q, s 3 =p 3 -3pq, s t =p 4 -4:p 2 q+2q 2 , s 5 =p 5 -5 p 3 q+5 pq 2 . 4. S5„=5-3 n , st=0 if k is not divisible by 5. 5. All zero. Page 140 2. See Ex. 2, p. 136. 3. e^ic + VQ + e 5 --'^- VQ, Q = fc 2 -? 6 (j=0, 1, 2, 3, 4). 4. e^c + VQ + e 7 -^- VQ, Q = \c 2 -qi (j=0, 1, . . . , 6). 1. c 2 2 —2cic 3 +2ci. 3. dc 3 -4c 4 . Page 141 2. C! 2 c 2 — 2c 2 2 — CiC 3 +4c 4 . 4. c 3 2 — 2c 2 c 4 . Page 142 1. C!C 3 — 4c 4 if n> 3, C1C3 if n = 3. 2. 3ciC 4 — c 2 c 3 — 5c 6 . 3. c 2 c 4 -4cic 5 +9c6. 4. c 3 2 -2c 2 c 4 +2cic 5 -2c6. 5. y 3 -(p 2 -2q)y 2 + (q 2 -2pr)y -r 2 = 0. 6. y 3 -qy 2 +pry-r 2 =0. 7. n/ 3 +2gt/ 2 +4pz/+8 =0. 8. Eliminate x by y= s 2 - x 2 . 9. Use p 2 -q+px=y. 10. -4+pr/s. 11. (rs-pr 2 +2pgs)/s 2 . 12. (i) s a s b s c s d -'2s a s b Sc+d+2'2s a Sb +c+d + '2s a - H) s c+ a-6s a+b+e+d . (u) -^(V -6s a 2 s 2a +8s a s 3a +3s 2a 2 -6s 4a ). 166 ANSWERS 13. (0 st = 1 . . Ci Ci 1 . . 2c 2 c 2 Ci 1 . . 3c 3 C3 C2 Ci . . 4c 4 Ci- 1 Cfc-2 Cfc-3 . . . d &Ci s 3 = - 1 ci Ci 1 2c 2 c 2 Ci 3c 3 where all but the last term in the main diagonal is 1, and all terms above the diagonal are zero except those in the last column. If k>n, we must take Cj=0 (j>n). (u) k ! Cjt = — 1 . Si Si 2 . s 2 S 2 Si 3 . s 3 S1-1 Sj— 2 Si- 3 • • Si s* , 3!c 3 =- 1 Si Si 2 s 2 s 2 Si S3 Page 152 1. y 2 {l%-y 2 ); y=0,x = ±3; y = ±4,x = +5. 2. (c-l)% 2 -25)(?/ 2 -16). Ifc^l, ?/ = ±5,x=0; y = ±4,a; = +3. 3- =46 -a 2 . 4. 2±3i, -2±i, ±t. Pages 153, 154 2. pqr—p 2 s—r 2 =0, x 2 -\-r/p, x 2 +px+ps/r. 3. 1, 3, l±i. 11. See Ex. 15, p. 134 1 a b 2 a 2 a INDEX Numbers refer to pages. Abscissa, 55 Absolute value, 3 Amplitude, 3 Argument, 3 Arithmetical progression, 19 Bend point, 56, 64 Bezout's eliminant, 148 Budan's theorem, 83 Cardan's formulas, 46, 48 Complex number, 1 geometrical representation, 3, 6, 7 trigonometric form, 3 Compound interest, 13, 90, 100 Conjugate, 1 Continuity, 66, 157 Cube root, 5, 48 of unity, 3, 4 Cubic equation, 32, 40, 45, 127, 134, 153-4 graph of, 65 number real roots, 48, 65, 79 reduced, 45, 64-5 trigonometric solution, 49 De Moivre's quintic, 140 theorem, 5 Derivative, 57-60, 69, 83, 97, 135 Descartes' rule of signs, 71, 85 Determinants, 101-27, 146-53 addition of columns, 113 columns, 103 complementary minors, 122 diagonal term, 103 elements, 103 expansion, 109 interchanges, 106, 107 Laplace's development, 122-3, 147-9 minors, 109, 116 of Vandermonde, 108 product of, 124 rank, 116, 121 Determinants, removal of factor, 111 rows, 103 signs of terms, 103-6 skew symmetric, 108 sum of, 112 Discriminant, 152-3 of cubic, 47, 65, 134 of quadratic, 11, 12 of quartic, 51, 81 Double root, 16 (see Discriminant) Duplication of cube, 35 Elementary symmetric function, 128 Elimination, 143-154 extraneous factor, 151 Equation for differences of roots, 154 squares of differences, 134, 154 Euler's eliminant, 147 Factor theorem, 12 Factored form, 11, 15 Fundamental theorem of algebra, 17, 155- Geometrical construction, 29-44 progression, 13, 19 Graphs, 55-70 Greatest common divisor, 61, 75 Horner's method, 86 Identical polynomials, 16 Identity, 11 Imaginary, 2 roots, 19, 98 Inflexion, 62-64 Integral rational function, 12, 17, 20 roots, 24-27 Interpolation, 93, 97 Interval, 78 Irreducible case, 48 Isolation of roots, 71 167 168 INDEX Linear equations, system, 101-3, 114-21 homogeneous, 119, 121 Linear factors, 11, 17 Lower limit to roots, 23 Matrix, 120 augmented, 121 Maximum, 57 Minimum, 57, 157 Modulus, 3 Multiple roots, 16, 60, 82 Multiplicity of root, 16, 20, 61 Newton's identities, 136 method of solution, 90-98 Number of roots, 16, 17, 48, 52, 69, 72-85 of negative roots, 74 Order of radical, 32 Ordinate, 55 Plotting, 55 Polynomial, 12, 66 sign of, 68 Primitive root of unity, 9 Product of roots, 18 Pure imaginary, 2 Quadratic equation, 11 graphical solution, 29, 55 sum of powers of roots, 139 Quadratic function a square, 12 Quartic equation, 50-54, 80-81 Quotient by synthetic division, 14 Rational roots, 27 Real equation, 12, 20 Reciprocal equation, 37, 44 Regula falsi, 93 Regular polygon, 8 7 sides, 35-36 9 sides, 35, 39 17 sides, 41-44 n sides, 44 Regular decagon, 39 pentagon, 39 Relations between roots and coefficients, 17 Relatively prime, 9, 10 Remainder theorem, 12 Resolvent cubic, 50, 51 Resultant, 143-154 Rolle's theorem, 69 Root between a and b, 67 Roots of unity, 8, 36, 39, 44, 136 periods of, 40 Roots, nth, 7 Sigma function, 128-142 Sign of polynomial, 68 Simple root, 16 Slope, 57, 59 Solution of numerical equations, 86-100 Specific gravity, 89 Square roots, 1, 30, 31, 96 Sturm's functions, 75-82 Sum of four squares, 126 like powers of roots, 134-142 products of roots, 18 roots, 18 Surd roots in pairs, 20 Sylvester's eliminant, 145, 149, 150 Symbols, 11; f(x), 12; \a\, 23, 155; r!, 59' /W(x),59; S, 128; s k , 134; R(f, g) 144 Symmetric functions, 128-142 in all but one root, 132-4 Synthetic division, 13, 86-95 Tangents, 60, 62 Taylor's theorem, 59 Transformed equation, 28, 86 Triple root, 16 Trisection of angle, 34, 40 Upper limit to roots, 21-23 Variation of sign, 71 Waring's formula, 136 BOSTON COLLEGE 3 9031 027 49857 5 QC ZJ> ZD as 3Q UJ 7T. DC .U t: