A a t h Cl ryl a I I f 1A Bell A 4W7w.,c;;ie,#, ['HE QA I ct1I I) ItC I,, -- THE METHOD OF LEAST SQUARES. VELOCITY OF LIGHTI 4 66 OBS.ERVATIONS. 0 Fig. i unit;=O000000001 A ~. 0~ I VELOCITY OF LIGHT IN AIR AND REFRACTING MEDIA. THREAD INTERVALS MADISON MERIDIAN CIRCLE. O 1-80 OBSERVATIONS>.,~4 Fig -a.oa 2Z y=8.3 e x unit=0.01 secondA of time B I WASHBURN.BSERVATORY, UNPUBLISHED OBSERVATIONS, ERRATUM. - Fig. B. The positive part of the axis of x should pass through the lowest plotted point. The relation of the plotted point to the rurve is correctly represented. PHOTOMET-RIC MEASURES /6 OF IAPETUS. 4 303 OBSERVATIONS S, Fig. ~ ~ ~~~Y -~~~~~~=91e-0.026 2 Z x unit=-S1 stelcakr.magnitude % 0 C "~~ ) g ' I ANNALS OF HARYARD COLLEGE OBSERV.ATO-RK, VO.L. II, P;. 252, DECLINATIONS OF OE LYRAE, 0' 1866-67. / 106 OBSERVATIONS. -~Q 0 ii~O~6 ~ Fig. o \ — 0.0221 X a * ia, Y~8.1 e x\ unit=0.CS secon 4 of arc. D 0 - 0 i) n I —.-t- -_y CONSTANT OF ABERRATIO1N1 A. HALL, LYNN, 1888, TYPICAL ERROR CURVES, AN ELEMENTARY TREATISE UPON THE METHOD OF LEAST SQUARES, WITH NUMERICAL EXAMPLES OF ITS APPLICATIONS. BY GEORGE C. COMSTOCK, PROFESSOR OF ASTRONOMY IN THE UNIVERSITY OF WISCONSIN, AND DIRECTOR OF THE WASHBURN OBSERVATORY. BOSTON, U.S.A.: PUBLISHED BY GINN & COMPANY. 1890. COPYRIGHT, 1889, BY GEORGE C. COMSTOCK. ALL RIGHTS RESERVED. TYPOGRAPHY BY J. S. CUSHING & Co., BOSTON, U.S.A. PRESSWORK BY GINN & gQ., BOSTON, U.S.A. PREFACE. THE following elementary treatment of the Method of Least Squares has grown out of my attempts to so present the subject to students of physics, astronomy, and engineering, that a working knowledge based upon an appreciation of its principles might be acquired with a moderate expenditure of time and labor. Conceiving that the ultimate warrant for the legitimacy of the method itself is to be found in the agreement between the observed distribution of residuals and the distribution represented by the error curve, I have not scrupled to abandon altogether the analytical demonstrations of the equation of this curve and to present it as an empirical formula, representing the generalized experience of observers. The evidence in support of a formula of this kind is necessarily cumulative, and the few curves which are presented in illustration of the law of error are to be considered as samples of the kind of evidence which exists in great abundance. By abandoning the theoretical demonstrations, the student is freed from the embarrassments which are usually encountered at the threshold of the subject, and which in many cases cause it to appear as a mathematical puzzle whose analytical difficulties absorb the attention of the tyro to the complete exclusion of the purposes for which the analysis is conducted. I have sought to give prominence to the distinction between accidental and systematic errors, and to insist upon the limi Vi PREFACE. tations which result from the difference between these two classes of error. To illustrate the principles of the text, I have made free use of numerical data and have arranged the computations in forms which experience has shown to be convenient for the purpose, with a view to their subsequent use by the student as models for his own computations. In the preparation of these pages, I have consulted many, if not most, of the standard treatises upon the subject, but my indebtedness for suggestions and methods of treatment is principally to FAYE, Cours d'Astronomie de l'Ecole Polytechnique. OPPOLZER, Lehrbuch der Bahnbestimmnnlg. WRIGHT, Treatise on the Adjustment of Observations. G. C. C. CONTENTS. SECTION PAGE 1. ILLUSTIATIVE PROBLEM........ 1 2. ERROIlS AND RESIDUALS...... 3. TiHE DISTRIBUTION OF RESIDUALS.. 5 4. TI ERROR CURVE........ 8 5. THE PRINCIPLE OF LEAST SQUARES..... 12 6. W EIGIITS.......... 16 7. NORMAL EQUATIONS.... 19 8. NON-LINEARl OBSERV.ATION EQUATIONS.....21 9. FORMATION AND SOLUTION OF NORMAL EQUATIONS..23 10. NUMERICAL EXAMPLE......... 29 11. CONDITIONED OBSERVATIONS....... 38 12. PROBABLE EPIORS......... 45 13. PROBABLE ERROR OF A FUNCTION OF OBSERVED QUANTITIES. 51 14. ASSIGNMENT OF WEIGHTS; REJECTION OF OBSERVATIONS. 54 15. EMPIRICAL OR INTERPOLATION FORMULE.. 58 16. APPROXIMATE SOLUTIONS....... 64 INDEX TO FORMULE......... 68 THE METHOD OF LEAST SQUARES. ->ow:~o-cc,~ 1. PROBLEM. To determine the coefficient of linear expansion of a certain bar of metal its length was determined at different temperatures by comparison with a standard of known length. The data furnished by the measures are (Kohlrausch, Leitfaden der Phlysik): Temperature. Observed Length. minm. 20~ C. 1000.22 40 1000.65 50 1000.90 60 1001.05 It is reqliired to determine from these observations the amount of the expansion of the bar per degree Centigrade. If c denote the required expansion, and l0 the length of the bar when its temperature is 0~ C., its length, 1, at any other temperature, t, may be represented by the equation 10 +t. = 1 By means of this equation the four observations recorded above are transformed into the following observation equations: (1),, + 20 c = 1000.22 (2) 1+ 40 c = 1000.65 l (3) lo + 50 = 1000.90 (4) 10+ 60c = 1001.05) Any two of these equations are sufficient to determine the values of 10 and c, but the values derived from different pairs of equations will be different. Thus we may find from 2 THE METHOD OF LEAST SQUAR ES. Equations. t0 c mml. mnm. (1) and (2) 999.79 +0.0215 (1) and (4) 999.80.0208 (2) and (3) 999.65.0250 (3) and (4) 1000.15.0150 etc. etc. etc. We are here presented with a problem of constant recurrenlce in the investigations and applications of physical science. In order to determine the values of certain quantities with a high degree of precision more measures or observations are made than are absolutely necessary, and these observations prove to be inconsistent among themselves, so that the resulting values of the unknown quantities depend upon the manner in which the data are combined. It is evident that all of the values above found for 0lo and c cannot be correct, and it is doubtful if any absolutely correct value can be derived fromt the data; but it is also apparent that the observations are not worthless and that any of the values above derived may be considered as approximations, more or less close, to the true values of the required quantities. If we assume that the relation between the length of a bar and its temperature can be expressed by an equation of the form employed above, we must suppose that the discordances in the results are due to errors in the observations, and the problem then becomes: To find from the observed data a set of results which shall be affected as little as possible by the errors of the data, or in more technical language, to find the most probable values of the unknown quantities. We may establish in advance of any formal investigation of this problem certain principles to which its solution must conform. Thus, (A) The adopted values of the quantities which are to be determined must be based upon all the data available. Only in exceptional cases, which will be considered hereafter, is it proper to omit or reject any observation or any known relation among the quantities. ERIOlIS ANI) RESIDUALS. 3 (B) The adopted values must satisfy the observation equations as nearly as possible. ~ 2. Errors and Residuals. The expression error of an observation has been freely used in the preceding section, but it should be recognized that the amount of this error can rarely, if ever, be known, since this would imply an exact knowledge of the unknown quantities. We may, however, obtain approximate values of these errors from the adopted values of the quantities which were to be determined. Thus, if the values,, = 999.79, c = + 0.0215 be substituted in equations (1), these become (1) 1000.22 = 1000.22 (3) 1000.87 = 1000.90 (2) 1000.65 = 1000.65 (4) 1001.08 = 1001.05 The difference between the first and second members of any one of these equations is called the residual of that equation, and is approximately the error of the corresponding observation. The residuals which correspond to the several values of 10 and c derived in ~ 1 are given below in tabular form. o= 999.79 999.80 999.65 1000.15 c = + 0.0215 + 0.0208 + 0.0250 + 0.0150 v - 0.00 0.00 + 0.07 -0.23.00 +.02.00 -.10 -.03 +.06.00.00 +.03.00 -.10.00 We may thus, for any assumed values of the unknown quantities, find a corresponding set of residuals, and the smaller these residuals are the closer is the probable approximation of the assumed, to the true values. Principle (B). This statement, however, requires an important qualification to which we now proceed. The errors with which any given series of observations is affected may be divided into two classes: Accidental Errors, or those whose law of recurrence is such that in the long run they are as often positive as negative and 4 THE METHOD OF LEAST SQUARES. whose effect upon the mean of a great number of observations therefore differs but little from zero; and Systematic Errors, or those which in the given series of observations do not thus tend to be eliminated from the mean. In the observations considered in ~ 1, an error of judgment by which the observer in a given case read the thermometer 0~.1 too high would probably be an accidental error, since it may be presumed that in the long run he would read it as often too low as too high, but if through a fixed habit of observing, the thermometer were always read too high this would be a systematic error, and the number of observations might be indefinitely increased without in the least diminishing its effect.. If the standard of length with which the bar was compared were an erroneous standard (e.g. 0.01 mm. too long), all of the observations would be affected with a systematic error due to this source, and the residuals would furnish no trace of this error, since they show only discordances among the observations, and not errors affecting all alike. The smallness of the residuals in any case, therefore, furnishes no guaranty that the observations and the results derived from them have not been vitiated by systematic errors. The presence of errors of this class constitutes the greatest obstacle to the accurate determination of any set of quantities whose values are sought, and the ingenuity and skill of the observer or experimenter cannot be better employed than in avoiding or overcoming the effect of such errors. It therefore deserves especial notice that systematic errors can often be transformed into accidental errors by varying the methods of observation or the conditions under which the observations are made. Thus the possible systematic error of judgment in reading a thermometer, to which allusion was made above, may be transformed into an accidental error if several different persons take part in the observations, since it is hardly probable that they will all have a common, persistent error of judgment. The error due to using an erroneous standard of length may be changed into accidental error by employing a number of different standards, since it is not probable that these, con THE DISTRIBUTION OF RESIDUALS. 5 structed at different times and by different makers, will all have a common error of length. Considerations of this character serve to illustrate the great practical importance of varying the methods of determining any quantity whose value is desired with great precision. Multiplying observations by the same method and under similar circumstances serves only to diminish the effect of accidental errors and is useless beyond a certain limit, while varying the methods and the circumstances under which observations are made tends to eliminate errors of both kinds. The principles here considered find their appropriate application in the selection of the methods by which any given set of unknown quantities is to be determined, but after the observations have been made, since they can, in general, furnish but little, if any, information in regard to their own systematic errors, these must be neglected and the reduction and discussion of the observations directed toward eliminating the effect of the accidental errors. ~ 3. The Distribution of Residuals. Gauss, a German mathematician, has shown by a course of analysis based upon the theory of probabilities that in any given series comprising a very large number of observations affected with accidental errors, the number of errors of a given magnitude, x, is a function of that magnitude. Thus, if x' and x" denote any two errors, and y' and y"11 the number of observations having the errors x' and x1" respectively, then yI: y"::f(x'): f(x") The analytical expression for f(x) obtained by Gauss is f(x) = ^ e-^22 (2) where e = base of the Naperian system of logarithms, 7r = ratio of the circumference to the diameter of a circle, h = a number whose value must be derived for each series of observations, but is constant for all the observations of that series. 6 THE METHOD OF LEAST SQUARES. The same expression for f(x) has been derived by other mathematicians through different courses of analysis, but against all of these investigations objections of a theoretical character have been urged. Experience, however, shows that the actual distribution of residuals does follow this law, not with absolute accuracy, but to a remarkable degree of approximation. An excellent illustration of this distribution in the case of a comparatively small number of observations is afforded by a series of 66 determinations of the velocity of light made at Washington, in the year 1882.* By means of a revolving mirror the time required for the passage of a ray of sunlight from one terrestrial point to another was measured. The mean of the 66 determinations of this time interval was 24.827 millionths of a second. By subtracting this mean from each single determination a series of residuals will be obtained, and the number of residuals whose magnitude equals 1, 2, 3, etc. units may then be counted. In this way a fair approximation to the distribution of residuals represented by Gauss's law of error will be found; but as this law purports to represent the average distribution of a great number of errors, we shall obtain a better comparison between it and the actual distribution by the following device, to which we resort in order to increase the number of available residuals: Let it be assumed that in any given set of observations the number of residuals of magnitude x is proportional to the number of residuals occurring between the limits x - a and x +4- a, where a is a quantity which in strictness ought to be an infinitesimal, but which may be made a small finite quantity without appreciable error. In the present case we adopt as the unit in which the residuals are to be expressed, the thousand-millionth part of a second (0.00000001), and put a equal to two such units. Thus, from a series of 66 observations are derived the following numbers which represent the distribution of residuals which might be expected to occur in a much longer series. * Velocity of Light in Air and Refracting Media. Bureau of Navigation, Navy Department, 1885, p. 187. THE DISTRIBUTION OF RESIDUALS. 7 Residual. Less than - 13.5 u Equalto — 13.5 / -12.5 -11.5 -10.5 - 9.5 - 8.5 - 7.5 - 6.5 - 5.5 - 4.5, - 3.5 - 2.5 - 1.5 - 0.5 " No. 2 0 2 2 2 3 2 4 6 8 12 15 18 21 23 % Residual. 0.8 o, Greater than + 13.5 } 0.0 of Equal to + 13.5 0.8 C + 12.5 0.8 + 11.5 0.8 ' + 10.5 1.2 4 + 9.5 0.8 + 8.5 1.6 +. 7.5 2.3 + 6.5 3.1.+ 5.5 4.7 + 4.5 5.8:.' + 3.5 7.0, — 2.5 8.2 + 1.5 8.9 ' + 0.5 No. 0 2 2 3 6 5 6 7 8 10 12 15 17 21 23 0.0 0.0 0.8 0.8 1.2 2.3 1.9 2.3 2.7 3.1 3.9 4.7 5.8 6.6 8.2 8.9 %, I, -....,.) The column headed % represents the number of residuals differing not more than half a unit from the magnitude given in the first column, expressed as a percentage of the whole number of residuals. Fig. A furnishes a graphical representation of this distribution, each percentage in the above table being represented by a point whose abscissa is the magnitude of the residual and whose ordinate is the percentage itself. The curve whose equation is = 100 h ehx2 y = __ e- 2x2 V ir h = 0.158 is shown in the same figure, and a simple inspection of the curve shows that its ordinates represent very approximately the percentage of residuals of each magnitude. The coefficient h appears multiplied by the factor 100 in order that the ordinates may be represented as percentages. Figs. B, C, D, represent the distribution of residuals in three other series of observations of different kinds, made at different places, by different observers, but all following the same law. The unit in which the residuals are expressed, unit of x, is stated with each figure, and the unit of y is in every case one per cent of the whole number of residuals. The equations of the several curves shown in the figures are 8 THE METHOD OF LEAST SQUARES. almost identical, but the feature to which the student's attention is called is that the algebraic form of the equation is in each case y = e-^2x2 V7r,and not that h has approximately the same value in each curve. The numerical value of h depends upon the unit adopted for x, and these units having been chosen with reference to a convenient graphical representation of the residuals, the agreement in the several values of h must be regarded as purely artificial. The series of observations represented in Fig. D is known to be affected with small systematic errors, and it will be noted that the distribution of the residuals is more irregular in this case than in any of the others. In each of the series represented in Figs. A and C, there are two residuals whose magnitudes are too great to be represented in the figures; and it is quite generally found that the actual number of very large residuals is slightly greater than the number given by the error curve. The illustrations here given are typical cases, and may serve to exemplify the statement made at the beginning of this section, that the actual distribution of residuals is found to follow Gauss's law of error, and in the following sections this law will be assumed as experimentally demonstrated, and from it will be derived the method of combining and discussing observations. The student will find it an instructive exercise to treat in a manner similar to that pursued above any series of observations to which he may have access, particularly his own observations, and thus lend additional weight to the experimental evidence which is here presented for his consideration. ~ 4. The Error Curve. From the manner in which the ordinates of the points plotted in Figs. A, B, C, and D were derived, it will be apparent that these ordinates represent the number of residuals falling within certain chosen limits of error. Thus in Fig. A, 8.9 per cent of all the residuals lie between the THE ERROR CURVE. 9 limits 0 and +1, 8.2 per cent between +1 and +2, etc., the interval within which the residuals are enumerated being in every case one unit. It is also evident that the number of residuals falling within any other interval, Ax, will depend upon the magnitude of this interval as well as upon the ordinate corresponding to it, and if A x is taken sufficiently small the number of residuals will be proportional to the product y A x. Geometrically considered, this product is the area included between the axis of x, the curve, and the two ordinates drawn through the extremities of A ax, and the number of residuals falling within the limits of Ax is therefore proportional to this area. We may, if we choose, make A x an infinitesimal, and the area y - A x and the corresponding number of residuals will then become indefinitely small, but by taking the sum of all the infinitesimal areas included between the limits x = a and x = b, where a and b have any values whatever, we obtain the area of that part of the curve included between ordinates drawn at these limits. By a similar process of summation we obtain the number of residuals lying between a and b, and the number of residuals thus found must be proportional to the area, since this proportionality is true in every infinitesimal element included in the area. In the following table, the function, A, represents the area of that part of the error curve included between ordinates whose abscissas are 0 and x, the argument of the table being the values of x for the particular error curve in whose equation h = 1; but the area included between 0 and x in the curve corresponding to any other value of h may be found front the same table, by using as the argument 7x instead of x. The area of that part of the curve lying between the limits a and b is represented l)y rb 7, b A = ydx == 'Ke-h2 dx (3) va ',/7r Let the variable in this expression be changed by putting Ax = t, and the expression becomes 1 Chb A = - e-idt N / t~h 10 THE METHOD OF LEAST SQUARES. These expressions for A become identical if h = 1; hence, if the value of A be computed from the second integral for h = 1 and tabulated, we may find from this table the value of A corresponding to any other value of h by changing the limits a, b into ha and hb. A remarkable property of the curve, which will be of use hereafter, may be readily obtained from the expression here found for A. If we make a = 0 and b = cc, the limits of the integral become 0 and:c for all values of h, hence the area of that part of the curve included between x = 0 and x = oc is the same for all values of h, i.e., for every series of observations. TABLE OF AREAS OF THE ERROR CURVE BETWEEN THE LIMITS 0 ANID hlx. hxr A Diff. hr -1 Diff. i x 1 I)iff. 0.0 0.000 1.0 0.21 2.0 09766 56 19 85 0.1.056 1.1.440 2.1 1.49851 55 15 56 0.2.111 1.2.455 2.2.49907 53 12 36 0.3.164 1.3.467 2.,.49943 50 9 3 0.4.214 1.4.476 2.4.49966 467 14 0.5.260 1.5.483 2.5.49980 42 5 9 0.6.302 1.6,488 2.6.49989 37 4 4 0.7.339 1.7.492 2.7.49993 32 3 3 0.8.371 1.8.495 2.8.49996 27 1 2 0.9.308 1.9.496 2.9.49998 23 2 1 1.0.421 2.0.498 3.0.49999 i o.50000 If in any series of observations n]' denote the number of residuals whose magnitudes are included between the limits a and b, n the whole number of residuals in the series, and A,, THE EI1ROl, CURVE,. 11 A, the values of A obtained from the table with the arguments ha, hb, then b]= n(A, T A,) since the ratio of W]b to n is equal to the ratio of the area of that part of the error curve which lies between the limits a and b, to the area of the whole curve, and this latter area is seen from the table to be always unity. The - sign in this equation is to be used when a and b have like signs, and the + when they have unlike signs. If the percentage of residuals between the limits a and b is required, it may be found by substituting 100 in place of n as the coefficient of (Alb tAa). Thus from Fig. A we find for the series of observations there represented, h2 = 0.025 and h = 0.158. To find the distribution of residuals between the limits - o..._^ -. -...2,- - 2...+ 1, +1. +4, +4...+ +c, we proceed as follows: < /8hj, Al, -:- A, n1]b Per cent. Obs. =66 (A ~,,) -- -x- 0.500 5 - 0.790..38 0.132 9 13.2 9 -- 5 -- 0.790.368 -2 -0.316.172.19 13 19.6 1 + 1 + 0.158.088 () 17 26.0 17 + 4 + 0.632.314.226 15 22. 13 +0 +~ 00 *.186 12 18.6 1G( + oo + co.500.... 66 100.0 66 The numbers in the colulmn "Per cent"' llay be coml)arel with the percentages given on page 7. The column "Obs." gives the actual number of residuals which occur in the given series between the limits here considered, and these numbers should be compared with the column "n.] ' By the use of this table, the distribution of residuals in any series of observations for which the value of h is known may be compared with the theoretical distribution much more readily than by plotting a curve, and the student should in this way examine several series of observations. The method of determining h for any given series is contained in ~ 12. 12 THE METHOD OF LEAST SQUARES. ~ 5. The Principle of Least Squares. The quantity h which appears in the equation of the error curve deserves especial attention. If in the equation y = -e-2 VTr x be put equal to zero, the resulting value of y is _-. This is Vw7 the maximum ordinate of the curve, and the value of this maximum ordinate varies directly as h. If those parts of the curve remote from the axis of y be considered, it will be found that the larger is h, the smaller are the values of y, since when x is a large quantity e-^h2 diminishes much more rapidly for increasing values of h than h itself increases. These relations between y and h correspond exactly to the criteria by which we estimate the precision of observations. If we compare two series of observations, I. and II., and find that in series I. the small errors are relatively more numerous (large values of y for small x's), and the large errors less numerous (small values of y for large x's), than in series II., we shall without hesitation call the observations of series I. more precise or accurate than those of series II.; and if required to assign definite meanings to the terms "more precise" and "less precise," we shall find difficulty in defining them in any other manner than by reference to the magnitude of the residuals. We therefore adopt as the measure of precision of any series of observations the value of h in the equation of its error curve; and having thus defined the term "precision," we are able to state two principles which are of general application in the discussion of observations. Let the data furnished by each observation be expressed in the form of an observation equation (Equations 1, ~ 1), then: the best attainable values of the unknown quantities are those which, (1) Distribute the residuals in accordance with the law of error, y = 1 e-h^2X2, and which, VI 7 THE PRINCIPLE OF LEAST SQUARES. 13 (2) Make the value of. h in the equation of the resulting error curve a maximum. The first of these principles is indeed involved in the second, since if the residuals are not distributed in accordance with the law y= -h2X2.A7r there can be no value of h to be made a maximum. It is, however, advantageous to state (1) as a separate principle, since it affords a test of the presence of systematic errors in the data, which, though far front being a perfect criterion, is often convenient and is sometimes the only test available. To justify the statement of (2), we resort to the following considerations: In accordance with A, ~ 1, we suppose that all of the data available is contained in the observation equations, and, B, i 1, we seek to satisfy all of these equations as nearly as possible. If the observations are free from systematic error, a supposition which must here be made, since we have no means of taking into account the effect of such errors, we may obtain by substituting in the observation equations any set of values which approximately satisfy them, a corresponding set of residuals which will be the errors of the observations, on the supposition that the substituted values were the true values of the unknown quantities. If these residuals are plotted in an error curve, they will furnish a numerical measure of the precision h, assigned to the observations by this set of values, and out of all possible sets of values of the unknown quantities that set which assigns the maximum precision to the observations will be entitled to the greatest degree of confidence; for if it were otherwise, we should have no reason for preferring a set of values which exactly satisfied all of the equations to a set which did not satisfy them. It is, of course, true that subsequent observations may furnish a better determination of the unknowns, and that the values thus found will not assign to the earlier observations as high a degree of precision as did the erroneous values obtained 14 THE METHOD OF LEAST SQUARES. from these observations alone, but this subsequent determination is based upon additional evidence, and the problem with which we are concerned is not to obtain the best possible values of the unknown quantities, but the best values which can be derived from the data in our possession. Assuming, then, the validity of (2), we proceed to transform it into an expression more convenient for practical use, and for this purpose we resort to the following property of the error curve, which may be approximately verified by actual measurement from any plotted curves, Figs. A, B, C, D. If the error curve be divided into a great number of parts by drawing equidistant ordinates throughout its whole extent, and the areas of the several parts into which the curve is thus divided be each multiplied by the square of the abscissa of its middle point, the sum of all these products will equal 2 2. The analytical expression for the process above described is +o 2 it M x2ydx or _- x2e-h2x2dx 00 X <7r Put hx = t, and this integral becomes 2fr2et2 dt 1 h2_V- ~2 h2 For the method of obtaining the value of the last integral, see Newcomb's Calculus, Articles 169, 176. The area of each of the parts into which the curve was divided is proportional to the number of residuals occurring between the limiting ordinates of the part; thus, let A denote the area of the part, N the corresponding number of residuals, and n and a the whole number of residuals and the whole area of the curve respectively; then A: N:: a: n but from the table in ~ 4, a = 1, whence A = N and Ax2 = N n n THE PRINCIPLE OF LEAST SQUARES. 15 Since N denotes the number of x's falling within the given infinitesimal part, A, of the curve, Nx2 is equal to the sum of the squares of the x's (residuals) whose magnitudes fall between the limiting ordinates of.A, and taking the sum of all the Ax2's we obtain Y Ax2 1 x2 i.e., is equal to the mean of the squares of all the residuals. It is customary to represent the sum of the squares of the residuals by the symbol [vv], v standing for any residual, and the [ ] denoting the sum of all quantities of the kind written within them. Comparing this result with the one obtained above, we have [v,] 1 Evil, J - i (4) n h2 (2 from which it appears that the relation between h and the sum of the squares of the residuals is such that when h is a maximum, [vv] is a minimum, and principle (2) may be restated as follows: The most probable values of the unkmnown quantities are those which make the sum of the squares of the residuals a minimum. From this principle has been derived the name Afethod of Least Squares, which is commonly applied to that body of principles which treats of the combination and discussion of observed data. We have arrived at this principle from a consideration of that class of cases in which the quantity observed is a function of two or more unknown quantities whose values are to be obtained from the observations. This obviously includes the case of a single quantity, x, whose value is directly measured; and it will be advantageous to apply the principle of least squares to this case. The observation equations are here of the simplest possible form. x - t3 X = mn3 etc. where m denotes an observed value of x. 16 THE METHOD OF LEAST SQUARES. If X0 denote any assumed value of x, the residuals obtained by substituting it in these equations will be V1 = llt -Xo V2 =.12 - Xo V3 = -- Xo VU =- 7',n - Xo and [vv] = (mi - Xo)2 + (m2 - )2 4+ (.3 ^xo)2... + (mn - ax)2. The value of Xo which will make [vv] a minimum is found from [] =0o= -2 (mj-x,,) -'2 ( 2-X) x-) -'2 ( R.(,2) (( -.To) dxo but this equation is equivalent to = mfl1 + 2t + 2m 4+-.. 4 m, [nm] and it thus appears that the universal practice of taking the arithmetical mean of all the measures of a single quantity as the best value of that quantity, is a particular case under the more general method of least squares. ~ 6. Weights. It frequently happens that the circumstances under which an observation was made lead the observer to distrust its accuracy, while other causes give him increased confidence in another observation. Observations which thus differ in quality are said to have different weights, the weight being a numerical measure of the quality, and these weights should be taken into account in combining the observations. Let us suppose two series of observations made upon the same unknown quantity, in one of which the observations are of different quality and entitled to different degrees of confidence, while in the other the observations are all equally good, but each of them entitled to less confidence than the poorest observation of the first series. By taking the mean of a number of observations of this second series, a more reliable value of the unknown quantity may be obtained than any single WEIQHTS. IT observation of the series can furnish, and by properly choosing the number of observations to be included in the mean, a value entitled to as much confidence as any observation of the first series may be found. Thrs number of observations of the second series whose mean is entitled to as much confidence as a single observation of the first series, is called the weight of the equivalent observation in the first series; and, obviously, the better an observation, the greater is its weight. These weights furnish no information about the absolute precision of the observations, but express only their relative excellence as compared with each other; hence, if PI, P2, p3, etc., be the weights of any observations, kP, kp2, kp,, etc., where k is any constant, will express these weights equally well, since it is the ratios of the weights, and not their absolute values, which are of importance. To exhibit the manner in which these weights are to be employed, let us recur to the data of ~ 1, and suppose that those observations were made under such conditions that the first one has a weight 1, the second 2, the third 3, and the fourth 4. In accordance with the definition of weights, this is equivalent to supposing a second series of observations of uniform excellence, such that the first of the actual observations can be replaced by one observation of this series, which must of course be numerically the same as the observation which it replaces; the second real observation may be replaced by two numerically equal observations of the second series; the third by three, etc. Each of these substituted observations will furnish an equation precisely like those given in ~ 1, and when the sum of the squares of the residuals is formed, we shall obtain v12+ (V2+ 2) + (V32 +v32+v2) + ( 2+42+V42+v42) [ pV] The symbol [pvv], which is adopted as an abbreviation for this expression, is equivalent, numerically, to the sum obtained by multiplying the square of each actual residual by the weight of the corresponding observation and adding the products, and it is evident that this [pvv] bears the same relation to the 18 THE METHOD OF LEAST SQUARES. substituted observations that [vv] bore to the actual observations in the case of equal weights, which was considered in the preceding section. The principle there obtained may therefore be generalized as follows: The most probable values of the unknown quantities are those which make the sum of the weighted squares of the residuals, [pvv], a minimum. Let the student show, as in the preceding section, that when this principle is applied to the case of observations of unequal weight made upon a single unknown quantity, it gives as the most probable value of that quantity p ] As an example of the application of weights, we select the following observations of the time of ending of the transit of Mercury of May 6, 1878, which were observed by different observers in the city of Washington. These observers were provided with telescopes of different sizes and magnifying powers, and differed among themselves in point of experience and skill, so that their observed times of last contact are not entitled to equal confidence. The weights assigned to the several observations represent the judgment of the computer with respect to their relative excellence. (Washington Observations, 1876, Appendix II., page 55.) Observed Time. p pmt 5h 38n 23s 1 239 [pm] =318 37 55 0 0 38 10 1 10 = 1 6 38 26 3 78 38 21 2 42 [pm] = 19.9 [PI 38 18 2 36 38 19 3 57 38 21 2 42 38 15 2 30 Weighted mean of the observations = 5b 38m 194.9. NORIMAL EQUATIONS. 19 ~ 7. Normal Equations. We have now to show how the principle of least squares is to be applied in determining the values of a set of unknown quantities, and in order to fix the ideas as definitely as possible, let it be supposed that there are three of these quantities, x, y, z, which are connected with each one of a set of observed quantities, n, by the relation ax + by + cz =?n where a, b, and c are numerical coefficients whose values are supposed known in each equation. From a series of more than three observed values of n, the most probable values of x, y, z are to be obtained by means of the relation [pvv] = a minimum. It is not to be presunmed that these values when found will exactly satisfy all the equations, and make [prv] = 0, but we shall find from eacl equation a residual v, so that strictly the observation equations should be written ax +- by +- c - C -, 2= Cax 4+- by +- C,7z - '2 = V''2 1) aC3 + bj/ + C,-z - } == T;:, etc. ete. etc. The symbols p1, p2, p)3 represent the weiglts assigned to the observed values, n1,,, u 3, etc. By the ordinary rule for determining a minimum of a function of several variables, the condition [pmv] = a minimum, furnishes the three equations [d\[pmr] n = 0 d pv.'] 0 dx dy dz and in order to obtain these derivatives we form from the observation equations [ +] -a,-Vp x + b1V/'-y + c'. 4 - </ / i ( 5) + 5 a%3Vpx + b3"V,3 y + c'sp1,z - VZ- 3 Z 32 ( etc. etc. etc. by +, - V etc. etc. etc. ) 20 THE METHOD OF LEAST SQUARES. The derivative of this expression with respect to x is dxp-] 2a}l/ Va1Vpx ~ blVpy + cvpiz -V\pn + 2a2'Vpa2/Vp2x + b2Vp2y + c,/p2i2 - VP/n2 (6) + 2 a3sVp3/s a3 x + b /p,y + C3V/p z - Vp3 n etc. etc. etc. =0 Let this expression be expanded, divided by 2 and simplified by the introduction of [ ] to denote the sum of all terms like those placed within them (all terms standing in the same vertical column), and it becomes the first of the following group of NORMAL EQUATIONS. [paa] x + [ pab] y + [pac] z - [pan] = 0 [pab] x + [pbb] y + [1pbc] z - [pbn] = 0 (7) [pac] x + [pbc] y + [pcc] z - [pcn] = 0 The second and third of these equations are derived in precisely the same manner as the first from the conditions d [pvv] = o pvv] = dy dx These equations are equal in number to the unknown quantities, and their solution will in general furnish a determinate set of values for these quantities which will be the most probable values, since the normal equations include all of the data furnished by the observations and have been so derived as to satisfy the principle of least squares. Equations (6) furnish a rule which is frequently given for the formation of normal equations. To obtain the first normal equation, multiply each observation equation by the product of its weight into the coefficient of x which occurs in it, and take the sum of all the resulting equations. The other normals are similarly obtained from the weights and the coefficients of y, z, etc., having due regard to the algebraic signs of the quantities in the several multiplications and divisions. This method NON-LINEAR OBSERVATION EQUATIONS. 21 is occasionally convenient, but in general the method of forming normal equations given in ~ 9 will be found less laborious. The symmetrical manner in which the coefficients of the normal equations are disposed should be especially noted, since this considerably diminishes the labor of their formation. The first coefficient in the second equation is the same as the second coefficient in the first equation, and generally the intl coefficient in the ntc equation is the same as the?nth coefficient in the m", equation. Let the student form normal equations from the observation equations contained in ~ 1, assuming that those equations have equal weights. ~ 8. Non-Linear Observation Equations. In all of the preceding investigation, it has been tacitly assumed that the relation of the observed to the unknown quantities can be expressed by an equation of the first degree; but cases in which this relation is of a much more complicated character are not uncommon, and a method of applying the principle of least squares to these cases is required. For the sake of sinmplicity, this method will be derived for the case of two unknown quantities, but the process is perfectly general and can readily be extended to any other number of unknowns. Let x and y be any two quantities which have not been directly measured but which are connected with an observed quantity, m, by the relation f(x, y, m) = 0 which represents any equation whatever existing between x, y, and m. Let x0 and yo denote approximate values of x and y, such that X= x x= Xo+f Ax Y=yo4+Ay A x and A y being the corrections which must be added to x0 and yo in order to obtain the most probable values of x and y. We may, for the present, suppose that x0 and Yo are mere guesses at the values of x and y, and we may test their correctness by substituting their numerical values in the equation f(x, y, m) = 0 22 THE METHOD OF LEAST SQUARES. which corresponds to each observed value of m. If every such equation were exactly satisfied by these values we should infer that x0 and yo were the most probable values of x and y. It cannot be expected that this perfect agreement will ever be found in practice, but from each observation equation a residual, v, will be found, due partly to the errors of the observations and partly to A x and A y. If in the equation f(x, y, m) = 0 we substitute for x and y X. + A x, yo + A y, and develop the expression by Taylor's Formula, remembering that f(xo, yo, m) is the residual found by substituting numerical values of x,, y( in the several observation equations, we have f(x'0+ Ax,, yo + A, ) + Afy + dco dyo If numerical values of v, b —, Ot' be introduced into this equation, it becomes (lxo c yo a* Ax + b-Ay+ n = 0 Each observation equation may thus be made to furnish a linear equation involving A x and A y, and these equations may be treated by the method of ~ 7. It must, however, be remembered that in the above development by Taylor's Formutla we have retained only the first three terms of an infinite series, and if the approximate values x0, Yo are not so nearly the most probable values that the squares and higher powers of A x and Ay are inappreciable, the development and the solution based upon it are inaccurate. On this account, it is seldom advantageous to make a least square solution for the unknown quantities until very approximate values of them have been found. These values will usually be obtained from the solution of a small number of the observation equations. The transformation of the observation equations by the introduction of corrections to assumed values of the unknowns is often advantageous even when the original equations are of the first degree, especially if the original quantities were of very different magnitudes. Thus, in the problem of ~ 1, the observation equations are of the form FORMATION AND SOLUTION OF NORMAL EQUATIONS. 2-3 J(lo, c, m)= 1, + tc- m = () in which c is a very small quantity while lo is approximately 10()00. If we put lo = 1000 + A 1 c = 0 + c we have tt- =1,'= t v =.f(1000, 0, ) = 1000 -()() and the equations are transformed into Ao0 +20Ac- 0.22 =0 A0 + 40Ac- 0.65 = 0 Alo + 50Ac - 0.9( = 0 A + 60 Ac-1.05 = 0 By this transformation the numerical operations involved in forming and solving the normal equations are much silmplified through the substitution of snmall numblers in the 1place of large ones. ~ 9. Formation and Solution of the Normal Equations. If the number of unknown quantities is greater than two, and especially if the number of observations is large, the numerical computation of a set of normal equations is a laborious process, and one in which errors are almost certain to occur unless special precautions are taken to guard against them. The method of forming these equations presented in this section has been developed with special reference to facilitating the numerical operations and obtaining the nlormals with the least expenditure of labor consistent witl the requisite accuracy, lnd although some of the processes may seem at first sight unnecessary and cumbrous, a little experience in their use, or in their neglect, will convince the student that they are in the long run labor-saving devices. Let each observation equation be written out and alrranged in tabular form, as in the following example. In order that these equations should furnish a good determination of the unknowns, x, y, z, it is necessary that the coefficients of these quantities should present a considerable range in their values 24 THE METHOD OF LEAST SQUARES. in the several equations. Thus, if all the coefficients of x were alike, all the coefficients of y equal each to each, etc., the equations would be absolutely indeterminate since we should have several unknown quantities involved in a single equation many times repeated, and if the coefficients approximate to this equality the equations will be approximately indeterminate, and will furnish unreliable values of the unknowns. If, therefore, several observations have been made under similar conditions, and furnish equations which are nearly identical, these will be nearly equivalent to a repetition of the same equation and it will be permissible to take their sum, having regard to their respective weights, and treat it as a single observation equation with a weight equal to the sum of the weights of the observations. Having thus reduced the number of equations as far as possible, each equation should be multiplied by the square root,of its weight as was done, ~ 7, in obtaining the form of the normal equations. By this multiplication the weights will be completely taken into account and will require no further attention. Let the wveighted equations thus obtained be repre-.sented by ajx + bly + clz +, =- 0 a2x + boy + c2z + "2 = 0 acx + bly + c3z + 'u3 = 0 etc. etc. etc. It will usually facilitate the formation of the normals to so transform these equations that no number greater than 1 shall occur in any of them. This can always be done by introducing new unknown quantities and dividing each equation by some constant number, usually some power of 10. Thus in the case of the two equations 5x + 71y-63 = 0 0.9x- 193y+- 93=0 let each equation be divided by 100 and put T- x = U I T-Y= - FORMATION AND SOLUTION OF NORMAL EQUATIONS. 25 the equations are thus transformed into 1.000 ui + 0.368 w - 0.630 = 0 0.180 u - 1.000 q- + 0.930 = 0 The solution of these equations will furnish values of u and w from which x and y may be found by the relations x = 20 y = T —w The purpose of this transformation is to simplify the subsequent numerical work by reducing the numbers involved to an approximate equality. Every coefficient which appears in the normal equations is the sum of a series of products of two quantities, thus [aa] = ala + a2c(2 + a3c3 +... [ab] = a,bl + a+2b2 + c,,b +... [bn] = bnll +- b2n2 -+ b6.t3 +- ' These products may be formed by the aid of Crelle's multiplication tables supplemented by a table of squares of numbers for the [aa], [bb], etc. In case Crelle's tables are not available, the products may be formed by logarithms or much more rapidly by the following method due to Bessel. Form for each equation the sums a + b, a + c, b + c, etc., for every pair of numbers contained in the equation; then since ab = -U (a + b)2 -aa - bb we have [ab] = [(a + b)2] - [aa] - [bb] [be] =. [ (b + c)'] - Ebb ]- [cc] } (8) etc. etc. etc. The [aa], [bb], [cc] are coefficients in the normal equations and must be computed in any case, and the formation of [ab], [bc], etc., therefore requires for each coefficient only the single additional quantity [(a+ b)2], [ (b +c)2], and presents the very great advantage that these quantities can be obtained * Crelle, Rechentafeln, Berlin. These tables give the products of all numbers up to 1000 X 1000, and arc of very general utility. 26 THE METHOD OF LEAST SQUARES. from a table of squares, and being all positive numbers no attention need be paid to the signs after the sums a + b, b + c, etc., have been formed. No method of computation can furnish a guaranty against the commission of numerical errors, and it is therefore desirable to test the computation from time to time to ascertain if such errors have occurred. To secure such a test or " check," as it is called, we introduce the following auxiliary quantities,. one for each observation equation: sl= a, + b, +- c + - + 1 S2 = 2 + b2 c2 + ~*. + 2 (9) etc. etc. etc. and form the quantities [as] [bs] *.. [sn]. It will appear from the mode in which the coefficients of the normal equations are formed that heck [aa] + [ab] ~ [ac +. + [an] = [as] [ab] + [bb] + [bc] + - + [bn] = [bs] (9a) etc. etc. etc. The [as], [bs], etc., are formed in precisely the same manner as [ab], [ac], etc., and the check relations above given must be satisfied by the computed values of these quantities. Where only two unknown quantities are involved in the normal equations the solution of the equations may be conveniently made by any of the methods of elementary algebra, but if the number of unknowns is greater than two, the simple and elegant method of successive substitutions proposed by Gauss may be employed with advantage. The normal equations in the case of three unknown quantities are: [aa] x + [ab] y + [ac] z + [an] = 0 [ab] x + [bb] y + [bc] z + [bn] = 0 (10) [ac] x + [bc] y + [cc] z [cn] = 0 and from the first of these [ab] [ac] [an] [aa] [aa] [aa] FORMATION AND SOLUTION OF NORMAL EQUATIONS. 27 This value of x substituted in the second and third equations transforms them into [bb.1] y +[be ]+ [bn - ] = 0 [bc 1]y+[cc l1]z+ [cln. 1] =0 )0 in which [bb 1] [bb] [ab] [cc 1] [cc ] —[a ] [ ] Eaa] [aa] [~aba]t (1d2 [ b e. 1 - [bc] - Eac [b e - ]a] b [aa] [aa]E [b. 1] b] -- [ ab] [1cnl - a ] [] (1 [6bl 1]- [Cb&] - [an] [en. 1] = [cn] — [ [ana ] These equations constitute a new set of normals, from which one unknown quantity has been eliminated. The correctness of the numerical work of this elimination may be tested by a continuation of the checks used in forming the original normals. We introduce an auxiliary quantity [bs 1] = [bb. 1] + [be 1] +[b 1] and inquire its relation to [as], [bs], etc. If we substitute in the expression for [bs 1] the values of [bb. 1], [bc 1], [bn. 1] in terms of the original coefficients, having regard to the relations [aa] + [ab] + [ac] + [an] = [as] (13) [ab ] + [bb] + [be ] + [bn] = [bs ] we find [bs 1]=[bs]- ab]-[a] []-[aa] [aa] ^ (14) whence [bs 1] = [bs] - ab] [as] and similarly, [cs. 1] = [cs]- [a] We may therefore obtain a complete check upon the accuracy of the numerical work involved in the elimination of x, by 28 THE METHOD OF LEAST SQUARES. forming the quantities [bs. 1], [cs. 1], in the same manner as [bb 1], [bc 1], [bn. 1], etc., and comparing the actual sums of these latter quantities with the computed check quantities. By a repetition of the process of elimination we obtain [c. 2] z + [cn - 2]= 0 Check [cs. 2] where [cc. '] -= [ec 1]- Eb 1 [c 1] ] ct = 2][c *I A-[b] [b. ] [ c 1] [eCs 2] =[ [. 1 - 1] [,s *.] 1 [cu* -[c l] b[ bfl] l] (15) bb[c.2]=[.l] ] = [c * 2] + [c1 *.2] and we are enabled to write the following equivalents for the original normal equations. ELIMINATION EQUATIONS. [.b ] [.~ ] PC [a,] X+[] 'y [c] + [w [bc..] [b.1] = o Y$ + = + (16) [cn. 2][ 0 E+[cc.2]3 The last of these equations gives the value of z directly, the second furnishes y as soon as z is known, and the first gives the value of x. The whole solution is therefore reduced to finding the values of the coefficients and absolute terms in these elimination equations. A convenient arrangement of the computation by which these quantities are obtained is given in the following example, in which the actual computation is exhibited upon one page, and the opposite page contains a schedule correspondingly arranged showing the analytical equivalent of each number contained in the computation. EXAMPLE. 29 In making the multiplications of [ab], [ac], [an], [as], by the constant factor [ab], the logarithm of this factor is written [aa] on the edge of a slip of paper, and being held successively adjacent to the logarithms of [ab], [ac], [an], [as], the sum of the two logarithms is taken mentally, the corresponding number looked out from a logarithmic table and written in its proper place under [bb], [bc], [bn], [bs], a subtraction then gives the value of [bb * 1], [bc 1], [bi n 1], [bs 1], and a similar process is followed for each other derived coefficient. ~ 10. Example. To illustrate the principles contained in the preceding sections, and to exhibit in detail the process of deriving the most probable values of several unknown quantities which are connected with the observed quantity by a rather complicated relation, we select from Vol. iii. Part 1 of the Mlemoirs of the National Academy of Sciences, page 58, the following series of experiments made with a 10-gauge Colt gun, loaded with uniform charges of four drams of powder and 1J ounces of shot, the shot ranging in fineness from No. 10 up to No. I Buck. The purpose of the experiments was to determine the relation existing between the size (fineness) of the shot and its average velocity over a range of 30 yards. The following table contains the results of the experiments, each velocity being the mean result of from three to six discharges of the gun. The weight of a pellet of No. 10 shot is taken as the unit of weight, and the velocities are expressed in feet per second. Size. Weight. Observed Velocity. No. 10 1 848 8 2 920;6 4 966 3 8 989 BB. 16 1000 FF. 32 1017 No. 1 Buck 64 1067 By plotting these results in a curve with the weight of the shot as abscissas, and the observed velocities as ordinates, the 30 THE METHOD OF LEAST SQUARES. experimenter reached the conclusion that the relation between the weight Wand the velocity V, is expressed by an equation of the form V __ _ sec-l W O 1 n in which 1, m, n, are constants whose values are to be determined from the observations. It will be found upon trial that 1, = 700, mo = 0.28, no = 0.42, in connection with the observed values of V and W will approximately satisfy this equation, and we therefore adopt these approximate values and proceed (~ 8) to determine the corrections A, Am, An, which when added to lo, n0, no, will furnish the most probable values of 1, m, n. The several differential coefficients of the observation equation, f(l, n, n, V) = 0 are dV V V 10 dlo lo ( V _ -1 cot V dmno no lo d V= 1 log Wcot dno f o in which M denotes the modulus of the common system of logarithms, M-= 0.43429. In the factor, cot is the ratio of lo' lo two numbers, and must be construed as representing a certain arc expressed in parts of the radius: the corresponding arc expressed in degrees is 57~.29538. 1o The form of the observation equation with which we are here concerned is - l0 cot. Am + - loog TVcwot -A n +(v-1 -l s. =)0 lo W(o lo if I me \9/ and introducing into this equation the numerical values of o1, no,, 1t? IV M, we find the following EXAMPLE. 31 OBSERVATION EQUATIONS. (1) 1.29A1-729 Am + OAn+53=0 p =1.0 (2) 1.36 - 535 + 104 + 32 = 0 1.0 (3) 1.41 -396 + 154 + 24 =0 0.8 (4) 1.45 - 294 +172 + 28 = 0 1.0 (5) 1.48 - 219 + 170 + 38 = 0 1.2 (6) 1.50 - 164 + 159 +- 37 = 0 1.1 (7) 1.52 - 122 + 142 - 2 =0 1.0 The absolute terms of these equations are residuals obtained by substituting in the original equation V TIV -- see- — = () 1 m I e - l the assumed values of 1o, mno, and to, and the smallness of these residuals compared with the values of V, shows that the assumed quantities are approximately correct values of 1, m, n1. The memoir from which our data are taken contains no indication of the weights to be assigned to the several determinations of V, and in the absence of such information they should all be treated as equally precise and given the weight 1; but for the sake of illustration a slightly different set of weights indicated above by p has been assigned to them, and by multiplying each equation by the square root of its weight we obtain the following WEIGHTED OBSERVATION EQUATIONS. 1.29 AL - 729 Am + OAn+ 53 = 0 1.36 - 535 + 104 + 32 = 0 1.26 -352 + 137 + 21 = 0 1.45 - 294 + 172 + 28 = 0 1.62 - 239 + 185 + 42 = 0 1.58 - 172 +167 + 38 0 1.5 - 122 + 142 - 2=0 32 THE MIETHOD OF LE AST SQUAI:RES. The coefficients and absolute terlms in these equations are of very different magnitudes, and to simlnlify the subsequent numerical work we divide each e(uation through by 100 and put t 6 1 = 2 -- (). 0 16 2d -7.29'.,S) and introduce x, y, z into the equations in place of A 1, A vmi, A n. This step, which is frequently called rendering the equations homogeneous, furnishes the following H()oIMOC;,NEOUS WVEI(IITEI) ()BSEVATl'OiN EQI ATIONS. 0.796 (; - 1.()00y +.00z)() + 0.530 = 0 s = + 0.326 0.839' - 0. 733 + 0.56: + 0.320 = 0 0.989 0.777 --.482 + 0.741 + 0.210 = 0 1.246 0.895 - 0(.403 + 0.931 + 0.280 = 0 1.703 1.000) - 0.32T + 1.000 + 0.420 = 0 2.093 0.9775 - 0.236 + 0.903 + 0.380 = 0 2.022 0.938 - 0.1;67 + )0.768 - 0.0'0 = 0) 1.51') EXAMPLE. 33 The values of s = a + b + c + n, which are to be used as a check in the formation of the normal equations, are derived from these equations. The formation of the coefficients of the normal equations by the use of a table of squares, Bessel's method, is represented in the following tables: SUMS OF THE COEFFICIENTS. Equation a + b 1.... 0.204 2....106 3...295 4....492 5...673.....739 7.....771 a+c a+ | -t+s b-+c 0.796( 1.32i6 1.122 1.000 1.402 1.159 1.828.170 1.518 0.987 2.023.259 1.826 1.175, 2.598.528 2.000 1.420 3.093.673 1.878 1.355 2.997.G7 1.706 0.918 2.457.601 I~~~~~~~~~~~~~~~~~~~~ b+, b s c -tn c- fs 0.470 0.674 0.530 0.326.4 41 0.256 0.88.3 1.552.272 0.764 0.951 1.987.1 2 1.300 1.211 2.634.093 1.766 1.420 3.093.144 1.786 1.283 2.925.187 1.352 2 0.748 2.287 SQUAIRPS. tion. 1 2 3 4 6 7 0.634.704.604.801 1.000.951.880 5.576....o 0.042.011.087.242.453.546.594 0.634 1.966 2.304 3.334 4.000 3.527 2.910 1.758 1.343 0.974 1.381 2.016 1.836 0.843 1.259 3.342 4.094 6.750 9.567 8.982 6.037 1.000.537.232.162.107.056.028 1.000.029.067.279.453.445.361 0.221.171.074.015.009.021.035 0.454 0.066 0.584 1.690 3.119 3.190 1.828 0.000.317.549.867 1.000.815.590 0.281 0.780 0.904 1.466 2.022 1.646 0.560 0.106 2.409 3.948 6.938 9.567 8.556 5.230 36.754 20.450 0.281.102.044.078.176.144.000 0.825 0.106 0.978 1.552 2.900 4.381 4.088 2.307 16.312 H c3g H 0 tl 0 C12 H w c: C-4 + 5.576 [aa] 1.975 7.698 - 2.861 [ab] 18.675 10.151 40.031 9.714 6.401 21.888 2.122... 2.634 6.260 - 1.813 [bc] 0.546 10.931 4.138 7.659 2.947 18.434... 4.963 + 4.480 [ac] + 1.875 [an] + 9.071 [as] + 9.070 + 2.122,[bb] - 1.200 [bn] - 3.751 [bs] - 3.752 +'4.138 + 1.348 [cc] [[cn] + 8.152 [cs] + 8.153 + 0.825 [nn] + 16.312 [ss] EXAMPLE. 35 From the sums of the squares contained in the several columns of this table the coefficients [ab], [ac], etc., are computed at the foot of the columns by the relations [ab] = [(a + b)2] - ([a2] + [b2]), etc. The check quantity [as] is compared with [aa] + [ab] + [ac] + [an] whose value is written immediately under [as], and which must agree with [as] within two or three units of the last decimal place. Every coefficient of the normal equations enters into one or more of these sums, which therefore furnish a complete test of the accuracy of the work in passing from the homogeneous observation equations to the normal equations. We now write the NORMAL EQUATIONS. + 5.576x - 2.861y + 4.480z + 1.875 = 0 - 2.861 x + 2.122 y - 1.813z - 1.200 = 0 + 4.480x - 1.813y + 4.138z + 1.348 = 0 It may be seen from an inspection of these equations that the data upon which they are based will not furnish a good determination of the values of all the unknowns, for if the first equation be divided by - 2 the quotient will be very like the second equation, and if it be multiplied by + - the product will be very like the third equation. We proceed, however, with the solution by Gauss' method, which will furnish the best results that the data can be made to yield. 36 THE METHOD OF LEAST SQUARES. SOLUTION OF [aa] [ab] log [aa] log [ab] [bb] log [ab] [ab][ab] [aa] [aa] [bb.-1] log Ebb.- 1] log [ac] [aa] log re-I [bb.- 1] log [(n - 2 [e. -2] THE NORmA [ac] log [ac] [be] [ab] [ac] [aa] [bc -l] lo- [be. 1] [cc] [ac] EI], [aa] [cc. 1] [bb.- 1] -1 [e. 2] log Ecc. 2] L EQUATIONS~ [an7] log [aIn] [bnj [ab]. -. [an] [bnt. 1] log [bit -l] Ecn] [ac] [n Pen. 1] r[be._I] -1 [en. 2] log [en. 2] [as] log [as] [bs] [ab] [ as] [bs.- 1] Checke suni. log [bs. 1] [es] [a [as] Check samn. [bc.- 11[bs - 1] [Ibb. 1] [cs.2] Check sumn. ELIMINATION EQUATIONS. [ aa] [ aa] [ aa] [be.1] [bn - l] +[bb liZ1 +[bb ] O +[cn.2] 0 [cce.2] The course of the computation after the formation of the elimination equations is sufficiently indicated upon the opposite page. EXAXIMPLE. 37 SOLUTION OF TILE NORMAL EQUATIONS. + 5.576 - 2.86.1 0.7464 0.4566)i + 2.122 + 1.468 + 4.480 0.65 13 - 1.813 - 2.2199 + 1.875 -0.2730 - 1.200 - 0.962 + 9.070 0.9576 - 3.752 - 4.654 9.7102 )i + 0.654 + 0.486 - 0.238 + 0.902 9.8156 9.6866 9.3766. 9.955-2 + 0.002 + 4.138 + 3.599 + 1.348 + 1. 507 + 8.153 + 7.286 9.9049 + 0.53) - 0.159 + 0.867 + 0.361 - 0.177 + 0.670 + 0.178 + 0.018 + 0.197 9.8710 + 0.866 + 0.196 9.0049! 9.2504 8.2553 I'IAMIN-ATIo-N EQUATIO-N.S. r - 0.513 y + O.SO3z + 0.336=0.=- 0.030 y + 0.743 z -0.364 = 0 y = + 0.439 z + 0.101 = 0 Z = - 0.101 log x 8.4771ni log 0.0162 8.2095 Al - 1.8 10 700.0 1 698.2 logf y 9.6425) log 7.29 0.8627 An + 0.0602 nMO + 0.2800 m + 0.3402 log 9.0049 it log 1.85 0.2672 A&n - 0.0547 n10 + 0.4200 n + 0.3653 38 THE METHOD OF LEAST SQUARES. If with the values of 1, m, n thus obtained the corresponding velocities be computed by means of the original equation V = sec- TV I ml the resulting residuals should be smaller than those derived from the substitution of 10, mn, no, i.e., the absolute terms of the observation equations. The following comparison of these residuals shows a much better representation of the observed values of V, especially if the suins of the squares, [vv], be compared. Observed - Computed V. Weight of Shot 1 2 4 8 16 32 64 f(lo no no) -53, -32, -24, -, -28 -38, -37, + 2 f(l, m, n) - 6, + 10, + 13, + 4, -10, -13, + 22 Not only are the residuals diminished in magnitude, but their distribution is much more nearly in agreement with the law of error. The values thus obtained for 1, mn, n ought not to be considered the best attainable, since the corrections Am, A n are relatively large fractions of mo and n0, and it is probable that the neglected terms containing A2, n2, etc., have an appreciable influence upon the solution. To secure the utmost accuracy these values of 1, m, n should be treated as new approximations and another set of corrections A l, Am, An derived. This resolution is recommended to the student as a valuable exercise. Let the student also derive from the data of ~ 1 the most probable values of 10 and c, assigning unequal weights to the several equations. ~ 11. Conditioned Observations. There is a class of cases in which the application of the principle of least squares seems to produce absurd results. Thus if each angle of a plane triangle be measured many times in order to obtain an accurate set of values for the angles, the application of the principle that the [pvv] must be made a minimum will furnish as the most probable value of each angle the weighted mean of the CONDITIONED OBSERVATIONS. measures of that angle, but the sum of these weighted means will usually differ slightly from 180~, and since the sum of the angles of every plane triangle must equal 180~ it appears that the most probable values above derived are impossible values. It must, however, be noted that the method of treatment above outlined is itself a violation of Principle A, ~ 1, since the knowledge that the sum of the angles must equal 180~ furnishes a relation among those angles which may be used and ought to be used in determining their most probable values; and the apparent absurdity above found is produced by neglecting this part of the data. A relation such as the above whlich must be exactly satisfied by a set of observed quantities is called a rigorous condition, the equation by which the relation is expressed is called an equation of condition, and observations of such quantities are known as conditioned observations. The number of rigorous conditions is, of course, always less than the number of unknown quantities, since if it were equal to the number of such quantities the values of the latter would be determined by the conditions alone, independently of any observations. In order to develop a convenient method of treating rigorous conditions, let x, y, z be three unknown quantities which are to be determined from observation, but whose values are required to satisfy the equations of condition (x,,,) = 0 q1 (x, y, z)= 0 Let the measurements or observations for the determination of the unknown quantities be represented by observation equations of the form fi(x, m) = 0 f,(y, n) = 0 f,(z, q)= 0 m, n, and q being the quantities directly measured, and the measures for the determination of x being quite independent of those for y, z, etc. In accordance with the principles of least squares the values of the unknown quantities are to be so determined that [pvv] shall be made a minimum in each series of observations above represented, and therefore the sum of all 40 THE METHOD OF LEAST SQUARES. the weighted squares of the residuals must also be a minimum. Owing to the conditions 4c(x, y, z) = 0, l (x, y, z) = 0 it will not in general be possible to assign to the unknown quantities values which will give to [pvv] its least possible value, and the problem becomes one of conditioned or relative minima, i.e. out of all the sets of values of x, y, z which will exactly satisfy the equations of condition it is required to find that set which assigns to [pvv] its least value consistent with those equations. The method of determining relative minima is as follows: (Jordan, Cours c'Analyse, Vol. i., ~ 205). Multiply each equation of condition by an undetermined constant factor, and add the products to the function which is to be made a minimum. The derivative of the new function with respect to each unknown quantity must be placed equal to 0, and the equations thus formed, together with the equations of condition, will be just sufficient to determine the unknown quantities and the constant multipliers. Thus, in the present case, representing the multipliers by - 2 k1 and - 2 k2, we have for the new function = [pvvl - o2 k, (x,, z) - 2 kh(x,, z) (17) and dw ( = 0 d 1= 0Od, (1() dxr dy dz,(xy, y, )= 0 (x, y, )=0 will determine kc, 2,, x, y, and z. It was shown in ~ 7 that in general for three unknown quantities, cdlvv = 2[paa] x + 2[pab] y + 2[pac] z + 2[pan] but in the case here considered those observation equations which contain x do not contain either y or z, and, therefore, the b and c coefficients in those equations are to be considered zero, all the products ab, ac, are also zero, and d[pvv] _ 2[paa]x + 2[pan] dx with similar expressions for the y and z derivatives. CONDITIONED OBSERVATIONS. 41 Denoting for the sake of brevity q(x, y, z) and ifr(x, y, z) by < and q respectively, we obtain by differentiating wz d - = [paCa~[p] + [tn ]- k t — = dcx dx dX d= D ][bb ]y + [pbn ]-k k1d - 7 = o dw d 4, _ k dy Id = [CPc ]z + [pc n]- k - = 0 ~7dz dz dz from which [pan] +d( k[ dl h2 [ paa] dx [ppa~t] -.dx [)taa] Y_ [pbn]' d( ki cl/ A2k [pbbc] (cy [pcbb] dy [pcbb] z_ _ [2]cn] d- hl +f k2 Z= [c + [c [29CC ] dz [p2ec] dz E Pcc] These equations determine the values of x, y, z, when kh1 and k2 are known, and it should be observed that the first terms of the second members of the equations are the values of x, y, z, which would be obtained by treating the observations as if these quantities were entirely independent of each other, e.g. in the case of direct observations of the quantities they are the weighted means of the observations. If we represent the values thus obtained by x, o, z0, and represent by v1, v2, V3 the corrections which must be added to these quantities in order to obtain the most probable values of x, y, z, i.e. put X = Xo + V = y = YO+ 'v2 z = zo + V3 we shall have V - [paa] + d A2 k, ~ dx [paa] dx [paa] d= k, d k1,9 V2 dy [pbb] dy [pbb] (19) d f k, d4 l k2 d3 z [pcc] dz [pcc] 42 THE METHOD OF LEAST SQUARES. The quantities k, and kc are called correlates, and from the manner in which they were introduced it appears that the number of correlates is equal to the number of rigorous conditions to which the observed quantities are subject. To determine the values of the correlates let x + v,, + v,, zv + V3 be substituted for x,?, z in the equations of condition, and the equations developed by Taylor's Formula, giving for the +(x, y, z) / ('XI, Y?, Z,) +;( VI + oV2 + -( ' + etc. = () dx, dy dz and a similar expression for l (x, y, z). Let the values of vi, V2, V3 in terms of kA andt k2 be substituted in these equations, and put dx _ dy = dz = V/j[pa] 1 -v pbb] aV[pc] = and the equations become [aa] k, + [al] k2 + b (XO, Yo, z) = 0 [a/]A', + [E]]k2 + +(x0, Yo, z,0) = 0 ) from which the values of k1 and k., may be obtained, and thus the values of VI, v2, v, from equations (19). The method by which the above equations have been derived for the case of three unknown quantities connected by two equations of condition is perfectly general and may be extended to any other number of quantities whose values are to be obtained from independent observations. In the cases which actually arise in practice the observation equations and equations of condition are usually of simple form, the differential coefficients and the quantities a, b, c, etc., being usually equal to either 1 or 0. CONDiTIONEI) OBSERVATIONS. 43 PROBLEM. Let the student show by the method of correlates that if the sum of the measured angles of a plane triangle exceed 180~ by a quantity e, the angles must be corrected by distributing e allong them in such a manner that the correction to each angle is inversely proportional to the weight of the angle. To illustrate the application of the principles of the present section to a rnumerical problemi, vwe select from the U. S.C. & G. Seurvey Report for 1884, pages 409 et seq., the following telegraphic determinations of longitude, and seek to adjust them so that they shall be mutually consistlent. Each difference of longitude between two stations was directly observed, so that the observation equations are all of the form x = 1,, x = m2, etc., and the values given below are the weighted means of the individual observations of each series. The probable error of each determination (see ~ 12) is placed imme(liately after the quantity'itself, and the weights of the determinations are assumed to be inversely proportional to the squares of the probable errors. Observed Stations. Symbol. Difference of Longitude. VP p Cambridge, Mass., ashingt' - on.C., o, 23" 41f041 ~001os 0.18 0.032 WVashington, D.C., ) ' Cambridge, Mass., ) ^Clev elandl, O.,.. Yo y 42 14.875 0.0.38 0.38.144 Cleveland, O., ) Cambridge, Mass.. ) ~Cambridge, Mss. ). 47 27.713 0.035 0.35.122 Columbus, 0., ' Washington, D.C.., Conbs, 0.. b.? 23 46.816 0.038 0.38.144 Columbus, O., ' Cleveland, 0., 1 Columbus, O., 5 0 The five observed differences of longitude give rise to two rigorous conditions represented by the following equations of condition: 44 THE METHOD OF LEAST SQUARES. O(), u+x-z= (), tw+y-z=0 The coefficients in the observation equations being all equal to unity, [paa] = P, [pbb] =m, etc., (I d) 1 do 1 de 1 and a - = -. a2 = -, i. etc., dx p dy Vp dx V\i and from these expressions are derived the following values of the coefficients, together with the sums si = a, + /i, s2 = a2 + /, etc., which are to be employed as a check upon the formation of the normal equations for determining the correlates. COEFFICIENTS. Subscripts. 1. 2. 3. 4. 5. a + 0.18 0.00 - 0.35 + 0.38 0.00 B 0.00 + 0.38 - 0.35 0.00 + 0.45 s +0.18 +0.38 -0.70 +0.38 +0.45 FORMATION OF THIE CORRELATE EQUATIONS. | a | a as B PS + 0.0324 + 0.0000 + 0.0324 + 0.0000 0.0000.0000.0000.0000.1444.1444.1225.1225.2450.1225.2450.1444.0000.1444.0000.0000.0000.0000.0000.2025.2025 + 0.2993 + 0.1225 + 0.4218 + 0.4694 + 0.5919 Check. 0.4218 0.5919 CORRELATE NORMAL EQUATIONS. + 0.2993 k, + 0.1225 k2 + 0.144 = 0 + 0.1225 k + 0.4694k2 +O.091 = 0 kl = - 08.449 k2 = -0.078 THE PROBABLE ERROR. 45 The absolute terms of the correlate equations are obtained by substituting the observed values xo, Yoz, Zo, u0, W in the equations of condition, and the values of kC, k2 may be found from the correlate equations, either by Gauss's method of substitution or by any of the ordinary algebraic processes of elimination. The corrections to x0, Yo, Zo, etc., and the adopted values of the unknown quantities, are now found from v, = + 0.032 Ak + 0.000 k = - 08.014 x = 231m 41'.027 +v, = + 0.000 k1 + 0.144k2 = -0.011 y = 42 14.864 3= - 0.122 k, - 0.122 k2= +- 0.064 z = 47 27.777 'V4 = + 0.144 k, + 0.000 k2 = -0.065 t = 2:3 46.751 5= + 0.000 k- + 0.202 k2 = - 0.016 w = 5 12.913 The values thus obtained satisfy the rigorous conditions of the problem, and are the most probable values which can be obtained from the data given above. ~ 12. The Probable Error. Every intelligent observer desires to know something of the quality of his observations, how good or how bad they are; the computer who has to combine the results of different series of observations should have some knowledge of their relative accuracy in order to assign to each series its proper weight; and the investigator engaged in a complicated series of experiments desires some criterion by which to estimate the relative errors of the several parts of his work, in order to properly apportion his care among them, giving the maximum attention where the greatest errors are to be feared. It is evident from the nature of the case that no absolute criterion of this kind can be furnished, since any series of observations may be affected with systematic errors which seriously impair the accuracy of its results but furnish no indication of their presence. Both observer and computer do, however, estimate the accuracy of observations by their agreement among themselves, and that within certain limits this procedure is correct follows from Gauss's law of error. If we suppose a very long series of observations affected only by accidental errors, the values of the unknown 46 THE METHOD OF LEAST SQUARES. quantities obtained from the series will differ but little from the true values (if the series is infinitely long they will be the true values), the residuals which they furnish will be very nearly the errors of observation, and the value of h in the equation of the error curve will furnish a measure of the precision of the observations as well as a measure of the smallness of the residuals. On the other hand, if the student attempts to construct the error curve corresponding to any short series of residuals, e.g., those of ~ 10, he will find that while they give him some information in regard to the curve there will be much that is arbitrary in its actual construction, and that many curves can be drawn which will appear to fit the residuals equally well, i.e. the amount of data in this case is insufficient to determine more than a rough approximation to the measure of precision of the observations. If the observations are affected with systematic errors, the residuals may be very different from the errors of the observations, and will then furnish no indication of their accuracy. It thus appears that any conclusions in regard to the accuracy of a given set of observations must be treated with caution if they are based solely on the residuals furnished by the observations. Such conclusions are, in fact, valid only within certain limits whose general nature is indicated above; but within these limits the information thus furnished may be of much value, and it is frequently employed for the )purposes indicated at the beginning of this section. The measure of precision, h, seems to be indicated by its name as the appropriate means of expressing the average accuracy of a set of observations, but in practice it is not so used, another function of the residuals being found more coilvenient. If in a very long series of observations the residuals be arranged in the order of their numerical magnitude (without regard to sign), that residual which occupies the middle place in the series will have as many residuals greater than it as there are less than it, and in any future series of observations of the same degree of precision as that here considered, it will be an even chance that any given residual will be THE PROBABLE ERRIOR. 47 greater than, or less than, the middle one above selected. This middle residual is usually denoted by r, and is rather inappropriately called the probable error of the series, the adjective having reference to the equal probabilities of the occurrence of residuals (errors) greater than, or less than, r. It is apparent that the greater the precision of any set of observation, the smaller will be the corresponding probable error, but the exact relation which exists between h anl r must be derived from the equation of the error curve. The symmetry of this curve with respect to the axis of y shows that the same law of distribution holds for both positive and negative errors, and that in a very long series of residuals the probable error r will occupy the middle place among the positive, errors and among the negative errors considered separately, as well as among all the errors taken without regard to sign. Since we are concerned only with the numerical magnitude of r we may confine our attention to the positive residuals, and find the relation between r and h from that half of the error curve which lies to the right of the axis of y. Since the probable error is a residual, it must be represented by the abscissa of some point on the axis of x, and we may determine this point from the condition that the ordinate drawn through it bisects the area of that half of the curve under consideration, since (from the relation between areas and the number of residuals of a given magnitude developed in ~ 4) this is the geometrical equivalent of the statement that the number of residuals greater than r is equal to the number less than r. By interpolation from the table in ~ 4, the value of the argument corresponding to A = 0.25 is found to be hx = hr = 0.477, whence the relation between the probable error and the measure of precision is 0.477 (22) The student will observe that in the definition of the probable error reference is made to a very long series of observations, and in a series of infinite length the value of r might be found 48 THE METHOD OF LEAST SQUARES. immediately from its definition, but in any ordinary set of observations it is better to assume that the residuals are distributed in accordance with the law of error, and to determine the value of r from the relation between h and the sum of the squares of the residuals, ~ 5, which gives r=~ 0.477V2 V vI We here encounter a difficulty arising from the attempt to apply to a short series of residuals principles which are rigorously true only when the series is of infinite length. Suppose the above expression for r applied to a series of three observations involving three unknown quantities whose values are derived from the resulting observation equations. These values will exactly satisfy the equations, no matter what the errors of the observations may be, and the residuals being all zero, there will be found r = 0 and h = c, which is absurd. The observations in this case furnish no data from which to estimate their precision, and in every such case where the number of observations is equal to the number of unknown quantities, the expression for the probable error ought to become indeterminate,. It is therefore customary to put 0 r=~o0.(;74 \1 [] (23) in which fA denotes the number of quantities whose values have been derived from the observations. This equation, which is known as Bessel's expression for the probable error of a single observation, being only an approximate one, we may usually put 2 in place of the coefficient 0.674. Among German physicists and astronomers, it is quite customary to suppress this coefficient altogether, and to use the "mean error" _= i~ J[] (24) n - /A for the comparison of observations. Geometrically considered, c denotes the abscissa of the point of inflexion of the error curve. THE PROBABLE ERRlO.01 A simpler expression for the probable error may be obtained by substituting in the equation 0.477 h a value of h derived as follows: Let each member of the equation of the error curve be multiplied by xdx and integrated between the limits — o and +oo, giving ydl = -- J xe-h2x2dx The value of the first integral in this equation is obviously 0, since as we pass along the error curve from - - to +oo every value of y occurs once associated with a negative value of x, and again with a numerically equal positive value, and for every negative element xydx in the integral there occurs an equal positive xydx so that the entire sum is 0. If, however, we agree to neglect the sign of x and to condsider only its numerical value, we shall find fxydx = 2 xydx and by a course of reasoning precisely similar to that applied in ~ 5 to the quantity J 2ydx, it may be shown that 2 jxldx is equal to the mean of all the residuals taken without regard to sign. We may therefore write [+ v]- h= + d where the + inside the brackets denotes that all of the residuals are to be treated as positive quantities. Putting hx = t in the second member of this equation and remembering that here also we are concerned only with numerical values of x without regard to sign, we obtain [+v l ]2C td 2 e — + ~+ vI _2- te- tf t =-2_ - t2t X n hVJ o hVrk 2 JO Introducing the limits into the integrated expression there results [+ ]_ 1 ^ 7~ WTT 50 THE METHOD OF LEAST SQUARES. and r = 0.477 /[+ v] i (25) This formula is rigorously correct only when the number of observations is infinite, and it must be transformed so as to become indeterminate when the number of observation equations is just sufficient to determine the unknown quantities, i.e., when n = /L. This might be accomplished by writing n -, in place of n, as was done in equation (19), but it is customary to substitute in this case V/n(n —up), which also renders r indeterminate when n = p., and gives values of r more nearly in agreement with equation (23). Making this substitution, we have r = 0.845 [+vJ Vn(n-p.) (26) which is known as Peters' formula for probable errors. This formula is very convenient for the numerical computation of probable errors, but where the number of observations is small the results furnished by equation (23) are considered more reliable, but neither formula can furnish a good determination of probable errors from a small number of observations. The numerical application of these formulae may be illustrated by the following short series of sextant observations for the determination of latitude. Observations 43C4 46'" 4 24 4 7 4 28 4 59 4 39 1 V tV 19" 361 3 9 20 400 1 1 32 1024 12 144 log [+ v] a.c.,log V( In-l) log 0.845 log r r 4 52 25; 25 log [vv] 4 52 25 625 log (n- 1) 3 47 40 1600 log [vv] n — 1 4 15 12 144 logA [ -n-1 3 3(; 51 2601 log 0.674 4 40 13 1; log r 4 27 F+v= 253 [rv =7703 r 2.403 8.940- 10 9.926 -10 1.269 _ 18".6 3.886 1.041 2.845 1.422 9.829 - 10 1.251 -: 17".8 Mean = 43 n= 12 y = PROBABLE ERROR OF A FUNCTION. 51 The difference between the values of r found from the first and second powers of the residuals is small compared with the uncertainty of each arising from the small number of observations. In so far as these observations can be considered as furnishing a value of r, they indicate that in a future series of similar and equally precise observations, there should be as many observations furnishing residuals (errors) greater than 18" as there are observations giving residuals less than 18". The ~ which is commonly prefixed to the numerical value of r, denotes that the observed quantity is as apt to err in excess as in defect. Let the student derive from the residuals given in ~ 10 a determination of the probable error of an observed V, noting that in this case u = 3. ~ 13. Probable Error of a Function of Observed Quantities. Let x', x", x"' denote quantities which have been determined from observation, and let r', r", r"' be their probable errors. Let u be a quantity whose value has been computed from the values of x', x", x"' by means of the relation f(x', x", x"', u) = 0 It is evident that the precision with which u is determined, depends upon the precision of x', x", x"', and by a slight extension of the term "probable error" we may consider the precision of u to be represented by a probable error, r, and may inquire the relation of r to r', r", r'". Since a probable error is one of the residuals or errors of a very long series, we may obtain the desired relation between r, r', r", r"' from a consideration of the general relation of any set of errors v', v", v"', in x', x", x"', to the corresponding error, v, in u. This relation is v = d v' + df- v'" + etc. dx' dx" ' dx"' (see ~ 8). To avoid the necessity for considering the signs of v', v", v"', let this equation be squared, giving 52 THE METHOD OF LEAST SQUARES. 22 ( '2P2+ ( f)2V f2 d+ ( )112 VWx'j \dx11 dx 1' from which all terms involving the products v'v", v'v"', v"v"', etc., have been dropped for the reason that the probable error of u depends upon the average magnitude of v, and in the long run any pair of residuals v', v" will have opposite signs as often as they have like signs, and will therefore produce an equal number of positive and negative terms whose effect upon the mean value of v2 will be very small compared with the terms containing v'2, v12, v"'2, which are always positive. Replacing these actual errors by the corresponding probable errors, we obtain =2 ( 2r +( '2 + 2 2(jr2 + 2 (27) dx'j +dx and an equation of similar form will express the relation of the probable error of the function to the probable errors of the quantities upon which it depends, whatever the number of these quantities may be. We proceed to apply this relation to a few simple cases of frequent occurrence in practice. (a) The probable error of the sum of n observed quantities. In this case u = x'+ x1 + x"1 +... + x1t and each of the differential coefficients d-,f d-f, etc., equals 1; dx' dx" whence r = r'2 + r"2 -+ r"2 -... + "n2 (28) (b) The probable error of the mean of n observed quantities. In this case u = 1 (x' + x" + X +.. + X") _?= etc.dx' dx" I r2 = 1 (r, 2 _+ r,2 + r"112 +... rn2)?U PROBABLE ERROR OF A FUNCTION. 53 We have here to distinguish two cases. If the x's are all of equal precision, the r's are equal, and may be represented by a common symbol r1; whence r = 2=i (~9) If the observations are of unequal precision represented by weights p' p"I pf 'p...P p'x' + + p xII +.p'x"... 4 p'x"n we have u =I- p +P i+Px- +.pX[LP] df p' df p" etc. dx' [p] dx" [p] -9 [1 2 rr of " n 'l w(30) where ri denotes the probable error of an observation whose weight is 1. The relations here derived between the probable error of a single determination of a quantity, and the probable error of the mean of n determinations, may be employed in connection with equation (23), to determine the probable error of an adopted value based upon several determinations of a quantity. Thus, in the general case of observations of unequal weight, if i1 represent the probable error of an observation of weight 1, and r the probable error of the weighted mean, we have from equation (23), r= 0.674 NI (31) r = V[p] J Let the student show that when the observations are of equal precision and 54 THE METHOD OF LEAST SQUARES. u = a (x' - x" -'") r = ~ aV3r' -. X' X' r ut = sin- r cos-. - a a a =log X' r- 0.434 1 + ~ 14. Assignment of Weights. Rejection of Observations. The term weight has been employed in the preceding sections as a measure of the quality of an observation, but its use is by no means limited to the case of single observations. Thus, from an Investigation of the Distance of the Stin, etc., by S. Newocomb, we select the following determinations of the solar parallax. Method by which determined. Parallax. Weight. Meridian observations of Mars, 1862. 8".855 25 Micrometric observations of Mars, 1862. 8.842 6 Parallactic inequality of the Moon. 8.838 16 Lunar equation of the Earth. 8.809 3 Transit of Venus, 1769. 8.860 6 Each value of the parallax here given is the final result of an elaborate discussion of many observations, and the weights indicate the relative excellence attributed to these results by the author of the investigation. If or denote any one of these values of the parallax, p its weight, and 7r0 the most probable value of the parallax, we shall have -o= [iP] =8847 (~ 6) It is to be noted that this value depends upon the weights assigned to the individual determinations, and that by properly selecting the weights, 7ro may be made to assume any value whatever between the least and the greatest single determination. Thus if the weight 100 be assigned to the value 8".860, and to each of the other values the weight 1, we shall find tro = 8".859, while a weight 100 for the value 8".809 with a weight 1 for each of the others, makes 7ro = 8".811. Between these limits the value of 7rt depends upon the judgment of the ASSIGNMEINT OF WEIGHTS. computer in assigning weights, and this determination of weights is one of the most delicate questions that arise in the application of the method of least squares. A relation between weights and probable errors may easily be established, which is frequently of service in that it enables the problem of weights to be stated in a different form. Let x denote an observation whose probable error is r and whose weight is 1, and denote by x', x", x"', etc., observations or combinations of observations of the same quantity, whose weights and probable errors are represented by p', p", p"',, r', r1", r, etc. In accordance with the definition of weights, x' is the equivalent of p' observations of the same quality as x, and from the equations derived in the preceding section we have., with a similar expression for each of the other quantities; whence I f 2 _ _ I f 2 _ tI I IIr 2. 9 P r = rr ' -1 2 _ 22.)2,)2 2. and p-, 2 P__ -f ' p 1, (3,) and, in general, the weights are inversely proportional to the squares of the probable errors. It has been sufficiently shown that probable errors derived from the residuals furnished by a series of observations represent only the effects of accidental errors of observation, but we may extend the significance of the term so as to include an estimate of the effect upon x', x", x"' of systematic errors in the observations. Let r, and r2 represent those parts of the probable error which come from these two sources respectively, and from ~ 13 we find for their combined effect 2'2 2= r+ 2- 7 and the expression for the weights becomes -.2 r2 P ro (33) r]~+r~2 THE METHOD OF LEAST SQUARES. By this device the determination of weights is reduced to mi estimate of the combined effect of accidental and systematic errors of observation upon the quantity whose weight is desired, and it was from an estimate of this character that the weights of the parallaxes given above were derived. If r' denote the probable accidental error of a single observation, and the quantity whose weight is p is the mean of n such observations, we shall have 1 (34) _+ 272 12 from which the constant multiplier rm has been dropped, since only relative values of ps are ever required. It appears from this equation that if the systematic errors, r2, are very small compared with the accidental errors, Xr', the weight increases rapidly as the number of observations is increased, but if the systematic errors are large, the weight is but little affected by the number of observations; a relation to be considered in deciding how many observations shall be made to determine an unknown quantity. In some cases it may be impossible to form any reliable estimate of the effect of systematic errors, and results which have been derived by different methods, or under different circumstances, may then be given equal weights on the supposition that they are affected by different systematic errors which it is equally important to eliminate; but this is equivalent to putting r2 =oc in the equation for the weights, and it will rarely happen that this is the best estimate which can be made for the amount of the systematic errors. It frequently happens that in a series of otherwise accordant observations, one or two will be found which differ widely front the others, and which if included in the final result, will furnish large residuals. What shall be done with observations of this kind has long been a vexed question. To reject them is equivalent to assigning to them the weight 0, and is the expression of the computer's judgment that they can contribute ASSIGNMENT OF WEIGHTS. 57 nothing to the accuracy of the result which he seeks to obtain. In an infinitely long series of observations, errors of any finite magnitude may be permitted without impairing the accuracy of the final result, and the existence of such errors seems contemplated by the theory which we have adopted, since the equation of the error curve gives finite values of y for all values of x between the limits - X and + c. But in the actual case which arises in lractice where a result must be obtained from a comparatively small number of observations, a single one of these, if affected with a large error, may make the final result farther from the truth than any one of the other observations. On the other hand, cases are by no means unknown in which a single discordant observation in a series proves to be nearest to the true value of the quantity sought, the others having all been vitiated by some common cause; and between these extremes an infinite variety of cases may be found. It must in general remain a matter of doubt whether a given discordant observation should or should not l)e rejected, and the decision made by the computer must be his judgment based upon all the data available as to whether more will be gained by rejecting than by retaining it. A knowledge of the way in which observations are made, of the circulmstances attending the particular observation in question, the magnitude of the errors which nmay reasonably be expected with the given observer and apparatus, or instrument, are elements which should be included in this judgment; and the observer will greatly facilitate its formation by making copious notes at the time of observation of all circumstances which in his opinion may affect the quality of his work, and particularly by noting any abnormal circumstances affecting a single observation or a part of the observations. A doubtful observation should be rejected if it is the conputer's deliberate judgment that its retention will hurt more than it will help his final result, but it is never legitimate for the computer to suppress an observation. A rejected observation should be included in the statement of his data, and may properly be accompanied by all explanation of the reasons for 58 THE METHOD OF LEAST SQUARES. its rejection, in order that any person interested in the result may form his own judgment of the data and the manner in which they have been discussed, and may, if necessary, rediscuss the observations in accordance with that judgment. The conclusion of the whole matter of assigning weights to numerical data may be summed up in the statement that no mathematical expression will suffice for this purpose, but the weights must be determined by an exercise of personal judgment, and the wider the knowledge upon which this judgment is based, the greater confidence will the weights and the resulting values of the unknown quantities command. ~ 15. Empirical or Interpolation Formulae. In the preceding sections attention has been directed to that class of problems in which the theoretical relation between the observed quantities and those whose values are to be determined is known; that is, an equation of known form exists between them, and the problem has been to determine the values of the constants which appear in the equation. But a very different class of cases now demands a passing notice. A series of observations is sometimes found to be affected with errors too great to be explained as the result of unavoidable and fortuitous causes, and it becomes apparent that the law of recurrence of these errors must be determined before the observations can be made to yield any valuable results. The American parties which were sent out in 1874 to observe the transit of Venus were provided with instruments for the determination of their local time, of such a character that the accidental error of a determination from a single star might fairly be estimated at O0.O0 or O'.06, but results obtained from observations of different stars varied among themselves by more than ten times this amount. An inspection of the discrepancies having shown that they depended in some way upon the distance of the observed star from the zenith, it was found by trial that the error at any zenith distance, z, could be repre. sented by the expression E = ~ la cosz - b sil' 2) EMPIRICAL OR INTERPOLATION FORMULAE. 59 where a and b are constants whose values were found from the observations themselves. The physical cause of these errors was subsequently found to be the bending of the instrument under its own weight, but it is to be noted that the above law of recurrence of the errors was determined first, the cause afterwards. Expressions of this kind are sometimes called interpolation formrulce and sometimes empirical equations; the one term having reference to their use, the other to their derivation. They are of very general use in all branches of physical science, since they may be made to serve as a convenient summary of a vast amount of numerical data, and one of the most important applications of the method of least squares is in determining the values of the constants which enter into such expressions. The problems treated in ~~ 1 and 10 both belong to this class, and the following expression for the magnetic declination at Washington, D.C., derived by Mr. C. A. Schott * from a series of observations extending over ninety years may serve as a further illustration: Mag. Dec. = 2~.47 + 2~.50 sin [1~.40(T-1850) - 14~.6] where T denotes the year for which the declination is required. When the cause whose effects are to be represented by an equation is known, the form of this equation can usually be derived by mathematical analysis; but where empirical formulae are employed other methods must be resorted to. The simplest of these is a graphical representation of the errors or other data under consideration. For this purpose let the errors represent ordinates, and the values of any variable upon which they are supposed to depend, the corresponding abscissas. Let points be plotted with these ordinates and abscissas as was done in obtaining the form of the error curve, Figs. A, B, C, D, and let a smooth curve be drawn through these points either free-hand or by the aid of a draughtsmnan's "irregular curve." The distance of each plotted point from the curve, measured along an ordinate, is the residual corresponding to the point, * U. S. C. & G. S. Report, 1882, p. 258. 60 THE METHOD OF LEAST SQUARES. and in accordance with the principle of least squares the curve should be so drawn as to make the sum of the squares of these residuals as small as possible, without unduly complicating the curve. If the variable has been properly chosen it will in many cases be found possible to draw a simple curve which shall represent the data within the limits of the accidental errors of observation, and as this curve is the graphical representation of the law required, its equation, y =f(x), is the analytical representation of that law. In this manner the form of the equation treated in ~ 10 was obtained. In other cases it will be possible to draw a smooth and simple curve which shall not represent the data within the limits of accidental error of the observations, but about which the points will be grouped, alternating from one side to the other in a systematic manner. Let the excess of the ordinate of any point over the corresponding ordinate of the curve be plotted with the given abscissa in a new curve. The two curves thus constructed will together form the graphical representation of the law of the data, and the analytic expression of that law will be Y f( ) + J (x) if y =f(x) and y = +(x) are the equations of the two curves respectively. In some cases the curves themselves will be a sufficient representation of the data, and it will be unnecessary to determine their equations since the value of y corresponding to any given value of x may be obtained by direct measurement. In other cases the curve will be chiefly serviceable in suggesting the probable form of an equation between the observed quantity and a variable upon which it is supposed to depend, or in showing that no simple relation exists between them. Two forms of equation are of such frequent use in this connection that they deserve especial notice. If the plotted curve does not differ very greatly from a straight line, the relation of the variable x to its function y may be represented by y = a + bx + cr + d.e + etc. (35) EMPIRICAL OR INTERPOLATION FORMUL-E. 61 This equation contains the first few terms of an infinite series by which a limited arc of any continuous curve can be represented, and since the actual relation between y and x could be represented by a curve, if its mathematical expression were known, it follows that the above equation can be made to represent this relation over a certain range of values of x, by assigning proper values to the coefficients. The number of terms of this series which should be taken into account, and the limits of x beyond which the equation is not applicable, depend upon the actual relation between y and x, and are therefore unknown; but, in general, it is not well to attempt to use this equation for large values of x, or when more than three or at most four terms are required. Its application in a simple case is illustrated in the problem of ~ 1, where y and x being replaced by the length of the bar and its temperature, it is assumed that their relation can be expressed within the range of temperature over which the observations extend, by the first two terms of the series. The second type of equation above referred to is y ao - a cos x _+ ac cos 2- +- 3 cos ~- etc. 3x r (3;) + 6b sin - bS- + b: s in3 sin + etc. in which m is an undetermined constant expressed in the same unit as x; -- is therefore a ratio, or absolute number, which in the application of the equation to numerical data must be transformed into circular measure by multiplying it by 180 —=57~.29578. This form of equation may be made to 7r represent any relation whatever between finite values of y and x, including those cases in which y is a discontinuous function, but it is especially advantageous when y is a periodic function of x, i.e., one in which the same values of y recur for values of x, separated by a constant interval, T, called the period of x, so that 62 THE METHOD OF LEAST SQUARES. f(x) =f(x + r) =f(x + 2) =... =f(x + ). The simplest type of such a function is y = sin x, the period in this case being T = 360~; e.g., sin 10~ = sin (10~ + 360~) = sin (10~ + 720~) = etc. When y is such a function the constant m should be put equal to the period divided by 27r, m 7=; in other cases the value 27r of m, should be so taken that the greatest value of - included m among the data shall not exceed 7r. The application of this formula may often be facilitated by noting that if the relation between y and x is such that f( - x) = f(-x), the sine terms all disappear, and the equation reduces to x 2x y = ao + al cos -+ a2cos -- + etc. (3) mn mr while if f(x) = -f(- x), the cosine terms vanish, and the equation becomes y = b0 + b sin- + b2 sill2 + etc. (38) m m The several forms above given to this type of equation are those most convenient for use when the values of the coefficients a, b, etc., are to be determined, but after their numerical values have been found it is advantageous to transform the equation as follows: Introduce the auxiliary quantities no, nl, n2l, N, N2, etc. defined by the relations no = a0 nl cos N, = al n2 cos N2 = a, n, sin N1- bl n2 sin N2 = b2 and the equation becomes y _= o +- Il cos —.Y + 12COS (2 —N2) + etc. (39) each pair of terms of the original equation being here replaced by a single term. The expression for the magnetic declination EMPIRICAL OR INTERPOLATION FORM ULAE. 63 at Washington given on p. 59 is of this type, as may be seen by writing it in the equivalent form Mag. Dec. = 2.47 + 2~.50 cos [1~.40(T-1850) - 104~.6] The mode of applying this form of equation may be illustrated by means of the following data selected from the series of observations whose residuals are plotted in Fig. C. The observed quantity, B, is the difference of stellar magnitude (brightness) between the planet Saturn and his satellite Iapetus. The quantity 1 given with each observed B fixes the position of the satellite in its orbit at the time of observation, and is analogous to the variable angle 0 in a system of polar coordinates. 1. B. Residual. 1. B. Residual. m llI nm 10~ 10.82 - 0.28 200~ 10.66 + 0.28 70 11.81 + 0.22 230 9.87 -0.21 110 11.69 -0.02 270 10.43 + 0.21 140 11.42 -0.03 310 10.48 -0.15 B is here seen to run through a complete cycle of values between the limits 9.87 and 11.81, while I varies from 0~ to 360~. We shall therefore endeavor to represent B as a periodic function of I whose period is 360~. In accordance with this assumption we put x 1 180~ 60* =1; y=B m 3-60 271 and taking into account the first five terms of the series, the several observations furnish the following OBSERVATION EQUATIONS. 10.82 = a, + 0.98 a1 + 0.17 b, + 0.94 a2 + 0.34 b2 11.81 = ao + 0.34 a, + 0.94 b - 0.77 a2 + 0.64 b2 11.69 = ao - 0.34 a, + 0.94 b - 0.77 a2 - 0.64 b. 11.42 = oa - 0.77 al + 0.64 bl + 0.17 a2 - 0.98 b, 10.66 = ao - 0.94 a1- 0.34 bi + 0.77 a, 2- 0.64 b, 64 THE METHOD OF LEAST SQUARES. 9.87 = ao - 0.64 a, - 0.77 bl - 0.17 a2 + 0.98 b2 10.43 = a0 + 0.00 a1 - 1.00 b, - 1.00 a2 + 0.00 b, 10.48 = ao + 0.64a, - 0.77 b, - 0.17 a2 - 0.98b2 The solution of these equations will be found in the following section. The values obtained for the constants are a0 = 10.92, a = + 0.15, b = + 0.74, ac =-0.04, b = — 0.18 Introducing the constants n, N, and determining their values from the relations,,1 = 10.92 n1 cos 1N = + (.15 712 cos 2 = - 0.04,I siln X, = + (0.74 H2 sin, = - 0.18 the equation becomes B = 10.92 + 0.76 cos ( — 78.5)-) 0.18 cos (21 - 77~.5) The residuals obtained by comparing the values of B computed from this formula with the observed values are given above with the data. Abundant data for exercise in deriving empirical formulae of this kind may be found in the United States Coast and Geodetic Survey Report for 1882, pp. 218-257. ~ 16. Approximate Solutions. It is often desirable to obtain from a series of observations, as rapidly and with as little labor as possible, a set of values of the unknown quantities involved which shall be fair approximations to their most probable values, but in which the highest degree of accuracy is not required. In cases of this kind the least square treatment of the observation equations as illustrated in ~ 10 is too long and laborious, and the following method may often be substituted for it with advantage. Let there be any number, e.g., three, unknown quantities ilvolved in a set of observation equations of the form ax + by+ + z + tn = 0 Weight =p and let each of these equations be multiplied by its weight, giving the group of equations, APPROXIMATE SOLUTIONS. 65 atx + biy + cz + ni = 0 k a2x+ b2y + c2z + n2 = 0 2 k (40) etc. etc. etc. etc. ) Multiply each of these equations by the undetermined constant k placed opposite it, and let the sum of all the resulting equations be formed. By the use of the summation symbol, [ ], this sum may be written [ka] + [kb]y + [kIc]z +[k] =[ 0 Since the several values of k which enter into this equation are entirely arbitrary it would be permissible to assign to them such values that [kb] and [kc] should each equal 0, which would give at once x Ci]c (41) [ka] This, however, is not practically advantageous on account of the labor involved in determining the values of k. We therefore put [ka] [ka] [tka] - and, limiting the values of k to + 1 and- 1, assign them in such a manner that [ka] shall be made as great, and [kb], [kcc] as small as possible. In this manner the coefficients of y and z may often be made so small that if approximate values of y and z are substituted in equation (42), they will furnish a sufficiently accurate value of x; since the effect of the errors of these approximations will be much diminished by the small coefficients by which they are multiplied. The value of y may be found in the same manner by selecting a set of k's which shall make [kb] large and [ka], [kc] small, and similarly for z. Two or three trials may be required before sufficiently close approximations to the values of x, y, z are obtained, but these trials are rapidly and easily made, and, if necessary, in exceptional cases the summation equations may be written in the form 66 THE METHOD OF LEAST SQUARES. [kc'a ] + [k'b ]y+[ k'c + [k'n ] =0 [k" a ]x + [k" b ]y - [k" c ]z + [k" n ] = 0 (43) [k"' a]x + [k" b]y + [k"' c]z + [k"',] = 0 and the equations solved by any of the methods of elementary algebra, but in every case the values of kI + 1 and - 1 should be so chosen that in the first equation the coefficient of x, in the second equation the coefficient of y, and in the third equation the coefficient of z, shall be made as large, and all of the other coefficients as small as possible. By this mode of solution each observation with its proper weight is included in the determination of the unknowns, but since the principle of least squares has not been taken into account, it cannot be expected that the resulting values will be the best that the observations can be made to yield. To illustrate the mode of solution we recur to the observation equations contained in the preceding section and putting ao = 10.00 + a write them as follows, placing the several values of k at the right of each equation. cos I sin I cos 21 sin 2 0.82=a+0.98 al+0.17 b1+0.94 a2+0.34 b2 1.81 =a+0.34 a+ 0.94 bl-0.77 a2+0.64 b2 1.69=a-0.34 al +0.94 bl-0.77 a2-0.64 b2 1.42=a-0.77 al+0.64 b +0.17 a2-0.98 b2 0.66=a-0.94al-0.34 bl+0.77 a2+0.64 b2 -0.13=a-0.64 al-0.77 bl-0.17 a2+0.98 b2 0.43=a+-0.00 a-.OO bl-1.00 a2+0.00 b2 0.48 = a+0.64 al-0.77 bx-0.17 a2-0.98 b, k' k" k"W kiv kv +1+1 +1+1 +1 +1+1+1-1+1 +1-1+1-1-1 +1-1+1+1-1 +1-1-1+1+1 +1-1-1-1+1 +1+1-1-1-1 +1+1-1+1-1 The summation equations obtained from this group are: + 7.18 = 8a - 0.73 a, - 0.19 b, - 1.00 a2 + 0.00 b2 - 0.10 = 0 a + 4.65 al - 1.13 bl - 1.00 a, + 0.00 b2 + 4.30 = 0a+ 1.15 a, +5.57 b + 0.4 a2+ 0.00 b2 (44) -0.42 = Oa + 0.53 a, - 0.41 bl + 4.42 a2 + 0.00 b2 - 0.86 = 0a + 0.21 a, +0.19 bl + 2.54 a2 + 5.20 2 ) APPROXIMATE SOLUTIONS. 67 and these correspond to the normal equations of a least square solution. To apply the method of approximations to the solution of this group of equations, we write them in the form: a = + 0.897 + 0.091 a, + 0.024 bl + 0.125 a2 a, = - 0.022 + 0.000 a, + 0.243 b, + 0.215 a2 bl = + 0.772 - 0.206 ai + 0.000 b, - 0.025 a2 (45) a2 =- 0.095 - 0.120 a +- 0.093 bl + 0.000 a2 b. = - 0.166 - 0.040 a, - 0.036 bl - 0.488 oa The divisions required in making this transformation were made by the use of Crelle's Rechentatfeln. By operations which can be performed mentally, we obtain the following sets of approximations to the values of I. II. III. a, 0.0 +0.2 +0.15 bi +0.8 +0.7 +0.74 b2 -0.1 -0.0 -0.04 and substituting in equations (45) the values given under III. we find the adopted values a0= 10 +a = 10.92 a=+ 0.15 a2= -0.04 b, = + 0.74 b, = -0.18 which were employed in the preceding section. INDEX TO FORMULAE. y= ~ 3,4 V7r 1 v~v] 2 h2 u X0= ___.] [aa] x ~ [atb] y~ [awjz~ Ea)] =7 [as] =L[aa] + [ab] + [ac] + + ['ail].9 [be.11=be [b-.1] c [eu.2] =[en -l] - [bb.1] [bn.l]. ~ 9 [aa] k,+ [a3] k2 +40(Xo, YO ZO) 0.~ 11 r= 0477~0.67 Jvv 0~~.845 Hg"' 1 r r ro=7T= = V-p].~...13:p:p = —r 12 12~14 y =a+bx +cx2+ etc.~ 15 y =no +n, cosQ -Ni)9l2CO 2cos(-IV2 )etc..~ 15 [knhI~[lcb +Ice k=4-1. 16 [ka] [ka]Y~[kajZ Latin Text-Books. INTROD. PRICI ALLEN & GREENOUGH: Latin Grammar........ $I.20 Ceesar (7 books, with vocabulary; illustrated)... i,25 Cicero (I3 orations, with vocabulary; illustrated).. I.25 Sallust's Catiline.............60 Cicero de Senectute............50 Ovid (with vocabulary)..... 1.40 Preparatory Course of Latin Prose.... 1.40 Latin Composition............ 12 ALLEN.. New Latin Method........90 Introduction to Latin Composition......90 Latin Primer...............90 Latin Lexicon.........90 Remnants of Early Latin......75 Germania and Agricola of Tacitus..... oo BLACKBURN. Essentials of Latin Grammar......70 Latin Exercises........60 Latin Grammar and Exercises (in one volume)..oo COLLAR & DANIELL: Beginner's Latin Book........oo Latine Reddenda (paper)..........20 Latine Reddenda and Voc. (cloth).....30 COLLEGE SERIES OF LATIN AUTHORS. Greenough's Satires and Epistles of Horace (text edition) $0.20; (text and notes)... 1.25 CROWELL. Selections from the Latin Poets. 1. I40 CROWELL & RICHARDSON: Briof History of Roman Lit. (BENDER) I.OO GREENOUGH. Virgil:Bucolics and 6 Books of EAneid (with vocab.). I.60 Bucolics and 6 Books of ZEneid (without vocab.) I.I2 Last 6 Books of iEneid, and Georgics (with notes) I.12 Bucolics, AEneid, and Georgics (complete,with notes) i.60 Text of Virgil (complete).......75 Vocabulary to the whole of Virgil. I.00 GINN & CO. Classical Atlas and Geography (cloth).... 2.00 HALSEY... Etymology of Latin and Greek.....I2 KEEP... Essential Uses of the Moods in Greek and Latin.25 KING... Latin Pronunciation............25 LEIGHTON..Latin Lessons...... 12 First Steps in Latin........... 1.12 MADVIG.. Latin Grammar (by THACHER)... 2.25 PARKER & PREBLE: Handbook of Latin Writing.....50 PREBLE... Terence's Adelphoe............25 SHUMWAY.. Latin Synonymes........30 STICKNEY. Cicero de Natura Deorum........ 1.40 TETLOW.. Inductive Latin Lessons... 1.12 TOMLINSON. Manual for the Study of Latin Grammar..20 Latin for Sight Reading......... I.00 WHITE (J. VW.) Schmidt's Rhythmic and Metric... 2.50 WHITE (J. T.) Junior Students' Latin-English Lexicon (mor.) 1.75 English-Latin Lexicon (sheep).....I50 Latin-English and English-Latin Lexicon (sheep) 3.00 WHITON.. Auxilia Vergiliana; or, First Steps in Latin Prosody.15 Six Weeks' Preparation for Reading Ceesar..40 Copies sent to Teachers for Examination, with a view to Introduction, on receipt of Introduction Price. GINN & COMPANY, Publishers, BOSTON, NEW YORK, AND CHICAGO. GREEK TEXT-BOOKS. INTROD. PRICE. Allen: Medea of Euripides.....$1.00 Flagg: Hellenic Orations of Demosthenes.... 1.00 Seven against Thebes.......1.00 Anacreontics.....35 Goodwin: Greek Grammar...... 1.50 Greek Reader..1.50 Greek Moods and Tenses......1.50 Selections from Xenophon an(l Herodotus. 1.50 Goodwin & White: Anabasis, with vocabulary... 1.50 Harding: Greek Inflection...... 50 Hogue: Irregular Verbs of Attic Prose..... 1.50 Jebb: Introduction to Homer....... 1.12 Kendrick: Greek at Sight......15 Leighton: New Greek Lessons. 1.20 Liddell & Scott: Abridged Greek-English Lexicon. 1.90 Unabridged Greek-English Lexicon. 9.40 Parsons: Cebes' Tablet.75 Seymour: Selected Odes of Pindar... 1.40 Introduction to Language and Verse of Homer.75 Homeric Vocabulary....75 School Iliad, Books I.-III., $1.25; Books I.-VI...60 Sidgwick: Greek Prose Composition.... 1.50 Tarbell: Philippics of Demosthenes..... 1.00 Tyler: Selections from Greek Lyric Poets. 1.00 White: First Lessons in Greek...... 1.20 Schmidt's Rhythmic and Metric. 2.50 Passages for Translation at Sight. Part IV.-Greek,.80 CEdipus Tyrannus of Sophocles... 1.12 Stein's Dialect of Herodotus..... 10 Whiton: Orations of Lysias. 1..00 Beckwith: Euripides' Bacchantes. Text and Notes, Paper,.95; Cloth, $1.25; Text only,.20. D'Ooge: Sophocles' Antigone. Text and Notes, Paper, $1.10; Cloth, $1.40; Text only,.20. Dyer: Plato's Apology and Crito. Text and Notes, Paper, $1.10; Cloth, $1.40; Text only,.20. Flagg: Euripides' Iphigenia among the Taurians. Text and Notes, Paper, $1.10; Cloth, $1.40; Text only,.20. Fowler: Thucydides, Book V. X* ~ Text and Notes, Paper, $1.10; Cloth, $1.40; Text only,.20. f Humphreys: Aristophanes' Clouds. 0) Text and Notes, Paper, $1.10; Cloth, $1.40; Text only,.20. e Manatt: Xenophon's Hellenica, Books I.-IV. X Text and Notes, Paper, $1.35; Cloth, $1.65; Text only,.20. S Morris: Thucydides, Book I. Text and Notes, Paper, $1.35; Cloth, $1.65; Text only,.20. Perrin: Homer's Odyssey, Books I.-IV. Text and Notes, Paper, $1.10; Cloth, $1.40; Text only,.20. Richardson: rEschines against Ctesiphon. Text and Notes, Paper, $1.10; Cloth, $1.40; Text only,.20. Seymour: Homer's Iliad, Books I.-III. Text and Notes, Paper, $1.10; Cloth, $1.40; Text only,.20. Smith: Thucydides, Book VII. Text and Notes, Paper, $1.10; Cloth, $1.40; Text only,.20. Towle: Plato's Protagoras. Text and Notes, Paper,.95; Cloth, $1.25; Text only,.20. GINN & COMPANY, Publishers, Boston, New York, and Chicago. Mathematics." Tntrod. Prices. Byerly........ Differential Calculus.................... $2.00 Integral Calculus...................... 2.00 Ginn............. Addition Manual........................15 Halsted.......... Mensuration............................ 1.00 Hardy.......... Quaternions........................... 2.00 Hill............. Geometry for Beginners................ 1.00 Sprague......... Rapid Addition.........................10 Taylor.......... Elements of the Calculus................ 1.80 Wentworth..... Grammar School Arithmetic.............75 Shorter Course in Algebra............... 1.00 Elements of Algebra..................... 1.12 Complete Algebra..................... 1.40 Plane Geometry.........................75 Plane and Solid Geometry.......... 1.25 Plane and Solid Geometry, and Trigonometry 1.40 Plane Trigonometry and Tables. Paper...60 P1. and Sph. Trig., Surv., and Navigation. 1.12 P1. and Sph. Trig., Surv., and Tables...... 1.25 Trigonometric Formulas................ 1.00 Wentworth & Hill: Practical Arithmetic................... 1.00 Abridged Practical Arithmetic............75 Exercises in Arithmetic................. Part I. Exercise anual............... Part II. Examination anual...........35 Answers (to both Parts)...............25 Exercises in Algebra.....................70 Part I. Exercise Alanual.................35 Part II. Examination Manual............35 Answers (to both Parts).............. 25 Exercises in Geometry...................70 Five-place Log. and Trig. Tables (7 Tables).50 Five-place Log. aud Trig. Tables (Comp. Ed.) 1.00,catworth & Reed: First Steps in Number, Pupils' Edition.30 Teachers' Edition, complete.90 Parts I, II., and III. (separate), each.30 Wheeler......... Plane and Spherical Trig. and Tables..... 1.00 Copies sent to Teachers for examination, with a view to Introduction on receipt of Introduction Price. GINN & COMPANY, Publishers. BOSTON. NEW YORK. CHICAGO. SCIENCE AND HISTORY. NATURAL SCIENCE. INTROD. PRICE Everett: Vibratory Motion and Sound..$ 2.00 Gage: Elements of Physics... 1.12 Introduction to Physical Science... 1.00 Hale: Little Flower-People...40 Hill: Questions on Stewart's Physics... 35 Journal of Morphology... (per vol.) 6.00 Knight: Primer of Botany...30 Williams: Introduction to Chemical Science...80 PHILOSOPHICAL SCIENCE. Davidson: Rosmini's Philosophical System..... 2.50 Hickok: Philosophical Works...00 Ladd: Lotze's Outlines of Metaphysic....80 Lotze's Outlines of Philosophy of Religion.80 Lotze's Outlines of Practical Philosophy..80 Lotze's Outlines of Psychology...80 Lotze's Outlines of 2Esthetics...80 Lotze's Outlines of Logic..80 Seelye: Hickok's Mental Science (Empirical Psychology) 1.12 Hickok's Moral Science... 1.12 POLITICAL SCIENCE. Clark: Philosophy of Wealth. 1.00 Clark & Giddings: The Modern Distributive Process. (retail).75 Macy: Our Government.. 70 Political Science Quarterly.(per'vol.) 3.00 Seligman; Railway Tariffs and the Interstate Law. (retail).75 HISTORY. Allen: Readers' Guide to English History....25 Andrade: Historia do Brazil....... 75 Fiske-Irving: Washington and His Country.. 1.00 Halsey: Genealogical and Chronological Chart...25 Journal of Archaeology.. (per vol.) 5.00 Judson: Cesar's Army..... 1.(0 Montgomery: Leading Facts of English History. 1.00 English History Reader..... 60 Moore: Pilgrims and Puritans.....60 Myers: Mediaeval and Modern History.... 1.50 Ancient History... 1.40 Copies sent to Teachers for Examination, with a view to Introduction, on receipt of Introduction Price. GINN & COMPANY, Publishers. BOSTON. NEW YORK. CHICAGO. i I Air I I If 1111 UNIVERSITY OF MICHIGAN illlilll Ell Ull ll 3 9015 01636 3619 | | | l Do: r: f f E tV:-::ff 0 ad 0 0 En D \ f tV iRr;00S A m I