A a t h Cl ryl a I I f
1A Bell A
4W7w.,c;;ie,#,


['HE


QA
I ct1I
I)  ItC 


I,,        --


THE METHOD OF LEAST SQUARES.


VELOCITY OF LIGHTI       4
66 OBS.ERVATIONS.
0
Fig.


i unit;=O000000001
A             ~.   0~


I


VELOCITY OF LIGHT IN AIR AND REFRACTING MEDIA.


THREAD INTERVALS
MADISON MERIDIAN CIRCLE. O
1-80 OBSERVATIONS&gt;.,~4 Fig


-a.oa 2Z
y=8.3 e
x unit=0.01 secondA of time
B


I


WASHBURN.BSERVATORY, UNPUBLISHED OBSERVATIONS,
ERRATUM. - Fig. B. The positive part of the axis of x should pass through the lowest plotted
point. The relation of the plotted point to the rurve is correctly represented.


PHOTOMET-RIC MEASURES    /6
OF IAPETUS.       4
303 OBSERVATIONS S,
Fig.


~ ~ ~~~Y            -~~~~~~=91e-0.026 2 Z
x unit=-S1 stelcakr.magnitude
% 0 
C        "~~  ) g  '


I


ANNALS OF HARYARD COLLEGE OBSERV.ATO-RK, VO.L. II, P;. 252,


DECLINATIONS OF OE LYRAE,  0' 
1866-67.         /
106 OBSERVATIONS.
-~Q  0      ii~O~6   ~      Fig.


o \                              â 0.0221 X a
* ia,                  Y~8.1 e
x\ unit=0.CS secon 4 of arc.
D                         0    - 0 i) n


I


â.-t- -_y


CONSTANT OF ABERRATIO1N1 A. HALL, LYNN, 1888,


TYPICAL ERROR CURVES,


AN ELEMENTARY TREATISE
UPON THE
METHOD OF LEAST SQUARES,
WITH


NUMERICAL EXAMPLES OF ITS APPLICATIONS.
BY
GEORGE C. COMSTOCK,
PROFESSOR OF ASTRONOMY IN THE UNIVERSITY OF WISCONSIN,
AND DIRECTOR OF THE WASHBURN OBSERVATORY.
BOSTON, U.S.A.:
PUBLISHED BY GINN &amp; COMPANY.
1890.


COPYRIGHT, 1889, BY
GEORGE C. COMSTOCK.
ALL RIGHTS RESERVED.
TYPOGRAPHY BY J. S. CUSHING &amp; Co., BOSTON, U.S.A.
PRESSWORK BY GINN &amp; gQ., BOSTON, U.S.A.


PREFACE.
THE following elementary treatment of the Method of Least
Squares has grown out of my attempts to so present the subject to students of physics, astronomy, and engineering, that
a working knowledge based upon an appreciation of its principles might be acquired with a moderate expenditure of time
and labor.
Conceiving that the ultimate warrant for the legitimacy of
the method itself is to be found in the agreement between the
observed distribution of residuals and the distribution represented by the error curve, I have not scrupled to abandon
altogether the analytical demonstrations of the equation of
this curve and to present it as an empirical formula, representing the generalized experience of observers. The evidence in
support of a formula of this kind is necessarily cumulative,
and the few curves which are presented in illustration of the
law of error are to be considered as samples of the kind of
evidence which exists in great abundance. By abandoning
the theoretical demonstrations, the student is freed from the
embarrassments which are usually encountered at the threshold of the subject, and which in many cases cause it to appear
as a mathematical puzzle whose analytical difficulties absorb
the attention of the tyro to the complete exclusion of the purposes for which the analysis is conducted.
I have sought to give prominence to the distinction between
accidental and systematic errors, and to insist upon the limi


Vi                      PREFACE.
tations which result from the difference between these two
classes of error. To illustrate the principles of the text, I
have made free use of numerical data and have arranged the
computations in forms which experience has shown to be
convenient for the purpose, with a view to their subsequent
use by the student as models for his own computations.
In the preparation of these pages, I have consulted many,
if not most, of the standard treatises upon the subject, but
my indebtedness for suggestions and methods of treatment
is principally to
FAYE, Cours d'Astronomie de l'Ecole Polytechnique.
OPPOLZER, Lehrbuch der Bahnbestimmnnlg.
WRIGHT, Treatise on the Adjustment of Observations.
G. C. C.


CONTENTS.
SECTION                                           PAGE
1. ILLUSTIATIVE PROBLEM........   1
2. ERROIlS AND RESIDUALS...... 
3. TiHE DISTRIBUTION OF RESIDUALS..                5
4. TI ERROR CURVE........      8
5. THE PRINCIPLE OF LEAST SQUARES..... 12
6.  W EIGIITS..........    16
7. NORMAL EQUATIONS.... 19
8. NON-LINEARl OBSERV.ATION EQUATIONS.....21
9. FORMATION AND SOLUTION OF NORMAL EQUATIONS..23
10. NUMERICAL EXAMPLE......... 29
11. CONDITIONED OBSERVATIONS....... 38
12. PROBABLE EPIORS.........     45
13. PROBABLE ERROR OF A FUNCTION OF OBSERVED QUANTITIES. 51
14. ASSIGNMENT OF WEIGHTS; REJECTION OF OBSERVATIONS. 54
15. EMPIRICAL OR INTERPOLATION FORMULE.. 58
16. APPROXIMATE SOLUTIONS.......      64
INDEX TO FORMULE......... 68


THE METHOD OF LEAST SQUARES.
-&gt;ow:~o-cc,~ 1. PROBLEM. To determine the coefficient of linear expansion of a certain bar of metal its length was determined at
different temperatures by comparison with a standard of known
length. The data furnished by the measures are (Kohlrausch,
Leitfaden der Phlysik):
Temperature.    Observed Length.
minm.
20~ C.           1000.22
40               1000.65
50               1000.90
60               1001.05
It is reqliired to determine from these observations the amount
of the expansion of the bar per degree Centigrade.
If c denote the required expansion, and l0 the length of the
bar when its temperature is 0~ C., its length, 1, at any other
temperature, t, may be represented by the equation
10 +t. = 1
By means of this equation the four observations recorded
above are transformed into the following observation equations:
(1),, + 20 c = 1000.22 
(2) 1+ 40 c = 1000.65 l
(3) lo + 50 = 1000.90 
(4) 10+ 60c = 1001.05)
Any two of these equations are sufficient to determine the
values of 10 and c, but the values derived from different pairs
of equations will be different. Thus we may find from


2


THE METHOD OF LEAST SQUAR ES.


Equations.      t0          c
mml.       mnm.
(1) and (2)   999.79     +0.0215
(1) and (4)   999.80.0208
(2) and (3)   999.65.0250
(3) and (4)   1000.15.0150
etc.        etc.         etc.
We are here presented with a problem of constant recurrenlce in the investigations and applications of physical science.
In order to determine the values of certain quantities with a
high degree of precision more measures or observations are
made than are absolutely necessary, and these observations
prove to be inconsistent among themselves, so that the resulting values of the unknown quantities depend upon the manner
in which the data are combined. It is evident that all of the
values above found for 0lo and c cannot be correct, and it is
doubtful if any absolutely correct value can be derived fromt
the data; but it is also apparent that the observations are not
worthless and that any of the values above derived may be
considered as approximations, more or less close, to the true
values of the required quantities.
If we assume that the relation between the length of a bar
and its temperature can be expressed by an equation of the
form employed above, we must suppose that the discordances
in the results are due to errors in the observations, and the
problem then becomes:
To find from the observed data a set of results which shall
be affected as little as possible by the errors of the data, or in
more technical language, to find the most probable values of
the unknown quantities.
We may establish in advance of any formal investigation of
this problem certain principles to which its solution must conform. Thus,
(A) The adopted values of the quantities which are to be
determined must be based upon all the data available. Only
in exceptional cases, which will be considered hereafter, is it
proper to omit or reject any observation or any known relation among the quantities.


ERIOlIS ANI) RESIDUALS.


3


(B) The adopted values must satisfy the observation equations as nearly as possible.
~ 2. Errors and Residuals.  The expression error of an
observation has been freely used in the preceding section, but
it should be recognized that the amount of this error can
rarely, if ever, be known, since this would imply an exact
knowledge of the unknown quantities. We may, however,
obtain approximate values of these errors from the adopted
values of the quantities which were to be determined. Thus,
if the values,, = 999.79, c = + 0.0215 be substituted in equations (1), these become
(1) 1000.22 = 1000.22      (3) 1000.87 = 1000.90
(2) 1000.65 = 1000.65      (4) 1001.08 = 1001.05
The difference between the first and second members of any
one of these equations is called the residual of that equation,
and is approximately the error of the corresponding observation. The residuals which correspond to the several values of
10 and c derived in ~ 1 are given below in tabular form.
o= 999.79      999.80     999.65     1000.15
c = + 0.0215   + 0.0208    + 0.0250   + 0.0150
v -  0.00        0.00      + 0.07     -0.23.00      +.02.00      -.10
-.03       +.06.00.00
+.03.00      -.10.00
We may thus, for any assumed values of the unknown quantities, find a corresponding set of residuals, and the smaller
these residuals are the closer is the probable approximation of
the assumed, to the true values. Principle (B).
This statement, however, requires an important qualification
to which we now proceed.
The errors with which any given series of observations is
affected may be divided into two classes:
Accidental Errors, or those whose law of recurrence is such
that in the long run they are as often positive as negative and


4


THE METHOD OF LEAST SQUARES.


whose effect upon the mean of a great number of observations
therefore differs but little from zero; and
Systematic Errors, or those which in the given series of
observations do not thus tend to be eliminated from the mean.
In the observations considered in ~ 1, an error of judgment by
which the observer in a given case read the thermometer 0~.1
too high would probably be an accidental error, since it may
be presumed that in the long run he would read it as often too
low as too high, but if through a fixed habit of observing, the
thermometer were always read too high this would be a systematic error, and the number of observations might be indefinitely increased without in the least diminishing its effect..
If the standard of length with which the bar was compared
were an erroneous standard (e.g. 0.01 mm. too long), all of the
observations would be affected with a systematic error due to
this source, and the residuals would furnish no trace of this
error, since they show only discordances among the observations, and not errors affecting all alike. The smallness of the
residuals in any case, therefore, furnishes no guaranty that the
observations and the results derived from them have not been
vitiated by systematic errors.
The presence of errors of this class constitutes the greatest
obstacle to the accurate determination of any set of quantities
whose values are sought, and the ingenuity and skill of the
observer or experimenter cannot be better employed than in
avoiding or overcoming the effect of such errors. It therefore
deserves especial notice that systematic errors can often be
transformed into accidental errors by varying the methods of
observation or the conditions under which the observations are
made.   Thus the possible systematic error of judgment in
reading a thermometer, to which allusion was made above, may
be transformed into an accidental error if several different persons take part in the observations, since it is hardly probable
that they will all have a common, persistent error of judgment.
The error due to using an erroneous standard of length may
be changed into accidental error by employing a number of
different standards, since it is not probable that these, con


THE DISTRIBUTION OF RESIDUALS.


5


structed at different times and by different makers, will all
have a common error of length. Considerations of this character serve to illustrate the great practical importance of varying the methods of determining any quantity whose value is
desired with great precision. Multiplying observations by the
same method and under similar circumstances serves only to
diminish the effect of accidental errors and is useless beyond a
certain limit, while varying the methods and the circumstances
under which observations are made tends to eliminate errors
of both kinds.
The principles here considered find their appropriate application in the selection of the methods by which any given set
of unknown quantities is to be determined, but after the observations have been made, since they can, in general, furnish but
little, if any, information in regard to their own systematic
errors, these must be neglected and the reduction and discussion of the observations directed toward eliminating the effect
of the accidental errors.
~ 3. The Distribution of Residuals. Gauss, a German mathematician, has shown by a course of analysis based upon the
theory of probabilities that in any given series comprising a
very large number of observations affected with accidental
errors, the number of errors of a given magnitude, x, is a function of that magnitude. Thus, if x' and x" denote any two
errors, and y' and y"11 the number of observations having the
errors x' and x1" respectively, then
yI: y"::f(x'): f(x")
The analytical expression for f(x) obtained by Gauss is
f(x) = ^     e-^22                  (2)
where e = base of the Naperian system of logarithms,
7r = ratio of the circumference to the diameter of a circle,
h = a number whose value must be derived for each series
of observations, but is constant for all the observations of that series.


6


THE METHOD OF LEAST SQUARES.


The same expression for f(x) has been derived by other
mathematicians through different courses of analysis, but
against all of these investigations objections of a theoretical
character have been urged. Experience, however, shows that
the actual distribution of residuals does follow this law, not
with absolute accuracy, but to a remarkable degree of approximation. An excellent illustration of this distribution in the
case of a comparatively small number of observations is
afforded by a series of 66 determinations of the velocity of
light made at Washington, in the year 1882.* By means of a
revolving mirror the time required for the passage of a ray of
sunlight from one terrestrial point to another was measured.
The mean of the 66 determinations of this time interval was
24.827 millionths of a second. By subtracting this mean from
each single determination a series of residuals will be obtained,
and the number of residuals whose magnitude equals 1, 2, 3,
etc. units may then be counted. In this way a fair approximation to the distribution of residuals represented by Gauss's
law of error will be found; but as this law purports to represent the average distribution of a great number of errors, we
shall obtain a better comparison between it and the actual distribution by the following device, to which we resort in order
to increase the number of available residuals:
Let it be assumed that in any given set of observations the
number of residuals of magnitude x is proportional to the number of residuals occurring between the limits x - a and x +4- a,
where a is a quantity which in strictness ought to be an infinitesimal, but which may be made a small finite quantity without
appreciable error. In the present case we adopt as the unit in
which the residuals are to be expressed, the thousand-millionth
part of a second (0.00000001), and put a equal to two such
units. Thus, from a series of 66 observations are derived the
following numbers which represent the distribution of residuals which might be expected to occur in a much longer series.


* Velocity of Light in Air and Refracting Media. Bureau of Navigation, Navy Department, 1885, p. 187.


THE DISTRIBUTION OF RESIDUALS.


7


Residual.
Less than - 13.5 u
Equalto  â 13.5 /
-12.5
-11.5
-10.5
- 9.5
- 8.5
- 7.5
- 6.5
- 5.5
- 4.5,
- 3.5
- 2.5 
- 1.5
- 0.5 "


No.
2
0
2
2
2
3
2
4
6
8
12
15
18
21
23


%                  Residual.
0.8 o,  Greater than + 13.5 }
0.0 of Equal to      + 13.5
0.8  C               + 12.5
0.8                  + 11.5
0.8  '               + 10.5
1.2 4               + 9.5
0.8                  + 8.5
1.6                 +. 7.5
2.3                  + 6.5
3.1.+ 5.5
4.7                  + 4.5
5.8:.'              + 3.5
7.0, â               2.5
8.2                  +  1.5
8.9  '               + 0.5


No.
0
2
2
3
6
5
6
7
8
10
12
15
17
21
23


0.0
0.0
0.8
0.8
1.2
2.3
1.9
2.3
2.7
3.1
3.9
4.7
5.8
6.6
8.2
8.9


%,      I, -....,.)


The column headed % represents the number of residuals
differing not more than half a unit from the magnitude given
in the first column, expressed as a percentage of the whole
number of residuals. Fig. A furnishes a graphical representation of this distribution, each percentage in the above table
being represented by a point whose abscissa is the magnitude
of the residual and whose ordinate is the percentage itself.
The curve whose equation is


= 100 h ehx2
y = __ e- 2x2
V ir


h = 0.158


is shown in the same figure, and a simple inspection of the
curve shows that its ordinates represent very approximately
the percentage of residuals of each magnitude. The coefficient h appears multiplied by the factor 100 in order that the
ordinates may be represented as percentages.
Figs. B, C, D, represent the distribution of residuals in three
other series of observations of different kinds, made at different places, by different observers, but all following the same
law. The unit in which the residuals are expressed, unit of x,
is stated with each figure, and the unit of y is in every case
one per cent of the whole number of residuals.
The equations of the several curves shown in the figures are


8


THE METHOD OF LEAST SQUARES.


almost identical, but the feature to which the student's attention is called is that the algebraic form of the equation is in
each case
y =    e-^2x2
V7r,and not that h has approximately the same value in each
curve. The numerical value of h depends upon the unit
adopted for x, and these units having been chosen with reference to a convenient graphical representation of the residuals,
the agreement in the several values of h must be regarded as
purely artificial.
The series of observations represented in Fig. D is known
to be affected with small systematic errors, and it will be
noted that the distribution of the residuals is more irregular
in this case than in any of the others. In each of the series
represented in Figs. A and C, there are two residuals whose
magnitudes are too great to be represented in the figures; and
it is quite generally found that the actual number of very large
residuals is slightly greater than the number given by the
error curve. The illustrations here given are typical cases,
and may serve to exemplify the statement made at the beginning of this section, that the actual distribution of residuals is
found to follow Gauss's law of error, and in the following sections this law will be assumed as experimentally demonstrated,
and from it will be derived the method of combining and discussing observations. The student will find it an instructive
exercise to treat in a manner similar to that pursued above
any series of observations to which he may have access, particularly his own observations, and thus lend additional weight
to the experimental evidence which is here presented for his
consideration.
~ 4. The Error Curve. From the manner in which the ordinates of the points plotted in Figs. A, B, C, and D were derived,
it will be apparent that these ordinates represent the number
of residuals falling within certain chosen limits of error. Thus
in Fig. A, 8.9 per cent of all the residuals lie between the


THE ERROR CURVE.


9


limits 0 and +1, 8.2 per cent between +1 and +2, etc., the
interval within which the residuals are enumerated being in
every case one unit. It is also evident that the number of
residuals falling within any other interval, Ax, will depend
upon the magnitude of this interval as well as upon the ordinate corresponding to it, and if A x is taken sufficiently small
the number of residuals will be proportional to the product
y A x. Geometrically considered, this product is the area included between the axis of x, the curve, and the two ordinates
drawn through the extremities of A ax, and the number of residuals falling within the limits of Ax is therefore proportional
to this area. We may, if we choose, make A x an infinitesimal,
and the area y - A x and the corresponding number of residuals
will then become indefinitely small, but by taking the sum of
all the infinitesimal areas included between the limits x = a
and x = b, where a and b have any values whatever, we obtain
the area of that part of the curve included between ordinates
drawn at these limits. By a similar process of summation we
obtain the number of residuals lying between a and b, and the
number of residuals thus found must be proportional to the
area, since this proportionality is true in every infinitesimal
element included in the area.
In the following table, the function, A, represents the area
of that part of the error curve included between ordinates
whose abscissas are 0 and x, the argument of the table being
the values of x for the particular error curve in whose equation h = 1; but the area included between 0 and x in the curve
corresponding to any other value of h may be found front the
same table, by using as the argument 7x instead of x.
The area of that part of the curve lying between the limits
a and b is represented l)y
rb     7,   b
A =    ydx == 'Ke-h2 dx              (3)
va      ',/7r 
Let the variable in this expression be changed by putting
Ax = t, and the expression becomes
1  Chb
A = -     e-idt
N / t~h


10


THE METHOD OF LEAST SQUARES.


These expressions for A become identical if h = 1; hence, if
the value of A be computed from the second integral for h = 1
and tabulated, we may find from this table the value of A corresponding to any other value of h by changing the limits a,
b into ha and hb.
A remarkable property of the curve, which will be of use
hereafter, may be readily obtained from the expression here
found for A. If we make a = 0 and b = cc, the limits of the
integral become 0 and:c for all values of h, hence the area of
that part of the curve included between x = 0 and x = oc is
the same for all values of h, i.e., for every series of observations.
TABLE OF AREAS OF THE ERROR CURVE BETWEEN THE
LIMITS 0 ANID hlx.


hxr     A    Diff.    hr      -1  Diff.         i x 1    I)iff.
0.0    0.000          1.0   0.21           2.0   09766
56                   19                     85
0.1.056           1.1.440          2.1  1.49851
55                    15                    56
0.2.111           1.2.455          2.2.49907
53                    12                    36
0.3.164           1.3.467          2.,.49943
50                    9 3 
0.4.214           1.4.476          2.4.49966
467                                         14
0.5.260           1.5.483          2.5.49980
42                     5                     9
0.6.302          1.6,488          2.6.49989
37                    4                      4
0.7.339           1.7.492          2.7.49993
32                     3                     3
0.8.371           1.8.495          2.8.49996
27                     1                     2
0.9.308           1.9.496          2.9.49998
23                     2                     1
1.0.421           2.0.498          3.0.49999
i  o.50000


If in any series of observations n]' denote the number of
residuals whose magnitudes are included between the limits a
and b, n the whole number of residuals in the series, and A,,


THE EI1ROl, CURVE,.


11


A, the values of A obtained from the table with the arguments
ha, hb, then
b]= n(A, T A,)
since the ratio of W]b to n is equal to the ratio of the area of
that part of the error curve which lies between the limits a
and b, to the area of the whole curve, and this latter area is
seen from the table to be always unity. The - sign in this
equation is to be used when a and b have like signs, and the +
when they have unlike signs. If the percentage of residuals
between the limits a and b is required, it may be found by
substituting 100 in place of n as the coefficient of (Alb tAa).
Thus from Fig. A we find for the series of observations there
represented, h2 = 0.025 and h = 0.158.
To find the distribution of residuals between the limits
-  o..._^   -.   -...2,- - 2...+ 1, +1.  +4, +4...+ +c,
we proceed as follows:
&lt;        /8hj,      Al, -:- A,  n1]b  Per cent.  Obs.
=66 (A  ~,,)
--       -x-       0.500
5  - 0.790..38    0.132    9      13.2     9
-- 5     -- 0.790.368
-2      -0.316.172.19      13     19.6     1
+ 1      + 0.158.088       ()     17     26.0    17
+ 4     + 0.632.314.226     15     22.     13
+0  +~       00     *.186     12     18.6    1G(
+ oo    + co.500....
66     100.0    66
The numbers in the colulmn "Per cent"' llay be coml)arel
with the percentages given on page 7. The column "Obs."
gives the actual number of residuals which occur in the given
series between the limits here considered, and these numbers
should be compared with the column "n.] '
By the use of this table, the distribution of residuals in any
series of observations for which the value of h is known may
be compared with the theoretical distribution much more
readily than by plotting a curve, and the student should in
this way examine several series of observations. The method
of determining h for any given series is contained in ~ 12.


12


THE METHOD OF LEAST SQUARES.


~ 5. The Principle of Least Squares. The quantity h which
appears in the equation of the error curve deserves especial
attention. If in the equation
y =  -e-2
VTr
x be put equal to zero, the resulting value of y is _-. This is
Vw7
the maximum ordinate of the curve, and the value of this
maximum ordinate varies directly as h. If those parts of the
curve remote from the axis of y be considered, it will be found
that the larger is h, the smaller are the values of y, since when
x is a large quantity e-^h2 diminishes much more rapidly for
increasing values of h than h itself increases. These relations
between y and h correspond exactly to the criteria by which
we estimate the precision of observations. If we compare two
series of observations, I. and II., and find that in series I. the
small errors are relatively more numerous (large values of y
for small x's), and the large errors less numerous (small values
of y for large x's), than in series II., we shall without hesitation call the observations of series I. more precise or accurate
than those of series II.; and if required to assign definite
meanings to the terms "more precise" and "less precise," we
shall find difficulty in defining them in any other manner than
by reference to the magnitude of the residuals. We therefore
adopt as the measure of precision of any series of observations
the value of h in the equation of its error curve; and having
thus defined the term "precision," we are able to state two
principles which are of general application in the discussion
of observations.
Let the data furnished by each observation be expressed in
the form of an observation equation (Equations 1, ~ 1), then:
the best attainable values of the unknown quantities are those
which,
(1) Distribute the residuals in accordance with the law of
error, y =  1 e-h^2X2, and which,
VI 7


THE PRINCIPLE OF LEAST SQUARES.


13


(2) Make the value of. h in the equation of the resulting
error curve a maximum.
The first of these principles is indeed involved in the second, since if the residuals are not distributed in accordance
with the law
y=      -h2X2.A7r
there can be no value of h to be made a maximum. It is, however, advantageous to state (1) as a separate principle, since it
affords a test of the presence of systematic errors in the data,
which, though far front being a perfect criterion, is often convenient and is sometimes the only test available.
To justify the statement of (2), we resort to the following
considerations: In accordance with A, ~ 1, we suppose that all
of the data available is contained in the observation equations,
and, B, i 1, we seek to satisfy all of these equations as nearly
as possible.  If the observations are free from  systematic
error, a supposition which must here be made, since we have
no means of taking into account the effect of such errors, we
may obtain by substituting in the observation equations any
set of values which approximately satisfy them, a corresponding set of residuals which will be the errors of the observations,
on the supposition that the substituted values were the true
values of the unknown quantities. If these residuals are plotted in an error curve, they will furnish a numerical measure
of the precision h, assigned to the observations by this set of
values, and out of all possible sets of values of the unknown
quantities that set which assigns the maximum precision to
the observations will be entitled to the greatest degree of confidence; for if it were otherwise, we should have no reason for
preferring a set of values which exactly satisfied all of the
equations to a set which did not satisfy them.
It is, of course, true that subsequent observations may furnish a better determination of the unknowns, and that the
values thus found will not assign to the earlier observations as
high a degree of precision as did the erroneous values obtained


14


THE METHOD OF LEAST SQUARES.


from these observations alone, but this subsequent determination is based upon additional evidence, and the problem with
which we are concerned is not to obtain the best possible values of the unknown quantities, but the best values which can
be derived from the data in our possession.
Assuming, then, the validity of (2), we proceed to transform
it into an expression more convenient for practical use, and for
this purpose we resort to the following property of the error
curve, which may be approximately verified by actual measurement from any plotted curves, Figs. A, B, C, D.
If the error curve be divided into a great number of parts
by drawing equidistant ordinates throughout its whole extent,
and the areas of the several parts into which the curve is thus
divided be each multiplied by the square of the abscissa of its
middle point, the sum of all these products will equal 2 2.
The analytical expression for the process above described is
+o        2 it  M
x2ydx or _- x2e-h2x2dx
00 X       &lt;7r
Put hx = t, and this integral becomes
2fr2et2 dt      1
h2_V-      ~2 h2
For the method of obtaining the value of the last integral,
see Newcomb's Calculus, Articles 169, 176.
The area of each of the parts into which the curve was
divided is proportional to the number of residuals occurring
between the limiting ordinates of the part; thus, let A denote
the area of the part, N the corresponding number of residuals,
and n and a the whole number of residuals and the whole area
of the curve respectively; then
A: N:: a: n
but from the table in ~ 4, a = 1, whence
A = N and Ax2 = N
n            n


THE PRINCIPLE OF LEAST SQUARES.


15


Since N denotes the number of x's falling within the given
infinitesimal part, A, of the curve, Nx2 is equal to the sum
of the squares of the x's (residuals) whose magnitudes fall
between the limiting ordinates of.A, and taking the sum of all
the Ax2's we obtain
Y Ax2   1 x2
i.e., is equal to the mean of the squares of all the residuals.
It is customary to represent the sum of the squares of the
residuals by the symbol [vv], v standing for any residual, and
the [ ] denoting the sum of all quantities of the kind written
within them.
Comparing this result with the one obtained above, we have
[v,]   1
Evil, J - i                    (4)
n     h2 (2
from which it appears that the relation between h and the
sum of the squares of the residuals is such that when h is
a maximum, [vv] is a minimum, and principle (2) may be
restated as follows:
The most probable values of the unkmnown quantities are those
which make the sum of the squares of the residuals a minimum.
From this principle has been derived the name Afethod of
Least Squares, which is commonly applied to that body of
principles which treats of the combination and discussion
of observed data.
We have arrived at this principle from a consideration of
that class of cases in which the quantity observed is a function of two or more unknown quantities whose values are to
be obtained from the observations. This obviously includes
the case of a single quantity, x, whose value is directly measured; and it will be advantageous to apply the principle of
least squares to this case. The observation equations are here
of the simplest possible form.
x -  t3
X = mn3
etc.
where m denotes an observed value of x.


16


THE METHOD OF LEAST SQUARES.


If X0 denote any assumed value of x, the residuals obtained
by substituting it in these equations will be
V1 = llt -Xo
V2 =.12 - Xo
V3 =   -- Xo
VU =- 7',n - Xo
and [vv] = (mi - Xo)2 + (m2 -  )2 4+ (.3 ^xo)2... + (mn - ax)2.
The value of Xo which will make [vv] a minimum is found
from
[] =0o= -2 (mj-x,,) -'2 ( 2-X) x-) -'2 (   R.(,2) ((  -.To)
dxo
but this equation is equivalent to
= mfl1 + 2t + 2m 4+-.. 4 m,  [nm]
and it thus appears that the universal practice of taking the
arithmetical mean of all the measures of a single quantity as
the best value of that quantity, is a particular case under the
more general method of least squares.
~ 6. Weights. It frequently happens that the circumstances
under which an observation was made lead the observer to distrust its accuracy, while other causes give him increased confidence in another observation. Observations which thus differ
in quality are said to have different weights, the weight being
a numerical measure of the quality, and these weights should
be taken into account in combining the observations.
Let us suppose two series of observations made upon the
same unknown quantity, in one of which the observations are
of different quality and entitled to different degrees of confidence, while in the other the observations are all equally good,
but each of them entitled to less confidence than the poorest
observation of the first series. By taking the mean of a number of observations of this second series, a more reliable value
of the unknown quantity may be obtained than any single


WEIQHTS.


IT


observation of the series can furnish, and by properly choosing
the number of observations to be included in the mean, a value
entitled to as much confidence as any observation of the first
series may be found. Thrs number of observations of the second series whose mean is entitled to as much confidence as a
single observation of the first series, is called the weight of the
equivalent observation in the first series; and, obviously, the
better an observation, the greater is its weight. These weights
furnish no information about the absolute precision of the
observations, but express only their relative excellence as compared with each other; hence, if PI, P2, p3, etc., be the weights
of any observations, kP, kp2, kp,, etc., where k is any constant,
will express these weights equally well, since it is the ratios
of the weights, and not their absolute values, which are of
importance.
To exhibit the manner in which these weights are to be
employed, let us recur to the data of ~ 1, and suppose that
those observations were made under such conditions that the
first one has a weight 1, the second 2, the third 3, and the
fourth 4. In accordance with the definition of weights, this is
equivalent to supposing a second series of observations of uniform excellence, such that the first of the actual observations
can be replaced by one observation of this series, which must
of course be numerically the same as the observation which it
replaces; the second real observation may be replaced by two
numerically equal observations of the second series; the third
by three, etc. Each of these substituted observations will furnish an equation precisely like those given in ~ 1, and when
the sum of the squares of the residuals is formed, we shall
obtain
v12+ (V2+  2) + (V32 +v32+v2) + (    2+42+V42+v42)  [ pV]
The symbol [pvv], which is adopted as an abbreviation for
this expression, is equivalent, numerically, to the sum obtained
by multiplying the square of each actual residual by the weight
of the corresponding observation and adding the products, and
it is evident that this [pvv] bears the same relation to the


18


THE METHOD OF LEAST SQUARES.


substituted observations that [vv] bore to the actual observations in the case of equal weights, which was considered in the
preceding section. The principle there obtained may therefore be generalized as follows:
The most probable values of the unknown quantities are those
which make the sum of the weighted squares of the residuals,
[pvv], a minimum.
Let the student show, as in the preceding section, that when
this principle is applied to the case of observations of unequal
weight made upon a single unknown quantity, it gives as the
most probable value of that quantity
p    ]
As an example of the application of weights, we select the
following observations of the time of ending of the transit of
Mercury of May 6, 1878, which were observed by different
observers in the city of Washington. These observers were
provided with telescopes of different sizes and magnifying
powers, and differed among themselves in point of experience
and skill, so that their observed times of last contact are not
entitled to equal confidence. The weights assigned to the several observations represent the judgment of the computer with
respect to their relative excellence.  (Washington Observations, 1876, Appendix II., page 55.)
Observed Time.  p    pmt
5h 38n 23s    1     239     [pm] =318
37 55      0      0
38 10      1 10              = 1 6
38 26      3     78
38 21      2     42      [pm] = 19.9
[PI
38 18      2     36
38 19      3     57
38 21      2     42
38 15      2     30
Weighted mean of the observations = 5b 38m 194.9.


NORIMAL EQUATIONS.


19


~ 7. Normal Equations. We have now to show how the
principle of least squares is to be applied in determining the
values of a set of unknown quantities, and in order to fix
the ideas as definitely as possible, let it be supposed that there
are three of these quantities, x, y, z, which are connected with
each one of a set of observed quantities, n, by the relation
ax + by + cz =?n
where a, b, and c are numerical coefficients whose values are
supposed known in each equation.
From a series of more than three observed values of n, the
most probable values of x, y, z are to be obtained by means of
the relation [pvv] = a minimum.
It is not to be presunmed that these values when found will
exactly satisfy all the equations, and make [prv] = 0, but we
shall find from eacl equation a residual v, so that strictly the
observation equations should be written
ax +- by +- c  - C  -,  2= 
Cax 4+- by +- C,7z - '2 = V''2  1)
aC3 + bj/ + C,-z -  } == T;:,
etc.   ete.   etc.
The symbols p1, p2, p)3 represent the weiglts assigned to the
observed values, n1,,, u 3, etc.
By the ordinary rule for determining a minimum of a function of several variables, the condition [pmv] = a minimum,
furnishes the three equations
[d\[pmr]  n          = 0      d pv.']  0
dx             dy             dz
and in order to obtain these derivatives we form from the
observation equations
[   +] -a,-Vp x + b1V/'-y + c'. 4 - &lt;/ /   i   ( 5)
+ 5 a%3Vpx + b3"V,3 y + c'sp1,z - VZ- 3 Z 32 (
etc.         etc.      etc. by +,   - V
etc.     etc.     etc.      )


20


THE METHOD OF LEAST SQUARES.


The derivative of this expression with respect to x is
dxp-]   2a}l/ Va1Vpx ~ blVpy + cvpiz -V\pn 
+ 2a2'Vpa2/Vp2x + b2Vp2y + c,/p2i2 - VP/n2   (6)
+ 2 a3sVp3/s a3 x + b /p,y + C3V/p z - Vp3 n 
etc.    etc.     etc.    =0 
Let this expression be expanded, divided by 2 and simplified
by the introduction of [ ] to denote the sum of all terms like
those placed within them (all terms standing in the same vertical column), and it becomes the first of the following group of
NORMAL EQUATIONS.
[paa] x + [ pab] y + [pac] z - [pan] = 0
[pab] x + [pbb] y + [1pbc] z - [pbn] = 0   (7)
[pac] x + [pbc] y + [pcc] z - [pcn] = 0 
The second and third of these equations are derived in precisely the same manner as the first from the conditions
d [pvv] = o    pvv] = 
dy           dx
These equations are equal in number to the unknown quantities, and their solution will in general furnish a determinate
set of values for these quantities which will be the most probable values, since the normal equations include all of the data
furnished by the observations and have been so derived as to
satisfy the principle of least squares.
Equations (6) furnish a rule which is frequently given for
the formation of normal equations. To obtain the first normal
equation, multiply each observation equation by the product of
its weight into the coefficient of x which occurs in it, and take
the sum of all the resulting equations. The other normals are
similarly obtained from the weights and the coefficients of y,
z, etc., having due regard to the algebraic signs of the quantities in the several multiplications and divisions. This method


NON-LINEAR OBSERVATION EQUATIONS.


21


is occasionally convenient, but in general the method of forming normal equations given in ~ 9 will be found less laborious.
The symmetrical manner in which the coefficients of the
normal equations are disposed should be especially noted, since
this considerably diminishes the labor of their formation. The
first coefficient in the second equation is the same as the second
coefficient in the first equation, and generally the intl coefficient
in the ntc equation is the same as the?nth coefficient in the m",
equation.
Let the student form normal equations from the observation
equations contained in ~ 1, assuming that those equations have
equal weights.
~ 8. Non-Linear Observation Equations. In all of the preceding investigation, it has been tacitly assumed that the
relation of the observed to the unknown quantities can be
expressed by an equation of the first degree; but cases in
which this relation is of a much more complicated character
are not uncommon, and a method of applying the principle of
least squares to these cases is required. For the sake of sinmplicity, this method will be derived for the case of two unknown quantities, but the process is perfectly general and can
readily be extended to any other number of unknowns.
Let x and y be any two quantities which have not been
directly measured but which are connected with an observed
quantity, m, by the relation
f(x, y, m) = 0
which represents any equation whatever existing between x, y,
and m. Let x0 and yo denote approximate values of x and y,
such that       X=       x 
x= Xo+f Ax     Y=yo4+Ay
A x and A y being the corrections which must be added to x0 and
yo in order to obtain the most probable values of x and y. We
may, for the present, suppose that x0 and Yo are mere guesses
at the values of x and y, and we may test their correctness by
substituting their numerical values in the equation
f(x, y, m) = 0


22


THE METHOD OF LEAST SQUARES.


which corresponds to each observed value of m. If every such
equation were exactly satisfied by these values we should infer
that x0 and yo were the most probable values of x and y. It
cannot be expected that this perfect agreement will ever be
found in practice, but from each observation equation a residual, v, will be found, due partly to the errors of the observations and partly to A x and A y.
If in the equation f(x, y, m) = 0 we substitute for x and y
X. + A x, yo + A y, and develop the expression by Taylor's
Formula, remembering that f(xo, yo, m) is the residual found
by substituting numerical values of x,, y( in the several observation equations, we have
f(x'0+ Ax,, yo + A,  )              +    Afy +
dco     dyo
If numerical values of v, b â, Ot' be introduced into this
equation, it becomes      (lxo c yo
a* Ax + b-Ay+ n = 0
Each observation equation may thus be made to furnish a linear equation involving A x and A y, and these equations may
be treated by the method of ~ 7. It must, however, be remembered that in the above development by Taylor's Formutla we
have retained only the first three terms of an infinite series,
and if the approximate values x0, Yo are not so nearly the most
probable values that the squares and higher powers of A x and
Ay are inappreciable, the development and the solution based
upon it are inaccurate. On this account, it is seldom advantageous to make a least square solution for the unknown quantities until very approximate values of them have been found.
These values will usually be obtained from the solution of a
small number of the observation equations.
The transformation of the observation equations by the
introduction of corrections to assumed values of the unknowns
is often advantageous even when the original equations are of
the first degree, especially if the original quantities were of
very different magnitudes. Thus, in the problem of ~ 1, the
observation equations are of the form


FORMATION AND SOLUTION OF NORMAL EQUATIONS. 2-3


J(lo, c, m)= 1, + tc- m = ()
in which c is a very small quantity while lo is approximately
10()00. If we put
lo = 1000 + A 1  c = 0 +  c
we have  tt- =1,'= t  v =.f(1000, 0, ) = 1000 -()()
and the equations are transformed into
Ao0 +20Ac- 0.22 =0
A0 + 40Ac- 0.65 = 0
Alo + 50Ac - 0.9( = 0
A  + 60 Ac-1.05 = 0
By this transformation the numerical operations involved in
forming and solving the normal equations are much silmplified
through the substitution of snmall numblers in the 1place of
large ones.
~ 9. Formation and Solution of the Normal Equations. If
the number of unknown quantities is greater than two, and
especially if the number of observations is large, the numerical
computation of a set of normal equations is a laborious process,
and one in which errors are almost certain to occur unless
special precautions are taken to guard against them.  The
method of forming these equations presented in this section
has been developed with special reference to facilitating the
numerical operations and obtaining the nlormals with the least
expenditure of labor consistent witl the requisite accuracy,
lnd although some of the processes may seem at first sight
unnecessary and cumbrous, a little experience in their use, or
in their neglect, will convince the student that they are in
the long run labor-saving devices.
Let each observation equation be written out and alrranged
in tabular form, as in the following example. In order that
these equations should furnish a good determination of the
unknowns, x, y, z, it is necessary that the coefficients of these
quantities should present a considerable range in their values


24


THE METHOD OF LEAST SQUARES.


in the several equations. Thus, if all the coefficients of x
were alike, all the coefficients of y equal each to each, etc., the
equations would be absolutely indeterminate since we should
have several unknown quantities involved in a single equation
many times repeated, and if the coefficients approximate to
this equality the equations will be approximately indeterminate, and will furnish unreliable values of the unknowns.
If, therefore, several observations have been made under similar
conditions, and furnish equations which are nearly identical,
these will be nearly equivalent to a repetition of the same
equation and it will be permissible to take their sum, having
regard to their respective weights, and treat it as a single
observation equation with a weight equal to the sum of the
weights of the observations.
Having thus reduced the number of equations as far as
possible, each equation should be multiplied by the square root,of its weight as was done, ~ 7, in obtaining the form of the
normal equations. By this multiplication the weights will be
completely taken into account and will require no further
attention. Let the wveighted equations thus obtained be repre-.sented by
ajx + bly + clz +, =- 0
a2x + boy + c2z + "2 = 0
acx + bly + c3z + 'u3 = 0
etc.  etc.   etc.
It will usually facilitate the formation of the normals to so
transform these equations that no number greater than 1 shall
occur in any of them. This can always be done by introducing
new unknown quantities and dividing each equation by some
constant number, usually some power of 10. Thus in the case
of the two equations
5x + 71y-63 = 0
0.9x- 193y+- 93=0
let each equation be divided by 100 and put
T- x = U     I T-Y= -


FORMATION AND SOLUTION OF NORMAL EQUATIONS. 25


the equations are thus transformed into
1.000 ui + 0.368 w - 0.630 = 0
0.180 u - 1.000 q- + 0.930 = 0
The solution of these equations will furnish values of u and w
from which x and y may be found by the relations
x = 20     y = T âw
The purpose of this transformation is to simplify the subsequent numerical work by reducing the numbers involved to
an approximate equality.
Every coefficient which appears in the normal equations is
the sum of a series of products of two quantities, thus
[aa] = ala + a2c(2 + a3c3 +...
[ab] = a,bl + a+2b2 + c,,b +...
[bn] = bnll +- b2n2 -+ b6.t3 +- '
These products may be formed by the aid of Crelle's multiplication tables  supplemented by a table of squares of numbers for the [aa], [bb], etc. In case Crelle's tables are not
available, the products may be formed by logarithms or much
more rapidly by the following method due to Bessel. Form for
each equation the sums a + b, a + c, b + c, etc., for every pair
of numbers contained in the equation; then since
ab = -U (a + b)2 -aa - bb
we have     [ab] =   [(a + b)2] - [aa] - [bb] 
[be] =. [ (b + c)'] - Ebb ]- [cc] }    (8)
etc.     etc.     etc. 
The [aa], [bb], [cc] are coefficients in the normal equations
and must be computed in any case, and the formation of [ab],
[bc], etc., therefore requires for each coefficient only the single
additional quantity [(a+ b)2], [ (b +c)2], and presents the
very great advantage that these quantities can be obtained


* Crelle, Rechentafeln, Berlin. These tables give the products of
all numbers up to 1000 X 1000, and arc of very general utility.


26


THE METHOD OF LEAST SQUARES.


from a table of squares, and being all positive numbers no
attention need be paid to the signs after the sums a + b, b + c,
etc., have been formed.
No method of computation can furnish a guaranty against
the commission of numerical errors, and it is therefore desirable to test the computation from time to time to ascertain if
such errors have occurred. To secure such a test or " check,"
as it is called, we introduce the following auxiliary quantities,.
one for each observation equation:
sl= a, + b, +- c +  - + 1
S2 =  2 + b2  c2 +  ~*. +  2            (9)
etc.    etc.   etc.
and form the quantities [as] [bs] *.. [sn]. It will appear from
the mode in which the coefficients of the normal equations
are formed that
heck    [aa] + [ab] ~ [ac +. + [an] = [as] 
[ab] + [bb] + [bc] + - + [bn] = [bs]   (9a)
etc.      etc.     etc.
The [as], [bs], etc., are formed in precisely the same manner
as [ab], [ac], etc., and the check relations above given must
be satisfied by the computed values of these quantities.
Where only two unknown quantities are involved in the
normal equations the solution of the equations may be conveniently made by any of the methods of elementary algebra,
but if the number of unknowns is greater than two, the simple
and elegant method of successive substitutions proposed by
Gauss may be employed with advantage.
The normal equations in the case of three unknown quantities are:
[aa] x + [ab] y + [ac] z + [an] = 0
[ab] x + [bb] y + [bc] z + [bn] = 0     (10)
[ac] x + [bc] y + [cc] z  [cn] = 0
and from the first of these
[ab]    [ac]     [an]
[aa]    [aa]     [aa]


FORMATION AND SOLUTION OF NORMAL EQUATIONS. 27


This value of x substituted in the second and third equations
transforms them into
[bb.1] y +[be  ]+ [bn - ] = 0 
[bc 1]y+[cc l1]z+ [cln. 1] =0 )0
in which
[bb 1]   [bb]    [ab]     [cc 1] [cc ] â[a ] [  ] 
Eaa]                     [aa]
[~aba]t (1d2
[    b   e. 1  -  [bc] -  Eac  [b e   - ]a]  b
[aa]                     [aa]E
[b. 1]    b] -- [   ab] [1cnl            - a ] []   (1
[6bl 1]- [Cb&amp;] -  [an]   [en. 1] = [cn] â [  [ana ]
These equations constitute a new set of normals, from which
one unknown quantity has been eliminated. The correctness
of the numerical work of this elimination may be tested by a
continuation of the checks used in forming the original normals. We introduce an auxiliary quantity
[bs 1] = [bb. 1] + [be 1] +[b  1]
and inquire its relation to [as], [bs], etc. If we substitute in
the expression for [bs 1] the values of [bb. 1], [bc 1], [bn. 1]
in terms of the original coefficients, having regard to the
relations
[aa] + [ab] + [ac] + [an] = [as]          (13)
[ab ] + [bb] + [be ] + [bn] = [bs ]
we find   [bs 1]=[bs]- ab]-[a] []-[aa]
[aa]             ^   (14)
whence    [bs 1] = [bs] -  ab] [as]
and similarly,
[cs. 1] = [cs]-   [a]
We may therefore obtain a complete check upon the accuracy
of the numerical work involved in the elimination of x, by


28


THE METHOD OF LEAST SQUARES.


forming the quantities [bs. 1], [cs. 1], in the same manner as
[bb 1], [bc 1], [bn. 1], etc., and comparing the actual sums
of these latter quantities with the computed check quantities.
By a repetition of the process of elimination we obtain
[c. 2] z + [cn - 2]= 0  Check  [cs. 2]
where     [cc. '] -= [ec 1]- Eb  1 [c 1] ]
ct  = 2][c *I A-[b] [b. ] [ c 1]
[eCs 2] =[ [. 1 -     1] [,s *.] 1
[cu*   -[c l]  b[      bfl] l]           (15)
bb[c.2]=[.l] ]
= [c * 2] + [c1 *.2] 
and we are enabled to write the following equivalents for the
original normal equations.
ELIMINATION EQUATIONS.
[.b   ]    [.~           ] PC [a,]
X+[] 'y        [c] +      [w 
[bc..]     [b.1] =   o
Y$ +                 = +        (16)
[cn. 2][  0
E+[cc.2]3 
The last of these equations gives the value of z directly, the
second furnishes y as soon as z is known, and the first gives
the value of x. The whole solution is therefore reduced to
finding the values of the coefficients and absolute terms in
these elimination equations. A convenient arrangement of the
computation by which these quantities are obtained is given
in the following example, in which the actual computation is
exhibited upon one page, and the opposite page contains a
schedule correspondingly arranged showing the analytical
equivalent of each number contained in the computation.


EXAMPLE.


29


In making the multiplications of [ab], [ac], [an], [as], by
the constant factor [ab], the logarithm of this factor is written
[aa]
on the edge of a slip of paper, and being held successively
adjacent to the logarithms of [ab], [ac], [an], [as], the sum of
the two logarithms is taken mentally, the corresponding number looked out from a logarithmic table and written in its
proper place under [bb], [bc], [bn], [bs], a subtraction then
gives the value of [bb * 1], [bc 1], [bi n 1], [bs 1], and a similar process is followed for each other derived coefficient.
~ 10. Example. To illustrate the principles contained in
the preceding sections, and to exhibit in detail the process of
deriving the most probable values of several unknown quantities which are connected with the observed quantity by a
rather complicated relation, we select from Vol. iii. Part 1 of
the Mlemoirs of the National Academy of Sciences, page 58, the
following series of experiments made with a 10-gauge Colt
gun, loaded with uniform charges of four drams of powder and
1J ounces of shot, the shot ranging in fineness from No. 10
up to No. I Buck. The purpose of the experiments was to
determine the relation existing between the size (fineness)
of the shot and its average velocity over a range of 30 yards.
The following table contains the results of the experiments,
each velocity being the mean result of from three to six discharges of the gun. The weight of a pellet of No. 10 shot is
taken as the unit of weight, and the velocities are expressed
in feet per second.
Size.    Weight. Observed Velocity.
No. 10        1        848
8        2         920;6       4         966
3        8         989
BB.        16       1000
FF.        32       1017
No. 1 Buck       64       1067
By plotting these results in a curve with the weight of the
shot as abscissas, and the observed velocities as ordinates, the


30


THE METHOD OF LEAST SQUARES.


experimenter reached the conclusion that the relation between
the weight Wand the velocity V, is expressed by an equation
of the form
V        __
_ sec-l W    O 
1        n
in which 1, m, n, are constants whose values are to be determined from the observations. It will be found upon trial that
1, = 700, mo = 0.28, no = 0.42, in connection with the observed
values of V and W will approximately satisfy this equation,
and we therefore adopt these approximate values and proceed
(~ 8) to determine the corrections A, Am, An, which when
added to lo, n0, no, will furnish the most probable values of
1, m, n.
The several differential coefficients of the observation equation, f(l, n, n, V) = 0 are
dV V
V      10 
dlo  lo
( V _  -1 cot V
dmno    no   lo
d V= 1 log Wcot 
dno   f         o
in which M denotes the modulus of the common system of
logarithms, M-= 0.43429. In the factor, cot  is the ratio of
lo' lo
two numbers, and must be construed as representing a certain
arc expressed in parts of the radius: the corresponding arc
expressed in degrees is 57~.29538.
1o
The form of the observation equation with which we are
here concerned is
- l0 cot. Am + - loog TVcwot -A n +(v-1   -l s. =)0
lo      W(o  lo      if        I                me \9/
and introducing into this equation the numerical values of
o1,   no,, 1t? IV M, we find the following


EXAMPLE.


31


OBSERVATION EQUATIONS.
(1) 1.29A1-729 Am +      OAn+53=0 p =1.0
(2) 1.36   - 535     + 104   + 32 = 0       1.0
(3)  1.41  -396      + 154   + 24 =0       0.8
(4)  1.45  - 294     +172    + 28 = 0      1.0
(5) 1.48   - 219     + 170   + 38 = 0      1.2
(6) 1.50   - 164     + 159   +- 37 = 0     1.1
(7) 1.52   - 122     + 142   -   2 =0      1.0
The absolute terms of these equations are residuals obtained
by substituting in the original equation
V        TIV 
-- see- â = ()
1       m
I   e -   l
the assumed values of 1o, mno, and to, and the smallness of
these residuals compared with the values of V, shows that the
assumed quantities are approximately correct values of 1, m, n1.
The memoir from which our data are taken contains no indication of the weights to be assigned to the several determinations
of V, and in the absence of such information they should all be
treated as equally precise and given the weight 1; but for the
sake of illustration a slightly different set of weights indicated
above by p has been assigned to them, and by multiplying each
equation by the square root of its weight we obtain the following
WEIGHTED OBSERVATION EQUATIONS.
1.29 AL - 729 Am +  OAn+ 53 = 0
1.36   - 535    + 104    + 32 = 0
1.26   -352     + 137    + 21 = 0
1.45   - 294    + 172    + 28 = 0
1.62   - 239    + 185    + 42 = 0
1.58   - 172    +167     + 38   0
1.5    - 122    + 142    -  2=0


32


THE MIETHOD OF LE AST SQUAI:RES.


The coefficients and absolute terlms in these equations are of
very different magnitudes, and to simlnlify the subsequent
numerical work we divide each e(uation through by 100 and
put


t   6
1 = 2 --
(). 0 16 2d


-7.29'.,S)


and introduce x, y, z into the equations in place of A 1, A vmi, A n.
This step, which is frequently called rendering the equations
homogeneous, furnishes the following
H()oIMOC;,NEOUS WVEI(IITEI) ()BSEVATl'OiN  EQI ATIONS.
0.796 (; - 1.()00y +.00z)() + 0.530 = 0  s = + 0.326
0.839'  - 0. 733 + 0.56: + 0.320 = 0      0.989
0.777 --.482 + 0.741  + 0.210 = 0        1.246
0.895 - 0(.403 + 0.931 + 0.280 = 0        1.703
1.000) - 0.32T  + 1.000 + 0.420 = 0       2.093
0.9775 - 0.236 + 0.903 + 0.380 = 0        2.022
0.938 - 0.1;67 + )0.768 - 0.0'0 = 0)      1.51')


EXAMPLE.


33


The values of s = a + b + c + n, which are to be used as a
check in the formation of the normal equations, are derived
from these equations.
The formation of the coefficients of the normal equations by
the use of a table of squares, Bessel's method, is represented in
the following tables:
SUMS OF THE COEFFICIENTS.


Equation a + b
1.... 0.204
2....106
3...295
4....492
5...673.....739
7.....771


a+c   a+   | -t+s  b-+c
0.796( 1.32i6 1.122  1.000
1.402 1.159 1.828.170
1.518 0.987 2.023.259
1.826 1.175, 2.598.528
2.000  1.420 3.093.673
1.878 1.355 2.997.G7
1.706 0.918 2.457.601
I~~~~~~~~~~~~~~~~~~~~


b+, b s    c -tn  c- fs
0.470 0.674 0.530 0.326.4 41  0.256 0.88.3 1.552.272 0.764 0.951  1.987.1 2  1.300 1.211 2.634.093  1.766 1.420 3.093.144  1.786  1.283 2.925.187 1.352 2 0.748 2.287


SQUAIRPS.


tion.


1
2
3
4
6
7


0.634.704.604.801
1.000.951.880
5.576....o


0.042.011.087.242.453.546.594


0.634
1.966
2.304
3.334
4.000
3.527
2.910


1.758
1.343
0.974
1.381
2.016
1.836
0.843


1.259
3.342
4.094
6.750
9.567
8.982
6.037


1.000.537.232.162.107.056.028


1.000.029.067.279.453.445.361


0.221.171.074.015.009.021.035


0.454
0.066
0.584
1.690
3.119
3.190
1.828


0.000.317.549.867
1.000.815.590


0.281
0.780
0.904
1.466
2.022
1.646
0.560


0.106
2.409
3.948
6.938
9.567
8.556
5.230
36.754
20.450


0.281.102.044.078.176.144.000
0.825


0.106
0.978
1.552
2.900
4.381
4.088
2.307
16.312


H
c3g
H
0
tl
0
C12
H
w
c:
C-4


+ 5.576
[aa]


1.975
7.698
- 2.861
[ab]


18.675 10.151 40.031
9.714  6.401 21.888


2.122...


2.634
6.260
- 1.813
[bc]


0.546 10.931  4.138  7.659
2.947 18.434...   4.963


+ 4.480
[ac]


+ 1.875
[an]


+ 9.071
[as]
+ 9.070


+ 2.122,[bb]


- 1.200
[bn]


- 3.751
[bs]
- 3.752


+'4.138 + 1.348
[cc]   [[cn]


+ 8.152
[cs]
+ 8.153


+ 0.825
[nn]


+ 16.312
[ss]


EXAMPLE.


35


From the sums of the squares contained in the several columns of this table the coefficients [ab], [ac], etc., are computed
at the foot of the columns by the relations
[ab] =   [(a + b)2] - ([a2] + [b2]), etc.
The check quantity [as] is compared with
[aa] + [ab] + [ac] + [an]
whose value is written immediately under [as], and which
must agree with [as] within two or three units of the last
decimal place. Every coefficient of the normal equations
enters into one or more of these sums, which therefore furnish
a complete test of the accuracy of the work in passing from the
homogeneous observation equations to the normal equations.
We now write the
NORMAL EQUATIONS.
+ 5.576x - 2.861y + 4.480z + 1.875 = 0
- 2.861 x + 2.122 y - 1.813z - 1.200 = 0
+ 4.480x - 1.813y + 4.138z + 1.348 = 0
It may be seen from an inspection of these equations that
the data upon which they are based will not furnish a good
determination of the values of all the unknowns, for if the
first equation be divided by - 2 the quotient will be very like
the second equation, and if it be multiplied by + - the product
will be very like the third equation. We proceed, however,
with the solution by Gauss' method, which will furnish the
best results that the data can be made to yield.


36


THE METHOD OF LEAST SQUARES.


SOLUTION OF
[aa]       [ab]
log [aa]   log [ab]
[bb]
log [ab]           [ab][ab]
[aa]           [aa]
[bb.-1]
log Ebb.- 1]
log [ac]
[aa]
log re-I
[bb.- 1]
log [(n - 2
[e. -2]


THE NORmA
[ac]
log [ac]
[be]
[ab] [ac]
[aa]
[bc -l]
lo- [be. 1]
[cc]
[ac] EI],
[aa]
[cc. 1]
[bb.- 1]   -1
[e. 2]
log Ecc. 2]


L EQUATIONS~
[an7]
log [aIn]
[bnj
[ab].  -. [an]
[bnt. 1]
log [bit -l]
Ecn]
[ac] [n
Pen. 1]
r[be._I]   -1 
[en. 2]
log [en. 2]


[as]
log [as]
[bs]
[ab]
[ as]
[bs.- 1]
Checke suni.
log [bs. 1]
[es]
[a  [as]
Check samn.
[bc.- 11[bs - 1]
[Ibb. 1]
[cs.2]
Check sumn.


ELIMINATION EQUATIONS.
[ aa]        [ aa]      [ aa] 
[be.1]       [bn  - l] 
+[bb liZ1   +[bb  ] O
+[cn.2]  0
[cce.2]
The course of the computation after the formation of the elimination equations is sufficiently indicated upon the opposite page.


EXAXIMPLE.


37


SOLUTION OF TILE NORMAL EQUATIONS.


+ 5.576   - 2.86.1
0.7464    0.4566)i
+ 2.122
+ 1.468


+ 4.480
0.65 13
- 1.813
- 2.2199


+ 1.875 -0.2730
- 1.200
- 0.962


+ 9.070
0.9576
- 3.752
- 4.654


9.7102 )i


+ 0.654   + 0.486    - 0.238   + 0.902
9.8156    9.6866     9.3766.   9.955-2


+ 0.002


+ 4.138
+ 3.599


+ 1.348
+ 1. 507


+ 8.153
+ 7.286


9.9049


+ 0.53)    - 0.159   + 0.867
+ 0.361    - 0.177   + 0.670
+ 0.178    + 0.018   + 0.197


9.8710


+ 0.866
+ 0.196


9.0049!


9.2504     8.2553
I'IAMIN-ATIo-N EQUATIO-N.S.
r - 0.513 y + O.SO3z + 0.336=0.=- 0.030
y + 0.743 z -0.364 = 0  y = + 0.439
z + 0.101 = 0   Z = - 0.101


log x 8.4771ni
log 0.0162 8.2095
Al - 1.8
10 700.0
1 698.2


logf y  9.6425)
log 7.29  0.8627
An + 0.0602
nMO + 0.2800
m + 0.3402


log    9.0049 it
log 1.85  0.2672
A&amp;n - 0.0547
n10 + 0.4200
n + 0.3653


38


THE METHOD OF LEAST SQUARES.


If with the values of 1, m, n thus obtained the corresponding
velocities be computed by means of the original equation
V = sec- TV
I         ml
the resulting residuals should be smaller than those derived
from the substitution of 10, mn, no, i.e., the absolute terms of
the observation equations. The following comparison of these
residuals shows a much better representation of the observed
values of V, especially if the suins of the squares, [vv], be
compared.
Observed - Computed V.
Weight of Shot 1     2    4     8    16    32    64
f(lo no no)  -53, -32, -24, -, -28  -38, -37, + 2
f(l, m, n)   - 6, + 10, + 13, + 4, -10, -13, + 22
Not only are the residuals diminished in magnitude, but
their distribution is much more nearly in agreement with the
law of error.
The values thus obtained for 1, mn, n ought not to be considered the best attainable, since the corrections Am, A n are
relatively large fractions of mo and n0, and it is probable that
the neglected terms containing A2, n2, etc., have an appreciable influence upon the solution. To secure the utmost accuracy
these values of 1, m, n should be treated as new approximations
and another set of corrections A l, Am, An derived. This resolution is recommended to the student as a valuable exercise.
Let the student also derive from the data of ~ 1 the most
probable values of 10 and c, assigning unequal weights to the
several equations.
~ 11. Conditioned Observations. There is a class of cases in
which the application of the principle of least squares seems
to produce absurd results. Thus if each angle of a plane triangle be measured many times in order to obtain an accurate
set of values for the angles, the application of the principle
that the [pvv] must be made a minimum will furnish as the
most probable value of each angle the weighted mean of the


CONDITIONED OBSERVATIONS.


measures of that angle, but the sum of these weighted means
will usually differ slightly from 180~, and since the sum of the
angles of every plane triangle must equal 180~ it appears that
the most probable values above derived are impossible values.
It must, however, be noted that the method of treatment above
outlined is itself a violation of Principle A, ~ 1, since the knowledge that the sum of the angles must equal 180~ furnishes a
relation among those angles which may be used and ought to
be used in determining their most probable values; and the apparent absurdity above found is produced by neglecting this
part of the data.
A relation such as the above whlich must be exactly satisfied
by a set of observed quantities is called a rigorous condition,
the equation by which the relation is expressed is called an
equation of condition, and observations of such quantities are
known as conditioned observations. The number of rigorous
conditions is, of course, always less than the number of unknown quantities, since if it were equal to the number of such
quantities the values of the latter would be determined by the
conditions alone, independently of any observations.
In order to develop a convenient method of treating rigorous
conditions, let x, y, z be three unknown quantities which are to
be determined from observation, but whose values are required
to satisfy the equations of condition
(x,,,) = 0   q1 (x, y, z)= 0
Let the measurements or observations for the determination of
the unknown quantities be represented by observation equations of the form
fi(x, m) = 0  f,(y, n) = 0  f,(z, q)= 0
m, n, and q being the quantities directly measured, and the
measures for the determination of x being quite independent
of those for y, z, etc. In accordance with the principles of
least squares the values of the unknown quantities are to be so
determined that [pvv] shall be made a minimum in each series
of observations above represented, and therefore the sum of all


40


THE METHOD OF LEAST SQUARES.


the weighted squares of the residuals must also be a minimum.
Owing to the conditions 4c(x, y, z) = 0, l (x, y, z) = 0 it will not
in general be possible to assign to the unknown quantities
values which will give to [pvv] its least possible value, and the
problem becomes one of conditioned or relative minima, i.e.
out of all the sets of values of x, y, z which will exactly satisfy
the equations of condition it is required to find that set which
assigns to [pvv] its least value consistent with those equations.
The method of determining relative minima is as follows:
(Jordan, Cours c'Analyse, Vol. i., ~ 205). Multiply each equation of condition by an undetermined constant factor, and add
the products to the function which is to be made a minimum.
The derivative of the new function with respect to each unknown quantity must be placed equal to 0, and the equations
thus formed, together with the equations of condition, will be
just sufficient to determine the unknown quantities and the constant multipliers. Thus, in the present case, representing the
multipliers by - 2 k1 and - 2 k2, we have for the new function
= [pvvl - o2 k, (x,, z) - 2 kh(x,, z)   (17)
and      dw (         = 0 d    1= 0Od,            (1()
dxr        dy        dz,(xy, y, )= 0      (x, y, )=0
will determine kc, 2,, x, y, and z.
It was shown in ~ 7 that in general for three unknown
quantities,
cdlvv = 2[paa] x + 2[pab] y + 2[pac] z + 2[pan]
but in the case here considered those observation equations
which contain x do not contain either y or z, and, therefore, the
b and c coefficients in those equations are to be considered
zero, all the products ab, ac, are also zero, and
d[pvv] _ 2[paa]x + 2[pan]
dx
with similar expressions for the y and z derivatives.


CONDITIONED OBSERVATIONS.


41


Denoting for the sake of brevity q(x, y, z) and ifr(x, y, z)
by &lt; and q respectively, we obtain by differentiating wz
d - = [paCa~[p] + [tn ]- k  t â      = 
dcx                     dx     dX
d= D      ][bb ]y + [pbn ]-k k1d -  7 = o
dw               d       4, _ k dy
Id = [CPc ]z + [pc n]- k    -      = 0
~7dz              dz     dz
from which
[pan] +d( k[           dl    h2
[ paa]   dx [ppa~t]  -.dx [)taa]
Y_   [pbn]' d(    ki    cl/   A2k
[pbbc]  (cy [pcbb]  dy [pcbb]
z_  _ [2]cn]  d-  hl  +f  k2
Z=    [c                +     [c
[29CC ]  dz [p2ec]  dz E Pcc]
These equations determine the values of x, y, z, when kh1 and
k2 are known, and it should be observed that the first terms of
the second members of the equations are the values of x, y, z,
which would be obtained by treating the observations as if
these quantities were entirely independent of each other, e.g.
in the case of direct observations of the quantities they are
the weighted means of the observations. If we represent the
values thus obtained by x, o, z0, and represent by v1, v2, V3 the
corrections which must be added to these quantities in order
to obtain the most probable values of x, y, z, i.e. put
X = Xo + V     = y = YO+ 'v2  z = zo + V3
we shall have
V -   [paa] + d     A2 k,
~ dx [paa]    dx [paa]
d=    k,    d     k1,9
V2  dy [pbb]   dy [pbb]                   (19)
d f   k, d4     l k2
d3  z  [pcc]  dz [pcc] 


42


THE METHOD OF LEAST SQUARES.


The quantities k, and kc are called correlates, and from the
manner in which they were introduced it appears that the
number of correlates is equal to the number of rigorous conditions to which the observed quantities are subject. To
determine the values of the correlates let x + v,,  + v,, zv  + V3
be substituted for x,?, z in the equations of condition, and
the equations developed by Taylor's Formula, giving for
the +(x, y, z)
/ ('XI, Y?, Z,) +;( VI +  oV2 + -( '  + etc. = ()
dx,    dy     dz
and a similar expression for l (x, y, z). Let the values of
vi, V2, V3 in terms of kA andt k2 be substituted in these equations,
and put
dx _          dy   =        dz   =
V/j[pa]   1   -v pbb]     aV[pc] = 
and the equations become
[aa] k, + [al] k2 +  b (XO, Yo, z) = 0 
[a/]A', + [E]]k2 + +(x0, Yo, z,0) = 0 )
from which the values of k1 and k., may be obtained, and thus
the values of VI, v2, v, from equations (19).
The method by which the above equations have been derived
for the case of three unknown quantities connected by two
equations of condition is perfectly general and may be extended to any other number of quantities whose values are
to be obtained from independent observations. In the cases
which actually arise in practice the observation equations and
equations of condition are usually of simple form, the differential coefficients and the quantities a, b, c, etc., being usually
equal to either 1 or 0.


CONDiTIONEI) OBSERVATIONS.


43


PROBLEM. Let the student show by the method of correlates that if the sum of the measured angles of a plane triangle
exceed 180~ by a quantity e, the angles must be corrected by
distributing e allong them in such a manner that the correction
to each angle is inversely proportional to the weight of the
angle.
To illustrate the application of the principles of the present
section to a rnumerical problemi, vwe select from the U. S.C. &amp;
G. Seurvey Report for 1884, pages 409 et seq., the following telegraphic determinations of longitude, and seek to adjust them
so that they shall be mutually consistlent. Each difference of
longitude between two stations was directly observed, so that
the observation equations are all of the form x =  1,, x = m2,
etc., and the values given below are the weighted means of the
individual observations of each series. The probable error of
each determination (see ~ 12) is placed imme(liately after the
quantity'itself, and the weights of the determinations are
assumed to be inversely proportional to the squares of the
probable errors.
Observed
Stations.       Symbol.  Difference of 
Longitude.    VP    p
Cambridge, Mass., 
ashingt' - on.C.,       o,  23" 41f041 ~001os 0.18 0.032
WVashington, D.C., ) ' 
Cambridge, Mass., ) 
^Clev elandl, O.,..    Yo y 42 14.875 0.0.38 0.38.144
Cleveland, O.,  ) 
Cambridge, Mass.. ) 
~Cambridge, Mss. ).       47 27.713 0.035 0.35.122
Columbus, 0., ' 
Washington, D.C.., 
Conbs, 0..  b.?  23 46.816 0.038 0.38.144
Columbus, O.,    ' 
Cleveland, 0.,                     1
Columbus, O.,                               5 0
The five observed differences of longitude give rise to two
rigorous conditions represented by the following equations of
condition:


44


THE METHOD OF LEAST SQUARES.


O(), u+x-z=              (), tw+y-z=0
The coefficients in the observation equations being all equal to
unity,
[paa] = P, [pbb] =m, etc.,
(I d)  1         do     1        de     1
and   a   - =  -.       a2 =    -, i.      etc.,
dx      p        dy   Vp         dx     V\i
and from these expressions are derived the following values of
the coefficients, together with the sums si = a, + /i, s2 = a2 + /,
etc., which are to be employed as a check upon the formation
of the normal equations for determining the correlates.
COEFFICIENTS.
Subscripts.    1.        2.        3.        4.       5.
a        + 0.18      0.00    - 0.35    + 0.38     0.00
B          0.00    + 0.38    - 0.35     0.00    + 0.45
s        +0.18     +0.38     -0.70     +0.38    +0.45
FORMATION OF THIE CORRELATE EQUATIONS.
| a    |    a            as          B            PS
+ 0.0324    + 0.0000     + 0.0324    + 0.0000      0.0000.0000.0000.0000.1444.1444.1225.1225.2450.1225.2450.1444.0000.1444.0000.0000.0000.0000.0000.2025.2025
+ 0.2993    + 0.1225     + 0.4218    + 0.4694     + 0.5919
Check.                    0.4218                   0.5919


CORRELATE NORMAL EQUATIONS.


+ 0.2993 k, + 0.1225 k2 + 0.144 = 0
+ 0.1225 k + 0.4694k2 +O.091 = 0


kl = - 08.449
k2 = -0.078


THE PROBABLE ERROR.


45


The absolute terms of the correlate equations are obtained
by substituting the observed values xo, Yoz, Zo, u0, W in the equations of condition, and the values of kC, k2 may be found from
the correlate equations, either by Gauss's method of substitution or by any of the ordinary algebraic processes of elimination. The corrections to x0, Yo, Zo, etc., and the adopted values
of the unknown quantities, are now found from
v, = + 0.032 Ak + 0.000 k = - 08.014  x = 231m 41'.027
+v, = + 0.000 k1 + 0.144k2 = -0.011  y = 42 14.864
3= - 0.122 k, - 0.122 k2= +- 0.064  z = 47 27.777
'V4 = + 0.144 k, + 0.000 k2 = -0.065  t = 2:3 46.751
5= + 0.000 k- + 0.202 k2 = - 0.016  w =  5 12.913
The values thus obtained satisfy the rigorous conditions of
the problem, and are the most probable values which can be
obtained from the data given above.
~ 12. The Probable Error.    Every intelligent observer
desires to know something of the quality of his observations,
how good or how bad they are; the computer who has to combine the results of different series of observations should have
some knowledge of their relative accuracy in order to assign
to each series its proper weight; and the investigator engaged
in a complicated series of experiments desires some criterion
by which to estimate the relative errors of the several parts
of his work, in order to properly apportion his care among
them, giving the maximum attention where the greatest errors
are to be feared. It is evident from the nature of the case
that no absolute criterion of this kind can be furnished, since
any series of observations may be affected with systematic
errors which seriously impair the accuracy of its results but
furnish no indication of their presence. Both observer and
computer do, however, estimate the accuracy of observations
by their agreement among themselves, and that within certain
limits this procedure is correct follows from Gauss's law of
error.  If we suppose a very long series of observations
affected only by accidental errors, the values of the unknown


46


THE METHOD OF LEAST SQUARES.


quantities obtained from the series will differ but little from
the true values (if the series is infinitely long they will be
the true values), the residuals which they furnish will be very
nearly the errors of observation, and the value of h in the
equation of the error curve will furnish a measure of the
precision of the observations as well as a measure of the
smallness of the residuals. On the other hand, if the student
attempts to construct the error curve corresponding to any
short series of residuals, e.g., those of ~ 10, he will find that
while they give him some information in regard to the curve
there will be much that is arbitrary in its actual construction,
and that many curves can be drawn which will appear to fit
the residuals equally well, i.e. the amount of data in this case
is insufficient to determine more than a rough approximation
to the measure of precision of the observations. If the observations are affected with systematic errors, the residuals may
be very different from the errors of the observations, and will
then furnish no indication of their accuracy.
It thus appears that any conclusions in regard to the accuracy of a given set of observations must be treated with
caution if they are based solely on the residuals furnished by
the observations. Such conclusions are, in fact, valid only
within certain limits whose general nature is indicated above;
but within these limits the information thus furnished may be
of much value, and it is frequently employed for the )purposes
indicated at the beginning of this section.
The measure of precision, h, seems to be indicated by its
name as the appropriate means of expressing the average
accuracy of a set of observations, but in practice it is not so
used, another function of the residuals being found more coilvenient. If in a very long series of observations the residuals
be arranged in the order of their numerical magnitude (without regard to sign), that residual which occupies the middle
place in the series will have as many residuals greater than it
as there are less than it, and in any future series of observations of the same degree of precision as that here considered,
it will be an even chance that any given residual will be


THE PROBABLE ERRIOR.


47


greater than, or less than, the middle one above selected. This
middle residual is usually denoted by r, and is rather inappropriately called the probable error of the series, the adjective
having reference to the equal probabilities of the occurrence of
residuals (errors) greater than, or less than, r.
It is apparent that the greater the precision of any set of
observation, the smaller will be the corresponding probable
error, but the exact relation which exists between h anl r
must be derived from the equation of the error curve. The
symmetry of this curve with respect to the axis of y shows
that the same law of distribution holds for both positive and
negative errors, and that in a very long series of residuals the
probable error r will occupy the middle place among the
positive, errors and among the negative errors considered separately, as well as among all the errors taken without regard
to sign. Since we are concerned only with the numerical
magnitude of r we may confine our attention to the positive
residuals, and find the relation between r and h from that half
of the error curve which lies to the right of the axis of y.
Since the probable error is a residual, it must be represented
by the abscissa of some point on the axis of x, and we may
determine this point from  the condition that the ordinate
drawn through it bisects the area of that half of the curve
under consideration, since (from  the relation between areas
and the number of residuals of a given magnitude developed
in ~ 4) this is the geometrical equivalent of the statement
that the number of residuals greater than r is equal to the
number less than r. By interpolation from the table in ~ 4,
the value of the argument corresponding to A = 0.25 is found
to be hx = hr = 0.477, whence the relation between the probable error and the measure of precision is
0.477                    (22)
The student will observe that in the definition of the probable error reference is made to a very long series of observations,
and in a series of infinite length the value of r might be found


48


THE METHOD OF LEAST SQUARES.


immediately from its definition, but in any ordinary set of
observations it is better to assume that the residuals are distributed in accordance with the law of error, and to determine
the value of r from the relation between h and the sum of the
squares of the residuals, ~ 5, which gives
r=~ 0.477V2 V    vI
We here encounter a difficulty arising from the attempt to apply
to a short series of residuals principles which are rigorously true
only when the series is of infinite length. Suppose the above
expression for r applied to a series of three observations involving three unknown quantities whose values are derived from
the resulting observation equations. These values will exactly
satisfy the equations, no matter what the errors of the observations may be, and the residuals being all zero, there will be
found r = 0 and h = c, which is absurd. The observations in
this case furnish no data from which to estimate their precision, and in every such case where the number of observations is equal to the number of unknown quantities, the
expression for the probable error ought to become indeterminate,. It is therefore customary to put
0
r=~o0.(;74 \1 []                 (23)
in which fA denotes the number of quantities whose values have
been derived from the observations. This equation, which is
known as Bessel's expression for the probable error of a single observation, being only an approximate one, we may usually
put 2 in place of the coefficient 0.674. Among German physicists and astronomers, it is quite customary to suppress this
coefficient altogether, and to use the "mean error"
_= i~ J[]                     (24)
n - /A
for the comparison of observations. Geometrically considered,
c denotes the abscissa of the point of inflexion of the error
curve.


THE PROBABLE ERRlO.01


A simpler expression for the probable error may be obtained
by substituting in the equation
0.477
h
a value of h derived as follows: Let each member of the equation of the error curve be multiplied by xdx and integrated
between the limits â o and +oo, giving
ydl = -- J     xe-h2x2dx
The value of the first integral in this equation is obviously 0,
since as we pass along the error curve from - -  to +oo every
value of y occurs once associated with a negative value of x,
and again with a numerically equal positive value, and for
every negative element xydx in the integral there occurs an
equal positive xydx so that the entire sum is 0. If, however,
we agree to neglect the sign of x and to condsider only its
numerical value, we shall find
fxydx = 2 xydx
and by a course of reasoning precisely similar to that applied
in ~ 5 to the quantity J 2ydx, it may be shown that 2 jxldx
is equal to the mean of all the residuals taken without regard
to sign. We may therefore write
[+ v]- h=    +     d
where the + inside the brackets denotes that all of the residuals are to be treated as positive quantities. Putting hx = t
in the second member of this equation and remembering
that here also we are concerned only with numerical values
of x without regard to sign, we obtain
[+v l    ]2C   td 2          e â +
~+ vI _2-     te- tf t =-2_   - t2t X
n     hVJ  o        hVrk    2 JO
Introducing the limits into the integrated expression there
results     [+  ]_   1
^     7~ WTT


50


THE METHOD OF LEAST SQUARES.


and


r = 0.477   /[+ v]
i


(25)


This formula is rigorously correct only when the number of
observations is infinite, and it must be transformed so as to
become indeterminate when the number of observation equations is just sufficient to determine the unknown quantities,
i.e., when n = /L.  This might be accomplished by writing
n -, in place of n, as was done in equation (19), but it is
customary to substitute in this case V/n(n âup), which also
renders r indeterminate when n = p., and gives values of r more
nearly in agreement with equation (23). Making this substitution, we have


r =   0.845  [+vJ
Vn(n-p.)


(26)


which is known as Peters' formula for probable errors. This
formula is very convenient for the numerical computation of
probable errors, but where the number of observations is small
the results furnished by equation (23) are considered more
reliable, but neither formula can furnish a good determination
of probable errors from a small number of observations.
The numerical application of these formulae may be illustrated by the following short series of sextant observations for
the determination of latitude.


Observations
43C4 46'"
4 24
4  7
4 28
4 59
4 39


1         V tV
19"        361
3           9
20         400
1           1
32        1024
12         144


log [+ v]
a.c.,log V(   In-l)
log 0.845
log r
r


4 52          25; 25         log [vv]
4 52          25         625        log (n- 1)
3 47          40        1600         log [vv]
n â 1
4 15          12         144        logA [ 
-n-1
3 3(;         51        2601          log 0.674
4 40          13         1;              log r
4 27  F+v= 253    [rv  =7703                r


2.403
8.940- 10
9.926 -10
1.269
_ 18".6
3.886
1.041
2.845
1.422
9.829 - 10
1.251
-: 17".8


Mean = 43
n= 12


y = 


PROBABLE ERROR OF A FUNCTION.


51


The difference between the values of r found from the first
and second powers of the residuals is small compared with the
uncertainty of each arising from the small number of observations.
In so far as these observations can be considered as furnishing a value of r, they indicate that in a future series of similar
and equally precise observations, there should be as many
observations furnishing residuals (errors) greater than 18" as
there are observations giving residuals less than 18". The ~
which is commonly prefixed to the numerical value of r, denotes
that the observed quantity is as apt to err in excess as in
defect.
Let the student derive from the residuals given in ~ 10 a
determination of the probable error of an observed V, noting
that in this case u = 3.
~ 13. Probable Error of a Function of Observed Quantities.
Let x', x", x"' denote quantities which have been determined
from observation, and let r', r", r"' be their probable errors.
Let u be a quantity whose value has been computed from the
values of x', x", x"' by means of the relation
f(x', x", x"', u) = 0
It is evident that the precision with which u is determined,
depends upon the precision of x', x", x"', and by a slight extension of the term  "probable error" we may consider the precision of u to be represented by a probable error, r, and may
inquire the relation of r to r', r", r'".
Since a probable error is one of the residuals or errors of a
very long series, we may obtain the desired relation between
r, r', r", r"' from a consideration of the general relation of any
set of errors v', v", v"', in x', x", x"', to the corresponding
error, v, in u. This relation is
v =    d v'       +  df- v'" + etc.
dx'     dx" '   dx"'
(see ~ 8).  To avoid the necessity for considering the signs of
v', v", v"', let this equation be squared, giving


52


THE METHOD OF LEAST SQUARES.


22  (     '2P2+ ( f)2V f2 d+ (  )112
VWx'j     \dx11       dx 1'
from which all terms involving the products v'v", v'v"', v"v"',
etc., have been dropped for the reason that the probable error
of u depends upon the average magnitude of v, and in the long
run any pair of residuals v', v" will have opposite signs as
often as they have like signs, and will therefore produce an
equal number of positive and negative terms whose effect upon
the mean value of v2 will be very small compared with the
terms containing v'2, v12, v"'2, which are always positive. Replacing these actual errors by the corresponding probable
errors, we obtain
=2  (   2r +(     '2 + 2   2(jr2 +  2    (27)
dx'j        +dx
and an equation of similar form will express the relation of
the probable error of the function to the probable errors of the
quantities upon which it depends, whatever the number of
these quantities may be.
We proceed to apply this relation to a few simple cases of
frequent occurrence in practice.
(a) The probable error of the sum of n observed quantities.
In this case u = x'+  x1 + x"1 +... + x1t
and each of the differential coefficients d-,f d-f, etc., equals 1;
dx' dx"
whence          r = r'2 + r"2 -+ r"2 -... + "n2    (28)
(b) The probable error of the mean of n observed quantities.
In this case u = 1 (x' + x" + X +.. + X")
_?=       etc.dx' dx" I
r2 = 1 (r, 2 _+ r,2 + r"112 +... rn2)?U


PROBABLE ERROR OF A FUNCTION.


53


We have here to distinguish two cases. If the x's are all of
equal precision, the r's are equal, and may be represented by a
common symbol r1; whence
r =       2=i (~9)
If the observations are of unequal precision represented by
weights         p' p"I pf 'p...P
p'x' + + p xII +.p'x"... 4 p'x"n
we have         u =I- p +P i+Px- +.pX[LP]
df    p'     df   p"     etc.
dx'  [p]    dx"   [p]
-9  [1 2
rr               of " n 'l        w(30)
where ri denotes the probable error of an observation whose
weight is 1.
The relations here derived between the probable error of a
single determination of a quantity, and the probable error of
the mean of n determinations, may be employed in connection
with equation (23), to determine the probable error of an
adopted value based upon several determinations of a quantity.
Thus, in the general case of observations of unequal weight, if
i1 represent the probable error of an observation of weight 1,
and r the probable error of the weighted mean, we have from
equation (23),
r= 0.674      NI                    (31)
r =
V[p]         J
Let the student show that when the observations are of
equal precision and


54


THE METHOD OF LEAST SQUARES.


u = a (x' -  x" -'")     r = ~ aV3r'
-.                         X' X' r
ut = sin-                r     cos-. -
a                          a   a
=log X'                 r-    0.434    1 + 
~ 14. Assignment of Weights.   Rejection of Observations.
The term weight has been employed in the preceding sections
as a measure of the quality of an observation, but its use is by
no means limited to the case of single observations. Thus,
from an Investigation of the Distance of the Stin, etc., by S.
Newocomb, we select the following determinations of the solar
parallax.
Method by which determined.       Parallax.  Weight.
Meridian observations of Mars, 1862.  8".855      25
Micrometric observations of Mars, 1862.  8.842     6
Parallactic inequality of the Moon.     8.838     16
Lunar equation of the Earth.            8.809      3
Transit of Venus, 1769.                8.860       6
Each value of the parallax here given is the final result of
an elaborate discussion of many observations, and the weights
indicate the relative excellence attributed to these results by
the author of the investigation. If or denote any one of these
values of the parallax, p its weight, and 7r0 the most probable
value of the parallax, we shall have
-o= [iP] =8847                    (~ 6)
It is to be noted that this value depends upon the weights
assigned to the individual determinations, and that by properly selecting the weights, 7ro may be made to assume any value
whatever between the least and the greatest single determination. Thus if the weight 100 be assigned to the value 8".860,
and to each of the other values the weight 1, we shall find
tro = 8".859, while a weight 100 for the value 8".809 with a
weight 1 for each of the others, makes 7ro = 8".811. Between
these limits the value of 7rt depends upon the judgment of the


ASSIGNMEINT OF WEIGHTS.


computer in assigning weights, and this determination of
weights is one of the most delicate questions that arise in the
application of the method of least squares.
A relation between weights and probable errors may easily
be established, which is frequently of service in that it enables
the problem of weights to be stated in a different form. Let x
denote an observation whose probable error is r and whose
weight is 1, and denote by x', x", x"', etc., observations or combinations of observations of the same quantity, whose weights
and probable errors are represented by p', p", p"',, r', r1", r,
etc. In accordance with the definition of weights, x' is the
equivalent of p' observations of the same quality as x, and
from the equations derived in the preceding section we have.,
with a similar expression for each of the other quantities;
whence
I  f 2  _ _  I f 2   _   tI I IIr  2. 9
P r =    rr '  -1  2 _ 22.)2,)2      2.
and           p-,   2 P__ -f     ' p 1,             (3,)
and, in general, the weights are inversely proportional to the
squares of the probable errors.
It has been sufficiently shown that probable errors derived
from the residuals furnished by a series of observations represent only the effects of accidental errors of observation, but we
may extend the significance of the term so as to include an
estimate of the effect upon x', x", x"' of systematic errors in
the observations. Let r, and r2 represent those parts of the
probable error which come from these two sources respectively,
and from ~ 13 we find for their combined effect
2'2 2= r+ 2- 7
and the expression for the weights becomes
-.2    r2
P     ro                              (33)
r]~+r~2


THE METHOD OF LEAST SQUARES.


By this device the determination of weights is reduced to mi
estimate of the combined effect of accidental and systematic
errors of observation upon the quantity whose weight is
desired, and it was from an estimate of this character that the
weights of the parallaxes given above were derived.
If r' denote the probable accidental error of a single observation, and the quantity whose weight is p is the mean of n
such observations, we shall have
1                      (34)
_+ 272
12
from which the constant multiplier rm has been dropped, since
only relative values of ps are ever required. It appears from
this equation that if the systematic errors, r2, are very small
compared with the accidental errors, Xr', the weight increases
rapidly as the number of observations is increased, but if the
systematic errors are large, the weight is but little affected by
the number of observations; a relation to be considered in
deciding how many observations shall be made to determine
an unknown quantity.
In some cases it may be impossible to form any reliable
estimate of the effect of systematic errors, and results which
have been derived by different methods, or under different circumstances, may then be given equal weights on the supposition
that they are affected by different systematic errors which it
is equally important to eliminate; but this is equivalent to
putting r2 =oc in the equation for the weights, and it will
rarely happen that this is the best estimate which can be made
for the amount of the systematic errors.
It frequently happens that in a series of otherwise accordant
observations, one or two will be found which differ widely front
the others, and which if included in the final result, will
furnish large residuals. What shall be done with observations
of this kind has long been a vexed question. To reject them
is equivalent to assigning to them the weight 0, and is the
expression of the computer's judgment that they can contribute


ASSIGNMENT OF WEIGHTS.


57


nothing to the accuracy of the result which he seeks to obtain.
In an infinitely long series of observations, errors of any finite
magnitude may be permitted without impairing the accuracy
of the final result, and the existence of such errors seems contemplated by the theory which we have adopted, since the
equation of the error curve gives finite values of y for all
values of x between the limits - X  and + c. But in the
actual case which arises in lractice where a result must be
obtained from a comparatively small number of observations,
a single one of these, if affected with a large error, may make
the final result farther from the truth than any one of the
other observations. On the other hand, cases are by no means
unknown in which a single discordant observation in a series
proves to be nearest to the true value of the quantity sought,
the others having all been vitiated by some common cause;
and between these extremes an infinite variety of cases may
be found. It must in general remain a matter of doubt
whether a given discordant observation should or should not
l)e rejected, and the decision made by the computer must be
his judgment based upon all the data available as to whether
more will be gained by rejecting than by retaining it. A knowledge of the way in which observations are made, of the circulmstances attending the particular observation in question, the
magnitude of the errors which nmay reasonably be expected
with the given observer and apparatus, or instrument, are
elements which should be included in this judgment; and the
observer will greatly facilitate its formation by making copious
notes at the time of observation of all circumstances which in
his opinion may affect the quality of his work, and particularly
by noting any abnormal circumstances affecting a single observation or a part of the observations.
A doubtful observation should be rejected if it is the conputer's deliberate judgment that its retention will hurt more
than it will help his final result, but it is never legitimate for
the computer to suppress an observation. A rejected observation should be included in the statement of his data, and may
properly be accompanied by all explanation of the reasons for


58


THE METHOD OF LEAST SQUARES.


its rejection, in order that any person interested in the result
may form his own judgment of the data and the manner in
which they have been discussed, and may, if necessary, rediscuss the observations in accordance with that judgment.
The conclusion of the whole matter of assigning weights to
numerical data may be summed up in the statement that no
mathematical expression will suffice for this purpose, but the
weights must be determined by an exercise of personal judgment, and the wider the knowledge upon which this judgment
is based, the greater confidence will the weights and the resulting values of the unknown quantities command.
~ 15. Empirical or Interpolation Formulae. In the preceding
sections attention has been directed to that class of problems
in which the theoretical relation between the observed quantities and those whose values are to be determined is known;
that is, an equation of known form exists between them, and
the problem has been to determine the values of the constants
which appear in the equation. But a very different class of
cases now demands a passing notice.
A series of observations is sometimes found to be affected
with errors too great to be explained as the result of unavoidable and fortuitous causes, and it becomes apparent that the
law of recurrence of these errors must be determined before
the observations can be made to yield any valuable results.
The American parties which were sent out in 1874 to observe
the transit of Venus were provided with instruments for the
determination of their local time, of such a character that the
accidental error of a determination from a single star might
fairly be estimated at O0.O0 or O'.06, but results obtained from
observations of different stars varied among themselves by
more than ten times this amount. An inspection of the discrepancies having shown that they depended in some way upon
the distance of the observed star from the zenith, it was found
by trial that the error at any zenith distance, z, could be repre.
sented by the expression
E = ~ la cosz - b sil' 2)


EMPIRICAL OR INTERPOLATION FORMULAE.


59


where a and b are constants whose values were found from the
observations themselves. The physical cause of these errors
was subsequently found to be the bending of the instrument
under its own weight, but it is to be noted that the above law
of recurrence of the errors was determined first, the cause
afterwards.
Expressions of this kind are sometimes called interpolation
formrulce and sometimes empirical equations; the one term having reference to their use, the other to their derivation. They
are of very general use in all branches of physical science,
since they may be made to serve as a convenient summary of a
vast amount of numerical data, and one of the most important
applications of the method of least squares is in determining
the values of the constants which enter into such expressions.
The problems treated in ~~ 1 and 10 both belong to this class,
and the following expression for the magnetic declination at
Washington, D.C., derived by Mr. C. A. Schott * from a series
of observations extending over ninety years may serve as a
further illustration:
Mag. Dec. = 2~.47 + 2~.50 sin [1~.40(T-1850) - 14~.6]
where T denotes the year for which the declination is required.
When the cause whose effects are to be represented by an
equation is known, the form of this equation can usually be
derived by mathematical analysis; but where empirical formulae
are employed other methods must be resorted to. The simplest
of these is a graphical representation of the errors or other
data under consideration.  For this purpose let the errors
represent ordinates, and the values of any variable upon which
they are supposed to depend, the corresponding abscissas. Let
points be plotted with these ordinates and abscissas as was
done in obtaining the form of the error curve, Figs. A, B, C, D,
and let a smooth curve be drawn through these points either
free-hand or by the aid of a draughtsmnan's "irregular curve."
The distance of each plotted point from the curve, measured
along an ordinate, is the residual corresponding to the point,
* U. S. C. &amp; G. S. Report, 1882, p. 258.


60       THE METHOD OF LEAST SQUARES.


and in accordance with the principle of least squares the curve
should be so drawn as to make the sum of the squares of these
residuals as small as possible, without unduly complicating the
curve. If the variable has been properly chosen it will in
many cases be found possible to draw a simple curve which
shall represent the data within the limits of the accidental
errors of observation, and as this curve is the graphical representation of the law required, its equation, y =f(x), is the
analytical representation of that law. In this manner the
form of the equation treated in ~ 10 was obtained.
In other cases it will be possible to draw a smooth and simple curve which shall not represent the data within the limits
of accidental error of the observations, but about which the
points will be grouped, alternating from one side to the other
in a systematic manner. Let the excess of the ordinate of any
point over the corresponding ordinate of the curve be plotted
with the given abscissa in a new curve. The two curves thus
constructed will together form the graphical representation of
the law of the data, and the analytic expression of that law
will be                 Y f( ) + J (x)
if y =f(x) and y = +(x) are the equations of the two curves
respectively.
In some cases the curves themselves will be a sufficient
representation of the data, and it will be unnecessary to determine their equations since the value of y corresponding to any
given value of x may be obtained by direct measurement. In
other cases the curve will be chiefly serviceable in suggesting
the probable form of an equation between the observed quantity and a variable upon which it is supposed to depend, or in
showing that no simple relation exists between them. Two
forms of equation are of such frequent use in this connection
that they deserve especial notice.
If the plotted curve does not differ very greatly from a
straight line, the relation of the variable x to its function y
may be represented by


y = a + bx + cr  + d.e + etc.


(35)


EMPIRICAL OR INTERPOLATION FORMUL-E.


61


This equation contains the first few terms of an infinite
series by which a limited arc of any continuous curve can be
represented, and since the actual relation between y and x
could be represented by a curve, if its mathematical expression
were known, it follows that the above equation can be made to
represent this relation over a certain range of values of x, by
assigning proper values to the coefficients. The number of
terms of this series which should be taken into account, and
the limits of x beyond which the equation is not applicable,
depend upon the actual relation between y and x, and are
therefore unknown; but, in general, it is not well to attempt to
use this equation for large values of x, or when more than
three or at most four terms are required. Its application in a
simple case is illustrated in the problem of ~ 1, where y and x
being replaced by the length of the bar and its temperature, it
is assumed that their relation can be expressed within the
range of temperature over which the observations extend, by
the first two terms of the series.
The second type of equation above referred to is
y   ao - a cos x _+ ac cos 2- +- 3 cos ~- etc.
3x        r    (3;)
+ 6b sin -   bS- + b: s in3  sin + etc.
in which m is an undetermined constant expressed in the same
unit as x; -- is therefore a ratio, or absolute number, which
in the application of the equation to numerical data must
be transformed into circular measure by multiplying it by
180 â=57~.29578.  This form of equation may be made to
7r
represent any relation whatever between finite values of y and
x, including those cases in which y is a discontinuous function,
but it is especially advantageous when y is a periodic function
of x, i.e., one in which the same values of y recur for values of
x, separated by a constant interval, T, called the period of x,
so that


62


THE METHOD OF LEAST SQUARES.


f(x) =f(x + r) =f(x + 2) =... =f(x +    ).
The simplest type of such a function is y = sin x, the period
in this case being T = 360~; e.g.,
sin 10~ = sin (10~ + 360~) = sin (10~ + 720~) = etc.
When y is such a function the constant m should be put equal
to the period divided by 27r, m  7=; in other cases the value
27r
of m, should be so taken that the greatest value of - included
m
among the data shall not exceed 7r. The application of this
formula may often be facilitated by noting that if the relation
between y and x is such that f( - x) = f(-x), the sine terms
all disappear, and the equation reduces to
x        2x
y = ao + al cos -+ a2cos -- + etc.     (3)
mn      mr
while if f(x) = -f(- x), the cosine terms vanish, and the
equation becomes
y = b0 + b sin- + b2 sill2 + etc.      (38)
m        m
The several forms above given to this type of equation are
those most convenient for use when the values of the coefficients a, b, etc., are to be determined, but after their numerical
values have been found it is advantageous to transform the
equation as follows: Introduce the auxiliary quantities
no, nl, n2l, N, N2, etc.
defined by the relations
no = a0     nl cos N, = al   n2 cos N2 = a,
n, sin N1- bl    n2 sin N2 = b2
and the equation becomes
y _=  o +- Il cos â.Y  + 12COS (2 âN2) + etc.  (39)
each pair of terms of the original equation being here replaced
by a single term. The expression for the magnetic declination


EMPIRICAL OR INTERPOLATION FORM ULAE.


63


at Washington given on p. 59 is of this type, as may be seen
by writing it in the equivalent form
Mag. Dec. = 2.47 + 2~.50 cos [1~.40(T-1850) - 104~.6]
The mode of applying this form of equation may be illustrated by means of the following data selected from the series
of observations whose residuals are plotted in Fig. C. The
observed quantity, B, is the difference of stellar magnitude
(brightness) between the planet Saturn and his satellite Iapetus. The quantity 1 given with each observed B fixes the
position of the satellite in its orbit at the time of observation,
and is analogous to the variable angle 0 in a system of polar
coordinates.
1.    B.   Residual.  1.    B.   Residual.
m       llI          nm
10~  10.82  - 0.28   200~  10.66  + 0.28
70   11.81  + 0.22   230    9.87  -0.21
110   11.69  -0.02    270   10.43  + 0.21
140   11.42  -0.03    310   10.48  -0.15
B is here seen to run through a complete cycle of values
between the limits 9.87 and 11.81, while I varies from 0~ to
360~. We shall therefore endeavor to represent B as a periodic
function of I whose period is 360~. In accordance with this
assumption we put
x     1   180~
60*    =1; y=B
m   3-60
271
and taking into account the first five terms of the series, the
several observations furnish the following
OBSERVATION EQUATIONS.
10.82 = a, + 0.98 a1 + 0.17 b, + 0.94 a2 + 0.34 b2
11.81 = ao + 0.34 a, + 0.94 b - 0.77 a2 + 0.64 b2
11.69 = ao - 0.34 a, + 0.94 b - 0.77 a2 - 0.64 b.
11.42 = oa - 0.77 al + 0.64 bl + 0.17 a2 - 0.98 b,
10.66 = ao - 0.94 a1- 0.34 bi + 0.77 a, 2- 0.64 b,


64


THE METHOD OF LEAST SQUARES.


9.87 = ao - 0.64 a, - 0.77 bl - 0.17 a2 + 0.98 b2
10.43 = a0 + 0.00 a1 - 1.00 b, - 1.00 a2 + 0.00 b,
10.48 = ao + 0.64a, - 0.77 b, - 0.17 a2 - 0.98b2
The solution of these equations will be found in the following section. The values obtained for the constants are
a0 = 10.92, a = + 0.15, b = + 0.74, ac =-0.04, b = â 0.18
Introducing the constants n, N, and determining their values
from the relations,,1 = 10.92  n1 cos 1N = + (.15  712 cos 2 = - 0.04,I siln X, = + (0.74  H2 sin, = - 0.18
the equation becomes
B = 10.92 + 0.76 cos ( â 78.5)-) 0.18 cos (21 - 77~.5)
The residuals obtained by comparing the values of B computed
from this formula with the observed values are given above
with the data.
Abundant data for exercise in deriving empirical formulae of
this kind may be found in the United States Coast and Geodetic
Survey Report for 1882, pp. 218-257.
~ 16. Approximate Solutions. It is often desirable to obtain
from a series of observations, as rapidly and with as little labor
as possible, a set of values of the unknown quantities involved
which shall be fair approximations to their most probable values, but in which the highest degree of accuracy is not required.
In cases of this kind the least square treatment of the observation equations as illustrated in ~ 10 is too long and laborious,
and the following method may often be substituted for it with
advantage.
Let there be any number, e.g., three, unknown quantities ilvolved in a set of observation equations of the form
ax + by+ + z + tn = 0  Weight =p
and let each of these equations be multiplied by its weight,
giving the group of equations,


APPROXIMATE SOLUTIONS.


65


atx + biy + cz + ni = 0  k 
a2x+ b2y + c2z + n2 = 0  2 k           (40)
etc.   etc.    etc.  etc. )
Multiply each of these equations by the undetermined constant k placed opposite it, and let the sum of all the resulting equations be formed.   By the use of the summation
symbol, [ ], this sum may be written
[ka] + [kb]y + [kIc]z +[k] =[  0
Since the several values of k which enter into this equation
are entirely arbitrary it would be permissible to assign to them
such values that [kb] and [kc] should each equal 0, which
would give at once
x     Ci]c                             (41)
[ka]
This, however, is not practically advantageous on account of
the labor involved in determining the values of k. We therefore put
[ka]   [ka] [tka] - 
and, limiting the values of k to + 1 and- 1, assign them in
such a manner that [ka] shall be made as great, and [kb],
[kcc] as small as possible. In this manner the coefficients of
y and z may often be made so small that if approximate values
of y and z are substituted in equation (42), they will furnish
a sufficiently accurate value of x; since the effect of the errors
of these approximations will be much diminished by the small
coefficients by which they are multiplied.
The value of y may be found in the same manner by selecting a set of k's which shall make [kb] large and [ka], [kc]
small, and similarly for z. Two or three trials may be required
before sufficiently close approximations to the values of x, y, z
are obtained, but these trials are rapidly and easily made, and,
if necessary, in exceptional cases the summation equations
may be written in the form


66


THE METHOD OF LEAST SQUARES.


[kc'a ] + [k'b ]y+[ k'c  + [k'n ] =0 
[k" a ]x + [k" b ]y - [k" c ]z + [k" n ] = 0  (43)
[k"' a]x + [k" b]y + [k"' c]z + [k"',] = 0 
and the equations solved by any of the methods of elementary
algebra, but in every case the values of kI + 1 and - 1 should
be so chosen that in the first equation the coefficient of x, in
the second equation the coefficient of y, and in the third equation the coefficient of z, shall be made as large, and all of the
other coefficients as small as possible.
By this mode of solution each observation with its proper
weight is included in the determination of the unknowns, but
since the principle of least squares has not been taken into
account, it cannot be expected that the resulting values will
be the best that the observations can be made to yield.
To illustrate the mode of solution we recur to the observation equations contained in the preceding section and putting
ao = 10.00 + a write them as follows, placing the several values of k at the right of each equation.


cos I   sin I  cos 21  sin 2 
0.82=a+0.98 al+0.17 b1+0.94 a2+0.34 b2
1.81 =a+0.34 a+ 0.94 bl-0.77 a2+0.64 b2
1.69=a-0.34 al +0.94 bl-0.77 a2-0.64 b2
1.42=a-0.77 al+0.64 b +0.17 a2-0.98 b2
0.66=a-0.94al-0.34 bl+0.77 a2+0.64 b2
-0.13=a-0.64 al-0.77 bl-0.17 a2+0.98 b2
0.43=a+-0.00 a-.OO bl-1.00 a2+0.00 b2
0.48 = a+0.64 al-0.77 bx-0.17 a2-0.98 b,


k'  k" k"W kiv kv
+1+1 +1+1 +1
+1+1+1-1+1
+1-1+1-1-1
+1-1+1+1-1
+1-1-1+1+1
+1-1-1-1+1
+1+1-1-1-1
+1+1-1+1-1


The summation equations obtained from this group are:
+ 7.18 = 8a - 0.73 a, - 0.19 b, - 1.00 a2 + 0.00 b2 
- 0.10 = 0 a + 4.65 al - 1.13 bl - 1.00 a, + 0.00 b2
+ 4.30 = 0a+ 1.15 a, +5.57 b + 0.4 a2+ 0.00 b2  (44)
-0.42 = Oa + 0.53 a, - 0.41 bl + 4.42 a2 + 0.00 b2
- 0.86 = 0a + 0.21 a, +0.19 bl + 2.54 a2 + 5.20 2 )


APPROXIMATE SOLUTIONS.


67


and these correspond to the normal equations of a least square
solution. To apply the method of approximations to the solution of this group of equations, we write them in the form:
a = + 0.897 + 0.091 a, + 0.024 bl + 0.125 a2 
a, = - 0.022 + 0.000 a, + 0.243 b, + 0.215 a2
bl = + 0.772 - 0.206 ai + 0.000 b, - 0.025 a2  (45)
a2 =- 0.095 - 0.120 a +- 0.093 bl + 0.000 a2 
b. = - 0.166 - 0.040 a, - 0.036 bl - 0.488 oa 
The divisions required in making this transformation were
made by the use of Crelle's Rechentatfeln.
By operations which can be performed mentally, we obtain
the following sets of approximations to the values of
I.           II.          III.
a,          0.0        +0.2         +0.15
bi        +0.8         +0.7         +0.74
b2        -0.1         -0.0         -0.04
and substituting in equations (45) the values given under III.
we find the adopted values
a0= 10 +a = 10.92   a=+ 0.15     a2= -0.04
b, = + 0.74  b, = -0.18
which were employed in the preceding section.


INDEX TO FORMULAE.
y=                                   ~ 3,4
V7r
1 v~v]
2 h2  u
X0= ___.]
[aa] x ~ [atb] y~ [awjz~ Ea)] =7
[as] =L[aa] + [ab] + [ac] +  + ['ail].9
[be.11=be [b-.1] c 
[eu.2] =[en -l] - [bb.1] [bn.l].     ~ 9
[aa] k,+ [a3] k2 +40(Xo, YO ZO) 0.~ 11
r= 0477~0.67    Jvv  0~~.845  Hg"'       1
r    r
ro=7T= = V-p].~...13:p:p  = âr 12  12~14
y =a+bx +cx2+ etc.~ 15
y =no +n, cosQ -Ni)9l2CO 2cos(-IV2 )etc..~ 15
[knhI~[lcb  +Ice     k=4-1.         16
[ka]  [ka]Y~[kajZ


Latin Text-Books.
INTROD. PRICI
ALLEN &amp; GREENOUGH: Latin Grammar........ $I.20
Ceesar (7 books, with vocabulary; illustrated)... i,25
Cicero (I3 orations, with vocabulary; illustrated).. I.25
Sallust's Catiline.............60
Cicero de Senectute............50
Ovid (with vocabulary).....           1.40
Preparatory Course of Latin Prose....    1.40
Latin Composition............   12
ALLEN..    New Latin Method........90
Introduction to Latin Composition......90
Latin Primer...............90
Latin Lexicon.........90
Remnants of Early Latin......75
Germania and Agricola of Tacitus..... oo
BLACKBURN. Essentials of Latin Grammar......70
Latin Exercises........60
Latin Grammar and Exercises (in one volume)..oo
COLLAR &amp; DANIELL: Beginner's Latin Book........oo
Latine Reddenda (paper)..........20
Latine Reddenda and Voc. (cloth).....30
COLLEGE SERIES OF LATIN AUTHORS.
Greenough's Satires and Epistles of Horace
(text edition) $0.20; (text and notes)...  1.25
CROWELL.   Selections from the Latin Poets.         1. I40
CROWELL &amp; RICHARDSON: Briof History of Roman Lit. (BENDER) I.OO
GREENOUGH. Virgil:Bucolics and 6 Books of EAneid (with vocab.). I.60
Bucolics and 6 Books of ZEneid (without vocab.) I.I2
Last 6 Books of iEneid, and Georgics (with notes) I.12
Bucolics, AEneid, and Georgics (complete,with notes) i.60
Text of Virgil (complete).......75
Vocabulary to the whole of Virgil.       I.00
GINN &amp; CO.   Classical Atlas and Geography (cloth).... 2.00
HALSEY... Etymology of Latin and Greek.....I2
KEEP... Essential Uses of the Moods in Greek and Latin.25
KING... Latin Pronunciation............25
LEIGHTON..Latin Lessons...... 12
First Steps in Latin........... 1.12
MADVIG.. Latin Grammar (by THACHER)...       2.25
PARKER &amp; PREBLE: Handbook of Latin Writing.....50
PREBLE... Terence's Adelphoe............25
SHUMWAY.. Latin Synonymes........30
STICKNEY.   Cicero de Natura Deorum........ 1.40
TETLOW.. Inductive Latin Lessons...             1.12
TOMLINSON. Manual for the Study of Latin Grammar..20
Latin for Sight Reading......... I.00
WHITE (J. VW.) Schmidt's Rhythmic and Metric...    2.50
WHITE (J. T.) Junior Students' Latin-English Lexicon (mor.) 1.75
English-Latin Lexicon (sheep).....I50
Latin-English and English-Latin Lexicon (sheep) 3.00
WHITON.. Auxilia Vergiliana; or, First Steps in Latin Prosody.15
Six Weeks' Preparation for Reading Ceesar..40
Copies sent to Teachers for Examination, with a view to Introduction,
on receipt of Introduction Price.
GINN &amp; COMPANY, Publishers,
BOSTON, NEW YORK, AND CHICAGO.


GREEK TEXT-BOOKS.
INTROD. PRICE.
Allen:        Medea of Euripides.....$1.00
Flagg:        Hellenic Orations of Demosthenes....  1.00
Seven against Thebes.......1.00
Anacreontics.....35
Goodwin:      Greek Grammar......             1.50
Greek Reader..1.50
Greek Moods and Tenses......1.50
Selections from Xenophon an(l Herodotus.      1.50
Goodwin &amp; White: Anabasis, with vocabulary...      1.50
Harding:      Greek Inflection......         50
Hogue:        Irregular Verbs of Attic Prose.....     1.50
Jebb:         Introduction to Homer.......  1.12
Kendrick:     Greek at Sight......15
Leighton:     New Greek Lessons.                          1.20
Liddell &amp; Scott: Abridged Greek-English Lexicon.             1.90
Unabridged Greek-English Lexicon.   9.40
Parsons:      Cebes' Tablet.75
Seymour:      Selected Odes of Pindar...                 1.40
Introduction to Language and Verse of Homer.75
Homeric Vocabulary....75
School Iliad, Books I.-III., $1.25; Books I.-VI...60
Sidgwick:     Greek Prose Composition....             1.50
Tarbell:      Philippics of Demosthenes.....      1.00
Tyler:        Selections from Greek Lyric Poets.  1.00
White:        First Lessons in Greek......      1.20
Schmidt's Rhythmic and Metric.                2.50
Passages for Translation at Sight. Part IV.-Greek,.80
CEdipus Tyrannus of Sophocles...             1.12
Stein's Dialect of Herodotus.....    10
Whiton:       Orations of Lysias.                        1..00
Beckwith: Euripides' Bacchantes.
Text and Notes, Paper,.95; Cloth, $1.25; Text only,.20.
D'Ooge: Sophocles' Antigone.
Text and Notes, Paper, $1.10; Cloth, $1.40; Text only,.20.
Dyer:     Plato's Apology and Crito.
Text and Notes, Paper, $1.10; Cloth, $1.40; Text only,.20.
Flagg:    Euripides' Iphigenia among the Taurians.
Text and Notes, Paper, $1.10; Cloth, $1.40; Text only,.20.
Fowler: Thucydides, Book V.
X*  ~     Text and Notes, Paper, $1.10; Cloth, $1.40; Text only,.20.
f   Humphreys: Aristophanes' Clouds.
0)            Text and Notes, Paper, $1.10; Cloth, $1.40; Text only,.20.
e   Manatt: Xenophon's Hellenica, Books I.-IV.
X             Text and Notes, Paper, $1.35; Cloth, $1.65; Text only,.20.
S   Morris:   Thucydides, Book I.
Text and Notes, Paper, $1.35; Cloth, $1.65; Text only,.20.
Perrin:  Homer's Odyssey, Books I.-IV.
Text and Notes, Paper, $1.10; Cloth, $1.40; Text only,.20.
Richardson: rEschines against Ctesiphon.
Text and Notes, Paper, $1.10; Cloth, $1.40; Text only,.20.
Seymour: Homer's Iliad, Books I.-III.
Text and Notes, Paper, $1.10; Cloth, $1.40; Text only,.20.
Smith:    Thucydides, Book VII.
Text and Notes, Paper, $1.10; Cloth, $1.40; Text only,.20.
Towle:    Plato's Protagoras.
Text and Notes, Paper,.95; Cloth, $1.25; Text only,.20.
GINN &amp; COMPANY, Publishers, Boston, New York, and Chicago.


Mathematics."
Tntrod.
Prices.
Byerly........ Differential Calculus.................... $2.00
Integral Calculus...................... 2.00
Ginn............. Addition Manual........................15
Halsted.......... Mensuration............................  1.00
Hardy.......... Quaternions...........................  2.00
Hill............. Geometry for Beginners................ 1.00
Sprague......... Rapid Addition.........................10
Taylor.......... Elements of the Calculus................ 1.80
Wentworth..... Grammar School Arithmetic.............75
Shorter Course in Algebra............... 1.00
Elements of Algebra..................... 1.12
Complete  Algebra.....................  1.40
Plane Geometry.........................75
Plane and Solid Geometry.......... 1.25
Plane and Solid Geometry, and Trigonometry 1.40
Plane Trigonometry and Tables. Paper...60
P1. and Sph. Trig., Surv., and Navigation. 1.12
P1. and Sph. Trig., Surv., and Tables...... 1.25
Trigonometric Formulas................ 1.00
Wentworth &amp; Hill: Practical Arithmetic................... 1.00
Abridged Practical Arithmetic............75
Exercises in Arithmetic.................
Part I. Exercise  anual...............
Part II. Examination  anual...........35
Answers (to both Parts)...............25
Exercises in Algebra.....................70
Part I. Exercise Alanual.................35
Part II. Examination Manual............35
Answers (to both Parts)..............  25
Exercises in Geometry...................70
Five-place Log. and Trig. Tables (7 Tables).50
Five-place Log. aud Trig. Tables (Comp. Ed.) 1.00,catworth &amp; Reed: First Steps in Number, Pupils' Edition.30
Teachers' Edition, complete.90
Parts I, II., and III. (separate), each.30
Wheeler......... Plane and Spherical Trig. and Tables..... 1.00
Copies sent to Teachers for examination, with a view to Introduction
on receipt of Introduction Price.
GINN &amp; COMPANY, Publishers.
BOSTON.          NEW YORK.           CHICAGO.


SCIENCE AND HISTORY.
NATURAL SCIENCE.
INTROD. PRICE
Everett:     Vibratory Motion and Sound..$ 2.00
Gage:        Elements of Physics...                    1.12
Introduction to Physical Science...         1.00
Hale:        Little Flower-People...40
Hill:        Questions on Stewart's Physics...   35
Journal of Morphology...   (per vol.)  6.00
Knight:      Primer of Botany...30
Williams:    Introduction to Chemical Science...80
PHILOSOPHICAL SCIENCE.
Davidson:    Rosmini's Philosophical System.....  2.50
Hickok:      Philosophical Works...00
Ladd:        Lotze's Outlines of Metaphysic....80
Lotze's Outlines of Philosophy of Religion.80
Lotze's Outlines of Practical Philosophy..80
Lotze's Outlines of Psychology...80
Lotze's Outlines of 2Esthetics...80
Lotze's Outlines of Logic..80
Seelye:      Hickok's Mental Science (Empirical Psychology)  1.12
Hickok's Moral Science...               1.12
POLITICAL SCIENCE.
Clark:       Philosophy of Wealth.                       1.00
Clark &amp; Giddings: The Modern Distributive Process. (retail).75
Macy:        Our Government..             70
Political Science Quarterly.(per'vol.)                  3.00
Seligman;    Railway Tariffs and the Interstate Law. (retail).75
HISTORY.
Allen:       Readers' Guide to English History....25
Andrade:     Historia do Brazil.......       75
Fiske-Irving: Washington and His Country..        1.00
Halsey:      Genealogical and Chronological Chart...25
Journal of Archaeology..                     (per vol.)  5.00
Judson:      Cesar's Army.....                1.(0
Montgomery: Leading Facts of English History.           1.00
English History Reader.....       60
Moore:       Pilgrims and Puritans.....60
Myers:       Mediaeval and Modern History....      1.50
Ancient History...  1.40
Copies sent to Teachers for Examination, with a view to Introduction,
on receipt of Introduction Price.
GINN    &amp;   COMPANY, Publishers.


BOSTON.


NEW YORK.


CHICAGO.


i
I


Air
I
I If


1111   UNIVERSITY OF MICHIGAN
illlilll Ell          Ull ll
3 9015 01636 3619


| | | l Do:
r:
f f E tV:-::ff
0 ad 0 0 En
D \ f tV iRr;00S


A


m
I