Jtlfara. ^. % 
 
Cornell University 
 Library 
 
 The original of tiiis book is in 
 tine Cornell University Library. 
 
 There are no known copyright restrictions in 
 the United States on the use of the text. 
 
 http://www.archive.org/details/cu31924014110435 
 
BY THE SAME AUTHOR 
 
 INDIAN CURRENCY AND FINANCE 
 
 8vo. Pp. viii + 263. 1913. 
 7s. 6d. net. 
 
 THE ECONOMIC CONSEQUENCES 
 OF THE PEACE 
 
 • 8vo. Pp. vii + 279. 1919. 
 8s. 6d. net. 
 
A TREATISE ON PROBABILITY 
 
MACMILLAN AND CO., Limited 
 
 LONDON BOMBAY ■ CALCUTTA . MADRAS 
 
 MELBOURNE 
 
 THE MACMILLAN COMPANY 
 
 NEW YORK • BOSTON ■ CHICAGO 
 DALLAS SAN FRANCISCO 
 
 THE MACMILLAN CO. OF CANADA, Ltd. 
 TORONTO 
 
A TREATISE 
 ON PROBABILITY 
 
 BY 
 
 JOHN MAYNARD KEYNES 
 
 FELLOW OF king's COLLEGE, CAMBRIDGE 
 
 MACMILLAN AND CO., LIMITED 
 
 ST. MARTIN'S STREET, LONDON 
 
 1921 
 
COPYRIGHT 
 
PREFACE 
 
 The subject matter of this book was first broached in the brain 
 of Leibniz, who, in the dissertation, written in his twenty-third 
 year, on the mode of electing the kings of Poland, conceived 
 of Probability as a branch of Logic. A few years before, " un 
 problfeme," in the words of Poisson, "propose k un austere 
 jansdniste par un homme du monde, a ^t^ I'origine du calcul 
 des probabilit^s." In the intervening centuries the algebraical 
 exercises, in which the Chevalier de la Mer^ interested Pascal, 
 have so far predominated in the learned world over the pro- 
 founder enquiries of the philosopher into those processes of 
 human faculty which, by determining reasonable preference, 
 guide our choice, that Probability is oftener reckoned with Mathe- 
 matics than with Logic. There is much here, therefore, which is 
 novel, and, being novel, unsifted, inaccurate, or deficient. I 
 propound my systematic conception of this subject for criticism 
 and enlargement at the hand of others, doubtful whether I 
 myself am likely to get much further, by waiting longer, 
 with a work, which, beginning as a Fellowship Dissertation, 
 and interrupted by the war, has already extended over 
 many years. 
 
 It may be perceived that I have been much influenced by 
 W. E. Johnson, G. E. Moore, and Bertrand Eussell, that is 
 to say, by Cambridge, which, with great debts to the writers 
 of Continental Europe, yet continues in direct succession 
 the English tradition of Locke and Berkeley and Hume, of 
 Mill and Sidgwick, who, in spite of their divergences of 
 
vi A TEEATISE ON PEOBABILITY 
 
 doctrine, are united in a preference for what is matter of 
 fact, and have conceived their subject as a branch rather of 
 science than of the creative imagination, prose writers, hoping 
 to be understood. 
 
 J. M. KEYNES^ 
 
 King's College, Oambkidgb, 
 May 1, 1920. 
 
CONTENTS 
 
 PART I 
 
 FUNDAMENTAL IDEAS 
 
 CHAPTER I 
 
 PAGE 
 
 The Meaning of Probability ..... 3 
 
 CHAPTER II 
 Peobabilitt in Relation to the Thboet or Knowledsb . 10 
 
 CHAPTER III 
 
 The Measurement of Probabilities . . . .20 
 
 CHAPTER IV 
 
 The Principle of Indifference . . . .41 
 
 CHAPTER V 
 
 Other Methods of Determining Probabilities . . 65 
 
 CHAPTER VI 
 
 The Weight of Arguments . . . . .71 
 
viii A TKEATISE ON PEOBABILITY 
 
 CHAPTER VII 
 
 FADE 
 
 Historical Ebtrospect . . . . .79 
 
 QHAPTER VIII 
 
 The Frequency Theory of Probability . . .92 
 
 CHAPTER IX 
 
 The Constructive Theoby op Part I. summarised . .111 
 
 PART II 
 
 FUNDAMENTAL THEOREMS 
 
 CHAPTER X 
 
 Introductory . . . . . . .115 
 
 CHAPTER XI 
 
 The Theory of Groups, with special eeperence to Lomcal 
 
 Consistence, Inference, and Logical Priority . .123 
 
 CHAPTER XII 
 
 The Definitions and Axioms of Inference and Probability 133 
 
 CHAPTER XIII 
 
 The Fundamental Theorems of Necessary Inference. . 139 
 
 CHAPTER XIV 
 
 The Fundamental Theorems of Probable Inference . 144 
 
CONTENTS ix 
 
 CHAPTER XV 
 
 PAGE 
 
 Numerical Measurement and Approximation op Proba- 
 bilities . r . . . . .158 
 
 CHAPTER XVI 
 
 Observations on the Theorems of Chapter XIV., and 
 
 THEIR Developments, including Testimony. . .164 
 
 CHAPTER XVII 
 
 Some Problems in Inverse Probability, including Averages 186 
 
 PART III 
 
 INDUCTION AND ANALOGY 
 
 CHAPTER XVIII 
 
 Introduction . . . . .217 
 
 CHAPTER XIX 
 
 The Nature of Argument by Analogy . . . .222 
 
 CHAPTER XX 
 
 The Value of Multiplication of Instances, or Pure Induction ,233 
 
 CHAPTER XXI 
 
 The Nature of Inductive Argument continued . .242 
 
 CHAPTER XXII 
 
 The Justification of these Methods . . .251 
 
 CHAPTER XXIII 
 
 Some Historical Notes on Induction . . . 265 
 
 Notes on Part III. ...... 274 
 
A TEEATISE 0^ PEOBABILITY 
 
 PAET IV 
 SOME PHILOSOPHICAL APPLICATIONS OF PEOBABILITY 
 
 CHAPTER XXIV 
 
 PAGE 
 
 The Meanings of Objective Chance, and of Randomness . '281 
 
 CHAPTER XXV 
 
 Some Problems arising out of the Discussion of Chance . 293 
 
 CHAPTER XXVI 
 
 The Application of Probability to Conduct . . 307 
 
 PAET V 
 
 THE FOUNDATIONS OF STATISTICAL INFEEENCE 
 
 CHAPTER XXVII 
 
 The Nature of Statistical Inference . . .327 
 
 CHAPTER XXVIII 
 
 The Law of Great Numbers . . . . .332 
 
 CHAPTER XXIX 
 
 The Use of d priori Probabilities for the Prediction of 
 Statistical Frequency — the Theorems of Bernoulli, 
 poisson, and tchbbychbff .... 337 
 
 CHAPTER XXX 
 
 The Mathematical use of Statistical Frequencies for the 
 Determination of Probability d posteriori — the Methods 
 of Laplace ...... 367 
 
CONTENTS xi 
 
 CHAPTER XXXI 
 
 PAGE 
 
 The Inversion of Bebnotjlli's Theorem . . . 384 
 
 CHAPTER XXXII 
 
 The Inductive Use of Statistical Frequencies for the 
 Determination of Peobabilitt d posterior'!^— tbm Methods 
 OF Lexis ....... 3D1 
 
 CHAPTER XXXIII 
 
 Outline of a Constructive Theory .... 406 
 
 BIBLIOGRAPHY .... .429 
 
 INDEX ........ 459 
 
PART I 
 FUNDAMENTAL IDEAS 
 
CHAPTEE I 
 
 THE MEANING OF PROBABILITY 
 
 " J'ai dit plus d'une fois qu'il faudrait une nouvelle eap^oe de logique, qui 
 tiaiteroit des degrSs de Probabilite." — Leibniz. 
 
 1. Part of our knowledge we obtain direct ; and part by 
 argument. The Theory of Probability is concerned with that 
 part which we obtain by argiunent, and it treats of the different 
 degrees in which the results so obtained are conclusive or in- 
 conclusive. 
 
 In most branches of academic logic, such as the theory of the 
 syllogism or the geometry of ideal space, all the arguments aim 
 at demonstrative certainty. They claim to be conclusive. But 
 many other arguments are rational and claim some weight with- 
 out pretending to be certain. In Metaphysics, in Science, and in 
 Conduct, most of the arguments, upon which we habitually base 
 our rational beliefs, are admitted to be inconclusive in a greater 
 or less degree. Thus for a philosophical treatment of these 
 branches of knowledge, the study of probability is required. 
 
 The course which the history of thought has led Logic to follow 
 has encouraged the view that doubtful arguments are not within 
 its scope. But in the actual exercise of reason we do not wait 
 on certainty, or deem it irrational to depend on a doubtful 
 argument. If logic investigates the general principles of valid 
 thought, the study of arguments, to which it is rational to attach 
 some weight, is as much a part of it as the study of those which 
 are demonstrative. 
 
 2. The terms certain and probable describe the various degrees 
 of rational belief about a proposition which different amounts of 
 knowledge authorise us to entertain. All propositions are true 
 or false, but the knowledge we have of them depends on our 
 circumstances; and while it is often convenient to speak of 
 
 3 
 
4 A TREATISE ON PROBABILITY pt. i 
 
 propositions as certain or probable, this expresses strictly a 
 relationship in which they stand to a corpus of knowledge, actual or 
 hypothetical, and not a characteristic of the propositions in them- 
 selves. A proposition is capable at the same time of varying degrees 
 of this relationship, depending upon the knowledge to which it is 
 related, so that it is without significance to call a proposition prob- 
 able unless we specify the knowledge to which we are relating it. 
 
 To this extent, therefore, probability may be caUed sub- 
 jective. But in the sense important to logic, probability is not 
 subjective. It is not, that is to say, subject to human caprice. 
 A proposition is not probable because we think it so. When once 
 the facts are given which determine ouj knowledge, what is 
 probable or improbable in these circumstances has been fixed 
 objectively, and is independent of our opinion. The Theory of 
 Probability is logical, therefore, because it is concerned with the 
 degree of belief which it is rational to entertain in given conditions, 
 and not merely with the actual beliefs of particular individuals, 
 which may or may not be rational. 
 
 Given the body of direct knowledge which constitutes out 
 ultimate premisses, this theory tells us what further , rational 
 beliefs, certain or probable, can be derived by valid argument 
 from our direct knowledge. This involves purely logical rela- 
 tions between the propositions which embody our direct know- 
 ledge and the propositions about which we seek indirect know- 
 ledge. "What particular propositions we select as the premisses 
 of our argument naturally depends on subjective factors peculiar 
 to ourselves ; but the relations, in, which other propositions stand 
 to these, and which entitle us to probable beliefs, are objective 
 and logical. 
 
 3. Let our premisses consist of any set of propositions h, and 
 our conclusion consist of any set of propositions a, then, if a 
 knowledge of h justifies a rational belief in a of degree a, we say 
 that there is a probabil/ity-relation of degree a between a and h.^ 
 
 In ordinary speech we often describe the conclusion as being 
 doubtful, uncertain, or only probable. But, strictly, these terms 
 ought to be applied, either to the degree of our rational belief in 
 the conclusion, or to the relation or argument between two sets 
 of propositions, knowledge of which would afEord grounds for a 
 corresponding degree of rational belief.^ 
 
 1- This will be written ajh = a. 2 ggg g^jgg Chapter' II. § 5. 
 
CH. I FUNDAMENTAL IDEAS 5 
 
 4. With the term " event," wMch has taken hitherto so im- 
 portant a place in the phraseology of the subject, I shall dis- 
 pense altogether.! Writers on Probability have generally dealt 
 with what they term the " happening " of " events." In the 
 problems which they first studied this did not involve much 
 departure from common usage. But these expressions are now 
 used in a way which is vague and unambiguous ; and it will be 
 more than a verbal improvement to discuss the truth and the 
 probability of propositions instead of the occurrence and the 
 probability of events.^ 
 
 5. These general ideas are not likely to provoke much 
 criticism. In the ordinary course of thought and argument, 
 we are constantly assuming that knowledge of one statement, 
 while not proving the truth of a second, jdelds nevertheless 
 some ground for believing it. We assert that we ought on the 
 evidence to prefer such and such a belief. We claim rational 
 grounds for assertions which are not conclusively demonstrated. 
 We aUow, in fact, that statements may be unproved, without, for 
 that reason, being unfounded. And it does not seem on reflection 
 that the information we convey by these expressions is wholly 
 subjective. JWhen we argue that Darwin gives vahd grounds 
 for our accepting his theory of natural selection, we do not simply 
 mean that we are psychologically inchned to agree with him ; 
 it is certain that we also intend to convey our belief that 
 we are acting rationally in regarding his theory as prob- 
 able. We beheve that there is some real objective relation 
 between Darwin's evidence and his conclusions, which is inde- 
 pendent of the mere fact of our behef, and which is just as real 
 and objective, though of a different degree, as that which would 
 exist if the argument were as demonstrative as a syllogism. 
 We are claiming, in fact, to cognise correctly a logical connection 
 between one set of propositions which we call our evidence and 
 which we suppose ourselves to know, and another set which we 
 call our conclusions, and to which we attach more or less weight 
 
 1 Except in those chapters (Chap. XVII., for example) where I am deahng 
 chiefly with the work of others. 
 
 ' The first writer I know of to notice this was AnciUon- in Doutes sur les 
 bases du calcul des probabilites (1794) : " Dire qu'un fait passe, present ou k 
 venir est probable, c'est dire qu'ime proposition est probable." The point was 
 emphasised by Boole, Laws of Thought, pp. 7 and 167. See also Czuber, 
 Wahrscheinlichkeitsrechnung, vol. i. p. 5, and Stumpf, Uber den Begriff der mathe- 
 matischen WahrscheinluAkeit. 
 
6 A TEEATISE ON PROBABILITY pt. i 
 
 according to the grounds supplied by the first. It is this type 
 of objective relation between sets of propositions — the type 
 which we claim to be correctly perceiving when we make such 
 assertions as these — to which the reader's attention must be 
 directed^ 
 
 6. Itis not straining the use of words to speak of this as the 
 relation of probabihty./ It is true that mathematicians have 
 employed the term in a narrower sense ; for they have often 
 confined it to the limited class of instances in which the relation 
 is adapted to an algebraical treatment. But in common usage 
 the word has never received this limitation. 
 
 Students of probability in the sense which is meant by the 
 authors of typical treatises on Wafirschemlichkeitsrechnung or 
 Calcul des probahilites, wHl find that I do eventually reach topics 
 with which they are familiar. But iu making a serious attempt 
 to deal with the fundamental difficulties with which aU students 
 of mathematical probabilities have met and which are notoriously 
 xmsolved, we must begin at the beginning (or almost at the 
 beginning) and treat our subject widely. As soon as mathe- 
 matical probability ceases to be the merest algebra or pretends 
 to guide our decisions, it immediately meets with problems 
 against which its own weapons are quite powerless. And even 
 if we wish later on to use probability in a narrow sense, it will 
 be well to know first what it means in the widest. 
 
 7. Between two sets of propositions, therefore, there exists 
 a relation, in virtue of which, if we know the first, we can attach 
 to the latter some degree of rational belief. This relation is the 
 subject-matter of the logic of probability. 
 
 A great deal of confusion and error has arisen out of a 
 failure to take due account of this relational aspect of prob- 
 abihty. From the premisses " a impUes b " and " a is true,", we 
 can conclude something about b — ^namely that b is true — ^which 
 does not involve a. But, if a is so related to b, that a knowledge 
 of it renders a probable beUef in b rational, we cannot conclude 
 anything whatever about b which has not reference to a ; and it 
 is not true that every set of self-consistent premisses which 
 includes a has this same relation to b. It is as useless, there- 
 fore, to say " 6 is probable " as it would be to say " b is equal," 
 or " b is greater than," and as unwarranted to conclude that, 
 because a makes b probable, therefore a and c together make b 
 
CH. I FUNDAMENTAL IDEAS 7 
 
 probable, as to argue that because a is less than b, therefore a 
 and c together are less than b. 
 
 Thus, when in ordinary speech we name some opinion as 
 probable without further qualification, the phrase is generally 
 elliptical. "We mean that it is probable when certain considera- 
 tions, implicitly or explicitly present to our minds at the moment, 
 are taken into account. We use the word for the sake of short- 
 ness, just as we speak of a place as being three miles distant, 
 when we mean three miles distant from where we are then situated, 
 or from some starting-point to which we tacitly refer. No 
 proposition is in itself either probable or improbable, just as no 
 place can be intrinsically distant ; and the probabihty of the 
 same statement varies with the evidence presented, which is, 
 as it were, its origin of reference. We may fix our attention 
 on our own knowledge and, treating this as our origin, consider 
 the probabilities of aU other suppositions, — ^according to the 
 usual practice which leads to the elUptical form of common 
 speech ; or we may, equally well, fix it on a proposed conclusion 
 and consider what degree of probability this would derive from 
 various sets of assumptions, which might constitute the corpus of 
 knowledge of ourselves or others, or which are merely 
 hypotheses. 
 
 Reflection will show that this accoimt harmonises with 
 familiar experience. There is nothing novel in the supposition 
 that the probability of a theory turns upon the evidence by which 
 it is supported ; and it is common to assert that an opinion was 
 probable on the evidence at first to hand, but on further informa- 
 tion was untenable. As our knowledge or our hypothesis changes, 
 our conclusions have new probabilities, not in themselves, but 
 relatively to these new premisses. New logical relations have 
 now become important, namely those between the conclusions 
 which we are investigating and our new assumptions ; but the 
 old relations between the conclusions and the former assumptions 
 still exist and are just as real as these new ones. It would be 
 as absurd to deny that an opinion was probable, when at a later 
 stage certain objections have come to light, as to deny, when 
 we have reached our destination, that it was ever three mUes 
 distant ; and the opinion still is probable in relation to the old 
 hypotheses, just as the destination is stiU three miles distant 
 from our starting-point. 
 
8 A TREATISE ON PROBABILITY n. i 
 
 ■ 8. A definition of probability is not possible, unless it contents 
 us to define degrees of the probability-relation by reference to 
 degrees of rational belief. We cannot analyse tbe probability- 
 relation in terms of simpler ideas. As soon as we have passed 
 from the logic of implication and the categories of truth and 
 falsehood to the logic of probability and the categories of know- 
 ledge, ignorance, and rational belief, we are paying attention to 
 a new logical relation in which, although it is logical, we were 
 not . previously interested, and which cannot be explained or 
 defined in terms of our previous notions. 
 
 This opinion is, from the nature of the case, incapable of posi- 
 tive proof. The presiunption in its favour must arise partly 
 out of our failure to find a definition, and partly because the 
 notion presents itself to the mind as something new and inde- 
 pendent. If the statement that an opinion was probable on the 
 evidence at first to hand, but became untenable on further in- 
 formation, is not solely concerned with psychological belief, I 
 do not know how the element of logical doubt is to be defined, 
 or how its substance is to be stated, in terms of the other 
 indefinables of formal logic. The attempts at definition, which 
 have been made hitherto, will be criticised ia later chapters. 
 I do not believe that any of them accurately represent that par- 
 ticular , logical relation which we have in our minds when we 
 speak of the probability of an argument. 
 
 In the great majority of cases the term " probable " seems to 
 be used consistently by different persons to describe the same 
 concept. Differences of opinion have not been due, I think, to 
 a radical ambiguity of language. In any case a desire to reduce 
 the indefinables of logic can easily be carried too far. Even if 
 a definition is discoverable in the end, there is no harm in post- 
 poning it untU our enquiry into the object of definition is far 
 advanced. In the case of " probability " the object before the 
 mind is so familiar that the danger of misdescribing its quaUties 
 through lack of a definition is less than if it were a highly abstract 
 entity far removed from the normal channels of thought. 
 
 9. This chapter has served briefly to indicate, though not 
 to define, the subject matter of the book. Its object has 
 been to emphasise the existence of a logical relation between two 
 sets of propositions in cases where it is not possible to argue 
 demonstratively from one to the other. This is a contention 
 
OH. I FUNDAMENTAL IDEAS 9 
 
 of a most fundamental character. It is not entirely novel, but 
 has seldom received due emphasis, is often overlooked, and 
 sometimes denied. The view, that probability arises out of 
 the existence of a specific relation between premiss and conclusion, 
 depends for its acceptance upon a reflective judgment on the 
 true character of the concept. It will be our object to discuss, 
 under the title of Probabihty, the principal properties of this 
 relation. First, however, we must digress in order to consider 
 briefly what we mean by knowledge, rational belief, and argument. 
 
CHAPTER II 
 
 PROBABILITY IS RELATION TO THE THEORY OF KNOWLEDGE 
 
 1. I DO not wisli to become involved in questions of epistemology 
 to which I do not know the answer ; and I am anxious to reach 
 as soon as possible the particular part of philosophy or logic 
 which is the subject of this book. But some explanation is 
 necessary if the reader is to be put in a position to understand 
 the point of view from which the author sets out ; I wjll, there- 
 fore, expand some part of what has been outlined or assumed 
 in the first chapter. 
 
 2. There is, first of all, the distinction between that part of 
 our belief which is rational and that part which is not. If a 
 man believes something for a reason which is preposterous or 
 for no reason at all, arid what he believes turns out to be true for 
 some reason not known to him, he cannot be said to believe it 
 rationally, although he believes it and it is in fact true. On the 
 other hand, a man may rationally believe a proposition to be 
 probable, when it is in fact false. The distinction between 
 rational belief and mere beUef, therefore, is not the same as the 
 distinction between true beliefs and false beliefs. The highest 
 degree of rational belief, which is termed certain rational belief, 
 corresponds to knowledge. We may be said to know a thing 
 when we have a certain rational belief in it, and vice versa. For 
 reasons which will appear from our account of probable degrees 
 of rational belief in the following paragraph, it is preferable to 
 regard knowledge as fundamental and to define rational belief by 
 reference to it. 
 
 3. We come next to the distinction between that part of our 
 rational belief which is certain and that part which is only 
 probable. Belief, whether rational or not, is capable of degree. 
 The highest degree of rational belief, or rational certainty of 
 
 10 
 
CH. u FUNDAMENTAL IDEAS 11 
 
 belief, and its relation to knowledge have been introduced above. 
 Wliat, however, is the relation to knowledge of probable degrees 
 of rational beUef ? 
 
 The proposition {say, q) that we ,know in this case is not the 
 same as the proposition {say, p) in which we have a probable 
 degree {say, a) of rational belief. If the evidence upon which 
 we base our belief is h, then what we know, namely q, is that 
 the proposition p bears the probability-relation of degree a to 
 the set of propositions h ; and this knowledge of ours justifies 
 us in a rational belief of degree a in the proposition p. It will 
 be convenient to call propositions such as p, which do not contain 
 assertions about probability-relations, " primary propositions " ; 
 and propositions such as q, which assert the existence of a 
 probability-relation, " secondary propositions." ^ 
 
 4. Thus knowledge of a proposition always corresponds to 
 certainty of rational belief in it and at the same time to actual 
 truth in the proposition itself. We cannot know a proposition 
 unless it is in fact true. A probable degree of rational belief 
 in a proposition, on the other hand, arises out of knowledge of 
 some corresponding secondary proposition. A man may ration- 
 ally believe a proposition to be probable when it is in fact false, 
 if the secondary proposition on which he depends is true and 
 certain ; while a man cannot rationally believe a proposition 
 to be probable even when it is in fact true, if the secondary 
 proposition on which he depends is not true. Thus rational 
 belief of whatever degree can only arise out of knowledge, 
 although the knowledge may be of a proposition secondary, in 
 the above sense, to the proposition in which the rational degree 
 of beUef is entertained. 
 
 5. At this point it is desirable to colligate the three senses 
 ia which the term probability has been so far employed. In its 
 most fundamental sense, I think, it refers to the logical relation 
 between two sets of propositions, which in § 4 of Chapter I. I 
 have termed the probability-relation. It is with this that I shall 
 be mainly concerned in the greater part of this Treatise. Deriva- 
 tive from this sense, we have the sense in which, as above, the 
 term probable is apphed to the degrees of rational belief arising 
 out of knowledge of secondary propositions which assert the 
 
 1 This classification of "primary" and "secondary" propositions was 
 suggested to me by Mr. W. E. Johnson. 
 
12 A TEEATISE ON PEOBABILITY m- i 
 
 existence of probability-relations in the fundamental logical sense. 
 Fuitber it is often convenient, and not necessarily misleading, 
 to apply the term probable to the proposition which is the object 
 of the probable degree of rational belief, and which bears the 
 probability-relation in question to the propositions comprising 
 the evidence. 
 
 6. I turn now to the distinction between direct and indirect 
 knowledge — ^between that part of our rational belief which we 
 know directly and that part which we know by argument. 
 
 We start from things, of various classes, with which we have, 
 what I choose to call without reference to other uses of this term, 
 direct acquaintance. Acquaintance with such things does not in 
 itself constitute knowledge, although knowledge arises out of 
 acquaintance with them. The most important classes of things 
 with which we have direct acquaintance are our own sensations, 
 which we may be said to experience, the ideas or meanings, about 
 which we have thoughts and which we may be said to understamd, 
 and facts or characteristics or relations of sense-data or meanings, 
 which we may be said to perceive ; — experience, understanding, 
 and perception being three forms of direct acquaintance. 
 
 The objects of knowledge and behef — as opposed to the 
 objects of direct acquaintance which I term sensations, meanings, 
 and perceptions — ^I shall term propositions. 
 
 Now OUT knowledge of propositions seems to be obtained in 
 two ways : directly, as the result of contemplating the objects 
 of acquaintance ; and indirectly, hy argument, through perceiving 
 the probability-relation of the proposition, about which we seek 
 knowledge, to other propositions. In the second case, at any 
 rate at first, what we know is not the proposition itseK but a 
 secondary proposition involving it. When we know a secondary 
 proposition involving the proposition p as subject, we may be 
 said to have indirect knowledge about p. 
 
 Indirect knowledge about p may in suitable conditions lead 
 to rational belief in p of an appropriate degree. If this degree 
 is that of certainty, then we have not merely indirect knowledge 
 about p, but indirect knowledge of p. 
 
 7. Let us take examples of direct knowledge. From ac- 
 quaintance with a sensation of yellow I can pass directly to a 
 knowledge of the proposition " I have a sensation of yeUow." 
 From acquaintance with a sensation of yellow and with the 
 
CH. II FUNDAMENTAL IDEAS 13 
 
 meanings of " yellow," " colour," " existence," I may be able 
 to pass to a direct knowledge of the propositions " I understand 
 the meaning of yellow," " my sensation of yellow exists," " yellow 
 is a colour." Thus, by some mental process of which it is 
 difficult to give an account, we are able to pass from direct 
 acquaintance with things to a knowledge of propositions about 
 the things of which we have sensations or understand the 
 meaning. 
 
 Next, by the contemplation of propositions of which we have 
 direct knowledge, we are able to pass indirectly to knowledge of or 
 about other propositions. The mental process by which we pass 
 from direct knowledge to indirect knowledge is in some cases and 
 in some degree capable of analysis. We pass from a knowledge 
 of the proposition a to a knowledge about the proposition b by per- 
 ceiving a logical relation between them. With this logical rela- 
 tion we have direct acquaintance. The logic of knowledge is 
 mainly occupied with a study of the logical relations, direct 
 acquaintance with which permits direct knowledge of the 
 secondary proposition asserting the probability-relation, and so 
 to indirect knowledge aboutj and in some cases of, the primary 
 proposition. 
 
 It is not always possible, however, to analyse the mental 
 process in the case of indirect knowledge, or to say by the per- 
 ception of what logical relation we have passed from the know- 
 ledge of one proposition to knowledge about another. But 
 although in some cases we seem to pass directly from one pro- 
 position to another, I am inclined to believe that in all legitimate 
 transitions of this kind some logical relation of the proper kind 
 must exist between the propositions, even when we are not 
 explicitly aware of it. In any case, whenever we pass to 
 knowledge about one proposition by the contemplation of it in 
 relation to another proposition of which we have knowledge — 
 even when the process is unanalysable — I call it an argument. 
 The knowledge, such as we have in ordinary thought by passing 
 from one proposition to another without being able to say what 
 logical relations, if any, we have perceived between them, may 
 be termed uncompleted knowledge. And knowledge, which 
 results from a distinct apprehension of the relevant logical 
 relations, may be termed knowledge proper. 
 
 8. In this way, therefore, I distinguish between direct and 
 
14 A TREATISE ON PROBABILITY pt. i 
 
 indirect knowledge, between that part of our rational belief which 
 is based on direct knowledge and that part which is based on 
 argument. About what kinds of things we are capable of know- 
 ing propositions directly, it is not easy to say. About our 
 own existence, our own sense-data, some logical ideas, and some 
 logical relations, it is usually agreed that we have direct know- 
 ledge. Of the law of gravity, of the appearance of the other 
 side of the moon, of the cure for phthisis, of the contents of 
 Bradshaw, it is usually agreed that we do not have direct know- 
 ledge. But many questions are in doubt. Of vihich logical 
 ideas and relations we have direct acquaintance, as to whether 
 we can ever know directly the existence of other people, and as 
 to when we are knowing propositions about sense-data directly 
 and when we are interpretiag them — ^it is not possible to give 
 a clear answer. Moreover, there is another and peculiar kind 
 of derivative knowledge — ^by memory. 
 
 At a given moment there is a great deal of our knowledge 
 which we know neither directly nor by argument — vt& remember 
 it. We may remember it as knowledge, but forget how we origin- 
 ally knew it. What we once knew and now consciously re- 
 member, can fairly be called knowledge. But it is not easy to 
 draw the line between conscious memory, unconscious memory 
 or habit, and pure instinct or irrational associations of ideas 
 (acquired or inherited)— rthe last of which cannot fairly be called 
 knowledge, for unhke the first two it did not even arise (in us at 
 least) out of knowledge. Especially in such a case as that of 
 what our eyes tell us, it is difficult to distinguish between the 
 different ways in which our behefs have arisen. We cannot 
 always tell, therefore, what is remembered knowledge and what is 
 not knowledge at all ; and when knowledge is remembered, we 
 do not always remember at the same time whether, originally, it 
 was direct or indirect. 
 
 Although it is with knowledge by argument that I shall be 
 mainly concerned in this book there is one kind of direct know- 
 ledge, namely of secondary propositions, with which I cannot 
 help but be involved. In the case of every argument, it is only 
 directly that we can know the secondary proposition which makes 
 the argument itself vaUd and rational. When we know some- 
 thing by argument this must be through direct acquaintance 
 with some logical relation between the conclusion and the premiss. 
 
OH. n FUKDAMENTAL IDEAS 15 
 
 In all knowledge, therefore, there is some direct element ; and 
 logic can never be made purely mechanical. AH it can do is 
 so to arrange the reasoning that the logical relations, which 
 have to be perceived directly, are made exphcit and are of a 
 simple kind. 
 
 9. It must be added that the term certainty is sometimes used 
 in a merely psychological sense to describe a state of mind 
 without reference to the logical grounds of the belief. With 
 this sense I am not concerned. It is also used to describe the 
 highest degree of rational belief ; and this is the sense relevant 
 to our present purpose. The peculiarity of certainty is that 
 knowledge of a secondary proposition involving certainty, 
 together with knowledge of what stands in this secondary 
 proposition in the position of evidence, leads to knowledge of, 
 and not merely about, the corresponding primary proposition. 
 Knowledge, on the other hand, of a secondary proposition in- 
 volving a degree of probabiUty lower than certainty, together 
 with knowledge of the premiss of the secondary proposition, 
 leads only to a rational belief of the appropriate degree m the 
 primary proposition. The knowledge present in this latter case 
 I have called knowledge about the primary proposition or con- 
 clusion of the argument, as distinct from knowledge of it. 
 
 Of probability we can say no more than that it is a lower degree 
 of rational belief than certairity ; and we may say, if we like, 
 that it deals with degrees of certainty.^ Or we may make 
 probability the more fundamental of the two and regard certainty 
 as a special case of probabihty, as being, in fact, the maximum 
 probability. Speaking somewhat loosely we may say that, if 
 our premisses make the conclusion certain, then it follows from 
 the premisses ; and if they make it very probable, then it very 
 nearly follows from them. 
 
 It is sometimes useful to use the term " impossibility " as 
 the negative correlative of " certainty," although the former 
 sometimes has a different set of associations. If a is certain, 
 then the contradictory of a is impossible. If a knowledge of a 
 makes b certain, then a knowledge of a makes the contradictory 
 
 ^ This view has often been taken, e.g., by BemoulU and, incidentally, by 
 Laplace ; also by Fries (see Czuber, Entwicklung, p. 12). The view, occasion- 
 ally held, that probability is concerned with degrees of truth, arises out of a 
 confusion between certainty and truth. Perhaps the Aristotelian doctrine 
 that future events are neither true nor false arose in this way. 
 
16 A TREATISE ON PROBABILITY ft. i 
 
 of b impossible. Thus a proposition is impossible with respect 
 to a given premiss, if it is disproved by the premiss ; and the 
 relation of impossibihty is the relation of minimum probability.^ 
 
 10. We have distinguished between rational beUef and irrational 
 belief and also between rational beliefs which are certain in degree 
 and those which are only probable. Knowledge has been 
 distinguished according as it is direct or indirect, according as it 
 is of primary or secondary propositions, and according as it is 
 of or merely about its object. 
 
 In order that we may have a rational belief in a proposition p 
 of the degree of certainty, it is necessary that one of two con- 
 ditions should be fulfilled — (i.) that we know p directly ; or (ii.) 
 that we know a set of propositions h, and also know some secondary 
 proposition q asserting a certainty-relation between p and h. 
 In the latter case h may include secondary as well as primary 
 propositions, but it is a necessary condition that all the pro- 
 positions h should be known. In order that we may have rational 
 behef in ^ of a lower degree of probability than certainty, it is 
 necessary that we know a set of propositions h, and also know 
 some secondary proposition q asserting a probability-relation 
 between p and h. 
 
 In the above account one possibility has been ruled out. It 
 is assumed that we cannot have a rational belief in ^ of a degree 
 less than certainty except through knowing a secondary pro- 
 position of the prescribed tjrpe. Such belief can only arise, that 
 is to say, by means of the perception of some probabihty-relation. 
 To employ a common use of terms (though one inconsistent with 
 the use adopted above), I have assumed that all direct knowledge 
 is certain. AH knowledge, that is to say, which is obtained in a 
 manner strictly direct by contemplation of the objects of acquaint- 
 ance and without any admixture whatever of argument and the 
 contemplation of the logical bearing of any other knowledge on 
 this, corresponds to certmn rational belief and not to a merely 
 probable degree of rational belief. It is true that there do seem 
 to be degrees of knowledge and rational belief, when the source of 
 
 ^ Necessity and Impossibility, ia the senses in which these terms are used 
 in the theory of Modality, seem to correspond to the relations of Certainty and 
 Impossibility in the theory of probability, the other modals, which comprise 
 the intermediate degrees of possibihty, corresponding to the intermediate 
 degrees of probabihty. Almost up to the end of the seventeenth century 
 the traditional treatment of modals is, in fact, a, primitive attempt to bring 
 the relations of probability within the scope of formal logic. 
 
OH. n FUNDAMENTAL IDEAS IT 
 
 the belief is solely in acquaintance, as there are when its source 
 is in argument. But I think that this appearance arises partly 
 out of the difficulty of distinguishing direct from indirect know- 
 ledge, and partly out of a confusion between probable know- 
 ledge and vague knowledge. I cannot attempt here to analyse 
 the meaning of vague knowledge. It is certainly not the same 
 thing as knowledge proper, whether certain or probable, and 
 it does not seem likely that it is susceptible of strict logical 
 treatment. At any rate I do not know how to deal with it, 
 and in spite of its importance I will not complicate a difficult 
 subject by endeavouring to treat adequately the theory of vague 
 knowledge. 
 
 I assume then that only true propositions can be known, 
 that the term " probable knowledge " ought to be replaced by 
 the term " probable degree of rational belief," and that a probable 
 degree of rational belief cannot arise directly but only as the 
 result of an argument, out of the knowledge, that is to say, of 
 a secondary proposition asserting some logical probability- 
 relation in which the object of the belief stands to some known 
 proposition. With arguments, if they exist, the ultimate pre- 
 misses of which are known in some other manner than that 
 described above, such as might be called " probable knowledge," 
 my theory is not adequate to deal without modification.^ 
 
 For the objects of certain belief which is based on direct 
 knowledge, as opposed to certain belief arising indirectly, there 
 is a well-established expression ; propositions, in which our 
 rational belief is both certain and direct, are said to be 
 self-evident. 
 
 11. In conclusion, the relativity of knowledge to the individual 
 may be briefly touched on. Some part of knowledge — ^knowledge 
 of our own existence or of our own sensations — ^is clearly rela- 
 tive to individual experience. We cannot speak of knowledge 
 absolutely — only of the knowledge of a particular person. Other 
 parts of knowledge — ^knowledge of the axioms of logic, for ex- 
 ample^ — may seem more objective. But we must admit, I think, 
 that this too is relative to the constitution of the human mind, 
 and that the constitution of the human mind may vary in some 
 degree from man to man. What is self-evident to me and what 
 
 ^ I do not mean to imply, however, at any rate at present, that the ultimate 
 premisses of an argument need always be primary propositions. 
 
 
 
18 A TEBATISE ON PEOBABILITY pt. i 
 
 I really know, may be only a probable belief to you, or may form 
 no part of your rational beliefs at aU. And tliis may be true 
 not only of such things as my existence, but of some logical axioms 
 also.- Some men — ^indeed it is obviously the case — ^may have a 
 greater power of logical intuition than others. Further, the 
 difEerence between some kinds of propositions over which human 
 intuition seems to have power, and some over which it has none, 
 may depend wholly upon the constitution of our .minds and 
 have no significance for a perfectly objective logic. We can no 
 more assume that all true secondary propositions are or ought 
 to be universally known than that all true primary propositions 
 are known. The perceptions of some relations of probability 
 may be outside the powers of some or all of us. 
 
 What we know and what probability we can attribute to our 
 rational beliefs is, therefore, subjective in the sense of being 
 relative to the individual. But given the body of premisses which 
 our subjective powers and circumstances supply to us, and given 
 the kinds of logical relations, upon which arguments can be based 
 and which we have the capacity to perceive, the conclusions, 
 which it is rational for us to draw, stand to these premisses in an 
 objective and wholly logical relation. Our logic is concerned 
 with drawing conclusions by a series of steps of certain specified 
 kinds from a limited body of premisses. 
 
 With these brief indications as to the relation of Probability, 
 as I understand it, to the Theory of Knowledge, I pass from 
 problems of ultimate analysis and definition, which are not the 
 primary subject matter of this book, to the logical theory and 
 superstructure, which occupies an intermediate position between 
 the ultimate problems and the applications of the theory, whether 
 such applications take a generalised mathematical form or a 
 concrete and particular one. For this purpose it would only 
 encumber the exposition, without adding to its clearness or its 
 accuracy, if I were to employ the perfectly exact terminology 
 and minute refinements of language, which are necessary for the 
 avoidance of error in very ftmdamental enquiries. While taking 
 pains, therefore, to avoid any divergence between the substance 
 of this chapter and of those which succeed it, and to employ only 
 such periphrases as could be translated, if desired, into perfectly 
 exact language, I shall not cut myself off from the convenient, 
 but looser, expressions, which have been habitually employed 
 
CH. n FUNDAMENTAL IDEAS 19 
 
 by previous writers and have the advantage of being, in a general 
 way at least, immediately inteUigible to the reader.^ 
 
 ^ This question, which faces all contemporary writers on logical and philo- 
 sophical subjects, is in my opinion much more a question of style — and therefore 
 to be settled on the same sort of considerations as other such questions — ^than 
 is generally supposed. There are occasions for very exact methods of state- 
 ment, such as are employed in Mr. Russell's Principia Mathematica. But there 
 are advantages also in writing the English of Hume. Mr. Moore has developed 
 in Principia Ethika an intermediate style which in his hands has force and 
 beauty. But those writers, who strain after exaggerated precision without 
 going the whole hog with Mr. Russell, are sometimes merely pedantic. They 
 lose the reader's attention, and the repetitious complication of their phrases 
 eludes his comprehension, without their really attaining, to compensate, 
 a complete precision. Confusion of thought is not always best avoided by 
 technical and unaccustomed expressions, to which the mind has no immediate 
 reaction of imderstanding ; it is possible, under cover of a careful formalism, 
 to make statements, which, if expressed in plain language, the mind would 
 immediately repudiate. There is much to be said, therefore, in favour of 
 understanding the substance of what you are saying all the time, and of never 
 reducing the substantives of your argument to the mental status of an x or y. 
 
CHAPTER III 
 
 THE MEASUREMENT OF PROBABILITIES 
 
 1. I HAVE spoken of probability as being concerned with degrees 
 of rational belief. This phrase implies that it is in some sense 
 quantitative and perhaps capable of measurement. The theory 
 of probable arguments must be much occupied, therefore, with 
 comfarisons of the respective weights which attach to different 
 arguments. With this question we will now concern ourselves. 
 
 It has been assumed hitherto as a matter of course that 
 probabiUty is, in the full and literal sense of the word, measurable. 
 I shall have to hmit, not extend, the popular doctrine. But, 
 keeping my own theories in the background for the moment, I 
 win begin by discussing some existing opinions on the subject. 
 
 2. It has been sometimes supposed that a numerical comparison 
 between the degrees of any pair of probabilities is not only con- 
 ceivable but is actually within our power. Bentham, for instance, 
 in his Rationale of Judicial Evidence^ proposed a scale on which 
 witnesses might mark the degree of their certainty ; and others 
 have suggested seriously a ' barometer of probability.' ^ 
 
 That such comparison is theoretically possible, whether or not 
 we are actually competent in every case to make the comparison, 
 has been the generally accepted opinion. The following quota- 
 tion ^ puts this point of view very well : 
 
 " I do not see on what ground it can be doubted that every 
 
 1 Book i chap vi. (referred to by Venn). 
 
 * The reader may be reminded of Gibbon's proposal that : — " A Theological 
 Barometer might be formed, of which the Cardinal (Baronius) and our country- 
 man. Dr. Middleton, should constitute the opposite and remote extremities, 
 as the former sank to the lowest degree of credulity, which was compatible with 
 learning, and the latter rose to the highest pitch of scepticism, in any wise 
 consistent with Religion." 
 
 3 W. F. Donkin, Phil. Mag., 1851. He is replying to an article by J. D. 
 Forbes {Phil. Mag., Aug. 1849) which had cast doubt upon this opinion. 
 
 20 
 
CH. m FUNDAMENTAL IDEAS 21 
 
 definite state of belief concerning a proposed hypothesis is in 
 itself capable of being represented by a numerical expression, 
 however difficult or impracticable it may be to ascertain its 
 actual value. It would be very difficult to estimate in numbers 
 the vis viva of all of the particles of a human body at any instant ; 
 but no one doubts that it is capable of numerical expression. I 
 mention this because I am not sure that Professor Forbes has 
 distinguished the difficulty of ascertaining numbers in certain 
 cases from a supposed difficulty of expression by means of numbers. 
 The former difficulty is real, but merely relative to our knowledge 
 and skill ; the latter, if real, would be absolute and inherent in 
 the subject-matter, which I conceive is not the case." 
 
 De Morgan held the same opinion on the ground that, wherever 
 we have differences of degree, numerical comparison must be 
 theoretically possible.^ He assumes, that is to say, that all 
 probabUities can be placed in an order of magnitude, and argues 
 from this that they must be measurable. Philosophers, however, 
 who are mathematicians, would no longer agree that, even if the 
 premiss is sound, the conclusion follows from it. Objects can 
 be arranged in an order, which we can reasonably call one of 
 degree or magnitude, without its being possible to conceive a 
 system of measurement of the differences between the individuals. 
 
 This opinion may also have been held by others, if not by 
 De Morgan, in part because of the narrow associations which 
 Probabihty has had for them. The Calculus of ProbabiHty has 
 received far more attention than its logic, and mathematicians, 
 under no compulsion to deal with the whole of the subject, have 
 naturally confined their attention to those special cases, the exist- 
 ence of which will be demonstrated at a later stage, where 
 algebraical representation is possible. Probabihty has become 
 associated, therefore, in the minds of theorists with those problems 
 in which we are presented with a number of exclusive and ex- 
 haustive alternatives of equal probabihty ; and the principles, which 
 are readily apphcable in such circumstances, have been supposed, 
 without much further enquiry, to possess general vahdity. 
 
 3. It is also the case that theories of probability have been 
 
 • " Whenever the terms greater and less can be applied, there twice, thrice, 
 etc., can be conceived, though not perhaps measured by us." — " Theory of Prob- 
 abilities," Encyclopaedia MetropoUtana, p. 395. He is a little more guarded in 
 his Formal Logic, pp. 174, 175 ; but arrives at the same conclusion so far as 
 probability is concerned. 
 
22 A TEEATISE ON PEOBABILITY pt. i 
 
 propounded and widely accepted, according to which its numerical 
 character is necessarily involved in the definition. It is often 
 said, for instance, that probabiUty is the ratio of the number of 
 " favourable cases " to the total number of " cases." If this 
 definition is accurate, it foUows that every probability can be 
 properly represented by a number and in fact is a number ; for 
 a ratio is not a quantity at all. In the case also of definitions 
 based upon statistical frequency, there must be by definition a 
 numerical ratio corresponding to every probabihty. These 
 definitions and the theories based on them wiU be discussed in 
 Chapter VIII. ; they are connected with fundamental difEerences 
 of opinion with which it is not necessary to burden the present 
 argument. 
 
 4. If we pass from the opinions of theorists to the experience 
 of practical men, it might perhaps be held that a presumption 
 in favour of the numerical valuation of all probabiUties can be 
 based on the practice of underwriters and the wilhngness of 
 Lloyd's to insure against practically any risk. Underwriters are 
 actually willing, it might be urged, to name a numerical measure 
 in every case, and to back their opinion with money. But this 
 practice shows no more than that many probabilities are greater 
 or less than some numerical measure, not that they themselves 
 are numerically definite. It is sufficient for the underwriter if 
 the premium he names exceeds the probable risk. But, apart 
 from this, I doubt whether in extreme cases the process of thought, 
 through which he goes before naming a premium, is whoUy 
 rational and determinate ; or that two equally intelKgent brokers 
 acting on the same evidence would always arrive at the same 
 result. In the case, for instance, of insurances effected before 
 a Budget, the figures quoted must be partly arbitrary. There is 
 in them an element of caprice, and the broker's state of mind, 
 when he quotes a figure, is like a bookmaker's when he names 
 odds. "Whilst he may be able to make sure of a profit, on the 
 principles of the bookmaker, yet the individual figures that make 
 up the book are, within certain limits, arbitrary. He may be 
 almost certain, that is to say, that there will not be new taxes on 
 more than one of the articles tea, sugar, and whisky ; there 
 may be an opinion abroad, reasonable or unreasonable, that the 
 likelihood is in the order — whisky, tea, sugar ; and he may, 
 therefore, be able to effect insurances for equal amounts in each 
 
CH. m FUNDAMENTAL IDEAS 23 
 
 at 30 per cent, 40 per cent, and 45 per cent. He has thus made 
 sure of a profit of 15 per cent, however absurd and arbitrary his 
 quotations may be. It is not necessary for the success of imder- 
 wiiting on these lines that the probabilities of these new taxes 
 arereaUy measurable by the figures ^, ^, and -^-fj^ ; it is sufficient 
 that there should be merchants wining to insure at these rates. 
 These merchants, moreover, may be wise to insure even if the 
 quotations are partly arbitrary ; for they may run the risk of in- 
 solvency unless their possible loss is thus limited. That the 
 transaction is in principle one of bookmaking is shown by the 
 fact that, if there is a specially large demand for insurance against 
 one of the possibiHties, the rate rises ; — the probabihty has not 
 changed, but the " book " is in danger of being upset. A Presi- 
 dential election in the United States supplies a more precise 
 example. On August 23, 1912, 60 per cent was quoted at Lloyd's 
 to pay a total loss should Dr. Woodrow Wilson be elected, 30 per 
 cent should Mr. Taft be elected, and 20 per cent should Mr. 
 Roosevelt be elected. A broker, who could effect insurances 
 in equal amounts against the election of each candidate, would be 
 certain at these rates of a profit of 10 per cent. Subsequent 
 modifications of these terms would largely depend upon the 
 number of applicants for each kind of pohcy. Is it possible to 
 maintain that these figures in any way represent reasoned 
 numerical estimates of probabihty ? 
 
 In some insurances the arbitrary element seems even greater. 
 Consider, for instance, the reinsurance rates for the Waratdh, 
 a vessel which disappeared in South African waters. The 
 lapse of time made rates rise ; the departure of ships in search of 
 her made them fall ; some nameless wreckage is found and they 
 rise ; it is remembered that in similar circumstances thirty 
 years ago a vessel floated, helpless but not seriously damaged, 
 for two months, and they fall. Can it be pretended that the 
 figures which were quoted from day to day — 75 per cent, 83 per 
 cent, 78 per cent — were rationally determinate, or that the 
 actual figure was not within wide hmits arbitrary and due to 
 the caprice of individuals ? In fact underwriters themselves 
 distinguish between risks which are properly insurable, either 
 because their probability can be estimated between comparatively 
 narrow numerical hmits or because it is possible to make a " book " 
 which covers all possibiUties, and other risks which cannot be 
 
24 A TREATISE ON PROBABILITY pt. i 
 
 dealt with in this way and which cannot form the basis of a regular 
 business of insurance, — although an occasional gamble may be 
 indulged in. I believe, therefore, that the practice of under- 
 writers weakens rather than supports the contention that all 
 probabilities can be measured and estimated numerically. 
 
 5. Another set of practical men, the lawyers, have been more 
 subtle in this matter than the philosophers.^ A distinction, 
 interesting for our present purpose, between probabilities, which 
 can be estimated within somewhat narrow hmits, and those which 
 cannot, has arisen in a series of judicial decisions respecting 
 damages. The following extract^ from the Times Law Reports 
 seems to me to deal very clearly in a mixture of popular and legal 
 phraseology, with the logical point at issue : 
 
 This was an action brought by a breeder of racehorses to 
 recover damages for breach of a contract. The contract was 
 that Cyllene, a racehorse owned by the defendant, should in the 
 season of the year 1909 serve one of the plaintifE's brood 
 mares. In the summer of 1908 the defendant, without the con- 
 sent of the plaintiff, sold Cyllene for £30,000 to go to South 
 America. The plaintiff claimed a sum equal to the average 
 profit he had made through having a mare served by Cyllene 
 during the past four years. During those four years he had 
 had four colts which had sold at £3300. Upon that basis his 
 loss came to 700 guineas. 
 
 Mr. Justice Jelf said that he was desirous, if he properly 
 could, to find some mode of legally making the defendant com- 
 pensate the plaintiff ; but the question of damages presented 
 formidable and, to his mind, insuperable difficulties. The 
 damages, if any, recoverable here must be either the estimated 
 loss of profit or else nominal damages. The estimate could only 
 be based on a succession of contingencies. Thus it was assumed 
 that {inter alia) Cyllene would be ahve and well at the time of the 
 intended service ; that the mare sent would be well bred and not 
 barren ; that she would not sUp her foal ; and that the foal would 
 be born ahve and healthy. In a case of this kind he could only 
 
 1 Leibniz note9 the subtle distinctions made by Jurisconsults between 
 degrees of probability ; and in the preface to a work, projected but unfinished, 
 which was to have been entitled Ad stateram juris de gradibus probationum et 
 probabilitatum he recommends them as models of logic in contingent questions 
 (Couturat, Logique de Leibniz, p. 240). 
 
 ^ I have considerably compressed the original report (SapweU v. Bass). 
 
OH. m FUNDAMENTAL IDEAS 25 
 
 rely on the weighing of chances ; and the law generally regarded 
 damages which depended on the weighing of chances as too 
 remote, and therefore irrecoverable. It was drawing the hne 
 between an estimate of damage based on probabihties, as in 
 " Simpson v. L. and N.W. Eailway Co." (1, Q.B.D., 274), where 
 Cockburn, C.J., said : " To some extent, no doubt, the damage 
 must be a matter of speculation, but that is no reason for not 
 awarding any damages at all," and a claim for damages of a 
 totally problematical character. He (Mr. Justice Jelf) thought 
 the present case was well over the hne. Having referred to 
 " Mayne on Damages " (8th ed., p. 70), he pointed out that 
 in " Watson -y.Ambergah Railway Co." (15, Jur., 448) Patteson, J., 
 seemed to think that the chance of a prize might be taken into 
 account in estimating the damages for breach of a contract to 
 send a machine for loading barges by railway too late for a show ; 
 but Erie, J., appeared to think such damage was too remote. 
 In his Lordship's view the chance of winning a prize was not of 
 sufficiently ascertainable value at the time the contract was made 
 to be within the contemplation of the parties. Further, in the 
 present case, the contingencies were far more numerous and 
 uncertain. He would enter judgment for the plaintifi for nominal 
 damages, which were all he was entitled to. They would be 
 assessed at Is. 
 
 One other similar case may be quoted in further elucidation 
 of the same point, and because it also illustrates another point — 
 the importance of making clear the assumptions relative to which 
 the probabiUty is calculated. This case ^ arose out of an ofEer of 
 a Beauty Prize ^ by the Baily Express. Out of 6000 photographs 
 submitted, a number were to be selected and published in the 
 newspaper in the following manner : 
 
 The United Kingdom was to be divided into districts and the 
 photographs of the selected candidates hving in each district were 
 to be submitted to the readers of the paper in the district, who 
 were to select by their votes those whom they considered the 
 most beautiful, and a Mr. Seymour Hicks was then to make an 
 appointment with the 50 ladies obtaining the greatest number 
 of votes and himself select 12 of them. The plaintiff, who came 
 
 1 ChapKn v. Hicks (1911). 
 
 ^ The prize was to be a theatrical engagement and, according to the article, 
 the probability of subsequent marriage into the peerage. 
 
26 A TREATISE ON PROBABILITY pt. i 
 
 out head of one of the districts, submitted that she had not been 
 given a reasonable opportunity of keeping an appointment, that 
 she had thereby lost the value of her chance of one of the 12 
 prizes, and claimed damages accordingly. The jury found that 
 the defendant had not taken reasonable means to give the 
 plaintiff an opportunity of presenting herself for selection, and 
 assessed the damages, provided they were capable of assessment, 
 at £100, the question of the possibihty of assessment being post- 
 poned. This was argued before Mr. Justice Pickford, and sub- 
 sequently in the Court of Appeal before Lord Justices Vaughan 
 WiUiams, Fletcher Moulton, and Earwell. Two questions arose 
 — relative to what evidence ought the probabiUty to be cal- 
 culated, and was it numerically measurable ? Counsel for the 
 defendant contended that, " if the value of the plaintiff's chance 
 was to be considered, it must be the value as it stood at the begin- 
 ning of the competition, not as it stood after she had been selected 
 as one of the 50. As 6000 photographs had been sent in, and there 
 was also the personal taste of the defendant as final arbiter to 
 be considered, the value of the chance of success was really in- 
 calculable." The first contention that she ought to be considered 
 as one of 6000 not as one of 50 was plainly preposterous and did 
 not hoodwink the court. But the other point, the personal 
 taste of the arbiter, presented more difficulty. In estimating 
 the chance, ought the Court to receive and take account of 
 evidence respecting the arbiter's preferences in types of beauty ? 
 Mr. Justice Pickford, without illuminating the question, held that 
 the damages were capable of estimation. Lord Justice Vaughan 
 WiUiams in giving judgment in the Court of Appeal argued as 
 follows : 
 
 As he understood it, there were some 50 competitors, and 
 there were 12 prizes of equal value, so that the average chance 
 of success was about one in four. It was then said that the 
 questions which might arise in the minds of the persons who had 
 to give the decisions were so numerous that it was impossible to 
 apply the doctrine of averages. He did not agree. Then it 
 was said that if precision and certainty were impossible in any 
 case it would be right to describe the damages as unassessable. 
 He agreed that there might be damages so imassessable that the 
 doctrine of averages was not possible of apphcation because the 
 figures necessary to be apphed were not forthcoming. Several 
 
CH. m FUNDAMENTAL IDEAS 27 
 
 cases were to be found in the reports where it had been so held, 
 but he denied the proposition that because precision and certainty 
 had not been arrived at, the jury had no function or duty to 
 determine the damages. ... He (the Lord Justice) denied that 
 the mere fact that you could not assess with precision and cer- 
 tainty relieved a wrongdoer from paying damages for his breach of 
 duty. He would not lay down that in every case it could be left 
 to the jury to assess the .damages ; there were cases where the 
 loss was so dependent on the mere unrestricted vohtion of another 
 person that it was impossible to arrive at any assessable loss 
 from the breach. It was true that there was no market here ; 
 the right to compete was personal and could not be transferred. 
 He could not admit that a competitor who found herself one of 
 50 could have gone into the market and sold her right to compete. 
 At the same time the jury might reasonably have asked them- 
 selves the question whether, if there was a right to compete, it 
 could have been transferred, and at what price. Under these 
 circumstances he thought the matter was one for the jury. 
 
 The attitude of the Lord Justice is clear. The plaintiff had 
 evidently suffered damage, and justice required that she should 
 be compensated. But it was equally evident, that, relative to 
 the completest information available and account being taken of 
 the arbiter's personal taste, the probability could be by no means 
 estimated with numerical precision. Further, it was impossible 
 to say how much weight ought to be attached to the fact that 
 the plaintiff had been head of her district (there were/ew^er than 
 50 districts) ; yet it was plain that it made her chance better than 
 the chances of those of the 50 left in, who were not head of their 
 districts. Let rough justice be done, therefore. Let the case 
 be simplified by ignoring some part of the evidence. The 
 " doctrine of averages " is then appUcable, or, in other words, 
 the plaintiff's loss may be assessed at twelve-fiftieths of the 
 value of the prize.^ 
 
 6. How does the matter stand, then ? Whether or not such 
 a thing is theoretically conceivable, no exercise of the practical 
 judgment is possible, by which a numerical value can actually 
 be given to the probability of every argument. So far from 
 
 ^ The jury in aasessing the damages at £100, however, cannot have argued 
 so subtly as this ; for the average value of a prize (I have omitted the details 
 bearing on their value) could not have been fairly estimated so high as £400. 
 
28 A TKEATISE ON PEOBABILITY pt. i 
 
 our being able to measure them, it is not even clear that we are 
 always able to place them in an order of magnitude. Nor has 
 any theoretical rule for their evaluation ever been suggested. 
 
 The doubt, in view of these facts, whether any two prob- 
 abihties are in every case even theoretically capable of comparison 
 in terms of numbers, has not, however, received serious considera- 
 tion. There seems to me to be exceedingly strong reasons for 
 entertaining the doubt. Let us examine a few more instances. 
 
 7. Consider an induction or a generahsation. It is usually 
 held that each additional instance increases the generaUsation's 
 probabihty. A conclusion, which is based on three experiments 
 in which the unessential conditions are varied, is more trust- 
 worthy than if it were based on two. But what reason or 
 principle can be adduced for attributing a numerical measure to 
 the increase ? ^ 
 
 Or, to take another class of instances, we may sometimes 
 have some reason for supposing that one object belongs to a 
 certain category if it has points of similarity to other known 
 members of the category {e.g. if we are considering whether 
 a certain picture should be ascribed to a certain painter), and 
 the greater the similarity the greater the probability of our 
 conclusion. But we cannot in these cases measure the increase ; 
 we can say that the presence of certain pecuhar marks in a 
 picture increases the probabihty that the artist of whom those 
 marks are known to be characteristic painted it, but We cannot 
 say that the presence of these marks makes it two or three or 
 any other number of times more probable than it would have 
 been without them. We can say that one thing is more like a 
 second object than it is Hke a third ; but there will very seldom be 
 any meaning in saying that it is twice as Uke. Probabihty is, so 
 far as measurement is concerned, closely analogous to similarity.^ 
 
 ^ It is true that Laplace and others (even amongst contemporary writers) 
 have believed that the probability of an induction is measurable by means of a 
 formula known as the rule of succession, according to which the probability of an 
 
 71+1 
 
 induction based on n instances is i. Those who have been convinced bv 
 
 m + 2 •' 
 
 the reasoning employed to estabUsh this rule must be asked to postpone judg- 
 ment until it has been examined in Chapter XXX. But we may point out here 
 the absurdity of supposing that the odds are 2 to 1 in favour of a generalisation 
 baaed on a, single instance — a conclusion which this formula would seem to 
 justify. 
 
 ^ There are very few writers on probability who have explicitly admitted 
 that probabilities, though in some sense quantitative, may be incapable of 
 
OH. in FUNDAMENTAL IDEAS 29 
 
 Or consider the ordinary circumstances of life. We are out 
 for a walk— what is the probability that we shall reach home 
 alive ? Has this always a numerical measure ? If a thunder- 
 storm bursts upon us, the probabiUty is less than it was before ; 
 but is it changed by some definite numerical amount ? There 
 might, of course, be data which would make these probabihties 
 numerically comparable ; it might be argued that a knowledge 
 of the statistics of death by hghtning would make such a com- 
 parison possible. But if such information is not included within 
 the knowledge to which the probabiUty is referred, this fact is 
 not relevant to the probabiUty actually in question and cannot 
 affect its value. In some cases, moreover, where general statistics 
 are available, the numerical probabiUty which might be derived 
 from them is inappUcable because of the presence of additional 
 knowledge with regard to the particular case. Gibbon cal- 
 culated his prospects of Ufe from the volumes of vital statistics 
 and the calculations of actuaries. But if a doctor had been called 
 to his assistance the nice precision of these calculations would 
 have become useless ; Gibbon's prospects would have been better 
 or worse than before, but he would no longer have been able to 
 calculate to within a day or week the period for which he then 
 possessed an even chance of survival. 
 
 In these instances we can, perhaps, arrange the probabihties 
 in an order of magnitude and assert that the new datum 
 strengthens or weakens the argument, although there is no 
 basis for an estimate how much stronger or weaker the new 
 argument is than the old. But in another class of instances is 
 it even possible to arrange the probabihties in an order of magni- 
 tude, or to say that one is the greater and the other less ? 
 
 8. Consider three sets of experiments, each directed towards 
 estabUshing a generaUsation. The first set is more numerous ; 
 
 numerical comparison. Edgeworth, " Philosophy of Chance " (Mind, 1884, p. 
 225), admitted that " there may well be important quantitative, although not 
 numerical, estimates " of probabilities. Goldsohmidt {Wahrscheinlichkeitsrech- 
 nung, p. 43) may also be cited as holding a somewhat similar opinion. He 
 maintains that a lack of comparability in the grounds often stands in the way 
 of the measurability of the probable in ordinary usage, and that there are not 
 necessarily good reasons for measuring the value of one argument against 
 that of another. On the other hand, a numerical statement for the degree of the 
 probable, although generally impossible, is not in itself contradictory to the 
 notion ; and of three statements, relating to the same circumstances, we can 
 well say that one is more probable than another, and that one is the most 
 probable of the three. 
 
30 A TREATISE ON PROBABILITY pt. i 
 
 in the second set the irrelevant conditions have been more 
 carefuUy varied ; in the third case the generaUsation in view 
 is wider in scope than in the others. Which of these generaKsa- 
 tions is on such evidence the most probable ? There is, surely, 
 no answer ; there is neither equality nor inequahty between 
 them. We cannot always weigh the analogy against the induc- 
 tion, or the scope of the generahsation against the bulk of the 
 evidence in support of it. If we have more grounds than 
 before, comparison is possible ; but, if the grounds in the two 
 cases are quite different, even a comparison of more and less, 
 let alone numerical measurement, may be impossible. 
 
 This leads up to a contention, which I have heard supported, 
 that, although not all measurements and not all comparisons of 
 probabiUty are within our power, yet we can say in the case of 
 every argument whether it is more or less likely than not. Is our 
 expectation of rain, when we start out for a walk, always mare 
 likely than not, or less Ukely than not, or as likely as not ? I am 
 prepared to argue that on some occasions none of these alternatives 
 hold, and that it wUl be an arbitrary matter to decide for or 
 against the umbrella. If the barometer is high, but the clouds are 
 black, it is not always rational that one should prevail over the 
 other in our minds, or even that we should balance them, — 
 though it will be rational to allow caprice to determine us and 
 to waste no time on the debate. 
 
 9. Some cases, therefore, there certainly are in which no 
 rational basis has been discovered for numerical comparison. It 
 is not the case here that the method of calculation, prescribed 
 by theory, is beyond our powers or too laborious for actual 
 application. No method of calculation, however impracticable, 
 has been suggested. Nor have we any primn facie indications of 
 the existence of a common unit to which the magnitudes of all 
 probabiKties are naturally referrible. A degree of probabihty 
 is not composed of some homogeneous material, and is not 
 apparently divisible into parts of, Hke character with one 
 another. An assertion, that the magnitude of a given prob- 
 ability is in a numerical ratio to the magnitude of every 
 other, seems, therefore, unless it is based on one of the current 
 definitions of probabihty, with which I shall deal separately 
 in later chapters, to be altogether devoid of the kind of support, 
 which can usually be suppKed in the case of quantities of which 
 
CH. m FXJKDAMENTAL IDEAS 31 
 
 the mensurability is not open to denial. It will be worth, 
 while, however, to pursue the argument a Uttle further. 
 
 10. There appear to be four alternatives. Either in some 
 cases there is no probabiUty at all ; or probabihties do not all 
 belong to a single set of magnitudes measurable in terms of a 
 common unit ; or these measures always exist, but in many 
 cases are, and must renmin, unknown ; or probabihties do 
 belong to such a set and their measures are capable of being 
 determined by us, although we are not always able so to 
 determine them in practice. 
 
 11. Laplace and his followers excluded the first two alter- 
 natives. They argued that every conclusion has its place in 
 the numerical range of probabihties from to 1, if only we knew 
 it, and they developed their theory of unknown probabihties. 
 
 In dealing with this contention, we must be clear as to what 
 we mean by saying that a probabihty is unknown. Do we mean 
 unknown through lack of skill in arguing from given evidence, 
 or unknown through lack of evidence ? The first is alone 
 admissible, for new evidence would give us a new probabihty, 
 not a fuller knowledge of the old one ; we have not discovered 
 the probabihty of a statement on given evidence, by determining 
 its probabihty in relation to quite different evidence. We must 
 not allow the theory of imknown probabihties to gain plausibihty 
 from the second sense. A relation of probabihty does not yield 
 us, as a rule, information of much value, unless it invests the 
 conclusion with a probabihty which hes between narrow numerical 
 limits. In ordinary practice, therefore, we do not always regard 
 ourselves as knowing the probabihty of a conclusion, unless we 
 can estimate it numerically. We are apt, that is to say, to 
 restrict the use of the expression probable to these numerical 
 examples, and to allege in other cases that the probabihty is 
 unknown.' We might say, for example, that we do not know, 
 when we go on a railway journey, the probabihty of death in a 
 railway accident, unless we are told the statistics of accidents 
 in former years ; or that we do not know our chances in a lottery, 
 unless we are told the number of the tickets. But it must be 
 clear upon reflection that if we use the term in this sense, — which 
 is no doubt a perfectly legitimate sense, — we ought to say that 
 in the case of some arguments a relation of probabihty does not 
 exist, and not that it is unknown. For it is not this probabihty 
 
32 A TREATISE ON PROBABILITY pt. i 
 
 that we have discovered, when the accession of new evidence 
 makes it possible to frame a numerical estimate. 
 
 Possibly this theory of unknown probabiUties may also gain 
 strength from our practice of estimating arguments, which, as 
 I maintain, have no numerical value, by reference to those that 
 have. We frame two ideal arguments, that is to say, in which 
 the general character of the evidence largely resembles what is 
 actually within our knowledge, but which is so constituted as 
 to yield a numerical value, and we judge that the probabihty of 
 the actual argument lies between these two. Since our standards, 
 therefore, are referred to numerical measures in many cases 
 where actual measurement is impossible, and since the probabihty 
 lies between two numerical measures, we come to beheve that it 
 must also, if only we knew it, possess such a measure itself. 
 
 12. To say, then, that a probabihty is unknown ought to 
 mean that it is unknown to us through our lack of skill in arguing 
 from given evidence. The evidence justifies a certain degree of 
 knowledge, but the weakness of our reasoning power prevents our 
 knowing what this degree is. At the best, in such cases, we only 
 know vaguely with what degree of probabihty the premisses invest 
 the conclusion. That probabihties can be unknown in this sense 
 or known with less distinctness than the argument justifies, 
 is clearly the case. We can through stupidity fail to make any 
 estimate of a probabihty at all, just as we may through the 
 same cause estimate a probabihty wrongly. As soon as we 
 distinguish between the degree of beUef which it is rational to 
 entertain and the degree of behef actually entertained, we have 
 in effect admitted that the true probabihty is not known to 
 everybody. 
 
 But this admission must not be allowed to carry us too far. 
 Probabihty is, vide Chapter II. (§ 12), relative in a sense to the 
 principles of human reason. The degree of probability, which 
 it is rational for us to entertain, does not presimie perfect logical 
 insight, and is relative in part to the secondary propositions 
 which we in fact know ; and it is not dependent upon whether 
 more perfect logical insight is or is not conceivable. It is the 
 degree of probabihty to which those logical processes lead, of 
 which our minds are capable ; or, in the language of Chapter II., 
 which those secondary propositions justify, which we in fact know. 
 If we do not take this view of probabihty, if we do not limit it 
 
CH. m FUNDAMENTAL IDEAS 33 
 
 in this way and make it, to tMs extent, relative to human 
 powers, we are altogether adrift in the unknown ; for we cannot 
 ever know what degree of probability would be justified by the 
 perception of logical relations which we are, and must always be, 
 incapable of comprehending. 
 
 13. Those who have maintained that, where we cannot assign 
 a numerical probability, this is not because there is none, but 
 simply because we do not know it, have really meant, I feel 
 sure, that with some addition to our knowledge a numerical 
 value would be assignable, that is to say that our conclusions 
 would have a numerical probabiHty relative to slightly different 
 premisses. Unless, therefore, the reader clings to the opinion 
 that, in every one of the instances I have cited in the earUer 
 paragraphs of this chapter, it is theoretically possible on that 
 evidence to assign a numerical value to the probabiKty, we are 
 left with the first two of the alternatives of § 10, which were 
 as follows : either in some cases there is no probability at all ; 
 or probabilities do not all belong to a single set of magnitudes 
 measurable in terms of a common unit. It would be difficult to 
 maintain that there is no logical relation whatever between 
 our premiss and our conclusion in those cases where we cannot 
 assign a numerical value to the probability ; and if this is so, 
 it is really a question of whether the logical relation has char- 
 acteristics, other than mensurability, of a kind to justify us in 
 calling it a probability-relation. Which of the two we favour is, 
 therefore, partly a matter of definition. We might, that is to 
 say, pick out from probabihties (in the widest sense) a set, if there 
 is one, all of which are measurable in terms of a common unit, 
 and call the members of this set, and them only, probabilities (in 
 the narrow sense). To restrict the term ' probability ' in this 
 way! would be, I tbink, very inconvenient. For it is possible, 
 as I shall show, to find several sets, the members of each of 
 which are measurable in terms of a unit common to all the 
 members of that set ; so that it would be in some degree 
 arbitrary ^ which we chose. Further, the distinction between 
 probabilities,! which would be thus measurable and those which 
 would not, is not fundamental. 
 
 At any rate I aim here at dealing with probability in its 
 
 1 Not altogether ; for it would be natural to select the set to which the 
 relation of certainty belongs. 
 
 D 
 
34 A TREATISE ON PROBABILITY pt. i 
 
 widest sense, and am averse to confining its scope to a limited 
 type of argument. If the opinion that not all probabiUties can 
 be measured seems paradoxical, it may be due to this divergence 
 from a usage which the reader may expect. Common usage, 
 even if it involves, as a rule, a flavour of numerical measurement, 
 does not consistently exclude those probabilities which are in- 
 capable of it. The confused attempts, which have been made, 
 to deal with numerically indeterminate probabilities imder the 
 title of unknown probabilities, show how difficult it is to 
 confine the discussion within the intended Hmits, if the original 
 definition is too narrow. 
 
 14. I maintain, then, in what follows, that there are some pairs 
 of probabilities between the members of which no comparison 
 of magnitude is possible ; that we can say, nevertheless, of some 
 pairs of relations of probabihty that the one is greater and the 
 other less, although it is not possible to measure the difference 
 between them ; and that in a very special type of case, to be 
 dealt with later, a meaning can be given to a numerical comparison 
 of magnitude. I think that the results of observation, of which 
 examples have been given earher in this chapter, are consistent 
 with this account. 
 
 By saying that not all probabilities are measurable, I mean 
 that it is not possible to say of every pair of conclusions, about 
 which we have some knowledge, that the degree of our rational 
 belief in one bears any numerical relation to the degree of our 
 rational belief in the other ; and by saying that not all proba- 
 bilities are comparable in respect of more and less, I mean that 
 it is not always possible to say that the degree of our rational 
 belief in one conclusion is either equal to, greater than, or less 
 than the degree of our belief in another. 
 
 We must now examine a philosophical theory of the quanti- 
 tative properties of probabihty, which would explain and 
 justify the conclusions, which reflection discovers, if the preceding 
 discussion is correct, in the practice of ordinary argument. We 
 must bear in mind that oxa theory must apply to all probabiUties 
 and not to a Hmited class only, and that, as we do not adopt a 
 definition of probabihty which presupposes its numerical men- 
 surability, we cannot directly argue from differences in degree 
 to a numerical measurement of these differences. The problem 
 is subtle and difficult, and the following solution is, therefore, 
 
OH. m FUNDAMENTAL IDEAS 35 
 
 proposed with hesitation ; but I am strongly convinced that 
 something resembling the conclusion here set forth is true. 
 
 15. The so-called magnitudes or degrees of knowledge or 
 probability, in virtue of which one is greater and another less, 
 really arise out of an order in which it is possible to place them. 
 Certainty, impossibility, and a probabihty, which has an inter- 
 mediate value, for example, constitute an ordered series in which 
 the probability lies between, certainty and impossibility. In the 
 same way there may exist a second probability which lies between 
 certainty and the first probability. When, therefore, we say that 
 one probabihty is greater than another, this precisely means that 
 the degree of our rational belief in the first case lies between 
 certainty and the degree of our rational behef in the second case. 
 
 On this theory it is easy to see why comparisons of more 
 and less are not always possible. They exist between two proba- 
 bilities, only when they and certainty all lie on the same ordered 
 series. But if more than one distinct series of probabihties 
 exist, then it is clear that only those, which belong to the same 
 series, can be compared. If the attribute ' greater ' as apphed 
 to one of two terms arises solely out of the relative order of the 
 terms in a series, then comparisons of greater and less must 
 always be possible between terms which are members of the 
 same series, and can never be possible between two terms which 
 are not members of the same series. Some probabihties are not 
 comparable in respect of more and less, because there exists 
 more than one path, so to speak, between proof and disproof, 
 between certainty and impossibiUty ; and neither of two proba- 
 bihties, which he on independent paths, bears to the other and 
 to certainty the relation of ' between ' which is necessary for 
 quantitative comparison. 
 
 If we are comparing the probabilities of two arguments, 
 where the conclusion is the same in both and the evidence of 
 one exceeds the evidence of the other by the inclusion of some 
 fact which is favourably relevant, in such a case a relation seems 
 clearly to exist between the two in virtue of which one hes 
 nearer to certainty than the other. Several types of argument 
 can be instanced in which the existence of such a relation is 
 equally apparent. But we cannot assume its presence in every 
 case or in comparing in respect of more and less the probabihties 
 of every pair of arguments. 
 
36 A TREATISE ON PEOBABILITY pt. i 
 
 16. Analogous instances are by no means rare, in which, by a 
 convenient looseness, the phraseology of quantity is misapplied 
 in the same manner as in the case of probability. The simplest 
 example is that of colour. When we describe the colour of 
 one object as bluer than that of another, or say that it has more 
 green in it, we do not mean that there are quantities blue and 
 green of which the object's colour possesses more or less ; we 
 mean that the colour has a certain position in an order of colours 
 and that it is nearer some standard colour than is the colour 
 with which we compare it. 
 
 Another example is afiorded by the cardinal numbers. We 
 say that the number three is greater than the number two, but 
 we do not mean that these numbers are quantities one of which 
 possesses a greater magnitude than the other. The one is 
 greater than the other by reason of its position in the order of 
 numbers ; it is further distant from the origin zero. One number 
 is greater than another if the second number lies between zero 
 and the first. 
 
 But the closest analogy is that of similarity. When we say 
 of three objects A, B, and C that B is more like A than C is, we 
 mean, not that there is any respect in which B is in itself quan- 
 titatively greater than C, but that, if the three objects are placed 
 in an order of similarity, B is nearer to A than C is. There are 
 also, as in the case of probability, different orders of similarity. 
 Por instance, a book bound in blue morocco is more like a book 
 bound in red morocco than if it were bound in blue calf ; and a 
 book bound in red calf is more like the book in red morocco than 
 if it were in blue calf. But there may be no comparison between 
 the degree of similarity which exists between books bound in 
 red morocco and blue morocco, and that which exists between 
 books bound in red morocco and red calf. This illustration 
 deserves special attention, as the analogy between orders of 
 similarity and probabihty is so great that its apprehension will 
 greatly assist that of the ideas I wish to convey. We say 
 that one argument is more probable than another {i.e. nearer to 
 certainty) in the same land of way as we can describe one object 
 as more like than another to a standard object of comparison. 
 
 17. Nothing has been said up to this point which bears on 
 the question whether probabilities are ever capable of numerical 
 comparison. It is true of some types of ordered series that 
 
CH. m FUNDAMENTAL IDEAS 37 
 
 there are measurable relations of distance between their members 
 as well as order, and that the relation of one of its members 
 to an ' origin ' can be numerically compared with the relation 
 of another member to the same origin. But the legitimacy of 
 such comparisons must be matter for special enquiry in each 
 case. 
 
 It will not be possible to explain in detail how and in what 
 sense a meaning can sometimes be given to the numerical measure- 
 ment of probabihties until Part II. is reached. But this chapter 
 will be more complete if I indicate briefly the conclusions at 
 which we shall arrive later. It wiU be shown that a process 
 of compounding probabihties can be defined with such properties 
 that it can be conveniently called a process of addition. It will 
 sometimes be the case, therefore, that we can say that one 
 probabihty C is equal to the sum of two other probabihties A 
 and B, i.e. C = A + B. If in such a case A and B are equal, then 
 we may write this C = 2A and say that C is double A. Similarly 
 if D = C + A, we may write D = 3A, and so on. We can attach a 
 meaning, therefore, to the equation P = n.A, where P and A are 
 relations of probabihty, and w is a number. The relation of 
 certainty has been commonly taken as the unit of such con- 
 ventional measurements. Hence if P represents certainty, 
 we should say, in ordinary language, that the magnitude of the 
 probabihty A is i. It will be shown also that we can define a 
 process, apphcable to probabihties, which has the properties of 
 arithmetical multiphcation. Where numerical measurement is 
 possible, we can in consequence perform algebraical operations 
 of considerable complexity. The attention, out of proportion 
 to their real importance, which has been paid, on account of the 
 opportunities of mathematical manipulation, which they afford, 
 to the hmited class of numerical probabilities, seems to be 
 a part explanation of the behef, which it is the principal object 
 of this chapter to prove erroneous, that all probabihties must 
 belong to it. 
 
 18. We must look, then, at the quantitative characteristics of 
 probabihty in the following way. Some sets of probabihties 
 we can place in an ordered series, in which' we can say of any 
 pair that one is nearer than the other to certainty, — that the 
 argument in one case is nearer proof than in the other, and that 
 there is more reason for one conclusion than for the other. But 
 
38 A TKEATISE ON PKOBABILITY pt. i 
 
 we can only build up these ordered series in special cases. If we 
 are given two distinct arguments, there is no general presump- 
 tion that their two probabilities and certainty can be placed 
 in an order. The burden of estabhshing the existence of such 
 an order lies "on us in each separate case. An endeavour will 
 be made later to explain in a systematic way how and in 
 what circumstances such orders can be estabUshed. The 
 argument for the theory here proposed will then be strengthened. 
 For the present it has been shown to be agreeable to common 
 sense to suppose that an order exists in some cases and not in 
 others. 
 
 19. Some of the principal properties of ordered series of 
 probabihties are as follows : 
 
 (i.) Every probabihty lies on a path between impossibility 
 and certainty ; it is always true to say of a degree 
 of probability, which is not identical either with 
 impossibihty or with certainty, that it Ues between 
 them. Thus certainty, impossibihty and any other 
 degree of probability form an ordered series. This 
 is the same thing as to say that every argument 
 amounts to proof, or disproof, or occupies an inter- 
 mediate position. 
 
 (ii.) A path or series, composed of degrees of probabihty, 
 is not in general compact. It is not necessarily true, 
 that is to say, that any pair of probabihties in the 
 same series have a probabihty between them. 
 
 (iii.) The same degree of probabihty can he on more than 
 one path {i.e. can belong to more than one series). 
 Hence, if B hes between A and C, and also hes between 
 A' and C, it does not follow that of A and A' either hes 
 between the other and certainty. The fact, that the 
 same probabihty can belong to more than one distinct 
 series, has its analogy in the case of similarity. 
 
 (iv.) If ABC forms an ordered series, B lying between A 
 and C, and BCD forms an ordered series, C lying between 
 B and D, then ABCD forms an ordered series, B lying 
 between A and D. 
 
 20. The difEerent series of probabihties and their mutual rela- 
 tions can be most easily pictured by means of a diagram. Let us 
 represent an ordered series by points lying upon a path, all the 
 
OH. m FUNDAMENTAL IDEAS 39 
 
 points on a given path, belonging to the same series. It follows 
 from (i.) that the points and I, representing the relations of 
 impossibility and certainty, lie on every path, and that all paths 
 lie wholly between these points. It follows from (iv.) that the 
 same point can lie on more than one path. It is possible, there- 
 fore, for paths to intersect and cross. It follows from (iv.) that 
 the probability represented by a given point is greater than that 
 represented by any other point which can be reached by passing 
 along a path Mth a motion constantly towards the point of 
 impossibiUty, and less than that represented by any point which 
 can be reached by moving along a path towards the point of 
 certainty. As there are independent paths there will be some 
 pairs of points representing relations of probability such that we 
 cannot reach one by moving from the other along a path always 
 in the same direction. 
 
 These properties are illustrated in the annexed diagram. 
 represents impossibility, I certainty, and A a numerically 
 measurable probability inter- 
 mediate between and I ; U, 
 V, W, X, Y, Z are non-mmaerical 
 probabilities, of which, however, 
 V is less than the numerical 
 probability A, and is also less 
 than W, X, and Y. X and Y 
 are both greater than W, and greater than V, but are not 
 comparable with one another, or with A. V and Z are both 
 less than W, X, and Y, but are not comparable with one 
 another ; U is not quantitatively comparable with any of the 
 probabilities V, W, X, Y, Z. Probabilities which are numerically 
 comparable wiU all belong to one series, and the path of this 
 series, which we may call the numerical path or strand, will be 
 represented by OAI. 
 
 21. The chief results which have been reached so far are 
 collected together below, and expressed with precision : — 
 
 (i.) There are amongst degrees of probabiUty or rational 
 behef various sets, each set composing an ordered 
 series. These series are ordered by virtue of a relation 
 of ' between.' If B is ' between ' A and C, ABC form a 
 series, 
 (ii.) There are two degrees of probability and I between 
 
40 A TEEATISE ON PEOBABILITY m. i 
 
 whicli all other probabilities lie. If, that is to say, A 
 is a probability, OAI form a series. represents im- 
 possibility and I certainty. ^ 
 (iii.) If A lies between and B, we may write this AB, 
 
 so that OA and AI are true for all probabilities. 
 
 (iv.) If AB, the probability B is said to be greater than 
 
 the probability A, and this can be expressed by B > A. 
 
 (v.) If the conclusion a bears the relation of probabihty 
 
 P to the premiss h, or if, in other words, the hypothesis 
 
 h invests the conclusion a with probabihty P, this may 
 
 be written aPh. It may also be written ajli='?. 
 
 This latter expression, which proves to be the more useful of the 
 
 two for most purposes, is of fundamental importance. If aVh 
 
 and a'PA', i.e. if the probabihty of a relative to h is the 
 
 same as the probabihty of a' relative to h', this may be written 
 
 alh^=a' jh'. The value of the symbol ajh, which represents 
 
 what is called by other writers ' the probabihty of a,' hes in 
 
 the fact that it contains exphcit reference to the data to which 
 
 the probability relates the conclusion, and avoids the numerous 
 
 errors which have arisen out of the omission of this reference. 
 
CHAPTBE IV 
 
 THE PEINCIPLE OP INDIPPERENCE 
 
 ABSOiiTJTE. ' Sure, Sir, this is not very reasonable, to summon my afEeotion 
 
 for a lady I know nothing of.' 
 Sib ANTEONr. ' I am sure, Sir, 'tis more unreasonable in you to object 
 
 to a lady you know nothing of.' ^ 
 
 1. In the last chapter, it was assmned that in some cases the 
 probabilities of two arguments may be equal. It was also argued 
 that there are other cases in which one probabihty is, in some 
 sense, greater than another. But so far there has been nothing 
 to show how we are to know when two probabiUties are equal or 
 unequal. The recognition of equahty, when it exists, will be 
 dealt with in this chapter, and the recognition of inequality in 
 the next. An historical account of the various theories about 
 this problem, which have been held from time to time, will be 
 given iu Chapter VII. 
 
 2. The determination of equality between probabilities has 
 received hitherto much more attention than the determination 
 of iaequality. This has been due to the stress which has been 
 laid on the mathematical side of the' subject. In order that 
 numerical measurement may be possible, we must be given a 
 number of equally probable alternatives. The discovery of a 
 rule, by which equiprobability could be established, was, there- 
 fore, essential. A rule, adequate to the purpose, introduced by 
 James Bernoulli, who was the real founder of mathematical 
 probability,^ has been widely adopted, generally under the 
 title of The Principle of Non-Sufficient Reason, down to the 
 present time. This description is clumsy and unsatisfactory, 
 and, if it is justifiable to break away from tradition, I prefer to 
 call it The Principle of Indifference. 
 
 1 Quoted by Mr. Bosanquet with reference to the Principle of Non-Sufficient 
 Reason. ' See also Chap. VII. 
 
 41 
 
42 A TREATISE ON PROBABILITY pt. i 
 
 The Principle of Indifierence asserts that if there is no known 
 reason for predicating of our subject one rather than another of 
 several alternatives, then relatively to such knowledge the 
 assertions of each of these alternatives have an equal probability. 
 Thus equal probabilities must be assigned to each of several 
 arguments, if there is an absence of positive ground for assigning 
 unequal ones. 
 
 This rule, as it stands, may lead to paradoxical and even 
 contradictory conclusions. I propose to criticise it in detail, 
 and then to consider whether any valid modification of it is 
 discoverable. For several of the criticisms which follow I am 
 much indebted to Von Kries's Die Principien der Wahrschem- 
 Uchkeit.^ 
 
 3. If every probability was necessarily either greater than, 
 equal to, or less than any other, the Principle of Indifference 
 wotdd be plausible. For if the evidence affords no ground for 
 attributing unequal probabilities to the alternative predications, 
 it seems to follow that they must be equal. If, on the other hand, 
 there need be neither equality nor inequality between prob- 
 abilities, this method of reasoning fails. Apart, however, from 
 this objection, which is based on the arguments of Chapter III., 
 the plausibility of the principle will be most easily shaken by an 
 exhibition of the contradictions which it involves. These fall 
 under three or four distinct heads. In §§ 4-9 my criticism will 
 be purely destructive, and I shall not attempt in these paragraphs 
 to indicate my own way out of the difficulties., 
 
 4. Consider a proposition, about the subject of which we know 
 only the meaning, and about the truth of which, as applied to 
 this subject, we possess no external relevant evidence. It has 
 been held that there are here two exhaustive and exclusive 
 alternatives — the truth of the proposition and the truth of its 
 contradictory — ^whUe our knowledge of the subject affords no 
 ground for preferring one to the other. Thus if a and a are 
 contradictories, about the subject of which we have no outside 
 knowledge, it is inferred that the probability of each is ^.^ In 
 
 ^ Published in 1886. A briel account of Von Kries's principal conclusions 
 will be given on p. 87. A useful summary of his book will be found in a review 
 by Meinong, pubUshed in the Gottingische gelehrte Anifeigen, for 1890 (pp. 66-75). 
 
 ' Cf. (e.g.) the well-known passage in Jevons's Principles of Science, voL i. 
 p. 243, in which he assigns the probability } to the proposition " A Platythliptio 
 Coefficient is positive." Jevons points out, by way of proof, that no other 
 
CH. IV FUNDAMENTAL IDEAS 43 
 
 the same way the probabilities of two other propositions, b and c, 
 having the same subject as a, may be each ^. But without 
 having any evidence bearing on the subject of these propositions 
 we may know that the predicates are contraries amongst them- 
 selves, and, therefore, exclusive alternatives — a supposition which 
 leads by means of the same principle to values inconsistent with 
 those just obtained. If, for instance, having no evidence relevant 
 to the colour of this book, we could conclude that ^ is the proba- 
 bility of ' This book is red,' we could conclude equally that the 
 probability of each of the propositions ' This book is black ' and 
 ' This book is blue ' is also J. So that we are faced with the 
 impossible case of three exclusive alternatives all as likely as not. 
 A defender of the Principle of IndifEerence might rejoin that we 
 are assuming knowledge of the proposition : ' Two difEerent 
 colours cannot be predicated of the same subject at the same 
 time ' ; and that, if we know this, it constitutes relevant out- 
 side evidence. But such evidence is about the predicate, not 
 about the subject. Thus the defender of the Principle will be 
 driven on, either to confine it to cases where we know nothing 
 about either the subject or the predicate, which would be to 
 emasculate it for all practical purposes, or else to revise and 
 amplify it, which is what we propose to do ourselves. 
 
 The difficulty cannot be met by saying that we must know 
 and take account of the number of possible contraries. For the 
 number of contraries to any proposition on any evidence is always 
 infinite ; 56 is contrary to a for all values of b. The same point 
 can be put in a form which does not involve contraries or 
 contradictories. For example, a/h=^ and ab/h=^, if A is 
 
 probability could reasonably be given. This, of course, involves the assumption 
 that every proposition must have some numerical probability. Such a con- 
 tention was first criticised, so far as I am aware, by Bishop Terrot in the Edin. 
 Phil. Trans, for 1856. It was deliberately rejected by Boole in his last pub- 
 lished work on probabiUty : " It is a plain consequence," he says {Edin. Phil. 
 Trans, vol. xxi. p. 624), " of the logical theory of probabilities, that the state 
 of expectation which accompanies entire ignorance of an event is properly 
 represented, not by the fraction J, but by the indefinite form %." Jevons's 
 particular example, however, is also open to the objection that we do not even 
 know the meaning of the subject of the proposition. Would he maintain that 
 there is any sense in saying that for those who know no Arabic the probability 
 of every statement expressed in Arabic is even ? How far has he been 
 influenced in the choice of his example by known characteristics of the predicate 
 ' positive ' ? Would he have assigned the probability J to the proposition 
 ' A Platythliptic Coefficient is a perfect cube ' ? What about the proposition 
 ' A Platythliptio Coefficient is allogeneous ' ? 
 
44 A TREATISE ON PROBABILITY m. i 
 
 irrelevant both to a and to b, in the sense required by the crude 
 Principle of IndifEerence.^ It follows from this that, if a is true, 
 b must be true also. If it foUows from the absence of positive 
 data that 'A is a red book' has a probability of ^, and that the 
 probability of ' A is red ' is also ^, then we may deduce that, if 
 A is red, it must certainly be a book. 
 
 We may take it, then, that the probability of a proposition, 
 about the subject of which we have no extraneous evidence, is 
 not necessarily ^. "Whether or not this conclusion discredits the 
 Principle of Indifference, it is important on its own account, and 
 will help later on to confute some famous conclusions of Laplace's 
 school. 
 
 5. Objection can now be made in a somewhat different shape. 
 Let us suppose as before that there is no positive evidence relating 
 to the subjects of the propositions under examination which 
 would lead us to discriminate in any way between certaiu 
 alternative predicates. If, to take an example, we have no 
 information whatever as to the area or population of the 
 coimtries of the world, a man is as likely to be an inhabitant 
 of Great Britain as of France, there beiag no reason to prefer 
 one alternative to the other.^ He is also as likely to be an 
 inhabitant of Ireland as of France. And on the same principle 
 he is as Ukely to be an inhabitant of the British Isles as of 
 France. And yet these conclusions are plainly inconsistent. 
 For our first two propositions together yield the conclusion 
 that he is twice as likely to be an inhabitant of the British 
 Isles as of France. 
 
 Unless we argue, as I do not think we can, that the knowledge 
 that the British Isles are composed of Great Britain and Ireland 
 is a ground for supposing that a man is more likely to inhabit 
 them than France, there is no way out of the contradiction. It 
 is not plausible to maintain, when we are considering the relative 
 populations of different areas, that the number of names of sub- 
 divisions which are within our knowledge, is, iu the absence of 
 any evidence as to their size, a piece of relevant evidence. 
 
 At any rate, many other similar examples could be invented, 
 
 ^ ajh stands for ' the probability of a on hypothesis h.' 
 
 ' This example raises a difficulty similar to that raised by Von Kries's 
 example of the meteor. Stumpf has propounded an invalid solution of Von 
 Kries's difficulty. Against the example proposed here, Stumpf's solution has 
 less plausibility than against Von Kries's. 
 
OH. IV FUKDAMENTAL IDEAS 45 
 
 wMch would reqtiire a special explanation in each case ; for the 
 above is an instance of a perfectly general difficulty. The 
 possible alternatives may be a, b, c, and d, and there may be no 
 means of discriminating between them ; but equally there may 
 be no means of discriminating between (a or b), c, and d. 
 This difficulty could be made striking in a variety of ways, but 
 it will be better to criticise the principle further from a some- 
 what different side. 
 
 6. Consider the specffic volume of a given substance.^ Let us 
 suppose that we know the specific volume to he between 1 and 3, 
 but that we have no information as to whereabouts in this interval 
 its exact value is to be found. The Principle of Indifference 
 would allow us to assume that it is as likely to he between 1 and 
 2 as between 2 and 3 ; for there is no reason for supposing that it 
 lies in one interval rather than in the other. But now consider 
 the specific density. The specific density is the reciprocal of 
 the specific volume, so that if the latter is v the former is ^. 
 Our data remaining as before, we know that the specific density 
 must lie between 1 and ^, and, by the same use of the Principle 
 of Indifference as before, that it is as Hkely to he between 
 1 and f as between f and ^. But the specific volume being 
 a determinate function of the specific density, if the latter hes 
 between 1 and |, the former Hes between 1 and 1|, and if the 
 latter lies between | and ^, the former Hes between 1| and 3. 
 It follows, therefore, that the specific volume is as Hkely to He 
 between 1 and IJ as between If and 3 ; whereas we have already 
 proved, relatively to precisely the same data, that it is as Hkely 
 to He between 1 and 2 as between 2 and 3. Moreover, any other 
 function of the specific volume would have suited our purpose 
 equally well, and by a suitable choice of this function we might 
 have proved in a similar manner that any division whatever 
 of the interval 1 to 3 jdelds sub-intervals of equal probabiHty. 
 Specific volume and specific density are simply alternative 
 methods of measuring the same objective quantity ; and there 
 are many methods which might be adopted, each yielding on the 
 appHcation of the Principle of Indifference a different probabiHty 
 for a given objective variation in the quantity.^ 
 
 1 This example is taken from Von Kries, op. cit. p. 24. Von Kjies does 
 not seem to me to explain correctly how the contradiction arises. 
 
 ' A. Nitsche ("Die Dimensionen der WahrsoheinUohkeit nnd die Evidenz der 
 UngewiBsheit," VierteJjah-sschr. f. maaenach. Philoa. vol. xvi. p. 29, 1892), in 
 
46 A TREATISE ON PROBABILITY pt. i 
 
 The arbitrary nature of particular methods of measurement 
 of this and of many other physical quantities is easily explained. 
 The objective quality measured may not, strictly speaking, possess 
 numerical quantitativeness, although it has the properties neces- 
 sary for measurement by means of correlation with numbers. 
 The values which it can assume may be capable of being 
 ranged in an order, and it will sometimes happen that the series 
 which is thus formed is contimMus, so that a value can always 
 be found whose order in the series is between any two selected 
 values ; but it does not follow from this that there is any meaning 
 in the assertion that one value is twice another value. The 
 relations of continuous order can exist between the terms of a 
 series of values, without the relations of numerical quantitative- 
 ness necessarily existing also, and in such cases we can adopt a 
 largely arbitrary measure of the successive terms, which yields 
 results which may be satisfactory for many purposes, those, 
 for instance, of mathematical physics, though not for those of 
 probability. This method is to select some other series of 
 quantities or numbers, each of the terms of which corresponds 
 in order to one and only one of the terms of the series which 
 we wish to measure. For instance, the series of character- 
 istics, differing in degree, which are measured by specific 
 volume, have this relation to the series of numerical ratios 
 between the volumes of equal masses of the substances, the 
 specific volumes of which are in question, and of water. They 
 have it also to the corresponding ratios which give rise to the 
 measure of specific density. But these only yield conventional 
 measurements, and the numbers with which we correlate the 
 
 oritioising Von Kries, argues that the alternatives to which the principle must 
 be applied are the smallest physically distinguishable intervals, and that the 
 probabiUty of the specific volume's lying within a certain range of values turns 
 on the number of such distinguishable intervals in the range. This procedure 
 might conceivably provide the correct method of computation, but it does not 
 therefore restore the credit of the Principle of Indifierence. For it is argued, 
 not that the results of applying the principle are always wrong, but that it does 
 not lead unambiguously to the correct procedure. If we do not know the 
 number of distinguishable intervals we have no reason for supposing that the 
 specific volume Ues between 1 and 2 rather than 2 and 3, and the principle can 
 therefore be applied as it has been applied above. And even if we do know 
 the number and reckon intervals as equal which contain an equal number of 
 ' physically distinguishable ' parts, is it certain that this does not simply 
 provide us with a new system of measurement, which has the same conven- 
 tional basis as the methods of specific volume and specific density, and is no 
 more the one correct measure than these are ? 
 
OH. IV FUNDAMENTAL IDEAS 47 
 
 terms which we wish to measure can be selected in a variety of 
 ways. It follows that equal intervals between the numbers 
 which represent the ratios do not necessarily correspond to equal 
 intervals between the qualities under measurement ; for these 
 numerical difEerences depend upon which convention of measure- 
 ment we have selected. 
 
 7. A somewhat analogous difficulty arises in connection with 
 the problems of what is known as ' geometrical ' or ' local ' 
 probability.^ In these problems we are concerned with the posi- 
 tion of a point or infinitesimal area or volume within a con- 
 tinuimi.^ The number of cases here is indefinite, but the Principle 
 of Indifference has been held to justify the supposition that equal 
 lengths or areas or volvmies of the continmmi are, in the absence 
 of discriminating evidence, equally likely to contain the point. 
 It has long been known that this assumption leads in numerous 
 cases to contradictory conclusions. If, for instance, two points 
 A and A' are taken at random on the surface of a sphere, and we 
 seek the probability that the lesser of the two arcs of the great 
 circle AA' is less than a, we get one result by assuming that the 
 probability of a point's lying on a given portion of the sphere's 
 surface is proportional to the area of that portion, and another 
 result by assuming that, if a point lies on a given great circle, the 
 probability of its Ijdng on a given arc of that circle is proportional 
 to the length of the arc, each of these assumptions being equally 
 justified by the Principle of Indifference. 
 
 Or consider the following problem : if a chord in a circle is 
 drawn at random, what is the probabiUty that it will be less 
 than the side of the inscribed equilateral triangle. One can 
 argue : — 
 
 (a) It is indifferent at what point one end of the chord lies. 
 If we suppose this end fixed, the direction is then 
 
 ^ The best accounts of this subject aie to be found in Czuber, Oeometrische 
 Wahrscheinlichkeiten und Mittelwerte ; Czuber, Wahrscheinlichkeitsrechnung, 
 voL i. pp. 75-109; Crofton, Encyd. Brit. (9th edit.), article 'Probability'; 
 Borel, Elements de la theorie des probabilitis, chapa. vi.-viii. ; a few other 
 references are given in the following pages, and a number of discussions of 
 individual problems will be found in the mathematical volumes of the 
 Educational Times. The interest of the subject is primarily mathematical, 
 and no discussion of its principal problems will be attempted here. 
 
 * As Czuber points out {Wahrscheinlichkeitsre<Anung, vol. i. p. 84), all 
 problems, whether geometrical or arithmetical, which deal with a continuum 
 and with non-enumerable aggregates, are commonly discussed under the name of 
 ' geometrical probability.' See also Lammel, Vntersuchungen. 
 
48 A TEEATISE ON PEOBABILITY pt. i 
 
 chosen at random. In this case the answer is easily 
 shown to be |. 
 (6) It is indifEerent in what direction we suppose the chord 
 to he. Beginning with this apparently not less justifi- 
 able assumption, we find that the answer is ^. 
 (c) To choose a chord at random, one must choose its 
 middle point at random. If the chord is to be less 
 than the side of the inscribed equilateral triangle, the 
 middle point must be at a greater distance from the 
 centre than half the radius. But the area at a 
 greater distance than this is | of the whole. Hence 
 our answer is |. ^ 
 In general, if x and /(a;) are both continuous variables, varying 
 always in the same or in the opposite sense, and x must he 
 between a and b, then the probability that x lies between c 
 
 d — c 
 and d, where a<c<d<b, seems to be ^ 'and the probabihty 
 
 that f(x) hes between /(c) and f{d) to be -^tjt — ^^-r. These 
 
 expressions, which represent the probabihties of necessarily 
 concordant conclusions, are not, as they ought to be, equal.^ 
 
 8. More than one attempt has been made to separate the 
 cases in which the Principle of Indifference can be legitimately 
 apphed to examples of geometrical probabihty from those in 
 which it cannot. M. Borel argues that the mathematician can 
 define the geometrical probabihty that a point M hes on a certain 
 segment PQ of AD as proportional to the length of the segment, 
 but that this definition is conventional until its consequences 
 have been confirmed d posteriori by their conformity with the 
 results of empirical observation. He points out that in actual 
 cases there are generally some considerations present which 
 lead us to prefer one of the possible assumptions to the others. 
 Whether or not this is so, the proposed procedure amounts to 
 an abandonment of the Principle of IndijEEerence as a vahd 
 criterion, and leaves our choice undetermined when further 
 evidence is not forthcoming. 
 
 M. PoLQcare, who also held that judgments of equiprobabihty 
 
 in such cases depend upon a ' convention,' endeavoured to mini- 
 
 ^ Beitrand, Calcul dea probabilitea, p. 5. 
 ' See {e.g.) Borel, MUmerUs de la theorie des probabilites, p. 85. 
 
OH. IV FUNDAMENTAL IDEAS 49 
 
 mise the importance of the arbitrary element by showing that, 
 under certain conditions, the result is independent of the particu- 
 lar convention which is chosen. Instead of assuming that the 
 point is equally hkely to he in every infinitesimal interval dx 
 we may represent the probabihty of its Ijnng in this interval by 
 the function ^(x)dx. M. Poincare showed that, in the game of 
 rouge et novr, for instance, where we have a number of compart- 
 ments arranged in a circle coloured alternately black and white, 
 if we can assume that <^{x) is a regular function, continuous and 
 with continuous differential coefficients, then, whatever the 
 particular form of the function, the probabihty of black is 
 approximately equal to that of white.^ 
 
 Whether or not investigations on these hues prove to have 
 a practical value, they have not, I think, any theoretical import- 
 ance. If, as I maintain, the probabihty ^{x) is not necessarily 
 numerical, it is not a generally justifiable assumption to 
 take its continuity for granted. We have, in the particular 
 example quoted, a number of alternatives, half of which lead to 
 black and half to white ; the assumption of continuity amounts 
 to the assumption that for every white alternative there is a 
 black alternative whose probabihty is very nearly equal to that 
 of the white. Naturally in such a case we can get an approxi- 
 mately equal probabihty for the whites as a whole and for the 
 blacks as a whole, without assuming equal probabihty for each 
 alternative individually. But this fact has no bearing on the 
 theoretical difficulties which we are discussing. 
 
 M. Bertrand is so much impressed by the contradictions of 
 geometrical probability that he wishes to exclude aU examples 
 in which the number of alternatives is infinite? It will be argued 
 in the sequel that something resembUng this is true. The dis- 
 cussion of this question will be resumed in §§ 21-25. 
 
 9. There is yet another group of cases, distinct in character 
 from those considered so far, in which the principle does not 
 seem to provide us with unambiguous guidance. The typical 
 example is that of an urn containing black and white baUs in an 
 
 ^ Poinoar6, Calcul des probabilites, pp. 126 et seq. 
 
 ^ Bertrand, Calcul des probabiUtea, p. 4: "L'infini n'eat pas un nombre; 
 on ne doit pas, sans explication, I'introduire dans les raisonnements. La 
 precision Ulusoire des mots pourrait faire naStre des contradictions. Choisir 
 au hasard, entie un nombre infini de cas possibles, n'est pas mie indication 
 sufSsante." 
 
 E 
 
50 A TEEATISE ON PEOBABILITY pt- i 
 
 unknown proportion.^ The Principle of Indifference can be 
 claimed to support the most usual hypothesis, namely, that all 
 possible numerical ratios of black and white are equally probable. 
 But we might equally well assume that aU possible constiivAions ^ 
 of the system of balls are equally probable, so that each individual 
 ball is assumed equally likely to be black or white. It wotild 
 follow from this that an approximately equal number of black 
 and white balls is more probable than a large excess of one colour. 
 On this hypothesis, moreover, the drawing of one ball and the 
 resulting knowledge of its colour leaves unaltered the proba- 
 bihties of the various possible constitutions of the rest of the bag ; 
 whereas on the first hypothesis knowledge of the colour of one 
 ball, drawn and not replaced, manifestly alters the probability 
 of the colour of the next ball to be drawn. Either of these hypo- 
 theses seems to satisfy the Principle of Indifference, and a believer 
 ia the absolute validity of the principle will doubtless adopt that 
 one which enters his mind first. ^ 
 
 The same point is very clearly illustrated by an example 
 which I take from Von Kries. Two cards, chosen from different 
 packs, are placed face downwards on the table ; one is taken 
 up and found to be of a black suit : what is the chance that the 
 other is black also ? One would naturally reply that the 
 chance is even. But this is based on the supposition, relatively 
 unpopular with writers on the subject, that every ' constitution ' 
 is equally probable, i.e. that each individual card is as Kkely 
 to be black as red. If we prefer this assumption, we must relin- 
 
 ^ The diffloulty in question was first pointed out by Boole, Laws of Thought, 
 pp. 369-370. After discussing the Law of Succession, Boole proceeds to show 
 that "there are other hypotheses, as strictly invol-riug the principle of the 
 ' equal distribution of knowledge or ignorance ' which would also conduct to 
 conflicting results." See also Von Kries, op. cit. pp. 31-34, 59, and Stump^ 
 Vber den Begrijf der mathematischen Wahrscheinlichkeit, Bavarian Academy, 
 1892, pp. 64-68. 
 
 ^ If A and B are two balls, A white, B black, and A black, B white, are 
 different ' constitutions.' But if we consider different numerical ratios, these 
 two oases are indistinguishable, and count as one only. 
 
 ^ 0. S, Peiroe in hia Theory of Probable Inference (Johns Hopkins Studies in 
 Logic), pp. 172, 173, argues that the ' constitution ' hypothesis is alone valid, 
 on the ground that, of the two h3rpotheses, only this one is consistent with itself. 
 I agree with his conclusion, and shall give at the close of the chapter the funda- 
 mental considerations which lead to the rejection of the ' ratio ' hypothesis. 
 Stumpf points out that the probabihty of drawing a white ball is, in any 
 case, ^. This is true ; but the probability of a second white clearly depends 
 upon which of the two hypotheses has been preferred. Nitsche (loc. cit. p. 31) 
 seems to miss the point of the difficulty in the same way. 
 
OH. IV FUM)AMENTAL IDEAS 51 
 
 quish. the text-book theory that the drawing of a black ball from 
 an urn, containing black and white balls in unknown proportions, 
 affects our knowledge as to the proportion of black and white 
 amongst the remaining balls. 
 
 The alternative — or text-book — ^theory assumes that there 
 are three equal possibilities — one of each colour, both black, both 
 red. If both cards are black, we are twice as likely to turn up 
 a black card than i£ only one is black. After we have turned up 
 a black, the probability that the other is black is, therefore, twice 
 as great as the probability that it is red. The chance of the 
 second's being black is therefore |.^ The Principle of Indifference 
 has nothing to say against either solution. Until some further 
 criterion has been proposed we seem compelled to agree with 
 Poincar6 that a preference for either h3rpothesis is wholly arbitrary. 
 
 10. Such, then, are the kinds of result to which an unguarded 
 use of the Principle of Indifference may lead us. The difficulties, 
 to which attention has been drawn, have been noticed before ; 
 but the discredit has not been emphatically thrown on the 
 origiaal source of error. Yet the principle certainly remains as 
 a negative criterion ; two propositions cannot be equally probable, 
 so long as there is any ground for discriminating between them. 
 The principle is a necessary, but not, as it seems, a sufficient 
 condition. 
 
 The enunciation of some sufficient ride is certainly essential if 
 we are to make any progress in the subject. But the difficulty 
 of discovering a correct principle is considerable. This difficulty 
 is partly responsible, I think, for the doubts which philosophers 
 and many others have often felt regarding any practical applica- 
 tion of the Calculus. Many candid persons, when confronted 
 with the results of Probability, feel a strong sense of the un- 
 certainty of the logical basis upon which it seems to rest. It is 
 difficult to find an inteUigible account of the meaning of ' proba- 
 bility,' or of how we are ever to determine the probability of any 
 particular proposition ; and yet treatises on the subject profess 
 to arrive at complicated results of the greatest precision and the 
 most profound practical importance. 
 
 The incautious methods and exaggerated claims of the school 
 of Laplace have undoubtedly contributed towards the existence 
 of these sentiments. But the general scepticism, which I believe 
 1 This is Foisson's solution, Becherches, p. 96. 
 
52 A TREATISE ON PROBABILITY w. i 
 
 to be mucli more widely spread than the literature of the subject 
 admits, is more fundamental. In this matter Hume need not 
 have felt " affrighted and confounded with that forelorn solitude, 
 in which I am placed in my philosophy," or have fancied himself 
 " some strange uncouth monster, who not being able to mingle 
 and unite in society, has been expell'd all human commerce, 
 and left utterly abandon'd and disconsolate." In his views on 
 probability, he stands for the plain man against the sophisms 
 and ingenuities of " metaphysicians, logicians, mathematicians, 
 and even theologians." 
 
 Yet such scepticism goes too far. The judgments of proba- 
 bility, upon which we depend for almost all our beliefs in matters 
 of experience, undoubtedly depend on a strong psychological 
 propensity in us to consider objects in a particular light. But 
 this is no ground for supposing that they are nothing more than 
 " hvely imaginations." The same is true of the judgments in 
 virtue of which we assent to other logical arguments ; and yet 
 in such cases we believe that there may be present some element 
 of objective validity, transcending the psychological impulsion, 
 with which primarily we are presented. So also in the case of 
 probability, we may believe that our judgments can penetrate 
 into the real world, even though their credentials are subjective. 
 
 11. We must now inquire how far it is possible to rehabilitate 
 the Principle of Indifference or find a substitute for it. There 
 are several distinct difficulties which need attention in a dis- 
 cussion of the problems raised in the preceding paragraphs. 
 Our first object must be to make the Principle itself more precise 
 by disclosing how far its application is mechanical and how far 
 it involves an appeal to logical intuition. 
 
 12. Without compromising the objective character of relations 
 of probabiHty, we must nevertheless admit that there is little 
 likelihood of our discovering a method of recognising particular 
 probabilities, without any assistance whatever from intuition or 
 direct judgment. Inasmuch as it is always assumed that we can 
 sometimes judge directly that a conclusion /oJfows /row a premiss, 
 it is no great extension of this assumption to suppose that we 
 can sometimes recognise that a conclusion partially follows from, 
 or stands in a relation of probability to, a premiss. Moreover, 
 the failure to explain or define ' probability ' in terms of other 
 logical notions, creates a presumption that particular relations 
 
CH. IV FUNDAMENTAL IDEAS 53 
 
 of probability must be, in the fixst instance, directly recognised 
 as such, and cannot be evolved by rule out of data which them- 
 selves contain no statements of probability. 
 
 On the other hand, although we cannot exclude every element 
 of direct judgment, these judgments may be limited and con- 
 trolled, perhaps, by logical rules and principles which possess a 
 general application. While we may possess a faculty of direct 
 recognition of many relations of probabihty, as in the case of 
 many other logical relations, yet some may be much more 
 easily recognisable than others. The object of a logical system 
 of probabihty is to enable us to know the relations, which 
 cannot be easily perceived, by means of other relations which 
 we can recognise more distinctly — to convert, in fact, vague 
 knowledge into more distinct knowledge.^ 
 
 13. Let us seek to distinguish between the element of direct 
 judgment and the element of mechanical rule in the Principle 
 of Indifference. The enunciation of this principle, as it is 
 ordinarily expressed, cloaks, but does not avoid, the former 
 element. It is in part a formula and in part an appeal to direct 
 inspection ; but in addition to the obscurity and ambiguity of 
 the formula, the appeal to intuition is not as explicit as it should 
 be. The principle states that ' there must be no known 
 reason for preferring one of a set of alternatives to any other.' 
 What does this mean ? What are ' reasons,' and how are 
 we to know whether they do or do not justify us in preferring 
 one alternative to another ? I do not know any discussion 
 of Probability in which this question has been so much as 
 asked. If, for example, we are considering the probabihty 
 of drawing a black ball from an urn containing balls which are 
 
 1 As it is the aim of trigonometry to determine the position of an object, 
 which is in a sense visible, not by a direct observation of it, but by observing 
 some other object together with certain relations, so an indirect method of this 
 kind is the aim of all logical system. If the truth of some propositions, and the 
 validity of some arguments, could not be recognised directly, we could make no 
 progress. We may have, moreover, some power of direct recognition where it 
 is not necessary in our logical system that we should make use of it. In these 
 cases the method of logical proof increases the certainty of knowledge, -yhich 
 we might be able to possess in a more doubtful manner without it. In other 
 cases, that, for instance, of a complicated mathematical theorem, it enables 
 us to know propositions to be true, which are altogether beyond the reach of 
 our direct insight ; just as we can often obtain knowledge about the position 
 of a partially visible or even invisible object by starting with observations of 
 other objects. 
 
54 A TREATISE ON PROBABILITY pt. i 
 
 black and white, we assume that the difference of colour be- 
 tween the balls is not a reason for preferring either alternative. 
 But how do we know this, unless by a judgment that, on the 
 evidence in hand, our knowledge of the colours of the balls is 
 irrelevant to the probability in question ? We know of some 
 respects in which the alternatives differ ; but we judge that a 
 knowledge of these differences is not relevant. If, on the other 
 hand, we were taking the baUs out of the urn with a magnet, 
 and knew that the black balls were of iron and the white of tin, 
 we might regard the fact, that a ball was iron and not tin, as 
 very important in determining the probability of its being 
 drawn. Before, then, we can begin to apply the Principle of 
 Indifference, we must have made a number of direct judgments 
 to the effect that the probabilities under consideration are un- 
 affected by the inclusion in the evidence of certain particular 
 details. We have no right to say of any known difference 
 between the two alternatives that it is ' no reason ' for preferring 
 one of them, imless we have judged that a knowledge of this 
 difference is irrelevant to the probability in question. 
 
 14. A brief digression is now necessary, in order to introduce 
 some new terms. There are in general two principal types of 
 probabilities, the magnitudes of which we seek to compare, — 
 those in which the evidence is the same and the conclusions 
 different, and those in which the evidence is different but the 
 conclusion the same. Other types of comparison may be re- 
 quired, but these two are by far the commonest. In the first 
 we compare the likelihood of two conclusions on given evidence ; 
 in the second we consider what difference a change of evidence 
 makes to the likeUhood of a given conclusion. In symboUc 
 language we may wish to compare xjh with yjh, or x/h with 
 xlhjh. We may call the first type judgments of preference, or, 
 when there is equality between x/h and y/h, of indifference ; and 
 the second type we may call judgments of relevance, or, when there 
 is equality between xjh and x/hji, of irrelevance. In the first 
 we consider whether or not a; is to be preferred to y on evidence h ; 
 in the second we consider whether the addition of h^ to evidence 
 h is relevant to x. 
 
 The Principle of Indifference endeavours to formulate a rule 
 which will justify judgments of indifference. But the rule that 
 there must be no ground for preferring one alternative to another, 
 
OH. IV FUKDAMENTAL IDEAS 55 
 
 involves, if it is to be a guiding rule at all, and not a fetitio 
 'princifii, an appeal to judgments of irrelevance. 
 
 Th& simplest definition of Irrelevance is as follows: h^ is 
 irrelevant to x on evidence h, if the probability of x on evidence h\ 
 is the same as its probabihty on evidence h,}- But for a reason 
 which will appear in Chapter "VT., a stricter and more complicated 
 definition, as foUows, is theoretically preferable : \ is irrelevant 
 to X on evidence h, if there is no proposition, inferrible from hji 
 but not from h, such that its addition to evidence h affects the 
 probability of x.^ Any proposition which is irrelevant in the 
 strict sense is, of course, also irrelevant in the simpler sense ; 
 but if we were to adopt the simpler definition, it would sometimes 
 occur that a part of evidence would be relevant, which taken as 
 a whole was irrelevant. The more elaborate definition by avoid- 
 ing this proves in the sequel more convenient. If the condition 
 xlhjt=xlh alone is satisfied, we may say that the evidence h^ 
 is ' irrelevant as a whole.' ^ 
 
 It will be convenient to define also two other phrases. Aj 
 and ^2 S'le independent and complementary parts of the evidence, 
 if between them they make up h and neither can be inferred from 
 the other. If x is the conclusion, and h^ and h^ are independent 
 and complementary parts of the evidence, then h^ is relevant if 
 the addition of it to h^ affects the probability of a;.* 
 
 Some propositions regarding irrelevance wiU be proved in 
 Part II. If K^ is the contradictory of h^ and xlhyh=xjh, then 
 x/JiJi^xjh. Thus the contradictory of irrelevant evidence is 
 also irrelevant. Also, if xjyh=x/h, it foUows that y/xh=y/h. 
 Hence if, on initial evidence h,y is irrelevant to x, then, on the 
 same initial evidence, x is irrelevant to y, i.e. if in a given state 
 of knowledge one occurrence has no bearing on another, then 
 equally the second has no bearing on the first. 
 
 15. This distinction enables us to formulate the Principle of 
 Indifference at any rate more precisely. There must be no 
 relevant evidence relating to one alternative, unless there is 
 corresponding evidence relating to the other ; our relevant 
 
 "■ That is to say, h^ is irrelevant to x/h if xlhji,=xjh. 
 
 ^ That is to say, h^ is irrelevant to x/h, if there is no proposition h\ such that 
 h'Jhjh = l, h\lh=¥l, and xjh'-JidFXJh. 
 
 ^ Where no misunderstanding can arise, the qualification ' as a whole ' will 
 be sometimes omitted. 
 
 * /.e (in symbolism) \ and 1i^ are independent and complementary parts of 
 h if ^1^2 =fe, ^1/^2=1=1, and ^2/^14=1. Also h^ is relevant if xjli^xlh^. 
 
56 A TREATISE ON PROBABILITY ft. i 
 
 evidence, that is to say, must be symmetrical with regard to the 
 alternatives, and must be applicable to each in the same manner. 
 This is the rule at which the Principle of Indifference somewhat 
 obscurely aims. We must first determine what parts of our 
 evidence are relevant on the whole by a series of judgments of rele- 
 vance, not easily reduced to rule, of the type described above. 
 If this relevant evidence is of the same form for both alternatives, 
 then the Principle authorises a judgment of indifference. 
 
 16. This rule can be expressed more precisely in symbohc 
 language. Let us assume, to begin with, that the alternative 
 conclusions are expressible in the forms ^(a) and ^(6), where 
 ^{x) is a propositional function.^ The difference between them, 
 that is to say, can be represented in terms of a single variable. 
 
 The Principle of Indifference is applicable to the alternatives 
 <^{a) and </)(6), when the evidence h is so constituted that, if f{a) 
 is an independent part of h (see § 14) which is relevant to <^{a), 
 and does not contain any independent parts which are irrelevant 
 to (^(os), then h includes /(6) also. 
 
 The rule can be extended by successive steps to cases in 
 which we have more than one variable. We can, if the necessary 
 conditions are fulfilled, successively compare the probabihties 
 of ^{a^a^ and <\>{\a^, and of <}>{b^a.^) and ^{b^b^), and establish 
 equaUty between ^{cij^a,^) and ^{bjb.^). 
 
 This elucidation is suited to most of the cases to which the 
 Principle of Indifference is ordinarily applied. Thus in the 
 favourite examples in which balls are drawn from urns, we can 
 infer from our evidence no relevant proposition about white balls, 
 such that we cannot infer a corresponding proposition about 
 black balls. Most of the examples, to which the mathematical 
 theory of chances has been appHed, and which depend upon the 
 Principle of Indifference, can be arranged, I think, in the forms 
 which the rule requires as formulated above. 
 
 17. We can now clear up the difficulties which arose over the 
 group of cases dealt with in § 9, the typical example of which was 
 the problem of the urn containing black and white balls in an 
 unknown proportion. This more precise enunciation of the 
 Principle enables us to show that of the two solutions the equi- 
 probability of each ' constitution ' is alone legitimate, and the 
 
 ^ If 0(o), 0(6), etc., are propositions, and a; is a variable, capable of taking 
 the values a, b, etc., then ^(a;) is a propositional function. 
 
OH. IV FUNDAMENTAL IDEAS 57 
 
 equiprobability of each numerical ratio erroneous. Let us write 
 the alternative ' The proportion of black balls is x '=(j){x), and 
 the datum ' There are n balls in the bag, with regard to none 
 of which it is known whether they are black or white '=h. 
 On the ' ratio ' hypothesis it is argued that the Principle of 
 Indifference justifies the judgment of indifEerence, <f){x)lh = 
 ^{y)lh. In order that this may be vahd, it must be possible to 
 state the relevant evidence in the form J{x) f{y). But this is 
 not the case. If a; = ^ and y = ^, we have relevant knowledge 
 about the way in which a proportion of black balls of one half 
 can arise, which is not identical with our knowledge of the way 
 in which a proportion of one quarter can arise. If there are four 
 balls, A, B, C, D, one half are black, if A, B or A, C or A, D or 
 B, C or B, D or C, D are black ; and one quarter are black, 
 if A or B or C or D are black. These propositions are not identical 
 in form, and only by a false judgment of irrelevance can we 
 ignore them. On the ' constitution ' hypothesis, however, 
 where A, B black and A, C black are treated as distinct alter- 
 natives, this want of symmetry in our relevant evidence cannot 
 arise. 
 
 18. We can also deal with the point which was illustrated by 
 the difficulty raised in § 4. We considered there the probabiUties 
 of a and its contradictory a when there is no external evidence 
 relevant to either. What exactly do we mean by saying that 
 there is no relevant evidence ? Is the addition of the word 
 external significant ? If a represents a particular proposition, 
 we must know something about it, namely, its meaning. May 
 not the apprehension of its meaniag afford us some relevant 
 evidence ? If so, such evidence must not be excluded. If, then, 
 we say that there is no relevant evidence, we must mean no 
 evidence beyond what arises from the mere apprehension of the 
 meaning of the symbol a. If we attach no meaning to the 
 symbol, it is useless to discuss the value of the probability ; for 
 the probability, which belongs to a proposition as an object of 
 knowledge, not as a form of words, cannot in such a case exist. 
 
 What exactly does the symbol a stand for in the above ? 
 Does it stand for any proposition of which we know no more 
 than that it is a proposition ? Or does it stand for a particular 
 proposition which we understand but of which we know no more 
 than is involved in understanding it ? In the former case we 
 
58 A TEEATISE ON PEOBABILITY ft. i 
 
 cannot extend our result to a, proposition of which we know even 
 the meaning ; for we should then know more than that it is a 
 proposition ; and in the latter case we cannot say what the 
 probability of a is as compared with that of its contradictory, 
 until we know wJiat particular proposition it stands for ; for, as 
 we have seen, the proposition itseK may supply relevant evidence. 
 
 This suggests that a source of much confusion may lie in the 
 use of symbols and the notion of variables in probability. In 
 the logic of implication, which deals not with probabihty but 
 with truth, what is true of a variable must be equally true of all 
 instances of the variable. In Probability, on the other hand, 
 we must be on oux guard wherever a variable occurs. In Im- 
 plication we may conclude that a^ is true of anything of which 
 (j) is true. In Probability we may conclude no more than that 
 ■yfr is probable of anything of which we only know that <ji is true of 
 it. If X stands for anything of which (ji(x) is true, as soon as 
 we substitute in probabihty any particular value, whose meaning 
 we know, for x, the value of the probabihty may be affected ; 
 for knowledge, which was irrelevant before, may now become 
 relevant. Take the following example : Does t^{a)l-^{a) = 
 <j}{b)/tjr(b) 1 That is to say, is the probabihty of ^'s being true 
 of a, given only that i/r is true of a, equal to the probabihty of 
 <^'s being true of b, given only that yfr is true of 6 ? If this simply 
 means that the probabihty of an object's satisfjdng (f> about 
 which nothing is known except that it satisfies yjr is equal to 
 ditto ditto, the equation is an identity. For in this case <j){a)l'\fr{a) 
 means the same as <j)(b)l-\lr{b), i.e. we know nothing about x and y 
 except that they satisfy t/t, and there is nothing whatever by 
 which we can distinguish a from 6. But if a and b represent 
 specific entities, which we can distinguish, then the equalijby 
 does not necessarily hold. If, for instance, <f){x) stands for ' x is 
 Socrates,' then it is plainly false that <ji{a)/-\lr{a) = <l>{b)j'\]r{b), where 
 a stands for Socrates and b does not. 
 
 19. Bearing this danger in mind, we can now give further 
 precision to the enunciation of the Principle of Indifference given 
 in § 16. Our knowledge of the meaning of a must be taken 
 accoimt of so far as it is relevant ; and the Principle is only satis- 
 fied if we have corresponding knowledge about the meaning of b. 
 Thus <f){a)/h = (j)(b)jh may be true for one pair of values a, b, and 
 not true for another pair of values a', b'. 
 
OH. IV FUNDAMENTAL IDEAS 59 
 
 This makes it possible to explain in part the contradiction 
 discussed in § 4. Even if it were true that the probability of a is 
 \, when we know nothing except that a is a proposition, it does 
 not follow that the probabihty of ' This book is red ' is |, when 
 we know the meanings of ' book ' and ' red,' even if we Imow no 
 more than this. Knowledge arising directly out of acquaititance 
 with the meaning of ' red ' may be sufficient to enable us to infer 
 that ' red ' and ' not-red ' are not satisfactory alternatives to 
 which to apply the Principle of Indifference. How this may 
 come about wiU be discussed in §§ 20, 21. 
 
 But the contradictions are not yet really solved ; for some 
 of the difficulties discussed iu § 4 can arise even when we know 
 no more of a and b than that they are different propositions. In 
 fact, although we have now stated more clearly than before how 
 the Principle should be enunciated, it is not yet possible to explain 
 or to avoid all the contradictions to which it led us in §§ 4 to 7. 
 For this purpose we must proceed to a further qualification. 
 
 20. The examples, in which the Priaciple of Indifference 
 broke down, had a great deal in common. We broke up the 
 field of possibility, as we may term it, into a number of areas 
 by a series of disjunctive judgments. But the alternative areas 
 were not ultimate. They were capable of further subdivision 
 into other areas similar in kind to the former. The paradoxes 
 and contradictions arose, in each case, when the alternatives, 
 which the Principle of Indifference treated as equivalent, actually 
 contained or might contain a different or an iudefinite number of 
 more elementary units. 
 
 In the type of cases in which the Principle of Indifference 
 seemed to permit the assertion that, in the absence of relevant 
 evidence, a proposition is as Kkely as its contradictory, its con- 
 tradictory is not an ultimate and indivisible alternative (in the 
 sense to be explained in § 21 below), even if the proposition itseH 
 satisfies this condition. For its contradictory can be disjunct- 
 ively resolved into an indefinite number of sets of contraries to 
 the proposition. It was out of this that our difficulties first arose. 
 ' This book is not red ' includes amongst others the alternatives 
 ' This book is black ' and ' This book is blue.' It is not, there- 
 fore, an ultimate alternative. 
 
 In the same way the contradiction of § 5 arose out of the possi- 
 biUty of sphtting the alternatives ' He inhabits the British 
 
60 A TEEATISE ON PKOBABILITY pt. i 
 
 Isles ' into the sub-alternatives ' He inhabits Ireland or he 
 inhabits Great Britain.' And in the third type of case, to 
 which the example of specific volume and density belongs, the 
 alternative ' v lies in the interval 1 to 2 ' can be broken up into 
 the sub-alternatives ' v Ues in the interval 1 to If or 1 J to 2.' 
 
 21. This, then, seems to point the way to the qualification of 
 which we "are in search. We must enunciate some formal rule 
 which will exclude those cases, in which one of the alternatives 
 involved is itself a disjunction of sub-alternatives of the same 
 form. For this purpose the following condition is proposed. 
 
 Let the alternatives, the equiprobability of which we seek to 
 establish by means of the Principle of Indifference, be t}>{aj), 
 ^(fflg) . . . ^(ffly),^ and let the evidence be h. Then it is a neces- 
 sary condition for the appUcation of the principle, that these 
 should be, relatively to the evidence, indivisible alternatives of 
 the form <j>(x). We may define a divisible alternative in the 
 following manner : 
 
 An alternative <^(a^) is divisible if 
 
 (i.) [</,(«,) ^<^(a,,) + «^K")]A=l. 
 (ii.) <p{a^) . (f)(a^.)/h = o, 
 (iii.) <f){a^')lh4=o and ^{a^.^jhJrO 
 
 The condition that the sub-alternatives must be of the same, 
 form as the original alternatives, i.e. expressible by means of the 
 same prepositional function ^(a;), deserves attention. It might 
 be the case that the original alternatives had nothing substantial 
 in common ; i.e. (j}{x) = a; is the only propositional function 
 common to aU of them, the alternatives being %, a^, . . ., a^. In 
 these circumstances the condition in question cannot be satisfied. 
 For the proposition a^ can always be resolved into the disjunction 
 ajb +a^E, where b is any proposition and 5 its contradictory. If, 
 on the other hand, the alternatives which we are comparing can 
 be expressed in the forms <f>(a^) and ^{a^, where the function 
 <f)(x) is distinct from x, it is not necessarily the case that either 
 of these can be resolved into a disjunctive combination of terms 
 which can be expressed in their turn in the same form. 
 
 Dispensing with symbolism, we can express these conditions 
 as follows : Our knowledge must not enable us to split up the 
 
 ^ The more complicated cases in which the propositional function, of which 
 the alternatives are instances, involves more than one variable (see § 16), can be 
 dealt with in a similar manner mutatis mutandis. 
 
CH. IV FUNDAMENTAL IDEAS 61 
 
 alternative </)(a^) into a disjunction of two sub-alternatives, (i.) 
 which are themselves expressible in the same form (p, (ii.) which 
 are mutually exclusive, and (iii.) which, on the evidence, are 
 possible. 
 
 In short, the Principle of Indifference is not applicable to a 
 pair of alternatives, if we know that either of them is capable of 
 being further split up into a pair of possible but incompatible 
 alternatives of the same form as the original pair. 
 
 22. This rule commends itself to common sense. If we 
 know that the two alternatives are compounded of a different 
 number or of an indefinite number of sub-alternatives which are 
 in other respects similar, so far as our evidence goes to the 
 original alternatives, then this is a relevant fact of which we 
 must take account. And as it affects the two alternatives in 
 differing and unsymmetrical ways, it breaks down the funda- 
 mental condition for the valid application of the Principle of 
 Indifference. 
 
 Neither this consideration nor that discussed in §§ 18 and 19 
 substantially modify the Principle of Indifference as enunciated 
 in § 16. They have only served to make explicit what was 
 always impUcit in the Principle, by explaining the manner in 
 which our knowledge of the form and meaning of the alternatives 
 may be a relevant part of the evidence. The apparent con- 
 tradictions arose from paying attention to what we may term 
 the extraneous evidence only, to the neglect of such part of the 
 evidence as bore upon the form and meaning of the alternatives. 
 
 23. The application of this result to the examples cited in § 18 
 is not difficult. It excludes the class of cases in which a pro- 
 position and its contradictory constitute the alternatives. For 
 if b is the proposition and B its contradictory, we cannot find 
 a propositional function ^(a;) which will satisfy the necessary 
 conditions. It deals also with the type of contradiction which 
 arose in considering the probabiUty that an individual taken at 
 random was an inhabitant of a given region. If, on the other 
 hand, the term ' country ' is so defined that one country cannot 
 include two countries, then an individual is, relatively to suitable 
 hypotheses, as likely to be an inhabitant of one as of another. 
 For the function <j>(x), where <j>(x) = ' the individual is an in- 
 habitant of coimtry x,' satisfies the conditions. And it deals 
 with the example of ranges of specific volume and specific density, 
 
62 A TREATISE ON PROBABILITY pt. i 
 
 because there is no range wMch does not contain vsdthin itself two 
 similar ranges. As there are in this case no definite units by 
 which we can define eqvul ranges, the device, which will be referred 
 to in § 25 for dealing with geometrical probabilities, is not avail- 
 able. 
 
 24. It is worth while to add that the qualification of § 21 is 
 fatal to the practical utility of the Principle of Indifference in 
 those cases only in which it is possible to find no ultimate alter- 
 natives which satisfy the conditions. For if the original alterna- 
 tives each comprise a definite number of indivisible and indifferent 
 sub-alternatives, we can compute their probabilities. It is often 
 the case, however, that we cannot by any process of finite sub- 
 division arrive at indivisible sub-alternatives, or that, if we can, 
 they are not on the evidence indifferent. In the examples given 
 above, for instance, where tf>(x)=x, or where a; is a part of un- 
 specified magnitude in a continuum, there are no indivisible 
 sub-alternatives. The first type comprises all cases, amongst 
 others, in which we weigh the probabilities of a proposition and 
 its contradictory ; and the second includes a great number of 
 cases in which physical or geometrical quantities are involved. 
 
 25. We can now return to the numerous paradoxes which 
 arise ra the study of geometrical probability (see §§ 7, 8). The 
 qualification of § 21 enables us, I think, to discover the source 
 of the confusion. Our alternatives in these problems relate to 
 certain areas or segments or arcs, and however small the elements 
 are which we adopt as our alternatives, they are made up of yet 
 smaller elements which would also serve as alternatives. Our 
 rule, therefore, is not satisfied, and, as long as we enunciate them 
 in this shape, we cannot employ the Principle of Indifference. 
 But it is easy in most cases to discover another set of alternatives 
 which do satisfy the condition, and which will often serve our 
 purpose equally well. Suppose, for instance, that a point Ues 
 on a line of length m.l., we may write the alternative ' the interval 
 of length I on which the point lies is the art;h interval of that 
 length as we move along the line from left to right ' =<f>{x) ; and 
 the Principle of Indifference can then be applied safely to the m 
 alternatives ^(1), <^(2) . . . ^(m), the number m increasing as the 
 length I of the intervals is diminished. There is no reason why 
 I should not be of any definite length however small. 
 
 If we deal with the problems of geometrical probability in 
 
CH. IV FUNDAMENTAL IDEAS 63 
 
 tHs way, we shall avoid the contradictory conclusions, which 
 arise from confusing together distinct elementary areas. In the 
 problem, for instance, of the chord drawn at random in a circle, 
 which is discussed in § 7, the chord is regarded, not as a one- 
 dimensional line, but as the limit of an area, the shape of which 
 is different in each of the variant solutions. In the first solution 
 it is the limit of a triangle, the length of the base of which tends 
 to zero ; in the second solution it is the limit of a quadrilateral, 
 two of the sides of which are parallel and at a distance apart 
 which tends to zero ; and in the third solution the area is defined 
 by the limiting position of a central section of undefined shape. 
 These distinct h3rpotheses lead inevitably to different resiilts. If 
 we were deahng with a strictly linear chord, the Priaciple of 
 Indifference would 3deld us no result, as we could not enunciate 
 the alternatives in the required form ; and if the chord is an 
 elementary area, we must know the shape of the area of which 
 it is the limit. So long as we are careful to enunciate the alter- 
 natives in a form to which the Principle of Indifference can be 
 applied unambiguously, we shall be prevented from confusing 
 together distinct problems, and shall be able to reach conclusions 
 in geometrical probability which are imambiguously valid. 
 
 The substance of this explanation can be put in a sHghtly 
 different way by saying that it is not a matter of indifference in 
 these cases in what manner we proceed to the limit. We must 
 assign the probabilities before proceeding to the limit, which 
 we can do imambiguously. But if the problem in hand does 
 not stop at small finite lengths, areas, or volumes, and we 
 have to proceed to the limit, then the final result depends upon 
 the shape in which the body approaches the hmit. Mathemati- 
 cians will recognise an analogy between this case and the deter- 
 mination of potential at points within a conductor. Its value 
 depends upon the shape of the area which in the limit represents 
 the point. 
 
 26. The positive contributions of this chapter to the deter- 
 mination of valid judgments of equiprobabihty are two. In the 
 first place we have stated the Principle of Indifference in a more 
 accurate form, by displaying its necessary dependence upon 
 judgments of relevance and so bringiug out the hidden element 
 of direct judgment or intuition, which it has always involved. 
 It has been shown that the Principle lays down a rule by which 
 
64 A TKEATISE ON PROBABILITY pt. i 
 
 direct judgments of relevance and irrelevance can lead on to 
 judgments of preference and indifference. In the second place, 
 some tjrpes of consideration, which are in fact relevant, but which 
 are in danger of being overlooked, have been brought into promi- 
 nence. By this means it has been possible to avoid the various 
 types of doubtful and contradictory conclusions to which the 
 Principle seemed to lead, so long as we applied it without due 
 qualification. 
 
CHAPTER V 
 
 OTHER METHODS OF DETERMINING PROBABILITIES 
 
 1. The recognition of the fact, that not all probabilities are 
 niunerical, limits the scope of the Principle of Indifference. It 
 has always been agreed that a numerical measure can actually 
 be obtained in those cases only in which a reduction to a set of 
 exclusive and exhaustive equiprobable alternatives is practicable. 
 Our previous conclusion that numerical measurement is often 
 impossible agrees very well, therefore, with the argument of the 
 preceding chapter that the rules, in virtue of which we can assert 
 equiprobability, are somewhat limited iu their field of application. 
 
 But the recognition of this same fact makes it more necessary 
 to discuss the principles which will justify comparisons of more 
 and less between probabilities, where numerical measurement is 
 theoretically, as well as practically, impossible. We must, for 
 the reasons given ia the preceding chapter, rely in the last resort 
 on direct judgment. The object of the following rules and 
 principles is to reduce the judgments of preference and relevance, 
 which we are compelled to make, to a few relatively simple types. ^ 
 
 2. We will enquire first in what circumstances we can expect 
 a comparison of more and less to be theoretically possible. I 
 am incHned to think that this is a matter about which, rather 
 unexpectedly perhaps, we are able to lay down definite rules. 
 We are able, I think, always to compare a pair of probabilities 
 which are 
 
 (i.) of the type abjh and ajh, 
 or (ii.) of the type a/hh^ and ajh, 
 
 provided the additional evidence h^ contains only one inde- 
 pendent piece of relevant information. 
 
 * Parts of Chap. XV. are closely connected with the topics of the follow- 
 ing paragraphs, and the discussion which is commenced here is concluded there. 
 
 65 P 
 
66 A TEEATISE ON PROBABILITY pt. i 
 
 (i.) The propositions of Part II. will enable us to prove that 
 
 ahjh < a/h unless b/ah = 1 ; 
 
 that is to say, the probability of our conclusion is diminished by 
 the addition to it of something, which on the hypothesis of our 
 argument cannot be inferred from it. This proposition will be 
 self-evident to the reader. The rule, that the probability of two 
 propositions jointly is, in general, less than that of either of them 
 separately, includes the rule that the attribution of a more 
 specialised concept is less probable than the attribution of a less 
 specialised concept. 
 
 (ii.) This condition requires a little more explanation. It 
 states that the probabihty a/hh^ is always greater than, equal to, 
 or less than the probabiHty a/h, if h^ contains no pair of comple- 
 mentary and independent parts ^ both relevant to a/h. If h^ 
 is favourable, ajhh^ > a/h. Similarly, ii h^is favourable to a/hhi, 
 a/hhji^ > a/hh^. The reverse holds if h^ and h^ are unfavourable. 
 Thus we can compare a/hh' and a/h, in every case in which the 
 relevant independent parts of the additional evidence h' are 
 either all favourable, or all unfavourable. In cases in which our 
 additional evidence is equivocal, part taken by itself being favour- 
 able and part xmf avoiurable, comparison is not necessarily possible. 
 In ordinary language we may assert that, according to our rule, 
 the addition to our evidence of a single fact always has a definite 
 bearing on our conclusion. It either leaves its probability un- 
 affected and is irrelevant, or it has a definitely favourable or 
 unfavourable bearing, being favourably or unfavourably relevant. 
 It cannot affect the conclusion in an indefinite way, which allows 
 no comparison between the two probabilities. But if the addition 
 of one fact is favourable, and the addition of a second is unfavour- 
 able, it is not necessarily possible to compare the probability of 
 our original argument with its probability when it has been 
 modified by the addition of both the new facts. 
 
 Other comparisons are possible by a combination of these 
 two principles with the Principle of Indifference. We may 
 find, for instancCp that a/hh^>a/h, that a/A =6/A, that b/h>b/hh^, 
 and that, therefore, a/h\>b/h\. We have thus obtained a 
 comparison between a pair of probabilities, which are not 
 of the types discussed above, but without the introduction 
 
 '^ See Chap. IV. § 14 for the meaning of these terms. 
 
CH. V FimDAMENTAL IDEAS 67 
 
 of any fresh principle. We may denote comparisons of this 
 type by (iii.). 
 
 3. Whether any comparisons are possible which do not fall 
 within any of the categories (i.), (ii.).- or (iii.), I do not feel certain. 
 We imdoubtedly make a number of direct comparisons which 
 do not seem to be covered by them. We judge it more probable, 
 for instance, that Caesar invaded Britain than that Romulus 
 founded Rome. But even in such cases as this, where a reduction 
 into the regular form is not obvious, it might prove possible if 
 we could clearly analyse the real grounds of our judgment. We 
 might argue in this instance that, whereas Romulus's founding of 
 Rome rests solely on tradition, we have in addition evidence of 
 another Mnd for Caesar's invasion of Britain, and that, in so 
 far as our belief in Caesar's invasion rests on tradition, we have 
 reasons of a precisely similar kind as for our behef in Romulus 
 mthovt the additional doubt involved in the maintenance of a 
 tradition between the times of Romulus and Caesar. By some 
 such analysis as this our judgment of comparison might be 
 brought within the above categories. 
 
 The process of reaching a judgment of comparison in this way 
 may be called ' schematisation.' ^ We take initially an ideal 
 scheme which falls within the categories of comparison. Let 
 us represent ' the historical tradition x has been handed down 
 from a date many years previous to the time of Caesar ' by 
 T|r^(x); *the historical tradition x has been handed down from 
 the time of Caesar' by 1/^2(05) ; ' the historical tradition x has 
 extra-traditional support ' by 1/^3(3;) ; and the two traditions, 
 the Romulus tradition and the Caesar tradition respectively, 
 by a and b. Then if our relevant evidence h were of the form 
 ■>^j(a)i/r2(6)i|r3(6), it is easily seen that the comparison alh<b/h 
 could be justified on the lines laid down above.^ A further judg- 
 ment, that our actual evidence presented no relevant divergence 
 from this schematic form, would then estabUsh the practical 
 conclusion. As I am not aware of any plausible judgment of 
 comparison which we make in common practice, but which is 
 clearly incapable of reduction to some schematic form, and as 
 I see no logical basis for such a comparison, I feel justified in 
 
 1 This phrase is used by Von Eiies, op. cit. p. 179, in a somewhat similar 
 connection. 
 
 2 For alf^{a,) = bH^^{b); o/^('i(o) < o/f j(o) ; blf^{b)<cblMb)Mb); 
 a/i'i{a) = alh ; and b/xl'^{b)^,{b) = b/h. 
 
68 A TEEATISB ON PEOBABILITY pt. i 
 
 doubting the possibility oi comparing the probabilities of argu- 
 ments dissimilar in form and incapable of schematic reduction. 
 But the point must remain very doubtful until this part of the 
 subject has received a more prolonged consideration. 
 
 4. Category (ii.) is very wide, and evidently covers a great 
 variety of cases. If we are to establish general principles of argu- 
 ment and so avoid excessive dependence on direct individual 
 judgments of relevance, we must discover some new and more 
 particular principles included within it. Two of these — ^those 
 of Analogy and of Induction — are excessively important, and 
 will be the subject of Part III. of this book. In addition to these 
 a few criteria will be examined and established in Chapter XIV., 
 §§ 4 and 8 (49.1). We must be content here (pending the 
 symbolic developments of Part II.) with the two observations 
 following : 
 
 (1) The addition of new ^ evidence hj^ to a doubtful ^ argument 
 a/h is favourably relevant, if either of the following conditions 
 is fulfilled : — (a) if ajh\=(i ; (6) if ajhh^ = l. Divested of sym- 
 bolism, this merely amounts to a statement that a piece of 
 evidence is favourable if, in conjunction with the previous 
 evidence, it is either a necessary or a sufficient condition for the 
 truth of our conclusion. 
 
 (2) It might plausibly be supposed that evidence would be 
 favourable to our conclusion which is favourable to favourable 
 CAddence — i.e. that, if A^ is favourable to x/h and x is favourable to 
 a/h, hy is favourable to ajh. Whilst, however, this argument 
 is frequently employed under conditions, which, if explicitly 
 stated, would justify it, there are also conditions in which this is 
 not so, so that it is not necessarily valid. For the very deceptive 
 fallacy iavolved in the above supposition, Mr. Johnson has 
 suggested to me the name of the Fallacy of the Middle Term. The 
 general question — ^If h^ is favourable to x/h and x is favourable to 
 alh, m what conditions is h^ favourable to ajh 1 — will be examined 
 ia Chapter XIV. §§ 4 and 8 (49.1). In the meantime, the iutui- 
 tion of the reader towards the fallacy may be assisted by the 
 following observations, which are due to Mr. Johnson : 
 
 Let X, x', x" . . . be exclusive and exhaustive alternatives 
 under datum h. Let h^ and a be concordamt in regard to each of 
 
 1 Aj is new evidence so long as Aj/A + 1. 
 ^ The argument is doubtful so long as ajTi is neither certain nor impossible. 
 
CH. V FUNDAMENTAL IDEAS 69 
 
 tkese alternatives : i.e. any hypothesis whicli is strengthened by 
 h^ wUl strengthen a, and any hypothesis which is weakened by 
 h^ will weaken a. It is obvious that, if h^ strengthens some of 
 the hypotheses x, x', x" . . ., it will weaken others. This fact 
 helps us to see why we cannot consider the concordance of A^ 
 and a in regard to one single alternative, but must be able to 
 assert their concordance with regard to every one of the exclusive 
 and exhaustive alternatives, including the particular one taken. 
 But a further condition is needed, which (as we shaU show) is 
 obviously satisfied iu two typical problems at least. This further 
 condition is that, for each hypothesis x, x', x" . . ., it shall hold 
 that, were this hypothesis known to be true, the knowledge of 
 h^ would not weaken the probability of a. 
 
 These two conditions are sufficient to ensure that h^ shall 
 strengthen a (independently of knowledge of x, x', x" . . .) ; 
 and, in a sense, they appear to be necessary ; for, unless they are 
 satisfied, the dependence of h^ upon a would be (so to speak) 
 accidental as regards the ' middle terms,' {x, x', x" . . .). 
 
 The necessity for reference to all the alternatives x, x', x" . . . 
 is analogous to the requirement of distribution of the middle 
 term in ordinary syllogism. Thus, from premises " All P is a;, 
 all S is X," the conclusion that " S's are P " does not formally 
 follow ; but given " all P is a; and all S is x' " it does follow that 
 " no S are P ", where x' is any contrary to x. The two conditions 
 taken together would be analogous to the argument : aU a; S is 
 P ; aU a;' S is P ; aU a;" S is P ; . . . therefore all S is P. 
 
 Fi/rst Typical Problem. — ^An urn contains an unknown pro- 
 portion of differently coloured balls. A ball is drawn and replaced. 
 Then x, x' , x" . . . stand for the various possible proportions. 
 Let h^ mean " a white ball has been drawn " ; and let a mean 
 " a white ball will be again drawn." Then any hypothesis which 
 is strengthened by h^ wiU strengthen a; and any hypothesis 
 which is weakened by h^ wUl weaken a. Moreover, were any 
 one of these hypotheses known to be true, the knowledge of h^ 
 would not weaken the probabiUty of a. Hence, in the absence 
 of definite knowledge as regards x, x', x" . . ., the knowledge 
 of Aj would strengthen the probability of a. 
 
 Second Typical Problem. — ^Let a certaiti event have taken 
 place ; which may have been x, x', x" or . . . Let A^ mean that 
 A reports so and so ; and let a mean that B reports similarly or 
 
70 A TREATISE ON PROBABILITY pt. i 
 
 identically. The phrase similarly merely indicates that any 
 hypothesis as to the actual fact, which would be strengthened by 
 A's report, would be strengthened by B's report. Of course, 
 even if the reports were verbally identical, A's evidence would not 
 necessarily strengthen the hypothesis in an equal degree with 
 B's ; because A and B may be unequally expert or intelligent. 
 Now, in such cases, we may further affirm (in general), that, were 
 the actual natxire of the event known, the knowledge of A's report 
 on it would not weaken (though it also need not strengthen) the 
 probability that B would give a similar report. Hence, in the 
 absence of such knowledge, the knowledge of h^ would strengthen 
 the probability of a. 
 
 5. Before leaving this part of the argument we must emphasise 
 the part played by direct judgment ia the theory here presented. 
 The rules for the determination of equality and inequality between 
 probabilities all depend upon it at some point. This seems to 
 me quite unavoidable. But I do not feel that we should regard 
 it as a weakness. For we have seen that most, and perhaps all, 
 cases can be determined by the application of general principles 
 to one simple type of direct judgment. No more is asked of the 
 intuitive power applied to particular cases than to determine 
 whether a new piece of evidence tells, on the whole, for or against 
 a given conclusion. The application of the rules involves no 
 wider assumptions than those of other branches of logic. 
 
 While it is important, in establishing a control of direct 
 judgment by general principles, not to conceal its presence, yet 
 the fact that we ultimately depend upon an intuition need not 
 lead us to suppose that our conclusions have, therefore, no basis 
 in reason, or that they are as subjective in validity as they are 
 in origin. It is reasonable to maintain with the logicians of the 
 Port Royal that we may draw a conclusion which is truly probable 
 by paying attention to all the circumstances which accompany 
 the case, and we must admit with as little concern as possible 
 Hume's taunt that " when we give the preference to one set of 
 arguments above another, we do nothing but decide from our 
 feeling concerning the superiority of their influence." 
 
CHAPTER VI 
 
 THE WEIGHT OF ARGUMENTS 
 
 1. The question to be raised in this chapter is somewhat novel ; 
 after much consideration I remain uncertain as to how much 
 importance to attach to it. The magnitude of the probability 
 of an argument, in the sense discussed in Chapter III., depends 
 upon a balance between what may be termed the favourable and 
 the unfavourable evidence ; a new piece of evidence which leaves 
 this balance unchanged, also leaves the probability of the argu- 
 ment unchanged. But it seems that there may be another 
 respect in which some kind of quantitative comparison between 
 arguments is possible. This comparison turns upon a balance, 
 not between the favourable and the unfavourable evidence, but 
 between the absolute amounts of relevant knowledge and of 
 relevant ignorance respectively. 
 
 As the relevant evidence at our disposal increases, the magni- 
 tude of the probability of the argument may either decrease or 
 increase, according as the new knowledge strengthens the un- 
 favourable or the favourable evidence ; but something seems to 
 have increased in either case, — we have a more substantial basis 
 upon which to rest our conclusion. I express this by saying that 
 an accession of new evidence increases the weight of an argu- 
 ment. New evidence will sometimes decrease the probability of 
 an argument, but it will always increase its ' weight.' 
 
 2. The measurement of evidential weight presents similar 
 difficulties to those with which we met in the measurement of 
 probability. Only in a restricted class of cases can we compare 
 the weights of two arguments in respect of more and less. But 
 this must always be possible where the conclusion of the two 
 arguments is the same, and the relevant evidence in the one in- 
 cludes and exceeds the evidence in the other. If the new evidence 
 
 71 
 
72 A TREATISE ON PROBABILITY pt. i 
 
 is ' irrelevant,' in the more precise of the two senses defined in § 14 
 of Chapter IV., the weight is left unchanged. If any part of the 
 new evidence is relevant, then the value is increased. 
 
 The reason for our stricter definition of ' relevance ' is now 
 apparent. If we are to be able to treat ' weight ' and ' relevance ' 
 as correlative terms, we must regard evidence as relevant, part 
 of which is favourable and part unfavourable, even if, taken as 
 a whole, it leaves the probability unchanged. With this defini- 
 tion, to say that a new piece of evidence is ' relevant ' is the same 
 thing as to say that it increases the ' weight ' of the argument. 
 
 A proposition cannot be the subject of an argument, unless 
 we at least attach some meaning to it, and this meaning, even if 
 it only relates to the form of the proposition, may be relevant 
 in some arguments relating to it. But there may be no other 
 relevant evidence ; and it is sometimes convenient to term the 
 probability of such an argument an a priori probability. In 
 this case the weight of the argument is at its lowest. Start- 
 ing, therefore, with minimum weight, corresponding to d priori 
 probability, the evidential weight of an argument rises, though 
 its probability may either rise or fall, with every accession of 
 relevant evidence. 
 
 3. Where the conclusions of two arguments are different, or 
 where the evidence for the one does not overlap the evidence 
 for the other, it wiU often be impossible to compare their weights, 
 just as it may be impossible to compare their probabilities. Some 
 rules of comparison, however, exist, and there seems to be a close, 
 though not a complete, correspondence between the conditions 
 under which pairs of arguments are comparable in respect of 
 probability and of weight respectively. We found that there were 
 three principal types in which comparison of probabihty was 
 possible, other comparisons being based on a combination of 
 these : — 
 
 (i.) Those based on the Principle of Indifference, subject 
 to certain conditions, and of the form ^a/-\jra.\ = <f)il'\}rb,h^, 
 where h^ and h^ are irrelevant to the arguments, 
 
 (h.) ajhh^^ajh, where h^ is a single unit of information, 
 containing no independent parts which are relevant. 
 
 (iii.) ablh^ajh. 
 
 Let us represent the evidential weight of the argument, 
 whose probability is ajh, by Y{alh). Then, corresponding to 
 
OH. VI FUNDAMENTAL IDEAS 73 
 
 the above, we find that the following comparisons of weight are 
 possible : — 
 
 (i.) Y{<f)a/'\lra.h^) =Y{(f>b/-\lrb.h2), where h^ and h^ are irrelevant 
 in the strict sense. Arguments, that is to say, to which the 
 Principle of IndifEerence is applicable, have equal evidential 
 weights. 
 
 (ii.) Y{a/hhj)>Y{ajh), unless h^ is irrelevant, in which case 
 V(a/AAj)=V(a/A). The restriction on the composition of \, 
 which is necessary in the case of comparisons of magnitude, is 
 not necessary in the case of weight. 
 
 There is, however, no rule for comparisons of weight corre- 
 sponding to (iii.) above. It might be thought that Y{abjh) < V(a/A), 
 on the ground that the more comphcated an argument is, relative 
 to given premisses, the less is its evidential weight. But this 
 is invalid. The argument abjh is further off proof than was the 
 argimient a/h ; but it is nearer disproof. For example, if ab/h = 
 and ajh>0, then V(a6/A)>V(a/A). In fact it would seem to 
 be the case that the weight of the argimient a/h is always 
 equal to that of d/h, where a is the contradictory of a ; i.e., 
 Y(a/h)=Y{d/h). For an argument is always as near proving or 
 disproving a proposition, as it is to disproving or proving its 
 contradictory. 
 
 4. It may be pointed out that if ajh = bjh, it does not neces- 
 sarily follow that Y{a/h)=Y{blh). It has been asserted already 
 that if the first equaUty f oUows directly from a single appHcation of 
 the Principle of Indifference, the second equality also holds. But 
 the first equality can exist in other cases also. If, for instance, 
 a and b are members respectively of different sets of three equally 
 probable exclusive and exhaustive alternatives, then ajh = b/h ; but 
 these argimients may have very different weights. If, however, 
 a and b can each, relatively to h, be inferred from the other, i.e. if 
 a/bh = 1 and b/ah = 1, then V(a/A) = Y{bjh). For in proving or dis- 
 proving one, we are necessarily proving or disproving the other. 
 
 Further principles could, no doubt, be arrived at. The above 
 can be combined to reach results in cases upon which unaided 
 common-sense might feel itself unable to pronounce with con- 
 fidence. Suppose, for instance, that we have three exclusive 
 and exhaustive alternatives, a, b, and c, and that ajh = bjh 
 in virtue of the Principle of Indifference, then we have 
 Y{a/h) = Y{b/h) and Y{a/h) = Y{d/h), so that Y{b/h) = Y{djh). It is 
 
74 A TEEATISE ON PROBABILITY m. i 
 
 also true, since dj{b + c)h = l and {b + c)lah = l, that V(a/A) = 
 V((6 + c)lh). Hence V(6/A) = V({6 + c)/A). 
 
 5. The preceding paragraphs will have made it clear that the 
 weighing of the amount of evidence is quite a separate process 
 from the balancing of the evidence for and against. In so far, 
 however, as the question of weight has been discussed at all; 
 attempts have been made, as a rule, to explain the former in 
 terms of the latter. If xjhjiz^^ and xj\=^, it has sometimes 
 been supposed that it is more probable that xjhji^ really is f than 
 that x/hi really is |. According to this view, an increase in the 
 amount of evidence strengthens the probability of the proba- 
 bility, or, as De Morgan would say, the presumption of the 
 probability. A Uttle reflection will show that such a theory is 
 untenable. For the probability of x on hypothesis hy is inde- 
 pendent of whether as a matter of fact x is or is not true, and if 
 we find out subsequently that x is true, this does not make it 
 false to say that on hypothesis h^ the probabiUty of x is f . Simi- 
 larly the fact that xjh^^ is f does not impugn the conclusion that 
 xjh^ is I, and unless we have made a mistake in our judgment or 
 our calculation on the evidence, the two probabilities are f and f 
 respectively. 
 
 6. A second method, by which it might be thought, perhaps, 
 that the question of weight has been treated, is the method of 
 'probable error. But while probable error is sometimes connected 
 with weight, it is primarily concerned with quite a different ques- 
 tion. ' Probable error,' it should be explained, is the name 
 given, rather inconveniently perhaps, to an expression which 
 arises when we consider the probabiUty that a given quantity is 
 measured by one of a number of different magnitudes. Our 
 data may tell us that one of these magnitudes is the most probable 
 measure of the quantity ; but in some cases it will also tell 
 us how probable each of the other possible magnitudes of the 
 quantity is. In such cases we can determine the probability 
 that the quantity will have a niagnitude which does not differ 
 from the most probable by more than a specified amount. The 
 amount, which the difference between the actual value of the 
 quantity and its most probable value is as hkely as not to exceed, 
 is the ' probable error.' In many practical questions the exist- 
 ence of a small probable error is of the greatest importance, 
 if our conclusions are to prove valuable. The probability that 
 
CH. VI FUNDAMENTAL IDEAS 75 
 
 the quantity has any particular magnitude may be very small ; 
 but this may matter very Uttle, if there is a high probability 
 that it lies within a certain range. 
 
 Now it is obvious that the determination of probable error 
 is intrinsically a difEerent problem from the determination of 
 weight. The method of probable error is simply a summation of 
 a number of alternative and exclusive probabilities. If we say 
 that the most probable magnitude is x and the probable error y, 
 this is a way, convenient for many purposes, of summing up a 
 number of probable conclusions regarding a variety of magni- 
 tudes other than x which, on the evidence, the quantity may 
 possess. The connection between probable error and weight, such 
 as it is, is due to the fact that in scientific problems a large 
 probable error is not uncommonly due to a great lack of evidence, 
 and that as the available evidence increases there is a tendency 
 for the probable error to diminish. In these cases the probable 
 error may conceivably be a good practical measure of the weight. 
 
 It is necessary, however, in a theoretical discussion, to point 
 out that the connection is casual, and only exists in a limited 
 class of cases. This is easily shown by an example. We may 
 have data on which the probabiUty of a; = 5 is J, of a; = 6 is \, 
 of a; = 7 is i, of a; = 8 is ^, and of a; = 9 is xir. Additional evidence 
 might show that x must either be 5 or 8 or 9, the probabilities of 
 each of these conclusions being iV> Aj tt- The evidential weight 
 of the latter argument is greater than that of the former, but the 
 probable error, so far from being diminished, has been increased. 
 There is, in fact, no reason whatever for supposing that the 
 probable error must necessarily diminish, as the weight of the 
 argument is increased. 
 
 The typical case, in which there may be a ^radical connection 
 between weight and probable error, may be illustrated by the 
 two cases following of balls drawn from an urn. In each case we 
 require the probability of drawing a white ball ; in the first case 
 we know that the urn contains black and white in equal propor- 
 tions ; in the second case the proportion of each colour is unknown, 
 and each ball is as Hkely to be black as white. It is evident that 
 in either case the probability of drawing a white ball is \, but 
 that the weight of the argument in favour of this conclusion is 
 greater in the first case. When we consider the most probable 
 proportion in which balls will be drawn in the long run, it after 
 
76 A TEEATISE ON PKOBABILITY pt. i 
 
 each withdrawal they are replaced, the question of probable 
 error enters in, and we find that the greater evidential weight of 
 the argument on the first hjrpothesis is accompanied by the 
 smaller probable error. 
 
 This conventionalised example is typical of many scientific 
 problems. The more we know about any phenomenon, the less 
 hkely, as a rule, is our opinion to be modified by each additional 
 item of experience. In such problems, therefore, an argument 
 of high weight concerning some phenomenon is likely to be accom- 
 panied by a low probable error, when the character of a series 
 of similar phenomena is under consideration. 
 
 7. Weight cannot, then, be explained in terms of probability. 
 An argument of high weight is not ' more Ukely to be right ' than 
 one of low weight ; for the probabihties of these arguments only 
 state relations between premiss and conclusion, and these re- 
 lations are stated with equal accuracy in either case. Nor is an 
 argument of high weight one in which the probable error is small ; 
 for a small probable error only means that magnitudes in the 
 neighbourhood of the most probable magnitude have a relatively 
 high probability, and an increase of e\ddence does not necessarily 
 involve an increase in these probabilities. 
 
 The conclusion, that the ' weight ' and the ' probability ' of an 
 argument are independent properties, may possibly introduce a 
 difficulty into the discussion of the apphcation of probabUity 
 to practice.^ For in deciding on a course of action, it seems 
 plausible to suppose that we ought to take account of the weight 
 as well as the probability of different expectations. But it is 
 difficult to think of any clear example of this, and I do not 
 feel sure that the theory of ' evidential weight ' has mijch 
 practical significance. 
 
 BernoulH's second maxim, that we must take into account all 
 the information we have, amounts to an injunction that we should 
 be guided by the probability of that argument, amongst those of 
 which we know the premisses, of which the evidential weight is 
 the greatest. But should not this be re-enforced by a further 
 maxim, that we ought to make the weight of our arguments as 
 great as possible by getting all the information we can ? ^ It is 
 
 I See also Chapter XXVI. § 7. 
 
 * Cf. Locke, Essay concerning HvMMn Understanding, hook ii. chap. xxi. § 67: 
 " He that judges without informing himself to the utmost that he is capable, 
 cannot acquit himself of judging amiss." 
 
OH. VI FUKDAMENTAL IDEAS 77 
 
 difficult to see, however, to what point the strengthening of an 
 argument's weight by increasing the evidence ought to b^ pushed. 
 We may argue that, when our knowledge is slight but capable of 
 increase, the course of action, which will, relative to such know- 
 ledge, probably produce the greatest amount of good, will often 
 consist La the acquisition of more knowledge. But there clearly 
 comes a point when it is no longer worth while to spend trouble, 
 before acting, in the acquisition of further information, and there 
 is no evident principle by which to determine hm far we ought 
 to carry our maxim of strengthening the weight of our argument. 
 A little reflection will probably con-vince the reader that this is 
 a very confusing problem. 
 
 8. The fundamental distinction of this chapter may be briefly 
 repeated. One argument has more weight than another if it is 
 based upon a greater amount of relevant evidence ; but it is not 
 always, or even generally, possible to say of two sets of proposi- 
 tions that one set embodies Tnore evidence than the other. It has 
 a greater probability than another if the balance in its favour, 
 of what evidence there is, is greater than the balance in favour 
 of the argument with which we compare it ; but it is not always, 
 or even generally, possible to say that the balance in the one case 
 is greater than the balance in the other. The weight, to speak 
 metaphorically, measures the sym of the favourable and unfavour- 
 able evidence, the probability measures the difference. 
 
 9. The phenomenon of ' weight ' can be described from the 
 point of view of other theories of probability than that which is 
 adopted here. If we follow certain German logicians in regarding 
 probabiUty as being based on the disjunctive judgment, we may 
 say that the weight is increased when the number of alternatives 
 is reduced, although the ratio of the number of favourable to 
 the number of unfavourable alternatives may not have been 
 disturbed ; or, to adopt the phraseology of another German 
 school, we may say that the weight of the probability is increased, 
 as the field of possibiUty is contracted. 
 
 The same distinction may be explained in the language of the 
 frequency theory.^ We should then say that the weight is in- 
 creased if we are able to employ as the class of reference a class 
 which is contained in the original class of reference. 
 
 10. The subject of this chapter has not usually been discussed 
 
 1 See Chap. VIII. 
 
78 A TEEATISE ON PROBABILITY n. i 
 
 by writers on probability, and I know of only two by whom the 
 question, has been expHcitly raised : ^ Meinong, who threw out , a 
 suggestion at the conclusion of his review of Von Kries' "Princi- 
 pien," published in the Gottingische gelehrte Anzeigen for 1890 
 (see especially pp. 70-74), and A. Nitsche, who took up Meinong's 
 suggestion in an article in the Vierteljahrsschrifi fiir wissenschaft- 
 liche Philosophie, 1892, vol. xvi. pp. 20-35, entitled "Die Dimen- 
 sionender Wahrscheinlichkeit und die Evidenz der Ungewissheit." 
 Meinong, who does not develop the point in any detail, dis- 
 tinguishes probabihty and weight as ' Intensitat ' and ' Qualitat,' 
 and is inclined to regard them as two independent dimensions in 
 which the judgment is free to move — ^they are the two dimensions 
 of the ' Urteils-Continuum.' Nitsche regards the weight as being 
 the measure of the reliabihty (Sicherheit) of the probability, and 
 holds that the probabihty continually approximates to its true 
 magnitude (reale Geltung) as the weight increases. His treatment 
 is too brief for it to be possible to understand very clearly what 
 he means, but his view seems to resemble the theory already 
 discussed that an argument of high weight is ' more likely to be 
 right ' than one of low weight. 
 
 ^ There are also some remarks by Czuber {Wahrscheinlichheitsrechnung, 
 voL i. p. 202) on the Erkenntnisswert of probabiUties obtained by different 
 methods, which may have been intended to have some bearing on it. 
 
CHAPTEE VII 
 
 HISTORICAL BETEOSPEOT 
 
 1. The characteristic features of our Philosophy of Probability- 
 must be determined by the solutions which we offer to the 
 problems attacked m Chapters III. and IV. Whilst a great part 
 of the logical calculus, which will be developed in Part II., would 
 be applicable with sKght modification to several distinct theories 
 of the subject, the ultimate problems of establishing the premisses 
 of the calculus bring into the Ught every fundamental difference 
 of opinion. 
 
 These problems are often, for this reason perhaps, left on one 
 side by writers whose interest chiefly lies in the more formal parts 
 of the subject. But Probability is not yet on so sound a basis 
 that the formal or mathematical side of it can be safely developed 
 in isolation, and some attempts have naturally been made to 
 solve the problem which Bishop Butler sets to the logician in the 
 concluding words of the brief discussion on probability with 
 which he prefaces the Analogy} 
 
 In this chapter, therefore, we will review in their historical 
 order the answers of Philosophy to the questions, how we know 
 relations of probability, what ground we have for our judgments, 
 and by what method we can advance our knowledge. 
 
 2. The natural man is disposed to the opinion that probability 
 is essentially connected with the inductions of experience and, 
 if he is a little more sophisticated, with the Laws of Causation 
 
 1 " It is not my design to inquire further into the nature, the foundation and 
 measure of probability ; or whence it proceeds that likeness should beget that 
 presumption, opinion and fuU conviction, which the human mind is formed 
 to receive from it, and which it does necessarily produce in every one ; or to 
 guard against the errors to which reasoning from analogy is liable. This 
 belongs to the subject of logic, and is a part of that subject which has not yet 
 been thoroughly considered." 
 
 79 
 
80 A TEEATISE ON PROBABILITY pt. i 
 
 and of the Uniformity of Nature. As Aristotle says, " the 
 probable is that which usually happens." Events do not always 
 occur in accordance with the expectations of experience ; but 
 the laws of experience afEord us a good ground for supposing 
 that they usually will. The occasional disappointment of these 
 expectations prevents our predictions from being more than 
 probable ; but the ground of their probability must be sought in 
 this experience, and in this experience only. 
 
 This is, in substance, the argument of the authors of the Port 
 Royal Logic (1662), who were the first to deal with the logic 
 of probability in the modem manner : "In order for me to 
 judge of the truth of an event, and to be determined to believe 
 it or not beheve it, it is not necessary to consider it abstractly, 
 and in itself, as we should consider a proposition in geometry ; 
 but it is necessary to pay attention to all the circumstances 
 which accompany ij;, internal as well as external. I call internal 
 circumstances those which belong to the fact itself, and external 
 those which belong to the persons by whose testimony we are led 
 to beheve it. This being done, if all the circumstances are 
 such that it never or rarely happens that the hke circumstances 
 are the concomitants of falsehood, our mind is led, naturally, 
 to beheve that it is true."^ Locke follows the Port Royal 
 Logicians very closely : " Probability is hkeUness to be true. . . . 
 The grounds of it are, in short, these two following. First, the 
 conformity of anything with our own knowledge, observation, 
 and experience. Secondly, the testimony of others, vouching 
 their observation and experience " ; ^ and essentially the same 
 opinion is maintained by Bishop Butler : " When we determine 
 a thing to be probably true, suppose that an event has or will 
 come to pass, it is from the mind's remarking in it a hkeness to 
 some other event, which we have observed has come to pass. 
 And this observation forms, in numberless instances, a pre- 
 sumption, opinion, or full conviction that such event has or will 
 come to pass." ^ 
 
 Against this view of the subject the criticisms of Hume were 
 directed : " The idea of cause and efEect is derived from experi- 
 ence, which informs us, that such particular objects, in all past 
 
 1 Eng. Trans., p. 353. 
 
 ^ An Essay concerning Human Understanding, book iv. " Of Knowledge and 
 Opinion." 
 
 ' Introduction to the Analogy. 
 
OH. vn FUKDAMENTAL IDEAS 81 
 
 instances, have been constantly conjoined with each other. . . . 
 According to this account of things . . . probabihty is founded 
 on the presumption of a resemblance betwixt those objects, of 
 which we have had experience, and those, of which we have had 
 none ; and therefore 'tis impossible this presumption can arise 
 from probabihty." ^ "When we are accustomed to see two impres- 
 sions conjoined together, the appearance or idea of the one im- 
 mediately carries us to the idea of the other. . . . Thus aU prob- 
 able reasoning is nothing but a species of sensation. 'Tis not 
 solely in poetry and music, we must follow our taste and senti- 
 ment, but hkewise in philosophy. When I am convinced of any 
 principle, 'tis only an idea, which strikes more strongly upon me. 
 When I give the preference to one set of arguments above another, 
 I do nothing but decide from my feehng concerning the superi- 
 ority of their influence." ^ Hume, in fact, points out that, while 
 it is true that past experience gives rise to a psychological anticipa- 
 tion of some events rather than of others, no ground has been 
 given for the vahdity of this superior anticipation. 
 
 3. But in the meantime the subject had fallen into the hands 
 of the mathematicians, and an entirely new method of approach. 
 was in course of development. It had become obvious that 
 many of the judgments of probabihty which we in fact make 
 do not depend upon past experience in a way which satisfies the 
 canons laid down by the Port Royal Logicians or by Locke. In 
 particular, alternatives are judged equally probable, without 
 there being necessarily any actual experience of their approxi- 
 mately equal frequency of occurrence in the past. And, apart 
 from this, it is evident that judgments based on a somewhat 
 indefinite experience of the past do not easily lend them- 
 selves to precise numerical appraisement. Accordingly James 
 Bernoulh,' the real founder of the classical school of mathematical 
 probabihty, while not repudiating the old test of experience, had 
 based many of his conclusions on a quite different criterion — ^the 
 rule which I have named the Principle of Indifference. The 
 traditional method of the mathematical school essentially 
 depends upon reducing aU the possible conclusions to a number 
 of ' equi-probable cases.' And, according to the Principle of 
 
 ^ Treatise of Human Nature, p. 391 (Green's edition). 
 » Op. cit. p. 403. 
 
 ^ See especially Ars Oonjectandi, p. 224. Cf. Laplace, Theorie analytique, 
 p. 178. 
 
 G 
 
82 A TREATISE ON PROBABILITY pt. i 
 
 IndifEerence, ' cases ' are held to be equi-probable when there 
 is no reason for preferring any one to any other, when there is 
 nothing, as with Buridan's ass, to determine the mind in any one 
 of the several possible directions. To take Czuber's example 
 of dice,^ this principle permits ns to assimie that each face is 
 equally likely to fall, if there is no reason to suppose any particular 
 irregularity, and it does not require that we should know that the 
 construction is regular, or that each face has, as a matter of fact, 
 fallen equally often in the past. 
 
 On this Principle, extended by Bernoulli beyond those 
 problems of gaming in which by its tacit assumption Pascal 
 and Huyghens had worked out a few simple exercises, the whole 
 fabric of mathematical probability was soon allowed to rest. 
 The older criterion of experience, never repudiated, was soon 
 subsumed under the new doctrine. First, in virtue of Bernoulli's 
 famous Law of Great Numbers, the fractions representing the 
 probabiUties of events were thought to represent also the actual 
 proportion of their occurrences, so that experience, if it were 
 considerable, could be translated into the cyphers of arithmetic. 
 And next, by the aid of the Principle of IndifEerence, Laplace 
 established his Law of Succession by which the influence of any 
 experience, however Umited, could be numerically measured, and 
 which purported to prove that, if B has been seen to accompany 
 A twice, it is two to one that B wiU again accompany A on A's 
 next appearance. No other formula iu the alchemy of logic 
 has exerted more astonishing powers. For it has established 
 the existence of God from the premiss of total ignorance ; and it 
 has measured with nimierical precision the probability that the 
 sun win rise to-morrow. 
 
 Yet the new principles did not win acceptance without 
 opposition. D'Alembert,^ Hume, and AnciUon ^ stand out as 
 the sceptical critics of probability, against the credulity of 
 
 ^ Wah/rscheinlichkeitsreclmwng, p. 9. 
 
 ' D'Alembert's scepticism was directed towaids the cuireat mathematical 
 theory only, and was not, like Hume's, fundamental and far-reaching. Hia 
 opposition to the received opinions was, perhaps, more splendid than dis- 
 criminating. 
 
 ' AnciUon's communication to the Berlin Academy in 1794, entitled Doutes 
 sur les bases du calcul des probabilites, is not as well known as it deserves to 
 be. He writes as a follower of Hume, but adds much that is original and 
 interesting. An historian, who also wrote on a variety of philosophical subjects, 
 AnciUon was, at one time, the Prussian Minister of Foreign Affairs. 
 
OH. vn FUNDAMENTAL IDEAS 83 
 
 eighteenth-century philosophers who were ready to swallow 
 without too many questions the conclusions of a science which 
 claimed and seemed to bring an entire new field within the 
 dominion of Eeason.^ 
 
 . The first effective criticism came from Hume, who was also 
 the first to distinguish the method of Locke and the philosophers 
 from the method of Bernoulli and the mathematicians. " Prob- 
 abihty," he says, " or reasoning from conjecture, may be divided 
 into two kinds, viz. that which is founded on chance and that which 
 arises from causes." ^ By these two kinds he evidently means the 
 mathematical method of counting the equal chances based on 
 Indifference, and the inductive method based on the experience 
 of uniformity. He argues that ' chance ' alone can be the 
 foundation of nothing, and " that there must always be a mixture 
 of causes among the chances, in order to be the foundation of 
 any reasoning." ^ His previous argument against probabilities, 
 which were based on an assumption of cause, is thus extended 
 to the mathematical method also. 
 
 But the great prestige of Laplace and the ' verifications ' 
 of his principles which his more famous results were supposed 
 to supply had, by the beginning of the nineteenth century, 
 estabhshed the science on the Principle of Indifference in an 
 almost unquestioned position. It may be noted, however, that 
 De Morgan, the principal student of the subject in England, 
 seems to have regarded the method of actual experiment and 
 the method of counting cases, which were equally probable 
 on grounds of Indifference, as alternative, methods of equal 
 vaUdity. 
 
 4. The reaction against the traditional teaching during the 
 past hundred years has not possessed sufficient force to displace 
 
 ^ French philosophy of the latter half of the eighteenth century was pro- 
 foundly affected by the supposed conquests of the Calculus of Probability in 
 all fields of thought. Nothing seemed beyond its powers of prediction, and 
 it almost succeeded in men's minds to the place previously occupied by 
 Revelation. It was under these influences that Condoroet evolved his doctrine 
 of the perfectibility of the human race. The continuity and oneness of 
 modem European thought may be illustrated, if such things amuse the 
 reader, by the reflection that Condoroet derived from BernouUi, that Godwin 
 was inspired by Condoroet, that Malthus was stimulated by Godwin's foUy 
 into stating hia famous doctrine, and that from the reading of Malthus 
 on Population Darwin received his earliest impulse. 
 
 ' Treatise of Human Nature, p. 424 (Green's edition). 
 
 3 Op. cit. p. 425. 
 
84 A TREATISE ON PROBABILITY pt. i 
 
 the established doctrine, and the Principle of IndifEerence is 
 still very widely accepted in an unqualified form. Criticism 
 has proceeded along two distinct lines ; the one, originated by 
 LesHe Ellis, and developed by Dr. Venn, Professor Edgeworth, 
 and Professor Karl Pearson, has been almost entirely confined 
 in its influence to England ; the other, of which the beginnings 
 are to be seen in Boole's Laws of Thought, has been developed 
 ia Germany, where its ablest exponent has been Von Eaies. 
 France has remained uninfluenced by either, and faithful, on 
 the whole, to the tradition of Laplace. Even Henri Poincar6, 
 who had his doubts, and described the Principle of IndifEerence 
 as " very vague and very elastic," regarded it as our only 
 guide in the choice of that convention, " which has always 
 something arbitrary about it," but, upon which calculation in 
 probability invariably rests.^ 
 
 5. Before following up in detail these two Unes of develop- 
 ment, I will summarise again the earlier doctrine with which the 
 leaders of the new schools found themselves confronted. 
 
 The earher philosophers had in mind in dealing with prob- 
 ability the apphcation to the future of the inductions of experience, 
 to the almost complete exclusion of other problems. Eor the 
 dcAa of probabihty, therefore, they looked only to their own 
 experience and to the recorded experiences of others ; their 
 principal refinement was to distinguish these two grounds, and 
 they did not attempt to make a numerical estimate of the chances. 
 The mathematicians, on the other hand, setting out from the 
 simple problems presented by dice and playing cards, and 
 
 1 Poincar^'s opiniona on Probability are to be found in his Oaleul des Prob- 
 abilites and in bis Science et Hypothise. ITeither of these books appears 
 to me to be in all respects a considered -work, but his view is sufficiently novel 
 to be worth a reference. Briefly, he shows that the current mathematical 
 definition is circular, and argues from this that the choice of the particular 
 probabilities, which we are to regard as initially equal before the application of 
 our mathematics, is entirely a matter of ' convention.' Much epigram is, 
 therefore, expended in pointing out that the study of probability is no more 
 than a polite exercise, and he concludes : " Le calcul des probabUit^s ofEre une 
 contradiction dans les termes m§mes qui servent a le designer, et, si je ne crai- 
 gnais de rappeler ici un mot trop souvent r^p^t6, je dirais qu'il nous enseigne 
 surtout une chose; c'est de savoir que nous ne savons rien." On the other 
 hand, the greater part of his book is devoted to working out instances of practi- 
 cal application, and he speaks of ' metaphysics ' legitimising particular conven- 
 tions. How this comes about is not explained. He seems to endeavour to 
 save his reputation as a philosopher by the surrender of probability as a valid 
 conception, without at the same time forfeiting his claim as a mathematician 
 to work out probable formulae of practical importance. • 
 
CH. vn FUKDAMENTAL IDEAS 85 
 
 requiring for the application of their methods a basis of numerical 
 measurement, dwelt on the negative rather than the positive 
 side of their evidence, and found it easier to measure equal 
 degrees of ignorance than equivalent quantities of experience. 
 This led to the expUcit introduction of the Principle of Indifference, 
 or, as it was then termed, the Principle of Non-Sufficient Reason. 
 The great achievement of the eighteenth century was, ia the eyes 
 of the early nineteenth, the reconciliation of the two points of 
 view and the measurement of probabilities, which were grounded 
 on experience, by a method whose logical basis was the Principle 
 of Non-Sufficient Reason. This would indeed have been a very 
 astonishing discovery, and would, as its authors declared, have 
 gradually brought almost every phase of human activity within 
 the power of the most refined mathematical analysis. 
 
 But it was not long before more sceptical persons began to 
 suspect that this theory proved too much. Its calculations, it 
 is true, were constructed from the data of experience, but the 
 more simple and the less complex the experience the better satis- 
 fied was the theory. What was required was not a wide experi- 
 ence or detailed information, but a completeness of symmetry in 
 the little information there might be. It seemed to follow from 
 the Laplacian doctrine that the primary quahfication for one 
 who would be well informed was an equally balanced ignorance. 
 
 6. The obvious reaction from a teaching, which seemed to 
 derive from abstractions results relevant to experience, was into 
 the arms of empiricism ; and in the state of philosophy at that 
 time England was the natural home of this reaction. The first 
 protest, of which I am aware, came from Leslie ElUs in 1842.^ 
 At the conclusion of his Remarks on an alleged froof of the Method 
 of least squares,^ " Mere ignorance," he says, " is no ground 
 for any inference whatever. Ex nihilo nihil." In Venn's 
 Logic of Chance ElUs's suggestions are developed into a complete 
 theory : ^ " Experience is our sole guide. If we want to discover 
 what is in reality a series of things, not a series of our own concep- 
 tions, we must appeal to the things themselves to obtain it, for 
 we cannot find much help elsewhere." Professor Edgeworth * 
 was an early disciple of the same school : " The probability," he 
 
 ^ On the Foundations of the Theory of Probabilities. 
 ' Republished in Miseellaneous Writings. 
 ^ Logic of Chance, p. 74. 
 * Metretike, p. 4. 
 
86 A TEEATISE ON PROBABILITY ft. i 
 
 says, " of head occurring n times if the coin is of the ordinary 
 make is approximately at least (J)". This value is rigidly deducible 
 from positive experience, the observations made by gamesters, 
 the experiments recorded by Jevons and De Morgan." 
 
 The doctrines of the empirical school wiU be examined in 
 Chapter VIII., and I postpone my detailed criticism to that 
 chapter. Venn rejects the applications of Bernoulli's theorem, 
 which he describes as " one of the last remainiag relics of Realism," 
 as well as the later Laplacian Law of Succession, thus destroying 
 the link between the empirical and the A 'priori methods. But, 
 apart from this, his view that statements of probability are 
 simply a particular class of statements about the actual world 
 of phenomena, would have led him to a closer dependence on 
 actual experience. He holds that the probability of an event's 
 having a certain attribute is simply the fraction expressing the 
 proportion of cases in which, as a matter of actual fact, this 
 attribute is present. Our knowledge, however, of this proportion 
 is often reached inductively, and shares the uncertainty to which 
 all inductions are hable. And, besides, in referring an event to 
 a series we do not postulate that all the members of the series 
 should be identical, but only that they should not be hnown to 
 diSer in a relevant manner. Even on this theory, therefore, we 
 are not solely determined by positive knowledge and the direct 
 data of experience. 
 
 7. The Empirical School in their reaction against the preten- 
 tious results, which the Laplacian theory affected to develop 
 out of nothing, have gone too far in the opposite direction. If 
 our experience and our knowledge were complete, we should 
 be beyond the need of the Calculus of Probability. And where 
 our experience is incomplete, we cannot hope to derive from it 
 judgments of probability without the aid either of intuition or of 
 some further d priori principle. Experience, as opposed to in- 
 tuition, cannot possibly afEord us a criterion by which to judge 
 whether on given evidence the probabilities of two propositions 
 are or are not equal. 
 
 However essential the data of experience may be, they cannot 
 by themselves, it seems, supply us with what we want. Czuber,^ 
 who prefers what he calls the Principle of Compelling Reason 
 (das Prinzip des zwingenden Grundes), and holds that ProbabiHty 
 
 ^ WahracheirUichlceitsreehnung, p. 11. 
 
CH. VII FUNDAMENTAL IDEAS 87 
 
 has an objective and not merely formal interpretation only when 
 it is grounded on definite knowledge, is rightly compelled to 
 admit that we cannot get on altogether without the Principle of 
 Non-Sufficient Reason. On the grounds both of its own intuitive 
 plausibility and of that of some of the conclusions for which it 
 is necessary, we are inevitably led towards this principle as a 
 necessary basis for judgments of probability. In some sense, 
 judgments of probability do seem to be based on equally balanced 
 degrees of ignorance. 
 
 8. It is from this starting-point that the German logicians 
 have set out. They have perceived that there are few judgments 
 of probability which are altogether independent of some principle 
 resembling that of Non-Sufficient Reason. But they also appre- 
 hend, with Boole, that this may be a very arbitrary method of 
 procedure. 
 
 It was pointed out in § 18 of Chapter IV. that the cases, in 
 which the Principle of Indifference (or Non-Sufficient Reason) 
 breaks down, have a great deal in common, and that we break 
 up the field of possibility into a number of areas, actually unequal, 
 but indistinguishable on the evidence. Several German logicians, 
 therefore, have endeavoured to determine some rule by which 
 it might be possible to postulate actual equahty of area for the 
 fields of the various possibilities. 
 
 By far the most complete and closely reasoned solution on 
 these lines is that of Von Kries.^ He is primarily anxious to dis- 
 cover a proper basis for the numerical measurement of probabiU- 
 ties, and he is thus led to examine with care the grounds of vahd 
 judgments of equiprobability. His criticisms of the Principle 
 of Non-Sufficient Reason are searching, and, to meet them, he 
 elaborates a number of qualifying conditions which are, he 
 argues, necessary and sufficient. The value of his book, however, 
 lies, in the opinion of the present writer, in the critical rather 
 than in the constructive parts. The manner in which his qualify- 
 ing conditions are expressed is ofteU; to an EngUsh reader at any 
 rate, somewhat obscure, and he seems sometimes to cover up 
 difficulties, rather than solve them, by the invention of new 
 technical terms. These characteristics render it difficult to 
 expound him adequately in a sunmiary, and the reader must be 
 
 ^ Die Principien der Wahrscheinlichkeitsrechnung, Eine logische Unter- 
 auchung. Freiburg, 1886. 
 
88 A TEEATISE ON PEOBABILITY pt. i 
 
 referred to the original for a proper exposition of the Doctrine of 
 Spiekdume. Briefly, but not very inteUigibly perhaps, he may 
 be said to hold that the hj^otheses for the probabihties of which 
 we wish to obtain a numerical comparison, must refer to 'fields' 
 (Spielrmime) which are ' indifferent,' ' comparable ' in magnitude, 
 and ' original ' {urspriinglich). Two fields are ' indifferent ' if 
 they are equal before the Principle of Non-Sufficient Reason ; 
 they are ' comparable ' if it is true that the fields are actually 
 of equal extent ; and they are ' original ' or ultimate if they are 
 not derived from some other field. The last condition is exceed- 
 ingly obscure, but it seems to mean that the objects with which 
 we are ultimately dealing must be directly represented by the 
 ' fields ' of our hypotheses, and there must not be merely correla- 
 tion between these objects and the objects of the fields. The 
 qualification of comparabihty is intended to deal with difficulties 
 such as that connected with the population of different areas of 
 unknown extent ; and the quahfication of originaUty with those 
 arising from indirect measurement, as in the case of specific 
 density. 
 
 Von Kries's solution is highly suggestive, but it does not seem, 
 so far as I understand it, to supply an unambiguous criterion 
 for all cases. His discussion of the philosophical character of 
 probability is brief and inadequate, and the fundamental error 
 in his treatment of the subject is the physical, rather than logical, 
 bias which seems to direct the formulation of his conditions. 
 The condition of UrsprilngUcJikeit, for instance, seems to depend 
 upon physical rather than logical criteria, and is, as a result, 
 much more restricted in its apphcabUity than a condition, which 
 will really meet the difficulties of the case, ought to be. But, 
 although I differ from him in his philosophical conception of 
 probability, the treatment of the Principle of Indifference, which 
 fills the greater part of his book, is, I think, along fruitful lines, 
 and I have been deeply indebted to it in formulating my own 
 conditions in Chapter TV. 
 
 Of less closely reasoned and less detailed treatments, which 
 aim at the same kind of result, those of Sigwart and Lotze are 
 worth noticing. Sigwart's^ position is sufficiently explained by 
 the following extract : " The possibility of a mathematical treat- 
 ment Hes primarily in the fact that in the disjunctive judgment 
 * Sigwart, Logic (Eng. edition), vol. ii. p. 220. 
 
CH. vn FUNDAMENTAL IDEAS 89 
 
 the number of terms in the disjunction plays a decisive part. 
 Inasmuch as a limited number of mutually exclusive possi- 
 bilities is presented, of which one alone is actual, the element 
 of number forms an essential part of our knowledge. . . . Our 
 knowledge must enable us to assume that the particular terms of 
 the disjunction are so far equivalent that they express an equal 
 degree of specialisation of a general concept, or that they cover 
 equal parts of the whole extension of the concept. . . . This 
 equivalence is most intuitable where we are deaUng with equal 
 parts of a spatial area, or equal parts of a period of time. . . . 
 But even where this obvious quahty is not forthcoming, we may 
 ground our expectations upon a hypothetical equivalence, where 
 we see no reason for considering the extent of one possibihty to 
 be greater than that of the others. . . ." 
 
 In the beginning of this passage Sigwart seems to be aware 
 of the fundamental difficulty, although exception may be taken 
 to the vagueness of the phrase " equal degree of speciahsation of 
 a general concept." But in the last sentence quoted he surrenders 
 the advantages he has gained in the earher part of his explana- 
 tion, and, instead of insisting on a knowledge of an equal degree 
 of speciahsation, he is satisfied with an absence of any knowledge 
 to the contrary. Hence, in spite of his initial quaUfications, he 
 ends unrestrainedly in the arms of Non-Sufficient Eeason.^ 
 
 Lotze,^ in a brief discussion of the subject, throws out some 
 remarks well worth quoting : " We disclaim all knowledge of 
 the circumstances which condition the real issue, so that when 
 we talk of equally possible cases we can only mean coordinated as 
 equivalent species in the compass of an universal case ; that is to 
 say, if we enumerate the special forms, which the genus can 
 assume, we get a disjunctive judgment of the form : if the con- 
 dition B is fulfilled, one of the kinds f-^f^f^ ... of the universal 
 consequent F will occur to the exclusion of the rest. Which of 
 all those different consequents will, in fact, occur, depends in all 
 cases on the special form h-p^^ ... in which that universal 
 condition is fulfilled. ... A coordinated case is a case which 
 answers to one and only one of the mutually exclusive values 
 bp^ ... of the condition B, and these rival values may occur in 
 
 * Sigwart's treatment of the subject of probability is curiously inaccurate. 
 Of his four fundamental rules of probability, for instance, three are, as he states 
 them, certainly false. 
 
 2 Lotze, Logic (Eng. edition), pp. 364, 365. 
 
90 A TREATISE ON PEOBABILITY pt. i 
 
 reality ; it does not answer to a more general form B, of this 
 condition, which can never exist in reality, because it embraces 
 several of the particular values h^^. . . ." 
 
 This certainly meets some of the difficulties, and its resem- 
 blance to the conditions formulated in Chapter IV. wiU be evident 
 to the careful reader. But it is not very precise, and not easily 
 applicable to all cases, to those, for instance, of the measure- 
 ment of continuous quantity. By combining the suggestions of 
 Von EJies, Sigwart, and Lotze, we might, perhaps, patch up a 
 fairly comprehensive rule. We might say, for instance, that if 
 6j and 6, are classes, their members must be finite in number and 
 enumerable or they must compose stretches ; that, if they are 
 finite in number, they must be equal in number ; and that, if 
 their members compose stretches, the stretches must be equal 
 stretches ; and that if 6^ and h^ are concepts, they must represent 
 concepts of an equal degree of speciahsation. But quahfications 
 so worded would raise almost as many difficulties as they solved. 
 How, for instance, are we to know when concepts are of an equal 
 degree of speciahsation ? 
 
 9. That probability is a relation has often received incidental 
 recognition from logicians, in spite of the general faUure to place 
 proper emphasis on it. The earliest writer, with whom I am 
 acquainted, explicitly to notice this, is Kahle in his Elementa 
 logicae ProbahiUum methodo mathematica in icsum Scientiarum 
 et Vitae adornata published at HaUe in 1735. ^ Amongst more 
 recent writers casual statements are common to the effect that 
 the probability of a conclusion is relative to the grounds upon 
 which it is based. Take Boole ^ for instance : " It is implied in 
 the definition that probability is always relative to our actual 
 
 '- This work, which seems to have soon fallen into complete neglect and is 
 now extremely rare, is full of interest and original thought. The following 
 quotations will show the fundamental position taken up : " Est cognitio pio- 
 babilis, si desunt quaedam requisita ad reritatem demonstrativam (p. 15). 
 Propositio probabiUs esse potest falsa, et improbabilis esse potest vera ; ergo 
 cognitio hodie possibilis, crastina luce mutari potest improbabilem, si accedunt 
 leUqua requisita omnia, in certitudinem (p. 26). . . . Certitudo est terminus 
 relatives : considerare potest ratione representationum in intelleotu nostro. 
 . . . Incerta nobis dependent a defectu cognitionis (p. 35). . . . Actionem 
 imprudeuter et contra regulas probabiUtatis susceptam eventus felix sequi 
 potest. Ergo prudentia actionum ex successu solo non est aestimanda (p. 62). 
 . . . Logica probabiUum est scientia dijudicandi gradum certitudinis eorum, 
 quibuB desunt requisita ad veritatem demonstrativam (p. 94)." 
 
 * " On a General Method in the Theory of Probabilities," Phil. Mag., 4th 
 Series, viii., 1854. See also, " On the Application of the Theory of Probabilities 
 
OH. vn FUNDAMENTAL IDEAS 91 
 
 state of infonnation and varies with that state of information." 
 Or Bradley : ^ " Probability tells us what we ought to believe, 
 what we ought to believe on certain data . . . Probability is no 
 more ' relative ' and ' subjective ' than is any other act of 
 logical inference from hypothetical premises. It is relative to 
 the data with which it has to deal, and is not relative in any other 
 sense." Or even Laplace, when he is explaining the diversity 
 of human opinions : " Dans les choses qui ne sont que vraisem- 
 blables, la difi6rence des donnees que chaque homme a sur eUes, 
 est une des causes principales de la diversity des opinions que 
 Ton voit regner sur les mSmes objets . . . c'est ainsi que le 
 mSme fait, recite devant une nombreuse assemblee, obtient divers 
 degres de croyance, suivant I'etendue des connaissances des 
 auditeurs." ^ 
 
 10. Here we may leave this account of the various directions 
 in which progress has seemed possible, with the hope that it may 
 assist the reader, who is dissatisfied with the solution proposed in 
 Chapter IV., to determine the line of argument along which he 
 is likeliest to discover the solution of a difiicult problem. 
 
 to the Question of the Combinatioii of Testimonies or Judgments " (Edin. Phil. 
 Trans, xxi. p. 600) : " Our estimate of the probability of an event varies not 
 absolutely with the ciroumstanoes which actually affect its occurrence, but with 
 our knowledge of those circumstances." 
 
 ^ T}\x. Principles of Logic, p. 208. 
 
 * Essai philosophique, p. 7. 
 
CHAPTEE VIII 
 
 THE FREQUENCY THEORY OP PROBABILITY 
 
 1. The theory of probability, outlined in the preceding chapters, 
 has serious difficulties to overcome. There is a theoretical, as 
 weU as a practical, difficulty in measuring or comparing degrees 
 of probability, and a further difficulty in determining them 
 d priori. We must now examine an alternative theory which is 
 much freer from these troubles, and is widely held at the present 
 time. 
 
 2. The theory is in its essence a very old one. Aristotle 
 foreshadowed it when he held that " the probable is that which 
 for the most part happens " ; ^ and, as we have seen in Chapter 
 VII., an opinion not unUke this was entertained by those philoso- 
 phers of the seventeenth and eighteenth centuries who approached 
 the problems of probability uninfluenced by the work of mathe- 
 maticians. But the underlying conception of earlier writers 
 received at the hands of some Enghsh logicians during the latter 
 half of the nineteenth century a new and much more complicated 
 form. 
 
 The theory in question, which I shall call the Frequency 
 Theory of Probabihty, first appears ^ as the basis of a proposed 
 logical scheme in a brief essay by Leslie Ellis On the Foundations 
 of the Theory of Probdbilities, and is somewhat further developed 
 in his Remarks on the Fundamental Principles of the Theory of 
 
 1 Shet. i. 2, 1357 a 34. 
 
 ' I give Ellis the priority because his paper, published in 1843, was read on 
 Feb. 14, 1842. The same conception, however, is to be found in Coumot's 
 Exposition, also published in 1843 : " La theorie des probabilit^s a pour objet 
 certains rapports numeriques qui prendraient des valeurs fixes et oompWtement 
 d^termin^es, si Ton pouvait rlp6ter k I'infini les 6preuves des mSmes hasards, 
 et qui, pour un nombre flni d'^preuves, osoillent entre des Umites d'autant plus 
 resserrSes, d'autant plus voisines des valeMTB finales, que le nombre des ^preuves 
 est plus grand." 
 
 92 
 
CH. vin FUNDAMENTAL IDEAS 93 
 
 Probabilities.^ " If the probability of a given event be correctly 
 deternained," he says, "the event will on a long run of trials tend 
 to recur with frequency proportional to their probabihty. This 
 is generally proved mathematically. It seems to me to be true 
 d priori. ... I have been unable to sever the judgment that 
 one even,t is more likely to happen than another from the beHef 
 that in the long run it will occur more frequently." Ellis ex- 
 phcitly introduces the conception that probabihty is essentially 
 concerned with a group or series. 
 
 Although the priority of invention must be allowed to Leslie 
 EUis, the theory is commonly associated with the name of Venn. 
 In his Logic of Chcmce ^ it first received elaborate and systematic 
 treatment, and, in spite of his having attracted a number of 
 followers, there has been no other comprehensive attempt to 
 meet the theory's special difficulties or the criticisms directed 
 against it. I shall begin, therefore, by examining it in the form 
 in which Venn has expounded it. Venn's exposition is much 
 coloured by an empirical view of logic, which is not perhaps as 
 necessary to the essential part of his doctrine as he himself 
 impUes, and is not shared by all of those who must be classed as 
 in general agreement with him about probability. It will be 
 necessary, therefore, to supplement a criticism of Venn by an 
 account of a more general frequency theory of probability, 
 divested of the empiricism with which he has clothed it. 
 
 3. The following quotations from Venn's Logic of Chance will 
 show the general drift of his argument : The fundamental con- 
 ception is that of a series (p. 4). The series is of events which 
 have a certain number of features or attributes in common (p. 10). 
 The characteristic distinctive of probabihty is this, — the occa- 
 sional attributes, as distinguished from the permanent, are found 
 on an examination to tend to exist in a certain definite proportion 
 of the whole number of cases (p. 11). We require that there should 
 be in nature large classes of objects, throughout all the individual 
 members of which a general resemblance extends. For this 
 
 1 These essays were published in the Transactions of the Camb. Phil. Soo., the 
 first in 1843 (vol. viii.), and the second in 1854 (vol. ix.). Both were reprinted 
 in Mathematical and other Writings (1863), together with three other brief 
 papers on Probability and the Method of Least Squares. All five are fuU of 
 spirit and originality, and are not now so well known as they deserve to be. 
 
 2 The first edition appeared in 1866. Revised editions were issued in 1876 
 and 1888. References are given to the third edition of 1888. 
 
94 A TREATISE ON PROBABILITY pt. i 
 
 purpose the existence of natural kinds or groups is necessary 
 (p. 55). The distinctive characteristics of probability prevail 
 principally in the properties of natural kinds, both in the ultimate 
 and in the derivative or accidental properties (p. 63). The same 
 peculiarity prevails again in the force and frequency of most 
 natural agencies (p. 64). There seems reason to beUeve that it 
 is in such things only, as distinguished from things artificial, that 
 the property ia question is to be found (p. 65). How, in any 
 particular case, are we to estabhsh the existence of a probabiUty 
 series ? Experience is our sole guide. If we want to discover 
 what is in reaUty a series of things, not a series of our own con- 
 ceptions, we must appeal to the things themselves to obtain it, 
 for we cannot find much help elsewhere (p. 174). When proba- 
 bihty is divorced from direct reference to objects, as it substanti- 
 ally is by not being founded upon experience, it simply resolves 
 itseK into the common algebraical doctrine of Permutations 
 and Combinations (p. 87). By assigning an expectation in 
 reference to the individual, we mean nothing more than to make 
 a statement about the average of his class (p. 151). When we say 
 of a conclusion within the strict province of probability, that it 
 is not certain, aU that we mean is that in some proportion of 
 cases only will such conclusion be right, in the other cases it will 
 be wrong (p. 210). 
 
 The essence of this theory can be expressed in a few words. 
 To say, that the probability of an event's having a certain charac- 
 teristic is -, is to mean that the event is one of a number of events, 
 a proportion - of which have the characteristic in question ; and 
 the fact, that there is such a series of events possessing this 
 frequency in respect of the characteristic, is purely a matter of 
 experience to be determined in the same manner as any other 
 question of fact. That such series do exist happens to be a 
 characteristic of the real world as we know it, and from this 
 the practical importance of the calculation of probabilities is 
 derived. 
 
 Such a theory possesses manifest advantages. There is no 
 mystery about it — ^no new indefimables, no appeals to intuition. 
 Measurement leads to no difficulties ; our probabilities or fre- 
 quencies are ordinary numbers, upon which the arithmetical 
 apparatus can be safely brought to bear. And at the same time it 
 
OH. vm FUNDAMENTAL IDEAS 95 
 
 seems to crystallise in a clear, explicit shape the floating opinion 
 of common sense that an event is or is not probable in certain 
 supposed circumstances according as it is or is not usual as a 
 matter of fact and experience. 
 
 The two principal tenets, then, of Venn's system are these, — • 
 that probability is concerned with series or groups of events, 
 and that all the requisite facts must be determined empirically, 
 a statement in probabihty merely summing up in a convenient 
 way a group of experiences. Aggregate regularity combined 
 with individual difference happens, he says, to be characteristic 
 of many events in the real world. It will often be the case, 
 therefore, that we can make statements regarding the average 
 of a certain class, or regarding its characteristics in the long run, 
 which we cannot make about any of its individual members 
 without great risk of error. As our knowledge regarding the 
 class as a whole may give us valuable guidance in dealing with an 
 individual instance, we require a convenient way of saying that 
 an individual belongs to a class in which certain characteristics 
 appear on the average with a known frequency ; and this the 
 conventional language of probabihty gives us. The importance 
 of probabihty depends solely upon the actual existence of such 
 groups or real kinds in the world of experience, and a judgment 
 of probabihty must necessarily depend for its vahdity upon our 
 empirical knowledge of them. 
 
 4. It is the obvious, as well ais the correct, criticism of such a 
 theory, that the identification of probabihty with statistical 
 frequency is a very grave departure from the estabhshed use of 
 words ; for it clearly excludes a great number of judgments 
 which are generally beUeved to deal with probabihty. Venn 
 himself was well aware of this, and cannot be accused of supposing 
 that all behefs, which are commonly called probable, are really 
 concerned with statistical frequency. But some of his followers, 
 to judge from their pubUshed work, have not always seen, so 
 clearly as he did, that his theory is not concerned with the same 
 subject as that with which other writers have dealt under the 
 same title. Venn justifies his procedure by arguing that no other 
 meaning, of which it is possible to take strict logical cognisance, 
 can reasonably be given to the term, and that the other meanings, 
 with which it has been used, have not enough in common to 
 permit their reduction to a single logical scheme. It is useless, 
 
96 A TREATISE ON PROBABILITY pt. i 
 
 therefore, for a Critic of Venn to point out that many supposed 
 judgments of probability are not concerned with statistical 
 frequency ; for, as I understand the Logic of Chance, he admits 
 it ; and the critic must show that the sense different from Venn's 
 in which the term probability is often employed has an important 
 logical interpretation about which we can generalise. This 
 position I seek to establish. It is, in my opinion, this other sense 
 alone which has importance ; Venn's theory by itself has few 
 practical applications, and if we allow it to hold the field, we must 
 admit that probability is not the guide of life, and that in following 
 it we are not acting according to reason. 
 
 5. Part of the plausibihty of Venn's theory is derived, I 
 think, from a failure to recognise the narrow limits of its ap- 
 pHcability, or to notice his own admissions regarding this. " In 
 every case," he says (p. 124), "in which we extend our inferences 
 by Induction or Analogy, or depend upon the witness of others, 
 or trust to our own memory of the past, or come to a conclusion 
 through conflicting arguments, or even make a long and com- 
 plicated deduction by mathematics or logic, we have a result of 
 which we can scarcely feel as certain as of the premisses from 
 which it was obtained. In all these cases, then, we are conscious 
 of varying quantities of belief, but are the laws according to which 
 the belief is produced and varied the same ? If they cannot be 
 reduced to one harmonious scheme, if, in fact, they can at best be 
 brought to nothing but a number of different schemes, each with 
 its own body of laws and rules, then it is vaia to endeavour to 
 force them into one science." All these cases, therefore, in which 
 we are ' not certain,' Venn expHcitly excludes from what he 
 chooses to call the science of probability, and he pays no further 
 attention to them. The science of probabihty is, according to 
 him, no more than a method which enables us to express in a 
 convenient form statistical statements of frequency. " The 
 province of probability," he says again on page 160, " is not so 
 extensive as that over which variation of behef might be observed. 
 Probability only considers the case in which this variation is 
 brought about in a certain definite statistical way."^ He points 
 
 * Edgeworth uses the term ' probability ' widely, as I do ; but he makes 
 a distinction corresponding to Venn's by limiting the subject-matter of the 
 Galeulus of Probabilities. He writes (' Philosophy of Chance,' Mind, 1884, 
 p. 223) : " The Calculus of Probabilities is concerned with the estimation of 
 degrees of probability ; not every species of estimate, but that which is founded 
 
OH. ym FUNDAMENTAL IDEAS 97 
 
 out on p. 194 that for the purposes of probability we must take 
 the statistical frequency from which we start ready made and 
 ask no questions about the process or completeness of its manu- 
 facture : " It may be obtained by any of the numerous rules 
 furnished by Induction, or it may be inferred deductively, or 
 given by our own observation ; its value may be diminished by 
 its depending upon the testimony of witnesses, or its being 
 recalled by our own memory. Its real value may be influenced 
 by these causes or any combinations of them ; but all these are 
 preliminary questions with which we have nothing directly to do. 
 We assume our statistical proposition to be true, neglecting the 
 diminution of its value by the processes of attainment." 
 
 It must be recognised, therefore, that Venn has deUberately 
 excluded from Ms survey almost all the cases in which we regard 
 our judgments as ' only probable ' ; and, whatever the value or 
 consistency of his own scheme, he has left untouched a wide 
 field of study for others. 
 
 6. The main grounds, which have induced Venn to regard 
 judgments based on statistical frequency as the only cases of 
 probabiHty which possess logical importance, seem to be two : 
 (i.) that other cases are mainly subjective, and (ii.) that they 
 are incapable of accurate measurement. 
 
 With regard to the first it must be admitted that there are 
 many instances in which variation of behef is occasioned by purely 
 psychological causes, and that his argument is valid against those 
 who have defined probability as measuring the degree of sub- 
 jective belief. But this has not been the usual way of 
 looking at the subject. ProbabiUty is the study of the 
 grounds which lead us to entertain a rational preference for 
 one behef over another. That there are rational grounds other 
 than statistical frequency, for such preferences, Venn does 
 not deny ; he admits in the quotation given above that the 
 ' real valiie ' of our conclusion is influenced by many other con- 
 on a particular standard. That standard is the phenomenon of statistical 
 uniformity : the fact that a genus can very frequently be subdivided into species 
 such that the number of individuals in each species bears an approximately 
 constant ratio to the number of individuals in the genus." This use of terms is 
 legitimate, though it is not easy to foUow it consistently. But, like Venn's, 
 it leaves aside the most important questions. The Calculus of Probabili- 
 ties, thus interpreted, is no guide by itself as to which opinion we ought 
 to foUow, and is not a measure of the weight we should attach to conflicting 
 arguments. 
 
 H 
 
98 A TREATISE ON PROBABILITY pt. i 
 
 siderations than that of statistical frequency. Venn's theory, 
 therefore, cannot be faixly propounded by his disciples as alterna- 
 tive to such a theory as is propounded here. For my Treatise is 
 concerned with the general theory of arguments from premisses 
 leading to conclusions which are reasonable but not certain ; 
 and this is a subject which Venn has, dehberately, not treated 
 in the Logic of Chance. 
 
 7. Apart from two circumstances, it would scarcely be neces- 
 sary to say anything further ; but in the first place some writers 
 have believed that Venn has propounded a complete theory 
 of probabihty, failing to realise that he is not at all concerned 
 with the sense in which we may saythat one induction or analogy, 
 or testimony, or memory, or train of argxmient is more probable 
 than another ; and in the second place he himiself has not always 
 kept within the narrow limits, which he has himself laid down 
 as proper to his theory. 
 
 For he has not remained content with defining a probability 
 as identical with a statistical frequency, but has often spoken 
 as if his theory told us which alternatives it is reasonable to -prefer. 
 When he states, for instance, that modahty ought to be banished 
 from Logic and relegated to Probability (p. 296), he forgets his 
 own dictum that of premisses, the distinctive characteristic of 
 which is their lack of certainty. Probability takes account of 
 one class only, Induction concerning itself with another class, and 
 so forth (p. 321). He forgets also that, when he comes to consider 
 the practical use of statistical frequencies, he has to admit that 
 an event may possess more than one frequency, and that we must 
 decide which of these to prefer on extraneous grounds (p. 213). 
 The device, he says, must be to a great extent arbitrary, and there 
 are no logical grounds of decision ; but would he deny that it is 
 often reasonable to found our probability on one statistical 
 frequency rather than on another ? And if our grounds are 
 reasonable, are they not in an important sense logical ? 
 
 Even in those cases, therefore, in which we derive our prefer- 
 ence for one alternative over another from a knowledge of statis- 
 tical frequencies, a statistical frequency by itseK is insufficient 
 to determine us. We may call a statistical frequency a prob- 
 abihty, if we choose ; but the fundamental problem of determining 
 which of several alternatives is logically preferable still awaits 
 solution. We cannot be content with the only counsel Venn 
 
OH. vm FUKDAMENTAL IDEAS 99 
 
 can offer, that we should choose a frequency which is derived 
 from a series neither too large nor too small. 
 
 The same difficulty, that a probabiUty in Venn's sense is 
 insufficient to determine which alternative is logically preferable, 
 arises in another connection. In most cases the statistical 
 frequency is not given in experience for certain, but is arrived 
 at by a process of indiiction, and inductions, he admits, are not 
 certain. If, in the past, three infants out of every ten have 
 died in their first four years, induction may base on this the 
 doubtful assertion. All infants die in that proportion. But we 
 cannot assert on this ground, as Venn wishes to do, that the prob- 
 abiUty of the death of an infant in its first four years is i^ths. 
 We can say no more than that it is probable (in my sense) that 
 there is such a probabihty (in his sense). For the purpose of 
 coming to a decision we cannot compare the value of this 
 conclusion with that of others until we know the probabiUty 
 (in my sense) that the statistical frequency really is T^rths. 
 The cases in which we can determine the logical value of a 
 conclusion entirely on grounds of statistical frequency would 
 seem to be extremely few in number. 
 
 8. The second main reason which led Venn to develop his 
 theory is to be foimd in his belief that probabiUties which are 
 based on statistical frequencies are alone capable of accurate 
 measurement. The term ' probabiUties,' he argues, is properly 
 confined to the case of chances which can be calculated, and all 
 calculable chances can be made to depend upon statistical 
 frequency. In attempting to estabUsh this latter contention 
 he is involved in some paradoxical opinions. " In many cases," 
 he admits, " it is undoubtedly true that we do not resort to direct 
 experience at all. If I want to know what is my chance of 
 holding ten trumps in a game of whist, I do not enquire how 
 often such a thing has occurred before. ... In practipe, d priori 
 determination is often easy, whilst d posteriori appeal to experi- 
 ence would be not merely tedious but utterly impracticable.'' 
 But these cases which are usually based on tlie Principle of 
 Indifference can, he maintains, be justified on statistic^ grounds. 
 In the case of coin tossing there is a considerable experience o^ 
 the equaUy frequent occurrence of heads and 'tails ; the experi- 
 ence gaiaed in this simple case is to be extended, to the coi^iplex 
 cases by "Induction and Analogy." In one Simple /case the 
 
100 A TREATISE ON PEOBABILITY pt. i 
 
 result to which the Principle of IndifEerence would lead is that 
 which experience recommends. Therefore in complex cases, 
 where there is no basis of experiment at all, we may assume that 
 Experience, if experience there was, would speak with the same 
 voice as IndifEerence. This is to assert that, because in one case, 
 where there is no known reason to the contrary, there actually 
 is none, therefore in other cases incapable of verification the 
 absence of known reason to the contrary proves that actually 
 there is none. 
 
 The attempt to justify the rules of inverse probability on 
 statistical grounds I have failed to understand ; and after a care- 
 ful reading, I am unable to produce an intelligible account of 
 the argument involved in the latter part of chapter vii. of the 
 Logic of Chance.^ I am doubtful whether Venn should not have 
 excluded d posteriori arguments ia probability from his scheme 
 as well as inductive arguments. The attempt to include them 
 may have been induced by a desire to deal with all cases 
 in which numerical calculation has been commonly thought 
 possible. 
 
 9. The argument so far has been solely concerned with the 
 case for the frequency theory developed in the Logic of Chance. 
 The criticisms which foUow will be directed against a more 
 general form of the same theory which may conceivably have 
 recommended itself to some readers. It is unfortunate that no 
 adherent of the doctrine, with the exception of Venn, has at- 
 tempted to present the theory of it in detail. Professor Karl 
 Pearson, for instance, probably agrees with Venn in a general 
 way only, and it is very likely that many of the foregoing remarks 
 do not apply to his view of probability ; but while I generally 
 disagree with the fundamental premisses upon which his work 
 in probability and statistics seems to rest, I am not clearly 
 aware of the nature of the philosophical theory from which he 
 thinks that he derives them and which makes them appear to 
 him to be satisfactory. A careful exposition of his logical pre- 
 slippositioAs wbuld greatly add to the completeness of his work. 
 In the(mea]itime it is only possible to raise general objections to 
 
 1 Let the' reader, yrho is acquainted with this chapter, consider what precise 
 assumption Venn's rpasoning requires on p. 187 in the example which seeks to 
 show the efficacy of Lord Lister's antiseptic treatment d posteriori. What is 
 thn 'inevitable assumption about the bags ' when it is translated into the 
 language of this example ? 
 
CH. vra FUNDAMENTAL IDEAS 101 
 
 any theory of probability which seeks to found itself upon the 
 conception of statistical frequency. 
 
 The generalised frequency theory which I propose to put 
 forward, as perhaps representative of what adherents of this 
 doctrine have in mind, differs from Venn's in several important 
 respects.^ In the first place, it does not regard probability as 
 being identical with statistical frequency, although it holds that 
 all probabilities must be based on statements of frequency, and 
 can be defined in terms of them. It accepts the theory that 
 propositions rather than events should be taken as the subject- 
 matter of probability ; and it adopts the comprehensive view 
 of the subject according to which it includes induction and all 
 other cases in which we beHeve that there are logical grounds for 
 preferring one alternative out of a set none of which are certain. 
 Nor does it follow Venn in supposing any special connection to 
 exist between a frequency theory of probability and logical 
 empiricism. 
 
 10. A proposition can be a member of many distinct classes 
 of propositions, the classes being merely constituted by the 
 existence of particular resemblances between their members 
 or in some such way. We may know of a given proposition that 
 it is one of a particular class of propositions, and we may also 
 know, precisely or within defined limits, what proportion of this 
 class are true, without our being aware whether or not the given 
 proposition is true. Let us, therefore, call the actual proportion 
 of true propositions in a class the truth-frequency ^ of the class, 
 and define the measure of the probability of a proposition relative 
 to a class, of which it is a member, as being equal to the truth- 
 frequency of the class. 
 
 The fundamental tenet of a frequency theory of probability 
 is, then, that the probabihty of a proposition always depends 
 upon referring it to some class whose truth-frequency is known 
 within wide or narrow limits. 
 
 Such a theory possesses most of the advantages of Venn's, 
 but escapes his narrowness. There is nothing in it so far which 
 could not be easily expressed with complete precision in the 
 terms of ordinary logic. Nor is it necessarily confined to prob- 
 
 ^ In what follows I am much indebted for some suggestions in favour of the 
 frequency theory communicated to me by Dr. Whitehead ; but it is not to be 
 supposed that the exposition which follows represents his own opinion. 
 
 * This is Dr. Whitehead's phrase. 
 
102 A TREATISE ON PROBABILITY w. i 
 
 abilities whicli are numerical. In some cases we may know the 
 exact nmnber whicli expresses the truth-frequency of our class ; 
 but a less precise knowledge is not without value, and we may 
 say that one probability is greater than another, without knowing 
 how much greater, and that it is large or small or negligible, if 
 we have knowledge of corresponding accuracy about the truth- 
 frequencies of the classes to which the probabilities refer. The 
 magnitudes of some pairs of probabilities we shall be able to 
 compare numerically, others in respect of more and less only, 
 and others not at all. A great deal, therefore, of what has been 
 said in Chapter III. would apply equally to the present theory, 
 with this difference that the probabilities would, as a matter of 
 fact, have numerical values in all cases, and the less complete 
 comparisons would only hold the field iu cases where the real • 
 probabilities were partially unknown. On the frequency theory, 
 therefore, there is an important sense ia which probabilities can 
 be unknown, and the relative vagueness of the probabilities 
 employed in ordinary reasoning is explained as belonging not 
 to the probabUities themselves but only to our knowledge of 
 them. For the probabilities are relative, not to our knowledge, 
 but to some objective class, possessing a perfectly definite truth- 
 frequency, to which we have chosen to refer them. 
 
 The frequency theory expounded in this manner cannot easily 
 avoid mention of the relativity of probabihties which is imphcit 
 here, as it is in Venn's. Whether or not the probability of a 
 proposition is relative to given data, it is clearly relative to the 
 particular class or series to which we choose to refer it. A given 
 proposition has a great variety of different probabilities corre- 
 sponding to each of the various distinct classes of which it is a 
 member ; and before an intelligible meaning can be given to a 
 statement that the probability of a proposition is so-and-so, the 
 class must be specified to which the proposition is being referred. 
 Most adherents of the frequency theory would probably go 
 further, and agree that the class of reference must be determined 
 in any particular case by the data at our disposal. Here, then, 
 is another point on which it is not necessary for the frequency 
 theory to diverge from the theory of this Treatise. It should, 
 I think, be generally agreed by every school of thought that the 
 probability of a conclusion is in an important sense relative to 
 given premisses. On this issue and also on the point that our 
 
CH. vm FUNDAMENTAL IDEAS 103 
 
 knowledge of many probabilities is not numerically definite, 
 there might well be for the future an end of disagreement, and 
 disputation might be reserved for the philosophical interpretation 
 of these settled facts, which it is unreasonable to deny, however 
 we may explain them. 
 
 11. I now proceed to those contentions upon which my 
 fundamental criticism of the frequency theory is founded. The 
 first of these relates to the method by which the class of reference 
 is to be determined. The magnitude of a probability is always 
 to be measured by the truth-frequency of some class ; and this 
 class, it is allowed, must be determined by reference to the 
 premisses, on which the probability of the conclusion is to be 
 determined. But, as a given proposition belongs to innumerable 
 • different classes, how are we to know which class the premisses 
 indicate as appropriate ? What substitute has the frequency 
 theory to offer for judgments of relevance and indifference ? 
 And without sometlung of this kind, what principle is there for 
 uniquely determining the class, the truth-frequency of which is 
 to measure the probabihty of the argument ? Indeed the 
 difficulties of showing how given premisses determine the class 
 of reference, by means of rules expressed in terms of previous 
 ideas, and without the introduction of any notion, which is new 
 and peculiar to probability, appear to me iasurmoimtable. 
 
 Whilst no general criterion of choice seems to exist, where of 
 two alternative classes neither includes the other, it might be 
 thought that where one does include the other, the obvious 
 course would be to take the narrowest and most specialised class. 
 This procedure was examined and rejected by Venn : though the 
 objection to it is due, not, as he supposed, to the lack of sufficient 
 statistics in such cases upon which to found a generahsation, 
 but to the inclusion in the class-concept of marks characteristic 
 of the proposition in question, but nevertheless not relevant 
 to the matter in hand. If the process of narrowing the class 
 were to be carried to its furthest point, we should generally be 
 left with a class whose only member is the proposition in question, 
 for we generally know something about it which is true of no 
 other proposition. We cannot, therefore, define the class of 
 reference as being the class of propositions of which everything 
 is true which is known to be true of the proposition whose prob- 
 ability we seek to determine. And, indeed, in those examples 
 
104 A TREATISE ON PROBABILITY pt. i 
 
 for which the frequency theory possesses the greatest prima facie 
 plausibility, the class of reference is selected by taking account 
 of some only of the known characteristics of the quaesitum, those 
 oharacteristicSj namely, which are relevant in the circumstances. 
 In those cases in which one can admit that the probability can be 
 measured by reference to a known truth-frequency, the class of 
 reference is formed of propositions about which our relevant 
 knowledge is the same as about the proposition under considera- 
 tion. In these special cases we get the same result from the 
 frequency theory as from the Principle of Indifference. But 
 this does not serve to rehabilitate the frequency theory as a 
 general explanation of probabiUty, and goes rather to show that 
 the theory of this Treatise is the generalised theory, compre- 
 hending within it such appUcations of the idea of statistical truth- 
 frequency as have vahdity. 
 
 * Relevance ' is an important term in probabiLtty, of which 
 the meaning is readily inteUigible. I have given my own defini- 
 tion of it already. But I do not know how it is to be explained 
 ia terms of the frequency theory. Whether supporters of this 
 theory have fully appreciated the difficulty I much doubt. It is 
 a fundamental issue involving the essence of the "peculiarity of 
 probability, which prevents its being explained away in terms 
 of statistical frequency or anything else. 
 
 12. Yet perhaps a modified view of the frequency theory 
 could be evolved which would avoid this difficulty, and I proceed, 
 therefore, to some further criticisms. It might be agreed that a 
 novel element must be admitted at this point, and that relevancy 
 must be determined in some such manner as has been explained 
 in earher chapters. With this admissionj it might be argued, the 
 theory would still stand, divested, it is true, of some of its original 
 simphcity, but nevertheless a substantial theory differing in 
 important respects, although not quite so fundamentally as 
 before, from alternative schemes. 
 
 The next important objection, then, is concerned with the 
 manner in which the principal theorems of probability are to be 
 estabhshed on a theory of frequency. This wiU involve an 
 anticipation in some part of later arguments ; and the reader 
 may be well advised to return to the following paragraph after 
 he has finished Part II. 
 
 13. Let us begin by a consideration of the ' Addition Theorem.' 
 
OH. vm FUKDAMEiNTAL IDEAS 105 
 
 If ajh denotes the probability of a on hypothesis h, this theorem 
 may be written {a + b)/h=a/h+bjh-ablh, and may be read 
 ' On hypothesis h the probability of " a or 6 " is equal to the 
 probability of a + the probability of 6 - the probability of 
 " both a and 6." ' This theorem, interpreted in some way or 
 other, is universally assumed ; and we must, therefore, inquire 
 what proof of it the frequency theory can afford. A little 
 symbolism wiU assist the argument : Let ay represent the truth- 
 frequency of any class a, and let aJh stand for ' the probability 
 of a on hypothesis h, a being the class of reference determined 
 by this hypothesis.' ^ We then have aJh = ay, and we require to 
 prove a proposition, for values of y and S not yet determined, 
 which wiU be of the form : 
 
 {a + i)Jh = aJh + h^jh - abjh. 
 
 Now if S' is the class of propositions {a + b) such that a is an 
 a and 6 a ;8, it is easily shown by the ordinary arithmetic of classes 
 that Sy = ay,+ /Sy-aySy where a/8 is the class of propositions which 
 are members of both a and /3. In the case, therefore, where 
 S = S' and j = a^, an addition theorem of the required kind has 
 been established. 
 
 But it does not follow by any reasonable rule that, if h deter- 
 mines a and /3 as the appropriate classes of reference for a and 6, 
 h must necessarily determine S' and a/3 as the appropriate classes 
 of reference for (a+b) and ab ; it may, for iastance, be the case 
 that h, while it renders a and /3 determinate, yields no informa- 
 tion whatever regarding a^, and points to some quite different 
 class fi, as the suitable class of reference for ab. On the frequency 
 theory, therefore, we cannot maintain that the addition theorem 
 is true in general, but only in those special cases where it happens 
 that 8 = 8' and y = a^. 
 
 The following is a good example : We are given 
 
 that the proportion of black-haired men in the population 
 
 V V 
 
 is — and the proportion of colour-blind men — , and there is no 
 
 known connection between black - hair and colour - blindness : 
 what is the probabiUty that a man, about whom nothing special 
 
 ^ The question, previously at issue, as to how the class of reference is deter- 
 mined by the hypothesis, is now ignored. 
 
106 A TEEATISE ON PROBABILITY ft. i 
 
 is known, is ^ either black-haired or colour-blind ? If we represent 
 
 the hypotheses by h and the alternatives by a and 6, it would 
 
 usually be held that, colour-blindness and black hair being 
 
 p p 
 independent for knowledge ^ relative to the given data, al/h = -^j 
 
 p 
 and that, therefore, by the addition theorem, (a + &)/A = - + 
 
 p P V 
 
 — - -Tg^- But, on the frequency theory, this result might be 
 
 invaUd; for a^j= -^, only if this is the actual proportion in fact 
 
 of persons who are both colour-blind and black-haired, and that 
 this is the actual proportion cannot possibly be inferred from 
 the independence for knowledge of the characters in question.^ 
 
 Precisely the same difficulty arises in connection with the 
 multiphcation theorem ab/h^ajbh.b/h.* In the frequency nota- 
 tion, which is proposed above, the corresponding theorem wiU 
 be of the form ahjh = a Jbh . b^/h. For this equation to be satisfied 
 it is easily seen that S must be the class of propositions xy such 
 that a; is a member of a and y of y8, and 7 the class of propositions 
 xb such that a; is a member of a ; and, as in the case of the addition 
 theorem, we have no guarantee that these classes 7 and S will be 
 those which the hypotheses bh and h wiU respectively determine 
 as the appropriate classes of reference for a and ah. 
 
 In the case of the theorem of inverse probability ^ 
 
 b/ah ajbh b/h 
 c/ah a/ch cjh 
 
 the same difficulty again arises, with an additional one when 
 practical apphcations are considered. For the relative proba- 
 bihties of our d priori hypotheses, b and c, will scarcely ever be 
 capable of determination by means of known frequencies, and in 
 the most legitimate instances of the inverse principle's operation 
 
 ^ In the course of the present discussion the disjunctive a + 6 is never inter- 
 preted so as to exclude the conjunctive db. 
 
 ' For a discussion of this term see Chapter XVI. § 2. 
 
 8 Venn argues (Logic of Chance, pp. 173, 174) that there is an inductive 
 ground for making this inference. The question of extending the fundamental 
 theorems of a frequency theory of probability by means of induction is discussed 
 in § 14 below. 
 
 * Vide Chapter XII. § 6, and Chapter XIV. § 4. 
 
 5 Vide Chapter XIV. § 5. 
 
OH. vm FUNDAMENTAL IDEAS 107 
 
 we depend either upon an inductive argument or upon the 
 Principle of Indifference. It is hard to think of an example in 
 which the frequency conditions are even approximately satisfied. 
 
 Thus an important class of case, in which arguments in proba- 
 bility, generally accepted as satisfactory, do not satisfy the 
 frequency conditions given above, are those in which the notion 
 is introduced of two propositions being, on certain data, inde- 
 pendent for knowledge. The meaning and definition of this 
 expression is discussed more fully in Part II. ; but I do not see 
 what interpretation the frequency theory can put upon it. Yet 
 if the conception of ' independence for knowledge ' is discarded, 
 we shall be brought to a standstill in the vast majority of problems, 
 which are ordinarily considered to be problems in probability, 
 simply from the lack of sufficiently detailed data. Thus the 
 frequency theory is not adequate to explain the processes of 
 reasoning which it sets out to explain. If the theory restricts its 
 operation, as would seem necessary, to those cases in which we 
 know precisely how far the true members of a and /S overlap, 
 the vast majority of arguments in which probability has been 
 employed must be rejected. 
 
 14. An appeal to some further principle is, therefore, required 
 before the ordinary apparatus of probable inference can be estab- 
 lished on considerations of statistical frequency ; and it may 
 have occurred to some readers that assistance may be obtained 
 from the principles of induction. Here also it wiH be necessary 
 to anticipate a subsequent discussion. If the argument of Part 
 III. is correct, nothing is more fatal than Induction to the theory 
 now under criticism. For, so far from Induction's lending 
 support to the fundamental rules of probabihty, it is itself 
 dependent on them. In any case, it is generally agreed that 
 an iaductive conclusion is only probable, and that its probability 
 increases with the number of instances upon which it is founded. 
 According to the frequency theory, this behef is only justified if 
 the majority of inductive conclusions actually are true, and it 
 will be false, even on our existing data, that any of them are even 
 probable, if the acknowledged possibihty that a majority are 
 false is an actuality. Yet what possible reason can the frequency 
 theory ofEer, which does not beg the question, for supposing that 
 a majority are true ? And failing this, what groimd have we 
 for believing the inductive process to be reasonable ? Yet we 
 
108 A TEEATISE ON PROBABILITY pt. i 
 
 invaoriably assume that with our existing knowledge it is logically 
 reasonable to attach some weight to the inductive method, even 
 it future experience shows that not one of its conclusions is verified 
 in fact. The frequency theory, therefore, in its present form at 
 any rate, entirely fails to explain or justify the most important 
 source of the most usual arguments in the field of probable 
 inference. 
 
 15. The failure of the frequency theory to explain or justify 
 arguments from induction or analogy suggests some remarks of a 
 more general kind. While it is undoubtedly the case that many 
 valuable judgments in probability are partly based on a know- 
 ledge of statistical frequencies, and that many more can be held, 
 with some plausibility, to be indirectly derived from them, there 
 remains a great mass of probable argument which it would be 
 paradoxical to justify in the same maimer. It is not stifficient, 
 therefore, even if it is possible, to show that the theory can be 
 developed in a self-consistent manner ; it must also be shown 
 how the body of probable argument, upon which the greater 
 part of our generally accepted knowledge seems to rest, can 
 be explained in terms of it ; for it is certain that much of 
 it does not appear to be derived from premisses of statistical 
 frequency. 
 
 Take, for instance, the intricate network of arguments upon 
 which the conclusions of The Origin of Species are founded : 
 how impossible it would be to transform them iato a shape in 
 which they would be seen to rest upon statistical frequency ! 
 Many individual arguments, of course, are exphcitly founded 
 upon such considerations ; but this only serves to differentiate 
 them more clearly from those which are not. Darwin's own 
 account of the nature of the argument may be quoted : " The 
 belief in Natural Selection must at present be grounded entirely 
 on general considerations : ' (1) on its beiag a vera causa, from 
 the struggle for existence and the certain geological fact that 
 species do somehow change ; (2) from the analogy of change 
 under domestication by man's selection ; (3) and chiefly from 
 this view connecting under an intelhgible poiat of view a host 
 of facts. When we descend to details ... we cannot prove that 
 a single species has changed ; nor can we prove that the supposed 
 changes are beneficial, which is the groundwork of the theory ; 
 nor can we e^lain why some species have changed and others 
 
OH. vm FUNDAMENTAL IDEAS 109 
 
 have not." ^ Not only in the main argument, but in many of the 
 subsidiary discussions,^ an elaborate combination of induction 
 and analogy is superimposed upon a narrow and limited know- 
 ledge of statistical frequency. And this is equally the case in 
 almost all everyday arguments of any degree of complexity. 
 The class of judgments, which a theory of statistical frequency 
 can comprehend, is too narrow to justify its claim to present a 
 complete theory of probability. 
 
 16. Before concluding this chapter, we should not overlook 
 the element of truth which the frequency theory embodies and 
 which provides its plausibility. In the first place, it gives a 
 true account, so long as it does not argue that probabiHty and 
 frequency are identical, of a large number of the most precise 
 arguments in probability, and of those to which mathematical 
 treatment is easily applicable. It is this characteristic which 
 has recommended it to statisticians, and explains the large 
 measure of its acceptance in England at the present time ; for 
 the popularity in this country of an opinion, which has, so far 
 as I know, no thorough supporters abroad, may reasonably be 
 attributed to the chance which has led most of the English 
 writers, who have paid much attention to probability in recent 
 years, to approach the subject from the statistical side. 
 
 In the second place, the statement that the probabiHty of an 
 event is measured by its actual frequency of occurrence ' in the 
 long run ' has a very close connection with a valid conclusion 
 which can be derived, in certain cases, from Bernoulli's theorem. 
 This theorem and its connection with the theory of frequency will 
 be the subject of Chapter XXIX. 
 
 17. The absence of a recent exposition of the logical basis of 
 the frequency theory by any of its adherents has been a great 
 disadvantage to me in criticising it. It is possible that some 
 of the opinions, which I have examined at length, are now held 
 by no one ; nor am I absolutely certain, at the present stage of 
 the inquiry, that a partial rehabilitation of the theory may not 
 be possible. But I am sure that the objections which I have 
 raised cannot be met without a great complication of the theory, 
 and without robbing it of the simplicity which is its greatest 
 
 1 Letter to G. Bentham, Life and Letters, vol. iii. p. 25. 
 ' E.g. in the discussion on the relative efEeot of disuse and selection in 
 reducing unnecessary organs to a rudimentary condition. 
 
110 A TREATISE ON PROBABILITY pt. i 
 
 preliminary recommendation. Until the theory has been given 
 new foundations, its logical basis is not so secure as to permit 
 controversial applications of it in practice. A good deal of 
 modern statistical work may be based, I think, upon an incon- 
 sistent logical scheme, which, avowedly founded upon a theory 
 of frequency, introduces principles which this theory has no 
 power to justify. 
 
CHAPTER IX 
 
 THE CONSTRUCTIVE THEORY OP PART I. SUMMARISED 
 
 1. That part of our knowledge which we obtain directly, 
 suppKes the premisses of that part which we obtain by argument. 
 From these premisses we seek to justify some degree of rational 
 belief about all sorts of conclusions. We do this by perceiv- 
 ing certain logical relations between the premisses and the 
 conclusions. The kind of rational belief which we infer in 
 this manner is termed probable (or in the limit certain), and the 
 logical relations, by the perception of which it is obtained, we 
 term relations of probability. 
 
 The probability of a conclusion a derived from premisses h 
 we write a/h ; and this symbol is of fundamental importance. 
 
 2. The object of the Theory or Logic of ProbabiUty is to 
 systematise such processes of inference. In particular it aims 
 at elucidating rules by means of which the probabihties of different 
 arguments can be compared. It is of great practical importance 
 to determine which of two conclusions is on the evidence the 
 more probable. 
 
 The most . important of these rules is the Principle of 
 Indifference. According to this Principle we must rely upon 
 direct judgment for discriminating between the relevant and 
 the irrelevant parts of the evidence. We can only discard 
 those parts of the evidence which are irrelevant by seeing that 
 they have no logical bearing on the conclusion. The irrelevant 
 evidence being thus discarded, the Principle lays it down that 
 if the evidence for either conclusion is the same {i.e. symmetrical), 
 then their probabilities also are the same {i.e. equal). 
 
 If, on the other hand, there is additional evidence {i.e. ia 
 addition to the symmetrical evidence) for one of the conclusions, 
 and this evidence is favourably relevant, then that conclusion is 
 
 111 
 
112 A TEEATISE ON PROBABILITY ft. i 
 
 the more probable. Certain rules have been given by which to 
 judge whether or not evidence is favourably relevant. And by 
 combinations of these judgments of preference with the judg- 
 ments of indifference warranted by the Principle of Indifference 
 more compUcated comparisons are possible. 
 
 3. There are, however, many cases in which these rules 
 furnish no means of comparison ; and in which it is certain that 
 it is not actually within our power to make the comparison. It 
 has been argued that in these cases the probabihties are, in fact, 
 not comparable. As in the example of similarity, where there 
 are different orders of increasing and diminishing similarity, but 
 where it is not possible to say of every pair of objects which of 
 them is on the whole the more hke a third object, so there are 
 different orders of probabiUty, and probabilities, which are not 
 of the same order, cannot be compared. 
 
 4. It is sometimes of practical importance, when, for example, 
 we wish to evaluate a chance or to determine the amount of 
 OUT expectation, to say not only that one probabiUty is greater 
 than another, but by how much it is greater. We wish, that is 
 to say, to have a numerical measure of degrees of probability. 
 
 This is only occasionally possible. A rule can be given for 
 ntimerical measurement when the conclusion is one of a number 
 of equiprobable, exclusive, and exhaustive alternatives, but not 
 otherwise. 
 
 5. In Part II. I proceed to a symboUc treatment of the 
 subject, and to the greater systematisation, by symbolic methods 
 on the basis of certain axioms, of the rules of probable argument. 
 
 In Parts III., IV., and V. the nature of certain very important 
 types of probable argument of a complex kind will be treated 
 in detail ; in Part III. the methods of Induction and Analogy, 
 in Part IV. certain semi-philosophical problems, and in Part V. 
 the logical foundations of the methods of inference now com- 
 monly known as statistical. 
 
PART II 
 FUNDAMENTAL THEOKEMS 
 
 113 
 
CHAPTER X 
 
 INTRODUCTORY 
 
 1. In Part I. we have been occupied with the epistemology of our 
 subject, that is to say, with what we know about the characteristics 
 and the justification of probable Knowledge. In Part II. I pass 
 to its Formal Logic. I am not certain of how much positive value 
 this Part will prove to the reader. My object in it is to show 
 that, starting from the philosophical ideas of Part I., we can 
 deduce by rigorous methods out of simple and precise definitions 
 the usually accepted results, such as the theorems of the addition 
 and multiplication of probabilities and of inverse probabUity. 
 The reader wiU readily perceive that this Part would never have 
 been written except under the influence of Mr. Russell's Princijna 
 Maihematica. But I am sensible that it may suffer from the 
 over-elaboration and artificiality of this method without the 
 justification which its grandeur of scale affords to that great work. 
 In common, however, with other examples of formal method, 
 this attempt has had the negative advantage of compelling the 
 author to make his ideas precise and of discovering fallacies and 
 mistakes. It is a part of the spade-work which a conscientious 
 author has to undertake ; though the process of doing it may 
 be of greater value to him than the results can be to the reader, 
 who is concerned to know, as a safeguard of the rehability of the 
 rest of the construction, that the thing can be done, rather than 
 to examine the architectural plans in detail. In the development 
 of my own thought, the following chapters have been of great 
 importance; For it was through trying to prove the fundamental 
 theorems of the subject on the hypothesis that Probability was 
 a relation that I first worked my way into the subject ; and the 
 rest of this Treatise has arisen out of attempts to solve the 
 successive questions to which the ambition to treat Probabihty 
 as a branch of Formal Logic first gave rise. 
 
 115 
 
116 A TREATISE ON PROBABILITY m. n 
 
 A fnrtlier occasion of diffidence and apology in introducing 
 this Part of my Treatise arises out of the extent of my debt to 
 Mx. W. E. Johnson. I worked out the first scheme in complete 
 independence of his work and ignorant of the fact that he had 
 thought, more profoundly than I had, along the same lines ; I 
 have also given the exposition its final shape with my own hands. 
 But there was an intermediate stage, at which I submitted what 
 I had done for his criticism, and received the benefit not only of 
 criticism but of his own constructive exercises. The result is 
 that in its final form it is difficult to indicate the exact extent of 
 my indebtedness to him. When the following pages were first 
 in proof, there seemed Uttle likelihood of the appearance of any 
 work on ProbabiUty from his own pen, and I do not now proceed 
 to publication with so good a conscience, when he is announcing 
 the approaching completion of a work on Logic which will include 
 " Problematic Inference." 
 
 I propose to give here a brief summary of the five chapters 
 following, without attempting to be rigorous or precise. I shall 
 then be free to write technically in Chapters XI.-XV., inviting 
 the reader, who is not specially interested in the details of this 
 sort of technique, to pass them by. 
 
 2. Probability is concerned with arguments, that is to say, 
 with the " bearing " of one set of propositions upon another set. 
 If we are to deal formally with a generalised treatment of this 
 subject, we must be prepared to consider relations of probability 
 between cmy pair of sets of propositions, and not only between 
 sets which are actually the subject of knowledge. But we soon 
 find that some limitation must be put on the character of sets of 
 propositions which we can consider as the hypothetical subject 
 of an argument, namely, that they must be possible subjects of 
 knowledge. We cannot, that is to say, conveniently apply our 
 theorems to premisses which are seK-contradictory and formally 
 inconsistent with themselves. 
 
 For the purpose of this limitation we have to make a distinc- 
 tion between a set of propositions which is merely false in fact 
 and a set which is formally inconsistent with itseK.^ This leads 
 
 '- Spinoza had in mind, I think, the distinction between Truth and Prob- 
 ability in his treatment of Necessity, Contingenee, and Possibility. Res 
 enim omnes ex data Dei natura necessario sequutae sunt, et ex necessitate naturae 
 Dei determinatae sunt ad cerio modo ezistendum et operandv/m {Mhiees i. 33). 
 Xliat is to say, everything is, without qualification, true or false. At res 
 
OH. X FUNDAMENTAL THEOEEMS 117 
 
 us to the conception of a growp of propositions, which is defined 
 as a set of propositions such that — (i.) if a logical principle 
 belongs to it, all propositions which are instances of that logical 
 principle also belong to it ; (ii.) if the proposition p and the 
 proposition ' not-^ or q ' both belong to it, then the proposition 
 q also belongs to it ; (iii.) if any proposition f belongs to it, then 
 the contradictory of f is occluded from it. If the group defined 
 by one part of a set of propositions excludes a proposition which 
 belongs to a group defined by another part of the set, then the 
 set taken as a whole is inconsistent with itself and is incapable of 
 forming the premiss of an argument. 
 
 The conception of a group leads on to a precise definition of 
 one proposition requiring another (which in the realm of assertion 
 corresponds to relevance in the realm of probability), and of logical 
 priority as being an order of propositions arising out of their 
 relation to those special groups, or real' groups, which are in fact 
 the subject of knowledge. Logical priority has no absolute 
 signification, but is relative to a specific body of knowledge, or, 
 as it has been termed in the traditional logiCj to the Universe of 
 Reference. 
 
 It also enables us to reach a definition of inference distinct from 
 implication, as defined by Mr. Kussell. This is a matter of very 
 great importance. Readers who are acquaiuted with thej work 
 of Mr. Russell and his followers will probably have noticed that 
 the contrast between his work and that of the traditional logic 
 is by no means wholly due to the greater precision and more 
 mathematical character of his technique. There is a difference 
 also in the design. His object is to discover what assumptions 
 are required in order that the formal propositions generally 
 accepted by mathematicians and logicians may be obtainable 
 
 aliqua nulla alia de causa contingens dicitur, nisi respectu defectus noslrae 
 cognitionis (Etliices i. 33, scholium). That is to say, Contingence, or, as I 
 term it. Probability, solely arises out of the limitations of our knowledge. 
 Contingence in this wide sense, which includes every proposition which, in 
 relation to our knowledge, is only probable (this term covering all intermediate 
 degrees of probability), may be further divided into Contingence in the strict 
 sense, which corresponds to an d priori or formal probability exceeding zero, 
 and Possibility ; that is to say, into formal possibility and empirical possibility. 
 Res singulares voco contingentes, quaienus, dum ad earum solam essentiam 
 attendimus, nihil invenimus, quod earum existentiam necessario ponat, vel 
 quod ipsam necessario seeludat. Easdem res singulares voco possibiles, quatenus, 
 dum ad causae, ex quibus produci detent, attendimus, nescimus, an ipsae 
 determinatae sint ad easdem producendum (EtMces iv. Def 3, 4). 
 
118 A TREATISE ON PEOBABILITY w. n 
 
 as the result of successive steps or substitutions of a few very 
 simple types, and to lay bare by this means any inconsistencies 
 which may exist in received results. But beyond the fact that 
 the conclusions to which he seeks to lead up are those of common 
 sense, and that the uniform type of argument, upon the validity 
 of which each step of his system depends, is of a specially obvious 
 kind, he is not concerned with analysing the methods of valid 
 reasoning which we actually employ. He concludes with 
 familiar results, but he reaches them from premisses, which have 
 never occurred to us before, and by an argument so elaborate that 
 our minds have difficulty in foUowiag it. As a method of setting 
 forth the system of formal truth, which shall possess beauty, 
 iater-dependence, and completeness, his is vastly superior to 
 any which has preceded it. But it gives rise to questions about 
 the relation in which ordinary reasoning stands to this ordered 
 system, and, in particular, as to the precise connection between 
 the process of inference, in which the older logicians were princi- 
 pally interested but which he ignores, and the relation of implica- 
 tion on which his scheme depends. 
 
 ' p implies q ' is, according to his definition, exactly equivalent 
 to the disjunction ' q is true or f is false.' If q is true, ' p itnpUes 
 q ' holds for aU values of p ; and similarly if f is false, the im- 
 plication holds for all values of q. This is not what we mean 
 when we say that q can be inferred or follows from f. For what- 
 ever the exact meaning of inference may be, it certainly does not 
 hold between all pairs of true propositions, and is not of such a 
 character that etoety proposition follows from a false one. It is 
 not true that ' A male now rules over England ' follows or can be 
 inferred from 'A male now rules over France ' ; or 'A female now 
 rules over England ' from ' A female now rules over France ' ; 
 whereas, on Mr. Russell's definition, the corresponding implica- 
 tions hold simply in virtue of the facts that ' A male now rules 
 over England ' is true and ' A female now rules over France ' 
 is false. 
 
 The distinction between the Relatival Logic of Inference and 
 Probability, and Mr. Russell's Universal Logic of Implication, 
 seems to be that the former is concerned with the relations of 
 propositions in general to a particular limited growp. Inference 
 and Probability depend for their importance upon the fact that 
 in actual reasoning the limitation of our knowledge presents us 
 
OH. X FUNDAMENTAL THEOEEMS 119 
 
 with a particular set of propositions, to which, we must relate any- 
 other proposition about which we seek knowledge. The course 
 of an argument and the results of reasoning depend, not simply 
 on what is true, but on the particular body of knowledge from 
 which we have set out. Ultimately, indeed, Mr. EusseU cannot 
 avoid concerning himself with groups. For his aim is to discover 
 the smallest set of propositions which specify our formal know- 
 ledge, and then to show that they do in fact specify it. In this 
 enterprise, being human, he must confine himself to that part of 
 formal truth which we know, and the question, how far his 
 axioms comprehend all formal truth, must remain insoluble. 
 But his object, nevertheless, is to establish a train of implications 
 between formal truths ; and the character and the justification of 
 rational argument as such is not his subject. 
 
 3. Passhig on from these preliminary reflections, our first 
 task is to establish the axioms and definitions which are to make 
 operative our sjonbolical processjes. These processes are almost 
 entirely a development of the idea of representing a probability 
 by the symbol a/h, where h is the premiss of an argument and a 
 its conclusion. It might have been a notation more in accord- 
 ance with our fundamental ideas, to have employed the symbol 
 a/h to designate the argwm&nZ from h to a, and to have represented 
 the probability of the argument, or rather the degree of rational 
 belief about a which the argument authorises, by the symbol 
 F{ajh). This would correspond to the symbol Y{a/h) which has 
 been employed in Chapter VI. for the evidential value of the 
 argument as distinct from its probability. But in a section 
 where we are only concerned with probabilities, the use of P(a/A) 
 would have been unnecessarily cumbrous, and it is, therefore, 
 convenient to drop the prefix P and to denote the probability 
 itself by a/h. 
 
 The discovery of a convenient symbol, like that of an essential 
 word, has often proved of more than verbal importance. Clear 
 thinking on the subject of Probability is not possible without a 
 symbol which takes an expHcit account of the premiss of the 
 argument as well as of its conclusion ; and endless confusion has 
 arisen through discussions about the probability of a conclusion 
 without reference to the argument as a whole. I claim, therefore, 
 the introduction of the symbol a/h as an essential step towards 
 any progress in the subject. 
 
120 A TEBATISE ON PROBAEILITY ra. n 
 
 4. Inasmucli as relations of Probability cannot be assumed 
 to possess the properties of numbers, tbe terms addition and 
 multiplication of probabilities have to be given appropriate 
 meanings by definition. It is convenient to employ these 
 familiar expressions, rather than to invent new ones, because the 
 properties which arise out of our definitions of addition and 
 multipUcation in Probability are analogous to those of addition 
 and multiplication in Arithmetic. But the process of establishing 
 these properties is a little complicated and occupies the greater 
 part of Chapter XII. 
 
 The most important of the definitions of Chapter XII. are the 
 following (the numbers referring to the numbers of Chapter 
 XII.) : 
 
 II. The Definition of Certainty : ajh = l. 
 
 III. The Definition of Impossibility : a/h=0. 
 
 VI. The Definition of Inconsistency : ah is inconsistent if 
 a/A=0. 
 
 VII. The Definition of a Group : the class of propositions a 
 such that a/h = 1 is the group h. 
 
 Vni. The Defimition of Equivalence : if b/ah = 1 and a/bh = 1 
 (amb)lh = l. 
 
 IX. The Definition of Addition: ab/h + aE/h^=a/h. 
 
 X. The Definition of Multiplication: ab/h=ajbh .blh = 
 b/ah . a/h. The symbolical development of the subject largely 
 proceeds out of these definitions of Addition and Multiplication. 
 It is to be observed that they give a meaning, not to the addition 
 and multiplication of any pairs of probabilities, but only to pairs 
 which satisfy a certain form. The definition of Multiplication 
 may be read : ' the probability of both a and b given h is equal 
 to the probability of a given bh, multiplied by the probability of 
 h given h.' 
 
 XI. The Definition of Independence: if aila^h=aj^/h and 
 ajajh=a2]h, ajh and ajh are independent. 
 
 XII. The Definition of Irrelevance: if ar^aji=ayjh, a^ is 
 irrelevant to ajh. 
 
 5. In Chapter XIII. these definitions, supplemented by a few 
 axioms, are employed to demonstrate the fundamental theorems 
 of Certain or Necessary Inference. The interest of this chiefly 
 lies in the fact that these theorems include those which the 
 
 ^ b stands for the contradictoiy of b. 
 
OH. X FUNDAMENTAL THEOEEMS 121 
 
 traditional Logic has termed the Laws of Thought, as for example 
 the Law of Contradiction and the Law of Excluded Middle. 
 These are here exhibited as a part of the generalised theory 
 of Inference or Eational Argument, which includes probable 
 Inference as well as certain Inference. The object of this chapter 
 is to show that the ordinarily accepted rules of Inference can in 
 fact be deduced from the definitions and axioms of Chapter XII. 
 
 6. In Chapter XIV. I proceed to the fundamental Theorems 
 of Probable Inference, of which the following are the most 
 interesting : 
 
 Addition Theorem: {a + b)/h=alh+bfh-ab/h, which reduces 
 
 to (a + b)/h = ajh + b/h, where a and b are mutually exclusive ; 
 
 and, if p^^ • ■ ■ Pn ioim, relative to h, a set of exclusive and 
 
 n 
 
 exhaustive alternatives, a/h='tpfajh. 
 
 1 
 Theorem of Irrelevance: If ajhji2=a/hi, then ajhji2=ajh^; 
 
 i.e. if a proposition is irrelevant, its contradictory also is irrelevant. 
 
 Theorem of Independence : li a2/ajh=a^h, aja2h=a^/h; i.e. 
 if «! is irrelevant to aJh, it foUows that a^ is irrelevant to a^/h 
 and that a^/h and a^/A are independent. 
 
 Multiplication Theorem : If aJh and aJh are independent, 
 aja2lh=ai/h . a^jh. 
 
 Theorem of Inverse Probability : -^ — =J—L. . -11—. Further, 
 
 a^jbh bjaji a^/h 
 
 if 0^1^= Pit <^2l^=Pz> ^/<hfi'=iv ^/<'a^ = 9'2' ^^^ ajbh + a2lbh = \, 
 then ajbh= — ^-^ — ; and if a^lh^aJh, aJbh= — =^, which 
 
 is equivalent to the statement that the probability of Oj when 
 we know b is equal to — i^, where q^ is the probability of b when 
 
 we know a^ and q^ its probability when we know a^. This 
 theorem enunciated with varying degrees of inaccuracy appears 
 in all Treatises on Probability, but is not generally proved. 
 
 Chapter XIV. concludes with some elaborate theorems on the 
 combination of premisses based on a technical symbolic device, 
 known as the Cumulative Formula, which is the work of Mr. W. E. 
 Johnson. 
 
 7. In Chapter XV. I bring the non-numerical theory of 
 probability developed in the preceding chapters into connection 
 with the usual numerical conception of it, and demonstrate how 
 
122 A TREATISE ON PROBABILITY fp. n 
 
 and in what class of cases a meaning can be given to a numerical 
 measure of a relation of probability. This leads on to what 
 may be termed numerical approximation, that is to say, the 
 relating of probabilities, which are not themselves numerical, 
 to probabilities, which are numerical, by means of greater and less, 
 by which in some cases numerical limits may be ascribed to 
 probabilities which are not capable of numerical measures. 
 
CHAPTER XI 
 
 THE THEORY OF GROUPS, WITH SPECIAL REFERENCE TO 
 LOGICAL CONSISTENCE, INFERENCE, AND LOGICAL PRIORITY 
 
 1. The Theory of Probability deals with the relation between 
 two sets of propositions, such that, if the first set is known to be 
 true, the second can be known with the appropriate degree of 
 probability by argument from the first.^ The relation, however, 
 also exists when the first set is not known to be true and is hypo- 
 thetical. 
 
 In a symbolical treatment of the subject it is important 
 that we should be free to consider hypothetical premisses, and 
 to take accoimt of relations of probability as existing between 
 any pair of sets of propositions, whether or not the premiss is 
 actually part of knowledge. But iu acting thus we must be 
 careful to avoid two possible sources of error. 
 
 2. The first is that which is Hable to arise wherever va/riables 
 are concerned. This was mentioned in passing in § 18 of Chapter 
 IV. We must remember that whenever we substitute for a 
 variable some particular value of it, this may so afEect the relevant 
 evidence as to modify the probability. This danger is always 
 present except where, as in the first half of Chapter XIII., the 
 conclusions respecting the variable are certain. 
 
 3. The second difficulty is of a different character. Our 
 premisses may be hypothetical and not actually the subject of 
 knowledge. But must they not be possible subjects of know- 
 ledge ? How are we to deal with hypothetical premisses which 
 are self-contradictory or formally inconsistent with themselves, 
 and which caimot be the subject of rational belief of any degree ? 
 
 1 Or more strictly, " perception of which, together with knowledge of the 
 first set, justifies an appropriate degree of rational belief about the second." 
 
 123 
 
124 A TREATISE ON PROBABILITY pt. n 
 
 Whether or not a relation of probability can be held to exist 
 between a conclusion and a self-inconsistent premiss, it will be 
 convenient to exclude such relations from our scheme, so as to 
 avoid having to provide for anomalies which can have no interest 
 in an account of the actual processes of valid reasoning. Where 
 a premiss is inconsistent with itself it cannot be required. 
 
 4. Let us term the collection of propositions, which are 
 logically involved in the premisses in the sense that they follow 
 from them, or, in other words, stand to them in the relation of 
 certainty,^ the growp specified by the premisses. That is to say, 
 we define a group as containing all the propositions logically 
 involved in any of the premisses or in any conjunction of them ; 
 and as excluding all the propositions the contradictories of which 
 are logically involved in any of the premisses or in any con- 
 junction of them. 2 To say, therefore, that a proposition foUows 
 from a premiss, is the same thing as to say that it belongs to the 
 group which the premiss specifies. 
 
 The idea of a ' group ' wiU then enable us to define ' logical 
 consistency.' If any part of the premisses specifies a group 
 containing a proposition, the contradictory of which is contained 
 in a group specified by some other part, the premisses are logically 
 inconsistent ; otherwise they are logically consistent. In short, 
 premisses are inconsistent if a proposition ' foUows from ' one 
 part of them, and its contradictory from another part. 
 
 5. We have stiU, however, to make precise what we mean in 
 this definition by one proposition /oZtowM^/rom or being logically 
 invohed in the truth of another. We seem to intend by these 
 expressions some kind of transition by means of a logical principle. 
 A logical principle cannot be better defined, I think, than in terms 
 of what in Mr. Russell's Logic of Implication is termed a formal 
 implication. ' p implies 5 ' is a formal implication if ' not-j9 or q ' 
 is formally true ; and a proposition is formally true, if it is a value 
 of a propositional function, in which all the constituents other 
 
 * ' a can be inferred from b,' ' a foUows from b,' ' a is certain in relation to 
 b,' ' a is logically involved in 6,' I regard as equivalent expressions, the precise 
 meaning of which will be defined in succeeding paragraphs. ' a is implied by 6,' 
 I use in. a different sense, namely, in Mr. Russell's sense, as the equivalent of 
 ' b ornot-a.' 
 
 * For the conception of a group, and for many other notions and definitions 
 in the course of this chapter — ^those, for example, of a real group and of 
 logical priority — ^I am largely indebted to Mr. W. E. Johnson. The origination 
 of the theory of groups is due to him. 
 
OH. XI FUNDAMENTAL THEOREMS 125 
 
 tlian the arguments are logical constants, and of which all the 
 values are true. 
 
 We might define a group in such a way that aU logical principles 
 belonged to every group. In this case all formally true proposi- 
 tions would belong to every group. This definition is logically 
 precise and would lead to a coherent theory. But it possesses 
 the defect of not closely corresponding to the methods of reasoning 
 we actually employ, because all logical principles are not in fact 
 known to us. And even in the case of those which we do know, 
 there seems to be a logical order (to which on the above definition 
 we cannot give a sense) amongst propositions, which are about 
 logical constants and are formally true, just as there is amongst 
 propositions which are not formally true. Thus, i£ we were to 
 assume the premisses in every argument to include aU formally 
 true propositions, the sphere of probable argument would be 
 limited to what (in contradistinction to formally true propositions) 
 we may term empirical propositions. 
 
 6. I"or this reason, therefore, I prefer a narrower definition — 
 which shall correspond more exactly to what we seem to mean 
 when we say that one proposition follows from another. Let us 
 define a group of propositions as a set of propositions such that : 
 
 (i.) if the proposition ' p is formally true' belongs to the group, 
 aU propositions which are instances of the same formal proposi- 
 tional function also belong to it ; 
 
 (ii.) if the proposition p and the proposition ' p implies q ' 
 both belong to it, then the proposition q also belongs to it ; 
 
 (iii.) if any proposition p belongs to it, then the contradictory 
 of p is excluded from it. 
 
 According to this definition all processes of certain inference 
 are wholly composed of steps each of which is of one of two simple 
 types (and if we like we might perhaps regard the first as com- 
 prehending the other). I do not feel certain that these conditions 
 may not be narrower than what we mean when we say that one 
 proposition follows from another. But it is not necessary for the 
 purpose of defining a group, to dogmatise as to whether any other 
 additional methods of inference are, or are not, open to us. If 
 we define a group as the propositions logically involved in the 
 premisses in the above sense, and prescribe that the premisses of 
 an argument in probability must specify a group not less extensive 
 than this, we are placing the minimum amount of restriction upon 
 
126 A TREATISE ON PROBABILITY w. n 
 
 the form of our pxemisses. If, sometimes or as a rule, o\a 
 premisses in fact include some more powerful principle of argu- 
 ment, so much the better. 
 
 In the formal rules of probability which follow, it will be 
 postulated that the set of propositions, which form the premiss 
 of any argument, must not be inconsistent. The premiss must, 
 that is to say, specify a ' group ' in the sense that no part of the 
 premiss must exclude a proposition which follows from another 
 part. But for this purpose we do not need to dogmatise as to 
 what the criterion is of inference or certainty. 
 
 7. It will be convenient at this point to define a term which 
 expresses the relation converse to that which exists between a 
 set of propositions and the group which they specify. The pro- 
 positions jPjPz . . . Pn are said to be fundamental to the group 
 h if (i.) they themselves belong to the group (which involves their 
 being consistent with one another) ; (ii.) if between them they 
 completely specify the group ; and (iii.) if none of them belong 
 to the group specified by the rest (for if p^ belongs to the group 
 specified by the rest, this term is redundant). 
 
 When the fundamental set is uniquely determined, a group h' 
 is a sub-group to the group h, if the set fundamental to h' is 
 included in the set fundamental to h. 
 
 Logically there can be more than one distinct set of proposi- 
 tions fundamental to a given group ; and some extra-logical test 
 must be appHed before the fundamental set is determined uniquely. 
 On the other hand, a group is completely determined when the 
 constituent propositions of the fundamental set are given. 
 Further, any consistent set of propositions evidently specifies 
 some group, although such a set may contain propositions 
 additional to those which are fundamental to the group it specifies. 
 It is clear also that only one group can be specified by a given 
 set of consistent propositions. The members of a group are, 
 we may say, rationally bound up with the set of propositions 
 fundamental to it. 
 
 8. If Mr. Bertrand Russell is right, the whole of pure 
 mathematics and of formal logic follows, in the sense defined 
 above, from a small number of primitive propositions. The 
 group, therefore, which is specified by these primitive pro- 
 positions, includes the most remote deductions not only amongst 
 those known to mathematicians, but amongst those which time 
 
OH. XI FUNDAMENTAL THEOREMS 127 
 
 and skill have not yet served to solve. If we define cert^iinty 
 in a logical and not a psychological sense, it seems necessary, 
 if our premisses include the essential axioms, to regard as 
 certain all propositions which follow from these, whether or 
 not they are known to us. Yet it seems as if there must 
 be some logical sense in which unproved mathematical 
 theorems — some of those, for instance, which deal with the 
 theory of numbers — can be likely or unhkely, and in which a 
 proposition of this kind, which has been suggested to us by 
 analogy or supported by induction, can possess an intermediate 
 degree of probability. 
 
 There can be no doubt, I think, that the logical relation of 
 certainty does exist in these cases in which lack of skill or insight 
 prevents our apprehending it, in spite of the fact that sufficient 
 premisses, including sufficient logical principles, are known to us. 
 In these cases we must say, what we are not permitted to say 
 when the indeterminacy arises from lack of premisses, that the 
 probability is unknown. There is still a sense, however, in which 
 in such a case the knowledge we actually possess can be, in a 
 logical sense, only probable. While the relation of certainty 
 exists between the fundamental axioms and every mathematical 
 hypothesis (or its contradictory), there are other data in relation 
 to which these hypotheses possess intermediate degrees of 
 probabiKty. If we are unable through lack of sldll to discover 
 the relation of probability which an hypothesis does in fact bear 
 towards one set of data, this set is practically useless, and we must 
 fix our attention on some other set in relation to which the prob- 
 ability is not unknown. When Newton held that the binomial 
 theorem possessed for empirical reasons sufficient probabihty 
 to warrant a further investigation of it, it was not in relation to 
 the axioms of mathematics, whether he knew them or not, that 
 the probabihty existed, but in relation to his empirical evidence 
 combined, perhaps, with some of the axioms. There is, in short, 
 an exception to the rule that we must always consider the prob- 
 abihty of any conclusion in relation to the whole of the data in 
 our possession. When the relation of the conclusion to the whole 
 of our evidence cannot be known, then we must be guided by 
 its relation to some part of the evidence. When, therefore, in 
 later chapters I speak of a formal proposition as possessing an 
 intermediate degree of probability, this will always be in relation 
 
128 A TREATISE ON PEOBABILITY w. n 
 
 to evidence from which tiie proposition does not logically follow 
 in the sense defined in § 6. 
 
 9. It follows from the preceding definitions that a proposition 
 is certain in relation to a given premiss, or, ia other woids, follows 
 from this premiss if it is included in the group which that premiss 
 specifies. It is impossible if it is excluded from the group — ^if, 
 that is to say, its contradictory follows from the premiss. We 
 often say, somewhat loosely, that two propositions are contra- 
 dictory to one another, when they are iuconsistent in the sense 
 that, relative to our evidence, they cannot belong to the same 
 group. On the other hand, a proposition, which is not itseK 
 included in the group specified by the premiss and whose contra- 
 dictory is not included either, has in relation to the premiss an 
 intermediate degree of probability. 
 
 If a follows from h and is, therefore, included in the group 
 specified by h, this is denoted hy a/h = 1. The relation of certainty, 
 that is to say, is denoted by the symbol of imity. The reason 
 why this notation is useful and has been adopted by common 
 consent will appear when the meaning of the product of a pair 
 of relations of probabUity has been explained. If we represent 
 the relation of certaiuty by 7 and any other probability by 
 a, the product a.<y=a. Similarly, if a is excluded from the 
 group specified by h and is impossible in relation to it, this is 
 denoted by ajh=0. The use of the symbol zero to denote 
 impossibility arises out of the fact that, if a denotes impossibihty 
 and a any other relation of probability, then, in the senses of 
 multiphcation and addition to be defined later, the product 
 a .a) = o), and the sum a + (o=a. Lastly, if a is not included 
 ia the group specified by h, this is written o/A+1 or u/JkI; . 
 and if it is not excluded, this is written a/h=i=0 or ajh>0. 
 
 10. The theory of groups now enables us to give an account, 
 with the aid of some further conceptions, of logical priority and 
 of the true nature of inference. The groups, to which we refer 
 the arguments by which we actually reason, are not arbitrarily 
 chosen. They are determined by those propositions of which 
 we have direct knowledge. Our group of reference is specified 
 by those direct judgments iu which we personally rationally 
 certify the truth of some propositions and the falsity of others. 
 So long as it is undetermined, or not determined uniquely, 
 which propositions are fundamental, it is not possible to discover 
 
CH. XI FUNDAMENTAL THEOEEMS 129 
 
 a necessary order amongst propositions or to show in what way 
 a true proposition ' follows from ' one true premiss rather than 
 another. But when we have determined what propositions are 
 fimdamental, by selecting those which we know directly to be true, 
 or in some other way, then a meaning can be attached to priority 
 and to the distinction between inference and implication. When 
 the propositions which we know directly are given, there is a 
 logical order amongst those other propositions which we know 
 indirectly and by argument. 
 
 11. It will be useful to distinguish between those groups which 
 are hypothetical and those of which the fundamental set is known 
 to be true. We will term the former hypothetical groups, and the 
 latter real groups. To the real group, which contains all the 
 propositions which are known to be true, we may assign the old 
 logical term Universe of Reference. While knowledge is here 
 taken as the criterion of a real group, what follows will be equally 
 valid whatever criterion is taken, so long as the fundamental set 
 is in some manner or other determined uniquely. 
 
 If it is impossible for us to know a proposition p except by 
 inference from a knowledge of q, so that we cannot know p to be 
 true unless we already know q, this may be expressed by saying 
 that ' p requires q.' More precisely requirement is defined as 
 follows : 
 
 p does not require q if there is some real group to which p 
 belongs and q does not belong, i.e. if there is a real group h 
 such that p/h = l, q/h=i=l ; hence 
 
 p requires q if there is no real group to which p belongs 
 and q does not belong. 
 
 p does not require q withi/n the group h, if the group h, to which 
 p belongs, contains a subgroup ^ h' to which p belongs and q does 
 not belong ; i.e. if there is a group h' such that h'jh = 1, p/h' = 1, 
 qjh'^l. This reduces to the proposition next but one above 
 if A is the Universe of Keference. In § 13 these definitions 
 will be generalised to cover intermediate degrees of prob- 
 ability. 
 
 12. Inference and logical priority can be defined m terms of 
 requirement and real groups. It is convenient to distiaguish 
 two types of inference correspondiag to hypothetical and real 
 
 1 Subgroups have only been defined, it must be noticed (see§ 7 above) when 
 the fundamental set of the group has been, in some way, uniquely determined. 
 
 K 
 
130 A TEEATISE ON PEOBABILITY pt. h 
 
 groups — i.e. to cases where the argument is only hypothetical, 
 and cases where the conclusion can be asserted : 
 
 Hypothetical Inference. — ' If j), q,' which may also be read 
 ' q is hypothetically inferrible from p,' means that there is a 
 real group h such that q/ph = l, and g/A + l. In order that this 
 may be the case, ph must specify a group ; i.e. p/h4=0, or in 
 other words p must not be excluded from h. Hypothetical 
 inference is also equivalent to : ' p implies g,' and ' p implies 
 q ' does not require ' q.' In other words, q is hypothetically 
 inferrible from p, if we know that q is true or p is false and if 
 we can know this without first knowing either that q is true or 
 that p is false. 
 
 Assertoric Inference. — ' p .-. q,' which may be read ' p therefore 
 q' oi' q may be asserted by inference from p,' means that ' Jlp,q' 
 is true, and in addition " p ' belongs to a real group ; i.e. there 
 are proper groups h and A' such th.a,tpjh = l, q/ph' =1, qjh'^l, 
 and pjh' 4= 0. 
 
 p is prior to q when p does not require q, and q requires p, 
 when, that is to say, we can know p without knowing q, but 
 not q unless we first know p. 
 
 p is prior to q within the group h when p does not require q 
 within the group, and q does require p within the group. 
 
 It follows from this and from the preceding definitions that, 
 if a proposition is fundamental in the sense that we can only 
 know it directly, there is no proposition prior to it ; and, more 
 generally, that, if a proposition is fundamental to a given 
 group, there is no proposition prior to it within the group. 
 
 13. We can now apply the conception of requirement to 
 intermediate degrees of probability. The notation adopted is, 
 it will be remembered, as follows : 
 
 p/h = a means that the proposition p has the probable relation 
 of degree a to the proposition h ; while it is postulated that h is 
 self-consistent and therefore specifies a group. 
 
 plh = l means that p follows from h and is, therefore, in- 
 cluded in the group specified by h. 
 
 p/h = means that p is excluded from the group specified by h. 
 
 If h specifies the Universe of Eeference, i.e. if its group com- 
 prehends the whole of our knowledge, p/h is called the absolute 
 probahility of p, or (for short) the probability of p ; and if p/h = 1 
 and h specifies any real group, p is said to be absolutely certain 
 
CH. XI FUNDAMENTAL THEOREMS 131 
 
 or (for short) certain. Thus f is ' certain ' if it is a member of a 
 real group, and a ' certain ' proposition is one which we know 
 to be true. Similarly if p/h=0 under the same conditions, p is 
 absolutely impossible, or (for short) impossible. Thus an ' im- 
 possible ' proposition is one which we know to be false. 
 
 The definition of requirement, when it is generalised so as to 
 take account of intermediate degrees of probabiUty, becomes, it 
 will be seen, equivalent to that of relevance : 
 
 The probability of p does not require q within the group h, if 
 there is a subgroup h' such that, for every subgroup h" which 
 includes A' and is included in fe(t.e. h'/h" =\,h" jh = \),pjh" =plh', 
 and q/h' =f= q/h. 
 
 When p is included in the group h, this definition reduces to 
 the definition of requirement given in § 11. 
 
 14. The importance of the theory of groups arises as soon as 
 we admit that there are some propositions which we take for 
 granted without argument, and that all arguments, whether 
 demonstrative or probable, consist in the relating of other con- 
 clusions to these as premisses. 
 
 The particular propositions, which are in fact fundamental 
 to the Universe of Reference, vary from time to time and from 
 person to person. Our theory must also be applicable to hypo- 
 thetical Universes. Although a particular Universe of Reference 
 may be defined by considerations which are partly psychological, 
 when once the Universe is given, our theory of the relation in 
 which other propositions stand towards it is entirely logical. 
 
 The formal development of the theory of argument from 
 imposed and limited premisses, which is attempted in the following 
 chapters, resembles in its general method other parts of formal 
 logic. We seek to establish implications between our primitive 
 axioms and the derivative propositions, without specific reference 
 to what particular propositions are fundamental in our actual 
 Universe of Reference. 
 
 It will be seen more clearly in the following chapters that the 
 laws of inference are the laws of probability, and that the former 
 is a particular case of the latter. The relation of a proposition to 
 a group depends upon the relevance to it of the group, and a 
 group is relevant in so far as it contains a necessary or sufficient 
 condition of the proposition, or a necessary or sufficient condition 
 of a necessary or sufficient condition, and so on ; a condition 
 
132 A TEEATISE ON PROBABILITY pt. n 
 
 being necessary if every hypothetical group, which includes the 
 proposition together with the Universe of Reference, includes 
 the condition, and suflSlcient if every hypothetical group, which 
 includes the condition together with the Universe of Reference, 
 includes the proposition. 
 
CHAPTER XII 
 
 THE DEFINITIONS AND AXIOMS OP INFERENCE AND 
 PROBABILITY 
 
 1. It is not necessary for the validity of what follows to decide 
 in what manner the set of propositions is determined, which is 
 fundamental to oui Universe of Reference, or to make definite 
 assumptions as to what propositions are included in the group 
 which is specified by the data. When we are investigating an 
 empirical problem, it will be natural to include the whole of 
 our logical apparatus, the whole body, that is to say, of 
 formal truths which are known to us, together with that part 
 of our empirical knowledge which is relevant. But in the 
 following formal developments, which are designed to display 
 the logical rules of probability, we need only assume that our data 
 always include those logical rules, of which the steps of our 
 proofs are instances, together with the axioms relating to prob- 
 ability which we shall enunciate. 
 
 The object of this and the chapters immediately following is 
 to show that all the usually assumed conclusions in the funda- 
 mental logic of inference and probability follow rigorously from 
 a few axioms, in accordance with the fundamental conceptions 
 expounded in Part I. This body of axioms and theorems 
 corresponds, I think, to what logicians have termed the Laws of 
 Thought, when they have meant by this something narrower than 
 the whole system of formal truth. But it goes beyond what has 
 been usual, in dealing at the same time with the laws of probable, 
 as well as of necessary, inference. 
 
 2. This and the following chapters of Part II. are largely 
 independent of many of the more controversial issues raised in 
 the preceding chapters. They do not prejudge the question as 
 
 133 
 
134 A TEEATISE ON PROBABILITY pt. h 
 
 to whether or not all probabilities are theoretically measurable ; 
 and they are not dependent on our theories as to the part played 
 by direct judgment in establishing relations of probability or 
 inference between particular propositions. Their premisses are 
 all hypothetical. Given the existence of certain relations of 
 probability, others are inferred. Of the conclusions of Chapter 
 III., of the criteria of equiprobabihty and of inequality discussed 
 in Chapters IV. and V., and of the criteria of inference discussed 
 in §§ 5, 6 of Chapter XI., they are, I think, whoUy independent. 
 They deal with a different part of the subject, not so closely 
 comiected with epistemology. 
 
 3. In this chapter I confine myself to Definitions and Axioms. 
 Propositions wUl be denoted by small letters, and relations 
 
 by capital letters. In accordance with common usage, a dis- 
 junctive combination of propositions is represiented by the sign 
 of addition, and a conjunctive combination by simple juxta- 
 position (or, where it is necessary for clearness, by the sign of 
 multipUcation) : e.g. ' a or 6 or c ' is written ' a + b+c,' and ' a 
 and b and c ' is written ' abc' ' a + 6 ' is not so interpreted as to 
 exclude ' a and b.' The contradictory of a is written a. 
 
 4. Preliminary Definitions : 
 
 I. If there exists a relation of probability P between the 
 proposition a and the premiss h 
 
 a/A=P Del 
 
 II. If P is the relation of certainty ^ 
 
 P=l Def. 
 
 III. If P is the relation of impossibihty ^ 
 
 P=0 Def. 
 
 IV. If P is a relation of probabiUty, but not the relatipn of 
 certainty P<1. Def. 
 
 V. If P is a relation of probability, but not the relation of 
 impossibility P>0. Def. 
 
 VI. If a/h=0, the conjunction ah is inconsistent. Def. 
 
 VII. The class of propositions a such that a/A = l is the 
 group specified by h ox (for short) the group h. Def. 
 
 VIII. If b/ah = 1 and a/bh = 1, {a^b)/h = 1 . Def. 
 This may be regarded as the definition of Equivalence. Thus 
 
 we see that equivalence is relative to a premiss h. a is equivalent 
 to b, given h,iib follows from ah, and a from bh. 
 
 ^ These symbols were first employed by Leibnitz. See p. 155 below. 
 
CH. XII FUNDAMENTAL THEOEEMS 135 
 
 5. Preliminary Axioms : 
 
 We shall assume that there is included in every premiss with 
 which we are concerned the formal implications which allow us 
 to assert the following axioms : 
 
 (i.) Provided that a and h are propositions or conjunctions 
 of propositions or disjunctions of propositions, and that h is not 
 an inconsistent conjunction, there exists one and only one rela- 
 tion of probabihty P between a as conclusion and h as premiss. 
 Thus any conclusion a bears to any consistent premiss h one and 
 •only one relation of probability. 
 
 (ii.) If {a=i)jh=l, and a; is a proposition, x/ah = x/bh. This 
 is the Axiom of Equivalence. 
 
 (iii.) (a + b=aE)/h = l 
 
 {aa^a)/h = 1 
 
 {&=a)/h = 1 
 
 (ab+ab=b)/h = l. 
 
 If a/h = 1, ah=h. That is to say, 
 if a is included in the group specified by h, h and ah are 
 equivalent. 
 
 6. Addition and MuUijaUcaiion. — If we were to assume that 
 probabilities are numbers or ratios, these operations could be 
 given their usual arithmetical signification. In adding or 
 multiplpng probabilities we should be simply adding or multi- 
 plying numbers. But in the absence of such an assumption, it 
 is necessary to give a meaning by definition to these processes. 
 I shall define the addition and multiplication of relations of 
 probabilities only for certain types of such relations. But it 
 will be shown later that the limitation thus placed on our opera- 
 tions is not of practical importance. 
 
 We define the sum of the probable relations ab/h and aBjh 
 as being the probable relation a/h ; and the prodiict of the probable 
 relations ajbh and b/h as being the probable relation ab/h. That 
 is to say : 
 
 IX. ab/h +aSjh= a/h. Def. 
 
 X. ab/h = a/bh . b/h = b/ah . a/h. Def. 
 Before we proceed to the axioms which will make these sym- 
 
 bols operative, the definitions may be restated in more familiar 
 language. IX. may be read : • " The sum of the probabilities 
 of ' both a and b ' and of ' a but not b,' relative to the same 
 hypothesis, is equal to the probability of ' a ' relative to this hypo- 
 
136 A TEBATISE ON PEOBABILITY pi- u 
 
 thesis." X. may be read : " The probability of ' both a and 6,' 
 assuming A, is equal to the product of the probability of 6, assum- 
 ing A, and the probability of a, assuming both 6 and A." Or in 
 the current terminology ^ we should have : " The probability 
 that both of two events will occur is equal to the probability of 
 the first multiplied by the probabiUty of the second, assuming 
 the occurrence of the first." It is, in fact, the ordinary rule for 
 the multiplication of the probabilities of events which are not 
 ' independent.' It has, however, a much more central position 
 in the development of the theory than has been usually recognised. 
 Subtraction and division are, of course, defined as the inverse 
 operations of addition and multiphcation : 
 
 XI. If PQ=E,P=5 Def. 
 
 XII. If P + Q=E, P=E-Q. Def. 
 Thus we have to introduce as definitions what would be axioms 
 
 if the meaning of addition and multiphcation were already defined. 
 In this latter case we should have been able to apply the ordinary 
 processes of addition and multiphcation without any further 
 axioms. As it is, we need axioms in order to make these symbols, 
 to which we have given our own meaning, operative. When 
 certain properties are associated, it is often more or less arbitrary 
 which we take as defining properties and which we associate 
 with these by means of axioms. In this case I have found it 
 more convenient, for the purposes of formal development, to 
 reverse the arrangement which would come most natural to 
 commonsense, full of preconceptions as to the meaning of addition 
 and multiphcation. I define these processes, for the theory of 
 probabiUty, by reference to a comparatively unfamiUar property, 
 and associate the more famihar properties with this one by means 
 of axioms. These axioms are as follows : 
 
 (iv.) If P, Q, E are relations of probabihty such that the 
 products PQ, PE and the sums P + Q, P +E exist, then : 
 
 (iv. a) If PQ exists, QP exists, and PQ = QP. If P + Q exists, 
 Q+P exists and P + Q = Q + P. 
 
 (iv. 6) PQ<P unless Q = l or P = 0; P + Q>P unless Q = 0. 
 PQ=P if Q = l or P=0; P + Q=P if Q = 0. 
 
 (iv. e) If PQSPE, then Q|E unless P = 0. If P + Q|P + E, 
 then Q=E and conversely. 
 
 ^ E.g. Bertrand, Calcul des probabilites, p. 26. 
 
CH. xn FUNDAMENTAL THEOREMS 137 
 
 A meaning has not been given, it is important to notice, to 
 the signs of addition and multipKcation between probabilities 
 in all cases. According to the definitions we have given, P + Q 
 and PQ have not an interpretation whenever P and Q are 
 relations of probability, but in certain conditions only. Further- 
 more, if P + Q=R and Q=S + T, it does not follow that 
 P + S-i-T=R, since no meaning has been assigned to such an 
 expression as P + S + T. The ecLuation must be written P + (S + T) 
 =Rj and we cannot infer from the foregoing axioms that 
 (P-I-S)+T=R. The following axioms allow us to make this 
 and other inferences in cases in which the sum P + S exists, i.e. 
 when P +S =A and A is a relation of probability. 
 
 (v.) [±P±Q] +[±E±S] =[±P±R] - [tQtS] =[±P±R] + 
 
 [±Q±S] = [±P±Q]-[tRtS] 
 in every case in which the probabilities [±P±Q], [±R±S], 
 [±P±R], etc., exist, i.e., in which these sums satisfy the con- 
 ditions necessary in order that a meaning may be given to them 
 in the terms of our definition. 
 
 (vi.) P(R±S)=PR±PS, if the sum R±S and the products 
 PR and PS exist as probabilities. 
 
 7. From these axioms it is possible to derive a number of 
 propositions respecting the addition and multiplication of prob- 
 abilities. They enable us to prove, for instance, that if P + Q = 
 R+S then P-R=S-Q, provided that the differences P-R 
 andS-Qexist; and that (P 4- Q) (R + S) = (P + Q)R + (P + Q)S = 
 [PR + QR] + [PS + QS] = [PR + QS] + [QR +PS], provided that 
 the sums and products in question exist. In general any re- 
 arrangement which would be legitimate in an equation between 
 arithmetic quantities is also legitimate in an equation between 
 probabilities, provided that our initial equation and the equation 
 which finally results from our symbolic operations can both be 
 expressed in a form which contains only products and sums which 
 have an interpretation as probabiUties in accordance with the 
 definitions. If, therefore, this condition is observed, we need not 
 complicate our operations by the insertion of brackets at every 
 stage, and no result can be obtained as a result of leaving them 
 out, if it is of the form prescribed above, which could not be 
 obtained if they had been rigorously inserted throughout. We 
 can only be interested in our final results when they deal with 
 actually existent and intelligible probabiUties — ^for our object is. 
 
138 A TREATISE ON PEOBABILITY pt. n 
 
 always, to compare one probability with another — ^and we are 
 not incommoded, therefore, in our symbolic operations by the 
 circumstance that sums and products do not exist between 
 every pair of probabihties. 
 8. Independence : 
 
 XIII. If a.Ja2h = a.Jfi and aja-^=a^jh, the probabilities 
 a■^]h and a^h are independent. Def. 
 
 Thus the probabilities of two arguments having the same 
 premisses are independent, if the addition to the premisses of the 
 conclusion of either leaves them unafEected. 
 
 Irrelevance : ^ 
 
 XIV. If a^aji=ajh, ag is irrelevant on the whole, or, for 
 
 short, irrelevant to Or^jh. Def. 
 
 ^ This is repeated for convenience of reference from Chapter IV. § 14. It is 
 only necessary here to take account of irrelevance on the whole, not of the more 
 precise sense. 
 
CHAPTEE XIII 
 
 THE FUNDAMENTAL THEOREMS OF NECESSARY INFERENCE 
 
 1. In this chapter we shall be mainly concerned with deducing 
 the existence of relations of certainty or impossibility, given other 
 relations of certainty or impossibility,^ — with the rules, that is to 
 say, of Certain or, as De Morgan termed it, of Necessary Inference. 
 But it wiU be convenient to include here a few theorems dealing 
 with intermediate degrees of probabihty. Except in one or two 
 important cases I shall not trouble to translate these theorems 
 from the symbolism in which they are expressed, since their 
 interpretation presents no difficulty. • 
 
 2. (1) a/h + alh =1. 
 
 For ablh+ab/h=b/h by IX., 
 
 a/bh . bjh + ajbh . b/h = b/h by X. 
 
 Put 6/A = 1, then a/bh + ajbh = 1 by (iv. b), 
 
 since 6/A-=.l, ih=h by (iii.). 
 
 Thus ajh+d/h = l by (ii.). 
 
 (1.1) If a/h = l,dlh = 0, 
 
 a/h+djh=l by (1), 
 
 .-. afh + d/h = ajh = a/h + by (iv. &) , 
 
 .-. djh=0 by (iv. c). 
 
 (1.2) Similarly, if d/h^l, a/h=0. 
 
 (1.3) If a/h^O, djh^l, 
 
 a/h + d/h = 1 by (1), 
 
 .-. + a/A =0 + 1 by (iv.6), 
 
 .-. a//i = l by (iv. c). 
 
 (1.4) Similarly, if a/A = 0, a/A = 1. 
 
 (2) a/A<lora/A = l by IV. 
 
 (3) a/h>0 01 a/h =0 by V., 
 i.e. there are no negative probabilities. 
 
 139 
 
140 A TREATISE ON PROBABILITY pt. n 
 
 (4) ab/h<b/h or abfh^^bjh by X. and (iv. b). 
 
 (5) If P and Q are relations of probability and P + Q=0, 
 then P=0 and Q=0. 
 
 P + Q>P unless Q =0 by (iv. b), 
 
 and P>0 unless P = by V. 
 
 .-. P + Q>0 unless Q = 0. 
 Hence, if P + Q=0, Q = and similarly P=0. 
 
 (6) If PQ=0, P=0 or Q=0, 
 
 Q>0 unless Q=0 by V. 
 
 Hence PQ>P . unless Q =0 or P =0 by (iv. c), 
 
 i.e. PQ>0 unless Q=0 or P=0 by (iv. b). 
 
 Whence, if PQ=0, the result follows. 
 
 (7) If PQ = 1, P = l and Q = l, 
 
 PQ<P unless P = or Q = l by (iv. &), 
 
 PQ=PiE P = or Q = l by(iv.6), 
 
 and P<1 unless P = l by IV., 
 
 .-. PQ<1 unless P = l. 
 Hence P = 1 ; similarly Q = 1. 
 
 (8) If ajh=0, ah/h=0 and a/bk = if bh is not incon- 
 sistent. 
 
 For ab/h = b/ah . a/h = a/bh . bjh by X., 
 
 and since a/A = 0, b/ah. a/h = by(iv. 6), 
 
 .•. ab/h = and a/bh . b/h = 0, 
 :. unless h/h=0, a/bh = by (5), 
 
 whence the result by VI. 
 
 Thus, if a conclusion is impossible, we may add to the con- 
 clusion or add consistently to the premisses without afEecting the 
 argument. 
 
 (9) If a/h = l, a/bh = \ if bh is not inconsistent. 
 
 Since a/h = l, a/h = by (l.i), 
 
 .-. d/bh = by (8) if bh is not inconsistent, 
 
 whence a/bh = l by (1.4). 
 
 Thus we may add to premisses, which make a conclusion 
 certain, any other premisses not inconsistent with them, without 
 afEecting the result. 
 
 (10) If a/A = l, ab/h = b/ah = b/h, 
 
 ab/h = b/ah . a/h = a/bh . b/h by X. 
 
 Since a/h = l, a/bh = \ by (9) unless b/h = 0, 
 
 .: b/ah . a/h = b/ah and a/bh . b/h = b/h by (iv. b), 
 whence the result, unless b/h = 0. 
 
CH. xm FUNDAMENTAL THBOEBMS 141 
 
 If h/h=0, the result follows from (8). 
 
 (11) liallh = \, alh = l. 
 
 For ahlh = l/ah .a/h by X., 
 
 .-. a/A = l by (7). 
 
 (12) If (a=b)lh = 'l,ajh = bjh, 
 
 b/ah . a/h — ajbh . b/h by X. 
 
 and b/ah = l, albh = l byVIIL, 
 
 .-. alh = b/h by (iv. &). 
 
 (12.1) If {amb)/h = l and hx is not inconsistent, 
 a/hx = b/hx. 
 afhx . x/h=x/ah . a/h, 
 and b/hx . x/h =x/bh . b/h by X., 
 
 x/ah=x/hh by (ii.), 
 
 and a/h = b/h by (12), 
 
 .". a/hx = b/hx unless x/h=0. 
 TMs is the principle of equivalence. In virtue of it and of 
 axiom (H.), if (a=b)/h = l, we can substitute a for b and vice versa, 
 wherever they occur in a probabiUty whose premisses include h. 
 
 (13) a/a = 1, unless a is inconsistent. 
 
 For a/a=aa/a=a/aa .a/a by (iii.), (12), and X., 
 
 whence a/aa = l by (ii.), unless a/a = 0, 
 
 i.e. a/a = l, unless a is inconsistent by (iii.), (12), and VI. 
 
 (13.1) d/a=0, unless a is inconsistent. This follows from 
 (13) and (l.l). 
 
 (13.2) a/d=0, unless a is inconsistent. This follows from 
 (iii.) by writing a for a in (13.1). 
 
 (14) If a/b=0 and a is not inconsistent, b/a=0. 
 
 Let/ be the group of assumptions, common to a and 6, which 
 we have supposed to be included in every real group ; 
 then a/b=a/bf and b/a = b/af by (iii.) and (12), 
 
 and a/bf.b/f = b/af.a/f by X. 
 
 Since a/bf==0 by hypothesis, 
 
 and «//=*= 0, since a is not inconsistent, 
 
 .-. b/af=0, 
 whence 6/^=0. 
 
 Thus, if a is impossible given b, then b is impossible given a. 
 
 (15) If V^2 = 0, hjhjh = 0, 
 
 hjhjh =h-j]iji . h^/h by II., 
 
 and since h.Jh^=0, h.jhjh = by (8), unless h/h^ = 0, whence 
 the result by (iv. b), unless h/h^=0. 
 
142 A TEBATISE ON PROBABILITY pi n 
 
 If hlh^=0, hjh=0 ' by (14), 
 
 since we assume that h is not inconsistent, and hence 
 
 h^hjh=0 by (8). 
 
 Thus, if hj^ is impossible given h^,, hji^ is always impossible and is 
 excluded from every group. 
 
 (15.1) If hjhjh=0 and h^h is not inconsistent, hjh2h = 0. 
 This, which is the converse of (15), follows from X. and (6). 
 
 (16) li hjh^ = l, {\ + K^)lh = l, 
 
 hjh^=0 ■ by (1), 
 
 .-. n^hJh = Q by (15), 
 
 .-. iy_hjh = l by (1.3), 
 
 .-. {hy + h^lh = l by (12) and (iii.). 
 
 (16.1) We may write (16) : 
 
 Ji h.Jh2 = l, (^2 5^i)/^ = lj where ' d' sjonbolises ' implies.' 
 Thus if hi follows from h^, then it is always certain that 
 ^2 implies A^. 
 
 (16.2) If {h-^+h^jh = 1 and h^h is not inconsistent, 
 lijh2h = \. 
 
 Kjhjh=0, as in (16), 
 .■. ^1/^2^=0 hy (15.1), since h^h is not inconsistent, 
 
 .■.hjh^h = l by (1.4). 
 
 This is the converse of (14). 
 
 (16.3) We may write (16.2) : 
 
 If (^2 'sh-^lh = l and hji is not inconsistent, -A j/A.2^ = l- 
 Thus, if we define a ' group ' as a set of propositions, which follow 
 from and are certain relatively to the proposition which specifies 
 them, this proposition proves that, if ^2 ^^1 ^^^ ^2 belong to a 
 group hji, then h-^ also belongs to this group. 
 
 (17) If {h-y D : a=i)lh = 1 and h-Ji is not inconsistent, a/h-Ji 
 ^Ijhji. This follows from (16.3) and (12). 
 
 (18) ala = \ or a/d = l. 
 
 a/a = l, unless a is inconsistent, by (13). 
 If a is inconsistent, a/h = 0, where h is not inconsistent, and 
 therefore d/h=l by (1.3). 
 
 Thus unless a is inconsistent, a is not inconsistent, and therefore 
 
 a/d = l by (13). 
 
 (19) adfh = 0, 
 
 d/d = l or n/a = l by (18); 
 
 .-. a/d = or ffl/a = by (l.l) and (1.2). 
 In either case adlh=0 by (15). 
 
OH. xm FUNDAMENTAL THBOEEMS 143 
 
 Thus it is impossible that both a and its contradictory 
 should be true. This is the Law, of Contradiction. 
 (20) {a + d)/h = l. 
 
 Since {ad=a + d)/h = l by (iii.), 
 
 a+dlh=0 by (19) and (12), 
 
 .-. (a+a)/A = l by (1.3). 
 
 Thus it is certain that either a or its contradictory is true: This 
 is the Law of Excluded Middle. 
 
 (21) If a/Ai = l and a/h^ = 0, h^hJh^O. 
 For ajhji^ . h-Jh^ = hjah^ ■ ajh^, 
 
 and d/hjh^. hjhj^ = hjdh-^.d/hi by X., 
 
 .•. a/hjh^ . h-Jh2 = and d/h^h^ ■ hjhj^=0, 
 since, by hypothesis and (1), d/hj^=0 and ajh2 = 0, 
 
 :. a/h-ih2 = or hjh2=0, 
 and alhji2 = \ or hjhj^ = 0, 
 
 .: h-^jh^^Q or hJh-^^ = 0. 
 In either case hjhjh = by (15). 
 
 Thus, if a proposition is certain relatively to one set of 
 premisses, and impossible relatively to another set, the two sets 
 are incompatible. 
 
 (22) If a/hj^ = and hjh = l, ajh=0, 
 
 ahjh=0 by (15), .*. hjah .a/h=0, 
 
 hjah = l by (9), unless a/h = 0. 
 
 .". in any case a/h^O. 
 
 (23) If l/a = and h/d = 0, b/h = 0. 
 
 ah/h = and dh[h = by (15), 
 .-. a/hh^O or h/h = 0, 
 
 and d/bh = or b/h = by II. and (iv.), 
 
 whence &/A=0 by (1.4). 
 
CHAPTER XIV 
 
 THE FtraDAMENTAL THEOREMS OF PROBABLE mFEBENCE 
 
 1. I SHALL give proofs in this chapter of most of the fundamental 
 theorems of Probability, with very little comment. The bearing 
 of some of them will be discussed more fully in Chapter XVI. 
 
 2. The Addition Theorems : 
 
 (24) {a + l)/h==a/h + b/h-ablh. 
 
 In IX. write (a + b) for a, and db for b. 
 
 Then (a + &)a6/A + {a + b)abjh = (a + &)/A, 
 
 whence dblh + {a + b){a+5)/h = {a + b)/h by (iii.), 
 
 d/bh . b/h + a/h = (a + b)/h by (iii.) and IX. 
 That is to say, {a + b)/h = ajh + (1 - ajbh) . b/h, 
 = a/h + blh-ablh. 
 In accordance Asrith the principles of Chapter XII. § 6, this 
 should be written, strictly, in the form a/h + (b/h - abjh), or in 
 the form h/h + {a/h-ah/h). The argument is valid, since the 
 probability (b/h-ab/h) is equal to ab/h, as appears from the 
 preceding proof, and, therefore, exists. This important theorem 
 gives the probability of ' a or 6 ' relative to a given hypothesis 
 in terms of the probabilities of *a,' 'b,' and 'a and b' relative to 
 the same hypothesis. 
 
 (24.1) If ab/h=0, i.e. if a and 6 are exclusive alternatives 
 relative to the hypothesis, then 
 
 (a + b)/h = a/h + b/h. 
 This is the ordinary rule for the addition of the probabilities of 
 exclusive alternatives. 
 
 (24.2) ab/h+db/h = b/h, 
 
 since ab+db^b by (iii.), 
 
 and adb/h=0 by (19) and (8). 
 
 (24.3) (a + b)/h = a/h + bd/h. This follows from (24) and 
 (24.2). 
 
 144 
 
OH. XIV FUNDAMENTAL THEOREMS 145 
 
 (24.4) {a + i + c)/h = {a + h)jh + cjh - {ac + bc)lh 
 
 = a/h + h/h + cjh - abjh - bc/h - cajh + abcjh. 
 
 (24.5) And in general 
 
 (24.6) If pj>tjh = for all pairs of values of s and t, it follows 
 by repeated application of X. that 
 
 n , 
 
 (?)i+^2 + ...+pJ/A = 2p,/A. 
 1 
 
 (24.7) If i3,?)t/A,=0, etc., and (p^ +^2 + • • • +1^ JM = 1' *•«• 
 if PxPi, ■ ■ -Pn form, relatively to h, a set of exclusive and 
 exhaustive alternatives, then 
 
 1 
 
 (25) If j)jP2 • • -Vn form, relative to h, a set of exclusive 
 
 and exhaustive alternatives, 
 
 71 
 
 1 
 
 Since (:Pi+P2 + - • •+FJM = 1 ^7 hypothesis, 
 
 .-. {p-y +P2 + • • • +i'7i)/«^ = 1 Ijy (9) if ah is not inconsistent ; 
 and since Pspjl^ = by hypothesis, 
 
 ■'• PsPtl'^^^^ ^y (9). if f^h is not inconsistent. 
 ji 
 Hence "Zpr/ah = (p^ +P2 + ... +Pn)la'h by (24.6) 
 
 1 
 
 = 1. 
 
 Also p^ajh =Pr/a,h . a/h. 
 
 Summing Sp^ajh = a/h . "Zpjah, 
 
 1 1 
 
 n 
 
 :. a/h = Xp^a/h, if ah is not inconsistent. 
 
 1 
 
 If ah is inconsistent, i.e. if a/h = (for A is by hypothesis con- 
 sistent), the result follows at once by (8). 
 
 (25.1) If p^a/h=X„ the above may be written 
 
 / J, ^r 
 
 2X, 
 1 
 
 L 
 
146 A TREATISE ON PEOBABILITY n. u 
 
 (26) alh = {a+K)/h. 
 
 For (a + K)/h = a/h + K/h - aK/h by (24), 
 
 = ajh by (13.1) and (8). 
 
 (26.1) This may be ■written 
 
 a/h = {hz>a)/h. 
 
 (27) li{a + b)/h = 0, ajh=0. 
 
 a/h + [h/h-al/h] =0, by (24) and hypothesis 
 .-. a/h = by (v.). 
 
 (27.1) If a/A=0 and 6/A=0, {a + b)/h=0. This follows 
 from (24). 
 
 (28) If a/A = 1, (a + h)/h = l, 
 
 (a + B)/h = a/h + Eajh by (24.3), 
 
 whence {a + d)/h = a/h = l by (l.l) and (8), together with the 
 hypothesis. That is to say, a certaia proposition is imphed by 
 every proposition. 
 
 (28.1) If a/h=0, (d + b)/h=l by substituting a for a and b 
 for 5 in (28). That is to say, a certainly false proposition 
 impUes every proposition. 
 
 (29) If a/(Ai + A2) = l, a/hi = l, 
 
 dl(h^ + hz)=0, 
 and .-. a{li-j^ + h^l\ = by (15). 
 
 Hence ah^h^ = Q by (27), 
 
 whence the result. 
 
 (29.1) If a/Ai = l and alh^ = \, al{h^ + li^)'^l. 
 As in (20) ah-J{Ji-^ + h^=Q and ahj(hj^ + h^=0. 
 
 Hence a{hj^ + hz)/{hj^ + h^)=0 by (27.1), 
 
 whence the result. 
 
 (29.2) If a/{hj^ + hz)=0, a/Ai=0. This follows from (29). 
 
 (29.3) If a/Ai = and a/Aa = 0, a/{hj^ + h2)=0. This follows 
 from (29.1). 
 
 3. Irrelevance and Independence : 
 
 (30) If a/h-ih^^a/h-i, then aj'h-Ji2 = ajhy^, if h-Ji^ is not incon- 
 sistent. 
 
 alh-y = ah^h-^^ + ah^h-^ by (24.2), 
 
 = a/hjJi^ ■ h^hj^+a/hJiQ ■ Hjh-y, • 
 = ajh-^. . hjhj^ + a/hji^ ■ ^Jh^, 
 .: a/hj^ . hjh-i = ajh-Ji^ . K^/h-^, 
 
 whence a/hi = a/hiK2, unless Jijh-y = 0, i.e. if hji^ is not in- 
 consistent. 
 
CH. XIV FUNDAMENTAL THEOREMS 147 
 
 Thus, if a proposition is irrelevant to an argument, then the 
 contradictory of the proposition is also irrelevant. 
 
 (31) If a^aji = a^h and aji is, not inconsistent, a-jaji=ailh. 
 This follows by (iv. c), since a^jaji .ajh =ajaji.a2lh by X. 
 
 If, that is to say, a^ is irrelevant to the argument ajh (see 
 XIV.), and aa is not inconsistent with h : then a^ is irrelevant 
 to the argument a-Jh ; and ajh and ajh are independent 
 (see XIII.). 
 
 4. Theorems of Relevance : 
 
 (32) If a/Mj >a/h, hjah >'h.Jh. 
 
 ah is consistent since, otherwise, a/M^ = ajh = 0. 
 Therefore ajh . h^ah^ajhh^ . h^jh by X., 
 
 >ajh . hjh by hypothesis ; 
 
 so that hi/ah>hjh. 
 
 Thus if Aj is favourably relevant to the argument a/h, a is 
 favourably relevant to the argument hjh. 
 
 This constitutes a formal demonstration of the generally 
 accepted principle that if a hypothesis helps to explain a 
 phenomenon, the fact of the phenomenon supports the reality 
 of the hypothesis. 
 
 In the following theorems p will be said to be more 
 
 favourable to a/h, than q is to b/h, if -^>-^j i-e. if, in the 
 
 a/h o/h 
 
 language of § 8 below, the coefficient of influence of p on a/h 
 
 is greater than the coefficient of influence of q on b/h. 
 
 (33) If X is favourable to a/h, and h-^ is not less favourable 
 to a/hx than x is to a/hhi, then h^^ is favourable to a/h. 
 
 For a/hh, = a/h . -^ — . ,_ ^ . „/^ ; and by hypothesis the 
 a/h a/hx a/hhjX 
 
 second term on the right is greater than unity and the pro- 
 duct of the third and fourth terms is greater than or equal 
 to unity. 
 
 (33.1) A fortiori, if a; is favourable to a/h and not favour- 
 able to a/hhi, and if h^ is not unfavourable to a/hx, then A^ is 
 favourable to a/h. 
 
 (34) If x is favomable to a/h, and h^ is not less favourable 
 to x/ha than x is to h^/ha, then A^ is favourable to a/h. 
 
 This follows by the same reasoning as (33), since by an 
 application of the Multiplication Theorem 
 
148 A TREATISE ON PROBABILITY pt. n 
 
 a/hk^x a/hhj^ aj/M^a hjha 
 ajhx a/hhjX x/ha ' hjhax 
 
 (35) If X is favourable to ajh, but not more favourable to it 
 ttan h-^x is, and not less favourable to it than to a/hh^, then 
 hi is favourable to a/h. 
 
 -c I1.T II i «A a/hhix] (a/hx alhlu 
 
 For aMi = ah.\-j^.-l-^\.\-!—.-J—l.. 
 
 [a/hx a/h ) \ a/h a/hh^x. 
 This result is a little more substantial than the two 
 preceding. By judging the influence of x and h^x on the 
 arguments ajh and ajhh^, we can infer the influence of h^ by 
 itself on the argument ajh. 
 
 5. The Multiplication Theorems : 
 
 (36) If ajh and ajh are independent, a^ajh = ajh . ajh. 
 For a-^ajh = ajaji . ajh = ajaji . ajh by X., 
 and since ajh and ajh are independent, 
 
 ajaji = ajh and ajajh = ajh by XIII. 
 
 Therefore a^ajh = ajh . ajh. 
 
 Hence, when ajh and ajh are independent, we can arrive at the 
 probability of % and a^ jointly on the same hypothesis by simple 
 multiplication of the probabiUties ajh and ajh taken separately. 
 
 (37) If pjh =pjpji =pjpi2)zh = . . ., 
 
 PiP2P3...pJh = {pJhY. 
 For PiPzPs ■ ■ ■ /h =pjh . pjpji . pjpipji ... by repeated 
 applications of X. 
 
 6. The Inverse Principle : 
 
 (38) ^^ = -Ip^ ■ ^, provided Ih, aJi, and a^h are 
 ajbh o/a^h ajh 
 
 each consistent. 
 
 For ajih . hjh = h/aji . ajh, 
 
 and aJbh . hjh = l/aji . ajh by X., 
 
 whence the result follows, since Ijh^O, unless hh is in- 
 consistent. 
 
 (38.1) If ajh=p-y, ajh=p2, l/ajb = qi, llajh=q^, and 
 ajhh + aJbh = 1, then it easily follows that 
 
 ajbh-- 
 
 PiSi 
 
 P1Q1+P2S1 
 
FUNDAMENTAL THE0EEM8 149 
 
 and ' ajbh=^^^ 
 
 (38.2) If ajh = ajh the above reduces to 
 
 S1 + S2 
 and aJhh = ^^ , 
 
 since ajh=i=0, unless a^h is inconsistent. 
 
 The proposition is easily extended to the cases in which the 
 number of a's is greater than two. 
 
 It will be worth while to translate this theorem into familiar 
 language. Let 6 represent the occurrence of an event B, a^ 
 and ttg the hypotheses of the existence of two possible causes 
 Aj and Ag of B, and h the general data of the problem. Then p^ 
 and P2 are the d priori probabiUties of the existence of A^ and Ag 
 respectively, when it is not known whether or not the event B 
 has occurred ; y^ and q^ the probabilities that each of the causes 
 Aj and Aj, if it exists, will be followed by the event B. Then 
 
 ^ ^ and — — are the probabilities of the existence 
 
 of Ai and Aj respectively after the event, i.e. when, in addition 
 to our other data, we know that the event B has occurred. The 
 initial condition, that hh must not be inconsistent, simply ensures 
 that the problem is a possible one, i.e. that the occurrence of the 
 event B is on the initial data at least possible. 
 
 The reason why this theorem has generally been known as 
 the Inverse Principle of Probability is obvious. The causal 
 problems to which the Calculus of Probability has been applied 
 are naturally divided into two classes — ^the direct in which, given 
 the cause, we deduce the effect ; the indirect or inverse in which, 
 given the effect, we investigate the cause. The Inverse Priaciple 
 has been usually employed to deal with the latter class of 
 problem. 
 
 7. Theorems on the Combination of Premisses : 
 The Multiplication Theorems given above deal with the com- 
 bination of conclusions ; given a/h^ and a/h^ we considered the 
 relation of aiajh to these probabilities. In this paragraph the 
 corresponding problem of the combination of premisses will be 
 
150 A TEEATISE ON PEOBABILITY pt. n 
 
 treated; given a/h^ and ajh^ we shall consider -the relation of 
 a/hji^ to these probabilities. 
 
 (39) alh,h,h = ^-^= ..y^?./. V X. and (24.2) 
 
 u + v 
 
 where u is the a priori probability of the conclusion a and both 
 hypotheses h-^ and h^ jointly, and v is the d priori probability 
 of the contradictory of the conclusion and both hypotheses \ 
 and ^2 jointly. 
 
 ah-^jh^ + aAj/^2 /^i/a«'2 • i + Aj/aA^ • (1 - 2') 
 hjahi . p 
 hjahj^ . p + hjah-y . (1 -^) ' 
 where p = ajh-^ and q = ajh^. 
 
 (40.1) If 2? = ! a/hjh. 
 
 and increases with 
 
 
 hjdhi 
 
 These results are not very valuable and show the need of an 
 original method of reduction. This is suppHed by Mr. W. E. 
 Johnson's Cumulative Formula, which is at present unpublished 
 but which I have his permission to print below.^ 
 
 8. It is first of all necessary to introduce a new symbol. Let 
 us write 
 
 XV. a/bh = {al'b}a/h Def. 
 We may call {a^hj the coefficient of influence of b upon a on 
 hypothesis h. 
 
 XVI. {a"b} ■ {ah^ = U'bh} Def. 
 and similarly {aJ'b} ■ {abhdh} = {a!'b''cdh}. 
 
 These coefficients thus belong by definition to a general class of 
 operators, which we may call separative factors. 
 
 (41) ab/h=:{a"b}.a/h.b/h, 
 
 since ab/h = a/bh . b/h. 
 
 ^ The substance of propositions (41) to (49) below is derived in its entirety 
 from his notes, — the exposition only is mine. 
 
CH. 3:iv FUNDAMENTAL THEOREMS 151 
 
 Thus we may also call •[a''6} the coefficient of dependence between 
 a and 6 on hypothesis h. 
 
 (41-1) abcjh = UHh) . a/h . b/h . c/h. 
 
 For ahc/h = \ahh'\ab/h . cjh by (41), 
 
 = {al^ . {aJ^h} . ajh . Ijh . c/h by (41). 
 (41.2) And in general 
 
 aicd ...lh = {aJ'bhV . . .] .ajh .Ijh . cjh .djh... 
 
 (42) {a%) = {h^, 
 since " a/bh . bjh = bjah . a/h, 
 
 (42.1) {aH^ = {ah^b}, 
 since a/h . b/h . c/h = a/h . c/h . b/h. 
 
 (42.2) And in general we have a commutative rule, by which 
 the order of the terms may be always commuted — 
 
 e.g. {aHc^ defg] = [bc^aY def} 
 
 [a^bc" defg] = [ahb^fed "g} . 
 
 (43) As a multiplier the separative factor operates so as to 
 separate the terms that may be associated (or joined) in the 
 multiphcand. 
 
 Thus {ab^cdh} . {a^ = {a^bhd\}, 
 
 for abcde/h = (abhd^ej -. ab/h . cd/h . e/h 
 
 = {abhd"e} . {aH} . a/h . b/h . cd/h . e/h, 
 and also abcde/h = [a^b^dh] . a/h . b/h . cd/h . e/h. 
 Similarly (for example) 
 
 {abc^'dhf} . {abh} . {a''&} = {a''&''c^cZV}- 
 
 (44) {a^b}.{ab} = {a''b}. 
 For ab/h = {a&} ab/h . 
 
 By a symbolic convention, therefore, we may put {ah] =1. 
 
 (44.1) If {a"6}=l, it follows that a/h and b/h are in- 
 dependent arguments ; and conversely. 
 
 (45) Rule of Repetition {aa^b} = {«"&}. 
 
 For aab/h=ab/h by (vi.) and (12). 
 
 (46) The Cumulative Formula : 
 x/ah : x'/ah : x"/ah : . . . 
 
 = x/h . a/xh : x'/h . a/x'h : x"/h . a/x"h : . . . by (38). 
 Take n + 1 propositions a, b, c . . . Then by repetition 
 x/ah . x/bh . x/ch ...: x'/a . x'b/ . x'/c . . . : x"/a . x"/b . x"/c ....... 
 
 = {x/hY+^a/xh .b/xh...: {x'/hy+^a/x'h . b/x'h . . . 
 
 : {x"/hf^^a/x"h . b/x"h . . . 
 which may be written 
 
152 A TREATISE ON PROBABILITY m- n 
 
 »+l n+l n.+l 
 
 Hx/ah : Ux'/ah : Ilx"/ah : . . . 
 
 = {xlhT+^Ualxh : {x'jhf+^Uajx'h : . . . 
 Now 
 x/hahc . . .: x'/hahc . . .: x"/habc . . . 
 
 = x/h . (aic . . .) /xh : x'/h . (ale . . .) jx'h : . . . by (38), 
 and 
 
 abc... /xh = {a^^b^h . . . }Ua/xh by (41.2), 
 .-. {xlkf .xjhabc ...: {x'/h)'^.x'jhabc ...: {x"lhy.x"lhabc ...:... 
 
 = {a^"&*c . . . )xlah . x/bh .x/ch...: {a^Vc . . . }x'/ah . x'/bh 
 
 . x'/ch ....... 
 
 which may be written 
 
 (jc/A)" . x/habc . . . x{a'"'¥''c . . .} . x/ah . x/bh . x/ch . . . 
 where variations of x are involved. 
 
 The cumulative formula is to be applied when, having accumu- 
 lated the evidence a,b,c . . ., we desire to know the comparative 
 probabilities of the various possible inferences x, x' . . . which 
 may be drawn, and already know determinately the force of 
 each of the items a,b,c... separately as evidence for x, x'. . . . 
 
 Besides the factors x/ak, x/bh, etc., we require to know two 
 other sets of values, viz. : (1) x/h, etc., i.e. the d priori 
 probabilities of x, etc., and (2) {(f'^¥h . . . }, etc., i.e. the 
 coeflGlcients of dependence between a, h, and c ... on hypotheses 
 xh, etc. It may be remarked that the values {a*6'"''c . . .}, 
 \af%'^^c ...}... are not in any way related, even when x' =x. 
 
 What corresponds to the cumulative formula has been em- 
 ployed, sometimes, by mathematicians in a simplified form 
 which is, except under special conditions, incorrect. First, it 
 has been tacitly assumed that {afWh . . . }, {a'"'b''h ...}... 
 are all unity : so that 
 
 {x/hyx/habc . . . oc x/ah . x/bh . x/ch . . . 
 Secondly, the factor (x/h)'^ has been omitted, so that 
 x/habc . . . oc x/ah . x/bh . x/ch . . . 
 
 It is this second incorrect statement of the formula which 
 leads to the fallacious rule for the combination of the testimonies 
 of independent witnesses ordinarily given in the text-books.^ 
 
 (46.1). If abc... /xh = {af'b'^c . . . } a/xh . b/xh . c/xh . . . 
 then x/habc . . . az^a'^'^b c . . . } x/ah . x/bh . x/ch . . . 
 
 1 See p 180 below. 
 
CH. XIV FUNDAMENTAL THEOREMS 153 
 
 This result is exceedingly interesting. Mr. Johnson is the first to 
 arrive at the simple relation, expressed above, between the direct 
 and the inverse formulae : viz. that the same coefficient is re- 
 quired for correcting the simple formulae of multiplication in 
 both cases. As he remarks, however, while the direct formula 
 gives the required probabihty directly by multiplication, the 
 inverse formula gives only the comparative probability. 
 
 (46.2) If X, x' , x" . . . are exclusive and exhaustive alterna- 
 tives. 
 
 
 xjhabc . 
 
 (x/hy.{a'^¥h 
 
 . . .}'n.x/ah 
 
 
 2[(£c7A)-".{a"^''6^\ 
 
 . . . }Ux'/ah] 
 
 since 
 
 xl7iaic...x{xlh)-''{a'"Vc 
 
 71-1 
 
 . . . ^Ux/ah, 
 
 and 
 
 
 ^x'/habc . . . = 1 
 
 by (24.7) 
 
 (47). 
 
 x/habc . 
 x/h 
 
 . . a/h . b/h .c/h... 
 
 abc .. ./h a 
 
 abc . . . /xh 
 
 jxh . b/xh . c/xh . . . 
 
 
 
 
 x/ah x/bh 
 ' _x/h x/h 
 
 For 
 
 
 abc . . . x/h=x/h . abc . . 
 
 ./xh. 
 
 abc 
 
 . ..x/h 
 
 abc . . . /xh a/h . b/h . 
 
 c/h... 
 
 abc . . . /h . x/h abc . . . /h abc . . . /h 
 
 abc . . . /xh 
 
 a/xh . b/xh . c/xh . 
 
 a/xh b/xh 
 _ a/h b/h 
 
 whence the result, smce -!-p-= -!-— , etc. 
 
 a/h x/h 
 
 (47.1) The above formula may be written in the condensed 
 form 
 
 {"''bo ■ • . M = pw".T7}" ^"^"'^"'''' -y t^''"*'} • ^^'"'^ • ^''''> • • • ^• 
 
 . .g. {x/hYx/habc ..._ {a^^^h"^ . . .} x/ah . x/bh . x/ch . . . 
 ^ '' {x/hYx/habc ..." {a^>'¥\'''' ..)' x/ah . x/bh . x/ch ...' 
 
 This follows at once from (46.2), since x and x are exclusive and 
 exhaustive alternatives. (It is assumed that xh, xh, and ah, 
 etc., are not inconsistent.) 
 
154 
 
 A TREATISE ON PROBABILITY 
 
 This formula gives xjhabc ... in terms of xjah, xjbh, etc., 
 together with the three values x/h, [af^V'^c'''' . . .}, and 
 
 (48.1) 
 
 xjhabcd ■ . ■ xjhhcd . 
 
 {(f%cd . . . } . xjah x/h 
 {a^lcd . . . } . xjah ' x/h 
 
 xjhabcd ..." xjhhcd . 
 This gives the effect on the odds (prob. x : prob. x) of the extra 
 knowledge a. 
 
 (49) When several data co-operate as evidence ia favour of a 
 proposition, they contiaually strengthen their own mutual 
 probabiUties, on the assumption that when the proposition 
 is known to be true or to be false the data jointly are not 
 counterdependent. 
 
 Z.e. if {a*6'"*'c . . . } and [a^lF^c . . . } are not less than 
 unity, and xjkh>xjh where k is any of the data a,b,o..., then 
 ^aJ^y^c^d . . .]■ beginning with imity, continually increases, as 
 the number of its terms is increased. 
 
 ahc . . . jh=xdbc . . . jh+xabc . . .jh by (24.2). 
 
 = xjh .dbc... jxh+xjh . ahc . . . jxh. 
 '^xjh . Uajxh . hjxh . . .+ xjhUajxh . hjxh . . . 
 (siace {a'^'^h^'^c . . ..} and {a^'^Vh . . . } are not less than unity), 
 
 ~axjh hxjh 
 xjh xjh 
 
 xjah xjbx 
 xjh 
 
 ^xjh . np 
 
 ahc . . . jh 
 
 .„ ^Vaxjh hxjh 
 + xjh.U\-^.^- 
 L xjh xjh 
 
 U[ajh.hjh...] 
 
 xjh.U 
 
 xjh 
 
 ■ \+xjh 
 
 -p. Vxjah xjhh 
 L xjh xjh 
 
 We can show that each additional piece of evidence a,b,c... 
 increases the value of this expression. For let xjh . G+xjh . G' be 
 its value when all the evidence up to k exclusive is taken, so that 
 
 xjkh.G+xjkh.G' 
 
 is its value when k is taken. Now Gr>G' since xjah>xjh, etc., 
 and xjah<xjh, etc., by the hypothesis that the evidence 
 favours x ; and for the same reason xjkh - xjh, which is equal 
 to xjh -xjkh, is positive. 
 
 .-. G (xjkh - xjh)>G'{xjh - xjkh), 
 i.e. xjkh . G + xjkh . G'>xjh . G + xjh . G', 
 
 whence the result. 
 
CH. XIV FUNDAMENTAL THEOEEMS 155 
 
 (49.1) The above proposition can be generalised for the 
 case of exclusive alternatives x, x' , ad' . . . (in place of z, x). 
 For {al'h^^ ...) 
 = x/h . {a*6^^c . . . ) ] a'^ {b'^x} {c^'x} . . . 
 + x'/h . {a^Vc ...) {aV} {&V}{cV} . . . 
 + x"/h . {a'"V\ ...} {aV} \b^x"}{GV} ... + ...; 
 from which it follows that, if {a^Vc . . . }etc. <t:l, and if 
 {a'^x} -1, {b''x-l}, {c''a3-l}, etc., have the same sign, then 
 {aJ'bh . . . } is increasing (with the number of letters) from unity. 
 Mr. Johnson describes this result as a generalisation of 
 the corrected " middle term fallacy " (see Chap. V. § 4). 
 
 APPENDIX 
 
 ON SYMBOLIC TREATMENTS OF PROBABILITY 
 
 The use of the symbol for impossibility and 1 for certainty was 
 first introduced by Leibnitz in a very early pamphlet, entitled 
 Specimen certitvMnis seu demonstrationum in jure, exhibitum in 
 doctrina conddtionum, published in 1665 (vide Couturat, Logique de 
 Leibnitz, p. 553). Leibnitz represented intermediate degrees of 
 probability by the sign ^, meaning, however, by this symbol a 
 variable between and 1. 
 
 Several modern writers have made some attempt at a symbolic 
 treatment of Probability. But with the exception of Boole, whose 
 methods I have discussed ia detail in Chapters XV., XVL, and 
 XVII., no one has worked out anything very elaborate. 
 
 Mr. McColl published a number of brief notes on Probability of 
 considerable iuterest — see especially his Symbolic Logic, Sixth Paper 
 on the Calculus of Equivalent Statements, and On the Growth and Use 
 of a Symbolical Language. The conception of probability as a relation 
 between propositions underUes his symbolism, as it does mine.^ The 
 
 probability of a, relative to the a priori premiss h, he writes - ; and 
 
 the probability, given b in addition to the d priori premiss, he writes 
 
 j-. Thus - = ajh, and 5- = ajbh. The difference t — , i-e. the change 
 
 in the probabiUty of a brought about by the addition of b to the 
 evidence, he calls ' the dependence of the statement a upon the state- 
 
 1 I did not come across these notes until my own method was considerably 
 developed. Mr. MoColl has been the first to nse the fundamental symbol of 
 Probability. 
 
156 A TREATISE ON PEOBABILITY pt. n 
 
 ment b,' and denotes it by St- Thus 8t = 0, where, in my termin- 
 ology, 6 is irrelevant to a on evidence h. The multiplication and 
 
 ,,. . , , , . , „ ah a h b a 
 
 addition formulae he gives as follows: — = -.- = -.-. 
 
 ° e e a e 
 
 a + b a h ah 
 e e e e " 
 
 Also S- = =58-, where A= - 
 
 D a € 
 
 It is surprising how little use he succeeds in making of these good 
 results. He arrives, however, at the inverse formula in the shape — 
 
 Cr V 
 
 V ^ '■=" Cr v' 
 2j — ■ — 
 r=l £ ^r 
 
 where Cj^ . . . c„ are a series of mutually exclusive causes of the event 
 V and include all possible causes of it ; reaching it as a generalisation 
 of the proposition 
 
 a b 
 
 a 
 
 € a 
 
 
 b a 
 
 e 
 
 b a 
 
 , - + - . 
 
 a £ 
 
 b 
 
 a 
 
 In a paper entitled " Operations in Eelative Number with Apph- 
 cations to the Theory of Probabihties," ^ Mr. B. I. Gilman attempted 
 a symbohc treatment based on a frequency theory similar to Venn's, 
 but made more precise and more consistent with itself : " Probability 
 has to do, not with individual events, but with classes of events ; and 
 not with one class, but with a pair of classes, — the one containing, 
 the other contained. The latter being the one with which we are 
 principally concerned, we speak, by an ellipsis, of its probability 
 without mentioning the containing class ; but in reality probabUity 
 is a ratio, and to define it we must have both correlates given." But 
 Mr. Gilman's symbolic treatment leads to very little. More recently 
 R. Laemmel, in his TJntersuchungen iiber die Ermittlung von Wahr- 
 scheinlichkeiten, made a beginning on somewhat similar Hues ; but 
 in his case also the symbolic treatment leads to no substantial results. 
 
 Apart from the writers mentioned above, there are a few who 
 have incidentally made use of a probability symbol. It will be 
 sufficient to cite Czuber.^ He denotes the probability of an event 
 
 ^ Published in the volume of Johns HopHna Studies in Logic. 
 ^ g, vol. i. pp. 43-48. 
 
OH. XIV FUNDAMENTAL THEOEEMS 157 
 
 E by W(E), and the probability of tbe event E given the occurrence of 
 an event F by Wj.(E). He uses this symbol to give "Wj,(E) = WjCE) 
 
 as the criterion of the independence of the events E and F (F denoting 
 the non-occurrence of F) ; Wj,(B) = 1, as the expression of the fact 
 that E is a necessary consequence of F ; and one or two other similar 
 results. 
 
 Finally there is in the Bulletin of the Physico-mathematical Society 
 of Kazan for 1887 a memoir in Russian by Platon S. Porctzki entitled 
 " A Solution of the General Problem of the Theory of Probability by 
 Means of Mathematical Logic." I have seen it stated that Schroder 
 iatended to publish ultimately a symbohc treatment of Probability. 
 Whether he had prepared any manuscript on the subject before his 
 death I do not know. 
 
CHAPTER XV 
 
 NUMERICAL MEASUREMENT AND APPROXIMATION OF 
 PROBABILITIES 
 
 1. The possibility of numerical measurement, mentioned at 
 the close of Chapter III., arises out of the Addition Theorem 
 (24.1). In introducing the definitions and the axiom, which are 
 required ia order to make the convention of numerical measure- 
 ment operative, we may appear, as in the case of the original 
 definitions of Addition and Multiplication, to be arguing in an 
 artificial way. This appearance is due, here as in Chapter XII., 
 to our having given the names of addition and multiplication to 
 certain processes of compounding probabilities in advcmce of 
 postulating that the processes in question have the properties 
 commonly associated with these names. As common sense is 
 hasty to impute the properties as soon as it hears the names, it 
 may overlook the necessity of formally introducing them. 
 
 2. The definitions and the axiom which are needed in order 
 to give a meaning to numerical measurement are the following : — 
 
 XVII. a/h + {a/h + [a/h + (a/h + . . .r terms)]} =r . a/h. Def . 
 
 XVIII. If r.a/h=: h/f, then a/h = -. h/f. Def. 
 
 XIX. If h/f=q.c/g, then ^-.hlf^^lg. Def. 
 
 r r 
 
 Thus if blh = ajh + ajh+ ... to r terms, then the probability 
 bjh is said to be r times the probability ajh ; hence if ahjh =0 and 
 a/h=b/h, the probability {a + b)/h is tivice the probabiHty a/h. 
 If a and b are exhaustive as weU as exclusive alternatives re- 
 latively to h, so that (a + b)/h = 1, since we take the relation of 
 certainty as our unit, then a/h=b/h—^. 
 
 We also need the following axiom postulating the existence of 
 relations of probabiUty corresponding to all proper fractions : 
 
 158 
 
CH. XV FUNDAMENTAL THEOREMS 159 
 
 (vii.) If q and r are any finite integers and q<r, there exists 
 a relation of probability which can be expressed, by means of the 
 
 convention of the foregoing definitions, as -• 
 
 r 
 
 3. From these axioms and definitions combined with those 
 of Chapter XII., it is easy to show (certainty being represented 
 by unity and impossibility by zero) that we can manipulate 
 according to the ordinary laws of arithmetic the "numbers" 
 which by means of a special convention we have thus introduced 
 to represent probabilities. Of the kind of proofs necessary 
 for the complete demonstration of this the following is given as 
 an example : 
 
 (50) If «//=- and h/h = \ alf+hlh^ '^^'^ . 
 m n ' mn 
 
 Let the probability ^ — =P, which exists by (vii.), 
 
 mn 
 
 then 71 . P = - = a// by (XIX.), 
 
 and m . P = - = b/h, 
 
 n 
 
 .: a/f + llh = n . P + ??i . P, if this probabihty exists, 
 
 = P + P . . . to w terms + P + P . . . to m terms, 
 = P + P .. .to m + n terms, 
 
 = {m + n)-p = '"^'^ by (XIX.). 
 
 mn J \ ) 
 
 This probabihty exists in virtue of (vii.). 
 
 4. Many probabilities — ^in fact aU those which are equal to 
 the probabihty of some other argument which has the same 
 premiss and of which the conclusion is incompatible with that 
 of the original argument — are numerically measurable in the 
 sense that there is some other probability with which they are 
 comparable in the manner described above. But they are not 
 numerically measurable ia the most usual sense, unless the pro- 
 bability with which they are thus comparable is the relation 
 of certainty. The conditions under which a probabihty a/h is 
 
 numerically measurable and equal to - are easily seen. It 
 
160 A TEEATISE ON PEOBABILITY pt. n 
 
 is necessary that there should exist probabilities a^h^^, ajh^ . . ., 
 aj\ . . . ajh„ such that 
 
 ajhj^ = ajh^ = . . . = ajhg = . . . = a^/h„ 
 
 q r 
 
 a/h = XaJh^, and Xajhs = l. 
 1 1 
 
 If alh=^ and 6/A = ^, it foUows from (32) that ablh=^ 
 
 only if a\h and h\h are independent arguments. Unless, there- 
 fore, we are dealing with independent arguments, we cannot 
 apply detailed mathematical reasoning even when the individual 
 probabilities are numerically measurable. The greater part of 
 mathematical probabiUty, therefore, is concerned with arguments 
 which are hoQi independent and numerically measurable. 
 
 5. It is evident that the cases in which exact numerical 
 measurement is possible are a very limited class, generally 
 dependent on evidence which warrants a judgment of equi- 
 probabiUty by an application of the Principle of Indifference. 
 The fuller the evidence upon which we rely, the less likely is it to 
 be perfectly symmetrical in its bearing on the various alternatives, 
 and the more likely is it to contain some piece of relevant informa- 
 tion favouring one of them. In actual reasoning, therefore, 
 perfectly equal probabiUties, and hence exact numerical measures, 
 will occur comparatively seldom. 
 
 The sphere of inexact numerical comparison is not, however, 
 quite so limited. Many probabilities, which are incapable of 
 numerical measurement, can be placed nevertheless between 
 numerical limits. And by taking particular non-numerical 
 probabilities as standards a great number of comparisons or 
 approximate measurements become possible. If we can place 
 a probability in an order of magnitude with some standard prob- 
 ability, we can obtain its approximate measure by comparison. 
 
 This method is frequently adopted in common discourse. 
 When we ask how probable something is, we often put our ques- 
 tion in the form — Is it more or less probable than so and so ? — 
 where ' so and so ' is some comparable and better known prob- 
 ability. We may thus obtain information in cases where it would 
 be impossible to ascribe any number to the probability in question. 
 Darwin was giving a numerical limit to a non-numerical prob- 
 
OH. XV FUNDAMENTAL THEOREMS 161 
 
 ability when he said of a conversation with Lyell that he thought 
 it no more likely that he should be right in nearly all points than 
 that he should toss up a penny and get heads twenty times 
 running.^ Similar cases and others also, where the probabihty 
 which is taken as the standard of comparison is itself non- 
 numerical and not, as in Darwin's instance, a numerical one, 
 wiU readily occur to the reader. 
 
 A specially important case of approximate comparison is that 
 of ' practical certainty.' This differs from logical certainty since 
 its contradictory is not impossible, but we are in practice com- 
 pletely satisfied with any probability which approaches such 
 a limit. The phrase has naturally not been used with complete 
 precision ; but in its most useful sense it is essentially non- 
 numerical — -we cannot measure practical certainty in terms of 
 logical certainty. We can only explain how great practical 
 certainty is by giving instances. We may say, for instance, that 
 it is measured by the probability of the sun's rising to-morrow. 
 The type which we shall be most likely to take will be that of a 
 well-verified induction. 
 
 6. Most of such comparisons must be based on the principles 
 of Chapter V. It is possible, however, to develop a systematic 
 method of approximation which may be occasionally useful. 
 The theorems given below are chiefly suggested by some work 
 of Boole's. His theorems were introduced for a different pur- 
 pose, and he does not seem to have realised this interesting 
 application of them ; but analytically his problem is identical 
 with that of approximation.^ This method of approximation 
 is also substantially the same analytically as that dealt with by 
 Mr. Yule under the heading of Consistence.^ 
 
 ^ Life and Letters, vol. ii. p. 240. 
 
 ^ In Boole's Calculus we are apt to be left with an equation of the second 
 or of an even higher degree from which to derive the probability of the oonolu- 
 sion ; and Boole introduced these methods in order to determine which of the 
 several roots of his equation should be taken as giving the true solution of the 
 problem in probability. In each case he shows that that root must be chosen 
 which lies between certain limits, and that only one root satisfies this condition. 
 The general theory to be applied in such cases is expounded by him in Chapter 
 XIX. of The Laws of Thought, which is entitled " On Statistical Conditions." 
 But the solution given in that chapter is awkward and unsatisfactory, and he 
 subsequently published a much better method in the Philosophical Magazine 
 for 1854 (4th series, vol. vni.) under the title "On the Conditions by which the 
 Solutions of Questions in the Theory of Probabilities are limited." 
 
 3 Theory of Statistics, chap. ii. 
 
 M 
 
162 A TREATISE ON PROBABILITY pt. u 
 
 (51) xyjh always lies between^ xjh and x/h+y/h-l and 
 between y/h and xjh + y/h-1. 
 
 For xyjh = xjh -xyjh by (24.2), 
 
 = xlh - y/h . xjyh by X. 
 
 Now xjyh lies between and 1 by (2) and (3), 
 
 .-. xy/h lies between x/h and xjh -y/h, 
 i.e. between x/h and x/h+yjh-l. 
 As xylh<0, the above limits may be replaced by x/h and 0, if 
 x/h+y/h-l<0. 
 
 We thus have limits for xy/h, close enough sometimes to be 
 useful, which are available whether or not x/h and y/h are inde- 
 pendent arguments. For instance, i£ y/h is nearly certain, ay/h 
 =x/h nearly, quite independently of whether or not x and y are 
 independent. This is obvious ; but it is useful to have a simple 
 and general formula for all such cases. 
 
 n+l 
 
 (52) XjX^ . . . Xji^^Jh is always greater than S xjh - n. 
 
 1 
 
 For by (51) x^x^ . . . x„+Jh>XjX2 . . . xjh+x^+jh - 1 
 
 >x^x^ . . . a;„ _ i/A + xjh + x^+Jh - 2, 
 and so on. 
 
 (53) xy/h + xy/h is always less than x/h -y/h + 1, and less 
 than y/h -x/h + 1. 
 
 For as in (51) xy/h = x/h -xy/h 
 
 and xy/h = y/h - xy/h, 
 
 .: xy/h + xy/h = x/h - y/h + 1 - 2xy/h, 
 whence the required result. 
 
 (54) xy/h - xy/h = x/h + y/h - 1 . 
 
 This proposition, which follows immediately from the above, 
 is really out of place here. But its close connection with con- 
 clusions (51) and (53) is obvious. It is slightly unexpected, 
 perhaps, that the difference of the probabilities that both of two 
 events will occur and that neither of them will, is independent of 
 whether or not the events themselves are independent. 
 
 7. It is not worth while to work out more of these results here. 
 Some less systematic approximations of the same kind are given 
 in the course of the solutions in Chapter XVII. 
 
 In seeking to compare the degree of one probability with that 
 of another we may desire to get rid of one of the terms, on account 
 
 ^ In thia and the following theorems the term ' between ' includes the 
 limits. 
 
CH. XV FUNDAMENTAL THBOKEMS 163 
 
 of its not being comparable with any of our standard probabilities. 
 Thus oui object in general is to eliminate a given symbol of 
 quantity from a set of equations or inequations. If, for instance, 
 we are to obtain numerical limits within which our probability 
 must lie, we must eliminate from the result those probabilities 
 which are non-numerical. This is the general problem for 
 Solution. 
 
 (55) A general method of solving these problems when we 
 can throw our equations into a linear shape so far as all symbols 
 of probability are concerned, is best shown in the following 
 example : — 
 
 Suppose we have X + v = a (i.) 
 
 \ + a- = h (ii.) 
 
 \+i'+cr = c (iii.) 
 
 X + fjL+v+p = d (iv.) 
 
 X+fj, + a- + T = e (v.) 
 
 \+IJ, + v+p + cr+T+v = l (vi.) 
 
 where \, fi, v, p, a, t, v represent probabiUties which are to be 
 eliminated, and Hmits are to be found for c in terms of the 
 standard probabiUties a, b, d, e, and 1. 
 
 X, fi, etc., must all lie between and 1. 
 
 From (i.) and (iii.) a-=c-a; from (ii.) and (iii.) v=c-b. 
 
 From (i.), (ii.), and (iii.) X = a + b-c. 
 whence c-a^O, c-'b'>0, a + b-c^O, 
 
 substituting for a; v,\ia (iv.), (v.), and (vi.) 
 
 fi + p^d-a, fi + T=e-b, fi + p + T + v = l-c, 
 whence p = d - a - fj,, T = e-i-fi, v = l-c-d + a-e + b+fi, 
 .-. d-a-fi^O, e-l-fjb->0, l-c-d + a-e + b+iJb->0. 
 We have still to eliminate jx. /M^d - a, fi^e - b, 
 
 fjb^c + d + e-a-b-1, 
 
 .: d-a'>c + d + e-a-b-l ebuA e-b^c + d + e-a-b-1. 
 Hence we have : 
 
 Upper limits of c: — b + 1 -e,a + l-d,a + b (whichever is least); 
 
 Lower limits of c : — a, b (whichever is greatest). 
 
 This example, which is only slightly modified from one given 
 by Boole, represents the actual conditions of a well-known 
 problem in probability. 
 
CHAPTBE XVI 
 
 OBSERVATIONS ON THE THEOREMS OF CHAPTER XIV. AND THEIR 
 DEVELOPMENTS, INCLUDING TESTIMONY 
 
 1. In Definition XIII. of Chapter XII. a meaning was given to 
 the statement that a-Jh and aj/A are independent arguments. 
 In Theorem (33) of Chapter XIV. it was shown that, if a^h and 
 a^h are independent, a-^a^/h=ajjh .ajh. Thus where on given 
 evidence there is independence between a^ and a^, the probability 
 on this evidence of a^a^ jointly is the product of the probabilities 
 of Oj and a^ separately. It is difficult to apply mathematical 
 reasoning to the Calculus of Probabihties unless this condition 
 is fulfilled ; and the fulfilment of the condition has often been 
 assumed too hghtly. A good many of the most misleading 
 fallacies in the theory of Probabihty have been due to a use of 
 the MultipUcation Theorem in its simplified form in cases where 
 this is illegitimate. 
 
 2. These fallacies have been partly due to the absence of 
 a clear understanding as to what is meant by Independence. 
 Students of Probability have thought of the independence of 
 events, rather than of the independence of arguments or pro- 
 positions. The one phraseology is, perhaps, as legitimate as the 
 other ; but when we speak of the dependence of events, we are 
 led to believe that the question is one of direct causal dependence, 
 two events being dependent if the occurrence of one is a part 
 cause or a possible part cause of the occurrence of the other. In 
 this sense the result of tossing a coin is dependent on the existence 
 of bias in the coin or in the method of tossing it, but it is inde- 
 pendent of the actual results of other tosses ; immunity from 
 smallpox is dependent on vaccination, but is independent of 
 statistical returns relating to immunity ; while the testimonies 
 of two witnesses about the same occurrence are independent, 
 so long as there is no collusion between them. 
 
 164 * 
 
OH. XVI FUNDAMENTAL THEOREMS 165 
 
 This sense, which it is not easy to define quite precisely, is 
 at any rate not the sense with which we are concerned when we 
 deal with independent probabihties. We are concerned, not with 
 direct causation of the kind described above, but with ' depend- 
 ence for knowledge,' with the question whether the knowledge of 
 one fact or event affords any rational ground for expecting the 
 existence of the other. The dependence for knowledge of two 
 events usually arises, no doubt, out of causal connection, or what 
 we term such, of sorne kind. But two events are not independent 
 for knowledge merely because there is an absence of direct causal 
 connection between them ; nor, on the other hand, are they 
 necessarily dependent because there is in fact a causal train which 
 brings them into an indirect connection. The question is whether 
 there is any known probable connection, direct or indirect. A 
 knowledge of the results of other tossings of a coin may be hardly 
 less relevant than a knowledge of the bias of the coin ; for a 
 knowledge of these results may be a ground for a probable know- 
 ledge of the bias. There is a similar connection between the 
 statistics of immunity from smallpox and the causal relations 
 between vaccination and smallpox. The truthful testimonies 
 of two witnesses about the same occurrence have a common 
 cause, namely the occurrence, however independent (in the legal 
 sense of the absence of collusion) the witnesses may be. For the 
 purposes of probability two facts are only independent if the 
 existence of one is no indication of anything which might be a 
 part cause of the other. 
 
 3. While dependence and independence may be thus con- 
 nected with the conception of causality, it is not convenient to 
 found our definition of independence upon this connection. A 
 partial or possible cause involves ideas which are stUl obscure, and 
 I have preferred to define independence by reference to the con- 
 ception of relevance, which has been already discussed. Whether 
 there reaUy are material external causal laws, how far causal 
 connection is distinct from logical connection, and other such 
 questions, are profoundly associated with the ultimate problems 
 of logic and probability and with many of the topics, especially 
 those of Part III., of this treatise. But I have nothing useful to 
 say about them. Nearly everything with which I deal can be 
 expressed in terms of logical relevance. And the relations be- 
 tween logical relevance and material cause must be left doubtful. 
 
166 A TREATISE ON PEOBABILITY pt. n 
 
 4. It will be useful to give a few examples out of writers who, 
 as I conceive, have been led into mistakes through misappre- 
 hending the significance of Independence. 
 
 Cournot/ in his work on Probability, which after a long period 
 of neglect has come into high favour with a modern school of 
 thought in France, distinguishes between ' subjective probability ' 
 based on ignorance and ' objective probability ' based on the 
 calculation of ' objective possibiUties,' an ' objective possibility ' 
 being a chance event brought about by the combination or con- 
 vergence of phenomena belonging to ivdefendent series. The 
 existence of objectively chance events depends on his doctrine 
 that, as there are series of phenomena causally dependent, so 
 there are others between the causal developments of which there 
 is independence. These objective possibilities of Cournot's, 
 whether they be real or fantastic, can have, however, small 
 importance for the theory of probability. For it is not known 
 to us what series of phenomena are thus independent. If we had 
 to wait until we knew phenomena to be independent in this sense 
 before we could use the simplified multiplication theorem, most 
 mathematical apphcations of probability would remain hypo- 
 thetical. 
 
 5. Cournot's ' objective probability,' depending wholly on 
 objective fact, bears some resemblances to the conception in the 
 minds of those who adopt the frequency theory of probability. 
 The proper definition of independence on this theory has been 
 given most clearly by Mr. Yule ^ as follows : 
 
 " Two attributes A and B are usually defined to be inde- 
 pendent, within any given field of observation or ' universe,' 
 when the chance of finding them together is the product of the 
 chances of finding either of them separately. The physical 
 meaning of the definition seems rather clearer in a different 
 form of statement, viz. if we define A and B to be independent 
 when the 'proportion of A's amxmgst the B's of the given universe is 
 the same as in that universe at large. If, for instance, the question 
 were put, ' What is the test for independence of smallpox attack 
 and vaccination 1 ' the natural reply would be, ' The percentage 
 of vaccinated amongst the attacked should be the same as in 
 the general population.' . . ." 
 
 ^ For some account of Coumot, see Chapter XXIV. § 3. 
 2 " Notes on the Theory of Association of Attributes in Statistics," Bio- 
 metrika, vol. ii. p. 125. 
 
CH. XVI FUNDAMENTAL THEOREMS 167 
 
 This definition is consistent with the rest of the theory 
 to which it belongs, but is, at the same time, open to the 
 general objections to it.^ Mr. Yule admits that A and B may be 
 independent in the world at large but not in the world of C's. 
 The question therefore arises as to what world given evidence 
 specifies, and whether any step forward is possible when, as is 
 generally the case, we do not know for certain what the propor- 
 tions in a given world actually are. As ia the case of Cournot's 
 independent series, it is in general impossible that we should 
 know whether A and B are or are not independent in this sense. 
 The logical independence for knowledge which justifies our 
 reasoning in a certain way must be something difierent from 
 either of these objective forms of iadependence. 
 
 6. I come now to Boole's treatment of this subject. The 
 central error in his system of probability arises out of his giving 
 two inconsistent definitions of ' independence.' ^ He first wins 
 the reader's acquiescence by giving a perfectly correct defini- 
 tion : " Two events are said to be independent when the 
 probability of the happening of either of them is unafiected by 
 our expectation of the occurrence or failure of the other." ^ But 
 a moment later he interprets the term ia quite a different sense ; 
 for, accordiag to Boole's second definition, we must regard the 
 events as independent unless we are told either that they must 
 concur or that they cannot concur. That is to say, they are in- 
 dependent unless we know for certain that there is, in fact, an 
 invariable connection between them. " The simple events, x, y, z, 
 will be said to be conditioned when they are not free to occur in 
 every possible combination ; in other words, when some com- 
 pound event depending upon them is precluded from occurring. 
 
 1 See Chapter VIII. 
 
 ' Boole's mistake was pointed, out, accurately though somewhat obscurely, 
 by H. Wilbraham in his review " On the Theory of Chances developed in Professor 
 Boole's Jmws of Thought" {Phil. Mag. 4th series, vol. vii., 1864). Boole 
 failed to understand the point of Wilbraham's criticism, and replied hotly, 
 challenging him to impugn any individual results (" Reply to some Observations 
 published by Mr. Wilbraham," Phil. Mag. 4th series, vol. viii., 1854). He 
 returned to the same question in a paper entitled " On a General Method in 
 the Theory of Probabilities," Phil. Mag. 4th series, vol. viii., 1854, where he 
 endeavours to support his theory by an appeal to the Principle of Indifference. 
 MoCoU, in his " Sixth Paper on Calculus of Equivalent Statements," saw 
 that Boole's fallacy turned on his definition of Independence ; but I do 
 not think he understood, at least he does not explain, where precisely Boole's 
 mistake lay. 
 
 * Laws of Thought, p. 255. The italics in this quotation are mine. 
 
168 A TEEATISE ON PEOBABILITY pt. h 
 
 . . . Simple unconditioned events are by definition independent." ^ 
 In fact as long as xz is possible, x and z are independent. This is 
 plainly inconsistent with Boole's first definition, with which he 
 makes no attempt to reconcile it. The consequences of his em- 
 ploying the term independence in a double sense are far-reaching. 
 For he uses a method of reduction which is only vaUd when the 
 arguments to which it is applied are independent in the first 
 sense, and assumes that it is vaUd if they are independent in the 
 second sense. While his theorems are true if all the propositions 
 or events involved are independent in the first sense, they are not 
 true, as he supposes them to be, if the events are independent 
 only in the second sense. In some cases this mistake involves 
 him ia results so paradoxical that they might have led him 
 to detect his fundamental error.^ Boole was almost certainly 
 led into this error through supposing that the data of a 
 problem can be of the form, " Prob. x=p," i.e. that it is 
 sufficient to state that the probability of a proposition is such 
 and such, without stating to what premisses this probability is 
 referred.^ 
 
 It is interesting that De Morgan should have given, 
 incidentally, a definition of independence almost identical 
 with Boole's second definiticJn : " Two events are independent 
 if the latter might have existed without the former, or the 
 
 1 Op. cit. p. 258. 
 
 ^ There is an excellent instance of this, Laws of Thoughi, p. 286. Boole 
 discusses the problem : Given the probability p of the disjunction ' either Y 
 ia true, or X and Y are false,' required the probability of the conditional pro- 
 position, ' If X ia true, Y ia true.' The two propositions are formally equivalent ; 
 
 but Boole, through the error pointed out above, arrives at the result — — , 
 
 1 —p-j-cp 
 where c ia the probability of ' If either Y is true, or X and Y false, X ia true.' 
 His explanation of the paradox amounts to an assertion that, so long as two 
 propositions, which are formally equivalent when true, are only probable, they 
 are not neoeaaarily equivalent. 
 
 * In studying and oritioiaing Boole's work on Probability, it is very im- 
 portant to take into account the various articles which he contributed to the 
 Philosophical Magazine daring 1854, in which the methods of The Laws of 
 Thought are considerably improved and modified. His last and most considered 
 contribution to Probability is his paper " On the application of the Theory of 
 Probabilities to the question of the combination of testimonies or judgments," 
 to be found in the Edin. Phil. Trans, vol. xxi., 1857. This memoir contains a 
 simplification and general summary of the method originally proposed in The 
 Laws of Thought, and should be regarded aa superaeding the expoaition of that 
 book. In spite of the error already alluded to, which vitiates many of his 
 conclusions, the memoir is as full as are his other writings of genius and 
 originality. 
 
CH. XVI FUNDAMENTAL THEOEEMS 169 
 
 former without the latter, for anything that we know to the 
 contrary." ^ 
 
 7. In many other cases errors have arisen, not through a 
 misapprehension of the meaning of independenoCj but merely 
 through careless assumptions of it, or through enunciating the 
 Theorem of Multiplication without its qualifying condition. 
 Mathematicians have been too eager to assume the legitimacy 
 of those complicated processes of multiplying probabilities, for 
 which the greater part of the mathematics of probability is 
 engaged in supplying simplifications and approximate solutions. 
 Even De Morgan was careless enough in one of his writings ^ 
 to enunciate the Multiphcation Theorem in the following form : 
 " The probabiUty of the happening of two, three, or more events 
 is the product of the probabilities of their happening separately 
 (p. 398). . . . Knowing the probability of a compound event, 
 and that of one of its components, we find the probabiUty 
 of the other by dividing the first by the second. This is a 
 mathematical result of the last too obvious to require further 
 proof (p. 401)." 
 
 An excellent and classic instance of the danger of wrongful 
 assumptions of independence is given by the problem of deter- 
 mining the probability of throwing heads twice in two consecutive 
 tosses of a coin. The plain man generally assumes without 
 hesitation that the chance is (J)^- For the d priori chance of 
 heads at the first toss is J, and we might naturally stfppose that 
 the two events are independent, — since the mere fact of heads 
 having appeared once can have no influence on the next toss. 
 But this is not the case unless we know for certain that the coin 
 is free from bias. If we do not know whether there is bias, or 
 which way the bias Hes, then it is reasonable to put the probability 
 somewhat higher than (J)^. The fact of heads having appeared 
 at the first toss is not the cause of heads appearing at the second 
 also, but the knowledge, that the coin has fallen heads already, 
 affects our forecast of its falling thus in the future, since heads in 
 the past may have been due to a cause which will favour heads 
 in the future. The possibility of bias in a coin, it may be noticed, 
 
 1 " Essay on Probabilities " in the Cabinet Encyclopaedia, p. 26. De Morgan 
 is not very consistent with himself in his various distinct treatises on this 
 subject, and other definitions may be found elsewhere. Boole's second defini- 
 tion of Independence is also adopted by Maofarlane, Algebra of Logic, p. 21. 
 
 ^ " Theory of Probabilities " in the Encyclopaedia Meiropolitana. 
 
170 A TREATISE ON PROBABILITY pt. n 
 
 always favours ' runs ' ; this possibility increases the probability 
 both of ' runs ' of heads and of ' runs ' of tails. 
 
 This point is discussed at some length in Chapter XXIX. and 
 further examples will be given there. In this chapter, therefore, 
 I will do more than refer to an investigation by Laplace and to 
 one real and one supposed fallacy of Independence of a type with 
 which we shall not be concerned in Chapter XXIX. 
 
 8. Laplace, in so far as he took account at all of the considera- 
 tions explained in § 7, discussed them imder the heading of Bes 
 inigcditis mammies qui peuvent exister entre hs chances que Von 
 suppose egaks.^ In the case, that is to say, of the coin with 
 unknown bias, he held that the true probability of heads even 
 at the first toss differed from | by an amount unknown. But 
 this is not the correct way of looking at the matter. In the 
 supposed circumstances the initial chances for heads and tails 
 respectively at the first toss really are equal. What is not true 
 is that the initial probability of ' heads twice ' is equal to the 
 probability of ' heads once ' squared. 
 
 Let us write ' heads at first toss ' =^i ; ' heads at second toss ' 
 =h,^. Then h-i]'h = 'hj'h='^, and hrji^jh^hjhji .h^h. Hence 
 hjijh = [hjhj^ only if h^hji=hjh, i.e. if the knowledge that 
 heads has fallen at the first toss does not affect in the least the 
 probability of its falling at the second. In general, it is true that 
 hjhih will not differ greatly from h^/h (for relative to most hypo- 
 theses heads at the first toss will not much influence our expectation 
 of heads at the second), and \ will, therefore, give a good approxi- 
 mation to the required probability. Laplace suggests an ingeni- 
 ous method by which the divergence may be diminished. If we 
 throw two coins and define ' heads ' at any toss as the face thrown 
 by the second coin, he discusses the probability of ' heads twice 
 running ' with the first coin. The solution of this problem 
 involves, of course, particular assumptions, but they are of a kind 
 more likely to be realised in practice than the complete absence 
 of bias. As Laplace does not state them, and as his proof is 
 incomplete, it may be worth while to give a proof in detail. 
 
 Let hy, ty, h^, t^ denote heads and tails respectively with 
 
 the first and second coins respectively at the first toss, and 
 
 hy, tl, h^, t.^ the corresponding events at the second toss, then 
 
 1 Essai pMlosophique, p. 49. See also " Memoirs sur les Probabilit^s," Mim. 
 de I'Acad. p. 228, and op. D'Alembert, " Sur le ealoul dea probabilitSs," 
 Opuscules mathemaliques (1780), vol. vii. 
 
OH. XVI FUNDAMENTAL THEOREMS 171 
 
 the probabiKty (with the above convention) of ' heads twice run- 
 ning,' i.e. agreement between the two coins twice running, is 
 
 {h^^ + t^t^){\h^ + t^t^')/h = {h^h^' + ht^)l{h^h^ + t^t^, h) 
 
 Since h^h^'/ihjhjj + 1^^, h) = t^t^l{hji^ + t^t{, h) by the Principle 
 
 of Indifference, and Aj^a'^a^a'M = 0- 
 
 .-. (Aj^a' + ht/)l{\h^ + tyt^, h) = 2.hjb^ lihjij; + t^t^, h) by (24.1). 
 
 Similarly {h-Ji^^ + t^t^'yh = Ih^h^jh. 
 
 We may assume that h-jjh-^h =h-i]h, i.e. that heads with one 
 coin is irrelevant to the probability of heads with the other : and 
 hjh=hi'/h=^ by the Principle of Indifference, so that 
 
 (AA' + <iV)M = 2(i)^ = i. 
 
 = ihjihg, hjh^' + tji^, h), 
 since, (hj)^' +tjt-^') being irrelevant to h'Jh, h'^jQiJi^ +t-^ti, h) = 
 
 Now hjih^, hji^ + tj,^. h) is greater than J, since the fact of 
 the coins having agreed once may be some reason for supposing 
 they will agree again. But it is less than hjhji : for we may 
 assume that hjih^', hji-^ +tj,^, h) is less than hj{h^, Ji-Jii, h), 
 and also that h^Qi^, hjii, h)=h^hjh, i.e. that heads twice 
 running with one coin does not increase the probabiUty of heads 
 twice running with a different coin. Laplace's method of tossing, 
 therefore, jdelds with these assumptions, more or less legitimate 
 according to the content of h, a probability nearer to | than is 
 hji^lh. If ^2/(^2'; JtJi^+hPi, ^)=i) then the probability is 
 exactly J. 
 
 9. Two other examples will complete this rather discursive 
 commentary. It has been supposed that by the Principle of 
 Indifference the probability of the existence of iron upon Sirius 
 is \, and that similarly the probability of the existence there of 
 any other element is also \. The probability, therefore, that 
 not one of the 68 terrestrial elements mil be found on Sirius 
 is {\)^^, and that at least one will be found there is 1 - ( J)^* or 
 approximately certain. This argument, or a similar one, has 
 been seriously advanced. It would seem to prove also, amongst 
 
172 A TREATISE ON PROBABILITY pt. n 
 
 many other things, that at least one college exactly resembling 
 some college at either Oxford or Cambridge will almost certainly 
 be found on Sirius. The fallacy is partly due, as has been pointed 
 out by Von Kries and others, to an illegitimate use of the Principle 
 of IndiSerence. The probability of iron on Sirius is not \. But 
 the result is also due to the fallacy of false independence. 
 It is assumed that the known existence of 67 terrestrial 
 elements on Sirius would not increase the probability of the 
 sixty-eighth's being found there also, and that their known 
 absence would not decrease the sixty-eighth's probabihty.^ 
 
 10. The other example is that of Maxwell's classic mistake in 
 the theory of gases.^ According to this theory molecules of gas 
 move with great velocity in every direction. Both the directions 
 and velocities are unknown, but the probability that a molecule 
 has a given velocity is a fmiction of that velocity and is inde- 
 pendent of the direction. The maximum velocity and the mean 
 velocity vary with the temperature. Maxwell seeks to 
 determine, on these conditions alone, the probability that a 
 molecule has a given velocity. His argument is as follows : 
 
 If <^{x) represents the probability that the component of 
 velocity parallel to the axis of X is x, the probability that the 
 velocity has components x, y, z parallel to the three axes is 
 (f){x)(l){y)<j)(z). Thus if F{v) represents the probability of a total 
 velocity v, we have (j){x)<l)(y)<j){z) = ¥{v), where v^ = x^+y^ + z^. 
 It is not difficult to deduce from this (assuming that the 
 
 ' See Von Kries, Die Principien der Wahrscheinlichkeitsrechnung, p. 10. 
 Stumpf {tJber den Begriffder mathem. Wahrscheinlichkeit, pp. 71-74) argues that 
 the fallacy results from not taking into account the fact that there might be as 
 
 many metals as atomic weights, and that therefore the chance of iron is -, where 
 
 z is the number of possible atomic weights. A. Nitsche ( VierteljscJi. f. wissensch. 
 PAiZo*., 1892) thinks that the real alternatives are 0, or only 1, or only 2 ... or 
 68 terrestrial elements on Sirius, and that these are equally probable, the chance 
 
 of each being ^. 
 
 ^ I take the statement of this from Bertrand's Calcul des probabiliies, p. 30. 
 Let me here quote a precocious passage on Probability regarded as a branch of 
 Logic, from a letter written by Maxwell in his nineteenth year (1850), before 
 he came up to Cambridge : " They say that Understanding ought to work 
 by the rules of right reason. These rules are, or ought to be, contained in 
 Logic ; but the actual science of logic is conversant at present only with things 
 either certain, impossible, or entirely doubtful, none of which (fortunately) 
 we have to reason on. Therefore the true logic for this world is the calculus 
 of Probabilities, which takes account of the magnitude of the probability 
 which Is, or ought to be, in a reasonable man's mind" (Life, page 143). 
 
CH. XVI FUNDAMENTAL THEOEEMS 173 
 
 functions are analytical) that d){x) must be of the form 
 
 It is generally agreed at the present time that this result is 
 erroneous. But the nature of the error is, I think, quite difEerent 
 from what it is commonly supposed to be. 
 
 Bertrand,^ Poincare,^ and Von Kiies,^ all cite this argument of 
 Maxwell's as an illustration of the fallacy of Independence ; and 
 argue that <f}{x), (f){y), and <j){z) cannot, as he assumes, represent 
 independent probabilities, if, as he also assumes, the probability 
 of a velocity is a function of that velocity. But it is not in this 
 way that the error in the result really arises. If we do not know 
 what function of the velocity the probability of that velocity is, 
 a knowledge of the velocity parallel to the axes of x and y tells 
 us nothing about the velocity parallel to the axis of z. Maxwell 
 was, I think, quite right to hold that a mere assumption that the 
 probability of a velocity is some function of that velocity, does 
 not interfere with the mutual independence of statements as to 
 the velocity parallel to each of the three axes. Let us denote 
 the proposition, ' the velocity parallel to the axis of X is a; ' by 
 X(a;), the corresponding propositions relative to the axes of Y 
 and Z by Y{y) and Z{z), and the proposition ' the total 
 velocity is d ' by Y{v) ; and let h represent our d priori data. 
 Then if 'K{x)/h=^{x) it is a justifiable inference from the 
 Principle of Indifference tha,t Y(y)lh=(p(y) and Z{z)/h=<f){z). 
 Maxwell infers from this that X{x)Y(y)Z{z)/h=<ji{x)<j){y)^(z). 
 That is to say, he assumes that Y{y)jX(x) .h=Y{y)/h and 
 that Z{z)/Y(y) .X(x) .h=Z{z)/h. I do not agree with the 
 authorities cited above that this is illegitimate. So long as 
 we do not know what function of the total velocity the prob- 
 ability of that velocity is, a knowledge of the velocities parallel 
 to the axes of x and y has no bearing on the probability of a given 
 velocity parallel to the axis of z. But Maxwell goes on to infer 
 that X{x)Y{y)Z{z)/h=Y(v)/h, where v^=x^+y^ + z^. It is here, 
 and in a very elementary way, that the error creeps in. The 
 propositions X(a!)Y(^)Z(z) and Y{v) are not equivalent. The 
 latter follows from the former, but the former does not follow 
 from the latter. There is more than one set of values x, y, z, 
 
 ' Calcul des probabilitia, p. 30. 
 
 2 Calcul des probabiUtes (2nd ed.), pp. 41-44 
 
 ^ WahrscheinlichkeitsrecJinung, p. 199. 
 
174 A TREATISE ON PROBABILITY pt. n 
 
 which will yield the same value v. Thus the probability Y{v)jh 
 is much greater than the probability X(a;)Y(y)Z(z)/A. As we do 
 not know the direction of the total velocity v, there are many 
 ways, not inconsistent with our data, of resolving it into com- 
 ponents parallel to the axes. Indeed I think it is a legitimate 
 extension of the preceding argument to put V(i;)/A=^(«) ; for 
 there is no reason for thinking differently about the direction 
 V from what we think about the direction X. 
 
 A difficulty analogous to this occurs in discussing the problem 
 of the dispersion of buUets over a target — a subject round which, 
 on accoimt of a curiosity which it seems to have raised in the 
 minds of many students of probability, a literature has grown up 
 of a bulk disproportionate to its importance. 
 
 11. I now pass to the Principle of Inverse Probability, a 
 theorem of great importance in the history of the subject. With 
 various arguments which have been based upon it I shall deal 
 in Chapter XXX. But it will be convenient to discuss here the 
 history of the Principle itself and of attempts at proving it. 
 
 It first makes its appearance somewhat late in the history of 
 the subject. Not until 1763, when Bayes's theorem was com- 
 municated to the Royal Society,^ was a rule for the determination 
 of inverse probabilities expUcitly enimciated. It is true that 
 solutions to inductive problems requiring an impUcit and more 
 or less fallacious use of the inverse principle had already been 
 propounded, notably by Daniel Bernoulli ia his investigations 
 iato the statistical evidence in favour of inoculation.^ But the 
 appearance of Bayes's Memoir marks the beginning of a new 
 stage of development. It was followed in 1767 by a contribution 
 from Michell ^ to the Philosophical Transactions on the distribu- 
 
 1 Published in the Phil. Trans, vol. liii., 1763, pp. 376-398. This Memoir 
 was communicated by Price after Bayes's death ; there was a second Memoir 
 in the following year (vol. liv. pp. 298-310), to which Price himself made some 
 contributions. See Todhunter's History, pp. 299 et seq. Thomas Bayes was 
 a dissenting minister of Tunbridge Wells, who was a Fellow of the Royal Society 
 from 1741 until his death in 1761. A German edition of his contributions to 
 Probability has been edited by Timerding. 
 
 2 " Essai d'une nouvelle analyse de la mortality causte par la petite v6role, 
 et des avantages de I'inoculation pour la pr^venir," Hist, de I'Acad., Paris, 1760 
 (pubUshed 1766). Bernoulli argued that the recorded results of inoculation 
 rendered it a probable cause of immunity. This is an inverse argument, though 
 Bayes's theorem is not used in the course of it. See also D. Bernoulli's Memoir 
 on the Inclinations of the Planetary Orbits. 
 
 ' Michell's argument owes more, perhaps, to Daniel Bernoulli than to 
 Bayes. 
 
OH. XVI FUNDAMENTAL THEOEEMS 175 
 
 tion of the stars, to which further reference will be made in 
 Chapter XXV. And in 1774 the rule was clearly, though not 
 quite accurately, enunciated by Laplace in his "Memoice sur 
 la probabilite des causes par les evfenemens " {MSmoires 
 present6s d I'Academie des Sciences, vol. vi., 1774). He states 
 the principle as follows (p. 623) : 
 
 " Si un 6venement peut Stre produit par un nombre n de 
 causes diS&entes, les probabiUt6s de I'existence de ces causes 
 prises de I'evdnement sont entre elles comme les probabilites de 
 I'evenement prises de ces causes ; et la probabilite de I'eadstence 
 de chacune d' elles est egale a la probabilite de I'evenement prise 
 de cette cause, divisee par la somme de toutes les probabilites 
 de I'dvfenement prises de chacune de ces causes." 
 
 He speaks as if he intended to prove this principle, but he only 
 give explanations and instances without proof. The principle is 
 not strictly true in the form in which he enunciates it, as will be 
 seen on reference to theorems (38) of Chapter XIV. ; and the 
 omission of the necessary qualification has led to a number of 
 fallacious arguments, some of which will be considered in Chapter 
 XXX. 
 
 12. The value and originaUty of Bayes's Memoir are con- 
 siderable, and Laplace's method probably owes much more to 
 it than is generally recognised or than was acknowledged by 
 Laplace. The principle, often called by Bayes's name, does not 
 appear in his Memoir in the shape given it by Laplace and 
 usually adopted since ; but Bayes's enunciation is strictly correct 
 and his method of arriving at it shows its true logical connection 
 with more fundamental principles, whereas Laplace's enuncia- 
 tion gives it the appearance of a new principle specially introduced 
 for the solution of causal problems. The following passage ^ 
 gives, in my opinion, a right method of approaching the 
 problem : " If there be two subsequent events, the probability 
 
 h P 
 
 of the second :j^ and the probabihty of both together — , and, it 
 
 being first discovered that the second event has happened, from 
 hence I guess that the first event has also happened, the prob- 
 
 p 
 ability I am in the right is r-" I^ *^® occurrence of the first event 
 
 1 Quoted by Todhunter, op. cit. p. 296. Todhunter underrates the import- 
 ance of this passage, which he finds unoriginal, yet obscure. 
 
176 A TEEATISE ON PEOBABILITY pt. n 
 
 is denoted by a and of the second by b, this corresponds to 
 
 ablh^albh . h/h and therefore albh =^At- ; for ahlh =— , hlh =—, 
 
 b/h N N 
 
 P 
 albh=--. The direct and indeed fundamental dependence of the 
 
 
 
 inverse principle on the rule for compound probabilities was not 
 appreciated by Laplace. 
 
 13. A number of proofs of the theorem have been attempted 
 since Laplace's time, but most of them are not very satisfactory, 
 and are generally couched in such a form that they do no more 
 than recommend the plausibiUty of their thesis. Mr. McColl^ gave 
 a symbolic proof, closely resembling theorem (38) when differ- 
 ences of symbolism are allowed for ; and a very similar proof 
 has also been given by A. A. MarkofF.^ I am not acquainted with 
 any other rigorous discussion of it. 
 
 Von Kries * presents the most interesting and careful example 
 of a type of proof which has been put forward in one shape or 
 another by a number of writers. We have initially, according to 
 this view, a certain number of hypothetical possibilities, all 
 equally probable, some favourable and some unfavourable to our 
 conclusion. Experience, or rather knowledge that the event 
 has happened, rules out a number of these alternatives, and we 
 are left with a field of possibiUties narrower than that with which 
 we started. Only part of the original field or Spieh-aum of 
 possibility is now admissible (zuldssig). Causes have d posteriori 
 probabilities which are proportional to the extent of their occur- 
 rence in the now restricted field of possibility. 
 
 There is much ia this which seems to be true, but it hardly 
 amoimts to a proof. The whole discussion is in reality an 
 appeal to intuition. For how do we know that the possibilities 
 admissible d posteriori are still, as they were assumed to be d 
 priori, equal possibilities ? Von Kries himself notices that there 
 is a difficulty ; and I do not see how he is to avoid it, except by 
 the introduction of an axiom. 
 
 This was in fact the course taken by Professor DonMn in 1851, 
 in an article which aroused some interest in the Philosophical 
 
 1 "Sixth Paper on the Caloulus of Equivalent Statements," Proc. Land. 
 Math. Soc, 1897, vol. xxviii. p. 567. See also p. 153 above. 
 
 ' Wahrsoheinlichkeitsrechnung, p. 178. 
 
 ' Die Principien der Wahrscheinlichkeitsrechnung, pp. 117-121. The above 
 account of Von Kries's argument is much condensed. 
 
OH. XVI FUNDAMENTAL THEOREMS 177 
 
 iine at the time, but which has since been forgotten. 
 Donkin's theory is, however, of considerable interest. He laid 
 down as one of the fundamental principles of probability the 
 following : ^ 
 
 " If there be any number of mutually exclusive hypotheses 
 fiih^hg,. . . of which the probabiUties relative to a particular state 
 of information are Pip^^ • • •> ^^^ if ^^^ information be gained 
 which changes the probabilities of some of them, suppose of 
 fe^_i.i and all that follow, without having otherwise any reference 
 to the rest, then the probabilities of these latter have the same 
 ratios to one another, after the new information, that they had 
 before." 2 
 
 Donkin goes on to say that the most important case is where 
 the new information consists in the knowledge that some of the 
 hypotheses must be rejected, without any further information 
 as to those of the original set which are retained. This is the 
 proposition which Von Kries requires. 
 
 As it stands, the phrase " without having otherwise any 
 reference to the rest " obviously lacks precision. An interpreta- 
 tion, however, can be put upon it, with which the principle is 
 true. If, given the old information and the truth of one of the 
 hypotheses hj^. . .h^to the exclusion of the rest, the probability 
 of what is conveyed by the new information is the same whichever 
 of the hypotheses hj^. . .h^ has been taken, then Donkin's 
 principle is valid. For let a be the old information, a' the new, 
 and let hja =f„ \laa' =^^' ; then 
 
 , -, , , Km' la a'/hM . )o_ 
 
 p, =h,aa =—J~=-^ , 
 
 a ja a ja 
 
 :. ^— ='^, etc., if a'/h^a = a'/h^a, which is the condition already 
 
 Pr Ps 
 
 explained. 
 
 14, Difficulties connected with the Inverse Principle have 
 arisen, however, not so much in attempts to prove the principle 
 as in those to enunciate it — though it may have been the lack 
 
 ^ " On certain Questions relating to the Theory of Probabilities," Phil. Mag. 
 4th series, vol. i., 1851. 
 
 ^ It is interesting to notice that an axiom, practically equivalent to this, 
 has been laid down more lately by A. A. MarkofE ( WahrscheinKchkeitsrechnwng, 
 p. 8) under the title ' Unabhangigkeitsaxiom.' 
 
 N 
 
178 A TEEATISE ON PKOBABILITY pt. n 
 
 of a rigorous proof that has been responsible for the frequent 
 enunciation of an inaccurate principle. 
 
 It will be noticed that in the formula (38-2) the a priori 
 probabilities of the hypotheses a^ and a^ drop out if pj^ =p2, and 
 the results can then be expressed in a much simpler shape. This 
 is the shape in which the principle is enunciated by Laplace for 
 the general case/ and represents the uninstructed view expressed 
 with great clearness by De Morgan : ^ " Causes are likely or un- 
 likely, just in the same proportion that it is likely or unlikely 
 that observed events should foUow from them. The most 
 probable cause is that from which the observed event could most 
 easily have arisen." If this were true the principle of Inverse 
 Probability would certainly be a most powerful weapon of proof, 
 even equal, perhaps, to the heavy burdens which have been laid 
 on it. But the proof given in Chapter XIV. makes plain the 
 necessity in general of taking into account the a priori prob- 
 abilities of the possible causes. Apart from formal proof this 
 necessity commends itself to careful reflection. If a cause is 
 very improbable in itself, the occurrence of an event, which 
 might very easily follow from it, is not necessarily, so long as 
 there are other possible causes, strong evidence in its favour. 
 Amongst the many writers who, forgetting the theoretic qualifica- 
 tion, have been led into actual error, are philosophers as diverse 
 as Laplace, De Morgan, Jevons, and Sigwart, Jevons ' going 
 so far as to maintain that the fallacious principle he enimciates 
 is " that which common sense leads us to adopt almost in- 
 stinctively." 
 
 15. The theory of the combination of premisses dealt with 
 in §§ 7, 8 of Chapter XIV. has not often been discussed, and the 
 history of it is meagre. Archbishop Whately* was led astray 
 
 ^ See the passage quoted above, p. 175. 
 
 ' " Essay on Probabilities," in the Cabinet Encydopcedia, p. 27. 
 
 ^ Principles of Science, vol. i. p. 280. 
 
 ' Logic, 8th ed. p. 211 : "As in the case of two probable premisses, the 
 conclusion is not established except upon the supposition of their being both 
 true, so in the case of two distinct and independent indications of the truth 
 of some proposition, unless both of them fail, the proposition must be true : 
 we therefore multiply together the fractions indicating the probability of the 
 failure of each — ^the chances against it — and, the result being the total chances 
 against the establishment of the conclusion by these arguments, this fraction 
 being deducted from unity, the remainder gives the probability for it. E.g. a 
 certain book is conjectured to be by such and such an author, partly, 1st, from 
 its resemblance in style to his known works ; partly, 2nd, from its being attri- 
 
OH. XVI FUNDAMENTAL THBOEEMS 179 
 
 by a superficial error, and De Morgan, adopting the same mis- 
 taken rule, pushed it to the point of absurdity.^ Bishop Terrot ^ 
 approached the question more critically. Boole's ^ last and 
 most considered contribution to the subject of probability dealt 
 with the same topic. I know of no discussion of it during the 
 past sixty years. 
 
 Boole's treatment is full and detailed. He states the problem 
 as follows : " Eequired the probability of an event z.^ when two 
 circumstances x and y are known to be present, — ^the probability 
 of the event z, when we know only of the existence of the circum- 
 stances X, being y, and the probabiKty, when we only know of 
 the existence of y, being y." * His solution, however, is vitiated 
 by the fundamental error examined in § 6 above. Two of his 
 conclusions may be mentioned for their plausibility, but neither 
 is valid. 
 
 " If the causes in operation, or the testimonies borne," he 
 
 bated to him by some one likely to be pretty well informed. Let the probability 
 of the conclusion, as deduced from one of these arguments by itself, be supposed 
 f , and in the other case f ; then the opposite probabilities wiU be f and f, which 
 multiplied together give -J-f as the probability against the conclusion. . . ." 
 
 The Archbishop's error, in that a negative can always be turned into an 
 affirmative by a change of verbal expression, was first poiuted out by a mere 
 diocesan. Bishop Terrot, in the Edin. Phil. Trans, vol. xxi. The mistake is well 
 explained by Boole in the same volume of the Edin. Phil. Trans. : " A confusion 
 may here be noted between the probability that a conclusion is proved, and the 
 probability in favour of a conclusion furnished by evidence which does not prove 
 it. In the proof and statement of his rule. Archbishop Whately adopts the 
 former view of the nature of the probabilities concerned in the data. In the 
 exemplification of it, he adopts the latter." 
 
 1 " Theory of Probabilities," Encydopcedia Metropolitana, p. 400. He shows 
 by means of it that "if any assertion appear neither likely nor unlikely in 
 itself, then any logical argument in favour of it, however weak the premisses, 
 makes it in some degree more likely than not — a theorem which will be readily 
 admitted on its own evidence." He then gives an example : " d priori 
 vegetation on the planets is neither likely nor unlikely ; suppose argument 
 from analogy makes it ^\ ; then the total probability is J+i . i\ or ^ J." De 
 Morgan seems to accept without hesitation the conclusion to be derived from 
 this, that everything which is not impossible is as probable as not. 
 
 * " On the Possibility of combining two' or more Probabilities of the same 
 Event, so as to form one definite Probability," Edin. Phil. Trans., 1856, vol. xxi. 
 
 * " On the Application of the Theory of Probabilities to the Question of the 
 Combination of Testimonies or Judgments," Edin. Phil. Trans., 1857, vol. xxi. 
 
 * Loc. cit. p. 631. Boole's principle (toe. cit. p. 620) that " the mean strength 
 of any probabilities of an event which are foimded upon different judgments 
 or observations is to be measured by that supposed probability of the event 
 a priori which those judgments or observations following thereupon would not 
 tend to alter," is not correct if it means more than that the mean strength of 
 zjx and zjy is to be measured by zjxy. 
 
180 A TEEATISE ON PEOBABILITY pt. n 
 
 argues, " are, separately, such as to leave the mind in a state of 
 equipoise as respects the event whose probability is sought, 
 united they ■will but produce the same effect." If, that is to say, 
 ajhi=\ and alh^=\, he concludes that ajh-Ji2=\- The plausi- 
 bility of this is superficial. Consider, for example, the following 
 instance : A^ = A is black and B is black or white, i^j =B is black 
 and A is black or white, a = both A and B are black. Boole also 
 concluded without valid reason that 0/^1^2 increases, the greater 
 the & priori improbability of the combination h-Jiz- 
 
 16. The theory of " Testimony " itself, the theory, that is to 
 say, of the combination of the evidence of witnesses, has occupied 
 so considerable a space in the traditional treatment of Probability 
 that it win be worth whUe to examine it briefly. It may, however, 
 be safely said that the principal conclusions on the subject set 
 out by Condorcet, Laplace, Poisson, Cournot, and Boole, are 
 demonstrably false. The interest of the discussion is chiefly due 
 to the memory of these distinguished failures. 
 
 It seems to have been generally believed by these and other 
 logicians and mathematicians ^ that the probability of two 
 witnesses speaking the truth, who are independent in the sense 
 that there is no collusion between them, is always the product 
 of the probabilities that each of them separately wUl speak the 
 truth.^ On this basis conclusions such as the following, for 
 example, are arrived at : 
 
 X and Y are independent witnesses {i.e. there is no collusion 
 between them). The probability that X will speak the truth is 
 X, that Y win speak the truth is y. X and Y agree in a particular 
 statement. The chance that this statement is true is 
 
 xy 
 
 xy + {\-x){l-y) 
 
 For the chance that they both speak the truth is xy, and the 
 chance that they both speak falsely is (1 -a;)(l -y). As, in this 
 
 ^ Perhaps M. Bertrand should be registered as an honourable exception. 
 At least he points out a precisely analogous fallacy in an example where two 
 meteorologists prophesy the weather, Calcul des Probabilites, p. 31. 
 ^ E.g., Boole, Laws of Thought, p. 279. 
 
 De Morgan, Formal Logic, p. 195. 
 
 Condorcet, Essai, p. 4. 
 
 Lacroix, Traite, p. 248. 
 
 Cournot, Exposition, p. 354. 
 
 Poisson, Becherchea, p. 323. 
 This list could be greatly extended. 
 
OH. XVI FUHTOAMENTAL THEOREMS 181 
 
 case, our hypothesis is that they agree, these two alternatives 
 are exhaustive ; whence the above result, which may be found 
 in almost every discussion of the subject. 
 
 The fallacy of such reasoning is easily exposed by a more 
 exact statement of the problem. For let a^ stand for " X^ asserts 
 a," and let a/aJi=Xi, where h, our general data, is by itself 
 irrelevant to a, i.e., x^ is the probability that a statement is true 
 of which we only know that X^ has asserted it. Similarly let us 
 write hlbji=x^, where b^ stands for " Xg asserts 6." The above 
 argument then assumes that, if X^ and Xj are witnesses who are 
 causally independent in the sense there is no collusion between 
 them direct or indirect, ahlaJ}J>' = ala-Ji . hjhji^x^x^. 
 
 But ab/ajbji=afajbbji . bja^bji, and this is not equal to XyX^ 
 unless ajajbbji=ajaji and b/ujb^h =blbji. It is not a sufficient 
 condition for this, as seems usually to be supposed, that X^ and Xg 
 shoidd be witnesses causally independent of one another. It is 
 also necessary that a and b, i.e. the propositions asserted by the 
 witnesses, should be irrelevant to one another and also each of 
 them irrelevant to the fact of the assertion of the other by a 
 witness. If a knowledge of a affects the probability either of 
 b or of 6j, it is evident that the formula breaks down. In the one 
 extreme case, where the assertions of the two contradict one 
 another, ahlaJ)Ji=Q. In the other extreme, where the two agree 
 in the same assertion, i.e. where a = b, alaj)bji = 1 and not = aja-Ji. 
 
 17. The special problem of the agreement of witnesses, who 
 make the same statement, can be best attacked as follows, a 
 certain amount of simplification being introduced. Let the 
 general data h of the problem include the hypothesis that X^ and 
 Xg are each asked and reply to a question to which there is only 
 one correct answer. Let a^ = " X^ asserts a in reply to the ques- 
 tion," and mi = "Xi gives the correct answer to the question." 
 
 mjJa]h=Xi and mJaJi=X2, 
 
 Xi and ajg being, in the conventional language of thi-: problem, 
 the " credibilities " of the witnesses. We have, since the wit- 
 nesses agree and since a follows from wi^aj and m^ follows from aa^, 
 
 a/aih^mi/aih; 
 ajaifn^ = 1 ; mjaafi = 1 . 
 AlsOjSiace the witnesses are, in the ordinary sense, "independent " 
 
182 A TKEATISE ON PROBABILITY pt. h 
 
 witnesses, aja^ah=a^ah and ajajah=a2/dh ; that is to say, the 
 probability of X2's asserting a is independent of the fact of X^'s 
 having asserted a, given we know that a is, in fact, true or false, 
 as the case may be. 
 
 The probability that, if the witnesses agree, their assertion is 
 t^'ieis / ,. , I. m^ajaji 
 
 a^a/aji + a^d/aji aja-^ah . Xy + aja^ah . (1 - JCg)' 
 If this is to be equal to ^, ^ ^^ -, we must have 
 
 XjX2 + {l-X{){l-X2) 
 
 a^ja-^ah x^ 
 
 Now -J^ -J^ by the hypothesis of " independence » 
 
 aa^fh d/h ajaji d/h 
 
 da^jh a/h djaji a/h 
 
 X.2 d/h 
 
 1 - X2 a/h 
 
 This then is the assumption which has tacitly slipped into the 
 conventional formula, — that a/h=dlh = ^. It is assumed, that 
 is to say, that any proposition taken at random is as likely as 
 not to be true, so that any answer to a given question is, d priori, 
 as likely as not to be correct. Thus the conventional formula 
 ought to be employed only in those cases where the answer 
 which the " independent " witnesses agree in giving is, d priori 
 and apart from their agreement, as likely as not. 
 
 18. A somewhat similar confusion has led to the controversy 
 as to whether and in what manner the d priori improbability 
 of a statement modifies its credibility in the mouth of a witness 
 whose degree of reliability is known. The fallacy of attaching 
 the same weight to a testimony regardless of the character of 
 what is asserted, is pointed out, of course, by Hume in the Essay 
 on Miracles, and his argument, that the great d priori improb- 
 ability of some assertions outweighs the force of testimony 
 otherwise reliable, depends on the avoidance of it. The correct 
 view is also taken by Laplace in his Essai philosophigue (pp. 
 
CH. XVI FUJIDAMENTAL THEOEEMS 183 
 
 98-102), where he argues that a witness is less to be believed 
 when he asserts an extraordinary fact, declaring the opposite 
 view (taken by Diderot in the article on " Certitude " in the 
 EncyclopSdie) to be inconceivable before " le simple bon sens." 
 
 The manner in which the resultant probability is affected 
 depends upon the precise meaning we attach to " degree of re- 
 liability " or " coefficient of credibility." If a witness's credi- 
 bility is represented by x, do we mean that, if a is the true answer, 
 the probability of his giving it is x, or do we mean that if he 
 answers a the probability of a's being true is a; ? These two things 
 are not equivalent. 
 
 Let a^ stand for " a is asserted by the witness " ; ^ for our 
 evidence bearing on the witness's veracity ; and Ag ^or other 
 evidence bearing on the truth of a. Let a/hji^, i.e. the d priori 
 probability of a apart from our knowledge of the fact that the 
 witness has asserted it, be represented by p. 
 
 n Ih 
 
 Let ajaji^=x^ and a^la\=x^; so that !»i=— t^-^^z- ^^ 
 
 Oil hi 
 
 general ajJi^d^ a-^l\. Do we mean by the witness's credibility 
 XiOX x^'i 
 
 We require a/ajiji^. 
 
 Let ai/dhi^ = r, i.e. the probability, apart from our special 
 knowledge concerning a, that, if a is false, the witness will hit on 
 that particular falsehood. 
 
 x^ x^p 
 
 X2P + aJd]ijh2 . (1 -p) X2P+r{l -f) 
 for ajahjh2=ajahj^ and ajdhjh2=ai/ahi, since, given certain 
 knowledge concerning a, h^ is irrelevant to the probability of a^. 
 19. Generally speaking, all problems, in regard to the com- 
 bination of testimonies or to the combination of evidence derived 
 from testimony with evidence derived from other sources, may 
 be treated as special instances of the general problem of the 
 combination of arguments. Beyond pointing out the above 
 plausible fallacies, there is little to add. Mr. W. E. Johnson, 
 however, has proposed a method of defining credibility, which 
 is sometimes valuable, because it regards the witness's credibility 
 not absolutely, but with reference to a given type of question. 
 
184 A TEEATISE ON PEOBABILITY pt. n 
 
 so that it enables us to measure the force of the witness's testimony 
 under special circumstances. If a represents the fact of A's 
 testimony regarding x, then we may define A's credibility for x 
 as a, where a is given by the equation 
 
 xjah^xlh + a-yi/xjh . x/h ; 
 
 so that a-y^xjh . xjh measures the amount by which A's assertion 
 of X increases its probability. 
 
 20. One of the most ancient problems in probability is con- 
 cerned with the gradual diminution of the probability of a past 
 event, as the length of the tradition increases by which it is 
 established. Perhaps the most famous solution of it is that 
 propounded by Craig ia his Theologiae Christianae Principia 
 Mathematica, published in 1699.^ He proves that suspicions of 
 any history vary ia the duplicate ratio of the times taken from 
 the beginning of the history in a manner which has been described 
 as a kind of parody of Newton's Principia. " Craig," says 
 Todhunter, " concluded that faith in the Gospel so far as it 
 depended on oral tradition expired about the year 880, and that 
 so far as it depended on written tradition it would expire in the 
 year 3150. Peterson by adopting a different law of diminution 
 concluded that faith would expire iu 1789." ^ About the same 
 time Locke raised the matter in chap. xvi. bk. iv. of the 
 Essay Concerning Human Understanding : " Traditional testi- 
 monies the farther removed, the less there proof. ... No 
 Probability can rise higher than its first original." This is 
 evidently intended to combat the view that the long acceptance 
 by the human race of a reputed fact is an additional argument 
 
 * See Todhunter's History, p. 64. It has been suggested that the anonymous 
 essay iu the Phil. Trans, for 1699 entitled " A Calculation of the OrecUbility 
 of Human Testimony " is due to Craig. In this it is argued that, if the 
 credibilities of a set of witnesses are p^ . . . p^, then if they are 
 successive the resulting probability is the product p^p^ . . . y„ ; if they are 
 concurrent, it is : ^ _ (^ _^^)(i -p,) . . . (1 -p,). 
 
 This last result follows from the supposition that the first witness leaves an 
 amount of doubt represented by 1 - j), ; of this the second removes the fraction 
 Pj, and so on. See also Lacroix, Traite elimentaire, p. 262. The above theory 
 was actually adopted by BicquiUey. 
 
 ^ In the Budget of Paradoxes De Morgan quotes Lee, the Cambridge Orientalist, 
 to the effect that Mahometan writers, in reply to the argument that the Koran 
 has not the evidence derived from Christian miracles, contend that, as evidence 
 of Christian miracles is daily weaker, a time must at last arrive when it will 
 fail of affording assurance that they were miracles at all : whence the necessity 
 of another prophet and other miracles. 
 
CH. XVI FUNDAMENTAL THEOEEMS 185 
 
 in its favour and that a long tradition increases rather than 
 diminishes the strength of an assertion. " This is certain," says 
 Locke, " that what in one age was affirmed upon slight grounds, 
 can never after come to be more vaUd in future ages, by being 
 often repeated." In this connection he calls attention to " a 
 rule observed in the law of England, which is, that though the 
 attested copy of a record be good proof, yet the copy of a copy 
 never so weU attested, and by never so credible witnesses, will 
 not be admitted as a proof in Judicature." If this is stiU a good 
 rule of law, it seems to radicate an excessive subservience to the 
 principle of the decay of evidence. 
 
 But, although Locke affirms sound maxims, he gives no theory 
 that can afEord a basis for calculation. Craig, however, was the 
 more typical professor of probability, and in attempting an 
 algebraic formula he was the first of a considerable family. The 
 last grand discussion of the problem took place in the columns 
 of the Educational Times?- Macfarlane^ mentions that four 
 diSerent solutions have been put forward by mathematicians 
 of the problem : " A says that B says that a certain event took 
 place ; required the probability that the event did take place, 
 Pi and ^2 being A's and B's respective probabilities of speaking 
 the truth." Of these solutions only Cayley's is correct. 
 
 ^ Reprinted in Mathematics from the Educational Times, vol. xxvii. 
 
 ' Algebra of Logic, p. 151. Maofarlane attempts a solution of the general 
 problem without success. Its solution is not difficult, if enough unknowns are 
 introduced, but of very little interest. 
 
CHAPTER XVII 
 
 SOME PROBLEMS IN INVERSE PROBABILITY, INCLUDING AVERAGES 
 
 1. The present chapter deals with ' problems ' — ^that is to 
 say, with applications to particular abstract questions of some of 
 the fundamental theorems demonstrated in Chapter XIV. It 
 is without philosophical interest and should probably be omitted 
 by most readers. I introduce it here in order to show the ana- 
 lytical power of the method developed above and its advantage 
 in ease and especially in accuracy over other methods which 
 have been employed.^ § 2 is mainly based upon some problems 
 discussed by Boole. §§ 3-7 deal with the fundamental theory 
 connecting averages and laws of error. §§ 8-11 treat discursively 
 the Arithmetic Average, the Method of Least Squares, and 
 Weighting. 
 
 2. In the following paragraph solutions are given of some 
 problems posed by Boole in chapter xx. of his Laws of Thought. 
 Boole's own method of solving them is constantly erroneous,^ 
 and the difficulty of his method is so great that I do not know 
 of any one but himseK who has ever attempted to use it. The 
 term ' cause ' is frequently used ia these examples where it might 
 have been better to use the term ' hjrpothesis.' For by a possible 
 cause of an event no more is here meant than an antecedent 
 occurrence, the knowledge of which is relevant to our anticipation 
 of the event ; it does not mean an antecedent from which the 
 event in question must follow. 
 
 (56) The a priori probabilities of two causes Aj and A^ 
 are c^ and Cg respectively. The probability that if the cause A^ 
 
 ^ Such examples as these might sometimes be set to teat the wits of students. 
 The problems on Probability usually given are simply problems on mathematical 
 combinations. These, on the other hand, are really problems in logic. 
 
 ' For the reason given in § 6 of Chapter XVI. The solutions of problems 
 I.-VI., for example, in the Laws of Thought, chap, xx., are all erroneous. 
 
 186 
 
OH. xvn FUNDAMENTAL THEOREMS 187 
 
 occur, an event E will accompany it (whether as a consequence 
 of Ai or not), is p^, and the probability that E will accompany Ag, 
 if Ag present itself, is p^. Moreover, the event E cannot appear 
 in the absence of both the causes A^ and Ag. Required the prob- 
 ability of the event E. 
 
 This problem is of great historical interest and has been called 
 Boole's ' Challenge Problem.' Boole originally proposed it for 
 solution to mathematicians in 1851 in the Cambridge and Dublin 
 Mathematical Journal. A result was given by Cayley ^ in the 
 Philosophical Magazine, which Boole declared to be erroneous.^ 
 He then entered the field with his own solution.^ " Several 
 attempts at its solution," he says, " have been forwarded to me, 
 all of them by mathematicians of great eminence, all of them 
 admitting of particular verification, yet differing from each other 
 and from the truth." * After calculations of considerable length 
 and great difficulty he arrives at the conclusion that u is the 
 probability of the event E where u is that root of the equation 
 
 [1 - ei(l -pj) - ^] [1 - 62(1 -p^) - u] ^ (M-CiPiKw-Ca^a) 
 l-u Ci^i + CaPa-'" 
 
 which is not less than c^^ and c^2 ^^^ iiot greater than 
 1 -Ci(l -pj), 1 -C2(l -p^), or C1P1 + C2P2. 
 
 This solution can easily be seen to be wrong. For in the 
 case where A^ and Ag cannot both occur, the solution is 
 u==CjPi+c^2'' whereas Boole's equations do not reduce to 
 
 ^ Phil. Mag. 4th series, voL t1 
 
 * Cayley's solution was defended against Boole by Dedekind (Orelle's Journal, 
 voL 1. p. 268). The difference arises out of the extreme ambiguity as to the 
 meaning of the terms as employed by Cayley. 
 
 * " Solution of a Question in the Theory of ProbabiUties," Phil. Mag. 4th 
 series, vol. vii., 1854. This solution is the same as that printed by Boole 
 shortly afterwards in the Laws of Thought, pp. 321-326. In the Phil. Mag. 
 WUbraham gave as the solution u^Cjp^+c^^-z, where z is necessarily less 
 than either c,yj or c^p.^- This solution is correct so far as it goes, but is not 
 complete. The problem is also discussed by Macfarlane, Algebra of Logic, 
 p. 154. 
 
 * In proposing the problem Boole had said : " The motives which have 
 led me, after much consideration, to adopt, with reference to this question, a 
 course unusual in the present day, and not upon slight grounds to be revived, 
 are the following : First, I propose the question as a test of the sufficiency of 
 received methods. Secondly, I anticipate that its discussion will in some 
 measure add to our knowledge of an important branch of piu:e analysis." 
 When printing his own solution in the Laws of Thought, he adds, that the 
 above " led to some interesting private correspondence, but did not elicit a 
 solution." 
 
188 A TREATISE ON PROBABILITY pt. n 
 
 this simplified form. The mistake which Boole has made is 
 the one general to his system, referred to in Chapter XVI., § 6.^ 
 
 The correct solution, which is very simple, can be reached as 
 follows : 
 
 Let «!, Oj, e assert the occurrences of the two causes and the 
 event respectively, and let h be the data of the problem. 
 
 Then we have aJh=Ci, a^h=C2, e/aji^pi, ejaji=p2' "^^ 
 require ejh. Let ejh^u, and let ajajeh=z. Since the event 
 cannot occur in the absence of both the causes, 
 
 eldja2h = 0. 
 
 It follows from this that a-^a^eh^O, unless ejh=0, 
 
 i.e. {a^ + a^leh = \, 
 
 whence a-^eh + a J eh = 1 + a^ajeh by (24). 
 
 Now ajeh = ^^^ and aJeh = ^, 
 
 . . w , 
 
 where z is the probability after the event that both the causes were 
 present. 
 
 If we write ea-^a^jh—y, 
 
 y = a-^ajeh . e/h = uz, 
 so that u = (cj^i + C22'2) ~ y • 
 
 Boole's solution fails by attempting to be independent of 
 y or z. 
 
 (56.1). Suppose that we wish to find limits for the solu- 
 tion which are independent of y and z: then, since y^O, 
 
 Again 
 
 e/h = eajjh + eajh^djh + eajh^l - c^ + CjPj^ by (24.2) and (4). 
 
 Similarly e/^ ^1 - Cg + C2P2. From the same equations it appears 
 that e/h'>Cjjp-i and '^c^p^- 
 
 '' Boole's error is pointed out and a correct solution given in Mr. MoColl's 
 " Sixth Article on the Calculus of Equivalent Statements " {Proc. Land. Math. 
 Soc. vol. xxviii. p. 562). 
 
OH. XVII FUNDAMENTAL THEOEBMS 189 
 
 .". u lies between 
 
 the greatest of \ ^^ and the least of ] 1 -Cj^{l -pi) 
 ^'^' [l-c,{l-p,). 
 
 It will be seen that these numerical limits are the same as the 
 limits obtained by Boole for the roots of his equations. 
 
 (56.2) Suppose that the d priori probabilities of the causes Cj 
 and Cg are to be eUminated. The only limit we then have is 
 
 U<Pi+P2- 
 
 (56.3) Suppose that one of the a priori probabilities Cg is to be 
 eliminated. We then have limits Cjp{^u<i 1 - q + c^p^^. If, there- 
 fore, Ci is large, u does not differ widely from Cjp^. 
 
 (56.4) Suppose Pi is to be eliminated. We then have 
 
 If therefore c^ is large or Cg small, u does not difier widely 
 from CiPj. 
 
 (56.5) If a^aji=a^h, i.e. if our knowledge of each of the 
 causes is independent, we have a closer approximation. For 
 
 y = eafy/h = eja^a^h . a-yja^h . Ui/h = e/aja^h . CiC^, 
 .: u = CiPi + CiPz - C1C2 ■ e/ojCCih, 
 
 .: U>CiPi + C2P2-CiC2. 
 
 (57) We may now generalise (56) and discuss the case of n 
 causes. If an event can only happen as a consequence of one 
 or more of certaia causes A^, Aj, . . . A^, and if q is the a priori 
 probability of the cause A^ and p^ the probabihty that, if the 
 cause Ai be known to exist, the event E will occur : required the 
 probability of E. 
 
 This is Boole's problem VI. {Laws of ThougM, p. 336). As 
 the result of ten pages of mathematics, he finds the solution to be 
 the root lying between certain limits of an equation of the «.'" 
 degree which he cannot solve. I know no other discussion of the 
 problem. The solution is as follows : 
 
 ejh = eajh + ea-Jk = edjh + eja-^h . ajh = edi/h + c^^p^ (i. ) 
 edjh = edja^/h + edi/a^h . ajh = edja^/h + c^ . edja^h, 
 
 edjja^h = e/a^h - eaja^h =P2 — • «%*2/^j 
 
190 A TREATISE ON PROBABILITY 
 
 aud edid^ajh = ea^d^laji . Cg = c.^{ela^ - edjd^afflj 
 
 =C3Pa-edja^Jh, 
 .: ejh = ed^d^ajh + c^pi + c^^ + HPz ~ eSiaJh - eoid^^jh. 
 In general 
 
 ed^a^ 1 . . d^_i/h = ed^d^ . ■ ■ d^_idjh + ed^dz ■ . . d^_iajh 
 = eai . . . dr/h + ea^ . . . d^_-jaji . c^ 
 = edi. . . djh + c^{elajh -ed^.. .d^_ i/«^r^} 
 = eSi . . . (i^jh + c^p^ -edy. . . d^.^a^/h, 
 
 .: finally we have e/h = ea^ . . . djh + %c^Pr ~ ^^^ • • • dj._ia^/h. 
 
 1 2 
 
 But since the n causes are supposed to be exhaustive 
 edi...djh = 0, 
 
 .-. ejh = Ic^Pr ~ Seffli . . .d^_ ^ajh (ii.) ■ 
 
 1 2 
 
 Let edi. . .d^, -fljh = n^ ; 
 
 n n 
 
 then e/h = %c^p^-Xn^ (iii.). 
 
 1 2 
 
 (57.1) If our knowledge of the several causes is independent, 
 if, that is to say, our knowledge of the existence of any one of 
 them is not relevant to the probability of the existence of any 
 other, so that a^jaji = a^l'h = c^, then 
 
 ea^ . . .dr_iajh = ed^. . . d^_ilafi . c^ 
 
 --c^ . e/fflj . . . dr_ia^h[l -% . . . d^_-jajij 
 
 = c.[l - n(l -Ci) . . . (1 -c^_i)]e/di . . . d^_ia^h. 
 
 1 
 
 Let e/di . . .d^_ ^aji = m,., 
 
 then e/h = 2 c^p^ - S c^[l - II (1 - c^)]m^. 
 
 r=l r=2 s=l 
 
 These results do not look very promising as they stand, but 
 they lead to some useful approsmations on the ehmination of 
 m^ and n^ and to some interesting special cases. 
 
OH. xvn 
 
 FUNDAMENTAL THEOREMS 
 
 191 
 
 (57.2) From equation (i.) it follows that e/h^c-iPi and 
 
 n 
 
 e/h^l -Cy{l -pj) ; and from equation (ii.) that e/A^Sc^j?,. ; 
 .•. e/h lies between ,^ 
 
 I'^iPi 1 
 
 the greatest of i ] and the least of ^ 1 - '^iCl -^i) 
 
 ^CnP^o : 
 
 (57.3) Further, if the causes are independent it follows from 
 (57.1) that 
 
 e/h^tc^p^. - 2c^[l - n(l - cj], 
 
 1 2 1 
 
 so that e/h lies between 
 
 the greatest of 
 
 (57.4) Now consider the case in which Pi=Pz = - ■ •=Pn = ^> 
 i.e. in which any of the causes would be sufficient, and in which 
 the causes are independent. Then m,. = 1 ; so that 
 
 r=n r=n s=r — 1 
 
 e/h=tc,-tcll- n (1-0] 
 
 r=l r=2 s = l 
 
 , =1-(1-Ci)(l-C,)...(l-Cj. 
 
 (57.5) Let Ci, c^. . .c^ be small quantities so that their 
 squares and products may be neglected. 
 
 Tlieii e/h=tc,p,, 
 
 I.e. the smaller the probabihties of the causes the more do they 
 approach the condition of being mutually exclusive.^ 
 
 (57.6) The d post^iori probability of a particular cause a^ 
 after the event has been observed is 
 
 e/aji . a^/h 
 
 te,p,-icll-n\l-c,)] 
 
 \ ' ^ and the 
 
 'f ^ least of 
 
 tc^Pr 
 
 l-cS.-p^) 
 
 OnPn 
 
 i-c^Ci-pJ 
 
 a^jeh = - 
 
 ejh 
 
 jPj^ 
 ejh 
 (This is Boole's problem IX., p. 357). 
 ^ Boole arriTes at this lesult, Laws of Thmghi, p. 345, but I doubt his proof. 
 
192 A TEEATISE ON PEOBABILITY pt. n 
 
 (58) The probability of the occurrence of a certain natural 
 phenomenon under given circumstances is f. There is also a 
 probability a of a permanent cause of the phenomenon, %.e. of a 
 cause which would always produce the event under the circum- 
 stances supposed. What is the probability that the phenomenon, 
 being observed n times, will occur the w + 1"' ? 
 
 This is Boole's problem X. (Laws of Thought, p. 358). Boole 
 arrives by his own method at the same result as that given below. 
 It is necessary first of all to state the assumption somewhat 
 more precisely. If x^ asserts the occurrence of the event at the 
 r** trial and t the existence of the ' permanent cause ' we have 
 
 x^/h=p, t/h = a, x^lth = \, 
 and we require ««+i/a:i • • • a;„A =y™+i- 
 
 It is also asstmied that if there is no permanent cause the prob- 
 ability of Xg is not affected by the observations x^, etc., i.e. 
 
 xjx^ . . . xfh = xjih,^ 
 
 - xJi/h_xJh-Xst/h p-a 
 
 Xjtn = -jpr — ■ 
 
 i/h i/h 1-a 
 
 x^jx^ . . . x^_ih = xj/xi . . . x^ _-Ji + xjtjxi . . . x^_Ji 
 
 = tjxi . . . x^_-Ji+x^/ixj^ . . . x^_jh . i/aji . . . x^_ih 
 
 _ x^. . .x^_-^t,jh f-a Xy. . .x^.-ilth.ijh 
 Xy. . .x^_i/h 1-a Xy. . .x^_.Jh 
 
 p-a\l -a 
 
 p-a\'^ 1 
 
 (1-a) 
 
 yiyi-'-Vr-i i-« y-iyi- •■Vr-i 
 
 a-l-(^-a)' 
 
 ,1-a 
 I.e. y, = ^ 
 
 yiSi%---yr-\ 
 
 a + (jp-a)|__ 
 Also 2/1 =p and y^ = ?, 
 
 yi 
 
 This assumption, which is tacitly introduced by Boole, is not generaUy 
 justifiable. I use it here, as my main purpose is to iUustrate a method. The 
 same problem, wOhout this assumption, will be discussed in dealing with Pure 
 Induction. 
 
CH. xvn 
 
 FUNDAMENTAL THEOEBMS 
 
 193 
 
 so that 
 
 Vn+l'- 
 
 VI -ay 
 
 '. + {p-a) 
 
 p-a 
 
 (58.1) If p =a, ?/„ = 1 ; for if an event can only occur as the 
 result of a permanent cause, a single occurrence makes future 
 occurrences certain under similar conditions. 
 
 (58.2) 
 
 a{p - a) 
 
 ¥n+l-yn = r 
 
 -ffA^-a/ 
 
 p -a 
 
 1- 
 
 l-ffl 
 
 a + {p-a.) 
 
 P- 
 
 1-a 
 
 71-1 
 
 a + (p-a) 
 
 p -a 
 1-a 
 
 (by easy algebra) ; 
 
 and p is always >a and <1. 
 p -a 
 
 So that {p - ffi)( j is positive and decreases as r increases, 
 
 As n increases y„ = 1 - e, where 
 
 -ip-a) 
 
 1- 
 
 p -a 
 
 p -a 
 1 -a 
 
 \n~2 
 
 a + (p-a) 
 
 p-a 
 
 71-2 
 
 so that for any value of rj however small a value of n can be 
 found such that e<»y so long as a is not zero. 
 
 (58.3) t^ the d posteriori probability of a permanent cause 
 after n successful observations is 
 
 . , J Xt . . . xjth . t/h 
 
 a 
 
 ^1^2 ■ ■ . «/„ 
 
 a + {p-a) 
 
 p-a 
 
 tn = l- e', where e' = 
 
 „-.,H 
 
 a + (p-a) 
 
 p-a 
 1-a 
 
194 A TEEATISE ON PEOBABILITY ft. h 
 
 So that t^ approaclies the limit unity as n increases, so long as a 
 is not zero. 
 
 3. The following is a common type of statistical problem.^ 
 We are given a series of measurements, or observations, or 
 estimates of the true value of a given quantity ; and we wish to 
 determine what function of these measurements wUl yield us 
 the most probable value of the quantity, on the basis of this evid- 
 ence. The problem is not determinate unless we have some 
 good ground for making an assumption as to how likely we are 
 in each case to make errors of given magnitudes. But such an 
 assumption, with or without justification, is frequently made. 
 
 The fimctions of the original measurements which we com- 
 monly employ, in order to yield us approximations to the most 
 probable value of the quantity measured, are the various kinds 
 of means or averages — ^the arithmetic mean, for example, or 
 the median. The relation, which we assume, between errors of 
 different magnitudes and the probabilities that we have made 
 errors of those magnitudes, is called a law of error. Corresponding 
 to each law of error which we might assume, there is some function 
 of the measurements which represents the most probable value 
 of the quantity. The object of the following paragraphs is to 
 discover what laws of error, if we assume them, correspond to 
 each of the simple types of average, and to discover this by means 
 of a systematic method. 
 
 4. Let us assume that the real value of the quantity is either 
 &!,... 6^ .. . 6„, and let a^ represent the conclusion that the 
 value is, in fact, b^. Further let x^ represent the evidence that 
 a measurement has been made of magnitude y^. 
 
 If a measurement y^ has been made, what is the probabUity 
 that the real value is b^ 1 The appUcation of the theorem of 
 inverse probability yields the following result : 
 
 %xja^. ajh 
 
 r=l 
 
 (the number of possible values of the quantity being n), where 
 
 Ti stands for any other relevant evidence which we may have, 
 
 in addition to the fact that a measurement cCj, has been made. 
 
 Next, let us suppose that a number of measurements 2/i • • • 2/m 
 
 ^ The substance of §§ 3-7 has been printed in the Journal of the Royal 
 Statistical Society, vol. Ixxiv. p. 323 (February 1911). 
 
CH. XVII FUNDAMENTAL THEOREMS 195 
 
 have been made ; what is now the probability that the real value 
 is bg ? We require the value of aJxyX^ . . . xji. As before, 
 
 , -, X-, . . . . xjaji. ajh 
 
 Xxj^ . . . xjaji. a^jJi 
 
 r = l 
 
 At this point we must introduce the simplifying assumption 
 that, if we Imew the real value of the quantity, the different 
 measurements of it would be independent, in the sense that a 
 knowledge of what errors have actually been made in some of 
 the measurements would not affect in any way our estimate of 
 what errors are likely to be made in the others. We assume, 
 in fact, that x^jx^ . . . Xga^h=x^/aji. This assumption is ex- 
 ceedingly important. It is tantamount to the assumption that 
 ova law of error is unchanged throughout the series of observations 
 in question. The general evidence h, that is to say, which justifies 
 oui assumption of the particular law of error which we do assume, 
 is of such a character that a knowledge of the actual errors made 
 in a number of measurements, not more numerous than those 
 in question, are absolutely or approximately irrelevant to the 
 question of what form of law we ought to assume. The law 
 of . error which we assume will be based, presumably, on an 
 experience of the relative frequency with which errors of difierent 
 magnitudes have been made under analogous circumstances in 
 the past. The above assumption will not be justified if the 
 additional experience, which a knowledge of the errors in the new 
 measurements would supply, is sufficiently comprehensive, rela- 
 tively to our former experience, to be capable of modifjdng our 
 assumption as to the shape of the law of error, or if it suggests 
 that the circumstances, in which the measurements are being 
 carried out, are not so closely analogous as was originally supposed. 
 
 With this assumption, i.e. that x^, etc., are independent of 
 
 one another relatively to evidence aji, etc., it follows from the 
 
 ordinary rule for the multiplication of independent probabilities 
 
 that s=m 
 
 x^ . . . . xjaji = nxjaji. 
 
 3=1 
 
 ajk. Uxjafi 
 Hence aJxjXz . . . x^h = ^^^ "^^ 
 
 S UxJa^h. aJh 
 
 r=lLs=l 
 
196 A TREATISE ON PROBABILITY pt. n 
 
 The most probahh value of the quantity under measurement, 
 given the m measurements y^, etc. — ^which is our quaesitum — is 
 therefore that value which makes the above expression a maxi- 
 mum. Since the denominator is the same for all values of b^, 
 we must find the value which makes the numerator a maximum. 
 Let us assume that aj/A=a!2/A= . . . -ajh. We assume, that 
 is to say, that we have no reason d priori (i.e. before any measure- 
 ments have been made) for thinking any one of the possible 
 values of the quantity more likely than any other. We require, 
 
 q=m 
 
 therefore, the value of 6,, which makes the expression Ilxjaji 
 
 3=1 
 
 a maximum. Let us denote this value by y. 
 
 We can make no further progress without a further assump- 
 tion. Let us assume that xjaji — ^namely, the probability of a 
 measurement y^ assuming the real value to be 6, — ^is an algebraic 
 function / of y^ and b^, the same function for all values of y^ and 
 6j within the limits of the problem.^ We assume, that is to say, 
 xjaji =f{yq,bg), and we have to find the value of &„ namely y, 
 
 q=m 
 
 which makes Ilf(yg^,y) a maximum. Equating to zero the 
 
 9=1 _ 
 
 differential coef&cient of this expression with respect toy, we 
 
 have 2 •i-i^9!£i=0,2 where /'=-;^. This equation may be 
 3=1 f{yq,y) dy 
 
 f 
 written for brevity in the form S^'=0. 
 
 Jq 
 
 If we solve this equation for y, the result gives us the value of 
 the quantity under observation, which is most probable relatively 
 to the measurements we have made. 
 
 The act of differentiation assumes that the possible values of y 
 are so numerous and so uniformly distributed within the range 
 in question, that we may, without sensible error, regard them as 
 continuous. 
 
 5. This completes the prolegomena of the inquiry. We are 
 
 ^ Gauss, iu obtaining the normal law of error, made, in effect, the more 
 special assumption that xjaji is a function of e, only, where e^ is the error and 
 ej=6j-yg. We shall find in the sequel that all symmetrical laws of error, 
 such that positive and negative errors of the same absolute magnitude are 
 equally likely, satisfy this condition — the normal law, for example, and the 
 simplest median law. But other laws, such as those which lead to the geometric 
 mean, do not satisfy it. 
 
 ^ Since none of the measurements actually made can be impossible, none of 
 the expressions /(j/,,?/) can vanish. 
 
CH. xvn FUNDAMENTAL THEOREMS 197 
 
 now in a position to discover what laws of error correspond to 
 
 given assumptions respecting the algebraic relation between the 
 
 measurements and the most probable value of the quantity, and 
 
 vice versa. For the law of error determines the form of f{yq,y)- 
 
 f 
 And the form oif{y^,y) determines the algebraic relation ^-^^ = 
 
 between the measurements and the most probable value. It 
 may be well to repeat that f{y^,y) denotes the probability to 
 us that an observer will make a measurement y^ in observing a 
 quantity whose true value we know to be y. A law of error tells 
 us what this probability is for aU possible values of y^ and y 
 within the limits of the problem. 
 
 (i.) If the most probable value of the quantity- is equal to the 
 arithmetic mean of the measurements, what law of error does this 
 imply ? 
 
 f 
 ^•L«=0 must be equivalent to l,{y-yq)=0, since the 
 
 Jq 
 
 J 5=™ 
 
 most probable value y must equal — Xy^. 
 
 f 
 .: •'-2 = <l>"(y){y - yq) where (j)"{y) is some function which 
 
 Jq 
 
 is not zero and is independent of y^. 
 
 Integrating, 
 
 iog fg=/'^"(y){y-yq)dy+ir{yq) where yjriy^) is some func- 
 tion independent of y. 
 
 =<P'{y){y -yq) -<l>(y) +i^iyq)- 
 
 So that /^=e"^'(J'K!'-s'»)-«s')+'f'(i'«)- 
 
 Any law of error of this type, therefore, leads to the arithmetic 
 mean of the measurements as the most probable value of the 
 quantity measured. 
 
 If we put (/)(«/)= -AV and ■fiy^^ -khj^+logk, we obtain 
 /g=Ae~*'^''~*''', the form normally assumed. 
 
 =Ae "*'"''', where z^ is the absolute magnitude of the error in 
 the measurement y^. 
 
 This is, clearly, only one amongst a number of possible solu- 
 tions. But with one additional assumption we can prove that 
 this is the only law of error which leads to the arithmetic mean. 
 
198 A TREATISE ON PROBABILITY pt. u 
 
 Let us assume that negative and positive errors of the same 
 absolute amount are equally likely. 
 
 In this case/j must be of the form Be*^*~^')', 
 
 ••• <t>'{y){y - y,) - 4>(y) + fiy^) = ^(y - yaf- 
 
 Differentiating with respect to y, 
 
 ^{y-Vif 
 
 But ^"(y) is, by hjrpothesis, independent of y^. 
 
 .•. ^(w-«.)^= -^2 ^]iej.e A; is constant ; integrating, 
 
 0{y-yq?= -^%-2//+logC and wehave/g=Ae-*^'(s'-«' (where 
 A=BC). 
 
 (ii.) "What is the law of error, if the geometric mean of the 
 measurements leads to the most probable value of the quantity ? 
 
 In this case S — = must be equivalent to Tly^ =y^, i-s- to 
 /a 5=1 
 
 Slog^ = 0. 
 
 y 
 
 Proceeding as before, we find that the law of error is 
 
 / =Ae"'''(*^'°s-+/-^''*+'''^*'\ 
 There is no solution of this which satisfies the condition that 
 negative and positive errors of the same absolute magnitude are 
 equally likely. For we must have 
 
 'i>'{y) log h+\& dy + yjr{y;)=<l>{y-y,f 
 
 y ^ y 
 
 oT<^"{y)\ogy^=~<^{y-y,f, 
 
 which is impossible. 
 
 The simplest law of error, which leads to the geometric mean, 
 seems to be obtained by putting ^'{y)= -ky, ■\{r{y^)=0. This 
 
 ^vea/.=A(|)\-'.. 
 
 A law of error, which leads to the geometric mean of the 
 observations as the most probable value of the quantity, has been 
 previously discussed by Sir Donald McAIister {Proceedings of the 
 Royal Society, vol. xxix. (1879) p. 365). His investigation de- 
 pends upon the obvious fact that, if the geometric mean of the 
 
CH. xvn FUNDAMENTAL THEOEEMS 199 
 
 observations yields the most probable value of the quantity, the 
 arithmetic mean of the logarithms of the observations must yield 
 the most probable value of the logarithm of the quantity. Hence, 
 if we suppose that the logarithms of, the observations obey the 
 normal law of error (which leads to their arithmetic mean as the 
 most probable value of the logarithms of the quantity), we can 
 by substitution find a law of error for the observations themselves 
 which must lead to the geometric mean of them as the most 
 probable value of the quantity itself. 
 
 If, as before, the observations are denoted by y^^, etc., and the 
 quantity by y, let their logarithms be denoted by \, etc., and by 
 I. Then, if Z^, etc., obey the normal law of eiTOi, f (1^,1) = Ae~*^'<'«~'>'. 
 Hence the law of error for y^, etc., is determined by 
 
 /(y„2/)=Ae-'='('°8^'-"'e^>' 
 =Ae-«='("'ef)', 
 
 and the most probable value of y must, clearly, be the geometric 
 mean of y^, etc. 
 
 This is the law of error which was arrived at by Sir Donald 
 McAlister. It can easily be shown that it is a special case of the 
 generalised form which I have given above of all laws of error 
 leading to the geometric mean. For if we put 1/^(^3) = - k%log y^)^, 
 and (p'iy) = 2k^ log y, we have 
 
 A similar result has been obtained by Professor J. C. Kapteyn.^ 
 But he is investigating frequency curves, not laws of error, and 
 this result is merely incidental to his main discussion. His 
 method, however, is not unlike a more generalised form of Sir 
 Donald McAlister's. In order to discover the frequency curve 
 of certain quantities y, he supposes that there are certain other 
 quantities z, fimctions of the quantities y, which are given by 
 z='E{y), and that the frequency curve of these quantities z is 
 normal. By this device he is enabled in the investigation of a 
 type of skew frequency curve, which is likely to be met with 
 often, to utilise certain statistical constants corresponding to 
 
 1 Sktw Frequency Curves, p. 22, published by the Astronomical Laboratory 
 at Groningen (1903). 
 
200 A TREATISE ON PROBABILITY pt. n 
 
 those wMch have been akeady calculated for the normal 
 curve. 
 
 In fact the main advantage both of Sir Donald McAlister's 
 law of error and of Professor Kapteyn's frequency curves lies in 
 the possibility of adapting without much trouble to unsymmetrical 
 phenomena numerous expressions which have been already 
 calculated for the normal law of error and the normal curve of 
 frequency.^ 
 
 This method of proceeding from arithmetic to geometric laws 
 of error is clearly capable of generalisation. We have dealt withj 
 the geometric law which can be derived from the normal arith-)' 
 metic law. Similarly if we start from the simplest geometric 
 
 law of error, namely, / =A ( — 1 e ^, we can easily find, by 
 
 writing log y = l and log y^ = l^, the corresponding arithmetic' 
 law, namely, j^=^^'^V-k)-i''A^ which is obtaiued from the 
 generalised arithmetic law by putting ^(i)=AV and -^{1^=0. 
 And, in general, corresponding to the arithmetic law 
 
 f =Ae"'''^*^^^~'''^""'''^''^'*''''^'''^ 
 we have the geometric law 
 
 where 
 
 y=\ogz, 2/g=log2g, /^L^& = 0(logz)and-fi(Zj)=-^(logz). 
 (iii.) What law of error does the harmonic mean imply ? 
 In this case, %^-^ =0 must be equivalent to S( ) =0' 
 
 h \% y/ 
 
 Proceeding as before, we find that/^ = Ae''"'(''>L7,-^]"-/"^^'*+''''''«^■ 
 A simple form of this is obtained by putting 4>'{y) = - ^^^ and 
 
 ^{yd=-1<hJr Then /g=Ae7/*-^=)'=Ae-*'|. With this law, 
 positive and negative errors of the same absolute magnitude are 
 not equally likely. 
 
 (iv.) If the most probable value of the quantity is equal to the 
 median of the measurements, what is the law of error ? 
 
 The median is usually defined as the measurement which 
 
 ^ It may be added that Professor Kapteyn's monograpli brings forward 
 considerations which would be extremely valuable in determining the types of 
 phenomena to which geometric laws of error are likely to be applicable. 
 
CH. xvn FUNDAMENTAL THEOEEMS * 201 
 
 occupies the middle position when the measurements are ranged 
 in order of magnitude. If the number of measurements m is odd, 
 
 the most probable value of the quantity is the ''^ ^th, and, if the 
 
 number is even, all values between the — th and the ( — + 1 Ith are 
 
 • 2 V2 / 
 
 equally probable amongst themselves and more probable than 
 any other. For the present purpose, however, it is necessary to 
 make use of another property of the median, which was known 
 to Fechner (who first introduced the median into use) but which 
 seldom receives as much attention as it deserves. If y is the 
 median of a number of magnitudes, the sum of the absolute differences 
 (i.e. the difference always reckoned positive) between y and each of 
 the magnitudes is a minimum. The median y oi y^ y^ . . . y^ is 
 
 m 
 
 found, that is to say, by making 2 1 y„ - «/ 1 a miniTmim where 
 
 \yq-y\ is the difference always reckoned positive between y^ 
 and y. 
 
 We can now return to the investigation of the law of error 
 corresponding to the median. 
 
 m 
 
 Write \y-yq\=Zg- Then since Xz^ is to be a minimum we 
 
 1 
 
 ™« — w 
 must have %^ — -^=0. Whence, proceeding as before, we have 
 
 1 ^3 
 
 /,=Ae/V*"<'^*+*<'"'- 
 The simplest case of this is obtained by putting 
 4>"{y)==-k^ 
 
 ir{y,)=y^^h%, 
 
 whence /^=Ae-*''2'-^'i=Ae-'^'"«- 
 
 This satisfies the additional condition that positive and nega- 
 tive errors of equal magnitude are equally hkely. Thus in this 
 important respect the median is as satisfactory as the arithmetic 
 mean, and the law of error which leads to it is as simple. It also 
 resembles the normal law .in that it is a function of the error only, 
 and not of the magnitude of the measurement as well. 
 
 The median law of error, /^ = Ae'^''", where z^ is the absolute 
 amount of the error always reckoned positive, is of some historical 
 
202 • A TEEATISE ON PROBABILITY pt. n 
 
 interest, because it was the earliest law of error to be formulated. 
 The first attempt to bring the doctrine of averages into definite 
 relation with the theory of probability and with laws of error was 
 published by Laplace in 1774 in a memoir " sur la probabilite des 
 causes par les evenemens." ^ This memoir was not subsequently 
 incorporated in his Theorie analytique, and does not represent his 
 more mature view. In the Thiorie he drops altogether the law 
 tentatively adopted in the memoir, and lays down the main lines 
 of investigation for the next hundred years by the introduction 
 of the normal law of error. The popularity of the normal law, 
 with the arithmetic mean and the method of least squares as its 
 corollaries, has been very largely due to its overwhelming ad- 
 vantages, in comparison with all other laws of error, for the pur- 
 poses of mathematical development and manipulation. And in 
 addition to these technical advantages, it is probably applicable 
 as a first approximation to a larger and more manageable group 
 of phenomena than any other single law. So powerful a hold 
 indeed did the normal law obtain on the minds of statisticians, 
 that until quite recent times only a few pioneers have seriously 
 considered the possibility of preferring in certain circumstances 
 other means to the arithmetic and other laws of error to the 
 normal. Laplace's earlier memoir fell, therefore, out of remem- 
 brance. But it remains interesting, if only for the fact that a 
 law of error there makes its appearance for the first time. 
 
 Laplace sets himself the problem iu a somewhat simplified 
 form : " Determiner le milieu que Ton doit prendre entre trois 
 observations denudes d'un meme phenomene." He begins by 
 assuming a law of error z = ^{y), where z is the probability of an 
 error y ; and finally, by means of a number of somewhat arbitrary 
 
 assumptions, arrives at the result ^(z)=— e~™^. If this formula 
 
 is to follow from his arguments, y must denote the ahsol/ute error, 
 always taken positive. It is not unlikely that Laplace was led 
 to this result by considerations other than those by which he 
 attempts to justify it. 
 
 Laplace, however, did not notice that his law of error led to 
 
 the median. For, instead of finding the most probable value, 
 
 which would have led him straight to it, he seeks the " mean of 
 
 error " — ^the value, that is to say, which the true value is as likely 
 
 ^ Memoires presentes a I'Acadimie des Sciences, vol. vi. 
 
CH. XVII FUNDAMENTAL THEOEEMS 203 
 
 to fall short of as to exceed. This value is, for the median law, 
 laborious to find and awkward in the result. Laplace works it 
 out correctly for the case where the observations are no more 
 than three. 
 
 6. I do not think that it is possible to find by this method a 
 law of error which leads to the mode. But the followiag general 
 formulae are easily obtained : 
 
 (v.) If ^6{yi^,y) =0 is the law of relation between the measure- 
 ments and the most probable value of the quantity, then the law 
 of error f^{y^,y) is given hj f^=A.e^^^y^^^"^v'>'^y+'^^y^\ Since /^ lies 
 between and '^, fO{y^y)<l)"{y)dy +^^[y^ +log A must be negative 
 for all values of y^ and y that are physically possible ; and, since 
 the values of y^ are between them exhaustive, 
 
 where the summation is for all terms that can be formed by giving 
 y^ every value d priori possible. 
 
 (vi.) The most general form of the law of error, when it is 
 assumed that positive and negative errors of the same magnitude 
 are equally probable, is Ae"*^'-^^""^'^', where the most probable 
 value of the quantity is given by the equation 
 
 ^{y - yg)f'{y - y,Y = O. where f'iy - yf = -J{y - y^f^ 
 
 "'(y ~ yq) 
 
 The arithmetic mean is a special case of this obtained by putting 
 fiy-yq)^ = {y-yqY I ^^^ ^^^ median is a special case obtained 
 by putting f(y - y/ =+-y/{y- y^f. 
 
 We can obtain other special cases by putting 
 
 f{y-yqf={y-yqy> 
 
 when the law of error is Ae~*'^*""*«^' and the most probable values 
 are the roots of my^ - By^Xy^ + ^y'Zy'^q - Xyq=^ 5 and by putting 
 
 fiy-yqT=iog{y-y^f, when the law of error is ^_ ^p ^^^ 
 the most probable values the roots of X = 0. In all these 
 
 y-y, 
 
 cases the law is a function of the error only. 
 
 7. These results may be summarised thus. We have 
 assumed : 
 
 (a) That we have no reason, before making measurements, for 
 
204 A TREATISE ON PEOBABILITY pt. u 
 
 supposing that the quantity we measure is more likely to have 
 any one of its possible values than any other. 
 
 (6) That the errors are independent, in the sense that a 
 knowledge of how great an error has been made in one case does 
 not affect our expectation of the probable magnitude of the error 
 ia the next. 
 
 (c) That the probability of a measurement of given magnitude, 
 when in addition to the d priori evidence the real value of the 
 quantity is supposed known, is an algebraic function of this 
 given magnitude of the measurement and of the real value of the 
 quantity. 
 
 {d) That we may regard the series of possible values as con- 
 tinuous, without sensible error. 
 
 (e) That the d priori evidence permits us to assume a law of 
 error of the type specified in (c) ; i.e. that the algebraic function 
 referred to in (c) is known to us d priori. 
 
 Subject to these assumptions, we have reached the following 
 conclusions : 
 
 (1) The most general form of the law of error is 
 
 leading to the equation S Oiy^y) =0, connecting the most probable 
 value and the actual measurements, where y is the most probable 
 value and y^, etc., the measurements. 
 
 (2) Assuming that positive and negative errors of the same 
 absolute magnitude are equally likely, the most general form is 
 /,=Ae-'^"^*'-^'>', leading to the equation 'Z{y-yq)f'{y-y,f=0, 
 
 where /'2: = —/z. Of the special cases to which this form gives 
 
 rise, the most interesting were 
 
 (3)/g=Ae-*='<''-8"J"=Ae-*^"^'', where z^=\y-y^\, leading to 
 the arithmetic mean of the measurements as the most probable 
 value of the quantity ; and 
 
 (4) /j=Ae~*'^«, leading to the median. 
 
 (5) The most general form leading to the arithmetic mean is 
 /j=Ae*'('-'>f^-^'>-'''(^'+'''(*«>, with the special cases (3), and 
 
 (6) / =^^'^^y-y^)-i''^ _ 
 
 (7) The most general form leading to the geometric mean is 
 f^^^^'{y)\osyi+fm.ay+^^y,)^ with the special cases : 
 
CH. xvn FUNDAMENTAL THEOREMS 205 
 
 (8)/,=A(^iy'%-"^and 
 
 (9)/,=Ae-K'°«^'^ 
 
 (10) The most general form leading to the harmonic mean is 
 
 f,=AeM~-i]-Af+^<y'\ with the special 
 
 case 
 
 ,, (v-yi)' -kW 
 
 (ll)/,=Ae y^ =Ae 
 
 (12) The most general form leading to the median is 
 
 /«=Ae 
 
 with the special case (4). 
 
 In each of these expressions, /^ is the probability of a measmre- 
 ment y^, given that the true value is y. 
 
 8. The doctrine of Means and the allied theory of Least 
 Squares comprise so extensive a subject-matter that they cannot 
 be adequately treated except in a volume primarily devoted to 
 them. As, however, they are one of the important practical 
 applications of the theory of probability, I am unwUUng to pass 
 them by entirely ; and the following discursive observations, 
 chiefly relating to the Normal Law of Error, will serve, taken in 
 conjunction with the paragraphs immediately preceding, to 
 illustrate the connection between the theories of this treatise 
 and the general treatment of averages. 
 
 9. The Claims of the Arithmetic Average.- — By definition the 
 arithmetic average of a number of quantities is nothing more 
 than their arithmetic sum divided by their number. But the 
 utility of an average generally consists in our supposed right to 
 substitute, in certain cases, this single measure for the varying 
 measures of which it is a function. Sometimes this requires no 
 justification ; the word " average " is in these cases used for 
 the sake of shortness, and merely to summarise a set of facts : 
 as, for instance, when we say that the birth-rate in England is 
 greater than the birth-rate in France. 
 
 But there are other cases in which the average makes a more 
 substantial claim to add to our knowledge. After a number of 
 examiners of equal capacity have given varying marks to a 
 candidate for the same paper, it may be thought fair to allow 
 the candidate the average of the different marks allotted : and 
 in general if several estimates of a magnitude have been made. 
 
206 A TEEATISE ON PEOBABILITY pt. n 
 
 between the accuracy of wMcli we have no reason to discriminate, 
 we often think it reasonable to act as iE the true magnitude were 
 the average of the several measurements. Perhaps De Witt, in 
 his report on Annuities to the States General in 1671,^ was the 
 first to use it scientifically. But as Leibniz points out : " Our 
 peasants have made use of it for a long time according to their 
 natural mathematics. For example, when some inheritance or 
 land is to be sold, they form three bodies of appraisers ; these 
 bodies are called Schurzen in Low Saxon, and each body makes 
 an estimate of the property in question. Suppose, then, that 
 the first estimates its value to be 1000 crowns, the second, 1400, 
 the third, 1500 ; the siim of these three estimates is taken, viz. 
 3900, and because they were three bodies, the third, i.e. 1300, is 
 taken as the mean value asked for. This is the axiom : aequali- 
 bus aequalia, equal suppositions must have equal consideration." ^ 
 
 But this is a very inadequate axiom. Equal suppositions 
 would have equal consideration, if the three estimates had been 
 multiplied together instead of being added. The truth is that 
 at all times the arithmetic mean has had simplicity to recommend 
 it. It is always easier to add than to multiply. But simplicity 
 is a dangerous criterion : " La nature," says Fresnel, ',' ne s'est 
 pas embarassee des difficultds d'analyse, elle n'a evite que la 
 compUcation des moyens." 
 
 With Laplace and Gauss there began a series of attempts to 
 prove the worth of the arithmetic mean. . It was discovered that 
 its use involved the assumption of a particular tjrpe of law of 
 error for the d, priori probabilities of given errors. It was also 
 found that the assumption of this law led on to a more com- 
 plicated rule, known as the Method of Least Squares, for com- 
 bining the results of observations which contain more than one 
 doubtful quantity. In spite of a popular belief that, whilst the 
 Arithmetic Mean is intuitively obvious, the Method of Least 
 Squares depends upon doubtful and arbitrary assumptions, it 
 can be demonstrated that the two stand and fall together.^ 
 
 ^ De vardye van de hf-renten na proportie van de hsrenten. The Hague, 1671. 
 
 * Nouveaux Essais. Engl, transl. p. 540. 
 
 ^ Venn {Logic of Chance, p. 40) thinks that the Normal Law of Error and 
 the Method of Least Squares " are not only totally distinct things, but they have 
 scarcely even any necessary connection with each other. The Law of Error 
 is the statement of a physical fact. . . . The Method of Least Squares, on the 
 other hand, is not a law at aJl in the scientific sense of the term. It is simply 
 a rule or direction. . . ." 
 
CH. xvn FUNDAMENTAL THEOREMS 207 
 
 The analytical theorems of Laplace and Gauss are compUcated, 
 but the special assumptions upon which they are based are easily 
 stated. 1 Gauss supposes {a) that the probability of a given error 
 is a function of the error only and not also of the magnitude of 
 the observation, (6) that the errors are so small that their cubes 
 and higher powers may be neglected. Assumption (a) is arbi- 
 trary,^ and Gauss did not state it explicitly. These two assump- 
 tions, together with certain others, lead us to the result. For 
 let <p(z) be the law of error where z is the error, and let us assume, 
 as it always is assumed in these proofs, that (f>(z) can be expanded 
 
 by Maclaurin's Theorem. Then (^{a;)=(^(0) +z^'(0) +^0"(O) + 
 
 z- 
 
 3 
 
 0"'(o) -I- . . . It is also supposed that positive and negative 
 
 O I 
 
 errors are equally probable, i.e. (f>{z)=(f>{-z), so that ^'(0) and 
 ^'"(0) vanish. Since we may neglect z* in comparison with z^, 
 ^{z)=(j){0)+^z^^"{0). But (neglecting z* and higher powers) 
 
 tea i_^ 
 
 a + te^ = ae « , SO that <f){z) =ae » . 
 
 Gauss's proof looks much more compUcated than this, but he 
 
 obtains the form ae " by neglecting higher powers of z, so that 
 this expression is really equivalent to a + bz^. By this approxi- 
 mation he has reduced all the possible laws to an equivalent 
 form.3 It is true, therefore, that the normal law of error is, to 
 the second power of the error, equivalent to any law of error, 
 which is a function of the error only, and for which positive and 
 negative errors are equally probable. Laplace also introduces 
 assumptions equivalent to these. 
 
 While mathematicians have endeavoured to estabhsh the 
 normal law of error and the arithmetic mean as a law of logic, 
 
 1 For an account of the three principal methods of arriving at the Method 
 of Least Squares and the Arithmetic Mean, see Ellis, Lea^t Squares. Gauss's 
 first method is in the Theoria Molua, and his second in the Theoria Combina- 
 ticmis Observationum. Laplace's investigations are in chap. iv. of the second 
 Book of the Theorie analytique. Laplace's method was improved by Poisson 
 in the Oonnaissance des temps for 1827 and 1832. 
 
 2 It does not follow, as G. Hagen argues {Orundzuge der Wahrscheinlichkeits- 
 rechnung, p. 29), that, because a larger error is less probable than a smaller, 
 therefore the probability of a given error is a function of its magnitude 
 
 only. 
 
 3 This is pointed out by Bertrand, Calcul des probabilite.", p 267. 
 
208 A TREATISE ON PROBABILITY pt. n 
 
 others have claimed for it the testimony of experience and have 
 deemed it a law of natmre.^ 
 
 That this cannot be so, is evident. For suppose that x^Xc^ ■ ■ -^n 
 are a set of observations of an unknown quantity x. Then, by 
 
 this principle, x = -"Zx^ gives the most probable value of x. But 
 
 n 
 
 suppose we had wished to determine x^, our observations, assum- 
 ing that we can multiply correctly, would be x^^, x^ . . . x^, 
 
 and the most probable value of ^=-%x^. But (_Sa;^)-4= -Sa;^^ 
 
 n n n 
 
 And in general, -Sflcc,) ^f{-%x^. Nor is this a consideration 
 n n 
 
 which can safely be ignored in practice. For our "observations" 
 are often the result of some manipulation, and the particular 
 shape in which we get them is not necessarily fixed for us. It is 
 not easy to say what the d'vre(A observation is. In particular if 
 any such law of sensation, as that enunciated by Fechner, is true 
 {i.e. that sensation varies as the logarithm of the stimulus), the 
 arithmetic mean must break down as a 'practical rtile in all cases 
 where human sensation is part of the instrument by means of 
 which the observations are recorded. ^ 
 
 Apart, however, from theoretical refutations, statisticians now 
 recognise that the arithmetic mean and the normal law of error 
 can only be applied to certain special classes of phenomena. 
 Quetelet ^ was, I think, the first to point this out. In England, 
 Galton drew attention to the fact many years ago, and Professor 
 Pearson * has shown " that the Gaussian-Laplace normal dis- 
 tribution is very far from being a general law of frequency 
 distribution either for errors of observation or for the distribution 
 of deviationsfrom type such as occur in organic populations. . . . 
 It is not even approximately correct, for example, in the distribu- 
 tion of barometric variations, of grades of fertility and incidence 
 of disease." 
 
 ^ This is, of course, a very common point of view indeed. Cf. Berirand, 
 op. cit. p. 183: "Malgre les objections pr^c^dentes, lafoimule de Gauss doit 
 etre adoptte. L'observation la oonflrme : cela doit suffire dans les applications." 
 
 ' This was noticed by Galton. 
 
 ' E.g. Letters on the Theory of ProbabUitiea, p. 114. 
 
 * On " Errors of Judgment, etc.," Phil. Trans. A, voL cxeviii. pp. 23S-299. 
 The following quotation is from his memoir On the Oeneral Theory of Skew 
 Correlation and Nonlinear Regression, where further references are given. 
 
OH. xvn FUNDAMENTAL THEOREMS 209 
 
 The Arithmetic Mean occupies, therefore, no unique position ; 
 and it is worth while, from the point of view of probability, to 
 discuss the properties of other possible means and laws of error, 
 as, for example, on the lines indicated in the earlier part of this 
 chapter. 
 
 10. The Method of Least Squares. — The problem, to which this 
 method is applied, is no more than the application of the same 
 considerations, as those which we have just been discussing, to 
 cases where the relation between the observed measurements and 
 the quantity whose most probable value we require, involves 
 more than one unknown. 
 
 Owing to the surprising character of its conclusions, i£ they 
 could be accepted as universally valid, and to the obscurity of 
 the mathematical fabric that has been reared on and about it, 
 this method has been surrounded by an unnecessary air of 
 mystery. It is true that in recent times scepticism has grown 
 at the expense of mystery. It is also true that just views have 
 been held by individuals for sixty years past, notably by Leslie 
 Ellis. But the old mistakes are not always corrected in the 
 current text-books, and even so useful and generally used a 
 treatise on Least Squares, as Professor Mansfield Merriman's, 
 opens with a series of very fallacious statements. 
 
 The controversial side of the Method of Least Squares is 
 purely logical ; in the later developments there is much elaborate 
 mathematics of whose correctness no one is in doubt. What it 
 is important to state with the utmost possible clearness is the 
 precise assumptions on which the mathematics is based ; when 
 these assumptions have been set forth, it remains to determine 
 their applicability in particular cases. 
 
 In dealing with averages we supposed ourselves to be pre- 
 sented with a number of direct observations of some quantity 
 which it is desired to determine. But it is obvious that direct 
 observations will be in many cases either impracticable or in- 
 convenient ; and our natural course wiU be to measure certain 
 other quantities which we know to bear fixed and invariable 
 relations to the imknowns we wish to determine. In surveying, 
 for instance, or in astronomy, we constantly prefer to take 
 measurements of angles or distances m. which we are not interested 
 for their own sakes, but which bear known geometrical relation- 
 ships to the set of ultimate unknowns. 
 
 p 
 
210 A TEEATISE ON PEOBABILITY rr. n 
 
 If we wish to determine the most probable values of a set of 
 imlmowns x^, x^, x^ . . . x^, instead of obtaining a number of 
 sets of direct observations of each, we may obtain a number of 
 equations of observation of the following type : 
 
 a-jX-i^+a^^+ . . . +a^y='Vi, 
 b-^x-i^+bzx^+ . . . +b^^=Y2, 
 
 ' mi 
 
 KjX-^ + 102X2 + . . . + tC^j. — V ^ 
 
 where Vi, etc., are the quantities directly observed, and the a's, 
 b's, etc., are supposed known (»i>-r). 
 
 We have ia such a case n equations to determine r unknowns, 
 and siace the observations are likely to be inexact, there may be 
 no precise solution whatever. In these circumstances we wish to 
 know the most probable set of values of the x's warranted by 
 these observations. 
 
 The problem is precisely similar in kind to that dealt with 
 by averages and differs only iu the degree of its complexity. It 
 is the problem of finding the most probable solution of such a set 
 of discrepant equations of observation that the Method of Least 
 Squares claims to solve. 
 
 By 1750 the astronomers were obtaining such equations of 
 observation in the course of their investigations, and the question 
 arose as to the proper manner of their solution. Boscovich in 
 Italy, Mayer and Lambert in Germany, Laplace in France, Euler 
 in Russia, and Simpson in England proposed dijfferent methods 
 of solution. Simpson, in 1757, was the first to introduce, by way 
 of simplification, the assumption or axiom that positive and 
 negative errors are equally probable.^ The Method of Least 
 Squares was first definitely stated by Legendre in 1805, who 
 proposed it as an advantageous method of adjusting observations. 
 This was soon followed by the ' proofs ' of Laplace and Gauss. 
 But it is easily shown that these proofs involve the normal law 
 of error y = ke~^^, and the theory of Least Squares simply 
 develops the mathematical results of applying to equations of 
 observation, which involve more than one unknown, that law 
 
 ^ See Meniman's Method of Least Squares, p. 181, for an historical sketch, 
 from which the above is taken. In 1877 Merriman published in the Trans- 
 actions of the Connecticut Academy a list of writings relating to the Method of 
 Least Squares and the theory of accidental errors of observation, which com- 
 prised 408 titles — classified as 313 memoirs, 72 books, 23 parts of books. 
 
OH. xvn FUNDAMENTAL THEOREMS 211 
 
 of error which leads to the Arithmetic Mean in the case of a single 
 unknown. 
 
 11. The Weighting of Averages. — It is necessary to recur to 
 the distinction made at the beginning of § 9 between the two 
 types to which our average, or, as it is generally termed in social 
 inquiries, our index number, may belong. The average or index 
 number may simply simimarise a set of facts and give us the 
 actual value of a composite quantity, as, for example, the index 
 number of the cost of Uving. In such cases the composite 
 quantity, in which we are interested, need not contain precisely 
 the same number of units of each of the elementary quantities of 
 which it is composed, so that the ' weights,' which denote the 
 numbers of each elementary quantity appropriate to the com- 
 posite quantity, are part of the definition of the composite 
 quantity, and can no more be dispensed with than the magnitudes 
 of the elementary quantities themselves. Nor in such cases is 
 the rejection of discordant observations permissible ; if, that is 
 to say, some of the elementary quantities are subject to much 
 wider variation, or to variations of a different type than the 
 majority, that is no reason for rejecting them. 
 
 On the other hand, the individual items, out of which the 
 average is composed, may each be indications or approximate 
 estimates of some one single quantity ; and the average, instead 
 of representing the measure of a composite quantity, may be 
 selected as furnishing the most probable value of the single 
 quantity, given, as evidence of its magnitude, the values of the 
 various terms which make up the average. 
 
 If this is the character of our average, the problem of weighting 
 depends upon what we know about the individual observations 
 or samples or indications, out of which our average is to be built 
 up. The units in question may be known to differ ia respects 
 relevant to the probable value of the quaesitum. Thus there 
 may be reasons, quite apart from the actual resxdts of the indi- 
 vidual observations or samples, for trusting some of them more 
 than others. Our knowledge may indicate to us, in fact, that 
 the constants of the laws of error appropriate to the several 
 instances, even if the type of the law can be assumed to be 
 constant, should be varied according to the data we possess about 
 each. It may also indicate to us that the condition of independ- 
 ence between the instances, which the method of averages 
 
212 A TREATISE ON PROBABILITY pt. n 
 
 presumes, is imperfectly satisfied, and consequently that our 
 mode of combining tlie instances in an average must be modified 
 accordingly. 
 
 Some modern statisticians, who, really influenced perhaps by 
 practical considerations, have been inclined to deprecate the 
 importance of weighting on theoretical grounds, have not always 
 been quite clear what kind of average they supposed themselves 
 to be dealing with. In particular, discussions of the question of 
 weighting ia connection with index numbers of the value of 
 money have suffered from this confusion. It has not been clear 
 whether such index numbers really represent measures of a 
 composite quantity or whether they are probable estimates of 
 the value of a single quantity formed by combining a number of 
 independent approximations towards the value of this quantity. 
 The original Jevonian conception of an index number of the 
 value of money was decidedly of the latter type. Modern work 
 on the subject has been increasingly dominated by the other 
 conception. A discussion of where the truth lies would lead me 
 too far into the field of a subject-matter alien to that of this 
 treatise. 
 
 Theoretical arguments against weighting have sometimes 
 been based on the fact that to weight the items of the average 
 in an irrelevant manner, or, as it is generally expressed, in a 
 random manner, is not likely, provided the variations between 
 the weights are small compared with the variations between the 
 items, to affect the result very much. But why should any one 
 wish to weight an average " at random " ? Such observations 
 overlook the real meaning and significance of weights. They are 
 probably inspired by the fact that a superficial treatment of 
 statistics would sometimes lead to the introduction of weights 
 which are irrelevant* In drawing a conclusion, for example, 
 from the vital statistics of various towns, the figures of population 
 for the different towns may or may not be relevant to our con- 
 clusion. It depends on the character of the argument. If they 
 are relevant, it may be right to employ them as weights. If they 
 are irrelevant, it must be wrong and unnecessary to do so. The 
 fact that wheat is a more important article of consumption than 
 pins may, on certain assumptions, be irrelevant to the usefulness 
 of variations in the price of each article as indications of variation 
 in the value of money. With other assumptions, it may be 
 
OH. XVII FUNDAMENTAL THEOREMS 213 
 
 extremely relevant. Or again, we may know that observations 
 with a particular instrument tend to be too large and must, 
 therefore, be weighted down. It is contrary both to theory and 
 to common sense to suppose that the possession of information 
 as to the relative reliability of different statistics is not useful. 
 There is no place, therefore, in my judgment, for a generalised 
 argument as to the propriety or impropriety of weighting an 
 average. 
 
 It should be added that, where we seek to build up an index 
 number of a conception, which is quantitative but is not itself 
 numerically measurable in any defined or unambiguous sense, by 
 combining a mmiber of numerical quantities, which, while they 
 do not measure our quaesitwm are nevertheless indications of its 
 quantitative variations and tend to fluctuate in the same sense, 
 as, for example, by means of what are sometimes called economic 
 barometers of the state of business, or the prosperity of the country 
 or the like, some very confusiag questions can arise both as to 
 what sort of a thiQg our resulting index really is, and as to the 
 mode of compilation appropriate to it. 
 
 These confusing questions always arise when, instead of 
 measuring a quantity directly, we seek an index to fluctuations 
 in its magnitude by combining ia an average the fluctuations of 
 a series of magnitudes, which are, each of them in a different way, 
 to some extent (but only to some extent), correlated with fluctua- 
 tions in our quaesitum. I must not burden this book with a 
 discussion of the problems of Index Numbers. But I venture to 
 tMnk that they would be sooner cleared up if the natures and 
 purposes of differing index numbers were more sharply distin- 
 guished — those, namely, which are simply descriptive of a composite 
 commodity, those which seek to combine results differing from 
 one another in a way analogous to the variations of an iastrument 
 of precision, and those which combiae results, not of the quaesitum 
 itself, b^t of various other quantities, variations in which are 
 partly due to variations in the quaesitum, but which we well 
 know to be also due to other distinguishable influences. Index 
 numbers of the third type are often treated by methods and 
 arguments only appropriate to those of the second type. 
 
 12. Tfie Rejection of Discordant Observations. — This differs 
 from the problem just discussed, because we have supposed so 
 far that our system of weighting is determined by data which we 
 
214 A TKEATISE ON PEOBABILITY pt. n 
 
 possess prior to and apart from our knowledge of the actual 
 magnitude of the items of our average. The principle of the 
 rejection of discordant observations comes in when it is argued 
 that, if one or more pf our observations show great discrepancies 
 from the results of the greater number, these ought to be partly 
 or entirely neglected in striking the average, even if there is no 
 reason, except their discrepancy from the rest, for attributing 
 less weight to them than to the others. By some this practice 
 has been thought to be in accordance with the dictates of common 
 sense ; by others it is denounced as savouring even of forgery.^ 
 
 This controversy, like so many others in Probability, is due 
 to a failure to understand the meaning of ' independence.' The 
 mathematics of the orthodox theory of Averages and Least 
 Squares depend, as we have seen, upon the assumption that the 
 observations are ' independent ' ; but this has sometimes been 
 interpreted to mean a physical independence. In point of fact, 
 the theory requires that the observations shall be independent, 
 in the sense that a knowledge of the result of some does not affect 
 the probability that the others, when known, involve given 
 errors. 
 
 Clearly there may be initial data in relation to which this 
 supposition is entirely or approximately accurate. But in many 
 cases the assumption will be inadmissible. A knowledge of the 
 results of a number of observations may lead us to modify our 
 opinion as to the relative reliabiUties of others. 
 
 The question, whether or not discordant observations should 
 be specially weighted down, turns, therefore, upon the nature of 
 the preliminary data by which we have been guided in initially 
 adopting a particvilar law of error as appropriate to the observa- 
 tions. If the observations are, relevant to these data, strictly 
 ' independent,' in the sense required for probability, then rejection 
 is not permissible. But if this condition is not fulfilled, a bias 
 against discordant observations may be well justified. 
 
 ^ E.g. 6. Hagen's Grundz&ge der Wah/rscheinlichkeitsrechnung, p. 63 : " Die 
 Tauschung, die man duroh Versohweigen von Messungen begeht, lasst sioh 
 eben so wenig entsohuldigen, als wenn man Messungen falsohen oder flngiren 
 woUte," 
 
PART III 
 INDUCTION AND ANALOGY 
 
 215 
 
CHAPTEK XVIII 
 
 INTRODUCTION 
 
 Nothing so like as eggs ; yet no one, on account of this apparent similarity, 
 expects the same taste and relish in all of them. 'Tis only after a long course 
 of uniform experiments in any kind, that we attain a firm reliance and security 
 with regard to a particular event. Now where is that process of reasoning, 
 which from one instance draws a conclusion, so different from that which it 
 infers from a hundred instances, that are no way different from that single 
 instance ? This question I propose as much for the sake of information, as 
 with any intention of raising difficulties. I cannot find, I cannot imagine any 
 such reasoning. But I keep my mind still open to instruction, if any one will 
 vouchsafe to bestow it on me. — HtTME."^ 
 
 1. I HAVE described Probability as comprising that part of 
 logic which deals with arguments which are rational but not 
 conclusive. By far the most important types of such arguments 
 are those which are based on the methods of Induction and 
 Analogy. Almost all empirical science rests on these. And the 
 decisions dictated by experience in the ordinary conduct of life 
 generally depend on them. To the analysis and logical justifica- 
 tion of these methods the following chapters are directed. 
 
 Inductive processes have formed, of course, at all times a 
 vital, habitual part of the mind's machinery. Whenever we learn 
 by experience, we are using them. But in the logic of the schools 
 they have taken their proper place slowly. No clear or satis- 
 factory account of them is to be found anywhere. Within and 
 yet beyond the scope of formal logic, on the line, apparently, 
 between mental and natural philosophy, Induction has been 
 admitted into the organon of scientific proof, without much help 
 from the logicians, no one quite knows when. 
 
 2. What are its distinguishing characteristics ? What are 
 the quaHties which in ordinary discourse seem to afford strength 
 to an inductive argument ? 
 
 ' Philosophical Essays concerning Human Understandijig. 
 217 
 
218 A TREATISE ON PROBABILITY pt. m 
 
 I shall try to answer these questions before I proceed to 
 the more fundamental problem — What ground have we for re- 
 garding such arguments as rational ? 
 
 Let the reader remember, therefore, that in the first of the 
 succeeding chapters my main purpose is no more than to state 
 in precise language what elements are commonly regarded as 
 adding weight to an empirical or inductive argument. This 
 requires some patience and a good deal of definition and special 
 terminology. But I do not think that the work is controversial. 
 At any rate, I am satisfied myself that the analysis of Chapter 
 XIX. is fairly adequate. 
 
 In the next section. Chapters XX. and XXI., I continue in 
 part the same task, but also try to elucidate what sort of assump- 
 tions, if we could adopt them, he behind and are required by the 
 methods just analysed. In Chapter XXII. the nature of these 
 assumptions is discussed further, and their possible justification 
 is debated. 
 
 3. The passage quoted from Hume at the head of this chapter 
 is a good introduction to our subject. Nothing so like as eggs, 
 and after a long course of uniform experiments we can expect 
 with a fixm reliance and security the same taste and rehsh iu all 
 of them. The eggs must be like eggs, and we must have tasted 
 many of them. This argument is based partly upon Analogy 
 and partly upon what may be termed Pure Induction. We argue 
 from Analogy in so far as we depend upon the likeness of the eggs, 
 and from Pure Induction when we trust the number of the ex- 
 periments. 
 
 It will be useful to call arguments inductive which depend 
 iu any way on the methods of Analogy and Pure Induction. But 
 I do not mean to suggest by the use of the term inductive that these 
 methods are necessarily confined to the objects of phenomenal 
 experience and to what are sometimes called empirical questions ; 
 or to preclude from the outset the possibility of their use in 
 abstract and metaphysical inquiries. While the term inductive 
 will be employed in this general sense, the expression Pure 
 Induction must be kept for that part of the argument which 
 arises out of the repetition of instances. 
 
 4. Hume's account, however, is incomplete. His argument 
 could have been improved. His experiments should not have 
 been too uniform, and ought to have differed from one another 
 
OH. xvm INDUCTION" AND ANALOGY 219 
 
 as mucli as possible in all respects save that of the likeness of the 
 eggs. He should have tried eggs in the town and in the country, 
 in January and in June. He might then have discovered that 
 eggs could be good or bad, however like they looked. 
 
 This principle of varying those of the characteristics of the 
 iastances, which we regard in the conditions of our generalisation 
 as non-essential, may be termed Negative Analogy. 
 
 It will be argued later on that an increase in the number of 
 experiments is only valuable in so far as, by increasing, or possibly 
 increasing, the variety found amongst the non-essential char- 
 acteristics of the instances, it strengthens the Negative Analogy. 
 If Hume's experiments had been absolutely uniform, he would 
 have been right to raise doubts about the conclusion. There is 
 no process of reasoning, which from one instance draws a con- 
 clusion different from that which it infers from a hundred in- 
 stances, if the latter are known to be in no way different from 
 the former. Hume has unconsciously misrepresented the typical 
 inductive argument. 
 
 When our control of the experiments is fairly complete, and 
 the conditions in which they take place are well known, there is 
 not much room for assistance from Pure Induction. If the 
 Negative Analogies are known, there is no need to count the 
 instances. But where our control is incomplete, and we do not 
 know accurately in what ways the instances differ from one 
 another, then an increase in the mere number of the instances 
 helps the argument. For unless we know for certain that the 
 instances are perfectly uniform, each new instance may possibly 
 add to the Negative Analogy. 
 
 Hume might also have weakened his argument. He expects 
 no more than the same taste and relish from his eggs. He 
 attempts no conclusion as to whether his stomach will always 
 draw from them the same nourishment. He has conserved the 
 force of his generalisation by keeping it narrow. 
 
 5. In an inductive argument, therefore, we start with a 
 number of instances similar in some respects AB, dissimilar in 
 others C. We pick out one or more respects A in which the 
 instances are similar, and argue that some of the other respects 
 B in which they are also similar are likely to be associated with 
 the characteristics A in other unexamined cases. The more 
 comprehensive the essential characteristics A, the greater the 
 
220 A TREATISE ON PROBABILITY n. in 
 
 variety amongst the non-essential characteristics C, and the less 
 comprehensive the characteristics B which we seek to associate 
 with A, the stronger is the likelihood or probability of the general- 
 isation we seek to establish. 
 
 These are the three ultimate logical elements on which the 
 probability of an empirical argument depends, — ^the Positive 
 and the Negative Analogies and the scope of the generalisation. 
 
 6. Amongst the generalisations arising out of empirical 
 argument we can distinguish two separate types. The first of 
 these may be termed universal induction. Although such in- 
 ductions are themselves stisceptible of any degree of probability, 
 they affirm invariable relations. The generalisations which they 
 assert, that is to say, claim universality, and are upset if a 
 single exception to them can be discovered. Only in the more 
 exact sciences, however, do we aim at establishing universal 
 inductions. In the majority of cases we are content with that 
 other kind of induction which leads up to laws upon which 
 we can generally depend, but which does not claim, however 
 adequately established, to assert a law of more than probable 
 connection.^ This second type may be termed Inductive Correla- 
 tion. If, for instance, we base upon the data, that this and that 
 and those swans are white, the conclusion that all swans are white, 
 we are endeavouring to establish a universal induction. But if 
 we base upon the data that this and those swans are white and 
 that swan is black, the conclusion that most swans are white, 
 or that the probability of a swan's being white is such and such, 
 then we are establishing an inductive correlation. 
 
 Of these two types, the former — universal induction — pre- 
 sents both the simpler and the more fundamental problem. In 
 this part of my treatise I shall confine myself to it almost entirely. 
 In Part V., on the Foundations of Statistical Inference, I shall 
 discuss, so far as I can, the logical basis of inductive correlation. 
 
 7. The fundamental connection between Inductive Method 
 and Probability deserves all the emphasis I can give it. Many 
 writers, it is true, have recognised that the conclusions which we 
 reach by inductive argument are probable and inconclusive. 
 Jevons, for instance, endeavoured to justify inductive processes 
 by means of the principles of inverse probability. And it is true 
 also that much of the work of Laplace and his followers was 
 
 ' What Mill calls ' approximate generalisations.' 
 
CH. xvm INDUCTION AND ANALOGY 221 
 
 directed to the solution of essentially inductive problems. But 
 it has been seldom apprehended clearly, either by these writers 
 or by others, that the vahdity of every induction, strictly inter- 
 preted, depends, not on a matter of fact, but on the existence of 
 a relation of probability. An inductive argument affirms, not 
 that a certain matter of fact is so, but that relative to certain 
 evidence there is a probability in its favour. The validity of the 
 induction, relative to the original evidence, is not upset, therefore, 
 if, as a fact, the truth turns out to be otherwise. 
 
 The clear apprehension of this truth profoundly modifies 
 our attitude towards the solution of the inductive problem. The 
 validity of the inductive method does not depend on the success 
 of its predictions. Its repeated failure in the past may, of course, 
 supply us with new evidence, the inclusion of which will modify 
 the force of subsequent inductions. But the force of the old 
 induction relative to the old evidence is untouched. The evidence 
 with which our experience has supplied us in the past may have 
 proved misleading, but this is entirely irrelevant to the 
 question of what conclusion we ought reasonably to have 
 drawn from the evidence then before us. The vahdity and 
 reasonable nature of inductive generalisation is, therefore, a 
 question of logic and not of experience, of formal and not of 
 material laws. The actual constitution of the phenomenal 
 universe determines the character of our evidence ; but it cannot 
 determine what conclusions given evidence rationally supports. 
 
CHAPTEE XIX 
 
 THE NATURE OF ARGUMENT BY ANALOGY 
 
 All kinds of reasoning from causes or effects are founded on two particulars, 
 viz. the constant conjunction of any two objects in all past experience, and the 
 resemblance of a present object to any of them. Without some degree of 
 resemblance, as well as union, 'tis impossible there can be any reasoning. — 
 
 1. Hume rightly maintains ttat some degree of resemblance 
 must always exist between the various instances upon which a 
 generalisation is based. For they must have this, at least, in 
 common, that they are instances of the proposition which 
 generalises them. Some element of analogy must, therefore, 
 lie at the base of every inductive argument. In this chapter I 
 shall try to explain with precision the meaning of Analogy, and 
 to analyse the reasons, for which, rightly or wrongly, we usually 
 regard analogies as strong or weak, without considering at present 
 whether it is possible to find a good reason for our instinctive 
 principle that likeness breeds the expectation of likeness. 
 
 2. There are a few technical terms to be defined. We mean 
 by a generalisation a statement that all of a certain definable class 
 of propositions are true. It is convenient to specify this class 
 in the following way. If /(aj) is true for all those values of x for 
 which <j){x) is true, then we have a generalisation about ^ and / 
 which we may write g{^,f)- If, for example, we are dealing with 
 the generalisation, " All swans are white," this is equivalent to 
 the statement, " ' a; is white ' is true for all those values of x for 
 which ' a; is a swan' is true." The proposition <f){a).f{a) is an 
 instance of the generalisation ^(^, /). 
 
 By thus defining a generalisation in terms of prepositional 
 functions, it becomes possible to deal with all kinds of generalisa- 
 
 . ' A Treatise of Human Nature. 
 222 
 
OH. XIX INDUCTION AND ANALOGY 223 
 
 tions in a uniform way ; and also to bring generalisation into 
 convenient connection with our definition of Analogy. 
 
 If some one thing is true about both of two objects, if, that is 
 to say, they both satisfy the same propositional function, then to 
 this extent there is an analogy between them. Every generalisa- 
 tion g{(f), /), therefore, asserts that one analogy is always accom- 
 panied by another, namely, that between all objects having the 
 analogy <f) there is also the analogy /. The set of propositional 
 functions, which are satisfied by both of the two objects, con- 
 stitute the positive analogy. The analogies, which would be 
 disclosed by complete knowledge, may be termed the total positive 
 arutlogy ; those which are relative to partial knowledge, the 
 known positive analogy. 
 
 As the positive analogy measures the resemblances, so the 
 negative analogy measures the differences between the two objects. 
 The set of functions, such that each is satisfied by one and not 
 by the other of the objects, constitutes the negative analogy. 
 We have, as before, the distinction between the total negative 
 analogy and the known negative analogy. 
 
 This set of definitions is soon extended to the cases in which 
 the number of instances exceeds two. The functions which are 
 true of all of the iastances constitute the positive analogy of the 
 set of instances, and those which are true of some only, and are 
 false of others, constitute the negative analogy. It is clear that 
 a function, which represents positive analogy for a group of 
 instances taken out of the set, may be a negative analogy for the 
 set as a whole. Analogies of this kind, which are positive for 
 a sub-class of the instances, but negative for the whole class, we 
 may term sub-analogies. By this it is meant that there are 
 resemblances which are conmion to some of the iastances, but 
 not to all. 
 
 A simple notation, in accordance with these definitions, will 
 be useful. If there is a positive analogy ^ between a set of in- 
 stances «! . . . a,t, whether or not this is the total analogy 
 between them, let us write this — 
 
 aj , . . On 
 
 1 Hence A (^) = 0(Oi) . ^(aj) . . . 0(a„)= 11 <ti(x). 
 
224 A TEEATISE ON PEOBABILITY pt. m 
 
 And if there is a negative analogy (j)', let us write this — 
 A (<^')-^ 
 
 ai . . . On 
 
 Thus A {<!>) expresses the fact that there is a set of 
 
 ttl . . . ttn 
 
 characteristics (^ which are common to all the instances, and 
 A (<^') that there is a set of characteristics <p' which is 
 
 ai. . .On 
 
 true of at least one of the instances and false of at least one. 
 
 3. In the typical argument from analogy we wish to generalise 
 from one part to another of the total analogy which experience 
 has shown to exist between certain selected instances. In all the 
 cases where one characteristic has been found to exist, another 
 characteristic/ has been found to be associated with it. We argue 
 from this that any instance, which is known to share the first 
 analogy <p, is likely to share also the second analogy/. We have 
 found in certain cases, that is to say, that both <f> and/ are true 
 of them ; and we wish to assert/ as true of other cases in which 
 we have only observed <p. We seek to establish the generalisation 
 ?(^> /)> °^ *^® ground that and / constitute between them an 
 olaserved positive analogy in a given set of experiences. 
 
 But while the argument is of this character, the grounds, upon 
 which we attribute more or less weight to it, are often rather 
 complex ; and we must discuss them, therefore, in a systematic 
 manner. 
 
 4. According to the view suggested in the last chapter, the 
 value of such an argument depends partly upon the nature of the 
 conclusion which we seek to draw, partly upon the evidence 
 which supports it. If Hume had expected the same degree of 
 nourishment as well as the same taste and relish from all of the 
 eggs, he would have drawn a conclusion of weaker probability. 
 Let us consider, then, this dependence of the probability upon the 
 scope of the generalisation g{^,f), — ^upon the comprehensiveness, 
 that is to say, of the condition </> and the conclusion/ respectively. 
 
 The more comprehensive the condition <^ and the less com- 
 prehensive the conclusion /, the greater d priori probability do 
 we attribute to the generalisation g. With every increase in <j) 
 this probability increases, and with every increase in / it will 
 diminish. 
 
 1 Hence A (0') = S 4>'{x) . 2 (f>'{x). 
 ai. ..On x=ar x=a/ 
 
OH. XIX INDUCTION AND ANALOGY 225 
 
 The condition ^(=^1(^2) ^ more compreliensive than the 
 condition ^1, relative to the general evidence h, if ^2 is a condition 
 independent of ^^ relative to h, 4>z being independent of ^j, if 
 g{<f>i, </)2)/^=t=l, i.e. if, relative to h, the satisfaction of (p^ is not 
 inferrible from that of (f>i. 
 
 Similarly the conclusion /( =fifi) is more comprehensive than 
 the conclusion /i, relative to the general evidence h, if /2 is a con- 
 clusion independent oifi, relative to h, i.e. if g{fi,f2)lh=^l. 
 
 If (j} =<^ii^a 3.nd/=/j/2, where <j)i and <p2 are independent and 
 fi and /a are independent relative to h, we have — 
 
 ^'(^i, /)/^ =ff{M2, f) • 9M2, f)l^ 
 
 ^9{<l>,f)IK 
 and g{^, f)jh =g{<f}, fJz)lh 
 
 =9Wi,f2)lh.g{<l>,-A)lh 
 
 so that g{^, A)lh^g{<t>, f)lh^9{<^x, f)/h. 
 
 This proves the statement made above. It will be noticed 
 that we cannot necessarily compare the a priori probabilities 
 of two generaUsations in respect of more and less, unless the con- 
 dition of the first is included in the condition of the second, and 
 the conclusion of the second is included in that of the first. 
 
 We see, therefore, that some generalisations stand initially 
 in a stronger position than others. In order to attain a given 
 degree of probability, generalisations require, according to their 
 scope, different amounts of favourable evidence to support them. 
 
 5. Let us now pass from the character of the generalisation 
 d priori to the evidence by which we support it. Since, when- 
 ever the conclusion / is complex, i.e. resolvable into the form 
 fifz where g(fi, f^jh =4= 1, we can express the probability of the 
 generalisation g{(f>,f) as the product of the probabilities of the 
 two generalisations g{^fi, /a) and g{(j>, fi), we may assume in what 
 follows, that the conclusion /is simple and not capable of further 
 analysis, without diminishing the generality of our argument. 
 
 We will begin with the simplest case, namely, that which 
 
 arises in the following conditions. First, let us assume that our 
 
 knowledge of the examined instances is complete, so that we know 
 
 of every statement, which is about the examined instances, 
 
 whether it is true or false of each.^ Second, let us assume that 
 
 ^ If <jy{a) is a proposition and \j/{a) = 1i . B{a), where ^ is a proposition not 
 involving a, then we must regard 6(a), not ^(a) as the statement about a. 
 
 Q 
 
226 A TREATISE ON PROBABILITY pt. m 
 
 all the instances which are known to satisfy the condition <j), 
 are also known to satisfy the conclusion / of the generalisation. 
 And third let us assume that there is nothing which is true of 
 all the examined instances and yet not included either in ^ or 
 in /, i.e. that the positive analogy between the instances is 
 exactly co-extensive with the analogy ^/ which is covered by the 
 generalisation. 
 
 Such evidence as this constitutes what we may term a perfect 
 analogy. The argument in favour of the generalisation cannot 
 be further improved by a knowledge of additional instances. 
 Since the positive analogy between the instances is exactly 
 coextensive with the analogy covered by the generalisation, and 
 since our knowledge of the examined instances is complete, there 
 is no need to take account of the negative analogy. 
 
 An analogy of this kind, however, is not likely to have much 
 practical utiUty ; for i£ the analogy covered by the generalisa- 
 tion, covers the whole of the positive analogy between the instances 
 it is difficult to see to what other instances the generalisation can 
 be applicable. Any instance, about which everything is true 
 which is true of aU of a set of instances, must be identical with 
 one of them. Indeed, an argument from perfect analogy can 
 only have practical utility, if, as will be argued later on, there are 
 some distinctions between instances which are irrelevant for the 
 purposes of analogy, and if, in a perfect analogy, the positive 
 analogy, of which we must take account, need cover only those 
 distinctions which are relevant. In this case a generalisation 
 based on perfect analogy might cover instances numerically 
 distinct from those of the original set. 
 
 The law of the Uniformity of Nature appears to me to amount 
 to an assertion that an analogy which is perfect, except that mere 
 differences of position in time and space are treated as irrelevant, 
 is a valid basis for a generalisation, two total causes being re- 
 garded as the same if they only difier in their positions in time 
 or space. This, I think, is the whole of the importance which 
 this law has for the theory of inductive argument. It involves 
 the assertion of a generalised judgment of irrelevance, namely, 
 of the irrelevance of mere position in time and space to generalisa- 
 tions which have no reference to particular positions in time 
 and space. It is in respect of such position in time or space that 
 ' nature ' is supposed ' uniform.' The significance of the law 
 
OH. XIX INDUCTION AND ANALOGY 227 
 
 and the nature of its justification, if any, are further discussed 
 in Chapter XXII. 
 
 6. Let us now pass to the type which is next in order of 
 simplicity. We will relax the first condition and no longer assume 
 that the whole of the positive analogy between the instances is 
 covered by the generalisation, though retaining the assumption 
 that our knowledge of the examined instances is complete. We 
 know, that is to say, that there are some respects in which the 
 examined instances are aU alike, and yet which are not covered 
 by the generalisation. If ^^ is the part of the positive analogy 
 between the instances which is not covered by the generalisation, 
 then the probability of this type of argument from analogy can 
 be written— 
 
 g{<l>,f)l A (</.<^,/). 
 
 The value of this probability turns on the comprehensiveness 
 of ^y. There are some characteristics ^j common to all the 
 instances, which the generaUsation treats as unessential, but 
 the less comprehensive these are the better. <f>y stands for the 
 characteristics ia which all the instances resemble one another 
 outside those covered by the generaUsation. To reduce these 
 resemblances between the instances is the same thing as to 
 increase the differences between them. And hence any increase 
 in the Negative Analogy involves a reduction in the compre- 
 hensiveness of ^1- When, however, our knowledge of the 
 instances is complete, it is not necessary to make separate 
 mention of the negative analogy A (^') in the above formula. 
 
 Oi., .On 
 
 For ^' simply includes all those functions about the instances, 
 which are not included in ^^i/, and of which the contradictories 
 are not included in them ; so that ia stating A {cjxpif), we 
 
 state by implication A {<})') also. 
 
 aj . . . On 
 
 The whole process of strengthening the argument in favour 
 of the generalisation g{(f), f) by the accumulation of further ex- 
 perience appears to me to consist in making the argument 
 approximate as nearly as possible to the conditions of a perfect 
 analogy, by steadily reducing the comprehensiveness of those 
 resemblances (p^ between the instances which our generahsation 
 disregards. Thus the' advantage of additional instances, derived 
 
228 A TEEATISE ON PEOBABILITY pt. m 
 
 from experience, arises not out of their number as such, but out 
 of their tendency to limit and reduce the comprehensiveness of 
 <f>i, or, in other words, out of their tendency to increase the negative 
 analogy (/>', since ^j^' comprise between them whatever is not 
 covered by (pf. The more numerous the instances, the less com- 
 prehensive are their superfluous resemblances likely to be. But 
 a single additional instance which greatly reduced 0^ would in- 
 crease the probability of the argument more than a large number 
 of instances which afEected ^^ less. 
 
 7. The nature of the argument examined so far is, then, that 
 the instances all have some characteristics in common which 
 we have ignored in framing our generalisation ; but it is still 
 assumed that our knowledge about the examined instances is 
 complete. We will next dispense with this latter assumption, and 
 deal with the case in which our knowledge of the characteristics 
 of the examined instances themselves is or may be incomplete.. 
 
 It is now necessary to take explicit accoimt of the known 
 negative analogy. For when the known positive analogy falls 
 short of the total positive analogy, it is not possible to infer the 
 negative analogy from it. Differences may be known between the 
 instances which cannot be inferred from the known positive 
 analogy. The probability of the argument must, therefore, be 
 written — 
 
 / aj. . .On aj . . . ttn 
 
 where ^^i/ stands for the characteristics in which all n instances 
 a^^ . . . a^ are krumn to be alike, and (f)' stands for the char- 
 acteristics in which they are known to differ. 
 
 This argument is strengthened by any additional instance or 
 by any additional knowledge about the former instances which 
 diminishes the known superfluous resemblances ^^ or increases the 
 negative analogy ^'. The object of the accumulation of further 
 experience is still the same as before, namely, to make the form 
 of the argument approximate more and more closely to that of 
 perfect analogy. Now, however, that om: knowledge of the 
 instances is no longer assumed to be complete, we must take 
 account of the mere number n of the instances, as well as of our 
 specific knowledge in regard to them ; for the more numerous 
 the instances are, the greater the opportunity for the total 
 negative analogy to exceed the known negative analogy. But 
 
OH. XIX INDUCTION AND ANALOGY 229 
 
 the more complete our knowledge of the instances, the less 
 attention need we pay to their mere number, and the more 
 imperfect our knowledge the greater the stress which must be 
 laid upon the argument from number. This part of the argu- 
 ment will be discussed in detail in the following chapter on 
 Pure Induction. 
 
 8. When om; knowledge of the instances is incomplete, there 
 may exist analogies which are known to be true of some of the 
 instances and are not known to be false of any. These sub- 
 analogies (see § 2) are not so dangerous as the positive analogies (pi, 
 which are known to be true of all the instances, but their existence 
 is, evidently, an element of weakness, which we must endeavour 
 to eliminate by the growth of knowledge and the multipHcation 
 of instances. A sub-analogy of this kind between the instances 
 ttr . . . a^ may be written A (aI^j.) ; and the formula, if it 
 
 Or . . -Cba 
 
 is to take account of all the relevant information, ought, there- 
 fore, to be written — 
 
 g{<j>,f)/ A {cpcpj) A (,^')n( A {f,)], 
 
 /«!... On tti . . . O/t V Or . . . Oa J 
 
 where the terms of 11/ A (i^s:)]. stand for the various sub- 
 
 analogies between sub-classes of the instances, which are not 
 included in (/)^i/ or in (j)'. 
 
 9. There is now another complexity to be introduced. We 
 must dispense with the assumption that the whole of the analogy 
 covered by the generaUsation is known to exist in all the instances. 
 For there may be some instances within our experience, about 
 which our knowledge is incomplete, but which show part of the 
 analogy required by the generalisation and nothing which con- 
 tradicts it ; and such instances afford some support to the 
 generalisation. Suppose that ,,<j> and ,,/ are part of <j> and / re- 
 spectively, then we may have a set of instances h-y. . .b^ which 
 show the following analogies : 
 
 A (6«^5<^15/) A (,.^')n/ A (,,|r,)|, 
 
 6l...6n. hi...hin ybr.-.'ba j 
 
 where ^^i is the analogy not covered by the generalisation, and 
 so on, as before. 
 
230 A TREATISE ON PROBABILITY m. m 
 
 The formula, therefore, is now as follows : 
 
 g{<l>,f)l n I A („</.„</.i^) A umm A (t,)1 
 
 / a,6. . . (^aj . . . On ai...a,i j ya^'bi... j 
 
 In this expression „^, ^/are the whole or part of ^,/; the product 
 n is composed of the positive and negative analogies for each 
 
 of the sets of instances Oj . . . a„, 6^ . . . 6„, etc. ; and the 
 product n contains the various sub-analogies of different sub- 
 classes of all the instances a^. . .a^, b^. . . b^, etc., regarded as 
 one set.^ 
 
 10. This completes our classification of the positive evidence 
 which supports a generalisation ; but the probability may also 
 be affected by a consideration of the negative evidence. We 
 have taken account so far of that part of the evidence only which 
 shows the whole or part of the analogy we require, and we have 
 neglected those instances of which <^, the conditioji of the general- 
 isation, or/, its conclusion, or part of or of /is knoivn to be false. 
 Suppose that there are instances of which </> is true and /false, it 
 is clear that the generalisation is ruined. But cases in which we 
 know fart of ^ to be true and/ to be false, and are ignorant as 
 to the truth or falsity of the rest of ^, weaken it to some extent. 
 We must take accoimt, therefore, of analogies 
 
 ttj' . . . a'nf 
 
 where ^<^, part of <^, is true of all the set, and „,/, part of / is 
 false of all the set, while the truth or falsity of some part of ^ and 
 / is unknown. The negative evidence, however, can strengthen 
 as well as weaken the evidence. We deem instances favourably 
 relevant in which <^ and/ are both false together,^ 
 
 Our final formula, therefore, must include terms, similar to 
 those in the formula which concludes § 9, not only for sets of 
 instances which show analogies aj>af, where „^ and ^ are parts 
 of (^ and /, but also for sets which show analogies a^a/> 
 
 ^ Even if we want to distinguish between the sub-analogies of the a set and 
 the sub-analogies of the 6 set, this information can be gathered from the pro- 
 duct n. 
 
 ^ I am disposed to thiak that we need not pay attention to instances for 
 which part of is known to be false, and part of / to be true. But the 
 question is a Uttle perplexing. 
 
CH. XIX INDUCTION AND AINALOGY 231 
 
 or analogies „^^, where ((<^ and „/ are the whole or part of <^ 
 
 and/, and ^/ are the contradictories of ^ andf.^ 
 
 It should be added, perhaps, that the theoretical classifica- 
 tion of most empirical arguments in daily use is complicated by 
 the account which we reasonably take of generalisations previ- 
 ously established. We often take account indirectly, therefore, 
 of evidence which supports in some degree other generalisations 
 than that which we are concerned to establish or refute at the 
 moment, but the probability of which is relevant to the problem 
 under investigation. 
 
 11. The argument will be rendered unnecessarily complex, 
 without much benefit to its theoretical interest, if we deal with 
 the most general case of all. "What follows, therefore, will deal 
 with the formula of the third degree of generahty, namely — 
 
 g{<p,f)/ A {<}>^J) 1 {<f>')U( A {^,)l 
 
 J ai. . . On aj , , . On \ar-.-(ta J 
 
 in which no partial instances occur, i.e. no iastances in which part 
 only of the analogy, required by the generalisation, is known to 
 exist. In this third degree of generahty, it wiU be remembered, 
 our knowledge of the characteristics of the instances is in- 
 complete, there is more analogy between the instances than is 
 covered by the generalisation, and there are some sub-analogies 
 to be reckoned with. In the above formula the incompleteness 
 of our knowledge is implicitly recognised in that 4)^if^' are 
 not between them entirely comprehensive. It is also supposed 
 that all the evidence we have is positive, no knowledge is 
 assumed, that is to say, of instances characterised by the con- 
 junctions „^ J, „0 J, or „0 J, where „0 and J are part of ^ and/. 
 An argument, therefore, from experience, in which, on the 
 basis of examined instances, we establish a generalisation apphc- 
 able beyond these instances, can be strengthened, if we restrict our 
 attention to the simpler type of case, by the following means : 
 
 (1) By reducing the resemblances 0i known to be common to 
 all the instances, but ignored as unessential by the generalisation. 
 
 (2) By increasing the differences ^' known to exist between 
 the iastances. 
 
 1 Where the conclusion /is simple and not complex (see § 5), some of these 
 complications cannot, of course, arise. 
 
232 A TREATISE ON PROBABILITY pt. m 
 
 (3) By diminishing the sub-analogies or unessential resem- 
 blances i^j. known to be common to some of the instances and not 
 known to be false of any. 
 
 These results can generally be obtained in two ways, either by 
 increasing the number of our instances or by increasing our know- 
 ledge of those we have. 
 
 The reasons why these methods seem to common sense to 
 strengthen the argument are fairly obvious. The object of (1) is to 
 avoid the possibility that ^^ as well as (^ is a necessary condition 
 of/. The object of (2) is to avoid the possibility that there may 
 be some resemblances additional to <f), common to aU the instances, 
 which have escaped our notice. The object of (3) is to get rid 
 of indications that the total value of ^^ may be greater than the 
 known value. When (f>^if is the total positive analogy between 
 the instances, so that the known value of (j}^ is its total value, it 
 is (1) which is fundamental ; and we need take account of (2) 
 and (3) only when our knowledge of the instances is incomplete. 
 But when our knowledge of the instances is incomplete, so that 
 (^1 falls short of its total value and we cannot infer tfi' from it, 
 it is better to regard (2) as fundamental ; in any case every 
 reduction of (^^ must increase (j)'. 
 
 12, I have now attempted to analyse the various ways in 
 which common practice seems to assume that considerations 
 of Analogy can yield us presumptive evidence in favour of a 
 generalisation. 
 
 It has been my object, in making a classification of empirical 
 arguments, not so much to put my results in forms closely similar 
 to those in which problems of generalisation commonly present 
 themselves to scientific investigators, as to inquire whether 
 ultimate uniformities of method can be found beneath the 
 innumerable modes, superficially difEering from another, in 
 which we do in fact argue. 
 
 I have not yet attempted to justify this way of arguing. 
 After turning aside to discuss in more detail the method of Pure 
 Induction, I shall make this attempt ; or rather I shall try to see 
 what sort of assxmiptions are capable of justifying empirical 
 reasoning of this kind. 
 
CHAPTER XX 
 
 THE VALUE OF MULTIPLICATION OF INSTANCES, OR PURE 
 INDUCTION 
 
 1. It has often been thouglit that the essence of inductive argu- 
 ment lies in the multiphcation of instances. " Where is that 
 process of reasoning," Hume inquired, " which from one instance 
 draws a conclusion, so different from that which it infers from 
 a hundred instances, that are no way different from that single 
 instance ? " I repeat that by emphasising the number of the in- 
 stances Hume obscured the real object of the method. If it 
 were strictly true that the hundred instances are no way different 
 from the single instance, Hume would be right to wonder in what 
 manner they can strengthen the argument. The object of iu- 
 creasing the number of iustances arises out of the fact that we 
 are nearly always aware of some difference between the instances, 
 and that even where the known difference is insignificant we may 
 suspect, especially when our knowledge of the instances is very 
 incomplete, that there may be more. Every new instance may 
 diminish the unessential resemblances between the instances and 
 by introducing a new difference increase the Negative Analogy. 
 For this reason, and for this reason only, new instances are 
 valuable. 
 
 If our premisses comprise the body of memory and tradition 
 which has been originally derived from direct experience, and 
 the conclusion which we seek to establish is the Newtonian theory 
 of the Solar System, our argument is one of Pure Induction, in 
 so far as we support the Newtonian theory by pointing to the 
 great number of consequences which it has in common with the 
 facts of experience. The predictions of the Nautical Almanack 
 are a consequence of the Newtonian theory, and these predictions 
 are verified many thousand times a day. But even here the 
 
 233 
 
234 A TREATISE ON PROBABILITY pt. m 
 
 force of the argument largely depends, not on the mere number 
 of these predictions, but on the knowledge that the circumstances 
 in which they are fulfilled differ widely from one another in a 
 vast number of important respects. The variety of the circum- 
 stances, in which the Newtonian generalisation is fulfilled, rather 
 than the number of them, is what seems to impress our reasonable 
 faculties. 
 
 2. I hold, then, that our object is always to increase the 
 Negative Analogy, or, which is the same thiag, to diminish the 
 characteristics conmion to all the examined instances and yet not 
 taken account of by our generalisation. Our method, however, 
 may be one which certainly achieves this object, or it may be one 
 which possibly achieves it. The former of these, which is obvi- 
 ously the more satisfactory, may consist either in increasing our 
 definite knowledge respecting instances examined already, or ia 
 finding additional instances respecting which definite knowledge 
 is obtainable. The second of them consists in finding additional 
 instances of the generalisation, about which, however, our de- 
 finite knowledge may be meagre ; such further instances, if our 
 knowledge about them were more complete, would either increase 
 or leave unchanged the Negative Analogy ; in the former case 
 they would strengthen the argument and in the latter case they 
 would not weaken it ; and they must, therefore, be allowed some 
 weight. The two methods are not entirely distinct, because 
 new instances, about which we have some knowledge but not 
 much, may be known to increase the Negative Analogy a little 
 by the first method, and suspected of increasing it further by the 
 second. 
 
 It is characteristic of advanced scientific method to depend 
 on the former, and of the crude \mregulated induction of ordinary 
 experience to depend on the latter. It is when our definite 
 knowledge about the instances is hmited, that we must pay 
 attention to their number rather than to the specific differences 
 between them, and must fall back on what I term Pure Induction. 
 
 In this chapter I investigate the conditions and the manner 
 in which the mere repetition of instances can add to the force 
 of the argument. The chief value of the chapter, in my judg- 
 ment, is negative, and consists in showing that a Une of advance, 
 which might have seemed promising, turns out to be a blind 
 alley, and that we are thrown back on known Analogy. Pure 
 
c!H. XX INDUCTION AND ANALOGY 235 
 
 Induction will not give us any very substantial assistance in 
 getting to the bottom of the general inductive problem. 
 
 3. The problem of generalisation ^ by Pure Induction can be 
 stated in the following symbolic form : 
 
 Let h represent the general d, jpriori data of the investigation ; 
 let g represent the generalisation which we seek to establish ; 
 let XyX^ . . . a5„ represent instances of g. 
 
 Then x^gh = 1, x^gh = 1 . . . xjgh = 1 ; given g, that is to 
 say, the truth of each of its instances follows. The problem is 
 to determine the probability gjhx^^ • ■ -x^, i.e. the probability 
 of the generalisation when n instances of it are given. Our 
 analysis will be simplified, and nothing of fundamental importance 
 will be lost, if we introduce the assumption that there is nothiag 
 in our ci priori data which leads us to distinguish between the 
 d priori likelihood of the different instances ; we assume, that is 
 to say, that there is no reason d priori for expecting the occurrence 
 of any one instance with greater reliance than any other, i.e. 
 
 XjJh=X2/h= . . . =xjh. 
 
 Write g/fix^os^ . . .x„ =p^ 
 
 and. x^_^ijnXjOS2 . . • ^n~¥n+i-i 
 
 then 
 
 Pn glhxy...x^ xjhx^i . . . x^_i 
 
 Pn-i g/hxi . . . x^_-^ g/hXj;. . .Xn_i.x,Jhxj^...x^_-i^ 
 
 X^jhXi . . ■ X^_i 
 
 « 1 1 
 
 ^^, and hence 2?„ = .p^, where Po=g/h, i.e. Pq 
 
 is the d priori probability of the generalisation. 
 
 1 In the most general sense we can regard any proposition as the generalisa- 
 tion of all the propositions which foUow from it. For if h is any proposition, 
 and we put 0(a;)= ' x can be inferred from h ' and/(a;)=a;, then ff(0, /)=fc. Since 
 Pure Induction consists in finding as many instances of a generalisation as 
 possible, it is, in the widest sense, the process of strengthening the probability 
 of any proposition by adducing numerous instances of known truths which 
 follow from it. The argument is one of Pure Induction, therefore, in so far as 
 the probability of a conclusion is based upon the number of independent con- 
 sequences which the conclusion and the premisses have in common. 
 
236 A TREATISE ON PROBABILITY m. m 
 
 It follows, therefore, that^„>^„_i so long as y„4=l. 
 Further, 
 
 XjX^ . . . xjh = xjhx-ips^ ...«„_!. x^x^ ...x^_-i]h . 
 =y„ . x-iX^ . . . x^_\]h 
 
 =ynyn-i ■■■yv 
 
 . ^ - Po _ Po _ 
 
 yiyz ■■■yn x^^z • ■ • xji'' 
 
 ^ Po 
 
 x^x^ . . . x^jh + XjX^ . . . x^jh 
 
 ^ Po 
 
 glh + XjX2...xJgh.gjh 
 
 = Po 
 
 Po+XjX2...xJgh{l-po) 
 
 This approaches unity as a limit, if XjX^ . . ■ xjgh . — 
 
 Po 
 approaches zero as a liinit, when n increases. 
 
 4. We may now stop to consider how much this argument has 
 proved. We have shown that if each of the instances necessarily 
 follows from the generalisation, then each additional instance 
 increases the probability of the generalisation, so long as the new 
 instance could not have been predicted with certainty from a 
 knowledge of the former instances.^ This condition is the same 
 as that which came to light when we were discussing Analogy. 
 If the new instance were identical with one of the former in- 
 stances, a knowledge of the latter would enable us to predict it. 
 If it difiers or may differ in analogy, then the condition required 
 above is satisfied. 
 
 The common notion, that each successive verification of a 
 doubtful principle strengthens it, is formally proved, therefore, 
 without any appeal to conceptions of law or of causality. But 
 we have not proved that this probability approaches certainty as 
 a liinit, or even that our conclusion becomes more likely than not, 
 as the number of verifications or instances is indefinitely increased. 
 
 5. What are the conditions which must be satisfied in order 
 that the rate, at which the probabihty of the generalisation 
 increases, may be such that it will approach certainty as a 
 
 ^ Sinoo Pn>-Pn-i so long as ^n + 1. 
 
OH. XX INDUCTION AND ANALOGY 237 
 
 limit when the number of independent instances of it are in- 
 definitely increased ? We have already shown, as a basis for 
 this investigation, that p^ approaches the limit of certainty for 
 a generalisation g, if, as n increases, x^x^ . . ■ xjgh becomes 
 small compared with p^, i.e. if the d priori probability of so many 
 instances, assuming the falsehood of the generalisation, is small 
 compared with the generalisation's d priori probability. It 
 follows, therefore, that the probability of an induction tends 
 towards certainty as a limit, when the number of instances is 
 increased, provided that 
 
 for all values of r, and P(,>r;, where e and 17 are finite proba- 
 ' bilities, separated, that is to say, from impossibiUty by a value 
 of some finite amount, however small. These conditions appear 
 simple, but the meaning of a ' finite probability ' requires a 
 word of explanation.^ 
 
 I argued in Chapter III. that not aU probabilities have an 
 exact numerical value, and that, in the case of some, one can say 
 no more about their relation to certainty and impossibility than 
 that they fall short of the former and exceed the latter. There 
 is one class of probabilities, however, which I called the numerical 
 class, the ratio of each of whose members to certainty can be 
 expressed by some number less than unity ; and we can sometimes 
 compare a non-numerical probability in respect of more and less 
 with one of these numerical probabilities. This enables us to 
 give a definition of ' finite probability ' which is capable of applica- 
 tion to non-numerical as well as to numerical probabilities. I 
 define a ' finite probability ' as one which exceeds some numerical 
 probability, the ratio of which to certainty can be expressed by 
 a finite number.^ The principal method, in which a probability 
 can be proved finite by a process of argument, arises either when 
 
 ^ The proof of these conditions, which is obvious, is as follows : 
 
 x^x^... Xn/gh = x„lxix^ . . . x„.^gh . Xj^x^. . . x„.Jgh<:{l - e)", 
 
 where e is finite and PD>ri where j; is finite. There is always, under these 
 
 (1 - e)" 1 
 conditions, some finite value of n such that both (1 - f)" and — s^re less 
 
 than any given finite quantity, however small. 
 
 ^ Hence a series of probabflities p^p^ ■• -Pr approaches a limit L, if, given 
 any positive finite number e however small, a positive integer n can always be 
 found such that for all values of r greater than n the difEerenoe between L and p^ 
 is less than e.y, where 7 is the measure of certainty. 
 
238 A TREATISE ON PROBABILITY pt. m 
 
 its conclusion can be shown to be one of a finite number of alter- 
 natives, wHch are between them exhaustive or, at any rate, have 
 a finite probability, and to which the Principle of Indifference 
 is appKcable ; or (more usually), when its conclusion is more 
 probable than some hypothesis which satisfies this first condition. 
 
 6. The conditions, which we have now established in order 
 that the probabiKty of a pure induction may tend towards 
 certainty as the number of instances is increased, are (1) that 
 x^/x^x^. . .Xr_jffh falls short of certainty by a finite amount 
 for all values of r, and (2) that p^, the d priori probabihty of our 
 generalisation, exceeds impossibihty by a fimite amount. It is 
 easy to see that we can show by an exactly similar argument that 
 the foUowing more general conditions are equally satisfactory : 
 
 (1) That x^jxyx^ ■ ■ ■ x^-iffh falls short of certainty by a finite 
 amount for all values of r beyond a specified value s. 
 
 (2) That pg, the probability of the generaUsation relative to 
 a knowledge of these first s instances, exceeds impossibility by 
 a finite amount. 
 
 In other words Pure Induction can be usefully employed to 
 strengthen an argument if, after a certain number of instances 
 have been examined, we have, from some other source, a fimite 
 probability in favour of the generalisation, and, assuming the 
 generalisation is false, a finite uncertainty as to its conclusion 
 being satisfied by the next hitherto unexamined instance which 
 satisfies its premiss. To take an example. Pure Induction can 
 be used to support the generalisation that the sun will rise every 
 morning for the next million years, provided that with the ex- 
 perience we have actually had there are finite probabilities, 
 however small, derived from some other source, first, in favour of 
 the generalisation, and, second, in favour of the sun's not rising 
 to-morrow assuming the generalisation to be false. Given these 
 finite probabilities, obtained otherwisCj however small, then the 
 probability can be strengthened and can tend to increase towards 
 certainty by the mere multipUcation of instances provided 
 that these instances are so far distinct that they are not 
 inferrible one from another. 
 
 7. Those supposed proofs of the Inductive Principle, which 
 are based openly or impHcitly on an argument in inverse prob- 
 ability, are all vitiated by unjustifiable assumptions relating 
 to the magnitude of the a priori probability p^. Jevons, for 
 
OH. XX IKDUCTION AND ANALOGY 239 
 
 instance, avowedly assumes tliat we may, in the absence of special 
 information, suppose any unexamined hypothesis to be as likely 
 as not. It is difficult to see how such a belief, if even its most 
 immediate implications had been properly apprehended, could 
 have remained plausible to a mind of so sound a practical judg- 
 ment as his. The arguments against it and the contradictions 
 to which it leads have been dealt with in Chapter IV. The 
 demonstration of Laplace, which depends upon the Eule of 
 Succession, will be discussed in Chapter XXX. 
 
 8. The prior probability, which must always be found, before 
 the method of pure induction can be usefully employed to support 
 a substantial argument, is derived, I think, in most ordinary 
 cases — with what justification it remains to discuss — ^from con- 
 siderations of Analogy. But the conditions of valid induction 
 as they have been enunciated above, are quite independent of 
 analogy, and might be applicable to other types of argument. 
 In certain cases we might feel justified in assuming directly that 
 the necessary conditions are satisfied. 
 
 Our belief, for instance, in the validity of a logical scheme is 
 based partly upon inductive grounds — on the nwnher of conclu- 
 sions, each seemingly true on its own account, which can be 
 derived from the axioms — and partly on a degree of self -evidence 
 in the axioms themselves sufficient to give them the ioitial 
 probability upon which induction can build. We depend upon 
 the initial presumption that, if a proposition appears to us to 
 be true, this is by itself, in the absence of opposing evidence, 
 some reason for its being as well as appearing true. We cannot 
 deny that what appears true is sometimes false, but, unless we 
 can assume some substantial relation of probabiHty between 
 the appearance and the reality of truth, the possibility of 
 even probable knowledge is at an end. 
 
 The conception of our having some reason, though not a 
 conclusive one, for certain beliefs, arising out of direct inspection, 
 may prove important to the theory of epistemology. The old 
 metaphysics has been greatly hindered by reason of its having 
 always demanded demonstrative certainty. Much of the cogency 
 of Hume's criticism arises out of the assumption of methods 
 of certainty on the part of those systems against which it was 
 directed. The earlier realists were hampered by their not per- 
 ceiving that lesser claims in the beginning might yield them 
 
240 A TEEATISE ON PEOBABILITY n. m 
 
 what they wanted in the end. And transcendental philosophy 
 has partly arisen, I believe, through the belief that there is no 
 knowledge on these matters short of certain knowledge, being 
 combined with the belief that such certain knowledge of meta- 
 physical questions is beyond the power of ordinary methods. 
 
 When we allow that probable knowledge is, nevertheless, real, 
 a new method of argument can be introduced into metaphysical 
 discussions. The demonstrative method can be laid on one side, 
 and we may attempt to advance the argument by taking account 
 of circumstances which seem to give some reason for preferring 
 one alternative to another. Great progress may follow if the 
 nature and reality of objects of perception,^ for instance, can be 
 usefully investigated by methods not altogether dissimilar from 
 those employed in science and with the prospect of obtaining as 
 high a degree of certainty as that which belongs to some scientific 
 conclusions ; and it may conceivably be shown that a beKef in 
 the conclusions of science, enunciated in any reasonable manner 
 however restricted, involves a preference for some metaphysical 
 conclusions over others. 
 
 9. Apart from analysis, careful reflection would hardly lead 
 us to expect that a conclusion which is based on no other than 
 grotmds of pure induction, defined as I have defined them as 
 consisting of repetition of instances merely, could attain in this 
 way to a high degree of probability. To this extent we ought 
 all of us to agree with Hume. We have found that the sugges- 
 tions of common sense are supported by more precise methods. 
 Moreover, we constantly distinguish between arguments, which 
 we call inductive, upon other grounds than the number of in- 
 stances upon which they are based ; and under certain conditions 
 we regard as crucial an insignificant number of experiments. The 
 method of pure induction may be a useful means of strengthening 
 a probability based on some other ground. In the case, however, 
 of most scientific arguments, which would commonly be called 
 inductive, the probability that we are right, when we make 
 predictions on the basis of past experience, depends not so 
 much on the number of past experiences upon which we rely, 
 as on the degree in which the circumstances of these experiences 
 
 ^ A paper by Mr. G. E. Moore entitled, " The Nature and Reality of Objects 
 of Perception," -wMoh was published in the Proceedings of the Aristotelian Society 
 for 1906, seems to me to apply for the first time a method somewhat resembling 
 that which is descvibe^ a]bpT@, 
 
OH. XX INDUCTION AND ANALOGY 241 
 
 resemble the known circumstances in which the prediction is 
 to take effect. Scientific method, indeed, is mainly devoted to 
 discovering means of so heightening the known analogy that 
 we may dispense as far as possible with the methods of pure 
 induction. 
 
 When, therefore, our previous knowledge is considerable 
 and the analogy is good, the purely inductive part of the argu- 
 ment may take a very subsidiary place. But when our knowledge 
 of the instances is slight, we may have to depend upon pure 
 induction a good deal. In an advanced science it is a last resort, 
 — ^the least satisfactory of the methods. But sometimes it must 
 be our first resort, the method upon which we must depend in 
 the dawn of knowledge and in fundamental inquiries where 
 we must presuppose nothing. 
 
CHAPTER XXI 
 
 THE NATURE OF INDUCTIVE ARGUMENT CONTINUED 
 
 1. In the emmciation, given in the two preceding chapters, of the 
 Principles of Analogy and Pure Induction there has been no 
 reference to experience or causaUty or law. So far, the argument 
 has been perfectly formal and might relate to a set of proposi- 
 tions of any type. But these methods are most commonly 
 employed in physical arguments where material objects or 
 experiences are the terms of the generalisation. We must con- 
 sider, therefore, whether there is any good ground, as some 
 logicians seem to have supposed, for restricting them to this 
 kind of inquiry. 
 
 I am inclined to think that, whether reasonably or not, we 
 nattirally apply them to aU kinds of argument aHke, including 
 formal arguments as, for example, about numbers. When we 
 are told that Fermat's formula for a prime, namely, 2^" + 1 for 
 all values of a, has been verified in every case ia which veri- 
 fication is not excessively laborious — ^namely, for a = l, 2, 3, 
 and 4, we feel that this is some reason for accepting it, or, at 
 least, that it raises a sufficient presumption to justify a 
 further examination of the formula.^ Yet there can be no refer- 
 ence here to the uniformity of nature or physical causation. If 
 inductive methods are limited to natural objects, there can no 
 more be an appreciable ground for thinking that 2^" + 1 is a true 
 formula for primes, because empirical methods show that it 
 yields primes up to a = 4, or even if they showed that it yielded 
 primes for every number up to a million million, than there is 
 to think that any formula which I may choose to write down 
 
 ^ This formula has, in fact, been disproved in recent times, e.g. 2^° + 1 = 
 4, 294, 967, 297 = 641 x 6, 700, 417. Thus it is no longer so good an illustration 
 as it would have been a hundred years ago. 
 
 242 
 
OH. XXI INDUCTION AND ANALOGY 243 
 
 at random is a true source of primes. To maintain that there is 
 no appreciable gromid in such a case is paradoxical. If, on the 
 other hand, a partial verification does raise some just appreciable 
 presumption in the formula's favour, then we must include 
 numbers, at any rate, as well as material objects amongst the 
 proper subjects of the inductive method. The conclusion of 
 the previous chapter indicates, however, that, if arguments of 
 this kind have force, it can only be in virtue of there being 
 some finite cL priori probability for the formula based on other 
 than inductive grounds. 
 
 There are some illustrations in Jevons's Principles of Science,^ 
 which are relevant to this discussion. We find it to be true of 
 the following six numbers : 
 
 5, 15, 35, 45, 65, 95 
 
 that they all end in five, and are all divisible by five without re- 
 mainder. Would this fact, by itself, raise any kind of presump- 
 tion that all numbers ending in five are divisible by five without 
 remainder ? Let us also consider the six numbers, 
 
 7, 17, 37, 47, 67, 97. 
 
 They aU end in seven and also agree in being primes. Would 
 this raise a presumption in favour of the generalisation that all 
 numbers are prime, which end in seven ? We might be prejudiced 
 in favour of the first argument, because it would lead us to a 
 true conclusion ; but we ought not to be prejudiced against the 
 second because it would lead us to a false one ; for the validity 
 of empirical arguments as the foundation of a probabUity cannot 
 be affected by the actual truth or falsity of their conclusions. 
 If, on the evidence, the analogy is similar and equal, and if the 
 scope of the generalisation and its conclusion is similar, then the 
 value of the two arguments must be equal also. 
 
 Whether ornot theuseof empirical argument appears plausible 
 to us in these particular examples, it is certainly true that many 
 mathematical theorems have actually been discovered by such 
 methods. Generalisations have been suggested nearly as often, 
 perhaps, in the logical and mathematical sciences, as in the 
 
 ^ Pp. 229-231 (one volume edition). Jevons uses these illustrationa, not 
 for the purpose to which I am here putting them, but to demonstrate the falli- 
 bility of empirical laws. 
 
244 A TREATISE ON PROBABILITY pt. m 
 
 physical, by the recognition of particular instances, even where 
 formal proof has been forthcoming subsequently. Yet if the 
 suggestions of analogy have no appreciable probability in the 
 formal sciences, and should be permitted only in the material, it 
 must be imreasonable for us to pursue them. If no finite prob- 
 ability exists that a formula, for which we have empirical verifica- 
 tion, is in fact universally true, Newton was acting fortunately, 
 but not reasonably, when he hit on the Binomial Theorem by 
 methods of empiricism.^ 
 
 2. I am inclined to beheve, therefore, that, if we trust the 
 promptings of conamon sense, we have the same kind of ground 
 for trusting analogy in mathematics that we have in physics, 
 and that we ought to be able to apply any justification of the 
 method, which suits the latter case, to the former also. This 
 does not mean that the d priori probabiHties, from some other 
 source than induction, which the inductive method requires as 
 its foundation, may not be sought and found differently in the 
 two types of inquiry. A reason why it has been thought 
 that analogy ought to be confined to natural laws may be, 
 perhaps, that in most of those cases, in which we could 
 support a mathematical theorem by a very strong analogy, the 
 existence of a formal proof has done away with the necessity 
 for the limping methods of empiricism ; and because in most 
 mathematical investigations, while in our earliest thoughts 
 we are not ashamed to consult analogy, our later work will be 
 more profitably spent in searching for a formal proof than in 
 establishing analogies which must, at the best, be relatively weak. 
 As the modern scientist discards, as a rule, the method of pure 
 induction, in favour of experimental analogy, where, if he 
 takes account of his previous knowledge, one or two cases may 
 prove immensely significant ; so the modem mathematician 
 prefers the resources of his analysis, which may yield him 
 certainty, to the doubtful promises of empiricism. 
 
 3. The main reason, however, why it has often been held that 
 we ought to limit inductive methods to the content of the particu- 
 lar material universe in which we live, is, most probably, the 
 fact that we can easily imagine a universe so constructed that 
 such methods would be useless. This suggests that analogy and 
 induction, while they happen to be useful to us in this world, 
 
 ' See Jevons, loc. cit. p. 231. 
 
OH. XXI INDUCTION AND ANALOGY 245 
 
 cannot be universal principles of logic, on the same footing, for 
 instance, as the syllogism. 
 
 In one sense this opinion may be well founded. I do not deny 
 or affirm at present that it may be necessary to confine inductive 
 methods to arguments about certain kinds of objects or certain 
 kinds of experiences. It may be true that in every useful argu- 
 ment from analogy our premisses must contain fundamental 
 assumptions, obtained directly and not inductively, which some 
 possible experiences might preclude. Moreover, the success of 
 induction in the past can certainly affect its probable usefulness 
 for the future. We may discover something about the nature 
 of the universe — ^we may even discover it by means of induction 
 itself — the knowledge of which has the effect of destroying the 
 further utility of induction. I shall argue later on that the 
 confidence with which we ourselves use the method does in 
 fact depend upon the nature of our past experience. 
 
 But this empirical attitude towards induction may, on the 
 other hand, arise out of either one of two possible confusions. 
 It may confuse, first, the reasonable character of arguments 
 with their practical usefulness. The usefulness of induction 
 depends, no doubt, upon the actual content of experience. If 
 there were no repetition of detail in the universe, induction 
 would have no utility. If there were only a single object in the 
 universe, the laws of addition would have no utihty. But the 
 processes of induction and addition would remain reasonable. 
 It may confuse, secondly, the vahdity of attributing probabiUty 
 to the conclusion of an argument with the question of the actual 
 truth of the conclusion. Induction tells us that, on the basis of 
 certain evidence, a certain conclusion is reasonable, not that it is 
 true. If the sun does not rise to-morrow, if Queen Anne still 
 lives, this will not prove that it was foolish or unreasonable of us 
 to have believed the contrary. 
 
 4. It wiU be worth while to say a little more in this connection 
 about the not infrequent failure to distinguish the rational from 
 the true. The excessive ridicule, which this mistake has visited 
 on the supposed irrationaUty of barbarous and primitive peoples, 
 affords some good examples. " Reflection and enquiry should 
 satisfy us," says Dr. Frazer in the Golden Bough, " that to our 
 predecessors we are indebted for much of what we thought most 
 our own, and that their errors were not wilful extravagances 
 
246 A TKEATISE ON PEOBABILITY n. m 
 
 or the ravings of insanity, but simply hypotheses, justifiable as 
 such at the time when they were propounded, but which a fuUer 
 experience has proved to be inadequate. . . . Therefore, in 
 reviewing the opinions and practices of ruder ages and races we 
 shall do well to look with leniency upon their errors as inevitable 
 slips made in the search for truth. . . ." The first introduction of 
 iron ploughshares into Poland, he tells in another passage, having 
 been followed by a succession of bad harvests, the farmers attri- 
 buted the badness of the crops to the iron ploughshares, and dis- 
 carded them for the old wooden ones. The method of reasoning 
 of the farmers is not difEerent from that of science, and may, 
 surely, have had for them some appreciable probability ia its 
 favour. " It is a curious superstition," says a recent pioneer ia 
 Borneo, " this of the Dusuns, to attribute anything — ^whether 
 good or bad, lucky or unlucky — that happens to them to some- 
 thing novel which has arrived in their country. For instance, 
 my living in Kindram has caused the intensely hot weather we 
 have experienced of late." ^ What is this curious superstition 
 but the Method of DiSerence ? 
 
 The following passage from Jevons's Principles of Science well 
 illustrates the tendency, to which he himself yielded, to depreci- 
 ate the favourite analogies of one age, because the experience of 
 their successors has confuted them. Between things which are 
 the same in number, he points out, there is a certain resemblance, 
 namely in number ; and in the infancy of science men could not 
 be persuaded that there was not a deeper resemblance implied 
 in that of number. " Seven days are mentioned in Genesis ; 
 infants acquire their teeth at the end of seven months ; they 
 change them at the end of seven years ; seven feet was the Umit 
 of man's height ; every seventh year was a climacteric or critical 
 year, at which a change of disposition took place. In natural 
 science there were not only the seven planets, and the seven 
 metals, but also the seven primitive colours, and the seven tones 
 of music. So deep a hold did this doctrine take that we still have 
 its results in many customs, not only in the seven days of the 
 week, but the seven years' apprenticeship, puberty at fourteen 
 years, the second cHmacteric, and legal majority at twenty-one 
 years, the third climacteric." Eeligious systems from Pythagoras 
 to Comte have sought to derive strength from the virtue of seven. 
 1 Oolden Bough, p. 174. 
 
OH. XXI INDUCTION AND ANALOGY 247 
 
 " And even in scientific matters the loftiest intellects have occa- 
 sionally yielded, as when Newton was misled by the analogy 
 between the seven tones of music and the seven colours of his 
 spectrum. . . . Even the genius of Huyghens did not prevent 
 him from ioferririg that but one satellite could belong to Saturn, 
 because, with those of Jupiter and the earth, it completed the 
 perfect number of six." But is it certain that Newton and 
 Huyghens were only reasonable when their theories were true, 
 and that their mistakes were the fruit of a disordered fancy ? 
 Or that the savages, from whom we have inherited the most 
 fundamental inductions of our knowledge, were always super- 
 stitious when they believed what we now know to be 
 preposterous ? 
 
 It is important to understand that the common sense of the 
 race has been impressed by very weak analogies and has attri- 
 buted to them an appreciable probability, and that a logical 
 theory, which is to justify common sense, need not be afraid of 
 including these marginal cases. Even our belief in the real 
 existence of other people, which we all hold to be weU estab- 
 lished, may require for its justification the combination of 
 experience with a jast appreciable a priori possibUity for 
 Animism generally.^ If we actually possess evidence which 
 renders some conclusion absurd, it is very difficult for us to 
 appreciate the relation of this conclusion to data which are 
 difierent and less complete ; but it is essential that we should 
 realise arguments from analogy as relative to premisses, if we are 
 to approach the logical theory of Induction without prejudice. 
 
 5. While we depreciate the former probability of beliefs 
 which we no longer hold, we tend, I think, to exaggerate the 
 present degree of certainty of what we still believe. The preceding 
 paragraph is not intended to deny that savages often greatly 
 
 1 " This is animism, or that senae of something in Nature which to the 
 enlightened or civilised man is not there, and in the civilised man's child, if it 
 be admitted that he has it at all, is but a faint survival of a phase of the 
 primitive mind. And by animism I do not mean the theory of a soul in 
 nature, but the tendency or impulse or instinct, in which aU myth originates, 
 to animate all things ; the projection of ourselves into nature ; the sense and 
 apprehension of an intelligence like our own, but more powerful in all visible 
 things " (Hudson, Far Away and Long Ago, pp. 224-5). This ' tendency or 
 impulse or instinct,' refined by reason and enlarged by experience, may be 
 required, in the shape of an intuitive a priori probability, if some of those 
 universal conclusions of common senae, which the most sceptical do not kick 
 away, are to be supported with rational foundations. 
 
248 A TREATISE ON PROBABILITY pt. m 
 
 overestimate the value of their crude inductions, and are to this 
 extent irrational. It is not easy to distinguish between a belief's 
 being the most reasonable of those which it is open to us to 
 believe, and its being more probable than not. In the same way 
 we, perhaps, put an excessive confidence in those conclusions — 
 the existence of other people, for instance, the law of gravity, or 
 to-morrow's sunrise — of which, in comparison with many other 
 beliefs, we are very well assured. We may sometimes confuse 
 the practical certainty, attaching to the class of beliefs upon which 
 it is rational to act with the utmost confidence, with the more 
 wholly objective certainty of logic. We might rashly assert, for 
 instance, that to-morrow's sunrise is as likely to us as failure, 
 and the special virtue of the number seven as unlikely, even to 
 Pythagoras, as success, in an attempt to throw heads a hundred 
 times in succession with an unbiassed coin.^ 
 
 6. As it has often been held upon various grounds, with 
 reason or without, that the validity of Induction and Analogy 
 depends in some way upon the character of the actual world, 
 logicians have sought for material laws upon which these methods 
 can be founded. The Laws of Universal Causation and the 
 Uniformity of Nature, namely, that all events have some cause 
 and that the same total cause always produces the same efiect, 
 are those which commonly do service. But these principles 
 merely assert that there are some data, from which events posterior 
 to them in time could be inferred. They do not seem to yield us 
 much assistance in solving the inductive problem proper, or in 
 determining how we can infer with probability from partial data. 
 It has been suggested in the previous chapter that the Principle 
 of the Uniformity of Nature amounts to an assertion that an 
 argument from perfect analogy (defined as I have defined it) is 
 valid when applied to events only differing in their positions in 
 time or spaoe.^ It has also been pointed out that ordinary in- 
 ductive arguments appear to be strengthened by any evidence 
 which makes them approximate more closely in character to a 
 perfect analogy. But this, I think, is the whole extent to which 
 this principle, even if its truth could be assumed, would help us. 
 
 "^ Yet if every inhabitant of the world, Grimsehl has calculated, were to toss 
 a coin every second, day and night, this latter event would only occur once on 
 the average in every twenty bUlion years. 
 
 * Is this inteipretation of the Principle of the Uniformity of Nature affected 
 by the Doctrine of Relativity? 
 
CH. XXI INDUCTION AND ANALOGY 249 
 
 States of the universe, identical in every particular, may never 
 recur, and, even if identical states were to recur, we should not 
 know it. 
 
 The kind of fundamental assumption about the character of 
 material laws, on which scientists appear commonly to act, 
 seems to me to be much less simple than the bare principle of 
 Uniformity. They appear to assume something much more like 
 what mathematicians call the principle of the superposition of 
 small effects, or, as I prefer to call it, in this connection, the 
 atomic character of natural law. The system of the material 
 universe must consist, if this kind of assumption is warranted, 
 of bodies which we may term (without any implication as to 
 their size being conveyed thereby) legal atoms, such that each of 
 them exercises its own separate, independent, and invariable 
 efEect, a change of the total state being compounded of a number 
 of separate changes each of which is solely due to a separate 
 portion of the preceding state. We do not have an invariable 
 relation between particular bodies, but nevertheless each has on 
 the others its own separate and invariable efEect, which does not 
 change with changing circumstances, although, of course, the 
 total effect may be changed to almost any extent if all the other 
 accompanying causes are different. Each atom can, accord- 
 ing to this theory, be treated as a separate cause and does 
 not enter into different organic combinations in each of which 
 it is regulated by different laws. 
 
 Perhaps it has not always been realised that this atomic 
 uniformity is in no way impUed by the principle of the 
 Uniformity of Nature. Yet there might well be quite different 
 laws for wholes of different degrees of complexity, and laws of 
 connection between complexes which could not be stated in 
 terms of laws connecting individual parts. In this case 
 natural law would be organic and not, as it is generally 
 supposed, atomic. If every configuration of the Universe were 
 subject to a separate and independent law, or if very small 
 differences between bodies — in their shape or size, for instance, — 
 led to their obeying quite different laws, prediction would be 
 impossible and the inductive method useless. Yet nature might 
 still be uniform, causation sovereign, and laws timeless and 
 absolute. 
 
 The scientist wishes, in fact, to assume that the occurrence 
 
250 A TEEATISE ON PROBABILITY pt. m 
 
 of a phenomenon which has appeared as part of a more complex 
 phenomenon, may be some reason for expecting it to be associated 
 on another occasion with part of the same complex. Yet if 
 different wholes were subject to different laws qud wholes and 
 not simply on account of and in proportion to the differences of 
 their parts, knowledge of a part could not lead, it would seem, 
 even to presumptive or probable knowledge as to its association 
 with other parts. Given, on the other hand, a nimiber of legally 
 atomic units and the laws connectiag them, it would be possible 
 to deduce their effects pro tanto without an exhaustive knowledge 
 of all the coexistiag circumstances. 
 
 We do habitually assume, I think, that the size of the atomic 
 unit is for mental events an individual consciousness, and for 
 material events an object small in relation to our perceptions. 
 These considerations do not show us a way by which we can 
 justify Induction. But they help to elucidate the kind of assump- 
 tions which we do actually make, and may serve as an introduction 
 to what follows. 
 
CHAPTER XXII 
 
 THE JUSTIFICATION OF THESE METHODS 
 
 1. The general Une of thought to be followed in this chapter may 
 be indicated, briefly, at the outset. 
 
 A system of facts or propositions, as we ordinarily conceive 
 it, may comprise an indefinite number of members. But the 
 ultimate constituents or indefinables of the system, which all 
 the members of it are about, are less in number than these 
 members themselves. Further, there are certain laws of necessary 
 connection between the members, by which it is meant (I do not 
 stop to consider whether more than this is meant) that the truth 
 or falsity of every member can be inferred from a knowledge of 
 the laws of necessary connection together with a knowledge of the 
 truth or falsity of some (but not aU) of the members. 
 
 The ultimate constituents together with the laws of necessary 
 connection make up what I shall term the independent variety 
 of the system. The more numerous the ultimate constituents 
 and the necessary laws, the greater is the system's independent 
 variety. It is not necessary for my present purpose, which is 
 merely to bring before the reader's mind the sort of conception 
 which is in mine, that I should attempt a complete definition 
 of what I mean by a system. 
 
 Now it is characteristic of a system, as distinguished from 
 a collection of heterogeneous and independent facts or proposi- 
 tions, that the number of its premisses, or, in other words, the 
 amount of independent variety in it, should be less than the 
 number of its members. But it is not an obviously essential 
 characteristic of a system that its premisses or its indepen- 
 dent variety should be actually finite. "We must distinguish, 
 therefore, between systems which may be termed finite and 
 infinite respectively, the terms finite and imfmite referring not to 
 
 251 
 
252 A TREATISE ON PROBABILITY w. in 
 
 the number of members in the system but to the amount of in- 
 dependent variety in it. 
 
 The purpose of the discussion, which occupies the greater 
 part of this chapter, is to maintain that, if the premisses of our 
 argument permit us to assume that the facts or propositions, 
 with which the argument is concerned, belong to a finite system, 
 then probable knowledge can be vahdly obtained by means of 
 an inductive argument. I now proceed to approach the question 
 from a shghtly different standpoint, the controlling idea, however, 
 being that which is outlined above. 
 
 2. What is our actual course of procedure in an inductive 
 argument ? We have before us, let us suppose, a set of.w in- 
 stances which have r known qualities, a^a^ ... a^ in common, 
 these r qualities constituting the known positive analogy. From 
 these qualities three (say) are picked out, namely, %, a^, a^, and 
 we inquire with what probability all objects having these three 
 qualities have also certain other qualities which we have picked 
 out, namely, a^_i, a^. We wish to determine, that is to say, 
 whether the quaUties «,._!, a^ are bound up with the qualities 
 %, fflgj <^3- 111 tliiis approaching this question we seem to 
 suppose that the qualities of an object are bound together in 
 a Hmited number of groups, a sub-class of each group being an 
 infaUible symptom of the coexistence of certain other members 
 of it also. 
 
 Three possibilities are open, any of which would prove 
 destructive to our generaHaation. It may be the case (1) that 
 a^_iOV a^is independent of all the other quaUties of the instances 
 — ^they may not overlap, that is to say, with any other groups ; 
 or (2) that ajO,^^ do not belong to the same groups as a^.^o,^ \ 
 or (3) that a^a^a^, while they belong to the same group as a^_ia„ 
 are not sufficient to specify this group uniquely — they belong, 
 that is to say, to other groups also which do not include a^_y and 
 a^. The precautions we take are directed towards reducing the 
 likelihood, so far as we can, of each of these possibilities. We 
 distrust the generalisation if the terms typified by a^-A are 
 numerous and comprehensive, because this increases the likeli- 
 hood that some at least of them fall under heading (1), and also 
 because it increases the likelihood of (3). We trust it if the 
 terms typified by a^a^a^ are numerous and comprehensive, 
 because this decreases the likelihood both of (2) and of (3). If 
 
CH. xxn INDUCTION AND ANALOGY 253 
 
 we find a new instance which agrees with the former instances in 
 a^a^^a^_-^^ but not in a^, we welcome it, because this disposes of 
 the possibility that it is a^, alone or in combination, that is bound 
 up with a^_-ja^. We desire to increase our knowledge of the 
 properties, lest there be some positive analogy which is escaping us, 
 and when our knowledge is incomplete we multiply instances, 
 which we do not know to increase the negative analogy for 
 certain, in the hope that they may do so. 
 
 If we sum up the various methods of Analogy, we find, I 
 think, that they are all capable of arising out of an underlying 
 assumption, that if we find two sets of qtialities in coexistence 
 there is a finite probability that they belong to the same group, 
 and a finite probability also that the first set specifies this group 
 uniquely. Starting from this assumption, the object of the 
 methods is to increase the finite probability and make it large. 
 Whether or not anything of this sort is explicitly present to our 
 minds when we reason scientifically, it seems clear to me that we 
 do act exactly as we should act, if this were the assumption from 
 which we set out. 
 
 In most cases, of course, the field is greatly simplified from 
 the fiist by the use of our pre-existing knowledge. Of the 
 properties before us we generally have good reason, derived 
 from prior analogies, for supposing some to belong to the same 
 group and others to belong to different groups. But this does 
 not affect the theoretical problem confronting us. 
 
 3. What kind of ground could justify us in assuming the 
 existence of these finite probabilities which we seem to require ? 
 If we are to obtain them, not directly, but by means of argument, 
 we must somehow base them upon a finite number of exhaustive 
 alternatives. 
 
 The following line of argument seems to me to represent, on 
 the whole, the kind of assumption which is obscurely present to 
 our minds. We suppose, I thiak, that the almost innumerable 
 apparent properties of any given object all arise out of a finite 
 number of generator properties, which we may call ^t^-^^. ■ . . 
 Some arise out of ip^ alone, some out of ^^ in conjunction with (p^, 
 and so on. The properties which arise out of ^^ alone form one 
 group ; those which arise out of ^x^a ^ conjunction form another 
 group, and so on. Since the number of generator properties is 
 finite, the number of groups also is finite. If a set of apparent 
 
254 A TREATISE ON PROBABILITY pt. m 
 
 properties arise (say) out of three generator properties ^i<^2^3, 
 then this set of properties may be said to specify the group 
 4'i4'z4'3' Since the total number of apparent properties is assumed 
 to be greater than that of the generator properties, and since the 
 number of groups is finite, it follows that, if two sets of apparent 
 properties are taken, there is, in the absence of evidence to the 
 contrary, a finite probability that the second set will belong 
 to the group specified by the first set. 
 
 There is, however, the possibility of a plurality of generators. 
 The first set of apparent properties may specify more than one 
 group, — there is more than one group of generators, that is to 
 say, which are competent to produce it ; and some only of these 
 groups may contain the second set of properties. Let us, for 
 the moment, rule out this possibility. 
 
 When we argue from an analogy, and the instances have 
 two groups of characters in common, namely <p and/, either/ 
 belongs to the group ^ or it arises out of generators partly distinct 
 from those out of which ^ arises. For the reason already ex- 
 plained there is a finite probabihty that / and (f) belong to the 
 same group. If this is the case, i.e. if the generalisation g{^f) 
 is valid, then / will certainly be true of all other cases in which 
 (p is true ; if this is not the case, then / wiU not always be true 
 when tj) is true. We have, therefore, the preliminary conditions 
 necessary for the apphcation of pure iaduction. If x^, etc., are 
 the instances, 
 
 g/h =p^, where p^ is finite, 
 xjgh=l, etc., 
 
 and xJx^X2, . . . x^_jgh = l -e, where e is finite. 
 
 And hence, by the argument of Chapter XX., the probability of a 
 generalisation, based on such evidence as this, is capable, under 
 suitable conditions, of tending towards certainty as a limit, when 
 the number of instances is increased. 
 
 If (^ is complex and includes a number of characters which 
 are not always found together, it must include a number of 
 separate generator properties and specify a large group ; hence 
 the initial probability that / belongs to this group is relatively 
 large. If, on the other hand, / is complex, there will be, for the 
 same reasons mutatis mutandis, a relatively smaller initial prob- 
 ability than otherwise that /belongs to any other given group. 
 
CH. xxn INDUCTION AND ANALOGY 255 
 
 When the argument is mainly by analogy, we endeavour to 
 obtain evidence which makes the initial probability jJq relatively 
 high ; when the analogy is weak and the argument depends for 
 its strength upon pure induction, p^ is small and j3,„, which is 
 based upon numerous iastances, depends for its magnitude upon 
 their number. But an argument from induction must always 
 involve some element of analogy, and, on the other hand, few 
 arguments from analogy can afford to ignore altogether the 
 strengthening influence of pure induction. 
 
 4. Let us consider the manner in which the methods of 
 analogy increase the initial UkeHhood that two characters belong 
 to the same group. The numerous characters of an object which 
 are known to us may be represented by a^a^ . . . a„. We select 
 two sets of these, a^ and a^, and seek to determine whether a^ 
 always belongs to the group specified by a^. Our previous know- 
 ledge will enable us, in general, to rule out many of the object's 
 characters as being irrelevant to the groups specified by a^ and a^, 
 although this will not be possible in the most fundamental in- 
 quiries. We may also know that certain characters are always 
 associated with a^ or with a^. But there wiU be left a residuum 
 of whose connection with a^ or a^ we are ignorant. These 
 characters, whose relevance is in doubt, may be represented by 
 
 a. 
 
 ■r+l ■ 
 
 .a^^i- If the analogy is perfect, these characters are 
 eliminated altogether. Otherwise, the argument is weakened 
 in proportion to the comprehensiveness of these doubtful char- 
 acters. For it may be the case that some of a,._,.i . . .a^^i are 
 necessary as well as a„ in order to specify all the generators 
 which are required to produce a^. 
 
 5. We may possibly be justified in neglecting certain of the 
 characters a^+i . . . a^-i by direct judgments of irrelevance. 
 There are certain properties of objects which we rule out from 
 the beginning as wholly or largely independent and irrelevant to 
 all, or to some, other properties. The principal judgments of 
 this kind, and those alone about which we seem to feel much 
 confidence, are concerned with absolute position in time and 
 space, this class of judgments of irrelevance being summed up, 
 I have suggested, in the Principle of the Uniformity of Nature. 
 We judge that mere position in time and space cannot possibly 
 affect, as a determining cause, any other characters ; and this 
 belief appears so strong and certain, although it is hard to see 
 
256 A TEEATISE ON PEOBABILITY ft. m 
 
 how it can be based on experience, that the judgment by which 
 we arrive at it seems perhaps to be direct. A further type of 
 instance in which some philosophers seem to have trusted direct 
 judgments of relevance in these matters arises out of the relation 
 between mind and matter. They have believed that no mental 
 event can possibly be a necessary condition for the occurrence of 
 a material event. 
 
 The Principle of the Uniformity of Nature, as I interpret it, 
 supplies the answer, if it is correct, to the criticism that the 
 instances, on which generalisations are based, are all aUke in 
 being past, and that any generalisation, which is applicable to 
 the future, must be based, for this reason, upon imperfect analogy. 
 We judge directly that the resemblance between instances, which 
 consists ia their being past, is in itself irrelevant, and does not 
 supply a valid ground for impugning a generalisation. 
 
 But these judgments of irrelevance are not free from difficulty, 
 and we must be suspicious of using them. When I say that posi- 
 tion is irrelevant, I do not mean to deny that a generahsation, the 
 premiss of which specifies position, may be true, and that the 
 same generalisation without this limitation might be false. But 
 this is because the generalisation is incompletely stated ; it 
 happens that objects so specified have the required characters, 
 and hence their position supplies a sufficient criterion. Position 
 may be relevant as a sufficient condition but never as a necessary 
 condition, and the inclusion of it can only affect the truth of a 
 generalisation when we have left out some other essential con- 
 dition. A generahsation which is true of one instance must be 
 true of another which only differs from the former by reason of 
 its position in time or space. 
 
 6. Excluding, therefore, the possibility of a plurality of 
 generators, we can justify the method of perfect analogy, and 
 other inductive methods in so far as they can be made to 
 approximate to this, by means of the assumption that the 
 objects in the field, over which our generahsations extend, do 
 not have an infinite number of independent quahties ; that, in 
 other words, their characteristics, however numerous, cohere 
 together in groups of invariable connection, which are finite 
 in number. This does not hmit the number of entities which 
 are only numerically distinct. In the language used at the 
 beginniag of this chapter, the use of inductive methods can be 
 
CH. xxn INDUCTION AND ANALOGY 257 
 
 justified if they are applied to what we have reason to suppose 
 a finite system.^ 
 
 7. Let us now take account of a possible plurality of 
 generators. I mean by this the possibility that a given char- 
 acter can arise in more than one way, can belong to more than 
 one distinct group, and can arise out of more than one generator. 
 (f) might, for instance, be sometimes due to a generator a^, and 
 ai might invariably produce/. But we could not generalise 
 from (j> to/, if ^ naight be due in other cases to a different 
 generator ag which would not be competent to produce /. 
 
 If we were dealing with inductive correlation, where we do 
 not claim universality for our conclusions, it would be sufficient 
 for us to assume that the number of distinct generators, to which 
 a given property ^ can be due, is always finite. To obtain validity 
 for universal generalisations it seems necessary to make the more 
 comprehensive and less plausible assumption that a finite prob- 
 ability always exists that there is not, in any given case, a plurality 
 of causes. With this assumption we have a valid argument from 
 pure induction on the same lines, nearly, as before. 
 
 8. We have thus two distinct difficulties to deal with, and we 
 require for the solution of each a separate assumption. The 
 point may be illustrated by an example in which only one of the 
 difficulties is present. There are few arguments from analogy of 
 which we are better assured than the existence of other people. 
 We feel indeed so well assured of their existence that it has been 
 thought sometimes that our knowledge of them must be in some 
 way direct. But analogy does not seem to me unequal to the 
 proof. We have numerous experiences in our own person of 
 acts which are associated with states of consciousness, and we 
 infer that similar acts in others are likely to be associated with 
 similar states of consciousness. But this argument from analogy 
 is superior in one respect to nearly all other empirical argu- 
 ments, and this superiority may possibly explain the great con- 
 fidence which we feel in it. We do seem in this case to have 
 direct knowledge, such as we have in no other case, that our 
 states of consciousness are, sometimes at least, causally con- 
 nected with some of our acts. We do not, as in other cases, 
 
 1 Mr. C. D. Broad, in two articles " On the Relation between Induction and 
 Probability" {Mind, 1918 and 1920), has been following a similar line of 
 thought. 
 
 S 
 
258 A TEEATISE ON PROBABILITY pt. m 
 
 merely observe invariable sequence or coexistence between con- 
 sciousness and act ; and we do believe it to be vastly improbable 
 in the case of some at least of our own physical acts that they 
 could have occurred without a mental act to support them. 
 Thus, we seem to have a special assurance of a kind not usually 
 available for believing that there is sometimes a necessary con- 
 nection between the conclusion and the condition of the 
 generalisation ; we doubt it only from the possibility of a 
 plurality of causes. 
 
 The objection to this argument on the ground that the analogy 
 is always imperfect, in that all the observed connections of 
 consciousness and act are alike in being mine, seems to me to be 
 invalid on the same ground as that on which I have put on one 
 side objections to future generalisations, which are based on the 
 fact that the instances which support them are all alike in being 
 past. If direct judgments of irrelevance are ever permissible, 
 there seems some ground for admitting one here. 
 
 9, As a logical foundation for Analogy, therefore, we seem to 
 need some such assumption as that the amount of variety in the 
 imiverse is limited in such a way that there is no one object so 
 complex that its quaUties fall into an infinite number of inde- 
 pendent groups {i.e. groups which might exist independently 
 as well as in conjunction) ; or rather that none of the objects 
 about which we generalise are as complex as this ; or at least 
 that, though some objects may be infinitely complex, we some- 
 times have a finite probability that an object about which we 
 seek to generalise is not infinitely complex. 
 
 To meet a possible plurality of causes some further assumption 
 is necessary. If we were content with Inductive Correlations 
 and sought to prove merely that there was a probabfiity in favour 
 of any instance of the generalisation in question, without in- 
 quiring whether there was a probability in favour of every instance, 
 it would be sufficient to suppose that, while there may be more 
 than one sufficient cause of a character, there is not an infinite 
 number of distinct causes competent to produce it. And this 
 involves no new assumption ; for if the aggregate variety of the 
 system is finite, the possible plurality of causes must also be finite. 
 If, however, our generalisation is to be universal, so that it breaks 
 down if there is a single exception to it, we must obtain, by some 
 means or other, a finite probability that the set of characters, 
 
OH. xxa INDUCTION AND ANALOGY 259 
 
 which condition the generalisation, are not the possible effect of 
 more than one distinct set of fundamental properties. I do not 
 know upon what ground we could establish a finite probability 
 to this effect. The necessity for this seemingly arbitrary hypo- 
 thesis strongly suggests that our conclusions should be in the 
 form of inductive correlations, rather than of universal general- 
 isations. Perhaps our generalisations should always run : ' It is 
 probable that any given ^ is/,' rather than, ' It is probable that 
 all ^ are/.' Certainly, what we commonly ^eem to hold with con- 
 viction is the belief that the sun will rise to-morrow, rather than 
 the belief that the sun will always rise so long as the conditions 
 explicitly known to us are fulfilled. This will be matter for 
 further discussion in Part V., when Inductive Correlation is 
 specifically dealt with. 
 
 10. There is a vagueness, it may be noticed, in the number of 
 instances, which would be required on the above assumptions 
 to estabUsh a given numerical degree of probability, which 
 corresponds to the vagueness in the degree of probabHity which 
 we do actually attach to inductive conclusions. We assume 
 that the necessary number of instances is finite, but we do not 
 know what the number is. We know that the probability of a 
 well-established induction is great, but, when we are asked to 
 name its degree, we cannot. Common sense tells us that some 
 inductive arguments are stronger than others, and that some 
 are very strong. But how much stronger or how strong we 
 cannot express. The probability of an induction is only 
 numerically definite when we are able to make definite assump- 
 tions about the number of independent equiprobable influences 
 at work. Otherwise, it is non-numerical, though bearing relations 
 of greater and less to numerical probabilities according to the 
 approximate limits within which our assumption as to the possible 
 number of these causes lies. 
 
 11. Up to this point I have supposed, for the sake of simplicity, 
 that it is necessary to make our assumptions as to the limitation 
 of independent variety in an absolute form, to assume, that is to 
 say, the finiteness of the system, to which the argument is appUed, 
 
 for certain. But we need not in fact go so far as this. 
 
 If our conclusion is C and our empirical evidence is B, then, 
 in order to justify inductive methods, our premisses must include, 
 in addition to E, a general hypothesis H such that C/H, the 
 
260 A TEEATISE ON PROBABILITY pt. m 
 
 d priori probability of our conclusion, has a finite value. The 
 effect of E is to increase the probability of C above its initial 
 d priori value, C/HE being greater than C/H. But the method 
 of strengthening C/H by the addition of evidence E is valid quite 
 apart from the particular content of H. If, therefore, we have 
 another general hypothesis H' and other evidence E', such that 
 H/H' has a finite value, we can, without being guilty of a circular 
 argument, use CAddence E' by the same method as before to 
 strengthen the probability H/H'. If we call H, namely, the 
 absolute assertion of the finiteness of the system under considera- 
 tion, the inductive hypothesis, and the process of strengthening 
 C/H by the addition E the indiictive method, it is not circular to 
 use the inductive method to strengthen the inductive hypothesis 
 itself, relative to some more primitive and less far-reaching assump- 
 tion. If, therefore, we have any reason (H') for attributing 
 d priori a finite probability to the Inductive Hypothesis (H), then 
 the actual conformity of experience d posteriori with expectations 
 based on the assumption of H can be utilised by the inductive 
 method to attribute an enhanced value to the probabihty of H. 
 To this extent, therefore, we can support the Inductive Hypothesis 
 by experience. In dealing with any particular question we can 
 take the Inductive Hjrpothesis, not at its d priori value, but at 
 the value to which experience in general has raised it. What 
 we require d priori, therefore, is not the certainty of the Inductive 
 Hypothesis, but a finite probability in its favour.^ 
 
 Our assumption, in its most limited form, then, amounts to 
 this, that we have a finite d priori probability in favour of 
 the Inductive Hypothesis as to there being some limitation 
 of independent variety (to express shortly what I have already 
 explained in detail) in the objects of our generalisation. Our 
 experience might have been such as to diminish this probability 
 d posteriori. It has, in fact, been such as to increase it. It is 
 because there has been so much repetition and uniformity in our 
 experience that we place great confidence in it. To this extent 
 the popular opinion that Induction depends upon experience for 
 its validity is justified and does not involve a circular argument. 
 
 ^ I have implicitly assumed in the above argument that if H' supports H, it 
 strengthens an argument which H would strengthen. This is not necessarily 
 the case for the reasons given on pp. 68 and 147. In these passages the 
 necessary conditions for the above are elucidated. I am, therefore, assuming 
 that in the case now in question these conditions actually are fulfilled. 
 
CH. XXII INDUCTION AND ANALOGY 261 
 
 12. I think that this assumption is adequate to its purpose 
 and would justify our ordinary methods of procedure in inductive 
 argument. It was suggested in the previous chapter that our 
 theory of Analogy ought to be as applicable to mathematical 
 as to material generalisations, if it is to justify common sense. 
 The above assumptions of the limitation of independent variety 
 sufficiently satisfy this condition. There is nothing in these 
 assumptions which gives them a peculiar reference to material 
 objects. We beUeve, in fact, that all the properties of numbers 
 can be derived from a limited number of laws, and that the same 
 set of laws governs all numbers. To apply empirical methods to 
 such things as numbers renders it necessary, it is true, to make 
 an assumption about the nature of numbers. But it is the same 
 land of assumption as we have to make about material objects, 
 and has just about as much, or as little, plausibility. There is 
 no new difficulty. 
 
 The assumption, also, that the system of Nature is finite is 
 in accordance with the analysis of the imderlying assumption of 
 scientists, given at the close of the previous chapter. The 
 hypothesis of atomic uniformity, as I have called it, while not 
 formally equivalent to the hypothesis of the limitation of inde- 
 pendent variety, amounts to very much the same thing. If the 
 fimdamental laws of connection changed altogether with varia- 
 tions, for instance, ia the shape or size of bodies, or if the laws 
 governing the behaviour of a complex had no relation whatever 
 to the laws governing the behaviour of its parts when belonging 
 to other complexes, there could hardly be a limitation of inde- 
 pendent variety in the sense in which this has been defined. And, 
 on the other hand, a limitation of independent variety seems 
 necessarily to carry with it some degree of atomic uniformity. 
 The underlying conception as to the character of the System of 
 Nature is in each case the same. 
 
 13. We have now reached the last and most difficiilt stage of 
 the discussion. The logical part of our inquiry is complete, and 
 it has left us, as it is its business to leave us, with a question of 
 epistemology. Such is the premiss or assumption which our 
 logical processes need to work upon. What right have we to 
 make it ? It is no sufficient answer in philosophy to plead that 
 the assumption is after all a very little one. 
 
 I do not believe that any conclusive or perfectly satisfactory 
 
262 A TEEATISE ON PROBABILITY pt. hi 
 
 answer to this question can be given, so long as our knowledge 
 of the subject of epistemology is in so disordered and undeveloped 
 a condition as it is in at present. No proper answer has yet been 
 given to the inquiry — of what sorts of things are we capable of 
 direct knowledge ? The logician, therefore, is in a weak position, 
 when he leaves his own subject and attempts to solve a particular 
 instance of this general problem. He needs guidance as to what 
 kind of reason we could have for such an assumption as the use 
 of inductive argument appears to require. 
 
 On the one hand, the assumption may be absolutely d priori 
 in the sense that it would be equally applicable to all possible 
 objects. On the other hand, it may be seen to be applicable to 
 some classes of objects only. In this case it can only arise out 
 of some degree of particular knowledge as to the nature of the 
 objects ill question, and is to this extent dependent on experience. 
 But if it is experience which in this sense enables us to know the 
 assumption as true of certain amongst the objects of experience, 
 it must enable us to know it in some manner which we may term 
 direct and not as the result of an inference. 
 
 Now an assumption, that all systems of fact are finite (in the 
 sense in which I have defined this term), cannot, it seems perfectly 
 plain, be regarded as having absolute, universal validity in the 
 sense that such an assumption is self -evidently applicable to every 
 kind of object and to all possible experiences. It is not, therefore, 
 iu quite the same position as a self-evident logical axiom, and does 
 not appeal to the mind in the same way. The most which can 
 be maintained is that this assumption is true of some systems of 
 fact, and, further, that there are some objects about which, as 
 soon as we understand their nature, the mind is able to apprehend 
 directly that the assumption in question is true. 
 
 In Chapter II. § 7, I wrote : " By some mental process of 
 which it is difficult to give an account, we are able to pass from 
 direct acquaintance with things to a knowledge of propositions 
 about the things of which we have sensations or imderstand the 
 meaning." Knowledge, so obtained, 1 termed direct knowledge. 
 From a sensation of yellow and from an understanding of the 
 meaning of ' yellow ' and of ' colour,' we could, I suggested, 
 have direct knowledge of the fact or proposition ' yeUow is a 
 colour ; ' we might also know that colour cannot exist without 
 extension, or that two colours cannot be perceived at the same 
 
OH. xxn INDUCTION AND ANALOGY 263 
 
 time in the same place. Other philosophers might use terms 
 difEerently and express themselves otherwise ; but the substance 
 of what I was there trying to say is not very disputable. But 
 when we come to the question as to what kinds of propositions 
 we can come to know in this manner, we enter upon an unex- 
 plored field where no certaia opinion is discoverable. 
 
 In the case of logical terms, it seems to be generally agreed 
 that if we understand their meaning we can know directly pro- 
 positions about them which go far beyond a mere expression of 
 this meaning; — propositions of the kind which some philo- 
 sophers have termed synthetic. In the case of non-logical or 
 empirical entities, it seems sometimes to be assumed that our 
 direct knowledge must be confined to what may be regarded as 
 an expression or description of the meaning or sensation appre- 
 hended by us. If this view is correct the Inductive Hypothesis 
 is not the kind of thing about which we can have direct know- 
 ledge as a result of our acquaintance with objects. 
 
 I suggest, however, that this view is incorrect, and that we 
 are capable of direct knowledge about empirical entities which 
 goes beyond a mere expression of our understanding or sensation 
 of them. It may be useful to give the reader two examples, more 
 familiar than the Inductive Hypothesis, where, as it appears to 
 me, such knowledge is commonly assumed. The fixst is that of the 
 causal irrelevance of mere position in time and space, commonly 
 called the Uniformity of Nature. We do believe, and yet have 
 no adequate inductive reason whatever for beUeving, that mere 
 position in time and space cannot make any difference. This 
 belief arises directly, I think, out of our acquaintance with 
 the objects of experience and our understanding of the concepts 
 of ' time ' and ' space.' The second is that of the Law of 
 Causation. We believe that every object in time has a ' neces- 
 sary ' connection ^ with some set of objects at a previous time. 
 This behef also, I think, arises in the same way. It is to be 
 noticed that neither of these beliefs clearly arises, in spite of the 
 directness which may be claimed for them, out of any one single 
 experience. In a way analogous to these, the validity of assuming 
 the Inductive Hypothesis, as applied to a particular class of 
 objects, appears to me to be justified. 
 
 Our justification for using inductive methods in an argument 
 ^ I do not propose to define the meaning of this. 
 
264 A TREATISE ON PEOBABILITY pt. m 
 
 about numbers arises out of our perceiving directly, when we 
 understand the meaning of a number, that they are of the re- 
 quired character.^ And when we perceive the nature of our 
 phenomenal experiences, we have a direct assurance that in their 
 case also the assumption is legitimate. We are capable, that 
 is to say, of direct synthetic knowledge about the nature 
 of the objects of our experience. On the other hand, there 
 may be some kinds of objects, about which we have no such 
 assurance and to which inductive methods are not reasonably 
 applicable. It may be the case that some metaphysical questions 
 are of this character and that those philosophers have been right 
 who have refused to apply empirical methods to them. 
 
 14. I do not pretend that I have given any perfectly adequate 
 reason for accepting the theory I have expounded, or any such 
 theory. The Inductive Hypothesis stands in a peculiar position 
 in that it seems to be neither a self-evident logical axiom nor an 
 object of direct acquaintance ; and yet it is just as difficult, as 
 though the inductive hypothesis were either of these, to remove 
 from the organon of thought the inductive method which can 
 only be based on it or on something Uke it. 
 
 As long as the theory of knowledge is so imperfectly 
 understood as now, and leaves us so uncertain about the grounds 
 of many of our firmest convictions, it would be absurd to 
 confess to a special scepticism about this one. I do not think 
 that the foregoing argument has disclosed a reason for such 
 scepticism. We need not lay aside the behef that this conviction 
 gets its invincible certainty from some valid principle darkly 
 present to our minds, even though it stiU eludes the peering 
 eyes of philosophy. 
 
 '^ Since numbers are logical entities, it may be thought less unorthodox to 
 make such an assumption in their case. 
 
CHAPTER XXIII 
 
 SOME HISTOEICAL NOTES ON INDUCTION 
 
 1. The number of books, which deal with inductive ^ theory, is 
 extraordinarily small. It is usual to associate the subject with 
 the names of Bacon, Hume, and Mill. In spite of the modern 
 tendency to depreciate the first and the last of these, they are the 
 principal names, I think, with which the history of induction 
 ought to be associated. The next place is held by Laplace and 
 Jevons. Amongst contemporary logicians there is an almost 
 complete absence of constructive theory, and they content 
 themselves for the most part with the easy task of criticising 
 Mni, or with the more difficult one of following him. 
 
 That the inductive theories of Bacon and of Mill are full of 
 errors and even of absurdities, is, of course, a commonplace of 
 criticism. But when we ignore detaUs, it becomes clear that they 
 were reaUy attempting to disentangle the essential issues. We 
 depreciate them partly, perhaps, as a reaction from the view once 
 held that they helped the progress of scientific discovery. For 
 it is not plausible to suppose that Newton owed anything to Bacon, 
 or Darwin to Mill. But with the logical problem their minds 
 were truly occupied, and in the history of logical theory they 
 should always be important. 
 
 It is true, nevertheless, that the advancement of science was 
 the main object which Bacon himself, though not Mill, believed 
 that his philosophy would promote. The Great Instauration was 
 intended to promulgate an actual method of discovery entirely 
 different from any which had been previously known.^ It did 
 
 ^ See note at the end of this chapter on " The Use of the Term Induction." 
 
 ' He speaks of himself as being " in hao re plane protopirus, et vestigia 
 
 nullius sequutus " ; and in the Praefatio Oeneralis he compares his method to 
 
 the mariner's compass, until the discovery of which no widg sea could be 
 
 crossed (see Spedding and Ellis, vol. i. p. 24). 
 
 265 
 
266 A TKEATISE ON PROBABILITY pt. m 
 
 not do this, and against such pretensions Macaulay's well-known 
 essay was not unjustly directed. MiU, however, expressly dis- 
 claimed in his preface any other object than to classify and 
 generalise the practices " conformed to by accurate thinkers in 
 their scientific inquiries." Whereas Bacon offered rules and 
 demonstrations, hitherto unknown, with which any man could 
 solve all the problems of science by taking pains. Mill admitted 
 that " in the existing state of the cultivation of the sciences, 
 there would be a very strong presumption against any one 
 who should imagine that he had effected a revolution in the 
 theory of the investigation of truth, or added any fundamentally 
 new process to the practice of it." 
 
 2. The theories of both seem to me to have been injured, 
 though in different degrees, by a failure to keep quite distinct 
 the three objects : (1) of helping the scientist, (2) of explaining 
 and analysing his practice, and (3) of justifying it. Bacon was 
 really interested in the second as well as in the first, and was 
 led to some of his methods hj reflecting upon what distinguished 
 good arguments from bad in actual investigations. To logicians 
 his methods were as new as he claimed, but they had their 
 origin, nevertheless, in the commonest inferences of science and 
 daily fife. But his main preoccupation was with the first, which 
 did injury to his treatment of the third. He himself became 
 aware as the work progressed that) in his anxiety to provide 
 an infallible mode of discovery, he had put forth more than he 
 would ever be able to justify.^ His own mind grew doubtful, 
 and the most critical parts of the description of the new method 
 were never written. No one who has reflected much upon In- 
 duction need find it difficult to understand the progress and 
 development of Bacon's thoughts. To the philosopher who first 
 distinguished some of the complexities of empirical proof in a 
 generalised, and not merely a particular, form, the prospects of 
 systematising these methods must have seemed extraordinarily 
 hopeful. The first investigator could not have anticipated that 
 Induction, in spite of its apparent certainty, would prove so 
 elusive to analysis. 
 
 Mill also was led, in a not dissimilar way, to attempt a too 
 
 1 This view is taken in the edition of James Spedding and Leslie EEis. 
 Their introductions to Bacon's philosophical works seem to me to be very greatly 
 superior to th« accounts to be found elsewhere. They make intelligible, what 
 seems, according to other commentaries, fanciful and without sense or reason. 
 
OH. xxm INDUCTION AND ANALOGY 267 
 
 simple treatment, and, in seeking for ease and certainty, to 
 treat far too lightly the problem of justifyrag what he had 
 claimed. MUl shirks, almost openly, the difficulties ; and scarcely 
 attempts to disguise from himself or his readers that he grotmds 
 induction upon a circular argument. 
 
 3. Some of the most characteristic errors both of Bacon and 
 of MiU arise, I think, out of a misapprehension, which it has been 
 a principal object of this book to correct. Both believed, without 
 hesitation it seems, that induction is capable of estabhshing a 
 conclusion which is absolutely certain, and that an argument 
 is invaUd if the generalisation, which it supports, admits of 
 exceptions in fact. " Absolute certainty," says Leslie Elhs,^ " is 
 one of the distinguishing characters of the Baconian induction." 
 It was, in this respect, mainly that it improved upon the older 
 induction fer enumerationem simpUcem. " The induction which 
 the logicians speak of," Bacon argues in the Advancement of 
 Lea/rning, " is utterly vicious and incompetent. . . . For to con- 
 clude upon an enumeration of particulars, without instance 
 contradictory, is no conclusion but a conjecture." The conclusions 
 of the new method, unlike those of the old, are not Uable to be 
 upset by further experience. In the attempt to justify these 
 claims and to obtain demonstrative methods, it was necessary 
 to introduce assumptions for which there was no warrant. 
 
 Precisely similar claims were made by MUl, although there 
 are passages in which he abates them,^ for his own rules of pro- 
 cedure. An induction has no validity, according to him as 
 according to Bacon, unless it is absolutely certain. The follow- 
 ing passage ^ is significant of the spirit in which the subject 
 was approached by him : " Let us compare a few cases 
 of incorrect inductions with others which are acknowledged 
 to be legitimate. Some, we know, which were believed for 
 centuries to be correct, were nevertheless incorrect. That all 
 swans are white, cannot have been a good induction, since the con- 
 clusion has turned out erroneous. The experience, however, on 
 which the conclusion rested was genuine." Mill has not justly 
 apprehended the relativity of all inductive arguments to the 
 evidence, nor the element of uncertaiuty which is present, more 
 
 1 Op. dt. vol. i. p. 23. 
 
 2 When he deals with Plurality of Causes, for instance- 
 ' Bk. iii. chap. iii. 3 (the italics aie mine). 
 
268 A TREATISE ON PROBABILITY pt. m 
 
 or less, in all the generalisations which they support.^ Mill's 
 methods would yield certainty, if they were correct, just as 
 Bacon's would. It is the necessity, to which Mill had subjected 
 himself, of obtaining certainty that occasions their want of 
 reality. Bacon and MiU both assume that experiment can 
 shape and analyse the evidence in a manner and to an extent 
 which is not in fact possible. In the aims and expectations with 
 which they attempt to solve the inductive problem, there is on 
 fundamental points an unexpectedly close resemblance beween 
 them. 
 
 4. Turning from these general criticisms to points of greater 
 detail, we find that the line of thought pursued by Mill was 
 essentially the same as that which had been pursued by Bacon, 
 and, also, that the argument of the preceding chapters is, in 
 spite of some real differences, a development of the same funda- 
 mental ideas which underlie, as it seems to me, the theories of 
 Mill and Bacon alike. 
 
 We have seen that all empirical arguments require an initial 
 probability derived from analogy, and that this initial probability 
 may be raised towards certainty by means of pure induction 
 or the multiplication of instances. In some arguments we depend 
 mainly upon analogy, and the initial probability obtained by 
 means of it (with the assistance, as a rule, of previous knowledge) 
 is so large that numerous instances are not required. In other 
 arguments pure induction predominates. As science advances 
 and the body of pre-existing knowledge is increased, we depend 
 increasingly upon analogy ; and only at the earlier stages of our 
 investigations is it necessary to rely, for the greater part of our 
 support, upon the multiplication of instances. Bacon's great 
 achievement, in the history of logical theory, lay in his being the 
 first logician to recognise the importance of methodical analogy 
 to scientific argument and the dependence upon it of most well- 
 established conclusions. The Novum Organum is mainly con- 
 cerned with explaining methodical ways of increasing what I 
 have termed the Positive and Negative Analogies, and of avoiding 
 false Analogies. The use of exclusions and rejections, to which 
 
 1 This misappieliension may be connected with Mill's complete failuie to 
 grasp with any kind of thoroughness the nature and importance of the theory of 
 probability. The treatment of this topic in the System of Logic is exceedingly 
 bad. His understanding of the subject was, indeed, markedly inferior to the 
 best thought of his own time. 
 
CH. xsni INDUCTION AND ANALOGY 269 
 
 Bacon attached supreme importance, and wHcli he held to con- 
 stitute the essential superiority of his method over those which 
 preceded it, entirely consists in the determination of what char- 
 acters (or natures as he would call them) belong to the positive 
 and negative analogies respectively. The first two tables with 
 which the iuvestigation begins are, first, the table essentiae et 
 praesentiae, which contains all known instances in. which the 
 given nature is present, and, second, the table decUnationis sive 
 absentiae in proximo, which contains instances corresponding in 
 each case to those of the first table, but in which, notwithstanding 
 this correspondence, the given nature is absent.^ The doctrine 
 of prerogative instances is concerned no less plainly with the 
 methodical determination of Analogy. And the doctrine of 
 idols is expounded for the avoidance oi false analogies, standing, 
 he says, in the same relation to the interpretation of Nature, as 
 the doctrine of fallacies to ordinary logic.^ Bacon's error lay 
 in supposing that, because these methods were new to logic, they 
 were therefore new to practice. He exaggerated also their pre- 
 cision and their certainty ; and he underestimated the import- 
 ance of pure induction. But there was, at bottom, nothing about 
 his rules impracticable or fantastic, or indeed unusual. 
 
 5. Almost the whole of the preceding paragraph is equally 
 applicable to Mill. He agreed with Bacon in depreciating the 
 part played in scientific inquiry by pure induction, and in 
 emphasising the importance of analogy to all systematic investi- 
 gators. But he saw further than Bacon in allowing for the 
 PluraUty of Causes, and in admittiag that an element of pure 
 induction was therefore made necessary. " The Plurality of 
 Causes," he says,^ " is the only reason why mere number of in- 
 stances is of any importance in inductive inquiry. The tendency 
 of imscientific inquirers is to rely too much on number, without 
 analysing the instances. . . . Most people hold their conclusions 
 with a degree of assurance proportioned to the mere mass of the 
 experience on which they appear to rest ; not considering that 
 by the addition of instances to instances, all of the same kind, 
 that is, differing from one another only in points already recog- 
 nised as immaterial, nothing whatever is added to the evidence of 
 
 1 Ellis, vol. i. p. 33. 
 
 2 Ellis, vol. i. p. 89. 
 
 ' Book iv. chap. x. 2, 
 
270 A TEEATISE ON PROBABILITY pt. m 
 
 the conclusion. A single instance eliminating some antecedent 
 wliicli existed in all the other cases, is of more value than the 
 greatest multitude of instances which are reckoned by their 
 number alone." Mill did not see, however, that our knowledge 
 of the instances is seldom complete, and that new instances, which 
 are not known to differ from the former in material respects, may 
 add, nevertheless, to the negative analogy, and that the multi- 
 plication of them may, for this reason, strengthen the evidence. 
 It is easy to see that his methods of Agreement and Difference 
 closely resemble Bacon's, and aim, like Bacon's, at the deter- 
 mination of the Positive and Negative Analogies. By allowing 
 for Plurality of Causes Mill advanced beyond Bacon. But he 
 was pursuing the same line of thought which alike led to Bacon's 
 rules and has been developed in the chapters of this book. 
 Like Bacon, however,. he exaggerated the precision with which 
 his canons of inquiry could be used in practice. 
 
 6. No more need be said respecting method and analysis. 
 But in both writers the exposition of method is closely inter- 
 mingled with attempts to justify it. There is nothing in Bacon 
 which at all corresponds to Mill's appeals to Causation or to the 
 Uniformity of Nature, and, when they seek for the ground of 
 induction, there is much that is peculiar to each writer. It is 
 my purpose, however, to consider in this place the details common 
 to both, which seem to me to be important and which exemphfy 
 the only line of investigation which seems likely to be fruitful ; 
 and I shall pursue no further, therefore, their numerous points 
 of difference. 
 
 The attempt, which I have made to justify the initial prob- 
 ability which Analogy seems to supply, primarily depends upon 
 a certain limitation of independent variety and upon the deriva- 
 tion of aU the properties of any given object from a limited 
 number of primary characters. In the same way I have supposed 
 that the number of primary characters which are capable of 
 producing a given property is also limited. And I have argued 
 that it is not easy to see how a finite probability is to be obtained 
 tmless we have in each case some such limitation in the number 
 of the ultimate alternatives. 
 
 It was ui a manner which bears fundamental resemblances 
 to this that Bacon endeavoured to demonstrate the cogency of 
 his method. He considers, he says, " the simple forms or differ- 
 
CH. xxm INDUCTION AND ANALOGY 271 
 
 ence of things whicli are few in munber, and the degrees and 
 co-ordinations whereof make all this variety." And in Valeriiis 
 Terminus he argues "that every particular that worketh any 
 effect is a thing compounded more or less of diverse single natures, 
 more manifest and more obscure, and that it appeareth not to 
 which of the natures the effect is to be ascribed." ^ It is indeed 
 essential to the method of exclusions that the matter to which it 
 is applied should be somehow resolvable into a finite number of 
 elements. But this assumption is not peculiar, I think, to 
 Bacon's method, and is involved, in some form or other, in every 
 argument from Analogy. In making it Bacon was initiating, 
 perhaps obscurely, the modern conception of a finite number of 
 laws of nature out of the combinations of which the almost bound- 
 less variety of experience ultimately arises. Bacon's error was 
 double and lay in supposing, first, that these distinct elements 
 lie upon the surface and consist in visible characters, and second, 
 that their natures are, or easily can be, known to us, although 
 the part of the Instauration, in which the manner of conceiving 
 simple natures was to be explained, he never wrote. These 
 beliefs falsely simplified the problem as he saw it, and led him 
 to exaggerate the ease, certainty, and fruitfulness of the new 
 method. But the view that it is possible to reduce all the 
 phenomena of the universe to combinations of a Hmited number 
 of simple elements — ^which is, according to EUis,^ the central 
 point of Bacon's whole system — ^was a real contribution to philo- 
 sophy. 
 
 7. The assumption that every event can be analysed into a 
 limited number of ultimate elements, is never, so far as I am 
 aware, explicitly avowed by MUl. But he makes it in almost 
 every chapter, and it underlies, throughout, his mode of procedure. 
 His methods and arguments would fail immediately, if we were 
 to suppose that phenomena of infinite complexity, due to an 
 infinite number of independent elements, were in question, or 
 if an infinite plurality of causes had to be allowed for. 
 
 In distinguishing, therefore, analogy from pure induction, 
 and in justifying it by the assumption of a limited complexity in 
 the problems which we investigate, I am, I think, pursuing, with 
 numerous differences, the line of thought which Bacon fiarst 
 
 1 Quoted by Ellis, vol. i. p. 41. 
 2 Vol. i. p. 28. 
 
272 A TEEATISE ON PROBABILITY pt. m 
 
 pursued and which Mill popularised. The method of treatment 
 is dissimilar, but the subject-matter and the underlying beliefs 
 are, in each case, the same. 
 
 8. Between Bacon and MiU came Hume. Hume's sceptical 
 criticisms are usually associated with causality ; but argument 
 by induction — ^inference from past particulars to future generalisa- 
 tions — ^was the real object of his attack. Hume showed, not that 
 inductive methods were false, but that their validity had never 
 been established and that aU possible lines of proof seemed 
 equally unpromisiag. The full force of Hume's attack and the 
 nature of the difficulties which it brought to light were never 
 appreciated by Mill, and he makes no adequate attempt to 
 deal with them. Hume's statement of the case against induction 
 has never been improved upon ; and the successive attempts 
 of philosophers, led by Kant, to discover a transcendental solu- 
 tion have prevented them from meeting the hostile arguments on 
 their own ground and from finding a solution along lines which 
 might, conceivably, have satisfied Hume himself. 
 
 9. It would not be just here to pass by entirely the name 
 of the great Leibniz, who, wiser in correspondence and frag- 
 mentary projects than in completed discourses, has left to us 
 sufficient indications that his private reflections on this subject 
 were much in advance of his contemporaries'. He distinguished 
 three degrees of conviction amongst opinions, logical certainty 
 (or, as we should say, propositions known to be formally true), 
 physical certainty which is only logical probability, of which a 
 well-established induction, as that man is a biped, is the type, 
 and physical probability (or, as we should say, an inductive 
 correlation), as for example that the south is a rainy quarter.^ 
 He condemned generaHsations based on mere repetition of 
 instances, which he declared to be without logical value, and he 
 insisted on the importance of Analogy as the basis of a valid 
 induction.^ He regarded a hypothesis as more probable in 
 proportion to its simplicity and its power, that is to say, to the 
 number of the phenomena it would explain and the fewness of 
 the assumptions it "involved. In particular a power of accurate 
 prediction and of explaining phenomena or experiments pre- 
 
 ^ Couturat, Opuscules et fragments inedits de Leibniz, p. 232. 
 ' Couturat, La Logique de Leibniz d'apris des documents inedits, pp. 
 262, 267. 
 
OH. xxm INDUCTION AND ANALOGY 273 
 
 viously untried is a just ground of secure confidence, of which 
 he cites as a nearly perfect example the key to a crj^togram.^ 
 
 10. Whewell and Jevons furnished logicians with a store- 
 house of examples derived from the practice of scientists. 
 Jevons, partly anticipated by Laplace, made an important 
 advance when he emphasised the close relation between 
 Induction and Probability. Combining insight and error, he 
 spoilt brilliant suggestions by erratic and atrocious arguments. 
 His application of Inverse Probability to the inductive problem 
 is crude and fallacious, but the idea which underlies it is 
 substantially good. He, too, made explicit the element of 
 Analogy, which Mill, though he constantly employed it, had 
 seldom called by its right .name. There are few books, so 
 superficial in argument yet suggesting so much truth, as Jevons's 
 Principles of Science. 
 
 11. Modern text-books on Logic all contain their chapters on 
 Induction, but contribute little to the subject. Their recogni- 
 tion of Mill's inadequacy renders their exposition, which, ia spite 
 of criticisms, is generally along his lines, nerveless and confused. 
 Where Mill is clear and offers a solution, they, confusedly 
 criticising, must withhold one. The best of them, Sigwart and 
 Venn, contain criticism and discussion which is interesting, but 
 constructive theory is lacking. Hitherto Hume has been master, 
 only to be refuted in the manner of Diogenes or Dr. Johnson. 
 
 1 Letter to Conring, 19th March 1678. 
 
NOTES ON PAET III 
 
 (i.) On the Use of the Teem Induction 
 
 1. Induction is in origin a translation of tte Aristotelian eTrayioyij. 
 This term was used by Aristotle in two quite distinct senses — ^first, 
 and principally, for the process by which the observation of particular 
 instances, in which an abstract notion is exemplified, enables us to 
 reahse and comprehend the abstraction itself ; secondly, for the type 
 of argument in which we generalise after the complete enumeration 
 and assertion of all the particulars which the generalisation embraces. 
 From this second sense it was sometimes extended to cases in which 
 we generalise after an incomplete enumeration. In post-Aristotelian 
 writers the induction per enumerationem simpKcem approximates to 
 induction in Aristotle's second sense, as the number of instances is 
 increased. To Bacon, therefore, " the induction of which the logicians 
 speak " meant a method of argument by multiplication of instances. 
 He himself deliberately extended the use of the term so as to cover 
 all the systematic processes of empirical generalisation. But he 
 also used it, in a manner closely corresponding to Aristotle's _^sJ use, 
 for the process of forming scientific conceptions and correct notions 
 of " simple natures." ^ 
 
 2. The modern use of the term is derived from Bacon's. Mill 
 defines it as " the operation of discovering and proving general 
 propositions." His philosophical system required that he should 
 define it as widely as this ; but the term has really been used, both 
 by him and by other logicians, in a narrower sense, so as to cover 
 those methods of proving general propositions, which we call empiri- 
 cal, and so as to exclude generalisations, such as those of mathematics, 
 which have been proved formally. Jevons was led, partly by the 
 linguistic resemblance, partly because in the one case we proceed 
 from the particular to the general and in the other from the general 
 to the particular, to define Induction as the inverse process of 
 Deduction. In contemporary logic Mill's use prevails ; but there 
 
 * See Ellis's edition of Bacon's Worlcs, vol. i. p. 37. On the first occasion 
 on which Induction is mentioned in the Novum Organum, it is used in this 
 secondary sense. 
 
 274 
 
NOTES INDUCTION AND ANALOGY 275 
 
 is, at the same time, a suggestion — arising from earlier usage, and 
 because Bacon and Mill never quite freed themselves from it — of. 
 argument by mere multiplication of instances. I have thought it 
 best, therefore, to use the term pure induction to describe arguments 
 which are based upon the number of instances, and to use induction 
 itself for all those tjrpes of arguments which combine, in one form or 
 another, pure induction with analogy. 
 
 (ii.) On the Use of the Teem Cause 
 
 1. Throughout the preceding argument, as well as in Part II., 
 I have been able to avoid the metaphysical difficulties which surround 
 the true meaning of cause. It was not necessary that I should 
 inquire whether I meant by causal connection an invariable con- 
 nection in fact merely, or whether some more intimate relation was 
 involved. It has also been convenient to speak of causal relations 
 between objects which do not strictly stand in the position of cause 
 and efEect, and even to speak of a 'probable cause, where there is no 
 implication of necessity and where the antecedents wiQ sometimes 
 lead to particular consequents and sometimes will not. In making 
 this use of the term, I have followed a practice not uncommon amongst 
 writers on probability, who constantly use the term cause, where 
 hypothesis might seem more appropriate.^ 
 
 One is led, almost inevitably, to use ' cause ' more widely than 
 ' sufficient cause ' or than ' necessary cause,' because, the necessary 
 causation of particulars by particulars being rarely apparent to us, 
 the strict sense of the term has little utility. Those antecedent 
 circumstances, which we are usually content to accept as causes, are 
 only so in strictness under a favourable conjunction of innumerable 
 other influences. 
 
 2. As our knowledge is partial, there is constantly, in our use 
 of the term cause, some reference implied or expressed to a Umited 
 body of knowledge. It is clear that, whether or not, as Cournot ^ 
 maintains, there are such things as independent series in the order 
 of causation, there is often a sense in which we may hold that there 
 is a closer intimacy between some series than between others. This 
 intimacy is relative, I think, to particular information, which is 
 actually known to us, or which is within our reach. It will be useful, 
 therefore, to give precise definitions of these wider senses in which 
 it is often convenient to use the expression cause. 
 
 ^ Cf. Czuber, Wahrscheinlichkeitsrechnung, p. 139. In dealing with Inverse 
 Probability Czuber explains that he means by possible cause the various Be- 
 dingungskomplexe from which the cause can result. 
 
 2 See Chapter XXIV. §3. 
 
276 A TREATISE ON PEOBABILITY pt. ni 
 
 We must first distinguisli between assertions of law and assertions 
 of fact, or, in the terminology of Von Kries,^ between nomologic and 
 ontologic knowledge. It may be convenient in dealing with some 
 questions to frame this distinction with reference to the special 
 circumstances. But the distinction generally applicable is between 
 propositions which contain no reference to pa/rticula/r moments of 
 time, and existential propositions which cannot be stated without 
 reference to specific points in the time series. The Principle of the 
 Uniformity of Nature amounts to the assertion that natural laws 
 are aU, in this sense, timeless. We may, therefore, divide our data 
 into two portions k and I, such that k denotes our formal and 
 nomologic evidence, consisting of propositions whose predication 
 does not involve a particular time reference, and I denotes the 
 existential or ontologic propositions. 
 
 3. Let us now suppose that we are investigating two existential 
 propositions a and b, which refer two events A and B to particular 
 moments of time, and that A is referred to moments which are all 
 prior to those at which B occurred. What various meanings can we 
 give to the assertion that A and B are causally connected ? 
 
 (i.) If b/ah = 1, A is a sufficient cause of B. In this case A is a 
 cause of B in the strictest sense, b can be inferred from a, and no 
 additional knowledge consistent with h can invaUdate this. 
 
 (ii.) If b/ah = 0, A is a necessary cause of B. 
 
 (iii.) If k includes all the laws of the existent universe, then A 
 is not a sufficient cause of B unless b/ak = 1. The Law of Causation, 
 therefore, which states that every existent has to some other previous 
 existent the relation of efEect to sufficient cause, is equivalent to the 
 proposition that, if & is the body of natural law, then, if b is true, 
 there is always another true proposition a, which asserts existences 
 prior to B, such that bjak=\. No use has been made so far of our 
 existential knowledge I, which is irrelevant to the definitions pre- 
 ceding. 
 
 (iv.) If bjakl = 1 and bjkl 4= 1, A is a sufficient cause of B undej 
 conditions I. 
 
 (v.) If bjakl = and bjkl =!= 0, A is a necessary cause of B under 
 conditions I. 
 
 (vi.) If there is any existential proposition h such that bjahk = 1 
 and bjhk =t= 1, A is, relative to k, a possible sufficient cause of B. 
 
 (vii.) If there is an existential proposition h such that b/shk = 
 and b/hk =1= 0, A is, relative to k, a possible necessary cause of B. 
 
 (viii.) If b/ahkl = 1, b/hk 4= 1, and h/akl 4= 0, A is, relative to k, 
 a possible sufficient cause of B under conditions I. 
 
 (ix.) If b/ahkl = 0, b/hU=^0, h/akl ^0, and h/akl ^0, A is, 
 relative to k, a possible necessary cause of B under conditions I. 
 
 ^ Die Prindpien der WahrscheirdichkeitsrecJinung, p. 86. 
 
NOTES INDUCTION AND ANALOGY 277 
 
 Thus an event is a possible necessary cause of another, relative to 
 given nomologic data, if circumstances can arise, not inconsistent 
 with our existential data, in which the first event will be indispensable 
 if the second is to occur. 
 
 (x.) Two events are causally independent if no part of either is, 
 relative to our nomologic data, a possible cause of any part of the 
 other under the conditions of our existential knowledge. The greater 
 the scope of our existential knowledge, the greater is the likelihood 
 of our being able to pronounce events caxisally dependent or inde- 
 pendent. 
 
 4. These definitions preserve the distinction between ' causally 
 independent ' and ' independent for probability,' — ^the distinction 
 between causa essendi and causa cognoscendi. If hJahM^hldhhl, 
 where a and h may be any propositions whatever and are not limited 
 as they were in the causal definitions, we have ' dependence for 
 probability,' and a is a causa cognoscendi for 5, relative to data kl. 
 If a and 6 are causally dependent, according to definition (x.), 6 is a 
 possible causa essendi, relative to data hi. 
 
 But, after all, the essential relation is that of ' independence for 
 probability.' We wish to know whether knowledge of one fact 
 throws light of any kind upon the likelihood of another. The theory 
 of causality is only important because it is thought that by means of 
 its assumptions light can be thrown by the experience of one pheno- 
 menon upon the expectation of another. 
 
PART IV 
 
 SOME PHILOSOPHICAL APPLICATIONS OF 
 PEOBABILITY 
 
 279 
 
CHAPTER XXIV 
 
 THE MEANINGS OF OBJECTIVE CHANCE, AND OF RANDOMNESS 
 
 1. Many important differences of opinion in the treatment of 
 Probability have been due to confusion or vagueness as to 
 what is meant by Eandomness and by Objective Chance, as 
 distinguished from what, for the purposes of this chapter, may be 
 termed Subjective Probability. It is agreed that there is a sort 
 of Probability which depends upon knowledge and ignorance, and 
 is relative, in some manner, to the mind of the subject ; but it is 
 supposed that there is also a more objective Probability which 
 is not thus dependent, or less completely so, though precisely 
 what this conception stands for is not plain. The relation of 
 Randomness to the other concepts is also obscure. The problem 
 of clearing up these distinctions is of importance if we are to 
 criticise certain schools of opinion intelligently, as well as to the 
 treatment of the foimdations of Statistical Inference which is to 
 be attempted in Part V. 
 
 There are at least three distinct issues to be kept apart. There 
 is the antithesis between knowledge and ignorance, between 
 events, that is to say, which we have some reason to expect, and 
 events which we have no reason to expect, which gives rise to 
 the theory of subjective probability and subjective chance ; and, 
 connected with this, the distinction between ' random ' selection 
 and ' biassed ' selection. There are next objective probability and 
 objective chance, which are as yet obscure, but which are com- 
 monly held to arise out of the antithesis between ' cause ' and 
 ' chance,' between events, that is to say, which are causally con- 
 nected and events which are not causally connected. And there 
 is, lastly, the antithesis between chance and design, between 
 ' blind causes ' and ' final causes,' where we oppose a ' chance ' 
 
 281 
 
282 A TREATISE ON PROBABILITY pt. iv 
 
 event to one, part of whose cause is a volition following on a 
 conscious desire for the event.^ 
 
 2. The method of this treatise has been to regard subjective 
 probability as fundamental and to treat all other relevant con- 
 ceptions as derivative from this. That there is such a thing as 
 probability in this sense has been admitted by all sensible philo- 
 sophers since the middle of the eighteenth century at least.^ But 
 there is also, many writers have supposed, something else which 
 may be fitly described as objective probability ; and there is, 
 besides, a long tradition ia favour of the view that it is this (what- 
 ever it may be) which is logically and philosophically important, 
 subjective probability being a vague and mainly psychological 
 conception about which there is very little to be said. 
 
 The distiaction exists already in Hume : " Probability is of 
 two kinds, either when the object is really in itself uncertain, 
 and to be determined by chance ; or when, though the object be 
 already certain, yet 'tis uncertain to our judgment, which finds 
 a number of proofs on each side of the question." ^ But the 
 distinction is not elucidated, and one can only infer from other 
 passages that Hume did not intend to imply in this passage the 
 existence of objective chance in a sense contradictory to a deter- 
 minist theory of the Universe. In Condorcet all is confused ; and 
 in Laplace nearly all. In the nineteenth century the distinction 
 begins to grow explicit in the writings of Cournot. " Les explica- 
 tions que j'ai donnees . . . ," he writes in the preface to his 
 Exposition, " sur le double sens du mot de probabilit6, qui 
 tantot se rapporte a vme certaine mesure de nos connaissances, et 
 tant6t a une mesure de la possibility des choses, independamment 
 de la connaissance que nous en avons : ces explications, dis-je, 
 me semblent propres a resoudre les difficultes qui ont rendu 
 jusqu'ici suspecte a de bons esprits toute la theorie de la proba- 
 bility mathematique." It will be worth while to pause for a 
 moment to consider the ideas of Cournot. 
 
 1 This is discussed in Chapter XXV. § 4. 
 
 2 D'Alembert, oolleoting (largely from Hume, many passages being trans- 
 lated almost verbatim) in the Encyclopedie methodigue the most up-to-date 
 commonplaces of the subject, found it natural to write : " II n'y a point de 
 hasard a proprement parler ; mais il y a son Equivalent : I'ignorance, oil nous 
 sommes des vraies causes des 6v6nemeus, a sur notre esprit I'influence qu'on 
 suppose au hasard." Compare also the sentences from Spinoza quoted on 
 p. 117 above. 
 
 ^ A Treatise of Human Nature, Book ii. part iii. section ix. 
 
CH. XXIV PHILOSOPHICAL APPLICATIONS 283 
 
 3. Cournot, while admitting that there is such a thing as sub- 
 jective chance, was concerned to dispute the opinion that chance 
 is merely the offspring of ignorance, saying that in this case 
 " le calcid des chances " is merely " un calcul des illusions." 
 The chance, upon which " le calcul des chances " is based, is 
 something different, and depends, according to him, on the com- 
 bination or convergence of phenomena belonging to independent 
 series. By " independent series " he means series of phenomena 
 which develop as parallel or successive series without any causal 
 interdependence or link of solidarity whatever.^ No one, he 
 says by way of example, seriously believes that in striking the 
 ground with his foot he puts out the navigator in the Antipodes, 
 or disturbs the system of Jupiter's satellites. Separate trains of 
 events, that is to say, have been set going by distinct initial acts of 
 creation, so to speak.^ Every event is causally connected with 
 previous events belonging to its own series, but it cannot be 
 modified by contact with events belonging to a diSerent series. 
 A ' chance ' event is a complex due to the concurrence in time 
 or place of events belonging to causally independent series. 
 
 This theory, as it stands, is evidently unsatisfactory. Even 
 if there are series of phenomena which are independent in Cournot's 
 sense, it is not clear how we can know which they are, or how we 
 can set up a calculus which presumes an acquaintance with them. 
 Just as it is likely that we are all cousins if we go back far enough, 
 so there may be, after all, remote relationships between ourselves 
 and Jupiter. A remote connection or a reaction quantitatively 
 small is a matter of degree and not by any means the same thing 
 as absolute independence. Nevertheless Cournot has contri- 
 buted something, I think, to the stock of our ideas. He has 
 
 1 " Le mot hasard," Cournot writes in his Essai sur les fondements de nos 
 connaissances, " n'indique pas une cause substantielle, mais une id6e : cette idee 
 est oelle de la oombinaison entre plusieurs series de causes ou de faits qui se 
 developpent chacun dans sa s6rie propre, ind6pendamment les uns des autres." 
 This is very like the definition given by Jean de la Placette in his Traite desjeux 
 de hasard, to which Cournot refers : " Pour moi, je suis persuade que le hasard 
 renferme quelque chose de r6al et de positif, savoir un concours de deux ou 
 plusieurs evenements contingents, chacun desquels a ses causes, mais en sorte 
 que leur concours n'en a aucune que Ton connaisse." 
 
 2 Ussai sur les fondements de nos connaissances, i. 134 : " La nature ne se 
 gouverne pas par une loi unique ... ses lois ne sont pas toutes ddriv^es les 
 unes des autres, ou d6riv6es toutes d'une loi supMeure par une n6oessit6 pure- 
 ment logique . . . nous devons les ooucevoir au-contraire oomme ayant pu 
 6tre d^cr^ttes s6par6ment d'une infinite de mani^res." 
 
284 A TREATISE ON PROBABILITY pt. iv 
 
 hinted at, even if he has not disentangled, one of the elements 
 in a common conception of chance ; and of the notion, which he 
 seems to have in his mind, we must in due course take account.^ 
 
 4. In the writings of Condorcet, I have said above, all is con- 
 fused. But in Bertrand's criticism of him a relevant distraction, 
 though not elucidated, is brought before the mind. " The 
 motives for believing," wrote Condorcet, " that, from ten million 
 white balls mixed with one black, it will not be the black ball 
 which I shall draw at the first attempt is q/" the same kind as the 
 motive for believing that the sun will not fail to rise to-morrow." 
 " The assimilation of the two cases," Bertrand writes in criticism 
 of the above,^ " is not legitimate : one of the probabilities is 
 objective, the other subjective. The probability of drawing 
 the black ball at the first attempt is lo ooo ooo ' i^either more nor 
 less. Whoever evaluates it otherwise makes a mistake. The 
 probability that the sun will rise varies from one mind to another. 
 A scientist might hold on the basis of a false theory, without being 
 utterly irrational, that the sun will soon be extraguished ; he 
 would be within his rights, just as Condorcet is within his ; both 
 woTild exceed theic rights in accusiug of error those who think 
 differently." Before commenting on this distinction, let us have 
 before us also some interesting passages by Poincare. 
 
 5. We certainly do not use the term ' chance,' Poincare points 
 out, as the ancients used it, in opposition to determinism. For 
 us therefore the natural interpretation of ' chance ' is subjective, 
 — " Chance is only the measure of our ignorance. Fortuitous 
 phenomena are, by definition, those, of the laws of which we are 
 
 ^ Coumot's work on Probability has been highly praised by authorities as 
 diTerse and distinguished as Boole and Von Kries, and has been made the 
 foundation of a school by some recent French philosophers (see the special 
 number of the Bevue de metaphysiqiie et de morale, devoted to Cournot and pub- 
 lished in 1905, and the bibliography at the end of the present volume passim). 
 The best account with which I am acquainted, of Cournot's theory of probability, 
 is to be found in A. Darbon's Le Concept du hasa/rd. Cournot's philosophy of 
 the subject is developed, not so much in his Exposition de la theorie des chances, 
 as in later works, especially in his Essai sur lee fondem^nls de nos connaissances. 
 Ooumot never touched any subject without contributing something to it, but, 
 on the whole, his work on Probability is, in my opinion, disappointing. No 
 doubt his Exposition is superior to other French text-books of the period, of 
 which there is so large a variety, and his work, both here and elsewhere, is not 
 without illuminatiag ideas : but the philosophical treatment is so confused and 
 indefinite that it is difficult to make [much of it beyond the one specific point 
 treated above. 
 
 ' Calcul des probabilites, p. xix. 
 
CH. XXIV PHILOSOPHICAL APPLICATIONS 285 
 
 ignorant." But Poincare immediately adds : " Is this definition 
 very satisfactory ? When the first Chaldaean shepherds f oUowed 
 mth their eyes the movements of the stars, they did not yet 
 know the laws of astronomy, but would they have dreamed of 
 saying that the stars move by chance ? If a modern physicist 
 is studying a new phenomenon, and if he discovers its law on 
 Tuesday, would he have said on Monday that the phenomenon 
 was fortuitous ? " ^ 
 
 There is also another type of case in which " chance must be 
 something more than the name we give to our ignorance.' ' Among 
 the phenomena, of the causes of which we are ignorant, there are 
 some, such as those dealt with by the manager of a life insurance 
 company, about which the calculus of probabilities can give real 
 information. Surely it cannot be thanks to our ignorance, 
 Poincare urges, that we are able to arrive at valuable conclusions. 
 If it were, it would be necessary to answer an inquirer thus : 
 " You ask me to predict the phenomena that will be produced. 
 If I had the misfortune to know the laws of these phenomena, I 
 could not succeed except by inextricable calculations, and I should 
 have to give up the attempt to answer you ; but since I am 
 fortunate enough to be ignorant of them, I will give you an answer 
 at once. And, what is more extraordinary stUl, my answer will 
 be right." The ignorance of the manager of the life insurance 
 company as to the prospects of life of his individual policy- 
 holders does not prevent his being able to pay dividends to his 
 shareholders. 
 
 Both these distinctions seem to be real ones, and Poincare 
 proceeds to examine further instances in which we seem to 
 distinguish objectively between events according as they are or 
 are not due to ' chance.' He takes the case of a cone balanced 
 upon its tip ; we know for certain that it wUl fall, but not on 
 which side — chance wiU determine. " A very small cause which 
 escapes our notice determines a considerable effect that we cannot 
 fail to see, and then we say that that effect is due' to chance." 
 The weather, and the distribution of the minor planets on the 
 Zodiac, are analogous instances. And what we term ' games of 
 chance ' afford, it has always been recognised, an almost perfect 
 
 ^ Galcul des probabilitea (2nd edition), p. 2. This passage also appears in an 
 article in the Bevue du moia for 1907 and in the author's Science et methode, of 
 the English translation of which I have made use above,— at the cost of doing 
 incomplete justice to Foincar^'s most admirable style. 
 
286 A TREATISE ON PROBABILITY pt. iv 
 
 example. " It may happen that small differences in the initial 
 conditions produce very great ones in the final phenomena. A 
 small error in the former will produce an enormous error in the 
 latter. Prediction becomes impossible, and we have the fortuit- 
 ous phenomenon:" " The greatest chance is the bicth of a great 
 man. It is only by chance that the meeting occurs of two genital 
 cells of different sex that contain precisely, each on its side, the 
 mysterious elements, the mutual reaction of which is destined 
 to produce genius. . . . How little it would have taken to make 
 the spermatozoid which carried them deviate from its course. 
 It would have been enough to deflect it a hundredth part of an 
 inch, and Napoleon would not have been born and the destinies 
 of a continent changed. No example can give a better compre- 
 hension of the true character of chance." 
 
 Poincare calls attention next to another class of events, which 
 we commonly assign to ' chance,' the distinguishing characteristic 
 of which seems to be that their causes are very numerous and 
 complex, — ^the motions of molecules of gas, the distribution of 
 drops of rain, the shuffling of a pack of cards, or the errors of 
 observation. Thirdly there is the type, usually connected with 
 one of the first two, and specially emphasised, as we have seen 
 above, by Couinot, in which something comes about through 
 the concurrence of events which we regard as belonging to distinct 
 causal trains, — a man is walking along the street and is killed by 
 the fall of a tile. 
 
 6. When we attribute such events, as those illustrated by 
 Poincare, to chance, we certainly do not mean merely to assert 
 that we do not know how they arose or that we had no special 
 reason for anticipating them d priori. So far from this being the 
 case, we mean to make a definite assertion as to the kind of way 
 in which they arose ; — ^though exactly what we mean to assert 
 about them it is extremely difficult to say. 
 
 Now a careful examination of all the cases in which various 
 writers claim to detect the presence of.' objective chance' con- 
 firms the view that ' subjective chance,' which is concerned with 
 knowledge and ignorance, is fundamental, and that so-called 
 ' objective chance,' however important it may turn out to be 
 from the practical or scientific point of view, is really a special 
 kind of ' subjective chance ' and a derivative type of the latter. 
 For none of the adherents of ' objective chance ' wish to question 
 
CH. XXIV PHILOSOPHICAL APPLICATIONS 287 
 
 the determinist character of natural order ; and the possibility 
 of this objective chance of theirs seems always to depend on the 
 possibility that a particular kind of knowledge either is ours or 
 is within our powers and capacity. Let me try to distinguish as 
 exactly as I can the criterion of objective chance. 
 
 7. When we say that an event has happened by chance, we 
 do not mean that previous to its occurrence the event was, on 
 the available evidence, very improbable ; this may or may not 
 have been the case. We say, for example, that if a coin falls heads 
 it is ' by chance,' whereas its falling heads is not at all improbable. 
 The term ' by chance ' has reference rather to the state of our 
 information about the concurrence of the event considered and 
 the event premised. The fall of the coin is a chance event if 
 our knowledge of the circumstances of the throw is irrelevant 
 to our expectation of the possible alternative results. If the 
 number of alternatives is very large, then the occurrence of 
 the event is not only subject to chance but is also very im- 
 probable. In general two events may be said to have a chance 
 connection, in the subjective sense, when knowledge of the 
 first is irrelevant to our expectation of the second, and produces 
 no additional presumption for or agaiast it ; when, that is to 
 say, the probabihties of the propositions asserting them are 
 independent in the sense defined in Chapter XII. § 8. 
 
 The above definition deals with chance in the widest sense. 
 What is the differentia of the narrower group of cases to which 
 it is desired to apply the term ' objective chance ' ? The occur- 
 rence of an event may be said to be subject to objective chance, 
 I think, when it is not only a chance event in the above sense, 
 but when we also have good reason to suppose that the addition 
 of further knowledge of a given kind, if it were procurable, would 
 not affect its chance character. We must consider, that is to say, 
 the probability which is relative not to actual knowledge but to 
 the whole of a certain kind of knowledge. We may be able to 
 infer from our evidence that, even with certain kinds of 
 additions to our knowledge, the connections between the events 
 would stiU be subject to chance in the sense just defined, and 
 we may be able to infer this without actually haAong the addi- 
 tional information in question. If, however complete otir 
 knowledge of certain kinds of things might be, there would still 
 exist independence between the propositions, the conjunction 
 
288 A TREATISE ON PROBABILITY pt. iv 
 
 of which we are investigating, then we may say there is an 
 objective sense in which the actual conjunction of these pro- 
 positions is due to chance. 
 
 8. This is, I think, the right line of inquiry. It remains to 
 decide, what kinds of information must be irrelevant to the 
 connection, in order that the presence of objective chance may 
 be established. 
 
 When we attribute a coincidence to objective chance, we 
 mean not only that we do not actually know a law of connection, 
 but, speaking roughly, that there is no law of connection to be 
 known. And when we say that the occurrence of one alterna- 
 tive rather than another is due to chance, we mean not only 
 that we know no principle by which to choose between the 
 alternatives, but also that no such principle is knowable. This 
 use of the term closely corresponds to what Venn means by the 
 term ' casual ' : " We call a coincidence casual, I apprehend, 
 when we mean to imply that no knowledge of one of the two 
 elements, which we can suppose to be practically attainable, 
 would enable us to expect the other." ^ 
 
 To make this more precise, we must revive our distinc- 
 tion,* between nomologic knowledge and ontologic knowledge, 
 between knowledge of laws and knowledge of facts or existence. 
 Given certain facts /(a) about a and certain laws of connection, L, 
 we can infer certainly or probably other facts 0(a) about a. If 
 a comply knowledge of laws of connection together with /(a) 
 yields no appreciable probability for preferring <j){a) to other 
 alternatives, then I suggest that an actual connection between ^ 
 and/ in a particular instance may be said to be due to chance in 
 a sense which usage justifies us in calling objective. We do 
 not, in fact, when we speak of objective chance, always use it 
 in so strict a sense as this, but this is, I think, the underlying 
 conception to which current usage approximates. Current 
 usage diverges from this sense mainly for two reasons. We 
 speak of objective chance if in the above conditions our 
 grounds for preference, though appreciable, are very smaU ; and 
 we are not insistent to assert the rule of chance if a comparatively 
 sUght addition to our ontologic knowledge would render the 
 probability or the grounds for preference appreciable. 
 
 1 Logic of Chance, p. 245. 
 » See Part III. Note (u.) § 2, p. 275. 
 
OH. XXIV PHILOSOPHICAL APPLICATIONS 289 
 
 To sum up the above, an event is due to objective chance if 
 in order to predict it, or to prefer it to alternatives, at present 
 equi-probable, with any high degree of probability, it would be 
 necessary to know a great many more facts of existence about 
 it than we actually do know, and if the addition of a wide 
 knowledge of general principles would be little use. 
 
 It must be added that we make a distinction between facts of 
 existence which are highly variable from case to case and those 
 which are constant or nearly constant over a certain field of 
 observation or experience. Within the limits of this field we 
 regard the permanent facts of existence as being, from the stand- 
 point of chance, in nearly the same position as laws. A connec- 
 tion is not due to chance, therefore, if a knowledge of the per- 
 manent facts of existence could lead to their prediction. 
 
 To sum up again therefore, — ^if within a given field of observa- 
 tion or experience a knowledge of those facts of existence which are 
 permanent or invariable within that field, together with a know- 
 ledge of all the relevant fundamental causal laws or general 
 principles, and of a few other facts of existence, would not 
 permit us, given/(a), to attribute an appreciable probability to 
 (^(a) (or an appreciable probability to the alternative ^i(a) 
 rather than <f>2{<^)) ! then the conjunction of ^(a) (or of ^i(a) 
 rather than <f>2{a) with /(a)) is due to objective chance. 
 
 9. If we return to the examples of Poincare, the above defini- 
 tion appears to conform satisfactorily with the usages of common 
 sense. It is when an excu^ knowledge of fact, as distinguished 
 from principle, is required for even approximate prediction that 
 the expression 'objective chance' seems applicable. But 
 neither our definition nor usage is precise as to the amount of 
 knowledge of fact which must be required for prediction, in 
 order that, in the absence of it, the event may be regarded as 
 subject to objective chance. 
 
 It may be added that the expression ' chance ' can be used 
 with reference to general statements as well as to particular facts. 
 We say, for example, that it is a matter of chance if a man dies 
 on his birthday, meaning that, as a general principle and in the 
 absence of special information bearing on a particular case, there 
 is no presumption whatever in favour of his dying on his birthday 
 rather than on any other day. If as a general rule there were cele- 
 brations on such a day such as would be not unlikely to accelerate 
 
 u 
 
290 A TEEATISE ON PROBABILITY n. iv 
 
 death, we should say that a man's djong on his birthday was not 
 altogether a matter of chance. If we Imew no such general rule 
 but did not know enough about birthdays to be assured that there 
 was no such rule, we could not call the chance ' objective ' ; we 
 could only speak of it thus, if on the evidence before us there was a 
 strong presumption against the existence of any such general rule. 
 
 10. The philosophical and scientific importance of objective 
 chance as defined above cannot be made plain, until Part V., on 
 the Foundations of Statistical Inference, has been reached. There 
 it will appear in more than one connection, but chiefly in connec- 
 tion with the application of Bernoulli's formula. In cases where 
 the use of this formula is valid, important inferences can be drawn; 
 and it will be shown that, when the conditions for objective chance 
 are approximately satisfied, it is probable that the conditions 
 for the application of Bernoulli's formula will be approximately 
 satisfied also. 
 
 11. The term random has been used, it is well recognised, in 
 several distinct senses. Venn^ and other adherents of the 
 ' frequency ' theory have given to it a precise meaning, but one 
 which has avowedly very little relation to popular usage. A 
 random sample, says Peicce,^ is one " taken according to a precept 
 or method, which, being applied over and over again indefinitely, 
 would in the long run result in the drawing of any one set of in- 
 stances as often as any other set of the same niimber." The 
 same fundamental idea has been expressed with greater precision 
 by Professor Edgeworth in connection with his investigations 
 into the law of error.^ It is a fatal objection, in my opinion, to 
 this mode of defining randomness, that in general we can only 
 know whether or not we have a random sample when our know- 
 ledge is nearly complete. Its divergence from ordinary usage is 
 well illustrated by the fact that there would be perfect randomness 
 in the distribution of stars in the heavens, as Venn explicitly points 
 out, if they were disposed in an exact and symmetrical pattern.* 
 
 1 Logic of Chance, chap, v., " The Conception Randomness and its Scientific 
 Treatment." 
 
 " " A Theory of Probable Inference " (published in Johns Hopkins Studies in 
 Logic), p. 152. 
 
 ' " Law of Error," Gamb. Phil. Trans., 1904, p. 128. 
 
 * Bat it may be added that this seems inconsistent with Venn's conception 
 of randomness as that of aggregate order and individual irregularity ; nor is it 
 concordant with Venn's typically random diagram (p. 118). His usage, there- 
 fore, is sometimes nearer than his definition to the popular usage. 
 
CH. XXIV PHILOSOPHICAL APPLICATIONS 291 
 
 I do not believe, therefore, that this kind of definition is a 
 useful one. The term' must be defined with reference to prob- 
 ability, not to what will happen " in the long run " ; though 
 there may be two senses of it, corresponding to subjective and 
 objective probability respectively. 
 
 The most important phrase in which the term is used is that 
 of ' a random selection ' or ' taken at random.' When we apply 
 this term to a particular member of a series or collection of 
 objects, we may mean one of two things. We may mean that 
 our knowledge of the method of choosing the particular member 
 is such that d priori the member chosen is as likely to be any 
 one member of the series as any other. We may also mean, 
 not that we have no knowledge as to which particular member 
 is in question, but that such knowledge as we have respecting 
 the particular member, as distinguished from other members of 
 the series, is irrelevant to the question as to whether or not 
 this member has the characteristic under examination. In the 
 first case the particular member is a random member of the 
 series for all characteristics ; in the second case it is a random 
 member for some only. As the second case is the more general, 
 we had better take that for the purpose of defining ' random 
 selection.' 
 
 The point will be brought out further if we discuss the 
 more diflSicult use of the term. What exactly do we mean by 
 the statement : " Any number, taken at random, is equally 
 likely to be odd or even " ? According to the frequency theory, 
 this simply means that there are as many odd numbers as there 
 are even. Taking it in a sense corresponding to subjective 
 chance (and to the explanations given above), I propose as 
 a definition the following : a is taken at random from the 
 class S for the purposes of the propositional function 8{x) . ^{x), 
 relative to evidence ^, if ' a; is a ' is irrelevant to the probability 
 d}(x)lS{x) . h. Thus ' the number of the inhabitants of France is 
 odd ' is, relative to my knowledge, a random instance of the 
 propositional function ' aj is an odd number,' since ' a is the 
 number of the inhabitants of France ' is irrelevant to the prob- 
 ability of ' a is odd.' ^ Thus to say that a number taken at 
 random is as likely to be odd as even, means that there is a 
 
 1 In the above S(a!) stands for ' a; is a number,' 0{a!) stands for ' x is odd,' 
 a stands for ' the number of inhabitants of France,' 
 
292 A TEEATISE ON PROBABILITY m.iv 
 
 probability ^ that any instance taken at random of the 
 generalisation ' all numbers are odd ' (or of the corresponding 
 generalisation * aU numbers are even ') is true ; an instance being 
 taken at random in respect of evenness or oddness, if our 
 knowledge about it satisfies the conditions defined above. 
 Whether or not a given instance is taken at random, depends, 
 therefore, upon what generalisation is in question. 
 
 12. We may or may not have reason to believe that, if we take 
 a series of random selections, the - proportionate number of 
 occurrences of one particular type of result will very probably 
 lie within certain limits. For reasons to be explained in Chapter 
 XXIX., random selection relative to such information may 
 conveniently be termed ' random selection under Bernoullian 
 conditions.' It is this kind of random selection which is scientific- 
 ally and statistically important. But, as this corresponds to 
 ' objective chance,' it is convenient to have a wider definition 
 of ' random selection ' unqualified, corresponding to ' subjective 
 chance ' ; and it is this wider definition which is given above. 
 
 The term opposite to ' random selection ' in ordinary usage 
 is ' biassed selection.' When I use this phrase without qualifica- 
 tion I shall use it as the opposite of ' random selection ' in the 
 wider unqualified sense. 
 
CHAPTER XXV 
 
 SOME PEOBLEMS ARISING OUT OF THE DISCUSSION OP CHANCE 
 
 1. There are two classical problems in which attempts have been 
 made to attribute certain astronomical phenomena to a specific 
 cause, rather than to objective chance in some such sense as has 
 been defined ia the preceding chapter. 
 
 The first of these is concerned with the iaclinations to the 
 ecliptic of the orbits of the planets of the solar system. This 
 problem has a long history, but it wiQ be sufficient to take De 
 Morgan's statement of it.^ If we suppose that each of the orbits 
 might have amy iaclination, we obtain a vast number of combina- 
 tions of which only a smaU number are such that their sum is as 
 small or smaller than the sum of those of the actual system. 
 But the very existence of ourselves and our world can be shown 
 to imply that one of this small number has been selected, and 
 De Morgan derives from this an enormous presumption that 
 " there was a necessary cause in the formation of the solar system 
 for the iacUnations being what they are." 
 
 The answer to this was pointed out by D'Alembert ^ in criticis- 
 
 ^ Article on Frobahilitiea in Encydopae^a MetropoUtana, p. 412, § 46. De 
 Morgan takes this without acknowledgment from Laplace, Theorie analytique 
 des probabilites {1st edition), pp. 257, 258. Laplace also allows for the fact 
 that all the planets move ia the same sense as the earth. He concludes : " On 
 verra que I'existence d'une caiiae commune qui a dirig6 tous ces mouvemens 
 dans le sens de la rotation du soleil, et sur des plans peu inclines k celui de son 
 ^quateur, est indiqu^e avec une probability bien sup^rieure k celle du plus 
 grand nombre des faits historiques sur lesquels on ne se permet auoun doute." 
 Laplace had in his turn borrowed the example, also without acknowledgment, 
 from Daniel Bernoulli. See also D'Alembert, Opuscules maihematiques, voL iv., 
 1768, pp. 89 and 292. 
 
 2 Op. cit. p. 292. " II y a certainement d'infini oontre un a parier que les 
 Plandtes ne devraient pas se trouver dans le mSme plan ; ce n'est pas une raison 
 pour en conclure que cette disposition, si eUe avoit Ueu, auroit n^cessairement 
 d'autre cause que le hasard ; car il y auroit de mSme I'inflni contre un k parier 
 
 293 
 
294 A TREATISE ON PROBABILITY w. iv 
 
 ing Daniel Bernoulli. De Morgan could have reached a similar 
 result whatever the configuration might have happened to be. 
 Any arbitrary disposition over the celestial sphere is vastly 
 improbable A priori, that is to say in the absence of known laws 
 tending to favour particular arrangements. It does not follow 
 from this, as De Morgan argues, that any actual disposition 
 possesses d posteriori a peculiar significance. 
 
 2. The second of these problems is known as Michell's problem 
 of binary stars. Michell's Memoir was published in the Philo- 
 sophical Transactions for 1767.^ It deals with the question as to 
 whether stars which are optically double, i.e. which are so situated 
 as to appear close together to an observer on the earth — are also 
 physically so " either by an original act of the Creator, or in con- 
 sequence of some general law, such perhaps as gravity." He 
 argues that if the stars " were scattered by mere chance as it 
 might happen ... it is manifest . . . that every star being 
 as likely to be in any one situation as another, the probability that 
 any one particular star should happen to be within a certain 
 distance (as, for example, one degree) of any other given star 
 would be represented ... by a fraction whose numerator would 
 be to its denominator as a circle of one degree radius to a circle 
 whose radius is the diameter of a great circle . . . that is, about 
 1 in 13131." From this beginning he derives an immense pre- 
 sumption against the scattering of the several contiguous stars 
 that may be observed " by mere chance as it might happen." 
 And he goes on to argue that, if there are causal laws directly 
 tending to produce the observed proximities, we may reasonably 
 suppose that the proximities are actual, and not merely optical 
 and apparent. The fact that Michell's iaduction was confirmed 
 by the later investigations of HerscheU adds interest to the 
 speculation. But apart from this the argument is evidently 
 
 que les Plan^tes pourroient n'avoir pas une certaine disposition d^terminte k 
 volenti. . . ." 
 
 D'Alembert is employing tte instance for his own purposes, in order to build 
 up an ad hominem argument in favour of his theory conoeming ' runs ' against 
 D. Bernoulli (see also p. 317). 
 
 '^ See also Todhimter's History, pp. 332-4 ; Venn, Logic of Chance, p. 260 ; 
 Forbes, " On the Alleged Evidence for a Physical Connexion between Stars 
 forming Binary or Multiple Groups, deduced from the Doctrine of Chances," 
 Phil. Mag., 1850, and Boole, " On the Theory of Probabilities and in par- 
 ticular on Michell's Problem of the Distribution of the Fixed Stars," Phil. 
 Mag., 1851. 
 
OH. XXV PHILOSOPHICAL APPLICATIONS 295 
 
 subtler than in the first example. Michell argues that there are 
 more stars optically contiguous, than would be likely if there 
 were no special cause acting towards this end, and further that, 
 i£ such a cause is in operation, it must be real, and not merely 
 optical, contiguity that results from it. 
 
 Let us analyse the argument more closely. By " mere chance 
 as it might happen " Michell cannot be supposed to mean " un- 
 caused." He is thinking of objective chance in the sense in 
 which I have defined this in the preceding chapter. We 
 speak of a chance occurrence when it is brought about by the 
 coincidence of forces and circumstances so numerous and complex 
 that knowledge sufficient for its prediction is of a kind altogether 
 out of our reach. Michell uses the term vaguely but means, I 
 think, something of this kind : An event is due to mere chance 
 when it can only occur if a large number of independent ^ con- 
 ditions are fulfilled simultaneously. The alternatives which 
 Michell is discussing are therefore these : Are binary stars merely 
 due to the interaction of a vast variety of steUar laws and posi- 
 tions or are they the result of a few fundamental tendencies, 
 which might be the subject of knowledge and which would lead 
 us to expect such stars in relative profusion ? 
 
 The existence of numerous binary stars may give a real 
 inductive argument in favour of their arising out of the inter- 
 action of a relatively small number of independent causes. But 
 it is not possible to arrive at such precise results as Michell's. 
 If there is some finite probability d priori that binary stars, 
 when they arise, do arise in this way, then, since the frequent 
 coincidence of a given set of independent causes relatively few 
 in number is more likely than that of a set relatively numerous, 
 the observation of binary stars will raise this probability d pos- 
 teriori to an extent which depends upon the relative profusion 
 in which such stars appear. If, in short, the first of the two 
 alternatives proposed above is assumed, there is no greater 
 presumption for a distribution, covering a part of the heavens, 
 in which binary stars appear, than for any other distribution ; 
 if the second is assumed, there is a greater presumption. The 
 observation of numerous distributions in which binary stars 
 appear increases, therefore, by the inverse principle, any d priori 
 probability which may exist in favour of the second hypothesis. 
 
 1 See § 3 of Note (ii.) to Part III. 
 
296 A TEEATISE ON PEOBABILITY m. iv 
 
 But more than this the argument cannot justify. That Michell's 
 argument is, as it stands, no more valid than De Morgan's, 
 becomes plain when we notice that he would still have a high 
 probability for his conclusion even if only one biaary star had 
 been observed. The valuable part of the argument must clearly 
 turn upon the observation of numerous binary stars. 
 
 Let us now turn to Michell's second step. He argues that, 
 if binary stars arise out of the iateraction of a small number of 
 iadependent forces, they must be physically and not merely 
 optically double. The force of this argument seems to depend 
 upon our possessing previous knowledge as to the nature of the 
 principal natural laws, and upon an assumption, arisiag out of 
 this, that there are not likely to be forces tendiag to arrange 
 stars, in reality at great distances from one another, so as to 
 appear double from this particular planet. But Michell, in 
 arguing thus, was neglecting the possibility that the optical 
 connection between the stars might be due to the observer and 
 his means of observation. It was not impossible that there should 
 be a law, connected with the transmission of light for example, 
 which would cause stars to appear to an observer to be much 
 nearer together than they really are. 
 
 While, therefore, a relative profusion of binary stars constitutes 
 evidence favourably relevant to Michell's conclusion, the argu- 
 ment is more complex and much less conclusive than he seems to 
 have supposed. This is a criticism which is applicable to many 
 such arguments. The simplicity of the evidence, which arises 
 out of the lack of much relevant information, is liable, unless we 
 are careful, to lead us into deceptive calculations and into asser- 
 tions of high numerical probabilities, upon which we should never 
 venture in cases where the evidence is full and complicated, but 
 where, in fact, the conclusion is established far more strongly. 
 The enormously high probability in favour of his conclusion, to 
 which Michell's calculations led him, should itself have caused 
 him to suspect the accuracy of the reasoning by which he 
 reached it. 
 
 3. Some more recent problems of this type seem, however, so 
 far as I am acquainted with them, to follow safer lines of argu- 
 ment. The most important are concerned with the existence 
 of star drifts. It seems to me not at all impossible to possess 
 data on which a valid argument can be constructed from the 
 
OH. XXV PHILOSOPHICAL APPLICATIONS 297 
 
 observation of optically apparent star drifts to the probability 
 of a real uniformity of motion amongst certain sets of stars 
 relatively to others. 
 
 Another problem, somewhat analogous to the preceding, has 
 been recently discussed by Professor Karl Pearson.^ The title 
 might prove a little misleading, perhaps, until the explanation 
 has been reached of the sense in which the term ' random ' is 
 used in it. But Professor Pearson uses the term in a perfectly 
 precise sense. He defines a random distribution as one in which 
 spherical shells of equal volimie about the sun as centre contain 
 the same number of stars.^ He argues that the observed facts 
 render probable the following disjunction : Either the distribu- 
 tion of stars is not random in. the sense defined above, or there is 
 a correlation between their distance and their brilliancy, such as 
 might be produced, for example, by the absorption of light in its 
 transmission through space, or the space within which they all 
 lie is limited in volume and not spherical in form.^ But it is 
 useless to employ the term random in this sense in such inquiries 
 as Michell's. For there is no reason to suppose that a non- 
 random distribution is more likely than a random distribution 
 to depend upon the interaction of a small number of independent 
 forces, and there might even exist a presumption the other way. 
 This arbitrary interpretation of randomness does not help us to 
 the solution of any interesting problem. 
 
 4. The discussion of Ji/nal causes and of the argument from 
 design has suffered confusion from its supposed connection with 
 theology. But the logical problem is plain and can be determined 
 upon formal and abstract considerations. The argument is in all 
 cases simply this — ^an event has occurred and has been observed 
 which would be very improbable a priori if we did not know that 
 it had actually happened ; on the other hand, the event is of such 
 a character that it might have been not unreasonably predicted 
 if we had assumed the existence of a conscious agent whose 
 motives are of a certain kind and whose powers are sufficient. 
 
 ^ " On the Improbability of a Random Distribution of the Stars in Space," 
 Proceedings of Royal Society, series A, vol. 84, pp. 47-70, 1910. 
 
 2 It is, therefore, independent of direction, and the distribution is random 
 even if the stars are massed in particular quarters of the heavens. The defini- 
 tion is, therefore, exceedingly arbitrary. 
 
 ' This should run more correctly, I think, "not a sphere vnth the sun as 
 centre." 
 
298 A TREATISE ON PROBABILITY pt. iv 
 
 Symbolically : Let h be our origiijial data, a tbe occurrence 
 
 of the event, 6 the existence of the supposed conscious agent. 
 
 Then ajh is assumed very small in comparison with ajhh ; and 
 
 we require hjah, the probability, that is to say, of b after a is 
 
 known. The inverse principle of probability already demon- 
 
 hlh 
 strated shows that hjah = ajhh.— ^, and bjah is therefore not 
 
 determinate in terms of ajbh and ajh alone. Thus we cannot 
 measure the probability of the conscious agent's existence after 
 the event,' unless we can measure its probability b^ore the event. 
 And it is our ignorance of this, as a rule, that we are endeavouring 
 to remedy. The argument tells us that the existence of the 
 hypothetical agent is more likely after the event than before 
 it ; but, as in the case of the general inductive problem dealt 
 with in Part III., unless there is an appreciable probability first, 
 there cannot be an appreciable probability afterwards. No 
 conclusion, therefore, which is worth having, can be based on the 
 argument from design alone ; like induction, this type of argu- 
 ment can only strengthen the probability of conclusions, for 
 which there is something to be said on other grounds. We cannot 
 say, for example, that the human eye is due to design more 
 probably than not, unless we have some reason, apart from the 
 nature of its construction, for suspecting conscious workmanship. 
 But the necessary d priori probability, derived from some other 
 source, may sometimes be forthcoming. The man who upon a 
 desert island picks up a watch, or who sees the symbol John 
 Smith traced upon the sand, can use with reason the argument 
 from design. For he has other grounds for supposing that 
 beings, capable of designing such objects, do exist, and that 
 their presence on the island, now or formerly, is appreciably 
 possible. 
 
 5. The most important problems at the present day, in which 
 arguments of this kind are employed, are those which arise in 
 connection with psychical research.^ The analysis of the ' cross- 
 
 ' The probability that a remarkable success in naming playing cards is due 
 to psychic agency, was discussed by Professor Edgeworth in MetreUke. This 
 was, I think, the first application of probabilities to these questions. See also 
 Proceedings of the Society for PsycMcal Research, Parts VIII. and X. ; Professor 
 Edgeworth's article on Psychical Research and Statistical Method, Stat. Joum. 
 vol. Lzxxii. (1919) p. 222 ; and Experiments in Psychical Research at Leland 
 Stanford Junior University, by J. Coover. 
 
OH. XXV PHILOSOPHICAL APPLICATIONS 299 
 
 correspondences,' which have played so large a part in recent 
 discussions, presents many points of difficulty which are not 
 dissimilar to those which, arise in other scientific inquiries of 
 great complexity ia which our iaitial knowledge is small. An 
 important part of the logical problem, therefore, is to distinguish 
 the peculiarity of psychical problems and to discover what special 
 evidence they demand beyond what is required when we deal with 
 other questions. There is a certain tendency, I think, arising out 
 of the belief that psychical problems are in some way peculiar, 
 to raise sceptical doubts against them, which are equally valid 
 against all scientific proofs. Without entering into any questions 
 of detail, let us endeavour to separate those difficulties which 
 seem peculiar to psychical research from those which, however 
 great, are not different from the difficulties which confront 
 students of heredity, for instance, and which are not less likely 
 than these to yield ultimately to the patience and the insight of 
 investigators. 
 
 For this purpose it is necessary to recur, briefly, to the analysis 
 of Part III. It was argued there that the methods of empirical 
 proof, by which we strengthen the probability of our conclusions, 
 are not at aU dissimilar, when we apply them to the discovery 
 of formal truth, and when we apply them to the discovery of the 
 laws which relate material objects, and that they may possibly 
 prove useful even in the case of metaphysics ; but that the 
 initial probability which we strengthen by these means is differ- 
 ently obtained in each class of problem. In logic it arises out 
 of the postulate that apparent self-evidence invests what seems 
 self-evident with some degree of probability ; and in physical 
 science, out of the postulate that there is a limitation to the 
 amount of independent variety amongst the qualities of material 
 objects. But both in logic and in physical science we may wish 
 to consider hypotheses which it is not possible to invest with any 
 d priori probability and which we entertain solely on account of 
 the known truth of many of their consequences. An axiom 
 which has no self-evidence, but which it seems necessary to com- 
 bine with other axioms which are self-evident in order to deduce 
 the generally accepted body of formal truth, stands in this 
 category. A scientific entity, such as the ether or the electron, 
 whose quaKties have never been observed but whose existence we 
 postulate for purposes of explanation, stands in it also. If the 
 
300 A TEEATISE ON PEOBABILITY vr.iv 
 
 analysis of Part III. is correct, we can never attribute a finite 
 probability ^ to the truth of such axioms or to the existence of 
 such scientific entities, however many of their consequences 
 we find to be true. They may be convenient hypotheses, because, 
 if we confine ourselves to certain classes of their consequences, 
 we are not likely to be led into error ; but they stand, neverthe- 
 less, in a position altogether different from that of such generalis- 
 ations as we have reason to invest with an initial probability. 
 
 Let us now apply these distinctions to the problems of psychical 
 research. In the case of some of them we can obtain the initial 
 probability, I think, by the same kind of postulates as in physical 
 science, and our conclusions need not be open to a greater degree 
 of doubt than these. In the case of others we cannot ; and these 
 must remain, unless some method is open to us peculiar to 
 psychical research, as tentative unproved hypotheses in the 
 same category as the ether. 
 
 The best example of the first class is afforded by telepathy. 
 We know that the consciousnesses which, if our hypothesis is 
 correct, act upon one another, do exist ; and I see no logical differ- 
 ence between the problem of establishing a law of telepathy and 
 that of establishing the law of gravitation. There is at present a 
 practical difference on account of the much narrower scope of our 
 knowledge, in the case of telepathy, of cognate matters. We can, 
 therefore, be much less certain ; but there seems no reason why 
 we should necessarily remain less certain after more evidence 
 has been accumulated. It is important to remember that, in 
 the case of telepathy, we are merely discovering a relation be- 
 tween objects which we already know to exist. 
 
 The best example of the other class is afforded by attempts 
 to attribute psychic phenomena to the agency of ' spirits ' other 
 than human beings. Such arguments are weakened at present 
 by the fact that no phenomena are known, so far as I am aware, 
 which cannot be explained, though improbably in some cases, 
 in other ways. But even if phenomena were to be observed of 
 
 ' I am assuming that there is no argument, arising either from self -evidence 
 or analogy, in addition to the argument arising from the truth of their con- 
 sequences, in favour of the truth of such axioms or the existence of such objects ; 
 but I daresay that this may not certainly be the case. The reader may be re- 
 minded also that, when I deny a finite probability this is not the same thing as 
 to affirm that the probability is infinitely small. I mean simply that it is not 
 greater than some numerically measurable probability. 
 
OH. XXV PHILOSOPHICAL APPLICATIONS 301 
 
 which no known agency could afford even an improbable ex- 
 planation, the hypothesis of ' spirits ' would still lie in the same 
 logical limbo as the hypothesis of the ' ether,' in which they 
 might be supposed not iaappropriately to move. 
 
 Such an hypothesis as the existence of ' spirits ' could only 
 become substantial if some peculiar method of knowledge were 
 within our power which would yield us the initial probability 
 which is demanded. That such a method exists, it is not in- 
 frequently claimed. If we can directly perceive these ' spirits,' 
 as many of those who are described in James's Varieties of 
 ReUgious Experience think they can, the problem is, logically, 
 altogether changed. We have, in fact, very much the same kind 
 of reason, though it may be with less probability, that we have 
 for believing in the existence of other people. The preceding 
 paragraph applies only to attempts at proving the existence of 
 ' spirits ' from such evidence as is discussed by the Society for 
 Psychical Eesearch. 
 
 In between these two extremes comes a class of cases, with 
 regard to which it is extremely difficult to come to a decision — 
 that of attempts to attribute psychic phenomena to the conscious 
 agency of the dead. I wish to discuss here, not the nature of the 
 existing evidence, but the question whether it is possible for 
 any evidence to be convincing. In this case the object whose 
 existence we are endeavouring to demonstrate resembles in 
 many respects objects which we know to exist. The question 
 of epistemology, which is before us, is this : Is it necessary, in 
 order that we may have an initial probability, that the object of 
 our hypothesis should resemble ia every relevant particular 
 some one object which we know to exist, or is it sufficient that we 
 shoidd know instances of aU its supposed qualities, though never 
 in combination ? It is clear that some qualities may be irrelevant 
 — ^position in time and space, for example — ^and that ' every 
 relevant particular ' need not include these. But can the initial 
 probability exist if our hypothesis assume^ qualities, which have 
 plainly some degree of relevance, in new combinations ? If we 
 have no knowledge of consciousness existing apart from a living 
 body, can indirect evidence of whatever character afford us any 
 probability of such a thing 1 Could any evidence, for example, 
 persuade Ixs that a tree felt the emotion of amusement, even if 
 it laughed repeatedly when we made jokes ? Yet the analogy 
 
302 A TEEATISE ON PEOBABILITY m. iv 
 
 which we demand seems to be a matter of degree ; for it does not 
 seem imreasonable to attribute consciousness to dogs, although 
 this constitutes a combination of qualities unlike in many respects 
 to any which we know to exist. 
 
 This discussion, however, is wanderiag from the subject of 
 probability to that of epistemology, and it will not be solved until 
 we possess a more comprehensive account of this latter subject 
 than we have at present. I wish only to distinguish between those 
 cases in which we obtain the initial probability in the same 
 manner as in physical science from those in which we must get 
 it, if at all, in some other way. The distinctions I have made 
 are sufficiently summarised by a recapitulation of the following 
 comparisons : We compared the proof of telepathy to the proof 
 of gravitation, the proof of non-human * spirits ' to the proof 
 of the ether, and, much less closely, the proof of the consciousness 
 of the dead to the proof of the consciousness of trees, or, perhaps, 
 of dogs. 
 
 Before passing to the next of the rather miscellaneous topics 
 of this chapter, it may be worth while to add that we should be 
 very chary of applying to problems of psychical research the 
 calculus of probabilities. The alternatives seldom satisfy the 
 conditions for the application of the Principle of Indifference, 
 and the initial probabilities are not capable of being measured 
 numerically. If, therefore, we endeavour to calculate the prob- 
 ability that some phenomenon is due to ' abnormal ' causes, 
 our mathematics Tvill be apt to lead us into unjustifiable 
 conclusions. 
 
 6. Uninstructed common sense seems to be specially unre- 
 liable in dealing with what are termed ' remarkable occurrences.' 
 Unless a ' remarkable occurrence ' is simply one which produces 
 on us a particular psychological effect, that of surprise, we can 
 only define it as an event which before its occurrence is very im- 
 probable on the available evidence. But it will often occur — 
 whenever, in fdct, our data leave open the possibility of a large 
 number of alternatives and show no preference for any of them 
 — that every possibility is exceedingly improbable a priori. It 
 follows, therefore, that what actually occurs does not derive any 
 peculiar significance merely from the fact of its being 'remarkable ' 
 in the above sense. Something further is required before we 
 can bmld with success. Yet Michell's argument wd the argu- 
 
OH. XXV PHILOSOPHICAL APPLICATIONS 303 
 
 ment from design derive a good deal of their plausibility, I thiak, 
 from the • remarkable ' character of the actual constitution 
 whether of the heavens or of the universe, la forgetfulaess of the 
 fact that it is impossible to propound any constitution which 
 would if it existed be other than ' remarkable.' It is supposed 
 that a remarkable occurrence is specially ia need of an explana- 
 tion, and that any sufficient explanation has a high probability 
 ia its favour. That an explanation is particularly required, 
 possesses a measure of truth ; for it is likely that our original 
 data were much lacldng in completeness, and the occurrence of 
 the extraordinary event briags to light this deficiency. But 
 that we are not justified iu adoptiag mth confidence any sufficient 
 explanation, has been shown already. 
 
 Such arguments, however, get a part of their plausibility from 
 a quite different source. There is a general supposition that some 
 kinds of occurrences are more likely than others to be susceptible 
 of an explanation hy us ; and, therefore, any explanation which 
 deals with such cases falls ia prepared soil. Eesults which, 
 judgiag from ourselves, conscious agents would be Kkely to pro- 
 duce fall into this category. Eesults which would be probable, 
 supposing a direct and predominant causal dependence between 
 the elements whose concomitance is remarked, belong to it also. 
 There is, in fact, a sort of argument from analogy as to whether 
 certain sorts of phenomena are or are not likely to be due to 
 ' chance.' This may explaia, for example, why the particular 
 concurrence of atoms that go to compose the human eye, why a 
 series of correct guesses ia naming playing cards, why special 
 symmetry or special asymmetry amongst the stars, seem to 
 require explanation in no ordinary degree. Prior to an explana- 
 tion these particular concurrences or series or distributions are 
 no more improbable than any other. But the causes of such 
 conjunctions as these are more likely to be discoverable by the 
 human miad than are the causes of others, and the attempt to 
 explain them deserves, therefore, to be more carefully considered. 
 THs supposition, derived by analogy or induction from those 
 cases in which we believe the causes to be known to us, has, per- 
 haps, some weight. But the direct application of the Calculus 
 of Probabilities can do no more in these cases than suggest matter 
 for investigation. The fact that a man has made a long series 
 of correct guesses in cases where he is cut off from the ordinary 
 
304 A TEEATISE ON PROBABILITY w. iv 
 
 channels of communication, is a fact worthy of investigation, 
 because it is more likely to be susceptible of a simple causal ex- 
 planation, which may have many applications, than a case in 
 which false and true guesses follow one another with no apparent 
 regularity. 
 
 7. In the case of empirical laws, such as Bode's law, which have 
 no more than a very slight connection with the general body of 
 scientific knowledge, it is sometimes thought that the law is more 
 probable if it is proposed 6e/ore the examination of some or all of 
 the available instances than if it is proposed after there examina- 
 tion. Supposing, for example, that Bode's law is accurately 
 true for seven planets, it is held that the law would be more 
 probable if it was suggested after the examination of six and 
 was confirmed by the subsequent discovery of the seventh, than 
 it would be if it had not been propounded until after all seven 
 had been observed. The arguments ia favour of such a conclusion 
 are well put by Peirce : ^ "All the qualities of objects may be 
 conceived to result from variations of a number of continuous 
 variables ; hence any lot of objects possesses some character in 
 common, not possessed by any other." Hence if the common 
 character is not predesignate we can conclude nothing. Cases 
 must not be used to prove a generalisation which has only been 
 suggested by the cases themselves. He takes the first five poets 
 from a biographical dictionary with their ages at death : 
 
 Aagard . 
 
 . 48 
 
 Abunowas 
 
 . 48 
 
 Abeille . 
 
 . 76 
 
 Accords 
 
 . 45 
 
 Abulola . 
 
 . 84 
 
 
 
 " These five ages have the following characters in common : 
 
 " 1. The difference of the two digits composing the number, 
 divided by three, leaves a remainder of one. 
 
 " 2. The first digit raised to the power indicated by the second, 
 and then divided by three, leaves a remainder of one. 
 
 " 3. The sum of the prime factors of each age, including one as 
 
 a prime factor, is divisible by three." 
 He compares a generalisation regarding the ages of poets based 
 
 ^ C. S. Peirce, A Theory of Probable Inference, pp. 162-167 ; published in 
 Johns Hopkins Studies in Logic, 1883. 
 
OH. XXV PHILOSOPHICAL APPLICATIONS 305 
 
 on this evidence to Dr. Lyon Playfair's argument about the 
 specific gravities of the three allotropic forms of carbon : 
 
 Diamond . . . 348=^12 
 Graphite . . . 2-29 = Vi2 
 
 Charcoal . . . 1-88= t/l2 
 
 approximately, the atomic weight of carbon being 12. Dr. 
 Playfair thinks that the above renders it probable that the specific 
 gravities of the allotropic forms of other elements would, if we 
 knew them, be found to equal the different roots of their atomic 
 weight. 
 
 The weakness of these argument^, however, has a different 
 explanation. These inductions are very improbable, because they 
 are out of relation to the rest of our knowledge and are based on 
 a very small number of instances. The apparent absurdity, 
 moreover, of the inductive law of Poets' Ages is increased by the 
 fact that we take account of the knowledge we actually possess 
 that the ages of poets are not in fact connected by any such law. 
 If we knew nothing whatever about poets' ages except what is 
 stated above, the induction would be as valid as any other which 
 is based on a very weak analogy and a very small number of 
 instances and is unsupported by indirect evidence. 
 
 The peculiar virtue of prediction or predesignation is altogether 
 imaginary. The number of instances examined and the analogy 
 between them are the essential points, and the question as to 
 whether a particular hypothesis happens to be propounded before 
 or after their examination is quite irrelevant. If all our in- 
 ductions had to be thought of before we examined the cases to 
 which we apply them, we should, doubtless, make fewer induc- 
 tions ; but there is no reason to think that the few we should make 
 would be any better than the many from which we should be 
 precluded. The plausibility of the argument is derived from a 
 different source. If an hypothesis is proposed d priori, this 
 commonly means that there is some ground for it, arising out of 
 our previous knowledge, apart from the purely inductive ground, 
 and if such is the case the hypothesis is clearly stronger than one 
 which reposes on inductive grounds only. But if it is a mere 
 guess, the lucky fact of its preceding some or all of the cases which 
 verify it adds nothing whatever to its value. It is the union of 
 
 X 
 
306 A TREATISE ON PEOBABILITY n. iv 
 
 prior knowledge, with the inductive grounds which arise out of 
 the immediate instances, that lends weight to an hypothesis, and 
 not the occasion on which the hypothesis is first proposed. It is 
 sometimes said, to give another example, that the daily fulfilment 
 of the predictions of the Nautical Almanack constitutes the most 
 cogent proof of the laws of dynamics. But here the essence of 
 the verification Kes in the variety of cases which can be brought 
 accurately under our notice by means of the Almanack, and in 
 the fact that they have all been obtained on a uniform principle, 
 not in the fact that the verification is preceded by a prediction. 
 
 The same point arises not uncommonly in statistical inquiries. 
 If a theory is first proposed and is then confirmed by the examina- 
 tion of statistics, we are inclined to attach more weight to it than 
 to a theory which is constructed in order to suit the statistics. 
 But the fact that the theory which precedes the statistics is more 
 likely than the other to be supported by general considerations 
 ; — ^for it has not, presumably, been adopted for no reason at all — 
 constitutes the only valid ground for this preference. If it does 
 not receive more support than the other from general considera- 
 tions, then the circumstances of its origin are no argument in its 
 favour. The . opposite view, which the unreliability of some 
 statisticians has brought into existence, — ^that it is a positive 
 advantage to approach statistical evidence without preconcep- 
 tions based on general grounds, because the temptation to ' cook ' 
 the evidence will prove otherwise to be irresistible, — ^has no 
 logical basis and need only be considered when the impartiality of 
 an investigator is in doubt. 
 
CHAPTEE XXVI 
 
 THE APPLICATION OF PEOBABILrrY TO CONDUCT 
 
 1. Given as our basis what knowledge we actually have, the 
 probable, I have said, is that which it is rational for us to believe. 
 This is not a definition. For it is not rational for us to believe 
 that the probable is true ; it is only rational to have a probable 
 belief in it or to believe it in preference to alternative beliefs. To 
 believe one thing in preference to another, as distinct from believing 
 the first true or more probable and the second false or less probable, 
 must have reference to action and must be a loose way of ex- 
 pressing the propriety of acting on one hypothesis rather than 
 on another. We might put it, therefore, that the probable is 
 the hypothesis on which it is rational for us to act. It is, however, 
 not so simple as this, for the obvious reason that of two hypotheses 
 it may be rational to act on the less probable if it leads to the 
 greater good. We cannot say more at present than that the 
 probability of a hypothesis is one of the things to be determined 
 and taken account of before acting on it. 
 
 2. I do not know of passages in the ancient philosophers which 
 explicitly point out the dependence of the duty of pursuing 
 goods on the reasonable or probable expectation of attaining 
 them relative to the agent's knowledge. This means only that 
 analysis had not disentangled the various elements in rational 
 action, not that common sense neglected them. Herodotus 
 puts the point quite plainly. " There is nothing more profitable 
 for a man," he says, " than to take good counsel with himself ; 
 for even if the event turns out contrary to one's hope, still one's 
 decision was right, even though fortune has made it of no effect : 
 whereas if a man acts contrary to good counsel, although by luck 
 he gets what he had no right to expect, his decision was not any 
 the less foolish." ^ 
 
 1 Herod, vii. 10. 
 307 
 
308 A TREATISE ON PROBABILITY pt. iv 
 
 3. The first contact of theories of probability with modern 
 ethics appears in the Jesuit doctrine of probabUism. According 
 to this doctriae one is justified in doing an action for which there 
 is any probability, however small, of its results being the best 
 possible. Thus, if any priest is willing to permit an action, that 
 fact affords some probability in its favour, and one will not be 
 damned for performing it, however many other priests denoimce 
 it.^ It may be suspected, however, that the object of this 
 doctrine was not so much duty as safety. The priest who per- 
 mitted you so to act assumed thereby the responsibility. The 
 correct application of probability to conduct naturally escaped 
 the authors of a juridical ethics, which was more interested in 
 the fixing of responsibility for definite acts, and in the various 
 specified means by which responsibility might be disposed of, 
 than in the greatest possible sum-total of resultant good. 
 
 A more correct doctrine was brought to light by the efforts of 
 the philosophers of the Port Royal to expose the fallacies of prob- 
 abilism. " In order to judge," they say, " of what we ought to 
 do in order to obtain a good and to avoid an evil, it is necessary 
 to consider not only the good and evil in themselves, but also 
 the probability of their happening and not happening, and to 
 regard geometrically the proportion which all these things have, 
 taken together." * Locke perceived the same point, although 
 not so clearly.^ By Leibniz this theory is advanced more 
 explicitly ; in such judgments, he says, " as in other estimates 
 disparate and heterogeneous and, so to speak, of more than one 
 dimension, the greatness of that which is discussed is ia reason 
 composed of both estimates {i.e. of goodness and of probability), 
 and is like a rectangle, in which there are two considerations, 
 viz. that of length and that of breadth. . . . Thus we should 
 
 ^ Compare with this doctrine the following curious passage from Jeremy 
 Taylor : — " We being the persons that are to be persuaded, we must see that 
 we be persuaded reasonably. And it is nnreasonable to assent to a lesser 
 evidence when a greater and clearer is propounded : but of that every man for 
 himself is to take cognisance, if he be able to judge ; it he be not, he is not 
 boimd under the tie of necessity to know anything of it. That that is 
 necessary shall be certainly conveyed to him : God, that beat can, will certainly 
 take care for that ; for if he does not, it becomes to be not necessary ; or if it 
 should still remain necessaiy, and he be damned for not knowing it, and yet to 
 know it be not in his power, then who can help it ! There can be no further 
 care in this business." 
 
 2 The Part Royal Logic (1662), Eng. Trans, p. 367. 
 
 3 Essay concerning Human Understanding; book ii. chap. xxi. § 66. 
 
CH. XXVI PHILOSOPHICAL APPLICATIONS 309 
 
 still need the art of thinkiiig and that of estimating probabilities, 
 besides the knowledge of the value of goods and evils, in order 
 properly to employ the art of consequences." ^ 
 
 In his preface to the Analogy Butler insists on " the absolute 
 and formal obligation " under which even a low probability, 
 if it is the greatest, may lay us : "To us probability is the very 
 guide of life." 
 
 4. With the development of a utilitarian ethics largely con- 
 cerned with the summing up of consequences, the place of prob- 
 ability in ethical theory has become much more explicit. But 
 although the general outlines of the problem are now clear, there 
 are some elements of confusion not yet dispersed. I will deal with 
 some of them. 
 
 In his Principia Ethica (p. 152) Dr. Moore argues that " the 
 first difficulty in the way of establishing a probability that one 
 course of action will give a better total result than another, lies 
 in the fact that we have to take account of the effects of both 
 throughout an infimite future. . . . We can certainly only pretend 
 to calculate the effects of actions within what may be called an 
 ' immediate future.' . . . We must, therefore, certainly have 
 some reason to believe that no consequences of our action in a 
 further future will generally be such as to reverse the balance of 
 good that is probable in the future which we can foresee. This 
 large postulate must be made, if we are ever to assert that the 
 results of one action will be even probably better than those of 
 another. Our utter ignorance of the far future gives us no justi- 
 fication for saying that it is even probably right to choose the 
 greater good within the region over which a probable forecast 
 may extend." 
 
 This argument seems to me to be invalid and to depend on 
 a wrong philosophical interpretation of probability. Mr. Moore's 
 reasoning endeavours to show that there is not even a probability 
 by showing that there is not a certainty. We must not, of course, 
 have reason to believe that remote consequences will generally 
 be such as to reverse the balance of immediate good. But we 
 need not be certain that the opposite is the case. If good is 
 additive, if we have reason to think that of two actions one pro- 
 duces more good than the other in the near future, and if we have 
 no means of discriminating between their results in the distant 
 
 ^ Nouveaux Essais, book ii. chap. xxi. 
 
310 A TREATISE ON PROBABILITY rr. iv 
 
 future, then by what seems a legitimate application of the 
 Principle of Indifference we may suppose that there is a prob- 
 ability in favour of the former action. Mr. Moore's argument 
 must be derived from the empirical or frequency theory of 
 probability, according to which we must know for certain what 
 wiU happen generally (whatever that may mean) before we can 
 assert a probability. 
 
 The results of our endeavours are very uncertain, but we have 
 a genuine probability, even when the evidence upon which it is 
 founded is slight. The matter is truly stated by Bishop Butler : 
 " From our short views it is greatly uncertain whether this 
 endeavour wUl, in particular instances, produce an overbalance 
 of happiness upon the whole ; since so many and distant things 
 must come iato the account. And that which makes it our duty 
 is that there is some appearance that it wiQ, and no positive 
 appearance to balance this, on the contrary side. . . ." ^ 
 
 The difficulties which exist are not chiefly due, I think, to our 
 ignorance of the remote future. The possibility of our knowing 
 that one thing rather than another is our duty depends upon the 
 assumption that a greater goodness in any part makes, in the 
 absence of evidence to the contrary, a greater goodness in the 
 whole more probable than would the lesser goodness of the part. 
 We assume that the goodness of a part is favourably relevant to 
 the goodness of the whole. Without this assumption we have no 
 reason, not even a probable one, for preferring one action to any 
 other on the whole. If we suppose that goodness is always 
 organic, whether the whole is composed of simiiltaneous or 
 successive parts, such an assumption is not easily justified. The 
 case is parallel to the question, whether physical law is organic or 
 atomic, discussed in Chapter XXI. § 6. 
 
 Nevertheless we can admit that goodness is partly organic 
 and BtiU allow ourselves to draw probable conclusions. For the 
 alternatives, that either the goodness of the whole universe 
 throughout time is organic or the goodness of the universe is the 
 arithmetic sum of the goodnesses of infinitely numerous and 
 infinitely divided parts, are not exhaustive. We may suppose 
 that the goodness of conscious persons is organic for each distinct 
 
 1 This passage is from the Analogy. The Bishop adds : " ... and also 
 that such benevolent endeavour is a cultivation of that most excellent of aU 
 virtuous principles, the active principle of benevolence." 
 
OH. xxYi PHILOSOPHICAL APPLICATIONS 311 
 
 and indiAddual personality. Or we may suppose that, when 
 conscious units are ia conscious relationship, then the whole 
 which we must treat as organic includes both units. These are 
 only examples. We must suppose, in general, that the units 
 whose goodness we must regard as organic and indivisible are 
 not always larger than those the goodness of which we can 
 perceive and judge directly. 
 
 5. The difficulties, however, which are most fundamental 
 from the standpoint of the student of probability, are of a different 
 kind. Normal ethical theory at the present day, if there, can be 
 said to be any such, makes two assumptions : first, that degrees 
 of goodness are numerically measurable and arithmetically 
 additive, and second, that degrees of probability also are numeric- 
 ally measurable. This theory goes on to maintain that what 
 we ought to add together, when, ia. order to decide between two 
 courses of action, we sum up the results of each, are the ' mathe- 
 matical expectations ' of the several results. ' Mathematical 
 expectation ' is a technical expression originally derived from the 
 scientific study of gambling and games of chance, and stands for 
 the product of the possible gain with the probability of attaining 
 it.^ In order to obtain, therefore, a measure of what ought to 
 be our preference in regard to various alternative courses of action, 
 we must sum for each course of action a series of terms made 
 up of the amounts of good which may attach to each of its 
 possible consequences, each multiplied by its appropriate prob- 
 ability. 
 
 The first assumption, that quantities of goodness are duly 
 subject to the laws of arithmetic, appears to me to be open to a 
 certain amoimt of doubt. But it would take me too far from 
 my proper subject to discuss it here, and I shall allow, for the 
 purposes of further argument, that in some sense and to some 
 extent this assumption can be justified. The second assumption, 
 however, that degrees of probability are wholly subject to the 
 laws of arithmetic, runs directly counter to the view which has 
 
 ' Priority in the conception of mathematical expectation can, I think, be 
 claimed by Leibniz, De incerti aestimatione, 1678 (Couturat, Logique de Leibniz, 
 p. 248). In a letter to Plaooius, 1687 (Dutens, vi. i. 36 and Couturat, op. cit. 
 p. 246) Leibniz proposed an application of the same principle to juris- 
 prudence, by virtue of which, if two litigants lay claim to a sum of money, 
 and if the claim of the one is twice as probable as that of the other, the sum 
 should be divided between them in that proporiiion. The doctrine, seems 
 sensible, but I am not aware that it has ever been acted on. 
 
312 A TEEATISE ON PKOBABILITY pt. iv 
 
 been advocated in Part I. of this treatise. Lastly, if both these 
 points be waived, the doctrine that the ' mathematical expecta- 
 tions ' of alternative courses of action are the proper measures of 
 our degrees of preference is open to doubt on two grounds — first, 
 because it ignores what I have termed in Part I. the ' weights ' 
 of the arguments, namely, the amount of evidence upon which 
 each probability is founded ; and second, because it ignores the 
 element of " risk ' and assumes that an even chance of heaven 
 or hell is precisely as much to be desired as the certain attain- 
 ment of a state of mediocrity. Putting on one side the first of 
 these groimds of doubt, I will treat each of the others in turn. 
 
 6. In Chapter III. of Part I. I have argued that only in a 
 strictly limited class of cases are degrees of probability numeric- 
 ally measurable. It follows from this that the ' mathematical 
 expectations ' of goods or advantages are not always numerically 
 measurable ; and hence, that even if a meaning can be given to 
 the sum of a series of non-numerical ' mathematical expectations,' 
 not every pair of such sums are numerically comparable in respect 
 of more and less. Thus even if we know the degree of advantage 
 which might be obtained from each of a series of alternative 
 courses of actions and know also the probability in each case of 
 obtaining the advantage ia question, it is not always possible by 
 a mere process of arithmetic to determine which of the alternatives 
 ought to be chosen. If, therefore, the question of right action is 
 under all circumstances a determinate problem, it must be ia 
 virtue of an iatuitive judgment directed to the situation as a 
 whole, and not ia virtue of an arithmetical deduction derived 
 from a series of separate judgments directed to the individual 
 alternatives each treated in isolation. 
 
 We must accept the conclusion that, if one good is greater 
 than another, but the probability of attaining the first less than 
 that of attaimng the Second, the question of which it is our duty 
 to pursue may be indeterminate, unless we suppose it to be 
 within our power to make direct quantitative judgments of prob- 
 ability and goodness jointly. It may be remarked, further, 
 that the difficulty exists, whether the numerical iadeterminate- 
 ness of the probability is intrinsic or whether its numerical value 
 is, as it is according to the Frequency Theory and most other 
 theories, simply unknown. 
 
 7. The second difficulty, to which attention is called above, 
 
OH. XXVI PHILOSOPHICAL APPLICATIONS 313 
 
 is the neglect of the ' weights ' of arguments in the conception 
 of ' mathematical expectation.' In Chapter VI. of Part I. the 
 significance of ' weight ' has been discussed. In the present 
 connection the question comes to this — ^if two probabilities are 
 equal in degree, ought we, in choosing our course of action, to 
 prefer that one which is based on a greater body of knowledge ? 
 
 The question appears to me to be highly perplexing, and it is 
 difficult to say much that is useful about it. But the degree of 
 completeness of the information upon which a probability is 
 based does seem to be relevant, as well as the actual magnitude 
 of the probability, in making practical decisions. Bernoulli's 
 maxim,^ that in reckoning a probability we must take into account 
 all the iaformation which we have, even when reinforced by 
 Locke's maxim that we must get all the information we can,^ 
 does not seem completely to meet the case. If, for one alternative, 
 the available iaformation is necessarily small, that does not seem 
 to be a consideration which ought to be left out of account 
 altogether. 
 
 8. The last difficulty concerns the question whether, the 
 former difficulties being waived, the ' mathematical expectation ' 
 of different courses of action accurately measuxes what our 
 preferences ought to be — whether, that is to say, the undesir- 
 ability of a given comrse of action increases in direct proportion 
 to any increase in the imcertainty of its attaining its object, or 
 whether some allowance ought to be made for ' risk,' its undesir- 
 abUity increasing more than in proportion to its uncertainty. 
 
 In fact the meaning of the judgment, that we ought to act in 
 such a way as to produce most probably the greatest sum of 
 goodness, is not perfectly plain. Does this mean that we 
 ought so to act as to make the sum of the goodnesses of each of 
 the possible consequences of our action multiplied by its prob- 
 ability a maximum ? Those who rely on the conception of 
 ' mathematical expectation ' must hold that this is an indisput- 
 able proposition. The justifications for this view most commonly 
 advanced resemble that given by Coudorcet in his " Eeflexions 
 
 1 Ars Conjectandi, p. 215 : " Non suffloit expendere unum alterumve argu- 
 mentum, sed oonquirenda sunt omnia, quae in cognitionem Mstram venire 
 possunt, atque uUo modo ad probationem rei facere videntur." 
 
 a Essay concerning Human Understanding, book ii. chap. xxi. § 67 : " He 
 that judges without informing himself to the utmost that he is capable, cannot 
 acquit himBeii of judging amiss." 
 
314 A TEBATISB ON PEOBABILITY m. rv 
 
 sur la rSgle g6n6rale, qui prescrit de prendxe pour valeur d'lrn 
 6veneinent incertain, la probability de cet ^venement nniltipli6e 
 par la valeur de rSvenement en M-rnSme," ^ where he argues 
 from Bernoulli's theorem that such a rule wiU lead to satisfactory 
 results if a very large number of trials be made. As, however, 
 it will be shown lq Chapter XXIX. of Part V. that Bernoulli's 
 theorem is not applicable in by any means every case, this 
 argument is inadequate as a general justification. 
 
 In the history of the subject, nevertheless, the theory of 
 ' mathematical expectation ' has been very seldom disputed. 
 As D'Alembert has been almost alone in casting serious doubts 
 upon it (though he only brought himself into disrepute by doiag 
 so), it wiU be worth while to quote the main passage in which he 
 declares his scepticism : " II me sembloit " (in reading Bernoulli's 
 Ars Conjectandi) " que cette matiere avoit besoin d'etre trait^e 
 d'une maniere plus claire ; je voyois bien que I'esp^rance 6toit 
 plus grande, 1° que la somme esperee etoit plus grande, 2° que 
 la probability de gagner I'etoit aussi. Mais je ne voyois pas avec 
 la mSme evidence, et je ne le vols pas encore, 1° que la probabUite 
 soit estimee exactement par les m^thodes usitees ; 2° que quand 
 elle le seroit, I'esperance doive etre proportionnelle k cette proba^ 
 bUite simple, plut6t qu'a une puissance ou mSme a une fonction 
 de cette probabilite ; 3° que quand il y a plusieurs combinaisons 
 qui donnent diEEerens avantages ou diSerens risques (qu'on 
 regarde comme des avantages n^gatifs) il faiUe se contenter 
 d'ajouter simplement ensemble toutes les esperances pour avoir 
 I'esperance totale." ^ 
 
 In extreme cases it seems difficult to deny some force to 
 D'Alembert's objection ; and it was with reference to extreme 
 cases that he himself raised it. Is it certain that a larger good, 
 which is extremely improbable, is precisely equivalent ethically 
 to a smaller good which is proportionately more probable 1 We 
 may doubt whether the moral value of speculative and cautious 
 action respectively can be weighed against one another in a 
 simple arithmetical way, just as we have already doubted whether 
 a good whose probability can only be determined on a slight 
 basis of evidence can be compared by means merely of the 
 
 1 Hist, de VAcad., Paris, 1781. 
 
 " Opuscules matMmaiiques, vol. iv., 1768 (extraits de lettres), pp. 284, 285. 
 See also p. 88 of the same volume. 
 
OH. XXVI PHILOSOPHICAL APPLICATIONS 315 
 
 magnitude of this probability with another good whose likelihood 
 is based on completer knowledge. 
 
 There seems, at any rate, a good deal to be said for the con- 
 clusion that, other things berag equal, that course of action is 
 preferable which involves least risk, and about the results of 
 which we have the most complete knowledge. In marginal cases, 
 therefore, the coefficients of weight and risk as weU as that 
 of probability are relevant to our conclusion. It seems natural 
 to suppose that they should exert some influence in other cases 
 also, the only difficulty in this beiag the lack of any principle for 
 the calculation of the degree of their influence. A high weight 
 and the absence of risk increase -pro tanto the desirability of the 
 action to which they refer, but we cannot measure the amount 
 of the increase. 
 
 The ' risk ' may be defined in some such way as foUows. If 
 A is the amount of good which may' result, f its probability 
 {p + q=\), and E the value of the 'mathematical expectation,' 
 so that B=33A, then the 'risk' is E, where Il=^(A-E) = 
 'p{l-p)k. = 'pqk. = qSi. This may be put in another way: E 
 measures the net immediate sacrifice which should be made ia the 
 hope of obtainiag A ; g' is the probability that this sacrifice will 
 be made in vain ; so that gE is the ' risk.' ^ The ordiaary theory 
 supposes that the ethical value of an expectation is a function 
 of E only and is entirely independent of R. 
 
 We could, if we liked, define a conventional coefficient c of 
 
 weight and risk, such as c=— tjz r, where w measures the 
 
 (l+?)(l+w) 
 
 ' weight,' which is equal to unity when jj = 1 and w = 1, and 
 
 to zero when p=0 or w=0, and has an intermediate value 
 
 in other cases.^ But if doubts as to the sufficiency of the 
 
 conception of ' mathematical expectation ' be sustained, it is not 
 
 likely that the solution will lie, as D'Alembert suggests, and as 
 
 has been exemplified above, in the discovery of some more 
 
 ^ The theory of Eisiko is briefly dealt with by Czuber, Wahrscheinlichheits- 
 rechnung, vol. ii. pp. 219 et seq. K R measures the first insurance, this leads to a 
 Risiko of the second order, Rj = gR = g^R. This agaia may be insured against, 
 and by a sufficient number of such reinsurances the risk can be completely 
 
 shifted : E+Ri-t-R2+ . .. =£(1+2+22+ . . .) = j— = -=A. 
 
 ° If pA = p'A', w>w', and ?=?', then cA>c'A'; if pA=p'A', w=w', and 
 g<g', then cA>c'A'; if pA=p'A', w>w', and ?<?', then 6A>c'A'; but if 
 pA =p'A!, w = to', and q > q', we cannot in general compare cA and c'A'. 
 
316 A TREATISE ON PROBABILITY pt. iv 
 
 complicated fiinction of the probability wherewith to compound 
 the proposed good. The judgment of goodness and the judgment 
 of probability both involve somewhere an element of direct 
 apprehension, and both are quantitative. We have raised a 
 doubt as to whether the magnitude of the ' oughtness ' of an 
 action can be in all cases directly determined by simply mtdti- 
 plyiag together the magnitudes obtaiaed in the two direct judg- 
 ments ; and a new direct judgment may be required, respecting 
 the magnitude of the ' oughtness ' of an action under given 
 circumstances, which need not bear any simple and necessary 
 relation to the two former. 
 
 The hope, which sustained many investigators in the course 
 of the nineteenth century, of gradually bringing the moral sciences 
 under the sway of mathematical reasoning, steadily recedes — 1£ 
 we mean, as they meant, by mathematics the introduction of 
 precise numerical methods. The old assumptions, that all 
 quantity is numerical and that all quantitative characteristics 
 are additive, can be no longer sustained. Mathematical reasoning 
 now appears as an aid in its symbolic rather than in its numerical 
 character. I, at any rate, have not the same lively hope as 
 Condorcet, or even as Edgeworth, " eclairer les Sciences morales 
 et politiques par le flambeau de I'Algebre." In the present case, 
 even if we are able to range goods in order of magnitude, and also 
 their probabilities in order of magnitude, yet it does not follow 
 that we can range the products composed of each good and its 
 corresponding probability in this order. 
 
 9. Discussions of the doctrine of Mathematical Expectation, 
 apart from its directly ethical bearing, have chiefly centred 
 round the classic Petersburg Paradox,^ which has been treated by 
 almost all the more notable writers, and has been explained by 
 them in a great variety of ways. The Petersburg Paradox arises 
 out of a game in which Peter engages to pay Paul one shilling 
 if a head appears at the first toss of a coin, two shillings if it does 
 not appear until the second, and, in general, 2'"^ shillings if no 
 head appears until the r*^ toss. What is the value of Paul's 
 expectation, and what sum must he hand over to Peter before 
 the game commences, if the conditions are to be fair ? 
 
 1 Por the history of this paradox see Todhunter. The name is due, he says, 
 to its having first appeared in a memoir by Daniel Bernoulli in the Commentarii 
 of the Petersburg Academy. 
 
OH. XXVI PHILOSOPHICAL APPLICATIONS 317 
 
 n 
 
 The mathematical answer is 2(|f2''"\ if the number of tosses 
 
 1 
 
 00 
 
 IS not in any case to exceed n in aU, and %{yfT'''^ if this restriction 
 
 1 
 
 IS removed. That is to say, Paul should pay - shillings in the 
 
 first case, and an infinite sum in the second. Nothing, it is said, 
 could be more paradoxical, and no sane Paul would engage on 
 these terms even with an honest Peter. 
 
 Many of the solutions which have been ofEered will occur at 
 once to the reader. The conditions of the game iw/ply contra- 
 diction, say Poisson and Condorcet ; Peter has undertaken 
 engagements which he cannot fulfil ; if the appearance of heads 
 is deferred even to the 100th toss, he will owe a mass of silver 
 greater in bulk than the sun. But this is no answer. Peter has 
 promised much and a belief in his solvency will strain our imagina- 
 tion ; but it is imaginable. And in any case, as Bertrand points 
 out, we may suppose the stakes to be, not shillings, but grains of 
 sand or molecules of hydrogen. 
 
 D'Alembert's principal explanations are, first, that true ex- 
 pectation is not necessarily the product of probability and 
 profit (a view which has been discussed above), and second, that 
 very long runs are not only very improbable, but do not occur 
 at all. 
 
 The next type of solution is due, in the first instance, to Daniel 
 Bernoulli, and turns on the fact that no one but a miser regards 
 the desirability of different sums of money as directly proportional 
 to their amount ; as Buffon says, " L'avare est comme le 
 mathematicien : tons deux estiment I'argent par sa quantite 
 numerique." Daniel Bernoulli deduced a formula from the 
 assumption that the importance of an increment is inversely 
 proportional to the size of the fortune to which it is added. 
 Thus, if a; is the ' physical ' fortune and y the ' moral ' fortune, 
 
 dy=k — ' 
 
 X 
 
 or y=klog-, where k and a are constants. 
 
 On the basis of this formula of Bernoulli's a considerable 
 
318 A TEEATISE ON PROBABILITY m. iv 
 
 theory has been built up both by Bernoulli^ himseK and by 
 Laplace.^ It leads easily to the further formula — 
 
 x = {a+Xj)p^{a+X2)p.2. . ., 
 
 where a is the initial ' physical ' fortune, p^, etc., the probabilities 
 of obtaining increments a^, etc., to a, and x the ' physical ' fortune 
 whose present possession would yield the same ' moral ' fortune 
 as does the expectation of the various increments a^, etc. By 
 means of this formula Bernoulli shows that a man whose fortune 
 is £1000 may reasonably pay a £6 stake in order to play the 
 Petersburg game with £1 units. Bernoulli also mentions two 
 solutions proposed by Cramer. In the first aU sums greater 
 than 2^ (16,777,116) are regarded as ' morally ' equal ; this 
 leads to £13 as the fair stake. According to the other formula 
 the pleasure derivable from a sum of money varies as the square 
 root of the sum ; this leads to £2 : 9s. as the fair stake. But 
 little object is served by following out these arbitrary hypotheses. 
 
 As a solution of the Petersburg problem this line of thought 
 is only partially successful : if increases of ' physical ' fortune 
 beyond a certain finite limit can be regarded as ' morally ' 
 negligible, Peter's claim for an infimte initial stake from Paul is, 
 it is true, no longer equitable, but with any reasonable law of 
 diminution for successive increments Paul's stake will still remain 
 paradoxically large. Daniel Bernoulli's suggestion is, however, 
 of considerable historical interest as being the first explicit 
 attempt to take account of the important conception known to 
 modern economists as the diminishing marginal utility of money, 
 — a conception on which many important arguments are founded 
 relating to taxation and the ideal distribution of wealth. 
 
 Each of the above solutions probably contains a part of the 
 psychological explanation. We are unwiUing to be Paul, partly 
 because we do not believe Peter will pay us if we have good 
 fortune in the tossing, partly because we do not know what we 
 should do with so much money or sand or hydrogen if we won it, 
 partly because we do not believe we ever should win it, and 
 partly because we do not think it would be a rational act to risk 
 
 1 " Specimen Theoriae Novae de Mensura Sortis," Comm. Acad. Petrop. 
 vol. V. for 1730 and 1731, pp. 175-192 (pubUshed 1738). See Todhunter, pp. 
 213 et seq. 
 
 ' Theorie analytique, chap. x. " De I'esp&ance morale," pp. 432-445. 
 
OH. XXVI PHILOSOPHICAL APPLICATIONS 319 
 
 an infinite sum or even a very large finite sum for an infinitely 
 larger one, whose attainment is infinitely unlikely. 
 
 When we have made the proper hypotheses and have ehmin- 
 ated these elements of psychological doubt, the theoretic dispersal 
 of what element of paradox remains must be brought about, I 
 think, by a development of the theory of risk. It is primarily 
 the great risk of the wager which deters us. Even in the case 
 where the nimiber of tosses is ia no case to exceed a finite number, 
 the risk E, as already defined, may be very great, and the relative 
 
 risk = will be almost unity. Where there is no limit to the 
 
 number of tosses, the risk is infinite. A relative risk, which 
 approaches unity, may, it has been already suggested, be a factor 
 which must be taken mto account in ethical calculation. 
 
 10. In establishing the doctriae, that all private gambling 
 must be with certaiaty a losing game, precisely contrary argu- 
 ments are employed to those which do service in the Petersburg 
 problem. The argument that " you must lose if only you go on 
 long enough " is well known. It is succinctly put by Laurent : ^ 
 Two players A and B have a and b francs respectively. J{a) is 
 
 the chance that A will be ruined. Thus f{a) = — j-,^ so that 
 
 a+o 
 
 the poorer a gambler is, relatively to his opponent, the more 
 likely he is to be ruined. But further, if & = oo , f{a) = 1, i.e. ruin 
 is certain. The infinitely rich gambler is the public. It is against 
 the public that the professional gambler plays, and his ruin is 
 therefore certain. 
 
 Might not Poisson and Condorcet reply, The conditions of 
 the game imply contradiction, for no gambler plays, as this argu- 
 ment supposes, for ever ? ^ At the end of aiVLj finite quantity of 
 play, the player, even if he is not the public, may finish with 
 winnings of any finite size. The gambler is in a worse position if 
 his capital is smaller than his opponents' — at poker, for instance, 
 or on the Stock Exchange. This is clear. But our desire for 
 moral improvement outstrips our logic if we tell him that he 
 must lose. Besides it is paradoxical to say that everybody 
 
 1 Oalcul des probabilites, p. 129. 
 
 2 This would possibly follow from the theorem of Daniel Bernoulli. The 
 reasoning by which Laurent obtains it seems to be the result of a mistake. 
 
 » Cf. also Mr. Bradley, Logic, p. 217. 
 
320 A TREATISE ON PROBABILITY pt. iv 
 
 individually must lose and that everybody collectively must wia. 
 For every individual gambler who loses there is an individual 
 gambler or syndicate of gamblers who win. The true moral is 
 this, that poor men should not gamble and that millionaires 
 should do nothing else. But milKonaires gain nothing by gam- 
 bling with one another, and until the poor man departs from the 
 path of prudence the millionaire does not find his oppprtimity. 
 If it be replied that in fact most millionaires are men originally 
 poor who departed from the path of prudence, it must be 
 admitted that the poor man is not doomed with certainty. 
 Thus the philosopher must draw what comfort he can from the 
 conclusion with which his theory furnishes him, that million- 
 aires are often fortunate fools who have thriven on unfortunate 
 ones.^ 
 
 11. In conclusion we may discuss a little further the concep- 
 tion of ' moral ' risk, raised in § 8 and at the end of § 9. Bernoulli's 
 formula crystallises the undoubted truth that the value of a sum 
 of money to a man varies according to the amount he already 
 possesses. But does the value of an amount of goodness also 
 vary in this way ? May it not be true that the addition of a given 
 good to a man who already enjoys much good is less good than 
 its bestowal on a man who has little ? If this is the case, it 
 follows that a smaller but relatively certaia good is better than 
 a greater but proportionately more uncertain good. 
 
 In order to assert this, we have only to accept a particular 
 theory of organic goodness, applications of which are common 
 enough in the mouths of political philosophers. It is at the root 
 of aU principles of equality, which do not arise out of an assumed 
 diminishing marginal utility of money. It is behind the numerous 
 arguments that an equal distribution of benefits is better than a 
 very unequal distribution. If this is the case, it follows that, the 
 sum of the goods of all parts of a community taken together 
 beiag fixed, the organic good of the whole is greater the more 
 equally the benefits are divided amongst the individuals. If the 
 doctrine is to be accepted, moral risks, like financial risks, must 
 not be undertaken unless they promise a profit actuarially. 
 
 1 From the social point of view, however, this moral against gambUng may 
 be drawn — that those who start with the largest initial fortunes are most likely 
 to win, and that a given increment to the wealth of these benefits them, on the 
 assumption of a diminishing marginal utility of money, less than it injures those 
 from whom it is taken. 
 
OH. XXVI PHILOSOPHICAL APPLICATIONS 321 
 
 There is a great deal which could be said concerning such a 
 doctrine, but it would lead too far from what is relevant to the 
 study of Probability. One or two instances of its use, however, 
 may be taken from the literature of Probability. In his essay, 
 " Sur I'application du calcul des probaljilites a I'inoculation de 
 la petite v6role," ^ D'Alembert points out that the community 
 would gain on the average if, by sacrificing the lives of one in five 
 of its citizens, it could ensure the health of the rest, but he argues 
 that no legislator could have the right to order such a sacrifice. 
 Galton, in his Probability, the Foundation of Eugenics, employed 
 an argument which depends essentially on the same point. 
 Suppose that the members of a certain class cause an average 
 detriment M to society, and that the mischiefs done by the 
 several individuals difEer more or less from M by amounts whose 
 average is D, so that D is the average amount of the individual 
 deviations, all regarded as positive, from M ; then, Galton argued, 
 the smaller D is, the stronger is the justification for taking such 
 drastic measures against the propagation of the class as would 
 be consonant to the feelings, if it were known that each individual 
 member caused a detriment M. The use of such arguments 
 seems to involve a qualification of the simple ethical doctrine 
 that right action should make the sum of the benefits of the 
 several individual consequences, each multiplied by its prob- 
 ability, a maximum. 
 
 On the other hand, the opposite view is taken in the Port Royal 
 Logic and by Butler, when they argue that everything ought to 
 be sacrificed for the hope of heaven, even if its attainment be 
 thought infinitely improbable, since " the smallest degree of 
 facility for the attainment of salvation is of higher value than 
 all the blessings of the world put together." ^ The argument is, 
 that we ought to foUow a course of conduct which may with the 
 slightest probability lead to an infinite good, imtil it is logically 
 disproved that such a result of our action is impossible. The 
 Emperor who embraced the Eoman Catholic religion, not because 
 
 ^ Opuscules mathematiques, vol. ii. 
 
 2 Port Royal Logic (Eng. trans.), p. 369 : " It belongs to infinite things alone, 
 as eternity and salvation, that they cannot be equalled by any temporal advan- 
 tage ; and thus we ought never to place them in the balance with any of the 
 things of the world. This is why the smallest degree of facility for the attain- 
 ment of salvation is of higher value than all the blessings of the world put 
 together. . . ." 
 
 Y 
 
322 A TREATISE ON PROBABILITY pt. iv 
 
 lie believed it, but because it offered iusurance against a disaster 
 whose future occurrence, however improbable, he could not 
 certainly disprove, may not have considered, however, whether 
 the product of an infinitesimal probability and an infinite good 
 might not lead to a finite or infinitesimal result. In any case the 
 argument does not enable us to choose between different courses 
 of conduct, unless we have reason to suppose that one path is 
 more likely than another to lead to infinite good. 
 
 12, In estimating the risk, ' moral ' or ' physical,' it must be 
 remembered that we cannot necessarily apply to individual 
 cases results drawn from the observation of a long series re- 
 sembling them ia some particular. I am thinking of such argu- 
 ments as BufEon's when he names ^^^^^ as the limit, beyond 
 which probability is negligible, on the ground that, being the 
 chance that a man of fifty-six taken at random will die within a 
 day, it is practically disregarded by a man of fiity-six who knows 
 his health to be good. " If a public lottery," Gibbon truly pointed 
 out, " were drawn for the choice of an immediate victim, and if 
 our name were iascribed on one of the ten thousand tickets, 
 should we be perfectly easy ? " 
 
 Bernoulli's second axiom,^ that in reckoning a probability 
 we must take everything into account, is easily forgotten in these 
 cases of statistical probabilities. The statistical result is so 
 attractive in its definiteness that it leads us to forget the more 
 vague though more important considerations which may be, in a 
 given particular case, within our knowledge. To a stranger the 
 probability that I shall send a letter to the post unstamped may 
 be derived from the statistics of the Post Office ; for me those 
 figures would have but the slightest bearing upon the question. 
 
 13. It has been pointed out already that no knowledge of 
 probabilities, less in degree than certainty, helps us to know what 
 conclusions are true, and that there is no direct relation between 
 the truth of a proposition and its probability. Probability begins 
 and ends with probability. That a scientific investigation 
 pursued on account of its probability wiU generally lead to truth, 
 rather than falsehood, is at the best only probable. The pro- 
 position that a course of action guided by the most probable 
 considerations will generally lead to success, is not certainly true 
 and has nothing to recommend it but its probability. 
 
 1 See p. 76. 
 
OH. XXVI PHILOSOPHICAL APPLICATIONS 323 
 
 The importance of probability can only be derived from the 
 judgment that it is rational to be guided by it in action ; and a 
 practical dependence on it can only be justified by a judgment 
 that in action we ought to act to take some account of it. It is 
 for this reason that probability is to us the " guide of life," since 
 to us, as Locke says, " in the greatest part of our concernment, 
 God has afEorded only the TwiUght, as I may so say, of Prob- 
 ability, suitable, I presume, to that state of Mediocrity and 
 Probationership He has been pleased to place us in here." 
 
PART V 
 
 THE FOUNDATIONS OF STATISTICAL 
 INFERENCE 
 
 325 
 
CHAPTER XXVII 
 
 . THE NATUEE OP STATISTICAL INFERENCE 
 
 1. The Theory of Statistics, as it is now understood,^ can be 
 divided into two parts wHcli are for many purposes better kept 
 distinct. The first function of the theory is purely desoriptive. 
 It devises numerical and diagrammatic methods by which certain 
 salient characteristics of large groups of phenomena can be briefly 
 described ; and it provides formulae by the aid of which we can 
 measure or summarise the variations in some particular character 
 which we have observed over a long series of events or instances. 
 The second fimction of the theory is inductive. It seeks to extend 
 its description of certain characteristics of observed events to 
 the corresponding characteristics of other events which have not 
 been observed. This part of the subject may be called the 
 Theory of Statistical Inference ; and it is this which is closely 
 bound up with the theory of probability. 
 
 2. The union of these two distinct theories in a single science 
 is natural. If, as is generally the case, the development of 
 some inductive conclusion which shall go beyond the actually 
 observed instances is our ultimate object, we naturally choose 
 those modes of description, while we are engaged in our pre- 
 liminary investigation, which are most capable of extension 
 beyond the particular instances which they primarily describe. 
 But this union is also the occasion of a great deal of confusion. The 
 statistician, who is mainly interested in the technical methods of 
 his science, is less concerned to discover the precise conditions in 
 which a description can be legitimately extended by induction. 
 He slips somewhat easily from one to the other, and having 
 found a complete and satisfactory mode of description he 
 
 1 See Yule, Introduction to Statistics, pp. 1-5, for a very interesting account 
 of the eTolution of the meaning of the term statistics. 
 
 327 
 
328 A TEEATISE ON PEOBABILITY pt. v 
 
 may take less pains over the transitional argument, which is 
 to permit him to use this description for the purposes of 
 generalisation. 
 
 One or two examples will show how easy it is to slip from 
 description into generalisation. Suppose that we have a series 
 of similar objects one of the characteristics of which is imder 
 observation ; — a number of persons, for example, whose age at 
 death has been recorded. We note the proportion who die at 
 each age, and plot a diagram which displays these facts graphic- 
 ally. We then determine by some method of curve fitting a 
 mathematical frequency curve which passes with close approxima- 
 tion through the points of our diagram. If we are given the 
 equation to this curve, the number of persons who are comprised 
 in the statistical series, and the degree of approximation (whether 
 to the nearest year or month) with which the actual age has been 
 recorded, we have a very complete and succinct account of one 
 particular characteristic of what may constitute a very large 
 mass of individual records. In providing this comprehensive 
 description the statistician has fulfilled his first function. But in 
 determining the accuracy with which this frequency curve can be 
 employed to determine the probability of death at a given age 
 in the population at large, he must pay attention to a new class 
 of considerations and must display a different kind of capacity. 
 He must take account of whatever extraneous knowledge may be 
 available regarding the sample of the population which came 
 under observation, and of the mode and conditions of the observa- 
 tions themselves. Much of this may be of a vague kind, and most 
 of it will be necessarily incapable of exact, numerical, or statistical 
 treatment. He is faced, in fact, with the normal problems of 
 inductive science, one of the data, which must be taken into 
 accoimt, being given in a convenient and manageable form by 
 the methods of descriptive statistics. 
 
 Or suppose, again, that we are given, over a series of years, 
 the marriage rate and the output of the harvest in a certain area 
 of population. We wish to determine whether there is any 
 apparent degree of correspondence between the variations of the 
 two within this field of observation. It is technically difficidt to 
 measure such degree of correspondence as may appear to exist 
 between the variations in two series, the terms of which are in 
 some manner associated in couples,— by coincidence, in this case. 
 
0H.3:xvn STATISTICAL INFERENCE 329 
 
 of time and place. By the method of correlation tables and 
 correlation coefficients the descriptive statistician is able to effect 
 this object, and to present the inductive scientist with a highly 
 significant part of his data in a compact and instructive form. 
 But the statistician has not, in calculating these coefficients of 
 observed correlation, covered the whole ground of which the in- 
 ductive scientist must take cognisance. He has recorded the 
 results of the observations in circimistances where they cannot 
 be recorded so clearly without the aid of technical methods ; but 
 the precise nature of the conditions in which the observations 
 took place and the numerous other considerations of one sort or 
 another, of which we must take account when we wish to 
 generalise, are not usually susceptible of numerical or statistical 
 expression. 
 
 The truth of this is obvious ; yet, not unnaturally, the more 
 complicated and technical the preliminary statistical investigations 
 become, the more prone inquirers are to mistake the statistical 
 description for an iaductive generahsation.^ This tendency, 
 which has existed in some degree, as, I thiak, the whole history of 
 the subject shows, from the eighteenth century down to the 
 present time, has been further encouraged by the terminology in 
 ordinary use. For several statistical coefficients are given the 
 same name when they are used for purely descriptive purposes, 
 as when corresponding coefficients are used to measure the force 
 or the precision of an induction. The term ' probable error,' 
 for example, is used both for the purpose of supplement- 
 iag and improving a statistical description, and for the 
 purpose of indicating the precision of some generalisation. 
 The term ' correlation ' itseK is used both to describe an 
 observed characteristic of particular phenomena and in the 
 enunciation of an inductive law which relates to phenomena 
 in general. 
 
 3, I have been at pain^ to enforce this contrast between 
 statistical description and statistical induction, because the 
 chapters which foUow are to be entirely about the latter, whereas 
 nearly all statistical treatises are mainly concerned with the 
 former. My object wiU be to analyse, so far as I can, the logical 
 
 1 Cf. Whiteliead, Introdw^ion to Mathematics, p. 27 : " There is no more 
 common error than to assume that, because prolonged and accurate mathe- 
 matical calculations have been made, the application of the result to some fact 
 of nature is absolutely certain." 
 
330 A TEEATISE ON PKOBABILITY pt. v 
 
 basis of statistical modes of argument. This involves a double 
 task. To mark down those which are iavalid amongst argu- 
 ments having the support of authority is relatively easy. 
 The other branch of our investigation, namely, to analyse 
 the ground of validity in the case of those arguments the 
 force of which all of us do in fact admit, presents the same 
 kind of fundamental difficulties as we met with in the case 
 of Induction. 
 
 4. The arguments with which we have to deal fall into three 
 main classes : 
 
 (i.) Given the probability relative to certain evidence of each 
 of a series of events, what are the probabilities, relative to the 
 same evidence, of various proportionate frequencies of occurrence 
 for the events over the whole series ? Or more briefly, how often 
 may we expect an event to happen over a series of occasions, given 
 its probability on each occasion ? 
 
 (ii.) Given the frequency with which an event has occurred 
 on a series of occasions, with what probability may we expect it 
 on a further occasion ? 
 
 (iii.) Given the frequency with which an event has occurred 
 on a series of occasions, with what frequency may we probably 
 expect it on a further series of occasions ? 
 
 In the first ty^e of argument we seek to infer an imknown 
 statistical frequency from an d priori probability. In the second 
 type we are engaged on the inverse operation, and seek to base 
 the calculation of a probability on an observed statistical fre- 
 quency. In the third type we seek to pass from an observed 
 statistical frequency, not merely to the probability of an individual 
 occurrence, but to the probable values of other unknown statistical 
 frequencies. 
 
 Each of these types of argument can be further compUcated 
 by being appHed not simply to the occurrence of a simple event 
 but to the concurrence imder given conditions of two or more 
 events. When this two or more dimensional classification re- 
 places the one dimensional, the theory becomes what is some- 
 times termed Correlation, as distinguished from simple Statis- 
 tical Frequency. 
 
 5. In Chapter XXVIII. I touch briefly on the observed 
 phenomena which have given rise to the so-called Law of 
 Great Numbers, and the discovery of which first set statistical 
 
CH. xxvn STATISTICAL INFERENCE 331 
 
 investigation goiag. In Chapter XXIX. the first type of argu- 
 ment, as classified above, is analysed, and the conditions which 
 are required for its vaHdity are stated. The crucial problem 
 of attacking the second and third types of argument is the 
 subject of my concluding chapters. 
 
CHAPTEE XXVIII 
 
 THE LAW OF GREAT NUMBERS 
 
 Natiira quidem suas habet consuetudines, natas ex reditu causarum, sed non 
 nisi lis iirl ri iroXii. Novi morbi inundant subinde humanum genus, quodsi 
 ergo de mortibua quotounque experimeuta feceris, non ideo naturae rerum llmites 
 posuisti, ut pro future variare non possit. — Leibhiz in u letter to Bernoulli, 
 December 3, 1703. 
 
 1. It has always been known that, while some sets of events 
 invariably happen together, other sets generaVy happen together. 
 That experience shows one thing, while not always a sign of 
 another, to be a usual or probable sign of it, must have been one 
 of the earliest and most primitive forms of knowledge. If a dog 
 is generally given scraps at table, that is suf&cient for him to judge 
 it reasonable to be there. But this Irind of knowledge was slow 
 to be made precise. Numerous experiments must be carefully 
 recorded before we can know at all accurately how usual the 
 association is. It would take a dog a long time to find out that 
 he was given scraps except on fast days, and that there was the 
 same number of these in every year. 
 
 The necessary kind of knowledge began to be accumulated 
 during the seventeenth and eighteenth centuries by the early 
 statisticians. Halley and others began to construct mortality 
 tables ; the proportion of the births of each sex were tabulated ; 
 and so forth. These investigations brought to Ught a new fact 
 which had not been suspected previously — ^namely, that ia certain 
 cases of partial association the degree of association, i.e. the pro- 
 portion of instances in which it existed, shows a very surprising 
 regularity, and that this regularity becomes more marked the 
 greater the number of the instances imder consideration. It was 
 found, for example, not merely that boys and girls are born on 
 the whole in about equal proportions, but that the proportion, 
 
 332 
 
cH.xxvm STATISTICAL INFERENCE 333 
 
 which, is not one of complete equality, tends everywhere, when 
 the number of recorded instances becomes large, to approximate 
 towards a certain definite figure. 
 
 During the eighteenth century matters were not pushed much 
 further than this, that in certain cases, of which comparatively 
 few were known, there was this surprising regularity, increasing 
 in degree as the instances became more numerous. Bernoulli, 
 however, took the first step towards giving it a theoretical basis 
 by showing that, if the ci priori probability is known throughout, 
 then (subject to certain conditions which he himself did not make 
 clear) in the long run a certain determiaate frequency of occurrence 
 is to be expected. Siissmilch [Die gottliche Ordnung in den 
 Veranderungen des menschlichen Geschlechts, 1741) discovered a 
 theological interest in these regularities. Such ideas had become 
 sufficiently familiar for Gibbon to characterise the results of 
 probability as " so true in general, so fallacious ia particular." 
 Kant foimd in them (as many later writers have done) some 
 bearing on the problem of Eree Will.^ 
 
 But with the nineteenth century came bolder theoretical 
 methods and a wider knowledge of facts. After proving his 
 extension of Bernoulli's Theorem,^ Poisson applied it to the 
 observed facts, and gave to the principle underlying these 
 regularities the title of the Law of Great Numbers. " Les choses 
 de toutes natures," he wrote,^ " sont soumises a une loi imiver- 
 seUe qu'on pent appeler la loi des grands nombres. . . . De ces 
 exemples de toutes natures, il resulte que la loi universelle des 
 grands nombres est A6]k pour nous un fait g6n6ral et incontestable, 
 r&ultant d'exp&iences qui ne se d4mentent jamais." This is 
 the language of exaggeration ; it is also extremely vague. But 
 
 1 In Idee zu einer allgemeinen Qeachichie in weltbiirgerlicher Absicht, 1784. For 
 a discussion of this passage and for the coimection between Kant and Siissmilch, 
 see liottin's Queielet, pp. 367, 368. 
 
 2 See p. 345. 
 
 " Becherches, pp. 7-12. Von Bortkiewicz {Kritische Betrachtungen, 1st part, 
 pp. 655-660) has maintained that Poisson intended to state his principle in a 
 less general way than that in which it has been generally taken, and that he was 
 misunderstood by Quetelet and others. If we attend only to Poisson's con- 
 tributions to Comptes Rendua in 1835 and 1836 and to the examples he gives 
 there, it is possible to make out a good case for thinking that he intended his 
 law to extend only to cases where certain strict conditions were fulfilled. But 
 this is not the spirit of his more popular writings or of the passage quoted above. 
 At any rate, it is the fashion, in which Poisson influenced his contemporaries, 
 that is historically interesting ; and this is certainly not represented by Von 
 Bortkiewicz's interpretation. 
 
334 A TREATISE ON PEOBABILITY m. v 
 
 it is exciting ; it seems to open up a whole new field to scientific 
 investigation ; and it has had a great influence on subsequent 
 thought. Poisson seems to claim that, in the whole field of chance 
 and variable occurrence, there really exists, amidst the apparent 
 disorder, a discoverable system. Constant causes are always 
 at work and assert themselves in the long run, so that each class 
 of event does eventually occur in a definite proportion of cases. 
 It is not clear how far Poisson's result is due to d priori reasoning, 
 and how far it is a natural law based on experience ; but it is 
 represented as displaying a certain harmony between natural 
 law and the d priori reasoning of probabilities. 
 
 Poisson's conception was mainly popularised through the 
 writings of Quetelet. In 1823 Quetelet visited Paris on an 
 astronomical errand, where he was introduced to Laplace and 
 came into touch with " la grande ecole fran9aise." " Ma jeunesse 
 et mon zMe," he wrote in later years, " ne tard^rent pas a, me 
 mettre en rapport avec les hommes les plus distingues de cette 
 epoque ; qu'on me permette de citer Fourier, Poisson, Lacroix, 
 specialement connus, comme Laplace, par leurs excellents Merits 
 sur la thdorie mathematique des probability. . . . C'est done 
 an milieu des savants, statisticiens, et ^oonomistes de ce temps 
 que j'ai commenc6 mes travaux." ^ Shortly afterwards began 
 his long series of papers, extending down to 1873, on the apphca- 
 tion of Probability to social statistics. He wrote a text-book 
 on ProbabiUty in the form of letters for the instruction of the 
 Prince Consort. 
 
 Before accepting in 1815 at the age of nineteen (with a view to 
 a livelihood) a professorship of mathematics, Quetelet had studied 
 as an aft student and written poetry ; a year later an opera, of 
 which he was part-author, was produced at Ghent. The character 
 of his scientific work is in keeping with these beginnings. There 
 is scarcely any permanent, accurate contribution to knowledge 
 which can be associated with his name. But suggestions, pro- 
 jects, far-reaching ideas he could both conceive and express, and 
 he has a very fair claim, I think, to be regarded as the parent of 
 modern statistical method. 
 
 Quetelet very much increased the number of instances of the 
 
 ^ For the details of the life of Quetelet and for a very fuU discussion of his 
 writings with special reference to Probability, see Lottin's Quetelet, statieticien et 
 
cH.xxvin STATISTICAL INFERENCE 335 
 
 Law of Great Numbers, and also brought into prominence a 
 slightly variant type of it, of which a characteristic example is 
 the law of height, according to which the heights of any consider- 
 able sample taken from any population tend to group themselves 
 according to a certain well-known curve. His instances were 
 chiefly drawn from social statistics, and many of them were of a 
 kind weU calculated to strike the imagination — ^the regularity of 
 the number of suicides, " I'efirayante exactitude avec laquelle 
 les crimes se reproduisent," and so forth. Quetelet writes 
 with an almost religious awe of these mysterious laws, and 
 certainly makes the mistake of treating them as being as 
 adequate and complete in themselves as the laws of physics, 
 and as little needing any further analysis or explanation.^ 
 Quetelet's sensational language may have given a considerable 
 impetus to the collection of social statistics, but it also involved 
 statistics in a slight element of suspicion in the minds of some 
 who, like Comte, regarded the application of the mathematical 
 calculus of probability to social science as " purement chimerique 
 et, par consequent, tout a fait vicieuse." The suspicion of 
 quackery has not yet disappeared. Quetelet belongs, it must be 
 admitted, to the long line of brilliant writers, not yet extinct, who 
 have prevented Probability from becoming, in the scientific salon, 
 perfectly respectable. There is still about it for scientists a 
 smack of astrology, of alchemy. 
 
 The progress of the conception since the time of Quetelet has 
 been steady and uneventful ; and long strides towards this perfect 
 respectability have been taken. Instances have been multiplied 
 and the conditions necessary for the existence of statistical 
 stability have been to some extent analysed. While the most 
 fruitful appHcations of these methods have still been perhaps, 
 as at first, in social statistics and in errors of observation, a 
 number of uses for them have been discovered in quite recent 
 times in the other sciences ; and the principles of Mendehsm 
 have opened out for them a great field of application throughout 
 biology. 
 
 1 Compare, for instance, the following passage from Becherches sur le penchant 
 au crime : " H me semble que ce qui se rattaohe k I'esp^ce humaine, consid&6e 
 en masse, est de I'ordre des faits physiques ; plus le nombre des individus est 
 grand, plus la volenti individuelle s' efface et laisse pr^dominer la seiie des faits 
 g6n6raux qui dependent des causes g6n6rales. . . . Ce sont ces causes qu'il 
 s'agit de saisir, et dte qu'on les oonnaltra, on en d6terminera les effets pour la 
 sooi6t6 comme on determine les effets par les causes dans les sciences physiques." 
 
336 A TREATISE ON PROBABILITY pt. v 
 
 2. The existence of numerous instances of the Law of Great 
 Numbers, or of something of the kind, is absolutely essential for 
 the importance of Statistical Induction. Apart from this the more 
 precise parts of statistics, the collection of facts for the prediction 
 of future frequencies and associations, would be nearly useless. 
 But the ' Law of Great Numbers ' is not at all a good name for the 
 principle which underlies Statistical Induction. The ' Stability 
 of Statistical Frequencies ' would be a much better name for it. 
 The former suggests, as perhaps Poisson intended to suggest, but 
 what is certainly false, that every class of event shows statistical 
 regularity of occurrence if only one takes a sufficient number of 
 instances of it. It also encourages the method of procedure, by 
 which it is thought legitimate to take any observed degree of 
 frequency or association, which is shown in a fairly numerous 
 set of statistics, and to assume with insufficient investigation 
 that, because the statistics are nwme/rous, the observed degree of 
 frequency is therefore stahh. Observation shows that some 
 statistical frequencies are, within narrower or wider limits, stable. 
 But stable frequencies are not very common, and cannot be 
 assumed lightly. 
 
 The gradual discovery, that there are certain classes of 
 phenomena, in which, though it is impossible to predict what will 
 happen in each individual case, there is nevertheless a regularity 
 of occurrence if the phenomena be considered together ia succes- 
 sive sets, gives the clue to the abstract inquiry upon which we 
 are about to embark. 
 
CHAPTER XXIX 
 
 THE USE OF i PRIORI PROBABrLITIES FOR THE PEEDIOTION OF 
 STATISTICAL FREQUENCY — THE THEOREMS OF BERNOULLI, 
 POISSON, AND TCHEBYCHEFF 
 
 Hoc igitur est illud Problema, quod eTulgandum hoc loco proposui, post- 
 quam jam per vioennium pressi, et cujus turn novitas, turn summa utilitas cum 
 paii conjuncta diffloultate omnibus reliqnis hujus dootrinae capitibus pondus 
 et pretium superaddere potest. — Bebnottlli.* 
 
 1. Bernoulli's Theorem is generally regarded as the central 
 theorem of statistical probability. It embodies the first attempt 
 to deduce the measures of statistical frequencies from the measures 
 of individual probabilities, and it is a sufficient fruit of the twenty 
 years which Bernoulli alleges that he spent in reaching his result, 
 if out of it the conception first arose of general laws amongst 
 masses of phenomena, in spite of the uncertainty of each parti- 
 cular case. But, as we shall see, the theorem is only vahd subject 
 to stricter qualifications, than have always been remembered, 
 and in conditions which are the exception, not the rule. 
 
 The problem, to be discussed in this chapter, is as follows : 
 Given a series of occasions, the probability ^ of the occurrence 
 of a certain event at each of which is known relative to certain 
 initial data h, on what proportion of these occasions may we 
 reasonably anticipate the occurrence of the event ? Given, that 
 is to say, the individual probability of each of a series of events 
 a priori, what statistical frequency of occurrence of these events 
 is to be anticipated over the whole series ? Beginning with 
 Bernoulli's Theorem, we will consider the various solutions of 
 this problem which have been propounded, and endeavour to 
 
 1 Ars Oonjeciandi, p. 227. 
 
 ' In the simplest cases, dealt with by Bernoulli, these probabilities are all 
 supposed equal. 
 
 337 Z 
 
338 A TREATISE ON PEOBABILITY m. v 
 
 determine the proper limits witMn which each method has 
 validity. 
 
 2. Bernoulh's Theorem in its simplest form is as follows : If 
 the probability of an event's occurrence under certain conditions 
 is p, then, if these conditions are present on m occasions, the most 
 probable number of the event's occurrences is mp (or the nearest 
 integer to this), i.e. the most probable proportion of its occurrences 
 to the total number of occasions is p ; further, the probabihty 
 that the proportion of the event's occurrences will diverge from 
 the most probable proportion p by less than a given amount 6, 
 increases as m increases, the value of this probability being 
 calculable by a process of approximation. 
 
 The probability of the event's occurring n times and failing 
 
 m -m times out of the m occasions is (subject to certain conditions 
 
 to be elucidated later) j?"?™"" multipUed by the coefficient of 
 
 this expression in the expansion of (p + g')™, where p + q = \. If 
 
 n ' 
 
 we write n = mp-h, this term is 7-—^ tt-.P^Q^''^- It 
 
 {mp-h)'.{mq+h)l^ 
 
 is easily shown that this is a maximum when h = 0, i.e. when n = mp 
 (or the nearest integer to this, where mp is not integral). This 
 result constitutes the first part of Bernoulli's Theorem. 
 
 For the second part of the theorem some method of approxi- 
 mation is required. Provided that m is large, we can simplify 
 
 n ' 
 the expression _ " rr-fP"?™"" by means of Stirling's 
 
 Theorem, and obtain as its approximate value 
 
 1 ^' 
 
 -e 2nvpq^ 
 
 y/2irmpq 
 
 As before, this is a maximum when h = 0, i.e. when n=mp. 
 
 It is possible, of course, by more complicated formulae to 
 obtain closer approximations than this.^ But there is an objec- 
 tion, which can be raised to this approximation, quite distinct 
 from the fact that it does not furnish a result correct to as many 
 places of decimals as it might. This is, that the approximation 
 is independent of the sign of h, whereas the original expression 
 is not thus independent. That is to say, the approximation 
 implies a symmetrical distribution for different values of h about 
 
 * See, e.g., Bowley, Elements of Statistics, p. 298. The objeotion about to 
 be raised does not apply to these closer approximations. 
 
OH. XXIX STATISTICAL mFERBNCE 339- 
 
 the value for h=0; while the expression under approximation 
 is unsyminetrical. It is easily seen that this want of sjonmetry 
 is appreciable unless mpq is large. We ought, therefore, to have 
 laid it down as a condition of our approximation, not only that 
 m must be large, but also that mpq must be large. Unlike most 
 of my criticisms, this is a mathematical, rather than a logical 
 point. I recur to it iu § 15. 
 
 " Par ime fiction qui rendra les calculs plus faciles " (to quote 
 Bertrand), we now replace the iuteger A by a continuous variable 
 z and argue that the probabiHty that the amount of the diverg- 
 ence from the most probable value m^ will lie between z and z + dz, 
 is 
 
 1 ^' 
 
 2m^q dz. 
 
 A^27rmpq 
 
 This ' fiction ' will do no harm so long as it is remembered that we 
 are now dealing with a particular kind of approximation. The 
 probabiHty that the divergence h from the most probable value 
 mp will be less than some given quantity a is, therefore. 
 
 1 /"+» _51_ 
 
 If we put =t, this is equal to 
 
 s/'^mpq 
 
 \/2mpg 
 
 Thus, if we write a = ijlm/pq 7, the probability ^ that the 
 number of occurrences will lie between 
 
 mp + ij'im.pq 7 and m/p - ^'im.pq 7 
 
 2 p 
 is measured by ^ — -=■ e dt. This same expression measures 
 
 1 The replacement of the integer 7i by the continuous variable z may render 
 the formula rather deceptive. It is certain, for example, that the error does not 
 lie between h and h+1. 
 
 ^ The above proof follows the general lines of Bertrand's (Calcul des proba- 
 hilites, chap. iv.). Some writers, using rather mere precision, give the result as 
 
 Mo 
 
 JttJ a ^fHirmpq 
 
 (e.g. Laplace, by the use of Euler's • Theorem, and more recently Czuber, 
 
340 A TREATISE ON PROBABILITY ft v 
 
 the probability that the 'proportion of occurrences will lie 
 between 
 
 p+ \,/ — 7 and p- ^1 — 7. 
 ^ m ^ m 
 
 2 r* 
 
 The different values of the integral —7== e"*"(^« = 6(«) are given 
 
 in tables.^ 
 
 The probability that the proportion of occurrences will lie 
 
 between given Hmits varies with the magnitude of / , and 
 
 this expression is sometimes used, therefore, to measure the 
 ' precision ' of the series. Given the d priori probabiUties, the 
 precision varies inversely with the square root of the number of 
 instances. Thus, while the probability that the absolute diverg- 
 ence will be less than a given amount a decreases, the probabiUty 
 that the corresponding proportionate divergence (i.e. the absolute 
 divergence divided by the number of instances) will be less than 
 a given amount b, increases, as the number of instances increases. 
 This completes the second part of Bernoulli's Theorem. 
 
 3. BemouUi himseM was not acquainted with Stirling's 
 theorem, and his proof differs a good deal from the proof outhned 
 in § 2. His final enunciation of the theorem is as follows : If in 
 each of a given series of experiments there are r contingencies 
 favourable to a given event out of a total number of contingencies 
 
 r 
 t, so that - is the probabihty of the event at each experiment, 
 
 V 
 
 then, given any degree of probability c, it is possible to make such 
 a number of experiments that the probability, that the propor- 
 tionate number of the event's occurrences wiU he between 
 
 r+l , r-1 . , „ 
 and , is greater than c.^ 
 
 Wahrscheinlichkeitsrechnung, vol. i. p. 121). As the -whole formula is approxi- 
 mate, the simpler expression given in the text is probably not less satisfactory in 
 practice. See also Czuber, Entwicklung, pp. 76, 77, and Eggenberger, Beitrdge 
 zur Darstellung des BernouUischen Theorems. 
 
 ^ A list of the principal tables is given by Czuber, loc. cit. vol. i. p. 122. 
 
 ^ Ars Gonjectandi, p. 236 (I have translated freely). There is a brief account 
 of Bernoulli's proof in Todhunter's History, pp. 71, 72. The problem is dealt 
 with by Laplace, Theorie analytigue, livre ii. chap. iii. For an account of 
 Laplace's proof see Todhunter's History, pp. 548-553. 
 
OH. XXIX STATISTICAL INFEEENCE 341 
 
 4. We seem, therefore, to have proved that, if the d priori 
 probabiUty of an event under certain conditions is p, the pro- 
 portion of times most probable d priori for the event's occurrence 
 on a series of occasions where the conditions are satisfied is also 
 p, and that if the series is a long one the proportion is very un- 
 likely to differ widely from p. This amounts to the principle 
 which ElUs 1 and Venn have employed as the defining axiom of 
 probabiHty, save that if the series is ' long enough ' the proportion, 
 according to them, will certainly be p. Laplace ^ believed that the 
 theorem afforded a demonstration of a general law of nature, and 
 in his second edition pubUshed in 1814 he replaces ^ the eloquent 
 dedication, A NapoUon-le-Grand, which prefaces the edition of 
 1812, by an explanation that Bernoulli's Theorem must always 
 bring about the eventual downfall of a great power which, drunk 
 with the love of conquest, aspires to a universal domination, — 
 " c'est encore un r^sultat du calcul des probabilites, confirme 
 par de nombreuses et funestes experiences." 
 
 5. Such is the famous Theorem of BernoulH which some have 
 believed * to have a universal validity and to be applicable to all 
 ' properly calculated ' probabiUties. Yet the theorem exhibits 
 algebraical rather than logical insight. And, for reasons about 
 to be given, it will have to be conceded that it is only true of a 
 special class of cases and requires conditions, before it can be 
 legitimately applied, of which the fulfilment is rather the ex- 
 ception than the rule. For consider the case of a coin of which 
 it is given that the two faces are either both heads or both tails : 
 at every toss, provided that the results of the other tosses are 
 unknown, the probabihty of heads is ^ and the probability of 
 tails is I ; yet the probabihty of m heads and m tails in 2m tosses 
 
 "■ 091 the Foundation of the Theory of ProbabiUties : "If the probability of a 
 given event be correctly determined, the event will on a long run of trials tend 
 to recur with frequency proportional to this probability. This is generally 
 proved mathematically. It seems to me to be true d priori. ... I have been 
 unable to sever the judgment that one event is more likely to happen than 
 another from the belief that in the long run it will occur more frequently." 
 
 ' Bssai philosophique, p. 53 : " On pent tirer du th^ordme pr^c^dent cette 
 consequence qui doit Stre regardee comme une loi g^nSrale, savoir, que les 
 rapports des effets de la nature, sout k, fort peu pres constans, quand ces effets 
 sont consid^r^s en grand nombre." 
 • ' Introduotion, pp. liii, liv. 
 
 * Even by Mr. Bradley, Principles of Logic, p. 214. After criticising Venn's 
 view he adds : " It is false that the chances must be realised in a series. It is, 
 however, true that they most probably will be, and true again that this prob- 
 ability is increased, the greater the length we give to our series." 
 
342 A TREATISE ON PROBABILITY m. v 
 
 is zero, and it is certain d priori that there wUl be either 2m 
 heads or none. Clearly Bernoulh's Theorem is inapplicable to 
 such a case. And this is but an extreme case of a normal 
 condition. 
 
 For the first stage in the proof of the theorem assumes that, 
 if p is the probabihty of one occurrence, p^ is the probability of r 
 occurrences running. Our discussion of the theorems of multi- 
 plication wiU have shown how considerable an assumption this 
 involves. It assumes that a knowledge of the fact that the event 
 has occurred on every one of the first r - 1 occasions does not in 
 any degree affect the probability of its occurrence on the rth. 
 Thus Bernoulli's Theorem is only vaUd if our initial data are of 
 such a character that additional knowledge, as to the proportion 
 of failures and successes in one part of a series of cases is alto- 
 gether irrelevant to our expectation as to the proportion in another 
 part. If, for example, the initial probability of the occurrence 
 of an event under certain circumstances is one in a miUion, we 
 may only apply Bernoulli's Theorem to evaluate our expectation 
 over a million trials, if our original data are of such a character 
 that, even after the occurrence of the event in every one of the 
 first milUon trials, the probabihty in the hght of this additional 
 knowledge that the event will occur on the next occasion is stiU 
 no more than one in a milhon. 
 
 Such a condition is very seldom fulfilled. If our initial prob- 
 abihty is partly founded upon experience, it is clear that it is 
 liable to modification in the light of further experience. It is, 
 in fact, difficult to ^ve a concrete instance of a case in which the 
 conditions for the application of BemouUi's Theorem are com- 
 pletely fulfilled. At the best we are dealing in practice with a 
 good approximation, and can assert that no realised series of 
 moderate length can much affect our initial probabihty. If we 
 
 2 fy 
 wish to employ the expression — = I e dt we are in a worse 
 
 position. For this is an approximate formida which requires for 
 its vahdity that the series should be long ; whilst it is precisely 
 in this event, as we have seen above, that the use of Bernoulli's . 
 Theorem is more than usually likely to be illegitimate. 
 
 6. The conditions, which have been described above, can be 
 expressed precisely as follows : 
 
OH. XXIX STATISTICAL INFERENCE 343 
 
 Let ^^ represent the statement that the event has occurred 
 on m out of n occasions and has not occurred on the others ; and 
 let ■^■^'h =p, where h represents our d 'priori data, so that f is the 
 d, priori probabiUty of the event in question. Bernoulli's Theorem 
 then requires a series of conditions, of which the following is 
 typical : m+i^n+ilm^n • h=jXjJh, i.e. the probability of the event 
 on the n + 1th occasion must be unaffected by our knowledge of 
 its proportionate frequency on the first n occasions, and must be 
 exactly equal to its a priori probabihty before the first occasion. 
 
 Let us select one of these conditions for closer consideration. 
 If y^ represents the statement that the event has occurred on each 
 of r successive occasions, 2/r/^=«/r/2/r-i^-2/r-iA ^^^ so on, so 
 
 s=r 
 
 that y^/A= Ylyjy^_-Ji. Hence if we are to have y^jh^p^, we 
 
 s=l 
 
 must have yjys-i^=p for all values of s from I to r. But in 
 many particular examples ys/ys-i^ increases with s, so that 
 y^jh>p''. Bernoulh's Theorem, that is to say, tends, if it is 
 carelessly appHed, to exaggerate the rate at which the probability 
 of a given divergence from the most probable decreases as the 
 divergence increases. If we are given a penny of which we have 
 no reason to doubt the regularity, the probabihty of heads at 
 the first toss is \ ; but if heads fall at every one of the first 999 
 tosses, it becomes reasonable to estimate the probability of heads 
 at the thousandth toss at much more than \. For the d 'priori 
 probabihty of its being a conjurer's penny, or otherwise biassed 
 so as to fall heads almost invariably, is not usually so infinitesim- 
 ally small as (1)"°°. We can only apply Bernoulh's Theorem 
 with rigour for a prediction as to the penny's behaviour over a 
 series of a thousand tosses, if we have d priori such exhaustive 
 knowledge of the penny's constitution and of the other con- 
 ditions of the problem that 999 heads miming would not cause 
 us to modify in any respect our prediction d priori. 
 
 7. It seldom happens, therefore, that we can apply Bernoulh's 
 Theorem with reference to a long series of natural events. For 
 in such cases we seldom possess the exhaustive knowledge which 
 is necessary. Even where the series is short, the perfectly 
 rigorous application of the Theorem is not hkely to be legiti- 
 mate, and some degree of approximation will be involved in 
 utilising its results. 
 
 Not so infrequently, however, artificial series can be devised 
 
344 A TEEATISE ON PEOBABILITY n. v 
 
 in wMch the assumptions of Bernoulli's Theorem are relatively 
 legitimate.^ Given, that is to say, a proposition %, some series 
 ttia^ . . . can be found, which satisfies the conditions : 
 
 (i.) cbjh = fflj/A . . . = a^/h. 
 (ii.) a,/as . . . a^ . . . h=ajh. 
 
 Adherents of the Frequency Theory of ProbabiUty, who use the 
 principal conclusion of Bernoulli's Theorem as the defining pro- 
 perty of all probabilities, sometimes seem to mean no more than 
 that, relative to given evidence, every proposition belongs to 
 some series, to the members of which Bernoulli's Theorem is 
 rigorously appHcable. But the natural series, the series, for 
 example, in which we are most often interested, where the a's 
 are alike in being accompanied by certain, specified conditions c, 
 is not, as a rule, rigorously subject to the Theorem. Thus ' the 
 probabihty of a in certain conditions c is | ' is not in general 
 equivalent, as has sometimes been supposed, to ' It is 500 to I 
 that in 90,000 occurrences of c, a will not occur more than 20,200 
 times, and 500 to 1 that it will not occur less than 19,800 times.' 
 
 8. BernoulH's Theorem supphes the simplest formula by 
 which we can attempt to pass from the d priori probabilities of 
 each of a series of events to a prediction of the statistical frequency 
 of their occurrence over the whole series. We have seen that 
 BernoulH's Theorem involves two assumptions, one (in the form 
 in which it is usually enunciated) tacit and the other exphcit. 
 It is assumed, first, that a knowledge of what has occurred at 
 some of the trials would not affect the probability of what may 
 occur at any of the others ; and it is assumed, secondly, that these 
 probabilities are aU equal a priori. It is assumed, that is to say, 
 that the probabihty of the event's occurrence at the rth trial is 
 equal a priori to its probabihty at the nth. trial, and, further, that 
 it is unaffected by a knowledge of what may actually have 
 occurred at the nth trial. 
 
 A formtda, which dispenses with the expHcit assimiption of 
 equal d priori probabihties at every trial, was proposed by 
 Poisson,^ and is usually known by his name. It does not dispense, 
 
 1 In the discussion in Chapter XVI., p. 170, of the probability of a diverg- 
 ence from an equality of heads and tails in coin-tossing, an example has been 
 given of the construction of an artificial series in which the apphoation of 
 Bernoulli's Theorem is more legitimate than in the natural series. 
 
 ^ Becherches, pp. 246 et seq. 
 
CH. XXIX . STATISTICAL INFEEENCE 345 
 
 however, with the other inexphcit assumption. The difierence 
 between Poisson's Theorem and Bernoulli's is best shown by 
 reference to the ideal case of balls drawn from an urn. The 
 typical example for the vahd apphcation of Bernoulli's Theorem 
 is that of balls drawn from a single urn, containing black and 
 white balls in a known proportion, and replaced after each draw- 
 ing, or of balls drawn from a series of urns, each containing black 
 and white balls ia the same known proportion. The typical 
 example for Poisson's Theorem is that of balls drawn from a seriefs 
 of urns, each containing black and white balls in diff&reni known 
 proportions. 
 
 Poisson's Theorem may be enunciated as follows : ^ Let s 
 trials be made, and at the Tith trial (\ = 1, 2 . . . s) let the prob- 
 abilities for the occurrence and non-occurrence of the event be 
 
 f^, % respectively. Then, if ='p, the probabiHty that the 
 
 s 
 
 number of occurrences m of the event in the s trials will lie 
 
 between the limits sf±}, is given by 
 
 P 
 
 P = :^\ e ^'sdx + 
 
 where k 
 
 ky/irs] Q ki,/TTS 
 
 X I 
 
 By substituting — -^=1 and ^=7, this maybe written 
 
 k;Js ks/s 
 
 in a form corresponding to that of Bernoulli's Theorem,^ namely : 
 
 The probability that the number of occurrences of the event 
 
 will lie between sp±yk^is given by 
 
 ^TT J g hy/Trs 
 
 9. This is a highly ingenious theorem and extends the applica- 
 tion of Bernoulli's results to some important types of cases. It 
 embraces, for example, the case in which the successive terms of 
 a series are drawn from distinct populations known to be char- 
 acterised by differing statistical frequencies ; no further com- 
 
 \ For the proof see Poisson, Recherches, loc. cit., or Czuber, WahrscheMich- 
 keitsrechnung, vol. i. pp. 153-159. 
 
 2 For the analogous form of Bernoulli's Theorem see p. 339 (footnote). 
 
346 A TREATISE ON PROBABILITY «. v 
 
 plication being necessary beyond the calculation of two simple 
 functions of these frequencies and of the number of terms in the 
 series. But it is important not to exaggerate the degree to which 
 Poisson's method has extended the application of Bernoulli's 
 results. Poisson's Theorem leaves untouched all those cases in 
 which the probabilities of some of the terms in the series of events 
 can be influenced by a knowledge of how some of the other terms 
 in the series have turned out. 
 
 Amongst these cases two types can be distinguished. In the 
 first type such knowledge would lead us to discriminate between 
 the conditions to which the different instances are subject. If, 
 for example, balls are drawn from a bag, containing black and 
 white balls in known proportions, and not replaced, the know- 
 ledge whether or not the first ball drawn was black afEects the 
 probability of the second ball's being black because it tells us 
 how the conditions in which the second ball is drawn differ 
 from those in which the first ball was drawn. In the second type 
 such knowledge does not lead us to discriminate between the 
 conditions to which the different instances are subject, but it leads 
 us to modify our opinion as to the nature of the conditions which 
 apply to all the terms alike. If, for instance, balls are drawn 
 from a bag, which is one, but it is not certainly known which, out 
 of a number of bags containing black and white balls in differing 
 proportions, the knowledge of the colour of the first ball drawn 
 affects the probabihties at the second drawing, because it throws 
 some light upon the question as to which bag is being drawn from. 
 
 This last type is that to which most instances conform which 
 are drawn from the real world. A knowledge of the character- 
 istics of some members of a population may give us a clue to the 
 general character of the population in question. Yet it is this 
 type, where there is a change in knowledge but no chamge in the 
 material conditions from one instance to the next, which is most 
 frequently overlooked.^ It will be worth while to say something 
 further about each of these two types.^ 
 
 ' Numerous instances could be quoted. To take a recent English ex- 
 ample, reference may be made to Yule, Introduction to the Theory of Statistics, 
 p. 251. Mr. Yule thinks that the condition of independence is satisfied if " the 
 resuU of any one throw or toss does not afiect, and is unaffected by, the results 
 of the preceding and following tosses," and does not allow for the cases in which 
 knowledge of the result is relevant apart from any change in the physical con- 
 ditions. 
 
 ^ The types which I distinguish under four heads (the BemouUian, the 
 
CH.XXIX STATISTICAL INFERENCE 347 
 
 10. For problems of the first type, where there is physical 
 or material dependence between the successive trials, it is not 
 possible, I think, to propose any general solution ; since the 
 probabilities of the successive trials may be modified in all kinds 
 of different ways. But for particular problems, if the conditions 
 are precise enough, solutions can be devised. The problem, for 
 instance, of an um, containing black and white balls in known 
 proportions, from which balls are drawn successively and not 
 replaced,^ is ingeniously solved by Czuber^ with the aid of 
 Stirling's Theorem. If o- is the number of balls and s the number 
 of drawings, he reaches the interesting conclusion (assuming that 
 0-, s and a--s are all large) that the probabiUty of the number of 
 black balls lying within given Hmits is the same as it would be 
 if the balls were replaced after each drawing and the mmiber 
 
 of drawings were s instead of s. 
 
 cr 
 
 In addition to the assumptions already stated. Professor 
 Czuber's solution appKes only to those cases where the limits, for 
 which we wish to determine the probabihty, are narrow compared 
 with the total number of black balls pa: Professor Pearson ^ has 
 worked out the same problem in a much more general manner, 
 so as to deal with the wTiole range, i.e. the frequency or prob- 
 ability of all possible ratios of black balls, even where s>p<r. The 
 various forms of curve, which result, according to the different 
 relations existing between p, s, and a, supply examples of each 
 of the different types of frequency curve which arise out of a 
 
 PoisBonian, and the two described above) Bachelier {Caloul des probabilites, 
 p. 155) classifies as follows : 
 
 (i.) Wben the conditions are identical throughout, the problem has uni- 
 formite ; 
 
 (ii.) When they vary from stage to stage, but according to a law given from 
 the beginning and in a manner which does not depend upon what has happened 
 at the earlier stages, it has independance ; 
 
 (iii.) When they vary in a manner which depends upon what has happened 
 at the earlier stages, it has connexiU. 
 
 Bachelier gives solutions for each type on the assumption that the number of 
 trials is very great, and that the number of successes or failures can be regarded 
 as a continuous variable. This is the same kind of assumption as that made 
 in the proof of Bernoulli's Theorem given in § 2, and is open to the same objeo- 
 tions,^-or rather the value of the results is limited in the same way. 
 
 1 It is of no consequence whether the balls are drawn successively and not 
 replaced, or are drawn simultaneously. 
 
 a Loc. cit. vol. i. pp. 163, 164. 
 
 3 " Skew Variation in Homogeneous Material," Phil. Trans. (1895), p. 360. 
 
348 A TKEATISE ON PROBABILITY pt. v 
 
 classification according to (i.) skewness or symmetry, (ii.) Kmita- 
 tion of range in one, both or neither direction ; and he designates, 
 therefore, the curves which are thus obtained as generalised prob- 
 ability curves. His discussion of the properties of these curves is 
 interesting, however, to the student of descriptive statistics 
 rather than to the student of probabihty. The most generaUsed 
 and, mathematically, by far the most elegant treatment of this 
 problem, with which I am acquainted, is due to Professor 
 Tschuprow.^ 
 
 Poisson, in attempting a somewhat similar problem,^ arrives 
 at a result, which seems obviously contrary to good sense, by a 
 curious, but characteristic, misapprehension of the meaning of 
 ' independence ' in probability. His problem is as follows : 
 If I balls be taken out from an urn, containing c black and white 
 balls in known proportions, and not replaced, and if a further 
 number of balls /i be then taken out, the probabihty that a given 
 
 proportion of these i^ balls will be black is independent of 
 
 the nurnber and the colour of the I balls originally d/rawn out. For, 
 he argues, iil+fj, balls are drawn out, the probabihty of a com- 
 bination, which is made up of I black and white balls in given 
 proportions followed by /jl balls, of which m are white and n black, 
 must be the same as that of a similar combination in which the 
 fi balls precede the I balls. Hence the probabihty of m white 
 balls in fi drawings, given that the I balls have already been 
 drawn out, must be equal to the probabihty of the same result, 
 when no balls have been previously drawn out. The reader will 
 perceive that Poisson, thinking only of physical dependence, has 
 been led to his paradoxical conclusion by a failure to distinguish 
 between the cases where the proportion of black and white balls 
 amongst the I balls originally drawn is hnown and where it is not. 
 The /aci of their having been drawn in certain proportions, pro- 
 vided that only the total number drawn is known and the pro- 
 portions are unknown, does not influence the probabihty. Poisson 
 states in his conclusion that the probabihty is independent of the 
 number and colour of the I balls originally drawn. If he had 
 added — as he ought — ' provided the number of each colour is 
 
 ^ "Zur Theorie der Stabilitat statistisoher Eeihen," p. 216, published in 
 the Skandinavisk Aktuarietidskrip for 1919. 
 " Loc. cit. pp. 231, 232. 
 
CH.XXIX STATISTICAL INFEEENCE 349 
 
 unknown,' the air of paradox disappears. This is an exceedingly 
 good example of the failure to perceive that a probability cannot 
 be influenced by the occurrence of a material event but only by 
 such knowledge, as we may have, respecting the occurrence of the 
 event.^ 
 
 11. For problems of the second type, where knowledge of the 
 result of one trial is capable of influencing the probability at the 
 next apart from any change in the material conditions, there is, 
 likewise, no general solution. The following artificial example, 
 however, will illustrate the sort of considerations which are in- 
 volved. 
 
 In the cases where Bernoulli's Theorem is appHed to practical 
 questions, the A 'priori probabiHty is generally obtained empiric- 
 ally by reference to the statistical frequency of each alternative 
 in past experience under apparently similar conditions. Thus 
 the d priori probability of a male birth is estimated by reference 
 to the recorded proportion of male births in the past.^ The 
 validity of estimating probabilities in this manner will be dis- 
 cussed later. But for the purposes of this example let us assume 
 that the d priori probability has been calculated on this basis. 
 
 Thus the d priori probability p ( =-) oi. an event is based on 
 
 the observation of its occurrence r times out of s occasions on 
 which the given conditions were present. Now, according to 
 BemoulU's Theorem directly apphed, the probabUity of the 
 
 event's occurring n times running is ^j" or ( - I . But, if the 
 
 event occurs at the first trial, the probability at the second 
 
 ■■ For an attempt to solve other problems of this type see Bachelier, Oalcul 
 des probabilites, chap. ix. (Probabilites connexes). I think, however, that the 
 solutions of this chapter are vitiated by his assuming in the course of them 
 both that certain quantities are very large, and also, at a later stage, that the 
 same quantities are infinitesimal. On this account, for example, his solution 
 of the following difficult problem breaks down : Given an urn A with m white 
 and n black balls and an urn B with m' white and n' black balls, if at each move 
 a ball is taken from A and put into B, and at the same time a ball is taken from 
 B and put into A, what is the probability after x moves that the urns A and B 
 shaE have a given composition ? 
 
 * Cf. Yule, Theory of Statistics, p. 258 : " We are not able to assign an 
 d priori value to the chance p (i.e. of a male birth) as in the case of dice-throwing, 
 but it is quite sufficiently accurate for practical purposes to use the proportion 
 of male births actually observed if that proportion be based on a moderately 
 large number of observations." 
 
350 A TEEATISB ON PROBABILITY pt. v 
 
 ■ , 1 
 
 becomes — -, and so on. Hence the probability P, properly 
 s + 1 
 
 calculated, of n successive occurrences is 
 
 r r+l r+2 r+n-1 
 s s+1 s+2 s +n- 1 
 
 {r + n-l)\ (s-1)! 
 (s+m-l)! (r-l)! 
 
 Theorem, provided that r and s are large ; 
 1 + 
 
 Hence 
 
 P = 
 
 r\"\ r 
 
 si I n-W+''-^ 
 1 + — 
 
 1 + - 
 
 
 n, 
 
 =j)''Q", where Q=^^ ^T^+l 
 
 Thus, in this case, the assumption of Bernoulli's Theorem is 
 approximately correct, only if Q is nearly unity. This condition 
 is not satisfied unless n is small both compared with r and com- 
 pared with s. It is very important to notice that two conditions 
 are involved. Not only must the experience, upon which the 
 d, 'priori probability is based, be extensive in comparison with the 
 number of instances to which we apply our prediction ; but also 
 the number of previous instances multiplied by the probability 
 based upon them, i.e. sp ( = r), must be large in comparison with 
 the number of new instances. Thus, even where the prior ex- 
 perience, upon which we foimd the initial probability P, is very 
 extensive, we must not, if P is very small, say that the probabihty 
 of n successive occurrences is approximately ^™, unless n is also 
 small. Similarly if we wish to determine, by the methods of 
 Bernoulli, the probability of n occurrences and m failures on 
 m + n occasions, it is necessary that we should have m and n small 
 
CH.XXIX STATISTICAL INFERENCE 351 
 
 compared with s, n small compared with r, and m small compared 
 with 8-r.^ 
 
 The case solved above is the simplest possible. The general 
 problem is as follows : If an event has occurred x times in the 
 
 T ■¥ X 
 
 first y trials, its probabiUty at the y + 1th is —■ — ; determine the 
 
 s+y 
 
 d priori probability of the event's occurring p times in q trials. 
 
 If the d priori probability in question is represented by (pip, q), we 
 
 , ,, , r+p-1 s + q-1-r-p 
 
 have <j>{p,q)= ^_f_^ (l>{p-l,q-l)+ — s + g-i 9iP>9-^)- 
 
 I know of no solution of this, even approximate. But we may 
 say that the conditions are those of supernormal dispersion as 
 compared with Bernoulli's conditions. That is to say, the prob- 
 
 r . 
 
 ability of a proportion differing widely from - is greater than 
 
 s 
 
 in Bernoullian conditions ; for when the proportion begins to 
 
 diverge it becomes more probable that it will continue to diverge 
 
 in the same direction. If, on the other hand, the conditions of 
 
 the problem had been such, that when the proportion begins to 
 
 diverge it becomes more probable that it will recover itself and 
 
 r 
 tend back towards - (as when we draw balls without replacing 
 s 
 
 them from a bag of known composition), we should have sub- 
 normal dispersion.^ 
 
 12. The condition elucidated in the preceding paragraph is 
 frequently overlooked by statisticians. The following example 
 from Czuber ^ will be sufficient for the purpose of illustration. 
 Czuber's argument is as foUows : 
 
 In the period 1866-1877 there were registered in Austria 
 
 m= 4,311,076 male births 
 ji= 4,052,193 female births 
 
 s = 8,363,269; 
 
 1 This paragraph is concerned with a different point from that dealt with 
 in Professor Pearson's article " On the Influence of Past Experience on Future 
 Expectation," to which it bears a superficial resemblance. Professor Pearson's 
 article which deals, not with Bernoulli's Theorem, but with Laplace's " Rule of 
 Succession," will be referred to in § 16 of this chapter and in § 12 of the next. 
 
 2 Bachelier (CaUul des probabilitea, p. 201) classifies these two kinds of con- 
 ditions as conditions acceUratrices and conditions retardatrices. 
 
 ' Loc. cit. vol. ii. p. 15. I choose my example from Professor Czuber because 
 he is usually so careful an exponent of theoretical statistics. 
 
352 A TREATISE ON PROBABILITY w. v 
 
 for the succeeding period, 1877-1899, we are given only 
 
 m' = 6,533,961 male births ; 
 
 what conclusion can we draw as to the number n' of female 
 births ? We can conclude, according to Czuber, that the most 
 probable value 
 
 Wo' = — = 6,141,587, 
 m 
 
 and that there is a probability P = -9999779 that vl will lie 
 between the Umits 6,118,361 and 6,164,813. 
 
 It seems in plain opposition to good sense that on 
 such evidence we should be able with practical certainty 
 
 P = -9999779 = 1 --7—rr7r) to estimate the number of female 
 45250/ 
 
 births within such narrow Umits. And we see that the con- 
 ditions laid down in § 11 have been flagrantly neglected. The 
 number of cases, over which the prediction based on Bemoulh's 
 Theorem is to extend, actually exceeds the number of cases upon 
 which the a priori probabihty has been based. It may be added 
 that for the period, 1877-1894, the actual value of n' did lie 
 between the estimated hmits, but that for the period, 1895- 
 1905, it lay outside limits to which the same method had 
 attributed practical certainty. 
 
 That Professor Czuber should have thought his own argument 
 plausible, is to be explained, I think, by his tacitly taking account 
 in his own mind of evidence not stated in the problem. He was 
 relying upon the fact that there is a great mass of evidence for 
 beheving that the ratio of male to female births is peculiarly 
 stable. But he has not brought this into the argument, and he 
 has not used as his <i priori probabihty and as his coefficient of 
 dispersion the values which the whole mass of this evidence would 
 have led him to adopt. Would not the argument have seemed 
 very preposterous if m had been the number of males called 
 George, and n the number of females called Mary ? Would it not 
 have seemed rather preposterous if m had been the number of 
 legitimate births and n the number of illegitimate births ? Clearly 
 we must take account of other considerations than the mere 
 numerical values of m and n in estimating our d priori probability. 
 But this question belongs to the subject-matter of later chapters. 
 
OH. XXIX STATISTICAL INFEEENCE 353 
 
 and, quite apart from the manner of calculation of the d priori 
 probability, the argument is invahdated by the fact than an 
 d priori probability founded on 8,363,269 instances, without 
 corroborative evidence of a non-statistical character, cannot 
 be assumed stable through a calculation which extends over 
 12,700,000 instances. 
 
 13. Before we leave the theorems of BemouUi and Poisson, 
 it is necessary to call attention to a very remarkable theorem by 
 TchebychefE, from which both of the above theorems can be 
 derived as special cases. This result is reached rigorously and 
 without approximation, by means of simple algebra and without 
 the aid of the differential calculus. Apart from the beauty 
 and simplicity of the proof, the theorem is so valuable and so 
 little known that it will be worth while to quote it in full : ^ 
 
 Let x,y,z. . . represent certaia magnitudes, of which x 
 can take the values XjX^ . . -x^. with probabilities p^^ • • -Pk 
 respectively, y the values y^y^ ■ • -Vi "^^^ probabilities q^q^ . . .qi, 
 z the values z^z^ . . .z^ with probabilities r-^r^ . ■ -r^ and so on, 
 so that 
 
 k I m 
 
 'Zp = l, 'tq = l, 2r = l, etc. 
 Ill 
 
 k I m 
 
 Write tp^x^ = a, tq>y^ = 'b, 'Zr^z^=c, etc., 
 
 1 1 1 
 
 k I ™ 
 
 and tp^xj' = aj^, tq^^ = K tr^z^ = c, etc., 
 
 1 1 1 
 
 so that we can describe a as the mathematical expectation or 
 average value of x, and % as the mathematical expectation or 
 average value of a;^ etc. 
 
 The probability that the sum x-vy-k-z-v ... will have for 
 its value x^-^y^+z^^- . . . is p^q,?^--- (provided that the 
 values oix,y,z.. . are independent). Hence 
 
 1 lYom Jaarn. Liouville (2), xii., 1867, " Des valeurs moyennes," an article 
 translated from the Russian of TchebychefE. This proof is also quoted by 
 Czuber, loc. cit. p. 212, through whom I first became acquainted with it. Most 
 of TchebyohefE's work was published previous to 1870 and appeared originally 
 in Russian. It was not easily accessible, therefore, until the pubUoation at 
 Petrograd in 1907 of the collected edition of his works in Prench. His 
 theorems are, consequently, not nearly so well known as they deserve to be, 
 although his most important theorems were reproduced from time to time in 
 the journals of Euler and Liouville. For full references see the Bibliography. 
 
 2a 
 
354 A TREATISE ON PROBABILITY m. v 
 
 %.+yA+«^+ ■•• -a-b-c- ...fpAK'^^--- 
 
 summed for all values of «, \, /i is the average expectation for 
 
 K+yA + 2^ + --- -a-h-c- ...f. 
 
 k 
 
 Now X(a;^^ - 2aa;„ + a^)p^ = Sp^xJ^ - iatp^^ + oFtp^ 
 
 = «! - 2a^ + a^ = «! - a^. 
 Also Sg';tr^ . . . = 1, summed for all values oi\, fi. . ., and 
 
 h h 
 
 t2.{x, - a){y^ - h)p, = t2{x^^ - bx, - mj^ + ab)p^ 
 
 1 1 
 
 = 2(^x22? A - S^JP A - ay^^'/c + «&5:^ J 
 = 2{ai/)^ -ah- ay^ + a&) = 0. 
 
 Therefore t^x^ +y^-¥zji + . . .-a-b-c- .. . fp^q^r^ . . . 
 
 = «! + &i + Ci + . . . - a^ - 6^ - c^ - . . ., 
 
 whence ^('^'' +yx + g^ + ----Q'-^-c-- O^j^.g^^'V ■ • ■ 1 ^ 
 
 where the summation extends over all values of k,\, fi. . . and 
 a is some arbitrary number greater than unity. 
 
 If we omit those terms of the sum on the left-hand side of 
 the above equation for which 
 
 {x,+y>, + z^ + ... -a-b-c-... )^ 
 
 a2(ai + &i + Ci-l-...-a2_&2_c2_ . . .) 
 
 and write unity for this expression in the remaining terms, both 
 these processes diminish the magnitude of the left-hand side. 
 
 Hence ^p^q>,r^. . ■<~^ where the summation covers those 
 
 of values only for which 
 
 (x, + y^+Zu. + ... -a-b-c... )^ 
 
 51. 
 
 a%ai + bi+c.t^ + ...-a^-¥-c^...) 
 If P is the probability that 
 
 {x^+y^ + «^ + ... -a-b-c... f 
 a\ai + b^+Ci + ...-a^-b^-c^-...) 
 
 is equal to or less than unity, it follows that 
 
OH. -X-IfTX 
 
 
 STATISTICAL INFEKENCE 
 
 
 
 
 
 i-p<i. 
 
 a? 
 
 
 
 i.e. 
 
 
 
 p>i-i. 
 
 
 
 Hence the 
 
 probability that the snm 
 
 
 
 
 a + 
 
 6 + C + . 
 
 1+C + . 
 
 ^.+yK^\^--- 
 
 lies between the 
 
 
 , . . - a ^flSj + 61 +Ci + . . 
 
 .-a2-62. 
 
 -c^-... 
 
 and 
 
 . . . + a. J a-. + &i + Ci + . . 
 
 .-a2_&2- 
 
 -c^-... 
 
 355 
 
 is greater than 1 — ^, where a is some number greater than 
 
 imity. 
 
 This result constitutes Tchebycheff's Theorem. It may also 
 be written in the following form : 
 
 Let n be the number of the magnitudes x,y,z..., and 
 
 write a =-?!_; then the probabiUty that the arithmetic 
 mean ^ lies between the hnuts 
 
 a + b+c + . . . 1 laj^ + bi + Ci + . . . a^ + b^ + c^ + . . . 
 n t\/ n n 
 
 is greater than 1 
 
 n 
 
 It is also easy to show ^ as a deduction from Tchebycheff's 
 Theorem that, if an amount A is won when an event of probability 
 ■p\^ = 1 - ?] occurs and an amount B lost when it fails, then in 
 s trials the probabihty that the total winnings (or losses) will lie 
 between the Umits 
 
 s{'pA. - qS)±a{A. + B) yfspq 
 
 is greater than 1 — 5- 
 a 
 
 14. From this very general result for the probable limits of 
 a sum composed of a number of independently varying magni- 
 tudes, Bernoulli's Theorem is easily derived. For let there be 
 
 ^ For a proof see Czuber, he. cit. vol. i. p. 210, 
 
356 
 
 A TREATISE ON PROBABILITY 
 
 s observations or trials, and s magnitudes XjX2 . . .Xg corre- 
 sponding, such that x = \ when the event under consideration 
 occurs, and a;=0 when it fails. If the probability of the event's 
 occurrence is p, we have a=jp, 6=|), etc., and 01=3?, ii=f, etc. 
 Hence the probability P that the number of the event's occur- 
 rences wiU lie between the limits sp±a^sp-sp'^, i.e. between 
 
 the limits sp±ay/spq where q = l-p, is >!-— . If we 
 
 a 
 
 compare this formula with the formula for Bernoiilli's 
 
 Theorem already given, we find that, where this formula 
 
 gives P>1 — ^, Bernoulli's Theorem with greater precision 
 
 gives P=©(— T^j. The degree of superiority in the matter 
 
 of precision supplied by the latter can be illustrated by the 
 following table : 
 
 a2. 
 
 a 
 
 -i- 
 
 1-5 
 
 •7788 
 
 •333 
 
 2 
 
 ■8427 
 
 -5 
 
 4-5 
 
 •9661 
 
 •7778 
 
 8 
 
 •9953 
 
 •875 
 
 12-5 
 
 ■9996 
 
 •92 
 
 18 
 
 •99998 
 
 •9445 
 
 Thus when the limits are narrow and a is small, BemoulU's 
 
 formula gives a value of P very much in excess of 1 — -. But 
 
 Bernoulli's formula involves a process of approximation which is 
 only valid when s is large. TchebychefE's formula involves no 
 such process and is equally vaUd for all values of s. We have 
 seen in § 11 that there are numerous cases in which for a 
 different reason BernoulH's formula exaggerates the results, 
 and, therefore, TchebychefE's more cautious limits may some- 
 times prove useful. 
 
 The deduction of a corresponding form of Poisson's Theorem 
 from TchebychefE's general formula obviously follows on similar 
 lines. For we put^ «=i'i> ^=P2> etc., and ai=Pi, bi=P2, etc., 
 
 1 I am using the same notation as that used for Poisson's Theorem in § 8. 
 
OH. XXIX STATISTICAL INFERENCE 357 
 
 and find that the probability that the number of the event's 
 occurrences will lie between the limits 
 
 A. /A \ 
 
 1 11 
 
 -.a V 2 
 
 i.e. between the limits sp±,a s/ "Zp^q^, 
 i.e. between the limits sp ± ij2ak ^s, 
 is greater than t — r- 
 
 In CreUe's Journal'^ TchebychefE proves Poisson's Theorem 
 directly by a method similar to his general method, and also 
 obtatQS several supplementary results such as the following : 
 
 I. If the chances of an event E in /* consecutive trials are 
 PjP^ ■ ■ -P^ respectively, and their sum is s, the probability that 
 E will occur at least m times is less than 
 
 — / 
 
 n-s)S/ 
 
 'm{fi-m)fsYf fi-sV-'^+'' 
 
 2{in-s)'\ fj, \fj,/ \/ji,-mJ 
 
 provided that m>s + l; 
 
 II. and the probability that E will not occur more than n times 
 is less than 
 
 5 - n)\/ fjL 
 
 provided that n< s-1. 
 
 III. Henc6 the probability that E will occur less than m times 
 and more than n is greater than 
 
 1 M(/"' - 
 n - s)\/ fi 
 
 -m)/sWA'-s V"™+^ 
 
 2(m-s)\/ A* V"''/ V/* 
 
 2(s-m)V fi \n) \fjt'- 
 
 m)/sY+Y//,-sY''' 
 
 provided m>s + l, n<s-l. 
 
 15. Tchebycheff's methods have been set out and his results 
 
 admirably extended by A. A. MarkofE.^ And some develop- 
 
 * Vol. 33 (1846), Demonstration elemenlaire d'une proposition genercUe de la 
 theorie des probabilites. 
 
 ' The reader is referred to Markoff's WaJirscheinlichkeitsrecJmung, and par- 
 ticularly to p. 67, for a striking development, along mathematical lines, of 
 
358 A TEEATISE ON PEOBABILITY pt. v 
 
 ments along the same lines by Tschuprow (" Zur Theorie der 
 Stabilitat statistischer Reihen," Skandinavisk AMuarietidskrift, 
 1919) have convinced me that TchebychefE's discovery is far 
 more than a technical device for solving a special problem, and 
 points the way to the fmidamental method for attacking these 
 questions on the mathematical side. The Laplacian mathe- 
 matics, although it stiU holds the field in most text-books, is 
 really obsolete, and ought to be replaced by the very beautiful 
 work which we owe to these three Russians. 
 
 16. There is one other investigation relating to Bernoulli's 
 Theorem which deserves remark. I have already pointed out, 
 in § 2, that the dispersion about the most probable value, even 
 when the conditions for the applicability of Bernoulli's Theorem 
 in its non-approximate form are strictly fulfilled, is unsym- 
 metrical. The fact, that the usual approximation for the prob- 
 ability of a divergence h from the most probable number of 
 occurrences (the notation is that of § 2 above) takes the form 
 1 ^^ 
 
 7 e '^m>s, which is the same for +A as for -h, has led 
 
 to this want of symmetry being very generally overlooked ; 
 and it is not uncommon to assume that the probability of a 
 given divergence less than pm is equal to that of the same diverg- 
 ence in excess of pm, and, in general, that the probability of 
 the frequency's exceeding pm in a set of m trials is equal to that 
 of its falling short of pm. 
 
 That this is not strictly the case is obvious. If a die is cast 
 60 times, the most probable number of appearances of the ace 
 is 10 ; but the ace is more likely to appear 9 times than 11 times ; 
 and much more likely (about 5 times as likely) not to appear at 
 all than to appear exactly 20 times. That this must be so wiU 
 be clear to the reader (without his requiring to trouble himself 
 with the algebra), when he reflects that the ace cannot appear 
 less often than not at all, whereas it may well appear more than 
 20 times, so that the smaUness of the possible divergence in 
 defect from the most probable value 10, as compared with the 
 possible divergence in excess, must be made up for by the greater 
 
 TchebychefE's leading idea. Further references to later memoirs, which, being 
 in the Russian language, are inaccessible to me, wiE be found in the Biblio- 
 graphy. 
 
OH. XXIX STATISTICAL INFEEENCE 359 
 
 frequency of any given defection as compared with the corre- 
 sponding excess. Thus the actual frequency in a series of trials 
 of an event, of which the probability at each trial is less than ^, 
 is likely to fall short of its most probable value more often than 
 it exceeds it. What is ia fact true is that the mathematical 
 expectation of deficiency is equal to the mathematical expecta- 
 tion of excess, i.e. that the sum of the possible deficiencies each 
 multiplied by its probability is equal to the sum of the possible 
 excesses each multiplied by its probability. 
 
 The actual measurement of this want of symmetry and the 
 determination of the conditions, in which it can be safely 
 neglected, involves laborious mathematics, of which I am only 
 acquainted with one direct investigation, that pubhshed in the 
 Proceedings of the London Mathematical Society by Mr. T. C. 
 Simmons.-^ 
 
 For the details of the proof I must refer the reader to Mr. 
 Simmons's article. His principal theorem ^ is as follows : If 
 
 r is the probability of the event at each trial and n{a + 1) the 
 
 number of trials, n and a beiag integers,^ the probabihty that the 
 frequency of occurrence will fall short of n is always greater than 
 the probability that it will exceed n ; the difference between the 
 two probabilities being a maximum when n = l, constantly 
 
 diminishing as n increases, lying always between times the 
 
 . / a 1 Y^''+^'> , la-1 . 
 
 greatest term m H and ^ — - times the 
 
 ^ \a-l-l a + 1/ Ba + l 
 
 ^ " A New Theorem in Probability." Mr. Simmons claimed uoTelty for 
 his investigation, and so far as I know this claim is justified ; but recent 
 investigations obtaining closer approximations to Bernoulli's Theorem by means 
 of the Method of Moments are essentially directed towards the same problem. 
 
 A somewhat analogous point has, however, been raised by Professor Pearson 
 in his article (Phil. Mag., 1907) on " The Influence of Past Experience on Future 
 Expectation." He brings out an exactly similar want of symmetry in the 
 probabilities of the various possible frequencies about the most probable fre- 
 quency, when the calculation is based, not on Bernoulli's Theorem as in Mr. 
 Simmons's investigation, but on Laplace's rule of succession (see next chapter). 
 The want of symmetry has also been pointed out by Professor Lexis (Abhand- 
 lungen, p. 120). 
 
 * I am not giving his own enunciation of it. 
 
 ' Mr. Simmons does not seem to have been able to remove this restriction 
 on the generality of his theorem, but there does not seem much reason to doubt 
 that it can be removed. 
 
360 
 
 greatest term in 
 
 A TREATISE ON PROBABILITY 
 
 I \(7i+l)(a+l) 
 
 ■+- 
 
 , and being approxi- 
 1 a-1 
 
 \a + l a + 1 
 mately equal, when n is very large, to 
 
 The following table gives the value of the excess A of the 
 probability of a frequency less than pm ovei; the probabiHty of 
 a frequency greater than pm for various values of p the prob- 
 ability and m the number of trials 
 
 calculated by Mr. Simmons : 
 
 p = — r, m=n{a + l) 
 
 as 
 
 P' 
 
 m. 
 
 A. 
 
 1 
 
 3 
 
 3 
 
 •037037 
 
 1 
 3 
 
 15 
 
 •02243662 
 
 1 
 3 
 
 24 
 
 •0182706 
 
 1 
 4 
 
 4 
 
 •034687 
 
 1 
 
 i 
 
 20 
 
 •03201413 
 
 1 
 
 10 
 
 10 
 
 •084777 
 
 1 
 10 
 
 20 
 
 •068673713 
 
 1 
 
 100 
 
 100 
 
 •101813 
 
 1 
 
 ioo 
 
 200 
 
 •081324387 
 
 1 
 
 1000 
 
 1000 
 
 •103454 
 
 Thus unless not only m but mp also is large the want of symmetry 
 is Ukely to be appreciable. Thus it is easily found that in 100 
 
 sets of 4 trials each, where p = j, the actual frequency is likely to 
 
 exceed the most probable 26 times and to fall short of it 31 times ; 
 
 and in 100 sets of 10 trials each, where ?> = tt;; to exceed 26 times 
 
 and to faU short 34 times. 
 
 Mr^. Simmons was first directed to this investigation through 
 
OH. XXIX STATISTICAL INFEEENCE 361 
 
 noticing in the examination of sets of random digits that " each 
 digit presented itself, with unexpected frequency, less than — of 
 
 the number of times. For instance, in 100 sets of 150 digits each, 
 I f oimd that a digit presented itseK in a set more frequently under 
 15 times than over 15 times ; similarly in the case of 80 sets each 
 of 250 digits, and also in other aggregations." Its possible 
 bearing on such experiments with dice and roulette, as are 
 described at the end of this chapter, is clear. But apart from 
 these artificial experiments, it is sometimes worth the statis- 
 tician's while to bear in mind this appreciable want of symmetry 
 in the distribution about the mode or most probable value in 
 many even of those cases in which BemouUian conditions are 
 strictly fulfilled. 
 
 17. I will conclude this chapter by an account of some of the 
 attempts which have been made to verify d posteriori the con- 
 clusions of Bernoulli's Theorem. These attempts are nearly 
 useless, first, because we can seldom be certain d priori that the 
 conditions assumed in BemouUi's Theorem are fulfilled, and, 
 secondly, because the theorem predicts not what will happen 
 but only what is, on certain evidence, likely to happen. Thus 
 even where our results do not verify BemouUi's Theorem, the 
 theorem is not thereby discredited. The results have bearing 
 on the conditions in which the experiments took place, rather 
 than upon the truth of the theorem. In spite, therefore, of the 
 not unimportant place which these attempts have in the history 
 of probability, their scientific value is very small. I record them, 
 because they have a good deal of historical and psychological 
 interest, and because they satisfy a certain idle curiosity from 
 which few students of probability are altogether free.^ 
 
 18. The data for these investigations have been principally 
 drawn from four sources — coin-tossing, the throw of dice, lotteries, 
 and roulette ; for in such cases as these the conditions for 
 BemouUi's Theorem seem to be fulfilled most nearly. The earliest 
 recorded experiment was carried out by BufEon,^ who, assisted 
 
 ^ Mr. Yule (Introchiction to Statistics, p. 254) recommends its indulgence : 
 " The student is strongly recommended to cany out a few series of such ex- 
 periments personally, in order to acquire confidence in the use of the theory." 
 Mr. Yule himself has indulged moderately. 
 
 > Essai d^arithmetigue morale (see Bibliography), published 1777, said to 
 have been composed about 1760. 
 
362 A TREATISE ON PROBABILITY m. v 
 
 by a child tossing a coin into the air, played 2048 partis of the 
 Petersburg game, in which a coin is thrown successively until 
 the parti is brought to an end by the appearance of heads. The 
 same experiment was repeated by a young pupil of De Morgan's 
 ' for his own satisfaction.' ^ In Bitffon's trials there were 1992 
 tails to 2048 heads ; in Mr. H.'s (De Morgan's pupil) 2044 tails to 
 2048 heads. A further experiment, due to Buffon's example, 
 was carried out by Quetelet ^ in 1837. He drew 4096 balls from 
 an urn, replacing them each time, and recorded the result at 
 different stages, in order to show that the precision of the result 
 tended to increase with the nmnber of the experiments. He 
 drew altogether 2066 white balls and 2030 black balls. Following 
 in this same tradition is the experiment of Jevons,' who made 
 2048 throws of ten coins at a time, recording the proportion of 
 heads at each throw and the proportion of heads altogether. In 
 the whole number of 20,480 single throws, he obtained heads 
 10,353 times. More recently Weldon* threw twelve dice 4096 
 times, recording the proportion of dice at each throw which 
 showed a number greater than three. 
 
 All these experiments, however, are thrown completely into 
 the shade by the enormously extensive investigations of the Swiss 
 astronomer Wolf, the earliest of which were pubhshed in 1850 
 and the latest in 1893.^ In his first set of experiments Wolf 
 completed 1000 sets of tosses with two dice, each set continviing 
 mitil every one of the 21 possible combinations had occurred at 
 least once. This involved altogether 97,899 tosses, and he then 
 completed a total of 100,000. These data enabled him to work 
 out a great number of calculations, of which Czuber quotes the 
 
 foUowing, namely a proportion of -83533 of unlike pairs, as against 
 
 5 
 the theoretical value -83333, i.e. -. In his second set of experi- 
 
 ^ Formal Logic, p. 185, published 1847. De Morgan gives Buffon's results, 
 as well as his pupil's, in fuU. Buffon's results are also investigated by Poisson, 
 Becherches, pp. 132-135. 
 
 2 Letters on the Theory of Probabilities (Eng. trans.), p. 37. 
 
 ' Principles of Science (2nd ed.), p. 208. 
 
 * Quoted by Edgeworth, "Law of Error" {Ency. Brit. 10th ed.), and by 
 Yule, Inirochiction to Statistics, p. 254. 
 
 ' See Bibliography. Of the earliest of these investigations I have no first- 
 hand iaiowledge and have relied upon the account given by Czuber, loc. cit. 
 vol. i. p. 149. For a general account of empirical verifications of Bernoulli's 
 Theorem reference may be made to Czuber, Wahrscheinlichlceitsrechnung, vol. i. 
 pp. 139-152, and Czuber, Entwicklung der Wahrscheinlichkeitstheorie, pp. 88-91. 
 
OH. XXIX STATISTICAL INFERENCE 363 
 
 ments WoK used two dice, one white and one red (in the first set 
 the dice were indistinguishable), and completed 20,000 tosses, the 
 details of each result being recorded ia the VierteTjahrsschrift der 
 Natwforsohenden Gesellschafi in Zurich. He studied particularly 
 the number of sequences with each die, and the relative frequency 
 of each of the 36 possible combinations of the two dice. The 
 sequences were somewhat fewer than they ought to have been, 
 and the relative frequency of the different combinations very 
 different indeed from what theory would predict.^ The ex- 
 planation of this is easily found ; for the records of the relative 
 frequency of each face show that the dice must have been very 
 irregular, the six face of the white die, for example, falling 38 
 per cent more often than the four face of the same die. This, 
 then, is the sole conclusion of these immensely laborious experi- 
 ments, — ^that Wolf's dice were very ill made. Indeed the ex- 
 periments could have had no bearing except upon the accuracy 
 of his dice. But ten years later Wolf embarked upon one more 
 series of experiments, using four distinguishable dice, — ^white, 
 yellow, red, and blue, — and tossing this set of four 10,000 times. 
 Wolf recorded altogether, therefore, in the course of his fife 
 280,000 results of tossing individual dice. It is not clear that 
 WoK had any well-defined object in view in making these 
 records, which are published in curious conjunction with various 
 astronomical results, and they afford a wonderful example of the 
 pure love of experiment and observation.^ 
 
 19. Another series of calculations have been based upon the 
 ready-made data provided by the published results of lotteries 
 and roulette.^ 
 
 ^ Czuber quotes the principal results (he. cit. vol. i. pp. 149-151). The 
 frequencies of only 4, instead of 18, out of the 36 combinations lay within the 
 probable limits, and the standard deviation was 76'8 instead of 23-2. 
 
 * The latest experiment of the kind, of which I am aware, is that of Otto 
 Meissner (" Wiirfelversuche," Zeitscknflfur Math, und Phys. vol. 62 (1.913), pp. 
 149-156), who recorded 24 series of 180 throws each with four distinguishable 
 dice. 
 
 ^ Tor the publication of such returns there has always been a sufficient 
 demand on the part of gamblers. An Almanack romain sur la loierie royale de 
 France was published at Paris in 1830, which contained aU the drawings of the 
 French lottery (two or three a month) from 1758 to 1830. Players at Monte 
 Carlo are provided with cards and pins with which to record the results of 
 successive coups, and the results at the tables are regularly published in Le 
 Monaco. Gamblers study these returns on account of the belief, which they 
 usually hold, that as the number of cases is increased the absolute deviation from 
 the most probable proportion becomes less, whereas at the best Bernoulli's 
 
364 A TREATISE ON PROBABILITY pt. v 
 
 Czuber ^ has made calculatioDS based on the lotteries 
 of Prague (2854 drawings) and Briinn (2703 drawings) between 
 the years 1754 and 1886, in which the actual results agree 
 very well with theoretical predictions. Fechner ^ employed the 
 lists of the ten State lotteries of Saxony between the years 1843 
 and 1852. Of a rather more interesting character are Professor 
 Karl Pearson's investigations ^ into the results of Monte Carlo 
 Roulette as recorded in Le Monaco in the course of eight weeks. 
 Applying Bernoulli's Theorem, on the hypothesis of the equi- 
 probability of aU the compartments throughout the investigation, 
 he found that the actually recorded proportions of red and black 
 were not unexpected, but that alternations and long runs were 
 so much in excess that, on the assumption of the exact accuracy 
 of the tables, the ci ^iori odds were at least a thousand millions 
 to one against some of the recorded deviations. Professor 
 Pearson concluded, therefore, that Monte Carlo Roulette is not 
 objectively a game of chance in the sense that the tables on which 
 it is played are absolutely devoid of bias. Here also, as in the 
 case of Wolf's dice, the conclusion is solely relevant, not to the 
 theory or philosophy of Chance, but to the material shapes of 
 the tools of the experiment. 
 
 Professor Pearson's investigations into Roulette, which dealt 
 with 33,000 Monte Carlo coups, have been overshadowed, just 
 
 Theorem shows that the proportionate deviation decreases while the absolute 
 deviation increases. Cf. Houdin's Les Trickeries des Qrecs devoiUes : " In a 
 game of chance, the oftener the same combination has occurred in succession, the 
 nearer we are to the certainty that it will not recur at the next east or turn-up. 
 This is the most elementary of the theories on probabilities ; it is termed the 
 maturity of the chances." Laplace (Essai philosophdqiie, p. 142) quotes an 
 amusing instance of the same beUef not drawn from the annals of gambling : 
 " J'ai vu des hommes d&irant ardemment d'avoir mi fils, n'apprendre qu'avec 
 peine les naissances des gar9ons dans le mois oil ils aUaient deyenir pdres. 
 S'imaginant que le rapport de oes naissances k oeUes des filles devait Stre le 
 mSme i la fin de chaque mois, ils jugaient que les gardens d6ji. n6s rendaient 
 plus probables les naissances prochaines des fiUes." 
 
 The literature of gambling is very extensive, but, so far as I am acquainted 
 with it, excessively lacking in variety, the maturity of the chances and the 
 martingale continually recurring in one form or another. The curious reader 
 will find tolerable accounts of such topics in Proctor's Chance and Luck, and 
 Sir Hiram Maxim's MorUe Carlo Facts and Fallacies. 
 
 '^ Zum Oesetz der grossen Zahlen. The results are summarised in his Wahr- 
 scheinlichkeitsrechnung, vol. i. p. 139. 
 
 2 Kollektivmasslehre, p. 229. These results also are summarised by Czuber, 
 loc. cit. 
 
 ^ The Chances of Death, voL i. 
 
OH. XXIX STATISTICAL INFEEENCE 365 
 
 as all other tosses of coins and dice have been outdone by Wolf, 
 by Dr. Karl Marbe,^ who has examined 80,000 coups from Monte 
 Carlo and elsewhere. Dr. Marbe arrived at exactly opposite 
 conclusions ; for he claims to have shown that long runs, so far 
 from being in excess, were greatly in defect. Dr. Marbe intro- 
 duces this experimental result in support of his thesis that the 
 world is so constituted that long ruas do not as a matter of fact 
 occur in it.^ Not merely are long runs very improbable. They 
 do not, according to him, occur at all. But we may doubt 
 whether roulette can tell us very much either of the laws of logic 
 or of the constitution of the universe. 
 
 Dr. Marbe's main thesis is identical, as he himself recognises, 
 with one of the heterodox contentions of D'Alembert.' But this 
 principle of variety, precisely opposite to the usual principle of 
 Induction, can have no claim to be accepted d priori and, as a 
 general principle, there is no adequate evidence to support it from 
 experience. Its origLa is to be found, perhaps, ia the fact that 
 
 * Naturphilosophische Untersuchungen zur Wahrscheinlichkeitstheorie. 
 
 ' Dr. Marbe's monograph has given rise in Germany to a good deal of dis- 
 cussion, not directed towards showing what a preposterous method this is for 
 demonstrating a natural law, but because the experimental result itsdf does not 
 really follow from the data and is due to a somewhat subtle error in Marbe's 
 reasoning, by which he has been led into an incorrect calculation of the probable 
 proportions A priori of the various sequences. The problem is discussed by 
 Von Bortkiewicz, Bromse, Bruns, Grimsehl, and Griinbanm (for exact references 
 to these see the BibUography), and by Lexis {Abhandhmgen, pp. 222-226) and 
 Ozuber (WaJM-ecJieinlichkeitsrechnung, vol. i. pp. 144-149). Largely as a result 
 of this controversy, Von Bortkiewicz has lately devoted a complete treatise 
 {Die Iterationen) to the mathematics of ' runs.' Dr. Marbe has been given 
 far more attention by his colleagues in Germany than he conceivably deserves. 
 
 » D'Alembert's principal contributions to Probability are most accessible in 
 the volumes of his Opuscules mathematigues (1761). Works on Probability 
 usually contain some reference to D'Alemberfc, but his sceptical opinions, re- 
 jected rather than answered by the orthodox school of Laplace, have not always 
 received full justice. D'Alembert has three main contentions to which in his 
 various papers he constantly recurs : 
 
 (1) That a probability very small mathematically is really zero ; 
 
 (2) That the probabilities of two successive throws with a die are not 
 independent ; 
 
 (3) That 'mathematical expectation' is not properly measured by the 
 product of the probability and the prize. 
 
 The first and third of these were partly advanced in explanation of the 
 Petersburg paradox (see p. 316). The second is connected with the first, and 
 was also used to support his incorrect evaluation of the probabihty of heads 
 twice running ; but D'Alembert, in spite of many of his results being wrong, 
 does not altogether deserve the ridicule which he has suffered at the hands of 
 writers, who accepted without sceptioaJ doubts the hardly less iucorreot con- 
 clusions of the orthodox theory of that time. 
 
366 A TEEATISE ON PROBABILITY pt. v 
 
 in a certain class of cases, especially where conscious human 
 agency comes in, it may contain some element of truth. The 
 fact of an act's having been done in a particular way once may 
 be a special reason for thinking that it will not be performed on 
 the next occasion in precisely the same manner. Thus in many 
 so-caUed random events some slight degree of causal and material 
 dependence between successive occurrences may, nevertheless, 
 exist. In these cases ' rims ' may be fewer and shorter than those 
 which we should predict, if a complete absence of such dependence 
 is assumed. If, for example, a pack of cards be dealt, collected, 
 and shuffled, to the extent that card-players do as a rule shuffle, 
 there may be a greater presumption against the second hand's 
 being identical with the first than against any other particular 
 distribution. In the case of croupiers long experience might 
 possibly suggest some psychological generalisation, — ^that they 
 are very mechanical, giving an excess of numbers belonging to a 
 particular section of the wheel, or, on the other hand, that when 
 a croupier sees a run beginning, he tends to vary his spin more than 
 usual, thus bringing runs to an end sooner than he ought.^ At any 
 rate, it is worth emphasising once more that from such experi- 
 ments as these this is the only hind of knowledge which we can 
 hope to obtain, — ^knowledge of the material construction of a 
 die or of the psychology of a croupier. 
 
 ^ A good roulette table is, however, so delicate an instrument that no prob- 
 able degree of regularity of habit on the part of the spinner could be sufSoient 
 to produce regularity in the result. 
 
CHAPTER XXX 
 
 THE MATHEMATICAL USB OP STATISTICAL FREQUENCIES FOE 
 THE DETERMINATION OP PROBABILITY A POSTMRIORl — THE 
 METHODS OP LAPLACE 
 
 Utilissima est aestimatio probabilitatum, quanquam in exemplis juridiois 
 politioisque plerumque non tam subtili oalculo opus est, quam acourata 
 omnium ciroumstantiarum enumeratione. — ^Leibniz. 
 
 1. In the preceding chapter we have assumed that the probability 
 of an event at each of a series of trials is given, and have considered 
 how to infer from this the probabihties of the various possible 
 frequencies of the event over the whole series, without discussing 
 in detail by what method the initial probability had been deter- 
 mined. In statistical inquiries it is generally the case that this 
 initial probability is based, not upon the Principle of In- 
 difEerence, but upon the statistical frequencies of similar events 
 which have been observed previously. In this chapter, therefore, 
 we must commence the complementary part of our inquiry, — 
 namely, into the method of deriving a measure of probability 
 from an observed statistical frequency. 
 
 I do not myself believe that there is any direct and simple 
 method by which we can make the transition from an observed 
 numerical frequency to a numerical measure of probability. 
 The problem, as I view it, is part of the general problem of found- 
 ing judgments of probabihty upon experience, and can only be 
 dealt with by the general methods of induction expounded in 
 Part III. The nature of the problem precludes any other method, 
 and direct mathematical devices can all be shown to depend 
 upon insupportable assumptions. In the next chapters we will 
 consider the applicability of general inductive methods to this 
 problem, and in this we will endeavour to discredit the mathe- 
 matical charlatanry by which, for a hundred years past, the basis 
 of theoretical statistics has been greatly undermined, 
 
 367 
 
368 A TREATISE ON PROBABILITY pt. v 
 
 2. Two direct methods have been commonly employed, 
 theoretically inconsistent vdth one another, though not in every 
 case noticeably discrepant in practice. The first and simplest of 
 these may be termed the Inversion of Bernoulli's Theorem, and 
 the other Laplace's Rule of Succession. 
 
 The earliest discussion of this problem is to be found in the 
 Correspondence of Leibniz and Jac. Bernoulli,^ and its true 
 nature cannot be better indicated than by some account of the 
 manner in which it presented itself to these very illustrious 
 philosophers. The problem is tentatively proposed by Bernoulli 
 in a letter addressed to Leibniz in the year 1703. We can deter- 
 mine from d 'priori considerations, he points out, by how much it 
 is more probable that we shaU throw 7 rather than 8 with two dice, 
 but we cannot determine by such means the probability that a 
 young man of twenty wiU outHve an old man of sixty. Yet is it 
 not possible that we might obtain this knowledge d posteriori 
 from the observation of a great number of similar couples, each 
 consisting of an old man and a young man ? Suppose that the 
 young man was the survivor in 1000 cases and the old man in 500 
 cases, might we not conclude that the young man is twice as likely 
 as the old man to be the survivor ? For the most ignorant 
 persons seem to reason in this way by a sort of natural instinct, 
 and feel that the risk of error is diminished as the number of 
 observations is increased. Might not the solution tend asymp- 
 totically to some determinate degree of probability with the 
 increase of observations ? Nesdo, Vir AmpUssime, an specula- 
 tionibus istis soliditatis dUquid inesse Tibi videatur. 
 
 Leibniz's reply goes to the root of the difficulty. The calcula- 
 tion of probabilities is of the utmost value, he says, but in statisti- 
 cal inquiries there is need not so much of mathematical subtlety 
 as of a precise statement of all the circumstances. The possible 
 contingencies are too nimierous to be covered by a finite number 
 of experiments, and exact calculation is, therefore, out of the 
 quesition. Although nature has her habits, due to the recurrence 
 of causes, they are general, not invariable. Yet empirical calcula- 
 tion, although it is inexact, may be adequate ia affairs of practice.^ 
 
 ■■ For the exact references see Bibliography. 
 
 * Leibniz's actual expressions (in a letter to Bernoulli, December 3, 1703) are 
 as f oUows : Utilissima est aeatimatio probabilitatum, quanquam in exemplis 
 juridicis politicisque plerumgue non tarn subtili calculo opus est, quam accurata 
 omnium circumstantiarum enumeratione. Cum empirice aestimamus proba- 
 
OH. XXX STATISTICAL INFEEENCE 369 
 
 Bernoulli in his answer fell back upon the analogy of balls 
 drawn from an urn, and maintained that without estimating 
 each separate contingency we might determine within narrow 
 limits the proportion favouring each alternative. If the true 
 proportion were 2 : 1, we might estimate it with moral certainty 
 a posteriori as lying between 201 : 100 and 199 : 100. " Certus 
 sum," he concluded the controversy, " Tibi placituram demonstra- 
 tionem, cum publicavero." But whether he was impressed by 
 the just caution of Leibniz, or whether death intercepted him, 
 he advances matters no further in the Ars Conjectandi. After 
 dealing with some of Leibniz's objections ^ and seeming to 
 promise some mode of estimating probabiUties a posteriori by an 
 inversion of his theorem, he proves the direct theorem only and 
 the book is suddenly at an end. 
 
 3. In dealing with the correspondence of Leibniz and Ber- 
 noulli, I have not been mainly influenced by the historical interest 
 of it. The view of Leibniz, dweUing mainly on considerations 
 of analogy, and demanding " not so much mathematical subtlety 
 as a precise statement of all the circumstances," is, substantially, 
 the view which will be supported in the following chapters. 
 The desire of Bernoulli for an exact formula, which would derive 
 from the numerical frequency of the experimental results a 
 numerical measure of their probability, preludes the exact 
 formulas of later and less cautious mathematicians, which will be 
 examined immediately. 
 
 4. During the greater part of the eighteenth century there is 
 no trace, I think, of the exphcit use of the Inversion of Bemoulh's 
 Theorem. The investigations carried out by D'Alembert, Daniel 
 BernoulH, and others rehed upon the type of argument examined 
 in Chapter XXV. They showed, that is to say, that certain 
 observed series of events would have been very improbable, if 
 we had supposed independence between some two factors or if 
 
 bilitates per experimenta suooessuum, quaeris an ea via tandem aestimatio 
 perfeote obtineri possit. Idque a Te repertum soribis. Difficultas in eo mihi 
 inease videtur, quod oontingentia seu quae infinitis pendent ciroumstantiis, per 
 finita experimenta determinari non possunt ; natura quidem suas habet oonsue- 
 tudines, natas ex reditu causarum, sed non nisi us ^Tri to ttoM. Novi morbi 
 inundant subinde humanum genus, quodsi ergo de mortibus quotounque ex- 
 perimenta feoeris, non ideo naturae rerum limites posuisti, ut pro futuro variare 
 non possit. Etsi autem empirice non posset haberi perfeota aestimatio, non 
 ideo minus empirioa aestimatio in praxi utilis et sufficiens foret. 
 
 ^ The relevant passages are on pp. 224-227 of the Ars Conjectandi. 
 
 2b 
 
370 A TEEATISE ON PROBABILITY pt. v 
 
 some occurrence had been assumed to be as likely as not, and they 
 inferred from this that there was in fact a measure of dependence 
 or that the occurrence had probabihty in its favour. But they 
 did not endeavour to pass from the observed frequency of occur- 
 rence to an exact measure of the probability; With the advent 
 of Laplace more ambitious methods took the field. 
 
 Laplace began by assuming without proof a direct inversion 
 of BernouUi's Theorem. Bernoulh's Theorem, in the form in 
 which Laplace proved it, states that, if j? is the probability d 
 
 m 
 priori, there is a probability P that the proportion of times — — 
 
 of the event's occurrence in fjt,{=m + n) trials wiU lie between 
 
 p±j 1^, where P==-^fV'^&+ , ^ - e-y\ The in- 
 
 version of the theorem, which he assumes without proof, 
 states that, i£ the event is observed to happen m times 
 in /x trials, there is a probability P that the probability 
 
 . , .,,,., m l2mn 
 
 01 the event p will he between —±7 / — ;-, where 
 
 ,1, '\/ /J? 
 
 P = — -= e'*^dt + — e~'>'^ The same result is aleo given 
 
 J^ 
 
 1^' 
 
 by Poisson.^ Thus, given the frequency of occurrence in fi 
 trials, these writers infer the probability of occurrence at 
 subsequent trials within certain Hmits, just as, given the 
 d priori probabihty, Bernoulli's Theorem would enable them 
 to predict the frequency of occurrence in /a trials within corre- 
 sponding limits. 
 
 1 For an account of the treatments of this topic both by Laplace and by 
 Poiason, see Todhunter's History, pp. 554-557. Both of them also obtain a 
 formula slightly different from that given above by a method analogous to the 
 first part of the proof of Laplace's Bule of Succession ; i.e. by an application of 
 the inverse principle of probability to the assumption that the probability of 
 the probability's lying within any interval is proportional to the length of the 
 interval. This discrepancy has given rise to some discussion. See Todhunter, 
 loc. cit. ; De Morgan, On a Question in the Theory of Probabilities ; Monro, On the 
 Inversion of Bernoulli's Theorem in Probabilities ; and Czuber, Entwicklung, 
 pp. 83, 84. But this is not the important distinction between the two mathe- 
 matical methods by which this ctuestion has been approached, and this minor 
 point, which is of historical interest mainly, I forbear to enter into. 
 
OH. XXX STATISTICAL INFERENCE 371 
 
 If the number of trials is at alii numerous, these limits are 
 narrow and the purport of the iaversion of Bernoulli's Theorem 
 may therefore be put briefly as follows. By the direct theorem, 
 if p measures the probability, p also measures the most probable 
 
 value of the frequency ; by the inversion of the theorem, if 
 
 m + n 
 
 AM 
 
 measures the frequency, also measures the most probable 
 
 value of the probability. The simplicity of the process has re- 
 commended it, since the time of Laplace, to a great niunber of 
 writers. Czuber's argument, criticised on p. 351, with reference 
 to the proportions of male and female births in Austria, is based 
 upon an imqualified use of it. But examples abound throughout 
 the literature of the subject, in which the theorem is employed in 
 circumstances of greater or less validity. 
 
 The theorem was originally given without proof, and is indeed 
 incapable of it, unless some illegitimate assumption has been 
 introduced. But, apart from this, there are some obvious objec- 
 tions. We have seen in the preceding chapter that Bernoulli's 
 Theorem itself cannot be applied to all Mnds of data indiscrimin- 
 ately, but only when certain rather stringent conditions are ful- 
 filled. Corresponding conditions are required equally for the 
 inversion of the theorem, and it cannot possibly be inferred from 
 a statement of the number of trials and the frequency of occur- 
 rence merely, that these have been satisfied. We must know, 
 for instance, that the examined instances are similar in the main 
 relevant particulars, both to one another and to the imexamined 
 instances to which we intend our conclusion to be applicable. 
 An. imanalysed statement of frequency cannot tell us this. 
 
 This method of passing from statistical frequencies to prob- 
 abilities is not, however, hke the method to be discussed in a 
 moment, radically false. With due qualifications it has its place 
 in the solution of this problem. The conditions in which an 
 inversion of Bernoulli's Theorem is legitimate will be elucidated 
 in Chapter XXXI. In the meantime we will pass on to Laplace's 
 second method, which is more powerful than the first and has 
 obtained a wider currency. The more extreme appHcations of 
 it are no longer ventured upon, but the theory which underlies 
 it is still widely adopted, especially by French writers upon 
 probabihty, and seldom repudiated. 
 
372 A TEEATISE ON PROBABILITY pt. v 
 
 5. The f omiula in question, which Venn ^ has called the Rule 
 of Siuscession, declares that, if we know no more than that an 
 event has occurred m times and failed n times under given con- 
 ditions, then the probability of its occurrence when those con- 
 
 ditions are next fulfilled is -. It is necessary, however, 
 
 m+n + 2 ■' 
 
 before we examine the proof of this formula, to discuss in detail 
 the reasoning which leads up to it. 
 
 This preliminary reasoning involves the Lapladan theory of 
 ' unknown probabilities.' The postulate, upon which it depends, 
 is introduced to supplement the Principle of Indifference, and 
 is in fact the extension of this principle from the probabilities 
 of arguments, when we know nothing about the arguments, to the 
 probabilities that the probabilities of arguments have certain 
 values, when we know nothing about the probabilities. Laplace's 
 enunciation is as follows : " Quand la probabilite d'un 6venement 
 simple est inconnue, on peut lui supposer egalement toutes les 
 valeurs depuis zero jusqu'a I'unite. La probabilite de chacune 
 de ces hypotheses tiree de I'evenement observe est . . . tme 
 fraction dont le numerateur est la probability de I'eveUement dans 
 cette hypothese, et dont le denominateur est la somme des pro- 
 babilites semblables relatives k toutes les hypotheses. . . ." ^ 
 
 Thus when the probability of an event is unknown, we may 
 suppose all possible values of the probability between and 1 to 
 be equally likely d 'priori. The probabiUty, after the event has 
 
 occurred, that the probability a priori was - (say), is measured 
 
 1 . ^ 
 
 by a fraction of which - is the numerator and the sums of all the 
 ■' r 
 
 possible a priori values the denominator. The origin of this rule 
 
 is evident. If we consider the problem in which a ball is drawn 
 
 from a bag containing an infinite number of black and white balls 
 
 in unknown proportions, we have hypotheses, corresponding to 
 
 each of the possible constitutions of the bag, the assumption of 
 
 which yields in turn every value between and 1 as the d priori 
 
 probabihty of drawing a white ball. If we could assume that 
 
 these constitutions are equally probable a priori, we should 
 
 obtain probabilities for each of them d posteriori according to 
 
 Laplace's rule. 
 
 ^ Logic of Chance, p. 190. " Esaai phihso'phigue, p. 16. 
 
STATISTICAL INFEEENCE 373 
 
 On the analogy of this Laplace assumes in general that, where 
 everything is unknown, we may suppose an infinite number of 
 possibilities, each of which is equally hkely, and each of which 
 leads to the event in question with a different degree of probability, 
 so that for every value between and 1 there is one and only one 
 hypothetical constitution of things, the assumption of which 
 invests the event with a probabihty of that value. 
 
 6. It might be an almost sufficient criticism of the above to 
 point out that these assumptions are entirely baseless. But the 
 theory has taken so important a place in the development of 
 probability that it deserves a detailed treatment. 
 
 What, ia the first place, does Laplace mean by an miknown 
 probability ? He does not mean a probability, whose value is in 
 fact unknown to us, because we are unable to draw conclusions 
 which could be drawn from the data ; and he seems to apply the 
 term to any probability whose value, according to the argument 
 of Chapter III., is numerically indeterminate. Thus he assumes 
 that every probability has a nimierical value and that, in those 
 cases where there seems to be no numerical value, this value is 
 not non-existent but imknown ; and he proceeds to argue that 
 where the numerical value is unknown, or as I should say where 
 there is no such value, every value between and 1 is equally 
 probable. With the possible interpretations of the term ' un- 
 known probability,' and with the theory that every probabihty 
 can be measured by one of the real numbers between and 1, 
 I have dealt, as carefully as I can, in Chapter III. If the view 
 taken there is correct, Laplace's theory breaks down immediately. 
 But even if we were to answer these questions, not as they have 
 been answered in Chapter HI., but in a manner favourable to 
 Laplace's theory, it remains doubtful whether we could legitim- 
 ately attribute a value to the probabihty of an unknown prob- 
 abihty's having such and such a value. If a probabihty is 
 unknown, surely the probabihty, relative to the same evidence, 
 that this probability has a given value, is also unknown ; and we 
 are involved in an infinite regress. 
 
 7. This point leads on to the second objection ; Laplace's 
 theory requires the emplojnnent of both of two inconsistent 
 methods. Let us consider a number of alternatives %, a2, etc., 
 having probabihties pi, ^2> ^tc. ; if we do not know anything 
 about ai, we do not know the value of its probability p^^, and we 
 
374 A TEEATISE ON PEOBABILITY pt. v 
 
 must consider the various possible values of f^, namely \, \, etc., 
 the probabilities of these possible values being q^., q^, etc. respect- 
 ively. There is no reason why this process should ever stop. 
 For as we do not know anything about b^, we do not know the 
 value of its probability q^, and we must consider the various 
 possible values of g'j, namely Cj, Cg, etc., the probabihties of these 
 possible values being r^, r^, etc. respectively ; and so on. This 
 method consists in supposing that, when we do not know anything 
 about an alternative, we must consider all the possible values of 
 the probabUity of the alternative ; these possible values can form 
 in their turn a set of alternatives, and so on. But this method 
 by itself can lead to no final conclusion. Laplace superimposes 
 on it, therefore, his other method of determining the probabihties 
 of alternatives about which we know nothing, — ^namely, the 
 Principle of Indifference. According to this method, when 
 we know nothing about a set of alternatives, we suppose the 
 probabilities of each of them to be equal. In some parts of 
 his writings — and this is true also of most of his followers — he 
 applies this method from the beginning. If, that is to say, we 
 know nothing about Oj, since a^ and its contradictory form a pair 
 of exhaustive alternatives two in number, the probability of these 
 
 alternatives is equal and each is -. But in the reasoning which 
 
 leads up to the Law of Succession he chooses to apply this method 
 at the second stage, having used the other method at the first 
 stage. If, that is to say, we know nothing about a^, its prob- 
 ability Pj^ may have any of the values b^, b^, etc. where b^ is any 
 fraction between and 1 ; and, as we know nothing about the 
 probabilities q^, q^, etc. of these alternatives b^, b^, etc., we may 
 by the Principle of Indifference suppose them to be equal. This 
 account may seem rather confused ; but it is not easy to give 
 a lucid account of so confused a doctrine. 
 
 8. Turning aside from these considerations, let us examine 
 the theory, for a moment, from another side. When we reach the 
 Eule of Succession, it will be seen that the hypothetical a priori 
 probabilities are treated as if they were possible causes of the 
 event. It is assumed, that is to say, that the number of possible 
 sets of antecedent conditions is proportional to the number of 
 real numbers between and 1 ; and that these fall into equal 
 groups, each group corresponding to one of the real numbers 
 
OH. XXX STATISTICAL INFERENCE 375 
 
 between and 1, this number measuring the degree of probability 
 with which we could predict the event, if we knew that an ante- 
 cedent condition belonging to that group was fulfilled. It is 
 then assumed that all of these possible antecedent conditions are 
 d priori equally likely. The argument has arisen by false analogy 
 from the problem in which a ball is drawn from an urn containing 
 an infinite nimiber of black and white balls. But for the assump- 
 tion that we have in general the kind of knowledge which is 
 necessary about the possible antecedents, no reasonable founda- 
 tion has been suggested. 
 
 De Morgan endeavoured to deal with the difficulty in much 
 the same way in the following passage : ^ "In determining the 
 chance which exists (under known circumstances) for the happen- 
 ing of an event a number of times which Ues between certain 
 limits, we are involved in a consideration of some difficulty, 
 namely, the probability of a probability, or, as we have called it, 
 the presumption of a probability. To make this idea more clear, 
 remember that any state of probability may be immediately 
 made the expression of the result of a set of circumstances, which 
 being introduced into the question, the difficulty disappears. 
 The word presumption refers distinctly to an act of the mind, or a 
 state of the mind, while in the word probability we feel disposed 
 rather to think of the external arrangements on the knowledge 
 of which the strength of our presumption ought to depend, than 
 of the presumption itself." The point of this explanation lies 
 in the assumption that " any state of probability may be imme- 
 diately made the expression of the result of a set of circumstances." 
 It cannot be allowed that this is generally tfue ; ^ and even in 
 those cases in which it is true we are thrown back on the d priori 
 probabilities of the various sets of circumstances which need not 
 be, as De Morgan assumes, either equal or exhaustive alternatives. 
 
 9. The proof of the Rule of Succession, which is based upon 
 this theory of unknown probabilities, is, briefly, as follows : 
 
 If X stands for the a priori probability of an event in given 
 conditions, then the probability that the event will occur m 
 times and fail n times in these conditions is a!"(l-a;)''. If, 
 however, x is unknown, all Values of it between and 
 
 1 Cabinet Encyclopaedia, p. 87. 
 
 '^ For instance, it is not true even in the standard instance of balls drawn from 
 an urn containing black and white in unknown proportions, unless the number 
 of balls is infinite. 
 
376 A TREATISE ON PROBABILITY pt. v 
 
 1 are d priori equally probable. It follows from these two 
 sets of considerations that, if the event has been observed 
 to occur m times out of m + n, the probability d posteriori that 
 X lies between x and x + dx is proportional to a;™(l - x)'^dx, 
 and is equal, therefore, to Aa;™(l -xfdx where A is a constant. 
 Since the event has in fact occurred, and since x must have 
 one of its possible values, A is determined by the equation 
 
 A.™(1-.)V. = 1 .•.A=-i>±^. 
 
 5 ^ r(m + i)r{7!, + 1) 
 
 Hence the probability that the event will occur at the (m+n + l)th 
 trial, when we know that it has occurred m times in m + n 
 trials, is 
 
 aI x'^+\l-x)''dx. 
 J 
 
 If we substitute the value of A fotind above, this is equal to 
 
 m + 1 1 
 m + n + 2 
 
 The class of problem to which the theorem is supposed to 
 apply is the following : There are certain conditions such that we 
 are ignorant d priori as to whether they do or do not lead to the 
 occurrence of a particular event ; on m out oi m+n occasions, 
 however, on which these conditions have been observed, the 
 event has occurred ; what is the probability in the light of this 
 experience that the event will occur on the next occasion ? The 
 
 answer to all such problems is -. In the cases where 
 
 m+n+2 
 
 n=0, i.e. when the event has invariably occurred, the formula 
 
 1 The theorem is sometimes enunciated by contemporary writers in a much 
 more guarded form, e.g. by Czuber, WahrscJieinUchkeitsrechnuTig, vol. i. p. 197, 
 and by Baohelier, Ccdcul des probabilites, p. 487. Bachelier, instead of assuming 
 that the d priori probabilities of all possible values of the probability of the 
 event are equal, writes ui{y)dy as the d priori probability that the probability is 
 y, so that after m occurrences is m+?i trials the probability that the probability 
 
 lies between y and y + dy is "^ ' - — w-f,^X. ' ^^ °^^ ^^^ ^° ^^^ of ffi d 
 
 priori, he suggests that the simplest hypothesis is to put fi = l, which leads, as 
 above, to Laplace's Law of Suocesmon. He also proposes the hypothesis 
 Ci>{y) = a + a^ + Ogj/" + .... in which case the denominator is a series of Eulerian 
 integrals. There is a discussion of the Law of Succession, and of the contra- 
 dictions and paradoxes to which it leads, by E. T. Whittaker and others in 
 Part VL vol. viii. (1920^ of the Transactions of the FtKvMy of Actuaries in 
 Scotland. 
 
STATISTICAL INFEEENCE 377 
 
 Wl "4" 1 
 
 yields the result -. In the case where the conditions have 
 
 m + 2 
 
 been observed once only and the event has occurred on that 
 
 2 
 occasion, the result is -. If the conditions have never been met 
 
 1 
 with at all, the probabiUty of the event is -. And even in the 
 
 case where on the only occasion on which the conditions were 
 
 observed, the event did not occur, the probabihty is -• 
 
 Some of the flaws in this proof have been already explained. 
 One minor objection may be pointed out in addition. It is 
 assumed that, if x is the d priori probability of the event's happen- 
 ing once, then aj" is the a priori probability of its happening n 
 times in succession, whereas by the theorem's own showing the 
 knowledge that the event has happened once modifies the prob- 
 ability of its happening a second time ; its successive occurrences 
 are not, therefore, independent. If the d, priori probabihty of the 
 
 event is -, and if, after it has been observed once, the probability 
 
 . . . 2 . 
 
 that it win occur a second time is -, then it follows that the a 
 
 1112 
 priori probabihty of its occurring twice is not „ x „, but ^ x ^, 
 
 i.e. - ; and in general the d, priori probabihty of its happening 
 
 ■ • • ■ /iV, 1 
 
 n times m succession is not -■ but 
 
 ,2/ n + l 
 
 10. But refinements of disproof are hardly needed. The 
 principle's conclusion is inconsistent with its premisses. We 
 begin with the assumption that the d priori probabiUty of an event, 
 about which we have no information and no experience, is un- 
 known, and that all values between and 1 are equally probable. 
 We end with the conclusion that the d priori probability of 
 
 such an event is -. It has been pointed out in § 7 that this 
 A 
 
 contradiction was latent, as soon as the Principle of Indifference 
 
 was superimposed on the principle of unknown probabihties. 
 
 The theorem's conclusions, moreover, are a reductio ad 
 
 absurdum of the reasoning upon which it is based. Who could 
 
 suppose that the probability of a purely hypothetical event, of 
 
378 A TEEATISE ON PROBABILITY pt. v 
 
 whatever complexity, in favour of which no positive argument 
 exists, the hke of which has never been observed, and which has 
 failed to occur on the one occasion on which the hypothetical 
 
 conditions were fulfilled, is no less than - ? Or if we do suppose it, 
 
 we are involved ia contradictions, — ^for it is easy to imagine more 
 than three incompatible events which satisfy these conditions. 
 
 11. The theorem was first suggested by the problem of the urn 
 which contains black and white balls ia unknown proportions : 
 m white and n black balls have been successively drawn and 
 replaced ; what is the probability that the next draw will yield 
 a white ball ? It is supposed that all compositions of the urn are 
 equally probable, and the proof then proceeds precisely as in the 
 case of the more general rule of succession. The rule of succession 
 has been, sometimes, directly deduced from the case of the urn, 
 by assimilating the occurrence of the event to the drawing of a 
 white ball and its non-occurrence to the drawing of a black ball. 
 
 On the hypothesis that all compositions of the urn are equally 
 probable, an hypothesis to which in general there is nothing corre- 
 sponding, and on the further hypothesis that the number of balls 
 is infinite, this solution is correct.^ But the rule of succession 
 does not apply, as it is easy to demonstrate, even to the case of 
 balls drawn from an urn, if the number of balls is finite.^ 
 
 12. If the Rule of Succession is to be adopted by adherents of 
 the Frequency Theory of Probabihty,^ it is necessary that they 
 should make some modification in the preliminary reasoning on 
 which it is based. By Dr. Venn, however, the rule has been 
 
 1 This second condition is often omitted {e.g. Bertrand, Galcul des proba- 
 bilites, p. 172). 
 
 2 The correct solution for the case of a finite number of bails, on the hjrpo- 
 
 thesis that each possible ratio is equally likely, is as foUows : The probability 
 
 of a black ball at a further trial, after black balls have been successively with- 
 
 1 3 
 drawn and replaced p times, is - -^^ where there are ii, balls and s, represents 
 
 the sum of the rth powers of the first n natural numbers. This reduces to 
 
 ^— g,— the solution usually given,— when n is infinite. More generally, if 
 
 p black balls and q white balls have been drawn and replaced, the chance 
 
 r=n 
 
 that the next baU will be black is - 
 
 1 r=o 
 
 3 See Chapter VIII. 
 
OH. XXX STATISTICAL INFERENCE 379 
 
 explicitly rejected on the ground that it does not accord with 
 experience.^ But Professor Karl Pearson, who accepts it, has 
 made the necessary restateinent,^ and it will be worth while to 
 examine the reasoning when it is put in this form. Professor 
 Pearson's proof of the Rule of Succession is as follows : 
 
 " I start, as most mathematical writers have done, with ' the 
 equal distribution of ignorance,' or I assume the truth of Bayes' 
 Theorem. I hold this theorem not as rigidly demonstrated, but 
 I think with Edgeworth * that the hjrpothesis of the equal dis- 
 tribution of ignorance is, within the hmits of practical Hfe, justi- 
 fied by our experience of statistical ratios, which d priori are 
 unknown, i.e. such ratios do not tend to cluster markedly round 
 any particular value. ' Chances ' lie between and 1, but our 
 experience does not indicate any tendency of actual chances to 
 cluster round any particular value in this range. The ultimate 
 basis of the theory of statistics is thus not mathematical but 
 observational. Those who do not accept the hypothesis of the 
 equal distribution of ignorance and its justification in observation 
 are compelled to produce definite evidence of the clustering of 
 chances, or to drop all application of past experience to the judg- 
 ment of probable future statistical ratios. . . . 
 
 " Let the chance of a given event occurring be supposed to lie 
 between x and x + dx, then ii on n=p + q trials an event has been 
 observed to occur p times and fail q times, the probability that 
 the true chance hes between x and x + dx is, on the equal 
 distribution of our ignorance, 
 
 ^ xP{l-xydx 
 
 f x^l ■ 
 
 J 
 
 - xfdx 
 
 " This is Bayes' Theorem. . . .* 
 
 ^ Logic of Chance, p. 197. 
 
 * " On the Influence of Past Experience on Futuxe Experience on Future 
 Expectation," Phil. Mag., 1907, pp. 365-378. The quotations given below are 
 taken from this article. 
 
 » This reference is, no doubt, to Edgeworth's " Philosophy of Chance " 
 {Mind, 1884, p. 230), when ho wrote : " The assumption that any probability- 
 constant about which we know nothing in particular is as likely to have one value 
 as another is grounded upon the rough but solid experience that such constants 
 do, as a matter of fact, as often have one value as another." See also Chapter 
 VII. § 6, above. 
 
 * Professor Pearson's use of this title for the above formula is not, I think, 
 liistorioally correct. 5ayes' Theorem is the Inverse Principle of Probability 
 itself, and not this extension of it. 
 
380 A TREATISE ON PROBABILITY pt. v 
 
 "Now suppose that a second trial of m=r+s instances be 
 made, then the probability that the given event will occur r times 
 and fail s, is on the d priori chance being between x and x+dx 
 
 and accordingly the total chance C^, whatever x may be of the 
 event occurring r times in the second series, is 
 
 a ^'^ 
 
 a>^+'-{l-x)'>+'dx 
 
 TrVs 
 
 '' x^il-xfdx 
 
 
 
 This is, with a shght correction, Laplace's extension of Bayes' 
 Theorem." i 
 
 13. This argument can be restated as foUows. Of all the 
 objects which satisfy (ji{x), let us suppose that a proportion p 
 also satisfy f{x). In this case p measures the probability that 
 any object, of which we know only that it is 0, is in fact also/. 
 Now if we do not know the value of p and have no relevant in- 
 formation which bears upon it, we can assume d priori that all 
 values of p between and 1 are equally likely. This assumption, 
 which is termed the ' equal distribution of ignorance,' is justified 
 by our experience of statistical ratios. Our experience, that is 
 to say, leads us to suppose that of all the theories, which could be 
 propounded, there are just as many which are always true as 
 there are which are always false, just as many which are true once 
 in fifty times as there are which are true once in three times, and 
 so on. Professor Pearson challenges those who do not accept 
 this assumption to produce definite evidence to the contrary. 
 
 The challenge is easily met. It would not be difficult to pro- 
 duce 10,000 positive theories which are always false corresponding 
 to every one which is always true, and 10,000 correlations of posi- 
 
 ^ The rest of the article is concerned with the determination of the probable 
 error when Laplace's Bule of Succession is used not simply to yield the prob- 
 ability of a single additional occurrence, but to predict the probable limits within 
 which the frequency will lie in a considerable series of additional trials. Pro- 
 fessor Pearson's method applies more rigorous methods of approximation to 
 the fundamental formulae given above than have been sometimes used. ■ As 
 my main purpose in this chapter is to dispute the general validity of the funda- 
 mental formulae, it is not worth while to consider these further developments 
 here. If the validity of the fundamental formula were to be granted, Professor 
 Pearson's methods of approximation would, I think, be satisfactory. 
 
CH. XXX STATISTICAL INFEEENCE 381 
 
 tive qualities which hold less often than once in three times for 
 every one we can name which holds more often than once in three 
 times. And the converse is the case for negative theories and 
 correlations between negative quahties ; for corresponding to 
 every positive theory which is true there is a negative theory 
 which is false, and so on. Thus experience, if it shows anything, 
 shows that there is a very marked clustering of statistical ratios 
 in the neighbourhoods of zero and imity, — of those for positive 
 theories and for correlations between positive quahties in the 
 neighbourhood of zero, and of those for negative theories and for 
 correlations between negative quahties in the neighbourhood of 
 imity. Moreover, we are seldom in so complete a state of ignor- 
 ance regarding the natm:e of the theory or correlation under 
 investigation as not to know whether or not it is a positive theory 
 or a correlation between positive quahties. In general, therefore, 
 whenever our investigation is a practical one, experience, if it 
 tells 1:1s anything, tells us not only that the statistical ratios cluster 
 in the neighbourhood of zero and unity, but in which of these two 
 neighbourhoods the ratio in this particular case is most likely 
 d priori to be found. If we seek to discover what proportion of 
 the population suffer from a certain disease, or have red hair, or 
 are called Jones, it is preposterous to suppose that the proportion 
 is as likely a priori to exceed as to fall short of (say) fifty per cent. 
 As Professor Pearson apphes this method to investigations where 
 it is plain that the quahties involved are positive, he seems to 
 maintain that experience shows that there are as many positive 
 attributes which are shared by more than half of any population 
 as there are which are shared by less than half. 
 
 It is also worth while to poiat out that it is formally impossible 
 that it should be true of all characters, simple and complex, that 
 they are as likely to have any one frequency as any other. For let 
 us take a character c which is compound of two characters a and 
 b, between which there is no association, and let us suppose that 
 a has a frequency x in the population in question and that b has 
 a frequency -y, so that, in the absence of association, the frequency 
 z of c is equal to xy. Then it is easy to show that, if all values of 
 X and y between and 1 are equally probable, all values of z 
 
 between and 1 are not equally probable. For the value - 
 
 is more probable than any other, and the possible values of 
 
382 A TEBATISE ON PROBABILITY pt. v 
 
 z become increasingly improbable as they differ more widely 
 from -• 
 
 It may be added that the conclusions, which Professor 
 Pearson himself derives from this method, provide a reAudio 
 ad ahswrdwm of the arguments upon which they rest. He con- 
 siders, for example, the following problem : A sample of 100 of a 
 population shows 10 per cent afiected with a certain disease. 
 What percentage may be reasonably expected in a second sample 
 of 100 ? By approximation he reaches the conclusion that the 
 percentage of the character in the second sample is as likely to 
 fall inside as outside the limits, 7*85 and 13-71. Apart from the 
 preceding criticisms of the reasoning upon which this depends, 
 it does not seem reasonable upon general groimds that we should 
 be able on so little evidence to reach so certain a conclusion. The 
 argument does not require, for example, that we have any know- 
 ledge of the manner in which the samples are chosen, of the 
 positive and negative analogies between the individuals, or indeed 
 anything at all beyond what is given ia the above statement. 
 The method is, in fact, much too powerful. It invests any posi- 
 tive conclusion, which it is employed to support, with far too high 
 a degree of probability. Indeed this is so foolish a theorem 
 that to entertain it is discreditable. 
 
 14. The Ride of Succession has played a very important part 
 in the development of the theory of probability. It is true that 
 it has been rejected by Boole ^ on the ground that the hypotheses 
 on which it is based are arbitrary, by Venn^ on the ground that it 
 does not accord with experience, by Bertrand® because it is 
 ridiculous, and doubtless by others also. But it has been very 
 widely accepted, — ^by De Morgan,* by Jevons,® by Lotze,® by 
 Czuber,' and by Professor Pearson,* — ^to name some representative 
 writers of successive schools and periods. And, in any case, it 
 
 ^ Laws of Thought, p. 369. * Logic of Chance, p. 197. 
 
 ' Galcul des probabilites, p. 174. 
 
 * Article in Cabinet Encyclopaedia, p. 64. ^ Principles of Science, p. 297. 
 
 ° Logic, pp. 373, 374 ; Lotze propounds a " simple deduction " " as convin- 
 cing" to him "as the more obscure analysis, bywhioh it is usuaEy obtained." 
 The proof is among the worst ever conceived, and may be commended to those 
 who seek instances of the profound credulity of even considerable thinkers. 
 
 ' Wahrscheinlichkeitsrechnung, vol. i. p. 199, — though much more guardedly 
 and withtoore qualifications than in the form discussed above. 
 
 * Loc. cit. 
 
OH. XXX STATISTICAL INFERENCE 383 
 
 is of interest as being one of the most characteristic results of a 
 way of thinking in probability introduced by Laplace, and never 
 thoroughly discarded to this day. Even amongst those writers 
 who have rejected or avoided it, this rejection has been due 
 more to a distrust of the particular applications of which the law 
 is susceptible than to fundamental objections agaiast almost 
 every step and every presumption upon which its proof depends. 
 Some of these particular applications have certainly been 
 surprising. The law, as is evident, provides a numerical measure 
 of the probability of any simple induction, provided only that our 
 ignorance of its conditions is sufficiently complete, and, although, 
 when the number of cases dealt with is small, its results are in- 
 credible, there is, when the number dealt with is large, a certain 
 plausibility in the results it gives. But even in these cases 
 paradoxical conclusions are not far out of sight. When Laplace 
 proves that, account beiag taken of the experience of the human 
 race, the probabihty of the sun's rising to-morrow is 1,826,214 to 1, 
 this large number may seem in a kind of way to represent our 
 state of mind about the matter. But an ingenious German, 
 Professor Bobek,^ has pushed the argument a degree further, and 
 proves by means of these same principles that the probabihty of 
 the sun's rising every day for the next 4000 years, is not more, 
 approximately, than two-thirds, — a result less dear to our natural 
 prejudices. 
 
 1 Lehrbuch der Wahrscheinlichlceitsrechming, p. 208. 
 
CHAPTEE XXXI 
 
 THE INVERSION OF BERNOULLI'S THEOREM 
 
 1. I CONCLUDE, then, that the application of the mathematical 
 methods, discussed in the preceding chapter, to the general 
 problem of statistical inference is invalid. Our state of know- 
 ledge about our material must be positive, not negative, before 
 we can proceed to such definite conclusions as they purport to 
 justify. To apply these methods to material, unanalysed in 
 respect of the circumstances of its origin, and without reference 
 to our general body of knowledge, merely on the basis of arith- 
 metic and of those of the characteristics of our material with 
 which the methods of descriptive statistics are competent to 
 deal, can only lead to error and to delusion. 
 
 But I go further than this in my opposition to them. Not 
 only are they the children of loose thinking, and the parents of 
 charlatanry. Even when they are employed by wise and com- 
 petent hands, I doubt whether they represent the most fruitful 
 form in which to apply technical and mathematical methods to 
 statistical problems, except in a limited class of special cases. 
 The methods associated with the names of Lexis, Von Bortkiewicz, 
 and Tschuprow (of whom the last named forms a link, to some 
 extent, between the two schools), which will be briefly described 
 in the next chapter, seem to me to be much more clearly con- 
 sonant with the principles of sound induction. 
 
 2. Nevertheless it is natural to suppose that the fundamental 
 ideas, from which these methods have sprung, are not wholly 
 egarSs. It is reasonable to presume that, subject to suitable con- 
 ditions and qualifications, an inversion of Bernoulli's Theorem 
 must have validity. If we knew that our material coidd be 
 likened to a game of chance, we might expect to infer chances 
 from frequencies, with the same sort of confidence as that with 
 
CH. XXXI STATISTICAL INFEKENCE 385 
 
 which we inier frequencies from chances. This part of our 
 inquiry will not be complete, therefore, until we have endeavoured 
 to elucidate the conditions for the validity of an Inversion of 
 Bernoulli's Theorem. 
 
 3. The problem is usually discussed in terms of the happemng 
 of an event under certain conditions, that is to say, of the co- 
 existence of the conditions, as affecting a particular event, with 
 that event. The same problem can be dealt with more generally 
 and more conveniently as an investigation of the correlation 
 between two characters A{x) and B{x), which, as in Part III., 
 are prepositional .functions which may be said to concur or co- 
 exist when they are both true of the same argument x. Given 
 that, within the field of our knowledge, B{x) is true for a certain 
 proportion of the values of x for which A{x) is true, what is the 
 probability for a further value aoix that, if A(a) holds, B(a) wiU 
 hold also ? 
 
 Let us suppose that the occurrence of an instance of A(a!) is a 
 sign of one of the events ej{x), e^ix) ... or ej(x), and that these 
 are exhaustive, exclusive, and ultimate alternatives. By ex- 
 haustive it is meant that, whenever there is an instance of A(a;), 
 one of the e's is present ; by excliisive, that the presence of one 
 of the e's is not a sign of the presence of any other, but not that 
 the concurrence of two or more of the e's is in fact impossible ; 
 by uUimate, that no one of the e's is a disjimction of two or more 
 alternatives which might themselves be members of the e's. 
 Let us assume that these alternatives are initially and throiighmet 
 the argument equally probable, which, subject to the above con- 
 ditions, is justified by the Principle of Indifference. We have no 
 reason, that is to say, and no part of our evidence ever gives us 
 one, for thinking that A(a) is more likely to be a sign of one of the 
 e's than of any other, or even for thinking that some e's, although 
 we do not know which, are more likely to occur than others. 
 Let us also assume that, out of e^^x), e^Kx) . . . e^{x), the set 
 e^(x), e^{x) . . . ei(x), and these only, are signs or occasions of 
 B{x) ; and further that we have no evidence bearing on the actual 
 magnitude of the integers I and m, so that the ratio Ijm is the 
 only factor of which the probability varies as the evidence 
 accumulates. Let us assume, lastly, that our knowledge of the 
 several instances of B(a;) is adequate to establish a perfect analogy 
 between them ; the instances a, etc., of B(a;), that is to say, must 
 
 2 c 
 
386 A TEEATISE ON PROBABILITY pt. v 
 
 not have anything in common except B, unless we have reason 
 to know that the additional resemblances are immaterial. Even 
 by these considerable simplifications not every difficulty has 
 been avoided. But a development along the usual lines with 
 the assistance of Bernoulli's Theorem is now possible. 
 
 Let l/m = q. If the value of q were known, the problem would 
 be solved. For this numerical ratio would represent the prob- 
 ability that A is, in any random instance, a sign of B ; and no 
 further evidence, which satisfies the conditions of the preceding 
 hypothesis, can possibly modify it. But in the inverse problem 
 q is not known ; and oiu" problem is to determine whether evidence 
 can be forthcoming of such a kind, that, as this evidence is in- 
 creased in quantity, the probability that A will be in any instance 
 a sign of B, tends to a limit which lies between two determinate 
 ratios, just as the probability of an inductive generalisation may 
 tend towards certainty, when the evidence is increased in a 
 manner satisfjring given conditions. 
 
 Let/(g') represent the proposition that q is the true value of 
 l/m. Let q' represent the ratio of the number of instances actually 
 before us in which A has been accompanied by B to that of the 
 instances in which A has not been accompanied by B ; and let 
 f'{q') be the proposition which asserts this. Now if the ratio q 
 is known, then, subject to the assumptions already stated, the 
 number q must also represent the d priori probability in any 
 instance, both before and after the results, of other instances are 
 known, that A, if it occurs, will be accompanied by B. We have, 
 in fact, the conditions as set forth in Chapter XXIX., in which 
 Bernoulli's Theorem can be validly applied, so that this theorem 
 enables us to give a numerical value, for all numerical values of 
 q and q', to the probability /'(gf')/)^ . f{q), — ^which expression repre- 
 sents the likelihood d priori of the frequency q', given q. 
 
 An application of the inverse formula allows us to infer from 
 the above the d posteriori probability of q, given q', namely : 
 
 f{q)lh-Aq')lh.f(q) 
 
 where the summation in the denominator covers all possible 
 values of q. In rough applications of this inverse of Bernoulli's 
 Theorem it has been usual to suppose that f{q)/h is constant for 
 all values of q, — ^that, in other words, all possible values of the 
 
OH. XXXI STATISTICAL INFEEENCE 387 
 
 ratio q are d priori equally likely. If this supposition were 
 legitimate, the formula could be reduced to the algebraical ex- 
 pression 
 
 /(g')A-/(g) 
 
 all the terms of which can be determined numerically by Ber- 
 noulli's Theorem. It is easy to show that it is a maximimi when 
 q=q', i.e. that q' is the most probable value of l/m, and that, 
 when the iastances are very numerous, it is very improbable that 
 IJm differs from q' widely. If, therefore, the number of instances 
 is increased ia such a manner that the ratio continues in the 
 neighbourhood of q', the probability that the true value of l/m 
 is nearly q' tends to certainty; and, consequently, the prob- 
 ability, that A is ia any instance a sign of B, also tends to a 
 magnitude which is measured by q'. 
 
 I see, however, no justification for the assumption that all 
 possible values of the ratio q are d priori equally likely. It is 
 not even equivalent to the assumptions that aU integral values 
 of I and m respectively are equally probable. I am not satisfied 
 either that different values of q, or that different values of m, 
 satisfy the conditions which have been laid down in Part I. for 
 alternatives which are equal before the Principle of Indifference. 
 There seem, for instance, to be relevant differences between the 
 statement that A can arise in exactly two ways and the state- 
 ment that it can arise in exactly a thousand ways. We must, 
 therefore, be content with some lesser assumption and with a 
 less precise form for our final conclusion. 
 
 4. Since, in accordance with our hypothesis, m cannot exceed 
 
 some finite number, and since I must necessarily be less than m, 
 
 the possible values of m, and therefore of q, are finite in number. 
 
 Perhaps we can assume, therefore, as one of our fundamental 
 
 assumptions, that there is d priori a finite probability in favour 
 
 of each of these possible values. Let /j, be the finite number 
 
 which m cannot exceed. Then there is a finite probability for 
 
 each of the intervals ^ 
 
 ■1*2 2,3 ^L-l . , 
 
 - to -, - to -, ... - — to 1 
 fl fl fji fl fl 
 
 1 The intervals are supposed to include their lower but not their upper 
 limit. 
 
388 A TEEATISE ON PEOBABILITY pt. v 
 
 that q lies in this interval ; but we cannot assume that there is 
 an equal probability for each interval. 
 We must now return to the formula 
 
 which represents the a 'posteriori probability of q, given q'. Since 
 by sufficiently increasuig the number of instances, the sum of 
 terms f{q')lhf{q) for possible values of q within a certain finite 
 interval in the neighbourhooji of q' can be made to exceed the 
 other terms by any required amount, and since the sum of the 
 values of f{q)lh for possible values of q within this interval is 
 finite, it clearly follows that a finite number of instances can 
 make the probability, that q lies in an interval of magnitude 
 Ij/i in the neighbourhood of q', to differ from certainty by less 
 than any finite amount however small. 
 
 5. We have, therefore, reached the main part of the conclusion 
 after which we set out — ^namely, that as the number of instances 
 is increased the probability, that g is in the neighbourhood of 
 q', tends towards certainty ; and hence that, subject to certain 
 specified conditions, if the frequency with which B accompanies 
 A is found to be j' in a great number of instances, then the 
 probability that A will be accompanied by B in any further 
 instance is also approximately q'. But we are left with the same 
 vagueness, as in the case of generalisation, respecting the value 
 of fjL and the number of instances that we require. We know 
 that we can get as near certainty as we choose by a finite number 
 of instances, but what this number is we do not know. This is 
 not very satisfactory, but it accords very well, I think, with 
 what common sense tells us. It would be very surprising, in 
 fact, if logic could tell us exactly how many instances we want, 
 to yield us a given degree of certainty in empirical arguments. 
 
 Nobody supposes that we can measure exactly the probability 
 of an induction. Yet many persons seem to believe that in the 
 weaker and much more difficult type of argimient, where the 
 association under examination has been in our experience, not 
 invariable, but merely in a certain proportion, we can attribute 
 a definite measure to our future expectations and can claim 
 practical certainty for the results of predictions which lie within 
 relatively narrow limits. Coolly considered, this is a preposter- 
 
OH. XXXI STATISTICAL INFERENCE 389 
 
 ous claim, which would have been universaUy rejected long ago, 
 if those who made it had not so successftdly concealed them- 
 selves from the eyes of common sense in a maze of mathematics. 
 6. Meantime we are in danger of forgetting that, in order to 
 reach even oux modified conclusion, material assumptions have 
 been introduced. In the fiist place, we are faced with exactly 
 the same difficulties as in the case of universal induction dealt 
 with in Part III., and our original starting-point must be the 
 same. We have the 'same difficulty as to how our mitial prob- 
 ability is to be obtained ; and I have no better suggestion to ofEer 
 in this than in the former case — ^namely, the supposed principle 
 of a limitation of independent variety in experience. We have 
 to suppose that if A and B occur together {i.e. are true of the 
 same object), this is some just appreciable reason for supposing 
 that in this instance they have a common cause ; and that, if 
 A occurs again, this is a just appreciable reason for supposing 
 that it is due to the same cause as on the former occasion. But 
 in addition to the usual inductive hypothesis, the argument has 
 rested on two particularly important assumptions, first, that we 
 have no reason for supposing that some of the events of which 
 A may be a sign are more likely to be exemplified in some of the 
 particular instances than in others, and secondly, that the analogy 
 amongst the examined B's is perfect. The first assumption 
 amounts, in the language of statisticians, to an assumption of 
 random sampling from amongst the A's. The second assumption 
 corresponds precisely to the similar condition which we discussed 
 fully in connection with inductive generalisation. The instances 
 of A(a!) may be the result of random sampling, and yet it may 
 stiU be the case that there are material circumstances, common 
 to all the examined instances of B(a;), yet not covered by the 
 statement A(a;)B(a;). In so far as these two assumptions are not 
 justified, an element of doubt and vagueness, which is not easily 
 measured, assails the argument. It is an element of doubt 
 precisely similar to that which exists in the case of generalisa- 
 tion. But we are most likely to forget it. For having overcome 
 the difficulties peculiar to correlation,^ it is, possibly, not im- 
 
 1 I am here using this term in distinction to generalisation ; that is to say, 
 I call the statement that A(x) is always accompanied by B{x) a generaliaaUon, 
 and the statement that A{x) is accompanied by B(a;) in a certain proportion 
 of cases a correlation. This is not qmte identical with its use by modern 
 statisticians. 
 
390 A TREATISE ON PROBABILITY pt. v 
 
 natural for a statifltician to feel as if he had overcpme all the 
 difficulties. 
 
 In practice, however, our knowledge, in cases of correlation 
 just as in cases of generalisation, wiU seldom justify the assump- 
 tion of perfect analogy between the B's ; and we shall be faced 
 by precisely the same problems of analysing and improving our 
 knowledge of the instances, as in the general case of induction 
 already examined. If B has invariably accompanied A in 100 
 cases, we have all kinds of difficulties about the exact character 
 of our evidence before we can found on this experience a valid 
 generalisation. If B has accompanied A, not invariably, but 
 only 50 times in the 100 cases, clearly we have just the same 
 kind of difficulties to face, and more too, before we can announce 
 a valid correlation. Out of the mere analysed statement that B 
 has accompanied A as often as not ha. 100 cases, without precise 
 particulars of the cases, or even if there were 1,000,000 cases 
 instead of 100, we can conclude very little indeed. 
 
CHAPTER XXXII 
 
 THE INDUCTIVE USE OF STATISTICAL FREQUENCIES FOR THE 
 DETERMINATION OF PROBABILITY A POSTERIORI — THE 
 METHODS OF LEXIS 
 
 1. No one supposes that a good induction can be arrived at 
 merely by counting cases. The business of strengthening the 
 argument chiefly consists in determining whether the alleged 
 association is stable, when the accompanying conditions are 
 varied. This process of improving the Analogy, as I have called 
 it in Part III., is, both logically and practically, of the essence of 
 the argument. 
 
 Now in statistical reasoning (or inductive correlation) that 
 part of the argument, which corresponds to counting the cases 
 in inductive generalisation, may present considerable technical 
 difficulty. This is especially so in the particularly complex cases 
 of what in the next chapter (§ 9) I shall term Quantitative Cor- 
 relation, which have greatly occupied the attention of English 
 statisticians in recent years. But clearly it would be an error to 
 suppose that, when we have successfully overcome the mathe- 
 matical or other technical difficulties, we have made any greater 
 progress towards establishing our conclusion than when, in the 
 case of inductive generalisation, we have counted the cases but 
 have not yet analysed or compared the descriptive and non- 
 numerical differences and resemblances. In order to get a good 
 scientific argument we still have to pursue precisely the same 
 scientific methods of experiment, analysis, comparison, and 
 differentiation as are recognised to be necessary to establish any 
 scientific generalisation. These methods are not reducible to a 
 precise mathematical form for the reasons examined in Part III. 
 of this treatise. But that is no reason for ignoring them, or for 
 pretending that the calculation of a probability, which takes into 
 
 391 
 
392 A TREATISE ON PROBABILITY pt. v 
 
 account nothing whatever except the numbers of the instances, 
 is a rational proceeding. The passage already quoted from 
 Leibniz {In exempUs juridicis politicisque plerumque non tamen 
 subtiU cahulo opus est, quam accurata omnium drcumstantimum 
 enumeratione) is as applicable to scientific as to political inquiries. 
 Generally speaking, therefore, I think that the business of 
 statistical technique ought to be regarded as strictly limited to 
 preparing the numerical aspects of our material in an intelligible 
 form, so as to be ready for the application of the usual inductive 
 methods. Statistical technique tells us how to ' count the cases ' 
 when we are presented with complex material. It must not 
 proceed also, except in the exceptional case where our evidence 
 furnishes us from the outset with data of a particular kind, to 
 turn its results into probabilities ; not, at any rate, if we mean 
 by probability a measure of rational belief. 
 
 2. There is, however, one type of technical, statistical investi- 
 gation not yet discussed, which seems to me to be a valuable 
 aid to inductive correlation. This method consists in breaking 
 up a statistical series, according to appropriate principles, into 
 a number of sub-series, with a view to analysing and measuring, 
 not merely the frequency of a given character over the aggregate 
 series, but the stability of this frequency amongst the sub- 
 series ; that is to say, the series as a whole is divided up by some 
 principle of classification into a set of sub-series, and ihejlmtua- 
 tion of the statistical frequency under examination between the 
 various sub-series is then examined. It is, in fact, a technical 
 method of increasing the Analogy between the instances, in the 
 sense given to this process in Part III. 
 
 3. The method of analysing statistical series, as opposed to 
 the Laplacian or mathematical method, one might designate the 
 inductive method. Independently of the investigations of 
 Bernoulli or Laplace, practical statisticians began at least as early 
 as the end of the seventeenth century ^ to pay attention to the 
 stability of statistical series when analysed in this manner. 
 Throughout the eighteenth century, students of mortality 
 statistics, and of the ratio of male to female births, (including 
 Laplace himself), paid attention to the degree of constancy of the 
 
 ^ Giaunt in his NaMrcd and Political Observations upon the Bills of Mortality 
 has been quoted as one of the earliest statisticians to pay attention to these 
 considerations. 
 
OH. xxxn STATISTICAL INFERENCE 393 
 
 ratios over different parts of their series of instances as weU as 
 to their average value over the whole series. And in the early- 
 part of the nineteenth century, Quetelet, as we have already- 
 noticed, widely popularised the notion of the stability of various 
 social statistics from year to year. Quetelet, however, sometimes 
 asserted the existence of stability on insufficient evidence, and 
 involved himself in theoretical errors through imitating the 
 methods of Laplace too closely ; and it was not until the last 
 quarter of the nineteenth century that a school of statistical 
 theory was founded, which gave to this way of approaching the 
 problem the system and technique which it had hitherto lacked, 
 and at the same time made explicit the contrast between this 
 analytical or inductive method and the prevailing mathematical 
 theory. The sole founder of this school was the German econo- 
 mist, Wilhelm Lexis, whose theories were expounded in a series 
 of articles and monographs published between the years 1875 
 and 1879. For some years Lexis's fundamental ideas did not 
 attract much notice, and he himself seems to have turned his 
 attention in other directions. But more recently a considerable 
 literature has grown up round them in Germany, and their full 
 purport has been expressed with more clearness than by Lexis 
 himself — although no one, with the exception of Ladislaus von 
 Bortkiewicz, has been able to make additions to them of any 
 great significance.-'- Lexis devised his theory with an immediate 
 view to its practical application to the problems of sex ratio and 
 mortality. The fact that his general theory is so closely inter- 
 mingled -with these particular applications of it is, probably, a 
 part explanation of the long interval which elapsed before the 
 general theoretical importance of his ideas was widely realised. 
 I cannot help doubting how fully Lexis himself realised it in the 
 first instance. It would certainly be easy to read his earlier 
 contributions to the question without appreciating their general- 
 ised significance. After 1879 Lexis added nothing substantial to 
 his earlier work, and later developments are mainly due to Von 
 
 * A list of Lexis's principal -writings on these topics wiH be found in the 
 Bibliography. There is little of first-rate importance which is not contained 
 either in the volume, Zwr Theorie der Massenerscheinungen in der menschlichen 
 OesdUchaft, or in the AbJmndlwngen zur Theorie der Bevolkerunga- und Moral- 
 Statistik. In this latter volume the two important articles on " Die Theorie der 
 Stabilitat statistischer Eeihen" and on "Das Geschlechtsverhaltnis der 
 Geborenen und die 'Wahrscheinlichkeitsrechnung," originally published in Con- 
 rad's Jahrbuche, are reprinted. 
 
394 A TREATISE ON PROBABILITY pt. v 
 
 BortMewicz. Those of the latter's writings, which have an 
 important bearing on the relation between probability and 
 statistics, are given in the Bibliography.-^ 
 
 On the logic and philosophy of ProbabiUty writers of the 
 school of Lexis are ui general agreement with Von Kries ; but this 
 seems to be due rather to the reaction which is common both to 
 him and to them against the Laplacian tradition, than to any 
 very intimate theoretical connection between Von Kries's main 
 contributions to Probability and those of Lexis, though it is true 
 that both show a tendency to find the ultimate basis of Probability 
 in physical rather than iu logical considerations. I am not 
 acquainted with much work, which has been appreciably influ- 
 enced by Lexis, written in other languages than German (including 
 with Germans, that is to say, those Russians, Austrians, and Dutch 
 who usually write in German, and are in habitual connection with 
 the German scientific world). In France Dormoy ^ published 
 independently and at about the same time as Lexis some not 
 dissimilar theories, but subsequent French writers have paid 
 little attention to the work of either. Such typical French 
 treatises as that of Bertrand, or, more recently, that of Borel, 
 contain no reference to them.' In Italy there has been some 
 discussion recently on the work of Von Bortkiewicz. Among 
 Englishmen Professor Edgeworth has shown a close acquaintance 
 with the work of the German school,* he providing for nearly forty 
 years past, on this as on other matters where the realms of 
 
 * The reader may be specially referred to the Kritiscke Betrachtungen zur 
 theoretischen Statistik {first instalment — ^the later instalments being of less interest 
 to the student of Probability), the Anwendungen der WalirscheiTilichkeUsrechnung 
 auf StaUstik, and Somogeneitdt und StabUitdt in der Statistik. Of other German 
 and Russian writers it wiU be sufficient to mention here TsohuproW, who in 
 " Die Aufgaben der Theorie der Statistik " (SchmoUer's Jahrbuch, 1905) and "Zur 
 Theorie der Stabilitat statistischerReihen " {Skanditiavisk AMuarietidshrift) gives 
 by far the best and most lucid general accounts that are available of the doctrines 
 of the school, he alone amongst these authors writing in a style from which 
 the foreign reader can derive pleasure, and Czuber, who in his Wdhrschein- 
 lichkeitsrechnung (vol. ii. part iv. section 1)' supplies a useful mathematical 
 commentary. 
 
 * Journal des aetuaires fran^is, 1874, and Theorie mathematique des assurances 
 sur la vie, 1878 ; on the question of priority see Lexis, Abhandlungen, p. 130. 
 
 ' Though both these writers touch on closely cognate matters, where Lexis's 
 investigations would be highly relevant — ^Bertrand, Cakul, pp. 312-314 ; Borel, 
 SUments, p. 160. 
 
 * See especially his "Methods of Statistics " in the Jubilee Volume of the 
 Stat. Journ., 1885, and "Application of the Calculus of Probabilities to 
 Statistics," International Statistical Institute Bulletin, 1910. 
 
ca. xxxu STATISTICAL INFERENCE 395 
 
 Statistics and Probability overlap, almost the only connecting 
 link between English and continental thought. 
 
 Nevertheless, an account in English of the main'doctrines of 
 this school is stiH lacking. It would be outside the plan of the 
 present treatise to attempt such an account here. But it may 
 be useful to give a short summary of Lexis's fundamental ideas. 
 After giviug this account I shall find it convenient, in proceeding 
 to my own incomplete observations on the matter, to approach 
 it from a rather different standpoint from that of Lexis or of 
 Von Bortkiewicz, though not for that reason the less influenced 
 or illuminated by their eminent contributions to this problem. 
 
 4. It win be clearer to begin with some analysis due to Von 
 Bortkiewicz,- and then to proceed to the method of Lexis him- 
 self, although the latter came first in point of time. 
 
 A group of observations may be made up of a number of sub- 
 groups, to which different frequencies for the character under 
 investigation are properly applicable. That is to say, a proper-. 
 
 tion — of the observations may belong to a group, for which, given 
 
 the frequency, the a 'priori probability of the character under 
 
 observation in a particular instance would be ^i, a proportion — 
 
 may belong to a second group for which f^ is the probability, and 
 so on. In this case, given the frequencies for the sub-groups, 
 the probability f for the group as a whole would be made up as 
 foUows : 
 
 We may call f a general jprohaJbility, and p, etc., special prob- 
 abilities. But the special probabilities may in their turn be 
 general probabilities, so that there may be more than one way 
 of resolving a general probability into special probabilities. 
 
 If p^ =P2 = . • . ■ =p, then p, for that particular way of resolv- 
 ing the total group into partial groups, is, in Bortkiewicz's termin- 
 ology, indifferent. lip is indifferent for all conceivable resolutions 
 into partial groups,^ then, borrowing a phrase from Von Kries, 
 Bortkiewicz says of it that it has a definitive interpretation. In 
 
 1 What follows is a free rendering of some passages in his Kritiache 
 
 ^ This is clearly a very loose statement of what Bortkiewicz really means. 
 
396 A TEEATISE ON PROBABILITY w. v 
 
 dealing with d priori probabilities, we can resolve a total prob- 
 ability until we reach the special probabilities of each individual 
 case ; and if we find that all these special probabilities are equal, 
 then, clearly, the general probability satisfies the condition for 
 definitive interpretation. 
 
 So far we have been dealing with d priori probabilities. But 
 the object of the analysis has been to throw light on the inverse 
 problem. We want to discover in what conditions we can regard 
 an observed frequency as being an adequate approximation to a 
 definitive general probability. 
 
 If p' is the empirical value of p (or, as I should prefer to call 
 it, the frequency) given by a series of n observations, we may 
 have 
 
 Even if this particular way of resolving the series of observations 
 is indifferent, the actually observed frequencies p-^', p^', etc., may 
 nevertheless be unequal, since they may fiuctuate round the 
 norm p' through the operation of ' chance ' influences. If, 
 however, n-^, Wg, etc., are large, we can apply the usual Bemoullian 
 formula to discover whether, if there was a norm p', the diverg- 
 ences of Px,p^, etc., from it are within the limits reasonably attri- 
 butable on BernouUian hypotheses to ' chance ' influences. We 
 can, however, only base a sound argument in favour of the 
 existence of a "' definitive ' probability p' by resolving our 
 aggregate of instances into sub-series in a great variety of ways, 
 and applying the above calculations each time. Even so, some 
 measure of doubt must remain, just as in the case of other 
 inductive arguments. 
 
 BortMewicz goes on to say that probabilities having definitive 
 interpretation {definitive Bedeutung) may be designated ele- 
 mentary probabilities {EhmentarwahrscheinUchkeiten). But the 
 probabilities which usually arise in statistical inquiries are not 
 of this type, and may be termed average probabilities {Durch- 
 schnittswahrscheinlichkeiten). That is to say, a series of observed 
 frequencies (or, as he calls them, empirical probabilities) does not, 
 as a rule, group itself as it would if the series was in fact subject 
 to an elementary probability. 
 
 5. This exposition is based on a philosophy of Probability 
 different from mine ; but the underlying ideas are capable of 
 
OH. xxxn STATISTICAL INFEEENCE 397 
 
 translation. Suppose that one is endeavouring to establish an 
 inductive correlation, e.g. that the chance of a male birth is m. 
 The conclusion, which we are seeking to establish, takes no 
 account of the place or date of birth or the race of the parents, 
 and assumes that these influences are irrelevant. Now, if we had 
 statistics of birth ratios for all parts of the world throughout the 
 nineteenth century, and added them all up and found that the 
 average frequency of male births was m, we should not be justified 
 in arguing from this that the frequency of male births ia England 
 next year is very unlikely to diverge widely from m. For this 
 would involve the unwarranted assumption, in BortMewicz's 
 terminology, that the empirical probability m is elementary for 
 any resolution dependent on time or place, and is not an average 
 probability compounded out of a series of groups, relating to 
 difEerent times or places, to each of which a distinct special 
 probability is applicable. And, in my terminology, it would 
 assume that variations of time and place were irrelevant to the 
 correlation, without any attempt having been made to employ 
 the methods of positive and negative Analogy to establish this. 
 
 We must, therefore, break up our statistical material iuto 
 groups by date, place, and any other characteristic which our 
 generalisation proposes to treat as irrelevant. By this means 
 we shall obtain a number of frequencies %', m^', m^, .... m^", 
 m^', mg", .... etc., which are distributed round the average 
 frequency m. For simplicity let us consider the series of fre- 
 quencies TOi', m^', OTj', .... obtained by breaking up our 
 material according to the date of the birth. If the observed 
 divergences of these frequencies from their mean are not signifi- 
 cant, we have the beginnings of aii inductive argument for 
 regarding date as being in this connection irrelevant. 
 
 6. At this point Lexis's fundamental contribution to the 
 problem must be introduced. He concentrated his attention on 
 the nature of the dispersion of the frequencies m^', m^, m^ . . . . 
 round their mean value m ; and he sought to devise a technical 
 method for measuring the degree of stability displayed by the 
 series of sub-frequencies, which are yielded by the various possible 
 criteria for resolving the aggregate statistical material into a 
 number of constituent groups. 
 
 For this purpose he classified the various types of dispersion 
 which could occur. It may be the case that some of the sub- 
 
398 A TEEATISE ON PEOBABILITY pt. v 
 
 frequencies show sucli wide and discordant variations from the 
 mean as to suggest that some significant Analogy has been over- 
 looked. In this event the lack of symmetry, which characterises 
 the oscillations, may be taken to indicate that some of the sub- 
 groups are subject to a relevant influence, of which we must take 
 account in our generalisation, to which some of the other sub- 
 groups are not subject. 
 
 But amongst the various types of dispersion Lexis found one 
 class clearly distinguishable from all the others, the peculiarity 
 of which is that the individual values fluctuate in a ' purely 
 chance ' manner about a constant fundamental value. This 
 type he called typical [typische) dispersion. He meant by this 
 that the dispersion conformed approximately to the distribution 
 which would be given by some normal law of error. 
 
 The next stage of Lexis's argument ^ was to point out that 
 series of frequencies which are typical in character may have as 
 their foundation either a constant probability,^ or one which is 
 itself subject to chance variations about a mean. The first case 
 is typified by the example of a series of sets of drawings of balls, 
 each set being drawn from a similar urn ; the second case by the 
 example of a series of sets of drawings, the urns from which each 
 set is drawn being not similar, but with constitutions which vary 
 in a chance manner about a mean. 
 
 As his measure of dispersion Lexis introduces a formula, which 
 is evidently in part conventional (as is the case with so many 
 other statistical formulae, the particular shape of which is often 
 determined by mathematical convenience rather than by any 
 more fundamental criterion). He expresses himself as follows. 
 Where the underlying probability is constant, the probable error 
 
 m a 
 
 /• 
 particular frequency d, priori is ii'=p. 
 
 '2v{l-v) 
 
 p = -4769, V is the underlying probability, and g is the number of 
 instances to which the frequency refers. This follows from the 
 usual Bernoullian assumptions. Now let E be the corresponding 
 expression derived a posteriori by reference to the actual devia- 
 tions of a series of observed frequencies from their mean, so that 
 
 ' I am here following fairly closely his paper, " tJber die Theorie der Stabilitat 
 statistioher Beihen," reprinted in his Abhandiungen zur Theorie der Bevolkerungs- 
 und Moral- Statistik, pp. 170-212. 
 
 " This mode of expression, which is not in accurate conformity with my 
 philosophy of Probability, is Lexis's, not mine. His meaning is intelligible. 
 
CH. xxxn STATISTICAL INFEEENCE 399 
 
 R =P» / ~^> where [S^] is tlie sum of the squares of the devia- 
 tions of the individual frequencies from their mean and n is their 
 number. Now, if the observed facts are due to merely chance 
 variations about a constant v, we must have approximately 
 R=r, though, if g is small, comparatively wide deviationB be- 
 tween R and r will not be significant. If, on the other hand, v 
 itself is not constant but is subject to chance variations, the case 
 stands differently. For the fluctuations of the observed fre- 
 quencies are now due to two components. The one which would 
 be present, even if the underlying probability were constant. 
 Lexis terms the ordinary or unessential component ; the other 
 he terms the physical component. If p is the probable deviation 
 of the various values of v from their mean, then, on the same 
 assumptions and as a deduction from the same theory as before, 
 R will tend to equal not r but y/r^ +p^. In this event R carmot 
 be less than r. If, therefore, R<r, one must suppose that the 
 individual instances of each several series on which each frequency 
 is based are not independent of one another. Such a series 
 Lexis terms an organic or dependent (gebundene) series, and 
 explains that it cannot be handled by purely statistical methods. 
 Since, therefore, we have three types of series, differing 
 fundamentally from one another according as R=r, >r, or <r, 
 
 R 
 Lexis puts — = Q, and takes Q as his measure of dispersion.^ If 
 
 Q = 1, we have normal dispersion; if Q>1, we have supernormal 
 dispersion; and if Q<1, we have subnormal dispersion, which is 
 an indication that the series is ' organic' 
 
 If the number of instances on which the frequencies are based 
 is very great, r becomes negligible in comp^Tlson with p (the 
 physical component), and, therefore, R = ^r^ +p^ becomes 
 approximately R =p. On the other hand, if p is not very large 
 and the base number of instances is small, p be6ctaies ,npgligible 
 
 ^ In Tsohuprow's notation {Die Aufgaben Ser ^heorie det' Statistifc, p. 45), 
 Q = P/C, where P (the Physical modulus) =a/ *-^ ^ ;ind C (the Com- 
 
 binatorial modulus) =»/ jj . M being the nuihber of iiistances in each 
 
 set, n the number of sets, J)^ the frequency for set k, and p the mean of the 
 n frequencies. 
 
400 A TREATISE ON PROBABILITY pi. v 
 
 in comparison with r, and we have a delusive appearance of 
 normal dispersion.-^ Lexis weU illustrates the former point by 
 the example that the statistics of the ratio of male to female 
 births for the forty-five registration districts of England over the 
 years 1859-1871 approximately satisfy the relation R=r. But 
 if we take the figures for all England over those thirteen years, 
 although the extreme limits of the fluctuation of the ratio about 
 its mean 1 -042 are 1-035 and 1 -047, nevertheless R = 2-6 and r = 1 -6, 
 so that Q = 1-625 ; the explanation being that the base number 
 of instances, namely 730,000, is so large that r is very small, with 
 the result that it is swamped by the physical component f. And 
 he illustrates the latter point by the assertion that, if in 20 or 30 
 series each of 100 draws from an urn containing black and white 
 balls equally, the number of black balls drawn each time were 
 only to vary between 49 and 51, he would have confidence that 
 the game was in some way falsified and that the draws were not 
 independent. That is to say, undue regularity is as fatal to the 
 assumption of Bemoullian conditions as is undue dispersion. 
 
 7. In a characteristic passage ^ Professor Edgeworth has applied 
 these theories to the frequency of dactyls in successive extracts 
 from the Aeneid. The mean for the line is 1-6, exclusive of the 
 fifth foot, thus sharply distinguishing the Virgilian line from the 
 Ovidian, for which the corresponding figure is 2-2. But there is 
 also a marked stability. " That the Mean of any five lines 
 should differ from the general Mean by a whole dactyl is proved 
 to be an exceptional phenomenon, about as rare as an Englishman 
 measuring 5 feet, or 6 feet 3 inches. An excess of two dactyls 
 in the Mean of five lines would be as exceptional as an Englishman 
 measuring 6 feet 10 inches." But not only so — -the stability is 
 excessive, and the fluctuation is less " than that which is obtained 
 upon the hypothesis of pure sortition. If we could imagine 
 dactyls and spondees to be mixed up in the poet's brain in the 
 proportion of 16i to 24 and shaken out at random, the modulus 
 in the num]l)er of dactyls would be 1-38, whereas we have con- 
 stantly ^obta^ined a smaller number, on an average (the square 
 root of the average fluctuation) 1-2." On Lexian principles 
 these statistibal results would support the hypothesis that the 
 
 ^ This is part of the explanation of Bortkiewicz's Law of Small Numbers. 
 See also p. 401. 
 
 * " On Methods of Statistics," Jubilee Volume of the Boyal Statistical Society, 
 p. 211. 
 
OH. xxxn STATISTICAL INFERENCE 401 
 
 series under investigation is ' organic ' and not subject to 
 Bemoullian conditions, an hypothesis in accordance with our 
 ideas of poetry. That Edgeworth should have put forward 
 this example in criticism of Lexis's conclusions, and that Lexis ^ 
 should have retorted that the explanation was to be found ia 
 Edgeworth's series' not consisting of an adequate number of 
 separate observations, indicates, if I do not misapprehend them, 
 that these authorities are at fault in the principles, if not of 
 Probability, of Poetry. 
 
 The dactyls of the Virgilian hexameter are, in fact, a very 
 good example of what has been termed connexite, leading to sub- 
 normal dispersion. The quantities of the successive feet are not 
 independent, and the appearance of a dactyl in one foot diminishes 
 the probability of another dactyl in that line. It is like the case 
 of drawing black and white balls out of an urn, where the balls 
 are not replaced. But Lexis is wrong if he supposes that a super- 
 normal dispersion cannot also arise out of connexitS, or organic 
 connection between the successive terms. It might have been 
 the case that the appearance of a dactyl in one foot increased 
 the probability of another dactyl in that line. He should, I 
 think, have contemplated the result R>r as possibly indicating 
 a non-typical, organic series, and should not have assumed that, 
 where R is greater than r, it is of the form Vr^ +p^. 
 
 In short. Lexis has not pushed his analysis far enough, and he 
 has not fuUy comprehended the character of the underlying 
 conditions. But this does not affect the fact that it was he who 
 made the vital advance of taking as the unit, not the single 
 observation, but the frequency in given conditions, and of con- 
 ceiving the nature of statistical induction as consisting in the 
 examination, and if possible the measurement, of the stability 
 of the frequency when the conditions are varied. 
 
 8. There is one special piece of work illustrative of the above 
 methods, due to Von BortMewicz, which must not be overlooked, 
 and which it is convenient to introduce in this place — the so- 
 called Law of Small Numbers.^ 
 
 Quetelet, as we have seen in Chapter XXVIII. , called attention 
 
 1 " Uber die Wahrsoheinliohkeitsreoluiuiig," p. 444 (see Bibliography). 
 
 ' There are numerous references to this phenomenon in periodical literature ; 
 but it is sufficient to refer the reader to Von Bortkiewicz's Das Oeaelz der kkinen 
 Zahlen. 
 
 2d 
 
402 
 
 A TREATISE ON PROBABILITY 
 
 to the remarkable regularity of comparatively rare events. Von 
 Bortkiewicz has enlarged Quetelet's catalogue with modem 
 instances out of the statistical records of bureaucratic Germany. 
 The classic instance, perhaps, is the number of Prussian cavalry- 
 men killed each year by the kick of a horse. The table is worth 
 giving as a statistical curiosity. (The period is from 1875 to 
 1894 ; G stands for the Corps of Guards, and I.-XV. for the 
 15 Army Corps.) 
 
 
 75 
 
 76 
 
 77 
 
 78 
 
 79 
 
 80 
 
 81 
 
 82 
 
 83 
 
 84 
 
 85 
 
 86 
 
 87 
 
 88 
 
 89 
 
 90 
 
 91 
 
 92 
 
 93 
 
 94 
 
 G. 
 
 
 2 
 
 2 
 
 1 
 
 
 
 1 
 
 1 
 
 
 3 
 
 
 2 
 
 1 
 
 
 
 1 
 
 
 1 
 
 
 1 
 
 I. 
 
 
 
 . , 
 
 2 
 
 
 3 
 
 . , 
 
 2 
 
 
 
 
 1 
 
 1 
 
 1 
 
 , , 
 
 2 
 
 
 3 
 
 1 
 
 
 II. 
 
 
 
 
 2 
 
 , , 
 
 2 
 
 . , 
 
 , , 
 
 1 
 
 1 
 
 , , 
 
 , , 
 
 2 
 
 1 
 
 1 
 
 
 
 2 
 
 
 
 m. 
 
 
 
 , , 
 
 1 
 
 1 
 
 1 
 
 2 
 
 
 2 
 
 , , 
 
 , , 
 
 , , 
 
 1 
 
 , , 
 
 1 
 
 2 
 
 1 
 
 
 
 
 IV. 
 
 
 1 
 
 , , 
 
 1 
 
 1 
 
 1 
 
 1 
 
 
 
 
 
 1 
 
 
 
 
 , , 
 
 1 
 
 
 .. 
 
 
 V. 
 
 
 
 . , 
 
 , , 
 
 2 
 
 1 
 
 
 
 1 
 
 
 
 1 
 
 
 1 
 
 1 
 
 1 
 
 1 
 
 
 1 
 
 
 VI. 
 
 
 
 1 
 
 ■ ■ 
 
 2 
 
 
 
 1 
 
 2 
 
 
 1 
 
 1 
 
 3 
 
 1 
 
 1 
 
 1 
 
 
 
 
 
 VII. 
 
 1 
 
 
 1 
 
 , , 
 
 
 
 1 
 
 
 1 
 
 1 
 
 
 , , 
 
 2 
 
 
 , , 
 
 2 
 
 1 
 
 , , 
 
 2 
 
 
 vm. 
 
 1 
 
 
 , , 
 
 
 1 
 
 
 
 1 
 
 , , 
 
 , , 
 
 
 . *. 
 
 1 
 
 . , 
 
 
 
 1 
 
 
 , , 
 
 1 
 
 IX. 
 
 
 
 
 , , 
 
 , , 
 
 2 
 
 1 
 
 1 
 
 1 
 
 . , 
 
 2 
 
 1 
 
 1 
 
 , , 
 
 1 
 
 2 
 
 , , 
 
 
 . 
 
 
 X. 
 
 
 
 1 
 
 1 
 
 
 1 
 
 , ,' 
 
 2 
 
 
 2 
 
 . , 
 
 
 
 
 2 
 
 1 
 
 3 
 
 , , 
 
 1 
 
 1 
 
 XI. 
 
 
 
 
 
 2 
 
 4 
 
 , . 
 
 1 
 
 3 
 
 
 1 
 
 1 
 
 1 
 
 1 
 
 2 
 
 1 
 
 3 
 
 
 3 
 
 1 
 
 XIV. 
 
 1 
 
 1 
 
 2 
 
 1 
 
 1 
 
 3 
 
 , , 
 
 4 
 
 , , 
 
 1 
 
 , , 
 
 3 
 
 2 
 
 1 
 
 , , 
 
 2 
 
 1 
 
 
 
 
 XV. 
 
 
 1 
 
 •• 
 
 •• 
 
 •• 
 
 •■ 
 
 ■■ 
 
 1 
 
 
 1 
 
 1 
 
 •■ 
 
 
 •• 
 
 2 
 
 2 
 
 •• 
 
 •• 
 
 
 •• 
 
 The agreement of this table with the theoretical results of a 
 random distribution of the total number of casualties is remark- 
 ably close : ^ 
 
 Casualties in a 
 Year. 
 
 Number of Occasions on which the Annual 
 
 Casualties in a Corps reach the Figure 
 
 in Column 1. 
 
 
 1 
 2 
 3 
 4 
 5 and more 
 
 Actual. 
 
 144 
 
 91 
 
 32 
 
 11 
 
 2 
 
 Theoretical. 
 
 1431 
 
 921 
 
 33-3 
 
 8-9 
 
 2-0 
 
 0-6 
 
 Other instances are furnished by the numbers of child suicides 
 in Prussia, and the like. 
 
 It is Von BortMewicz's thesis that these observed regularities 
 
 ^ Bortkiemoz, <rp. cit. p. 24. 
 
OH. xxxn STATISTICAL INFERENCE 403 
 
 have a good theoretical explanation behind them, which he 
 dignifies with the name of the Law of Small Numbers. 
 
 The reader wiU recall that, according to the theory of Lexis, 
 his measure of stability, Q is, in the more general case, made up 
 of two components r and p, combined in the expression ^r^ +p^, 
 of which one is due to fluctuations from the average of the con- 
 ditions governing all the members of a series, which furnishes us 
 with one of our observed frequencies, and of which the other is 
 due to fluctuations in the individual members of the series about 
 the true norm of the series. Bortkiewicz carries the same 
 analysis a little further, and shows that Lexis's Q is of the form 
 vl +{n-l)c^, where n is' the number of times that the event 
 occurs in each series.^ That is to say, Q increases with n, and, 
 when n is small, Q is likely to exceed unity to a less extent than 
 when n is large. To postulate that n is small, is, when we are 
 dealing with observations drawn from a wide field, the same 
 thing as to say that the event we are looking for is a comparatively 
 • rare one. This, in brief, is the mathematical basis of the Law 
 of Small Numbers. 
 
 In his latest published work on these topics,^ Von Bortkiewicz 
 builds his mathematical structure considerably higher, without, 
 however, any further underpinning of the logical foundations 
 of it. He has there worked out further statistical .constants, 
 arising out of the conceptions on which Lexis's Q is based (the 
 precise bearing of which is not made any clearer by his calling 
 them coefficients of synd/romy), which are explicitly dependent 
 on the value of n ; and he elaborately compares the theoretical 
 value of the coefficients with the observed value in certain actual 
 statistical material. He concludes with the thesis, that Homo- 
 geneity and Stability (defined as he defines them) are opposed 
 conceptions, and that it is not correct to premise, that the larger 
 statistical mass is as a rule more stable than the smaller, unless 
 
 ^ I refer the reader to the original, op. cit. pp. 29-31, for the interpretation 
 of c (which is a function of the mean square errors arising in the course of the 
 investigation) and for the mathematical argument by which the above result 
 is justified. 
 
 2 " Homogeneitat und Stabilitat in der Statistik," published in the Skandi- 
 navisk Aktuarielidskrift, 1918. Those readers, who look up my references, 
 will, I think, agree with me that Von Bortkiewicz does not get any less 
 obscure as he goes on. The mathematical argument is right enough, and 
 often brilliant. But what it is all really about, what it all really amounts to, 
 and what the premisses are, it becomes increasingly perplexing to decide. 
 
404 A TREATISE ON PROBABILITY pt. v 
 
 we also assume that the larger mass is less homogeneous. At this 
 poiat, it would have helped, if Von BortMewicz, excluding from 
 his vocabulary homogeneity, paradromy, 7'ji, and the like, had 
 stopped to teU in plain language where his mathematics had led 
 him, and also whence they had started. But hke many other 
 students of Probability he is eccentric, preferring algebra to earth. 
 
 9, Where, then, though an admirer, do I criticise aU this ? I 
 think that the argument has proceeded so far from the premisses, 
 that it has lost sight of them. If the limitations prescribed by 
 the premisses are kept in mind, I do not contest the mathematical 
 accuracy of the results. But many technical terms have been 
 introduced, the precise signification and true limitations of which 
 will be misunderstood if the conclusion of the argument is allowed 
 to detach itself from the premisses and to stand by itself. I will 
 illustrate what I mean by two examples from the work of Von 
 Bortkiewicz described above. 
 
 Von Bortkiewicz enunciates the seeming paradox that the 
 larger statistical mass is only, as a rule, more stable if it is less- 
 homogeneous. But an illustration which he himself gives shows 
 how misleading his aphorism is. The opposition between 
 stability and homogeneity is borne out, he says, by the judgment 
 of practical men. For actuaries have' always maintained that 
 their results average out better, if their cases are drawn from a 
 wide field subject to variable conditions of risk, whilst they are 
 chary of accepting too much insurance drawn from a single 
 homogeneous area which means a concentration of risk. But 
 this is really an instance of Von Bortkiewicz's own distinction 
 between a general probability p and special probabilities j)j etc., 
 where 
 
 If we are basing our calculations on p and do not know p-^, f^, 
 etc., then these calculations are more likely to be borne out by 
 the result if the iastances are selected by a method which spreads 
 them over all the groups 1, 2, etc., than if they are selected by a 
 method which concentrates them on group 1. In other words, 
 the actuary does not like an undue proportion of his cases to be 
 drawn from a group which may be subject to a common relevant 
 influence /or which he has not allowed. If the d priori calculations 
 are based on the average over a field which is not homogeneous 
 
CH. xxxn STATISTICAL INFERENCE 405 
 
 in all its parts, greater stability of result will be obtained if the 
 instances are drawn from all parts of tlie non-bomogeneous 
 total field, than if tbey are drawn now from one homogeneous 
 sub-field and now from another. This is not at all paradoxical. 
 Yet I believe, though with hesitation, that this is aU that Von 
 Bortkiewicz's elaborately supported mathematical conclusion 
 really amounts to. 
 
 My second example is that of the Law of Small Numbers. 
 Here also we are presented with an apparent paradox in the 
 statement that the regularity of occurrence of rare events is more 
 stable than that of commoner events. Here, I suspect, the 
 paradoxical result is really latent in the particular measure of 
 stability which has been selected. If we look back at the figures, 
 which I have quoted above, of Prussian cavalrymen killed by 
 the kick of a horse, it is evident that a measure of stability could 
 be chosen according to which exceptional iastabUity would be 
 displayed by this particular material ; for the frequency varies 
 from to 4 round a mean somewhat less than unity, which is a 
 very great percentage fluctuation. In fact, the particular measure 
 of stability which Von Bortkiewicz has adopted from Lexis has 
 about it, however useful and convenient it may be, especially for 
 mathematical manipulation, a great deal that is arbitrary and 
 conventional. It is only one out of a great many possible 
 formulae which might be employed for the numerical measure- 
 ment of the conception of stability, which, quantitatively at 
 least, is not a perfectly precise one. The so-caUed Law of Small 
 Numbers is, therefore, little more than a demonstration that, 
 where rare events are concerned, the Lexian measure of stability 
 does not lead to satisfactory results. Like some other formulae 
 which involve a use of BernoulUan methods in an approximative 
 form, it does not lead to reliable results ia all circumstances. 
 I should add that there is one other element which may contribute 
 to the total psychological reaction of the reader's mind to the 
 Law of Small Numbers, namely, the surprising and piquant 
 examples which are cited in support of it. It is startling and 
 even amusing to be told that horses kick cavalrymen with the 
 same sort of regidarity as characterises the rainfall. But our 
 surprise at this particular example's fulfilling the Law of Great 
 Numbers has little or nothing to do with the exceptional stability 
 about which the Law of Small Numbers purports to concern itself. 
 
CHAPTER XXXIII 
 
 OUTLINE OF A CONSTRUCTIVE THEORY 
 
 1. There is a great difference between the proposition " It is 
 probable that every instance of this generalisation is true " and 
 the proposition " It is probable of any instance of this generalisa- 
 tion taken at random that it is true." The latter proposition 
 may remain valid, even if it is certain that some instances of the 
 generalisation are false. It is more likely than not, for example, 
 that any number wiU be divisible either by two or by three, but 
 it is not more likely than not that all numbers are divisible either 
 by two or by three. 
 
 The first type of proposition has been discussed in Part III. 
 under the name of Universal Induction. The latter belongs to 
 Inductive Correlation or Statistical Indudion, an attempt at the 
 logical analysis of which must be my final task. 
 
 2. What advocates of the Frequency Theory of Probability 
 wrongly believe to be characteristic of all probabilities, namely, 
 that they are essentially concerned not with single instances but 
 with series of instances, is, I think, a true characteristic of 
 statistical induction. A statistical induction either asserts the 
 probability of an instance selected at random from a series of 
 propositions, or else it assigns the probability of the assertion, 
 that the truth frequency of a series of propositions {i.e. the 
 proportion of true propositions in the series) is in the neighbour- 
 hood of a given value. In either case it is asserting a char- 
 acteristic of a series of propositions, rather than of a particular 
 proposition. ■ 
 
 Whilst, therefore, our unit in the case of Universal Induction 
 is a single instance which satisfies both the condition and the 
 conclusion of our generalisation, our unit in the case of Statistical 
 
 406 
 
CH. xxxm STATISTICAL INFEEENCE 407 
 
 Induction is not a single instance, but a set or series of instances, 
 all of which satisfy, the condition of our generalisation but 
 which satisfy the conclusion only in a certain proportion of cases. 
 And whilst in Universal Induction we build up our argument by 
 examining the known positive and negative Analogy shown in a 
 series of single instances, the corresponding task in Statistical 
 Induction consists in examining the Analogy shown in a series of 
 series of instances. 
 
 3. We are presented, in problems of Statistical Induction, with 
 a set of instances all of which satisfy the conditions of our general- 
 isation, and a proportion / of which satisfy its conclusion ; and 
 we seek to generalise as to the probable proportion in which 
 further instances will satisfy the conclusion. 
 
 Now it is useless merely to pay attention to the proportion (or 
 frequency) / discovered in the aggregate of the instances. Eor 
 any collection whatever, comprising a definite number of objects, 
 must, if the objects be classified with reference to the presence 
 or absence of any specified characteristic whatever, show some 
 definite proportion or statistical frequency of occurrence ; so that 
 a mere knowledge of what this frequency is can have no appreci- 
 able bearing on what the corresponding frequency will be for 
 some other collection of objects, or on the probability of finding 
 the characteristic in an object which does not belong to the 
 original collection. We should be arguing in the same sort of 
 way as if we were to base a universal induction as to the 
 concurrence of two characteristics on a single observation of this 
 concurrence, and without any analysis of the accompanying 
 circumstances. 
 
 Let the reader be clear about this. To argue from the mere 
 fact that a given event has occurred invariably in a thousand 
 instances under observation, without any analysis of the circum- 
 stances accompanying the individual instances, that it is likely 
 to occur invariably in future instances, is a feeble inductive 
 argument, because it takes no account of the Analogy. Neverthe- 
 less an argument of this kind is not entirely worthless, as we have 
 seen in Part III. But to argue, without analysis of the instances, 
 from the mere fact that a given event has a frequency of 10 per 
 cent in the thousand instances under observation, or even in a 
 minion instances, that its probability is 1/10 for the next instance, 
 or that it is likely to have a frequency near to 1/10 in a further 
 
408 A TEEATISE ON PEOBABILITY pt. v 
 
 set of observations, is a far feebler argument ; indeed it is hardly 
 an argument at aU. Yet a good deal of statistical argument is not 
 free from this reproach ; — ^though persons of common sense often 
 conclude better than they argue, that is to say, they select for 
 credence, from amongst arguments similar in form, those in 
 favour of which there is in fact other evidence tacitly known to 
 them though not explicit in the premisses as stated. 
 
 4. The analysis of statistical induction is not fundamentally 
 different from that of universal induction already attempted in 
 Part III. But it is much more intricate ; and I have experienced 
 exceptional difficulty, as the reader may discover for himself in 
 the following pages, both in clearing up my own mind about it 
 and in expounding my conclusions precisely and intelligibly. I 
 propose to begin with a few examples of what commonly impresses 
 us as good arguments in this field, and also of the attendant 
 CLTCimistances which, if they were known to exist, might be held 
 to justify such a mode of reasoning ; and, having thus attempted 
 to bring before the reader's mind the character of the subject- 
 matter, to proceed to an abstract analysis. 
 
 Example One. — Let us investigate the generalisation that the 
 proportion of male to female births is m. The fact that the 
 aggregate statistics for England during the nineteenth century 
 yield the proportion m would go no way at all towards justifying 
 the statement that the proportion of male births in Cambridge 
 next year is likely to approximate to m. Our argument would 
 be no better if our statistics, instead of relating to England during 
 the nineteenth century, covered all the descendants of Adam. 
 But if we were able to break up our aggregate series of instances 
 into a series of sub-series, classified according to a great variety 
 of principles, as for example by date, by season, by locality, by 
 the class of the parents, by the sex of previous children, and so 
 forth, and if the proportion of male births throughout these sub- 
 series showed a significant stability in the neighbourhood of m, 
 then indeed we have an argument worth something. Otherwise 
 we must either abandon our generalisation, amplify its conditions, 
 or modify its conclusion. , 
 
 Example Two. — Let us take a series of objects s all alike in 
 some specified respect, this resemblance constituting membership 
 of the class F ; let us determine of how many members of the 
 series a certain property ^ is true, the frequency of which is to be 
 
CH. xxxm STATISTICAL INFERENCE 409 
 
 the subject of otir generalisation ; and if a proportion / of the 
 series s have the property <j}, we may say that the series s has a 
 frequency /for the property (p. 
 
 Now if the whole field F has a finite number of constituents, 
 it must have some determinate frequency p, and if, therefore, 
 we increase the comprehensiveness of s until eventually it 
 includes the whole field, / must come in the end to be equal 
 to p. This is obvious and without interest and not what we 
 mean by the law of great numbers and the stability of statistical 
 frequency. 
 
 Let us now divide up the field F, according to some deter- 
 minate principle of division D, into subfields F^, Fg, etc, ; and 
 let the series s^ be taken from F^, s^ from Fg, and so on. Where 
 Fj, Fg, etc., have a finite number of constituents, s^, Sg, etc., may 
 possibly coincide with them ; if s^, Sg, etc., do not coincide with 
 F;^, Fg, etc., but are chosen from them, let us suppose that they are 
 chosen according to some principle of random or unbiassed 
 selection — s-^, that is to say, will be a random sample from F^. 
 Now it may happen that the frequencies /i,/g, etc., of the series 
 Si, Sg, etc., thus selected cluster round some mean frequency /. If 
 the frequencies show this characteristic (the measurement and pre- 
 cise determination of which I am not now considering), then the 
 series of series s^, Sg, etc., has a stable frequency for the classifica- 
 tion D. ' Great numbers ' only come in because it is difficult to 
 ascertain the existence of stable frequency imless the series s^, Sg, 
 etc., are themselves numerous and tmless each of these comprises 
 numerous individual instances. 
 
 Let us then apply a different priaciple of division D', leading 
 to series s^', Sg', etc., and to frequencies/^^/g', etc. ; and then again 
 a third principle of division D" leading to frequencies /^'j/g", etc. ; 
 and so on, to the full extent that our knowledge of the differences 
 between the individual instances permits us. If the frequencies 
 /ij/g, etc., fi'tfz, etc., fijfz", etc., and so on are all stable about/, 
 we have an inductive ground of some weight for asserting a 
 statistical generahsation. 
 
 Let. the field F, for example, comprise all Englishmen in their 
 sixtieth year, and let the property (f), about the frequency of 
 which we are generalising, be their death in that year of their age. 
 Now the field F can be divided into subfields Fj, Fg, etc., on in- 
 numerable different principles. F^ might represent Englishmen 
 
410 A TREATISE ON PROBABILITY pt. v 
 
 in their sixtieth year in 1901, Fg in 1902, and so on ; or we might 
 classify them according to the districts in which they live ; or 
 according to the amount of income tax they pay ; or according as 
 they are in workhouses, in hospitals, in asylums, in prisons, or at 
 large. Let us take the second of these classifications and let the 
 subfields Fi, Fj, etc., be constituted by the districts in which they 
 live. If we take large random selections s^, s^, etc., from F^, Fg, 
 etc., respectively, and find that the frequencies /^j/g, etc., fluctuate 
 closely round a mean value /, this can be expressed by the 
 statement that there is a stable frequency / for death in the 
 sixtieth year in different English districts. We might also find 
 a similar stability for all the other classifications. On the other 
 hand, for the third and fourth classifications we might find no 
 stability at aU, and for the first a greater or less degree of stability 
 than for the second. In the latter case the form of our statistical 
 generalisation must be modified or the argument in its favour 
 weakened. 
 
 Example Three. — Let us return to the example given in Chapter 
 XXVTI. of the dog which is fed sometimes by scraps at table 
 and so judges it reasonable to be there. From one year to another, 
 let us assume, the dog gets scraps on a proportion of days more 
 or less stable. "What sorts of explanation might there be of 
 this 1 First, it might be the case that he was fed on the movable 
 feasts of the Church ; there would be the same number of these 
 in each year, but it would not be easy for any one who had not 
 the clue to discover any regularity in the occasions of their 
 individual occurrence. Second, it might be the case that he 
 was given scraps whenever he looked thin, and that the scraps 
 were withheld whenever he looked fat, so that if he was given 
 scraps on one day, this diminished the likelihood of his getting 
 scraps on the next day, whilst if they were withheld this would 
 increase the likelihood ; the dog's constitution remaining constant, 
 the number of days for scraps would tend to fluctuate from 
 year to year about a stable value. Third, it might be the case 
 that the company at table varied greatly from day to day, and 
 that some days people were there of the kind who give dogs 
 scraps and other days not ; if the set of people from whom 
 the company was drawn remained more or less the same from 
 year to year, and it was a matter of chance (in the objective sense 
 defined in § 8 of Chapter XXIV. above) which of them were 
 
CH. xxxm STATISTICAL INFEEENCE 411 
 
 there from day to day, the proportion of days for scraps might 
 agaia show some degree of stability from year to year. Lastly, 
 a combination between the first and third type of circumstance 
 gives rise to a variant deserving separate mention. It might be 
 the case that the dog was only given scraps by his master, that 
 his master generally went away for Saturday and Sunday, and 
 was at home the rest of the week unless something happened 
 to the contrary, and that " chance " causes would sometimes 
 intervene to keep him at home for the week-end and away in 
 the week ; in this case the frequency of days for scraps would 
 probably fluctuate in the neighbourhood of five-sevenths. In 
 circumstances of this third type, however, the degree of stability 
 would probably be less than in circumstances of the first two 
 types ; and ia order to get a really stable frequency it might 
 be necessary to take a longer period than a year as the basis 
 for each series of observations, or even to take the average for 
 a number of dogs placed ia like circumstances instead of one 
 dog only. 
 
 It has been assumed so far that we have an opportunity of 
 observing what happens on emery day of the /ear. If this is 
 not the case and we have knowledge only of a random sample 
 from the days of each year, then the stability, though it will be 
 less in degree, may be nevertheless observable, and will increase 
 as the number of days included in each sample is increased. 
 This applies equally to each of the three tjrpes. 
 
 5. What is the correct logical analysis of this sort of reasoning ? 
 If an inductive generalisation is a true one, the conclusion which 
 it asserts about the instance under inquiry is, so far as it goes, 
 definite and final, and cannot be modified by the acquisition of 
 more detailed knowledge about the particular instance. But a 
 statistical induction, when applied to a particular instance, is 
 not like this ; for the acquisition of further knowledge might 
 render the statistical induction, though not in itself less probable 
 than before, inappUcable to that particular instance. 
 
 This is due to the fact that a statistical induction is not really 
 about the particular instance at all, but has its subject, about 
 which it generalises, a series ; and it is only applicable to the 
 particular instance, in so far as the instance is relative to our 
 knowledge, a random member of the series. If the acquisition of 
 new knowledge affords us additional relevant information about 
 
412 A TREATISE ON PEOBABILITY pt. v 
 
 tte particular instance, so that it ceases to be a random member 
 of the series, then the statistical induction ceases to be applicable ; 
 but the statistical induction does not for that reason become 
 any less probable than it was — it is simply no longer indicated 
 by our data as being the statistical generalisation appropriate 
 to the instance under iaquiry. The point is illustrated by the 
 familiar example that the probability of an unknown individual 
 posting a letter unaddressed can be based on the statistics of 
 the Post Office, but my expectation that I shall act thus, cannot 
 be so determined. 
 
 Thus a statistical generalisation is always of the form : ' The 
 probability, that an instance taken at random from the series 
 S will have the characteristic ^, is j? ; ' or, more precisely, if a is 
 a random member of S(a;), the probability of ^(a) is p. 
 
 It will be convenient to recapitulate from Chapter XXIV. § 11 
 the definition of ' an instance taken at random ' : Let ^{x) 
 stand for ' x has the characteristic ^' and S(a;) for ' a; is a member 
 of the class S ' ; then, on evidence A, a is a random member 
 of the class S for characteristic ^, if ' a; is a ' is irrelevant to 
 <f){x)/S{x) . h,^ i.e. if we have no information about a relevant 
 to </)(«) except S(a). 
 
 Or alternatively we might express our definition as follows : 
 Consider a particular instance a, where the object of our inquiry 
 is the probability of <p{a) relative to evidence h. Let us discard 
 that part of our knowledge h{a) which is irrelevant to cj){a), 
 leaving us with relevant knowledge h'{a). Let the class of 
 instances a^, ajj etc., which satisfy h'{x) be designated by S. Then, 
 relative to evidence A, a is a random member of the class or 
 series S for the characteristic (j}. 
 
 Let us denote the proposition ' x is, on evidence h, a random 
 member of S for characteristic <^ ' by B,{x, S, <ji, h) ; then our 
 statistical generalisation is of the form <p{x)/R{x, S, ^,h).h =p. 
 
 If R {a, S, (^, h) holds, then, on evidence h, S is the appropriate 
 statistical series to which to refer a for the purposes of the charac- 
 teristic (j>. 
 
 It is not always the case that the evidence indicates any 
 series at all as ' appropriate ' in the above sense. In particular, 
 
 * The use of variables in probability, as has been pointed out on p. 58, is 
 very dangerous. It might therefore be better to enunciate the above : a is a 
 random member of S for characteristic tj>, if ^(a)/S(o) . ft = 0(6)/S(6) . fe where 
 S(6) . h contains no information about b, except that fi is a member of S 
 
OH. xxxin STATISTICAL INFEEENCE 413 
 
 if evidence h indicates S as the appropriate series, and evidence 
 h' indicates S' as tlie appropriate series, then relative to evidence 
 hh' (assuming these to be not incompatible), it may be the case 
 that no determinate series is indicated as appropriate. In this 
 case the method of statistical induction fails us as a means of 
 determining the probability under inquiry. 
 
 6. We can now remove our attention from the individual 
 instance a to the properties of the series S. What sort of evidence 
 is capable of justifying the conclusion that jp is the probability 
 that a random member of the series S will have the character- 
 istic ^ ? 
 
 In the simplest case, S is a finite series of which we know the 
 truth frequency for the characteristic <^, namely f} Then by a 
 straightforward application of the Principle of Indifference we 
 have f =/, so that j>{x)fR{x, S, ^,h).h =/. 
 
 In another important type S is a series, with an indefinite 
 number of members which, however, group themselves in such 
 a way that for every member of which (j}{x) is true, there cor- 
 responds a determinate number of members of which ^(a;)' is 
 false. The series, that is to say, contains an indefinite number 
 of atoms, but each atom is made up of a set of molecules of 
 which (j){x) is true and false respectively in fixed and determinate, 
 proportions. If this determinate proportion is known to be/, we 
 have, as before, p =/. The tjrpical instance of this type is afforded 
 by games of chance. Every possible state of affairs which might 
 lead to a divergence in one direction is balanced by another 
 probability leading in the opposite direction ; and these alterna- 
 tive possibilities are of a kind to which the Principle of Indifference 
 is applicable. Thus for every poise of the dice box which leads 
 to the fall of the six-face, there is a corresponding poise which 
 leads to the fall of each of the other faces ; so that if S is the 
 series of possible poises, we may equate pto ^ where <j} is the fall 
 of the six-face. It is not necessary, in order to obtain this 
 result, to assert that S is a finite series with an actual determinate 
 frequency /for the fall of each face. 
 
 So far no inductive element enters in. But in general we do 
 not know the constitution of S for certain, and can only infer it 
 inductively from its resemblance to other series of which we know 
 the constitution. This presents a normal inductive problem — 
 
 "• I.e. if /is the proportion of the members of the series for which 0(a;) is true. 
 
414 A TREATISE ON PROBABILITY pt. v 
 
 the determination by an analysis of tlie positive and negative 
 analogies as to whether the respects in which S differs or may 
 differ from the other series is or is not relevant in the particular 
 context ^ ; and it involves the same sort of considerations as 
 those discussed in Part III. 
 
 There is, however, a further difficulty to be introduced before 
 we have reached the typical statistical problem. In the case 
 now to be considered our actual data do not consist of positive 
 knowledge of the constitutions either of S itself or of other series 
 more or less resembling S, but only of the frequency of the 
 characteristic in actually observed sets of selections, great or 
 small, either from S itself or from other series more or less 
 resembling S. 
 
 Thus in the most general case our inquiry falls into two parts. 
 We are given the observed frequency in statistical sets selected 
 from Sj, S2, etc., respectively. The first part of our inquiry is 
 the problem of arguing from these observed frequencies to the 
 probable constitutions of Sj^, 83, etc., i.e. of determining the values 
 of <f>{x)fR{x, Sj, <j), h) . h, etc. ; we may call this part the statistical 
 problem. The second part of our inquiry is the problem of 
 arguing from the probable constitutions of S^, Sg, etc., to the 
 probable constitution of S, where S, Sj, Sg resemble one another 
 more or less, and we have to determine whether the differences 
 are or are not relevant to our inquiry ; we may call this part the 
 inductive problem. 
 
 Now if the observed statistical sets are made up of random 
 instances of S^, Sg, etc., we can argue in certain conditions from 
 the observed frequencies to the probable constitutions of the 
 series, out of which the random selections have been made, by 
 an inverse application of Bernoulli's Theorem on the hues ex- 
 plained in Chapter XXXI. Moreover, if the series S^, Sj, etc., 
 are finite series and the observed selections cover a great part 
 of their members, we can reach an at least approximate con- 
 clxision without raising all the theoretical dLEGiculties or satisfying 
 all the conditions of Chapter XXXI. The commonly received 
 opinions as to the bearing of the observed frequencies in a 
 random sample on the constitution of the universe out of which 
 the sample is drawn, though generally stated too precisely and 
 without sufficient insistence on the assumptions they iavolve, 
 our actual evidence not warranting in general more than an 
 
CH. xxxni STATISTICAL INFERENCE 415 
 
 approximate result, are not, I think, fundamentally erroneous. 
 The most usual error in modem method consists in treating too 
 Hghtly what I have termed above the inductive problem, i.e. 
 the problem of passing from the series S^, Sg, etc., of which we 
 have observed samples, to the series S of which we have not 
 observed samples. 
 
 Let us, then, assume that we have ascertained p^, f^, etc., with 
 more or less exactness, by examining either all the instances of 
 the series S^, Sg, etc., or random selections from them, i.e. <^{x)[R 
 {x, Sj, ^,h).h =pi, etc. This can be expressed for short by saying 
 that the series S^, Sg, etc., are subject to probable-frequencies 
 Pi> P2> ®*^-5 ^or the characteristic ^. Our problem is to infer from 
 this the probable-frequency p of the unexamined series S. The 
 class characteristics of the series S^, Sg, etc., will be partly the same 
 and partly different. Using the terminology of Part III. we 
 may term the class characteristics which are common to all of 
 them the Positive Analogy, and the class characteristics which 
 are not common to all of them the Negative Analogy. 
 
 Now, if the observed or inferred probable -frequencies of 
 the series S^, Sg, are to form the basis of a statistical induction, 
 they must show a stable value ; that is to say, either we must 
 have pi =p^ = etc., or at least p^, p^, etc., must be stably grouped 
 about their mean value. Our next task, therefore, must be 
 to discover whether the probable-frequencies pj^, p^, etc., display 
 a significant stability. It is the great merit of Lexis that he was 
 the first to investigate the problem of stability and to attempt its 
 measurement. For, xmtil a primdfade case has been established 
 for the existence of a stable probable-frequency, we have but 
 a flimsy basis for any statistical induction at all ; indeed we are 
 limited to the class of case where the instance imder iaquiry is 
 a member of identically the same series as that from which our 
 samples were drawn, i.e. where S = S^, which in social and scientific 
 inquiries is seldom the case. 
 
 What is the meaning of the assertion that p^, p^, etc., are 
 stably grouped about their mean value ? The answer is not 
 simple and not perfectly precise. We could propound various 
 formulae for the measurement of stability and dispersion, respect- 
 ively, and the problem of translating the conception of stability, 
 which is not quantitatively precise, iato a numerical formula 
 involves an arbitrary or approximative element. For practical 
 
416 A TREATISE ON PROBABILITY pt. v 
 
 purposes, however, I doubt if it is possible to improve on Lexis's 
 measure of stability Q, the mathematical definition of which 
 has been given above on p. 399. Lexis describes the stability 
 as subnormal, normal, or supernormal according as Q is less than, 
 equal to, or greater than 1. This is too precise, and it is better 
 perhaps to say that the stability about the mean is normal if 
 the dispersion is such as would not be improbable d priori, if 
 we had assumed that the members of Sj, Sg, etc., were obtained 
 by random selection out of a single universe U, that it is sub- 
 normal if the dispersion is less than one would have expected on 
 the same hjrpothesis, and that it is supernormal if the dispersion 
 is greater than one would have expected. 
 
 Let us suppose that we find that on this definition p^, p^, etc., 
 are stable about p, and let us postpone consideration of the cases 
 of subnormal or supernormal dispersion. This is equivalent to 
 saying that the frequencies of S^, S^, etc., are within limits which 
 we should expect d priori, if we had knowledge relative to which 
 their members were chosen at random from a universe U of which 
 the frequency was p for the characteristic imder inquiry. We 
 next seek to extend this result to the unexamined series S and to 
 justify anticipations about it on the basis of the members of S 
 also being chosen at random from the universe U. This leads us 
 to the strictly inductive part of our inquiry. 
 
 The class characteristics of the several series Sj, Sg, etc., will be 
 partly the same and partly different, those that are the same 
 constituting the positive analogy and those that are different 
 constituting the negative analogy, as stated above. The series 
 S will share part of the positive analogy. The argument for 
 assimilating the properties of S, iu relation to the characteristic 
 under inquiry, to the properties of S^, Sg, etc., in relation to this 
 characteristic depends on the differences between S, S^, Sj, etc., 
 being irrelevant in this particular connection. The method of 
 strengthening this argument seems to me to be the same as the 
 general inductive method discussed in Part III. and to present 
 the same, but not greater, difiiculties. 
 
 In general this inductive part of our inquiry will be best 
 advanced by classifying the aggregate series of iostances with 
 which we are presented in such a way as to analyse most clearly 
 the significant positive and negative analogies, to group them, 
 that is to say, into sub-series S^, Sg, etc., which show the most 
 
CH. xKxui STATISTICAL IKFEEENCE 417 
 
 marked and definite class cliaracteristics. Our knowledge of the 
 differences between tte particular observed instances wHch 
 constitute our original data will suggest to us one or more 
 principles of classification, such that the members of each sub- 
 series all have in common some set of positive or negative char- 
 acteristics, not all of which are shared in common by aU the 
 members of any of the other sub-series. That is to say, we 
 classify our whole set of instances into a series of series S^, Sj, etc., 
 which have frequencies f^, f^, etc., for the characteristic under 
 inquiry ; and then again we classify them by another principle or 
 criterion of classification into a second series of series S^', Sg', etc., 
 with frequencies /i', /2',etc. ; and soon, so far as our knowledge of 
 the possible relevant differences between the instances extends ; 
 the whole result being then summed up in a statement of the 
 positive and negative analogies of the series of series. If we then 
 find that all the frequencies f^, f^, etc., jf/, f^', etc., are stable about 
 a value p, and if, on the basis of the above positive and negative 
 analogies, we have a normal inductive argument for assimilating 
 the unexamined series S to the examined series S^, Sg, etc., S^', Sg', 
 etc., in respect of the characteristic under inquiry, in this case we 
 have, not conclusive grounds, but grounds of some weight for 
 asserting the probability p, that an instance taken at random 
 from S will have the characteristic in question. 
 
 Let me recapitulate the two essential stages of the argu- 
 ment. We first find that the observed frequencies in a set of 
 series are such as would have been not improbable d priori if, 
 relative to our knowledge, these series had all been made up of 
 random members of the same universe U ; and we next argue 
 that the positive and negative analogies of this set of series 
 furnish an inductive argument of some weight for supposing that 
 a further unexamined series S resembles the former series in 
 having a frequency for the characteristic under inquiry such as 
 would have been not improbable d priori if, relative to our know- 
 ledge, S was also made up of random members of the hypo- 
 thetical universe U. 
 
 7. It is very perplexing to decide how far an argument of 
 this character involves any new and theoretically distinct 
 difficulties or assumptions, beyond those already admitted 
 as inherent in Universal Induction. I believe that the fore- 
 going analysis is along the right liaes and that it carries the 
 
 2e 
 
418 A TREATISE ON PEOBABILITY pt. v 
 
 inquiry a good deal fuitlier than it has been carried hitherto. 
 But it is not conclusive, and I must leave to others its more 
 exact elucidation. 
 
 There is, however, a little more to be said about the half -felt 
 reasons which, in my judgment, recommend to common sense 
 some at least of the scientific (or semi-scientific) arguments 
 which run along the above lines. In expressing these reasons I 
 shall be content to use language which is not always as precise as 
 it ought to be. 
 
 I gave in Chapter XXIV. §§ 7-9 an Laterpretation of what is 
 meant by an ' objectively chance ' occurrence, in the sense in 
 which the results of a game, such as roulette, may be said to be 
 governed by ' objective chance.' This interpretation was as 
 follows : " An event is due to objective chance if in order to 
 predict it, or to prefer it to alternatives, at present equi-probable, 
 with any high degree of probability, it would be necessary to 
 know a great many more facts of existence about it than we 
 actually do know, and if the addition of a wide knowledge of 
 general principles would be little use." The ideal instance of 
 this is the game of chance ; but there are other examples afforded 
 by science in which these conditions are fulfilled with more or 
 less perfection. Now the field of statistical induction is the class 
 of phenomena which are due to the combination of two sets of 
 influences, one of them constant and the other liable to vary in 
 accordance with the expectations of objective chance, — Quetelet's 
 ' permanent causes ' modified by ' accidental causes.' In social 
 and physical statistics the ultimate alternatives are not as a rule 
 so perfectly fixed, nor the selection from them so purely random, 
 as in the ideal game of chance. But where, for example, we find 
 stabUity in the statistics of crime, we could explain this by 
 supposing that the population itself is stably constituted, that 
 persons of different temperaments are alive in proportions more 
 or less the same from year to year, that the motives for crime are 
 similar, and that those who come to be influenced by these 
 motives are selected from the population at large in the same 
 kind of way. Thus we have stable causes at work leading to the 
 several alternatives in flxed proportions, and these are modified 
 by random influences. Generally speaking, for large classes of 
 social statistics we have a more or less stable population including 
 different kinds of persons in certain proportions and on the other 
 
OH. xxxm STATISTICAL INFEEENCE 419 
 
 hand sets of enviromnents ; the proportions of the different 
 kinds of persons, the proportions of the different kinds of environ- 
 ments, and the manner of allotting the environments to the 
 persons vary in a random manner from year to year (or, it may be, 
 from district to district). In all such cases as these, however, 
 prediction beyond what has been observed is clearly open to 
 sources of error which can be neglected in considering, for 
 example, games of chance ; — our so-caUed ' permanent ' causes 
 are always changing a little and are liable at any moment to 
 radical alteration. 
 
 Thus the more closely that we find the conditions in scientific 
 examples assimilated to those in games of chance, the more 
 confidently does common sense recommend this method. The 
 rather surprising frequency with which we find apparent stability 
 in human statistics may possibly be explained, therefore, if the 
 biological theory of Mendelism can be established. According to 
 this theory the qualities apparent in any generation of a given 
 race appear in proportions which are determined by methods 
 very closely analogous to those of a game of chance. To take a 
 specific example (I am giving not the correct theory of sex but an 
 artificially simplified form of it), suppose there are two kinds of 
 spermatozoa and two kinds of ova and of the four possible kinds 
 of union two produce males and two females, then r£ the kinds of 
 spermatozoa and ova exist in equal numbers and their union is 
 determined by random considerations in precisely the same sense 
 in which a game of chance such as roulette depends upon random 
 considerations, we should expect the observed proportions to 
 vary from equality, as indeed they do, in the same manner as 
 variations from equality of red and black occur at roulette.'^ If 
 the sphere of influence of MendeUan considerations is wide, we 
 have both an explanation in part of what we observe and also a 
 large opportunity in future of using with profit the methods of 
 statistical analysis. 
 
 This is all familiar. This is the way ia which in fact we do 
 think and argue. The inquiry as to how far it is covered by the 
 abstract analysis of the preceding paragraphs, and by what 
 
 ^ The fluctuations in the proportion of the sexes which, as is weE known, 
 is not in fact one of equality, correspond, as Lexis has shown, to what one 
 would expect in a game of chance with an astonishing exactitude. But 
 it is difficult to find any other example, amongst natural or social phenomena, 
 in which his criteria of stability are by any means as equally well satisfied. 
 
420 A TEEATISE ON PROBABILITY ft. v 
 
 logical principle the use of this analysis can be justified as rational, 
 I have pushed as far as I can. It deserves a profounder study 
 than logicians have given it in the past. 
 
 8. Two subsidiary questions remain to be mentioned. The 
 first of these relates to the character of series which, in the 
 terminology of Lexis, show a subnormal or supernormal stability ; 
 for I have pressed on to the conclusion of the argument on the 
 assumption that the stabilities are normal. Subnormal stability 
 conceals two types : the one in which there is really no stability 
 at all and the results are in fact chaotic ; and the other in which 
 there is mutual dependence between the successive instances of 
 such a kind that they tend to resemble one another so that any 
 divergence from the normal tends to accentuate itself. Super- 
 normal stability corresponds in the other direction to the second 
 of these two tjrpes ; that is to say, there is mutual dependence of 
 a regulative kind between the successive instances which tends 
 to prevent the frequency from swinging away from its mean 
 value. The case, where the dog was fed with scraps when he 
 looked thin and not fed when he looked fat, illustrated this. 
 The typical example of this type is where balls are drawn from 
 urns, containing black and white balls in certain proportions and 
 not replaced ; so that every time a black ball is drawn the next 
 ball is more likely than before to be white, and there is a tendency 
 to redress any excess of either colour beyond the proper propor- 
 tions. Possibly the aggregate annual rainfall may afford a 
 further illustration. 
 
 Where there is no stability at all and the frequencies are chaotic, 
 the resulting series can be described as ' non-statistical.' Amongst 
 ' statistical series,' we may term ' independent series ' those of 
 which the instances are independent and the stability normal, 
 and ' organic series,' those of which the instances are mutually 
 dependent and the stability abnormal, whether in excess or ia 
 defect. ' Organic series ' have been incidentally discussed else- 
 where in this volume, I shall not pursue them further now, 
 because I do not think that they introduce any new theoretical 
 difficulty into the general problem of statistical inference ; 
 although the problem of fitting them into the general theoretical 
 scheme is not easy.-^ 
 
 1 The following more precise definitions bring these ideas into line with what 
 has gone before : consider the terms di, Oj . . . a„ of a series s(x) ; let ' a^ia g' 
 
OH. xxxm STATISTICAL INFERENCE 421 
 
 9. The second question is concerned with the relation between 
 the Inductive Correlation, which has been the subject-matter of 
 this chapter, and the Correlation Coefficient, or, as I should prefer 
 to call it, the Quantitative Correlation, with which recent English 
 statistical theory has chiefly occupied itself. I do not propose 
 to discuss this theory in detail, because I suspect that it is much 
 more concerned, at any rate in its present form, with statistical 
 description than with statistical induction. The transition from 
 defining the ' correlation coefficient ' as an algebraical expres- 
 sion to its employment for purposes of inference is very far from 
 clear even in the work of the best and most systematic writers 
 on the subject, such as Mr. Yule and Professor Bowley. 
 
 In the notation employed in the earlier part of this chapter I 
 have classified each examiaed instance a according as it did or 
 did not possess the characteristic <ji, i.e. satisfy the propositional 
 function <j>{x), or, in other words, according as ^(a) was true or 
 false. Thus only two possible alternatives were contemplated, 
 and <ji was not considered as a quantitative characteristic which 
 the instance could satisfy in greater or less degree. Equally the 
 common element in all the instances, required to, constitute them 
 as instances for the purpose of our statistical generalisation (or, 
 as I have sometimes put it, required to satisfy the condition of the 
 generalisation), was regarded as definite and unique and not 
 capable of quantitative variation. That is to say, aU the instances 
 satisfied a function ■\lr{x), and the question was, what proportion 
 
 =3, and let 9',/A=p,, where h is our data. Then, if g^lg, . . . gt ■ ■ ■ ^—Pr for s-U 
 values oir,s, . . .,t. . ., the terms of the series are independent relative to h. If 
 Pi=Pi=. . .=p the terms are uniform. If the terms are both independent and 
 uniform, the series may be called an independent Bernoullian series, subject to 
 a Bernoullian probability p. If the terms are independent but not uniform, the 
 series may be called an independent compound series, subject to a compounded 
 probability l/nSp,. If the terms are not independent, the series is an organic 
 series. 
 
 The same terminology can then be applied to the series Si, S^, . . . S„, regarded 
 as members of the series of series S(a;). Let the frequencies of the series for the 
 characteristic under inquiry be aii, x^, . . . x„, and let xjh = Oj{Xj), i.e. Si{Xj) is the 
 probability of a frequency x^ in the first series. Then if a;,/a!, . . . A = 9,(a;,) for all 
 values of r, s, etc., the frequencies are independent ; and if 9i(a;) = 64,x^ = . . . e{x), 
 the frequencies are stable. If the frequencies are stable and independent, the 
 series of series may be called Gaussian. If the frequencies are stable and 
 independent, and if in addition each individual series is subject to a Bernoullian 
 probability, the probable dispersion of the frequency is normal and symmetrical. 
 If the individual series are organic, the dispersion of the frequencies may be 
 normal, subnormal, or supernormal. If the series of series is Gaussian, and the 
 individual series Bernoullian, we have the type of the perfect statistical series. 
 
422 
 
 A TREATISE ON PROBABILITY 
 
 of them also satisfied the function ^(x). A typical example was 
 that of sex-ratio, — yjr{x) being the birth of a child and <p{x) its 
 sex, where there is no question of degree in either ■\fr{x) or ^(a;). 
 
 It might.be the case, however, that the characteristics under 
 examiuation were capable of degree or quantitative variation ; 
 for example ^fr{x) might be the age of the mother and ^(a;) the 
 weight of the child at birth. In this case we should have a series 
 1^1(33), ^Ir^ix), etc., corresponding to the various age-periods of the 
 mothers, and a series 4>i(x),^2{^)> etc., corresponding to the various 
 weights of the children. Now if we concentrated our attention 
 on 1^1(0;) and <pi(x) alone, i.e. on mothers of a particular age and 
 the proportions of their children which had a particular weight 
 at birth, we have a one-dimensional problem of the same kind as 
 before ; out of all the instances which satisfy '\jr-i(x) a certain 
 proportion satisfy ^^(a;) also. But clearly we can push our 
 observations further and we can take note what proportion of the 
 instances which satisfy i/ri(a;) satisfy (^^{x), ^3(x),and so on, respect- 
 ively ; and then we can do the same as regards the instances 
 which satisfy yltgix), y}ra{x), etc. The total results of this two- 
 dimensional set of observations can then be tabulated in what is 
 called a twofold correlation table. Thus if /„ is the proportion 
 of instances satisfying ■\lrg{x) which also satisfy <f>r{x) we have a 
 table as foUows : 
 
 
 ■i'M 
 
 ■+.(«) 
 
 ■^,W 
 
 
 0i(a;) 
 
 /u 
 
 /.a 
 
 /18 
 
 
 02(a') 
 
 Ai 
 
 /.a 
 
 /.3 
 
 
 M'^) 
 
 /s. 
 
 /a. 
 
 /sa 
 
 
 
 
 
 
 
 We could, further, increase the complexity and completeness 
 of our observations to any required degree. For example we 
 might take account also of 6{x), the age of the father, and con- 
 struct a threefold table where /,^ is the proportion of instances 
 satisfying (f>r(x), T|r,(a!), d^ix) ; and so on up to an n-fold table. 
 
 Clearly it is not necessary for the construction of tables of 
 
CH. xxxm STATISTICAL INFEEENCE 423 
 
 this kind that ^(x) and ^^(a;) should stand for degrees of the same 
 quantitative characteristic ; they might be any set of exclusive 
 alternatives ; for example, -\|r(a;) might be the colour of the baby's 
 eyes, and <^(a;) its Christian name. 
 
 But ia order that the correlation table may be of any 
 practical interest for the purposes of iof erence, it is necessary — 
 and this, I think, is one of the critical assumptions of correla- 
 tion — ^that <f)j{x), (j)2{x) . . . and also ^i{x), ^2(0;) . . , should 
 be arranged in an order that is significant, i.e. such that we have 
 some d priori reason for expecting some connection to exist 
 between the order of the ^'s and the order of the ^'s. The point 
 of this will be illustrated by concentrating our attention on the 
 simplest type of case where ^(x) and <f){x) are quantitative 
 characteristics arranged ia order of magnitude. Now suppose 
 it were the case that the younger mothers tended to bear heavier 
 babies, then, if ^^(a;) (piix) are the ages increasing upwards and 
 ^i(^) ^a(^) tJie weights diminishing downwards, /^ would probably 
 be the greatest of the f^^'s and, generally speaking, /^^ would be 
 greater than/,+i ^ ; also /ga might be the greatest of the f^'s, and 
 so on ; so that the frequencies lying on the diagonal of the table 
 would be the grea.test and the frequencies would tend to be less 
 the farther they lay from the diagonal. If we had some reason 
 d priori {i.e. based on our pre-existing knowledge), if only a 
 slight one, for supposing that there might be some connection 
 between the age of the mother and the weight of the baby, then, 
 if in a particular set of instances the frequencies were grouped 
 about the diagonal as suggested above, this might be taken as 
 affording some inductive support for the hypothesis. 
 
 Now the theory of correlation, as it is expounded .in the 
 text-books, is almost entirely concerned with measuring how 
 nearly the observed frequencies are grouped about the diagonal 
 of the table (though the complete theory is not, of course, so 
 restricted as this) . The ' coefficient of correlation ' is an algebraical 
 formula which may be regarded as measuring this phenomenon 
 in a way that is sufficiently satisfactory for all ordinary purposes. 
 If it is defined thus, it is simply a statistical description of a 
 particular set of observations arranged in a particular order. 
 How can we make use of this coefficient for the purposes of 
 inference 1 
 
 Dr. Bowley faces this problem a little more definitely than do 
 
424 A TEEATISE ON PEOBABILITY pt. v 
 
 most statistical writers. Mr. Yule warns the student that the 
 problem exists/ but he does not himself attack it systematically 
 or do more than apply common sense to particiilar problems. 
 So much greater emphasis, however, has been laid hitherto on 
 the mathematical complications, that many statistical students 
 hazily float from defining the correlation coefficient as a statistical 
 description to employing it as a measure of the probability of a 
 statistical generalisation as to the association between quanti- 
 tative variations of ^(x) and i|r(a:) respectively. If, for ex- 
 ample, it is found in a particular set of observations of 
 mothers' ages and babies' weights that the frequencies are 
 closely ranged about the diagonal, this is considered a sufficiently 
 good reason for attributing probability to a generalisation as to 
 the ' correlation ' {i.e. tendency to quantitative correspondence) 
 between the age of the mother and the weight of the baby. 
 
 Dr. Bowley's line of thought is as follows. He begins by 
 defining the correlation coefficient r merely as a statistical de- 
 scription (Elements of Statistics, p. 354). He then shows (p. 355), 
 as an illustration of the nature of r, that if x and y are two 
 variable quantities which depend (more strictly, are known to 
 depend) on other variables U, V, W in such a way that 
 
 X,=A + aU,+ . . . +^U,+iV, + ,V, + . . . -F^V, 
 Y,=iU,+2U,+ . . . +^U,-i-iW, + 2W,+ . . . ,W, 
 
 where jUf, jUj . . . ^Vj, gYj . . . iWj, 2^^ .... are selected 
 at random each from an independent group of quantities (more 
 strictly, are relative to our data, random members of independent 
 groups) ; then, if we know a priori certain statistical coefficients 
 descriptive of the constitution of these groups, the value' of r 
 will probably tend towards a certain value. So far we are on 
 fairly safe, but not very fruitful, ground. We have no basis 
 for arguing backwards from the observed value of r; but, 
 provided we have rather extensive and peculiar knowledge 
 d priori as to how X^ and Yj are constituted, then we have 
 calculable expectations as to the limits within which the value 
 
 ^ Introduction to the Theory of Statistics, p. 191 : " The coefficient of correla- 
 tion, lite an average or a measure of dispersion, only exhibits in a summary 
 and comprehensible form one particular aspect of the facts on which it is based, 
 and the real difficulties arise in the interpretation of the coefficient when 
 obtained." 
 
CH. xxxm STATISTICAL INFEEENCE 425 
 
 of r, namely the correlation coefficient between X and Y, wUl 
 probably turn out to lie, when we have observed it. 
 
 Dr. Bowley's next move is more dubious. If the constitu- 
 tions of the independent groups are similar in a certain statistical 
 respect {i.e. if they have the same standard deviations), then, 
 
 Dr. Bowley concludes, r=— — , which "expressed in words 
 
 shows that the correlation coefficient tends to be the ratio of 
 the number of causes common in the genesis of two variables 
 to the whole number of independent causes on which each 
 depends." By this time the student's mind, unless anchored 
 by a more than ordinary scepticism, wiU have been well launched 
 into a vague, fallacious sea. 
 
 Neglecting, however, the dictum just quoted, we find that the 
 second stage of the argument consists in showing that, if we 
 have a certain sort of knowledge d priori as to how our variables 
 are constituted, then the various possible values for the coefficients 
 of correlation, which would be yielded by actual sets of observa- 
 tions made ia prescribed conditions, will have, d priori, and 
 before the observations have been made, calculable probabilities, 
 certain ranges of values being probable and others improbable. 
 
 As a rule, however, we are not arguing from knowledge about 
 the variables to anticipations about their correlation coefficient ; 
 but the other way round, that is from observations of their 
 correlation coefficients to theories about the nature of the vari- 
 ables. Dr. Bowley perceives that this involves a third stage 
 of the argument, and appeals accordingly (p. 409) to " the 
 difficult and elusive theory of inverse probability." He appre- 
 hends the difficulty but he does not pursue it ; and, like Mr. 
 Yule, he really falls back for practical purposes on the criteria 
 of common sense, an expedient well enough in his case, but not 
 a universal safeguard. 
 
 The general argument from inverse probability to which Dr. 
 Bowley makes his vague appeal is doubtless on the following 
 lines : If there is no causal connection between the two sets of 
 quantities, then a close grouping of the frequencies about the 
 diagonal would be a priori improbable (and the greater the 
 number of the individual observations, the greater the improba- 
 bility since, if the quantities are independent, there is, then, all 
 the more opportunity for ' averaging out ') ; therefore, inversely. 
 
426 A TREATISE ON PROBABILITY m. v 
 
 if the frequencies do group themselves about the diagonal, we 
 have a presumption in favour of a causal connection between 
 the two sets of quantities. 
 
 But if the reader recalls our discussion of the principle of 
 inverse probabiUty, he wiU remember that this conchision cannot 
 be reached unless d priori, and quite apart from the observations 
 in question, we have some reason for thinking that there may be 
 such a causal connection between the quantities. The argu- 
 ment can only strengthen a pre-existing presumption ; it cannot 
 create one. And in the absence of reasons peculiar to the 
 particular inquiry, we have no choice but to fall back on the 
 general methods and the general presumptions of induction. 
 
 It is apparent that, where the correlation argument seems 
 plausible, some tacit asstmiption must have slipped in, if we return 
 to the case where our correlation table relates to the weights of 
 the babies and their Christian names. Either by accident or 
 because we had arranged the order of the Christian names to 
 suit, it might happen with a particular set of observations, even 
 a fairly numerous set, that the correlation coefficient was large. 
 Yet on that evidence alone we should hardly assert a generalisation 
 connecting the weights of babies with their Christian names. 
 
 The truth is that sensible investigators only employ the 
 correlation coefficient to test or confirm conclusions at which 
 they have arrived on other groimds. But that does not validate 
 the crude way in which the argument is sometimes presented, 
 or prevent it from misleading the unwary, — since not all investi- 
 gators are sensible. 
 
 If we abandon the method of inverse probability in favour of 
 the less precise but better fovmded processes of induction, 
 ' quantitative correlation,' as I should like to term this particular 
 branch of statistical induction, is more complicated than, but not 
 theoretically distinct from, the kind of arguments which have 
 occupied the earlier paragraphs of this chapter. The character 
 of the additional complication can be described by saying that 
 we are presented with a two-dimensional problem instead of a 
 one-dimensional problem. The mere existence of a particular 
 correlation coefficient as descriptive of a group of observations, 
 even of a large group, is not in itself a more conclusive or significant 
 argument than the mere existence of a particular frequency 
 coefficient would be. Of course if we have a considerable body 
 
CH. xxxm STATISTICAL INFERENCE 427 
 
 of pre-existing knowledge relevant to the particular inquiry, tlie 
 calculation of a small number of correlation coefficients may be 
 crucial. But otherwise we must proceed as in the case of fre- 
 quency coefficients ; that is to say we must have before us, in 
 order to found a satisfactory argument, many sets of observa- 
 tions, of which the correlation coefficients display a significant 
 stability in the midst of variation in the non-essential class 
 characteristics (i.e. those class characteristics which our general- 
 isation proposes to neglect) of the different sets of observations. 
 
 10. I am now at the conclusion of an inquiry in which, 
 beginning with fundamental questions of logic, I have endeavoured 
 to push forward to the analysis of some of the actual arguments 
 which impress us as rational in the progress of knowledge and the 
 practice of empirical science. In writing a book of this kind the 
 author must, if he is to put his point of view clearly, pretend some- 
 times to a little more conviction than he feels. He must give 
 his own argument a chance, so to speak, nor be too ready to 
 depress its vitality with a wet cloud of doubt. It is a heavy task 
 to write on these problems ; and the reader wiU perhaps excuse 
 me if I have sometimes pressed on a little faster than the diffi- 
 culties were overcome, and with decidedly more confidence them 
 I have always felt. 
 
 In laying the foundations of the subject of Probability, I have 
 departed a good deal from the conception of it which governed 
 the minds of Laplace and Quetelet and has dominated through 
 their influence the thought of the past century, — ^though I believe 
 that Leibniz and Hume might have read what I have written with 
 sympathy. But in taking leave of Probability, I should like to 
 say that, in my judgment, the practical usefulness of those modes 
 of inference, here termed Universal and Statistical Induction, 
 on the validity of which the boasted knowledge of modem science 
 depends, can only exist — and I do not now pause to inquire 
 again whether such an argument must be circular — i£ the universe 
 of phenomena does in fact present those peculiar characteristics 
 of atomism and limited variety which appear more and more 
 clearly as the ultimate result to which material science is tending : 
 
 fateare neoessest 
 materiem quoque finitis difEerre figuris. 
 
 The physicists of the nineteenth century have reduced matter to 
 
428 A TEEATISE ON PEOBABILITY pt. v 
 
 tlie collisions and arrangements of particles, between which the 
 ultimate qualitative differences are very few ; and the Mendelian 
 biologists are deriving the various qualities of men from the 
 collisions and arrangements of chromosomes. In both cases the 
 analogy with the perfect game of chance is reaUy present ; and 
 the validity of some current modes of inference may depend on the 
 assumption that it is to material of this kind that we are applying 
 them. Here, though I have complained sometimes at their want 
 of logic, I am in fundamental sympathy with the deep underlying 
 conceptions of the statistical theory of the day. If the contem- 
 porary doctrines of Biology and Physics remain tenable, we may 
 have a remarkable, if undeserved, justification of some of the 
 methods of the traditional Calculus of Probabilities. Professors 
 of probability have been often and justly derided for arguing as 
 if nature were an urn containing black and white balls in fixed 
 proportions. Quetelet once declared in so many words — " I'urne 
 que nous interrogeons, c'est la nature." But again in the 
 history of science the methods of astrology may prove useful to 
 the astronomer ; and it may turn out to be true — ^reversing 
 Quetelet's expression — ^that " La nature que nous interrogeons, 
 c'est une urne." 
 
BIBLIOGRAPHY 
 
 429 
 
BIBLIOGEAPHY 
 
 INTKODUCTION 
 
 There is no opinion, however absurd or incredible, which has not been 
 maintained by some one of our philosophers. — ^Descartes. 
 
 The following Bibliography does not pretend to be complete, 
 but it contains a much longer list of what has been written 
 about ProbabiUty than can be found elsewhere. I have 
 hesitated a httle before burdening this volume with the titles 
 of many works, so few of which are still valuable. But I was 
 myself much hampered, when first I embarked on the study of 
 this subject, by the absence of guide-posts to the scattered but 
 extensive Hterature of the subject ; and a list which I drew up 
 for my own convenience, without much attention to biblio- 
 graphical nicety or to exact uniformity in the style of entry, 
 may be useful to others. 
 
 It is rather an arbitrary matter to decide what to include 
 and what to exclude. ProbabiUty overlaps many other topics, 
 and some of the most important references to it are to be 
 found in books, the main topic of which is something else. On 
 the other hand it would be absurd to include every casual 
 reference ; and no useful purpose would have been served by 
 cataloguing the very numerous volumes dealing with Insurance, 
 Games of Chance, Statistics, Errors of Observation, and Least 
 Squares, which treat in detail these various applications of the 
 Theory of ProbabiUty. It has been a matter of some difficulty, 
 therefore, to know precisely where to draw the Une. Where 
 the main subject of a book or paper is ProbabiUty proper, I 
 have included it, nearly regardless of my own view as to its 
 importance, and have not attempted to act as censor ; but 
 where ProbabiUty is not the main subject or where an appUca- 
 tion of Probability is concerned, the chief interest of which is 
 
 431 
 
432 A TREATISE ON PROBABILITY 
 
 solely in the application itself, I have only included the entry 
 where I think it important, intrinsically or historically or 
 from the celebrity of the author. In particular, the existence 
 of Professor Mansfield Merriman's very extensive bibliography, 
 published in the Transactions of the Connecticut Academy for 
 1877, has made it possible to deal very lightly (and to the 
 extent of but few entries) with the inordinately large hterature 
 of Least Squares. This list comprises 408 titles of writings 
 relating to the Method of Least Squares and the theory of 
 accidental errors of observation, and is sufficiently exhaustive 
 so far as relates to niemoirs on this topic published before 
 1877. 
 
 Of bibhographical sources for ProbabiHty proper, Tod- 
 hunter's History of the Mathematical Theory of Probability 
 and Laurent's Calcul des probabilites are alone important. Of 
 mathematical works published before the time of Laplace, 
 Todhunter's list, and also his conmientary and analysis, are 
 complete and exact, — a work of true learning, beyond criticism. 
 The bibhographical catalogue at the conclusion of Laurent's 
 Calcul (published in 1873) is the longest Ust pubhshed hitherto 
 of general works on ProbabiUty. But it is unduly swollen by 
 the inclusion of numerous items on Insurance and Errors of 
 Observation, the bearing of which on ProbabiHty is very 
 sUght ; ^ it is chiefly mathematical in bias ; and it is now 
 nearly fifty years old. 
 
 I have not read all these books myself, but I have read 
 more of them than it would be good for any one to read again. 
 There are here enumerated many dead treatises . and ghostly 
 memoirs. The Ust is too long, and I have not always success- 
 fully resisted the impulse to add to it in the spirit of a 
 collector. There are not above a hundred of these which it 
 would be worth while to preserve, — if only it were securely 
 ascertained which these hundred are. At present a bibho- 
 grapher takes pride in numerous entries ; but he would be a 
 more useful fellow, and the labours of research would be 
 lightened, if he could practise deletion and bring into existence 
 an accredited Index Expurgatorius. But this can only be 
 accomphshed by the slow mills of the collective judgment of 
 
 1 Laurent's list eontaius 310 titles, of which I have excluded 174 from my 
 list as being insufiSciently relevant. 
 
BIBLIOGRAPHY 433 
 
 tlie learned ; and I liave already indicated my own favourite 
 authors in copious footnotes to the main body of the text. 
 
 The list is long ; yet there is, perhaps, no subject of equal 
 importance and of equal fascination to men's miuds on which 
 so Uttle has been written. It is now fifty-five years since 
 Dr. Venn, still an accustomed figure in the streets and courts 
 of Cambridge, first pubHshed his Logic of Chance ; yet amongst 
 systematic works in the English language on the logical founda- 
 tions of Probability my Treatise is next to his in chronological 
 order. 
 
 The student will find many famous names here recorded. 
 The subject has preserved its mystery, and has thus attracted 
 the notice, profound or, more often, casual, of most speculative 
 minds. Leibniz, Pascal, Arnauld, Huygens, Spinoza, Jacques 
 and Daniel Bernoulli, Hume, D'Alembert, Condorcet, Euler, 
 Laplace, Poisson, Coumot, Quetelet, Gauss, Mill, Boole, 
 Tchebychef, Lexis, and Poincare, to name those only who are 
 dead, are catalogued below. 
 
 Abbott, T. K. " On the Probability of Testimony and Arguments." PhU. 
 
 Mag. (4). vol. 27, 1864. 
 Adbain, R. " Research concerning the Probabilities of the Errors which 
 happen in making Observations." The Analyst or Math. Museum, vol. 1, 
 pp. 93-109, 1808. 
 
 [This paper, which contains the first deduction of the normal law of 
 error, was partly reprinted by Abbe with historical notes in Amer. Joum. 
 Soi. vol. i. pp. 411-415, 1871.] 
 Ammon, O. " Some Social AppUoations of the Doctrine of Probability." 
 
 Joum. Pol. Econ. vol. 7, 1899. 
 AmpJire. Considerations sur la th^orie math6matique du jeu. Pp. 63. 4to. 
 
 Lyon, 1802. 
 ANciiiLON. " Doutes sur les bases du caloul des probabilit^s." Mem. Ac. 
 
 Berlin, pp. 3-32, 1794-5. 
 Abbtjthnot, J. Of the Laws of Chance, or a Method of Calculation of the 
 Hazards of Game plainly Demonstrated. 16mo. London, 1692. 
 [Contains a translation of Huygens, De ratiooiniis in ludo aleae.] 
 4th edition revised by John Hans. By whom is added a demonstration of 
 the gain of the banker in any circumstance of the game call'd Pharaon, etc. 
 Sm. 8vo. London, 1738. 
 
 [For a fuU account of this book and discussion of the authorship, see 
 Todhunter's Histoly, pp. 48-53.] 
 
 " An Argument for Divine Providence, taken from the constant Regular- 
 ity observ'd in the Births of both Sexes." Phil. Trans, vol. 27, pp. 186- 
 190 (1710-12). 
 
 [Ajgues that the excess of male births is so invariable, that we may con- 
 clude that it is not an even chance whether a male or female be bom.] 
 
 2f 
 
434 A TREATISE ON PROBABILITY 
 
 Abistotlb. Anal. Prior, ii. 27, 70* 3. 
 
 Bhet. i. 2, 1357 a 34. [See Zeller's Aristotle for further references.] 
 
 Arnauld. (The Port Royal Logic.) La Logique ou I'Art de penser. 12mo. 
 Paris, 1662. Another ed. C. Jourdain, Hachette, 1846. Transl. into 
 Eng. with introduction by T. S. Baynea. London, 1851. xlvii + 430. 
 See especially pp. 351-370. 
 
 Babbaqe, C. An Examination of some Questions connected with Games of 
 
 Chance. 4to. 25 pp. Trans. B. Soc. Edin., 1820. 
 Baohelibe, Louis. Caloid des probabilitis. Tome i. 4to. Pp. vii + 517. 
 Paris, 1912. 
 Le Jeu, la chance, et le hasard. Pp. 320. Paris, 1914. 
 [Bailby, Samubl.] Essays on the pursuit of truth, on the progress of know- 
 ledge and on the fundamental principle of all evidence and expectation. 
 Pp. xii-i-302. London, 1829. 
 Baldwin. Dictionary of Philosophy. Bibliographical volumes ; s.v. " Prob- 
 
 abiUty." 
 Baniol, a. "Le Hasard." Revue Internationale de Sociologie. Pp. 16. 
 
 1912. 
 Baebbteao. Traite du jeu. 1st ed. 1709. 2nd ed. 1744. 
 
 [Todhunter states (p. 196) that Barbeyrac is said to have pubUshed a 
 discourse " Sur la nature du sort."] 
 Bates, Thomas. An Essay towards solving a Problem in the Doctrine of 
 Chances. Phil. Trans, vol. M. pp. 370-418, 1763. A demonstration, 
 etc. Phil. Trans, vol. liv. pp. 296-325, 1764. 
 
 [Both the above were communicated by the Rev. Richard Price, and 
 the second is partly due to him.] 
 
 German transl. Versuch zur Losung eines Problems der Wahrschein- 
 lichkeitsrechnung. Herausgegeben von H. E. Timerding. Sm. 8vo. 
 Leipzig, 1908. Pp. 57. 
 BfiotJBLiN. " Sur les suites ou sequences dans le loterie de Gfenes." Hist, de 
 I'Acad. Pp. 231-280. Berhn, 1765. 
 
 " Sur I'nsage du principe de la raison suffisante dans le calcul des pro- 
 
 babilites." Hist, de I'Aoad. Pp. 382-412. Berlm, 1767. (Publ. 1769.) 
 
 Bellavitis. " Osservazioni suUa theoiia deUe probabiUtJi." Atti del Instituto 
 
 Veneto di Soienze, Lettere, ed Arti, Venice, 1857. 
 Benaed. "Note sur une question de probability." Journal de I'ificole 
 
 royale poUteohnique. Vol. 15, Paris, 1855. 
 Bentham, J. Rationale of Judicial Evidence. 
 
 See Introductory View, chap, xii., and Bk. i. chaps, v., vi., vii. 
 Bbenoxtlli, Daniel. " Specimen theoriae novae de mensura sortis." Comm. 
 Acad. Sci. Imp. Pet. vol. v. pp. 175-192, 1738. 
 
 Germ, transl. 1896, by A. Pringsheim : Die Grundlage der modemen 
 Wertlehre. Versuch einer neuen Theorie der Wertbestimmimg von Gluoks- 
 fallen (Einleitung von Ludvig Kck). Pp. 60. Leipzig, 1896. 
 
 " Recueil des pieces qui ont remport^ le prix de l'Aoad6mie Royale des 
 Sciences." 1734. iii. pp. 95-144. 
 
 [On " La cause physique de I'inolinaison des plans des orbites des planetes 
 par rapport au plan de I'^quateur de la revolution du soleil autour de son 
 axe."] 
 
 " Essai d'une nouvelle analyse de la mortality causae par la petite 
 v6role." Hist, de I'Aoad. pp. 1-45. Paris, 1760. 
 
 De UBU algorithm! infinitesimalis in arte conjectandi specimen. Novi 
 Comm. Petrop., 1766. xii. pp. 87-98. A 2nd memoir. Petrop., 1766. 
 xii. pp. 99-126. See a criticism by Trembley, Mem. de I'Aoad., Berlin, 
 1799. 
 
BIBLIOGRAPHY 435 
 
 Beenottlli, Daniel. — continued. 
 
 Disquisitiones analytiquae de novo problemate oonjeeturali. Novi 
 Comm. Petrop. xiv. pp. 1-25, 1769. A 2nd memoir, Petrop. xiy. pp. 
 26-45, 1769. 
 
 " Bijudicatio maxime probabilis plurium observationum disorepantium 
 atque verisimiUima induotio inde formanda." Acta Aoad., pp. 3-23. 
 Petrop., 1777. Crit. by Euler, pp. 24-33. 
 Bbrnottlu, Jac. Ars conjectan(U, opus posthumum. Pp. ii -t- 306 -I- 35. 
 Sm. 4to, Basileae, 1713. 
 
 [Published by N. Bernoulli eight years after Jac. Bernoulli's death.] 
 
 Part I. Reprint with notes and additions of Huygens, De ratiooiniis in 
 ludo aleae. 
 
 Part II. Doctrina de permutationibus et combinationibus. 
 
 Part III. Explicans usum praecedentis doctrinae in variis sortitionibus 
 et ludis aleae. [Twenty-four problems.] 
 
 Part IV. Tradens usum et appUoationem praecedentis doctrinae in 
 oiviUbus, moraUbus et oeoonomicis. 
 
 Traotatus de seriebus infinitis. [Not connected with the subject of 
 ProbabiUty.] 
 
 Lettre a un amy, sur les partis du jeu de paume. 
 
 [The most important sections, including BemouUi's Theorem, are in 
 Part IV. For a very full accoimt of the whole volume see Todhunter's 
 History, chap, vii.] 
 
 Engl. Transl. of Part II. only, vide Maseres. 
 
 Pr. transl. of Part I. only, vide Vastel. 
 
 Germ, transl. : Wahrscheinlichteitsreclmung. 4 Telle mit dem Anhange : 
 Brief an einem lEYeund iiber das BaUspiel, libers, u. hrsg. v. R. Haussner. 
 2 vols. Sm. 8vo. 1899. 
 
 [See also Leibniz.} 
 Bbbnotjlli, John. De alea, sive arte oonjectandi, problemata quaedam. 
 
 CoUected ed. vol. iv. pp. 28-33. 1742. 
 Bbrnottlli, John (grandson). " Sur les suites ou sequences dans la loterie de 
 GSnes." Hist, de I'Aoad, pp. 234-253. Berlin, 1769. 
 
 " Memoire sur uu probldme de la doctrine du haaard." Hist, de I'Acad., 
 pp. 384-408. Berlin, 1768. 
 Bebnotjlli, Nicholas. Specimina artis coujectandi, ad quaestiones juris 
 
 appUcatae. Basel, 1709. Repr. Act. Erud. Suppl., pp. 159-170, 1711. 
 Bbetrand, J. Calcul des probabiUtes. Pp. lvii + 332. Paris, 1889. 
 
 " Sur I'applicatiou du calcul des probabiht^s k la theorie des jugemeuts." 
 Comptes rendus, 1887. 
 
 "Les Lois du hasard." Rev. des Deux Mondes, p. 758. Avril 1884. 
 Bbssbl. " Untersuohung iiber die Wahrsoheinlichkeit der Beobaohtungsfehler." 
 Astr. Nachriohten, vol. xv. pp. 369-404, 1838. 
 
 Also Abhandl. von Bessel, vol. ii. pp. 372-391. Leipzig, 1875. 
 BiCQuiLLBY, C. F. DB. Du caloul des probabilites. 164 pp., 1783. 2nd ed. 
 1805. 
 
 Germ, transl. by C. F. Biidiger. Leipzig, 1788. 
 BiENAYMfi, J. " Sur un principe que Poisson avait cru deoouvrir et qu'U avait 
 appeM loi des grands nombres." , Comptes rendus de I'Aoad. des Sciences 
 morales, 1855. 
 
 [Reprinted in Journal de la Soo. de Statistiques de Paris, pp. 199-204, 
 1876.] 
 
 " Probabilite de la Constance des causes couolue des effets observes." 
 Procds-verbaux de la Soc. Philomathique, 1840. 
 
 " Sur la probabilite des resultats moyens des observations, etc." Sav. 
 :fitraugers. v., 1838. 
 
436 A TREATISE ON PROBABILITY 
 
 BiBNAYMii, J. — continued. 
 
 " Theor^me sur la probabilite des resultats moyens des observations.'' 
 Prooes-verbaux de la Soo. Philomathique, 1839. 
 
 " Considerations a I'appui de la d6couverte de Laplace sur la loi de pro- 
 babiUte dans la m^thode des moindres carr^s." Comptes rendus des 
 seances de I'Academie des Sciences, toI. xxxvii., 1853. 
 
 [Reprinted in Journal deLiouville, 2nd series, vol. xii., 1867, pp. 158-176.] 
 
 " Eemarques sur les differences qui distinguent I'interpolation de Cauohy 
 de la m^thode des moindres carrfe." Comptes rendus, 1853. 
 
 " Probabilite des erreurs dans la methode des moindres oarres." Joum. 
 IdouviUe, vol. xvii., 1852. 
 BiNBT. " Eeoherches sur une question de probabilite " (Poisson's Theorem). 
 
 Comptes rendus, 1844. 
 Blasckke, E. Vorlesungen iiber mathematische Statistik. Pp. viii + 268. 
 
 Leipzig, 1906. 
 BoBBK, K. J. Lebrbuch der WahrscheinUohkeitsreohnung. Nach System 
 
 Kleyer. Pp. 296. Stuttgart, 1891. 
 BoHLMANN, G. " Die GrundbegrifEe der WahrsoheinUohkeitsrechnung ia ihrer 
 Anwendung auf die Lebensversioherung." Atti del IV Congr. intern. 
 dei matematici, Borne, 1909. 
 Boole, G. Livestigatious of Laws of Thought on which are founded the 
 Mathematical Theories of Logic and Probabilities. Pp. ix+424. London, 
 1854. 
 
 " Proposed Questions in the Theory of Probabilities." Cambridge and 
 Dublin Math. Journal, 1852. 
 
 " On the Theory of ProbabUities, and in particular on Michell's Problem 
 of the Distribution of the Kxed Stars." Phil. Mag., 1851. 
 
 " On a General Method in the Theory of ProbabiUties." Phil. Mag., 
 1852. 
 
 " On the Solution of a Question in the Theory of ProbabiUties." Phil. 
 Mag., 1854. 
 
 " Reply to some Observations published by Mr. Wilbraham in the Phil. 
 Mag. vii. p. 465, on Boole's ' Laws of Thought.' " PhU. Mag., 1854. 
 
 " ^Further Observations in reply to Mr. Wilbraham." Phil. Mag., 1854. 
 
 " On the Conditions by which the Solutions of Questions in the Theory 
 of ProbabiMties are limited." Phil. Mag., 1854. 
 
 " On certain Propositions in Algebra connected with the Theory of 
 ProbabiUties." Phil. Mag., 1855. 
 
 " On the AppUcation of the Theory of ProbabiUties to the Question of 
 the Combination of Testimonies or Judgments." Edin. Phil. Trans, vol. 
 xxi. pp. 597-652, 1857. 
 
 " On the Theory of ProbabiUties." Roy. Soo. Proc. vol. xii. pp. 179- 
 184, 1862-1863. 
 BoECHAEDT, B. Einfuhrung in die WahrsoheinUohkeitslehre. vi-H86. 
 
 BerUn, 1889. 
 BoBDONi, A. SuUe probabiUtil. 4to. Giom. deU' L R. Instit. Lombardo di 
 
 Soienze. T. iv. Nuova Serie. Milano, 1852. 
 BoREL, E. !6l6ments de la theorie des probabiUtes. 8vo, pp. vU-l-191. 
 Paris, 1909. 2nd ed. 1910. 
 
 LeHasard. Pp. iv -I- 312. Paris, 1914. 
 
 "Le Calcul des probabiUtes et la methode des majorit^s." L'Ann^e 
 psyohologique, vol. 14, pp. 125-151. Paris, 1908. 
 
 "Les ProbabiUtes d6nombrables et leurs appUcations arithmetiques." 
 Rendiconti del Ciroolo matematico di Palermo, 1909. 
 
 " Le Calcul des probabiUtes et la mentaUt6 individuaUste." Revue du 
 Mois, vol. 6, pp. 641-650, 1908. 
 
BIBLIOGEAPHY 437 
 
 BoEBL, E. — continued. 
 
 " La Valeur practique du calcul des probabilites.'' Revue du Mois, vol. 
 1, pp. 424-437, 1906. 
 
 " Les Probabilites et M. le Danteo." Kevue du Mois, vol. 12, pp. 77-91 
 1911. 
 BoRTKiEwicz, L. VON. Das Gesetz der kleinen Zahlen. 8vo, pp. viiiH-52; 
 Leipzig, 1898. 
 
 " Anwendungen der WahrscheinUchkeitsreohnung auf Statistik." En 
 cyklopadie der mathematisclieii Wissensohaften, Band 1, Heft 6. 
 
 " WahraoheinUchkeitstheorie und Erfahrung." Zeitsohrift fiir Philo 
 Sophie und philosophisohe Kritik, vol. 121, pp. 71-81. Leipzig, 1903. 
 
 [With reference to Marbe, Bromse, and Grimaehl, q.v.] 
 
 " Kritisohe Betraohtungen zur theoretiaohen Statistik." Jahrb. f 
 Nationalok. u. Stat. (3), vol. 8, pp. 641-680, 1894; vol. 10, pp. 321-360, 
 1895 ; vol. 11, pp. 671-705, 1896. ' 
 
 " Die erkenntnistheoretisohen Grundlagen der WahrscheinUohkeits 
 
 reohuung." Jahrb. f. Nationalok. u. Stat. (3), vol. 17, pp. 230-244, 
 
 1899. 
 
 [Criticised by Stumpf., q.v., who is answered by Bortkiewioz, loc. cit. 
 vol. 18, pp. 239-242, 1899.] 
 
 " Zur Verteidigung des Gesetzes der kleinen Zahlen." Jahrb. f . National 
 ok. u. Stat. (3), vol. 39, pp. 218-236, 1910. 
 
 [The literature of this topic is not fuEy dealt with in this BibUography, 
 but very fuU references to it will be found in the above article.] 
 
 " Uber den Prazisionsgrad des Divergenzkoeffizientes." Mitteil. des Ver- 
 bandes der osterr. und ungar. Versichernngsteohniker, vol. 5. 
 
 " Eealismus und FormaUsmus in der mathematischen Statistik." AUg. 
 Stat. Archiv, vol. ix. pp. 225-256. Munich, 1915. 
 
 Die Iterationen : ein Beitrag zur Wahracheinlichkeitstheorie. Pp 
 xuH-205. Berlin, 1917. 
 
 Die radioaktive Strahlung als Gegenstand wahrscheinlichkeits 
 theoretisoher Untersuchungen. Pp. 84. Berlin, 1913. 
 
 " WahrscheinUchkeitstheoretische Untersuchungen fiber die Knaben 
 quote bei Zwillings Gebieten." Sitzungsber. der Berliner Math. Ges., vol, 
 xvii. pp. 8-14, 1918. 
 
 Homogeneitat und Stabilitat in der Statistik. Pp. 81. (Extracted from 
 the Skandinavlsk Aktuarietidskiitt.) Uppsala, 1918. 
 BoSTWiCK, A. E. " The Theory of ProbabiUties." Science, iii., 1896, 
 
 p. 66. 
 BotiTBOirx, PiBBRB. " Les Origines du calcul des probabilites." Revue du 
 
 Mois, vol. 5, pp. 641-654, 1908. 
 BowLEY, A. L. Elements of Statistics. Pp. xiH-459. 4th ed. London, 
 
 1920. 
 Bradley, F. H. The Principles of Logic. Bk. i. chap. 8, §§ 32-63, pp. 
 
 201-20. London, 1883. 
 Beavais. " Analyse math6matique sur les probabiUtes des erreurs de situa- 
 tion d'un point." M6m. Sav. vol. 9, pp. 255-332, Paris, 1846. 
 Brendel. Wahischeinlichkeitsrechnung mit Einschluss der Anwendungen. 
 
 Gottingen, 1907. 
 Broad, C. D. "The Relation between Induction and Probability." Mind, 
 
 vol. xxvii. (1918). Pp. 389-404, and vol. xxix. (1920) pp. 11-45. 
 Bromse, H. Untersuchungen zur WahracheinUchkeitslehre. (Mit beaonderer 
 Beziehung auf Marbes Sohrift (q.v.).) 
 
 Zeitschrift fiir PhUosophio und philosophisohe Kritik. Band 118. 
 Leipzig, 1901. Pp. 145-153. 
 
 (See also Marbe, Grimsehl, and v. Bortkiewioz.) 
 
438 A TREATISE ON PROBABILITY 
 
 Beunn, Dr. Hermann. "tJber ein Paradoxon der Wahrsoheinliohkeitsrech- 
 nung." Sitzungsberichte der pliilos.-philol. Klasse der K. bayrische 
 Akademie, pp. 692-712, 1892. 
 
 Bbtjns, H. Wahrschemliohkeitsreolumng und Kollektivmasslehre. 8vo. Pp. 
 viii + 310 + 18. Leipzig, 1906. 
 
 "Das Gruppensehema fur zufallige Ereignisse." Abhandl. d. Leipz. 
 Ges. d. Wissensoh. vol. xxix. pp. 579-628, 1906. 
 
 Bbtaut, Sophib. " On the Failure of the Attempt to deduce inductive Prin- 
 ciples from the Mathematical Theory of Probabilities." PMl. Mag. S. 5, 
 No. 109, Suppl. vol. 17. 
 
 BtJiTON. " Essai d'arithmetique morale." Supplement k I'Histoire NatureUe, 
 vol. 4, 103 pp. 4to. 1777. Hist. Ac. Par. pp. 43-45, 1733. 
 
 Btjnyakovski. Osnovaniya, etc. (Principles of the Mathematical Theory of 
 ProbabiUties.) Petersburg, 1846. 
 
 BiTEBtJEY, S. H. " On the Law of Probability for a System of correlated 
 variables." PhiL Mag. (6), vol. 17, pp. 1-28, 1909. 
 
 Campbell, R. " On a Test for ascertaining whether an observed Degree of 
 
 Uniformity, or the reverse, in tables of Statistics is to be looked upon as 
 
 remarkable." Phil. Mag., 1859. 
 
 " On the Stability of Results based upon average Calculations." Joum. 
 
 List. Act. vol. 9, p. 216. 
 
 A popular Litroduction to the Theory of Probabilities. Pp. 16, Edin- 
 burgh, 1865. 
 Canteli/I, p. p. " Sulla applicazione delle probabiUta parziali alia statistica." 
 
 Giomale di Matematica finanziaria, vol. i. (1919), pp. 30-44. 
 Cantor, G. Historische Notizen fiber die Wahrscheinlichkeitsrechnung. 4to. 
 
 8 pp. Halle, 1874. 
 Cantor, M. PoUtisohe Arithmetik oder die Arithmetik des taglichen Lebens. 
 
 Pp. X -I- 155. Leipzig, 1898, 2nd ed. 1903. 
 Canz, E. C. Tractatio synoptioa de probabilitate juridioa sive de praesumtione. 
 
 4to. Tubingen, 1751. 
 Cabamubl, John. Kybeia, quae combinatoriae genus est, de alea, et ludis 
 
 fortuuae serio disputans. 1670. [Lioludes a reprint of Huygens, which 
 
 is attributed to Longomontanus.] 
 Cardan. De ludo aleae. fo., 15 pp. 1663. [Cardan ob. 1576.] 
 Cabvello, E. Le Calcul des probabilit^s et ses appUcations. 8vo. Pp. ix + 
 
 169. Paris, 1912. 
 Castelnuovo, GuiDO. Calcolo delle probabiHta. Large 8vo. Pp. xxiii-l- 
 
 373. Rome, 1919. 
 Catalan, E. "Solution d'un probUme de probabilite, relatif au jeu de 
 
 rencontre." Joum. Liouville, vol. ii., 1837. 
 
 " Deux probMmes de probabilites." Joum. Liouville, vol. vi. 
 Probl^mes et theordmes de probabilites. 4to. 1884. 
 Cattchy. Sur le systdme de valeurs qu'il faut attribuer k divers elements 
 
 determines par un grand nombre d'observations. 4to. Paris, 1814. 
 Caylbt, A. " On a Question in the Theory of Probabilities." Phil. Mag., 1853. 
 CbsarO, E. " Considerazioni sul concetto di probability." Periodico di 
 
 Matematica, vi., 1891. 
 Chabmbe, C. V. L. Researches into the Theory of Probability. Publ. in 
 
 Engl, in Meddelanden from Lund's Astronom. Observatorium, Series ii., 
 
 No. 24. 4to. 51 pp. Lund, 1906. 
 
 " Contributions to the Mathematical Theory of Statistics," Arkiv for 
 
 matematik, astronomi och fysik, vols. 7, 8, 9, passim. 
 
 Vorlesungen fiber die Gnmdzfige der mathematischen Statistik. Sm. 
 
 4to. Pp. 125. Lund, 1920. 
 
BIBLIOGEAPHY 439 
 
 Charpentibe, T. v. " Sur la neoessiW d'instituer la logique du probable." 
 
 Comptes reudus de I'Acad. des Sciences morales, vol. i. p. 103, 1875. 
 
 " La Logique du probable." Bev. phil. vol. vi. pp. 23-38, 146-163, 1878. 
 Chrtstal, G. On some Fundamental Piineiples in the Theory of Probability. 
 
 London, 1891. 
 Claek, Samuel. The Laws of Chance : or a Mathematical Investigation of 
 
 the Probability arising from any proposed Circumstance of Play, etc. 
 
 Pp. u + 204, 1758. 
 Cohen, J. Chance : A Comparison of 4 Facts with the Theory of Probabilities. 
 
 Pp. 47. London, 1905. 
 CoNDORCET, Marquis de. Essai sur I'appUoation de I'analyse k la probabiUte 
 
 des decisions rendues k la pluralite des voix. 4to. Pp. cxci-l-304. 
 
 Paris, 1785. Another edition, 1804. 
 
 "Sur les 6venements futurs." Acad, des So., 1803. 
 Memoir on Probabilities in six parts : 
 
 1. " Reflexions sur la rSgle g&erale qui presorit de prendre pour valeur 
 d'un evenement incertain la probabiUte de oet 6v6nement, multiplite par 
 la valeur de r6v6nement en lui-mfeme." Hist, de I'Aoad. pp. 707-728. 
 Paris, 1781. 
 
 2. " Application de I'analyse k cette question : Determiner la pro- 
 babiUte qu'un arrangement reguUer est I'efEet d'une intention de le pro- 
 duire." Hist, de I'Aoad., Paris, 1781. With Part i. 
 
 3. Sur revaluation des droits ^ventuels. 1782, pp. 674-691. 
 
 4. Reflexions sur la methode de determiner la probabiUte des evene- 
 ments futurs, d'aprds I'observation des evenements passes. 1783, pp. 
 539-559. 
 
 5. Sur la probabiUte des faits extraordinaires. 1783, with Part 4. 
 
 6. AppUoation des principes de I'artiole precedent k quelques questions 
 de critique. 1784, pp. 454-468. 
 
 CoovER, J. Experiments in Psychical Research at Leland Stanford Junior 
 University. Pp. 641. Stanford University, California, 1917. 
 
 [See Psychical Research and Statistical Method by E. Y. Edgeworth, 
 Stat. JL., vol. Ixxxii. (1919), p. 222.] 
 CoBBAUX, F. Essais metaphysiques et mathematiques sur le hasard. 8vo. 
 
 Paris, 1812. 
 Costa. ProbabiUte du tir. 8vo. Paris, 1825. 
 
 " Question de probabiUte appUoable aux decisions rendues par les 
 jures." Liouv. J. (1), vii., 1842. 
 CouROY, Alpe. db. Essai sur les lois du hasard suivi d'etendus sur les assu- 
 rances. 8vo. Paris, 1862. 
 CouESfOT, A. Revue de Metaphysique et de Morale, May 1905. Numero 
 speoialemeut consaore k Coumot. See especiaUy : 
 
 F. Faure : " Les Idees de Coumot sur la statistique," pp. 395-411. 
 D. Parodi : " Le Critioisme de Coumot," pp. 451-484. 
 F. Mentre : " Les Racines historiques du probabiUsme rationnel de 
 Coumot," pp. 485-508. 
 Art. " ProbabUites." Dictionnaire de Franok. 
 
 " Sur la probabiUte des jugements et la statistique." Journal de liou- 
 viUe, t. iii. p. 257. 
 
 " Memoire sur les appUcations du caloul des chances k la statistique 
 judioiaire." Liouv. J. (1) iii., 1838. 
 
 Exposition de la theorie des chances et des probabiUtes. Pp. viU -(-448. 
 Paris, 1843. 
 
 German translation by C. H. Sohnuse. 8vo. Braunschweig, 1849. 
 CouTUBAT, I-. La Logique de Leibniz d'aprSs des documents inedits. Pp. 
 xiv. + 608. Paris, 1901. 
 
440 A TEEATISE ON PEOBABILITY 
 
 CotTTURAT, L. — coniinued. 
 
 [See especially chap. -n. for references to Leibniz's views on Probability.] 
 
 Opuscules et fragments inedits de Leibniz. Paris, 1903. 
 Ceaiq. Theologiae Christianae piincipia mathematica. 4to. London, 1699. 
 
 Reprinted Leipzig, 1755. 
 [Ceaig(?).] " a Calculation of the Credibility of Human Testimony." Phil. 
 Trans, vol. xxi. pp. 359-365, 1699. 
 
 [Also attributed to Halle y.] 
 Ceakanthoepb, E. Logioa. 1st ed. London, 1622. 2nd ed. London, 1641 
 (auctior et emendatior). 3rded. Oxon., 1677. 
 
 [Book V. " De syllogismo probabih."] 
 Cbobton, M. W. " On the Theory of Local Probability, appHed to Straight 
 Lines drawn at random in a Plane." Phil. Trans, vol. 158, pp. 181-199, 
 1869. 
 
 [Summarised in Proo. Loud. Math. Soo. vol. 2, pp. 55-57, 1868.] 
 
 " ProbabiUty." Bncycl. Brit. 9th ed., 1885. 
 
 " Geometrical Theorems relating to Mean Values.'' Proc. Lond. Math. 
 Soc. vol. 8, pp. 304-309, 1877. 
 CzuBBB, E. Zum Gesetz der grossen Zahlen. Prag, 1889. 
 
 Geometrische WahrscheirJiohkeiten und Mittelwerte. Pp. vii-f244. 
 Leipzig, 1884. ' 
 
 Theorie der Beobaohtungsfehler. Pp. xiv-l-418. Leipzig, 1891. 
 
 Die Entwlcklung der Wahrsoheinlichkeitstheorie und ihrer Anweudungen. 
 Pp. viii -1-279. Leipzig, 1899. 
 
 WahrsoheiuUchkeitsrechuung und ihre Auweudung auf Fehlerausglei- 
 chung, Statistik und Lebensversicherung. Leipzig, 1903. 
 
 Ditto. 2 vols, 8vo. x+410-l-x-l-470. Leipzig, 1908-10. Second 
 edition, revised and enlarged. Vol. i. WarsoheiuUchkeitstheorie, Eehler- 
 ausgleichung, Kollektivmasslehre, 1908. Vol. ii. Mathematische 
 Statistik, mathematische Grundlagen der Lebensversicherung, 1910. 
 
 D'Alembbrt. Opuscules mathematiques : Paris, 1761-1780. 
 
 [RMexions sur le calcul des probabihtes, ii. pp. 1-25, 1761. 
 Sur I'apphoation du c. des p. h, I'inoculation, ii. pp. 26-95. 
 Sur le calcul des probabihtes, etc., iv. pp. 73-105 ; iv. pp. 283-341 ; 
 V. pp. 228-231 ; v. pp. 508-510 ; vii. pp. 39-60.] 
 Melanges de htterature, d'histoire et de philosophie, Amsterdam, 1770. 
 [Doutes et questions sur le calcul des probability, vol. v. pp. 223-246. 
 Reflexions sur I'inoculation. Vol. v. (These two papers were reprinted 
 in the first volume of D'Alembert's collected works published at Paris in 
 1821 (pp. 451-514).)] 
 Articles in Encyclopedie ou Dictionnaire raisonne : 
 " Croix .ou Pile," 1754. 
 " Gageure," 1757. 
 
 Article in Encyclopedie methodique : " Cartes." 
 D'ANiiiRBS. " Reflexions sur les jeux de hasard." Mem. de I'Aoad. pp. 391- 
 
 398. BerUn, 1784. 
 Danteo, PiiLix LB. " Le Hasard et la question d'eoheUe." Revue du Mois, 
 vol. 4, pp. 257-288, 1907. 
 
 Le Chaos et I'harmonie universelle. Paris, 1911. 
 Darbishibb, a. D. Some Talks illustrating Statistical Correlation. (Re- 
 printedfrom Memoirsof theMauchester Literary and Philosophical Society.) 
 21 pp. and plates. 8vo. 1907. 
 Dabboit, a. Le Concept du hasard dans la philosophie de Coumot. !^tude 
 
 critique. Pp. 60. Paris, 1911. 
 Davenport, C. B. Statistical Methods. 1904. 
 
BIBLIOGKAPHY 441 
 
 Db Moivbb, a. " De meusura sortis, seu, de probabilitate eventuum in ludis 
 a oasu fortuito pendentibus." Phil. Trans, vol. xxvii. pp. 213-264, 1711. 
 
 Doctrine of Chances, or A Method of Oaloulatiug the Probabilities of 
 Events in Play. 1st ed. 4to. Pp. xiv-l-175. 1718. 2nd ed. Large 
 4to. Pp. xiv + 258. 1738. 3rd ed. Large 4to. Pp. xii + 348. 1756. 
 
 La dottrina d. azzardi appUc. ai problemi d. probabiUta di vita, di pensi, 
 ecc, trad, da R. Gaeta e G. Pontana. Milan, 1776. 
 
 Miscellanea analytioa de seriebus et quadraturis. 4to. Pp. 250 + 22. 
 London, 1730. 
 De Morgan, A. Essay on Probabilities and their Apphcation to Life Con- 
 tingencies and Insurance Offices. 1838. 
 
 Formal Logic : or the Calculus of Inference Necessary and Probable. 
 1847. 
 Theory of Probabilities. 4to. 1849. 
 [From the Encyclopaedia Metropohtana.] 
 
 On the Structure of the Syllogism and on the Apphcation of the Theory 
 of Probabilities to Questions of Argument and Authority. 4to. Camb. 
 Phil. Soc. pp. 393-405, 1847 (read Nov. 9, 1846). 
 
 On the Symbols of Logic, the Theory of the Syllogism, and in particular 
 
 of the Copula, and the Application of the Theory of Probabihties to some 
 
 Questions of Evidence. 4to. Camb. Phil. Soo. vol. ix. pp. 116-125, 1851. 
 
 De Witt, John. De vardye van de lif-renten na proportie van de los-renten. 
 
 La Haye, 1671. 
 
 English tfansl. : Contributions to the History of Insurance, by Frederick 
 Hendriks in the Assurance Magazine, vol. 2, p. 231 (1852). 
 
 [For an abstract see N. Struyok, Inleiding tot het algemeine geography, 
 etc. 4to. Amsterdam, 1740. P. 345.] 
 Dbdekind, R. Bemerkungen zu einer Aufgabe der WahrscheinUchkeits- 
 
 rechnung. Pp. 268-271. CreUe J. vol. 1., 1855. 
 Dboen, C. F. Tabularum ad faciliorem probabiUtatis computationem utHem 
 
 Enneas. Kiobenhavn, 1824. 
 DiDBROT. Art. " Probabilite " in the Bncyclopedie. 
 
 DiDiON, J. Caloul des probabihtes apphque au tir des projectiles. 8vo. 1858. 
 DoDSON, James. Mathematical Repository. 3 vols. 1753. Vol. ii. pp. 
 
 82-136. 
 DONKIN, W. F. " Sur la th^orie de la oombinaison des observations." Liouv. 
 J. (1), vol. XV. 1850. 
 
 " On Certain Questions relating to the Theory of Probabihties." Phil. 
 Mag., May 1851. 
 DoRMOY, E. Theorie matWmatique des assurances sur la vie. 2 vols. Paris, 
 
 1878. 
 Drobisch, a. " Tiber die nach der Wahrsoheinhchkeitsreohnimg zu erwartende 
 Dauer der Ehen." Beriohte liber die Verhandlungen der Konigl. Saohsi- 
 schen Gesellschaft der Wissenschaften mathem.-physik. 1880. 
 Dbobisoh, M. W. Neue Darstellung der Logik. 2ud ed. Leipzig, 1851. 3rd 
 ed. 1863. 4th ed. 1875. 5th ed. 1887. 
 
 [Probabihty, pp. 181-209, §§ 145-157 (references to 4th ed).] 
 
 Edobwobth, F. Y. "Calculus of Probability applied to Psychical Research." 
 Proceedings of Soc. for Psych. Res. Parts vni. and x. 
 
 " On the Method of ascertaining a Change in the Value of Gold." Boy. 
 Stat. Soo. J. xlvi. pp. 714-718. 1883. 
 
 " Law of Error." Phil. Mag. (5) vol. xvi. pp. 300-309, 1883. 
 
 " Method of least Squares." Phil. Mag. (5) vol. xvi. pp. 360-375, 1883. 
 
 "Physical Basis of Probability." Phil. Mag. vol. xvi. pp. 433-435, 
 1883. 
 
442 A TREATISE ON PROBABILITY 
 
 Edqewoeth, F. Y. — continiied. 
 
 " Chance and Law." Hermathene (Dublin), 1884. 
 
 " On the Reduction of Observations." Phil. Mag. (5) vol. xvii. pp. 
 135-141, 1884. 
 
 " Philosophy of Chance." Mind, April 1884. 
 
 " A priori Probabilities." Phil. Mag. (5) vol. xviii. pp. 209-210, 1884. 
 
 " On Methods of Statistics." Stat. Joum. Jub. vol. pp. 181-217, 1885. 
 
 [Criticised by Bortkiewicz and defended by Edgeworth, Jahrb. f. 
 nat. Ok. u. Stat. (3), vol. 10, pp. 343-347 ; vol. 11, pp. 274-277, 701-705, 
 1896.] 
 
 " Observations and Statistics." Phil. Soc. 1885. 
 
 "Law of Error and Elimination of Chance." Phil. Mag., 1886, vol. 
 xxi. pp. 308-324. 
 
 " Problems in Probabilities." Phil. Mag., 1886, vol xsdi. pp. 371-384, 
 and 1890, voL xxx. pp. 171-188. 
 
 Metretike : or the Method of Measuring Probability and Utility. 8vo. 
 1887. 
 
 " On Discordant Observations." Phil. Mag. (5) vol. xxiii. pp. 1887. 
 
 " The Empirical Proof of the Law of Error." PhU. Mag. (5) vol. xxiv. 
 pp. 330-342, 1887. 
 
 " The Element of Chance in Competitive Examinations." Roy. Stat. 
 Soc. Joum. liii. pp. 460-475 and 644-663, 1890. 
 
 " The Law of Error and Correlated Averages." Phil. Mag. (5) vol. 
 XXXV. pp. 63-64, 1893. 
 
 " Statistical Correlation between Social Phenomena." Roy. Stat. Soc. 
 Joum. Ivi. pp. 670-675, 1893. 
 
 " The Asymmetrical Probability-Curve." 1896. Phil. Mag. vol. xli. 
 pp. 90-99. 
 
 " Miscellaneous Applications of the Calculus of Probabilities." Roy. 
 Stat. Soc. Joum. Ix. pp. 681-698, 1897 ; Ixi. pp. 119-131 and 534-544, 
 1898. 
 
 " Law of Error." Phil. Trans, vol. xx. 
 
 " The Generalised Law of Error." Stat. Joum. vol. Ixix., 1906. 
 
 " On the Probable Errors of Erequency-Constants." Stat. Joum. vol. 
 Ixxi. pp. 381-397, 499-512, 651-678, 1908; and vol. Ixxii. pp. 81-90, 1909. 
 
 " On the Application of the Calculus of Probabilities to Statistics." 
 Bulletin xviii. of the Litemational Statistical Institute, Paris, 1910, 32 pp. 
 
 " Applications of Probabilities to Economics." Economic Journal, vol. 
 XX. pp. 284-304, 441-465, 1910. 
 
 "Probability." Encyclopaedia Britannioa, 11th ed. vol. 22, pp. 376- 
 403, 1911. 
 
 " On the Application of Probabilities to the Movement of Gas-Molecules." 
 Phil. Mag., vol. xl., pp. 249-272, 1920. 
 
 " Molecular Statistics." Rov Stat. Soc. Joum., vol. Ixxxiv. pp. 71-89, 
 
 1921. 
 
 Eqobnbergbe, J. " Beitrage zur Darstellung des bemoulUschen Theorems." 
 
 Bemer Mitth. vol. 50 (1894); and Zeitschr. f. Math. u. Ph. 45 (1900), p. 43. 
 
 Eldbeton, W. p. Frequency-Curves and Correlation. 8vo. London, 1907. 
 
 xiii -1- 172. 
 
 [Contains a useful list of papers on Correlation, p. 163.] 
 Ellis, R. L. "On the Foundations of the Theory of ProbabiUty." 4to. 
 Camb. Phil. Soc. vol. viii., 1843. 
 
 [Reprinted in " Mathematical and other Writings," 1863.] 
 
 " On a Question in the Theory of Probabilities." Camb. Math. Joum. 
 No. xxi. vol. iv., 1844. 
 
 [Reprinted in " Mathematical and other Writings," 1863.] 
 
BIBLIOGKAPHY 443 
 
 Ellis, R. L. — continued. 
 
 " On the Method of Least Squares." Trans. Camb. Phil. Soo. vol. viii., 
 1844. 
 
 [Reprinted in " Mathematical and other Writings," 1863.] 
 "Remarks on an alleged Proof of the 'Method of Least Squares.'" 
 Phil. Mag. (3) voL xxxvii., 1850. 
 
 [Reprinted in " Mathematical and other Writings," 1863.] 
 " Remarks on the IMndamental Principle of the Theory of Probabilities." 
 Trans. Camb. Phil. Soc. vol. ix., 1854. 
 
 [Reprinted in " Mathematical and other Writings," 1863.] 
 Elsas, a. " Kritische Betraohtungen (iber die Wahrsoheinliohkeitsreohnung." 
 
 Philos. Monatssch. vol. xxv. pp. 557-584, 1889. 
 Embeson, William. Miscellanies, 1776. [See espec. pp. 1-48.] 
 Enokb, J. P. Methods der kleinsteu Quadrate. Eehler theoret. Unter- 
 
 Buchungen. Berlin, 1888. 
 Engbl, G. " Cber Mogliohkeit und WirkUchkeit." Philos. Monatssch. vol. 
 
 V. pp. 241-271, 1875. 
 Eemakopf, W. p. Wahrsoheinlichkeitslehre (in Russian). 
 Etjlbe. " Calcul de la probability dans le jeu de rencontre." Hist. Ac. Berl. 
 (1751), pp. 255-270, 1753. 
 
 " Sur I'avantage du banquier au jeu de pharaou." Hist. Ac. Berl. 
 (1764), pp. 144-164, 1766. 
 
 " Sur la probabilite des sequences dans la loterie genoise." Hist. Ac. 
 Berl. (1765), pp. 191-230, 1767. 
 
 " Solution d'une question tr^s difl&cile dans le calcul des probabilit^s." 
 Hist. Ac. Berl. (1769), pp. 285-302, 1771. 
 
 " Solutio quanmdam quaestionum difficiUorum in oaloulo probabiUum." 
 Opuscula analytioa, vol. ii. pp. 331-346, 1785. 
 
 " Solutio quaestionis ad caloulum probabilitatis pertinentis : Quantum 
 duo conjuges persolvere debeant, ut suis haeredibus post utriusque mortem 
 certa argenti summa persolvatur." Opuscula analytica, vol. ii., pp. 315- 
 330, 1785. 
 
 " Wahrscheiulichkeitsrechnung." Opera omnia, ser. 1, A, vol. iv. 
 Leipzig. 
 
 Fahlbbck. " La R6gularite dans les choses humaines, ou les types statistiques 
 
 et leurs variations." Joum. Soo. Stat, de Paris, pp. 188-200, 1900. 
 Fechnee, G. Th. Kollektivmasslehre. (Edited by G. E. Lipps.) 1897. 
 EtOK, A. PhiloBophischer Versuch iiber die Wahrsoheinhohkeiten. Pp. 46. 
 
 Wiirzburg, 1883. 
 EiSHEE, A. The Mathematical Theory of Probabilities. Translated from the 
 
 Danish. Pp. xx -I- 171. New York, 1915. 
 FoEBBS, J. D. " On the alleged Evidence for a Physical Connexion between 
 
 Stars forming Binary or Multiple Groups, deduced from the Doctrine of 
 
 Chances." Phil. Mag., Dec. 1850. (See also Phil. Mag., Aug. 1849.) 
 POENCEY. The Logic of Probabilities. Transl. from the .French. 8vo. 
 
 London, n.d. (? 1760.) 
 FoESTBB, W. Wahrheit und Wahrscheiulichkeit. Pp. 40. Berlin, 1875. 
 Fries, J. J. Versuch einer Kritik der Principien der Wahrsoheinhohkeitsrech- 
 
 nung. Braunschweig, 1842. 
 Feommichbit. Uber Lehre der Wahrscheiulichkeit. 4to. Braunschweig, 
 
 1773. 
 Puss, N. " Reoherches sur un probleme du calcul des probabiUtes." Act. Ac. 
 
 Petr. (1779), pars posterior, pp. 81-92, 1783. 
 
 " Supplement au m^moire sur un probUme du calcul des probabilites." 
 
 Act. Ac. Petr. (1780), pars posterior, pp. 91-96, 1784. 
 
444 A TREATISE ON PROBABILITY 
 
 Galileo, 6. " Considerazioni sopra il giuoci dei dadi." Opere, vol. iii. pp. 
 U9-121, 1718. Also, Opera, vol. xiv. pp. 293-296. Firenze, 1855. 
 
 " Letters intomo le stima di un eavaUo." Opere, vol. xiv. pp. 231-284. 
 Krenze, 1856. 
 Galloway, T. A Treatise on Probability. 8vo. Edinburgh, 1839. (Prom 
 
 the 7th edition of the Encyclopaedia Britannioa.) 
 Galton, p. " Correlations and their Measurement." Proc. Roy. Soc, vol. 
 xiv. pp. 136-145. 
 
 Probability, the Foundation of Eugenics. Herbert Spencer Lecture, 
 1907. (Reprinted — Essays in Eugenics. 8vo. u -1-109 pp. London, 1909.) 
 Gardon, C. Antipathies des 90 nombres, probabilites, et observations com- 
 paratives, sur les loteries de France et de BruxeUes. 8vo. Paris, 1801. 
 Traite elementaire des probabilites, etc. Paris, 1805. 
 L'investigateur des chances . . . pour obtenir souvent des succes aux 
 loteries imperiales de France. Paris. 
 Gabvb, C. De uonnuUis quae pertinent ad logicam probabilium. 4to. Halae, 
 
 1766. 
 Gatakbe, T. On the Nature and Use of Lots. 4to. 1619. 
 Gauss, C. F. Theoria motus oorporum coelestium. 4to. Hamburg, 1809. 
 
 " Theoria combinationis observationum erroribus minimis obnoxiae." 
 Comm. Soc. Gottingen, vol. v. pp. 33-90. 1823. 
 
 Methode des moindres carr^s. Traduit en franjais par J. Bertrand. 
 8vo. 1855. 
 
 [A translation of part of the above.] 
 
 WahrsoheinUohkeitsreohnung. Werke, vol. iv. pp. 1-53. 4to. Gottingen, 
 1873. 
 Geisbnheimer, L. tJber Wahrscheinhohkeitsrechnung. 8vo. BerUu, 1880. 
 GiLMAN, B. I. " Operations in Relative Number with Applications to Theory of 
 
 Probabihty." Johns Hopkins Studies in Logic, 1883. 
 Gladstone, W. E. " Probability as a Guide to Conduct." Nineteenth Cent. 
 
 vol. V. pp. 908-934, 1879 ; and in " Gleanings," vol. ii. pp. 153-200. 
 Glaisher, J. W. L. " On the Rejection of Discordant Observations." Monthly 
 Notices R. Astr. S. vol. xxiii., 1873. 
 
 " On the Law of Facility of Errors of Observation, and on the Method 
 of Least Squares." Mem. R. Astr. S. vol. xxxix., 1872. 
 GOLDSCHMIDT, L. " WahrscheinUchkeit und Versioherung." Bull, du Comite 
 permanent des Congr^s Intematiouaux d'Actuaires, 1897. 
 
 Die WahrscheinUohkeitsrechnung : Versuoh einer Kritik. Pp. 279. 
 Hamb., 1897. 
 
 [Cf. Zeitschr. f. Philos. u. phil. Kr., oxiv., pp. 116-119.] 
 Gonzalez, T. Fundamentum theologiae moraUs, id est traotatus theologicus 
 de recto usu opioionum probabiUum. 4to. Dillingen, 1689. Naples, 1694. 
 [An abridgement entitled : Synopsis tract, theol. de recto usu opin. 
 prob., concinnata a theologo quodam Soo. Jesu : cui accessit logistica 
 probabiUtatum. 3rd ed. 8vo. Venice, 1696. See Migne, Theol. Cur. 
 Compl., vol. xi., p. 1397.] 
 GotTEAND, Ch. Histoire du calcul des probabilites depuis ses origines jusqu'k 
 nos jours. 8vo. Paris, 1848, 148 pp. 
 
 [His history seems to be a portion of a very extensive essay in 3 folio 
 volumes containing 1929 pp., written when he was very young, in com- 
 petition for a prize proposed by the Fr. Acad, on a subject entitled " Theorie 
 de la certitude " ; see Stances et Travaux de I'Aoad^mie des Sciences 
 morales et poUtiques, vol. x. pp. 372, 382, vol. xi. p. 137. See TOD- 
 
 HTJNTER.] 
 
 Gravesande, W. J. 'S. Introduotio ad philosophiam, metaphysicam et logicam 
 continens. 8vo. Venetiis, 1737. 
 
BIBLIOGRAPHY 445 
 
 Gravesandb, W. J. 'S — continued. 
 
 CEuvres philosophiques et mathematiques. 4to. Amsterdam, 1774, 
 
 2 vols. 4to. ii. pp. 82-93, 221-248. 
 GEBiiLTNGS, K. " Die pMosophiaoheu Grundlagen der Wahrscheinlichkeits- 
 
 rechnung." Abhaudlungeu der Priessohen Sohtde, N.P., vol. iii., 1910. 
 Geimsehl, E. " Unterauchimgen zur Wahrsohemliclikeitslehre. (Mit beson- 
 
 derer Beziehung auf Marbes Sohrift {q.v.).) " Zeitsohrift fiir Philosophie und 
 
 philosophisohe Kritik. Baud 118, pp. 154-167. Leipzig, 1901. 
 [See also Bromsb, Maebb, and v. Boktkiewioz.] 
 Geolous. " Sur une question de probability appliqute k la th^orie des 
 
 nombres." Journal de I'lnstitut, 1872. 
 Groschius, J. A. Logioa probabilium in artium practicarum subsidium 
 
 adomata. Sm. 8vo. Halae, 1764. Pp. xvi-l-352. 
 GEUNBAtTM, H. Isolierte und reine Gruppeu imd die Marbeache Zahl "p." 
 
 Wiirzburg, 1904. 
 GtriBEET, A. " Solution d'une question relative h, la probability des juge- 
 
 meuts rendus a une majorite queloonque." Liouv. J. (1) vol. iii., 1838. 
 
 Hack. WahrscheinUohkeitsreehmmg. Leipzig, 1911. 
 
 Hagbn, 6. P. Meditationes philosophioae de methodo mathematico. Norim- 
 
 bergae, 1734. 
 Portsetzung einiger aus der Mathematio abgenommenen Regeln, uaoh 
 
 welohen sich der meusclJiohe Verstand bei Erfindung der Wahrheiten 
 
 richtet. HaUe, 1737. 
 Hagbn, G. Grvmdziige der Wahrsoheinliohkeitareohnung. Berlin, 1837. 
 
 (2nded. 1867, 3rd ed. 1882.) 
 
 Der ooustante wahrsoheiuliolie Fehler : Naohtrag zur 3ten Auflage der 
 
 Grundziige der Wahrsoheiuliolikeitsreolmung. 38 pp. Berlin, 1884. 
 Hallby. See Ceaig. 
 Hans, John. See J. Abbuthnot. 
 Hansdoepf, F. " Beitrage zur Walirscheiuliehkeitsreohnung." Leipz. Ber., 
 
 vol. 53, pp. 152-178, 1901. 
 
 " Das Riaiko bei Zufallaspielen.'' Leipz. Ber., vol. 49, pp. 497-548, 1897. 
 Hansen, P. A. " tjber die Anweudung derWahrscheinUchkeitsreclmuug auf 
 
 geodatisohe Vermessungen." Aatr. N. vol. ix. 1831. 
 Haetmann, E. von. " Die Grundlage der Wahrsoheinliohkeitsurteils." 
 
 Vierteljahrsschr. f. wlss. Phil. u. Soz., vol. xxvui., 1904. 
 Hatjtbsbevb, Gauthiee d'. > Traite 61ementaire aur lea probabiUtes. Paris, 
 
 1834. 
 Application de I'algebre elementaire au oaloul des probabiUtes. Paris, 
 
 1840. 
 HiiLiB. Memoire sur la probabiUte du tir. 8vo. 1854. 
 Helm. ' ' Eine Anwendung der Theorie des Tausohwerthes auf die Wahrschein- 
 
 liohkeitsreohnimg." Zeitaohr. f. Math. u. Phys., vol. 38, pp. 374-376. 
 
 Leipzig, 1893. 
 
 " Die WahrsoheinUohkeitalehre als Theorie der KoUektiv-begriffe." 
 
 Annalen der Naturphiloaophie, vol. 1. 
 Hbney, Charles. La Loi des petits nombres. Recherohes sur le sens de 
 
 I'ecart probable dans les chances aimplea k la roulette, au treute-et-quarante 
 
 etc., en general dans les ph6nomeues dependant de causes puremeut 
 
 aooidentales. 72 pp. 8vo. Paris, 1908. 
 Heeschel, W. " On the Theory of Probabilities." Journal of Actuaries, 
 
 1869. 
 
 " Quetelet on Probabilities." Edin. Rev., 1860. 
 [Reprinted in Quetelet's Physique Sociale, vol. i. pp. 1-89, 1869.] 
 " On au Application of the Rule of Succession." Edin. Rev., 1850. 
 
446 A TEEATISB ON PROBABILITY 
 
 Hebz, N. Wahrscheinliohkeits- und Ausgleichungsrechnimg. Pp. iv + 381. 
 
 Leipzig, 1900. 
 HiBBKN, J. 6. Inductive Logic. London, 1896. 
 
 [See chaps, xv., xvi.] 
 HoBHOTJSB, L. T. Theory of Knowledge. 
 
 [See Part II., chaps, x., xi.] 
 HoYLE. An Essay towards making the Doctrine of Chances easy to those who 
 
 understand vulgar Arithmetic only. Pp. viii + 73, 1754, 1758, 1764. 
 HuBEBDT, A. Die Priucipien der Walusoheinlichkeitsrechnung. 4to. BerUn, 
 
 1845. 
 Htjme, David. Treatise on Human Nature. 1st ed. 1739. 
 [See especially Part III.] 
 
 An Enquiry concerning Human Understanding. 
 [See specially Section vi.] 
 
 Essays, Part I. , XTV. On the Rise and Progress of the Arts and Sciences, 
 pp. 115, 116, 1742. 
 HuYQENS, Ch. " De ratiociniis in ludo aleae." Schooten's Exercitat. math, 
 pp. 519-534. 4to. Lugd. Bat., 1657. 
 
 [Written by Huygens in Dutch and translated into Latin by Schooten.] 
 Engl, transl. by W. Browne. Sm. 8vo, pp. 24. London, 1714. 
 [See also Jao. Bbenotoij, Aebtithnot (Engl. Transl.), and Vastel (Er. 
 Transl.).] 
 
 Jattut, Gr. A. Die Wahrscheinliohkeitsrechnuug und ihre Anwendung auf das 
 wissenschaftliche und praktische Leben. Leipzig, 1839. 
 
 Janet. La Morale. Paris, 1874. [See Bk. iii. chap. 3 for Probabihsm.] 
 
 Engl, transl. The Theory of Morals. New York, 1883, pp. 292-308. 
 
 Jbvons, W. S. Principles of Science. 2 vols. 1874. 
 
 JoBDAN, C. " De quelques formules de probability (sur les causes)." Comptes 
 rendus, 1867. 
 
 JoiTBDAiN, P. E. B. "Causality, Induction, and Probability." Mind, vol. 
 xxviii. pp. 162-179, 1919. 
 
 Kahle, L. M. Elementa logicae probabihum methodo mathematica, in usu 
 
 scientiarum et vitae adomata. Pp. lO-t-xxiiH-245. Sm. 8vo. Halae, 
 
 1735. 
 Kanneb, M. " Allgemeine Probleme der WahrsoheinUohkeitsrechnung und ihre 
 
 Anwendung auf Fragen der Statistik." Joum. des Collegiums fur Lebens- 
 
 Versicherungs-Wissensohaft. Berlin, 1870. 
 KAxrwuAJSTS, Aii. Theorie und Methoden der Statistik. [Translated from the 
 
 Russian.] Pp. xuH-540. Tubingen, 1913. 
 Keplee, J. " De Stella nova in pede serpentarii." 1606. See J. Kepler's 
 
 Astr. Op. Omn. edidit Frisch, ii. pp. 714-716. 
 KiEOHMANir, J. H. VON. tJber die WahrscheinUchkeit. Pp. 60. Leipzig, 
 
 1878. 
 Knapp. " Quetelet als Theoretiker." Jahib. f . nat. Ok. und Stat. (New Series), 
 
 voL xviii. 
 KozAK, Jossr. Grundlehren der WahrscheinUchkeitsrechnung als Vorstufe fur 
 
 das Studium der Fehlerausgleichung, Schiesstheorie, und Statistik. 
 
 Vienna, 1912. 
 
 Theorie des Schiesswesens auf Grundlage der WahrscheinUchkeitsrech- 
 nung und Fehlertheorie. Vienna, 1908. 
 Keies, J. VON. Die Priucipien der WahrscheinUchkeitsrechnung. Eine 
 
 logische Untersuchung. Pp. 298. 8vo. Freiburg, 1886. 
 [See also Lexis, Meinong and Siqwabt.] 
 
BIBLIOGRAPHY 447 
 
 Laoboix, S. !F. Trait6 616mentaire du oaloul des probabilitfis. Pp. viii + 299. 
 8vo. Paris, 1816. 
 
 [2nde id., revue et augmentee, 1822 ; 4th ed. 1864.] 
 
 [Translated into Gterman : E. S. Unger, Erfurt, 1818.] 
 Lagrange. " M6moire sur I'utilite de la m^thode de prendre le milieu entre 
 les r^sultats de plusieurs observations, dans lec[uel on examine les avan- 
 tages de cette m6thode par le calcul des probabiUt^s, et oti I'on r6sout 
 difE6rents probl^mes relatifs a cette matidre." Misc. Tauiinensia, vol. 5, 
 pp. 167-232, 1770-1773. (Euvres completes, vol. 2, Paris, 1867-1877. 
 
 " Rechercbes sur les suites recurrentes . . . etsurl'usagedeces Equations 
 dans la theorie des hasards." Nouv. M6m. Ac. Berl. (1775), pp. 183-272, 
 1777. (Euvres completes, vol. 4. Paris, 1867-1877. 
 Laisant, C. a. Alg^bre. Theorie des nombres, probability, geometrie de 
 
 situation. Paris, 1895. 
 Lambbkt, J. H. " Examen d'une esp^ce de superstition ramende au calcul 
 
 des probability." Nouv. M6m. Ac. Berl., 1771, pp. 411-420. 
 LiMMEL, R. Untersuchungen fiber die Enuittelung von Wahrsoheinliohkeiten. 
 
 (Inaug.-Dissert.) Pp. 80. Zurich, 1904. 
 Lampe,E. "tJbereineAufgabeausderWahrscheinlichkeitsreolmung.'' Grun. 
 
 Arch., vol. 70, 1884. 
 Laxge, p. a. Logische Studien. 
 
 liAPl/AOB. BsBai philosophique sur les probabilites. (Printed as introduction 
 to Theorie anal3rtic[ue des probabilites, from 2nd ed. of the latter onwards.) 
 4to. Paris, 1814. 
 
 German translation by Tonnies. Heidelberg, 1819. German translation 
 by N. Sohwaiger. Leipzig, 1886. 
 
 A Philosophical Essay ou Probabilities : transl. from the 6th French 
 ed. by E. W. Trusoott and P. L. Emory. 8vo. New York, 1902, 196 pp. 
 
 Theorie analytique des probabilites. 
 
 1st ed. 4to. Paris, 1812. 1st and 2nd SuppL, 1812-1820. 2nd ed. 
 4to. cxi-f506-f2, Paris, 1814. 3rd Suppl. 1820. 3rd ed. Paris, 1820. 
 4th Suppl. after 1820. CEuvres completes, vol. 7, pp. cxcvH-691, Paris, 
 1847. CEuvres completes, vol. 7, pp. 832, Paris, 1886. 
 
 " Recherches sur rintdgration des Equations differentieUes aux diffe- 
 rences finies, et sur leur usage dans la theorie des hasards." Mem. pres. 
 a I'Acad. des So., pp. 113-163, 1773. 
 
 " M6moire sur les suites reourro-recurrentes et sur leurs usages dans la 
 theorie des hasards." M6m. pres. k I'Acad. des Sc, vol. 6, pp. 353-371, 
 1774. 
 
 " M6moire sur la probabUitd des causes par les evenements." Mem. 
 pr6s. a. I'Acad. des Sc, vol. 6, pp. 621-656, 1774. 
 
 " Mdmoire sur les probabilites." Mem. pr6s. k I'Acad. des So., pp. 227- 
 332, 1780. 
 
 " M6moire sur les approximations des formnles qui sont fonctions de 
 trds grands nombres, et sur leurs applications aux probabiUtds." M6m. 
 de I'Inst., pp. 353-415, 539-565, 1810. 
 
 " M6moire sur les intfigrales definies, et leur appUcation aux probability." 
 Mem. de I'Inst., pp. 279-347, 1810. 
 
 [The above memoirs are reprinted in CEuvres oompldtes, vols. 8, 9, and 
 12, Paris, 1891-1898.] 
 
 Sur I'application du oaloul des probabilites appUqud k la philosophie 
 naturelle. Conn, des temps. CEuvres compldtes, vol. 13. Paris, 1904. 
 
 " Applications du calcul des probabilites aux observations et spSciale- 
 ment aux operations du nivellement." Annales de Chimie. CEuvres 
 completes, vol. 14, Paris, 1913. 
 La Plaobttb, J. Trait6 des jeux de hasard. 18mo. 1714. 
 
448 A TREATISE ON PROBABILITY 
 
 Latteent, H. Traits du caloul des probabilit6s. Paris, 1873. 
 
 [A la fin une liste des piincipaux outrages (320) ou memoires publics 
 sur le oaloul des probabilit6s.] 
 
 "AppUeation du calcul des probability k la verification des reparti- 
 tions." Joum. des Aotuaires franfais, vol. i. 
 
 " Sur le th6oreme de J. Bernoulli." Joum. des Aotuaires franjais, vol. i. 
 Lechalas, G. " Le Hasard." Rev. Neo-scolastique, 1903. 
 
 " A propos de Coumot : hasard et d^terminisme." Rev. de M^t. et 
 de Mor., 1906. 
 Lbqbndre. " Methode des moindres carres." Mem. de I'lust., 1810, 1811. 
 
 NouveUes m^thodes pour la determination des orbites des eometes. 
 Paris, 1805-6. 
 Lehe. " Zur Frage der Wahrscheinliehkeit von weibliohen Geburten und von 
 Totgeburten." Zeitsohrift f. des ges. Staatsw., vol. 45, p. 172, and p. 
 524, 1889. 
 Leibniz. Nouveaux Essais. Liv. ii. chap. xxi. ; liv. iv. chaps, ii. § 14, xv., 
 xvi., xviii., xx. 
 
 Opera omnia, ed. Duteus, v. 17, 22, 28, 29, 203, 206 ; vi. pt. i., 271, 
 304, 36, 217 ; iv. pt. iii. 264. 
 
 Correspondence between Leibnitz and Jac. Bernoulli. L.'s Gesammelte 
 Werke (ed. Pertz and Gerhardt), vol. 3, pp. 71-97, passim. Halle, 1855. 
 
 [These letters were written between 1703 and 1705.] 
 
 See also s.v. Cotjtueat. 
 Lbmoink, E. " Solution d'un probleme sur les probability." Bulletin de la 
 Soc. math, de Paris, 1873. 
 
 Questions de probabilit6s et valeurs relatives des pieces du jeu des 
 eohecs. 8vo. 1880. 
 
 " Quelques questions de probabilites resolues g6om6triquement." BuU. 
 de la Soc. math, de Prance, 1883. 
 
 "Divers probldmes de probabilite." Ass. fran9aise pour I'Avancement 
 des Sciences, 1885. 
 Lexis, W. Abhandlimgen zur Theorie der Bevolkerungs- und Moral-statistik. 
 Pp. 253. Jena, 1903. 
 
 Zur Theorie der Massenerscheinungen in der menschUchen GeseUschaf t. 
 Pp. 95. Freiburg, 1877. 
 
 " 0ber die Wahrscheinliohkeitsreclmung und dereu Anwendung auf die 
 Statistik. Jahrb. f. nat. Ok. u. Slat. (2), vol. 13, pp. 433-450, 1886. 
 
 [Contains a review of v. Kries's " Principien."] 
 
 " Vhei die Theorie der Stabihtat statistischer Reihen." Jahrb. f. nat. 
 Ok. u. Stat. (1), vol. 32, p. 604, 1879. 
 
 [Reprinted in Abhandlungen.] 
 
 " Das Geschlechtsverhaltnis der Geborenen und die Wahrscheinlichkeits- 
 reohnung." Jahrb. f. nat. Ok. u. Stat. (1), vol. 27, p. 209, 1876. 
 
 [Reprinted in Abhandlungen.] 
 
 Einleitung in die Theorie der Bevolkerungsstatistik. Strassburg, 1875. 
 LiAOEE, J. B. J. Calcul des probabilites et theorie des erreurs avec des appli- 
 cations aux sciences d'observation en general et h, la geodesie en parti- 
 culier. 416 pp. Brussels, 1852. 2nd ed. 8vo. 1879. 
 
 " Sur la probabilite d'une cause d'erreurreguUere, etc." Bull, del' Acad, 
 de Belgique, 1855. 
 LiAPOUNOiT, A. " Sur une proposition de la theorie des probabilites." Bull, 
 de I'Aoad. des Sc. de Saint-Pet., v. s6rie, vol. xiii. 
 
 " Nouvelle Forme du theor^me sur la Hmite de probabilite." M6m. de 
 I'Acad. des Sc. de Saint-Pet., viii. s^rie, vol. xiii. (1901). 
 LiBBEEMEiSTEE, C. " Uber Wahrscheinlichkeitsrechnuug in Anwendung auf 
 therapeutische Statistik." Sammlung kUnisohe Vortrage, Nr. 110. 1877. 
 
BIBLIOGEAPHY 449 
 
 IdiiiENFBLD, J. " Versuoh einer strengen Fassung des Begriffs der mathe- 
 matisohen Wahrsoheinliohkeit." Zeitschr. f. Philos. u. phil. Kr., vol. 
 cxx, pp. 58-65, 1902. 
 LiPPS, G. F. KoUeotivmasslehre. 1897. 
 laiTROW, J. J. Die WahrscheinUchkeitsreohnung in ihrer Anwendung auf das 
 
 wissenschaftliohe und praktische Leben. 8vo. Wien, 1833. 
 I.OBATCHEWSKY, N. J. " ProbabiKte des resultats moyens tires d'observations 
 repet6es." Crelle J. 1824. 
 
 Reprinted. Liouv. J. vol. 24. 1842. 
 LOTTIN, J. Le Oaloul des probabilites et les r^gularites statistiques. 32 pp. 
 8vo. Louvain, 1910. (Originally published in the Revue Neo-scolas- 
 tique, Feb. 1910.) 
 
 Quetelet, statistioien et sociologue. Louvain, 1912. Pp. xxx + 564. 
 [Contains a very full discussion of Quetelet's Work on Probability.] 
 LOTZE, H. Logik. 1st ed. 1874, 2nd ed. 1880. 
 
 Engl, transl. by B. Bosanquet. Oxford, 1884. 
 
 [See Bk. ii. chap. ix. : Determination of Single Facts and Calculus of 
 Chances.] 
 liOtTMi, S. DiePrinzipienderWahrscheinUchkeitsrechnung. Tubingen, 1910. 
 liTJEBOOK, J. W., and Deinkwatbr. Treatise on Probability. [Library of 
 Useful Knowledge.] 
 
 [Often wrongly ascribed to De Morgan.] 
 
 Macalistee, Donald. The Law of the Geometric Mean. Phil. Trans., 1879. 
 JUcCoLL, Httgh. Symbolic Logic. 1906. [Especially chaps, xvii., xviii.] 
 
 The Calculus of Equivalent Statements. Proc. Lond. Math. Soo. Six 
 papers. 
 
 [See particularly 1877, vol. ix. pp. 9-20 ; 1880, xi. 113-121, 4th paper ; 
 1897, xxviii. p. 556, 6th paper.] 
 
 " Growth and Use of a SymboHcal Language." Memoirs Manchester 
 Lit. Phil. Soc. series ui. vol. 7, 1881. 
 
 " Symbolical or Abbreviated Language with an AppUcation to Mathe- 
 matical ProbabUity." Math. Questions, vol. 28, pp. 20-23. 
 
 Various Papers in Mathematical Questions from the Journal of Educa- 
 tion, vols. 29, 33, etc. 
 
 " A Note on Prof. C. S. Peirce's Probability Notation of 1867." Proc. 
 Lond. Math. Soc. vol. xii. p. 102. 
 MacfaSlanb, Alexander. Principles of the Algebra of Logic. 
 
 [See especially chaps, ii., iii., v., xx., xxi., xxii., xxui., and the examples.] 
 Various Papers in Mathematical Questions from the Journal of Educa- 
 tion, vols. 32, 36, etc. 
 MaoMahon, p. a. " On the Probability that the Successful Candidate at an 
 Election by Ballot may never at any time have fewer Votes than the one 
 who is unsuccessful, etc." Phil. Trans. (A), vol. 209, pp. 153-175, 1909. 
 Maldidieb, Jules. " Le Hasard." Rev. Philos. xliii., 1897, pp. 561-588. 
 Malpatti, G. F. " Esame critioo di un problema di probability del Sig. Daniele 
 Bernoulli, e soluzione d 'un altro problema analogo al bemulUano." Memorie 
 di Matematica e Fisica della Society Itahana, vol. 1, pp. 768-824, 1782. 
 Mallet. " Sur le calcul des probabilites." Act. Helv. Basileae, 1772, 
 
 vii. pp. 133-163. 
 Mansions, P. " Sur la portee objective du calcul des probabilites." Bulletin 
 
 de TAoad^mie de Belgique (Classe des sciences), pp. 1235-1294, 1903. 
 JiIaebb, Db. Karl. Naturphilosophisohe Untersuchungen zur Wahrscheinhch- 
 keitslehre. 50 pp. Leipzig, 1899. 
 
 Die Gleiohformigkeit in der Welt. Munich, 1916. 
 
 2g 
 
450 A TREATISE ON PROBABILITY 
 
 Mabkoff, a. a. " tJber die Wahrsoheinliohkeit d posteriori " (in Russian). 
 Mitteilimgen der Charkowv Math. Gesell. 2 Serie, vol. iii. 1900. 
 
 " Untersuchung eines wiohtigen Falles abhangiger Proben " (in Russian). 
 Abh. der K. Buss. Ak. d. W., 1907. 
 
 " tJber einige FaUe der Theoreme vom Grenzwert der mathematischen 
 Eofinungen urid vom Grenzwert der WahrsoheinMchkeiten " (in Russian). 
 Abh. der K. Russ. Ak. d. W., 1907. 
 
 " Erweiterung des Gesetzes der groasen Zahlen auf von einander 
 abhangige Grossen " (in Russian). Mitt. d. phys.-math. Ges. Kazan, 1907. 
 
 " tJber einige Falle des Theorems vom Grenzwert der Wahrscheinlioh- 
 keiten " (in Russian). Abh. der K. Russ. Ak. d. W., 1908. 
 
 " Erweiterung gewisser Satze der Wahrsoheinliohkeitsrechnung. auf eine 
 Summe verketteter Grossen " (in Russian). Abh. der K. Russ. Ak. d. W., 
 1908. 
 
 "Untersuchung des allgemeinen Falles verketteter Ereignisse" (in 
 Russian). Abh. der K. Russ. Ak. d. W., 1910. 
 
 " tJber einen Fall von Versuchen, die eine komplizierte zusammen- 
 hangendes Kette bilden," and " tJber zusammenhangende Grossen, die 
 keine eohte Kette bilden " (both in Russian). Bull, de I'Acad des Sciences. 
 Petersburg, 1911 
 
 Wahrscheinlichkeitsrechnung. Transl. from 2nd Russian edition by H. 
 Ijebmann. Leipzig, 1912. Pp. vii + 318. 
 
 Demonstration du second theoreme — limits du calcul des probabilites par 
 la methode des moments. Saint-P^tersbourg, 1913. Pp. 66. 
 
 [Supplement to the 3rd Russian edition of WahrsoheinUchkeitsreohnung, 
 in honour of the bicentenary of the Law of Great Numbers, with a Portrait 
 of Jacques Bernoulli.] 
 Masabyk, T. G. David Hume's Skepsis und die Wahrscheinlichkeitsrechnung. 
 
 Wien, 1884. 
 Masebes, F. The Doctrine of Permutations and Combinations, being an 
 Essential and Fundamental Part of the Doctrine of Chances : As it is 
 delivered by Mr. James Bernoulli, in his excellent Treatise on the Doctrine 
 of Chances, intitled, Ars conjeotandi . . . 8vo. London, 1795. 
 MBrnoNQ, A. Review of Von Kries'a " Die Priucipien der Wahrscheinlichkeits- 
 rechnung." Gottingische Gfelehrte Anzeigen, vol. 2, pp. 56-75, 1890. 
 
 tJber Moglichkeit und Wahrscheinlichkeit : Beitrage zur Gegenstanda- 
 theorie und Erkenntnistheorie. Pp. xvi. + 760. Leipzig, 1915. 
 Mbissner (Otto). Wahrsoheinhchkeitsrechnung : I. Grundlehren ; IL An- 
 wendungen. Leipzig, 1912 ; 2nd ed., 1919. Pp. 56 -h 52. 
 
 [An elementary primer.] 
 Mendelssohn, Moses. Philos. Schriften, 2 Tie. 12mo. Pp. xxii-)-278 + 283. 
 Berlin, 1771. (Vide especially vol. ii. pp. 243-283, entitled " Ueber die 
 Wahrscheinlichkeit." ) 
 MenteJ), F. " Rdle du hasard dans les inventions et decouvertes." Rev. de 
 Phil., 1904. 
 
 " Les Racines historiques du probabilisme rationnel de Coumot.'' Rev. 
 de Metaphysique et de Morale, pp. 485-508, May 1905. 
 
 Coumot et la renaissance du probabilisme au xixe siecle. Paris, 1908. 
 Mbeeiman, M. a Text-book of the Method of Least Squares. New York, 
 1884. Pp. vii + 198. 6th ed., 1894. 
 
 " List of Writings relating to the Method of Least Squares, with Historical 
 and Critical Notes." Trans. Connecticut Acad. vol. 4, pp. 151-232, 1877. 
 Meetz. Die Wahrscheinlichkeitsrechnung und ihre Anwendung, etc. Frank- 
 fort, 1854. 
 Messina, I. " Intomo a un nuovo teorema di calcolo delle probabilita." 20 pp. 
 4to. Giomale di MatematichediBattaglini, vol. Ivi. (1918). Naples. 
 
BIBLIOGEAPHY 451 
 
 Messina, I. — continued. 
 
 [Described Stat. Jl. vol. Ixxxii. (1919), p. 612.] 
 
 " Su di iin nuoTo teorema di oalcolo deUe probabilitSi, sul teorema di 
 
 Bernoulli e sui poatulati empirioi per la loro appUoazione. " BoU. del Lavoro 
 
 et deUa Presidenza, vol. xxxiii. (1920). 
 Meyer, A. Essai sur une exposition nouvelle de la theorie analytique des 
 
 probabiUtes d posteriori. 4to. Pp. 122. Li^ge, 1857. 
 
 Cours de oaloul des probabilites fait h, I'tmiversit^ de Li^ge de 1849 k 1857. 
 
 Publi6 sur les mss. de I'auteur par P. Folie. BruxeUes, 1874. 
 Vorlesungen fiber WabrsoheinlioKkeitsreclmung. (Translation of the 
 
 above by E. Czuber.) Pp. xii + 654. Leipzig, 1879. 
 MiOHBLL. " An Inquiry into the Probable Parallax and Magnitude of the Fixed 
 
 Stars, from the Quantity of light which they afford us, and the particular 
 
 Ciroumstanoes of their Situation." Phil. Trans, vol. 57, pp. 234-264, 
 
 1767. 
 MtTiH ATtd, G. " Le Hasard chez Aristote et chez Coumot." Eevue de M6ta. 
 
 et de Mor. vol. x. pp. 667-681, 1902. 
 Mill, J. S. System of Logic. Bk. iii. chaps. 18, 23. 
 MoNDfeiE. " Solution d'une question qui se presente dans le caloul des pro- 
 
 babiUtes." Liouville Joum. vol. ii. 
 MONEO, C. J. " Note on the Inversion of Bernoulli's Theorem in Probabilities." 
 
 Proc. Lond. Math. Soo. vol. 5, pp. 74-78 and 145, 1874. 
 MosfTBSSTjs, Br. DE. Le^ons 61ementaires sur le oaloul des probabilites. Pp. 
 
 191. Paris, 1908. (Reviewed Stat. Joum., 1909, p. 113.) 
 " Le Hasard." Rev. du Mois, March 1907. 
 MoNTBSSTJS, R. DE, and Lbohalas, G. " Un Paradoxe du calcul des proba- 
 bilites." Nouv. Ann. iv. (3), 1903. 
 MoNTMOET, P. DE. Essai d'analyse sur les jeux de hasard. 4to. Pp. 
 
 xxiv-H89. Paris, 1708. 
 
 Ditto. 4to. pp. 414. Paris, 1714. (The 2nd ed. is increased by a 
 
 treatise on Combinations, and the correspondence between M. and Nicholas 
 
 Bernoulli.) 
 MoNtuola, J. T. Histoire des mathematiques. 4 vols. 4to. Paris, 1799— 
 
 1802. 
 Vol. iii. pp. 380-426. 
 
 Nbwoomb, Simon. A Statistical Inquiry into the Probability of Causes of the 
 
 Production of Sex in Human Offspring. (Published by the Carnegie 
 
 Institution of Washington.) Pp. 34. 8vo. Washington, 1904. 
 Nicole, F. " Examen et resolution de quelques questions sur les jeux." Hist. 
 
 Ac. Par. pp. 45-56, 331-344, 1730. 
 NiBtrpOKT, 0. F. DE. Un peu detort ou amusemens d'un aexagenaire. 8vo. 
 
 BruxeUes, 1818. Containing " Conversations sur la theorie des pro- 
 
 babiUtfe." 
 NiTSCHE, A. " Die Dimensionen der Wahrscheinliohkeit und die Evidenz der 
 
 Ungewissheit." Vierteljahraschr. f. wissensch. Philos. vol. 16, pp. 20-35, 
 
 1892. 
 Nixon, J. W. " An Experimental Test of the Normal Law of Error.'' Stat. 
 
 Joum. vol. 76, pp. 702-706, 1913. 
 
 Obttingbe, L. Die Wahrsoheinlichkeitslehre. 4to. Berlin, 1852. 
 
 [Reprinted from Crelle, J., vols. 26, 30, 34, 36, under the title, Unter- 
 suchungen fiber Wahrscheinliohkeitsreohnung.] 
 OSTBOQRADSKY. " Probability des jugements." Aoad. de St-Petersbourg, 1834. 
 
 " Sur la probabilite des hypotheses." Melanges math, et astr., 1859. 
 
 2g2 
 
452 A TEEATISE ON PEOBABILITY 
 
 Pagano, F. Logica dei probabili. Napoli, 1806. 
 
 Pabisot, S. a. Trait6 du calcul conjectural ou I'art de raisonner snr les choses 
 futures et inconnues. 4to. Paris, 1810. 
 
 Pascal, B. " Letters to Fermat." Varia opera mathematica D. Petri de 
 Fermat. pp.'179-188, Toulouse, 1678. 
 CEuvres, vol. 4, pp. 360-388, Paris, 1819. 
 
 Patavio. ProbabiUsmus methodo mathematico demonstratus. 1840. 
 
 Pattlhan, Fe. " L'erreur et la selection." Rev. Philos. vol. viii. pp. 72-86, 
 179-190, 290-306, 1879. 
 
 Pbabodt, a. p. " Beligious Aspect of the Logic of Chance and Probability." 
 Princeton Rev. vol. v. pp. 303-320, 1880. 
 
 Pbaeson, K. " On a Form of Spurious Correlation which may arise when 
 Indices are used, etc." Proc. Roy. Soo. vol. Ix. pp. 489-498. 
 
 " On the Criterion that a given System of Deviations from the Probable 
 in the case of a Correlated System of Variables is such that it can be 
 reasonably supposed to have arisen from Random Sampling." Phil. Mag. 
 (5), vol. 50, pp. 157-160, 1900. 
 
 " On some Applications of the Theory of Chance to Racial Differentia- 
 tion." Phil. Mag. (6), vol. 1, pp. 110-124, 1901. 
 
 Contributions to the Mathematical Theory of Evolution. 
 [The main interest of the twelve elaborate memoirs published in the 
 PhU. Trans, under the above title is in every case statistical. References 
 are given below to those of them which have most reference to the theory 
 of Probability and in which Professor Pearson's general theory is mainly 
 developed.] 
 
 II. " Skew Variation in Homogeneous Material." Phil. Trans. (A), 
 vol. 186, Part i. pp. 343-414, 1895. 
 
 III. "Regression, Heredity, and Panmixia." Phil. Trans. (A), vol. 
 187, pp. 253-318, 1897. 
 
 IV. " On the Probable Errors of Frequency Constants and on the 
 Influence of Random Selection on Variation and Correlation." Phil. 
 Trans. (A), vol. 191, pp. 229-311, 1898. (With L. N. G. Filon.) 
 
 VII. " On the Correlation of Characters not quantitatively measurable." 
 Phil. Trans. (A), vol. 195, pp. 1-47, 1901. 
 
 " Mathematical Contributions to the Theory of Evolution." Roy. Stat. 
 Soc. Joum. Ivi., 1893, pp. 675-679 ; lix., 1896, pp. 398-402 ; Ix., 1897, pp. 
 440-449. 
 
 " On the Mathematical Theory of Errors of Judgment, with special 
 reference to the Personal Equation." Phil. Trans. (A), vol. 198, pp. 235- 
 299, 1902. 
 
 On the Theory of Contingency and its relation to Association and 
 Normal Correlation. Pp. 35. 4to. London, 1904. 
 
 On the General Theory of Skew Correlation and Non-linear Regression. 
 Pp. 54. 4to. London, 1905. 
 
 On further Methods of determining Correlation. London, 1907. (Re- 
 viewed by G. U. Yale Joum. Roy. Stat. Soc, Dec. 1907.) 
 
 " Ou the Influence of Past Experience on Future Expectation." Phil. 
 Mag. (6), vol. 13, pp. 365-378, 1907. 
 
 " The Fundamental Problem of Practical Statistics." Biometrika, vol. 
 xiii. pp. 1-16, 1920, 
 
 [On Inverse Probability.] 
 
 " Notes on the History of Correlation." Biometrika, vol. xui. pp. 
 25-45, 1920. 
 
 "The Chances of Death" and other essays. 2 vols. 8vo, London, 
 1897. 
 
 The Grammar of Science. London, 1892. 
 
BIBLIOGRAPHY 453 
 
 PuiROB, C. S, " A Theory of Probable Inferenoe." Johns Hopkins " Studies in 
 Logic," 1883. 
 
 " On an Improvement in Boole's Calculus of Logic." Proo. Amer. Acad. 
 Arts and Soi. vol. vii. pp. 250-261, 1867. Pp. 62. Camb., 1870. 
 Perozzo. "Nuove applioazioni del oaloolo delle probability alio studio dei 
 fenomeni statistici." Proceedings of Academia dei Lincei, 1881-82. 
 
 Germ, transl. by 0. Elb. Neue Anwendvmgen der Wahrsoheinhchkeits- 
 lechnung in der Statistik. Pp. 33. 4to. Dresden, 1888. 
 PliRON, H. " Essai sur le hasard. La Psychologie d'un concept." Rev. de 
 
 Meta. et de Mor. vol. x. pp. 682-693, 1902. 
 PiKABD, H. "Sur la Convergence des Probabiht^." Bev. 2T6o-SchoL de 
 
 Phil. No. 84 (1919) and No. 85 (1920). 
 PmoHBBLB, S. " II oaloolo deEe probability e 1' intuizione." Scientia, vol. 
 
 xix. pp. 417-426, 1916. 
 PizzETTi, P. I fondamenti matematici per la critica dei risultati spenmentali. 
 
 Atti deUa R. Univ., Genova, 1892. 
 Plaats, J. D. VAK DEB. Over de toepassuig der waarschijnlijkheidsrekening 
 
 op medische statistick. 1895. 
 Plana, G. " Memoire sur divers probldmes de probability." Memoires de 
 
 1' Academic de Turin for 1811-12, vol. xx. pp. 355-408, 1813. 
 PoiNOARi:, H. Caloul des probabilit6s. Pp. 274. Paris, 1896. 
 2nd edition (with additions). Pp. 333. Paris, 1912. 
 Soience et hypoth^se. Paris. 
 Engl, transl., London, 1905. 
 
 Science et methode. Paris. (Includes a chapter on " Le Hasard.") 
 Bng. transl. (by P. Maitland). Pp. 288. London, 1914. 
 " Le Hasard." Rev. du Mois, March 1907. 
 Poissojsr, S. D. Recherohes sur la probabihte des jugemeilts en matidre 
 crimineUe et en matidre civile, precedees des regies g^nSrales du caloul des 
 probabiUtes. 4to. Pp. ix -I- 415. Paris, 1837. 
 
 Lehrbuch der Wahrscheinlichkeitsrechnung. German translation of the 
 above by H. Schnuse. Braimschweig. 8vo. 1841. 
 
 " Sur la probabihte des resultats moyens des observations." Conn, 
 des Temps. Pp. 273-302, 1827. Pp. 3-22, 1832. 
 
 " Formules relatives aux probabilites qui dependent de tres grand 
 nombres." Compt. Rend., Acad. Paris, vol. 2, pp. 603-613, 1836. 
 " Sur le jeu de trente et quarante." Annal. de Gergonne, xv. 
 " Solution d'un probleme de probabiUte." liouv. J. (1), vol. 2, 1837. 
 " Memoire sur la proportion des naissanoes des fiUes et des garjons." 
 Mem. Acad. Paris, vol. 9, pp. 239-308, 1830. 
 PoNDEA et HossAED. Question de probability resolue par la g6om6trie. 8vo. 
 
 Paris, 1819. 
 POEETZKi, Platon. S. " Solution of the general Problem of the Theory of 
 Probability by means of Mathematical Logic." (In Russian.) BuU. of 
 the physico-mathematical Academy of Kasan, 1887. 
 Peevost, P. " Sur les prinoipes de la theorie des gains fortuits." Nouv. Mem. 
 
 Pp. 430-472. Berlin, 1780. 
 Peevost, P., and LatrnJEE, S. A. " Sur les probabilites." M6m. Ac. BerL 
 (1796), pp. 117-142, 1799. 
 
 " Sur I'art d'estimer la probabihte des causes par les efEets." Mem. Ac. 
 Berl. (1796), pp. 3-24, 1799. 
 
 " Bemarques sur I'utilit^ et I'^tendue du principe par lequel on estime 
 la probability des causes." Mem. Ac. Berl. (1796), pp. 25-41, 1799. 
 Note on last. M6m. Ac. Bed. (1797), p. 152, 1800. 
 " Memoire sur I'application du caloul des probabilites k la valeur du 
 t6moignage." Mdm. Ac. Beri. (1797), pp. 120-151, 1800. 
 
454 A TREATISE ON PROBABILITY 
 
 Pbioe, K. See Bates. 
 
 PsiirasEEiM, A. See Dasieii BuBisroirLij:. 
 
 " Welteres zur Gtesohichte des Petersburger Problems." Grunert, Archiv, 
 77, 1881. 
 
 Peootob, R. a. Chance and Luck. A Discussion of the Laws of Luck, Coin- 
 cidences, Wagers, Lotteries, and the Fallacies of Gambling ; with Notes on 
 Poker and Martingales. Pp. vii + 263. London, 1887. 
 
 Peotimalethbs. Miracle versus Nature : being an Application of Certain Pro- 
 positions in the Theory of Chances to the Christian Miracles. 8vo. Cam- 
 bridge, 1847. 
 
 Qtjetelet, A. Instructions populaires sur le calcul des probabiUtSs. 12mo. 
 Bruxelles, 1828. 
 
 Engl, transl. : Popular Instructions on the Calculation of Probabilities, 
 transl. with notes by B. Beamish. 1839. 
 
 Dutch transl. by H. Strootman. Breda, 1834. 
 
 Lettres sur la th^orie des probabihtes appUqu^e aux sciences, morales 
 et poUtiques. Bruxelles, 1846. 
 
 Engl, transl. : Letters on the Theory of Probabilities as applied to the 
 Moral and Political Sciences, transl. by 0. G. Downes. Svo. 1849. 
 
 " Sur la possibility de mesurer I'influence des causes qui modifient les 
 dl6mens sociaux." Corresp. math^m. et phys. vol. tH. pp. 321-346. 
 Bruxelles, 1832. 
 
 " Sur la Constance qu'on observe dans le nombre des crimes qui se com- 
 mittent." Corresp. math6m. et phys. voL vi. pp. 214-217. Brussels, 1830. 
 
 "ThSoiiedesprobabilitfis." (IntheBncyoLpopulaire.) Brussels, 1853. 
 
 " Sur le calcul des probabihtes appliqu6 k la science de I'homme." BuU. 
 de I'Acad. roy. vol. xxvi. pp. 19-32. Brussels, 1873. 
 
 [For a fuU bibhography and discussion of Quetelet's writings on these 
 topics see Lottin's Quetelet.] 
 
 BiAyiiEiQH, LoKD. " On James Bemouilli's Theorem in Probabilities." Phil. 
 
 Mag. (5), vol. 47, pp. 246-251, 1899. 
 BiEaNAirLT. Calcul des chances et philosophie de la bourse. Svo. Paris, 
 
 1863. 
 BpENOimEB, Oh. L'Homme : la raison, la passion, la Uberte, la certitude, la 
 
 probabiUte morale. 8vo. 1859. 
 Bevel, P. Camille. Esquisse d'un systeme de la nature fonde sur la loi du 
 
 hasard. 1890. 2nd ed. (corrigte), 1892. 
 
 Le Hasard, sa loi et ses consequences dans les sciences et en philosophie. 
 
 Paris, 1905. 2nd ed. (corrig6e et augment^e).. Pp. 249. Paris, 1909. 
 BizzBTTi, J. " Ludorum soieutia, sive artis conjeotandi elementa ad alias 
 
 applicata." Act. Erud. Suppl. vol. 9, pp. 215-229, 295-307. Leipzig, 
 
 1729. 
 RoBEETS, Hon. Fbancis. " An Arithmetical Paradox concerning the Chances 
 
 of Lotteries." Phil. Trans, vol. xvii. pp. 677-681, 1693. 
 Eo&BB. " Solution d'un probl&me de probabiUte." liouv. J. (1), vol. 17, 
 
 1852. 
 RoTTSB, W. Doctrine of Chances, or the Theory of Gaming made easy to every 
 
 Person — ^Lotteries, Cards, Horse-Racing, Dice, etc. 1814. 
 BtfDiQEE, Andreas. De seusu falsi et veri Ubri iv. [Lib. i. cap. xii. et lib. 
 
 ni.] Editio Altera. 4to. Lipsiae, 1722. 
 Bottini. Critical Reflexions on the Essai philosophique of Laplace (in 
 
 Italian). Modena, 1321. 
 
BIBLIOGEAPHY 455 
 
 Sabttdski-Ebbbhaed. Die Wahrsoheinliohkeitsreolmung, ihre Anwendung auf 
 
 das Sohiessen iind auf die Theorie des Einsohiessens. Stuttgart, 1906. 
 Sawitsoh, a. Die Anwendung der Wahrscheinliolikeitstheorie auf die Bereoh- 
 nung der Beo^iaolitungen und geodatischen Messungen oder die Methode 
 der kleiusten Quadrate. (Translated into German from the Russian by 
 Laia.) Leipzig, 1863. 
 SoHELL, W. tfter Wahrsoheinliehkeit. 8vo. 
 ScHNtrsE, H. Vid. Poisson. 
 
 SoHWBlGGBR, F. Berechnuug der Wahrsoheinlichkeit beim Wiirfeln. 
 Scott, John. The Doctrine of Chance : the Arithmetic of Gambling. 56 pp. 
 
 8vo. 1908. 
 Sbottbei, Paolo. Lettere suUa materia del probabile. 12mo. Colonia, 1732. 
 Sbxttts Empieious. Works. 
 Sheldoit, W. H. " Chance." Journal of Phil., Psych., and Soi. Meth., vol. 
 
 ix. pp. 281-290. 1912. 
 Shbppaed, W. F. " On the Application of the Theory of Error to Cases of 
 Normal Distribution and Normal Correlation." Phil. Trans. A. vol. 192, 
 pp. 101-167, 1899. 
 
 " On the Calculation of the most Probable Values of the Frequency 
 Constants for Data arranged according to Equidistant Divisions of a Scale." 
 Proc. Lond. Math. Soo. vol. xxix. pp. 353-380. 
 " Normal Correlation." Camb. Phil. Soc. vol. xix. 
 "Normal Distribution and Correlation. Roy. Soc. Trans., 1898. 
 SiGWABT, C. Review of von Kries in Vierteljahrsschr. fiir Wias. Phil. xiv. 
 p. 90. 
 
 Logik. Tubingen, 1878. 
 
 2nd ed. Freiburg, i. B., 1893. English ed., 1895. 
 Vol. ii. Part 3, chap. 3, § 85, Die Wahrscheinlichkeitsrechnung ; 5, § 102. 
 Die Wahrsoheinlichkeit auf statischem Boden. 
 References in English ed. : 
 
 Probability, vol. ii. pp. 216-230, 261-271 (errors of observation), 303- 
 309 (induction), 504-507 (statistics). 
 Simmons, T. C. " A New Theorem in Probability." Proc. Lond. Math. Soc. 
 vol. 26, pp. 290-323, 1895. 
 
 "Sur la probability des ovfeements composfe." Ass. Franc, pour 
 I'Avancemeut des Sciences. 1896. 
 Simon. " Exposition des principes du calcul des probabilit^s." Joum. des 
 
 Actuaires franjais, i, 
 Simpson, T. "A Letter to the Right Honourable George, Earl of Macclesfield, 
 President of the Royal Society, as to the Advantage of taking the Mean 
 of a Number of Observations in Practical Astronomy." Phil. Trans, vol. 
 xlix. pp. 82-93, 1755. 
 
 " An Attempt to show the Advantage arising by taking the Mean of a 
 Number of Observations in Practical Astronomy." (Miscellaneous tracts 
 on some curious subjects, pp. 64-75). London, 4to, 1757. 
 
 [A reprint of the above with some new matter. The probability, assum- 
 ing positive and negative errors to be equally likely, that the mean is 
 nearer to the truth than a single observation taken at random, is here 
 investigated for the first time.] 
 
 Treatise on the Nature and Laws of Chance. 4to. London, 1740. 
 Another edition. 8vo. 1792. 
 SOBEL, G. " Le Calcul des probability et I'expdrienee." Rev. Philos. vol. 
 
 xxiii. pp. 50-66, 1887. 
 Spehb, F. W. Vollstandiger LehrbegrifiE der reinen Oombinationslehre mit 
 ^ Anwendungenderselben auf Analysis und Wahrscheinlichkeitsrechnung. 2. 
 wohlfeile Ausg. 4to. Braunschweig, 1840. 
 
456 A TEEATISE ON PEOBABILITY 
 
 Spinoza. " Letter to Jan van der Meer." Opera ed. Van Vloten and Land, 
 
 vol. ii. pp. 145-149, Ep. 38 (in Latin and Dutch). 
 See also Spinoza's Briefweohsel in J. H. v. Kirohmann's Philos. Biblio- 
 
 thek, voL xlvi. pp. 145-147. 
 Speaotje, T. B. On Probability and Chance and their Connexion with the 
 
 Business of Insurance. Svo. 1892. 
 Stamkaet, F. J. Over de waarsohijnlijkheidsrekening. 8vo. 
 Sterztngbr, O. Zur Logik und Naturphilosophie der Wahrscheinliohkeitslehre. 
 
 Leipzig, 1911. 
 Stewart, Dugald. " On the Calculus of Probabilities, in reference to the 
 
 Preceding Argument for the Existence of God, from Final Causes." Philo- 
 sophy of the Moral Powers, vol. ii. pp. 108-119. (Sir W. Hamilton's ed., 
 
 Edin., 1860.) 1st ed., 1828. 
 Stieda, L. tJber die Anwendung der WahrsoheinUohkeitsrechnung in der 
 
 anthropologischen StatistLk. Arch. f. Anthrop., 1882. 
 2nd ed. 8vo. Braunschweig, 1892. 
 Stebetee, T. E. The Elements of the Theory of Probabilities. 31 pp. 8vo. 
 
 1908. 
 Stbtive. Catalogus novus stellarum duplioium et multiplicium. Dorpati, 
 
 1827, pp. xxxvii-xlviii. 
 STtTMPF, C. " Bemerkung zur Wahrscheinlichkeitslehre." Jahrb. f. national. 
 
 Ok. u. Stat. (3), vol. 17, pp. 671, 672, 1899 ; vol. 18, p. 243, 1899. 
 [In criticism of Bortkiewicz, q.v.l 
 Sttjmpp, K. " tJber den Begrifi der mathematiachen WahrscheinUchkeit." 
 
 Ber. bayr. Ak. (Phil. CI.), pp. 37-120, 1892. 
 
 "Uber die Anwendung des mathematischen Wahrscheinhohkeits- 
 
 begrifEes auf Telle eines Continuums." Ber. bayr. Ak. (Phil. CI.), pp. 681- 
 
 691, 1892. 
 Suppantschitsoh. Einfiihrung in die Wahrsoheinlichkeitsreohnnng. Leipzig. 
 
 Tait, p. G. " Law of FreqLuenoy of Error." Edin. Phil. Trans, vol. 4, 1865. 
 
 On a Question of Arrangement and Probabilities. 1873. 
 TcHEBTCHEF, P. L. Essai d'analyse 61ementaire de la theorie des probabilites. 
 4to. Moscow, 1845 (in Russian, degree thesis). Pp. ii -I- 61 H-iii. 
 
 " Demonstration el6mentaire d'une proposition generate de la theorie 
 des probabilitfa." CreUe J. vol. 33, pp. 259-267, 1846. 
 
 " Des valeurs moyennes." Liouv. J. (2), vol. 12, pp. 177-184, 1867. 
 (Extrait du Becueil des Sciences mathematiques, vol. ii.) 
 " Sur deux th^oremes relatifs aux probabilites." Petersb. Abh. vol. 
 55, 1887. (In Russian.) French translation by J. Lyon : Act. Math. 
 Petr. vol. 14, pp. 305-315, 1891. 
 QEuvres. 2 vols. 4to. St-Pftersbourg, 1907. 
 (The three memoirs preceding are here reprinted m French.) 
 Teebot, Bishop. " Summation of a Compound Series and its Application to a 
 Problem in Probabilities." Edin. PhiL Trans., 1853, vol. xx. pp. 541-545. 
 " On the Possibility of combining two or more Probabilities of the same 
 Event, so as to form one Definite Probability." Edin. Phil. Trans., 1856, 
 vol. xxi. pp. 369-376. 
 Thiele, T. N. Theory of Observations. Pp. 6 + 143. 4to. London, 1903. 
 Thomson, Aechbishop. Laws of Thought. § 124, Syllogisms of Chance (13 pp.). 
 Thttbeitf. !^l4mens et principes de la royale arithm^tique aux jettons, etc. 
 
 12mo. Paris, 1661. 
 TiMBEDiNG. Die Analyse des Zufalls. Pp. ix-i-168. Braunschweig, 1915. 
 ToDHXTNTEE, I. "On the Method of Least Squares." Camb. Phil. Trans, vol. ii. 
 A History of the Mathematical Theory of Probability from the Time of 
 Pascal to that of Laplace. Lge. Svo. pp. xvi + 624, Camb. and Lond., 1865. 
 
BIBLIOGEAPHY 457 
 
 TozDR, J. On the Measure of the Force of Testimony in Cases of Legal Evi- 
 dence. 4to. Camb. Phil. See. vol. viii. Part II. 16 pp. (read Nov. 27, 
 1843). 1844. 
 Teemblby. " Observations sur le calcul d'un jeu de hasard." Mem. Ac. Berl. 
 (1802), pp. 86-102. 
 
 " Brecherches sur une question relative au calcul des probabilit6s." 
 M6m. Ac. Berl. (1794-5), pp. 69-108, 1799. 
 
 (On Euler's memoir, " Solutio quarundam quaestionum difficihorum in 
 calculo probabiUtatum.") 
 
 " De probabihtate causarum ab effectibus oriunda." Comm. Soc. Eeg. 
 Gott. (1795-8), vol. 13, pp. 64-119, 1799. 
 
 " Observations sur la methods de prendre les milieux entre les observa- 
 tions." Mem. Ac. Berl. (1801), pp. 29-58, 1804. 
 
 " Disquisitio elementaris circa caloulum probabilium." Comm. Soo. 
 Reg. Gott. (1793-4), vol. 12, pp. 99-136, 1796. 
 TsoHTjPEOw, A. A. " Die Aufgaben der Theorie der Statistik." Jahrb. f. 
 gesetzg. Verwalt. u. Volkswirtsch. vol. 29, pp. 421-480, 1905. 
 
 " Zur Theorie der Stabilitat statistischer Eeihen." Skandinavisk 
 Aktuarietidskiift, pp. 199-256, 1918 ; pp. 80-133, 1919. 
 TwARDOwsKi, K. " tJber sogenannte relative Wahrheiten." Arch. f. syst. 
 Philos. vol. viii. pp. 439-447, 1902. 
 
 Ueban, p. M. " tJber den BegrifE der mathematischen Wahrscheinlichkeit." 
 Vierteljahrsachr. f. wiss. Phil, imd Soz., vol. x. (N.S.), 1911. 
 
 Vastbl, L. G. p. L'Art de conjeoturer. Traduit du latin de J. Bernoulli, avec 
 observations, Sclaircissemens et additions. Caen, 1801. 
 
 [Translation of Parti, only bf Bernoulli's Ars Conjectandi (q.v.) contain- 
 ing a commentary on and reprint of Huygens, De ratiooiniis in ludo 
 aleae.] 
 Venn, J. The Logic of Chance. 1866. 2nd ed., 1876. 3rd ed., 1888. 
 
 " The Foundations of Chance." Princeton Rev. vol. 2, pp. 471-510, 
 1872. 
 
 " On the Nature and Useis of Averages." Stat. Joum. vol. 54, pp. 429- 
 448, 1891. 
 
 Wagneb, a. Die Gesetzmassigkeit in den scheinbar wiUkiirliohen Hand- 
 lungen des Meusohen. Hamburg, 1864. 
 
 " Wahrsoheinhchkeitsrechnung imd Lebensversioherung." Zeitschr. f. 
 d. ges. Versioherungswissensohaft. Berlin, 1906. 
 
 Wabihg, E. (M.D. Luoasian Prof.) On the Principles of translating Algebraic 
 Quantities into Probable Relations and Annidties, etc. Pp. 59. Cam- 
 bridge, 1792. 
 
 An Essay on the Principles of Human Knowledge. Pp. 244. Cam- 
 bridge, 1794. 
 
 Wblton, J. Manual of Logic. (Probability, vol. ii. pp. 165-185.) Lon- 
 don, 1896. 
 
 Wbstbbgaaed. Grundziige der Theorie der Statistik. 
 
 Whitakeb, Lttcy. " On the Poisson Law of Small Numbers." Biometrika, 
 vol. X. 1914. 
 
 Whittakbb (E. T.). " On Some Disputed Questions of Probability." Trans- 
 actions of the Faculty of Actuaries in Scotland, vol. viii. (1920), pp. 
 163-206. 
 
 [Problems of Inverse Probability including the Law of Succession. This 
 paper is followed by others on the same subject by various writers.] 
 
458 A TREATISE ON PROBABILITY 
 
 Whitworth, W. a. Choice and Chance, An Elementary Treatise on Permuta- 
 tions, Combinations, and Probability, with 300 Exercises. 1867. 2nd 
 ed., 1870. 3rd ed. pp. viu+244. Cambridge, 1878. 
 
 Expectations of Parts into which a Magnitude is divided at Random. 
 1898. 
 
 WiOKSEM,, S. D. " Some Theorems in the Theory of Probabilities." Skan- 
 dinavisk Aktuarietidskrift, p. 196 (1910). 
 
 WlJNNE, H. A. De leer der waarschijnlijkheid in hare toepassing op het 
 dagelijksche leven. 1862. 
 
 WHiBRAHAM, H. " On the Theory of Chances developed in Prof. Boole's 
 ' Laws of Thought.' " Phil. Mag., 1854. 
 
 Wild, A. Die Grundsatze der Wahrscheinliohkeitsreclmung und ihre Anwen- 
 dungen. Munohen, 1862. 
 
 WnsTDELBAND, W. Die Lehren vpm ZufaU. Berlin, 1870. 
 
 Wolf, A. "The Philosophy of Probability." Proo. Arist. Soo. vol. xiii. 
 pp. 29. Loudon, 1913. 
 
 Wolf, R. " tJber eine neue Serie von Wiirfelversuchen." Vierteljs. Natur- 
 forsch. Gesellsch. in Zurich, vol. 26, pp. 126-136 and 201-224, 1881 ; vol. 
 27, pp. 241-262, 1882 ; vol. 28, pp. 118-124, 1883. 
 
 " Neue Serie von Wurfelsversuohen." Ibid. vol. 38, pp. 10-32, 1893. 
 " Versuche zur Vergleiohung der ErfahrungswahrsoheinUchkeit mit der 
 mathematischen Wahrsoheinlichkeit." Mitth. d. Naturforsch. Gesellsch., 
 Bern, 1849-1851, 1853. 
 
 Wolff, Chbistiajst. Philosophia rationaUs sive logioa. Leipzig, 1732. 
 
 WooDWAED, R. S. Higher Mathematics, chap. x. pp. 467, 507. " Proba- 
 bility and Theory of Error.", New York, 1900. 
 
 Probability and Theory of Errors. New York, 1906. 
 
 Wyeotjbopf, G. " Le Certain et le probable." La PhUos. posit, p. 165, 1867. 
 
 YoTJifG, J. R. Elementary Treatise on Algebra, Theoretical and Practical, with 
 
 an Appendix on Probabilities and Life Annuities. 4th ed. enlarged, post 
 
 8vo. 1844. 
 YotrNG, Rev. M. " On the Force of Testimony in establishing Pacts contrary 
 
 to Analogy." Trans. Roy. Ir. Acad. vol. vii. pp. 79-118, 1800. 
 YoinsrQ, T. " Remarks on the Probabilities of Error in Physical Observations, 
 
 etc." Phil. Trans., 1819. 
 Yule, G. TJ. " On the Theory of Correlation." Joum. Stat. Soo. vol. Ix. 
 
 p. 812, 1897. 
 
 " On the Association of Attributes in Statistics." Phil. Trans. (A), vol. 
 
 194, pp. 257-319, 1900. 
 " On the Theory of Consistence of Logical Class-frequencies." Phil. 
 
 Trans. (A), vol. 197, pp. 91-132, 1901. 
 
 An Introduction to the Theory of Statistics. Pp. xiii. + 376. London, 
 
 1911. 
 YtiLE and Galton. " The Median." Stat. Joum. pp. 392-398, 1896. 
 
INDEX' 
 
 Aoquaintanoe, direct, 12 
 Addition, of probabilities, 37, 135 
 
 definition of, 120 
 
 Theorem of, 104, 121, 144 
 
 and measurement, 158 
 Analogy, principle of, 68 
 
 and induction, 218, 222 
 
 negative, 219, 223, 233, 415 
 
 positive, 220, 223, 415 
 
 and generalisation, 223 
 
 logical foundation of, 258 
 
 and Bacon, 268 
 
 and Leibniz, 272 
 
 and Jevons, 273 
 
 and statistics, 391, 407, 415 f. 
 AnoiUon, 5 n., 82 
 
 Apprehension, direct, and ethical judg- 
 ment, 316 
 Argument, 13 
 Aristotle, 80, 92 
 
 and induction, 274 
 Arithmetic mean (or average), 205 
 
 and laws of error, 197 
 
 Laplace on, 206 
 
 Gauss on, 206 
 Astronomers and Least Squares, 210 
 Asymmetry, and Bernoulli's Theorem, 
 
 358 f. 
 Atomic Uniformity, 249 
 Averages, 205 f. 
 
 weightiug of, 211 
 
 and discordant observations, 214 
 Axioms, 135 f. 
 
 non-seU-evident, 299 
 
 BacheUer, 347 n. 
 
 and statistical frequency, 349 n., 351 
 
 and Rule of Succession, 376 n. 
 Bacon, 265 f. 
 
 tables of, 269 
 
 and limited variety, 270 
 
 Bayes, and Inverse Probability, 174 
 
 Theorem of, 379 
 Belief, rational, 4 f., 10, 16, 307 
 
 degrees of, 11 
 Bentham, measurement of Proba- 
 bility, 20 
 Bernoulli, Daniel, and Inverse Proba- 
 bility, 174 
 
 and planets, 293 n., 294 
 
 and Petersburg Paradox, 316, 317 
 Bernoulli, Jac, 15 n., 41, 76, 81, 83, 
 86, 368, 369 
 
 weight of evidence, 313 
 
 second axiom of, 322 
 
 and regular frequency, 333 
 
 and statistical series, 392 
 Bernoulli's Theorem, 109, 314, 319 n., 
 333, 337 f. 
 
 and asymmetry, 358 f. 
 
 empirical verification of, 361 f. 
 
 Inverse of, 368 f., 385 f., 
 Bertrand, 48 n., 49 
 
 on multiplication, 136 
 
 and Maxwell, 172 ». 
 
 and independence, 173 
 
 and Law of Error, 208 n. 
 
 and chance, 284 
 
 and Petersburg Paradox, 317 
 
 and Bernoulli's Theorem, 339 
 
 and Bule of Succession, 382 
 BicquUley and testimony, 184 n. 
 Bobek and Rule of Succession, 383 
 Bode's Law, 304 
 Boole, 43 n., 50 n., 84, 294 n. 
 
 and German logicians, 87 
 
 and relation of Probability, 90 
 
 and symbolic probabiUty, 155 
 
 and approximation, 161 
 
 and independence, 167 
 
 and Whately, 179 
 
 and combination of premisses, 179 
 
 1 Ihis Index does not cover the Bibliogiapby. 
 469 
 
460 
 
 A TEEATISE ON PR(5BABILITY 
 
 Boole (eontd.) — 
 
 and testimony, 180 
 
 and Challenge Problem, 187 
 
 and Coumot, 284 n. 
 
 and Rule of Succession, 382 
 Borel, 47 n., 48 
 
 Bortkiewioz, von, and great numbers, 
 333 ra. 
 
 and Marbe, 365 n. 
 
 method of, 384 
 
 and Lexis, 393 f. 
 
 and Law of Small Numbers, 401 f. 
 
 and Quetelet, 402 
 Boscovitch and Least Squares, 210 
 Bowley, 421, 423, 424 f. 
 Bradley, 319 n. 
 
 and relativity of Probability, 91 
 
 and Bernoulli's Theorem, 341 n. 
 Broad, C. D., 257 n. 
 Bromse and Marbe, 363 n. 
 Briinn and lotteries, 364 
 Bruns and Marbe, 365 n. 
 BufEon, 317, 322 
 
 and coin-tossing, 362 
 Butler, Bishop, 79, 80, 309, 310 
 
 and risk, 321 
 
 Calculus of Probability, 83»., 149, 
 164, 303, 428 
 
 and Psychical Research, 302 
 
 and Sociology, 335 
 ' Casual,' 288 
 Causality, 263, 276 
 
 and independence, 164 
 ' Cause,' 275 • 
 Cause, final, 297 
 Cayley, and tradition, 185 
 
 and Challenge Problem, 187 
 Certainty, 10, 127, 128 
 
 and truth, 15 
 
 Kahle and, 90 n. 
 
 definition of, 120 
 
 relation of, 134 
 
 and Bacon, 267 
 
 and Leibniz, 272 
 Chance, objective, 281, 286 f., 295, 
 418 
 
 Couturat on, 283 
 
 Poincare on, 284, 289 
 
 Condorcet on, 284 
 
 definition of, 287 
 
 and planets, 293 
 
 and binary stars, 295 
 Coefficient of Credibility, 183 
 
 of Correlation, 421 f. 
 Combination of premisses, 149, 178 
 
 Comte, and ' seven,' 246 
 
 and statistics, 335 
 Condorcet, 83 n., 317 
 
 and testimony, 180 
 
 and chance, 282, 284 
 
 andethicB,313,316 
 
 and gambling, 319 
 Conduct and Probability, 307 
 Consistence and group theory, 124 
 Contradiction, 143 
 Coover, J., 298 n. 
 Correlation, 329, 390 
 
 and statistical frequency, 330 
 
 Quantitative, 391, 426 
 
 Inductive, 406 
 
 coefficient, 421 f. 
 Coumot, and frequency theory, 92 
 
 and independence, 166 
 
 on testimony, 180 
 
 and causality, 275 
 
 and chance, 282, 283 
 Couturat, 272 n., 311 n. 
 Craig and tradition, 184 
 Cramer and Petersburg Paradox, 318 
 Crofton, 47 n. 
 Cumulative Formula, 150 
 
 Johnson and, 121 
 Czuber, 47 m., 78, 82, 86, 339 «., 
 345 n., 347 
 
 and symbolic probability, 156 
 
 and ' cause,' 275 n. 
 
 and risk, 315 n. 
 
 and Bernoulli's Theorem, 340 n. 
 
 and statistical frequency, 351, 394 
 
 and TohebyohefE's Theorem, 353 »., 
 355 ». 
 
 and verification of Bernoulli, 362 n. 
 
 and lotteries, 364 
 
 and Marbe, 365 
 
 and Li verse of Bernoulli's Theorem, 
 370 71. 
 
 and Bule of Succession, 376 n., 382 
 
 D'Alembert, 82, 170 n., 321, 36571., 369 
 
 and chance, 282 
 
 and planets, 293 
 
 and mathematical expectation, 314 
 
 and ethics, 316 
 
 and Petersburg Paradox, 317 
 
 and Marbe, 365 
 Darbon, A., and Coumot, 284 
 Darwin, 108 
 
 and Lyell, 161 
 
 and Mill, 265 
 Dedekind and ' Challenge Problem,' 
 187 71. 
 
INDEX 
 
 461 
 
 Definitions, 134 f. 
 
 summary of, 120 
 de la Plaoette, Jean, and chance, 283 
 De Morgan, 21, 74, 83 
 
 and inference, 139 
 
 and independence, 168 
 
 and Inverse Probability, 178 
 
 and combination of premisses, 179 
 
 and tradition, 184 n. 
 
 and planets, 293 
 
 pupU of, 362 
 
 and Inverse of Bernoulli's Theorem, 
 370 ». 
 
 and Rule of Succession, 375, 382 
 De Witt and arithmetic averages, 206 
 Dice-tossing, 361 f. 
 Diderot on testimony, 183 
 Discordant observations, rejection of, 
 
 213 
 Donldn, W. F., 20 
 
 and Inverse Probability, 176 
 Dormoy, 394 
 
 Edgeworth, 29 n., 84, 85, 362 n., 379, 
 400 
 
 use of ' Probability,^ 96 n. 
 
 and randomness, 290 
 
 and Psychical Research, 298 n. 
 
 and ethics, 316 
 
 and German statisticians, 394 
 Eggenberger, 340 n. 
 Ellis, Leslie, 84, 85 
 
 and frequency theory, 92 
 
 and Least Squares, 207 n., 209 
 
 and Bacon, 265 m., 266 ra., 269 re., 
 271 n., 274 n. 
 
 and Bernoulli's Theorem, 341 
 Empirical School, 85, 86 
 Epistemology, 302 
 
 and inductive hypothesis, 261 
 Equiprobabihty, 41, 63, 65 
 Equivalence, definition of, 120, 134 
 
 axiom of, 135 
 
 principle of, 141 
 Error, probable, 329 
 Ethics, 307 f. 
 
 Euler and Least Squares, 210 
 Event, probability of, 5 
 Evidence, and measurement of Prob- 
 ability, 7, 35 
 
 relevant and irrelevant, 53, 54 
 
 independent and complementary, 55 
 
 external, 57 
 
 addition of, 66, 68 
 
 weight of, 71 
 
 and Induction, 221 
 
 Excluded Middle, Law of, 143 
 Experience and the Principle of 
 Indifierence, 100 
 
 Pechner, and median, 201 
 and law of sensation, 208 
 and lotteries, 364 
 
 Permat, formula of, 242 
 
 Forbes, J. D., 20 n., 21, 294 n. 
 
 Prazer, Sir J., 245 
 
 Frequency curves, 199 
 and statistics, 328 
 
 Frequency, statistical, 330 
 
 Frequency theory, 92 f. 
 and randomness, 290 
 and BemouUi's Theorem, 344 
 and Rule of Succession, 378 
 
 Fresnel and simpUcity, 206 
 
 Fries, 15 n. 
 
 Galton, 321 
 
 and Fechner's law, 208 
 Gambling, 319 
 Gauss, and laws of error, 196 n., 198 
 
 and arithmetic mean, 206 
 
 and Least Squares, 210 
 Generalisation, 389 
 
 definition of, 222 
 
 from statistics, 328 
 Generator properties, 253 
 
 pluraUty of, 254, 256, 257 
 Geometrical probability, 47, 62 
 German logicians, ^7 
 Gibbon, 29, 322, 333 
 GUman, B. I., and symboUc prob- 
 ability, 156 
 Goldschmidt, 29 n. 
 Goodness, organic nature of, 310 
 Grauut, 392 n. 
 
 Great Numbers, Law of, 82, 330, 333 f. 
 Greville, Fulke, 466 
 Grimsehl, 248 n. 
 
 and Marbe, 365 n. 
 Groups, of propositions, 117, 124 
 
 definition of, 120, 125 
 
 real aifd hypothetical, 129 
 Griinbaum and Marbe, 365 n. 
 
 Hagen, and error, 207 
 
 and discordant observations, 214 n. 
 Halley and mortality statistics, 332 
 Herodotus, 307 
 Herschell and binary stars, 294 
 Houdin, 364 n. 
 
 Hudson, W. H., and animism, 247 «. 
 Hume, 52, 70, 80, 81, 82, 83, 239, 427 
 
 and testimony, 182 
 
462 
 
 A TREATISE ON PROBABILITY 
 
 Hume {corM.) — 
 
 and Induotionj 218, 233, 265, 272 
 
 and analogy, 222, 224 
 
 and chance, 282 
 Huyghens, 82 
 
 and ' sia,' 247 
 Hypothesis, 7 
 Hypothetical entities, 299 
 
 Implication, 124 
 Impossibility, 15 
 
 definition of, 120 
 
 relation of, 134 
 Inconsistency, definition of, 120 
 Independence, for knowledge, 107, 
 165 
 
 definition of. 120, 138 
 
 Theorem of, 121, 146 
 
 of events, 164 
 
 and law of error, 195 
 
 and measurement, 204 
 
 and averages, 212 
 
 and discordant observations, 214 
 
 and chance, 283 
 Index numbers, 211 
 ' Induction,' 274 
 Induction, 97 
 
 Principle of, 68 
 
 and frequency theory, 98, 99, 107 
 
 and Logic, 217 
 
 pure, 218 
 
 universal, 220, 406, 417 
 
 validity of, 221 
 
 and statistics, 327 f. 
 
 statistical, 406 f . 
 Inductive correlation, 220, 257, 258, 
 
 392, 397, 406 
 Inductive hypothesis, 260, 264 
 Inductive method, 260 
 Inference, 129 
 
 necessary, 120, 139 
 
 hypothetical and asaertoric, 130 
 
 statistical, 327 f. 
 Insurance, 22, 285, 404 
 Intuition versus experience, 86 
 
 and ethical judgment, 312 
 Inverse Probability, 149, 174 
 
 and Venn, 100 
 
 and frequency theory, 106 
 
 Theorem of, 121 
 
 and statistics, 369, 370 n. 
 
 and Bowley, 425 
 Irrelevance, 255 
 
 judgments of, 54 
 
 definition of, 55, 120, 138 
 
 Theorem of, 121, 146 
 
 James, W., and spirits, 301 
 Jesuits, 308 
 Jevons, 244 n. 
 
 and eqniprobability, 42 n. 
 
 and Inverse Probability, 178 
 
 and index numbers, 212 
 
 and Induction, 222, 238, 243, 265, 
 273,274 
 
 and analogy, 246 
 
 and coin-tossing, 362 
 
 and Rule of Succession, 382 
 Johnson, W. B., 116 
 . and propositions, 11 re. 
 
 and added evidence, 68 
 
 and cumulative formula, 121, 150, 
 153, 155 
 
 and groups, 124 
 
 and testimony, 183 
 Judgments, 54 
 
 of preference and relevance, 65 
 
 direct, 70 
 
 disjunctive, 77 
 
 Kahle and the Probability relation, 90 
 Kant, 333 
 
 and Hume, 272 
 Kapteyn, Prof. J. C, and law of 
 
 error, 199 
 Knowledge, 10 
 
 kinds of, 3, 4 
 
 direct and indirect, 12, 262 
 
 incomplete and proper, 13 
 
 of logical relations, 14 
 
 probable and vague, 17 
 
 relativity of, 17 
 
 vague and distinct, 53 
 
 homologio and ontologic, 276, 288 
 
 and ignorance, 281 
 
 and chance, 289 
 Kries, von, 42, 44 n., 45 n., 46 n., 50, 
 67 n., 84 
 
 and equiprobabUity, 87 
 
 and Principle of Indiiferenoe, 172 
 
 and independence, 173 
 
 and Inverse Probability, 176 
 
 and knowledge, 276 
 
 and Coumot, 284 n. 
 
 and School of Lexis, 394 
 
 Lacroix, 184 re. 
 
 Lambert and Least Squares, 210 
 
 Lammel, 47 re. 
 
 and symbolic probability, 156 
 Laplace, 15 re., 28 »., 31, 82, 83, 84, 
 318, 427 
 
 school of, 44, 51, 86, 358, 365 
 
INDEX 
 
 463 
 
 Laplace (contd.) — 
 
 and relation of Probability, 91 
 
 and independence, 170 
 
 and Inverse Probability, 175, 178 
 
 and testimony, 180, 182 
 
 and doctrine of averages, 202 
 
 and arithmetic mean, 206 
 
 and Least Squares, 210 
 
 and Induction, 220, 239, 265, 273 
 
 and chance, 282 
 
 and planets, 293 n. 
 
 and Qxietelet, 334 
 
 and Bernoulli's Theorem, 340, 341, 
 370 
 
 and Bule of Succession, 351 n., 
 359 n., 368 
 
 and birth proportions, 364 
 
 and unknown probabilities, 370 
 
 and Bayes' Theorem, 380 
 
 and statistical series, 392 
 Laurent and gambling, 319 
 Law, 311 n. 
 Lavf of error, 194 f. 
 
 and arithmetic mean, 197 
 
 and geometric mean, 198 
 
 and median, 200 
 
 and mode, 203 
 
 normal law, 199, 202, 205 
 
 Lexis and, 398 
 Least Squares and Venn, 206 
 
 method of, 202, 205, 206, 209 
 Lee and tradition, 184 n. 
 Legendre and Least Squares, 210 
 Leibniz, 24»., 308, 368, 392, 427 
 
 and arithmetic average, 206 
 
 and Induction, 272 
 Lexis, and asymmetry of statistical 
 frequency, 359 n. 
 
 and Marbe, 365 n. 
 
 method of, 384, 393 f., 397 f. 
 
 and Edgeworth, 401 
 
 and statistical stability, 415, 419 n. 
 Locke, 76, 80, 82, 83, 308, 323 
 
 on tradition, 184 
 
 and weight of evidence, 313 
 Logic, academic, 3 
 
 of probabUity, 8 
 
 of implication, 58 
 
 and Induction, 217, 245 
 
 and initial probability, 299 
 Logical priority, 129 
 Lotteries, 333 n., 361, 364 f. 
 
 published results of, 363 
 Lotze, 89 
 
 and Rule of Succession, 382 
 Lucretius, 427 
 
 M'Alister, Sir Donald, and laws of 
 
 error, 198 
 Macaulay and Bacon, 266 
 McOoU, and symbolic probability, 
 155 
 
 and Boole, 167 n. 
 
 and Inverse Probability, 176 
 
 and ' Challenge Problem,' 188 n. 
 Macfarlane, and independence, 169 n. 
 
 and tradition, 185 
 
 and ' Challenge Problem,' 187 n. 
 Maolaurin, Theorem of, 207 
 Marbe, Dr. Karl, and roulette, 365 
 Marginal utility, 318 
 MarkofE, A. A., 177 n. 
 
 and Inverse Probability, 176 
 
 and TchebychefE's Theorem, 357 
 Mathematical Expectation, 311, 315, 
 
 316 
 Mathematicians, and probability, 84 
 
 and cumulative formula, 152 
 
 and laws of error, 207 
 
 and ethics, 316 
 Maxim, Sir Hiram, 364 n. 
 Maxwell, 172 n. 
 
 and theory of gases, 172 
 Mayer and Least Squares, 210 
 Means and laws of error, 194 f. 
 Measurement of Probability, 34, 158, 
 311 
 
 and frequency theory, 94 
 
 and induction, 259, 388 
 
 and psychical research, 302 
 
 and ethics, 311 
 Median and laws of error, 200 
 Meinong, 78 
 
 Meissner, Otto, and dice-throwing, 363 
 Memory, 14 
 MendeUsm and statistics, 335, 419 
 
 428 
 Merriman, Mansfield, and Least 
 
 Squares, 209 
 Metaphysics and certainty, 239 
 Method of Difference, 246 
 Michell, 302 
 
 and Inverse Probability, 174 
 
 and binary stars, 294 
 Middle Term, Fallacy of, 68, 155 
 Mill, and inductive correlations, 220 
 
 and induction, 265 f. 
 
 and pluraUty of causes, 267 n. 
 
 and probabUity, 268 n. 
 
 and pure induction, 269 
 
 methods of, 270 
 
 and limited variety, 271 
 Modality and probability, 16 ». 
 
464 
 
 A TREATISE ON PROBABILITY 
 
 Modality (contd.) — 
 
 Venn and, 98 
 Mode, and law of error, 203 
 
 asymmetry about, 361 
 Monte Carlo, 364 
 Moore, G. B., 19, 240 n., 309 
 Morgan, vide De Morgan 
 Multiplication, 135 
 
 de&iition of, 120 
 
 theorems of, 121, 148, 342 
 
 of instances, 233 f. 
 Munro, 370 n. 
 
 Necessary connection, law of, 251 
 Newton, and induction, 244 
 
 and 'seven,' 247 
 
 and Bacon, 265 
 Nitsche, A., 45 n., 50 n., 78, 172 n. 
 
 Occurrences, remarkable, 302 
 
 Pascal, 82 
 
 Pearson, Karl, 84, 351 n. 
 
 and frequency theory, 100 
 
 and arithmetic mean, 208 
 
 and stars, 297 
 
 and asymmetry, 347, 359 n. 
 
 and generalised Probability curves, 
 347 
 
 and roulette, 364 
 
 and Rule of Succession, 379, 382 
 Peirce, 50 n., 304 
 
 and randomness, 290 
 Petersburg Paradox, 316 
 
 psychology of, 318 
 
 and BufEon, 362 
 Peterson and tradition, 184 
 Physios and initial probability, 299 
 Planets, movements of, 293 
 Playfair, Dr. Lyon, 305 
 Plurality of causes and MUl, 267 
 Poetry and statistics, 401 
 Poincare, Henri, 48, 84 
 
 and independence, 173 
 
 and chance, 284, 289 
 Foisson, 51 n., 362 n. 
 
 on testimony, 180 
 
 and least errors, 207 
 
 and Petersburg Paradox, 317 
 
 and gambling, 319 
 
 and great numbers, 333, 336 
 
 Theorem of, 344 
 
 and statistical frequency, 348 
 
 and Tchebycheff, 357 
 
 and inverse of Bernoulli's Theorem, 
 370 
 
 Poretzki, Platon S., and symbolic 
 
 probability, 157 
 Port Royal logic, 70, 80, 321 
 
 and probabilism, 308 
 Prediction, value of, 305 
 Price and Bayes, 174 n. 
 Primitive people and rational belief, 
 
 245 
 Principle of compelling reason, 86 
 Principle of Indifference, 42, 81, 83 f., 
 87, 104, 107, 171 
 
 analysis of, 63 
 
 modification of, 65, 58 
 
 and induction, 99 
 
 and measurement, 160 
 
 and Psychical Research, 302 
 
 and ethics, 310 
 
 and statistics, 367 
 
 and Laplace, 372, 374 
 
 and Rule of Succession, 377 
 Principle of Non-Sufficient Reason, 41, 
 
 85 
 Principle of superposition of small 
 
 eSects, 249 
 Probabilism, 308 
 ' Probability,' 8 
 
 Venn's use of, 95 
 
 Edgeworth's use of, 96 n. 
 Probability, and relevant knowledge, 
 4 
 
 objective relation of, 5, 8, 281 
 
 mathematical, 6 
 
 dependent on evidence, 7 
 
 philosophical definition of, 8 
 
 three senses of, 11 
 
 measurement of, 20 f., 37 
 
 and law, 24 
 
 and similarity, 28, 36 
 
 comparison of, 34, 66, 160 
 
 series of, 35, 38 
 
 ' geometrical,' 47, 48, 62 
 
 and rational beUef , 97 
 
 and statistical frequency, 98 
 
 and truth frequency, 101 f., 337 f. 
 
 Inverse; 106, 149 
 
 and truth, 116, 322 
 
 negative, 139 
 
 finite, 237 
 
 and randomness, 291 
 
 and planetary orbits, 293 
 
 and binary stars, 294 
 
 and star drifts, 296 
 
 and final causes, 297 
 
 and spirits, 300, 301 
 
 and telepathy, 300 
 
 and ethics, 307 
 
INDEX 
 
 465 
 
 Probability [contd.) — 
 
 from statistics, 367 f. 
 
 ' unknovm,' and Laplace, 372 
 Probability relation, 4, 8, 13, 134 
 
 intuition of, 52 
 Probable error, 74 
 Proctor, 364 n. 
 Proposition, characterisation of, 3, 4 
 
 primary and secondary, 11, 13 
 
 knowledge of, 12 
 
 self-evident, 17 
 
 classes of, 101 f. 
 
 groups of, 117, 124 
 
 sub-groups of, 126, 129 
 
 disjunction and conjunction of, 134 
 
 synthetic, 263 
 
 existential, 276 
 Propositional function, 56 
 
 and induction, 222 
 
 and randomness, 291 
 Psychical Research, 278 f. 
 Psychology and probability, 52 
 Pythagoras and ' seven.,' 246 
 
 Quetelet, 333 n., 334, 335, 401, 418, 
 427, 428 
 and arithmetic mean, 208 
 and balls, 362 
 and statistical stability, 393 
 
 Randomness, 281, 290, 412 
 
 Pearson's use of, 297 
 Relation, of probabihty, 6 
 
 of ' between,' 35, 39 
 Relativity, of knowledge, 17 
 
 of probabilities, 102 
 
 doctrine of, and the Law of Uni- 
 formity, 248 n. 
 Relevance, judgments of, 54 
 
 and frequency theory, 104 
 
 theorems of, 147 
 Remarkableness, 302 
 Requirement, 129 
 Risk, 316 
 
 and ethics, 313 
 
 and Petersburg Paradox, 319 
 
 ' moral,' 320, 322 
 
 * physical,' 322 
 Roulette, 361, 364 
 
 published results of, 363 n. 
 Rule of Succession, 359 n., 368, 372, 
 374 
 
 proof of, 375 
 
 and frequency theory, 378 
 
 and Pearson, 380 n. 
 Russell, Bertrand, 19, 115, 124 »., 126 
 
 Russell, Bertrand {conid.) — 
 and inference, 117 
 and implication, 124 
 
 Schematisation, 67 
 
 Schroder and sjrmboUc probability, 
 
 157 
 Selection, random, 292 
 Series of probabiUties, 35, 38 
 
 and frequency theory, 93 
 
 independent, 283, 420 
 
 organic, 399, 420 
 
 Gaussian, 421 n. 
 Sigwart, 88 
 
 and inverse probability, 178 
 
 and Induction, 273 
 Simmons and asymmetry in Bernoulli's 
 
 Theorem, 359 
 Simpson and Least Squares, 210 
 Small Numbers, Law of, 401 f. 
 Society for Psychical Research, 298 n. 
 Space, 255 
 
 and uniformity, 226 
 
 irrelevance of, 301 
 Spedding and Ellis and Bacon, 265 n., 
 
 266 ra. 
 Spielraame, doctrine of, 88 
 Spinoza, 116 n., 282 n. 
 Spirits, probability of, 300 
 Star drifts, 296 
 Stars, binary, 294 
 Statistical frequency, theory of, 93 f. 
 
 generalisation of, 101 
 
 criticism of, 103 
 
 stability of, 336, 392-415 
 
 fluctuation of, 392 
 Statistical inference, 327 f . 
 
 induction, 406 f. 
 Statistics, and prediction, 306 
 
 descriptive and inductive, 327 
 Stumpf, 44 91., 50 m., 172 n. 
 Sub-analogies, 223, 229 
 Sub-groups of propositions, 126, 129 
 Succession, Law of, 82 
 
 See Rule of 
 Siissmilch and regular frequencies, 
 333 
 
 Taylor, Jeremy, 308 n. 
 TchebychefE, Theorem of, 353, 355 
 
 and Poisson's Theorem, 357 
 Telepathy, probability of, 300 
 Terrot, Bishop, 43 n. 
 
 and Whately, 179 n. 
 
 and combination of premisses, 179 
 Testimony, theory of, 180 
 
466 
 
 A TKEATISE ON PKOBABILITY 
 
 Time, 256 
 
 and unifonnity, 226 
 
 irrelevance of, 301 
 Todhunter, 294 n., 318 n., 370 n. 
 
 and Bayes, 175 
 
 and Craig, 184 
 
 and Petersburg Paradox, 316 
 
 and Bernoulli's Theorem, 340 n. 
 Truth and probability, 116 n., 322 
 Truth frequency, 101, 406 
 Taohuprow, 358, 399 n. 
 
 and statistical frequency, 348, 394 n. 
 
 method of, 384 
 
 Uniformity of Nature, Law of, 226, 
 
 248, 255, 263, 276 
 and Mm, 270 
 Universal Causation, Law of, 248 
 Universal Liduction and statistical 
 
 methods, 389, 406-417 
 Universe of reference, 117, 129, 130 
 Unknown probabilities, 372, 373, 
 
 375 
 
 Variables in Probability, 58, 123, 412 n. 
 
 Variety, 234 
 
 and induction, 219 
 limitation of, 258, 260, 427 
 
 Venn, 84, 106 n., 294 n. 
 
 Venn (contd.) — 
 and experience, 85 
 and Bernoulli, 86, 341 
 and frequency theory, 93 f . 
 and inverse probability, 100 
 and Least Squares, 206 n. 
 and induction, 273 
 and chance, 288 
 aaA' random,' 290 
 and B,ule of Succession, 372, 378, 382 
 
 Weight, of evidence, 312 
 and ethics, 315 
 
 Weighting of averages, 211 
 
 Weldon and dice, 362 
 
 Whately and combination of pre- 
 misses, 178 
 
 Whitehead, and frequency theory, 101 
 and invalid inference, 329 n. 
 
 Whittaker, B. T., and Rule of Suc- 
 cession, 376 n. 
 
 WUbraham, H., and Boole, 167 n. 
 
 Wolf and dice, 362 
 
 Yule, 349 n., 361 n. 
 and approximation, 161 
 and independence, 166 
 and ' eiatisUcs,' 327 
 and coin-tossing, 346 n., 361 n. 
 and correlation, 421, 424 
 
 I'alse and treacherous Probability, 
 Enemy of trutli, and friend to wickednesae ; 
 Witli whose bleare eyes Opinion leamea to see, 
 Truth's feeble party here, and banenuesse. 
 
 THE END 
 
 Printed ^ R. & R. Clark, Limited, Edinhurgh^ 
 
BY THE SAME AUTHOR 
 8vo. 8j. 6d. net. 
 
 THE ECONOMIC 
 CONSEQUENCES OF THE PEACE 
 
 TIMES. — "Mr. Keynes's work on the Peace Conference is 
 one of a calibre quite diflferent from any of those others which we 
 have hitherto received. Mr. Keynes writes with knowledge; he 
 was himself one of the chief actors in the Conference, and- his book 
 is an important political event. . . . Mr. Keynes brings great 
 literary ability, a broad view, a clear grasp of general principles, to 
 bear upon the very complicated matters with which he is occupied, 
 and in his hands these questions of coal, exchange, and reparation 
 can be read with pleasure by the non-technical student.'' 
 
 WESTMINSTER GAZETTE. — ''Mx. Keynes's very re- 
 markable book betrays a grasp of the subject which could only have 
 been derived from personal experience at the Conference itself." 
 
 ATHENj/EUM. — "It is a perfectly equipped arsenal of facts 
 and arguments, to which every one will resort for years to come who 
 wishes to strike a blow against the forces of prejudice, delusion and 
 stupidity. . . . Never was the case for reasonableness more power- 
 fully put. It is enforced with extraordinary art. What might easily 
 have been a difificult treatise, semi-official or academic, proves to be 
 as fascinating as a good novel : it has all the merits — the accuracy, 
 the method, the well-considered arrangement — of the best kind of 
 State Paper, with none of the shortcomings." 
 
 LONDON: MACMILLAN & CO., Ltd. 
 
BY THE SAME AUTHOR 
 8vo. js. 6d. net. 
 
 INDIAN 
 CURRENCY AND HNANCE 
 
 ECONOMIC JOURNAL.— "The book is, and is likely long 
 to remain, the standard work on its subject. . . . While academic 
 students will be grateful for this acute and informing work, it will 
 be read with as much interest, and perhaps even greater appreciation, 
 by men of business and affairs.'' 
 
 SPECTATOR. — " Mr. Keynes's careful and disintei-ested study 
 of the monetary facts of twenty years, and his methodical marshal- 
 ling of facts and figures, will be useful even to those — and they will 
 probably be few — who are not convinced by his reasoning." 
 
 CLARE MARKET REVIEW.^" By his really masterly 
 treatment of the Indian currency system, the author has made a 
 very valuable addition to our economic literature. . . . Mr. Keynes 
 has succeeded admirably in both of the chief tasks which the writing 
 of his book involved — those, namely, of explaining, and of uphold- 
 ing, the system." 
 
 ECONOMIST— " A. searching, well-informed, and admirably 
 lucid survey." 
 
 BANKERS' MAGAZINE.— ""Written in an attractive manner, 
 without undue repetition or employment of charts or tables, the 
 work is of value to all students of currency matters.'' 
 
 LONDON: MACMILLAN & CO., Ltd.