BEVERLEY EDUCATIONAL SERIES EDITED BY W. W. CHARTERS PROFESSOR OF EDUCATION UNIVERSITY OF ILLINOIS SCHOOL STATISTICS AND PUBLICITY ^f; BY CARTER ALEXANDER FIRST ASSISTANT STATE SUPERINTENDENT OF PUBLIC INSTRUCTION FOR WISCONSIN SOMETIME PROFESSOR OF SCHOOL ADMINISTRATION GEORGE PEABODY COLLEGE FOR TEACHERS SILVER, BURDETT AND COMPANY BOSTON NEW YORK CHICAGO \r^ % %^^ •h^ Copyright, 1919, by CAETER ALEXANDER All rights reserved M":r n -; CI.A5L1824 EDITOR'S PREFACE We hear from the forum and pulpit that reconstruction must follow a war of such magnitude as that just closing. But the laws of habit still operate and, if permitted to do so, the nation will return, in fact will prefer to return, to those accustomed grooves of thought and action from which they have been so vigorously shaken. This return is not possible in those cases which have an economic basis. The industrial world will be compelled by the insistence of labor to change its ideals and practices, and problems of finance, taxes, and revenues will force reconstruction in the field of politics. But education will not feel the insistent urge of economic forces unless some obvious catastrophe, such as the spectacle of schoolrooms unprovided with teachers because of public parsimony, visualizes the crisis for the taxpayers. Situations less obvious than this the'citizen does not see and understand. It is difficult to show him that low salaries may result in other evils as serious as the corporeal absence of teachers from the classroom. He must be shown that not only must teaching be good but that better buildings, better sanitation, curriculum changes, compulsory education, and vocational education must be provided. Sonorous or staccato generalities about reconstruction which, I venture to say, in less than a twelvemonth of our swift-moving current of national life will have be- come platitudinous, will not produce that sympathetic attitude toward educational reconstruction which is neces- vi Editor's Preface I sary for its processes to be carried on. Antebellum habits will wage a momentous warfare upon transitory emotional ideals. . And, even though the present moment should be preg-f nant with sympathy for the improvement of the schools, other obligations compete for a place in the necessarily limited field of public attention. National taxes com-i pete with school taxes, political conflicts are more promi-" nent than school necessities, and industrial readjustments obtain readier attention than educational reconstruction. To meet this situation the man of the hour is the super- intendent of schools. The problem of crystallizing the liquid desires of the public for educational progress is as squarely placed upon his shoulders as any human task has ever rested upon any individual. For education is personal. The nation is educated in community groups. Just as the squad is the unit of the army, the community is the unit of the nation, and just as the non-commissioned officer who leads the squad is the ''backbone of the army,'' so the superintendent who leads his community is the backbone of the forces of education. The educational army is just as dependent upon his intelligence and in- dustry as the fighting army is dependent upon the cor- poral. The chief weapon for leading the people of a community in educational activity is publicity. And this may be obtained in three ways. One method is to develop ex- cellent schools and let the work speak for itself through satisfied parents, loyal teachers, and efficient children. But this method as the sole method is open to the criti- cism that parents and children cannot clearly distinguish between good school systems and inferior systems, and to the further criticism that if people are too well satisfied Editor's Preface vii with their schools they are not sensitive to the need for better schools. So, important as the maintenance of a good system is, its presence does not insure knowledge of its excellence and needs. A second method is personal explanation of what the school is doing and attempting, carried on by conferences with the school board, by public meetings, and by private conversation. But a third method to be added to these is the superintendent's annual report and printed com- munications, in which a wider audience is reached. All three of these methods are used by those superintendents who have been most successful in molding public opinion. But, as Mr. Alexander points out, seventy per cent of a group containing one hundred twenty-eight members of two of the most distinguished organizations of an in- telligent community stated that they did not read the school board reports. And this proportion is small for the nation as a whole. Upon the problem of making the superintendent's report readable by his community this text is directed. The author attacks the whole problem from the collect- ing of the data and their statistical treatment, to the presentation of his findings in simple and graphic form. It is, therefore, presented as a notable attempt to make known to the public those inner workings of the school, to the end that fluid educational interest may take on stability of action directed toward progressive ends. AUTHOR'S PREFACE The admitted ineffectiveness of our school statistics is due mainly to two causes. First, many superintendents do not know how to apply statistics as well as do their equals in intelligence in other fields. Second, these super- intendents do not know — because they have been too busy to learn — how to present statistical matter to the public effectively. As the experience of publicists in other fields shows, the ways of doing these things are simple. But not enough advantage of the labors of these men has been taken, nor have their results been adapted for easy and quick use by the busy superintendent. There is pressing need for a book which will do these very things. The aim of this book, then, is to make available for superintendents and classes in school administration the results of years of study of statistical theory and its applications to school data for publicity purposes, as shown in school reports and surveys. These results have all passed through the fires of criti- cism of the large number of practical school men whom it has been the author's good fortune to have as students during the past six years. In particular, he wishes to acknowledge his great indebtedness to the men of Ed- ucation 245, his graduate course at Peabody, whose generous aid and criticism have contributed much to the practical helpfulness of the book. Acknowledgments are also gratefully made to the fol- ix X Author's Preface 1 lowing : Professor E. L. Thorndike of Teachers College, who introduced the writer to statistical method ; the editor, Professor W. W. Charters, who has given valuable suggestions for modifying the form of the book; Mr. W. C. Brinton and Professors J. F. Bobbitt and W. I. King, on whose writings the author has drawn freely; Dr. W. W. Theisen, supervisor of educational measure- ments in the Wisconsin State Department of Public In- struction, who has read part of the proof ; the various writers and foundations who have given permission for the reproduction of copyrighted charts; Messrs. C. H. Moore, E. McK. Highsmith, and S. C. Garrison for careful criticism of the manuscript. A special feature of this book is that all but five of the cuts are from drawings made by the boys in the Tech- nological High School at Atlanta under the direction of Mr. E. S. Maclin, at that time head of the drawing de- partment there. Mr. Maclin, who was one of the men in Education 245, has kindly given his services to produce this practical demonstration of the possibilities in utilizing high school students on school publicity. No doubt there are some slight errors or irregularities to be found in the cuts. But all such imperfections are only evidences of the genuineness of the demonstration. The school superintendent must be a publicist. He must make reports to the public. In many places for the next decade at least he must fight as hard as any officer in the trenches to ward off the incessant and fierce attacks made upon his school appropriations by politicians and hard-pressed but unthinking tax-payers. For warding off or beating back such attacks, his most effective weapons will be reports containing simple but skillful statistical devices for presenting the claims of the school children. Author's Preface xi Unless he has such weapons, the enemy will be liable to sweep over the schools and place them on a starvation diet. If this book in any material way helps the superin- tendent of schools, active or in training, to arouse his public and to secure adequate support for his school, it will fulfill its mission. Carter Alexander. SUGGESTIONS FOR USING THE BOOK I. TO THE SUPERINTENDENT IN THE FIELD The active superintendent, accustomed as he is to adapting materials quickly to his own problems, of course needs few "suggestions for using this book. But to save his time, a few are given. 1. All of this book except Chapters V to VII can be read with little effort by any active school man. Ap- plications to his own work will constantly come to mind. 2. A beginner in statistical method can attain a fair acquaintance with the matters involved in Chapters V to VII by a careful reading of the text. But a usable knowledge of these chapters, for one not previously familiar with statistical method, can hardly be expected unless the exercises or similar ones are done as indicated in the suggestions for the instructor of future superintendents. For such exercises, the superintendent generally has on hand several administrative problems that would be greatly simplified by adequate statistical treatment. His annual report, in particular, will furnish material for practically every exercise in the book. I 3. The table of contents and the index will enable him to use the book as a reference book for securing material or suggestions for procedure on school problems necessitat- ing statistical treatment. He can also quickly get ideas and devices for presenting his results to the public. 4. If he has plenty of time, the procedure advocated for students in training will be very profitable. xiii xiv Suggestions for Using the Book II. TO THE INSTRUCTOR ENGAGED IN TRAINING FUTURE SUPERINTENDENTS The experience of the writer with the active superin- tendents and future superintendents in his own classes indicates the following : 1. It is advisable to have a good supply of supple- mentary material for practice work and concrete illus- trations. At least one copy of each reference given Li the bibliography in the back of the book is advisable. In addition, one should secure before the class starts a fairly complete collection of typical annual reports and publicity pamphlets of the city and county superintend- ents in the district, state, and section of the country from which the students come. It is well to have several copies of the better ones of these, but for most of them one copy each is sufficient. These should be distributed among the members of the class, each being allowed to choose those he especially cares for and each being re- sponsible for his own set. All students may be required to become familiar with the better ones for which dupli- cate copies are available. 2. Portions of the text may be prepared for class dis- cussion as in any textbook course. But the instructor may test the extent to which the students have mastered the text and create much more interest by constantly expecting them to bring in pertinent illustrations of the points under discussion. These illustrations should be obtained from the reports or other material for which the student is responsible, from his previous experience, or from current magazines and advertisements. 3. A mastery of the text, however, can be assured only when students prove their ability to handle the exercises ♦ Suggestions for Using the Book xv in this book, or similar ones which the instructor may easily make up or select from the references in the bibliog- raphy. ''To know whether any one has a given mental state, see if he can use it." Answers to exercises have been purposely omitted. Most students come to such exercises with a habit of striving for an accuracy to several decimals in the answer rather than of concentrating their attention on the prin- ciples involved. The particular decimals secured in any answer will often vary considerably according to the grouping used, the extent to which decimals are carried out, and so forth. Again, many administrative problems involve original data based on approximate measures, where attempts to secure absolute numerical accuracy are a sheer waste of time for everybody. By checking the procedure and answers of various students against each other, a sufficiently accurate answer will be ob- tained in the class. To secure this result, however, the instructor should see that the exercises are worked out in full and handed in. Judging by the writer's experi- ence, it is hardly worth while to discuss any of the ex- ercises that involve computation until they have been completely written out by the students. The work of both student and instructor will be greatly reduced for exercises which occur on pages 38 to 316, by utilizing the work already done on a problem, merely adding the work for the particular point to be illustrated and handing in the exercise again. If there is time and the students are sufficiently mature, the problem started in the exercises on pages 38, 43, 57, 61, 81, 89 may be continued throughout the course by each student with excellent results. It is not continued in this book because of the difficulty of keeping a group xvi Suggestions for Using the Book together on such work in the usual time allotted to reci- tations. But in the writer's own classes, his students continue to work outside on such problems and make oral or written reports at the end of the course. Prob- lems of this nature interest students greatly. Any in- structor can find many opportunities for them in the local city school system, with resulting profit to the system. 4. For the convenience of the instructor and a few of the students who evince unusual interest or capacity for the work, a few references are given at the end of various chapters. It is not, however, advisable to use many of these for class reading. In attempting to understand statistical method, the student must, above everything else, avoid confusion and keep his feeling of mastery over what he has done. But he will find the various writers taking up topics in different sequence and using different definitions or different technical terms in a manner very confusing to a beginner. Much better results will be secured if the instructor will use most of the references himself and adapt the additional material he wishes to the plan of this book and the special needs of his students. Experiments with the writer's own classes show that a satisfactory working knowledge of the material presented in this book can be comfortably acquired in from twenty- five to forty-five class periods. The shorter time will, of course, reduce the use of exercises. The longer time will allow abundant opportunity for the use of the exercises and special problems advocated. The book can thus be used in a six weeks' summer school, in a corresponding period during the regular session, or in whatever longer period is available. I TABLE OF CONTENTS PAGE Chapter One. Why We Need Better School Statistics 1 I. Common Errors II. Wastefulness of These Errors III. Indifference of the PubHc IV. Progress Elsewhere in Using Statistics V. Unsatisfactory State of Affairs in School Statistics Chapter Two. Collection of Data 33 I. When to Use Statistical Method II. How to Plan Statistical Treatment of Problems III. How to Determine Units and Scales IV. How to Do the Actual Collecting Chapter Three. Technical Methods Needed in School Statistics 90 I. Usual Views II. Statistical Knowledge Needed for School Surveys III. Statistical Knowledge Needed for Reading Educa- tional Investigations IV. Illustration of Value of Statistical Method to the Superintendent V. Statistical Method as a Form of Expression Chapter Four. Scales, Distribution Tables, and Sur- faces OF Frequency . . . . . . 100 I. Scales II. Distribution Tables III. Surfaces of Frequency Chapter Five. Measures of Type 124 I. The Mode II. The Median III. The Average IV. Which Measure of Type to Use in a Given Dis- tribution xvii XVIU Table of Contents Chapter Six. Measures of Deviation or Dispersion I. Extreme Range Variation II. Quartile Deviation III. Other Percentile Deviations IV. Median Deviation V. Average Deviation VI. Standard Deviation VII. Deviations for Skew Distributions VIII. Which Measure of Deviation to Use in a Given Distribution PAGE 149 164 Chapter Seven. Measures of Relationships . I. Relationships Inside of One Group II. Simple Relationships between Different Distributions III. Coefficient of Variability or Dispersion IV. Correlation Chapter Eight. Supplement on Statistical Treatment . 187 I. Reliability of Statistical Results II. Special Economies in Calculation III. Combining Data Given in Rank Order Only Chapter Nine. Uselessness of Statistics in Current School Reports 200 I. The Situation II. Causes of the Uselessness III. Devices for Effective Presentation Chapter Ten. Presenting Tabulated Statistics to the Public I. Possibilities of Using Tabulations of Statistics to In- fluence the Public II. How to Give a Bird's-Eye View through Tabulation III. How to Make Up a Series of Tables of the Same General Nature Chapter Eleven. Graphic Presentations of School Statistics Especially for the Public I. Object of Graphic Presentations II. How to Make Graphs for the Public from Statistical Data 206 234 Table of Contents xix PAGE III. How Graphs for the Public Differ from Those for the Administrator IV. Examples of Good Graphs on School Statistics for the Public V. Economies in Making School Graphs for the Public Chapter Twelve. Translating Statistical Material ON Schools for the Public .... 303 I. The Need of Translation II. Suggestions for Good Translations III. Examples of Good Translations of School Sta- tistics Selected and Annotated Bibliography .... 317 Index 323 SCHOOL STATISTICS AND PUBLICITY CHAPTER I WHY WE NEED BETTER SCHOOL S'^ATISTICS I. COMMON ERRORS Many school superintendents have no doubt laughed over some variation of the following : A clerk in a department store asked for a raise in salary and got this reply from the owner : "Why, this is no time to ask for more wages. Times are too hard and you do very little work. "I will show you how little work you do for me in a year. "The year has 365 days in it, and each day is 24 hours, divided in three equal parts, 8 hours for work, 8 hours for sleep, and 8 hours for play. "Now just listen. Take 8 hours that you sleep in each day, which is 122 days, from 365, and that leaves 243 days. "You play 8 hours each day, which is another 122 days from 243, and that leaves 121 days, — see? "Now, there are 52 Sundays when you do not work; just take these from 121, and that leaves 69 days. "When the summer comes, you say: *I can't work, I'm all in, and I want a vacation.' I give you 2 weeks off; take 14 days from 69. This leaves 55 days. "Then the store closes a half day every Saturday. This is 26 from 55, which leaves 29 days. "Then you take li hours for your lunch each day, which is 28 days, from 29, or 1 day left. "And I just remember that that day was the retailmen's picnic, and you asked off to go to it." 1 2 School Statistics and Publicity The isolated statements seem fair enough on the sur- l\ face when taken one at a time. What has been done to manipulate the figures so that a patently erroneous conclusion is reached ? Humorous as this is, it is not so far beyond what can be found sometimes in the statistical 1 parts of school reports. 1 Let us examine samples of these errors, which for con- venience may be regarded as arising from troubles with 1. Unanalyzed Totals. 2. Comparisons Employing Indefinite Units. 3. Comparisons Using Unsound Elementary Treatment. 4. Attempts at Too Great Accuracy. 5. Neglect of Technical Statistics. 6. Presentations of School Statistics to the Public. These will now be discussed in order, but only for the " purpose of showing the errors clearly. The remedies for such troubles will be given in subsequent chapters. ^ 1. Unanalyzed Totals Some commercial clubs and school systems print figures on their letterheads giving the size of the schools, number of teachers, number of children enrolled, number of build- ings, size of plant, etc. A certain normal school in a middle-western state uses this material on its letterhead : A Teacher's College Faculty, 46 — men, 19 ; women, 27 Enrollment, 1915-16, 2179 Average attendance, 874 1 For those who wish to use this book for reference purposes or for review, cross references to later portions are given throughout this chapter. But for those who intend to read the whole book and who are going through the chapter for the first time, it is advisable to ignore the cross references at this stage. Why We Need Better School Statistics 3 These figures mean very little unless the reader knows in addition whether the proportion of the number of the faculty to the number in the student body is somewhere close to that of the average normal school in the country ; whether, in averaging the attendance, there was one short term with a heavy attendance set over against three long terms with a rather lean attendance, etc. Another illustration of this error is the statement sometimes made when medical inspection is installed in a city, that the results show 90 per cent or over of the children to be defective. This tends to alarm people until they consider that there must be some fallacy in it or there could be no such thing as normal children in that city. The fact is that practically every child is physically defective on some point. When children are measured on ten to twenty physical points, each child is certain to be defective on some one point, but many of them are, on the whole, average or normally healthy children. The unanalyzed total gives a misleading im- pression. 2. Comparisons Employing Indefinite Units Even if the error of unanalyzed totals is avoided, a second weakness may creep in through foolish or mis- leading comparisons. These may be of this nature because of lack of suitable units for comparing the things, or the trouble may arise because of unsound methods in making the comparisons. Enrollment. For example, a certain college at one summer session announced an enrollment of 1108. At the second summer session after this the figures were 1484. Apparently this was a gain of only 34 per cent. In reality the school had gained about 60 per cent. The School Statistics and Publicity ^ enrollment the first time was padded by the issuing of visitors' cards, even to wives of the faculty. The enroll- ment the third time included only bona fide students, owing to the need of impressing certain people with the school's actual enrollment. The difference in units here prevented the school from gaining the advertising value of the really large increase of the second year. Serious errors often occur in comparisons between school systems because the number of children has not been figured on the same basis. Thus, this number may be calculated as the total enumerated of school age (school census), as the total ever on the roll (enrollment), as the total number in attendance that have not been absent over three days (number belonging), or as the average daily attendance. Also, the percentage of daily attendance is often figured on the " number belonging " base. This of course always gives a high percentage of daily attendance, since absent pupils are rapidly dropped from the count. It is in reality a measure of the holding power of the schools for truants, rather than for all children of school age. Sometimes the enrollment in a school system equals or exceeds the enumeration, not because of the attraction for enumerated children, but because of the enrollment of non-residents, no mention of which may be made in that connection. Or the enrollment may be greater be- cause of transfers inside the system so that the same pupil is enrolled in more than one ward school and con- sequently is counted twice or more on enrollment. School Expenditures. Erroneous conclusions are often reached in attempting to compare the total expenditures of one city school system with those of another city, without proper units. These totals are vitally affected by such factors as the number of people residing in the Why We Need Better School Statistics 5 city, the per capita wealth, the proportion of children in the population, the rate of taxation for all purposes, etc. A rich suburban city can be expected to expend far more on its schools than can a city of the same size made up mainly of factory workers or similar wage earners. Distribution of School Moneys. In some places in the South, state school money is distributed on one unit, the enumeration basis, and expended on another, the number of children (more likely of white children) in at- tendance. For example, some years ago in Alabama one community is said to have had enough in this way to pay the principal of the white school $1200 and give him three assistants, all for an enrollment of forty-five white children and an average daily attendance of not more than thirty-five. This was of course achieved by spend- ing on the white schools considerable money drawn on negro census children. ^ Age of Child. Another indefinite unit is the " age " of a child in any given school system. Is a six-year-old child one that is six and has not reached seven, or is it a child from five and a half up to six and a half years old ? In making the recent survey of Cleveland, Dr. Ayres found that the age of a child in that city was determined by whatever birthday fell nearest to September of that particular school year. This meant that a child was the same age for a whole year. Children entering in June were set down as being the same age as they would have been if they had entered the preceding September. This made the average ages of the children somewhat younger than they 'should be. It had the effect of making many children that were over-age in the fall, no longer over-age 1 Reported by Mr. W. F. Puckett 6 School Statistics and Publicity 1 when promoted at the mid-term. Thus at the beginning of the second term there was always a large and instan- taneous drop in the percentage of retarded children, al- though the children in reality were relatively about as they were in September. That is, they were a half year older and a half year farther along in the grades. Be- cause of this peculiar definition of age, the survey com- mission could make no comparisons in acceleration, elim- ination, and retardation, with other cities.^ Similar difficulties confront a superintendent who figures retarda- tion before promotions in his system one year, and after promotions the next year. And when he compares his school system with any other system on retardation, he must be certain that he is taking it at the same time it was figured in the other systems. Another example is the custom reported from Louisville, Kentucky, of count- ing withdrawals as of the age they were on the previous September first. Many of those withdrawing after the middle of the year are, of course, in a higher grade or half- grade from what they were when their ages were taken. Teachers' Salaries. It is not uncommon to publish the salaries of teachers as being so much per month, with- out any mention of the number of monthly pajrments per year. Obviously the teacher has to live twelve months in the year, and for purposes of comparison the total for the year is the figure that must be used, especially if other people are to be influenced. For instance, a Kentucky superintendent in a small city reported that when he got his first month's salary check for $125, the cashier of the bank, who drew $90 a month, remarked jocularly that his own salary should be raised. He considered that the superintendent was earning more money. In fact, the 1 Cleveland Survey, volume on "Child Accounting," pp. 40-43 Why We Need Better School Statistics 7 superintendent drew $125 a month for eight months or a total of $1000 ; the cashier drew $90 a month or a total of $1080. It is equally obvious that such a report on the monthly basis is very much to the advantage of the sys- tem that runs for a shorter term. Another example is the published salary list of the faculty in a certain southern normal school. Some men were listed at $150 and others at $133J a month. But as a matter of fact, the $150 men were paid for eleven months and the $133J men for twelve months a year, while all worked the same length of time, thus making an average difference of only a little over $4 a month. Teachers' Schedules. It is rather common to attempt to compare the amount of work done by teachers within the same high school, normal school, or college, mainly by taking simply the number of recitations each has per week. This assumes that one recitation means as much work as any other and that teachers do no work except that connected with recitations. It leaves out of account such factors as : one recitation with a large class means much more work than one with a small class ; two reci- tations in different classes involve much more labor than two sections of the same class ; teachers serving on cer- tain committees do a great deal of work connected with them and entirely apart from class duties. Tax Rate. Some state superintendents and many city superintendents have published comparisons of cities or counties with other cities or counties, based on the tax rate for school purposes. This is of no value unless one knows the rate of assessment and can thus arrive at the real rate of taxation. It is a well-known fact that there is an enormous variation in the rate of assessments in various cities; that no two are practically the same. 8 School Statistics and Publicity ^^j Yet how often do we see in school reports long lists of figures giving the school tax levies of many cities chosen almost at random. But these do not impress the tax- payer. He knows whether his total school taxes are heavy or not and will pay little attention to figures that merely place his city low in the list of cities ranked according to rate of school tax.^ But even the actual rate of taxation is still of little value unless one knows how heavily the taxpayer is bur- dened with other taxes. For example, several years ago at Arkadelphia, Arkansas, the taxpayers protested against a school tax of seven mills (70 cents on the $100) ; they said that if this were added to their state school levy of three mills (30 cents on the $100), it would make their total school tax ten mills (100 cents on the $100), or higher than most places in the South. This was totally wrong, for many of the communities with which they were com- paring themselves paid a county school tax in addition to the local and state school taxes considered by them.- College Degrees. A similar trouble arises from the custom in many schools of at least secondary grade, of advertising the percentage of their faculty that hold de- grees. But with such degrees undefined these figures mean nothing. In all sincerity may not the public ask why an A.B. from Harvard or Vanderbilt University is rated just the same as one from a private college of junior rank which has perhaps four second-rate teachers and practically no equipment? The fact is that the public does ask just such a question. For this reason thinking people place very little value at the present time on the mere announcement of a degree. They wish to know what institution conferred it. As a consequence some 1 Se3 pp. 51-2 2 Reported by Mr. A. E. Phillips I Why We Need Better School Statistics 9 editors, authors, and catalogues, when giving a man's degrees, also indicate the institutions conferring them. Unequal Things. In the preceding examples the lack of suitable units has come about rather from thoughtless- ness than from any definite intention. But a very com- mon error in school statistics is caused by deliberately considering as equal, things which a very little serious thought would show are not equal. Professor Thorn- dike 1 gives a good example of the seriousness of this error. Dr. Rice had ranked a large list of words as of equal difficulty to spell. Professor Thorndike gave the same list to a group of children with these results : 42 children missed necessary; 37, disappoint; and so on, down to 1, feather; none, picture. Making the most of the possibility that the group of children found some of the words easier than others because of recent drills, yet the fact is unmistakably clear that picture is not so difficult for these children to spell as is necessary. An especially good example of considering as equal things which are not at all equal is found in the following excerpt from a recent circular published in Virginia: ^ The general administration of school affairs in Virginia has also furnished striking evidence of sound economy as well as splendid efficiency. This is best shown, perhaps, in the recent comparison with the wealthy state of Minnesota, which has practically the same population as Virginia. In 1916 Virginia with eight millions of rev- enue enrolled 11,560 more pupils than Minnesota with twenty-six millions of revenue. This ignores the proportion of children in the two state populations, the length of school term, or anything con- nected with quality of education. The schooling given 1 Thorndike, E. L. : Mental and Social Measurements, p. 8 ft 10 School Statistics and Publicity the average child in Virginia certainly is nothing like the schooling given the average child in Minnesota. The latter is so much superior that it might con- ceivably cost over twice as much and still be really more economical. A third instance is found in library reports of various schools. These are often useless for comparative purposes] because some schools report all bulletins, agricultural I and census reports, and duplicate copies, as separate books and of equal value. A fourth example is that furnished by the vicious at- tacks on the state university made in some states by un- scrupulous politicians. In these the yearly cost of school- ing for a child in a rural school is compared with the several hundred dollars required to pay for a year's instruction of a university student. For this comparison one child in a rural school is deliberately considered to be equal to one student in the university. ^ Grading Standards. In this connection we must not overlook the errors that arise from differences in the grading standards of different teachers in the same school. Superintendents for purposes of records, determination of class honors, etc., generally assume that their teachers grade on something like the same standard. But, in fact, the almost countless articles and books on uniform grading published in the last few years demonstrate that all teachers if left to themselves will vary widely in their standards of marking pupils. In some schools elaborate statistics are kept to deter- mine the standing of pupils by marks, especially in high schools for class or graduating honors, etc. But these awards are really doubtful because no effort has been made 1 See also p. 47 Why We Need Better School Statistics 11 to have the teachers mark uniformly. It is only when teachers mark on something like a uniform basis that the valedictorian may be easily found by averaging marks given by them. Without some system of uniform mark- ing, a student getting an average of 95 may really be of the same ability as another getting an average of 97. In the University of Missouri, before the adoption of its present system of uniform grading, it was well known that certain departments were '' snap " departments because of the high grades given in proportion to the amount of work required of the student. Professor Meyer in one of his earlier articles on uniform grading ^ shows that in the department of philosophy at this time 55 per cent of all grades given were ''A," while in the department of chemistry only one per cent were '* A." Consequently, students of the university who wished to get into the Phi Beta Kappa fraternity tended to choose all their electives from the departments known to give many high marks. It may be taken for granted that few possessors of a Phi Beta Kappa key in this institution about this time had taken any more than was required in chemistry, but probably many of them had elected work rather freely in philosophy. The application of the same principle to the high school valedictorian needs no elaboration. Judging Contests. The same error is made in judging debating or essay contests with from, say three to five, judges, unless the judges are obliged to grade on the same scale. Suppose we have ten contestants, and three judges who grade in terms of their own judgment, the grades being as follows : 1 Meyer, Max: "The Grading of Students," Science, 28: 246. (Aug. 21, 1908) 12 School Statistics and Publicity Marked by Marked by Marked by Average Contestant Judge A Judge B Judge C Mark 1 55 82 90 75.6 2 95 80 94 89.6 3 70 82 92 81.3 4 85 88 94 89.0 5 70 82 91 81.0 6 84 86 96 88.6 7 80 86 96 87.3 8 75 84 96 85.0 9 75 80 96 83.6 10 60 82 92 78.0 Each of the judges has rated a different contestant as the highest, and the proposition is made that an average be taken of all the ratings. When this is done, it is found that Contestant 2 has won, although Judge B has rated him as one of the worst of the performers and Judge C has ranked him as tied for fifth place. Contestant 4 loses although he is rated in first place by Judge B, second place by Judge A, and as tied for second place in Judge C's opinion. The reason he loses, although his combined rankings are higher, is this : the judges did not grade on the same scale. Judge A used a scale of much wider variation than did the other judges, and he gave Contest- ant 2 ten points more than any of the other contestants. The whole range of Judge B's grades was only eight points ; that of Judge C's only six points ; but Judge A's ranged forty points. Therefore, his rating of Contestant 2 more than overcame the combined judgments of the other two men.^ 1 For further treatment of units and scales, see pp. 43-57. Why We Need Better School Statistics 13 3. Comparisons Using Unsound Elementary Treatment Errors in the Use of Percentages. Of the unsound methods of comparison, some common ones deal with percentages. First, there is the kind that ignores the real meaning of percentage. For example, a superin- tendent may claim a decrease of 200 per cent in retarded children or number of withdrawals. Such a statement is of course meaningless, for there can be only 100 per cent of anything on which to figure the decrease. What is usually meant is that the present number is now only one third of what it once was, a decrease of 66| per cent. This error runs at once into the one which frankly loses sight of the original data. Some years ago a prominent city superintendent in Georgia had the charge brought against him that illiteracy of adult whites had increased 100 per cent in his city, while in the state as a whole it had decreased from 13 per cent to 10 per cent or a 23 per cent actual decrease in the amount of illiteracy. (13 - 10 = 3 . y3_ := 23 % + . ) The facts were that this city of about forty thousand population had, at the beginning of the period in question, a certain small number of adult white illiterates, say ten. During the period several families of ignorant Greeks moved in, bringing at least ten more adult white illiterates. Based on the original number of illiterates, the increase was 100 per cent; but based on the total population, the increase was negligible. Yet some supposedly well-informed people in this city, not knowing or thinking how few cases the misleading state- ment was based upon, were inclined to criticize the super- intendent of schools. Especially may the original data be forgotten in figuring percentages on parts of a small group. Thus, taking the k 14 School Statistics and Publicity ^1 percentages of the various marks given by a teacher in a class of less than twenty pupils is of doubtful value. Any j unusual condition of any sort affecting one pupil only { will affect the number of pupils receiving the same mark, by more than one twentieth of the whole. If there are ten pupils in the class, it will be affected 10 per cent and so on. If two pupils out of ten received A, 20 per cent of the class would get this mark. If one of these pupils for any reason should do poorer work, the A group would lose 10 per cent of the whole, but half, or 50 per cent, of itself. In similar fashion the Missouri School Journal some years ago ran a comparison of state teachers' associations, based upon the percentage of teachers in the state attend- ing them. Rhode Island came first and Tennessee last. This frankly ignored to a considerable extent the original data with its lack of a unit for " teacher in attendance." The Rhode Island teachers could attend their meeting at Providence on a nickel street car fare or a dime at the highest. But many teachers in the larger states could not attend their state association for financial reasons. Thus a low percentage of attendance in some states might mean a much greater spirit and devotion than would a 100 per cent attendance in Rhode Island. Other comparisons attempt to relate directly groups of data which should first have their component parts turned into percentages. For example, on the wall in a certain state department of education, the grades on certificate examinations made in each subject are plotted in a graph ranging from 70 to 100. The name of the subject is put on the graph and the name of the person grading the papers. The attempt is made to standardize the gi'ading in the different subjects by comparing these graphs. But • Why We Need Better School Statistics 15 because of the failure to turn the numbers of grades for each group into percentages of the whole, the compari- sons are either very difficult or erroneous. The compari- sons could be made directly only if there were the same number of papers in each subject, a thing which does not often happen even approximately. Carelessness in Securing Data. A second form of un- sound comparisons frequently comes from carelessness in securing data. For example, in comparing school sys- tems on expenditures, the figures taken for one year may be wholly misleading for at least one or two cities out of any twenty. An unusual condition in some one city, such as an epidemic or a great expansion of work, or the erec- tion of a building a few years before, may make the ex- penditures for that year wholly different from the usual expenditures of that city. Again, unless great care is taken, the total expenditures for one city may include money spent on buildings and repairs, and duplicates, while for other cities they cover current expenses only without duplicates. Unless the figures for each city are usual, or " average " ones, and unless they are taken for all cities on the same basis, no amount of care later will give sound comparisons. Omission of Important Factors. A particularly un- fortunate form of comparison is that which presents data related to each other and then draws from them conclu- sions that are entirely erroneous because certain impor- tant factors have been left out of consideration. In other words, it is the old fallacy of arguing from insufficient data. For instance, take the case of the state superin- tendent in the South who some years ago claimed credit for the increase in school revenues in his state. Both from the platform and in press reports, he claimed that he 16 School Statistics and Publicity had done more towards adding money to the state school fund than had any of his predecessors. He even issued a bulletin in which he compared the amount per capita for schools during his regime with that of five of his prede- cessors in the office. In his closing sentence he styled himself " the wizard of finance." As a matter of fact, the state in question through industrial development had largely increased its wealth so that the same rates for school taxation brought larger school revenues. The wealth had increased faster than the population, and so the per capita spent on schools rose. However capable a state superintendent of schools may be, he can hardly legitimately claim much credit for increasing the wealth of his state in a few years, especially at a rate faster than the population. Another example is the argument sometimes used against compulsory education. This cites that illiteracy has decreased faster in certain southern states that have no compulsory education laws than in certain northern states that have had such laws for years. But it leaves out of account the fact that the smaller northern decrease in illiteracy is due to recent immigration of foreign illit- erates, a class of which only a small number come to the South. 4. Attempts at Too Great Accuracy Cost Figures. Occasionally time and effort are wasted by the superintendent's going to great extremes to show accuracy in his figures. This is particularly unfortunate at times, because what is considered great accuracy may be only unnecessary work that can never make things accurate. For instance, a superintendent may go to great length to calculate the cost of instruction per day Why We Need Better School Statistics 17 per child in his system, for comparison with similar figures for other systems. He runs his figures out to hundredths of a cent. But owing to the trouble in units previously mentioned, his results are really worth very little. One system has taken the school census for the number of children; another, the enrollment; and still another, the average daily attendance. Unless he knows that the same base has been taken for each system, no amount of mechanical accuracy later will give exact results. In many public presentations of school statistics, much of the effect is lost by attempting to be too accurate in giving the figures to the cent or the fraction thereof in all places. The mind of the average citizen, or even of an expert for that matter, cannot take in too many details. The exact figures add nothing to the impression in his mind, and indeed detract from the main things. As an illustration of this point, take Professor Bobbitt's expres- sions in the San Antonio Survey that " English costs in the neighborhood of $210,000 " and " spelling costs in the neighborhood of $40,000." ^ These are far more ef- fective than if he had run the sums out to the exact num- ber of dollars and cents obtained from a very long and tedious computation. Standards. A superintendent may make an unwise recommendation when he advocates that his school should reach the precise figure on an '' average " or " middle figure " of a certain table, comparing his school with other systems. In view of the well-known chances for inaccuracy in the original figures, it is far better to take for granted that such hair-splitting accuracy is not desirable and to use a rather wide standard. Professor 1 Bobbitt, J. F. : Survey of San Antonio Public Schools, pp. 99, 103 18 School Statistics and Publicity Bobbitt 1 does this in his " zone of safety '^ standard which includes the middle half of the group. As an il- lustration of the workings of his standard, we quote Table I on the costs of instruction in high school mathe- matics per one thousand student hours, in certain cities of the country. Table 1. Bobbitt Table Showing Cost of Instruction per 1000 Student Hours (Mathematics) Cost per Name of school 1000 stu- dent hours University High . . , . $169 Mishawaka, Ind , . . . . 112 Elgin, 111 100 Maple Lake, Minn » . . . 100 Granite City, 111 88 East Chicago, Ind 82 De Kalb, 111. . T~~ 74 San Antonio, Tex 69 Harvey, 111 69 Waukegan, 111 63 South Bend, Ind 62 East Aurora, 111 61 Rockford, 111 59 Booneville, Mo 58 Brazil, Ind 56 Leavenworth, Kan 56 Greensburg, Ind 54 Morgan Park, 111 53 Noblesville, Ind 52 Norfolk, Neb. \ . ~. ~. ! '. ] '. \ ] ] ] ] ] ] 42 Washington, Mo 41 Bonner Springs, Kan 38 Russell, Kan 34 Junction City, Kan 33 Mt. Carroll, 111 30 1 Bobbitt, J. F. : "High School Costs," School Review, 23 : 505-534. (Oct., 1915) Why We Need Better School Statistics 19 From this table, Professor Bobbitt would not say that the cost of teaching high school mathematics should be exactly $59; but that it should be somewhere between $52 and $74. ^ This measure must for another reason be used with still more caution. In this same study by the same method, Professor Bobbitt found the '' zone of safety " for Latin to be from $54 to $92, while that for English ranged only from $43 to $67. That is, the zones differ for different subjects, and one cannot at first glance be sure that one ought to strive for a difference in costs in. the subjects. Again, this measure is not good for publicity purposes unless the superintendent's system is below the '' safety zone." In such cases it is very effec- tive. But if his system is above this zone, citizens are apt to rest contented or even to contemplate reducing school taxes. 5. Neglect of Technical Statistics Averages. Even when the data are accurate, errors may creep in because of neglect of the elements of tech- nical statistics. A frequent example of this arises from a very loose use of the average to typify a group, especially in regard to teachers' salaries. In a recent investigation 2 of salaries paid by fifteen of the best-known colleges for women, it was found that the average lowest salary paid was $700 and the average highest was $1500. The average would give a very erroneous impression, as far more of the instructors received the lower salaries than the higher. The maximum amount reported was $3000 ; the lowest, $100 and home. It is obvious that the high salary would exert a greater influence on the average than the lower one, since it is twice as far away. 1 To be still more accurate, between $47 and $78. See p. 130. 2 Journal of Pedagogy, 19 : 185 20 School Statistics and Publicity ^^B This point may be further illustrated from a report of a state superintendent. ^ In one table he gives the highest and the lowest salary paid per month to male and to female teachers, for both races, separately. The table I following, in the same way, gives the average salary of the various groups. For example. County A reports the highest salary for white male teachers, $175 per month ; the lowest, $40; the average, $73.07. The fact un- doubtedly is that in County A more white male teachers get less than $73.07 than get more. This table is made worse by averaging the salaries paid both white teachers and colored. This average means little for two reasons : First, there are fewer colored teachers by far in Florida than white ; second, the colored teachers do not get any- thing like so much salary as the white. The result is that the average is higher than all or most of the salaries paid colored teachers, and lower than all or most of the salaries paid white teachers, and so is significant for neither group. In the summaries of another report of the same super- intendent of the same state for 1913-14, this error appears. ^ The average salary per month paid teachers in the ten highest and the ten lowest states in the United States is given. Wisconsin stands at the top of the lowest ten. Florida does not appear, so we suppose that she is some- where between the highest ten states and the lowest ten ; hence the apparent conclusion that Florida pays a better monthly salary than does Wisconsin. But from another table we learn that the average school term in Wisconsin is 175.7 days ; for Florida, 122.2 days. In other words, Wisconsin pays on the average for 53.5 more days than 1 Report of State Superintendent of Florida, 1912 : 471-472 2 Report of State Superintendent of Florida, 1913-14 : 37 Why We Need Better School Statistics 21 does Florida. Now if these averages had been expressed in years instead of months, the chances are very likely that Wisconsin would be higher up in the list of states than Florida. A very inaccurate but common way of taking the aver- age is illustrated by the following procedure which came under the writer's notice : In a study for school purposes of the negroes of Texas, the percentage of the negro pop- ulation to the total population of the state was desired. Three counties with a high percentage, three with a mod- erate percentage, and three with a low percentage were taken for the whole state. Then the average of these nine counties was determined. This is a very rough and in- accurate method. If the number of counties in the state were divided into groups having the same ratio to each other as the counties taken in the study, and if these latter were then chosen at random or equally spaced in their groups, the results would be fairly accurate. But such a state of affairs could hardly be hoped for in popular sampling of this kind. A similar error is liable to occur in the common practice of judging the work of a class by examining one of the best notebooks, one of the medium group, and one of the worst. Unless care is taken, the extremes may re- ceive an undue amount of emphasis. The medium group is always so much larger than the extreme groups that it ought to have at least two samples to each one for the other groups. Variations. Often bare averages are given, when varia- tions from the average are the significant thing. Suppose that two boys in school have the same average for their grades in different subjects, say 85. But one boy made these grades, 84, 86, 86, 88, 81, and the other made 97, 98, 22 School Statistics and Publicity 90, 80, 60. The deviations from the average in these two instances show that one boy is an '' average " boy in all his studies while the other is very fine in some, possibly those he likes, but poor in the others. In the illustration of the salaries paid teachers in col- leges for women, previously given, the significant thing is not the average but the deviation from that average. Professor Bobbitt in his San Antonio survey has one rather incomplete treatment in this matter of deviation, occurring in what is otherwise a most admirable use of variations.^ He desires to make the point that the problem of heating the schoolrooms in the city is of minor importance and gives a table of mean hourly temperature of the city for all school hours, month by month. This of course allows for deviations. These figures are for the winter months not a tremendous distance below the 68 degree standard for comfort. This would indicate that the city was always mild in winter. However, the sig- nificant thing is not this mean temperature of San An- tonio, but the deviations from it. To get a mean temper- ature of say 55, there must be days below it, possibly a considerable number. If there are only a few days each month, it is clear that the school buildings must be equipped to keep children warm on those days or else school must be dismissed until it gets warmer. Sampling. An extremely common error in school statistics arises because of ignorance of " sampling." This corresponds to the error in logic arising from con- clusions drawn from one case, from too few cases, or from cases not properly selected. In many school statistics the samples are too few, or they are not impartially taken. Especially is this true in most questionnaire or straw 1 San Antonio Survey, pp. 226 j^. Why We Need Better School Statistics 23 ballot investigations. A questionnaire is generally an- swered chiefly by those especially interested in the subject. Any conclusions from such data should not be interpreted as applying to any save that class. The methods of taking the average described on page 21 are of course illustra- tions of bad sampling. Incorrect sampling appears often in school advertising. Thus, a leading southern university gives prominence in its advertising to the number of its graduates who have been members of Congress, governors of states, United States senators, or even President. In many ways of course such advertising is thoroughly justifiable, but not when it seeks to convey the impression that the typical student at the university will later reach such prominence. Again, a certain southern university appears to admit to graduate standing all graduates of a certain normal school which does about two years of college work. This graduate standing is granted because at one time three or four students from the normal school in question were taken on trial at this university and did good graduate work. That is, all students from the normal school are assumed to be as capable as the original three or four. Practically every school system, especially a high school striving for recognition, cites the performance of its best students in creating a general impression of its work. But such sampling needs to be taken with considerable salt. We might with equal justice expect all Chinamen to be so many Confuciuses, or all Americans to be so many Woodrow Wilsons. The process of sampling is treated at length later on.^ But briefly, where sampling is resorted to, it must be done purely at random. And enough cases must be taken to 1 Pp. 62-71 24 School Statistics and Publicity insure an approximately accurate impression of the entire distribution that is being sampled. Number on a Scale. A final technical weakness, one requiring great care to avoid, comes from a wrong inter- pretation of what a number means on a scale. For ex- ample, a score of 6.25 on any test should mean one-fourth of a step above the "6" step's starting point. If this step runs from 6 to 7, obviously the score of 6.25 is just what we need. If '' 6 '' is regarded as running from 5.5 to 6.5, then 6.25 may really mean 5.75. A superin- tendent may measure his work on a scale by one method and compare it with work measured by another method on the same scale, and feel much elated over beating the other man by half a step, when as a matter of fact the half step advantage is due solely to differences in counting on the scale. 1 6. Presentations of School Statistics to the Public When it comes to presenting school statistics to the public, the errors are even greater. There is often little or no tabulation, or there are such complications that they can be understood only by experts. Many of the presentations make no use of graphs, and others contain such intricate graphs or charts that laymen cannot easily understand them. Some of the charts have the zero line cut off so that the effect on the reader is apt to be totally misleading.2 In other graphs the scales used may give a false impression. 2 II. WASTEFULNESS OF THESE ERRORS The mere errors of school statistics are not their worst feature ; it is the amount of time and energy put into such 1 See p. 49 2 See p. 283 ^ g^e p. 284 Why We Need Better School Statistics 25 useless things. Some school executives apparently de- vote much time and effort to the collection and publica- tion of educational statistics of this useless sort, pre- sumably with the idea of doing some good. Professor Hanus ^ in a recent study of superintendents' reports found this to be true of one city : The report of the super- intendent contained 35 tables, 23 of which contained statistics for the year 1913-14; the other 12 presented comparative statistical summaries covering the period from 1906 to 1914 ; but only two of the latter group of tables had any relation to the first-named group. Forty- two per cent of all the reports studied by Hanus con- tained *' few or no comparative statistics, and very little or no satisfactory interpretation of statistics." The significance of such statistics is practically nil, he concluded. Snedden and Allen, in a book published sorhe years ago,^ indicate that in their investigation of the school reports of the best city systems of the country, much useless statistical material was found . This material they roughly classified as of two kinds : First, material that would be useful if properly related and explained ; second, material useless in itself, either because it has no significance what- ever, or because it is in detailed form, when only the sum- mary is the significant thing. The protests against such ineffectiveness in school statistics arise also from other sources. Superintendent Maxwell of New York frankly says : " There are some ways in which the efficiency of a school may be deter- mined with an approach to accuracy and without the 1 Hanus, Paul : "Town and Gity Reports," etc., School and Society, 3 : 145-155, 186-198. (Jan. 29, Feb. 5, 1916) 2 School Reports and School Efficiency, pp. 1-10 26 School Statistics and Publicity assistance and without the retardation of time-wasting, energy-destroying statistical research. '' There may be ways in which the so-called scientific surveys or investigations, when stripped of past and present absurdities, will help in determining efficiency." ^ Again, the writer of a very thought-provoking satire on present-day educational theory offers the following '' statistics on the different classes of teachers with re- spect to ' pedaguese ' or the scientific terminology em- ployed by educational theorists " 2 ; Use and think they understand it 12 % Have used and thought they understood it, but don't now . 2% Think they understand it, but don't use it 6% Use it but don't understand it 9% Don't use it, don't understand it, but esteem with awe those who do 51% Think it is rot 20% 100% III. INDIFFERENCE OF THE PUBLIC Moreover, the public, particularly the influential portion of the public, have a distrust of these school statistics and are little affected by them. It may be that the average citizen has a wholesome respect akin to awe for any statistical presentation, as claimed by some writer in the Unpopular Review.^ In this article appear some very illuminating examples of misleading statistics in general fields. Errors similar to these can be found in many school reports. But the writer's experience and results from conferring with many practical school men point in the other direction. Many influential 1 Maxwell, W. H. : "How to' Determine the Efficiency of a School System," American School Board Journal, May, 1915 2 Henrick, Welland : A Joysome History of Education, p, 54 8 May- June, 1915:352-366 Why We Need Better School Statistics 27 people who are not especially trained in statistics, are frankly skeptical of any statistical report on schools. At the same time they may be vitally interested in the schools and have confidence in the superintendent. School statistics are usually included in large reports which do not reach the public to any appreciable extent. Professor Hanus sent 250 letters to the members of the Harvard Club, and 250 to the members of the Chamber of Commerce in Boston. He asked these people if they had seen a report of the school board, including that of the city superintendent, within the last two years. He received 128 replies from members of the club and 83 from members of the Chamber of Commerce. About 70 per cent of these answered ''No." Snedden and Allen assert that the reports of city systems are not read and give as the reason that they are not so arranged as to be intelligible to the ordinary citizen. The common practice of issuing summaries or abstracts of school reports or surveys, indicates the same thing. The indifference of the public is further shown by the fact that editions of school reports are usually very small. Only a few years ago the proposition was made to place copies of the New York City report in the hands of each of the 15,000 teachers, so that more intelligent and ear- nest work would result. It was objected to on the ground that it would cost too much. But the cost of the reprints would have been insignificant compared with the $36,000,- 000 spent on the public schools that year. Another illustration comes from Mobile. This city has a debt of $4,000,000, and an annual expenditure of about $464,000 with $110,000 spent on schools. But the board hesi- tated about paying the $500 necessary to print the recent school survey. 28 School Statistics and Publicity IV. PROGRESS ELSEWHERE IN USING STATISTICS While these and similar errors continue in statistics of many school systems with resulting indifference of the public, remarkable improvements and extensions in the use of statistical method have been made in other fields and in some schools. Whether we look in economics, sociology, census reports, insurance, cost accounting, biological sciences, or the best school reports, wherever scientific method is used, that is, practically, wherever the experimental method is used, there has been a refinement of statistical method and an application of it to the prob- lems in question. Let us list at random some of these problems as they come to mind : Life insurance calculations, by which the average number of years a man will live after any given age, is known. Weather conditions (rain, temperature, etc.) over any given area for a long period of time, by which it may be known whether it will be profitable to grow certain crops in that region. The adaptation of the average to measuring and sampling processes, especially of grain. The movement from country to city and its results. The tenancy problem, its causes, types, and effects on general social status of the region under consideration. Relative attainments of city and country born children. Relative strength of various European belligerents. The relation of men and women in various mental processes. The effect of factory labor on children. The fight against preventable diseases. Comparison of the intellect of the white man with that of the negro. The influence of woman suffrage in national elections. Within the field of educational investigation itself, it is significant to scan the list of publications at a graduate school of education like Teachers College (Columbia University) and see that from only a small number of Why We Need Better School Statistics 29 dissertations embodying a few statistics simply handled, there has been a steady drift toward investigations in- volving the comprehensive use of statistical method of greater and greater accuracy. The School of Education at the University of Chicago requires a course in statis- tical method very early in the program of every graduate student. Statistical method has been used to get at the truth of many educational problems, ones that could not other- wise be solved, for example : Do women teachers drive boys out of high schools? Is the South taxing itself as high for schools as the North? Is the record on college entrance examinations as good an index of the student's later achievement in college as is his high school record or the college record of his brother? What is the relation between reasoning and fundamental opera- tions in arithmetic? What are the most economical ways of memorizing? Do teachers get more salary for each year of training beyond the high school? What are the causes of elimination of children from school? A similar progress has been made in these other fields in presenting statistical information to the public. Po- litical parties, labor leaders, Y. M. C. A. leaders, phil- anthropic workers, various foundations, bureaus of mu- nicipal research, evangelists, advertising agencies, and corporations are showing in their reports great progress in pictorial and graphic ways of presenting statistical mate- rial so as to influence the layman. V. UNSATISFACTORY STATE OF AFFAIRS IN SCHOOL STATISTICS In view of the progress in the use of statistics in other fields and in special educational investigations, our present unsatisfactory state of general school statistics cannot be 30 School Statistics and Publicity allowed to continue. Statistical methods apply wherever things are to be counted or measured. Nearly all the problems of the school executive involve numerical data and cannot be adequately handled without statistical method. For example, what problem would a super- intendent have that did not relate to one of the following general fields? 1. Children to be educated or changed. 2. The aims of education, or the nature and amount of change to be produced in these children. 3. The agents of this education — teachers and others. 4. The means of this education — buildings, books, laboratories, etc. 5. The methods by which these agents use these means. 6. The changes resulting from various combinations of these with the first, which is the big thing in education .^ Every one of these fields affords numerical data for the solution of problems, and they cannot be solved without handling such data. In such fields '' the number of use- ful studies to be made is, for all practical purposes, in- finite." 2 The school executive has as great need to appeal to the layman with statistical matter on schools, as do any of the publicists in other fields. In particular, he has prac- tically to make the same appeal for funds as do workers in these other fields, especially the religious and philan- thropic ones. Consecjuently, the school superintendent needs for school purposes any valuable methods or dis- coveries on statistics worked out in other fields. 1 Thorndike, E. L. : "Quantitative Investigations in Education," School Revietv Monographs for College Teachers of Education, No. I, pp. 31-32 2 Ibidem, p. 34 I Why We Need Better School Statistics 31 In the preceding, the aim has been to set forth clearly typical errors in school statistics and to indicate briefly the causes for such ineffectiveness. The rest of this book shows in detail how to work up school statistics properly and how to present them effectively to the public. Chapter Two deals with the collection of data. Chap- ter Three discusses the need for technical methods in han- dling school statistics. Chapters Four, Five, Six, and Seven show how to apply these technical methods to school facts that are to be presented to the public, — Chapter Four dealing with scales, distribution tables, and surfaces of frequency; Chapter Five with the three measures of type ; Chapter Six with the measures of deviation or dis- persion; Chapter Seven with measures of relationship. Chapter Eight is a supplement on statistical treatment, dealing with additional problems which the superintendent will encounter in working up his school statistics to present to the public. Chapter Nine treats of the ineffectiveness with the public of the statistics in many school reports. Chapter Ten shows how to tabulate school statistics for the layman. Chapter Eleven deals with graphic presenta- tions of school statistics for the public. Chapter Twelve gives effective ways of translating school statistics into words, for popular consumption. EXERCISES 1. When may a child be counted as a six-year-old? 2. Just what is the justice of a state university's rating a high school on the performance of its former graduates in the first two years at the university? 3. Precisely what does a boast in the state teachers' journal that a certain girl graduate in a given high school made an average of 98.5 for the four years, mean to other high school teachers? 4. How valuable is the practice of some teachers of keeping 32 School Statistics and Publicity records of the work of pupils for only the last few days before the end of the term ? 5. How significant is the claim of a superintendent that his high school is better than that of a neighboring superintendent because he has in it 6 teachers as against the other's 4, and enrolls 60 non- residents as against the other's 40? 6. Of what worth is the statement of a superintendent that he has increased the size of the graduating class in his high school 75 per cent in two years? 7. How could you compare in the same high school the cost of teaching one period a day for the principal who teaches 4 periods a day, with that for the Ej;)glish teacher who has 6 periods a day; with that for a science teacher who has 4 classes with double periods each, a day? 8. What examples of the errors given on pages 2-24 have you come across in your previous experience with school statistics, and in just what did the trouble consist? 9. Take some school report or some article in your state teachers' journal that contains considerable statistical material. Locate the errors, if any, in it and show in just what they consist. CHAPTER II COLLECTION OF DATA I. WHEN TO USE STATISTICAL METHOD 1. When Statistical Method Is Profitable Which school problems may be subjected with profit to statistical treatment? The superintendent needs to know this, for we have seen how easy it is for him to waste his time and make errors in school statistics. These troubles frequently come from two ever-present dangers: starting on the details of the task before it has been clearly thought out; and attempting to use statistical method where it is not applicable or will not yield results worth the effort. In other words, two questions at once arise : 1. Which problems of the superintendent may profitably be sub- jected to statistical treatment? ' 2. How may he know whether a given problem can profitably be so treated? In reply to the first of these questions, it may be said that practically all of the superintendent's problems may be statistically treated, because on all of them he can work with data that may be counted or measured more or less accurately. The quotation from Professor Thorn- dike, page 30, shows how wide a range of problems can be thus treated. But to be still more specific, the follow- ing list of problems ever present for the superintendent must be so treated : ^ 1 Made up chiefly from Snedden and Allen : School Reports and School Efficiency, pp. 118-127 33 m 34 School Statistics and Publicity Buildings ^| Is the number of sittings adequate? WM What is the value per child that can be comfortably seated? " How much has the city spent for buildings per $1000 of taxable property as compared with other cities for a period of years? How much waste space is there in the buildings? What is the proportionate cost of upkeep and repairs, per capita average daily attendance? What has been the cost of equipping special buildings or rooms, as laboratories, manual training, etc., per capita average daily at- tendance, as compared with other years and other systems? Receipts and expenditures What has been the per capita expenditure, expressed in terms of average daily attendance of the system as a whole for several years? What has been the per capita cost, per school, of such items as salaries, general administration, fuel, building and repairs, special classes, etc.? Also what have been the relative costs? What do the different classes of schools, — high, junior high, ele- mentary, special, — cost per capita? Are receipts keeping pace with the increase in number of children? Census What are the numbers of children for each year within the school age limits in the city ? How many of these are within the compulsory school age ? Attendance What is the number of children of compulsory school age attend- ing public schools? How many children of compulsory school age are attending private schools, special schools, or are otherwise satisfactorily accounted for? How many are in voluntary attendance, per school, and has the number increased during the past several years? What is the number of persistent attendances, i.e., children who attend 160 days out of a possible 200, etc.? What is the character of the absences? Elimination How many children drop out each year, by grades and schools? Why do these children drop out — transferred to other schools, non-promotion, irregular attendance, over-age, etc.? Is the percentage of elimination increasing? Collection of Data 35 RetardoMon What are the percentages of promotion and non-promotion by- grades and schools? Would retardation be lessened by flexible promotion schemes? By promotion by subject in the upper grades? Are retardation and elimination increased or lessened in schools where industrial work is given? What is the relation between retardation and over-age? Special classes What is the per capita cost, average daily attendance, of special classes, evening schools, etc.? How does this cost compare with that in other cities? Medical inspection How many children have been treated ? How many defects have been remedied as a result of this treatment ? How many homes visited ? What is the per capita cost of this work ? Is it adequate? Truancy How many cases were reported? What disposition was made of these cases? What were the causes of these cases of truancy? What has been the per capita cost of the truancy department ? How does it compare with other cities in amount of work done and in expense? How m.uch time on the average elapses before a reported truant is returned to school ? Supervision Which of several methods of teaching a subject produces the best results with the given teachers and conditions of the system? Which teachers or schools are doing the best work, in what subjects, and just how much better? How many teachers can a supervisor most profitably work with in the given system? How successful is the scheme of the system for rating teachers ? How effective for producing better teaching are the methods em- ployed for improving teachers in service? Are the teachers overburdened with routine and clerical work? Are the classes too large for good work? Which parts of the curriculum ought to be eliminated or given re- duced time allotments? 36 School Statistics and Publicity 2. When Statistical Method Is Unprofitable Sometimes statistical method cannot be used, because it is impracticable to get suitable data. Lack of records, of suitable help, of money to pay for such assistance, or of time on the part of the superintendent, may bring this about. There are cases also in which the effect aimed at with a statistical presentation could be secured more quickly and easily with some argument from analogy, some emo- tional appeal, etc. For example, one of the Cleveland reports some years ago wished to emphasize the improve- ment in school children that could be produced by op- erations for adenoids and similar troubles. Instead of using statistics, it simply showed the picture of a boy afflicted with adenoids before the operation, and another picture of him some months afterward. The difference in expression in the two pictures, especially as regards intelligence, probably was more effective than any possible statistical presentation. Then there are cases of individuals which, as Professor King so aptly puts it, " statistics cannot and never will be able to take into account. When these are important, other means must be used for their study." ^ The second of these statements is really true only on the assumption that the superintendent has the right conclusion to present. Often this statistical treatment would show him that his presentation was untrue or at least needed material modification. Finally, there are some school problems that cannot be treated statistically with profit, for example : Cost of subjects sure to enroll few students compared with the cost of those sure to enroll many. Getting a city already spending money freely to do better. 1 King, W. I. : Elements of Statistical Method, p. 35 Collection of Data 37 Selection of texts ; cheap books usually mean inferior quality. Percentages of elimination and retardation where it is impossible to estimate fairly accurately the number of children entering in a given year. Worth of teaching methods outside of those that may be objec- tively measured. Superintendent Maxwell in the article previously cited gives a few such cases, for example, the items of character- building, development of reasoning ability (not yet measured to any appreciable extent), the motive to good, hard work, and all problems involving tact.^ 3. How to Decide Doubtful Cases How shall the superintendent decide quickly which problems shall be statistically treated and which not? Conditions vary often with each problem. But in general, if the big elements in a given problem involve numbers or can be expressed or measured in numbers, statistical treatment will be applicable. The cost of subjects sure to enroll few students cannot profitably be statistically compared with the cost of subjects sure to enroll many students, because the big factor in the situation cannot be stated in numbers. This factor is the relative value of the two subjects as parts of educa- tion. One may correspond to salt, of which all of us need a few grains each day, preferably some at each meal, and for which there is no substitute. English is a good example. The other may correspond to protein, which all of us also need, but which does not have to be taken at every meal, or even every day, and of which there are various forms. Science with its numerous alternative forms is a good example. Each subject is as important as the other for perfect educational health, but this relative importance can hardly be profitably 1 Maxwell, W. H. : "How to Determine the Efficiency of a School System," American School Board Journal, May, 1915, p. 11 38 School Statistics and Publicity treated or measured with numbers. Before rushing into any statistical treatment of a school problem, the superintendent, then, should first try to analyze out the big factors, considering whether the most important ones can be satisfactorily treated with statistical method. EXERCISES 1. Which of the following problems, or which parts of them, may be profitably subjected to statistical treatment? Which may not? Give your reasons precisely for each one. (a) To what extent can science be profitably taught in the grades ? (6) What percentage of the funds available for library pur- poses in a high school should go to each department in a given year? (c) Should home teachers be paid markedly lower salaries than those from a distance? (d) Should the passing mark be 65 or 70 on a scale of 100? (e) Should a given child in a given grade be promoted? (/) Is a given teacher marking too hard or too laxly? (g) What salary as superintendent in a given state may a com- petent man reasonably look forward to? 2. State in the form of definite questions at least three school problems in which you are interested that might profitably be subjected to statistical treatment. Give your specific reasons for so listing them. 3. Do the same for at least three school problems in which you are interested that cannot profitably be subjected to statistical treatment. II. HOW TO PLAN STATISTICAL TREATMENT OF PROBLEMS Careful planning in statistical work is always a sine qua non for success. " Each hour spent in carefully arranging the work is likely to save a score of hours in trying to straighten out the confusion due to a hasty and ill-advised program." ^ It is equally true that " one ^ King, W. I. : Elements of Statistics, p. 47 I Collection of Data 39 of the peculiarities of statistical work is that practically everything must be anticipated in advance, all possible sources of error detected and guarded against, and even the general results estimated." ^ The saving of time in statistical work becomes all the more necessary when we remember that the superintendent is at best a very much overworked man. Careful planning means time saved for the really big things in the statistical work he does, — the results and their meaning for his school work. He is also able to get a larger number of statistical processes done in a given time, which in turn means that more results and meanings will be available. In a word, careful planning of statistical work permits a larger use of statistical method by the superintendent in the time at his disposal. To save time, the main cautions to be kept in mind in planning statistical work will be given briefly and dogmat- ically with only necessary explanation. 1. Decide Precisely What Is to Be Found Out or Proved in the Statistical Work Indefinite phrasing of the problem means indefinite thinking, with the inevitable wastes of time that accom- pany it. The best device the author has ever found for compelling one to make a sharp and clean-cut statement of the problem, is to state it in the form of a very definite question, the adjectives, adverbs, subordinate phrases, etc. of which indicate the sub-questions or minor problems. This device has been found to be very serviceable on many problems not involving statistical treatment. It is just as serviceable on those that do need statistical 1 King, W. I. : Elements of Statistics, p. 47 40 School Statistics and Publicity method. The usual topical statement of the problem is one of the surest guarantees of loose thinking at the start. The interrogative form of statement accentuates the problem effect. A superintendent may simply state his problem as '' School Costs in Blankville." How much better it would be to state it thus : '' Are we paying all we can possibly afford for schools and are we getting our money's worth? " The same thing holds true for sub- ordinate problems. For example, compare the ordinary superintendent's statement of his problem with that of Superintendent Spaulding in the 1912 school report for Newton, Massachusetts. Usual Statement Superintendent Spaulding's Statement Report of Blankville Public Schools Statement of aims The Newton Schools : what are they trying to do? Attendance and progress of pupils Are they doing what they are trying to do ? (Taken for granted) Do you approve of their policy? Expenditures of the system for the current year Is their policy carried out eco- nomically? Administration Course of study Reports of various supervisors Is it administered efficiently? Recommendations for next year's work Can we afford to continue it? Can we afford not to continue it? Collection of Data 41 2. Plan to Collect only Data for Which One Can Point Out in Advance Specific Ways in Which They Will Be of Value to Him This does not of course mean that one can know in advance all the ways in which the data will be of value to him. The collection of data that do not seemingly answer some of one's problems or promise to buttress certain of his arguments, is simply a gambling proposition. And the odds are ten to one that the data will never be of any material use to him. It is true that sometimes unexpected uses for data will appear after the work of collecting them has begun. For example, the writer and a graduate student collected considerable data on the cost of instruction in southern normal schools. But they had worked for some weeks before they discovered that the same data would give material for answering very important questions about size of classes. However, such a valuable by-product cannot be counted upon in all cases. 3. Plan the Whole Procedure Through to the End, Trying It Out on Sample Data to Be Sure That the Units, Blanks, Processes, etc. Will Work Here is the place where one hour of good work will save twenty later on, as Professor King says. The units chosen should be carefully tested to see if they are prac- tical. The blanks should be drawn out in detail and the actual operations attempted with them. In particular, what is known as the question of " group- ing " must be decided. This means that if the data are to be considered in groups, the exact range of each group must be determined beforehand. For example, 42 School Statistics and Publicity if one is studying days' attendance, are the children to be grouped as those attending 0-19, 20-39, 40-59, etc., or 0-9, 10-19, 20-29, etc. ? This cannot be treated in detail here, but is discussed fully on pages 106 and 107. If the blanks are to be filled out by outside persons, some of the actual people, or preferably similar but less intelligent people, should be used to test out the blanks. The errors that these persons make should be noted and the blank revised accordingly. Thus, in sending out a blank for teachers to fill out, it is advisable to submit the rough draft to several average teachers, see how they can fill it out, and revise as necessary. A very helpful device at this point is to make a " brief " of just what is to be found and of the methods to be used. This can be elaborated from the material accumulated under the suggestions in i and 2. The actual processes for handling the data should be tested on the blanks themselves. Thus in making out a blank, if percentages are later to be calculated, the numbers from which they are to be calculated should come in adjacent columns if possible. Actually calculating such percentages on sample blanks will insure that an economy of this nature is cared for. Again, in making out blanks in series, the same fact should appear if possible in relatively the same column in different blanks. This will make all manipulation and calculation much easier. But such placing is almost sure to be overlooked unless the calculation with simple data method is carried through. In all probability the reader by this time is asking this question : " But how can one keep open-minded if so much planning is to be done ahead? " This is a natural question. So is the usual one as to whether such planning will not tend to the buttressing of preconceived opinions Collection of Data 43 rather than to the discovery of anything new. In the attempt to find the truth about anything, the question method of outlining the plans is undoubtedly the surest mechanical device to aid in keeping one open-minded. Note that Superintendent Spaulding does not suggest the answer to any one of his questions. Beyond this, probably no device will be of much service to a person who is determined to prove a certain thing by statistics whether or no. Furthermore, much of the superintendent's statistical work is for the purpose of demonstrating or proving to others what the superintendent already knows to be true. Here the element of keeping open-minded does not enter, and planning ahead is unquestionably very helpful. EXERCISE Take any one problem from your list in Exercise 2, page 38. Out- line in question form precisely what you would wish to find or prove. III. HOW TO DETERMINE UNITS AND SCALES In planning statistical work, the need of units and scales early becomes apparent. ^ The superintendent is constantly called upon to pass judgment upon the worth of many school matters. He generally does this by merely placing the thing judged in its proper place in a gi'aduated scale of values of such things. For example, a superintendent passing judgment on the work of a teacher merely puts her in her proper place in the list of teachers in his school system, ranged from high to low, or in a list made up of all the teachers he has ever observed. This is 1 See pp. 3-11 for examples of errors arising from a lack of proper units and scales. 44 School Statistics and Publicity indicated in the very language he uses : '' Best I ever had/' *' worst I ever saw," " hopeless," '' practically perfect," etc. If he expresses his judgment in letters, as A, B, etc., or in figures as 85, 90, 100, etc., he is merely substituting symbols for such word estimates. 1. Subjective and Objective Scales It is generally recognized that the superintendent will vary at different times in his judgments of the same teacher, engaged in the same kind of work. He may be suffering from a severe headache, or be perturbed over a recent business reversal. But his readings of a standard thermometer, when it was at the same temperature, would vary little from time to time. Let us now consider the reasons for this difference. The thermometer is graduated into constant definite units that measure the same amount of heat in the room always in the same way. Not so in the case of judging the teacher. The superintendent's scale of teaching ability has no definite units that always measure the same amount of teaching efficiency. These two scales represent the extremes of the kinds of scales that the superintendent must use. The units in the scale for judging teachers are in the superintendent's mind. Granting that he can transmit fairly clear ideas of his scale to others, there will be great disagreement among those using it. If they agree, they may easily be unaware of the fact, for the same descriptive words mean very different things to different persons. Since in a scale of this kind there are no units that can be made to mean exactly the same thing to different people, such a scale is said to be a '' subjective " one. In the case of Collection of Data 45 the scale for measuring temperature, the units are not concealed in the mind of the person using the scale; they are external to every one who wishes to use it. For this reason, such a scale is called " objective." The chances for error or difference of opinion in reading the units of an objective scale are slight as com- pared with those arising from the use of a subjective one. Between these two tjrpes of scales lie others with vary- ing degrees of subjectivity and objectivity. For example, the Thorndike handwriting scale is made up of samples of handwriting rated from to 18, as determined by the combined judgment of a considerable number of com- petent judges. This scale is objective in that any one can see the sample of handwriting grades, say No. 8. But its use is subjective in that all people do not agree that No. 8 is worse than No. 9, nor would all judges rate any other sample of handwriting at the same place on the scale, say No. 12. If the number of judges were large enough, the variation in placing such a sample might range from 10 to 17. It is highly desirable in planning any statistical work to try to secure units and scales that shall be as objective as possible and that shall have a minimum of harmful subjective elements. Thus, the step between 94 and 95 in the marking system of two teachers does not mean at all the same thing. Again, merely because several people think two words are equal in difficulty, it does not make them so. Professor Thorndike, as noted on page 9, quotes Rice as counting that disappoint is equal to feather in difficulty in spelling, or as proceeding as though it were. But by actual experimentation in a 5A grade, twenty-four times 46 School Statistics and Publicity as many girls and thirteen times as many boys missed disappoint as missed feather. In measuring arithmetic work, it is much better to take examples from the Courtis or Stone tests, because the practical worth of these has been demonstrated by the actual achievements of thousands of pupils. One reason why subjective scales are so often un- desirable is that the zero points on them are unknown. On an objective scale, such as length, 90 inches is just three times as long as 30 inches, or it is just three times as far from zero length. But in the grading or giving of marks by two teachers, there is no assurance that each regards 90 as just three times as good as 30 or just three times as far removed from utter failure. One teacher on a test may grade the worst student in the class at 30 and the best at 90. Another teacher might grade these same students on the same test at 60 and 90. The teachers would obviously be grading from different zero points. In the standard scales for grading composition, hand- writing, and so on, the zero points have been determined by a procedure too complicated to be given here.^ This is why such scales, even when they involve many sub- jective elements, are so superior to the attempt of a novice at making his own scale. If we must take subjective estimates as units and make our own scale, it is better to pursue the following treat- ment : 1. Avoid choosing estimators with known or probably marked prejudices. 2. Have all these persons estimate the worth of the problems in terms of a separate problem, which for convenience is to be consid- 1 See Thorndike : Mental and Social Measurements, Revised Edi- tion, p. 16^. Collection of Data 47 ered worth so much, say 10. (That is, all call this problem the value of 10.) 3. For the value of any other problem, take the average of the estimates given by the different persons.^ 2. The Jingle Fallacy The superintendent must beware especially of con- sidering things equal because they are called by the same words. This is known as the " jingle " fallacy.^ Thus, one child does not equal another child as a matter of school expenditure, if the first child is in the primary grades and the second child is in the last year of the high school. The cost of educating the latter for one year is much more than in the case of the former. The differ- ence between the ability to do one problem and the ability to do two problems in the Courtis tests is not the same as the difference in ability to do fifteen problems and the ability to do sixteen problems. Any one who can do fifteen can fairly easily work up to sixteen. But if a child can barely do one, it is a tremendous task to work up to doing two. The '' jingle " fallacy usually results from neglecting to define units or to consider the zero points. 3. Essentials of a Valid Scale The construction of a good scale for many lines of school work demands considerable technical knowledge and experience. The superintendent in general had 1 Professor Thorndike on pages 9 and 10 of Mental and Social Measurements has a much more complicated method for utilizing sub- jective estimates. 2 Professor Thorndike borrows this term from Professor Aikins. See Mental and Social Measurements, p. 10. 48 School Statistics and Publicity better plan to use scales already worked out.^ Beyond this we may for our purposes summarize the essentials of a valid scale from Thorndike : ^ 1. The scale must be as objective as possible. Its meaning must be such that all competent judges will agree as to what it is. 2. The series of facts used in making up the scale must be of the same sort of thing or quality. 3. The steps in the scale should be clearly defined. It is better if they are equal ; if unequal, let the steps be de- fined as definitely as possible. However, a scale in which only the order or rank of the various facts making it up is known, is often very useful. 4. The zero point must be defined if possible. ^ 4. Discrete and Continuous Series It is impossible to use a scale properly unless one knows whether the facts it is to measure are in a discrete or a continuous series. A series is said to be discrete if it is regarded as broken up, i.e., the different items are separate or there are gaps between them. On the other hand, if the series is capable of any degree of subdivision, that is, if the items are regarded as strung out along the scale, and running into each other, the scale is said to be con- tinuous. The table of the costs of instruction in mathe- matics, page 18, is an example of a discrete scale. In this table every item is regarded as an integer and there are gaps between the items. A good example of a continuous series is Table 2, made up from data worked out by the writer. iSee "Descriptive List of Standard Tests," by W. S. Gray, Ele- mentary School Journal, 18: 56. (Sept., 1917.) 2 Thorndike, E. L. : Mental and Social Measurements, pp. 11-18. Collection of Data 49 Table 2. Continuous Series Showing Fifth Grade Achieve- ments WITH Courtis Tests in Addition, in a Western City Number of Number of children >lems attempted making each score 1 2 2 3 10 4 24 5 32 6 35 7 34 8 54 9 25 10 27 11 24 12 4 13 4 14 3 15 6 16 9 17 2 18 2 In such a series as this one, the 32 children who at- tempted five problems are not all regarded as being pre- cisely at the point " five problems attempted " on the scale, but as distributed from " five problems attempted " to " six problems attempted/^ For this reason it is very important to know what a given number means on a scale. That is, does 6 mean from 5.5 to 6.5, or from 6 to 6.99, or nearer 6 than either 5 or 7? The second of these methods, that of measuring in terms of the point last passed, is often the natural way and saves labor in all sorts of measurements. ^ This method is the one to use where it is possible to say authoritatively that a given case 1 See Thorndike : Mental and Social Measurements, p. 22. 50 School Statistics and Publicity is beyond a certain point on the scale, but " the how much beyond " cannot be easily determined. Obviously it is a good method for Table 2, since a given case has attempted say five problems, but one cannot easily tell whether it has just started on five, is half way through the fifth, or is practically ready to start on the sixth. The other method can be used with a scale like the hand- writing scales, where a given case is said to be nearer a given sample on the scale than anything else. A case would be called "9,'^ for example, without being definitely located as either below or above that point. Here '' 9 " would mean from 8.5 to 9.5. 5. How to Use Scales In actual practice the superintendent can measure the worth of his work in whole or in part, on one of three kinds of scales : a. He can 'place the thing measured in its relative position in a scale of items (school systems, rooms, classes, etc.), all considered from the same viewpoint and without the use of units. This is the method used by the superintendent or school board of Blanktown, when he announces that his town has the best schools in the state, or that So-and-So makes this statement. It of course carries no weight whatever unless we know that the judgment of the one making the statement is sound. If one could read all the small town papers of any given state for one year, he would probably find three-fourths of them claiming that their home town had the best schools in the state. The same method with all its weaknesses is often used by a teacher in regard to the value of his pet method of teaching, his favorite mode of discipline, or the particular class he happens to be teaching at this time. Collection of Data 51 Table 3. Ranks of Certain Cities on Real Wealth and Assessed Wealth behind Each $1 Spent on Schools (Adapted from* Portland Survey, pp. 80, 304) Real Wealth Assessed Wealth City behind Each $1 for Schools Rank behind Each $1 for Schools Rank Newark, N. J. ... $165 1 $165 11 Worcester, Mass. 180 2 180 15 Toledo, 184 185 3 4 110 185 2 New Haven, Conn. . 18 Paterson, N. J. . . 185 5 185 19 Lowell, Mass. . . . 194 6 194 20 Fall River, Mass. 196 7 196 21 Syracuse, N. Y. . . 202 8 180 17 Cambridge, Mass. 204 9 204 22 Grand Rapids, Mich. 207 10 166 12 Dayton, 0. ... 208 11 125 4 Washington, D. C. 212 12- 148 8 Scranton, Pa. . . 216 13 173 18 Jersey City, N. J. 218 14 218 24 Columbus, 0. . . 221 15 133 6 Rochester, N. Y. . 225 16 180 16 Denver, Colo. . . 231 17 116 3 Albany, N. Y. . 234 18 234 26 Providence, R. I. 256 19 256 31 Bridgeport, Conn. 276 20 276 34 Kansas City, Mo. 280 21 140 7 Minneapolis, Minn. 294 22 132 5 New Orleans, La. 314 23 236 27 Louisville, Ky. 326 24 228 25 Nashville, Tenn. . 350 25 263 33 Omaha, Neb. 352 354 26 27 53 177 1 Oakland, Cal. . . 14 Seattle, Wash. . . 364 28 164 10 Spokane, Wash. . 370 29 152 9 St. Paul, Minn. . 407 30 204 23 Indianapolis, Ind, 408 31 245 29 Memphis, Tenn. . 449 32 247 30 PORTLAND, ORE. 456 33 260 32 Birmingham, Ala. 479 34 240 28 Richmond, Va. 536 35 402 35 52 School Stat'* xs and Publicity A refinement of the same method is used by a judge in a contest, when he ranks the contestants in order of merit only. He then gives the best contestant the rank of l,.the next best the rank of 2, and so on. But he wisely refrains from attempting to say how much better is the first.^ h. He can compare his own school with other schools on a scale of his own making, all schools being measured with definite units. Thus, he may wish to compare his community with others on the basis of the amount of money it really pays, wealth considered, for schools. The Portland Survey Commission found in comparing the wealth and school expenditures of that city with thirty-six others nearest it in size, that there was a vast difference between the assessed and the real wealth in many cities. To show this point more clearly the table on p. 51 has been adapted from the Portland Survey. It may be seen from this table that it is very important to have as the unit, the number of dollars of real wealth in the city, not the number of dollars of assessed wealth. If assessed valuation were taken as the unit, Omaha would be 1 instead of 26 ; Denver 3 instead of 17 ; Lowell 20 instead of 6, etc. It is evident that any one seeing this or a similar scale must agree to the ranking of each city or item as shown in the first column, provided the original data for the calculation are correct and the unit is a reasonable one. Then there remains only the question as to whether the cities selected for the scale are representative ones for fair comparison in the matter under discussion. c. He may compare his own school with other schools hy means of a standard test, and then place his school on a scale of cities made up as in h, or merely compare it with the standards of the makers of the test. {This amounts to a scale.) The advantage of a scale of this kind is that the units have been proved equal or approximately equal, and there can be no question of the relative positions of samples in a given scale. And as time 1 See pp. 11, 192-198 I Collection « ' Data 53 goes on, very authoritative scales will appear. Thus, we now have scales of this sort in the Thorndike or the Ayres handwriting scales, the Hillegas or the Harvard-Newton composition scales, the Ayres spelling scale, etc. Furthermore, by taking the achievements of school systems as measured on these scales or with standard units, a superintendent can easily make a scale of such achievements and see where his school system comes on it. 6. Practical Examples of Units and Scales for Superintendents The best practical examples of units and scales for a superintendent, of course, appear in the recent school surveys. A superintendent wishing to get up a scale or find units on a given problem can get them very quickly by utilizing the following table. The particular survey is denoted by the name of yie city, and the numbers refer to pages. Units and Scales that a Superintendent may Profitably Use Description Where found (Surveys) Playgrounds r Salt Lake City, 222 Square feet per child < Rockford, 7 ; Ashland, 11 I Denver, 11, 122 Buildings Square feet per child ( f " ^'^,?"^' ''' t Ashland, 11 Cubic feet per child Ashland, 11 Space per teacher and child Leavenworth, 48 Number of sittings per room Snedden and Allen, 29 Total seating capacity by buildings Snedden and Allen, 29 . . I,- r ^ f San Antonio, 315 Average cost per cubic foot l ^ . ^ , , Ar. no \ Sprmgiield, 22, 23 Average cost per pupil San Antonio, 315 Average cost per classroom San Antonio, 315 Same for fuel Denver, 1, 55 54 School Statistics and Publicity Same for repairs Lighting by candle power Janitor's salary per room Same per hour Same per 1000 cubic feet Valuation per room Teaching staff Number of children per super- visory officer Same on average daily at- tendance Number of children, average daily attendance per teacher Training of teachers Years of experience Years of training Teachers' salaries On yearly basis On monthly basis On weekly basis Based on enrollment Based on years taught Maximum and minimum sal- aries Principal's salary based on number of rooms in building Proportionate expenditures Percentages of school expendi- tures for different purposes Denver, 1, 62 Salt Lake City, 235 / San Antonio, 249 \St. Louis, IV, 117-120 Ashland, 11 Ashland, 11 San Antonio, 250 Salt Lake City, 39 Oakland, 26 f Salt Lake City, 53 Louisville, 33 Newton, 1913, Table IX . Oakland, 24 South Bend, 198 South Bend, 200 f South Bend, 101 Leavenworth, 50 Butte, 120 Newton, 1913, V Janesville, 43, 44 Bridgeport, 17 Vermont, 225 . Springfield, 61 Ashland, 14 Vermont, 225 Baltimore, 74 Salt Lake City, 55, 56 Portland, 75 f Cleveland, 97 \ Springfield, 61 I Janesville, 74 \ St. Louis, IV, 55, 56 Collection of Data 55 Percentage of salaries for high schools Percentage of city expenditures for schools Per capita costs Total population For each of population over 15 For each adult male High school costs per person in population Average daily attendance Same for fuel Enrollment Average number belonging Cost of instruction Student hour Per pupil Per pupil enrolled Per 1000 student hours Miscellaneous costs Expenditures for whole cities on medical inspection Evening schools per session Per wagon, rural consolidated schools Part of each $1000 spent on instruction in each subject Janesville, 82 J Janesville, 68 \ St. Louis, IV, 32 - Butte, 143-4 Bridgeport, 21 " Janesville, 68 Oakland, 44 I Baltimore, 34 Portland, 407 Portland, 407 Springfield, 95, 96 In most surveys ' Butte, 82 Oakland, 44 I Kansas City, 82 [ Birmingham, 36 Houston, 83 Sneddon and Allen, 17 ' South Bend, 204 Vermont, 227 I Leavenworth, 51 . Springfield, 114, 97 / Rockford, 111 \ Janesville, 75 Springfield, 114, 97 r San Antonio, 215 \ Denver, 1, 60 I Janesville, 90, 100 South Bend, 177 Newton (1913), 41 Texas, 33 San Antonio, 213 56 School Statistics and Publicity Expense for attendance officer per 1000 pupils enrolled Time spent on each subject Minutes per week Part of each 1000 hours spent on each subject Hours of recitation and di- rected study in reading and history Population Number per 1000 in certain occupations Same by lOO's Races by lOO's Family for nearly everything Wealth Portland, 390 f Salt Lake City, 76 I Leavenworth, 54 [ Houston, 83 San Antonio, 214 Cleveland, 121, 125 South Bend, 141 - r Salt Lake City, 17 ' . \ Cleveland, 21 Cleveland, 21 Red River, 42, 48, 82 Per capita population Same for real wealth r South Bend, 211 < Janesville, 68 [ St. Louis, IV, 18 r Salt Lake City, 19 1 Cleveland, 25 Taxable wealth behind each / Oakland, 43 child in school Same for child in average daily attendance Real wealth behind each $1 spent on schools Possible revenue per child en- rolled Tax rate Mills on assessed valuation 1 Maryland, 128 Janesville, 52 Portland, 414 Salt Lake City, 48 Same on real valuation Same per $100 real valuation < Illinois, 262 r Portland, 108 \ Leavenworth, 17 ' South Bend, 215 Janesville, 60 Salt Lake City, 313 Rockford, 119, 49 Collection of Data 57 Per $100,000 real wealth Cleveland, 25 Rate necessary to produce esti- mated per capita support for schools on actual wealth Salt Lake City, 311 Enrollment Increase in children per week (5 years) Salt Lake City, 36 On basis of 1000 children in kindergarten Ogden, 9 Parts of 100 pupils in public, private, and parochial schools Cleveland, 28 Same, failures, and promotions Denver, 1, 70, 2, 77 EXERCISES 1. Discuss the value for the superintendent of the units used in each of the examples given on pages 53-57, and of the scales that could be made up from such units. Just how would you make up these scales? 2. What is the value of the question for measuring the efficiency of teachers, counting the number of questions they ask in a given time ? Why? 3. Which is better for measuring the preparation of high school teachers, and why? (a) The number of years they were in college. (h) The number of years beyond the elementary school they spent in study. 4. Precisely what is the value of each of the following methods of instructing judges in a contest, and why? (a) Mark on a scale of 100. (6) Mark on a scale of 30. (c) Mark the best 1, the next best 2, etc. (d) Mark the best 100, the worst 70, and the others where they should come in between. (e) Mark on a scale of 100, allowing 50 for content and organ- ization, 30 for English, and 20 for delivery. 5. What units and scales would you plan to use in studying the statistical problem selected in the exercise on page 43, and just why? Precisely how would you plan to use the scale or scales chosen ? k 58 School Statistics and Publicity IV. HOW TO DO THE ACTUAL COLLECTING 1. Records in One's Own School Most of the superintendent's data must come from his own school system. But on many problems the matter of giving advice about the collecting may be like that of Holmes, when he said that one should always exercise great care in the selection of one's grandfather. The superintendent can at any rate collect far more data by seeing that his records are so kept as to show the desired facts later, than he can ever suddenly exact from teachers who have never thought of keeping or giving out infor- mation on this point. For example, the disputes and troubles in the studies of retardation and elimination a few years ago arose mainly because of the way school records had been kept. It was impossible to tell from existing records how many of the given pupils had entered the first grade at any given time years before, or to find all the significant facts about a pupil brought together in one place. Inasmuch as the facts in the superintendent's own school are often meaningless unless they can be compared with similar facts in other school systems, he must as far as possible use records similar to those of his fellow school men. Therefore, it is advisable that he use the records and reports recommended by the Committee of the National Education Association on Records and Reports, and by the United States Bureau of Education. He should also try to get the State Department of Education in his state to use blanks that will fit in with this system. If a decalogue for superintendents should be written, one of the first commandments undoubtedly should be: " Thou shalt keep thy records as nearly as possible by Collection of Data 59 the uniform system of the National Education Associa- tion." 1 2. Other Sources of Data The data from other school systems are obtained usually by the use of questionnaires, from printed reports, school surveys, magazine articles, etc. Aside from the question of selection, the questionnaire method is often practically worthless for collecting statistical data. School men are too busy to answer large numbers of questions, to work out the object of the questionnaire when this is not clearly stated, to hunt up much infor- mation on former conditions, to puzzle out involved or ambiguous questions, and sometimes too careless to give information definite enough to be of any service. As a result, few replies will come in on a given problem, and not all of these will be complete. This means few opportunities for comparison. It is always better to make use of printed statistics where possible, taking care to be sure of the units and processes used in compiling the statistics taken. The mere fact that they are part of a printed report or formal presentation, often required by law, practically insures much more accurate figures than can be secured by the questionnaire method. A second advantage is that the superintendent can see that all his figures are on the same basis, a thing impossible with a questionnaire because of the inability of many people to understand or follow directions on an inquiry which they have never seen before and never expect to see again. 1 See Bulletin U. S. Bureau Education, 1912, No. 3, or get particu- lars from such companies as the Library Bureau or the Shaw-Walker Co. 60 School Statistics and Publicity ^ For busy superintendents the following suggestions as to where material for statistical comparisons may be found, are appended : 1. Report of the United States Commissioner of Education, Volume II, each year contains numerous tables on city school systems. It includes such material as enrollment, number of teachers, aggregate days' attendance, average daily attendance for both elementary and secondary schools, length of session, number of children of census age in private schools, number of buildings, number of sittings, itemized receipts and expenditures of all school systems in cities of 5000 or more. 2. The bulletins of the United States Bureau of Education often contain much valuable material on special topics. Especially good are such ones as the following : A comparative study of the salaries of teachers and officers. 1915, No. 31. Ayres, L. P. : Provision for exceptional children in the pub- lic schools. 1911, No. 14. Boykin, Jas. C. and King, Roberta: The tangible rewards of teaching. 1914, No. 16. Deffenbaugh, W. S. : School administration in the smaller cities. 1915, No. 44. Frost, Norman : A statistical study of the public school sys- tems of the southern Appalachian Mountains. 1915, No. 11. Monahan, A. C. and Dye, C. H. : A comparison of the salaries of rural and urban superintendents of schools. 1917, No. 33. Morse, H. N. : Educational survey of Montgomery County, Md. 1913, No. 32. Public, society, and school libraries. 1915, No. 25. Statistics of certain manual training, agricultural, and industrial schools. 1915, No. 19. Strayer, G. D. : Age and grade census of schools and colleges. 1911, No. 5. Thorndike, E. L. : The elimination of pupils from school. 1907, No. 4. Thorndike, E. L. : The teaching staff of secondary schools in the U. S. . 1909, No. 4. Updegraff, Harlan : A study of expenses of city school systems. 1912, No. 5. 1 Collection of Data 61 Updegraff, Harlan : Public and private high schools. 1912, No. 22. Updegrafif, Harlan and Hood, Wm. : Urban and rural school statistics. 1912, No. 21. 3. The reports of the State Department of Education within the state where the superintendent lives. 4. Strayer, G. D. and Thorndike, E. L. Studies in Educational Administration. (Contains the cream of many dissertations pub- lished at Teachers College prior to 1912.) 5. Publications of the United States Census Bureau, especially special reports on cities and abstracts of each census. 6. School surveys of all classes. 7. Bulletins issued at various times such as those started by Super- intendent Spaulding at Minneapolis in September, 1916. 8. Dissertations from leading schools of education in universities. The publications of the superintendent's own state university in this line will be easily available. In addition, the dissertations and theses from Teachers College, Columbia University, and the School of Edu- cation at the University of Chicago, are very valuable. 9. Reports of investigations appearing in standard educational magazines. The best for this purpose are : American School Board Journal, Milwaukee Educational Administration and Supervision, Baltimore Elementary School .Journal, University of Chicago Press Journal of Educational Psychology, Baltimore School Review, University of Chicago Press School and Society, New York 10. After the foregoing was written. Professor H. 0. Rugg's book on Statistical Methods Applied to Education appeared. It has much more extended references, especially on pages 28-39, 361-375. The material is admirably classified for ready reference. EXERCISE For the special problem you have selected, jot down as carefully as you can at this time : (a) The places in your own school system (or one in which you are interested) from which you might secure statistical data on it. (6) Other likely sources of statistical data on it. 62 School Statistics and Publicity 3. Sampling ^ As soon as the sources of data are determined, the question arises as to what can be done in the way of " sampling " with a view to cutting down the inevitably large amount of work. " Sampling/' of course, means working from selected or tjrpical specimens rather than with the whole mass of data. The superintendent usually considers sampling be- cause he must answer one of three questions : 1. How many measures are needed of an item to be sure that the item is fairly well represented? 2. How many cases and which ones need to be treated in a large mass of data to be sure that the results will be approximately true of the whole ? 3. In case the superintendent at best can get only a small list of items for comparative purposes (say only a dozen towns that are really comparable to his town on school expenditures), how is he to choose these items? Let us now take these up in order. Number of Measures of One Item Needed. No arbitrary rule can be laid down for the number of measures needed on any one item. But it is safe to say that often there should be more than one in order to insure a reliable average measure for the item. For example, in comparing cities for school expenditures, it is often very unfair to get the expenditures for one year only. There may be very unusual conditions for that- year in several cities, say fires, cold spells that sent up fuel bills, epidemics that necessitated much medical expense, etc. In such studies it is customary to take the average of two years for each city, as Strayer does in his City School Expenditures, 1 See also pp. 22-23 Collection of Data 63 Superintendent Spaulding, in a recent monograph on expenditures in Minneapolis, takes the average for five years for all cities studied. Another good example is seen in comparing teachers on their ways of marking students. It is very unfair to judge a teacher by the marks she gives at any one term examination, or in any one class. Professor Max Meyer at the University of Missouri, for instance, never attempted to pass judgment on the marking of a member of the faculty there until he could get at least five hundred marks given by that teacher in all his classes. A teacher may rightly object to a rating given her by the superintendent on one short visitation in one subject only. In practice, of course, she will not object to such rating if it is highly favorable to her, but it is probably about as far from the exact truth as an unfavorable rating would be if made on the same visitation. The writer some years ago, in inspecting high schools for the University of Missouri, found that the inspection was hardest in this one particular. It was im- possible to visit a school oftener than once a year for a day, and this made it extremely difficult to pass judgment on four or five teachers in the course of the six hours or less of teaching, judgment to which the teachers or their superintendents would really subscribe. The more measures we take of an item, provided they are not all chosen with the same bias or cause for error, the more reliable will be the final average measure taken for that item. But time is too valuable to permit going on indefinitely getting measures of one item. The safe procedure is to take as few measures of it as may reason- ably be expected to represent it fairly. Selection of Samples Ordinarily. The amount of work in any statistical treatment is so great that the 64 School Statistics and Publicity question of cutting down the number of items by sampling is very important. But it is equally important that the samples be so selected as really to represent the whole. Neglect of this is responsible for the worthlessness of many laborious pieces of school statistics. The selection of samples should be absolutely at random, and if there are groups of data, the same percentage of samples should be taken from each group. Thus, a su- perintendent who was making a study of outside reading done by students in his town, could not get very trust- worthy results by asking the pupils in five wealthy families, in five families in comfortable circumstances, and in five poor families. There would be so many more in the second group than in the first, and so very many more in the third than in either of the other two groups that his results would be untrustworthy. To get at anything like the truth, he would have to take, say, twenty from the second group and possibly a hundred from the third. It is equally erroneous to attempt to obtain results on the effect of negro population on schools in Texas by ranging the counties in order of percentage of negro population and then taking three of the counties most free from negroes, three from the middle group, and three of those having the largest numbers of negroes. The results are wholly unreliable until we know how many counties are in the lowest natural group (that is, one without gaps in the percentages), how many in the middle group, and how many in the highest group, and that we have taken the same proportion of samples from each. Again, a teacher in giving a grade to a pupil ought not to take the work of the pupil during the last few days of the term. It is not a fair sampling of the student's work. Nor should she take the sample grades for her private book at any Collection of Data 65 stated time known to students. If they know she always grades them on Friday, they will do well on that day and slow down on other days. Unfair selection explains why questionnaire methods of getting results are often so unsatisfactory and mis- leading. The people who answer questionnaires are often very selected and biased, and do not represent the whole group at all fairly. For example, a superintendent may wish to know what parents think of his school and may send out a questionnaire for this purpose. It will be sure to be answered mainly by those who are favorable or who wish to make him think they are favorable, and possibly by a very few opponents who are stirred up enough to come out frankly, but who are probably se- riously or unfairly prejudiced against him. If he sends out a questionnaire to other superintendents, as a general rule only the superintendents who think they can make a better showing than he can, will send in results. So very serious is this defect of the questionnaire method that the phrase of President Kirk of the Kirksville, Missouri, Normal School, " a questionable question- naire/' is often justified. The most casual examination of the statistical outpourings of the questionnaire type in the last few years will show that very often only a small percentage of the persons receiving questionnaires ever answer them. This small percentage is almost certain to be selected on some peculiar bias and is getting smaller because so many foolish and needless questionnaires have been sent out to superintendents in the last few years. Many school men confess to throwing nearly all question- naires into the waste basket or turning them over to clerks or pupils to answer. A good method of sampling is the familiar one of taking 66 School Statistics and Publicity every fourth or fifth ease of the items arranged alphabeti- cally, or in order of magnitude, so that there can be no prejudice in the matter. This, of course, insures getting the same percentage in each of any possible groups. Thus, if it is a case of consulting citizens, every fifth name in the telephone book would give the superintendent a good sampling of men able to have telephones. But it would not represent all citizens. To get all represented, he had better take every fifth name in a directory of the city, or in a list of the registered voters. If it is a case of children, arranging them alphabetically by grades, per- haps boys and girls separately, and then taking every fifth name would give him a good set of samples for boys and another good set for girls. If a teacher has kept daily grades of pupils, he can get a good sample set of grades by taking, say, every seventh grade for each pupil, or something of this sort, just so long as the grade taken does not fall on the same day of the week every time. Sampling is often resorted to in giving standard tests to large numbers of children where the labor of grading the papers from all would be very great. One of the best sampling schemes coming to the writer's notice is the one employed in the San Francisco survey in 1916. Four tests — arithmetic, spelling, penmanship, and reading — were given. Two classrooms in different grades in the eighty-one elementary schools in the city were chosen at random for a test in some one of the four subjects. No teacher or principal knew in advance what rooms had been chosen or what subject would be given in the room selected. If a questionnaire has to be used, it is generally advisable for the superintendent to pick out a reasonably small list by one of the preceding methods, and then to devote his energies to seeing that approximately every name on the Collection of Data 67 list sends in a fairly accurate answer. The " personal questionnaire " filled out in person by an '' interviewer " is sometimes an excellent device. It relieves the person interviewed of much drudgery and insures correct inter- pretation of the questions. But it tends to embarrass the person giving the information, often to the point of stopping easy conversation. To avoid this, the inter- viewer is apt to encourage or accept rough estimates in place of accurate data. In any case, the best way for the superintendent to get answers is first to be sure that his investigation will be of value to some one beside himself, and especially to those he requests to fill out his question- naire. Then he should promise to give all those answering it a copy of the results. If he ever intends to do any more investigating, he must faithfully keep this promise, aside from his moral obligation to keep it.^ Selection of Very Few Samples. Often the superintend- ent knows that he cannot by any possibility get more than a small number of items. His problem then is how to select these items so that he may be sure that his com- parisons will not be absurd. For example, if he wishes to know whether his community is spending enough money on its schools, is he to take all cities of that size in the United States or in his state ? Neither procedure will do because the same conditions manifestly do not obtain in all of these cities. The same would be true if the cities were selected on the basis of the number of children of school age, the number of children in school, etc. The towns certainly ought not to be compared 1 For those who, in spite of the preceding, find it necessary to use a questionnaire, pp. 40-56 of Rugg's Statistical Methods Applied to Education will be very valuable. This contains the most practical treatment of the questionnaire that the author has yet seen. 6S School Statistics and Publicity mercilessly unless their wealth per capita is something like the same. They cannot be compared on current school expenditures unless we know that their school debts are something like the same relatively, and even this may not be enough We may have to know their present taxes for other needs, the city's indebtedness for other purposes, etc. Mr. F. 0. Seymour in writing his master's thesis at Peabody in 1916 had the same problems in making a study of school costs for Amarillo, Texas. Table 4 is adapted from his work. Table 4. Showing School Statistics for Certain Cities of ABOUT the Same Population and General Situation . Per Per Per cent Town Popu- lation cent of pupils to total popu- lation cent of pupils in school Wealth per capita City debt per capita City tax per capita of city revenue going to schools Coffeyville, Kan. 13,687 24.0 86 $635 $62 $30 48.1 Dennison, Tex. 12,632 19.4 69 501 22 14 41.9 Pearsons, Kan. 12,363 18.3 71 815 31 15 52.9 Sherman, Tex. 12,412 20.9 71 608 20 15 39.1 Guthrie, Okla. 11,654 20.1 72 474 68 14 34.2 Marshall, Tex. 11,452 19.4 61 409 51 13 33.4 Paris, Tex. 11,269 27.2 85 815 51 19 36.5 Palestine, Tex. 10,480 18.1 59 508 19 10 40.3 Cleburne, Tex. 10,464 24.3 78 508 31 14 40.8 San Angelo, Tex. 10,321 15.2 49 566 20 10 44.8 Amarillo, Tex. 9,957 16.8 61 549 26 9 45.7 It may be noted concerning these cities which Mr. Sey- mour has selected for purposes of comparison with Amarillo that : (1) They are of practically the same population; (2) they are all in the same section of the Collection of Data 69 country; (3) the various items given in the table are on the whole rather close together for the different cities. Had he included some eastern, northern, or extreme southern cities of the same population, undoubtedly the variations in these items would have been much greater. Examples of Bad Sampling. Failure to pick out samples that really represent the whole group may lead to some very fallacious conclusions. For instance, Professor Bobbitt in his article on the cost of instruction in high schools, for purely illustrative purposes, compares results from various cities. ^ The population of the cities he uses varies greatly. Figures could not be obtained for all, but Table 5 gives enough to show the variations. Table 5. Variations in Population of the Cities Used by Professor Bobbitt in His Study of High School Costs City Population in 1910 City Population in 1910 Mishawaka, Ind. Elgin, 111 Maple Lake, Minn. Granite City, 111. . East Chicago, 111. De Kalb, 111. . . . San Antonio, Tex. . Harvey, 111. . . . Waukegan, 111. . . South Bend, Ind. East Aurora, 111. . . Rockford, 111. . . 11,885 25,976 9,903 8,102 96,614 7,227 16,069 53,684 29,807 2 45,401 Booneville, Mo. . . . Brazil, Ind. .... . Leavenworth, Kan. . Greensburg, Ind. . . Morgan Park, 111. . . Noblesville, Ind. . . Norfolk, Neb. . . . Washington, Mo. . . Bonner Springs, Kan. . Russell, Kan. . . . Junction City, Kan. Mt. Carroll, 111. . . . 4,252 9,340 19,363 5,420 3,694 5,073 6,025 3,670 5,598 1,759 1 Bobbitt, J. F. : "Cost of Instruction in High Schools," School Review, 23 : 505-534. (Oct., 1915) 2 Population for Aurora. There are two high schools, one in East Aurora and one in West Aurora. 70 School Statistics and Publicity But Professor Updegraff has shown that the cost of instruction is higher in large cities than in small ones. See Table 6. Table 6. Variations in Cost of Instruction and Supervision PER Capita (Population) in Cities of Different Sizes ^ Per capita Per capita expenses of Median city in Population expenses of teachers' salaries salaries and expenditures for supervisors Total Group I 300,000 and over $50.98 $1.18 $52.16 Group II 100,000 to 200,000 36.15 3.26 39.41 Group III 50,000 to 100,000 36.93 3.39 40.32 Group IV 30,000 to 50,000 29.25 3.38 32.63 As to costs in schools in cities below 30,000, no such figures are known to the author. It is entirely probable that they are higher than the lowest figures cited here because of not having such full classes in high school. But the main point is that there are always great dangers in selecting samples that cannot reasonably be regarded as coming from the same class. Professor Bobbitt was, of course, perfectly aware of this difficulty and on page 506 of his article indicates that his tables are valuable as patterns of work mainly. Another example of the results of unwise selection of samples is furnished by Superintendent Spaulding's monograph on the cost of the Minneapolis schools.^ 1 Adapted from Updegraff, Harlan : "A Study of Expenses in City- School Systems," Bulletin U. S. Bureau Education, 1912, No. 5, pp. 7, 86. (Computed on enrollment of pupils) 2 Spaulding, F. E. : Financing the Minneapolis Schools. (Board of Education, Minneapolis, Sept., 1916) Collection of Data 71 He has a chart on page 46 showing the expenditure per child for ordinary maintenance of the elementary schools. But in this chart he has included three southern cities, Louisville, Birmingham, and New Orleans. All three have in their population large negro elements which seldom pay taxes and add little wealth to the community because they are not able to do so. At the same time they have relatively large numbers of children to be provided with schooling. Consequently these southern cities have to take care of a large number of negro children with no corresponding increase in revenue. Therefore, they should not be placed in comparison with Minneapolis. Their presence in this table has the effect of making Minneapolis appear more extravagant in her expenditures for schools than would be the case if only northern cities were considered. Summary of Rules for Sampling. These may be given briefly: 1. Be sure that the measures of any one item represent its usual state. 2. Select samples absolutely at random. 3. If there are groups of data, take the same percentage of samples from each group. 4. Avoid using a questionnaire if possible. If it must be used, be careful to discount the results for classes liable to be selected by it. 5. If only a small number of samples are obtainable, select them with unusual care. 4. Blanks and. Tabulating Devices for Blanks. The actual collection of data should be made on some form for tabulating, called for convenience a '' blank." Rules for making blanks which hold good for many specific problems are extremely hard to lay down, but the following devices are often helpful : 72 School Statistics and Publicity a. Plan the blanks so as to get a maximum amount of information with a minimum of space. This reduces chances for error, because the more nearly a person can see all of the blank at one look, the greater mastery of it and of the relations of its parts he will have. But this is not to be inter- preted as advocating extreme condensation resulting in eye strain, or the elimination of all space for calculations. Nor does it mean that setting all the material on one sheet of paper is sufficient. A blank may be so long or so wide as to be hard to understand at one look. 6. Use " double distribution " tables where possible. By these are meant " two-way " tables, — tables that classify data two ways, across the page on one classification, and down the page on another. They thus enable one to get the data from at least two separate tables condensed into a convenient form in one. Tables 7, 8, and 9, with slight modifications, are taken from Snedden and Allen's School Reports and School Efficiency ^ to illustrate double dis- tribution tables. Table 7. Blank for Showing Number op Pupils Making Given Number op Days' Attendance First grade . Second grade Third grade Fourth grade Etc. . . . From 19 days From 20 to 39 days From JfO to 59 days From 60 to 79 days From 80 to 99 days Etc. Explanation: This table provides for a distribution of pupils' attendance in two ways: (1) by the grade; (2) by the number of days. Hence it is called a ''two-way" or "double distribution" table. 1 Pp. 130, 132 Collection of Data 73 Table 8. Blank for Showing Distribution of Pupils in Each Grade by Ages, Records of Ages Being Made First grade Second grade Totals Males Females Males Females Males Females From 5 yr. 6 mo. to 6 yr. 6 mo. From 6 yr. 6 mo. to 7 yr. 6 mo. Etc. . . . Note : This is a very convenient table, since it gives a distribution of pupils both by ages, and by grade and sex. c. Strive to have all the printing on the blank so that it can he read from one position while filling in the data. Table 7 was originally printed like Table 9. Table 9. Blank for Showing Number of Pupils Making Given Number of Days' Attendance CO CO CO CO CO S^ ^ STi >5 ?r5 ;3 e !3 S e ^ -^ ^ ^3 'XS 05 Oi Oi Oi Oi <50 ^ t^ Oi •»-s O o o o O "*-^ '*^ -»^ -K) -»o o o O O H|(N (^ -^ ^ oo g g s g g Q o- o o o «:> ^ ^ V. fcn c^ ^ !*H t^ uq First grade . . . Second grade . . Etc 74 School Statistics and Publicity Note: This is exactly the way the blank should not be made up. The proper form is given on page 72. With care, the space usually can all be utilized either way. If all the titles and headings can be read from one position, that is, without having to turn the paper around, the chances for error are much reduced. d. Put full or plainly abbreviated title, not merely an arbitrary symbol or a number, above each column, if any one besides the superintendent is to use the blank. 4. NATIONAL SCHOOL RECORD SYSTEM TRANSFER CARD To be filled out for the Attendance Offi- cer in case of transfer to any other school, either in or outside city. . . . 1. Last name 2. First name and initial 6. Name of parents or guardian 8. Residence before discharge 7. Occupation of parent or 9. Date of discharge guardian 8. New residence (or name of private or parochial school if pupil is transferred to one) 10. Age when discharged Years Months d. Grade e. Room f. Days Present Health h. Conduct i. Scholarship Date of last attendance School District Teacher Principal Remarks on other side THE SHAW-WALKER CO., MUSKEGON FORM NO 4 Some superintendents get out blanks of the kind condemned here. But it is very doubtful if such blanks will ever secure accuracy from anyone except the person who thought out the key (even he may forget it if it is at all intricate) and the small percentage of teachers who may be called "hyper-patient." Teachers dread such blanks for precisely the same reason that most readers dread having to refer constantly to footnotes in something they are trying to read. Collection of Data 75 If the data are to be copied or read to some one else, a key num- ber should appear in each column immediately under, or in front of, the title. The reader or copyist can catch this key number much more quickly and easily than he can the heading. The key number is also preferable for calling or reading off to some one else. A good example of the use of key numbers and letters with titles is found in the National System of Uniform Records, the transfer card of which is reproduced on page 74. e. If the same item appears in different blanks of a series, always give it the same key number and the same relative position to closely connected items. The report of the Committee on Uniform Records and Reports, previously referred to, recognized this principle. The order of the various items appearing on the pupil's report card is the same as that appearing on the lower half of the attendance and scholar- ship record kept in the teacher's loose-leaf register, which is later filed in the principal's office. Other cards in the system which contain these items use the same key letters and relative position. /. If data are to be summarized from figures furnished by many persons, make the summary with as little copying as possible. Copying always tends to error, and the checking necessary to avoid such errors is very laborious and time-consuming. The special econo- mies here are things like the following: (1) In a building, the data for a report on the table given on page 72 may be secured by hav- ing each teacher call at the principal's office and put her data on the proper horizontal line. The completed blank will then be the desired summary. (2) If the data are to come from different build- ings, each teacher can put hers on the proper line on a blank and send it to headquarters. There such blanks can be gathered, the line clipped off each blank (keeping necessary identification marks of course), and pasted on a sheet of paper, thus forming the summary at once. In very extensive or frequent tabulations of this sort, blanks with perforations where the lines are to be cut apart may be of service. Perforated blanks are, however, very expensive and usually 76 School Statistics and Publicity unnecessary. For most purposes, cutting and pasting, or copying and checking, will be sufficient. g. Figure or summarize on the blank itself if time can be saved thereby, as is often the case. Sabjact P^WSt'cS Class Teacbe r Li^S Co Vvi Ir Sex Al Tears Erperience, before 1913-14_ Head of Department? :^5'-l05 ; 90.94 : 80-89 : 65-79 : 150-701 ; 7SA : Below 65 : Total Boys : / = \ -ffH- '- II t , / '■ ; ; 1^ 1 1 ® 1 — ^ Girls J / /// . // ; 1 • , • ; / ;<2) : ^ o = / • 7 ■ If (D : / Id \31 Data for Grading System - Hume -Fogg High Sobool Grades for 1913-14. FiQ. 1. — Example of Scoring and Figuring on the Blank Used for Collect- ing Data. Collection of Data 77 In summarizing the material collected as advised in section /, the additions can be made in ink (often red ink) on the bottom of the sheet where all the data have been collected. A good illustration of this point is afforded by a blank which was used by a class of graduate students making under the writer's direction a study of the marks given in the Hume-Fogg High School in Nashville.^ (See Figure 1.) In Figure .1 original scores are indicated by the marks above the line. The plain figures below each line are the numerical equiva- lents of the scores. The figures in rings represent the approximate percentage values of the numbers just above them. On the original blank such percentages could be indicated by red ink without using rings. All percentages are calculated here to show the method. But in this particular problem, because of the small number of cases in each group, such percentages have little value. See pages 13, 63. h. In case only one copy of a blank can be made, use cross-section paper, or ruled blank book. Cross-section paper saves an enormous amount of time in ruling paper and avoids eyestrain. Such paper may be purchased from a good scientific supply or drawing company, in sheets of various sizes. That particular kind characterized by heavy lines every five or ten small squares is the better kind. The heavy lines serve as columns, while the small squares help to keep figures under each other. A ruled blank book or account book, especially the kind extending over two adjacent pages, is similarly useful because of the vertical columns and horizontal lines. i. Put at the top of the blank the item which determines the place of the blank in a series or which identifies it. The observance of this caution will make for ease in handling the data. For example, in the case of cards that are to be filed alphabet- ically according to the name of the student, this name should appear at the top of the card where it will be the first thing the eye of the one going through the records will light upon. Yet many school men 1 The results of this investigation appeared in Educational Ad- ministration and Supervision, 1 : 648 (Dec, 1915), from which article this illustration is adapted. 78 School Statistics and Publicity get out blanks with the name of the school, the name of the card, the name of the town, and other insignificant items at the top, which is the choice position for identifying a blank quickly, especially if it is ever filed. Such material is very seldom handled except by persons who use the blanks more or less constantly. For these, this material may as well come at the bottom of the card and often even in fine print. Even the transfer card of the National School Record System shown on page 74 could be improved upon in this respect. The card is to be filed alphabetically by the name of the pupil. This name should come on the top line, and the title of the card with the caution to the attendance officer should be placed at the bottom. j. Always use blanks cut to standard sizes. This is essential for easy handling of data, filing for ready refer- ence, and storage for future use. Folders and filing cases can easily be procured for standard sizes, and the printer or supply house can furnish paper and cards for these sizes at much lower prices than for odd sizes. The standard sizes are 3x5, 4x6, 5x8, and 8|xll inches. Examples of Good Blanks. The busy superintendent may often save much time by getting good blanks used in other systems and quickly modifying them for his own purposes. Such blanks may be found in : 1. Snedden and Allen : School Reports and School Efficiency. (Macmillan) This discusses the good and bad points in actual blanks from city school systems, on such topics as the school plant, expend- itures, the census, attendance, attendance and ages, promotions, summaries, etc. 2. Report of the National Education Association Committee on Uni^ form Records and Reports, published as Bulletin U. S. Bureau Educa- tion, 1912, No. 3. It gives model blanks for the following in elementary schools : 1. Principal's term report : (a) enrollment ; (6) promotions, non-promotions, by grades; (c) distribution of withdrawals Collection of Data 79 by ages and causes ; (d) distribution of attendance ; (e) grad- uates by years in schools; (/) non-promotions by grades and causes ; (g) failures by studies and grades ; (h) distribution of leavings and withdrawals by ages and grades ; (i) ages of grad- uates; (j) enrollment and attendance; (k) distribution of whole-time teachers. 2. Teacher's term report: (a) enrollment by divisions; (6) non-promotions by grades and classes ; (c) failures by grades and studies ; (d) enrollment and attendance ; (e) distribution of enrollment by ages ; (/) distribution of withdrawals by ages and causes ; (g) distribution of leavings by ages ; (h) beginners by training; (i) beginners by ages. There is also a set of blanks for high schools similar to these. 3. Strayer and Thorndike: Educational Administration. (Mac- millan) This contains extracts from statistical researches on educa- tion made at Teachers College, Columbia University. It con- tains some of the blanks recommended by the National Educa- tion Association Committee as well as very valuable table forms on other things. 4. Rugg, H. O. : Statistical Methods Applied to Education, pages 28-87. (Houghton Mifflin) 5. Certain Surveys. The Butte Survey, for example, contains good suggestive blanks on these phases : census, attendance, absences, educa- tion and experience of teachers, enrollment, promotion and failures, size of classes, ages of children at beginning of first semester, receipts and expenditures. One Blank versus Card Index. In many pieces of statistical work, one must early face the question as to whether it is better to put all the data on one large blank, allowing one line to each case or group, or to enter each case or group on a separate card. One of the best arguments for the latter method is that the data in the columns cannot show their full meaning, 80 School Statistics and Publicity as a rule, unless arranged in order from lowest to highest. If the data are all on one large sheet, the classes may come in order for the data in the first column, but this order will not be the same in any one of the other columns, which may be several in number. If these columns are to be accurately studied, the data in them must be copied off and rearranged, with many chances for error. On the other hand, if no large table is made, the cards may be taken up in order easily for one column. The data may then be copied off for a table on this one item. Next, the cards may be rearranged for column two and the corresponding table made. This process is much easier mentally and more accurate than the first one mentioned above. It is the procedure used in the Courtis tests, where each child has a folder containing his own work. These folders are taken up and sorted into piles of differing achievement for attempts in addition, and from these a distribution table is made ; they are re- distributed for attempts in subtraction, and a second distribution table is made, this time for subtraction ; the process is repeated for multiplication and division. Then the start is made on ''rights," and the four sortings take place on this basis and the four resulting tables are copied off. To summarize, the advantages of each plan are : Large Sheet Card Index Plan The data are not easily lost. Less chance for error in copy- It is impossible to lose a part, ing or rearranging data. as can easily happen in the use Easier mentally. of cards. All data are before the eye at Any parts of the data may be one time, and thus a better pre- separated from the remainder at liminary grasp of the situation any time, may be had. Collection of Data 81 Less work in compiling at The data may be easily shifted first. to any order. At the beginning data may be Pupils can handle data better, copied in any order. Often it will be advisable to give portions to different workers. The disadvantages of these two plans are largely the converse of the advantages, but they may be emphasized through the statement of them: Card Index Plan If card is lost, it cannot be re- placed, nor can the loss be easily discovered. A card may stick to another and be easily overlooked. Cards for easy handling must ordinarily be kept in alphabetical order. Hence if shifted for pur- poses of making tables, etc., must be returned to regular order. Much mechanical work is neces- sary in examining data. Large Sheet Data from large buildings or groups not easily brought to- gether on one sheet. Additional data (often come late) are hard to insert. Data in all columns not ar- ranged from highest to lowest nor can they be so arranged with- out recopying. More chances for error in re- copying and rearranging. Will stand less wear and tear. Averages and summaries are hard to make unless data are ar- ranged in order from the start. Often this cannot be determined when data are copied on large sheet. On any given problem, a little forethought or practice on sample data will usually indicate which is the better plan. However, it is very difficult to be certain in this matter until the whole statistical procedure has been thought through. EXERCISE For your special problem : (a) Decide just where you would use a card index and just where one large blank, giving full reasons. 82 School Statistics and Publicity (b) List the blanks, with very definite titles, that you now think you would need. (c) Draw off in complete detail, accurate to size of paper, at least one of these blanks. 5. Miscellaneous Economies in Collecting Data a. Cross-section paper for scoring. Many times it is necessary to keep track of the number of cases, the data for which may come from widely separated places. Keeping track mentally is very difficult and inaccurate. It is far better to adopt some method of putting down a mark for each case as soon as it is located, and later count the marks. The cross-section paper device is one for such marking. — 1 1 1 1 1 1 1 1 < 1 1 1 1 1 1 )0 -1 0( ) 1 1 1 \ 1 1 1 I 1 e 10 "~ 8J ) Fig. 2. — Scoring on Cross-Section Paper. Suppose that a study is being made of the marks given by various teachers to pupils. A sheet, column, or whatever space is desired, may be assigned to each teacher. One horizontal row of large squares may be labeled 90-100 or A, according to the grading system ; another 80-89 or B, etc. Each mark between 90 and 100 as soon as located would be represented by a check mark in one of the little squares in Collection of Data 83 the row of large squares representing grades 90-100, under the column for the proper teacher or subject. For such work, always begin at the left upper corner of the first big square on the left and fill in reg- ularly to the right, row by row. Suppose, when the work has been finished, that the 90-100 column looks like the illustration given. One glance at the squares in Figure 2 shows that there are 33 grades between 90 and 100 given by the teacher during the time under in- vestigation. b. Scoring by fives on plain paper. This is the old device used in counting the votes in elections, etc. The first four cases of every five WW m THi ^^ or M m w tm mi rtu in m Ill Fig. 3. — Scoring on Plain Paper. are given a straight perpendicular mark each ; the fifth is made hori- zontally, tying the other four, as in Figure 3. This makes 33 cases, as in the preceding method of scoring. For most people the counting is much easier if the groups of five are under each other and not in a horizontal line. The author also has made much use of the scheme of putting down a mark for every case in the group as best he could for speed, on any Fig. 4. — Scoring on Plain Paper. paper handy. Then rings are drawn around the marks with five in each ring. It is easy to count the rings. The chief merit of this 84 School Statistics and Publicity scheme for a single worker is that the eye does not have to be shifted quickly and accurately, with the resulting strain. Putting down the dots roughly requires no eyestrain. Putting the rings around the dots is not hard, as the eye is kept focused on the particular part of the paper. Thus, the same 33 cases might be scored as in Figure 4. c. Checking on blanks. Where possible, tabulate by check marks in appropriate column on a blank, as in the United States Govern- ment stock device. Thus, a superintendent could study physical defects of school children with a blank like Table 10. Table 10. Blank Illustrating Method of Checking in Entering Data Physical Defects Sight Hearing Pupil No. Ade- noids Tubercu- losis Efr Right Left Right Left 1 X 2 X 3 4 5 X X X 6 X 7 X ' d. Some entry for each case. In some instances accuracy is in- creased by the device of making some sort of entry for each case. It is very easy to omit a few cases from a large number if entries are made only where data actually exist. But if numbers are put down for actual data, zeros if it is known that nothing is done, and "n. d." (for "no data") where it has been impossible to secure data, the results will be likely to be much more accurate. The necessity of having to account for each case in a positive form, reduces the chances for omission, a fact long ago discovered by insurance companies, which require their agents to make some report on each item on a blank. Collection of Data 85 e. Rider strip for printed reports. Often it is necessary for the superintendent to get figures from two or three widely separated columns in a table of fine print. There is great eyestrain and many chances for inaccuracy in trying to copy them off directly, especially where the tables run over two pages, as do those in the government bulletins. The best way to avoid this is to cut out a strip of paper which may be placed over the table so that the desired figures will stand out clearly and quickly in the angles of the paper. Table 11 indicates how to work the scheme if the superintendent wishes to compare his system with the cities in the table in the matter of ex- penditures under the headings, Board of Education and Business Offices, Superintendent's Office, and Other Supplies. Government offices use specially constructed rulers for the same purpose. Table 11. Illustration of Use of Ruler Strip Device on A Page, in the Report of the U. S. Commissioner of Education Cities B'd of Ed. S^Bus. Offices. Supf's OffTce. Sal. ^Exp of super- visors. Sal of ?rm. Sal.of Teachers Text Bool<5 Other Supplies ALABAMA $3631 128163 $10750 43801 $5967 J05600 Birmingham CALIFORNIA L,osAn3etes J 1 1 1 1 1 .. ^ Another very convenient form of ruler strip is made with two strips of zinc adhesive or bicycle tape and some thin, tough paper. The strips of adhesive are pasted across the paper, but a little apart. Then the top strip is cut across, the cuts corresponding to the lines in the table to be used. Then some of the resulting flaps may be pulled back and pressed back to show the figures in the desired columns, as in Figure 5. 86 School Statistics and Publicity The advantages of this form are that it can be used for various combinations of columns, and that it is very durable. /. One operation at a time. The idea is to carry the same step or operation all the way through without stopping to do something else. That is, if one is preparing a table from data found in different tables, he should copy all the data from each table in turn, and not skip from one to the other. Or if one has several groups of data to Papi Fig. 5. /ap« - 43801 I 1 1 1 1 1 105600 -——-"— Illustration of Use of Adhesive Tape Ruler Strip on Figures of Table 11. rearrange, he should finish each group before going to the next. This gives the practice effect for that operation, insures greater accuracy, and is much easier mentally. g. Using high school students to gather data. For gathering many of the data, high school students can do as well as any one else. A class can collect, classify, and check a great deal. Of course the question at once arises as to whether this is a legitimate use of the time of high school students, who supposedly go to school for their own benefit and not to help work out school statistics. But many of these are later to do clerical work of various sorts. Much of the statistical work in school affords the finest sort of clerical practice for such stu- dents. Their welfare will be properly cared for if the work is done under the careful supervision of the superintendent. He will natu- rally discover a great deal of valuable knowledge about the vocational aptitudes of his individual students for satisfactorily doing clerical work or for directing others in such work. Accordingly it may be considered thoroughly sound preliminary training for them, as Pro- fessor Bobbitt has pointed out.^ '^ But there is more direct evidence of the value of such work than any such theoretical statement. In 1915 one of the author's graduate students, Mr. S. J. Phelps, later professor of secondary education at the University of Vermont, directed the work of the students of the 1 San Antonio Survey, pp. 32, 33. Collection of Data 87 Gallatin, Tennessee, High School in making a study of four problems connected with their school system.^ These problems were : 1. Cost of maintenance of the system. 2. Age-grade distribution of the pupils. 3. Variations in marks given by high school teachers. 4. Study of the lighting facilities in each room. Clear and concise instructions were placed by Professor Phelps in the hands of the various teachers engaged in the work. The answers of the students were carefully checked over in class and then submitted to the inspection of a graduate class in school administration at George Peabody College for Teachers. Most of these men had had much experience in administrative work and all were specializing in survey work. They unanimously agreed that the work was surprisingly accurate. In answer to the question : Is not such work an exploitation of the students? Professor Phelps answers: "All are agreed that in mathematics, especially, it is necessary for a high school student to do a large amount of drill work. Now in doing this, which is the more profitable and practical for a high school student who is studying, for instance, eleventh grade civics or arithmetic, to do as outside work : To study the costs of oper- ating his own school system, compared to similar costs in other towns, or to find out the number of days it will take A, B, and C working together to do a piece of work which A can do in three days, B in four days, and C in five days? "How would a study of costs in his school compare with a paper which he might, after much delving, prepare on the source of some abstract principle of governmental costs? Isn't this a place where the much-talked-of subject, Community Civics, could get some practical problems ? "Which seems more practical and profitable for a class in algebra studying the graph, — to make a graph of these same costs, or to graph the profile of a river bed, or perhaps an extract from the table of American Mortality Experience? "In which would a student in percentage be expected to show . the more interest, — in a study of the percentage distribution of 1 Phelps, S. J. : Master's thesis at George Peabody College for Teachers, 1915, on file in library. 88 School Statistics and Publicity the marks given by his own teachers, among which he has a mark, or in studying the percentage composition of some compound, perhaps a fertilizer, which he has never seen and in which he cannot be expected to show a passing interest or curiosity? "Would another student in practical measurements get more from computing the surface and volume of the earth, or from find- ing how many spheres of a certain diameter could be placed in a cylindrical cup of certain dimensions, than he would get from studying the ratio of lighting space to floor space and air space per pupil in the same room?" From such suggestive questions as these it may be surmised that work of this sort, instead of "exploiting" the high school student, would be of great practical benefit to him. A report of similar work from Wisconsin is as follows : HIGH SCHOOL CLASSES GRAPH CONDITIONS Over-Age and Failures Studied In connection with the algebra work in the Frederic High School, graphic rep- resentations in colors are made showing conditions in the number of students re- tarded, and other school problems. One percentage graph compares the percentage of students over age, showing a distinct decrease in retardation during the past four years. Other graphs include mate- rial on students dropped, failed, and pro- moted in various subjects. Teachers are also compared with respect to the number of students failed by each of them. Incidentally, such work as this gives the algebra pupil practical work to do, illumi- nates the general subject of graphic analy- sis, and makes mathematics interesting.^ 1 Wisconsin State Department of Education : Educational News Bulletin, Jan. 1, 1917. Collection of Data 89 EXERCISE Which of the miscellaneous economies would be applicable to your problem, and just how would you use them? REFERENCES FOR SUPPLEMENTARY READING King, W. I. Elements of Statistical Method, Chapters IV-IX. Report of the Committee on Uniform Records and Reports. U. S. Bureau of Education Bulletin, 1912, No. 3. Rugg, H. 0. Statistical Methods Applied to Education, Chapters II, III. Thorndike, E. L. Mental and Social Measurements, Chapter II. CHAPTER III TECHNICAL METHODS NEEDED IN SCHOOL STATISTICS I. USUAL VIEWS So far we have considered only statistical matters that are plain to any experienced school man. With the suggestions previously given, such a man could success- fully collect statistical data on most of his school prob- lems. But the working up of the data and the proper interpretation of them would be altogether different and much more difficult. He would at once face such questions as these: Does the superintendent need any technical knowledge of statistics? Can he, without such special knowledge, analyze his data and get the really significant things out of them? Or, without such knowledge, can he present these results effectively to the public ? The Conservative's View. Attempts to answer these questions have brought forth much nonsense and fruit- less effort. For example, one group of school men take the position that a superintendent needs no special knowl- edge of statistics, simply because there is in their opinion no virtue in statistics. They quote the old statement of Bagehot : " There are three kinds of lies — lies, damned lies, and statistics." Or they reiterate : " Figures don't lie, but liars do figure." Or else they would at least 90 Technical Methods in School Statistics 91 agree with the author of a recent article that '' whatever the causes, the fact is that any one who presents his arguments in the form of tables, and his conclusions in dogmatic statements presumably based on the tables, is sure to convince nine tenths of his readers." ^ Con- sequently, superintendents having this viewpoint see no need of any special knowledge of statistics. They think that any effort to do anything unusual with statistical data is simply a waste of time. The absurdity of such an opinion should be evident to any one who has even glanced through the preceding chapters of this book. Professor King states the whole point very strikingly thus : ''To attempt to handle statistics properly without a knowledge of statistical method is only a little less absurd, though vastly more common, than to attempt to build a great steel bridge without a knowledge of trigonometry." ^ . The Specialist's View. Another group, composed of scientific educators and statistical experts, advocate either a very thorough course in statistical method or none whatever. They quote Pope on " A little learning is a dangerous thing," etc. Or they say that giving a school man only a little, or very superficial, knowledge of statistics is like putting a razor in the hands of a baby. Or they compare the results of such a procedure with those ensuing when very delicate and expensive machinery is put in the hands of a novice. Such machinery, which would produce wonders if run by a competent man, is, of course, soon ruined by a bungler, and the result produced is very inferior or altogether lacking. Even Professor King says : " The science of statistics, then, is a most useful servant, 1 "Lies, Damned Lies and Statistics," Unpopular Review, 1915^ Vol. II, 352-353 2 King, W. I. : Elements of Statistical Method, pp. 37-38 92 School Statistics and Publicity but only of great value to those who understand its proper use." ^ The Golden Mean. The truth probably lies between these extremes. Even Professor Thorndike, the pioneer in the application of modern statistical method to edu- cational problems, favors this moderate view when he says : '' There is, happily, nothing in the great principles of modern statistical theory but refined common sense, and little in technique resulting from them that general intelligence cannot readily master." ^ Of course, he later says that mathematical gifts and training will be very useful to students of quantitative mental science, but such things are not absolutely necessary for learning the elements of statistical method. Observations in other fields also support this " golden mean " view. Often the successful politician, minister, or business man has better practical ways of controlling people and reading human nature than has the expert in the psychology of such matters. It is not an uncommon thing for an experienced public speaker to influence an audience more than a teacher of public speaking could hope to do. We are only saying in other words that common sense and first-hand experience in reading and controlling human minds are as powerful factors in in- fluencing the public as expert knowledge in the mechanics of any art having the same end in view. Is it not reason- able, then, to expect the experienced superintendent with a small amount of statistical theory to outdistance, in practical statistics with the public, the best-trained experts in statistical theory only? 1 King: oj)- cit., p. 33 ^ Thorndike, E. L. : Mental and Social Measurements, p. 2 Technical Methods in School Statistics 93 II. STATISTICAL KNOWLEDGE NEEDED FOR SCHOOL SURVEYS But another phase of this question arises. The modern superintendent must be able to read and understand school surveys and apply the statistical methods used to his own school problems. How much statistical work does he need for this sort of thing? A study of practically every school survey thus far published shows that to meet this requirement the superintendent should, in addition to things previously mentioned, know : 1. The meaning of these terms : median, average, quartile, range, central tendency, variability, overlapping, and coefficient of varia- bility. 2. The methods of computing medians, quartiles, averages, and variability so that any fallacies or mistakes arising from poor or false methods may be detected. 3. How to read and understand tables and graphs. 4. The principles underlying the theory of good and bad units, and how the unit in the particular study was derived. 5. The principles underlying the construction of graphs, so that he will not be misled by a badly constructed graph. 6. How scales are derived, so that he will not be misled by the interpretations made from data measured on these scales. However, the citation of survey material as evidence of the relatively small amount of statistical knowledge needed by the superintendent is open to at least one criticism, that the surveys are written for the public, and to be effective must contain little or no technical material. But the proper translation of such facts into popular language presupposes accurate statistical work. And few superintendents could hope to do more than to give the best possible statistical treatment to such problems as are so treated in all our best school surveys to date. 94 School Statistics and Publicity III. STATISTICAL KNOWLEDGE NEEDED FOR READING EDUCATIONAL INVESTIGATIONS Moreover, for publicity work a good superintendent needs to keep up with recent investigations in education and psychology, as carried on by investigators in the various schools of education and by educational founda- tions. Some of the most valuable of these are not written especially for laymen and require some knowledge of statistical terms and methods to be understood at all. Good abstracts or news items of such studies often employ these terms. The superintendent needs to understand the terms and general processes used, but he need not know how to calculate or employ these terms accurately himself. It would be better if he could so use them, but it is not absolutely necessary. The situation is similar to that in reading for all of us. We must know how to employ, use, and spell correctly certain words. We need to be able to read with sufficient understanding a much larger list of words. But it is not necessary for us to know how to spell these latter words, or how to use them correctly in our speech or writing. The superintendent needs to understand the meaning, for reading purposes, of at least the following : 1. Such terms as : average deviation, standard deviation, mode, probable error, inter-percentile range, correlation, skewness, dis- persion. 2. The reliability of the various measures of variability. 3. The effect of different methods of grouping data, on the con- clusions reached. 4. Some of the common methods of making allowance for the un- reliability of data. Technical Methods in School Statistics 95 IV. ILLUSTRATION OF VALUE OF STATISTICAL METHOD TO THE SUPERINTENDENT The value of statistical method and presentation to the superintendent may be most clearly presented by a con- crete illustration. Suppose a superintendent wishes to know whether his high school classes are too large or too small for good work. He may take as his standards the pronouncements of colleges or universities, or the actual classes found in high schools that he or some competent person rates as good ones. He would, of course, get the enrollments of all the classes in his own high school, say fifty or more. He would get similar figures for twenty other high schools, or more if possible, say for at least a thousand classes. Then he would be confronted with the problem of handling this enormous mass of data so as to bring any clear idea out of it. His data would cover pages and pages. He would have an unwieldy mass of facts that needed sim- plifying. Without proper treatment, his ideas of the whole would be '' decidedly vague and indefinite." He would need some procedure that would enable him to " give clear-cut form to this hazy conception " and to " set objects in their proper perspectives and relation- ship." ' Now if he knew the elements of statistical method, he could very shortly summarize his data on a half page, as Professor Bobbitt does in the School Review for October, 1915.2 In this article, to which we have previously referred, Professor Bobbitt is making studies of the cost 1 King, W. I. : Elements of Statistical Method, p. 28 2 Bobbitt, J. F. : " High School Costs," School Review, 23 : 505- 534 96 School Statistics and Publicity of instruction per one thousand student hours in a number of high schools. First, each study, as English, mathe- matics, etc., is worked out separately as shown in Table 1, on page 18 of this book. Table 12 is the summarizing table. Table 12. Bobbitt Table Showing Sizes of High School Classes by Subjects 1 Median No. Pupils " Zone of S a Pupils 58 42-88 32 28-55 22 20-24 21 18-24 21 17-23 20 16-22 19 18-25 19 15-23 18 14-24 17 15-20 17 14-19 17 13-23 15 10-21 14 12-18 Music Physical Training . . English Mathematics . . . . History ...... Science Agriculture . . . . Commercial . . . . Drawing Modern Languages . . Latin Household Occupations Normal Training . . Shopwork To any one who understood the simplest things about statistics, this table would at a glance disclose such facts as these: In music, half the schools have more than 58 pupils in a class, and half have less; half of them have between 42 and 88 pupils; a fourth have less than 42 pupils; a fourth have more than 88 pupils. Similar statements would hold for the other subjects down the table. The table would also disclose that the " average classes " are in agriculture and commercial subjects. I Technical Methods in School Statistics 97 with 19 pupils each as a rule. Half the classes in other subjects have more than this number and half of them have less. The table would show which had more and which less, ranging from music with 58 down to shopwork with 14. A little more inspection would disclose which classes ranged more in their variations from the ''typical" or '' average " class in that subject. All these things, if told in words, would occupy pages and pages of description that would be about as clear and interesting as a real- estate deed to the lot on which the superintendent's home stood. The Bobbitt table,^ of course, is infinitely clearer and more forcible than the great masses of data or the long and tedious description could ever be. It has indeed fulfilled " one of the prime objects of statistics." This, according to Professor King,^ is " to give us a bird's-eye view of a large mass of facts, to simplify this extensive and complex array of isolated instances and reduce it to a form which will be comprehensible to the ordinary mind." V. STATISTICAL METHOD AS A FORM OF EXPRESSION Finally, the superintendent needs statistical method just as he does any other method of effective presentation or expression. A good description of scenery, of an object, a face, etc., always proceeds by giving first a bird's-eye view, or very brief comprehensive sketch, called usually the " fundamental image," or in exposition, " the topic sentence." Then the details are later filled in. The success of the description or explanations depends upon the clearness, brevity, and vividness of the funda- 1 The writer uses this name because such tables appear to have been first used by Professor J. F. Bobbitt of the University c:f Chicago. 2 King : op. cit., p. 22 98 School Statistics and Publicity mental image, and then upon the extent to which approxi- mately all the important details place themselves clearly under it. Of course, the whole process must not be so mechanical and obvious as to disgust the reader. But without the fundamental image and this procedure, the reader would soon be utterly lost in the details, or he would get so few of them in mind, or in such a disorganized manner, that he would get no clear idea of the whole. And he would not waste any more time in trying to do so. In the same way statistical method forces a mass of numerical data into a form which describes the whole by giving a good fundamental image or picture. At the same time it leaves all the data so grouped and classified that significant points stand out, but both points and minor details of any consequence may all be easily located under the fundamental image. If suitable graphic presentations of the statistical results are made, they will have the same advantage that a line drawing has over a photograph. The superintendent without a knowledge of statistics might give in words only a picture that would correspond to a photograph taken without proper focus. This, of course, gives all the details, but blurred. A suitable graph would make the essential facts of the whole and the essential details stand out clearly. It has been found repeatedly by text- book writers for beginners that line drawings emphasizing the essential elements in apparatus, pictures, etc., are preferable to actual photographs of the objects, because the photographs give too many details and so obscure the big things. The superintendent with no knowledge of statistical method could, under the most favorable cir- cumstances, give his reader only a sort of blurred photo- graph of his ideas. And, indeed, he would probably have Technical Methods in School Statistics 99 only a blurred picture of them in his own mind. With statistical method he could have in mind a sharp line drawing and give his readers the same kind of picture. The essentials of this method that are of value to him will be taken up in the next chapter. EXERCISES 1. Which of the statistical terms mentioned on page 93 do you think you understand fully? 2. Which of the statistical processes given on the same page do you think you know how to do ? Note : The best way to know whether you understand a term fully is to see whether you can quickly write a clear explanation of it. Similarly, to know whether you can tell how to do a process, see if you can quickly write directions for doing it. 3. Which of the statistical terms mentioned on page 94 have you come across in your reading, and just what do they mean to you at present ? REFERENCES FOR SUPPLEMENTARY READING King, W. I. Elements of Statistical Method, Chapters I-III. Rugg, H. 0. Statistical Methods Applied to Education, Chapter I. Thorndike, E. L. Mental and Social Measurements, Introduction and pages 36-41. I I CHAPTER IV SCALES, DISTRIBUTION TABLES, AND SURFACES OF FREQUENCY Thus far we have discussed only the most elementary statistical matters. But we have seen the need of some technical knowledge of statistics, which we shall now proceed to develop. The treatment will include the meanings of the various statistical terms, methods of calculating them, cautions as to their use, and devices for showing them graphically. I. SCALES Review of Scales. The first essential in all statistical work is to determine the units and scales to be used. It is impossible to collect data profitably until this has been done. For this reason, these terms were discussed at length in connection with the collection of data. If the reader is not familiar with the treatment given there,i he should read it before proceeding with this section. For our purposes here it is necessary to understand : 1. That whenever possible all the measures in a group should be expressed in terms of a definite common measure called a "unit." 2. That all measures in any group, when arranged in order of size, make a scale going from high to low. 1 See pp. 43-57 100 Scales and Distribution Tables 101 $170 160 150 140 130 120 110 Qi47 20 10 University High 169 Mishawaka 112 Elgin »00 Maple Lake, Minn. 100 Norfolk, Neb. 42 Bonner Springs 38 Junction City, Kaa 33 ^^^^^^^^^^3^ Washington, Mo. 41 Russell, Kan. 34 Fig. 6. — Device for Representing a Discrete Scale Graphically. This graph shows the cost per 1000 student hours in mathematics in certain high schools. (From J. F. Bobbitt, School Review, 23 : 509.) 102 School Statistics and Publicity 3. That we must know what a given measure on a scale means, i.e., whether 6 extends from 5.5 to 6.5 or from 6.0 to 6.99 or from 5.95 to 6.05, etc. 4. That we must know whether the scale is discrete (all measures separate or with gaps between) or continuous (measures running into one another and spread out all over the scale). It is worth noting that many of the scales the superintendent uses, especially those he makes up for him- self, are discrete in appear- ance but are really used as though they were continu- ous. That is, each separate item is regarded as extend- ing half the distance to the nearest items above and below. When the measures are discrete and there is a rela- tively small number of cases, say twenty to thirty, they may be shown in a Bobbitt table. 1 In this kind of table the name of each measure is written in the left-hand column and the size of the measure in the right-hand column. The measures begin with the highest and run to the lowest, thus giving an idea of a scale like that of a thermometer, which goes up from the bottom. 18, 96 -REVERE- 23.50 QUINCY- 22.00 CHEL3EA-21.40 -ARLlNGTON-20.80 ICAMBRIDGE- 20.40 iMELROSE - 20. 40 EVERETT- 19.70 IMALDEN -19.20 WINTHROP- 19.20 lOMERVILLE- 18.80 ELMONT- 18.30 WATERTOWN- 18.20 MEDFORD- 18.00 WINCHESTER- 18.00 NEWT ON -\7. 40 DEDHAM-17.40 ^BOSTON- 16-4.0. ^WALTHAM- 15.90 BROOKLINE- 12.00 MILTON- 11. 50 Fig. 7. — Thermometer Chart for Presenting a Discrete Scale. It shows the total tax on $1000, for the year 1912, in all the cities and towns of the metropolitan district, Boston, Massa- chusetts. Note omission of zero line. (From 1912, Newtoji, Massachusetts, School Report, page 113.) 1 See pp. Scales and Distribution Tables 103 Graphic Presentation of Discrete Scales. A discrete scale is easily presented graphically by various methods. Four of these methods are given here, three of them using the data from the Bobbitt table on page 17. 1. By vertical scale on left with names of items to the right. Figure 6 shows this. Cost per 1000 5. Hours $0 30 6o so iso iso leo Name of school. Uo'verai+y High M'Shawaka, Ind. Elgin, III. Maple Lake,Minn Grani+eCity, III. Ea3+ Chicago, Ind. OeKolb.IlL San Antonio, Tex. Harvey, 111. Waukegqn.TII. South Bencl,Ind. East Aurora, III. Rockford, III. Booneville.Mo Brazi(,Ind. Leavenworth, Kans Greens burg. Ind Morgan Park, II I. Noblesville, Ind Norfolk Neb.- Washington, Mo. Bonner Springs.Kans Russell,Kans. Junction Cify, Kans M+Corroll,III Fig. 8. — Graphic«Representation of a Bobbitt Table, Histogram Form. The data are from Table 1, page 18. 2. By thermometer device. Superintendent Spaulding in his 1912 report for the schools of Newton, Massachusetts, showed a similar table as a scale on a thermometer. As this is now out of print, the device is here re- produced. It is a very excellent one, save that the zero line is not properly shown. See Figure 7. ! $169 1 11^ 1 100 100 88 1 82 1 74 1 b-y 69 63 J <62 1 61 59 J 58 1 56 56 54 J 53 52 f 42 41 1 s 38 1 34 . 33 30 104 School Statistics and Publicity 3. By bars to represent magnitvde. The procedure for this is : First, the names of the items (schools in this instance) are placed on the left, the highest at the top. In the next column is placed the corresponding magnitude (cost per 1000 student hours in this case). At the top and running out to the Cost per 1000 S hrs. ^0 Name of scbool. UniYersi+y HigW ^169 AAishawaKQ, Ind- I '2 l\Q,\r\.W. 100 Map\e Loke, H\ nn. lOO Grani+eCi-Vy,IU. 88 East ChicQCjclnd. 82 DeKQ|b,Ill. 74 5a n An^on\o,Tex. 69 Harvey. l\l. 69 Waukec^anAVV. 63 South Bend, \nd. 62 South Avjrora,W- 61 RocKtord.lll- 59 Boonev'\lle,lAo. 58 Braz.il, \nd. 56 Leavenworth, Kanv 56 6reer>\3or«j,\nd. 5^ Morgan ParKUl. 53 Nob\es\'(\\e, \Y\d. 52 Norto\W,Neto. 42 Woshincjton.Mo. 41 Bonner Springs, Kan S. dQ Rus3e\\,V^ans. 34 Junction City, Kans. 33 Mt.Carro\|,\\\. 30 30 60 90 120 150 180 ^-••^-^—a * ' Fig. 9. — Graphic Representation of a Bobbitt Table, Smoothed Curve Form. The data are from Table 1, page 18. This is simply the smoothed form of Fig. 8. right, the scale is placed (at $10 intervals here). Each distance of five small squares on the scale represents $10 and one small square will thus mean $2. Therefore, to get the bar for the first school, go out on the scale to 170, drop back half a square to get 169, and then construct the bar from the base or zero line to this point. In the Scales and Distribution Tables 105 same way the bar for each other school may be constructed. The significance of the bar in such a graph lies in its length only, not in its width. Hence, all bars in the same diagram must be uniform in width. The bars are here drawn adjacent to each other, but there can be spaces between if desired. See Figure 8. 4. By a '^ curve "drawn with the magnitudes on the vertical axis and the names of the cases running in order from high to low {or vice versa) on the horizontal axis. See Figure 9. This diagram or graph has been constructed in exactly the same way as the preceding one, except that dots have been placed at the middle of the spaces where the ends of the bars came in the preced- ing diagram, and then joined with a line. The dots were put in faintly and have been covered up by the line. Obviously, then, the "curve" is nothing more than the smoothing down of the corners of the bars. EXERCISES 1. The salaries of school superintendents in Missouri cities be- tween 2500 and 5000 population in 1914-15, for all cities on which data could be obtained, were as follows : Booneville, $1,650 — Butler, $1,320 — Cameron, $1,400 — Carter- ville, $1,200 — Caruthersville, $1,200 — Charleston, $1,500 — Clin- ton, $1,920 — De Soto, $1,400 — Excelsior Springs, $1,500 — Far- mington, $1,400 — Fayette, $1,500 — Festus, $1,140 — Frederick- town, $1,450 — Kennett, $1,500 — Kirkwood, $2,400 — Liberty, $1,800 — Louisiana, $1,350 — Macon, $1,700 — Marceline, $1,100 — Marshall, $2,100 — Maryville, $1,500 — Monette, $1,500 — Rich Hill, $1,200 — Richmond, $1,500 — Sikeston, $1,500 — Slater, $1,200 — Warrensburg, $1,600 — Washington, $1,170 — West Plains, $1,500. Make up a Bobbitt table to show the status of these salaries, and graph this table in as many ways as you can. 2. Make up a similar table on superintendents' salaries in a group of cities in some other state that may be legitimately compared, choosing cities of some other size if preferred. Statistics on population may be gotten from census reports or a 106 School Statistics and Publicity good almanac like the World Almanac. Figures for salaries may be gotten from directories issued by book companies, from reports of the state superintendent of education, or from " A Comparative Study of the Salaries of Teachers and School Officers" (Bulletin U. S. Bureau of Education, 1915, No. 31). Volume II of the Annual Report of the U. S. Commissioner of Education will furnish figures for the total ex- pense of the superintendent's office but sometimes this contains other items in addition to his salary. In the smaller cities it very closely approximates the superintendent's salary. n. DISTRIBUTION TABLES In a continuous scale or distribution, it is customary to group measures of magnitudes that are almost the I same in value, and call them by a group name. Thus, in the Ayres spelling scale, all words spelled by from 70 to 76 per cent inclusive of children in a given grade are lumped into one group and called 73 in difficulty for that grade. It is also customary in such a distribution to make up a table called a " distribution table " or '' table of fre- quency." The table is made up with magnitudes in order of size in the left-hand column and the corresponding numbers of cases or frequencies in a parallel column on the right, the smaller measures preferably being at the bottom. The grouping is for the purpose of condensation and clearness, but the cases can always be kept individually if necessary. Professor Dearborn in a study of grades has kept all the cases of each magnitude separate. For example, he presents the distribution of marks given in English to 69 eighth grade children as shown in Table 13.1 1 Dearborn, W. F. : ** School and University Grades," Bulletin, University of Wisconsin, No. 368, H. S. Series, No. 9, 1910. From Figure 9, p. 25 Scales and Distribution Tables 107 Table 13. Distribution Table of Marks of Eighth Grade Children (from Dearborn) Mark No. Making Mark No. Making 100 80 3 99 79 3 98 78 3 97 2 77 2 96 76 1 95 2 75 2 94 1 74 93 3 73 3 92 4 72 2 91 6 71 2 90 2 70 89 3 69 88 3 68 2 87 3 67 86 1 66 85 2 65 1 84 3 64 1 83 2 63 82 3 62 1 81 3 61 60 The magnitudes in this table run from 60 to 97. This table is exactly like a Bobbitt table, except that in this instance there are several cases of the same magnitude. But even here these cases are supposed or assumed to run from the lower limit of the magnitude to the lower limit of the next magnitude. That is, the six 91's are not all exactly 91 but are spread from barely 91 up to not quite 92, or from just 90.5 to almost 91.5, depending upon which system of definition of the measure 91 is used. 108 School Statistics and Publicity But we know that such grouping as Professor Dearborn has used in this instance is rather finer than we are accustomed to use in examining teachers' marks. Let us now make coarser groupings. The first possibiUty that occurs to the reader is probably that of making the groupings cover supposedly equal parts of the scale, say five units. Thus, we may make the groups cover 60-64, 65-69, etc., setting the limits very definitely and getting these groups : Marks No. Making 95-100 4 90-94 16 .85-89 12 80-84 14 75-79 11 70-74 7 65-69 3 60-64 2 69 Or we may make them cover 60-69, 70-79, etc., and have: Marks No. Making 90-100 20 80-89 26 70-79 18 60-69 _5 69 Or we may group as many school systems do, and have : Magnitude No. Cases 95-100 4 90-94 16 80-89 26 70-79 18 Below 70 5 69 Scales and Distribution Tables 109 For grouping, this from Professor Thorndike should be kept in mind : In general, in mental and social measurements, in the calculation of averages, average deviations, and mean square deviations, when the face value of the series gives a grouping of 40 to 60 steps, it is al- lowable to group by double steps, and when the face value of a series gives a grouping of 60 to 80 steps, to group by triple steps. But it should be observed that coarse grouping saves little time except in the calculation of the average, average deviation, and mean square deviation. In the case of the calculation of the median, 25 percentile, 75 percentile, and median deviation, it is the author's opinion that the gain in precision from the finer scale is greater than the loss in time, if one economizes time in recording measures in the finer group- ing.i The superintendent, however, may not understand the finer points of this without considerable statistical theory and experience. The best simple rule for him to follow is not to divide into small groups where the cases seem to bunch more closely than usual, and not to include in the same group cases that are manifestly far apart. In the example of the marks given above, it will be noted that the cases bunch together very closely when grouped in the third form. Nor are the cases to be found in one group too far apart from one another to be in that group. For instance, 81 and 89 are to be found in the same group. But experience with marks shows us that when one teacher marks one boy 81 and another teacher marks another boy 89 in the same subject, there may be little appreciable difference in the achievements of the boys. In most cases common sense and experience must be utilized in considering grouping. Salaries of superintend- ents may be safely grouped by hundreds (1200-1299, 1300-1399, etc.) because their salary increases usually 1 Mental and Social Measurements, p. 50 110 School Statistics and Publicity come by hundreds. But the salaries of grade teachers are more profitably grouped by fifties or twenty-fives, thus : 400-449, 450-499, or 400-424, 425^49, 450-474, 474-499, etc., because their usual salary increases are covered by the smaller steps. A grouping for training in weeks, of teachers that covered summer school work, would be 1-6, 7-12, 13-18, etc., or 1-3, 4-6, 7-9, 10-12 and not 1-4, 5-8, 9-12, because summer schools usually run either 6, 9, or 12 weeks. EXERCISES 1. The salaries of superintendents in cities of 2500-5000 for 1914- 15, for all obtainable, were in the states given, as follows : Alabama. $1,500 — 1,800 — 1,250 — 1,680 — 1,900 — 1,600 — 1,800 — 1,500. Arkansas. $2,000 — 1,620 — 1,000 — 1,500 — 1,600 — 1,600 — 1,500 — 1,100 — 1,500 — 1,200 — 1,500 — 1,500 — 1,600 — 1,350. Florida. $1,500 — 1,200 — 1,500 — 1,650. Georgia. $1,800 — 1,800 — 1,800 — 2,000 — 1,500 — 1,200 — 1,500 — 1,500 — 1,200 — 1,600 — 1,800 — 2,000 — 2,000 — 1,650. Kentucky. $1,000 — 1,500 — 1,800 — 1,350 — 1,800 — 1,600 — 1,800 — 1,400 — 1,200 — 1,400 — 1,650 — 1,200 — 1,400 — 1,500. Louisiana. $1,500 — 1,800 — 1,500 — 1,800 — 1,500. Maryland. $1,400 — 1,450. Mississippi. $1,800 — 2,200 — 1,125 — 1,700 — 1,650 — 1,800. Missouri. $1,650 — 1,320 — 1,400 — 1,200 — 1,200 — 1,500 — 1,920 — 1,400 — 1,500 — 1,400 — 1,500 — 1,140 — 1,450 — 1,500 — 2,400 — 1,800 — 1,350 — 1,700 — 1,100 — 2,100 — 1,500 — 1,200 — 1,500 — 1,500 — 1,200 — 1,600 — 1,170 — 1,500. North Carolina. $1,500 — 1,500 — 1,200 — 1,500 — 1,200. Oklahoma. $1,500 — 1,500 — 1,400 — 1,800 — 1,800 — 1,500 — 900 — 2,000 — 1,300 — 1,800 — 1,500 — 1,800 — 1,200 — 1,800 — 1,300 — 1,500 — 1,800 — 1,800. Scales and Distribution Tables 111 South Carolina. $1,200 — 1,500 — 1,215 — 1,500 — 1,350 — 1,800 — 1,200 — 1,250 — 1,500 — 2,000. Tennessee. $2,000 — 1,000 — 1,200 — 1,080 — 1,200 — 1,600 — 1,600 — 1,500 — 1,000 — 1,800. Texas. $1,960 — 2,000 — 1,500 — 1,800 — 2,100 — 2,200 — 2,000 — 1,800 — 1,500 — 1,800 — 2,000 — 1,200 — 1,500 — 1^560 — 1,800 — 1,500 — 1,800 — 1,300 — 1,800 — 2,300 — 1,500 — 1,675 — 2,200 — 1,200 — 2,000 — 1,400 — 1,800 — 1,500 — 1,500 — 1,500 — 1,800 — 1,800 — 1,800. Virginia. $1,750 — 1,200 — 1,200. West Virginia. $1,500 — 1,400 — 1,350 — 1,500 — 1,500 — 1,800 — 1,500 — 1,380 — 1,600 — 1,550 — 2080. Ignore the matter of sampling and arrange these salaries in a distri- bution table, being careful to justify the step chosen in your grouping. 2. Make a similar distribution table of superintendents' salaries in cities of this size for any other section of the United States, getting your data from the sources found in Exercise 2, page 105, and telling just why you use each step in the process of making the table. 3. Make a similar distribution table for the following figures on the number of hours required by individual pupils to complete one half-grade in grammar : ^ 7 — 10 — 11 — 11 — 11—12 — 12 — 13 — 13 — 15 — 16 — 16 _ 16 — 17 — 18 — 18 — 19 — 19 — 20 — 20 — 21 — 21 — 22 — 22 — 22 — 23 — 23 — 25 — 27 — 29 — 33 — 33 — 33 — 34 — 34 _ 36 _ 37 _ 38 — 39 — 40 — 43 — 44 — 44 — 48 — 49 — 49. III. SURFACE OF FREQUENCY Graphing Distribution Tables. For the presentation of grouped distributions by graphs, three simple devices are available, the histogram or rectangular graph, the smoothed graph, and the check form of the histogram. The procedure for this histogram is as follows : 1. Lay off on cross-section paper a horizontal scale, on which the magnitude scale runs by groups from the lowest magnitude at the left to the highest magnitude at the right. 1 From Monograph C, Individual Instruction, San Francisco State Normal School, p. 28 112 School Statistics and Publicity 2. From the same zero point erect a perpendicular scale which is to represent the number of cases. r— < 25 20 \ 1 • y 15 10 5 : I 1 1 1 1 \ ' 1 1 1 1 1 1 "No. Below Cas65 70 70 t079 80 to 89 SO to 84 QSfoWO Fig. 10. — Histogram Showing Data of Table 13, but Using Grouping as Given on Page 108. 3. Then find on the horizontal scale the point marking the mag- nitude of any given case, and count up to find the proper point to denote the number of cases in that group. Scales and Distribution Tables 113 Do the same for each group. In so doing, one will get a number of points at different heights strung out above the horizontal scale. 4. Then proceed to draw a line through these points coming down to the base line on the right, and either coming down to the base line 25 j / 70 / (5 / > ^ / \ /O / ' \ / \ 5 ^ \ / \ \ / \ Cases 70 Fig. 11. — Smoothed Form of Graph Shown in Figure 10. 70 to 79 ao f o 69 00 to 94 9 5 f o 100 on the left or going to the vertical scale on that side. There will thus be inclosed an area which is called the "surface of frequency." This surface may be made by making each point located after the 114 School Statistics and Publicity manner described above, the upper left-hand corner of a rectangle which is as wide as the length of the space occupied by that group on the horizontal scale. Thus the plotting of the Dearborn data as a histogram is shown in Figure 10. It is not customary to draw those parts of each rectangle shared in common with other rectangles. Common por- tions in the diagram are shown by dotted lines. In the smoothed graph (Figure 11), the points located to determine the rectangles may represent the middles of the tops of such rectangles instead of the upper left- hand corners. Then these points may be joined by straight lines, giving a surface with apexes. This is somewhat " smoothed," it will be noticed. The check form of the histogram simply uses dots on cross-section paper, one for each item, thus keeping the columns the same width. There is no line drawn above, the columns showing roughly the shape of the surface. It is a very valuable form for tabulating data and at the same time showing the shape of the surface of fre- quency. That is, it may be made up before the dis- tribution table. Thus, if cross-section paper had been used for the. Dearborn data at the outset, a surface of frequency like that in Figure 12 could have been obtained, and from this surface, the distribution table given on page 107 could have been easily made up. In graphing a distribution table, the scale in which the items come regularly is always put on the base line. The scale in which the items come more or less irregularly is put on the vertical line. In other words, this means that magnitude is measured on the horizontal line of the graph and the number of cases is shown on the vertical line. There is no bullet-proof reason for this ; it is simply the conventional way of doing the thing, just as the order Scales and Distribution Tables 115 ... n t3 O 0) QO — — K« (0 — — lO — ^ — ~ — «^ — — — — CM — — — — — ~" — — — "~" 4^ W ^ 03 (S3 c4 «^ 1^ +3 bD rt M.S ■73 1 (D r£l CO (H >i o-o to f*< 116 School Statistics and Publicity from left to right across the page and from top to bottom in reading and writing is the proper and customary one for European people and their descendants. One should draw pictures or graphs of data to be read with as much care as he would exercise in preparing the manuscript of an article to be printed. ^ Characteristics of Surface of Frequency. The dis- tribution table is a great economy over a miscellaneous mass of unassorted data. But it is too cumbersome to be kept in mind in all its details. We need to apply here our idea of the fundamental image or bird's-eye view of the whole. This can best be done through the use of two special qualifications, characteristics, or earmarks of the distribution. The first of these qualities is that measure which indicates the typical, average, or central size of the group. The second is that number which indicates how far the other members of the group on the average vary, spread, or deviate from the first-named quality. These two characteristics of the distribution table serve to make it full of meaning to any person who understands statistics; and with a little care the lay reader may readily acquire the ideas back of these devices. Both of these characteristics may be expressed as magni- tudes or as so many multiples of whatever unit may be used in the distribution table. They may also be shown graphically on the surface of frequency. The particular kind of central tendency to be used and the measure from it depend altogether upon the shape of this surface of frequency. The question at once comes up, then: How many variations in the surface of frequency are of 1 Paraphrased from W. C. Brinton : Graphic Methods of Present- ing Facts Scales and Distribution Tables 117 significance to the superintendent, and how may he know them? Normal Surface of Frequency. If the cases in any distribution are taken by chance or by a combination of causes that amount to chance, the shape of the surface of frequency invariably becomes bell-shaped with symmet- Percenf inaKirig given score Lowesf Highest score score Fig. 13. — Example of Normal Surface of Frequency. This graph shows the per cent of pupils attaining given scores in Stone Reasoning Tests, all pupils being tested. (Adapted from Butte Survey, page 95.) rical sides. Figures 13, 14, and 15 are good examples because all the children were taken, and the variations then arise only from chance. Notice that in one of these graphs or diagrams the cases composing it are bunched much more closely around the highest point or apex of the distribution than in the second diagram. But it might sometimes happen that two different sets of children, when tested on the same thing, might give distribution tables which, when graphed, would make graphs or surfaces of frequency varying as widely as these two. For example two sixth grades might make the same average or central tendency on the Courtis Tests. But in one case the achievements of all 118 School Statistics and Publicity the children might be close to the average, while in the second case some of the children might make very high records and others very low. This bell-shaped surface is called the " normal " or " probability " surface because it is the one found in natural and mental phenomena of all kinds, when the distributions are made up from un- 5043 4462 3536 2878 1373 452 2 59 3536 1884 1225 527 399 10 20 30 40 50 60 70 80 90 100 110 120 to to to to to to to to to to to to to 9 19 29 39 49 59 69 79 89 99 109 119 129 Fig. 14. — Example of Normal Surface of Frequency. This figure shows the number of pupils writing at each speed from to 9 letters per minute to 120 to 129 letters per minute. Data for 25,387 pupils in four upper grades of Cleveland. (From Measuring the Work of Public Schools, Cleve- land Survey, page 67, by permission.) selected cases or from those picked at random. In other words, it is the normal or probable thing to expect from such data. Skew Surface of Frequency. But there are some distributions in which the cases bunch much more on one side of the apex than on the other, for the reason that a certain cause or combination of causes operates on some of the cases in the distribution but not on all. The Scales and Distribution Tables 119 distribution or surface is said to be " skewed " toward the thin, drawn-out side and is called a '' skew '' dis- tribution or '' skew " surface. ^ A good example of such a distribution is one made up of the number of children of different ages in school. As the children grow older, some die, and consequently the group of a given age is 5000- - Fig. 15. — Smoothed Form of Figure 14. slightly smaller in size than any group of lower age. But at the age of fourteen the compulsory education laws usually cease to operate, and many children immediately drop out of school. Others want to go to work. Such forces tend to decrease the group rapidly. The follow- ing table of frequency for 1908 for Nashville, Tennessee, as reported by Stray er in his " Age and Grade Census of 1 From the verb "skew," meaning to put askew or twist to one side. While there is perfect agreement among educational writers as to what constitutes a skew distribution, there is no agreement as to which is the skew end. By this term some writers mean the blunt end and some the sharp end. The author uses it to describe the sharp end, following Professor Thorndike. 120 School Statistics and Publicity Schools and Colleges," ^ well illustrates the point under discussion : Table 14. Distribution by Ages of Pupils in the Nashville Schools, 1908 \ Age Number in group 72 1493 8 1733 9 1584 10 1712 11 1595 12 1626 13 1490 14 1198 15 885 16 528 17 and over 390 Notice that there is practically no increase or decrease of importance in these groups until the age of fourteen is reached. From this point on, the decrease is very rapid. If the ages beyond seventeen had been given sep- arately, the extension would narrow down to a very slender one. Figure 16 shows the facts in the histogram or unsmoothed form. Smoothed out, as in Figure 17, the graph shows the " skew " even better. A good example of a skew surface is shown in the results of the spelling test in Figure 18. 1 Bulletin U. S. Bureau of Education, 1911, No. 5, p. 34 ' Children are not permitted to enter the Nashville schools until they are seven years of age. Scales and Distribution Tables 121 < 1 Mo, of pupils 1800 1600 1400 /200 1000 600 600 400 ^00 A36S 6 7 8 10 II 12 13 14 15 \Q 17 and over Fig. 16. — Skewed Histogram Representing Distribution of Pupils by Ages in Nashville Public Schools, 1908. (From data in Table 14.) No. of pupils 1800 1600 1400 1200 (000 800 600 400 200 Ages 6 7 8 S 10 II 12 13 14 15 16 17 and over Fig. 17. — Smoothed Form of Graph Given in Figure 16. -V 1 — I ^ i I I I I I I I I I 122 fo Of children 40 A School Statistics and Publicity M91 30- 20- 10- ENTIRE CITY 3988 cinildren I 10 20 30 40 50 60 -70 80 90 100 Fig. 18. — Skewed Histogram Showing the Percentage of Children Attaining Each of the Possible Scores on the Spelling Test in Salt Lake City as a Whole. (Adapted from Salt Lake City Survey, page 135.) EXERCISES 1. Draw a surface of frequency for each of the distribution tables used or gotten up by you in the previous exercises. 2. Draw a surface of frequency for each of the distributions given in the table on page 123. REFERENCES FOR SUPPLEMENTARY READING King, W. I. Elements of Statistical Method, Chapters V, XI. Rugg, H. 0. Statistical Methods Applied to Education, Chapters IV, VII, VIII. Thorndike, E. L. Mental and Social Measurements, Chapter II and pages 28-36. Scales and Distribution Tables 123 'Frequency of the Different Percentages of Boys and Girls Retarded Two Years in Certain Cities of 25,000 Population and Over, 1908 ^ Per cent of total no. of hoys No. cities Per cent of total no. of girls No. cities 2 1 2 3 3 5 3 6 4 3 4 9 5 7 5 9 6 7 6 9 7 4 7 12 8 9 8 18 9 16 9 15 10 12 64 10 13 ! 11 12 11 11 12 19 56 12 7 13 11 13 3 14 9 14 2 15 2 15 5 16 2 16 3 17 1 17 3 18 7 18 2 19 1 19 2 20 2 21 1 21 1 22 1 1 From Strayer, G. D. : "Age and Grade Census of Schools and Colleges," Bulletin U. S. Bureau of Education, 1911, No. 5, pp. 86-87 CHAPTER V MEASURES OF TYPE There are three measures of the " type " or central tendency of importance in school work — the mode, the median, and the average — which will now be discussed in order. I. THE MODE Definition. The mode is that number which represents the size of the most numerous item or items in a group. That is, it is the vogue or fashion in the cases, because there are more of this size than of any other. The mode is precisely what the ordinary man usually has in mind when he speaks of the '' average.'' He is referring to that measure which includes the greatest number of cases. If he says that teachers instruct forty children on the average, he means that more teachers instruct just about forty children than teach thirty, fifty, or any number far removed from forty. Graphic Representation. Graphically, the mode is the magnitude represented by the point on the scale above which the surface of frequency is highest. It may be marked by a perpendicular erected from this point to the apex of the surface. But note that the mode is a measure, not a number of cases. Calculation. All that is necessary here is to pick out the group containing the largest number of cases and 124 Measures of Type 125 see what its magnitude is. If several adjoining groups have about the same number of cases in them, they should be run together into larger groups. Usually this procedure will give a more pronounced mode, which is its purpose. In case two widely separated groups are larger than the others, the distribution is said to be " bi- Number receiving 1251 iOO 15 50 25 n ^ ^^ ,0 (O o " 0) — > >- V. o O ^ o 0; Salary C\J C Q> CO O ^ooo o ooO(io oo ooSoo Fig. 19. — Example of Multi-modal Surface of Frequency. This shows the distribution of salaries paid elementary school teachers, Salt Lake City, 1914-15. Note that the printing on this graph cannot be read easily from one position. (Adapted from Salt Lake City Survey, page 51.) modal, ^' that is, it has two modes, and probably is composed of two rather different classes which have been lumped together but which possibly should not be so considered. A frequency table showing the number of teachers getting different salaries will often show more than one mode. For example, the surface of frequency 126 School Statistics and Publicity for the salaries of elementary teachers in the Salt Lake City Survey has three well-defined modes at approximately $650, $850, and $1020. This means that, as regards salaries, there are really three distinct classes of teachers within the whole group. The graph is here shown in Figure 19. Advantages of the Mode for School Statistics. 1. The mode is useful where it is desirable to eliminate extreme variations. For example, the amount of work a given group of children can do in a school year is determined by the modal attendance of the group, not by that of the few who are absent almost continuously, or that of the small number who never miss a day. 2. In finding the mode, it is unnecessary to know anything about extreme cases except that they are few in number. In comparing his school with other schools, the superintendent need not worry about the one or two schools that are higher than any of the others, if his own school falls in the mode or close to it. The extreme cases may not be measured accurately, and they may or may not really come legitimately into the distribution. But whether they do or not, they cannot affect the mode. 3. It is very easy to determine with considerable accuracy from well-selected data. 4. It is the best measure of type to the ordinary mind. As before indicated, this is what the ordinary man often means by "average." 5. It is unambiguous. No one ever thinks from it that all the measures in the group are practically on it. 6. The mode is often the most typical measure of a skew distribution. Measures of Type 127 Probably the most significant thing about a frequency table of teachers' salaries is that largest group which get the same salary or a salary within certain limits. The extreme salaries, the median salary, or the average salary might be of no especial significance. But the modal salary would point out the significant group at once. The three modal salaries for the Salt Lake City elementary teachers, as shown in the graph on page 125, indicate the significant salaries at a glance. Disadvantages of the Mode for School Statistics. 1. In many groups, no single, well-defined type actually exists. There is no such thing as a modal age for children or a modal num- ber of children in a grade in school. All the age groups up to 14 are about the same in size, and all the grades up to about the sixth or seventh keep about the same size. Of course children drop out but usually enough are held over to make the grades approximately the same size. When all cases are kept separate as in Bobbitt tables, there is, of course, no mode unless several cases happen to be of exactly the same size or are considered to be of the same size. 2. The mode is of no value if weight is to be given to extreme cases. It would take no special account of the high per capita cost of a city at the upper end of a per capita group, so far as the size of the item was concerned, although such city might admittedly have the best schools in the group. Similarly it would take no special note of the lowest city in the group although it might admittedly have the worst schools in the group. 3. The mode cannot be determined by any simple arithmetical process and is sometimes difficult to get by any method. 4. The product of the mode by the number of items does not give the correct total. For example, take Table 15. 128 School Statistics and Publicity Table 15. Distribution Table Showing Penmanship Records of Second Grade at Butte ^ Score ■ Number pupils making 4 5 5 22 6 21 7 29 8 28 9 42- 10 7 11 29 12 5 13 7 16 _1 196 The mode in this example is 9. 196 X 9 = 1764. The sum of the products of the cases in each group by its score, however, is only 1617. Such a total sometimes proves very useful for checking other steps. 5. The mode may be determined by a very few items in case none of the groups contains more than a few items. This, of course, may be offset by wider grouping. EXERCISES 1. What is the mode in each of the distribution tables used in pre- vious exercises ? 2. Draw the line to represent the position of the mode on each of the surfaces of frequency used in previous exercises. II. THE MEDIAN Definition. The median is the magnitude represented by the mid-point on a scale or distribution. Obviously, half the cases fall below this mid-point and half above it. 1 Butte Survey, p. 80 Measures of Type 129 Note that the median is a magnitude or size of a case, not the number of the case. Calculation. Various devices and formulas have been given for calculating the median, according as there is an odd number of cases, an even number of cases, a gap between two groups that are equal in size, etc. But the simplest and surest plan hy far is to regard the distribution as a scale and always to find the magnitude of the mid-point on it. If there is an odd number of cases, the median is, of course, where the mid-point of the middle case lies. If the median falls in a gap, the mid-point in the gap must be taken. If the median falls in a group distributed over part of the scale, one must run up the part covered by this group until he finds the point that will exactly place half the cases in the whole distribution below it and half above it, splitting a case into halves if it is necessary. The essential thing is to find the mid-point. The mag- nitude denoted by this mid-point will be the exact median. The magnitude corresponding to the group containing the mid-point will be the approximate median. This mid- point method of calculation will now be illustrated with various examples, starting with an even number of cases. If the distribution is a discrete one and contains an even number of cases, the median falls between the two middle cases. The place for it to fall is found by dividing the number of cases by 2, which gives the number of cases to have on one side. Then count in from one end till this number of cases has been checked off. For instance, suppose we make up a Bobbitt table with an even number of cases by taking only the first twelve cases from the table on page 18, as in Table 16. 130 School Statistics and Publicity Table 16. Bobbitt Table Showing Cost of Instruction per 1000 Student Hours (Mathematics) Name of school University High Mishawaka, Ind. . Elgin, 111 Maple Lake, Minn. Granite City, 111. . East Chicago, Ind. De Kalb, 111. ^ San Antonio, Tex. . Harvey, 111. . . . Waukegan, 111. . . South Bend, Ind. . East Aurora, 111. Cost per 1000 student hours $169 112 100 100 88 82 74 69 69 63 62 61 The mid-point of these twelve cases will obviously have to throw six cases on each side of it, that is, it must come between cases six and seven. For those who desire a formula, this will be found by dividing the number of cases by 2. The mid-point, then, is at the magnitude halfway between $82 and $74 or at $78. ($82 - $74 = $8. h of $8 = $4. $74 + $4 = $78.) The same result, of course, could be obtained by merely taking the average of the two middle cases. ($74 + $82 = $156. $156 ^ 2 = $78.) If the two cases between which the median in a Bobbitt table falls are the same size, as in the Bobbitt table of size of classes on page 96, the median, of course, is represented by the size of either case, — 19 in this instance. For an example of a continuous series and even number of cases, take the achievements of the eighth grade in composition at Butte. ^ These may be adapted for our purposes as follows, paying attention for the present to the two left-hand columns only : 1 Butte Survey, p. 74 Measures of Type 131 Rated at Number of ; papers 7 2 6 6 (8) Adding down 5 22 (30) i4 43 (73) ^^3 39 ^ 32 (42) 1 9 (10) Adding up 1 2 )154 77 There are 154 cases in all. The median, then, must be at the point which will throw 77 cases on one side and 77 cases on the other. This point will obviously be at the end of the 77th case or the beginning of the 78th case, as one prefers to call it. Say that this point will be located at the end of the 77th case. Counting up we find 42 cases in steps 0, 1, and 2. If the 39 cases in step 3 are added, we find that we have more than the required 77 cases. Subtracting 42 from 77 we find that we must have 35 cases more. That is, we must add 35 cases from group 3 to the other 42 so as to reach the end of the 77th case. This means that the median is located |f of the distance up the scale represented by the step 3. (ff = .90) But does step 3 extend from 3 to 4, or from 2.5 to 3.5? If it ex- tends from 3 to 4, the median is obviously 3.90. (3 + .90 = 3.90) In the Butte scoring, however, the latter method was used, and the actual values on the Hillegas scale are as follows : is 3 is 3.69 1 is 1.83 4 is 4.74 2 is 2.60 5 is 5.85, etc. With this in mind, we must use the true measures on the Hillegas scale. Step 3 would extend from the halfway point between 2.60 and 3.69 (or 3.145) and the mid-point between 3.69 and 4.74 (or 4.215). The distance between 3.145 and 4.215 is 1.07. f| of 1.07 is .96. Adding this .96 to 3.145 we get the median or mid-point, 4.105. The median may also be figured from the other end. The pro- cedure is the same, the figures this time being as follows. Counting down we find that from quality 4 up we have 73 cases. We need to take 4 cases from the upper part of the group rated 3. That is, we must go down /^ of the part covered by that group, (/j = .10) If 132 School Statistics and Publicity the step meant 3 to 4, the median would then be 4 — .10 or 3.9. But as we saw before, this group covers 1.07 and extends up to 4.215. 3% of 1.07 is .11. Then 4.215 - .11 gives 4.105 for the median. If the distribution is a discrete one and contains an odd number of cases, the magnitude of the middle case is the median. This is easily found by counting in, usually adding 1 to the number of cases and dividing by 2 to get the number of the middle case. Thus, in the discrete series represented by the first two columns of the table on page 51, the median real wealth behind each $1 for schools is $234, because it is the eighteenth case on the scale. There are 17 cases below and 17 cases above. Note that 18, the number of case wanted, is found by adding 1 to the total number of cases (35) and dividing by 2. (1 + 35 -^ 2 = 18.) In the Bobbitt table on page 18, the median is represented by the thirteenth case. (13 = 1+25 -^2.) Note the horizontal lines inclosing the median. If, however, the series is a continuous one and has an odd number of cases, the median is manifestly located at the mid-point of the middle case. In the discrete series, we took the whole middle case for the median. In the continuous series, the middle case is itself supposed to be spread out along the scale, and consequently we have to find its mid-point. For example, let us take the median for the fifth grade composi- tion scores at Butte.^ These were as follows : Rated at 5 Number making 1 4 3 2 18 49 86 (19) Adding down (68) 1 46 1 2)201 (47) Adding up 100.5 Butte Survey, page 74 Measures of Type This time we have an odd number of cases. The median will fall on that point where 100| cases come on either side, that is, in the middle of the 101st case. In steps and 1, we have 47 cases. We need 53 § cases out of the 86 in step 2 to find our halfway place. Chang- ing step 2 to the Hillegas value as before, we find that it extends from the mid-point between 1.83 and 2.60 to the mid-point between 2.60 and 3.69, that is, from 2.215 to 3.145, the distance being .93. ^^ 86 of .93 is .58. 2.215 plus .58 makes 2.795, the median. Coming down, we find that we have 68 cases from 3 up. To get 100| cases, we must take 32^ cases from the upper end of group 2. Figuring as before, ^ of 93 = .35. 3.145 - .35 gives us 2.795 86 for the median, the same result as before. Some books give rules for finding the median which involve finding the middle case. The middle case does represent the median in that its magnitude is the median. But there is danger in using the formula of adding 1 to the total number of cases and dividing by 2 to get the middle case. The danger lies in the tendency to add the whole of the middle case to the part taken from the group. Thus in the example preceding, a beginner is apt to take fl of step 2 to add to the lower limit of that step, or to take || from the upper limit if he is coming down. The former procedure really puts 101 cases below and only 100 cases above the median calculated. The other procedure puts 101 cases above and only 100 cases be- low the median calculated. Obviously, two different re- sults would come from these two calculations. That is, the median calculated coming down would not agree with the median calculated coming up. The mid-point plan, however, will give the same result from either end and is consequently safer. In order to emphasize the fact that the median is best located by taking the mid-point on the scale or dis- 134 School Statistics and Publicity tribution, irrespective of whether the number of cases is odd or even, let us take several other examples. In 1916, the Courtis Tests were given in certain schools in a western city, with the following results for the number of problems attempted by each eighth grade child in one process : No. problems attempted No. attempting 24 5 23 2 (7) Adding down 22 2 (9) 21 1 (10) 20 5 (15) 19 5 (20) 18 5 (25) 17 7 (32) 16 13 (45) 15 15 (60) 14 10 (70) 13 15 (85) 12 24 (109) 11 22 10 26 (88) 9 21 (62) 8 19 (41) 7 9 (22) 6 8 (13) 5 3 (5) 4 1 (2) Adding up 3 1 2)219 109.5 There are 219 cases. Therefore, the median will fall in the middle of the 110th case, since there must be 109 § cases on either side of it. Steps 3 to 10 inclusive take in 88 cases. Subtracting 88 from 109.5, we find that we must have 21.5 cases out of the next step. There are 22 cases in the step. Therefore, ^ of the distance of this step, which is 1, is .98. By the nature of the Courtis Tests, however, step Measures of Type 135 11 extends from 11 to 12o Therefore, the median is 11 + .98 = 11.98. Figuring down, we would have ^ of 1 = .02 to be taken from the upper limit of step 11, which is 12. 12 — .02 = 11.98 for our median, the same result as before. Suppose, now, that we manipulate the data in the above problem so that step 11 contains only 1 case instead of 22, the remaining 21 being given to step 10, making its total 47. The total of all the cases is unchanged. We find, however, that steps 3 to 10 inclusive now contain 109 cases, and the median must fall in the 11th step, which has only one case. It will, therefore, fall in the middle of the step, since the one case must be considered as extending over the whole space in the step. The median, therefore, in this supposed case would be 11.5. In order to show the working of the same principle in the case of the fifth grade composition results at Butte, given above, let us ma- nipulate the data until they appear as follows : Rated at No. Papers 5 23 4 28 (51) Adding down 3 49 (100) 2 1 1 46 (100) Adding up 54 2)201 100.5 The total number of cases remains the same, and the median would fall as above, namely, at the middle of the 101st case, or at the 100| case. Steps and 1 now together contain 100 cases. Therefore, h case must be taken from the next step. As there is only one case in this step, the median will fall at the mid-point of this step, for the same reason as stated in the solution of the problem just above this one. The distance covered by step 2 on the Hillegas scale, as shown before, is .93. One-half of this is .465. Add .465 to 2.215 (the lower limit of the step) and we have for the median, 2.68. Suppose, however, that we have an even number of cases, and the median falls in a step containing cases, that is, in a gap in the dis- tribution. For example, let the data used in the last problem be ma- nipulated so as to appear thus : 136 School Statistics and Publicity ed at 5 No. papers 23 4 3 2 28 (51) Adding down 49 (100) 1 46 (100) Adding up 54 2)200 100 In this example, the median would come at the end of the 100th case. We find that the end of this case coincides with the end of step 1. But coming down, the end of the 100th case would be at the begin- ning of step 3. Obviously any point in step 2 could be taken and there would still be 100 cases on each side. In a problem of this kind, the best procedure is to divide the vacant space on the scale so that the two nearest actual cases on it will be equidistant. Then the median will fall in the middle of step 2 or at 2.68 for the real value, and will be the same as if one case had been in the step with the total number of cases odd. If the data given on page 134 from the tests in arithmetic are ma- nipulated so that there are only 218 cases in all and one-half of these are above 11 and one-half below, with none in step 11, but actual cases in steps 10 to 12, precisely the same procedure would be followed. That is, in this case, the median would be 11.5. Sometimes the median is not calculated exactly but only approximately. Professor Monroe, for example, gives some devices for obtaining an approximate median and then correcting it to get the true median.^ But in general, about the only time that an approximate median may be used to advantage is when one takes for it the magnitude of the lower limit of the group that contains the real median. That is, 11 would be taken for the approximate median for the problem on page 134, since the median falls within the 11-11.99 group. The straight * Monroe, W. S. : Educational Tests and Measurements, pp. 242- 247 Measures of Type 137 mid-point method of calculating the median will get it with less confusion, greater accuracy, and more speed for the average person. Graphic Representation of the Median. In a Bobbitt table having an odd number of items, the median is the No. attempting ow 1 1 1 1 25 zo \D 1 1 1 1 1 1 5 1 1 ) 1 1 1 1 ( No. Pr Dbli sm s 1 & G 9. juar '0 II. Medi 65 tile 1 98 an 2G 15 15. Quar ^ 35 file 3 2 2 5 Fig. 20. — Graphic Representation of the Median on a Surface of Frequency. The corresponding distribution table is given on page 134. 138 School Statistics and Publicity size of the middle item, and it is usually inclosed in parallel horizontal lines to indicate that it is the median. (See page 18.) If the table has an even number of items, a single horizontal line is drawn across between the two middle cases, thus throwing half the cases above and half below. (See page 130.) The line then represents the median, but the exact magnitude is not shown in figures. To represent the median graphically on a surface of frequency, find the point on the base line of the surface of frequency represented by the calculated median and through it erect a perpendicular. This perpendicular is sometimes erroneously called the median, but really the median is the size of the magnitude on the horizontal scale at the foot of the perpendicular. (See Figure 20.) This perpendicular divides the surface of frequency into two equal parts. Through the point on the base representing the median (11.98) draw a perpendicular (represented by a dotted line here) which will cut the surface into two equal areas. The small squares may be counted to prove this. There are 88 small squares to the left of the 11-11.99 group and 109 to the right of it. The 11-11.99 group contains 22 small squares, of which .98 must go with the ones to the left, and .02 with the ones to the right. .98 of 22 = 21.56. Adding to 88 we get 109.56. .02 of 22 = .44. Adding to 109, we get 109.44, or approximately the same area. The slight discrepancy is due to carrying the decimals out to only two places. Advantages of the Median for School Work. 1. The median can usually be located exactly without much trouble. This is of great service where the mode cannot be exactly deter- mined, as in distributions where practically any legitimate groupings will give several groups of about the same size. Measures of Type 139 2. Extreme cases influence it little. In this it resembles the mode. For example, dropping off a number of cases at either end of the distribution table on page 134 would only shift the median a part of one step. As long as the median stays within the 22 cases of step 11, for instance, one case dropped at either end could shift the median only jV or .045 of one step. 3. Its location can never depend upon a small number of items. It falls at the midpoint of the distribution irrespective of how many cases are in a group or where the groups are. 4. If the number of extreme cases is known, or known approximately, we do not have to know their size. Thus, if we wish to know the median or typical salary of Latin teachers in a state, it is not necessary to get the salaries of teachers in the large high schools that will not report to the state superintendent, nor in the small unorganized high schools, provided we know about how many teachers are in each class of schools. If we know these numbers approximately, we can still get the median salary of Latin teachers for the whole state. 5. The median is of special value for data where the items cannot be measured in definite units. Thus, we may get the median child on any particular ability for a room, or the median performer in a debating or oratorical contest, without ever being able to measure in definite units the performance of a single contestant. An arithmetical average cannot be calculated here with any useful accuracy, but the median can be found and com- pared with similar medians. Disadvantages of the Median for School Statistics. 1. The median is not so easily calculated as the average. The average is computed by an arithmetical process familiar to most children in the grammar grades, and it may be computed without 140 School Statistics and Publicity 1 rearranging the items. The median cannot be calculated until the data are rearranged in order of size, and while the process of calcula- tion is simple, the previous examples show that considerable care must be exercised in getting it. 2. The total cannot be gotten by multiplying the median by the number of items. In this respect it is like the mode. 3. It is not useful in those cases where it is desirable to give large weight to extreme variations. Thus, the average daily attendance of a school is much affected by those who are absent a large part of the time. The median attendance would not be so affected. But what we wish in this particular in- stance is that the great effect of the few extreme cases shall exert its full influence. Consequently, the average daily attendance is used. 4. Unlike the mode but like the average, the median may be located in the distribution where the actual cases are few. 5. In a discrete series there may be so many cases the same size as the median that it will become almost meaningless. It cannot mean much here unless there is some reasonable basis for regarding the given measures as spread out, that is, for regarding the distribution as in some ways continuous. EXERCISES 1. Calculate the median from both ends for each of the distribu- tion tables used in previous exercises. 2. Draw the line to represent the position of the median in each of the surfaces of frequency used in previous exercises. f Measures of Type 141 III. THE AVERAGE Definition. The average is a measure much used in ordinary life without being defined. Indeed, it is not capable of easy definition. The ordinary person, if pinned down long enough, will define it as a measure which will give '* the general run " of a group by taking into account both the number of cases and the size of each one. But actually he means one of two things, which differ widely from each other and neither of which really corresponds to this definition. Most of the time the " average " means to him the most frequent measure in the group, i.e., the mode. But in some instances, it means to him a very unreal and unjust thing obtained by statistical sleight of hand. Thus he will say that there is no such thing as an average boy, nobody ever saw such a boy, etc. This comes nearer his definition than does his other use of the term. But even here he probably is not conscious of the fact that the average is really the size of the balancing point or center of gravity in the distribution. Calculation. 1. Ordinary method. Ordinarily, the average is calculated by dividing the sum of all the measures (or cases) by the number of measures. By formula it is : . /AN Sum of all measures Average (Av.) = — ;jr^ 5 No. of measures Thus, for the results on the Courtis Tests, page 134, we could find the average as follows : 142 School Statistics and Publicity No. • Problems No. All Attempted Attempting Measures 24 X 5 = 120 23 X 2 = 46 22 X 2 = 44 21 X 1 = 21 20 X 5 = 100 19 X 5 = 95 18 X 5 = 90 17 X 7 = 119 16 X 13 = 208 15 X 15 = 225 14 X 10 = 140 13 X 15 = 195 12 X 24 = 288 11 X 22 = 242 10 X 26 = 260 9 X 21 = 189 8 X 19 = 152 7 X 9 = 63 6 X 8 = 48 5 X 3 = 15 4 X 1 = 4 3 X 1 219 2] 3 .9)2667 12.18 Average No. Problems Attempted. 2. Short method. By experienced workers in statistics averages are often computed by the following method of guessing the average and then correcting it. First arrange figures in a distribution table. Then guess the average by inspection, usually taking the approximate median. To be absolutely correct, this must be guessed within one step of the true average. Then correct the guessed average by the average of all the deviations from it. Measures of Type 143 In the Courtis Test problem just worked, the procedure is as fol- lows: Guessed average is 11. Deviations above or + deviations are : + + 24 deviations of 1 or 24 15 2 . 30 10 3 . 30 15 4 . 60 13 5 . 65 7 6 . 42 5 7 . 35 5 8 . 40 5 9 . 45 1 10 . 10 2 11 . 22 2 12 . 24 5 13 . 65 +492 Deviations below or — deviations are : 26 deviations of 1 or 26 21 ... . . 2 . . 42 19 ... . . 3 . . 57 9 . . . . . 4 . . 36 8 . . . . . 5 . . 40 3 . . . . . 6 . . 18 r-l . 7 . . 7 1 . . . . . 8 . . 8 -234 + Deviations 492 234 Excess of + deviations =258 This means that we got the guessed average too low because there are more deviations above than below it. The 258 is the excess of deviations above. As there are 219 cases, the average excess devia- tion to satisfy every case is |f| or 1.18 of a step. As the guessed aver- 144 School Statistics and Publicity age of 11 was too low, we add this 1.18 average deviation and get' 12.18 for our corrected average. Had the minus deviations been in excess, it would have meant that the guessed average was too high, and we should have sub- tracted the correction. The rule for computing the average by the short-cut method is: " Arrange the numbers in the order of their magni- tude ; choose any number hkely to be nearest the average ; add together, regarding signs, the deviations from it of all the numbers; divide this result by the number of the measures of the average which you are obtaining; add the quotient to the chosen number." ^ In using the short-cut method, three cautions must be kept in mind : 1. To get much accuracy, the guessed average must be within one step of the true average. 2. The correction must be added to the guessed average if the plus deviations are in the majority; it must be subtracted if the minus deviations are in the majority. 3. If the guessed average is in a group, the deviations for all that group are 0. The number of cases, however, must be used in dividing to get the average deviation. The short method is not always short by itself ; but it often saves time on certain calculations on deviations. Graphic Representation. The average can be rep- resented graphically in the same way as the median. (See page 137.) That is, the point representing the calculated average is found on the base line and a per- pendicular is erected to call attention to the size of the average. However, there is in general little value in representing the average this way because the line drawn has no appreciable relation to the surface of frequency. 1 Thorndike, E. L. : Mental and Social Measurements, p. 3 Measures of Type 145 A line for the mode runs through the highest part of the surface, a fact the eye easily grasps. A line for the median cuts the surface into two equal areas, which the eye will readily compare. But the average has no such relationship to show. Advantages of the Average for School Statistics. 1. " Unlike the median or mode, it may be definitely located by a simple process of addition and division, and it is unnecessary to draw diagrams or arrange the data in any set form or series." ^ The great ease with which the average can be calculated from figures in almost any form is doubtless the main reason it has been so often used. It is not even necessary to throw figures into a distribu- tion table or table of frequency, as is the case in finding the median or the mode. 2. It weighs extreme cases, which is a desirable thing in certain instances. 3. " Unlike the mode, it is affected by every item in the group, and its location can never be due to a small class of items." 2 Thus Superintendent Spaulding in a recent bulletin gets the aver- age expenditure for Minneapolis by taking the average for five years. No one year is any more important than another and so gets counted no more and no less than another year. The median in such a small number of cases, with an odd number especially, would emphasize one year unduly. 4. The method of calculating it is familiar to every one. 5. It may be determined when the aggregate and the number of items are the only things known. Thus, in determining the typical number of days' attendance for a child in a certain grade, we may have the total number of days' 1 King, W. I. : Elements of Statistical Method, p. 136 2 Ibid., p. 136 146 School Statistics and Publicity ■ attendance and the total number of children. We might not have the actual record of a single child given us. But from the other two items we could calculate the average number of days' attendance for a child. Sometimes these are the only two items that can be secured from reports. Thus we may have only the total for teachers' salaries and the number of teachers ; the total sum spent on repairs and the num- ber of buildings; the total expenditures for a certain kind of school and the number of such schools, etc. Disadvantages of the Average for School Statistics. 1. It cannot be located on a surface of frequency from looking at the surface alone. The median or mode can be quickly located by inspecting the sur- face of frequency. The average can be put on it only after the aver- age has been calculated, and the point counted out on the base scale. 2. It cannot be located accurately if the extremes are missing or in any way doubtful. This is particularly troublesome when we recall that we are often inclined to expect extreme cases of expenditures and such things in school work to be doubtful or inaccurate. 3. It lays too much stress on extreme variations. This is as bad a thing for school statistics as judging a patent medicine by the enthusiastic few who write testimonials about it for publication, or a school by what is done the first and last week in each session. The trouble can, however, be eliminated to some extent by dropping very extreme cases and calculating the average from the remaining ones. 4. It cannot be used where we cannot accurately measure the quantities studied. The so-called averaging of points on contestants in a debate is often futile because they are not measured by the different judges on the same scale. 5. It may fall where no data actually exist. That is, it may fall in a gap (the median does this sometimes as well) and be almost as absurd as the case of the duck reported by a Measures of Type 147 western professor. The duck was shot at with a double-barreled shotgun, one shot going two feet to the right of the duck and the other two feet to the left. The average performance was, of course, zero and centered on the duck. Statistically the duck was dead ; actually it flew away. 6. The average often means to the ordinary man something different from what the calculated thing means. The process of calculation is undoubtedly familiar to all. But what the ordinary man often means by the "average boy" is a boy like the majority of boys, not like the few above or below this majority. The man in all probability has seldom thought through the fact that his method of calculation does not always give him this "majority" measure. EXERCISES 1. Calculate by both the long and short method the average for each of the distribution tables used in previous exercises. 2. Draw the line to represent the position of the average in each of the surfaces of frequency used in previous exercises. IV. WHICH MEASURE OF TYPE TO USE IN A GIVEN DISTRIBUTION If the distribution is symmetrical or approximately so, the average, median, or mode are, of course, the same or approximately so. Consequently, as regards magnitude, it is a matter of indifference which of the three is used as a measure of central tendency. But it is generally im- possible to tell whether the distribution is symmetrical or not till a frequency table has been made. Once this table has been made, it is much easier to determine the median than the average. In a Bobbitt table or similar distribution where all cases are separate, the median can be found much more quickly than the average. The mode is the measure to use in all skew distributions or bi-modal ones. 148 School Statistics and Publicity The matter of symmetry or skewness can be seen at a glance from the surface of frequency. With a little experience it can be recognized directly from the table of frequency. For school statistics, this is probably the safest rule : Use the mode for skew and hi-modal distributions; the average, for cases where every item must he counted as much as any other item {say in finding the average expenditures for five or less years); and the median for all others. Of the three, the median is by far the best " all pur- pose " measure of central tendency for school statistics. EXERCISE In each of the distribution tables for which you have calculated the mode, median, and average, which is the best measure of central tendency ? Why ? REFERENCES FOR SUPPLEMENTARY READING King, W. I. Elements of Statistical Method, Chapter XII. Rugg, H. 0. Statistical Methods Applied to Education, Chapter V. Thorndike, E. L. Mental and Social Measurements, pp. 36-39. CHAPTER VI MEASURES OF DEVIATION OR DISPERSION The second element necessary in giving the bird's-eye view, it will be recalled, is some measure of how much the cases range from the central tendency. This is variously called '' spread," " range," " deviation," or " dispersion." It is of just as much importance as the central tendency. For example, it makes a vast difference whether a teacher is called upon to teach children that are all very close to the typical or " average " child, or ones that vary enormously both above and below. Supervision means one thing for a superintendent when all his teachers are very close in ability, experience, spirit, etc., to his typical teacher; it means an entirely different thing when many of his teachers range widely either side of this typical teacher. There are several ways of measuring this range, the most useful of which will now be given. I. EXTREME RANGE VARIATION The dispersion may be shown by giving the extreme cases and thus showing how far it is between them. This measure may be useful in showing the difference of ability in achievements of a grade in school. For example, the eighth grade in a western city as reported on some Courtis Tests varied in attempts on the addition problems from 4 to 24, or a range of 20 out of a possible range of 24. This 149 150 School Statistics and Publicity measure shows at once that there is a wide range of ability of eighth grade children in that city as regards attempts on addition problems. But as a general rule, the extreme range variation is not a good measure of dispersion because the extreme cases are likely to be very unreliable, especially if there are only one or two that are isolated from the rest of the measures in the distribution. The range of teachers' salaries in the typical school system is not from that of the superintendent down to the salary of the poorest paid teacher, because there is only one superintendent and his salary is far removed on the scale from that of the highest paid teacher under him. His salary must be eliminated before the extreme range for the others has any value. The influence of extreme variations upon this measure is shown by Table 17, taken from a study in costs of instruction in home economics in fifteen southern normal schools.^ Table 17. Variations in Cost of Instruction in Home Economics in 15 Southern Normal Schools, per 1000 Stu- dent Hours, 1916 Key number of normal school III VII XIII II VIII IV XI XIV XV I VI X V IX XII Cost of home economics instruction per 1000 student hours $194 115 89 88 85 82 79 73 70 64 60 58 53 52 15 1 Alexander, Carter: "Costs of Instruction in Normal Schools," Elementary School Journal, XVII, 653 Measures of Deviation or Dispersion 151 The extreme range variation between 194 and 15 is 179. But it is very evident that both extremes are the results of some very unusual factor. If the top case is cut off, the range is reduced to 100. If the bottom case is cut off, but the top one retained, the range is 142. If both are cut off, the range is reduced to 63. If the two top cases and the bottom one, all of which seem unusual, are eliminated, the range is reduced to 37, which is probably a fairly reliable figure. The uncertainty as to just how many cases to leave out in any one distribution when computing the spread emphasizes the need of having some standard proportion cut off, say a fourth at each end. Then the average deviation or spread from the central tendency of the two intersections or points thus obtained can be computed. Let us now discuss some of the devices for getting devia- tions from such points. II. QUARTILE DEVIATION (SEMI-INTER-QUARTILE RANGE OR " Q ") This is by far the easiest and most important measure of dispersion for school statistics. But before it can be easily understood, it is necessary to get the quartiles. These are the magnitudes of the points on the scale which divide the distribution into four equal groups of cases (or quarters) . Quartiles. Obviously, it takes three such points to divide the whole into four groups with equal numbers of cases, — the first quartile or 25 percentile, the second quartile (which, of course, is the median), and the third quartile or 75 percentile. Obviously, also, the first quartile is the median of the lower half of the distribution, and the third quartile is the median of the upper half. Below the first quartile come one-fourth of the cases and above it, three-fourths. Above the third quartile 152 School Statistics and Publicity come one-fourth of the cases, and below it, three-fourths. Between the first and third quartiles come one-half or 50 per cent of the cases. The problem of locating the first quartile is statistically the same as that of locating the median, except that cases are counted in from one end so as to get only one-fourth of them on one side. That is, the object this time is to find the " quarter point " from one end of the distribution. In a Bobbitt tableit is generally best to fix the number of cases so as to know in advance whether the quartiles will be shown as separate cases or fall between separate cases. In this way, there will be no trouble with splitting cases. Thus if the number of cases is some multiple of 4, plus 1, the median will appear as a separate case and the quartiles will fall between cases., For the Bobbitt table on page 18, the quartiles are .respectively 47 for Q 1 (|^ of 42 + 52) and 78 for Q 3 (I of 74 + 82). If the total number of cases is some multiple of 4, the median and both quartiles will fall between cases. If the total number of cases is some multiple of 4, plus 3, the median and both quartiles will appear as separate cases. The calculation of the quartiles in the distribution given on page 134 is as follows. Since there are 219 cases, the first quartile must come at the point that will throw | of 219 or 54 1 cases below it. Counting up, we find that this will require 13 f cases of the 9-10 group, there being only 41 cases below. This will make it come -^ — or 21 .65 of the step up. Q 1 then is 9 + .65 = 9.65. Similarly Q 3 is m the 15 group commg down. 54| — 45 = 9|. ^^-^ = .65. Q 3 then is 16 - .65 = 15.35. ^ Graphically, the first quartile is the magnitude on the base at the foot of a perpendicular so located that it will cut off one-fourth of the area of the surface of frequency to the left of it and three-fourths to the right. The third quartile has one-fourth of the area to the right and three- fourths to the left. The median has one-half the area on either side of it. (See Figure 20, page 137.) Measures of Deviation or Dispersion 153 Quartile Deviation. The Q is found by taking half the difference between the first and third quartiles. This average will give the average distance of the quartiles from the median of central tendency selected. The aver- age distance is used because sometimes the median is not exactly halfway between the quartiles (it cannot be if the distribution is not perfectly symmetrical). By formula, then, ^ Quartile 3 — Quartile 1 Q = 2 For the Bobbitt table on page 18, the calculation is : Q = $Z§-iiML=i?l= $15.50 For the distribution on page 134, the calculation is: _ 15.35 - 9.65 ^ 5.70 ^ 2 85 Li Li Graphically, Q is one-half the distance from Quartile 3 to Quartile 1 on the base of the surface of frequency. But as it is difficult to show only one-half of this, the whole of this distance is usually represented as 2 Q . (See Figure 20 . ) III. OTHER PERCENTILE DEVIATIONS In some of the educational investigations, especially those issued by workers at the University of Chicago, dispersion is indicated by the distance from the median of some point marking off a convenient number of the cases, other than the quartile. The number of cases is usually a common percentage; hence the point is called a *' percentile." In some studies the cases are divided into thirds by " tertiles." The median here is, of course, in the middle of the second group. The tertile deviation, then, is half the difference of the first and second tertiles 154 School Statistics and Publicity (only two tertiles are needed to make three groups). Similarly four '' quintiles " will divide the distribution into five groups with the same number of cases, the median being at the middle of the third group. The quintile deviation then would be one-half the difference of the first and fourth quintiles. Table 18. Table to Show Variation by Percentage Groups, Using Distribution of Annual Salaries of Regular Teach- ers IN Elementary Schools in Cleveland and in 13 Other Cities of More Than 250,000 Inhabitants ^ Salaries not exceeding the amounts specified were earned by teachers bearing to the aggregate num- City ber employed in each city the proportion of: 10 per 30 per 50 per 70 per 90 per cent cent cent cent cent Baltimore , . . $600 $ 700 $ 700 $ 750 $ 800 Boston .... 648 840 1,176 1,176 1,224 Chicago . . . 675 975 1,175 1,175 1,200 Cincinnati . . . 700 900 1,000 1,000 1,000 Cleveland . . . 600 750 900 1,000 1,000 Indianapolis . . 475 625 875 925 925 Milwaukee . . 876 876 876 876 876 Minneapolis . . 750 950 1,000 1,000 1,000 Newark .... 630 780 1,000 1,000 1,300 New Orleans . . 500 600 700 750 800 Philadelphia . . 630 780 900 940 1,000 San Francisco . . 840 1,164 1,200 1,224 1,224 St. Louis . . . 700 1,032 1,032 1,032 1,120 Washington . . 625 700 750 890 980 Average . . . $661 $ 834 $ 949 $ 988 $1,032 ^ Data for Cleveland from payroll for 1914-15; data for other cities for 1913-14, from "Tangible Rewards of Teaching," U. S, Bureau of Education Measures of Deviation or Dispersion 155 The calculations and graphic representations for these various percentile deviations are precisely similar to those for the Q or quartile deviation. These deviations are found as yet in few educational investigations, but should be understood for reading purposes. A simple but effective device for popular consumption indicates the variability indirectly by showing the magni- tudes which equal or exceed certain fixed percentages of the cases. This device has been found especially valu- able for making comparisons between different distribu- tions. A good example occurs in the Cleveland Survey ^ and is reproduced in Table 18. IV. MEDIAN DEVIATION (MED. DEV. OR P. E.) This is the median of all the deviations arranged in order of size and irrespective of whether they are above or below the central tendency. In other words, it is cal- culated by arranging the deviations in order of size and then finding the median of these deviations in precisely the same way as the median of anything else would be found. Within the median deviation of the central tendency used, the middle 50 per cent of the cases come. Within the Q of the central tendency approximately 50 per cent of the cases come, but they are not necessarily the middle 50 per cent, because the central tendency does not lie exactly halfway between the quartiles, except in a sjnumetrical distribution. In such a distribution the median deviation is of course the same as Q. 1 Volume on ** Financing the Schools," p. 57 156 School Statistics and Publicity The median deviation is little used in school statistics, but the superintendent may come across it in his reading of educational investigations and in attempts to indicate variability. It is sometimes called the '' probable error," and this is where it gets the letters '* P. E." However, this name is not a good one for, as Professor Thorndike suggests, the median deviation '' is not specially probable and not an error at all." ^ A statement of central tendency of 30 with a P. E. of 4 means that half the cases deviate more than 4 and half of them less than 4 from this 30. It also means that for any given case chosen at random, the chances are 50 to 50 that it will deviate more than 4 from this 30, that is, be below 26 or above 34; the chances are likewise 50 to 50 that it will deviate less than 4 from this 30, that is, be between 26 and 34. In other words, it is a toss-up as to whether any given case will be likely to deviate more or less than 4 from this central tendency of 30 ; that is, any given case is as likely to be between 26 and 34 as it is to be below 26 or above 34. V. AVERAGE DEVIATION (A. D.) This is simply the average of all the deviations from whatever central tendency is selected. It may be figured from any measure of central tendency. But it is best calculated from the average or approximate average, next from the median or approximate median, and seldom if ever from the mode, unless the average spread for each side is given separately. The A. D. for the Bobbitt table on page 18 is figured thus : » Mental and Social Measurements, p. 40, footnote Measures of Deviation or Dispersion 157 Deviation from median (59) of 61 is 2 62 3 63 4 69 10 69 10 74 15 82 23 88 29 100 41 100 41 112 53 169 110 59 58 1 56 3 56 3 54 5 53 6 52 7 42 17 41 18 38 21 34 25 33 26 30 29 !es, 25)502 sum 20.1 A. D The A. D. for a distribution in groups is usually figured from the approximate average or the approximate median, so as to have the deviations in whole steps. The result is close enough for all practical purposes. The A. D. for the distribution given on page 134 is figured thus from the guessed average : 492 + deviations ] ^ . ^ 234- deviations l^^^P^S^ 143 No. of cases, 219 )726 sum of all deviations 3.31 A. D. 158 School Statistics and Publicity VI. STANDARD DEVIATION (MEAN SQUARE DEVIATION, S. D.) This is found by taking the square root of the average of the squares of the deviations, counting zero deviations. ^^ -p. _ / Sum of squares of deviations ^ Number of deviations It is of no particular value in school statistics except that the superintendent should understand it so that he may read intelligently educational or economic treat- ments that use it. It saves time to use this as the measure of dispersion if the Pearson Coefficient of Correlation is later to be used. It is better than the A. D. if the de- viations of the extreme cases need to be weighted. The calculation of the S. D. from the guessed average in the dis- tribution on page 134 is figured as follows, using the deviations as given on page 143 : 24 X 12 = 24 15 X 22 = 60 10 X 32 = 90 15 X 42 = 240 13 X 52 = 325 7 X 62 = 252 5 X 72 = 245 5 X 82 = 320 5 X 92 = 405 1 X 102 = 100 2 X 11^ = 242 2 X 122 = 288 5 X 132 =: 845 26 X 12 = 26 21 X 22 = 84 19 X 32 = 171 9 X 42 = 144 8 X 52 = 200 3 X 62 = 108 1 X 72 = 49 1 X 82 = 64 ^^ S.D. =j4-282=4.42 219 4282 Measures of Deviation or Dispersion 159 VII. DEVIATIONS FOR SKEW DISTRIBUTIONS The measures for deviations so far given are serviceable only for symmetrical, or approximately symmetrical, distributions. If the distribution is a skew one, it is evident that no average deviation for the whole can be of any service, because the deviation on one side of the central tendency will be markedly different from the deviation on the other side. This average or median deviation for the whole would be a deviation which did not actually exist. Hence in a skew distribution it is customary to take the mode or median (preferably the mode) for the central tendency, and then to give the average or median deviation for the cases above this central tendency, and the same for the cases below it. This procedure may be illustrated from the salaries paid the Cleveland elementary school teachers in 1914-15.1 (See page 160.) The approximate median is sufficient here and inasmuch as only an average deviation is to be figured, it is unnecessary to make the groupings by even steps. There are in all 2204 teachers, making 1102 above the median and 1102 below. The A. D. above is • $118810 (MAO The A. D. below is $200920 »ia, . 1101"^^^^^ + The median of the part above the group containing the median is at the end of the 525th case, evidently in the $1000 group, making the quartile deviation above roughly $100. The median of the part below the group containing the median is evidently in the $700 group, making the quartile deviation below roughly $200. The exact median and the exact quartile deviation could be figured if desired, counting 1102 cases above the median and 551 cases above the quartile. 1 Adapted from Cleveland Survey, Summary Volume, p. 98 160 School Statistics and Publicity Salary paid Number getting salary Deviation from median Sum of deviations $1650 1 750 + 750 1540 2 640 1280 1500 4 600 2400 1430 1 530 530 1400 1 500 500 1300 5 400 2000 1210 1 310 310 1200 7 300 2100 1155 1 255 255 1100 71 200 14200 1050 83 Total 150 12450 1045 3 teachers 145 435 Approxi- 1000 762 above = 100 76200 mate 950 108 1050 50 50 5400 118810 Median = = 900 196 196 850 112 -5600 825 2 75 150 800 130 100 13000 770 1 130 130 750 133 150 19950 715 4 185 740 * 700 164 200 32800 650 145 250 36250 600 136 Total 300 40800 550 20 teachers 350 7000 500 110 below - 400 44000 400 1 958 2)2204 1102 500 500 200920 The deviations have been calculated with reference to the median in this particular example because this was the central tendency used in the survey. But it is apparent that if the modal salary of $1000 as denoted by the largest group of 762 cases were taken, the discrepancy between the deviation measures on the two sides would be much greater. Measures of Deviation or Dispersion 161 "1200- 1160 4- 1 120-1- 1060- 1040- 1000- 960 r 920 - 660 640- eoo- Hobak en 760 -- 720 -- M 680 640- 560- 520- 460-U 440 -- Springfield Bayonnc Youngstown Fort Wayne D«i Moines Passaic -^omervilk— Springfield—Lawrence — New Bedford^Evanay/He-Doiuth ■ Utica- -Canton- — Wafer bury f-lolyoke — : Lynn^ Pa^ focket^j^^^J^^nton Tcrre ^aute^.^^.^ /Kansas Cy. £. St. Louis Schenectady Saginaw Sf.Josepii Wi/kes-Barre - M , . Harrisbu/^ nanchcater Altoona Souffj Bend Reading Fig. 21. — Graphed Bobbitt Table of Mean Annual Salaries Paid Ele- mentary Teachers in Certain Cities. (J. F. Bobbitt : Elementary School Journal, 15 : 45.) 162 School Statistics and Publicity In this connection it should be noticed that a Bobbitt table may be really a skew distribution. This skewness will not be apparent at a glance because the table will show just as many lines or cases between one quartile and the median as between the other quartile and the median. It needs inspection to see whether cases listed on separate lines are really the same in size or approxi- mately so. Professor Bobbitt removes this difficulty by graphing his results. (See page 101.) The crowding in of the cases below the median shows that the devia- tion is on the average much less on that side than on the upper. The same is true of the graph given in Fig- ure 21. VIII. WHICH MEASURE OF DEVIATION TO USE IN A GIVEN DISTRIBUTION The choice of the measure of deviation depends upon the measure of central tendency selected. If the latter is the average, then the deviation should be expressed by the average deviation. If the median is used, deviation should be expressed by the Q, although the average de- viation may be employed. If for any reason extreme variations are to be emphasized, the extreme range measure or the standard deviation should be employed. In a skew distribution, if the median is used, the quartile deviation or the average deviation for each side should be given separately; if the mode is used, the average deviation or the median deviation for each side should be given separately. Measures of Deviation or Dispersion 163 EXERCISES 1. Calculate all the measures of deviation or dispersion you can for each of several of the distribution tables used in previous exercises. For this use the measures of central tendency chosen in the exercise on page 148. 2. For each of the distributions used in the exercise above, which measures of deviation are preferable? Why? REFERENCES FOR SUPPLEMENTARY READING King, W. I. Elements of Statistical Method, Chapter XIII. Rugg, H. O. Statistical Methods Applied to Education, pp. 149- 173 and 178-180. Thorndike, E. L. Mental and Social Measurements, pp. 46-50 and Chapter VI. CHAPTER VII MEASURES OF RELATIONSHIPS So far we have for the most part considered distribu- tions of measures as wholes and a single distribution at a time. But one of the greatest values of statistical method is the ease with which it makes possible the study of relationships, including those between separate distribu- tions. Since things have no meaning except through their connections with other things, statistical method must bring out such relationships very clearly. It was, of course, impossible to cover the previous topics in this book without dealing with relationships. But it is now advisable to give these connections a special and separate treatment. I. RELATIONSHIPS INSIDE OF ONE GROUP Discrete Series. As soon as a Bobbitt table or cen- tral tendency and the deviation for any distribution are given, by any of the various ways for calculating these measures, a relationship is, of course, indicated. But in general, inside of one distribution, relationship is most easily shown by some form of the bar graph. If every item is kept separate, each should be repre- sented by a separate bar. All bars should be the same width and differ only in length. This length indicates the measure of the case. 164 Measures of Relationships 165 The graphing of the Bobbitt table in Figures 8 and 9 is an example of this. The bars in Figure 8, of course, may run vertically if pre- ferred. In such a table the cases are arranged from high to low, and the tops of the bars or the curve for the graphic presentation will have a general downward or upward direction, according to the end from which it is viewed. But there are some distributions in which the cases come in a time sequence or in an alphabetical order so that the tops of the bars or the curve jog up and down. This would be the case with the number of children in school by years or months in a city of fluctuat- ing population, if the base scale represented the calendar months or years in succession. In this instance, the central tendency and meas- ure of deviation would have to be calculated and given separately, or else shown with the same data rearranged as a Bobbitt table or ordinary grouped distribution. Continuous Series. If there is more than one case to an item, that is, if the items are grouped, the length of each bar will represent the number of cases. This amounts then merely to drawing the surface of frequency with bars representing groups which may or may not be adjacent to each other. ^ The tops of the ends of these bars form the broken line or ''curve" which, united with the base line, makes up the surface of frequency. If the distribution table is to be written or printed without graphing, the relative sizes of the groups may frequently be brought out more forcibly by turning the numbers into percentages of the whole number of cases. Of course, the ratio between 47 cases of magnitude 6 in a total of 470 cases is exactly the same as the ratio of 10 per cent to 100 per cent. But the ordinary man is used to thinking in terms of percentages and can grasp the relationship much more quickly that way. Graphing will show this relationship as easily from the original figiires as from percentages, so there is no need for the 1 See p. Ill 166 School Statistics and Publicity change to percentages where graphing is to be done, in the case of one distribution taken by itself. II. SIMPLE RELATIONSHIPS BETWEEN DIFFERENT DIS- TRIBUTIONS If two different distribu- tions have been made up on the same scale, they can be conveniently shown in bar graphs, as in Figure 22. It may be noted there that the tops of the two dif- ferent sets of bars give two " curves." This device will not be of service in compar- ing more than two distribu- tions at a time, but several more " curves " may be drawn provided they do not overlap too much. It is cus- tomary to represent curves alone by different conven- tional signs as : No. IV No. V -.-.-.-.-.-.-.-.-.-.-. No. Yl fill II II 11 II II 1 1 II II II II II II II II II Fig. 22. — Device for Comparing Two Different Distributions Made up on the Same Scale with Bar Graphs. The whole bara represent the total en- rollment, while the shaded portions repre- sent retarded children in the grades at Memphis, about 1908. (From Laggards in Our Schools by L. P. AjTes, page 39, by permission of Russell SageFoundation) No. I No. II - No. Ill rrrrrr....^.^ Figure 23 is a good example. However, it is often very difficult to grasp comparisons between two different distributions using the same scale, unless the number of cases for each magnitude is put on the same basis as regards the rest of its distribution. Measures of Relationships 167 RESULTS OF SPELLING TESTS PERCENTAGE OF WORDS SPELLED COR- RECTLY BY GRADES Percent 100 90 80 70 L 60 Highest 1 Average UuTTE Lowest J Average for schools in 22 cities This is most easily done by changing the number of cases in each step to the proper percentage of the whole. That is, one would not give the achievements in composition for several school grades by merely citing the number of children in each grade making the various scores. Instead, he would give tables presenting the percent- ages of children making each score in each grade. Table 18 on page 154 employs percentages for making comparisons be- tween cities on salaries of teachers, by groups. A good way to compare several distributions is to place the surfaces of fre- quency made up from the percentage tables one above the other. This, of course, may be used only when they are all applied to the same scales. In this case, they must be centered exactly (the centers in the same vertical line), or else the differences between the central tendencies must be drawn to scale accurately. Figure 24 is a good example. Professor Bobbitt in the San Antonio Survey used 50 ■40 30 20 10 i ^ ^ A i ^ ^ 1 p t Fig. 2 3 4 5 6 7 8 Grades 23. — Use of Curves for Comparing More Than Two Distributions. This chart represents the range from the poorest to the best room tested in each grade in spelling, Butte, Montana. For example, the poorest second-grade room averaged 73. The average for the whole city is repre- sented by the dotted line, while the average for twenty-two cities is represented by the heavy line at 70. (From Butte, Montana, Sur- vey, page 72) Scores 2 3 4 5 6 789 Median Scores Scores: 01 23456789 Fig. 24. — Graph for Comparing Related Distributions, Using the Com- position Scores in Salt Lake City. (From Salt Lake City Survey, page 141) 168 SPELLING ABILITY BY GRADES 100 eo QO 70 60 Q^ 50 M AO Or 30 10 Cira de W 3 26 14 19 15 3 27 7 ^^4 ,2^ 17 2? 6 ?4 8 n Q, GrnrioW 19 15 S 3 13 7 20 (5 J8 % /^ 2 ^ '4 22 24 33 9 -22. -21- Gmdfl 7 /6 Ji. "IW24 T2 4 _ 5 :2r ^4. Fig. 25. — Device for Graphing Bobbitt Tables from Different Distribu- tions on Related Scales. Each wide column represents the achievement in spelling of the ward schools for that grade, on the Ayres scale. Each number represents a ward school, and the height on the scale shows the achievement of that grade in that school. The quartile and median lines are shown. (Adapted from the San Antonio Survey, page 105) 169 170 School Statistics and Publicity another device for making comparisons between different distributions that had been made up as Bobbitt tables. He drew their quartiles and medians as horizontal lines across vertical columns, allowing each distribution one column and adjusting the horizontal lines on the vertical ones as scales. Figure 25 gives part of his device for showing achievements of different grades in spelling. This is a very convenient graph, for it shows a good many relationships at a glance, such as the variations in central tendencies, dispersions, etc. By observing only the lines belonging to the medians, a curve may be read across the page. That is, a little practice will enable one to see three curves on the chart, the upper rep- resenting quartile 3; the middle, the median; and the lower, quartile 1. III. COEFFICIENT OF VARIABILITY OR DISPERSION A serious difficulty arises when we try to compare two distributions as regards the amounts of dispersion unless they have about the same central tendencies. If the central tendencies are widely different and the abso- lute number of units of dispersion is the same, the real or relative amounts of dispersion are widely different. For example, suppose a superintendent is studying the way his high school teachers mark pupils. He gets 500 or more marks given by each teacher. He finds that two teachers have the same median in their marks, 80 on a numerical scale. But one teacher has a Q of 5 and the other a Q of 10. In the first case, the typical dispersion is 6| per cent of the median; in the other it is 12| per cent. Again, a median of 90 with a Q of 5 would be markedly different from a median of 80 with a Q of 5. In the former, the variability would be 5f per cent and in the latter, 6i per cent, although they varied 10 points in the central tendency. Consequently, in comparing distributions or groups Measures of Relationships 171 on variability, experienced persons compare them through percentages of variabiUty or dispersion. The percentage of dispersion is usually called the coefficient of variability. By formula it is simply : r^ ra ' ^ £ ^T • i„-T4- Mcasure of Devlatlou . , , Coefficient of Variability = — :^ jj^ , pointed off as per cent. Thus, for the distribution on page 134, the calcula- tion is : Coefficientof Variability=§ = ^-g (from p 153) ^ M 11.98 (from p. 135) Professor Haggerty in his study of arithmetic in twenty Indiana cities ^ has an interesting graph for com- paring variation in two distributions. He had figures for each school grade on the Courtis Tests in both attempts and rights, thus getting two distributions of twenty cases each. So he made a scale for the attempts and one for the rights on each grade, with the medians centered and opposite. Thus a line joining these two centers was horizontal. A given city could be indicated by finding its position on each of the two scales and joining these points by a line. If the given city varied as did the whole group of cities, its line was parallel to the original line joining the medians. If it did not vary that way, its line would slant, the end farthest from the median line indicating that it varied more in that respect than in the other. Thus in Figure 26, for the fifth grade in addition, the upper heavy line represents the work of the city of Bloomington and shows at a glance that the fifth grade in Bloomington attempted 9 problems in addition and got about 5| of these right ; that in both attempts and 1 Haggerty, M. E. : "Arithmetic: A Cooperative Study in Edu- cational Measurements," Indiana University Studies, No. 27 14 ■ 13 7 12 i ■ 1) 6 10 9 / B/oomington. 5 8 7 6 : 4 Median 2.0 Ind Cities 3 5' 4 2 3' a ' 1 1 , A R Fifth grade addition Bloomington Ind. compared with standard for 20 Ind. cities, Haggerty. Fig. 26. — Graphic Device for Comparing Variation in Two Different Distributions. The scale for attempts in the Courtis Tests from twenty cities is shown on the left, centered with the scale for rights on the right. The slanting line represents the achieve- ment of Bloomington, the slant upward indicating that it is relatively higher in rights than in attempts. 172 Measures of Relationships 173 rights it exceeded the standards set by the twenty Indiana cities as a group (these standards are 6.5 and 3.6 respectively) ; also that its achievement in rights was better than its achievement in attempts. EXERCISE Which of the two distributions given in Exercise 2, page 122, is the more variable and just how much? Precisely how do you reach your conclusion? For this, use the central tendencies and measures of deviation previously calculated. IV. CORRELATION Meaning of Correlation. Sometimes it is very advan- tageous to be able to show accurately and briefly the general relation between two distributions that have some common element or have been tested on the same thing. Thus it may be desired to get the relation between the two distributions obtained by testing the same group of cases on two different tests. For example, one may wish to know if a group of cities rank the same way on excellence of schools that they do on per capita school costs. Now it is evident in all such cases that the chances are very much against any situation where the cities would rank exactly the same or exactly opposite in both lists. Some will fall in exactly the same places, others exactly opposite, and still others will change indiscriminately. Consequently it is desirable to have some way of showing the extent to which the individual cases generally keep the same relative positions in the two distributions (that is, first in each, second in each, last in each, etc.), or, putting it another way, the extent to which the two distributions are correlated with each other. For a concrete example, let us take the following data on Cleveland ward schools, accumulated during the sur- 174 School Statistics and Publicity vey there. 1 The records of eighteen schools for the same grade on two qualities, which for our purposes we may call the A test and the E test, are given in Table 19. Table 19. Correlation Table Using Data from Eighteen Cleveland Schools School Brownell . . Clark . . . Marion . . . Detroit . . . Fullerton . , . Sackett . . . North Doan Bolton . . , East Boulevard Gilbert . . . Rosedale . . Landon . . . Lawn . . . Walton . . . Gordon . . Sibley . . . Waverly . . Halle . . . Record in A test 32.8 28.3 28.0 26.3 26.1 25.5 25.0 24.6 24.0 23.0 22.9 22.6 21.9 21.9 21.7 21.3 20.8 19.4 Record in Etest 10.0 7.5 6.8 6.5 7.5 7.0 7.0 6.9 5.9 6.7 8.5 6.2 5.3 7.1 6.6 7.1 5.2 For our purposes it is not necessary to know which grades were used, what the tests mean, or how they were figured. With the records as given, the main question is : How did the rankings of the schools on the two tests correspond ? That is, did each school (or the majority of the schools) 1 From some material turned over to the author to be used as practice work in his class on Statistical Methods Applied to Educa- tion at the University of Chicago, summer quarter of 1915. Measures of Relationships 175 Rating 05 so 85 20 15 fO Rating 35 30 25 20 A Test IS E Te4+ z o a (D o Q ^ ^^ CO o O o CO 1 o ■♦J. ^ ■ CO CO CO O Brick or frame O O o t/2 o o CO o ^ ^ 02 213 214 School Statistics and Publicity Long tables may also be broken up by grouping the items. Thus, contrast the tables of Cincinnati and Cleveland for com- parison of expenditures for different years, as given by Snedden and Allen, pages 35-36, here shown as Tables 24 and 25 : Table 24. Example of Effect of Unclassified Items. Com- parison OF School Expenditures for the Years 1895-1905 (Cincinnati) Year ending Year ending Year ending Aug. 31, 1895 Aug. 31, 1900 Aug. 31, 1905 Teachers, day schools $669,752 $799,286 $815,719 Teachers, night schools . 9,606 6,612 8,321 Officers and examiners . 15,143 16,646 17,792 Librarians Janitors New buildings .... Repairs Lots Furniture Heating fixtures . . . Rent Fuel Supplies Printing Advertising Gas Census Textbooks and supple- mentary readers . . 1,641 3,502 13,448 Incidentals, etc. . . . 5,678 4,643 3,284 Teachers' Institute . . Interest and redemption of bonds Public library .... Deaf-mute taxes . . . Transfer of funds . . . Apparatus Totals Presenting Statistics to the Public 215 Table 25. Example of Effect of Classified Items. Com- parison OF School Expenditures (Cleveland) August 31 1900 1905 Tuition Supervisors' salaries Teachers' salaries Maintenance Officers' and employees' salaries .... Fuel and light $37,406 897,190 118,664 1,295 $50,964 1,314,660 184,144 Repairs Stationery and supplies Contingent "Water 10,335 Fixed charges Interest Bonds Rent and insurance Furniture and fixtures Permanent improvements Land Buildings Grading, paving, etc Improvement on existing buildings . . . Miscellaneous School books St. Louis Exposition Glenville annexation Total 216 School Statistics and Publicity 4. Arrangement of Totals So That They Can Be Quickly Grasped This will aid the superintendent to discover significant relations, as well. The most common devices are the use of the word '' total," the employment of black-face type or italics, and the outset in another right-hand column ; thus : $ 673 $ 673 $ 673 1240 1240 1240 890 890 890 $2803 Total. $ 2803 $ 2803 5. Printing of Headings So They Can All Be Read from One Way It is not pleasant to have to twist one's head or turn the book around to read vertical headings. The average reader will skip such rather than do the turning. For the same reason, tables that are printed lengthwise on the page should be avoided if at all possible. In general, the headings may, by the use of syllables and abbre- viations, be horizontally printed in just as small a space as the wrong way takes. See pages 72, 213. 6. Filling in Tables Only Where Data Actually Exist If vacant places are filled with zeros, the labor involved in reading the zeros is as heavy as that in reading the actual numbers.^ However, if the paper is not cross- lined, it is better to fill the vacant spaces with dotted lines, as the reader may get lost if several vacancies occur together. Compare for example Tables 26 and 27 2. 1 This does not in any way discount what was said on page 84 about making an entry for every case. That referred to a blank for the superintendent to use in collecting data, not for the public. 2 Snedden and Allen : School Reports and School Efficiency, p. 78 Presenting Statistics to the Public 217 Table 26. Showing Use of Zeros — Detroit Central High School Total number of " first year " pupils who have left since Sept., 1904 Cause Illness Illness in family Failing eyesight Work Transferred . . . Left city .... Indifference to work Music Unknown . . . Total .... Ages 13 Bi G T 13 B G T 14 B G T 15 B G T Totals B G T 65 23 4 2 41 6 10 9 1 23 119 Table 27. Omitting Zeros — Detroit Central High School Total number of "first year ' pupils who have left since Sept. , 1904 Ages Totals Cause 13 13 14 15 B G T B G T B G T B G T B G 18 4 1 7 2 7 1 1 13 54 T Illness 1 1 2 1 1 2 1 1 5 1 1 1 1 1 2 1 6 1 1 1 1 10 2 2 4 4 1 1 2 1 3 1 6 5 1 1 12 3 3 2 1 10 5 1 34 4 3 8 10 65 23 Illness in family . . . Failing eyesight . . . Work • . . Transferred Left city 4 2 41 6 10 Indifference to work . . Music . Unknown Total 9 1 23 119 ^ In order to include all the columns within the limits of the page, it has been necessary to use the abbreviation B for Boys, G for Girls, and T for Total. 218 School Statistics and Publicity 7. Care to Avoid Eyestrain All the type in a table should be large enough to be read easily and quickly. It is a false economy to compress tables into a very small space through the use of fine print. This may save money on printing bills. But it loses more money, because the fine print will not be read by the very persons for whom the table was prepared. Dotted lines are very helpful to aid the eye in covering long horizontal stretches between data that are to be connected. They should, however, go as far as they are needed. In Table 28 dotted lines start out, but do not go far enough to be of real service. Table 28. Example of Bad Use of Dotted Lines. Cost for Overhead Administrative Control in Western Cities ^ City Percent of total mainte- nance cost spent for ad- ministrative control Sacramento, Cal 1.8 Spokane, Wash 2.2 Pasadena, Cal 2.4 Seattle, Wash 2.6 Oakland, Cal 2.7 Denver, Colo 2.7 etc. etc. 8. Avoidance of the Alternate Column Scheme Unless the Types and Arrangement in the Columns Stand Out Sharply For example, it is usually unwise to use alternate lines for statistics on boys and girls, for public consumption, unless one set is in red type, black-face type, or italics. Italics as used by Ayres in his ** Laggards in Our Schools " 1 Oakland Survey, p. 17 I Presenting Statistics to the Public 219 are generally less satisfactory than black-face type. As red, for practical purposes, can be used only on individual copies and not in printed reports, it is often better to make a separate table for boys and one for girls if they are to be printed. This simplifies the matter. Both tables may be later summarized into one, as on page 217, for purposes of comparison which will be clear to any one who has looked at the previous tables. In the Des Moines report for 1914-15, a table of this kind appears, opposite page 98. It is an age-progress table. Children who have always been in school in Des Moines are shown in the upper left-hand column of each rectangle; those coming in, in italics in the upper right-hand column ; and the total below in heavy type : 20 SU 54 The differentiation in lines within divisions of a table must also be carefully attended to. Note in Tables 26 and 27, page 217, how lighter lines are used to subdivide the larger divisions. Dotted lines would have done as well for the interior lines, but the use of colons for this purpose is not advisable. The Salt Lake City Survey has a table ^ in which colons have been used to indicate interior lines. But as set up, they appear at first to indicate ratios between boys per seat and the percentage of suffi- ciency, and so on. It takes the average reader some time to realize that these colons are really intended to form dotted dividing lines. 9. Clear Headings and Subheadings The subheadings must be clear without much explana- tion. The use of only key numbers or letters to head columns is bad enough for the person doing the work.^ 1 Page 249 2 gee page 74 220 School Statistics and Publicity Every effort should be made to avoid it in a tabulation intended for the average man. On page 50 of the Hammond Survey appears a table employing key- letters which actually occupy almost as much space for explanation as the tabulation itself. 10. Neatness and Artistic Features In addition to the foregoing, it is highly desirable to have all printed tables as neat and artistic as possible. Tables that are pleasing to the eye will by their very form and convenience attract unconscious attention, which may then easily be diverted to a consideration of the ideas or conclusions embodied, in the tabulated material. Or they will cause the reader so little strain that whatever attention he has to give them will be concentrated on the thought alone. A competent printer will be able to secure such results by himself but may need to be held down to setting up all type so that it can be read from one position. Where the superintendent has to give complete directions for printing, he should be especially careful about uniform headings, uniform spac- ing, large print, inclosing each table in a border, or at least using distinctive ruling or open spaces above and below each table to set it off properly. The tables in this book have been planned to serve as models on these matters. III. HOW TO MAKE UP A SERIES OF TABLES OF THE SAME GENERAL NATURE Often it is desired to have a series of tables of the same general nature or at least closely related. This necessitates : I Presenting Statistics to the Public 221 1. A Summary Table at the Start and Minor Tables in the Same Sequence as Items in the Summary Table For example, if it is desired to give a series of tables on various school costs per pupil, the procedure might be thus : First, have a summary table with various items, as cost of superintendent's office, cost of instruction, cost of supervision, etc., all appearing in the same table for the city as a whole. This could be followed by a series of tables, each covering all the ward schools on one item. These tables would follow the sequence of the items in the summary table. 2. The Same Sequence of Items within Similar Tables If, in these various tables, the lump sum, the average number of pupils belonging, and the cost per pupil are given in adjacent columns from left to right, it is advisable to follow the same order of presentation and the same form of table all the way through. In this way the " mind set " of the reader may be utilized. 3. Keeping in Mind That the Main Purpose of Any Tabulation Is the Showing of Relationships (a) The purely alphabetical order of items in many tables destroys or greatly handicaps the showing of relationships. Thus in the report of the city superintendent of South Bend for 1913-14, all the tables involving the different wards or schools are practically worthless for influencing the public because of this alpha- betical arrangement. For example, take the table on comparative costs of instruction and supervision by buildings for 1913-14, page 30. If the last column of this table had been arranged in order of magni- tude from high to low, the comparison would stand out. See Tables 29 and 30 for this. The original table might do if the central tendency and quartiles for the city, or some other standards, were printed in bold-faced type 222 School Statistics and Publicity at the top of the table, so that comparison might be made with them for each ward school. Or it would not be so unsatisfactory if another column had been added at the right giving the rank of each school on cost per pupil, from highest to lowest. Grammar would be rank 2; Colfax, rank 9; and so on to Warren, which would be rank 3. But the ranks would not show up so well as in the arrangement on page 223. In any event, it is only just to note that the alphabetical arrangement is of service in making easy the work of checking the names of the schools so that no one will be omitted. This has its value during the period of working up data inside the school system. But this value disappears as soon as the table reaches the citizen. He assumes that the work is correct and desires only to get at the meaning of the whole and its parts as quickly as possible. ORIGINAL FORM Table 29. Example of Alphabetical Arrangement of Items Showing Comparative Cost of Instruction and Super- vision BY Buildings for 1912-13 (South Bend) Total Average Cost Schools cost of daily per instruction attendance pupil Grammar $10,054.05 332 $30.28 Colfax etc. etc. 24.53 Coquillard 25.36 Elder 22.19 Franklin 21.04 Jefferson 32.35 Kaley 22.26 Lafayette 27.97 Laurel 20.14 Lincoln 24.07 Linden 25.10 Madison 26.12 Muessel 27.94 Oliver 19.90 Perley 23.61 River Park ....... 19.25 Studebaker 23.89 Warren 29.16 Presenting Statistics to the Public 223 REVISED FORM Table 30. Bobbitt Table Arrangement of Data in Table 29 Schools Cost per pupil Jefferson $32.35 Grammar 30.28 Warren 29.16 Lafayette . . . . . 27.97 Muessel 27.94 Madison 26.12 Coquillard 25.36 The full figures from which Linden 25.10 this table is derived are on Colfax ....... 24.53 file in the superintendent's office Lincoln 24.07 and may be inspected by any Studebaker 23.89 interested person. Perley ...... 23.61 Kaley 22.26 Elder 22.19 Franklin 21.04 Laurel 20.14 Oliver 19.90 River Park 19.25 (6) It is generally best to arrange the table in the form of a scale running from high to low. There are some exceptions. Perhaps a better rule would be to arrange items in the table so that the city or school having the desired trait in the greatest abundance should be at the top. For example, a table showing the number of dollars behind each $1 spent for schools should be placed with the smallest number at the top. The smaller number is the more desirable, since a small number of dollars behind each dollar spent for schools indicates a high school tax and presumably good schools. Accordingly, Table 31 should have been reversed. 224 School Statistics and Publicity Table 31. Real Wealth Behind Each Dollar Spent for School Maintenance ' 1. Atlanta, Ga $559.00 2. Los Angeles, Cal 538.00 3. Richmond, Va 536.00 4. Birmingham, Ala 479.00 5. Portland, Ore 456.00 6. Memphis, Tenn 449.00 7. Indianapolis, Ind 408.00 and so on to 35. Toledo, Ohio 184.00 36. Worcester, Mass 180.00 37. Newark, N. J 165.00 Note that the ranks in this are really reversed. Atlanta in this showing is doing less for schools than any of the other cities and should have rank 37. Newark should have rank 1. Table 32, which gives the same idea in another way, is correct in putting the largest number at the top, because a high tax rate on real wealth for schools is desirable. It is given this way : Table 32. Comparative Rates of Tax Required for School Maintenance (in Mills) 2 Based on Real Wealth of Cities ^ 1. Newark, N. J 00606 2. Toledo, Ohio 00543 3. New Haven, Conn 00541 4. Paterson, N. J. 00541 5. Lowell, Mass 00515 and so on to 31. St. Paul, Minn 00244 32. Memphis, Tenn 00244 33. Portland, Ore 00219 34. Birmingham, Ala 00209 35. Richmond, Va 00186 36. Los Angeles, Cal 00184 37. Atlanta, Ga 00180 1 Portland Survey, p. 310 2 In some parts of the country, this would be more easily under- stood if given as cents on the one hundred dollars ; as : 1. Newark, N. J $.606 ^Portland Survey, p. 311 Presenting Statistics to the Public 225 In printing Bobbitt tables, the most effective results on the casual reader will undoubtedly be obtained by giving each item a separate line. Thus in Tables 33-35 from pages 62 and 63 of the San Antonio Survey, Form 1 is best, Form 2, the next best. Form 3 should not be used. A trained reader can understand one form as easily as the other. But the average man will understand the first form much more quickly than he will the others. Table 33. Bobbitt Table, Form 1. Annual Per Capita Expenditures for Street Maintenance, 1912 Nashville $2.79 Augusta 2.76 Tampa 2.10 Memphis 2.04 Houston 2.04 Savannah 1.71 Atlanta , 1.63 Dallas 1.55 Galveston 1.54 Jacksonville 1.53 Austin 1.51 New Orleans 1.50 Macon 1.43 Shreveport ........ 1.36 Montgomery 1.36 Mobile 1.33 Fort Worth 1.17 El Paso 1.10 Muskogee 1.07 Birmingham 1.02 San Antonio 99 Charleston 85 Little Rock ........ .63 Oklahoma City ..,.,.. .63 226 School Statistics and Publicity Table 34. Bobbitt Table, Form 2. Annual Per Capita Expenditures for Street Maintenance, 1912 Nashville $2.79 Augusta 2.76 Tampa 2.10 Memphis 2.04 Houston 2.04 Savannah 1.71 Atlanta 1.63 Dallas 1.55 Galveston . . . . . 1.54 Jacksonville 1.53 Austin 1.51 New Orleans . . . . 1.50 Macon . . . $1.43 Shreveport . . 1.36 Montgomery . 1.36 Mobile . . . 1.33 Fort Worth . . 1.17 El Paso . . . 1.10 Muskogee . . 1.07 Birmingham 1.02 San Antonio .99 Charleston . . .85 Little Rock . . .63 Oklahoma City .63 Table 35. Bobbitt Table, Form 3 (Practically Never De- sirable). Annual Per Capita Expenditures for Street Maintenance, 1912 Nashville Tampa Houston $2.79 2.10 2.04 Augusta . Memphis Savannah $2.76 2.04 1.71 Atlanta . Galveston Austin . . 1.63 1.54 1.51 Dallas . . Jacksonville New Orleans 1.55 1.53 1.50 Macon . . Montgomery Fort Worth . 1.43 1.36 1.17 Shreveport Mobile . El Paso . 1.36 1.33 1.10 Muskogee San Antonio Little Rock 1.07 .99 .63 Birmingham Charleston . . Oklahoma City 1.02 .85 .63 The use of Forms 2 and 3 probably grows out of a desire to utilize part of a page for the table, or through a mistaken idea that it econ- omizes space. But in most places where it is desired to use such a table, there would be many such tables to present. It is readily apparent that two tables of Form 1 placed side by side on a page would take up practically no more space than if they were printed either as Form 2 or Form 3. Presenting Statistics to the Public 227 (c) If many horizontal lines appear in the table, ease in reading them is facilitated by running a heavy horizontal line or leaving a space every five lines or less. Examples of this are familiar to most readers. The lines drawn across, or the gaps left in the Bobbitt tables to show the medians and quartiles serve the same purpose, if there are not too many cases in each section of the distribution. A third device for this purpose is the numbering of the items from top to bottom on both margins. Thus the data for item 3 on the left may be traced across to the data on the line numbered 3 on the right. This is the device used by the United States Bureau of Education on many of the tables that cover two pages, with continuous horizontal lines. (d) In most tables where the cases are kept separate, it is advisable to make the name of the city, school, etc., to be compared with the others, stand out prominently. It will be recalled that this was done in the tables from the Port- land Survey. The device is the familiar one used by newspapers to call attention to the home city in a table giving the standing of the baseball clubs of a league. Sometimes the emphasis is given with capital letters as in Portland, or with black-faced type as in Salt Lake City. On paper charts not to be printed, the particular case can be marked in red or some bright color. It is hardly possible to make the one case and the data with it stand out too prominently. In the age-grade tables, it is customary to make the children of normal age stand out prominently by marking off these numbers. Heavy stairstep lines may be drawn to inclose the numbers for nor- mal children; or heavy lines may be placed above and below these numbers ; or they may be boxed in ; or they may be printed in bold- faced type. Samples are given in Figure 29. Where the printer can handle it, the boxed-in form is preferable, because it enables one to separate the normal from both the retarded and the accelerated children very easily. In another form of age-grade table, the facts for one grade at a time are presented. This is usually done by running the ages along the top and the number of years the child has been in school down the table. Then two heavy lines are drawn vertically through the table to inclose the children of normal age in this grade. Two other heavy 228 School Statistics and Publicity Ag as Grades under 6 6 to 6k TO 7 7 to 7i. to 8 IB 22 1 54-0 )59 1 105 38 1 A 30 98 il8 76 HB 17 6^ 1 133 102 HA Z £7 58 e.. HIB HEA Ages Grades I B lA EB BA IHB HIA under 6 6 to to 7 7 to 7i 7i to 8 22 540 159 105 38 30 98 M8 76 17 65 133 102 a Z7 58 84 Ages under 6 6 to 6i to 7 7 . 7i 7i to 8 ZZ 540 159 105 38 30 98 lie 76 17 65 133 lOZ a 27 58 e4 Fig. 29. Grades IB I A HB HA niB HLK Devices for Making the Number of Normal Children Stand Out in Age-Grade Tables. Presenting Statistics to the Public 229 lines are drawn horizontally through to inclose the children of normal progress irrespective of how old they were at entrance. The area in the center, where the two sets of lines cross, incloses children who are both normal for age and progress. Various other combinations for the other parts of the table are easily worked out. The following table form is similar to one used in the Bridgeport, Connecticut, Survey, page 35. Years in School. Table 36. Age-Progress Table. 5B Grade Ages. Total. 6 6i 9 3k 10 I0| II Mi 12 »2i 13 2 2k 3 3^ 4 ^i 5 5i 6 6i 7 Total. (e) It is often advisable to present only percentage derivations instead of the original figures. 230 School Statistics and Publicity The reason is that the original figures may be large and not easily comparable, whereas the percentage equivalents are small and easily compared. Distribution tables of the results on standard tests, how- ever valuable for educational investigations, are not at all suited for the general public. Round numbers or the nearest whole units are often better than exact figures, as they can be held in mind more easily. If all the parts of 100 per cent are shown, care should be taken to as- sign values to the parts that will total exactly 100 per cent. Approxi- mations are satisfactory for the superintendent, but a discrepancy be- tween the sum of the parts and 100 per cent would afford some readers an excuse for attacking the accuracy of the report. These suggestions must, however, be used with caution. For in some instances the use of approximate figures may lead readers to suspect that the figures have been " doctored." Especially is this true as regards statistical presentations designed to increase the school tax. (/) Special devices in tabulating are sometimes of great value. For example, a tabulation showing the part of each dollar of school money that goes to various school expenses is effective with the average man. Insurance companies take great pains to show where every part of the dollar from premiums goes. Tables 37-39 show some of the possibilities. Similar devices could show the part of a year or years spent on each subject. Table 37. How Portland Spends Its Dollar ^ Interest 20.7 cents General expenses of city government 6.0 cents PoHce department 9-0 cents Fire department 12.0 cents Inspection 0.9 cents Health 0.7 cents Street cleaning and sanitation 6.2 cents Care of streets and bridges 9.0 cents Education 30.8 cents Libraries 1.0 cents Parks and playgrounds 2.6 cents Damages 1-1 cents Total lOO.O cents 1 Adapted from the Portland Survey Graph, p. 84 Presenting Statistics to the Public 231 Table 38. How the Trolley Nickel is Divided ^ 2.02 cents for wages .78 cents for expenses .55 cents for taxes 1.06 cents for interest .59 cents for dividend 5.00 cents — your nickel Table 39. How Rockford Spent Its School Dollar, 1915 ^ Elementary Schools Teachers' salaries 41.67 cents Building maintenance and upkeep . . 12.3 New buildings ........ 9.9 Educational supplies 3.9 Department of hygiene .97 68.74 cents General Interest on school fund 2.81 Executive employees (Educational) . 1.39 Executive employees (Board) ... .86 Evening schools and gymnasium . . .4 5.46 High School Teachers' salaries 15.81 Upkeep of building 6.7 Educational supplies 1.43 New building 1.41 Educational employees .45 25.80 100.00 cents Sometimes the variations in sizes of different items may be in- dicated by difference in sizes of the type used in printing. Table 40 is a good example of this, but the variations in size are only very rough approximations. 1 Poster used by Louisville, Ky., Railway Company 2 Adapted from the chart on p. 120 of the Review of Rockford Public Schools, 1915-16. Some of the figures were changed slightly so as to make the total exactly 100 cents. 232 School Statistics and Publicity Table 40. Variations in Type to Indicate Relative Size "Watch the Central Association Grow!" Year i^ew memoers auucu during year 1 1 uuti paiu up membership Net increase 1908 1909 1910 ' 1911 1912 1913 1914 102 161 132 117 115 146 177 824 445 484 486 659 619 680 121 39 2 73 60 61 1915 223 768 88 1916 320 973 205 917 " Every member get one I " Are you one of these ? "There is no going back." Do you need more convincing proof of the worth of membership in The Central Association of SCIENCE and MATHEiVIATICS TEACHERS? Table 41 uses very little space, for data classified in several ways. Table 41. Example of Presenting in a Small Space Data Classified in Several Ways. The Standing of Salt Lake City in the Fundamentals of Arithmetic as Compared with Other Cities, Judged by the Median Score Attained by Each Grade ^ Addition Multiplication V VI VII VIII V VI VII VIII 3.9 4.6 5.4 6.7 Detroit 3.8 4.8 6.0 7.5 3.7 4.9 5.6 7.8 Boston 3.3 4.8 5.1 6.5 3.9 4.4 4.7 5.6 Other Cities 2.6 4.5 5.2 6.4 2.9 3.4 3.8 5.3 Butte 4.1 5.0 6.5 8.1 4.1 6.4 6.9 8.6 Salt Lake City 4.3 6.3 7.1 8.3 Subtraction Division 5.5 6.2 7.3 9.5 Detroit 2.7 4.4 7.1 8.8 4.9 6.3 6.9 8.6 Boston 2.0 3.3 5.1 6.9 4.5 6.1 7.8 8.4 Other Cities 2.3 4.3 5.8 6.3 2.9 3.4 3.8 5.3 Butte 3.6 4.3 7.2 10.2 6.2 7.8 8.8 9.8 Salt Lake City 3.0 6.6 7.7 9.6 ^ Salt Lake City Survey, p. 174 Presenting Statistics to the Public 233 (g) There are times when it is necessary to use con- ventional tables. In so doing, the following points may be of use: 1. Use double distribution tables and forms recommended by the National Education Association committee.^ 2. Place dates old to new down the page or left to right. 3. Place magnitude so that the most desirable showing is at the top or to the left. (This sometimes reverses the order of dates ad- vocated in 2.) 4. Use Roman numerals for one classification, e.g., the numbers for the school grades, and Arabic numerals for the data on pupils within the grades. 5. It is best where the sense is not destroyed and the effect de- sired will not be lost, to get up tables in the conventional forms. Bizarre effects in tabulation are no more to be desired than is writing from right to left or up the page. EXERCISE Take some annual school report or school survey in which you are interested. Write out a detailed criticism of the tables or lack of them in it, from the standpoint of their effectiveness with the public, showing just why they are good or liable to be unsuccessful. In the cases of the unsuccessful ones, or failure to employ tabulations where desirable, draw up forms that would present the same data properly. REFERENCES FOR SUPPLEMENTARY READING Report of the Committee on Uniform Records and Reports. U. S. Bureau of Education Bulletin, 1912, No. 3. Rugg, H. 0. Statistical Methods Applied to Education, Chapter X. Snedden, David S., and Allen, William H. School Reports and School Efficiency. 1 Report of Committee on Uniform Records and Reports, Bureau of Education Bulletin, No. 2, 1912, p. 20 CHAPTER XI GRAPHIC PRESENTATIONS OF SCHOOL STATIS- TICS, ESPECIALLY FOR THE PUBLIC I. OBJECT OF GRAPHIC PRESENTATIONS The object in making a graphic presentation of statis- tical matter is to give as quickly as possible through the eye a faithful and forceful bird's-eye view of the mass of statistics, the significant parts, their relationships, etc. Graphic presentation in statistics is simply a develop- ment of a growing general tendency to make desired impressions by pictures rather than by word descriptions. This tendency is shown most clearly in the pictorial supplements and cartoons, and in the constantly in- creasing proportion of illustrations in printed material appearing in all our leading newspapers and magazines at the present time. The graph, however, is more closely akin to the line drawing than it is to the photograph. It is found to be of great advantage in textbooks for the following reasons : 1. It presents the significant points in a clear and unmistakable way. These points are also presented apart from the great mass of subsidiary data on which they rest. 2. It makes the presentation concrete by appealing to the eye. As many people are unable to understand things they cannot image, the graph will drive the significant points home to those who could not be reached otherwise. 234 Graphic Presentations of School Statistics 235 3. It often economizes time and space, for it will take up less room than the description it displaces. 4. It gives to most persons a more accurate basis of comparison than they could get with the same effort from word descriptions or tabulations. Textbooks must make things clear very quickly to beginners in the subject. School reports must present school facts very rapidly and clearly to citizens who know little of them. Consequently, the graph, if prop- erly used, should be about as valuable for school reports as it has been for textbooks. The graph, however, is open to dangers of mis- representation and exaggeration. These things are just as harmful in school graphs as they are in demagogic politics, patent medicine advertisements, etc. The ideal graph would probably have the forcefulness of the most powerful motor car, department store, or patent medicine advertisements, with something like the truthfulness and accuracy on essential points of a first-rate scientist. Another trouble is that a graph leaves with most persons only a general impression ; it is very difficult for them to recall it accurately, much less to reproduce it from memory later for any one else. II. HOW TO MAKE GRAPHS FOR THE PUBLIC FROM STATISTICAL DATA 1. Component Parts Circle Graph. It is often desirable to show in a graph the relative size of the component parts of a statistical whole. A popular device for doing this is the circle, each sector by its area representing one proportional part of the whole. 236 School Statistics and Publicity Figure 30 is a graph of this nature showing the distribution of time through the eight grades for the various common school subjects. Figure 31 gives a similar graph with fewer parts. Fig. 30. — Component Part Circle Graph Showing Distribution of Time to Various Subjects Throughout the Eight Elementary Grades. (Adapted from the 1915-16 Review of the Rockford, Illinois, Schools, page 52.) This kind of graph is famihar to the public and of course makes the comparisons through the sizes of the angles. But it has these disadvantages : 1. If more than a few parts are to be represented, there is trouble in reading the names. i r I Graphic Presentations of School Statistics 237 When there are only three parts and they are all large, the names may all be printed horizontally as in Figure 31. But if there are many parts and some of them are small, this is im- possible. Then the names must be printed as in Figure 30 or else at the side with dotted lines leading into the parts indicated. If the former device is used, the printing must be reversed as the eye proceeds around the circle, with consequent delay. Note in Figure 30 how one has to read up for "music" and down for " geography." If the latter device is used, much more time is consumed in asso- ciating the names with the parts concerned, than when the names are printed on the parts. Fig. 31. — Component Part Circle Graph Showing Relative Propor- tions of Normal, Retarded, and Accelerated Pupils. (From Salt Lake City Survey, page 190.) 2. The figures denoting the various parts cannot be placed in such positions that they can be easily compared or added. This is espe- cially bad when the parts are given in percentages. 3. It is extremely difficult to compare the same factors in different wholes. For example, suppose a circle similar to this one for Rockford " had been drawn for Joliet; it would be very hard to com- pare the sector on "reading" in the Rockford circle with the ' corresponding sector in the Joliet circle. Bar Graph. The data shown in this circle graph for Rockford may be presented in a bar graph which permits of placing the figures so they can be added. For this the parts within each of the three main divisions should be arranged from high to low. Figure 32 shows this arrangement. I I 238 School Statistics and Publicity Reading Arithmetic Note that practically all the defects of the circle graph have been remedied here. Note also that the component parts in a bar graph are shown proportionately by their lengths only. Their widths have nothing to do with it and their areas are in exactly the same proportion as their lengths. Note, too, that the figures arc given for any reader who cares for them and in such shape that they may be added easily. Sometimes for economy in printing, the bar may be a hori- zontal one, in which case the lines of printing and figures may run vertically so that the reader will have to turn the page to read them. English 12.86% . i i ^. • ., An elaboration of the component bar graph is the form with two different but related scales, one on either side. For example, the waste through repeaters in a school system might be shown as in Figure 33, using the average annual cost per elementary pupil, say $30. The bar graph idea with component parts may be 100 7o carried out for popular Fig. 32. — Bar Graph Showing by Presentation with cartoon Component Parts with Subdivisions effccts. the Distribution of Time for Com- mon School Subjects. Po, example, the proportion of (Adapted from Figure 30.) white and negro children out of Geoqrophy Hisloi Music 7 Drawing Manual Jr. Physical Jr. Gomes Recess Opening Ex. t.b% 14.75 % 12.867. 7.64% 3.09% 5.85% 6.65% 2.85% 3.267. 1.09 7o 706% 2.94 7o Graphic Presentations of School Statistics 239 No. of repeaters 50 Z5 50 75 50 50 50 75 425 8th Grad^ 7th M 6th II 5th • 1 4 th II 3rd II znd •t 1st M « Cost of repeat-' ing work H500 750 1500 a200 1500 1500 1500 2200 ,^2750 Fig. 33. — Bar Graph to Show Component Parts, with Two Different but Related Scales. each twelve may be shown as in Figure 34. The black children com- ing in from the right on the row serve the same purpose as putting the right end of a bar in black. But the idea of the children's figures Fig. 34. — Bar Graph with Cartoon Effect Showing Proportion of White and Colored Children out of Every Twelve in Alabama. (From An Educational Survey of Three Counties in Alabama, page IS.) 240 School Statistics and Publicity here gives an interesting touch in addition. While this cartoon graph is not accurate, strictly speaking, a little care will make it accurate enough for its purpose. For example, by choosing twelve rather than ten figures for the whole line, fractions of two-thirds and one-third have been shown with whole figures. This device is extremely effective in a popular presentation. 2. Simple Comparisons The simplest graphic comparison is usually made through some form of the bar graph. The latter is used a great deal for comparisons in presentations intended for the public, but its possibihties are not generally recognized. Many graphic comparisons for pubhc con- sumption so far have involved one or both of two errors : (a) The comparison is made by using similar areas. (6) The comparison is made through a cartoon effect, when the latter is needed only to attract attention to the graph, in which bars would show the relationship much better. We shall now examine the possibihties of the bar graph for comparisons, and after that take up typical graphs of other kinds for this purpose. Comparisons with Bar Graphs. Some of the possi- bihties of the bar graph in making comparisons have been indicated heretofore, particularly in connection with a distribution arranged as a Bobbitt table. ^ If the component parts of any bar graph on pages 238 and 239 were taken out, arranged in order of size and lined up at the left, they would give a bar graph effect running from high to low. In making comparisons with bars, it is generally advis- 1 See p. 103 Graphic Presentations of School Statistics 241 able to follow this order, with the bars lined up at the left ends, as in Figure 35. Name of item Size Bar $703 5,52 Newark Jersey City Fig. 35. — Illustration of Correct Order of Items in a Bar Graph. By doing this the magnitudes will come in a column where they can be seen by those who like to have the figures, and these can easily be added, with the sum at the bottom of the middle column. Very seldom should the numbers be placed at the immediate right-hand ends of the bars. They tend to make the bars seem disproportionately larger and they cannot be easily compared or added. lo 20 30 40 50 60 70 SO 90 <00 1.- Vermont etCy to ^QrLoutsiana m. Fig. 36. — Device Used by Dr. L. P. Ayres to Show by States Percentages of the School Population Enrolled in Public Schools, in Private Schools, and Not in Any School, in 1910. White portion indicates children in public schools ; shaded, those in private schools ; and black, those not in any school. (Adapted from A Comparative Study of the Public School Systems in the Forty-eight States and reproduced by permission of the Russell Sage Foundation.) To make the bars stand out clearly, the distance between them must be markedly different from the width of a bar. Otherwise the lines will tend to run together or sink into the background. •Sometimes, to economize space, some of the largest items may be shown with double bars. A familiar example is the graph used by the American Book Company on its 242 School Statistics and Publicity calendar. On this a long double bar represents the expenditures in liquor and a short single bar, the expendi- tures in textbooks. This is very doubtful practice, as it greatly lessens the difference between the magnitudes for the untrained reader. The horizontal bar graph is also very valuable where it is desirable to make comparisons between component parts in similar wholes. For this purpose, the wholes are represented by bars of the same width and length. Republican V//A Progressive ES3 Democrat I I AO 50 60 70 80 90 100 10 20 30 Alabama J^yyAwww"^ to New York Fig. 37. — Device to Show Parts of a Total and Also to Indicate Relative Sizes of the Totals. Adapted from Professor Irving Fisher's chart showing parts of the total vote for president in 1912, at the same time indicating the relative voting strength of each state. Then the component parts for the same item are lined up on the left margin, and those for another item will appear lined up on the right margin. A fine example is shown in Figure 36. It is to be noted that this will not work well for more than two component parts. Observe how difficult it is to get an idea of the size of the middle item representing the number of children enrolled in private schools when this item for different states is compared. If, for any reason, it is desirable to make comparisons between the relative sizes of the wholes, the widths of the bar may be adjusted accordingly. Mr. W. C. Brinton borrows such a device for giving an analysis of the total vote for president in 1912 in the forty-eight states from Professor Irving Fisher.^ Each 1 Brinton, W. C. : Graphic Methods cf Presenting Facts, p. 10 Graphic Presentations of School Statistics 243 state is represented by the same length, and the percentage given to each party varies, while the number of voters also varies in the states. (See Figure 37.) The following are some of the places in school work where a graph of this kind would be helpful : 1. It could be used in the Ayres graph given before for comparing the total number of school children in the states. 2. In a chart comparing the number of retarded, normal, and ac- celerated children in a number of cities, the relative totals of children enrolled could be shown by the width of the bars. 100 100 V 21 82 68 r: ^ 55 35 38 ! 23 1 19 i 28 12 12 1 r^y^~^ 13 14 15 16 17 18 19 Fig. 38. — Bar Graph for Comparing Two Things Whose Proportions Are Constantly Varying. Columns represent number of boys and girls among each hundred beginners who remain in school at each age from 13 to 19. Shaded columns represent boys and white columns girls. (From Springfield Survey, page 52, by permission of the Russell Sage Foundation.) 244 School Statistics and Publicity 3. If the distribution of the cost of educating one child were being shown for various items such as instruction, supplies, etc., for several cities, the width of the bar might represent the total amount spent by each city to educate one of its children, etc. The bar graph may be used to compare two or more things whose proportions are constantly varying, by using a different shading for each separate kind of item. Thus Dr. Ayres uses Figure 38 to show the number of beginners that remain in school at various ages at Springfield, Illinois, using shading for boys and white for girls. The reader not only gets a comparison between the number of boys and girls remaining in school at any given age, but also a comparison between the different age groups. Other examples are : 1. Enrollment of pupils in different grades, white columns for boys and black columns for girls [Cleveland Summary Volume, page 85). 2. Contrast in percentage of retardation of white and colored pupils by grades, shaded bars for white, black bars for colored pupils, all on horizontal basis {Louisville Report for 1911^-15, page 27). School. year Daily cost Hi<3h schools Sal ones 1 '////y y//A y////////, y////////.w^^ // /// y///, mrn^M, '/////////. yrn^M, ^^^ WMM/M ^vy/////A I WASH I NQ TON 12 VtRMONT 24 KANSAS 36 MARYLAND 48 ALABAMA Fig. 39. — Device for Showing Relative Standing of Several Cases on Each of Several Items. Adapted from Ayres's graph showing the standing of the forty-eight states, by per- mission of the Russell Sage Foundation. The highest quarter is represented by white, the second quarter by light shading, the third quarter by dark shading, and the lowest quarter by black. An elaborate yet easily understood chart, making com- parisons for the forty-eight states on ten different items, is found in Ayres's bulletin on the state school systems. ^ 1 Page 32 I a a o rt _ fe ^ o .2 .2 f^ a oj . .~ fl tn o) , o ^ c3 fl, a '^^^ ^ o c3 ,, ej 1^ •r -Q ti ^ S 3 •-' g 73 5 O fl > >> M g -a .1i 5 O) " ^ <» S * -Q a ^ 02 S o =* O 73 ■if ^ 03 QJ c .s « 3 ^ ^ 2 d « ^ tn « a; Its ^ a -w |_i IE S< =*-! •ft « Oi « ° a fl fe C! 2 -►^ ^ « fa ro , ^ 4^ -fj t; ■-' ■ vi a> C O .ti T) ^> 2^ ^ S & ra ft a> a >< "" 57 , X « 4) t-i ^ o _fl a> '^ tc - ^^ c3 S 9 O O ;i (- -u cc j3 ~ 3 tc o -t^ 05 -^ W "^ -g «« a:S o -c 0) 8:3:2 ^ -^ fl > S IK ir: ^j o ■+^ •« -rH M o S OQ d 245 246 School Statistics and Publicity This is made up by starting at the top with the best state, Washing- ton, and going down to the lowest, Alabama. Each state is repre- sented by a separate bar, and each bar is divided into ten parts. Each of these ten parts represents an item on which the state is graded. For each item, the states in the lowest quarter have that part black ; the states in the next quarter have dark shading; the states in the next to the highest quarter have light shading ; and the states in the highest quarter have white. (See Figure 39.) This form of chart is excellent for graphing the numerous items of a summary table. The use of the different forms of shading uniformly through the table enables the reader quickly to locate any state's rank on any item. The use of the darker shades for the worst ranks on any item is also a good device, for it would have a tendency to sting the p^ide of the average citizen. Note also that an idea of the general standing of any state on the ten items can be gotten quickly. For example, Washington appears with practically white spaces and so is manifestly very high on the whole. Alabama has a black space on every item and so is very low. These things can be seen at a glance. Such a graph would be of use in comparing the per- formances of several pupils, of several teachers, of several classes, of several rooms, or of several schools where each individual case had been ranked on a number of qualities or achievements. For example, the showing on standard tests for several eighth grades might be made thus, using a horizontal bar for each grade, and a vertical column for each test used. The Boston Survey has a graph to show how Boston's expenditures compare with those of the average city, using component parts of a bar graph that is a square. It is reproduced here as Figure 40, but experiments show that while it is not readily understood at first by the average school man, it is exceedingly effective once it is grasped. Sometimes it is desirable to compare two distributions that have been grouped by similar steps, for the typical Graphic Presentations of School Statistics 247 amount of the same quality in each step. In this case, a set of bar graphs running to the right may be used for one distribution, abutting on a similar set of bars running HIGH SCHOOL EDUCATION. PAYS YEARLY INCOME HIGH SCHOOL TRAINING NO K S. TRAINING N HIGH SCHOOL IN HIGH SCHOOL 9 500 1000 I. 150 ^Z337 H SCHOOL TRAINED BOYS-WAGES -53.50 PER DAY NO H. SCHOOL TRAINING -WAGES^ 1.50 PER DAY Fig. 41. — Example of Right and Left Device for Comparing Distributions with Bar Graphs. It shows average yearly income of high school graduates as compared with that of persons not having high school training. (From one of the 1917 folders of the Agri- cultural Extension Division of the International Harvester Company.) to the left for the other distribution. It is, however, difficult to compare magnitudes on the right with those on the left. 248 School Statistics and Publicity An example is the graph used by the Bureau of Edu- cation, shown in Figure 41. From this, it is only a short step to the graph which exhibits the facts about several qualities in a city or DECREASES INCREASES 70 6.0 5.0 • 4.0 3.0 2.0 1.0 1.0 I II III 1 - ' I^^Grade — 3rd 4+h -,th gth oth ■ - 10^^ „tK ^^ ,2th . Fig. 42. — Example of Right and Left Device for Comparing Distributions with Bar Graphs. It shows changes in distribution of enrollments by grades in Baltimore between 1899 and 1909. (From Baltimore Survey, page 98.) school, by showing increases on bars extending to the right of a vertical line, and decreases by bars extending to the left of it. A good example is Figure 42 from the Baltimore Survey, showing tha change in enrollments in the various grades by percentages. The Graphic Presentations of School Statistics 249 decrease in the first grade was due to special efforts to move up re- tarded children. This form is preferable to that in Figure 41 because the bars to be compared are closer to each other. The " Monument " Graph. Areas are not at all easy to compare. Such comparisons would be far more effective if made by bar graphs lined up at one end so that the comparison would merely be a matter of com- paring the lengths of the various bars. For example, take the "monument" graph which is frequently found in school reports. A good illustration is found in one repre- senting the number of pupils enrolled in each grade in the Alabama Survey of Three Counties, shown in Figure 43. Here each stone represents the enrollment for one grade, beginning with the first grade for 5423 pupils and topping off with a little stone for the 60 pupils in the last year of the high school. The errors in using this sort of graph are as follows: The areas of the " stones " may be taken into account for the relationship by the reader, when it really is shown by their lengths only. As there is no common point from which to measure either the length or area of the stones, no adequate idea of relative sizes can be obtained from such a graph. The effect of this form of comparison is to make the difference between the larger and the smaller numbers seem less than it is. This may be shown by rearranging the data in the graph in a regular bar graph lined up at the left. (See Figure 44.) Notice how much accentuated the differences appear. ' It may be contended that the monument graph is only one form of the graph given in Figures 41 and 42. That is, it really has two halves formed by an imaginary line down the center, either half of which gives a simple bar graph effect. But if that is the case, why not use the simple 250 School Statistics and Publicity bar graph alone in the first place? When children are lined up in '' stair steps " to get their varying heights, IV H.S. 60 JHH.S. I\9 n H.S. 198 I H.S. 455 7^^(5r.Q\b ^^ 6KI485 b^^Qtr. 1800 4^^6^2080 3^Qr. EI32 Z^Qr 2687 |9* Gr. 54 Z3 Fig. 43. — Monument Graph Showing Number of Pupils Enrolled in Each Grade. (From An Educational Survey of Three Counties in Alabama, page 63.) they are all placed on the floor. No one would ever think of putting their waist lines on the same level and then of taking account only of variations above the waist. ET H.S. 60 4ttj qrade 20bO 1 \ 1st grade M23" Fig. 44. — Graph Showing Apparent Size of Certain "Stones" from the Monument Graph Lined up at One Side. Comparisons with Circle Graphs. Errors frequently arise in making comparisons with circles. It makes all the difference whether the comparison is made through the diameters or through their areas. The ordinary reader tends to make the comparison on the diameter basis. If the circles are drawn on the area basis, how- ever, it is apparent that the comparison will not be so striking as the maker of the graph intended. If, on the Graphic Presentations of School Statistics 251 other hand, circles are drawn on the diameter basis, some readers will tend to overestimate the facts. The graph used on page 91 of the Cleveland Survey, Summary Volume, is a good example of the difficulty in using circles on the area Under age and rapid progress © Normal age and rapid progress © Over age and rapid progress Under age a\nd normal progress Normal age and norma\ progress Over age and normal progress ® Under age and slow progress Normal age and slow progress Over age and slow progress Fig. 45. — Comparison of Circles by Areas Using the Percentage of Children in Each Age and Progress Group in Elementary Schools of Cleveland at Close of Year 1914-15. (From Cleveland Survey, Summary Volume, page 91. By permission of the Siirvey Committee of the Cleveland Foundation.) basis. It is reproduced in Figure 45. Here nine circles are employed to show the percentages of children in the under-age, normal-age, and over-age subdivisions of the rapid, normal, and slow groups. This 252 School Statistics and Publicity requires nine circles and it is almost impossible to compare them ac- curately. One circle is marked 6 and another 30, but the former appears to be about one-fourth of the latter, or larger than in reality. A Larger Proportion of Children are Going to Nigh School. Fig. 46. — Concentric Circle Graph to Show Relative Increase in High School Enrollment. (From the 1915-16 Review of the Bockford, Illinois, Schools, page 107.) The use of concentric circles as a means of comparison is even worse than that of the circles apart from each other. Graphic Presentations of School Statistics 253 Figure 46 is a concentric circle graph showing the enrollment of the Rockford schools for the years 1895, 1900, 1905, 1910, and 1915, with a comparison of the high school enrollment with the total enrollment. This graph is worth little for the public, because it is hard to understand, and because it is almost impossible to get a cor- rect notion of the areas of the different circles. One tends to look only at the rings and not at the circles, and the whole effect is some- thing like the advertisements showing the various layers in an auto- mobile tire. The bar graph would be very much better for this comparison, as in Figure 47. Comparisons made by using the areas of segments of the same circle are questionable, because people have not Hig h School Grades 1895 - 5141 to 1915 - e932 Fig. 47. — Component Bar Graph Comparison of Data Shown in Figure 46, been trained to estimate the areas of segments. Even as simple a graph as " How Portland Spends Its Dollar " (see Figure 48) is hard to interpret correctly. But when the graph becomes as complicated as the one in Figure 49, it is probably useless for the average reader. It is even more difficult to compare sectors in different circles than in the same circle. For example, take the graph on the percentage of home-trained and non-trained teachers and principals in the Cleveland Survey, shown in Figure 50. The sectors here serve but little to emphasize the different percentages given. In addition, the white labels on the black parts probably cut down the apparent size of the black parts very mate- rially. 254 School Statistics and Publicity Triangle Graphs. A few surveys make comparison by the heights or areas of overlapping isosceles trianglet having equal bases, as in Figures 51 and 52. Interest 207* Libra K[g5 1.01 ,1/ ^-^- .0'' A .^ ?/> 'Sf^ fo. \ ■op_. Fig. 48. — Example of Difficulty of Comparing Component Parts in a Circle Graph When the Angles Are Not Clearly Shown. This figure gives an itemized statement of "How Portland Spends Its Dollar." (From Portland Survey, page 84.) Since two triangles with equal bases are to each other as their altitudes, the comparison is perfectly accurate from either a height or area standpoint. But as heights alone are really wanted for determining the areas, plain bars 1/ y0(. \ \ ^ X / 1 (0 \/\/X ^ \ ^ > ^(O CM \/V^ I ^ V \ / CO W V/\ \ \ \ /5 1 (U \\ \ \ y r (0 o c lU o 1 "B* D c ^ ^ lO IS 03 a> u. 0> •5^ 2 o (0 o o o 5 X.* 52 CO tJ CO Q> D 3 cd 03 (1| "ft § ^ <0 CO a ^1 OS 6 M "Diagram III. Surface of circle represents total per capita expenditure in the average city. Sectors are proportional to amount spent for each of the twelve main purposes for which funds are expended. Shaded portion represents expenditure in Bridgeport. Under each heading the first figure gives in dollars and cents the amount spent per child per year in the average city and the second figure the corre- sponding amount for Bridgeport." 255 256 School Statistics and Publicity Fig. 50. — Graph Showing DiflSculty in Comparing Sec- tors from Different Circles. (From Cleveland Survey, Sum- mary Volume, page 107. By per- mission of the Survey Committee of the Cleveland Foundation.) would show the relative lengths of the altitudes with much less work. Besides, since the aver- age reader has had no experience in comparing heights or areas of triangles, about the only justifi- cation for them is the one matter of adding variety. Comparisons with Cartoon Effects. Another bad use of comparison through areas is sometimes found in the employ- ment of cartoons in which the data are represented by the areas of persons or objects. For ex- ample, take Figure 53. The trouble with such a chart is that the area grows much faster than the height, so that the expenditure for 1914-15, instead of appearing only about 35 per cent greater than for 1910-11 as it should, really appears to be several hundred per cent greater. In this particular chart, the hori- zontal lines at the back help to reduce the exaggeration by em- phasizing the height factor. Even when the figures are given with the chart, as in this instance, the visual inaccuracy is serious enough to cause a dis- trust of the whole thing. How- Graphic Presentations of School Statistics 257 ever, the cartoon effect can be secured in a graph that, by use of units or separate figures, allows no chance for error in making comparisons . )( A very effective chart of this kind is the one from the Des Moines ^Report for 1914-15.^ It is too large to be shown effectively in this book, so will be described in words only. It aims to show in a car- toon the relative numbers of retarded children, normal children, and SALARIES OP PRIMCiPALS WAQES OF JANITORS Fig. 51. — Comparison by Triangles between the Salaries of Principals in Springfield and the Average for Ten Other Cities in 1911-12. Shaded triangle represents average annual per capita expense for principals' salaries for each child in average attendance in the day schools of Springfield, and triangle in outline represents corresponding expenditures for the average of ten other cities. (From Springfield Survey, page 98, by permission of the Russell Sage Foundation.) Fig. 52. — Comparison by Triangles of the Cost of Janitor Service in the Average City and in Bridgeport. Triangle in outline represents portion of each thousand dollars spent for janitors' wages in the average city ; shaded triangle repre- sents the amount spent in Bridge- port. (From Bridgeport Survey, page 27.) accelerated children, boys and girls separately. To the left, 27 boys and 20 girls are represented as climbing a hill, book in hand and read- ing. In the middle, 18 boys and 23 girls are walking along on a level with books tucked away under their arms. At the right, 5 boys and 7 girls are going down hill with no books at all. This is similar to the Alabama illustration in that it gives relative proportions, but note how much better this relationship is shown by separate children than it would be by a few children varying in size. The graph would have been a little more effective if each group of children had been in one 1 Page 99 258 School Statistics and Publicity line so that the length of the line might also have entered into the comparison. This could have been easily managed by adding a few more lines to indicate a portion of a hill for each child not on the level. See how the child in the grades is growing 1910-11 1911-12 1912-13 1913-14 1914-15 t=rli $23^ $25^ $27^ Fig. 53. — Example of the Difficulty in Making Comparisons with Cartoon Effects. This shows what is spent in Louisville on each child in the grades. The comparison is really made by heights only, but the reader tends to take it by areas. (From 1914-15 Louisville Report, page 35.) This would take no more room on the whole, and it would not seriously weaken the effect, since the hill conveys practically the same idea as the use of the books. Another example of a cartoon applying a good idea but in a very unreal way, and also using circles for comparison, is given in Figure 54. FOR EVERY DOLLAR THAT The average Bridofeporl cit y sp ends spends A BOARD OF EDUCATION OFFICE ^^^^H ^1 ^^ 22t Fig. 54. — Cartoon Graph Using a Sector of a Circle to Represent Part of a Dollar. (From Bridgeport Survey, page 22.) But the cartoon effect could have been kept and a much more accurate comparison made by using cents in a bar effect, thus : Average city $1.00 to 100 Bridgeport .22 . . . to 22 Graphic Presentations of School Statistics 259 Ayres uses such a device to show the cost of schooling per child per day in the various states, in his bulletin on comparing the school systems of forty-eight states.^ When the cartoon effect is very necessary to show relative parts of a dollar, and accuracy is not essential, a cartoon similar to Figure 55 may be advisable. ADVENTURES OF MR. TAXPAYER. With a municipal budget and without. How one city now How a city SUPPOSES KNOWS the money is spent. where the money ^oes. Fig. 55. — Cartoon Effect to Show Parts of a Dollar. (From Newburgh Survey, page 93, by permission of the Russell Sage Foundation.) Figure 56 is a cartoon effect to show the lack of suffi- cient playground space. Figure 57 is a skillful use of a bar graph effect. Other good examples of cartoon effects may be found in : a. The Ohio Survey, page 65. Here there is a line of twelve teachers for each class of school, begin- ning with one-room township schools and going on up to high schooL Teachers without professional training are in black ; those with one or more terms in summer schools are in gray ; those with one or 1 Page 18 260 School Statistics and Publicity more years in a professional school in white. A high degree of ac- curacy is attained by using half a figure to represent one twenty- fourth. Thus seven twenty-fourths of the twelve teachers in one line are represented by three women clothed in black and another with a black skirt and gray waist. LAWN vs. PLAYGROUND William Street How One Newbury School Saves the Grass at the Expense of the Children Fig. 56. — Cartoon Effect to Show Lack of Playground Space. (From Newburgh Survey, page 63, by permission Of the Russell Sage Foundation.) b. Ayres's A Comparative Study of the Public School Systems in the Forty-eight States, page 6. The value of school property in different states is represented by in- dividual dollar marks, a line for each state, giving a bar effect ; thus : Florida Kentucky Arkansas Mississippi IK (ft (J m (J (J* (J (P ^ (p (P

^ ijj 5^ O) San franc \SCO,CQi Fig. 60. — Example of Using Curves to Show Changes on Different Dates in Two Items for Two Cities. The two charts show the competition for city funds. (From Cubberley's Public School Administration, page 414, by permission of Houghton Mifflin Company.) of Memphis, Tennessee, public schools affords a fine illustration. (See Figure 22, page 166.) This same curve effect can be obtained by a trained man from the component part bar graphs. Obviously, if only two component parts are shown, the curve is read one way for one set, and another for the other, that is, looked at from the two sides separately, as in Figure 61. If 266 School Statistics and Publicity there are more than two component parts, the curves show only for the two end parts, as in Figure 62. c Fig. 61. — Curve Effect on Bars with Two Component Parts. The curve effect is also noticeable in a graph where various items in one group are compared with similar items in other groups. A good example is the standing in the four fundamental operations in the Courtis arithmetic tests for different grades. Letting A stand for Addition, S for Subtraction, M for Multiplication, and D for Fig. 62. — Curve Effect on Bars with Three Component Parts. Division, the results might be shown as in Figure 63. A line connect- ing the ends of all the bars of the same kind will give a curve. See also Figure 38. Standards for Drawing Curves. If, after all the pre- ceding, it is still felt desirable to present school statistics to the public with curves, the curves should be drawn properly. Accordingly, the suggestions of the Joint Graphic Presentations of School Statistics 267 Committee on Standards for Graphic Presentation ^ should be followed. The words alone are given here, but No. 16 14 • 12 • 10 - 8 • 6 ■ A ■ probs. GRADE E" VI q\c. A 5 M D A S M D A S M Fig. 63. — Bar Graph Device with Curve Effect for Comparing Several Groups on Several Items. Each bar represents the standing in the Courtis Tests. A means addition, S, subtraction, etc. the full report contains graphs which make the text much clearer: 1. The general arrangement of a diagram should proceed from left to right. 2. Where possible, represent quantities by linear magnitudes, as areas or volumes are more likely to be misinterpreted. 3. For a curve, the vertical scale, whenever practicable, should be so selected that the zero line will appear on the diagram. 4. If the zero line of the vertical scale will not normally appear on the curve diagram, the zero line should be shown by the use of a hori- zontal break in the diagram. 5. The zero lines of the scales for a curve should be sharply dis- tinguished from the other coordinate lines. 1 Copies may be obtained from the American Society of Mechani- cal Engineers, 29 West 39th Street, New York, price 10 cents, discount in quantities. 268 School Statistics and Publicity 6. For curves having a scale representing percentages, it is usually desirable to emphasize in some distinctive way the 100 per cent line or other line used as a basis of comparison. 7. When the scale of a diagram refers to dates, and the period rep- resented is not a complete unit, it is better not to emphasize the first and last ordinates, since such a diagram does not represent the begin- ning or end of time. 8. When curves are drawn on logarithmic coordinates, the limiting lines of the diagram should each be at some power of ten on the log- arithmic scales. 9. It is advisable not to show any more coordinate lines than necessary to guide the eye in reading the diagram. 10. The curve lines of a diagram should be sharply distinguished from the ruling. 11. In curves representing a series of observations, it is advisable, whenever possible, to indicate clearly on the diagram all the points representing the separate observations. 12. The horizontal scale for curves should usually read from left to right and the vertical scale from bottom to top. 13. Figures for the scales of a diagram should be placed at the left and at the bottom or along the respective axes. 14. It is often desirable to include in the diagram the numerical data or formulae represented. 15. If numerical data are not included in the diagram, it is desirable to give the data in tabular form accompanying the diagram. 16. All lettering and all figures on a diagram should be placed so as to be easily read from the base as the bottom, or from the right- hand edge of the diagram as the bottom. 17. The title of a diagram should be made as clear and complete as possible. Sub-titles or descriptions should be added if necessary to insure clearness. 3. Special Summarizing Graphs Sometimes it is desired to give a graphic summary of something that has been measured by relative position on a number of items. The way of showing this roughly by the bar graph was given on page 244, but this indicated only the quarter of the distribution in which the case fell Graphic Presentations of School Statistics 269 3 4- lO 1 1 ; 7. !3 1 + J5 I 7 I 8 19 20 21 7 I O / 4 \5 t6 I 7 I 8 IS 20 21 lO I I 12 15 IT I 8 19 20 21 Expenditure Expenditure per Expenditure ^Qr per $1,000 of tax- child in average inhabitant able wealth daily attendance The shaded rectangles represent Boston, Fig. 64. — Graphic Device for Summarizing the Relative Position of a Given Case in a Number of Different Distributions. This figure shows the rank of Boston in a group of twenty-one cities in expenditiire for operation and maintenance of schools. (From Boston Report, 1916, page 158.) 270 School Statistics and Publicity Teacher City (Indicate sex) EFFICIENCY RECORD Grade taught. (or building) (or subject) Experience years. Salary per month. Highest academic training '. Extent of professional training Detailed Ratixg V.P II. III. IV. General appearance Health Voice Intellectual capacity Initiative and self-reliance . . . . Adaptability and resourcefulness . Accuracy Industry Enthusiasm and optimism . . . . Integrity and sincerity Self-control Promptness Tact Sense of justice Academic preparation Professional preparation Grasp of subject-matter Understanding of children . . . . Interest in the life of the school . Interest in the life of the community Ability to meet and interest patroni Interest in lives of pupils . . . . Co-operation and loj'alty . . . . Professional interest and growth . . Daily preparation Use of English Care of light, heat, and ventilation . Neatness of room Care of routine Discipline (governing skill) . . . . Definiteness and clearness of aim . Skill in habit formation Skill in stimulating thought Skill in teaching how to study . Skill in questioning Choice of subject-matter . . . . Organization of subject-matter . . Skill and care in assignment Skill in motivating work . . . . Attention to individual needs . Attention and response of the class . Growth of pupils in subject-matter . General development of pupils Stimulation of community . . . . Moral influence General Rating Poor Medium Good Ex. Recorded by Position Date Fig. 65. — Summarizing Graph to Show Efficiency Record of a Teacher, Used by School of Education, University of Chicago. Graphic Presentations of School Statistics 271 on that quality. A refinement of this is found in Figure 64. Another example is that used by the School of Education at the University of Chicago to sum up the rating of a Grade Arithmetic Addition Spe&d Accuracy Subtraction Speed Accuracy Multiplication Speed Accuracy Division Speed Accuracy Silent Reading Handwriting Speed Qualify Grade n m m sr la im: 2m. Fig. 66. — Graphic Device for Summarizing the Achievements of One Pupil or School in Several Fields as Related to Standards in Those Fields. (From Educational Tests and Measurements of Monroe, De Vo33 and Kelly, by permission of the authors and Houghton MiflBin Company.) teacher on several different qualities. Part of this graph is shown in Figure 65.^ It should be noted that a general heading will loom up on this graph in proportion to the number of sub- heads it has. Consequently, the way to make any general heading have weight will be to add sub-heads, without caring particularly how important these sub-heads are. 1 The original blank has suitable main headings for each Roman numeral, printed horizontally, but it was not feasible to show them on a page of this size. 272 School Statistics and Publicity There appears to be no way of overcoming this defect except by varying the widths of the horizontal spaces, which would probably complicate the device beyond the point of practical value. A third summarizing device, and one capable of wide adaptation, is shown in Figure 66. The same device may be used to show the standing of a school or city on a number of items, by placing the names of the standards at the top in place of the Roman numerals for the grades. Figure 24 on page 168, showing the surfaces of fre- quency one above the other, is a fourth summarizing graph that is of value to the public. 4. Brinton's Rules for Graphic Presentation Brinton, in his standard book. Graphic Methods of Presenting Facts, has a very convenient set of rules followed by a set of check items. These were printed before the suggestions of the Committee on Standards for Graphic Presentation (of which he is chairman) were published (see pages 267-268), and some of the suggestions appear in both sets. But Mr. Brinton's original lists are not confined to suggestions for curves, as is the report of the committee. The practical school man will prefer to have all the suggestions given in one place, so they are here appended. Helps on Graphic Presentations (Selected from Brinton : Graphic Methods of Presenting Facts, pages 360-362) I. Rules for Graphic Presentation 1. Avoid using areas, or volumes, when representing quanti- ties. Presentations read from only one dimension are the least likely to be misinterpreted. Graphic Presentations of School Statistics 273 2. The general arrangement of a chart should proceed from left to right. 3. Figures for the horizontal scale should always be placed at the bottom of a chart. If needed, a scale may be placed at the top also. 4. Figures for the vertical scale should always be placed at the left of a chart. If needed, a scale may be placed at the right also. 5. Whenever possible, include in the chart the numerical data from which the chart was made. 6. If numerical data cannot be included in the chart, it is well to show the numerical data in tabular form accompanying the chart. 7. All lettering and all figures on a chart should be placed so as to be read from the base or from the right-hand edge of the chart. 8. A column of figures relating to dates should be arranged with the earliest date at the top. 9. Separate columns of figures, with each column relating to a different date, should be arranged to show the column for the earliest date at the left. 10. When charts are colored, the color green should be used to indicate features which are desirable or which are commended, and red for features which are undesirable or criticized adversely. 11. For most charts and for all curves, the independent variable should be shown in the horizontal direction. 12. As a general rule, the horizontal scale for curves should read from left to right and the vertical scale from bottom to top. (See "special.") 13. For curves drawn on arithmetically ruled paper, the ver- tical scale whenever possible should be so selected that the zero line will be shown on the chart. 14. The zero line of the vertical scale for a curve should be a much broader line than the average coordinate lines. 15. If the zero line of the vertical scale cannot be shown at the bottom of a curve chart, the bottom line should be a slightly wavy line indicating that the field has been broken off and does not reach to zero. 16. When the scale of a curve chart refers to percentages, the line at 100 per cent should be a broad line of the same width as a zero line. 274 School Statistics and Publicity 18. If the horizontal scale for a curve begins at zero, the verti- cal line at zero (usually the left-hand edge of the field) should be a broad line. 19. When the horizontal scale expresses time, the lines at the left-hand and the right-hand edges of a curve chart should not be made heavy, since a chart cannot be made to include the be- ginning or the end of time. 20. When curves are to be printed, do not show any more coordinate lines than are necessary for the data and to guide the eye. Lines one-fourth inch apart are sufficient to guide the eye. 21. Make curves with much broader lines than the coordinate ruling, so that the curves may be clearly distinguished from the background. 22. Whenever possible, have a vertical line of the coordinate ruling for each point plotted on a curve, so that the vertical lines may show the frequency of the data observations. 23. If there are not too many curves drawn in one field, it is desirable to show at the top of the chart the figures representing the value of each point plotted in a curve. 24. When figures are given at the top of a chart for each point in a curve, have the figures added if possible to show yearly totals or other totals which may be useful in reading. 25. Make the title of a chart so complete and so clear that misinterpretation will be impossible. Special. In showing deviations from a central tendency, on the vertical scale, upwards is plus, and downwards, minus ; on the horizontal scale, to the right is plus, and to the left, minus. II. Checking List for Graphic Presentations. 1. Are the data of a chart correct? 2. Has the best method been used for showing the data? 3. Are the proportions of the chart the best possible to show the data? 4. When the chart is reduced in size, will the proportions be those best suited to the space in which it must be printed ? 5. Are the proportions such that there will be sufficient space for the title of the chart when the chart has been reduced to final printing size? Graphic Presentations of School Statistics 275 6. Are all scales in place? 7. Have the scales been selected and placed in the best pos- sible manner? 8. Are the points accurately plotted? 9. Are the numerical figures for the data shown as a portion of the chart? 10. Have the figures for the data been copied correctly? 11. Can the figures for the data be added and the total shown? 12. Are all the dates accurately shown? 13. Is the zero of the vertical scale shown on the chart? 14. Are all zero lines and the 100 per cent lines made broad enough ? 15. Are all lines on the chart broad enough to stand the re- duction to the size used in printing? 16. Does lettering appear large enough and black enough when seen under a reducing glass in the size which will be used for print- ing? 17. Is all the lettering placed on the chart in the proper direc- tion for reading? 18. Is cross-hatching well made with lines evenly spaced? 22. Are dimension lines used wherever advantageous? 23. Is a key or legend necessary? 24. Does the key or legend correspond with the drawing? 25. Is there a complete title, clear and concise? 26. Is the drafting work of good quality? 27. Have all pencil lines which might show in the engraving been erased? 28. Is there any portion of the illustration which should be cropped off to save space? 29. Are the instructions for the final size of the plate so given that the engraver cannot make a mistake? 30. Is the chart in every way ready to mark "0. K. "? 5. Presenting Statistical Data with Maps Sometimes it is necessary to impress the public with the way items vary in size or frequency in different geographic areas. Thus, a state superintendent may wish to show 276 School Statistics and Publicity the legislature how high school facilities vary in different counties of the state; a city superintendent may desire to demonstrate how far pupils have to go to reach a school, or the crowded conditions necessitating a new building in a certain locality ; or a county superintendent may need to show just how his county ought to be dis- tricted for schools so that all children will be within a reasonable distance of a school. These things may be shown fairly well by variations in shading the maps for different localities, a device much used by the United States Bureau of Census and by geography, history, and economics texts. However, all such maps require keys for their interpretation, and it is difficult for any one, except a person very familiar with such work, to understand them quickly. A much better way is to represent every case or every certain number of cases (say ten) by a dot. Figure 67 is a map used by George Peabody College for Teachers to show the communities from which students have come to the college during the years 1914 to 1918. This illustration shows admirably just how much territory is coming under the direct influence of the school. It does not give a correct notion of the number of students that have attended the college during these years, as some cities have contributed possibly a hundred or more each and yet such a city would be represented by only one dot. If a dot had been inserted for each student, some portions of the map would have been solid black. This could be remedied by using perpendicular wires with a bead for each student, as recommended by Brinton,i but it is very difficult to reproduce such a map by pho- tography or by a drawing. In showing the widespread 1 Graphic Methods of Presenting Facts, p. 251 ^ O lU a % o r/i (-> o o h o d .2 -si 0) to OS a '" 2a TO H o o 0) 2 f=i ;:; o .2^ CO a; □ -^ 2 "in a s .S eS 277 278 School Statistics and Publicity influence of the college, the map is more effective than the one from the Report of the General Education Board Fig. 68. — Device for Showing Distribution of Cases on a Map. Each dot represents a student from the different counties, attending the University of Georgia. Note the radiating lines to indicate that the university is exerting an influence upon all the state. for 1914, page 4. This is merely a map of the United States in outline, with the number of students in each Graphic Presentations of School Statistics 279 state attending Vanderbilt University. It is thus only a map with a table on it, and not even in good tabular form, because there is no direct way to compare the numbers by having them in one, column, and especially running from low to high or vice versa. It is impossible for one to visualize the number of students attending Vanderbilt University, and it takes a little time to get a correct notion of the sections of the country coming under its direct influence. The University of Georgia uses a good map to show its attendance, heightening the effect by adding radiating lines to indicate its influence in the state. (See Figure 68.) The dot device on a map of the city was used by the superintendent in the Rockford schools ^ to show the residence of pre-tubercular pupils and also the residence of students in the evening schools. In the one case, he showed clearly that the pre-tubercular children were scattered widely through the city and that the problem of handling them was city- wide. He also showed that the attendance at evening school was not restricted to a small area of the city. In some of the Red Cross work huge state maps are shown with the number of tubercular soldiers for each county pasted in as so many paper doll sol- diers, upright in ranks. It is a very effective presentation. If the distances on a map are to be measured in some time unit, a map similar to that in Figure 69 is useful. The accredited schools of a state may be shown on an outline map, using different-colored pins or tacks for the different classes of schools. In the division of rural education in the state department of education for Missouri, there is an immense map of the state painted on the wall. On this, each approved rural school is shown 1 Rockford Review for 1915-16, pp. 85, 104 280 School Statistics and Publicity by a small kodak picture of the building. This enables a visitor to get almost immediately an idea of where the good rural school work is being done, and closer examination will show the kinds of school buildings being put up. Of • Fig. 69. — Device for Showing Distance with a Time Element on a Map. Map iised by the Nashville Commercial Club to show how accessible the city is. Every city indicated is within twelve hours' travel of Nashville. course, such a map cannot be easily reproduced in printing. Still, something can be done with conventional drawings for each item in a class. Most readers of this, for example, probably recall the recent Y. M. C. A. propaganda with Graphic Presentations of School Statistics 281 the sketch of the Western Battle Front, each building of the organization being represented by a tiny drawing of the right kind. III. HOW GRAPHS FOR THE PUBLIC DIFFER FROM THOSE FOR THE ADMINISTRATOR The difference has been referred to many times before, especially in discussion of the superior value for the public of bar graphs over curves and in Brinton's check list.i Let us now analyze it further, restating certain points for additional emphasis. In general, the public will view graphs in much the same way as they view any explanation or presentation. The ordinary man cannot quickly get from a rough copy of a chapter the meaning that an experienced writer can ; he cannot extract from a confused and verbose mass of evidence the essentials that a trained lawyer can; he cannot grasp so quickly, nor in such large numbers, many intricate graphic presentations that seem relatively simple to a trained school man. School^graphs for the public must be simple, with^jrelatively few" elements or lines, and very forcible. The trained reader can extract the significant things from most graphs, however poorly constructed ; the average man cannot. Let tfe^now take up some of the most significant aids for making graphs clear to the public. 1. The title should give all the significant points to he found in the graph, so that the graph would he quite in- telligihle apart from the context where it is found. This implies that there should not be many different points found in one chart. If the chart can show only one thing, it is all the better. Some examples of good titles are : 1 See pp. 274-5 282 School Statistics and Publicity Standing of the children of Salt Lake City in the fundamentals of arithmetic, judged by the median score attained by each grade. {Salt Lake City Survey, p. 175.) Distribution of ages at which Salt Lake City children enter the first school grade. (Same, p. 201.) Columns represent number of pupils among each hundred begin- ners who remain in school at each grade from the first elementary to the fourth high. {Cleveland Survey, p. 88.) Percentage of elementary teachers, high school teachers, and elementary principals in Cleveland who are home trained and not home trained. {Cleveland Survey, p. 107.) How Portland spends its dollar. {Portland Survey, p. 84.) Figure 7, representing the percentage of children in several grades who make the given scores in composition. For ex- ample, 1.7 per cent of the fourth-grade children wrote com- positions scored at 0; 43.8 per cent of the fourth grade were scored at 1; etc. By following the median lines, the overlap- ping of ability from grade to grade is disclosed. {Butte Survey, p. 75.) In each of these instances, the title and the chart make a complete unit. The last one is especially noteworthy. It gives a very brief title, then expands this with a full but concise explanation of all points in the diagram that may cause confusion. 2. A chart or graph too large to he seen without turning the head is apt to he a poor chart for the puhlic, no matter how simple it may he. This is shown by some charts and folded graphs in some of the surveys. It is very seldom that a chart or graph should occupy more than one page of the publication in which it appears. In reducing charts for publication, however, one must be careful that the reduction is not carried to the point of making the differences negligible or the lettering too small to be read easily. Advertisers long ago discovered that the public will not read advertisements in fine print, and school graphs are only a form of advertising the school work. Often graphs are reduced greatly to economize space. If this is pushed to the extent of making the graph hard to read, it is clearly advisable to omit some of the graphs altogether, and make the remainder large enough to be effective. Graphic Presentations of School Statistics 283 3. The background of a chart should not he made any more prominent than necessary. Many charts are plotted on coordinate paper heavily and finely ruled, while the curves or bars are but a trifle heavier than the co- ordinate ruling. Such charts do not stand out clearly from their background. Only as many coordinate lines should appear on a chart as are necessary to guide the eye of the reader and to permit of easy reading of the curves. The difficulty may be rather easily avoided by drawing on the coordinate paper in very heavy lines with India ink all the lines and figures which it is desired to reproduce, including the coordinate ruling. The necessary lettering can be put in with a typewriter, using a practically new black ribbon. The proper exposure in making the cut will "take" all the desired lines and lettering, but not the others. 4. Exaggerations should he avoided as much as possible. (a) Usually, a very forceful presentation may be had without any great sacrifice of accuracy. Of course, only complete scientific presentation can ever give the whole truth. However, if only a phase of a problem at a time is presented to the public (and often this seems necessary) , some exaggeration is inevitable. Here the problem is to choose between absolute accuracy and forcefulness of presentation. (6) It is, in general, dangerous to leave the zero line off a chart intended for the public, or even to send it out with the conventional wave line at the bottom. Many school men in making charts have found it convenient to leave the zero line off. Sometimes, when all the quantities used come high on the vertical scale, this is done to economize space. An excellent example of such a chart is Figure 70. The upper chart conveys the idea that salaries have increased greatly in Louisville during this period of five years. But the chart begins at about $475 instead of zero. The chart below is drawn in full. It is clearly seen from the complete chart that the salary increase, while noticeable, does not appear anything like so large as in the original and incorrectly drawn chart. 284 School Statistics and Publicity (c) Care should be taken that the scales chosen for the graph do not exaggerate things unduly. The novice in chart making will probably become confused by the ever changing ratios between the perpendicular and the hori- zontal scales. No definite rules can be laid down for guidance in ^5 if was drawn ^550 $600 -1650 itJOO 19*0-11 1 1 19/1-12 1 1 I9IZ-I3 1 1 [ igi3-i4 1 1 1 19(4-15 1 1 1 As t should hdve been IJO ~?|00 fl50 4200 ^250 ^300 4350 f400-#450 -?500 #550 ^600 ^650 ^700 I9l0|-ll 1 1911 -IZ 1912 -13 _ i9(3 -14 1914 -15 ^ ^ ^ Fig. 70. — Example of Danger of Leaving Zero Line Off a Chart. The top set of bars was intended to show a "comparison of salaries paid to elementary school teachers in Louisville for the years 1910-1915." The bottom set shows what the comparison really was. (Adapted from the Louisville Report, 1914-1915, page 18.) this matter. The only way to get facihty in adjusting these . ratios in the proper way is through the trial and error method. That a change in the ratio makes a great difference in the im- pression produced by the graph is shown in Figure 71 Graphic Presentations of School Statistics 285 The left-hand graph shows the results of the Kansas Silent Reading Tests in the Rockford schools. The right-hand graph shows the same data plotted with the horizontal scale increased while the vertical scale is decreased. It will be seen at once Scole Grade 345 6 7 8 20 10 5 Scale Grade 3 4 5 6 7 8 15 10 Fig. 71. Example of Effect Produced by Changing Ratio of Horizontal to Vertical Scale on a Graph. Results of Kansas Silent Reading Tests. (Adapted from Review of Rockford, Illinois, Schools, 1915-1916.) 286 School Statistics and Publicity that the difference in achievement between the grades does not appear so marked in the second illustration as in the first. (d) The superintendent must beware of graphs containing optical illusions. There is little danger of this in charts made small enough to publish, but there may be danger in large wall charts. Brinton calls attention to illusions caused by a row of perpendicular lines compared with a row of horizontal ones the same distance apart. ^ Fig. 72. — Cartoon Graph Representing Ratio of Lighting Space to Floor Space. (From Alabama Three-County Survey, page 92.) The lines in the first row appear shorter than they really are and spread farther apart ; those in the second seem to be longer than they really are. Another illusion is caused when a white square and a black one of exactly the same size are drawn adjacent. The white one seems larger. This might affect slightly the ratio of lighting space as shown by white windows against a dead black wall space, in some surveys. (See Figure 72.) 5. Special effort should he made to introduce variety, novelty, and various striking features to attract attention to the statistical relations to he presented. Variety in graphs is as necessary to keep the attention as is variety anywhere else. No ordinary reader could stand it to wade through a report of any length which had a great many bar graphs of one pattern and no other illustrations. Sometimes pictures or devices may be used simply to catch the attention. The bulletin on illiteracy in Virginia, 2 for example, uses a picture of a rural school to get the reader's attention for the state- 1 Graphic Methods of Presenting Facts, p. 358 2 Illiteracy in Virginia, published by State Department of Public Instruction, p. 7 Graphic Presentations of School Statistics 287 THE PROFIT FROM TWO HERDS FOR ONE YEAR rr $95.73 ._.^j l Ml ' I I I I I '■"•' ' State Bank WHY THIS DIFFERENCE? Herd A IT WAS NOT THE SIZE OP HERD .11 COWS 9 PURE Bred It was mot the breed % grades IT was mot the feed cost -♦sze.yo (silos and cood buildings on each farm) HERE IS THE ANSWER AVCRAGIE PRODUCTION OP BUTTER FAT I7I.I LBS. PER COW This Is A True Story As Told Dy Herd B II COWS t NATIVE 10 GRADES $569.96 386.9 LBS PER COW *^n 1 ■ I J ' ^ ■ * I t,IZI ^ H0RAL:-IT would have taken 93 POOR cows TO MAKE THE PROFIT THE 11 GOOD COWS MADE; DOES IT PAY TO KEEP RECORDS? Fig. 73. — Example of Use of Special Devices to Attract Attention to the Statistics Involved. This figure compares the profit from two herds for one year and shows how many dairymen are wasting time and money on low-producing cows. "Why not get rid of your 'visitors'?" (From a pubUcation of the University of Wisconsin, Experiment Division, by permission.) 288 School Statistics and Publicity ment that only six of these children are beyond the first reader and none beyond the fifth reader. Or devices similar to those in Figure 73 may be used to attract attention to the statistics. Then there are special touches which no one can tell precisely how to go about acquiring. For example, in the Alabama three-county A ^ ^ ^ ftfl One outof every ten white men must ask another to mark his ballot for him. ttMMIII Foor outof every ten negroes must ask another to ^ign their names for them. Fig. 74. — Bar Graph with Cartoon Effect Showing Illiteracy in Alabama. (From the Survey of Three Counties in Alabama, page 19.) survey, illiteracy is shown among voters by having the literate voters face the reader and the illiterate ones turn their backs. (See Figure 74.) Probably this is about as forceful a showing of the shame of illiteracy as could be devised. IV. EXAMPLES OF GOOD GRAPHS ON SCHOOL STATISTICS FOR THE PUBLIC 1. To Show Rise in School Costs In the Newton, Massachusetts, Report for 1912,^ the scale on tax for school maintenance per $1000 is shown on a thermometer device which is reproduced on page 102 1 Page 113 Graphic Presentations of School Statistics 289 of this book. This gives the idea that costs for school maintenance should rise. The names of various cities with which Newton is compared appear at one side of the graph, with lines running from each name to the proper degree on the thermometer where the mercury should stand for that city. This graph shows very forcibly that the mercury must rise many degrees for Newton before it will equal the best record made by the other cities. For some cities, probably a cartoon utilizing the weigh- ing machine seen at fairs and carnivals would be equally effective. 1 Instead of pounds, tax leyies or per capita amounts of money could be shown on the scale of the upright, with the highest amount reached or desired at the top. Men representing the other cities could be standing around, evidently having struck the machine, and their records could be shown on a bulletin board in the background. Another man, representing the home city, could be shown as just getting ready to strike the machine to see what he can do, in the midst of words of encouragement or taunts from the other men. His old record might appear on the bulletin board. Underneath might be some such question as, " Can't he send it to the top? '' " Who is the best man? " '' How much will he beat his old record? " 2. To Show Relative Investments in School Property In the Educational Survey of Three Counties of Alabama, the number of dollars invested for each child of school age by each state is given. ^ Each dollar is represented by a dollar mark. Thus, Massachusetts has $115 invested in school property for each child of school age, while 1 Suggested by Mr. F. C. Lowry 2 Page 212 290 School Statistics and Publicity Mississippi has only $4. The advantages of this graph are : The sjmibol aids in calhng attention to the graph ; the length of the row of dollar marks gives the effect of a bar ; the data are accurately represented, — there is no material exaggeration or anything in the device to mis- lead. 3. To Show a Lack of Funds for Maintenance In the Survey of Three Counties of Alabama,^ there appear pictures of a schoolhouse and an automobile. From suitable figures, we learn that the initial cost of a cheap automobile is more than that of the average rural schoolhouse; and that the upkeep of the machine is more than that of the average rural school. This com- parison depends for its power on contrast, not on accuracy, for there is nothing particularly accurate about it. All the same, it is a very powerful device in shaming rural people into doing their duty by schools. If there is a single automobile in such a district, it represents more than the rural school expenses. This illustration will be of service chiefly in suggesting similar comparisons. The Columbus Dispatch some months ago had a very effective cartoon to represent the disparity in wages of women teachers and statehouse janitors in Ohio.^ It depicts a gruff old man with hands in his pockets, labeled " Old Man Ohio.'' Above him are two inserts. The left insert represents a pitiful woman teacher in her classroom, with the statement that the average salary for the public school teacher in Ohio is $54 a month. The right insert depicts a shuffling negro janitor in cap and overalls, bearing broom, mop, and bucket, with the 1 Page 72 2 Reproduced in American. School Board Journal, Jan., 1917, p. 33 Graphic Presentations of School Statistics 291 statement that Ohio pays the janitors in the statehouse $60. On either side of Old Man Ohio is a hand with index finger pointing at him to emphasize the title of the whole — " For Shame ! " 4. To Show Length of School Term, Average Attend- ance, Etc. For this the Ayres bulletin on the forty-eight states has a good graph. Each day is represented by a small square, the whole representing a bar graph, with each bar two squares wide to make the bars shorter. The total length of the bar represents the average number of days in which schools were open in that state. Beginning at the left, a sufficient number of these little squares are shaded 40 60 60 49- NEW MEXICO, Fig. 75. — Graph for Showing the Relation of the Average Number of Daj^s' Attendance by Each Pupil to the Number of Days School Was Open. (From Dr. Ayres's Comparative Study of the Public School Systems in the Forty-eight States.) to represent the average number of days attended by each pupil enrolled in that state. The names of the states appear on the left from high to low, beginning with Rhode Island, which had her school open 193 days and kept each pupil in 148.8 days. The lowest is New Mexico, which had her schools open only 100 days and kept each pupil in only 66.4 days. This is represented in Figure 75. This chart shows clearly which states have the longest school terms and which are making the best use of what they have. It could be used for cities just as well as for states. 292 School Statistics and Publicity Another way to show the attendance of different school systems is suggested by the graph on page 10 of the same bulletin. This shows the number of days of schooling each child of school age would get per year if he got his. share. Each day is represented by a small dot; the dots are clustered in groups of five, thereby giving a unit of the week as well as the day. The bar effect is obtained, and the graph has every advantage mentioned for those above. This graph could be used in any graph- ing of attendance, probably. The copy used by Ayres has the figures at the right end of the bars, which is bad, because it makes the bars appear lengthened unequally. This is corrected in the copy below. • 48. New Mexico 46 :•::•::.::•::• : etc. 5. To Show Per Capita Costs of Schooling A graph for this, similar to the last two described, is found on page 18 of the Ayres pamphlet. It sets forth the cost of one day's schooling for one child in each state in 1910. Each cent is represented by a black dot, and the bar effect is obtained. This dot or the cent mark or the dollar mark could be used in gi-aphing any data on costs, the unit being chosen so as to keep the bar short enough.^ South Carolina 7 :•::•::.::•::• : etc. 6. To Show a Disgraceful State of Affairs in Certain Localities On page 160 of the Alabama three-county survey is shown a familiar map graph. This particular one shows the map of the United States with all states having com- ^ See page 260 of this book Graphic Presentations of School Statistics 293 pulsory education laws in white, and those not having such laws in black. As black is usually associated with shame and disgrace, the graph becomes a stinging accuser against the sections that are backward in this respect. The same idea, of course, has been used in religious maps. This use of black was referred to in the description of the chart on page 32 of the Ayres pamphlet. (See page 244 of this book.) The idea is capable of wide use in cases where it is desirable to shame backward school systems into doing something better. The objection that it is difficult to show lettering on black areas is easily over- come by using white ink for lettering. 7. To Show the Variability of Children in the Different Grades in Their Achievement in Some Standard Test or Similar Matter A discrete distribution or Bobbitt table may be shown with bar graphs as shown on page 103, with a curve as shown on page 104, or with, a scale as shown on pages 101 and 102. For a continuous distribution, some form of the block graph is probably best. It is much used in the surveys of Butte, Salt Lake City, and other places. (See pages 112 and 137 of this book.) 8. To Show the Relation between the Rank of Children in Their Classes in the Elementary School and Their Probable Entrance into High School The pupils are divided into three classes, the upper, middle, and lower thirds. Each third is represented by a broad vertical bar, all bars the same length. The per- 294 School Statistics and Publicity centage of each bar representing those not going on should be colored black or shaded, the rest being left unshaded. The three bars should be placed side by side, with the low third on the left and the high third on the right. (See Figure 76.) Low Middle High third third third Fig. 76. — Graph Showing Percentage of Eighth Grade Pupils Entering High School from the Low Third, the Middle Third, and the High Third of their Classes, Cleveland. (From Cleveland Survey, Summary Volume, page 185, by permission.) 9. To Compare Achievements of the Various Grades in Standard Tests with Similar Grades from Other Cities The best graph for this purpose is probably a modifi- cation of the bar given in Figure 63 on page 267. Each grade could be represented by the one kind of shading throughout. Each group could have the name of its city beneath, the scale could appear on each margin, and, if necessary, faint horizontal scale lines could be drawn clear across the drawing. Graphic Presentations of School Statistics 295 10. To Show Ratio of Lighting to Floor Space In the Springfield Survey, sl graph is used for this in which the floor space is represented by a black square. ^ In the midst of this is a white square representing the lighting space. In the first diagram to the left appears the standard ratio ; the second one gives the average ratio for Springfield. The percentages appearing in the white squares should be written below. White squares always appear larger than black ones in such a graph, but this one is too small for the feature to affect it materially. The white square in the midst of the black gives a window- like effect and so helps to call attention to the diagram. This may be improved upon by making the whole the shape of a side wall, with the white the shape of windows in that wall. (See Figure 72, page 286.) The preceding illustrations comprise only a few of the best selections from school reports and surveys, in addition to those given before. It will be noticed that many of the problems the superintendent is sure to meet in graphing his data have not been mentioned. The bar graph, concerning which much has been said in the pre- vious pages, is capable of wide adaptation, as is also the Bobbitt table. One or the other of these two may be pressed into service upon almost any occasion. It is well, however, to introduce some of the special forms mentioned above, for variety's sake if nothing else. The efficient superintendent, of course, will always be on the lookout for improving the devices we now have for presenting statistical data to the public, or he may work out some entirely new methods. But any graph he 1 Page 24 296 School Statistics and Publicity devises for the public should be relatively simple, very clear, and, if possible, forceful. V. ECONOMIES IN MAKING SCHOOL GRAPHS FOR THE PUBLIC Large Cross-Section Paper. If large charts are to be made for display purposes, excellent results may be obtained from the use of good wax or grease crayons of assorted colors. The time required for getting accurate measurements may be greatly reduced by the use of large size cross-section paper, as the counting is then very easily done for the two scales or for locating any particular point. The paper for this should be heavy manila, light enough in color and sufficiently free from spots to make a good background for the colors, and rough enough to take the colors easily. Sheets 36 by 40 inches ruled faintly in one-inch squares give excellent results. These may be obtained from any large printing house equipped with ruling-pen machines, but are very expensive in small quantities. The University of Chicago Press carries them in stock at from three to five cents per sheet, depending upon the price of paper, transporta- tion extra ; or they may be obtained from the Peabody College Book Store on the same terms. Making Cartoons. In some cases, good results may be obtained by pasting pictures on charts, if the cartoon effect is desired. For example, the writer wished to reproduce in large form the automobile and schoolhouse graph from the Alabama three-county survey, referred to on page 290. He got a picture of an automobile from a large-sized advertisement in the Saturday Evening Post and a picture of a rural schoolhouse from the front cover of an American School Board Journal. By putting on the title and the figures, the chart was soon made. Students often employ this same device in getting up posters for school entertainments. Graphic Presentations of School Statistics 297 Gummed Letters. Much time is saved and a beautiful chart may be made at little cost of time and money by the use of gummed paper letters and strips. These can be gotten in several colors and any height from four inches down. The superintendent can use letters from one to one and a half inches high, costing from about $1.50 to $2.00 per thousand. The letters may be spaced very easily and quickly if the chart is made on a large sheet of cross-section paper. These letters and strips may be obtained from the Tablet and Ticket Company of Chicago, which will send pictures of such charts. The strips are very serviceable for making bar graphs. Rubber Stamp Set. An advertising set such as is used by merchants for display cards can be used advantageously in making good charts. With a little practice, neat charts can be quickly made with such a stamp set, especially on the large cross-section sheets. A satisfactory set may be obtained from the Milton Bradley Company, 73 Fifth Avenue, New York City, or from Salisbury- Schulz Company, 157 West Randolph Street, Chicago, or at any of their various offices, for about $5, depending upon the price of rubber. Securing Clear Lines. In drawing charts for cuts, the chart and all essential cross-section lines should be reproduced. The error of reproducing the background of numerous cross-section lines may be avoided by tracing over in India ink the parts that should come out in the cut. The photographic process will reproduce this easily before the background with its fainter tones will '* take.'' The too prominent background is usually due to making the drawings with ordinary ink or faint typewriter letter- 298 School Statistics and Publicity ing. The latter may be easily traced in India ink and give good results. (See page 76.) Miscellaneous Aids. Numerous aids to making charts quickly with various kinds of cross-section paper, special scales, etc., may be obtained from the Educational Exhibition Company, Providence, Rhode Island, or the Tablet and Ticket Company of Chicago. " Perpetual " Attendance Graph Device. An exceed- ingly easily managed graph arrangement was observed Fig. 77. — "Perpetual" Attendance Graph Device. some years ago by the writer in the device used by Principal R. L. Dimmitt of the Ensley High School at Birmingham, Alabama, to compare the attendance record of the classes in the high school. (See Figure 77.) Graphic Presentations of School Statistics 299 A large chart was made up on Bristol board, once for all, with the percentage scale running up on the sides. Each class was repre- sented by a paper ribbon that came through a slit on the base line. By pulling the ribbons up and down each month and fastening the ends with thumb tacks, the graph was quickly brought up to date. The omission of the zero line exaggerated differences, but such ex- aggeration was to some extent desired for emphasis. The only real drawback seems to be that one cannot compare the records of the classes by month with such a chart. The chart as operated, however, does not need to show comparisons. The emphasis on attendance is intended to keep up attendance all the time, and not to let children slack up one month because of a good record the preceding month. If, however, it is desirable to make comparisons, this can be easily done by preserving kodak pictures of the chart at various times. This cut is drawn from such a picture. Graphs without Cuts. If the charts are to be set up in type and not with cuts, the originals may be made either by hand or on the typewriter. For all such work the horizontal or vertical bar graph is especially useful. By the aid of conventional signs, different lengths of rules, etc., almost any chart containing a bar effect can be made so simple that it can be set up in any ordinary limitedly equipped printing office. The following are examples : XX xxxx j{)jj>;}>;j>Jt) xxxxxx *mmm . xxxxxxxxx XXXXXXX r^^^^-r^TTTTrTrrT^^^^^^^^^ XXXXXXXXXXXX 0000 xxxxxxxxxxxxxxxx -^- ______-_-_-_-_-_------ xxxxxxxxxxxxxxxx Utilizing Students. In most of the work on graphs for the public, the superintendent need only furnish the idea 300 School Statistics and Publicity or design for the chart. The rest of the work can be done by various upper grades or high school classes as very profitable laboratory or practical exercises. The mathematics classes, especially those in algebra, and the drawing and art classes are the logical ones to call upon to take charge of this work. In Newton, Massachusetts, the boys in the high school printed the title pages of the annual school reports and probably made the graphs, although this latter is not so stated. In the World Book Company's reprint of MacAndrew's The Public and Its School, the drawings are all made by public school children. Children who could do such drawing could make most of the graphs advocated in this book. As it is, practically all the drawings in this book have been (^^^W^^Es^iti: rtSTS, BCTFt 5UR.VEY^ p. 93. [^fy\\ pupi I^ ICsteJ ) Fig. 78. — Showing Sketch and Rough Notes from Which a Pupil Drew Figure 13. copied or drawn from the author's suggestions and statistical data by the students of Mr. E. S. Maclin at the Atlanta Technological High School, as a demonstration. Figure 78 shows the sketch and notes furnished one of these pupils by the writer, from which Figure 13 was drawn. Figure 79 shows the cartoon drawn by a student from notes given by Mr. Maclin. Graphic Presentations of School Statistics 301 ONLY TOO TRUE cw>. I'm growing |50 FAST THAT ILL SOON be: in ThEl ^TREET AND NO J Fig. 79. — Cartoon Drawn by an Atlanta High School Student. This cartoon represents the high school situation in Atlanta and was drawn by the student with only these suggestions from his drawing instructor, Mr. E. S. Maclin : Subject : The overcrowded conditions of the Atlanta High Schools. Represent a good-natured boy who has outgrown his clothes at every point, trousers splitting, feet running out of his shoes, shirt too small for him, etc. He is represented as saying to his father, the City Fathers, "Dad, I'm growing in spite of you. I'll soon be in the street and no place to go." His father is represented as being a rich man with Atlanta's per capita bonded indebtedness of about $22 per in- habitant. It will be noted that the suggestions were not wholly followed, but the cartoon as published in an Atlanta paper was sufficiently forceful. 302 School Statistics and Publicity EXERCISE 1 Take the school report or school survey used in "the exercise on page 233. Write out a detailed criticism of the graphs or lack of them in it from the standpoint of their effectiveness with the public, showing just why they are good or liable to be unsuccessful. In the cases of the unsuccessful ones, or failure to use graphs where desirable, sketch graphs that would present the same data properly. REFERENCES FOR SUPPLEMENTARY READING Brinton, W. C. Graphic Methods for Presenting School Facts. Prac- tically all. Ellis, A. Caswell. ** The Money Value of Education." U. S. Bureau of Education Bulletin, 1917, No. 22. • King, W. I. Elements of Statistical Method, Chapter X. Rugg, H. 0. Statistical Methods Applied to Education, Chapter X. CHAPTER XII TRANSLATING STATISTICAL MATERIAL ON SCHOOLS FOR THE PUBLIC I. THE NEED OF TRANSLATION For the superintendent who wishes to present his school statistics effectively to the public, three devices are available. He may graph his material ; he may tabulate it ; he may translate it into words. Each procedure has its strengths and weaknesses. Each is best adapted to certain conditions. Translation is the most serviceable device in cases where it is difficult to secure or to have printed good tabulations or graphs. A translation of statistical material into words with a few mere numbers can be typewritten or set up in type anywhere with little effort. Such a translation can be read or spoken at any public meeting with no particular preparation. Furthermore, it can be so neatly expressed that persons hearing it can easily remember it and quickly pass it on to others, a thing not possible with tabulations and graphs. It can be so forcibly worded that it will arouse people to action. For example, the New York Survey devoted many pages to figures and graphs setting forth the results attained in arithmetic. But it is very doubtful if the whole or any part of this section could impress the average man as does the simple translation of the summary 303 304 School Statistics and Publicity by Mc Andrew : "It takes us less time to get a thing wrong here than it does in the average school system." ^ Statistical material on schools is often presented in words in such a way as to be clear to the trained school man and yet be unintelligible to the average man. If it is to be clear to the latter, it must be as definitely translated for him as must material from the Latin or other foreign languages, from a doctor's description in medical parlance of a disease, from fundamental political theories, or from scientific experiments in agriculture. We have various writers to translate the classics ; we have Woods Hutchin- son and other medical writers to translate medical knowledge ; we had Miinsterberg and James to translate psychology; we have various writers and speakers to translate political theories into the party campaign books; the agricultural colleges have numerous writers of bulletins to translate agricultural knowledge for the farmer and housewife. Have we not just as much need for translating school statistics into the language of the average man? Beyond clearness, the matter of force is very important. The translation must not only be clear to the man for whom it is intended, but it must take hold of him in some way. Force cannot be obtained by the mere repetition of tabulations in straight sentences of reading matter, how- ever much some state superintendents appear to think it can. This is not translation. The problem is really f the same problem as that of the life insurance compa- nies, the corporation seeking to influence the public, and advertisers in general, when they try to reach the public with statistical material. They do far more than merely express figures by words. 1 The Public and Its Schools, p. 8 Translating Statistical Material 305 It cannot be too strongly stated that translation is not the same as definition or explanation. In defining statistical terms we simply aim to show precisely what we mean by them, often in terms just as technical. In explaining statistical terms we merely try to make our particular use of them clear to persons who already have the same definitions of them as ourselves, this too in language often just as technical. Good translation of a statistical term, of course, involves both definition and explanation, but it is more. It puts the emphasis on the fact that the meaning must be carried over into an entirely different language or set of expressions. For example, the median may be defined as the magnitude of the midpoint in a distribution. In any given frequency table or sur- face of frequency, it may be explained that the point marked " m " signifies the median. But the idea of the median in a set of superintendents' salaries may be translated for the average man by telling him that half the superintendents get more than that salary and half of them get less. The writer has never seen any discussion of this trans- lation phase of school statistics. Consequently, the treatment here is only a preliminary or tentative analysis, to be supplemented with illustrations from any source. For clearness and brevity, the points will be given rather dogmatically. II. SUGGESTIONS FOR GOOD TRANSLATIONS The main things to be kept in mind in working out good translations of school statistics are : 1. The illustrations and images used must be of an elementary nature, or at least familiar to the people for whom the translation is being made. 306 School Statistics and Publicity The announcement that retarded children in school are as thick as Ford automobiles or men wearing Masonic pins is intelligible any- where. On the other hand, the statement that retarded children are as thick as negroes in southern cities would be a very effective way of translating the facts to some southern audiences, but would hardly be of much value elsewhere. It would not even work in all southern cities, because the proportion of negroes in them varies from close to 50 per cent down to less than 5 per cent. 2. In some cases it may be necessary to use several illustrations in order to be sure of reaching all classes of people. 3. Instead of representing a total by imagining an unreal extension of a familiar object, or by making up from familiar units an aggregate so large as to be in- comprehensible, it is usually better to employ some other unit. Often this other unit is one of time. It is of doubtful value to ask the average man to think of a line of school children eight hundred miles long ; of a schoolhouse as large as all the schoolhouses of the county put together; of a sheet of writing paper large enough to cover a township; of a lump of coal weighing as much as all the coal burned in one day in many schools ; or of a total of any sort reaching into the hundreds of thousands. Such translations are sometimes attempted. They might well be called Jack-and-the-beanstalk translations, for they are about as far- fetched. Practically all statistical totals needing to be translated will, on examination, be found to involve in some way units of length, area, volume, weight, and time. The aim should be to translate the total into another kind of unit that will keep it within the limits of com- prehension or experience of the ordinary man.^ Often a relatively larger unit of time that is forceful may be employed than in the case of the other kinds of units, because most people comprehend long stretches of time or the consumption of goods over long periods, 1 For elaborating this point the writer is indebted to Mr. H. A. Webb, one of his students. Translating Statistical Material 307 fairly well. The periods of time, however, must in general be well within the limits of the ordinary man's active life span. Consider the following translation of the average daily absence for the schools in Texas in 1911-13 : "Placed twelve feet apart, these white pupils absent every day from the schools of Texas would form a line extending across the state from El Paso to Texarkana, a distance of over eight hundred miles." 1 The average Texan, even though he is from an early age accustomed to boasting of the size of the state, can have but a very hazy idea of the distance mentioned. Even if he has traveled from one city to the other, his idea is dependent chiefly upon his recollection of the time it took to make the trip, and a good part of this time he may have been asleep. In all likelihood, he has never seen children lined up twelve feet apart this way. It would probably be better for most people to consider these children as marching double file and say how many days it would take them to pass a given point. The average man has seen people marching double file in parades and has a good idea of about how fast they would pass. Instead of saying that, if the school costs of a certain city were represented by silver dollars lying side by side, they would extend the distance of ten blocks, it would be far better to say that at a cer- tain sum, say $50 a month, it would take a man a certain number of years to earn an equivalent amount of money. In this case, the average man has a much better idea of how long it takes him to earn money than he has of the distance dollars touching each other will extend. He has never seen silver dollars lined up along a street, but he has worked for wages and knows the value of his money. A very effective example of this sort is found in the Negro Year- Book for 1916-17? The compiler of this made a study of the average number of days that each negro child of school age attended school in each of the southern states. To make the smallness impressive, he calculated the number of years it would take the average negro child to complete the elementary school on the basis of eight grades and nine months to the school year, thus : 1 White, E. v.: "A Study of Rural Schools in Texas," Bulletin of University of Texas, No. 364, Oct. 10, 1914, p. 20 2 Page 233 308 School Statistics and Publicity No. of yrs. it would take the average negro child to complete the elementary course in the public schools State provided for him Maryland 16 Texas 18 Virginia 18 Georgia 19 Florida 20 North Carolina 20 Alabama 22 Louisiana 25 South Carolina 33 This could have been made more forcible by changing the figures to the age at which a negro entering school, when six years old, could complete the grades, as 22 for Maryland, 24 for Texas, etc. 4. In cost statistics, it is sometimes advisable to minimize the total by expressing it in amount per small unit of time, usually a trivial sum. Thus daily papers and many weekly periodicals advertise "10 cents a week" and do not call attention to the total of $5.20 for the year. A church calls for "30 cents a week" and does not emphasize the fact that this reaches $15.60 for the year. The Y. M. C. A. announces that membership with all privileges costs "a nickel a day or the price of your daily cigar, " when the yearly total is from $15 to $18, without mentioning these latter figures. The Liberty Loan called for "a dollar a week" instead of stressing the $50 that would be laid aside for the year. The sum for the total is generally not stressed except in savings bank advertisements and such appeals, where the aim is to surprise the reader with the total saved, and not any amount that is expended. 5. Absolute accuracy frequently has to be sacrificed to force and clearness in translation. Thus, if the percentage of negroes in a city population was 40, and the percentage of retarded children 35 or 45, the negro illustration Translating Statistical Material 309 mentioned before would be accurate enough for translating the idea. The "nickel a day" of the Y. M. C. A. is adequate for any an- nual sum from $15 to $18. 6. Practically all totals have to be translated through comparisons, using familiar objects or notions, before they can be understood or have much force for the average man. The statement that four per cent of all children of school age are mentally defective and need special attention would mean little to most men. But to say that every school system enrolling 500 chil- dren had among that number 20 defective pupils, or enough to equal the ordinary 6A grade taught by one teacher, would drive this fact home to the average man. The mere statement that there were on the average 64 children to each grade teacher in a school system would mean little to the people of the city and probably would receive no attention. However, if the statement were changed to read that there were enough surplus children over the standard number for the various classes to fill four classrooms, the fact would certainly impress citizens. The bare announcement of the amount spent for some item of school expense is not nearly so forceful as the statement that it is only a certain fraction of the ice cream and soda, liquor or tobacco expenditures of the town, if such figures or approximate estimates of them can be obtained from merchants. A southern county superintendent recently translated his valua- tion estimates as follows : "The total value of all school property in the county outside of the city is $34,420, which is $3080 less than half the value of the county jail and site, and $3000 less than one fourth of the value of the county courthouse, site, furniture and fixtures. " Of the $34,420 invested in school property in the county, only $15,665, or less than half, belongs to the state and county. In other words, the value of all the school property of those schools with titles vested in the state is $5335 less than the cost of three of her best motor trucks used in constructing good roads. "The total value of all supplies and equipment, including musical instruments and libraries, is $5875, which is $1125 less 310 School Statistics and Publicity than the cost of one motor truck used in building the roads of the county. "All school equipment in the county outside the city of is equal in value to less than one seventh of the value of the machinery owned by the county and used in the making of good roads. 'Seven to one' is the ratio of the county's in- vestment in equipment for making roads as compared to her in- vestment in equipment for making men and women." The same superintendent translated the total area of all the school grounds in the county, 73 acres, by comparing it with the 180 acres of playgrounds in the property of two country clubs near the county- seat. In similar fashion, the increased enrollment of the public high schools of the United States in 1914 over 1913 would make a city as large as Chattanooga and Knoxville, Tennessee, combined. Dr. L. P. Ayres did a great service for the school survey movement when he used the following translation to show its results : "About seven years ago this retardation became one of the most widely studied problems of educational administration, and in the past four it has been one of the prominent parts of the school survey. During the entire period, hundreds of superin- tendents throughout the country have been readjusting the schools to better the conditions disclosed. "In these seven years the number of children graduating each year from the elementary schools of America has doubled. The number now is three quarters of a million greater annually than it was then. The only great organized industry in America that has increased the output of its finished product as rapidly as the public schools during the past seven years is the automobile industry." ^ The Warren County (Kentucky) Bulletin translates the losses through retardation as amounting to "thousands of years." This is probably more impressive than really comprehensible. Suppose a teacher is repeating every answer after every pupil, hour in and hour out, for days, as did some teachers observed by the author's students. To tell a citizen that such a teacher is wasting ^ Ayres, I-.. P.: "A Survey of Surveys," Indiana University Bul- letin, Vol. 13, No. 11, p. 180 Translating Statistical Material 311 time would mean little. But to show him that if the teacher had forty pupils, she probably would have twenty of them reciting half a day each day; that if she repeated every answer, she could only cover about half as much as if she did not repeat ; that this meant virtually wasting a fourth of a day for every pupil ; that a fourth of a day for forty pupils amounted to ten days for one ; that this ten days was two weeks in school for one pupil ; that this teacher by her repe- titions was each day consuming or wasting the equivalent of one pupil's time for two whole weeks in school, — all this would mean a good deal to him. 7. Many questions involving value, and particularly exhibits of loss or waste, can be profitably translated into a money equivalent. This is particularly true of all proposals involving an increase in school taxes, which must, of course, be addressed to the taxpayers. Thus, the need of health education and sanitation may be shown by translating the loss in money through death and sickness into the cost estimates furnished by Professor Irving Fisher of Yale, and others. This is well done in the Warren County (Kentucky) Survey, page 2. It was found there that in the year 1915 there were 155 preventable deaths, and the potential loss occasioned by these deaths, according to Professor Fisher's estimate, was shown to be $263,500. The well-known chart of the United States Bureau of Education, which attempted to show that every day spent in school is worth $9 to a child, is a good example. The fallacies in it are likely to give actual pain to any one who knows much about statistical method, or who will use his common sense effectively. But it does translate the material into something that is intelligible and appealing to most people. It has been of great value for thousands of high school commencement addresses and campaigns for increasing school levies. In any propaganda for increasing school taxes, it is well to trans- late the increase into the respective amounts of money which will be due from men who already pay certain round sums for total taxes, the number in each class, thus : This increase in school taxes will require : 50 cents more per year from each of the 1500 men who now pay a total yearly tax of $5 each. o 12 School Statistics and Publicity $1 more per year from each of the 500 men who now pay a total yearly tax of $10 each. $2 more per year from each of the 250 men who now pay a total yearly tax of $20 each, etc., etc. III. EXAMPLES OF GOOD TRANSLATIONS OF SCHOOL STATISTICS In many of the school surveys which use modern statis- tical method, the technical statistical terms and results have not been translated so as to influence the average man. To just this extent they are certain to fall short of what a survey or review of a school claims to be. True, there are to be found in isolated places some very- successful translations of these terms But these are so scattered as not to be generally accessible. The remainder of this chapter aims to make a few of these available for general use. 1. Sampling The following translation of the process of sampling was used by the author's students in preparing some material for a survey of one of the western cities : In giving the standard tests to the children of this city, it was found to be too great a task to test every child, as labor in grading the papers would be enormous. So certain schools and grades were chosen at random, and tests were given in these places. When a carload of wheat is being graded, the grader does not look at all the grains, as every one knows that would take too long. What he does is to take a few grains from each of several places well distributed throughout the whole lot of wheat, and to make his rating from these samples. This process has been found accurate enough so that it is constantly used in business without complaint from either the buyer or seller. The same thing is true in grading fruit. Not every apple or peach is actually looked at, or even every box, but only certain apples or peaches taken from certain boxes (determined at random), and the quality of these determines the grade assigned to the whole consignment. Translating Statistical Material 313 These examples are well known to the people of this section, in all probability because of the amount of wheat and fruit raised in the state. After they have read such a translation, are they likely to doubt the validity of the sampling done in the school survey? Sampling may also be explained by comparing it with the process of taking a straw vote. Most men understand how this is done, how reliable it is, etc. 2. The Average A familiar idea for translating the average is afforded by the " see-saw " or by the lever. Every one knows that the farther the person is from the object upon which the lever is resting, the more weight it takes on the other side to counterbalance him. That is, the center is what the physicist means by " center of gravity." It is the same with the average. The average is the balancing point of all the cases in a distribution, with their distances from it and their sizes taken into account. The farther an item is from the average, the more weight it has. Professor King uses the expression '' a type " for the average. 3. The Median The following translations are taken from surveys: "Among the teachers in the elementary schools, the median or midway age is twenty-nine years, half of the teachers being twenty- nine years old or older, and the other half, twenty-nine years of age or younger." ^ "With teachers ranked in descending order according to size of salaries, the median salary is the salary received by the teacher half way down the line." ^ 1 "The PubHc Schools of Springfield, Illinois," Springfield Survey, p. 59 2 "Financing the Schools," Cleveland Survey, p. 53 314 School Statistics and Publicity The idea of a median was translated with the aid of a picture by Superintendent Womack in his Conway (Arkansas) Survey by taking his ungraded class and lining them up for height from low to high. The middle child was standing on a drain which ran out white in front, and there were two gaps in the line to indicate the quartiles. Any reader could look at the line of pupils and quickly get a clear notion of what was meant by median height. " The point above which and below which fifty per cent of the cases fall." 1 ** The median is the case which was found in the investigation to have as many cases below it as there are above it." ^ A very effective translation of the median can be se- cured by using a description of a typical person or school, which will have the median amount in each of a number of different qualities. The first example of this use of the median ever noted by the author was the description of the typical teacher in Professor Coffman's Social Com- position of the Teaching Population, pages 79-80. A por- tion of this description follows : The typical American male public school teacher ... is twenty- nine years of age, having begun teaching at almost twenty years of age, after he had received but three or four years of training beyond the elementary school. In the nine years elapsing between the age he began teaching and his present age, he has had seven years of ex- perience, and his salary at the present time is $489 a year. Both his parents were living when he entered and both spoke the English language. They had an annual income from their farm of $700, which they were compelled to use to support themselves and their four or five children. His first experience as a teacher was secured in the rural schools, where he remained for two years at a salary of $390 per year. He 1 Cubberley, E. P.: "Survey of the Organization, Scope and Fi- nances of the Public School System of Oakland, California," Board of Education Bulletin, No. 8, June, 1915. 2 Ellifif, J. D.: "A Study of the Rural Schools of Saline County, Missouri," University of Missouri Bulletin, Vol. 16, No. 22, p. 8, footnote Translating Statistical Material 315 found it customary for rural teachers to have only three years of training beyond the elementary school, but in order for him to ad- vance to a town school position, he had to get an additional year of training. Etc., etc. This has been imitated since in many cases. Thus, in the American School Board Journal, for August, 1917, page 70, there is a description of the typical Iowa high school principal, based on medians. The author has had this device used by many of his students in writing up in- vestigations. 4. The Mode The mode may be translated as follows : A certain article of clothing is said to be in " fashion " when more people wear it than do without it. Likewise in a distribu- tion, the mode is the *' fashion " in cases ; more appear there than anywhere else. 5. Spread or Dispersion or Variability There are wide differences in the wealth of the people of this country. Wealth of individuals varies all the way from Rockefeller and his millions to the poor street beggar. This variation in statistics is called the spread or dispersion. Now some agitators, if they had their way, would eliminate this spread by making all people's wealth equal. In the same way there are all sorts of variations in children in school work, for any particular line of work. It would be just as great an error as that of the agitators to reduce the variability in any one line so that all children were considered equal in performance. Professor J. F. Bobbitt, in the School Review for October, 1915, page 508, uses the translation '' zone of safety " to 316 School Statistics and Publicity indicate that on high school costs of instruction, it would be well to try to get within the middle 50 per cent. That is, he translates the spread between the quartiles by '' zone of safety." 6. Correlation The Biblical phrase, " the first shall be last and the last shall be first," might be used to good advantage in trans- lating a perfect negative correlation. ^ The following is a good translation of a coefficient of correlation of .48 between abilities in shop practice and abilities in drawing: There is marked evidence that abilities in shop practice and drawing accompany each other. Students above the average in one group will tend to be above the average in the other. It is not known specifically in what way the two abilities are centrally connected, or to what ex- tent the presence of either one is an indication of the other .2 EXERCISE Take the school report or school survey used in the exercises on pages 233 and 302. Write out a detailed criticism of the transla- tions of statistics or lack of them in it from the standpoint of their effectiveness with the public, showing just why they are good or liable to be unsuccessful. In the cases of the unsuccessful ones, or failure to use translations where desirable, make up translations that would present the same data properly. REFERENCES FOR SUPPLEMENTARY READING Ellis, A. Caswell. " The Money Value of Education." United States Bureau of Education Bulletin, 1917, No. 22. McAndrew, William. The Public and Its School. 1 Suggested by one of the writer's students, Mr. L. A. Sharp 2 Rusg, H. O. : Statistical Method Applied to Education, p. 257 SELECTED AND ANNOTATED BIBLIOGRAPHY The aim in this is to give a minimum list of the simpler and more easily accessible materials. I. Statistical Method Chapman and Rush. The Scientific Measurement of Classroom Products. Silver, Burdett and Company, Boston, 1917. Contains excellent brief chapters on the theory of scales, their application in schools, and dangers incident to their use. Other chapters present the more important scales for measuring work in the formal subjects in the elementary school, describe processes for getting results, and show how the results may be used to better classroom Work. Elderton, W. p. and E.- M. Primer of Statistics. A. & C. Black, London, 1910. A brief, very simple, and readable treatment, with no special reference to education. King, W. I. The Elements of Statistical Method. The Macmillan Company, New York, 1915. An elementary, concise, straightforward treatment, but adapted more to economic or historical work than to school problems. Monroe, W. S. Educational Tests and Measurements. Riverside Press, Cambridge, Mass., 1917. A treatment which works in many of the elements of sta- tistics very simply and forcibly, under discussions of various tests and scales. RuGG, H. O. Statistical Methods Applied to Education. The River- side Press, Cambridge, Mass., 1917. An admirable book for its general purpose, emphasizing the problems of the school administrator. The statistical part proper, while written as much as possible in a non-technical style, is natu- rally carried to a much greater refinement and intricacy than are 317 318 School Statistics and Publicity necessary in preparing school statistics for publicity. However, the last chapter is very fine for publicity work. The bibli- ography covers all the main problems which superintendents need to study quantitatively, especially surveys, and is alone worth the price of the book. Thorndike, E. L. An Introduction to the Theory of Mental and Social Measurements. Teachers College, Columbia University, New York, 1913. A complete exposition of things needed in the fields indi- cated by its title. It has few direct applications for the admin- istrator, and is extremely difficult for the beginner in statistics. II. Calculating Tables Crelle, a. L. Rechentafeln. G. Reiner, Berlin, new edition, 1907. Gives products to 1000 by 1000. Peters, J. Neue Rechentafeln fiir Multiplikation und Division. G. Reiner, Berlin. III. Exercises and Problems Thorndike's Mental and Social Measurements and Rugg's Statistical Methods Applied to Education have problems in various places, some of which may be easily adapted for practice work. RUGG, H. O. Illustrative Problems in Educational Statistics. Pub- lished by the author. University of Chicago Press, 1917. IV. Graphic Methods Brinton, W. C. Graphic Methods of Presenting Facts. Engineering Magazine Company, New York, 1914. An excellent non-technical treatment, profusely illustrated. While not written especially for school men, its conclusions and suggestions are easily adapted to school problems. Ellis, A. Caswell. The Money Value of Education. Bulletin of the United States Bureau of Education, 1917, No. 22, Washington. Contains numerous charts used in educational campaigns. RuGG, H. O. Statistical Methods Applied to Education. The River- side Press, Cambridge, Mass., 1917. Chapter X presents numerous good examples. Selected Bibliography 319 V. School Reports, General Treatments Bliss, D. C. Methods and Standards for Local School Surveys. D. C. Heath and Company, Boston, 1918. A simple but adequate treatment of the topics indicated, which will be of great value to the superintendent. Has many good tabulations and some graphs. Giles, J, T. A Statistical Study of School Reports from the Twenty- five Largest Cities of Indiana. Educational Administration and Supervision, Vol. II, pp. 305-311. Hanus, Paul H. School Efficiency. A Constructive Study. School Efficiency Series. World Book Company, Yonkers-on-Hudson, N.Y., 1913. A study of twenty-six widely selected city reports in the United States. Snedden, David S. and Allen, William H. School Reports and School Efficiency. The Macmillan Company, New York, 1908. A pioneer book in this field, now useful chiefly for its sug- gestions on what to include in a report and on using tabulations. VI. School Reports and Surveys Especially Valuable from a Publicity Standpoint The publisher is indicated for each item given here. For a copy of any other survey or report mentioned in the body of the text, address the superintendent of the school system concerned. Alabama. An Educational Survey of Three Counties in Alabama. Department of Education, Montgomery, Ala., 1914. Boston, Massachusetts. Report of a Study of Certain Phases of the Public School System of Boston, Massachusetts. Teachers College, Columbia University, New York City, 1916. Butte, Montana. Report of a Survey of the School System of Butte, Montana. By Strayer, G. D., and others. Board of School Trustees, 1914. Cleveland, Ohio. The Cleveland Education Survey. Ayres, L. P., Director. Published in twenty-five separate monographs by the Survey Committee of the Cleveland Foundation, Cleveland, Ohio. The following are especially valuable: Child Accounting in the Public Schools — Ayres Financing the Public Schools — Clark 320 School Statistics and Publicity 4 Measuring the Work of the Public Schools — Judd The Cleveland School Survey (summary) — Ayres Dansville, New York. A Study — The Dansville High School. By Foster, J. M. F. A. Owen Publishing Co., Dansville, N. Y. Denver, Colorado. Report of the School Survey of School District Number One in the City and County of Denver. Part I, General Organization and Management; Part II, The Work of the Schools; Part III, The Industrial Survey; Part IV, The Business Manage- ment; Part V, The Building Situation and Medical Inspection. The School Survey Committee, Denver, Colorado, 1916. Des Moines, Iowa. Annual Report of the Des Moines Public Schools. For the year ending July 1, 1915. Board of Education, Des Moines, Iowa. Ellis, A. Caswell. The Money Value of Education. Bulletin of the United States Bureau of Education, 1917, No. 22. Grand Rapids, Michigan. School Survey. By a large staff. 1916. Janesville, Wisconsin. An Educational Survey. By Theisen, W. W., and staff of state department. Published by State De- partment of Public Instruction, Madison, Wisconsin. McAndrew, William. The Public and Its School. World Book Company, Yonkers-on-Hudson, N. Y., 1916. Minneapolis, Minnesota. Three Monographs on School Finance in Minneapolis. By Spaulding, F. E. Board of Education, Minneapolis, Minn. : A Million a Year Financing the Minneapolis Schools The Price of Progress Newburgh, New York. The Newburgh Survey. Department of Surveys and Exhibits, Russell Sage Foundation, 128 East 23d Street, New York City, 1913. Newton, Massachusetts. The Newton Public Schools. By Spauld- ing, F. E. 1912 and 1913, Newton, Mass. (Out of print.) New York City. Report of Committee on School Inquiry. By Hanus, Paul H., and others. School Efficiency Series, World Book Company, Yonkers-on-Hudson, N. Y. Ohio. Report of the Ohio State School Survey Commission. By Camp- bell, M. E., Allendorf, W. L., and Thatcher, C. J., 1914. Portland, Oregon. The Portland Survey. By Cubberley, E. P., and others, 1913. Selected Bibliography 321 Report of the Committee on Uniform Records and Reports. U. S. Bureau of Education Bulletin, 1912, No. 3. RocKFORD, Illinois. A Review of the Rockford Public Schools, 1915- 1916. Board of Education, Rockford, Illinois, 1916. Salt Lake City, Utah. Report of a Survey of the School System of Salt Lake City, Utah. By Cubberley, E. P., and others. Board of Education, Salt Lake City, 1915. San Antonio, Texas. The San Antonio Public School System. By Bobbitt, J. F. The San Antonio School Board, 1915. Springfield, Illinois. The Public Schools of Springfield, Illinois. By Ayres, L. P. Division of Education, Russell Sage Founda- tion, New York City, 1914. St. Louis, Missouri. Report of Survey of St. Louis School System. Board of Education, 1917. Texas. A Study of Rural Schools in Texas. By White, E. V. and Davis, E. E. University of Texas, Austin, Texas, 1914. United States. A Comparative Study of Public School Systems in the Forty-eight States. Division of Education, Russell Sage Foundation, New. York City, 1912. This is now out of print, but copies will doubtless be available for the superintendent at his state department of education or state university. Copies were sent to all the members of the state legislatures at the time of its issue, 1912. INDEX Note. — To save space, the words education, publicity, school, statistics, and teachers are in the main omitted from this index. In the numerous combinations in which they naturally occur, look for the next most significant word. Absence, translation for, 307 Absurdities in rank-order combi- nations, 198 Accuracy, 187-190, 230; errors in attempts at too great ac- curacy, 16-19 ; in graphs, 283 ; in translations, 308 Advertising, school, error in bas- ing upon exceptional graduates, 23 Age-grade tables, emphasis in, 228 Age-progress, table form for, 229 ; circle graphs for, 251 Age, school, error in determining, 5 Ages of pupils, distribution table for, 120 ; graphs for, 121 Aikins, Professor, 47 Alabama, Survey of Three Counties, 239, 250, 288, 289, 290 Allen, W. H., 25, 27, 33, 72, 78, 201, 202, 213, 216 Alphabetical order in tables, 221- 222 Alternate columns in tables, 218 Amarillo, Tex., 68 American Book Company, graph, 241 Area graph for comparison on component parts, 245 ; for comparisons with circles, 251 Arithmetic tests, 46, 47, 49 ; sur- face of frequency for, 117; frequency table for, 134; table for results, 232 ; variability in, 171 Arkadelphia, Ark., 8 Artistic features in tables, 220 Association of science and mathe- matics teachers, table to show growth, 232 Atlanta, 300, 301 Attendance, at teachers' associa- tions, error in computing, 14 ; school, blanks for showing, 72, 73 ; error in indefinite units, 4; perpetual graph device for showing, 298 ; problems of, 34 Automobile graph, 290, 296 Average, 141-147 ; advantages of, 145; computation of, long method, 141-142 ; computa- 323 324 Index tion of, short method, 142 ; definition of, 141 ; disadvan- tages of, 146; errors in com- puting, 19-21 ; graphic repre- sentation of, 144; translation for, 313 Average deviation, 156 Averages, deviations from, 21-22 Ayres, L. P., 5, 166, 241, 243, 244, 251, 259, 291, 292, 310 Ayres handwriting scale, 53 Ayres spelling scale, 106 Background for graphs, 241, 283 Bagehot, Walter, 90 Baltimore Survey, 248 Bar graph, 237; cartoon effect, 239, 260, 288; comparisons with, 243, 244, 253; com- ponent parts, use for, 239; curve effects with, 263 ; order of items for, 241 ; right and left form, 247, 248 Bi-modal distribution, 147 Bird's-eye view through tabula- tion, 209 Birmingham, Ala., 210, 298 Black, use of, on maps, 293 Blanks, 71-78; making, 78; examples of good, 78; re- vision of, 42; vs. card index, 79-81 Bloomington, Ind., 172 Bobbitt, J. F., 17, 22, 69, 70, 86, 95, 96, 101, 162, 167, 315 Bobbitt table, 18, 96, 103, 104, 127, 130, 147, 152, 153, 156, 161, 162, 165, 169, 223, 225, 226, 240, 293, 295 Bold-face type, use in tables, 219, 227 Boston, circulation of school re- port, 27 Boston Report, 245, 246, 269 Bowley, A. L., 189 Bridgeport Survey, 229, 255, 257, 258 Brief, use of, in planning, 42 Brinton, W. C, 116, 242, 272, 276, 286; graphic methods of presentation, 242 ; rules for graphic presentation, 272-.275 Buffalo, 213 Buildings, lack of, cartoon effect for, 301 ; problems on, 34 ; table for, 213; units and scales for, 53 Butte Survey, 117, 128, 130, 167, 293 By-products in collecting data, 41 Calculating devices, 191 Calculating tables, 191 Calculation, economies in, 190- 192 Card index vs. one blank, 79-81 Carelessness in securing data, 15 Cartoon effects in graphs, 256, 287, 301 Cartoons, economies in making, 296 Census, school, problems of, 34 Central tendency, measures of, 124-147 Charts, time, 261 Checking, in calculations, 190 ; on b!?.nks, 84 Index 325 Chicago, University of, statistical method in School of Education, 29 ; blank for rating teachers, 270 Cincinnati, 214 Circle graph, 235, 259 ; for com- parisons, 250, 254, 255, 256, 259; with cartoon effect, 258, 259 Cleveland, 5, 36, 174, 214 Cleveland Survey, 118, 154, 159, 251, 253, 256, 282, 294, 313 Coefficient of correlation, cal- culation of, 184, 185; ex- amples of, 183; meaning of, 181, 182 Coefficient of variability, 170 Coffman, L. D., 314 Collecting data with high school students, 86-88 College degrees, error in com- parisons with, 8 College salaries, error in getting average, 19 Columbus Dispatch, 290 Committee on standards for graphic presentation, 267 Committee on uniform records and reports, 58 Comparisons, simple, 240 (see also Relationships, 164-186) ; using component parts, 14, 266, 267; using percentages, 13, 14 ; using indefinite units, 3-16; using relative position, 269 ; using unsound treat- ment, 3-16; with bar graphs, 239, 240, 243, 244, 253, 266, 267, 269 ; with circle graphs, 250, 256, 258; with cartoon effects, 256, 258; with curves, 104, 262, 263, 266, 267; with triangle graphs, 257 Component part graphs, 236, 238, 239, 242, 245, 254, 255 Composition scales, 46, 53 Composition tests, frequency table for, 131 ; graph for, 168 Concentric circle graph, 252 Constant errors, 188 Contests, judging, 11-12, 52, 192 ff. Continuous series, 48 ; graph for, 165 Conway, Ark., Survey, 314 Correlation, 173 (see also Co- efficient of Correlation) ; graphic devices for showing, 175-181 ; Hke-signs, table for, 178-179; translation for, 316 Cost of instruction, 16, 17, 18, 48, 55, 70, 96, 101, 104, 105, 130, 150, 222, 223, 244, 307, 308 Cost of maintenance, 218, 264 Cost per pupil, 263 Cost records, cartoon effect to show value of, 287 Costs, cartoon for, 290 ; errors in computing, 16; graphs for, 288, 292; sampling for, 68; translation for, 307, 308 ; units and scales for, 55. Courtis arithmetic tests, 46, 47, 142, 143, 149, 199; frequency table for, 134 ; graphs for, 172, 266 326 Index Crayons for graphs, 296 Crelle tables, use of, 191 Cross-section paper, for computa- tion, 190 ; for large graphs, 296 Cubberley, E. P., 263, 265, 314 Current school reports, 200 Curve effects with bar graphs, 263, 266 Curves, arbitrary signs for, 166; for comparisons, 262 ; for dis- crete scales, 104; standards for, 266-268 Data, carelessness in securing, 15 ; collection of, 33-89 ; econ- omies in collecting, 82-88 ; sources of, 58-61 Dearborn, W. F., 106, 107, 108, 180 De Voss, J. C, 271 Defectives, unanalyzed total for, 3 ; number of, translation for, 309 Degrees, college and university, error in comparisons with 8 Des Moines, 207, 210, 219, 257 Detroit, 217 Deviation, coefficient of, 170 ; measures of, 149-163 ; graphic representation of, 101, 137 ; measure for given distribution, 162 ; translation for, 315 Dimmitt, R. L., 298 Discrete scale or series, definition, 48 ; graphic representation of, 103-105, 164 Dispersion (see Deviation) Distance, map to show by time elements, 280 Distribution, of cases on a map, 277 ; of pupils, age-grade blank for, 73; of school moneys, error in, 5; of time in ele- mentary grades, graph for, 236 ; tables, 106-111 ; graphing. 111 Dollar circle graphs for component parts, 254 Dollar graph for proportionate parts with cartoon effects, 258, 259 Dollar proportionate parts table, 230, 231 Dollar-sign graph, 260, 289 Dots for bar graphs, 292 Double distribution table, 72 Double entry table for receipts and payments, 211 Economies, in collecting data, 82- 88 ; in graphs, 296 Education, value of, graph to show, 247 ; translation for, 311 Educational investigations, knowl- edge needed for, 94 Efficiency record of teachers, summarizing graph for, 720 Elimination, bar graph for, 243; of high school students, table for, 217 ; problems of, 34 Elliff, J. D., 314 Emphasis in tables, 227 Enrollment, errors in, 3-5; graphs, 248, 249, 252, 253, 261, 294; map devices to show distribution, 277, 278 ; N. E. A. blank for, 74; translation for, 310 ; units and scales for, 57 Errors, constant, 188 ; variable, 187 Index 327 Estimating reliability, 189 Exaggerations in graphs, 283 Expenditure graphs, 245, 259, 265, 269 Expenditures, average, 62 ; errors in comparisons, 4 ; problems of, 34 ; proportionate units and scales for, 54 ; sampling for, 68, 70; tables, 214, 215; translation for, 308 (see also Costs) Extreme range variation, 149 Eyestrain, avoiding, 218, 227 Fisher, Irving, 242 Florida, 20 Ford automobile translation, 306 Fractions, errors in getting too small, 17 Frederic, Wis., 88 Frequency, surface of. 111, 116, 123 ; multi-modal surface of, 125; normal surface of, 117; skew surface of, 118-122 ; tables, 106-111 Georgia, University of, 279 Giles, J. T., 200 Grading pupils, 10, 11, 21, 22, 63, 66, 107, 108, 170, 196; blank for studying teachers' standards, 76 Graphs, 234-301 ; check list for, 274 ; economies in making, 296; examples of good, 288; size of, 282 ; standards for, 266, 272, 281; summarizing, 268 Gray, W. S., 48 Grounds, size of, cartoon graph for, 260; translation for, 310 Grouping, 41, 107-110, 210 Gummed letters for graphs and charts, 297 Haggerty, M. E., 171, 172, 176, 199 Hammond Survey, 220 Handwriting scales, 45, 50, 53 Handwriting tests, distribution table for, 128 ; surface of fre- quency at Cleveland, 118 Hanus, Paul, 25, 27, 200, 201 Harvard-Newton composition scale, 53 Headings, for graphs, 281 ; for tables, 209, 219; form for printing, 216 Health education, lack of, trans- lation for, 311 Heating, 22 Herrick, W., 26 High school enrollment, graph for, 293 High school training, value of, graph to show, 247 Hillegas composition scale, 53, 131 Histogram or column diagram, 111-116, 122 ; check form, 114, 115 Illiteracy, cartoon graph to show, 288 ; error in treatment of, 13, 16 India ink for graphs and charts, 297 Indiana, 199, 200 328 Index International Harvester Company graph, 247 Jingle fallacy, 47 Judging contests, 11, 12, 52, 192 ff. Kelley, F. J., 184, 271 Key numbers, for blanks, 75; for tables, 219 King, W. I., 36, 38, 39, 41, 91, 92, 95, 97, 145, 313 Kirk, Jno. R., 65 Length of school year, bar graph for, 291 ; translation for, 307 Lettering on graphs, 296 Library Bureau, 59 Library statistics, error in in- definite units, 10 Lighting space, graphs for, 286, 295 Lines, clear in graphs, 297; dividing, in tables, 218, 219 Louisville, 6, 231, 244, 258, 283 Lowry, F. C, 289 MacAndrew, Wm., 300, 304 Maclin, E. S., 300 Maintenance cost, per cent of, table for, 218 Maps, 275 ff. ; use of black on, 293 Marking pupils, 10, 11, 21, 22, 63, 66, 107, 108, 170, 196; blank for studying teachers' standards on, 76 Masonic pin translation, 306 Maxwell, Wm., 25, 26, 37 Median, advantages of, 138; computation of, 129-137; def- inition of, 128; disadvantages of, 139 ; graphic representa- tion of, 137 ; translation of, 313 Median deviation, 155 Medical inspection, unanalyzed totals in, 3 ; problems in, 35 Membership, table for showing growth in, 232 Memphis, 166, 265 Meyer, Max, 11, 63 Minneapolis, 63, 70, 145 Minnesota, 9 Missouri, University of, grading system, 11 ; State Department of Education, 279 Mobile, 27 Mode, advantages of, 126; def- inition, 124 ; computation of, 125; disadvantages of, 127; graphic representation of, 125 ; translation of, 315 Monroe, W. S., 136, 271 Monument graph, 249 Multimodal surface of frequency, -125 Nashville, Tenn., 77, 119-121, 280 National Education Association, 58, 74, 78 Neatness in tables, 220 Negroes, schooling for, 15, 307 Newburgh Survey, 259, 260, 261 Newton, Mass., 40, 102, 103, 261, 263, 288, 300 New York Survey, 27, 303 Normal schools, 2,7, 150 Index 329 Normal surface of frequency, 117- 119 Oakland Survey, 218 Ohio Survey, 259 Old Man Ohio cartoon, 290 Omitting important factors, 15 Order of items for bar graph, 241 ; for tables, 221, 223 Over-age, error in method of determining, 5 Paper letters for charts and graphs, 297 Peabody (George) College for Teachers, 277 Pearson coefficient of correlation, 158, 185 P. E., or probable error, 155 Penmanship (see Handwriting) Percentage tables, graphing, 167 Percentages, errors in, 13-15 Percentile deviations, 153 Percentiles, 153 Perpetual attendance graph de- vice, 298 Phelps, S. J., 86 Phi Beta Kappa, 11 Planning statistical treatment, 38- 43 Playgrounds, cartoon graph for, 260; size of, translation for, 310 ; units and scales for, 53 Population of school district, units and scales for, 56 Portland Survey, 51, 52, 224, 227, 230, 282 Presentation of school statistics to public, errors in, 24 Probability surface of frequency, 118 Problem, statistical, how to state, 39 Property, value of, translation for, 309 ; value of, graph for, 289 Proportionate parts table, 231 Public indifference, 26-27 Puckett, W. F., 5 Q or Quartile Deviation, 151, 153 ; graphic representation, 137 ; translation for, 315 Quartiles, 151; graphic repre- sentation, 137 Questionnaire method, value of, 59 ; blank for, 65 ; sampling for, 66 Range, measures of, 149-163 Rank-order combinations of data, 192-198 Rating of teachers, summarizing graph for, 270 Receipts and payments, problems of, 34 ; table, 211 Records, 58, 287 Red Cross, 279 Relationships, 164-186 Relative position, bar graph for, 269 Rehabihty, 187-190 Reports, 25-27, 40 Reproducing graphs for the public, 297 Retardation, error in determining, 6; graph for, 166, 237, 257; problems of, 35; translation for, 306, 310 330 Index Revenues, errors in comparison, 15 Rice, Dr., 9, 45 Rockford Review, 231, 236, 252, 264, 279, 285 Rubber stamps for graphs and charts, 297 Rugg, H. 0., 67, 79, 182, 316 Ruled blank book, 77 Ruler strip device for use on re- ports, 85, 86 . Rural school work, map for, 279 Russell Sage Foundation, 166, 243,244,257, 259, 260, 261 Salaries, janitors', 257 ; principals', 257; teachers', 22, 109, 110, 125, 150, 154, 290; teachers', error in indefinite units, 6, 7 ; teachers', units and scales for, 54 Salt Lake City Survey, 122, 125, 168, 219, 232, 237, 282, 293 Sampling, 22-24, 62-71 San Antonio Survey, 17, 22, 86, 167, 225 San Francisco Survey, 66 Scales, 24, 43-53, 100; discrete and continuous, 48; examples of, 53-57; graphic represen- tation of, 101, 285; objective, 44 ; subjective, 44 Schedules, teachers', error in com- puting with indefinite units, 7 Schooling, value of, translation for, 311 School problems, 34-37 School statistics, errors in, 1-24 ; errors in presentation of, 24 ; need for better, 1-32 ; value of, for superintendents, 30-31 Scoring data, economies in, 82-84 Semi-inter-quartile range, 151 Sequence in tables, 221 Seymour, F. 0., 68 Sharp, L. A., 316 Shaw-Walker Com.pany, 59 (T = Greek sigma = abbreviation for Standard Deviation Signs, arbitrary, for graphs, 299 Skew distribution, 148 ; devia- tions for, 159 Skewness, 118-122 Skew surface of frequency, 118- 122 Smoothed surface of frequency, 119 Smoothing graphs, 113 Snedden, David, 25, 27, 33, 72, 78, 201, 202, 213, 216 South Bend Survey, 209, 221 Spaulding, F. E., 40, 43, 63, 70, 71, 103, 145 Special classes, 35 Spelling, errors in lack of units for, 9 ; scale, 106 ; tests, 45, 122, 167, 169 Spread, measures of, 149-163; translation for, 315 Springfield Survey, 211, 243, 257, 313 Standard Deviation, 158 Standard tests, sampling in, 66 Standards, errors in striving for, 17-19 Statistical method, reliability of results with, 187, 189, 190; Index 331 value of, and when to use, 19- 24, 28, 29, 30, 33, 36, 37, 38- 40, 91-93, 95, 97-99, 100, 200 Step method, economy for com- putation, 191 Step on a scale, meaning of, 24, 49 Stone tests, 117 Strayer, G. D., 62, 79, 119 Street maintenance, table for, 225 Students, use of, for statistical work, 192, 299 Subjective scales, 46 Summarizing data on blanks, 75, 76 Summary tables, 221 Summer session enrollment, error in indefinite units, 3 Supervision, problems of, 35 Surface of frequency, 111-122 Symmetrical distribution, 117 Tables, distribution, 106; of frequency, 106 ; series of, 220 Tabulation, 209, 233; for the public, 203, 204, 206-233 Taxes, increase in, bar graph for, 261 ; increase in, translation for, 311 Tax rate, errors in comparison with, 7, 8 ; graphs for, 102, 265 ; table for, 224 ; units and scales for, 56, 57 Teachers College, use of statistical method at, 28 Teaching staff, units and scales for, 54 Technical methods needed in school statistics, 19-24, 90-99 Tests, rank-order combinations for, 192 ; summarizing graphs for, 168, 271 Textbooks, graph for cost of, 242 Thermometer graph, 102, 288 Thorndike, E. L., 30, 45, 46, 47, 48, 49, 79, 92, 109, 144, 189, 191, 206 Thorndike handwriting scale, 45, 53 Tie rankings, 195 Title of graph or chart (see Headings) Time charts, 261 Time spent on each subject, units and scales for, 56 Time unit for translations, 306 Totals, forms for emphasizing, 216 ; unanalyzed, error in, 2-3 Training of teachers, graphs for, 256, 259 ; units and scales for, 54 Translation of statistics for the pubHc, 303-316 Triangle graphs, 254 Truancy, problems of, 35 Two-way tables, 72 Type, measures of, 124-147 Unclassified items table, 214, 215 Unequal things, errors in con- sidering equal, 9-10, 47 Uniform records and reports, 233 Units, errors in, 3-12, 16; ex- amples of, 53-57; how to de- termine, 43-57 Updegraff, Harlan, 70 U. S. Bureau of Education, 59, 120, 247, 248, 311 332 Index 1 Uselessness of statistics in school reports, 200 Valedictorian, determining, 192 Valid scale, 47-48 Vanderbilt University, 279 Variability, of children, graph for, 293; coefficient of, 170; translation for, 315 Variable errors, 187 Variation, measures of, 149-163 Variations, errors in neglecting, 21-22 ; in size of type for tables, 232 Variety in graphs, 286 Virginia, 9, 286 Waste, in school statistics, 24- 26; in teaching, translation for, 311 Wax crayons for graphs and charts, 296 Wealth, units and scales for, 56 ; real and assessed behind each $1 spent on schools, table for, 51,52; real, table for, 223 Webb, H. A., 306 Weighing machine cartoon, 289 Weighting factors in rank-order combinations, 197 Wisconsin, Experiment Station, 287; State Department of Public Instruction, 88 Withdrawals, error in determining age, 6 Womack, J. P., 314 Y. M. C. A., 280, 308, 309 Zero line in graphs, 263, 283, 284 Zero point on scale, 46, 47 Zeros, use in tables, 216, 217 Zone of safety, 18, 315 5477 LIBRARY OF CONGRESS 021 334 61 67