History of Statistics
In modern society, statistics perform an important role in the mechanism of management of the economy. Throughout many centuries of existence, regardless of the level and stage of the economic development and the nature of the politician system, statistics have always acted as the necessary and effective tool of the public administration and at the same time as the science investigating the quantitative aspect of the mass phenomena. Statistics as the practical activity of people arose in ancient times. The emergence and development of statistics were caused by public requirements such as calculation of the population, property, etc. Nowadays, statistics have a well-developed structure and occupies a wide range of application fields.
Currently, there are about a thousand definitions of the term “statistics.” As Coetzer reports, for a long time, philosophers, mathematicians, economists, sociologists, political leaders, and statisticians tried to define the meaning of statistics as a science. The term “statistics” is derived from the Latin word “status”, which means “a certain state of affairs.” The term “statistics” is used with different meanings. Statistics refers to the practical activities aimed at the collection, storage, processing, and analysis of the digital data characterizing the population, economy, culture, education, and other phenomena in social life. Originally, it was used as a word meaning “political science.” Statistics also refers to the special science, namely the branch of knowledge studying the phenomena in the life of society from the side of their quantitative aspects. As an academic discipline statistics is a crucial unit of the curriculum training businessmen, managers, economists of the higher qualification.
The statistical data collection has begun from the very ancient times. According to Coetzer, the later period included the processing and analysis of the statistical data, leading to the emergence of statistics as a science. In ancient Rome, the accounting of free citizens and their property was carried out by gender and age; data on the state of the industry and agriculture was collected. In the ancient world, newborns were accounted; special lists included the accounted young men, who have reached the age of military service (18 years old). Land lists (inventories), which joined data on buildings, slaves, cattle, stock, and incomes were formed.
J. Graunt (1620-1674), W. Petty (1623-1687), and E. Halley (1656-1742) were the founders of the English school of political arithmeticians. In their works, two directions prevailed: the demographic direction with a bias to life insurance issues created by Graunt and Halley, as well as the statistical and economic direction created by Petty. Graunt was the first, who discovered regularities of the mass public events and showed how to process and analyze multiple primary data (Stephenson). He first tried to construct the table of mortality of the population; thus, he is considered as the founder of demography. The scientist calculated averages, i.e. relative frequencies, and detected certain stability. Graunt’s statistical practice was based upon averaging. His idea of averaging is used in dealing with randomly generated data until the present days. This approach detects “stable laws”, which are also called “statistical laws” (“History of Statistics”).
Graunt’s and Petty’s ideas were taken by E. Halley and A. De Moivre in England and spread throughout Europe. Halley was the famous English astronomer, who suggested the idea of the law of large numbers and applied the methods of removing the random departure. Laplace and Poisson combined the abovementioned ideas with the ideas of the theory of probability, turning averaging statistical methods into much more mathematically sophisticated approaches of calculation (“History of Statistics”).
According to Stochastikon Encyclopedia, Petty devoted to statistics a number of scientific works. He aimed to estimate specifically one or another phenomenon despite the obvious lack of numerical data. Petty was the first, who introduced the expression “Political Arithmetic”. His most famous work was Political Arithmetic published in 1690. In his publication, he intended to represent all political, social, and natural components of a state by means of numbers in order to create a proper basis for decision making and avoid political discrepancies. Political arithmeticians sought to characterize the state and development of society by means of figures, open regularities of development of the public phenomena, which were shown in the mass material. The goals and objectives of these scientists are close to the modern understanding of the essence of statistics (“History of Statistics”).
In the early 20th century, there was the third direction of the statistical science called statistical and mathematical. The special contribution to the development of this direction was made by the statistician A. Quetelet (1796-1874). He considered statistics as a part of social physics, namely the science studying the laws of the public system by means of quantitative methods. Quetelet proved the idea of using patterns identified from a variety of cases as an essential tool for learning the objective world. His major merit is the justification of the idea of regularities’ use revealed from a variety of cases, as the most crucial tool for learning the objective world. It was he, who gave the definition of the subject of statistics (the mass phenomena connected with the life of society and state), saw in it the tool of social knowledge, and also he made the significant contribution to the development of the theory of stability of statistics, revealed the essence of statistical methods (Stigler 52-65).
The significant contribution to the development of statistics was made by the English scientists F. Galton (1822-1911), K. Pearson (1857-1936), W. Gossett (1876-1936), R. Fisher (1890-1962), etc. They considered the probability theory making one of the branches of applied mathematics as a statistics basis. Galton was seriously interested in the issue of heredity, to the analysis of which he soon applied the statistical methods. Among other things, he developed a percentile concept usage. Pearson also conducted many fruitful pieces of research in statistics. Along with Galton, he made a significant contribution to the development of the theory of a quantitative assessment of communication (the correlation theory). Pearson developed the criterion of chi-square of the statistical hypothesis testing. Gossett, who owned a pseudonym Student, developed the small sample theory (Stigler 13-50). Researches of these scientists made a substantial impact on modern statistics. At the end of the 18th century, the political arithmetic was revived, having changed its name to statistics. On the basis of the “normal law” and the “concept of error”, statistics reached its first apogee in the 19th century by means of the emergence of definition the “average man” and its subsequent application in recently created social sciences.
In the West, in the 20th century, the most famous scientist in the field of statistics was R. Fisher (1890-1962). Fisher developed methods of quantitative analysis. The scientist developed the variance analysis, the theory of experimental design, and the method of the maximum credibility of the parameters’ assessment. Much of his research has had a significant impact on modern statistics (Chatterjee 39). At the beginning of the 20th century, the new mathematical result, which was likely to make calculus and statistics more credible, appeared. In 1933, a Russian mathematician A. N. Kolmogorov published his book “Foundations of the Calculus of Probabilities” that axiomatized the probability theory and statistics. Since the 1930s, the correlation between the science and mathematics was performed by means of data; thus, the “mathematical statistics” and “probability theory” were combined into statistics (Chern and Hirzebruch 162-164).
Having appeared in the 17th century in England and then having widely spread within Europe in the 18th century, statistics began its active development on the territory of the United States in the early 20th century. The vital statistics of the USA was developed on the basis of a cooperative relationship between the federal government and the states. Since the end of the 1970s, the mentioned interrelation has included an official convention called the Vital Statistics Cooperative Program. The USA’s vital statistics were based upon the local registration of various vital events such as registration and accounting of births, deaths, fetal deaths, marriages, divorces, and abortions (Weed 527-539).
However, the registration of births, deaths, and marriages began with a registration law provided by Virginia in 1632. The registration law was modified by Massachusetts in 1639. Later, the government of the USA framed the Constitution, and the decennial census was provided. In the latter half of the 19th century, the questions about vital events were included in the decennial censuses in order to obtain the national statistical data. However, this method was found inadequate, and its results were insufficient. In 1915, the national birth-registration was started, and by 1933, all states registered the number of births and deaths with the valid event coverage and provided to the central statistical bureau all the required data in order to create the national birth and death statistics. In 1946, the US Public Health Service obtained the responsibility to collect, operate and publish the vital statistics at the federal level (Weed 527-539).
In 1989, the U.S. Standard Certificate of Live Birth was revised, and some items were added to the “Information for Medical and Health Use Only” section such as tobacco and alcohol use, obstetric procedures, weight gain, abnormal conditions of the newborns, and method of delivery. Since 1989, the National Center for Health Statistics is publishing the fertility data based on the mother’s race. During the 20th century, the cause-of-death statistics of the United States have been irradiated and classified in accordance with the International Statistical Classification of Diseases, Injuries, and Causes of Death. Nowadays, the American system of statistics has dramatically improved the timeliness of the data, which are provided by the National Center for Health Statistics. Methodological issues were solved by means of the implementation of full automation (Weed 527-539).
The present state of the development of statistical methods appeared in 1900 when Pearson published his journal Biometrika. In the early 20th century, the parametric statistics were actively applied. The methods based on the analysis of the data obtained from Pearson’s n-parameter family of distribution were studied. Normal distribution was the most popular. Pearson’s, Student’s, and Fisher’s criteria were used in order to check hypotheses. The maximum-likelihood method and analysis of variance were offered, and the main principles of the experimental design were formulated (Wilcox 2-7).
Currently, scientists differentiate practical statistics and mathematical statistics. The practical statistics is a methodical discipline, which is the center of statistics. When applying methods of the practical statistics to specific areas of expertise and branches of the national economy, such scientific and practical disciplines as “statistics in medicine” or “statistics in the industry” appear. From this point of view, econometrics performs statistical methods in economics. The mathematical statistics play a vital role in the mathematical foundation for practical statistics (Calvert).
Statistical methods of the data analysis are applied virtually in all spheres of human activity. Statistical methods are always used in cases when it is necessary to obtain and prove any statements and judgments about objects or subjects with some internal heterogeneity. Statistical methods represent a very useful tool in researches in various areas such as economy, business, social sciences, medicine, etc. For example, a sample unit can be performed as a versed cosine that describes the dynamics of the index, namely its changes over time: a patient’s electrocardiogram, the amplitude of engine shaft beats or time series describing the dynamics of criteria of a certain company. In the applied researches, the statistical data of different types are used. In particular, it is connected with the ways of receiving. For example, if tests of some technical devices proceed to a certain time point, the so-called trimmed data consisting of a set of numbers, such as a period of operation of a number of devices to the full, and the information about that other devices, which continued to operate at the time of the end of the test, are obtained. The trimmed data are often used in the assessment and control of the reliability of technical devices (“Applications of Statistics”).
Besides, it is very instructive to compare the probabilistic and statistical models applied in various areas and find their proximity and at the same time state some distinctions. For example, the proximity of statements of tasks and statistical methods applied to their decision is observed in such areas as scientific medical researches, certain social researches and market researches, particularly in medicine, sociology, and marketing. They are often grouped together and called the “sampling analysis”. The application of statistical methods and models for the statistical analysis of certain data is closely related to problems of the corresponding area. The research of the dynamics of price growth by means of inflation index-numbers calculated with the help of the independently collected information is of interest primarily from the point of view of the economy and economic management (both at the macro level and at the level of separate organizations) (“Applications of Statistics”).
Statistics is also widely used in school management. For example, statistics provide data on all changes in the demography and population of the school. Statistics help to collect and process certain assessments in order to improve the school system. The statistical analysis may be used in order to obtain and assess the collective educational performance of students and understand possible correlations between their performance and various factors such as the socioeconomic background. In sports, statistics provide an apparent summary of sports events by means of different well-tabulated scores or other parameters. Statistics in the social life provides the government with more information about mass public events and citizens. By means of the proper analysis of statistical results, many social reforms may be initiated in order to improve the standards of living. In science, statistics help to observe and regulate epidemics and diseases, as well as assess any medical practices and efficiency of drugs (Habib).
The mass character of public laws and originality of their actions predetermines the need for research of the cumulative data. The law of large numbers is generated by means of the special characteristics of the mass phenomena. Properties of the mass phenomena, on the one hand, differ from each other and, on the other hand, have something in common, caused by their belonging to a certain class. Moreover, single phenomena are more susceptible to random factors than their population. The law of large numbers states that quantitative principles of the mass phenomena are distinctly shown only in their rather large numbers. Thus, the essence of this law is that in the numbers, which are turning out as a result of the mass surveillance, certain correctness that cannot be found in a small number of the facts act (Dinov, Christou, and Gould).
The law of large numbers expresses dialectics random and necessary characteristics. As a result of the mutual compensation of casual deviations, average sizes estimated for the value of the same species become typical, reflecting the actions of constants and essential facts in the circumstances of time and place. Tendencies and regularities disclosed by means of the law of large numbers are valid only as mass tendencies, but not as laws for each separate case. The property of stability of the mass phenomena noted in practice became a subject of theoretical researches. The result of researches became proof of a number of theorems. The Chebyshev inequality theorem and Bernoulli’s theorem belong to their number (Dinov, Christou, and Gould).
Student’s t-test is the common name for statistical tests, in which statistics of criterion has a t-distribution. Predominantly, t-criteria are applied in order to check the equality of average values in two samples. The zero hypothesis assumes that averages are equal. All varieties of t-test criterion are parametric and are based on the additional assumption of normality of the data sample. Therefore, before the t-test criterion application, it is recommended to perform a normality test. If the hypothesis of the normality is rejected, other distributions may be checked; if they do not fit, it is necessary to use non-parametric statistical tests (Ireland 79-84).
The Student’s distribution was introduced in 1908 by the English statistician W. Gossett, who worked in a factory that produced beer. Probabilistic and statistical methods were used for the acceptance of economic and technical solutions at that factory; therefore, the administration of the factory banned W. Gossett to publish scholarly articles under his own name. In such a way, the trade secret, namely the probabilistic and statistical methods developed by W. Gossett, was protected. However, he had an opportunity to publish his works under the pseudonym “Student” (“Statistics” 589).
The history of Gossett-Student shows that a hundred years ago, a substantial economic efficiency of probabilistic and statistical methods was obvious to managers of Great Britain. Currently, the Student’s distribution is one of the most well-known distributions used in the analysis of real data. It is used for the estimation of an assembly average, predicted value and other characteristics by means of confidential intervals in order to test hypotheses of values of an assembly average, coefficients of a regressional relationship, homogeneity hypothesis selection, etc. (Ireland 79-84).
Taking into account all the abovementioned information, it should be noted that the history of statistics shows that the statistical science developed as a result of the theoretical enrichment of the best practices of the registration and statistical works caused, first of all, by requirements of the management of the life of society. Besides, statistics is a wide experience and an important subject, which is substantially useful for people of all ages and teachers to raise their intelligence level. Statistics is widely used with other areas of sciences for the achievement of the main conclusions. Statistical conclusions belong to groups as a whole, instead of belonging to individuals separately. Statistics only generalize data but does not interpret them.