Fisher, Ronald Aylmer
Copyright notice |
---|
This article Ronald Aylmer Fisher was adapted from an original article by Sandy Zabell, which appeared in StatProb: The Encyclopedia Sponsored by Statistics and Probability Societies. The original article ([http://statprob.com/encyclopedia/RonaldAylmerFISHER.html StatProb Source], Local Files: pdf | tex) is copyrighted by the author(s), the article has been donated to Encyclopedia of Mathematics, and its further issues are under Creative Commons Attribution Share-Alike License'. All pages from StatProb are contained in the Category StatProb. |
Ronald Aylmer FISHER
b. 17 February 1890 - d. 29 July 1962
Summary. R. A. Fisher transformed the statistics of his day from a modest collection of useful ad hoc techniques into a powerful and systematic body of theoretical concepts and practical methods. This achievement was all the more impressive because at the same time he pursued a dual career as a biologist, laying down, together with Sewall Wright and J. B. S. Haldane, the foundations of modern theoretical population genetics.
Ronald Aylmer Fisher, arguably the greatest statistician of the last (or any) century was born in Hampstead, England. After attending Harrow, he studied at Gonville and Caius College, Cambridge from 1909 to 1913; later returning to Cambridge first as a Fellow, and then as Professor of Genetics. He died in Adelaide, Australia. Fisher made profound contributions to both theoretical and applied statistics, and to population genetics. His career divides naturally into five periods: 1913 - 1919 (held several minor commercial and teaching positions); 1919 - 1933 (resident statistician at Rothamsted); 1933 - 1943 (Galton Professor of Eugenics, University College, London); 1943 - 1957 (Arthur Balfour Professor of Genetics, University of Cambridge); and 1957 - 1962 (retirement). Throughout his life he received many awards and honors, including the Weldon Memorial Medal (1928), election to Fellowship of the Royal Society (1929), and honorary degrees from Harvard University (1936) and the University of Chicago (1956). Fisher's mathematical genius was already evident during his student days at Cambridge. During these years he also gained his first exposure to Mendelian genetics and the English eugenics movement; his interest in statistical inference was intimately connected to these biological interests, and in later years his output was almost always evenly divided between the two areas.
Initial Contributions, 1913 - 1919
During his first years after graduation from Cambridge, Fisher published two papers of outstanding significance: one on the small sample distribution of the sample correlation coefficient (1915; CP 4); the other linking the Mendelian and Darwinian approaches to genetics (1918; CP 9). (References to Fisher's papers are to the date of publication and their number in his Collected Papers [=CP] (Bennett, 1971-1974).)
During this initial period Fisher was strongly influenced by the work of William Sealy Gossett ("Student"), impressing the later by producing a rigorous mathematical derivation for Student's $t$-distribution (in his paper Student had only guessed the form of the distribution on the basis of its first four moments). Encouraged by this first success, Fisher turned to the correlation coefficient. Earlier studies by Pearson, Filon, Gosset, and Soper had given approximate, large sample expressions for the mean and standard error of the sample correlation coefficient $r$. Using geometric insights (representing the outcome of a random sample as a single point in an $n$- dimensional space, in a mathematical tour-de-force Fisher was able to derive the exact, small-sample distribution of $r$. Although Karl Pearson accepted Fisher's paper for publication in his journal Biometrika, two years later a "cooperative study" by Pearson and his associates appeared criticizing Fisher's paper on several grounds. Although relations between Fisher and Pearson remained cordial in the immediate aftermath, Pearson rejected another paper of Fisher submitted to Biometrika and this, together with the rejection of Fisher's Mendelian paper by the Royal Society (Pearson had been one of the referees to give a negative report), ultimately led to a sharp deterioration in relations between the two. Pearson, nevertheless, recognizing Fisher's obvious abilities, offered him a position in his University College London statistical laboratory, but Fisher recognized the incompatibility of their personalities; and when a competing offer came from Rothamsted, an agricultural research station in the English countryside, Fisher readily accepted it.
Rothamsted: 1919 - 1933
Fisher's days at Rothamsted were his among his happiest and most productive. Enjoying for the first time a secure position, he was surrounded by appreciative and supportive colleagues, and had access both to important problems and extensive statistical compilations of data that needed analysis. In his element at last, Fisher began to develop an extraordinary range of statistical methods fitted into a theoretical superstructure: during the next fifteen years he produced a flood of papers unparalleled either before or after. These fell into four main categories: statistical tests of significance and distribution theory, contributions to theoretical statistics and estimation theory, fiducial inference, and the design of experiments.
a. Statistical tests of significance and distribution theory. Fisher's mathematical skills and geometric insights enabled him to tackle a wide variety of distributional problems, extending the work of Gosset on the $t$- distribution (for whom Fisher had a life-long admiration). One of the most important of these was the distribution of the chi-squared statistic, determining the appropriate number of degrees of freedom if the cell probabilities depended on one or more estimated parameters. Fisher proceeded (in a series of five papers published over the seven year period 1922-1928) to attack Pearson's use of the chi-squared statistic to test homogeneity, on the (entirely correct) grounds that Pearson had systematically employed an incorrect number of degrees of freedom. The result was a heated dispute, to which Fisher repeatedly returned (CP 19, 31, 34, 49, 62, 188). Other, less controversial but equally fundamental distributional results obtained by Fisher during this period included his paper on testing the significance of regression coefficients (1922, CP 20), the $F$ distribution (1924, CP 36), the independence of the sample mean and variance from a normal population (1925, CP 43), and the sampling distribution of the multiple correlation coefficient (1928, CP 61)
b. The theory of estimation. At the same time as he made major strides in distribution theory, Fisher began to craft a systematic theory of estimation, introducing the basic concepts of sufficiency, consistency, and efficiency (1920, CP 12; 1922, CP 18; 1925, 42). The method of maximum likelihood was advanced as the practical realization of these three goals; and the entire structure laid out systematically in Fisher's epochal Statistical Methods for Research Workers (1st ed., 1925). The book illustrated Fisher's genius for the apt example, systematic exposition of efficient computational technique, and ruthless suppression of mathematical justification or derivation. The last chapter of the book, involving the estimation of a genetic linkage parameter, is a masterpiece of exposition; using five competing methods of estimation, the advantages and ease of use of the method of maximum likelihood estimation are contrasted with other, less efficient, more eclectic, or more cumbersome modes. Throughout his life Fisher remained an ardent supporter of the method of maximum likelihood, but his views regarding it evolved. His initial 1922 paper argued its large sample merits, but the 1925 paper began to argue that it had small sample advantages as well, and introduced the concept of information loss (as well as the method of scoring).
c. Fiducial inference. Pearson was also an exponent of Bayesian methods, and Fisher's rejection of inverse methods and his ultimate development of fiducial inference as an alternative to them was thus yet another assault on the Pearsonian edifice. It is in many ways ironic that Fisher's first paper on fiducial inference, "Inverse Probability" (1930, CP 84), contains little that is controversial. In it Fisher introduced the probability integral transformation, and observed that this transformation often provides a pivotal quantity that can be inverted to obtain interval estimates having any prespecified coverage frequency. It was only later, after Fisher attempted to extend the argument to include multiparameter estimation, that difficulties arose. In 1935 Fisher (CP 125) illustrated the use of a simultaneous fiducial distribution with two examples, one of which was the notorious Behrens-Fisher problem: to estimate the difference in the means $m_1$ and $m_2$ of two normal populations, given that the variances ${s_1}^2$ and ${s_2}^2$ of the two populations are unknown. Few could have predicted then that it would generate a debate lasting several decades. Fisher's solution was almost immediately questioned by Bartlett, who noted that unlike the examples involving the $t$-statistic, standard deviation, and correlation coefficient, the interval estimates for $m_2 - m_1$ advocated by Fisher gave rise to tests with inappropriate levels of significance, in terms of frequencies involving repeated sampling from the same initial population. Unlike Fisher's many other original and important contributions to statistical methodology and theory, fiducial inference never gained widespread acceptance, despite the importance that Fisher himself attached to the idea. Instead, it was the subject of a long, bitter, and acrimonious debate within the statistical community; and although Fisher's impassioned advocacy gave it viability during his own lifetime, it quickly exited the theoretical mainstream after his death. Considerable confusion always existed about the exact nature of the fiducial argument; and the entire subject came to have an air of mystery. Fiducial inference never developed during Fisher's lifetime into a coherent and comprehensive theory, but remained a collection of examples, insights, and goals, added to and modified over time; and the polemical nature of the debate on both sides rendered much of the resulting literature opaque.
d. Design of experiments. The abundance of agricultural data available at Rothamsted led Fisher by a direct and natural route to the analysis of variance; and his first paper on this subject appeared as early as 1923 (CP 32). This initial effort suffered from a number of defects: Fisher had not tabulated the $F$ distribution, nor did he yet appreciate fully the crucial role of randomization. But progress came quickly: Fisher's address the next year to the 1924 International Congress of Mathematicians in Toronto (CP 36) described the relationship between the $z$ =: 1/2 ln $F$, chi-squared, normal, and $t$-distributions; and passages from the 1925 edition of Statistical Methods for Research Workers emphasized the importance of randomization in ensuring the validity of tests of significance. During this period Fisher developed in rapid succession the basic elements of design: blocking, factorial designs (1926, CP 48), Latin squares, confounding, and partial confounding, and the analysis of covariance. This work was summarized in Fisher's classic book The Design of Experiments (1935), another masterpiece of exposition containing the celebrated example of the lady tasting tea (Chapter 2: "The Principles of Experimentation, Illustrated by a Psycho-Physical Example").
Professor of Eugenics, University College London, 1933 - 1943
Ironically, Fisher's move to London and a Professorship marked the beginning of a period of increasing controversy in his life, in particular the dispute with Neyman. Before his dispute with Neyman, Fisher had engaged in other statistical controversies, crossing swords with Arthur Eddington, Harold Jeffreys, and Karl Pearson. He had been fortunate in his previous choice of opponents: Eddington conceded Fisher's point; Jeffreys was cordial in rebuttal; and Pearson labored under the disadvantage of being wrong. But in Neyman Fisher faced an opponent of an entirely different character. Neyman left Poland at the beginning of 1934 in order to assume an academic position at University College London. Shortly after his arrival in England, Neyman read a paper before the Royal Statistical Society (on 19 June 1934) dealing in part with the fiducial argument, and reformulating Fisher's theory in terms of what Neyman called "confidence intervals". Neyman described his theory of confidence intervals as an alternative description and development of Fisher's theory of fiducial probability, permitting its extension to the case of more than one parameter. Fisher, one of the paper's discussants, in turn referred to Neyman's work as a "generalization" of the fiducial argument, but pointed to the problem of a possible lack of uniqueness in the resulting probability statements if sufficient or ancillary statistics were not employed. Confidence intervals, Fisher thought, make statements which, although mathematically valid, are of only limited inferential value; but that they had some value he conceded in a footnote. In 1935, shortly after this, relations between the two broke down after Fisher's discussion of Neyman's 1935 JRSS paper (read 28 March). Neyman's paper had been critical of some of Fisher's most important work in the design of experiments, although the attack was indirect and towards Fisher himself the tone of the paper was one of almost studied politeness. Fisher's discussion was sharply critical of Neyman, both in substance and tone; and Neyman did not hold back in response.
Closely linked to his criticisms of Neyman's confidence intervals were Fisher's emerging views on the nature of conditional inference. There were already hints of this in some of his earlier papers (in particular, CP 42), but the issue first clearly arose in his papers in the 1930s. In his 1934 classic ``Two new properties of mathematical likelihood" (CP 108), Fisher introduced the concept of the recovery of information using conditioning; and in his 1934 address to the Royal Statistical Society (CP 108), he turned to the conditional analysis of the two-by-two table and the use of the exact test. In both cases, as throughout his career, the examples advanced beyond the supporting theory, and no general prescription or rationale for conditioning was provided.
In his papers of the 1930's, Fisher was just beginning to grapple with these issues, and his comments are at times brief, fragmentary, even tentative. It is symptomatic of the uncertainty he must have felt at this period in his life that in 1941 he made the extraordinary concession that Jeffreys, ``whose logical standpoint is very different from my own, may be right in proposing that `Student's' method involves logical reasoning of so novel a type that a new postulate should be introduced to make its deductive basis rigorous" (1941, CP 181, p. 142).
But when referring to Neyman, no such concession was possible. By 1945 Fisher's view had hardened, and he labeled the criterion that ``the level of significance must be equal to the frequency with which the hypothesis is rejected in repeated sampling of any fixed population allowed by hypothesis" as an ``intrusive axiom, which is foreign to the reasoning on which the tests of significance were in fact based" (1945, CP 203, p. 507).
Professor of Genetics, University of Cambridge: 1943 - 1957
Fisher's return to Cambridge, although a professional triumph, was marked by personal tragedy: irreconcilable differences led to permanent separation from his wife, and shortly after, in December 1943, his son George was killed in the war. Fisher's output of scientific papers continued unabated, but although these still contained much of interest, there was nevertheless for the most part a clear decline in their depth and insight relative to his best earlier work, save for the occasional exception, such as "Dispersion on a sphere" (Fisher, 1953; CP 249). His book The Theory of Inbreeding (1949) has been described by his own daughter as "remarkable for being peculiarly his own formulation without reference to, or comparison with, what others had published on that subject" (Box, 1978, p. 417).
In the 1950s, sensing that the tide had turned against his theory of fiducial inference, Fisher returned to the subject in his 1956 book, Statistical Methods and Scientific Inference (SMSI, 1956). Fisher's treatment of probability in SMSI revealed an important shift in his view of the nature of probability. In his papers before World War II, Fisher had described prior distributions as referring to an objective process by which population parameters were generated; writing for example in 1921, that the problem of finding a posterior distribution "is indeterminate without knowing the statistical mechanism under which different values of [a parameter] come into existence" (1921, CP 14, p. 24), and that ``we can know nothing of the probability of hypotheses or hypothetical quantities" (p. 35).
In contrast, in the 1950's Fisher espoused a view of probability much closer to the personalist or subjectivistic one: ``probability statements do not imply the existence of [the hypothetical] population in the real world. All that they assert is that the exact nature and degree of our uncertainty is just as if we knew [the sample] to have been one chosen at random from such a population" (1959, CP 273, p. 22). None of the populations used to determine probability levels in tests of significance have ``objective reality, all being products of the statistician's imagination" (1955, CP 261, p. 71; cf. SMSI, p. 81). In the 1st and 2nd editions of SMSI, Fisher even referred to ``the role of subjective ignorance, as well as that of objective knowledge in a typical probability statement" (p. 33). Thus, although Fisher remained publicly anti-Bayesian, after World War II he was in fact much closer to the ``objective Bayesian" position than that of the frequentist Neyman.
Retirement: 1957 - 1962
Fisher did not retire from his Professorship of his own accord, but due to an imposed age requirement at Cambridge (although he continued on at Gonville and Caius, elected President of the College in 1957). In his last years, he traveled extensively, visiting Michigan State University in the Fall of 1957, and later both returning to the United States and visiting Japan, India, Belgium, France, and Italy. Controversial to the end, after becoming a scientific consultant for the British Tobacco Manufacturers Standing Committee he repeatedly expressed skepticism both in invited lectures and in print regarding the statistical evidence then available regarding the causal connection between smoking and lung cancer. In 1959 Fisher moved to the University of Adelaide in Australia, spending much of his remaining time there. It was here that he was diagnosed as having cancer in the summer of 1962; and although the operation (on July 21) was judged a success, eight days later Fisher died suddenly of a post-operative embolism.
Literature
Primary Sources. Fisher's three books on statistical theory (Statistical Methods for Research Workers, The Design of Experiments, and Statistical Methods and Scientific Inference) have gone through many editions. Now individually out of print, they have been reprinted in a single volume by Oxford (Fisher, 1990); The Genetical Theory of Natural Selection (1930) remains in print thanks to Dover Publications. Fisher's Collected Papers (Bennett, 1971 - 1974) remain an invaluable source for the study of Fisher (although the collection omits most of his book reviews, and scattered other minor contributions). Fisher's selected correspondence has also been published in two volumes, one genetic (Bennett, 1983) and one statistical (Bennett, 1990). These are invaluable for understanding the evolution of his thinking, and sometimes contrast in interesting fashion with his more public statements of position. Of particular interest is Fisher's correspondence with Gossett, unfortunately only published privately.
Secondary Sources. The biography of Fisher by his daughter (Box, 1978) is an invaluable source of information regarding Fisher's life and personality; its scientific discussion provides a useful orientation for further study. Also useful is the obituary notice of Yates and Mather (1963). There are many appreciations of Fisher's contributions to statistical science; but of particular note is the remarkable effort of L. J. Savage (1976). The essays in R. A. Fisher: An Appreciation (Fienberg and Hinkley, 1980), although of variable quality, have considerable value as assessments of Fisher's contributions from the perspective of later professional statisticians. (Outstanding among these is D. L. Wallace, ``The Behrens-Fisher and Creasy-Fieller Problems", pp. 119-147.) Gosset's recent statistical biography (Pearson, 1990) of necessity touches on Fisher throughout. Among the numerous appreciations and discussions of Fisher, two of particular interest are Yates (1951) and Kruskal (1980).
References
[1] | Bennett, J. H. (ed.) (1971-1974). Collected Papers of R. A. Fisher (5 vols). University of Adelaide. |
[2] | Bennett, J. J. (ed.) (1983). Natural Selection, Heredity, and Eugenics. Including Selected Correspondence of R. A. Fisher with Leonard Darwin and Others. Clarendon Press, Oxford. |
[3] | Bennett, J. H. (ed.) (1990). Statistical Inference and Analysis: Selected Correspondence of R. A. Fisher. Clarendon Press, Oxford. |
[4] | Box, Joan Fisher (1978). R. A. Fisher: The Life of a Scientist. Wiley, New York. |
[5] | Fienberg, Stephen E. and Hinkley, David V. (1980). R. A. Fisher: An Appreciation. Lecture Notes in Statistics 1, Springer-Verlag, New York. |
[6] | Fisher, R. A. (1990). Statistical Methods, Experimental Design, and Scientific Inference. Clarendon Press, Oxford. |
[7] | Kruskal, W. (1980). The significance of Fisher: a review of R. A. Fisher: The Life of a Scientist. Journal of the American Statistical Association 75, 1019-1030. |
[8] | Pearson, E. S. (1990). `Student': A Statistical Biography of William Sealy Gosset (R. L. Plackett and G. A. Barnard, eds.). Clarendon Press, Oxford. |
[9] | Savage, L. J. (1976). On re-reading R. A. Fisher. Annals of Statistics 3, 441-500 (with discussion). |
[10] | Yates, F. (1951). The influence of Statistical Methods for Research Workers on the development of the science of statistics. Journal of the American Statistical Association 46, 19-34. |
[11] | Yates, F. and Mather, K. (1963). Ronald Aylmer Fisher, 1890-1962. Biographical Memoirs of Fellows of the Royal Society of London 9, 91-120. |
Reprinted with permission from
Christopher Charles Heyde and Eugene William Seneta (Editors),
Statisticians of the Centuries, Springer-Verlag Inc., New York, USA.
Fisher, Ronald Aylmer. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Fisher,_Ronald_Aylmer&oldid=53144