Last active
December 19, 2018 08:50
-
-
Save chrishwiggins/1594c8b72a4c74bdb369 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
---------- Forwarded message ---------- | |
From: chris wiggins <chris.wiggins@[YYY].edu> | |
Date: Wed, Aug 1, 2012 at 7:26 PM | |
Subject: stats history | |
To: hadley@[XXX].edu | |
Cc: chris wiggins <chris.wiggins@[YYY].edu> | |
Dear Hadley: | |
I'd like to try to address your tweeted inquiry | |
( https://twitter.com/hadleywickham/status/229402238404153344 ) | |
as to why stats is so mathy. | |
like all good academics, let me start by saying how woefully | |
unqualified I am even to ask this question, having a PhD in | |
neither stats, math, nor history. Instead my PhD is in Physics | |
which, as we discussed, means I rush in to other people's fields | |
whereas more qualified and wise people fear to tread. | |
my brief understanding of statistics is this: | |
- 19th c, people already looked at data in mathematical | |
ways, including the great ones (Gauss (1822), Legendre (1805), etc) | |
- pre WWII: applied problems | |
such as genetics and Guinness (Fisher, Student) | |
- also pre-WWII, Jerzey Neyman passed through London to | |
to hang with Karl Pearson and then moved to Berkeley | |
to found what would become the worlds strongest stats dept, | |
a mathematical and Frequentist [edit] place. | |
- during WWII EE, physics, and statstics were badly needed. | |
the role of physics is well-celebrated. less so is the role | |
of EE (e.g., in developing radar) and stats, each of which | |
had small Manhattan-project-esque developments, i.e., federal | |
$ were used to bring into proximity many experts in the field | |
working very, very hard to solve very, very applied (martial) | |
problems. In the case of EE it was Harvard/MIT 'radlab', led | |
by Fred Terman, who went on to be the father of silicon valley. in the | |
case of stats there were the Columbia statistical research group, | |
a like-minded group at Princeton (both funded via warren weaver), | |
and likely others I don't know. | |
- that said there were hardly any stats 'departments' (nor, I | |
presume, were there many applied math departments -- warren weaver | |
also seems to have seeded this field during WWII and led to | |
postwar department building) and the departments being created | |
all seem to have grown out of math departments (similarly with applied math). | |
- statistics seems to have existential issues both in character | |
and stability at this time. | |
* stability: stats efforts at Columbia and Princeton, which were | |
* good schools and | |
* funded by warren weaver to do stats during and i presume after WWII | |
yet both had their departments implode or cut back. Columbia hemorrhaged | |
people post WWII. Princeton lost stats in the 1980s. | |
* character: because stats departments grew out of math departments there | |
was lots of snobbery. Lehmann's book [1] has many quotes about mathematicians | |
looking down on stats. i think that made stats have issues WWII until | |
today always trying to be "real" math. | |
- except for singularities like Tukey. Tukey had more than enough | |
math credentials, having a great PhD thesis in topology. Moreover he was | |
eccentric, home-schooled, and in general seemed not to give a damn | |
how other people did it. he was also apparently incredibly smart, | |
and applied his math at Princeton, bell, ETS (Princeton), and in | |
a variety of consulting work in industrial and government contexts. | |
- Tukey also bucked the Berkeley math envy by pushing computation | |
and exploratory data analysis. If you look for example at the 1998 | |
Neyman lecture by Chambers about computation and statistics he's quite | |
explicit about how Tukey's thinking in 1964-65 influenced his and thus | |
S and thus R. | |
- My opinion is that this attention to being "real" (i.e., mathy) caused stats | |
to miss the data boat. the clear innovative thinker here was Tukey | |
who influenced | |
- - Tufte, with whom he taught a class at Princeton | |
- - Cleveland, who wrote the 'data science' article in 2001 | |
- - chambers, who created R and S | |
- - he also coauthored with and influenced J H freedman at Stanford, | |
who presumably influenced Hastie and Tibshirani. | |
- - probably a whole mess of other people I don't know about owing to illiteracy | |
- something i don't know is the history of departments which don't' feature | |
in Lehmann's autobiography, including | |
- - Harvard (who made that place?) | |
- - Wisconsin (seems to have a lively history/department) | |
- the most amazing punchline here seems to be to be how WWII turned | |
Tukey into a statistician who spawned what is now data science and | |
data visualization. he sprang fully-formed from WWII a statistician | |
without, as far as i can tell, inheriting the DNA from Berkeley (which | |
in turn came from the UK). it's as though two separate species had | |
spontaneously formed and then interbred. am i wrong? | |
- cf https://twitter.com/mshron/status/229899515690364928 | |
- cf https://twitter.com/mshron/status/229961814685908993 | |
== | |
APPENDIX: gems from Lehmann's book: | |
== | |
The basic difference between the roles of mathematical probability in 1946 | |
and 1988 is that the subject is now accepted as mathematics, whereas in | |
1946, to most mathematicians, mathematical probability was to mathematics | |
as black marketing to marketing . . . . And the fact that probability was | |
intrinsically related to statistics did not improve either subject’s | |
standing in the eyes of pure mathematicians." | |
@Stanford:" | |
The mathematics department received me with a certain detachment,” Bowker | |
says. “Although he became a great supporter of statistics, Gabor Szegö was | |
then chairman of the mathematics department, and explained to me very nicely | |
that while what I did was very interesting, it wasn't’t mathematics. So we | |
moved rather quickly to a separate department.” | |
@Stanford: And thus it came about that Al Bowker, formally still a graduate | |
student at Columbia (although by then his thesis had been completed), in | |
1948 became chairman of the fledgling statistics department, | |
@Princeton: "He also did not try to build up a group; however, he acquired a | |
colleague fortuitously. This was John Tukey, a topologist in the mathematics | |
department since 1939. During the war Tukey became involved in statistics, | |
and by 1945 considered himself a statistician rather than a topologist." | |
@Berkeley, circa 1946, "Evans [the chair] argued forcefully against a B.A. | |
degree in statistics, since it would ... would be essentially nothing but an | |
undergraduate professional degree.” | |
@Berkeley: "No course on Bayesian statistics was introduced until 1969." | |
le cam, 1950: "at Berkeley everything was full of measure theory and other | |
fanciful mathematics" | |
crazy: "Because of her interest in the history of probability, the Berkeley | |
statistics department in 1970 asked David to give a course in this | |
subject. The course, which met for two hours on Fridays, was given by her | |
regularly for a number of years. It was one way to satisfy a statistics | |
requirement and it soon became very popular, with a steadily increasing | |
enrollment that eventually rose to five hundred students. There were two | |
reasons for this popularity. One was that David was a lively and | |
entertaining lecturer; the other, which I am afraid was an even more | |
important reason, was that she demanded very little of the students. She | |
assigned no homework and there were no exams. The only requirement was the | |
final, an essay written at home on any topic of some relevance. Toward the | |
end of my term as chair, I began to hear rumors that a brisk market had | |
developed in essays recycled from previous years. As a result, we decided | |
soon after to discontinue the course." | |
@Tukey: "By the end of late 1945, I was a statistician rather than a | |
topologist." | |
@von mises: "All of von Mises’ work was infused by his view that the task of | |
applied mathematics is to build mathematical models of some aspects of the | |
real world," | |
moment of clarity: "At this point, classical statistics splits into three | |
branches: point estimation, which tries to pinpoint the unknown parameter | |
; confidence sets, which provides a set in which can be stated to lie with a | |
certain guaranteed probability; and hypothesis testing, where a hypothesis | |
about is either accepted or rejected" | |
!: "Larry’s Los Angeles draft board considered mathematics as a deferrable | |
subject but not statistics. It therefore became essential for him to be in a | |
mathematics rather than a statistics department. So Larry contacted Jack | |
Kiefer, who arranged a position for him at Cornell, where statistics was in | |
the mathematics department" | |
"Tukey went in the opposite direction: he argued that much statistical | |
activity should take place without the use of any models....Tukey stressed | |
the primacy of the data" | |
"The Bayesian approach, however, was violently opposed as unscientific by | |
both Fisher and Neyman in the 1920s and 1930s, and as a result fell into | |
disuse. The person bringing it to life again was Leonard J. Savage" | |
- "Tukey stressed exploratory data analysis without any probability or | |
mathematics." -E.L. on JWT | |
- Huber (1997) writes: | |
Very few people will have realized at that time (I certainly was not among | |
them) that Tukey, while ostensibly speaking about his personal | |
predilections, was in fact redefining statistics. | |
@Berkeley, hard to picture: "as one of her courses, she started a | |
statistical consulting service, staffed by the graduate students taking the | |
course. During the ten years that she was in charge of this course, the | |
service provided statistical advice to about two thousand clients, | |
mostly—but not exclusively—from within the university. To head this service, | |
Julie was appointed lecturer in 1977 and senior lecturer in 1981, a position | |
in which she remained until her retirement in 1994." | |
That others too found it difficult is illustrated by a 1987 paper by | |
Speed, “What Is An Analysis of Variance?” which is followed by the comments | |
of eleven discussants, no two of whom quite agree on its meaning. | |
# References | |
[1] Citation to Lehmann's book: | |
Lehmann, Erich L. Reminiscences of a statistician: The company I kept. | |
Springer Science & Business Media, 2007. | |
https://goo.gl/9I9pt1 |
An authority/former advisee of JWT corrects me, saying "I'm pretty sure Fred Mosteller
... wasn't really John Tukey's Ph.D. student but more like buddy."
according to a Special Collections Assistant at Princeton:
Upon looking at the dissertation for Mr. Mosteller i was able to find a mention of thanks in the conclusion to his work. There he stated that he thanked S.S. Wilkes who he was under the direction of, as well as J.W. Tukey for his suggestions and constructive criticisms. Mosteller also sweetly thanked his wife Virginia.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
regarding "harvard -- who made that place" ( https://gist.github.com/chrishwiggins/1594c8b72a4c74bdb369#file-letter-cw-hw-on-stats-and-data-science-history-L91 ) I later found out it was Tukey's student Mosteller, continuing the Tukey lineage.