In a recent interview with the Journalist’s Resource blog, Harvard political scientist Gary King responded to a question about “data journalism as ‘social science on deadline’” with an answer that just blew me away, so hopefully no one at Harvard will get angry with me for posting King’s full remarks:
I think ultimately there is no line between journalists and social scientists. Nor is it true that journalists are less sophisticated than social scientists. And it is not true that social scientists totally understand whatever method they should know in order to access some new dataset. What matters in the end is that whatever conclusions you draw have the appropriate uncertainty attached to them. That’s the most important thing.
The worst phrase ever invented is “That’s not an exact science.” That is a sentence that makes no sense. The whole point of science is that you’re making inferences about things that we’re not really sure of. So the only relevant thing to express is the appropriate level of uncertainty with our inferences.
Sometimes we have a shorter deadline. That’s true in journalism and in social science as well. No matter what, in the end there’s always some data we don’t have. In the end, there’s always some uncertainty about the conclusion that we’re going to draw. And the more interesting, the more innovative, the more cutting edge the subject is we’re analyzing, the more uncertainty we’re going to have. And that’s just the breaks.
And so what makes us — I would say scientists; journalists maybe don’t like to call themselves scientists, but I’m happy to — all doing the right thing is expressing the appropriate degree of uncertainty with respect to our conclusions. So I don’t see any difference between journalists and social scientists. I see the same continuum within journalism and within social science.
King is encouraging scholars and journalists to use his Dataverse project as a way share and archive the data that they’ve collected. If we agree with King that there really is “no line between journalists and social scientists” why would we continue to accept different cultures of evidence sharing and replication in academic and journalistic publications? Will we forever be expected to just take a journalist’s word for their claims? Good data journalism, like a good academic article, should probably include a link to the dataset and replication code.
Quick Update It strikes me that a lot of people don’t know what to call Horace Dediu at Asymco, but after posting this, I think we should call him a data journalist following the sorts of best practices (including data sharing) laid out by Gary King. I’m sure we might be able to think of a few other journalist/blogger/analyst types playing by the rules as well, but I wonder if they are mostly bloggers like Dediu.