28th October 2010: On Data Journalism
It represents the convergence of a number of fields which are significant in their own right – from investigative research and statistics to design and programming. The idea of combining those skills to tell important stories is powerful – but also intimidating. Who can do all that?
Actually, this is an old yet interesting topic. Barry Saunders notes he wrote a thesis on data journalism “before it was cool.” In 2001, during my stint as Disinformation’s editor, I wrote a profile of information visualisation tools and a glossary of coding terms (links now outdated). In 2002, I did a Swinburne University presentation on developing an editorial framework for interpreting data and event-based news. However, the roots of today’s ‘data journalism’ can be traced back to the 1960s movement in ‘social indicators’, statistical inference methods, and early computer-assisted journalism. A good primer on these techniques is Philip Meyer’s Precision Journalism: A Journalist’s Guide to Social Sciences (Rowman & Littlefield Publishers, Lanham MD, 2002).
Rather than data, for me the more interesting area is analytical models, causal inference and judgment. This deals with how to evaluate data, and how to infer findings when there is no ‘smoking gun’ or ‘deep throat’ source. In two celebrated papers for Journal of Political Economy and NBER’s Working Paper series, Columbia’s Ray Fisman and UC Berkeley’s Edward Miguel inferred patterns of corruption from parking tickets. Greenlight Capital’s hedge fund manager David Einhorn made a speech at the 2002 Ira Cohn Conference, and then a follow-up book in which he explained his research process using mosaic theory to uncover information about a ‘target’ company and to short-sell their stock. Philip Tetlock, Robert Jervis, and Gregory Treverton have written extensively on this, in terms of the epistemology and research methods of intelligence analysis.