Science. Communication. Community.
Increasingly powerful computer software enables data visualization well beyond a basic graph or pie chart. How do journalists communicate the raw data important in a data-driven piece in an engaging way that captures not only the reader’s attention but also their interest in the nitty-gritty details?
It’s 2013. Data-driven journalism (and even sensor journalism) is here. As large datasets have become increasingly important to writers, so too has the software to analyze such information. Much has been said about this use of data visualization in science journalism, but how does one translate a data table full of numbers into something engaging, something eye-catching, or even beautiful?
Inspired in part by this Ask Metafilter thread, I started to think about how I, as a scientist and as a journalist, can present information in an understandable medium for all readers. For those unfamiliar with the subject, I can use engaging visual techniques to share why my topics are important and interesting. Data, after all, is just an organized set of numbers and labels. By exploiting the relationships between these values, you can convert [1 1 10 2 15; apple apple orange apple orange] into a clear differentiation in pesticide levels for different fruits.
At a base level, any data can be represented as plain old graphs. Line graphs, bar graphs, pie charts…they don’t sugar-coat anything. While straightforward and fundamental to scientific communication, such graphs by themselves are sometimes a bit dry for journalistic outlets. Enter the infographic.
Today, infographics have gone to the next generation: they’re interactive. With the advent of internet-based information comes the opportunity to produce interactive datasets never before possible in fixed-print media.
But how do you, as a holder of data, turn this string of numbers in a spreadsheet or database into something the public can relate to and engage with? You need a few computer skills, but don’t worry, you don’t have to be a computer programming expert to work with data, though it certainly helps to take things to the next level. If the thought of spending more than 10 minutes in Excel makes you woozy, perhaps you should make friends with someone who’s more comfortable doing the actual computer work, and ask them to teach you what you need to get started.
If the thought of gorgeous visualizations makes you drool, forge ahead.
At the very least, you can plot your data in Excel. This is the lowest common denominator. Many beautiful graphs have been created in Excel, but you have to put in a little extra effort to show your reader you care about the presentation of your data. Label your axes. Create an engaging chart title. Override the default font and automated line styles. If your graph includes a legend that says “Series1”, you probably haven’t made a graph that will draw your readers in.
Ready to dig a little deeper in your data? Maybe you want to run some statistical formulas or select out an important subset of your dataset. A popular program for manipulating scientific data is Matlab. Matlab is commercial software, and it’s hardly affordable for those outside of scientific institutions where campus licenses make the program available, but scientists worldwide rely on Matlab to process their dataset and visualize its contents.
Interested in a free alternative? Perhaps R is up your alley. R is a free statistical package enabling easy countless statistical tests. (Remember the famous quote: “There are lies, damned lies, and statistics.” You can use this to your advantage.)
Working with D3 is easy because base datasets and computer code templates are available for you to start with. Will you be displaying data on a state-by-state basis or a county-by-county basis? No need to create your own map—geographical boundaries are all freely available. All you need to do is add your particular data and visualization scheme.
It’s one thing to see the newshounds at the New York Times have great data visualization, but can an average joe work with the same data? Definitely. Although you’re unlikely to learn D3 overnight, with a little practice and with a powerful dataset, you can be on the path to New York Times-quality interactive visualizations using free software. The best way to find out if D3 is right for you is to start playing around. For example, some individuals were interested in illustrating the reader’s individual quest to find new tasty beers based upon their current beverage preferences (using D3, of course).
There are, of course, pros and cons to each of the software packages mentioned above. Some cost money and others are free. Some are meant for interactive web presentation, and others are better at generating fixed graphics. Some have a graphical interface while others require text-based computer coding. Regardless, each has its own learning curve.
So, data journalists, what do you think? What data visualization styles do you admire? What software packages do you use that I forgot to mention? I can’t wait to see what the next generation of science journalists develop as technology evolves.