How Reporters Can Ask the Right Questions of Databases

Dec 112014

Investigative journalists often look to numbers to back up or fuel their reports, but the data they need can't always be found in a tidy spreadsheet or gathered straight from a source.

"As a journalist obviously your main tool is talking to people; it’s being able to ask the right questions of the right people," said ICFJ Knight Fellow Friedrich Lindenberg in a recent webinar on digital tools for investigative reporting. "That’s still true in many ways, but now you can ask the right questions of the right databases. You can ask the right questions using the right tools."

The ICFJ Anywhere - Dow Jones Foundation webinar helped to narrow down that ever-growing list of resources for journalists, exploring tools that simplify ways to search laborious documents and turn PDFs into data-friendly visuals.

Analyze, search and release documents

DocumentCloud lets users track where people or companies appear in documents and how often, making it a useful service when combing through court or legal documents, Lindenberg said. Rather than looking through 200-plus pages of information, reporters can jump directly to areas of the document regarding the person or company in question. DocumentCloud was founded in 2009 with a grant from the Knight News Challenge.

News organizations and journalists have used the tool to examine and publicly release everything from WikiLeaks documents to city contracts to rarely seen reports and memos.

You can use DocumentCloud to:

  • Comb through other documents posted publicly by news organizations
  • Privately view documents
  • Share documents with the public or with other reporters
  • Add notes to documents by commenting or highlighting passages
  • Embed portions of the document directly into your news article online

The tool began as a joint project between ProPublica and The New York Times but is now run by the Investigative Reporters and Editors. To use the tool, you need to have an affiliation with a news organization.

But Lindenberg encouraged journalists interested in using the service to simply email him. Lindenberg and Code for Africa are developing a similar tool called sourceAFRICA, and he’s willing to let you try it out. (DocumentCloud is open source, so others can use its code to create similar products.)

Lindenberg also recommended Overview, which sorts information in your documents by counting how frequently words or themes appear in texts. You can tag similarities you find and create visualizations based on the information. Anyone can use the tool by importing documents from DocumentCloud or directly uploading PDFs.

Find out more about companies

If you're investigating a certain company and need to find more information, try OpenCorporates. The free database system allows users to search through about 55 million registered companies in more than 75 jurisdictions. Using this tool, you’re also able to see if a director owns other entities than his or her main company.

If you’re in the United Kingdom or reporting on business there, you can also try DueDil. It’s similar to OpenCorporates but is specifically for businesses in the UK.

If you want to dig into contracts, go to Investigative Dashboard. You’ll be able to find where certain governments do business with private companies and explore who owns certain intellectual properties. You can also find links to additional online databases with information about businesses and who runs them.

The best part about Investigative Dashboard, Lindenberg added, is you might be able to interact with other humans using its research desk. If part of your reporting is based in another country, you can ask other journalists and researchers to do additional digging you can’t do remotely.

Get data

Calling it the “gateway drug to data journalism,” Lindenberg recommended Tabula to help you extract data from PDFs.

Tabula will pull out statistics and tables in long documents, and also lets you convert a PDF table of data into a spreadsheet where you can see the formulas behind the final numbers.

Keep learning

The School of Data has free online courses for those new to working with data and for experienced pros. You can also read up on the latest in data journalism by subscribing to NICAR-L’s newsletter.

Visit Friedrich's website for a full list of vetted tools for investigative journalists. You can also view the full ICFJ Anywhere - Dow Jones Foundation webinar at the top of this post.

Friedrich Lindenberg is an ICFJ Knight International Journalism Fellow who works with journalists and watchdog organizations to develop data resources and investigative tools.

This post is also published on IJNet, which is produced by ICFJ.