How Reporters Can Ask the Right Questions of Databases

By: ICFJ | 12/11/2014

Investigative journalists often look to numbers to back up or fuel their reports, but the data they need can't always be found in a tidy spreadsheet or gathered straight from a source.

"As a journalist obviously your main tool is talking to people; it’s being able to ask the right questions of the right people," said ICFJ Knight Fellow Friedrich Lindenberg in a recent webinar on digital tools for investigative reporting. "That’s still true in many ways, but now you can ask the right questions of the right databases. You can ask the right questions using the right tools."

The ICFJ Anywhere - Dow Jones Foundation webinar helped to narrow down that ever-growing list of resources for journalists, exploring tools that simplify ways to search laborious documents and turn PDFs into data-friendly visuals.

Analyze, search and release documents

DocumentCloud lets users track where people or companies appear in documents and how often, making it a useful service when combing through court or legal documents, Lindenberg said. Rather than looking through 200-plus pages of information, reporters can jump directly to areas of the document regarding the person or company in question. DocumentCloud was founded in 2009 with a grant from the Knight News Challenge.

News organizations and journalists have used the tool to examine and publicly release everything from WikiLeaks documents to city contracts to rarely seen reports and memos.

You can use DocumentCloud to:

  • Comb through other documents posted publicly by news organizations
  • Privately view documents
  • Share documents with the public or with other reporters
  • Add notes to documents by commenting or highlighting passages
  • Embed portions of the document directly into your news article online

The tool began as a joint project between ProPublica and The New York Times but is now run by the Investigative Reporters and Editors. To use the tool, you need to have an affiliation with a news organization.

But Lindenberg encouraged journalists interested in using the service to simply email him. Lindenberg and Code for Africa are developing a similar tool called sourceAFRICA, and he’s willing to let you try it out. (DocumentCloud is open source, so others can use its code to create similar products.)

Lindenberg also recommended Overview, which sorts information in your documents by counting how frequently words or themes appear in texts. You can tag similarities you find and create visualizations based on the information. Anyone can use the tool by importing documents from DocumentCloud or directly uploading PDFs.

Find out more about companies

If you're investigating a certain company and need to find more information, try OpenCorporates. The free database system allows users to search through about 55 million registered companies in more than 75 jurisdictions. Using this tool, you’re also able to see if a director owns other entities than his or her main company.

If you’re in the United Kingdom or reporting on business there, you can also try DueDil. It’s similar to OpenCorporates but is specifically for businesses in the UK.

If you want to dig into contracts, go to Investigative Dashboard. You’ll be able to find where certain governments do business with private companies and explore who owns certain intellectual properties. You can also find links to additional online databases with information about businesses and who runs them.

The best part about Investigative Dashboard, Lindenberg added, is you might be able to interact with other humans using its research desk. If part of your reporting is based in another country, you can ask other journalists and researchers to do additional digging you can’t do remotely.

Get data

Calling it the “gateway drug to data journalism,” Lindenberg recommended Tabula to help you extract data from PDFs.

Tabula will pull out statistics and tables in long documents, and also lets you convert a PDF table of data into a spreadsheet where you can see the formulas behind the final numbers.

Keep learning

The School of Data has free online courses for those new to working with data and for experienced pros. You can also read up on the latest in data journalism by subscribing to NICAR-L’s newsletter.

Visit Friedrich's website for a full list of vetted tools for investigative journalists. You can also view the full ICFJ Anywhere - Dow Jones Foundation webinar at the top of this post.

Friedrich Lindenberg is an ICFJ Knight International Journalism Fellow who works with journalists and watchdog organizations to develop data resources and investigative tools.

This post is also published on IJNet, which is produced by ICFJ.

Latest News

Sharon Moshavi on Journalism, Disinformation and Why Facts Still Matter

Sharon Moshavi, the president of the International Center for Journalists (ICFJ), recently joined the Ink and Insights podcast for a wide-ranging conversation on the future of journalism and the evolving information ecosystem. The interview, hosted by author and storyteller Sumit Sharma Sameer, touched on the growing role of AI in both enhancing and undermining journalistic work, the importance of audience-centric innovation and why young reporters must build subject-matter and tech fluency to stay resilient in the industry.

ICFJ Knight Fellow Sannuta Raghu Says “Fidelity to Source” is Vital When Using AI

Newsrooms globally have begun exploring ways to convert their journalism into different formats using AI: for example, from text articles to videos, podcasts, infographics and more. As they do so, the core challenge isn’t just accuracy – it’s rigor. Journalists strive to get facts right and attribute them clearly, avoid bias, verify claims, and maintain transparency. When AI is used to convert a work of journalism from one form to another, the same rigor may not carry over.

A Reporter's Guide to The History of Tariffs

For most of human history, governments have taxed goods crossing their borders. Tariffs — taxes levied on imports or exports — have financed empires, protected domestic industries, and punished foreign rivals. They’ve sparked wars, crashed economies, and redefined alliances. Yet today’s tariff war between the United States and the world doesn’t fit neatly into any of the old molds. Rather than being a tool to nurture domestic industry or fill government coffers, tariffs are now being wielded as weapons in a sprawling contest over global power and economic dominance.