Skip to content

Instantly share code, notes, and snippets.

@brentajones
Last active July 13, 2023 16:40
Show Gist options
  • Save brentajones/77167d4dfe81d46cb731 to your computer and use it in GitHub Desktop.
Save brentajones/77167d4dfe81d46cb731 to your computer and use it in GitHub Desktop.

Want to team up?

We're gauging interest in forming some kind of an informal group of lone data journalists to help us better connect and collaborate. If you're interested, please fill out our form and we'll go from there. Thanks!


Solo data journalist tips and tools

Get to work

Keep lists of story ideas

Be sure to include both the simple one-shot stories as well as the bigger projects. It may be helpful to split up your lists that way.

Unless you have a super data-centric newsroom, you'll likely be seeing story ideas, or ways of approaching stories, other reporters don't. So you'll want to keep track of them for the right moment.

Employ templates and checklists

You're the only one doing this stuff? Congratulations. You're also the only one who remembers how — until you forget or go on vacation.

Once you figure out a skill, whether it's how to create that essential Excel PivotTable to analyze the city's crime report each month or how to code a bar graph that works in your CMS, document it!

Write down a checklist, step-by-step, of how you finished that project. Yes, you might need to modify that project for your next one, but having a place to start is invaluable.

Learn to code…or don't

You don't need to code to be an effective data journalist. You can find a lot of stories with basic Excel skills. And you can make maps and graphics with off-the-shelf tools (see the tools section below). As you progress in your data journalism career, eventually you'll hit a wall that can only be scaled with a little code — but you can cross that bridge when you come to it.

Start simple

If you're just starting out, start small. Do a simple locator map. A single bar graph or line chart. Then work your way up. Starting this way shows editors that useful data journalism can be done in days, not weeks.

It can also be smart to start with a silly or fun project (i.e. a dog name database, popular baby names over time) so -- if you make a mistake -- you won’t have to worry about getting sued. This helps reduce your stress level as you’re learning the ropes.

Use (or develop) a stylesheet

If you're developing visualizations of any kind, you should have a stylesheet. What fonts do you use? What size are they? What colors do you pick for a bar graph?

Following a stylesheet has many benefits. For your readers, it gives your work a consistent, professional look. For you, it helps you produce visualizations faster. You can make all those decisions once, instead of every time you fire up a project, and save your brain-power for tackling the current problem.

Enjoy economies of scale

You’ll be amazed, as you do more and more data journalism, how much faster you’ll get. Things that literally took you three days to build you can now whip off in 20 minutes. Once you get to this stage, you can take advantage of these time savings to build time into your day to learn new skills (like coding!). And you’re unlikely to get too much push back from your bosses, because they’ll be amazed at how quickly you’re pumping this stuff out.

Get some allies

Team up with your newsroom

You may not have a “data team” where you work but you can dramatically increase the amount of data journalism your organization produces by teaming up with your non-data-journalist colleagues. Rather than do everything yourself -- from crunching your data to writing the story -- break the journalism into its component parts.

You make the chart of standardized test scores, then hand off your analysis to your education reporter. You make the map of ER wait times and then ask your health reporter to do some interviews about it. Those beat specialists will often do a better job of the interviewing and writing anyways, as they have a better sense of what all those numbers mean.

Data journalism isn’t for everyone but you can often get others in your newsroom into a “data state of mind” so that they’ll start coming to you with great ideas for data stories you can team up on.

Team up with your editor

Learn to manage your editor's expectations. Also learn how she can help you make a better story:

Produce! Unless you're heads down on one huge project (and your editor is cool with that), try to come up with a finished thing every day. A simple map or chart, or a short data story. It keeps the bosses happy and keeps you from burning out on one thing.

Double-check your data before rushing to your editor with a blockbuster story. You may have made an error. If you didn't, report it out a little before showing your editor. It lets your editor know your direction and helps you keep focus.

Under-promise, then over-deliver. You should have a backup plan should your Cool Data Vis falls through. What is it?

Don't rush. When your boss asks: "Hey, how long will that take?" In your head, multiply the number by 2 and change the units. Two hours = four days. Then, if you finish in less time you look like a genius. Note: This loses effectiveness if you're consistently beating your estimates. Get one right every once in a while.

A few tips for editors supporting data journalists:
  • Ask questions about how he or she came up with the numbers or information
  • Ask if the reporter double checked his or her information
  • Find someone else in the newsroom who knows the subject matter to talk to the reporter about the story

Team up with the data journalism community

NICAR! Find some folks to conspire with and serve as a mutual support group. Have a monthly Skype call or Google Hangout.

Join the NICAR Listserv.

Follow folks on Twitter and join the conversation. #NICAR15 is a good place to start.

Follow folks on GitHub and make some pull requests.

Team up with the data/programming community where you live

You may be the only data journalist at your newsroom, or maybe even in your city. But you’re almost certainly not the only data geek where you live. There are lots of other folks out there in business, government and the non-profit sector who love playing with data. Seek these people out and connect with them. They’re going to be your biggest fans. And you can often ask them for help when you’re stuck on a technical problem.

Twitter is great for this. Also GitHub. Check out various "User Group" meetups, like GIS user groups for mapping or Python or Ruby user groups for programming.

Get some tools

Just as photographers require access to cameras, data journalists require access to certain tools. Learn what you can do with these tools, perhaps on a personal computer, and be able to make the case to your boss (and their boss, and the IT department) what the tool does and how it will help make journalism.

The good news is that most of them are free, or have free equivalents. The bad news is you'll require admin access to install most of them. So…

Step 1: Get admin access to your computer, or get friendly with your IT department.

If that just isn't happening for you, look toward web-based tools, of which there are many. Some folks also do this work on their personal machines.

Tools for working with data

  • CSVKit — CSVkit is a command-line tool for working with CSV files. It has great documentation, and there's a tutorial built right in. And because it’s a Unix-like tool, you don’t really have to understand all of it to start using it. Some great CSVkit uses:
    • Joining two csvs without having to use a database
    • Turning a CSV into geojson for mapping
    • Easily inserting a csv into a sqlite table
  • SQLite — SQLite is a lightweight database. Peter Aldhous has a good tutorial.
  • Open Refine — Open Refine is a tool that can solve some basic data-cleaning and reformatting problems. Also it seems like pretty much everyone on NICAR-L uses it, so it’s easy to find answers for questions on the list. Some uses for Open Refine:
    • Trimming leading and ending whitespaces from columns.
    • "Cluster and merge" equivalent (but badly entered/spelled) entries. Super powerful. Doing this by hand is slow and painful.
    • Combine a bunch of excel worksheets into a contiguous csv. There’s probably a bunch of ways to do this, but Open Refine takes 10 seconds.
    • Refine is one of the easiest, and most reliable, ways to reformat data for Tableau Public.
  • Excel PivotTables (or equivalent) — Sometimes it's appropriate to bust out Python Pandas iPython Notebooks or R to do some analysis. Other times, all you need to do is make a PivotTable. These are fast, simple, surprisingly powerful tools. There are tons of video tutorials available online.
  • Cometdocs — As IRE members, we get a free license. Did you get a pesky PDF that looks like it used to be an Excel file? Cometdocs can convert it into something you can actually use.
  • Tabula — Tabula is a free open-source tool for converting PDFs.
  • NPR Visuals Team setup guide — if you're just getting started with coding (and you have a Mac), NPR's Visuals Team has a step by step for installing many of the basic tools you can use to start developing. This isn't the only way to set up your machine, but it is thorough.

Tools for helping to visualize data

  • Datawrapper — Datawrapper is an open source (though not free in all cases) web app to help create data visualizations.
  • Tableau Public — A steeper learning curve than Datawrapper but you can do a lot of stuff in Tableau without coding and its once-sluggish interface is getting a lot snappier (like with tooltips and maps)
  • Google Fusion Tables — Still probably the easiest way to make interactive maps, even if you’ve got thousands of points. And you can jazz up your Fusion Table maps with the Layer Wizard or Derek Eder’s search template.
  • QGIS — QGIS is a free, open-source package for working with geographic data. Even if you never deploy something made with QGIS, it can still come in handy. Helpful for:
    • Some quick and dirty geocoding (through MMQGIS plugin. Free and easy!)
    • Converting .shp into geojson or KML and back
    • Quickly exploring .shp files. Not sure if you need the 10m or the 100m file? Fire up QGIS.
    • Filtering a large SHP file into smaller chunks. Statistics Canada, for example, only releases boundary files for Census Tracts (neighbourhoods) for the whole country, which is a massive file. You can quickly use QGIS to just grab the tracts you need.
  • GDAL — GDAL is a command-line tool for working with shapefiles. It's powerful, but easier than it seems at first glance. Mike Bostock's Let's make a map tutorial introduces it gently.

Tools for helping present data

  • Twitter Bootstrap — If you're starting a project and need a basic web template, here is one.
  • NPR's Daily Graphics Rig and Apps Template — If you're doing web graphics or web projects on a regular basis, templates like these can make them a snap to start, develop and deploy.
  • ColorBrewer — ColorBrewer helps you develop color schemes, primarily for maps but useful for other purposes. It also lets you know which ones are colorblind-friendly.
  • D3.js and D3-tip — D3.js is a Javascript library for binding data to web elements to make visualizations. There are many, many examples of various techniques online. One tool for working with D3 is D3-tip which makes creating tooltips for those graphics easy.

Credits:

Compiled for NICAR 2015, Atlanta Ga.

Panel: Flying Solo

Panelists:

Moderator:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment