How to: plan a journalism project that needs data entry

The challenges of data entry projects

Tip 1: Sketch out the table structure(s) to avoid having to repeat work

  1. Liabilities amount
  2. Name of company owing money
  3. Date
  4. Name of company owed money
  5. Agreement owed under

One table — or multiple tables?

Tip 2: Add columns to your table plans for attribution and checking, notes and newsworthiness

  • Who entered the problematic data;
  • If it is actually an error — or the data is what it seems;
  • What may have caused any error;
  • Whether the same mistake has been made by the same person elsewhere.

Columns to check the data

  1. Entered by
  2. Liabilities amount
  3. AMOUNT CHECK
  4. Name of company owing money
  5. Company number
  6. Date
  7. DATE CHECK
  8. Name of company owed money
  9. Agreement owed under
  10. Source

Creating the table — and the checking columns

Tip 3: Formatting columns to avoid bad data entry

  • Company numbers, telephone numbers, contract codes and other ID codes being stored as numbers (which means leading zeroes are removed).
  • Dates being stored as text
  • In Excel: right-click on the column letter at the top of the column. Then select Format cells and on the window that appears select Text.
  • In Google Sheets: select the whole column, then click on the Format menu and select the Number menu. This will open up another menu where you can specify the format you want (in this case, Plain Text)

Tip 4: The brutal option: data validation

  1. First, you need to create a list of values that you will allow. That’s best done in a separate sheet in Excel.
  2. Create a second sheet, then. In A1 in this new sheet type ‘Companies’ as your column heading for column A then in A2 type ‘Company A’ and in A3 type ‘Company B’. You now have a list of 2 companies.
  3. Now go back to your main sheet, and select all the cells below the heading in column H (‘Company owed’).
  4. Make sure you are in the Data tab in Excel, and click the button marked Data validation.
  5. A window will appear where you can specify what’s allowed in these cells. Select List.
  6. Click into the Source: box at the bottom. Now, while your cursor is still there, move your mouse away from the window and click on the second sheet in Excel, where your list is. Then click and drag to select the cells containing the list of company names you want to allow. The box should start to fill with the location of those cells: `=Sheet2!$A$1:$A$3`
  7. There are other options in the other tabs about the warning that is shown when someone tries to enter a value not in the list, and whether they can override that, but for now… Click OK.

Tip 5: Using Google Forms to enter data

  • Multiple people can work on the same spreadsheet at the same time
  • The spreadsheet can be accessed from any computer as long as they have an internet connection
  • Google Sheets stores a history of the sheet as it is changed (look under the File menu for ‘Version history’). This means that if mistakes are made you can return to a previous version of the spreadsheet, compare changes, and revert to earlier versions if you need to (removing any changes made after that version)
  • Data can be entered using a form (Google Forms) rather than directly into the spreadsheet itself
  • For dates remember to change to ‘Date’;
  • For notes use ‘Paragraph’ so there’s enough space if they need it.
  • For newsworthiness you can use the ‘Linear scale’ option that makes them choose between 1 and 5 or another range you specify.
  • You can also use the Drop-down option to force them to specify from a range of options (such as countries).
  • For yes/no answers use checkboxes.

Taking it further: structured journalism

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Paul Bradshaw

Paul Bradshaw

Write the @ojblog. I run the MA in Data Journalism and the MA in Multiplatform and Mobile Journalism @bcujournalism and wrote @ojhandbook #scrapingforjournos