Creating Tidy data tables

It’s quite common in visualization and analytics to work with tidy tables of data. That is, tables where each column of data contains only a single variable, and each variable is represented in a single column.

Yet data are often stored in tables that are arranged for direct visual representation or for convenient working in a spreadsheet. Often, such arrangements are not ‘tidy’.

For users of packages in R, Elm or python, there are built-in functions to tidy ‘messy’ tabular data. But if you don’t have access or experience using them, tidying can be unexpectedly complex, especially if multiple columns of tidy data are required.

So I thought I’d create a simple page for uploading/tidying/downloading any CSV file representing messy data that does not rely on any programming expertise. It uses the excellent arquero library to do the hard work and hopefully provides a quick and simply way of reorganising tabular data for visualization and analysis.

4 Likes

Hey Jo! This is great! Reshaping data is a real pain and can be confusing to people who aren’t used to doing it. This is super helpful!

Nice work @jwoLondon, I use a similar Tidy library to Arquero tidy.js – Intro & Demo / Peter Beshai / Observable, it’s also inspired by Hadley Wickham (Radio NZ interview).

2 Likes