Data should be easily understood by you, collaborators, evaluators (e.g. reviewers), and computers
Sometimes, organizational strategies that work well for humans don’t work well for computers
Develop good practices that make data legible to both humans and computers
Keep your raw data raw! Make copies for cleaning.
In today’s workshop we will be using the Portal Project Teaching Dataset
This comes from a longrunning study in Arizona regarding rodent and ant impacts on plant communities
Let’s take a look at a “messy” version of a dataset that might be collected for a project like this.
Our dataset has two tabs. Two field assistants conducted the surveys, one in 2013 and one in 2014, and they both kept track of the data in their own way in tabs 2013 and 2014 of the dataset, respectively. Now you’re the person in charge of this project and you want to be able to start analyzing the data.
Your challenge: With a partner, look through this Google sheet and identify what problems you will have to address to create a “flat” sheet ready for analysis.
Make a copy of the sheet and start addressing issues
Work on this for ~10-15 minutes