# Your code here; add more chunks as needed.
Activity 3
Overview
The goals for this activity are for you to:
- finalize the published paper in your field whose results you aim to reproduce as part of the semester project for this course,
- thoroughly explore the dataset and understand the “data generation process”
- identify the software packages you will need to implement to complete this reproduction analysis.
Fill out the information in the sections below.
Overview of the original paper
What were the motivations for the study? What were the core questions/hypotheses?
Main results
What are the core results from the original analysis that you wish to reproduce?
Details of the dataset
Provide an overview of all details regarding the original dataset that you feel will be relevant for your re-analysis. For example, what are the different “types” of data relavent to this analysis (e.g. what are the columns in the dataset? Is each column categorical or continuous? How many observations are present? How are missing values represented?)
Details of the analytical approach
Provide an overview of the analytical approach used in the original study. Include both conceptual and practical information (e.g. what kind of analyses were conducted, and using which software packages?)
Path to reproduction
Describe what data are available, including links to all repositories.
Describe what code is available, including links to all repositories.
Getting familiar with the data
- Download the dataset and use
dplyr
to start familiarizing yourself with the structures and patterns in the dataset. Please include your code and the output below. (Try making at least 5 plots to conduct exploratory data analysis, and run 1–2 simple statistical models).