# Your code here; add more chunks as needed.
Activity 3
Instructions for graduate students
The goals for this activity are for you to:
- finalize the published paper in your field whose results you aim to reproduce as part of the semester project for this course,
- thoroughly explore the dataset and understand the “data generation process”
- identify the software packages you will need to implement to complete this reproduction analysis.
Fill out the information in the sections below.
Overview of the original paper
Citation
Add an in-text citation to your publication here:
(Recall that to complete this, you will need to ensure that the bibtex entry for your chosen paper is available in a .bib
file in your project repository, and that the path to the bibliography file is specified in the YAML header).
Main results
What are the core results from the original analysis that you wish to reproduce?
Details of the dataset
Provide an overview of all details regarding the original dataset that you feel will be relevant for your re-analysis. For example, what are the different “types” of data relavent to this analysis (e.g. what are the columns in the dataset? Is each column categorical or continuous? How many observations are present? How are missing values represented?)
Details of the analytical approach
Provide an overview of the analytical approach used in the original study. Include both conceptual and practical information (e.g. what kind of analyses were conducted, and using which software packages?)
Path to reproduction
Describe what data are available, including links to all repositories.
Describe what code is available, including links to all repositories.
Getting familiar with the data
- Download the dataset and use
dplyr
to start familiarizing yourself with the structures and patterns in the dataset. Please include your code and the output below. (Try making at least 5 plots to conduct exploratory data analysis, and run 1–2 simple statistical models).
Instructions for undergraduate students
The goal for this activity is for you to finalize the open science dataset that you plan to explore as part of the semester project for this course, to understand the structure and content of the data, and to identify the software packages you will need to conduct your analysis/visualization.
Overview of the dataset
Description and source of data:
- Describe how the dataset was generated (i.e. who collected the data, over what timeperiod, what was the purpose, etc.).
Your intended visualization
- Describe the analysis/visualization you hope to generate from these data. For example, if you plan to visualize the data, what kind of graphs do you plan to make? What will be on the X- and Y-axes?
Getting familiar with the data
- Download the dataset and use
dplyr
to start familiarizing yourself with the structures and patterns in the dataset. Please include your code and the output below.
# Your code here; add more chunks as needed.