Reproducible research in ecology and evolution

Author

Gaurav Kandlikar

About the course

Welcome to Biol [4800|7800] - Open and Reproducible Research in Ecology and Evolution.

You today?

In a few months

Course objectives

The content and structure of this course is designed to help you work towards the following objectives:

  1. Understand the trends and tools for reproducible and open research practices in science generally, and in ecology/evolutionary biology specifically;

  2. Develop and articulate your individual philosophy and workflow towards reproducible research;

  3. Integrate openly available datasets and tools to reproduce classic result(s);

  4. Envision and begin to implement the steps you will take towards ensuring reproducibility and robustness of your own research;

  5. Build a community of practice around reproducible and open research in ecology and evolution.

1 Communities of practices are “groups of people who share a concern or a passion for something they do and learn how to do it better as they interact regularly.” – ref

Who is this course for?

The target audience for this course is graduate students or advanced undergraduates with some past experience analyzing biological data with tools like R, python, or other programs called from the command line. The tools we will cover in this course are broadly applicable across fields, but many examples will refer to topics in ecology and evolution.

Pre-requisites

While there are no formal course pre-requisites, this course will likely be most valuable for participants who have a working knowledge of conducting data analysis/visualization in R (and/or Python) and executing commands from the command line. As a practical yardstick, if the material covered in chapters 1–5 of R for Data Science is not completely unfamiliar to you, then you should be able to complete all the exercises in this course.

If you have never before seen the material in the chapters mentioned above, you might be better served by signing up for LSU’s “Foundations for computational biology” (S2025: Biol 4800 with Dr. Brant Faircloth) course, which is more specifically designed as an introduction to using these tools.

If you are starting with zero prior experience in using R but want to take this course anyway, please meet with Dr. Kandlikar early in the semester to ensure that there a path for you to get the most out of this course.

Please come to class with a laptop capable of running R, RStudio, git, and other associated tools. LSU students can borrow a laptop for an entire semester through the LSU Library.

Course communication

Official communication about the course will occur over Moodle and LSU email. For informal communication (e.g. to seek help on a bug you are encountering, or to share a cool tool), I encourage all students to join the unofficial course discord server.

Calendar

Last updated: 2025-04-15

Week Class 1 (Tues) Class 2 (Thurs) Submission
Week 01
(14 & 16 Jan)
Overview of reproducible and open research Set up tools and intro to semester project none
Week 02
(21 & 23 Jan)
Project organization and management In-class exercise:
SWC Unix Shell
none
Week 03
(28 & 30 Jan)
Writing with Quarto In-class exercise
tbd
none
Week 04
(04 & 06 Feb)
Version control with git In-class exercises
tbd
Activity 1 due on 04 Feb
Week 05
(11 & 13 Feb)
Data visualization In-class exercises
Publication-quality graphics with ggplot2
none
Week 06
(18 & 20 Feb)
How data are stored in R In-class exercises:
Data Carpentry Data Processing (Setup and Episodes 3–4)
Activity 2 due on 18 Feb
Week 07
(25 & 27 Feb)
Data archiving and storage Open work time for semester project none
Week 08
(04 & 06 Mar)
None - Mardi Gras break Buffer day – review any previous material Activity 3 due on 06 Mar
Week 09
(11 & 13 Mar)
Crash course topic 1: Digital lab notebooks Student presentations - semester project proposal In–class presentation of semester project proposal
Week 12
(18 & 20 Mar)
Crash course topic 2: User-defined functions Open work time for semester project None
Week 13
(25 & 27 Mar)
Crash course topic 3: Accessing open databases Open work time for semester project none
Week 14
(01 & 03 Apr)
None - spring break None - spring break
Week 15
(08 & 10 Apr)
Crash course topic 4: Intro to High-performance computing Using quarto to create presentations Open work time for semester project Activity 4 due on 08 Apr
Week 16
(15 & 17 Apr)
Crash course topic 5: Advanced tools for collaboration Open work time for semester project none
Week 17
(22 & 24 Apr)
Semester project presentations Semester project presentations Student presentations
Week 18
(29 Apr & 1 May)
Semester project presentations End-of-semester individual meetings Student presentations
  • Review/deeper dive into one of the core lessons of Weeks 1–7
  • Maintaining an open digital lab notebook
  • Introduction to high-performance computing
  • Accessing open databases through R
  • Wrangling text-based data with Regular Expressions
  • Writing user-defined functions
  • Advanced tools for collaborative programming (e.g. renv)
  • Advanced open science practices (pre-registration, pre-printing, etc.)
  • Unit testing and continuous integration
  • Student choice!

Grading

Letter grades will be determined through your activities (four activities worth 25 points each) and semester project (worth a total of 100 points; see semester project for details). Your final grade out of 200 points will determine your letter grade.

A+ for earning 194--200 points over the semester;    
A  for earning 186--193 points over the semester;  
A- for earning 180--185 points over the semester;   
B+ for earning 174--179 points over the semester;  
B  for earning 166--173 points over the semester;   
B- for earning 160--165 points over the semester;   
C+ for earning 154--159 points over the semester;  
C  for earning 146--153 points over the semester;   
C- for earning 140--145 points over the semester;  
D+ for earning 134--139 points over the semester;  
D  for earning 126--133 points over the semester;
D- for earning 120--125 points over the semester

Assignment deadlines

All assignments except the final semester project submission come with a 24-hour grace period (i.e. you can submit the assignment for full credit without any prior discussion with me). If you are unable to complete an activity submission during this grace period, please get in touch with me to discuss alternatives.

Guidelines for using AI-generated code

The goal for this course is for you to think through the principles and practice of conducting ethical, open, and robust science. In my experience, the casual use generative AI tools is largely antithetical to these goals, and I strongly discourage their use among students.

Instead, when your are stuck, consider turning to a human, whether it is through our unofficial course discord, a human–authored resource (books, blogs, package documentation, etc.), or forums like stack overflow. You are also welcome to come to Gaurav’s “office hours” (AKA hacky hours) to discuss any issues.

I don’t have the tools, capacity, or desire to monitor or penalize your use of AI tools in this course. But if I get the sense that you are relying on these sources to complete the coursework, I may ask for an individual meeting to discuss the extent to which your submissions reflect your own understanding of the material.

About this site

The source code of this website is available on gitlab: https://gitlab.com/gklab/teaching/reproducible-research-s25.