Version control with git

http://www.phdcomics.com/comics/archive.php?comicid=1323

https://phdcomics.com/comics/archive/phd101212s.gif

Version control to the rescue

  • One file, lots of versions

    • Historical versions
    • Multiple versions that you might be considering
    • Multiple versions across collaborators
    • Changes are explicitly tracked so you know what’s changing

Basics of git

  • Version control software
  • Possible to use entirely on your own computer (without connecting to online (remote) platforms like Gitlab, Github)
    • But massive gains come from syncing to remote
    • Share with collaborators
    • Share with outside world
    • Share with yourself

Basics of git

Simplest workflow: just you working on your project, on only one computer.

  • Do your work (enter data, edit your qmd manuscript, etc.)
  • Every time you make a “meaningful” change, commit your work with a message.
  • Push to a remote repository if you want to back up your work.

Illustrations from the Openscapes blog GitHub for supporting, contributing, and failing safely by Allison Horst and Julia Lowndes

Implementing a simple git workflow

  • “Remote-first” approach
  1. Create a blank repository on gitlab (or github)
  2. Clone it to your local machine by creating a new RProject
  3. Work on your code/writing
  4. When you feel you are at an “anchor point”, commit your work and push
  5. Repeat 3–4 as needed.
  • Live demo

Implementing a simple git workflow

  • “Local-first” approach
  1. Create a new repository on your local computer (e.g. by initiating a new RProject with “Use version control” selected)
  2. Create a new repository on gitlab, and connect your repository to this remote repo.
  3. Work on your code/writing
  4. When you feel you are at an “anchor point”, commit your work and push
  5. Repeat 3–4 as needed.
  • Live demo

Aside: When should I commit?

Using a Git commit is like using anchors and other protection when climbing. If you’re crossing a dangerous rock face you want to make sure you’ve used protection to catch you if you fall.

Commits play a similar role: if you make a mistake, you can’t fall past the previous commit. Coding without commits is like free-climbing: you can travel much faster in the short-term, but in the long-term the chances of catastrophic failure are high!

Like rock climbing protection, you want to be judicious in your use of commits. Committing too frequently will slow your progress; use more commits when you’re in uncertain or dangerous territory. Commits are also helpful to others, because they show your journey, not just the destination.

— Hadley Wickham, quoted in Happy git with R

Commits should be ‘atomic’, meaning that they should do one simple thing and they should do it completely. For example, an ‘atomic’ commit could be adding a new function or renaming a variable. If a lot of different changes to your project are all committed together, it can be hard to troubleshoot if any error appears in that version. Furthermore, undoing the whole commit may throw away valid and useful work.

Commit messages

Git, day 2

First wrinkle:

You working on a project with a collaborator.

  1. Create a blank repository on gitlab (or github)

  2. Clone it to both machines

  3. Recall the previous workflow: Work, commit, push

  4. We now add an extra step at the beginning: Pull, work, commit, push

First wrinkle:

You working on a project with a collaborator.

  • Live exercise:
  1. Organize into pairs or groups of 3 (Required for this activity)
  2. Each person creates a new repository (either RStudio first, or Gitlab first)
  3. Push a change to your repository.
  4. Make a clone of your partner’s repository, and make a change.
  5. Push your change to gitlab.
  6. Pull changes to your local repo from gitlab.

Second wrinkle:

You are testing out some new features of a project - the work may or may not end in success!

(Practical example: you have a working draft of the paper but your advisor suggests an different way to structure the discussion. You want to explore what it would look like if you were to adopt this suggestion.)

  1. Within your R project, create a new branch.
  2. Work as you normally would.
  3. Once you are done exploring the changes, you can choose to import changes into the main branch.

The Turing Way project illustration by Scriberia. Zenodo.

  • Live demo

Other assorted features of git/gitlab

Other assorted features of git/gitlab

  • “Releases”/tags
  • Live demo

Other assorted features of git/gitlab

  • Ignoring things

Other assorted features of git/gitlab

  • Using git from the command line (bash shell)

    • pull, add, commit, push pipeline
    • git branch
    • git diffs

Other assorted features of git/gitlab

  • Using git for maintaining a “digital lab notebook”

More learning

There are tons of resources for learning git.

  • Read a few and see which style(s) suit you best.

  • A handy “cheatsheet” for when something breaks: https://dangitgit.com/

In-class exercise