Writing with Quarto

Announcements

  • Activity 1 due at the end of next week

  • Instructions are available at the top of the activity

  • For graduate students: identify 5 papers that are important to your dissertation, whose results you think it would be important to reproduce

    • Characteristics of good papers:
    • Ideally published within ~20 years
    • Analysis does not rely on extremely heavy computation
    • Original authors did not make a fully reproducible pipeline available
    • (But if they did, we can discuss potential ways forward.)

Quarto

Quarto files are designed to be used in three ways:

  • For communicating to decision-makers, who want to focus on the conclusions, not the code behind the analysis.

  • For collaborating with other data scientists (including future you!), who are interested in both your conclusions, and how you reached them (i.e. the code).

  • As an environment in which to do data science, as a modern-day lab notebook where you can capture not only what you did, but also what you were thinking.

— R for Data Science, Ch. 28

Anatomy of a qmd file

(link to download this file)

Lines 1–5: “YAML” header - this is the space to add information about your file (title, author, date, additional details).

  • Demarcated by three dashes (---)
  • Always appear in key: value format
  • Lots of possible options – see this guide for details.
    • No need to get overwhelmed by the options for now - but good to keep in mind for future projects.

Lines 7–15: R code chunk

  • Demarcated by three ticks, followed by the programming language name (```{r} {code} ```)
  • Lines 8–9 are options for the R code chunk - these control, e.g. whether the code is run or not, how big figures are, etc.
  • Lines 11–15 are standard R code

Lines 16–20: Text in Markdown

  • This is text meant to be read by humans
  • Can customize how text is rendered, e.g. adding * around a word to italicize: *quarto* becomes quarto
  • Read about additional formatting options here

What to do with qmd files

  • Render (i.e. “generate”) into a “public facing” document, with all the source code readily available.
  • Settings for rendering are based on the YAML header of the file

Some review from the Crump lab tutorial

Markdown options

  • Goal: write in plain text to generate documents with “rich” formatting
  • *text in italics* \(\to\) text in italics

  • **text in bold** \(\to\) text in bold

  • ***text in bold-italic*** \(\to\) text in bold-italic

  • [underlined text]{.underline} \(\to\) underlined text

  • [text in small caps]{.smallcaps} \(\to\) text in small caps

  • text with ^superscript^ or ~subscript~ \(\to\)
      text with superscript or subscript

  • $\frac{dN}{dt} = rN$ \(\to\) \(\frac{dN}{dt} = rN\)

  • Community ecology is a mess [@lawton_1999] \(\to\) Community ecology is a mess (Lawton 1999)

  • You can also add footnotes^[like this] \(\to\) You can also add footnotes1

Markdown options, con’t

![Here's a picture of a Panamenian golden frog](https://upload.wikimedia.org/wikipedia/commons/5/55/Atelopus_zeteki1.jpg) \(\to\)

Here’s a picture of a Panamenian golden frog

Markdown options, con’t

# Header 1

## Header 2

### Header 3

Markdown options, con’t

See https://quarto.org/docs/authoring/markdown-basics.html for a comprehensive guide to markdown options

R code chunks

  • To write R Code in qmd documents, we need to insert code “chunks”
  1. The keyboard shortcut Cmd + Option + I / Ctrl + Alt + I.

  2. The “Insert” button icon in the editor toolbar.

  3. By manually typing the chunk delimiters ```{r} and ```.

— R for Data Science, Ch. 28

```{r}
1 + 1
```
[1] 2

R code chunks, con’t

  • Code chunks can be modified with several options
  • Simple example: labeling the chunk with a name
Listing 1: Simple Addition
```{r}
#| lst-label: lst-simple-addition
#| lst-cap: Simple Addition

1 + 1
```
[1] 2

As seen in Listing 1, 1+1=2

R code chunks, con’t

  • Some R chunks can be shown but not run (“evaluated”)
```{r}
#| label: simple-multiplication
#| eval: false
2 * 2
```

R code chunks, con’t

  • R code chunks can control the appearance of the output
```{r}
#| label: first-figure
#| fig-width: 2

cars |> 
  ggplot(aes(x = speed, y = dist)) + 
  geom_point()
```

R code chunks, con’t

  • R code chunks can control the appearance of the output
```{r}
#| label: second-figure
#| fig-width: 8

cars |> 
  ggplot(aes(x = speed, y = dist)) + 
  geom_point()
```

R code chunks, con’t

  • R code chunks can control the appearance of the output
```{r}
#| label: third-figure
#| fig-width: 8
#| fig-cap: "Plot of speed vs. dist"
#| fig-subcap: "Generated from `cars` dataset in R"
#| fig-align: center

cars |> 
  ggplot(aes(x = speed, y = dist)) + 
  geom_point()
```

Generated from cars dataset in R

Plot of speed vs. dist

Day 2: Advanced quarto utilities

Status check

  • Reminder of “Project” oriented workflow: every time you come to class, or sit down to work on class material, open the appropriate project in RStudio, and work from there.

  • As of now, very few regular pushes onto gitlab (at least in publicly viewable repositories)

  • Consistency is your best friend for getting into a habit, so try to push notes at least once at the end of class. Bonus points for setting up RProjects for your other coursework. Extra-bonus for tracking these with git.

|>: Piping in R

  • Recall the pipe from the Bash shell:
wc -l *.txt | sort
  • “Take the output of the left side, and pass it as input to the command on the right side”

  • For a review, revisit Episode 4 of the Software Carpentry workshop on the Bash Shell: link

|>: Piping in R

Consider some computations on a vector vec:

  • Take the log of vec
  • Then take the lagged difference of vec (i.e. vec[2]-vec[1], vec[3]-vec[2], …)
  • Then exponentiate the lagged vector
  • Finally, round the vector to 3 decimal points

Quarto utilities for scientific writing

  • Changing output formats (beyond HTML)
  • Embedding figures, tables, and equations
  • Bibliography/reference management
  • Generating appendices/supplements
  • Word templates

Changing output formats beyond HTML

  • HTML Page, PDF, Docx file, Revealjs slides, etc. :heavy_check_mark:

Embedding figures, tables, and equations

  • Modify demo.qmd to show how figure captions can be generated, figures can be cross-referenced, and ditto for tables. :heavy_check_mark:

  • If your figures are generated by code within the quarto document, no need to separately insert them - automatically inserted when you render the file!

  • If you have additional figures (e.g. to add a microscopy photo), use markdown syntax:

![Figure caption](path/to/figure.png)

Bibliography/reference management

Generating appendices/supplements

  • “Including” other quarto documents

Customized Word templates

  • Link to Gaurav’s preferred template for Docx generation

Quarto as a “swiss army knife” for scientific writing

  • When preparing a manuscript for journal submission, there’s several moving pieces to manage:

    • Reference management/inserting citations
    • Inserting figures, tables, equations
    • Generating appendices/supplements
    • Sharing code with reviewers
    • Keeping track of your changes during peer-review

Quarto can help with all!

  • See Editorial by Journal of Ecology for how Editors are thinking about the role of quarto/other reproducible report generation methods.
Lawton, John H. 1999. “Are There General Laws in Ecology?” Oikos, 177–92.