2 Quantitative skills laboratory
PURPOSE
- To learn how to record data in electronic format
- To learn how to write hypotheses as equations
- To learn how to choose the appropriate visualizations
- To learn how to make graphs using R Studio
Before coming to the laboratory:
If you have not already, install
R
andRStudio
.Before coming to the laboratory read:
EXERCISE 1. Data entry and graphing with a continuous independent variable
The increase of carbon dioxide in the atmosphere is expected to reduce the concentrations of carbonate ions in the surface of the ocean. This process, in general called ocean acidification, affects the biology and survival of a wide range of marine species because it prevents the formation of shells and plates of calcium carbonate. A long term study on coral reefs showed how the net community calcification rate (G, measured in mmol CaCO3m-2d-1) is related to the concentration of carbonate (CO32-, measured in \(\mu\)mol\(\cdot\)kg-1). We will use the data from the research paper (which is summarized in pg. 58 (or pg. 60 in the 2nd Canadian Edition) of the Campbell textbook) to understand this relationship. The results of this experiment are as follows:
CO3 | G |
---|---|
230.77 | 2.99 |
230.09 | 4.07 |
233.79 | 6.25 |
235.86 | 7.34 |
241.84 | 8.62 |
248.27 | 9.36 |
254.96 | 12.45 |
259.80 | 14.45 |
265.79 | 17.36 |
271.09 | 19.54 |
273.88 | 22.61 |
Enter the data
- We provide this data as a .csv called
LangdonDataset.csv
. You can view and download the data from a github repo from here. Alternatively, you can just load the data from the internet link as demonstrated below.
NOTE Follow the instructions in Entering and loading data and import the data. For the read.csv
function, quote marks "
are needed, and the path to the data must be specified exactly.
Note that this is just a subset of the data from the full paper. Include a copy of the code you used in your lab report.
Make a graph
- Follow the instructions in Making graphs in R to make a scatterplot for these data and replicate the figure from the Campbell textbook (see Figure 14.1 below). Include both the code you wrote and the final figure in your lab report. NOTE: Don’t worry about figuring out how to write superscripts and subscripts in R for the axis-labels. You can use the “^” symbol to indicate a superscript, and the “-” to indicate a subscript.
HINT Think about what you are specifying as x
and y
in the plot
function. Use the tools you learned about in Introduction to R to explore the dataframe.
NEW TIP instead of entering the data as a vector the way you learned in Making graphs in R, you can use the code below to create an x vector from a data frame (here I titled my data frame “langdon_data”; your name may be different, and I created a vector of data called “CO3”). The $
sign tells R to look in the data frame specified before the $
sign (in the example below the data frame is “langdon_data” and then read the column that comes after the $
sign (in the example here, it’s CO3. The column header you specify after the $ sign MUST match the way it is shown in the dataframe EXACTLY. Use the commands you learned in Introduction to R to confirm what the column header is.
You can use this to create the vector of data that you need to plot on the x-axis, and then write you own code to create the vector of data for the y-axis.
langdon_data <- read.csv("https://raw.githubusercontent.com/ahurford/quant-guide-data/main/BIOL1001-data/LangdonDataset.csv")
CO3 <- langdon_data$CO3
Use the examples in Making graphs in R to create a fully labelled scatter plot. Export this and paste it into your document that you will hand in for this lab. Include both the code you wrote and the final figure in your lab report.
HINT if you want to add a line of best fit based on linear regression, use the function abline
using a linear regression model (the function lm
). Your code will look something like this: abline(lm(y~x))
Refer to the help files for assistance. The two terms on each side of the tilde (“~” symbol) tell R what the equation for the regression y~x
is. In your case substitute in what YOU defined as x
and y
in R. NOTE for the lab assignment, this step is optional.
- Follow the instructions in Making graphs in R to make a second graph - a line graph for these data. Export your line graph insert it into your lab report. Write a figure caption. Include both the code you wrote and the final figure in your lab report.
EXERCISE 2. Bar plots and boxplots with discrete independent variables
We will use data from a research paper that investigated whether plants were able to respond to stress cues emitted from their drought-stressed neighbours [@falik2011rumor]. This research project is also discussed in your textbook. The research team used Pisum sativum plants in an experimental setup that allowed them to connect the plants’ root systems. One plant in the central position of a row was subjected to osmotic stress, while neighbouring plants remained unstressed on both sides. Plants on one side of the stressed plant shared roots with others in the group but did not share roots with the stressed plant; these were the control group. On the other side, the stressed plant shared its roots with its nearest unstressed neighbour, and all the other plants shared their roots with their nearest neighbour (See Figure 2.2).
Stress was quantified fifteen minutes after the induction of drought by measuring the width of the stomatal openings on the leaves of the plants. The results of this experiment are as follows:
type | plant | width |
---|---|---|
control | 1 | 12.109661 |
15min | 1 | 11.577023 |
control | 2 | 11.608355 |
15min | 2 | 10.903394 |
control | 3 | 9.446475 |
15min | 3 | 12.093995 |
control | 4 | 9.697128 |
15min | 4 | 11.013055 |
control | 5 | 10.840731 |
15min | 5 | 11.483029 |
control | 6 | 10.746736 |
15min | 6 | 7.221932 |
control | 7 | 10.605744 |
15min | 7 | 7.973890 |
control | 8 | 11.780679 |
15min | 8 | 7.080940 |
control | 9 | 11.608355 |
15min | 9 | 8.772846 |
control | 10 | 11.686684 |
15min | 10 | 9.838120 |
control | 11 | 11.874674 |
15min | 11 | 11.639687 |
Questions
Follow the instructions in Entering and loading data to load the data file FalikDataset.csv using the command line (
read.csv
) option. Note that the data in the CSV file are ordered differently than in the list above (to verify this you can open the CSV file in Excel to see the full set).Looking at the dataset, can you tell how many treatments were applied in the experiment? How many plants were used for each treatment?
What is the smallest stomatal width measured? What is the largest?
EXERCISE 2b. Making a bar plot
Questions
- Follow the instructions below to replicate the figure from the Campbell textbook (bar plot; see Figure 14.3 below). Export your bar plot and insert it into your lab report. Add a caption to your bar plot and hand it in with your lab report. Include the code you wrote to generate the figure.
To make a bar plot, use the code below as a starting point, and then use the resources in this guide to add appropriate x and y axis labels.Don’t worry about figuring out how to write the symbol “mu” in R; you can just write “um” or “micrometres” for now. However, if you want to try writing Greek symbols in R, look at the code below for the boxplot example, and see if you can apply it to your bar plot.
width <- falik_data$width #This creates a vector of data of the plant stomatal widths
plant <- falik_data$plant #The creates a vector of data of the plant ID numbers
barplot(width, names = plant, col = c("blue", "red"))
In the bar plot, what can you tell about the stomatal openings of the treatment plant vs. the control plants? Is it consistent across all the individual plants?
Follow the instructions below to re-plot the data from EXERCISE 2b as a boxplot that compares control versus treatment. Export your boxplot and insert it into your lab report. Add a caption to your boxplot. NOTE: If you describe what the symbols/colours mean in the caption, then you do not need to include a legend. If you want to try adding a legend (adding one is optional) then look up “legend” in the help window.
control <- subset(falik_data, type == "control") #This subset command is a handy function to break the data into subsets. Here, we're creating a data frame with just the values from the control.
treatment <- subset(falik_data, type == "15min") #Here we are subsetting the data to contain only the data from the treatment
boxplot(control$width, treatment$width, names = c("control", "treatment"), ylab = expression(paste("stomatal opening ( ", mu, ")")))
Label the following on the boxplot: median, 25% and 75% quantiles, 95% confidence limits. HINT you can insert the exported graph into a blank PowerPoint slide and use lines/text boxes to draw the labels. Then save the slide as an image file (e.g., JPEG) and insert it in your lab report.
What does the boxplot tell you about the differences between the treatment plants (plants 6-11) vs. the control plants that the bar plot does not?
Why do you think a scientist might choose to represent these data one way over another?