# Set random seed
set.seed(2412)
# Specify means and standard deviation
<- 5 # number of points
n <- 10 # mean
mu <- 3 # standard deviation
sigma
# Simulate normal data
<- data.frame(x = round(rnorm(n, mu, sigma), 1)) |>
sim_data arrange(x)
Covariance and Correlation
Chapter 2: Lesson 1
Learning Outcomes
Compute the key statistics used to describe the linear relationship between two variables
- Compute the sample mean
- Compute the sample variance
- Compute the sample standard deviation
- Compute the sample covariance
- Compute the sample correlation coefficient
- Explain sample covariance using a scatter plot
Interpret the key statistics used to describe sample data
- Interpret the sample mean
- Interpret the sample variance
- Interpret the sample standard deviation
- Interpret the sample covariance
- Interpret the sample correlation coefficient
Preparation
- Read Sections 2.1-2.2.2 and 2.2.4
Learning Journal Exchange (10 min)
- Review another student’s journal
- What would you add to your learning journal after reading your partner’s?
- What would you recommend your partner add to their learning journal?
- Sign the Learning Journal review sheet for your peer
Class Activity: Variance and Standard Deviation (10 min)
We will explore the variance and standard deviation in this section.
The following code simulates observations of a random variable. We will use these data to explore the variance and standard deviation.
The data simulated by this process are:
The variance and standard deviation are individual numbers that summarize how far the data are from the mean. We first compute the deviations from the mean,
We can summarize this information in a table:
Table 1: Deviations from the mean
6.9 | -2.5 | ||||
7.7 | -1.7 | ||||
8.1 | -1.3 | ||||
10.8 | 1.4 | ||||
13.5 | 4.1 |
Class Activity: Covariance and Correlation (15 min)
Team Activity: Computational Practice (15 min)
Table 3: Computational Practice
The table below contains values of two time series
1 | -2.1 | 2.8 | -1.9 | 3.61 | 1 | 1 | -1.9 |
2 | -0.2 | 2.2 | |||||
3 | 0.8 | 0.9 | |||||
4 | 0.4 | 2 | |||||
5 | 2.3 | -1 | |||||
6 | -2.4 | 3.9 | |||||
sum | -1.2 | 10.8 | |||||
Use the table above to determine these values:
Here is a scatterplot of the data.
Summary
Computations in R (5 min)
Use these commands to load the data from the previous activity into R.
x <- c( -2.1, -0.2, 0.8, 0.4, 2.3, -2.4 )
y <- c( 2.8, 2.2, 0.9, 2, -1, 3.9 )
We can use R to compute the mean, variance, standard deviation, correlation coefficient, and covariance.
Mean,
mean(x)
[1] -0.2
Variance,
var(x)
[1] 3.212
Standard Deviation,
sd(x)
[1] 1.792205
Correlation Coefficient,
cor(x, y)
[1] -0.9449384
Covariance,
cov(x, y)
[1] -2.86
Homework Preview (5 min)
- Review upcoming homework assignment
- Clarify questions