Autoregressive (AR) Models

Chapter 4: Lesson 3

Learning Outcomes

Characterize the properties of an

A R (p)

stochastic process

Define an $A R (p)$ stochastic process
Express an $A R (p)$ process using the backward shift operator
State an $A R (p)$ forecast (or prediction) function
Identify stationarity of an $A R (p)$ process using the backward shift operator
Determine the stationarity of an $A R (p)$ process using a characteristic equation

Check model adequacy using diagnostic plots like correlograms of residuals

Characterize a random walk’s second order characteristics using a correlogram
Define partial autocorrelations
Explain how to use a partial correlogram to decide what model would be suitable to estimate an $A R (p)$ process
Demonstrate the use of partial correlogram via simulation

Preparation

Read Section 4.5

Learning Journal Exchange (10 min)

Review another student’s journal
What would you add to your learning journal after reading another student’s?
What would you recommend the other student add to their learning journal?
Sign the Learning Journal review sheet for your peer

Class Activity: Definition of Autoregressive (AR) Models (10 min)

We now define an autoregressive (or AR) model.

Definition of an Autoregressive (AR) Model

The time series ${x_{t}}$ is an autoregressive process of order $p$ , denoted as $A R (p)$ , if $x_{t} = α_{1} x_{t - 1} + α_{2} x_{t - 2} + α_{3} x_{t - 3} + \dots + α_{p - 1} x_{t - (p - 1)} + α_{p} x_{t - p} + w_{t} (4.15)$

where ${w_{t}}$ is white noise and the $α_{i}$ are the model parameters with $α_{p} \neq 0$ .

In short, this means that the next observation of a time series depends linearly on the previous $p$ terms and a random white noise component.

Check Your Understanding

Show that we can write Equation (4.15) as a polynomial of order $p$ in terms of the backward shift operator: $(1 - α_{1} B - α_{2} B^{2} - \dots - α_{p} B^{p}) x_{t} = w_{t}$

We have seen some special cases of this model already.

Check Your Understanding

Give another name for an $A R (0)$ model.
Show that the random walk is the special case $A R (1)$ with $α_{1} = 1$ . (See Chapter 4, Lesson 1.)
Show that the exponential smoothing model is the special case where $α_{i} = α (1 - α)^{i}$ for $i = 1, 2, \dots$ and $p \to \infty$ . (See Chapter 3, Lesson 2.)

We now explore the autoregressive properties of this model.

Check Your Understanding

Show that the $A R (p)$ model is a regression of $x_{t}$ on past terms from the same series. Hint: write the $A R (p)$ model in more familiar terms, letting $y_{i} = x_{t}, x_{1} = x_{t - 1}, x_{2} = x_{t - 2}, \dots, x_{p} = x_{t - p}, and ϵ_{i} = w_{t}$
Explain why the prediction at time $t$ is given by ${\hat{x}}_{t} = {\hat{α}}_{1} x_{t - 1} + {\hat{α}}_{2} x_{t - 2} + \dots + {\hat{α}}_{p - 1} x_{t - (p - 1)} + {\hat{α}}_{p} x_{t - p}$
Explain why the model parameters (the $α$ ’s) can be estimated by minimizing the sum of the squared error terms: $\sum_{t = 1}^{n} {({\hat{w}}_{t})}^{2} = \sum_{t = 1}^{n} {(x_{t} - {\hat{x}}_{t})}^{2}$
What is the reason this is called an autoregressive model?

Class Activity: Exploring $A R (1)$ Models (10 min)

Definition

Recall that an $A R (p)$ model is of the form $x_{t} = α_{1} x_{t - 1} + α_{2} x_{t - 2} + α_{3} x_{t - 3} + \dots + α_{p - 1} x_{t - (p - 1)} + α_{p} x_{t - p} + w_{t}$ So, an $A R (1)$ model is expressed as $x_{t} = α x_{t - 1} + w_{t}$ where ${w_{t}}$ is a white noise series with mean zero and variance $σ^{2}$ .

Second-Order Properties of an $A R (1)$ Model

We now explore the second-order properties of this model.

Second-Order Properties of an

A R (1)

Model

If ${x_{t}}_{t = 1}^{n}$ is an $A R (1)$ prcess, then its the first- and second-order properties are summarized below.

$\begin{aligned} μ_{x} & = 0 \\ γ_{k} = c o v (x_{t}, x_{t + k}) & = \frac{α^{k} σ^{2}}{1 - α^{2}} \end{aligned}$

Click here for a proof of the equation for

c o v (x_{t}, x_{t + k})

Why is $c o v (x_{t}, x_{t + k}) = \frac{α^{k} σ^{2}}{1 - α^{2}}$ ?

If ${x_{t}}$ is a stable $A R (1)$ process (which means that $||<1) can be written as:

$\begin{aligned} (1 - α B) x_{t} & = w_{t} \\ ⟹ x_{t} & = (1 - α B)^{- 1} w_{t} \\ = w_{t} + α w_{t - 1} + α^{2} w_{t - 2} + α^{3} w_{t - 3} + \dots \\ = \sum_{i = 0}^{\infty} α^{i} w_{t - i} \end{aligned}$

From this, we can deduce that the mean is

$E (x_{t}) = E (\sum_{i = 0}^{\infty} α^{i} w_{t - i}) = \sum_{i = 0}^{\infty} α^{i} E (w_{t - i}) = 0$

The autocovariance is computed similarly as:

$\begin{aligned} γ_{k} = c o v (x_{t}, x_{t + k}) & = c o v (\sum_{i = 0}^{\infty} α^{i} w_{t - i}, \sum_{j = 0}^{\infty} α^{j} w_{t + k - j}) \\ = \sum_{j = k + i} α^{i} α^{j} c o v (w_{t - i}, w_{t + k - j}) \\ = α^{k} σ^{2} \sum_{i = 0}^{\infty} α^{2 i} \\ = \frac{α^{k} σ^{2}}{1 - α^{2}} \end{aligned}$

See Equations (2.15) and (4.2).

Correlogram of an $A R (1)$ process

Correlogram of an AR(1) Process

The autocorrelation function for an AR(1) process is

$ρ_{k} = α^{k} (k \geq 0)$ where $| α | < 1$ .

Check Your Understanding

Use the equation for the autocovariance function, $γ_{k}$ , to show that $ρ_{k} = α^{k}$ for $k \geq 0$ when $| α | < 1$ .
Use this to explain why the correlogram decays to zero more quickly when $α$ is small.

Small Group Activity: Simulation of an $A R (1)$ Process

Check Your Understanding

In each of the following cases, what do you observe in the correlogram? (If you expect to see significant results and you do not, try increasing the number of points.)

$α = 1$
$α = 0.5$
$α = 0.1$
$α = 0$
$α = - 0.1$
$α = - 0.5$
$α = - 1$

Class Activity: Partial Autocorrelation (10 min)

Definition of Partial Autocorrelation

Partial Autocorrleation

The partial autocorrelation at lag $k$ is defined as the portion of the correlation that is not explained by shorter lags.

For example, the partial correlation for lag 4 is the correlation not explained by lags 1, 2, or 3.

Check Your Understanding

What is the value of the partial autocorrelation function for an $A R (2)$ process for all lags greater than 2?

On page 81, the textbook states that in general, the partial autocorrelation at lag $k$ is the $k^{t h}$ coefficient of a fitted $A R (k)$ model. This implies that if the underlying process is $A R (p)$ , then all the coefficients $α_{k} = 0$ if $k > p$ . So, an $A R (p)$ process will yield partial correlations that are zero after lag $p$ . So, a correlogram of partial autocorrelations can be helpful to determine the order of an appropriate $A R$ process to model a time series.

ACF and PACF of an

A R (p)

Process

For an $A R (p)$ process, we observe the following:

	AR(p)
ACF	Tails off
PACF	Cuts off after lag $p$

Example: McDonald’s Stock Price

Here is a partial autocorrelation plot for the McDonald’s stock price data:

Show the code

# Set symbol and date range
symbol <- "MCD"
company <- "McDonald's"

# Retrieve static file
stock_df <- rio::import("https://byuistats.github.io/timeseries/data/stock_price_mcd.parquet")

# Transform data into tibble
stock_ts <- stock_df %>%
  mutate(
    dates = date, 
    value = adjusted
  ) %>%
  select(dates, value) %>%
  as_tibble() %>% 
  arrange(dates) |>
  mutate(diff = value - lag(value)) |>
  as_tsibble(index = dates, key = NULL) 

pacf(stock_ts$value, plot=TRUE, lag.max = 25)

The only significant partial correlation is at lag $k = 1$ . This suggests that an $A R (1)$ process could be used to model the McDonald’s stock prices.

Partial Autocorrelation Plots of Various $A R (p)$ Processes

Here are some time plots, correlograms, and partial correlograms for $A R (p)$ processes with various values of $p$ .

Shiny App

Class Activity: Stationary and Non-Stationary AR Processes (15 min)

Definition of the Characteristic Equation

Treating the symbol $B$ formally as a number (either real or complex), the polynomial

$θ_{p} (B) x_{t} = (1 - α_{1} B - α_{2} B^{2} - \dots - α_{p} B^{p}) x_{t}$

is called the characteristic polynomial of an AR process.

If we set the characteristic polynomial to zero, we get the characteristic equation:

$θ_{p} (B) = (1 - α_{1} B - α_{2} B^{2} - \dots - α_{p} B^{p}) = 0$

The roots of the characteristic polynomial are the values of $B$ that make the polynomial equal to zero–i.e., the values of $B$ that make $θ_{p} (B) = 0$ . These are also called the solutions of the characteristic equation. The roots of the characteristic polynomial can be real or complex numbers.

We now explore an important result for AR processes that uses the absolute value of complex numbers.

Identifying Stationary Processes

An AR process will be stationary if the absolute value of the solutions of the characteristic equation are all strictly greater than 1.

First, we will find the roots of the characteristic polynomial (i.e. the solutions of the characteristic equation) and then we will determine if the absolute value of these solutions is greater than 1.

We can use the polyroot function to find the roots of polynomials in R. For example, to find the roots of the polynomial $x^{2} - x - 6$ , we apply the command

polyroot(c(-6,-1,1))

[1]  3+0i -2+0i

Note the order of the coefficients. They are given in increasing order of the power of $x$ .

Of course, we could simply factor the polynomial: $x^{2} - x - 6 = (x - 3) (x + 2) \overset{s e t}{=} 0$ which implies that $x = 3 or x = - 2$

Definition of the Absolute Value in the Complex Plane

Let $z = a + b i$ be any complex number. It can be represented by the point $(a, b)$ in the complex plane. We define the absolute value of $z$ as the distance from the origin to the point:

$| z | = \sqrt{a^{2} + b^{2}}$

Practice computing the absolute value of a complex number.

Check Your Understanding

Find the absolute value of the following (complex) numbers:

$- 3$
$4 i$
$- 3 + 4 i$
$- \frac{\sqrt{3}}{4} + \frac{1}{4} i$
$\frac{1}{\sqrt{2}} - \frac{1}{\sqrt{2}} i$
$5 - 12 i$

We will now practice assessing whether an AR process is stationary using the characteristic equation.

Check Your Understanding

For each of the following AR processes, do the following:

Write the AR process in terms of the backward shift operator.
Solve the characteristic equation.
Determine if the AR process is stationary.

$A R (1)$ process: $x_{t} = x_{t - 1} + w_{t}$
$A R (1)$ process: $x_{t} = \frac{1}{3} x_{t - 1} + w_{t}$
$A R (2)$ process: $x_{t} = - \frac{1}{4} x_{t - 1} + \frac{1}{8} x_{t - 2} + w_{t}$
$A R (2)$ process: $x_{t} = - \frac{2}{3} x_{t - 1} + \frac{1}{3} x_{t - 2} + w_{t}$
$A R (2)$ process: $x_{t} = - x_{t - 1} - 2 x_{t - 2} + w_{t}$
$A R (2)$ process: $x_{t} = \frac{3}{2} x_{t - 1} - x_{t - 2} + w_{t}$
$A R (2)$ process: $x_{t} = 4 x_{t - 2} + w_{t}$
$A R (3)$ process: $x_{t} = \frac{2}{3} x_{t - 1} + \frac{1}{4} x_{t - 2} - \frac{1}{6} x_{t - 3} + w_{t}$

Choose one stationary $A R (2)$ process and one non-stationary $A R (2)$ process. For each, do the following:

Simulate at least 1000 sequential observations.
Make a time plot of the simulated values.
Make a correlogram of the simulated values.
Plot the partial correlogram of the simulated values.

The following code chunk may be helpful--or you can use the simulation above.

Show the code

# Number of observations
n_obs <- 1000

# Generate sequence of dates
start_date <- my(paste(1, floor(year(now())-n_obs/365)))
date_seq <- seq(start_date,
    start_date + days(n_obs - 1),
    by = "1 days")

# Simulate random component
w <- rnorm(n_obs)

# Set first few values of x
x <- rep(0, n_obs)
x[1] <- w[1]
x[2] <- 0.6 * x[1] + w[2]

# Set all remaining values of x
for (t in 3:n_obs) {
  x[t] <- 0.6 * x[t-1] + 0.25 * x[t-2] + w[t]
}

# Create the tsibble
sim_ts <- data.frame(dates = date_seq, x = x) |>
  as_tsibble(index = dates) 

# Generate the plots
sim_ts |> autoplot(.vars = x)
acf(sim_ts$x, plot=TRUE, lag.max = 25)
pacf(sim_ts$x, plot=TRUE, lag.max = 25)

What do you observe about the difference in the behavior of the stationary and non-stationary processes?

Homework Preview (5 min)

Review upcoming homework assignment
Clarify questions

Download Homework

homework_4_3.qmd

Autoregressive Model Definition

Check Your Understanding Solutions

Show that we can write Equation (4.15) as a polynomial of order $p$ in terms of the backward shift operator: $(1 - α_{1} B - α_{2} B^{2} - \dots - α_{p} B^{p}) x_{t} = w_{t}$

Solution:

$x_{t} = α_{1} x_{t - 1} + α_{2} x_{t - 2} + α_{3} x_{t - 3} + \dots + α_{p - 1} x_{t - (p - 1)} + α_{p} x_{t - p} + w_{t}$ After subtracting, we have: $\begin{aligned} w_{t} & = x_{t} - α_{1} x_{t - 1} - α_{2} x_{t - 2} - α_{3} x_{t - 3} - \dots - α_{p - 1} x_{t - (p - 1)} - α_{p} x_{t - p} \\ = x_{t} - α_{1} B x_{t} - α_{2} B^{2} x_{t} - α_{3} B^{3} x_{t} - \dots - α_{p - 1} B^{p - 1} x_{t} - α_{p} B^{p} x_{t} \\ = (1 - α_{1} B - α_{2} B^{2} - α_{3} B^{3} - \dots - α_{p - 1} B^{p - 1} - α_{p} B^{p}) x_{t} \end{aligned}$

Give another name for an $A R (0)$ model.

Solution:

White noise

Show that the random walk is the special case $A R (1)$ with $α_{1} = 1$ . (See Chapter 4, Lesson 1.)

Solution:

If we let $α_{1} = 1$ in an $A R (1)$ model, we get: $\begin{aligned} x_{t} & = α_{1} x_{t - 1} + α_{2} x_{t - 2} + α_{3} x_{t - 3} + \dots + α_{p - 1} x_{t - (p - 1)} + α_{p} x_{t - p} + w_{t} \\ = α_{1} x_{t - 1} + w_{t} \\ = x_{t - 1} + w_{t} \end{aligned}$ which is the definition of a random walk, as given in Chapter 4, Lesson 1.

Show that the exponential smoothing model is the special case where $α_{i} = α (1 - α)^{i}$ for $i = 1, 2, \dots$ and $p \to \infty$ . (See Chapter 3, Lesson 2.)

Solution: $\begin{aligned} x_{t} & = α_{1} x_{t - 1} + α_{2} x_{t - 2} + α_{3} x_{t - 3} + \dots + α_{p - 1} x_{t - (p - 1)} + α_{p} x_{t - p} + w_{t} \\ = α (1 - α)^{1} x_{t - 1} + α (1 - α)^{2} x_{t - 2} + α (1 - α)^{3} x_{t - 3} + \dots + α (1 - α)^{p - 1} x_{t - (p - 1)} + α (1 - α)^{p} x_{t - p} + w_{t} \end{aligned}$

This is Equation (3.18) in Chapter 3, Lesson 2.

Show that the $A R (p)$ model is a regression of $x_{t}$ on previous terms in the series. (This is why it is called an “autoregressive model.”) Hint: write the $A R (p)$ model in more familiar terms, letting $y_{i} = x_{t}, x_{1} = x_{t - 1}, x_{2} = x_{t - 2}, \dots, x_{p} = x_{t - p}, ϵ_{i} = w_{t}, and β_{j} = α_{j}$

Solution: $\begin{aligned} x_{t} & = α_{1} x_{t - 1} + α_{2} x_{t - 2} + α_{3} x_{t - 3} + \dots + α_{p - 1} x_{t - (p - 1)} + α_{p} x_{t - p} + w_{t} \\ y_{i} & = β_{1} x_{1 i} + β_{2} x_{2 i} + β_{3} x_{3 i} + \dots + β_{p - 1, i} x_{p - 1, i} + β_{p} x_{p, i} + ϵ_{i} \end{aligned}$

This is a multiple linear regression equation with zero intercept.

Explain why the prediction at time $t$ is given by ${\hat{x}}_{t} = {\hat{α}}_{1} x_{t - 1} + {\hat{α}}_{2} x_{t - 2} + \dots + {\hat{α}}_{p - 1} x_{t - (p - 1)} + {\hat{α}}_{p} x_{t - p}$

Solution:

The prediction at time $t$ in a multiple regression setting would be: ${\hat{y}}_{i} = {\hat{β}}_{1} x_{1 i} + {\hat{β}}_{2} x_{2 i} + {\hat{β}}_{3} x_{3 i} + \dots + {\hat{β}}_{p - 1} x_{p - 1, i} + {\hat{β}}_{p} x_{p, i}$ Translated to the $A R (p)$ setting, this becomes: ${\hat{x}}_{t} = {\hat{α}}_{1} x_{t - 1} + {\hat{α}}_{3} x_{t - 2} + {\hat{α}}_{3} x_{t - 3} + \dots + {\hat{α}}_{p - 1} x_{t - (p - 1)} + {\hat{α}}_{p} x_{t - p}$

Explain why the model parameters (the $α$ ’s) can be estimated by minimizing the sum of the squared error terms: $\sum_{t = 1}^{n} {({\hat{w}}_{t})}^{2} = \sum_{t = 1}^{n} {(x_{t} - {\hat{x}}_{t})}^{2}$

Solution:

This is exactly how the multiple linear regression coefficients are estimated…minimizing the sum of the squared error terms.

What is the reason this is called an autoregressive model?

Solution:

This is called an autoregressive model because we regress the current tern on the previous terms in the series.

Absolute Value

Check Your Understanding Solutions

Find the absolute value of the following (complex) numbers:

$| - 3 | = \sqrt{(- 3)^{2} + 0^{2}} = 3$
$| 4 i | = \sqrt{(0)^{2} + (4)^{2}} = 4$
$| - 3 + 4 i | = \sqrt{(- 3)^{2} + (4)^{2}} = 5$
$| - \frac{\sqrt{3}}{4} + \frac{1}{4} i | = \sqrt{{(\frac{\sqrt{- 3}}{4})}^{2} + {(\frac{1}{4})}^{2}} = \sqrt{\frac{3}{16} + \frac{1}{16}} = \frac{1}{2}$
$| \frac{1}{\sqrt{2}} - \frac{1}{\sqrt{2}} i | = \sqrt{{(\frac{1}{\sqrt{2}})}^{2} + {(\frac{- 1}{\sqrt{2}})}^{2}} = 1$
$| 5 - 12 i | = \sqrt{(5)^{2} + (- 12)^{2}} = 13$

Using the Characteristic Polynomial to Assess Stationarity

Check Your Understanding Solutions

For each of the following AR processes, do the following:

Write the AR process in terms of the backward shift operator.
Solve the characteristic equation.
Determine if the AR process is stationary.

$A R (1)$ process: $x_{t} = x_{t - 1} + w_{t}$

Solution:

$A R (1)$ process: $x_{t} = \frac{1}{3} x_{t - 1} + w_{t}$

Solution:

$B = 3$ This is a stationary AR process.

$A R (2)$ process: $x_{t} = - \frac{1}{4} x_{t - 1} + \frac{1}{8} x_{t - 2} + w_{t}$

Solution:

$- \frac{1}{8} (B^{2} - 2 B - 8) x_{t} = w_{t}$ $B = - 4, B = 2$ This is a stationary AR process.

$A R (2)$ process: $x_{t} = - \frac{2}{3} x_{t - 1} + \frac{1}{3} x_{t - 2} + w_{t}$

Solution:

$- \frac{1}{3} (B^{2} - 2 B - 3) x_{t} = w_{t}$ $B = - 1, B = 3$ This is a non-stationary AR process.

$A R (2)$ process: $x_{t} = - x_{t - 1} - 2 x_{t - 2} + w_{t}$

Solution:

$(B^{2} + 1 / 2 * B + 1 / 2) x_{t} = w_{t}$ $B = \frac{1}{4} \pm \frac{\sqrt{7}}{4} i$ $| B | = \sqrt{\frac{1}{16} + \frac{7}{16} i} = \frac{1}{\sqrt{2}} \leq 1$ This is a non-stationary AR process.

$A R (2)$ process: $x_{t} = \frac{3}{2} x_{t - 1} - x_{t - 2} + w_{t}$

Solution:

$(B^{2} - \frac{3}{2} B + 1) x_{t} = w_{t}$ $B = \frac{3}{4} \pm \frac{\sqrt{7}}{4} i$ $| B | = 1$ This is a non-stationary AR process.

$A R (2)$ process: $x_{t} = 4 x_{t - 2} + w_{t}$

Solution:

$(1 - 4 B^{2}) x_{t} = w_{t}$ $B = \pm \frac{1}{2}$ $| B | = \frac{1}{2}$ This is a non-stationary AR process.

$A R (3)$ process: $x_{t} = \frac{2}{3} x_{t - 1} + \frac{1}{4} x_{t - 2} - \frac{1}{6} x_{t - 3} + w_{t}$

Solution:

$B = 2, - 2, \frac{3}{2}$ This is a stationary AR process.

Learning Outcomes

Preparation

Learning Journal Exchange (10 min)

Class Activity: Definition of Autoregressive (AR) Models (10 min)

Class Activity: Exploring AR(1) Models (10 min)

Definition

Second-Order Properties of an AR(1) Model

Correlogram of an AR(1) process

Small Group Activity: Simulation of an AR(1) Process

Class Activity: Partial Autocorrelation (10 min)

Definition of Partial Autocorrelation

Example: McDonald’s Stock Price

Partial Autocorrelation Plots of Various AR(p) Processes

Shiny App

Class Activity: Stationary and Non-Stationary AR Processes (15 min)

Homework Preview (5 min)

Class Activity: Exploring $A R (1)$ Models (10 min)

Second-Order Properties of an $A R (1)$ Model

Correlogram of an $A R (1)$ process

Small Group Activity: Simulation of an $A R (1)$ Process

Partial Autocorrelation Plots of Various $A R (p)$ Processes