Autoregressive (AR) Models

Chapter 4: Lesson 3

Learning Outcomes

Characterize the properties of an AR(p) stochastic process
  • Define an AR(p) stochastic process
  • Express an AR(p) process using the backward shift operator
  • State an AR(p) forecast (or prediction) function
  • Identify stationarity of an AR(p) process using the backward shift operator
  • Determine the stationarity of an AR(p) process using a characteristic equation
Check model adequacy using diagnostic plots like correlograms of residuals
  • Characterize a random walk’s second order characteristics using a correlogram
  • Define partial autocorrelations
  • Explain how to use a partial correlogram to decide what model would be suitable to estimate an AR(p) process
  • Demonstrate the use of partial correlogram via simulation

Preparation

  • Read Section 4.5

Learning Journal Exchange (10 min)

  • Review another student’s journal

  • What would you add to your learning journal after reading another student’s?

  • What would you recommend the other student add to their learning journal?

  • Sign the Learning Journal review sheet for your peer

Class Activity: Definition of Autoregressive (AR) Models (10 min)

We now define an autoregressive (or AR) model.

Definition of an Autoregressive (AR) Model

The time series {xt} is an autoregressive process of order p, denoted as AR(p), if xt=α1xt1+α2xt2+α3xt3++αp1xt(p1)+αpxtp+wt                       (4.15)

where {wt} is white noise and the αi are the model parameters with αp0.

In short, this means that the next observation of a time series depends linearly on the previous p terms and a random white noise component.

Check Your Understanding
  • Show that we can write Equation (4.15) as a polynomial of order p in terms of the backward shift operator: (1α1Bα2B2αpBp)xt=wt

We have seen some special cases of this model already.

Check Your Understanding
  • Give another name for an AR(0) model.

  • Show that the random walk is the special case AR(1) with α1=1. (See Chapter 4, Lesson 1.)

  • Show that the exponential smoothing model is the special case where αi=α(1α)i for i=1,2, and p. (See Chapter 3, Lesson 2.)

We now explore the autoregressive properties of this model.

Check Your Understanding
  • Show that the AR(p) model is a regression of xt on past terms from the same series. Hint: write the AR(p) model in more familiar terms, letting yi=xt,  x1=xt1,  x2=xt2,  ,  xp=xtp,  and  ϵi=wt

  • Explain why the prediction at time t is given by x^t=α^1xt1+α^2xt2++α^p1xt(p1)+α^pxtp

  • Explain why the model parameters (the α’s) can be estimated by minimizing the sum of the squared error terms: t=1n(w^t)2=t=1n(xtx^t)2

  • What is the reason this is called an autoregressive model?

Class Activity: Exploring AR(1) Models (10 min)

Definition

Recall that an AR(p) model is of the form xt=α1xt1+α2xt2+α3xt3++αp1xt(p1)+αpxtp+wt So, an AR(1) model is expressed as xt=αxt1+wt where {wt} is a white noise series with mean zero and variance σ2.

Second-Order Properties of an AR(1) Model

We now explore the second-order properties of this model.

Second-Order Properties of an AR(1) Model

If {xt}t=1n is an AR(1) prcess, then its the first- and second-order properties are summarized below.

μx=0γk=cov(xt,xt+k)=αkσ21α2

Why is cov(xt,xt+k)=αkσ21α2?

If {xt} is a stable AR(1) process (which means that $||<1) can be written as:

(1αB)xt=wtxt=(1αB)1wt=wt+αwt1+α2wt2+α3wt3+=i=0αiwti

From this, we can deduce that the mean is

E(xt)=E(i=0αiwti)=i=0αiE(wti)=0

The autocovariance is computed similarly as:

γk=cov(xt,xt+k)=cov(i=0αiwti,j=0αjwt+kj)=j=k+iαiαjcov(wti,wt+kj)=αkσ2i=0α2i=αkσ21α2

See Equations (2.15) and (4.2).

Correlogram of an AR(1) process

Correlogram of an AR(1) Process

The autocorrelation function for an AR(1) process is

ρk=αk      (k0) where |α|<1.

Check Your Understanding
  • Use the equation for the autocovariance function, γk, to show that ρk=αk for k0 when |α|<1.

  • Use this to explain why the correlogram decays to zero more quickly when α is small.

Small Group Activity: Simulation of an AR(1) Process

Check Your Understanding

In each of the following cases, what do you observe in the correlogram? (If you expect to see significant results and you do not, try increasing the number of points.)

  • α=1
  • α=0.5
  • α=0.1
  • α=0
  • α=0.1
  • α=0.5
  • α=1

Class Activity: Partial Autocorrelation (10 min)

Definition of Partial Autocorrelation

Partial Autocorrleation

The partial autocorrelation at lag k is defined as the portion of the correlation that is not explained by shorter lags.

For example, the partial correlation for lag 4 is the correlation not explained by lags 1, 2, or 3.

Check Your Understanding
  • What is the value of the partial autocorrelation function for an AR(2) process for all lags greater than 2?

On page 81, the textbook states that in general, the partial autocorrelation at lag k is the kth coefficient of a fitted AR(k) model. This implies that if the underlying process is AR(p), then all the coefficients αk=0 if k>p. So, an AR(p) process will yield partial correlations that are zero after lag p. So, a correlogram of partial autocorrelations can be helpful to determine the order of an appropriate AR process to model a time series.

ACF and PACF of an AR(p) Process

For an AR(p) process, we observe the following:

AR(p)
ACF Tails off
PACF Cuts off after lag p

Example: McDonald’s Stock Price

Here is a partial autocorrelation plot for the McDonald’s stock price data:

Show the code
# Set symbol and date range
symbol <- "MCD"
company <- "McDonald's"

# Retrieve static file
stock_df <- rio::import("https://byuistats.github.io/timeseries/data/stock_price_mcd.parquet")

# Transform data into tibble
stock_ts <- stock_df %>%
  mutate(
    dates = date, 
    value = adjusted
  ) %>%
  select(dates, value) %>%
  as_tibble() %>% 
  arrange(dates) |>
  mutate(diff = value - lag(value)) |>
  as_tsibble(index = dates, key = NULL) 

pacf(stock_ts$value, plot=TRUE, lag.max = 25)

The only significant partial correlation is at lag k=1. This suggests that an AR(1) process could be used to model the McDonald’s stock prices.

Partial Autocorrelation Plots of Various AR(p) Processes

Here are some time plots, correlograms, and partial correlograms for AR(p) processes with various values of p.

Shiny App

Class Activity: Stationary and Non-Stationary AR Processes (15 min)

Definition of the Characteristic Equation

Treating the symbol B formally as a number (either real or complex), the polynomial

θp(B)xt=(1α1Bα2B2αpBp)xt

is called the characteristic polynomial of an AR process.

If we set the characteristic polynomial to zero, we get the characteristic equation:

θp(B)=(1α1Bα2B2αpBp)=0

The roots of the characteristic polynomial are the values of B that make the polynomial equal to zero–i.e., the values of B that make θp(B)=0. These are also called the solutions of the characteristic equation. The roots of the characteristic polynomial can be real or complex numbers.

We now explore an important result for AR processes that uses the absolute value of complex numbers.

Identifying Stationary Processes

An AR process will be stationary if the absolute value of the solutions of the characteristic equation are all strictly greater than 1.

First, we will find the roots of the characteristic polynomial (i.e. the solutions of the characteristic equation) and then we will determine if the absolute value of these solutions is greater than 1.

We can use the polyroot function to find the roots of polynomials in R. For example, to find the roots of the polynomial x2x6, we apply the command

polyroot(c(-6,-1,1))
[1]  3+0i -2+0i

Note the order of the coefficients. They are given in increasing order of the power of x.

Of course, we could simply factor the polynomial: x2x6=(x3)(x+2)=set0 which implies that x=3   or   x=2

Definition of the Absolute Value in the Complex Plane

Let z=a+bi be any complex number. It can be represented by the point (a,b) in the complex plane. We define the absolute value of z as the distance from the origin to the point:

|z|=a2+b2

Practice computing the absolute value of a complex number.  

Check Your Understanding

Find the absolute value of the following (complex) numbers:

  • 3

  • 4i

  • 3+4i

  • 34+14i

  • 1212i

  • 512i

We will now practice assessing whether an AR process is stationary using the characteristic equation.

Check Your Understanding

For each of the following AR processes, do the following:

  1. Write the AR process in terms of the backward shift operator.
  2. Solve the characteristic equation.
  3. Determine if the AR process is stationary.
  • AR(1) process: xt=xt1+wt

  • AR(1) process: xt=13xt1+wt

  • AR(2) process: xt=14xt1+18xt2+wt

  • AR(2) process: xt=23xt1+13xt2+wt

  • AR(2) process: xt=xt12xt2+wt

  • AR(2) process: xt=32xt1xt2+wt

  • AR(2) process: xt=4xt2+wt

  • AR(3) process: xt=23xt1+14xt216xt3+wt

  1. Choose one stationary AR(2) process and one non-stationary AR(2) process. For each, do the following:
  • Simulate at least 1000 sequential observations.
  • Make a time plot of the simulated values.
  • Make a correlogram of the simulated values.
  • Plot the partial correlogram of the simulated values.
The following code chunk may be helpful--or you can use the simulation above.
Show the code
# Number of observations
n_obs <- 1000

# Generate sequence of dates
start_date <- my(paste(1, floor(year(now())-n_obs/365)))
date_seq <- seq(start_date,
    start_date + days(n_obs - 1),
    by = "1 days")

# Simulate random component
w <- rnorm(n_obs)

# Set first few values of x
x <- rep(0, n_obs)
x[1] <- w[1]
x[2] <- 0.6 * x[1] + w[2]

# Set all remaining values of x
for (t in 3:n_obs) {
  x[t] <- 0.6 * x[t-1] + 0.25 * x[t-2] + w[t]
}

# Create the tsibble
sim_ts <- data.frame(dates = date_seq, x = x) |>
  as_tsibble(index = dates) 

# Generate the plots
sim_ts |> autoplot(.vars = x)
acf(sim_ts$x, plot=TRUE, lag.max = 25)
pacf(sim_ts$x, plot=TRUE, lag.max = 25)
  1. What do you observe about the difference in the behavior of the stationary and non-stationary processes?

Homework Preview (5 min)

  • Review upcoming homework assignment
  • Clarify questions
Download Homework

Autoregressive Model Definition

Absolute Value

Using the Characteristic Polynomial to Assess Stationarity