Show that the exponential smoothing model is the special case where for and . (See Chapter 3, Lesson 2.)
We now explore the autoregressive properties of this model.
Check Your Understanding
Show that the model is a regression of on past terms from the same series. Hint: write the model in more familiar terms, letting
Explain why the prediction at time is given by
Explain why the model parameters (the ’s) can be estimated by minimizing the sum of the squared error terms:
What is the reason this is called an autoregressive model?
Class Activity: Exploring Models (10 min)
Definition
Recall that an model is of the form So, an model is expressed as where is a white noise series with mean zero and variance .
Second-Order Properties of an Model
We now explore the second-order properties of this model.
Second-Order Properties of an Model
If is an prcess, then its the first- and second-order properties are summarized below.
Click here for a proof of the equation for
Why is ?
If is a stable process (which means that $||<1) can be written as:
From this, we can deduce that the mean is
The autocovariance is computed similarly as:
See Equations (2.15) and (4.2).
Correlogram of an process
Correlogram of an AR(1) Process
The autocorrelation function for an AR(1) process is
where .
Check Your Understanding
Use the equation for the autocovariance function, , to show that for when .
Use this to explain why the correlogram decays to zero more quickly when is small.
Small Group Activity: Simulation of an Process
Check Your Understanding
In each of the following cases, what do you observe in the correlogram? (If you expect to see significant results and you do not, try increasing the number of points.)
Class Activity: Partial Autocorrelation (10 min)
Definition of Partial Autocorrelation
Partial Autocorrleation
The partial autocorrelation at lag is defined as the portion of the correlation that is not explained by shorter lags.
For example, the partial correlation for lag 4 is the correlation not explained by lags 1, 2, or 3.
Check Your Understanding
What is the value of the partial autocorrelation function for an process for all lags greater than 2?
On page 81, the textbook states that in general, the partial autocorrelation at lag is the coefficient of a fitted model. This implies that if the underlying process is , then all the coefficients if . So, an process will yield partial correlations that are zero after lag . So, a correlogram of partial autocorrelations can be helpful to determine the order of an appropriate process to model a time series.
ACF and PACF of an Process
For an process, we observe the following:
AR(p)
ACF
Tails off
PACF
Cuts off after lag
Example: McDonald’s Stock Price
Here is a partial autocorrelation plot for the McDonald’s stock price data:
Show the code
# Set symbol and date rangesymbol <-"MCD"company <-"McDonald's"# Retrieve static filestock_df <- rio::import("https://byuistats.github.io/timeseries/data/stock_price_mcd.parquet")# Transform data into tibblestock_ts <- stock_df %>%mutate(dates = date, value = adjusted ) %>%select(dates, value) %>%as_tibble() %>%arrange(dates) |>mutate(diff = value -lag(value)) |>as_tsibble(index = dates, key =NULL) pacf(stock_ts$value, plot=TRUE, lag.max =25)
The only significant partial correlation is at lag . This suggests that an process could be used to model the McDonald’s stock prices.
Partial Autocorrelation Plots of Various Processes
Here are some time plots, correlograms, and partial correlograms for processes with various values of .
Shiny App
Class Activity: Stationary and Non-Stationary AR Processes (15 min)
Definition of the Characteristic Equation
Treating the symbol formally as a number (either real or complex), the polynomial
is called the characteristic polynomial of an AR process.
If we set the characteristic polynomial to zero, we get the characteristic equation:
The roots of the characteristic polynomial are the values of that make the polynomial equal to zero–i.e., the values of that make . These are also called the solutions of the characteristic equation. The roots of the characteristic polynomial can be real or complex numbers.
We now explore an important result for AR processes that uses the absolute value of complex numbers.
Identifying Stationary Processes
An AR process will be stationary if the absolute value of the solutions of the characteristic equation are all strictly greater than 1.
First, we will find the roots of the characteristic polynomial (i.e. the solutions of the characteristic equation) and then we will determine if the absolute value of these solutions is greater than 1.
We can use the polyroot function to find the roots of polynomials in R. For example, to find the roots of the polynomial , we apply the command
polyroot(c(-6,-1,1))
[1] 3+0i -2+0i
Note the order of the coefficients. They are given in increasing order of the power of .
Of course, we could simply factor the polynomial: which implies that
Definition of the Absolute Value in the Complex Plane
Let be any complex number. It can be represented by the point in the complex plane. We define the absolute value of as the distance from the origin to the point:
Practice computing the absolute value of a complex number.
Check Your Understanding
Find the absolute value of the following (complex) numbers:
We will now practice assessing whether an AR process is stationary using the characteristic equation.
Check Your Understanding
For each of the following AR processes, do the following:
Write the AR process in terms of the backward shift operator.
Solve the characteristic equation.
Determine if the AR process is stationary.
process:
process:
process:
process:
process:
process:
process:
process:
Choose one stationary process and one non-stationary process. For each, do the following:
Simulate at least 1000 sequential observations.
Make a time plot of the simulated values.
Make a correlogram of the simulated values.
Plot the partial correlogram of the simulated values.
The following code chunk may be helpful--or you can use the simulation above.
Show the code
# Number of observationsn_obs <-1000# Generate sequence of datesstart_date <-my(paste(1, floor(year(now())-n_obs/365)))date_seq <-seq(start_date, start_date +days(n_obs -1),by ="1 days")# Simulate random componentw <-rnorm(n_obs)# Set first few values of xx <-rep(0, n_obs)x[1] <- w[1]x[2] <-0.6* x[1] + w[2]# Set all remaining values of xfor (t in3:n_obs) { x[t] <-0.6* x[t-1] +0.25* x[t-2] + w[t]}# Create the tsibblesim_ts <-data.frame(dates = date_seq, x = x) |>as_tsibble(index = dates) # Generate the plotssim_ts |>autoplot(.vars = x)acf(sim_ts$x, plot=TRUE, lag.max =25)pacf(sim_ts$x, plot=TRUE, lag.max =25)
What do you observe about the difference in the behavior of the stationary and non-stationary processes?
Show that the model is a regression of on previous terms in the series. (This is why it is called an “autoregressive model.”) Hint: write the model in more familiar terms, letting
Solution:
This is a multiple linear regression equation with zero intercept.
Explain why the prediction at time is given by
Solution:
The prediction at time in a multiple regression setting would be: Translated to the setting, this becomes:
Explain why the model parameters (the ’s) can be estimated by minimizing the sum of the squared error terms:
Solution:
This is exactly how the multiple linear regression coefficients are estimated…minimizing the sum of the squared error terms.
What is the reason this is called an autoregressive model?
Solution:
This is called an autoregressive model because we regress the current tern on the previous terms in the series.