Autoregressive Integrated Moving Average (ARIMA) model was introduced by Box and Jenkins (hence also known as Box-Jenkins model) in 1960s for forecasting a variable. An effort is made in this paper to develop an ARIMA model for Total houses sold per quarter and to apply the same in forecasting Total houses sold for the three leading years. ARIMA method is an extrapolation method for forecasting and, like any other such method, it requires only the historical time series data on the variable under forecasting. Among the extrapolation methods, this is one of the most sophisticated methods, for it incorporates the features of all such methods, does not require the investigator to choose the initial values of any variable and values of various parameters a priori and it is robust to handle any data pattern. As one would expect, this is quite a difficult model to develop and apply as it involves transformation of the variable, identification of the model, estimation through non-linear method, verification of the model and derivation of forecasts. In what follows, we first explain the ARIMA model, then develop the same for Total houses sold using quarterly data during 1989 to 2007 and finally apply the same to forecast the values of the variable during the future 3 years.
Theoretical Basis of Time-Series Analysis:
A time series is a set of values of a continuous variable Y (Y1, Y2, ...,Yn), ordered according to a discrete index variable t (1, 2, ..., n). The term time-series comes from econometric studies in which the index variable refers to intervals of time measured in a suitable scale. However, it must be clearly stated that this direct reference to time is not required: actually, any different meaning can be attributed to the index variable, provided that it is able to order the Y values. In general, in a given time series the following can be recognized and separated (3) (Kendall, 1966):
1) A regular, long-term component of variability, termed trend, that represents the whole evolution pattern of the series;
2) A regular, short-term component whose shape occurs periodically at intervals of s lags of the index variable, currently known as seasonality, because this term is also derived by applications in economics;
3) AN AR(p) autoregressive component of p order, which relates each value- (trend and seasonality) to the p previous Z values, according to the following linear relationship
Where are parameters to be estimated and 4) a MA (q) moving average component of q order, which relates each value to the q residuals of the q previous Z estimates
where (i= 1, ..., q) are parameters to be estimated. The theory of time-series analysis has developed a specific language and a set of linear operators. According to Box and Jenkins (1), a highly useful operator in time-series theory is the lag or backward linear operator (B) defined by
Consider the result of applying the lag operator twice to a series:
Such a double application is indicated by B^(2), and, in general, for any integer k, it can be written t is a residual term; and
By using the backward operator, Equation [1] can be rewritten as