Sometimes, a nonlinear relationship between the dependent and independent variables is more appropriate than a linear relationship. In such cases, running a linear regression will not be optimal. If the linear model is not the correct form, then the slope and intercept estimates and the fitted values from the linear regression will be biased, and the fitted slope and intercept estimates will not be meaningful (Hox, 2010). Over a restricted range of independent or dependent variables, nonlinear models may be well approximated by linear models (this is in fact the basis of linear interpolation), but for accurate prediction a model appropriate to the data should be selected. A nonlinear transformation should first be applied to the data before running a regression.
Diagnostic Results
Heteroskedasticity
Micronumerosity
Outliers
Nonlinearity
W-Test
Hypothesis Test
Approximation
Natural
Natural
Number of
Nonlinear Test
Hypothesis Test
Variable
p-value
result
result
Lower Bound
Upper Bound
Potential Outliers
p-value
result
Y
no Problems
-7.86
671.70
2
Variable X1
0.2543
Homoskedastic
no problems
-21377.95
64713.03
3
0.2458
linear
Variable X2
0.3371
Homoskedastic
no problems
77.47
445.93
2
0.0335
nonlinear
Variable X3
0.3649
Homoskedastic
no problems
-5.77
15.69
3
0.0305
nonlinear
Variable X4
0.3066
Homoskedastic
no problems
-295.96
628.21
4
0.9298
linear
Variable X5
0.2495
Homoskedastic
no problems
3.35
9.38
3
0.2727
linear
Statistical Summary
Sometimes, certain types of time-series data cannot be modeled using any other methods except for a stochastic process, because the underlying events are stochastic in nature. For instance, you cannot adequately model and forecast stock prices, interest rates, price of oil, and other commodity prices using a simple regression model, because these variables are highly uncertain and volatile, and does not follow a predefined static rule of behavior, in other words, the process is not stationary (Snijders, 2011). Stationary is checked here using the Runs Test while another visual clue is found in the Autocorrelation report (the ACF tends to decay slowly). A stochastic process is a sequence of events or paths generated by probabilistic laws. That is, random events can occur over time but are governed by specific statistical and probabilistic rules. The main stochastic processes include Random Walk or Brownian motion, Mean-Reversion, and Jump-Diffusion. These processes can be used to forecast a multitude of variables that seemingly follow random trends but restricted by probabilistic laws. The process-generating equation is known in advance but the actual results generated are unknown.
Distributive Lags
P-Values of Distributive Lag Periods of Each Independent Variable
Variable
1
2
3
4
5
6
7
8
9
10
11
12
X1
0.8467
0.2045
0.3336
0.9105
0.9757
0.1020
0.9205
0.1267
0.5431
0.9110
0.7495
0.4016
X2
0.6077
0.9900
0.8422
0.2851
0.0638
0.0032
0.8007
0.1551
0.4823
0.1126
0.0519
0.4383
X3
0.7394
0.2396
0.2741
0.8372
0.9808
0.0464
0.8355
0.0545
0.6828
0.7354
0.5093
0.3500
X4
0.0061
0.6739
0.7932
0.7719
0.6748
0.8627
0.5586
0.9046
0.5726
0.6304
0.4812
0.5707
X5
0.1591
0.2032
0.4123
0.5599
0.6416
0.3447
0.9190
0.9740
0.5185
0.2856
0.1489
0.7794
Periodic
Drift Rate
-1.48%
Reversion Rate
283.89%
Jump Rate
20.41%
Volatility
88.84%
Long-Term Value
327.72
Jump Size
237.89
Probability of stochastic model fit:
46.48%
A high fit means a stochastic model is better than conventional models.
Runs
20
Standard Normal
-1.7321
Positive
25
P-Value (1-tail)
0.0416
Negative
25
P-Value (2-tail)
0.0833
Expected Run
26
A low p-value (below 0.10, 0.05, 0.01) means that the sequence is not random and hence suffers from stationarity problems, and an ARIMA
model might be more appropriate. Conversely, higher p-values indicate randomness and stochastic process models might be appropriate.