Multiple Regression Analysis

Read Complete Research Material

MULTIPLE REGRESSION ANALYSIS

Multiple Regression Analysis



Multiple Regression Analysis

Introduction

In regression analysis there is a response or dependent variable (y) may be the number of species abundance or presence-absence of a single species and explanatory or independent variable (x). The purpose is to obtain a simple function of the independent variable, which is able to describe as closely as possible the variation of the dependent variable.

As the observed values ??of the dependent variable generally differ from those predicted by the function, it has an error. The most effective role is one that describes the dependent variable with the least possible error or, in other words, with the smallest difference between observed and predicted values. The difference between observed and predicted values ??(the error function) is called residual variation or debris. To estimate the parameters of the function using the least squares fit. Ie it tries to find the function in which the sum of the squares of the differences between observed and expected values ??is less. However, with this type of strategy is necessary that the waste or errors are normally distributed and to vary similarly over the entire range of values ??of the dependent variable. These assumptions can be tested by examining the distribution of waste and its relationship with the dependent variable.

Scatter plot with regression line for the relationship between life expectancy (Y) and the percentage of literate people (X) in various countries, early 1980's. Source: US Census Bureau, 1989. Regression analysis uses a mathematical equation to describe this kind of relationship. In linear regression the equation describes the straight line which, on the average, provides the closest "fit" to all the points at once. There is only one such line and it is known as the least-squares regression line. It gets its name from the fact that if we measure the vertical distance (called a residual) between the line and each point on the graph, square each distance, and adds them up; the total will be smaller for this line than for any other. Hence, it is the line that best "fits" the data.

One common application of multiple regression is for predicting values of a particular dependent  variable based on knowledge of its association with certain independent variables. In this context, the independent variables are commonly referred to as predictor variables and the dependent variable is characterized as the criterion variable. In applied settings, it is often desirable for one to be able to predict a score on a criterion variable by using information that is available in certain predictor variables. For example, in the life insurance industry, actuarial scientists use complex regression models to predict, on the basis of certain predictor variables, how long a person will live.

Multiple regression is most commonly used to predict values of a criterion variable based on linear associations with predictor variables. A brief example using simple regression easily illustrates how this works. Assume that a horticulturist developed a new hybrid maple tree that grows exactly 2 feet for every year that ...
Related Ads