1. Write a brief description of the data you're working with.
Answer
The part of the total data that will be analysed in this paper is comprised of the variables of education and income. The variable of education provides a data set that has been collected regarding the per-capita education expenditure, whereas, the variable of income describes data related to the per-capita income of the people belonging to the mentioned districts of the United States of America.
2. Produce a histogram of both variables and describe both distributions.
Answer
In statistics, a histogram refers to a graphical representation which helps in understanding the nature of distribution of a given data set. The nature of distribution for any data can be either normal or non-normal. The normality of any data can be evaluated with the observation of the shape of the histogram and of the normality curve that is drawn with a histogram.
The histograms drawn for the given variables of education (Per-capita education expenditure) and income (Per-capita income) have been demonstrated below;
The observation of the histogram and the normality curve, drawn for the variable of education indicates that the data for this variable is normally distributed. The points of this data set seem to follow the common pattern of a bell shaped curve, which qualifies its distribution as a normal distribution. However, the data is not perfectly normally distributed as it involves the presence of some extended edges which depart the normality curve.
The evaluation of the histogram and the normality curve that have been drawn for the variable of income suggests that the type of distribution for this variable is a normal distribution. This signifies the normality of the data, as there is only few extended edges present which depart from the normal bell shaped curve (Boston University School of Public Health, 2013).
3. Test to see which of the two variables is most normal.
Answer
The tests of normality conducted for the variables of education and income suggests that the calculated values for Shapiro-Wilks test were 0.284 and 0.753, respectively. These scores suggest that the data is normally distributed as the values obtained are greater than the chosen alpha level, which is 0.05. However, the comparative analysis of these scores suggests that data for the variable of income is more normal, as compared to data recorded for the variable of education (Weinberg, & Abramowitz, 2008).
Table 1: Test of Normality for Income
Tests of Normality
Kolmogorov-Smirnova
Shapiro-Wilk
Statistic
df
Sig.
Statistic
df
Sig.
income
.070
44
.200*
.983
44
.753
a. Lilliefors Significance Correction
*. This is a lower bound of the true significance.
Table 2: Test of Normality for Education
Tests of Normality
Kolmogorov-Smirnova
Shapiro-Wilk
Statistic
df
Sig.
Statistic
df
Sig.
education
.090
44
.200*
.969
44
.284
a. Lilliefors Significance Correction
*. This is a lower bound of the true significance.
Other tests conducted for comparatively analysing the normality of data for the two variables included the stem and leaf plots, normal Q - Q plots and the Detrended Normal Q - Q plots. The observation of the outputs of the mentioned normality tests suggests that the data is normally distributed; however, the data set recorded for the variable of income tends to be more normal as ...