Here we are going to make a regression equation with the help of given data of sales and different other variables. The case involves the decision to locate a new store at one of two candidate sites. The decision will be based on estimates of sales potential, and for this purpose, you will need to develop a multiple regression model to predict sales.
Data
Data given to us in excel file is showing that there are different variables. In the first column we can see the data variables as stores which are showing the serial numbers of different observations. There are 32 other variables in the dataset including the variable sales. And each variable has 250 observations.
We have to analyze firstly the correlation between the sales and the comp type variable. And there is a scatter plot showing correlation between the variables. After that is a brief summary of regression analysis of data to which we are familiar. We have to make a regression equation which will help us in prediction of the future values of variables. Actually regression gives us the dependency relationship of two variables.
Results and discussion
Sales in the middle categories 3 - 6 are in similar ranges on the vertical axis, but 1 and 2 have somewhat higher sales, and category 7 appears to have somewhat lower sales. This implies that, when you create dummy variables for comtype, dummy variables for categories 1, 2, 7 are likely to be statistically significant in the multiple regression models (and dummy variables for categories 3 - 6 are likely to be not significant.
Given below are the tables showing a continuation of correlation between the sales and other variables in the sense that it different values for correlation between the variables. We have not included the stores due to the reason that the values in stores column is showing that the serial numbers for the variables. Correlation variables are being shown in bold.
%black
%spanishsp
%inc0-10
%inc10-14
%inc14-20
%inc20-30
%inc30-50
%inc50-100
%inc100+
%black
1
%spanishsp
0.357427
1
%inc0-10
0.362107
0.726546
1
%inc10-14
0.457146
0.691285
0.77167
1
%inc14-20
0.258855
0.307898
0.467917
0.445675
1
%inc20-30
-0.09728
-0.2023
-0.19229
-0.12769
0.100411
1
%inc30-50
-0.26088
-0.46347
-0.53141
-0.58086
-0.47484
-0.10451
1
%inc50-100
-0.15994
-0.27106
-0.43594
-0.37013
-0.53457
-0.49964
0.190495
1
%inc100+
-0.09349
-0.05814
-0.2058
-0.18094
-0.30687
-0.47028
-0.28237
0.393417
1
medianinc
-0.24338
-0.4433
-0.50953
-0.47957
-0.3511
0.023333
0.337775
0.367527
0.104077
medianrent
-0.25587
-0.47718
-0.57385
-0.56233
-0.3611
0.139653
0.447805
0.25987
0.014864
medianhome
-0.16203
-0.16396
-0.21734
-0.27103
-0.34112
-0.26177
0.210398
0.45391
0.189482
%owners
-0.23069
-0.49081
-0.6785
-0.67266
-0.35936
0.295064
0.486771
0.159258
-0.00133
%nocars
0.345944
0.614018
0.775739
0.748319
0.400051
-0.27983
-0.51292
-0.28128
-0.02377
%1car
0.123581
-0.18832
-0.03531
0.001752
0.078698
-0.04701
-0.08659
0.07361
0.078594
%tvs
0.093697
0.018538
-0.07821
-0.02379
0.042768
0.011531
-0.04748
0.003554
0.086017
%washers
-0.20157
-0.41246
-0.55679
-0.51408
-0.30793
0.191278
0.378661
0.196096
0.022294
%dryers
-0.42793
-0.45287
-0.65238
-0.6672
-0.43538
0.265067
0.471975
0.247908
0.007152
%dishw
-0.37281
-0.42938
-0.63281
-0.65789
-0.47741
0.027258
0.535406
0.406822
0.081965
%aircond
-0.12488
-0.4069
-0.38638
-0.42986
-0.17775
-0.122
0.357187
0.344613
0.036188
%freezer
-0.30988
-0.41081
-0.65183
-0.63883
-0.44748
0.156002
0.530266
0.285889
0.02411
%sechome
-0.09138
-0.27526
-0.32441
-0.3254
-0.25463
-0.15453
0.392878
0.324265
-0.00091
%sch0-8
0.322587
0.533597
0.664647
0.625479
0.369669
-0.16176
-0.51621
-0.28566
0.009884
%sch9-11
0.193613
0.275794
0.25788
0.243109
0.201412
0.172021
-0.27617
-0.32679
-0.03781
%sch12
-0.08104
-0.23518
-0.31339
-0.31211
-0.0885
0.143911
0.239439
0.001755
-0.01949
%sch12+
-0.27392
-0.36596
-0.407
-0.37146
-0.30854
-0.04272
0.349707
0.353606
0.021023
population
0.459857
0.394565
0.515098
0.521082
0.322593
-0.20415
-0.30771
-0.22
-0.05598
familysize
-0.14715
-0.11384
-0.21871
-0.27426
-0.13895
0.174816
0.15144
0.025405
-0.0234
selling_sqrft (in 1000s)
0.215144
0.14474
0.217351
0.261532
0.2191
-0.14756
-0.15275
-0.01554
-0.05674
sales (in $1000s)
0.274686
0.547427
0.615054
0.61405
0.265031
-0.31003
-0.40371
-0.10666
0.010405
medianinc
medianrent
medianhome
%owners
%nocars
%1car
%tvs
%washers
%dryers
%dishw
%aircond
1
0.338675
1
0.28343
0.233588
1
0.296713
0.502462
0.10303
1
-0.43326
-0.5452
-0.17962
-0.84667
1
0.094887
-0.02583
-0.01319
-0.04599
0.012165
1
-0.03998
0.010248
-0.07697
0.065414
-0.01908
-0.14887
1
0.231024
0.334522
0.091914
0.61592
-0.64096
-0.03349
0.030832
1
0.3847
0.454108
0.177265
0.758404
-0.79981
-0.15638
0.084337
0.663177
1
0.456397
0.466217
0.313716
0.635515
-0.66661
-0.158
0.081797
0.485032
0.690043
1
0.308554
0.239947
0.23632
0.238798
-0.32381
-0.03254
0.00852
0.243451
0.187352
0.332953
1
0.327953
0.481519
0.160979
0.75627
-0.76103
-0.20303
0.064363
0.632635
0.792721
0.697435
0.213012
0.272736
0.223736
0.228777
0.31387
-0.29784
-0.12657
0.092936
0.305367
0.356459
0.464875
0.523516
-0.38713
-0.51689
-0.21895
-0.61511
0.681949
0.074723
-0.09917
-0.48054
-0.62474
-0.611
-0.31198
-0.27948
-0.2489
-0.19581
-0.11733
0.197764
-0.01349
0.038365
-0.07224
-0.17752
-0.29309
-0.21112
0.194872
0.303938
-0.08422
0.353458
-0.32615
0.012231
0.084403
0.178108
0.274093
0.239167
0.114437
0.281288
0.295108
0.30717
0.28397
-0.38724
-0.06085
0.003744
0.280838
0.370123
0.430365
0.250019
-0.31629
-0.29829
-0.15946
-0.579
0.641109
0.183335
-0.01078
-0.47076
-0.64801
-0.50905
-0.12824
0.029826
0.123929
-0.03479
0.321108
-0.29133
-0.03909
-0.04211
0.273843
0.326702
0.251451
-0.06656
-0.14107
-0.24716
-0.05195
-0.2832
0.299593
0.084676
-0.06612
-0.19569
-0.33736
-0.23141
0.073247
-0.32539
-0.39391
0.029872
-0.68985
0.700939
0.009904
-0.0584
-0.56226
-0.65733
-0.49124
-0.29024
%freezer
%sechome
%sch0-8
%sch9-11
%sch12
%sch12+
population
familysize
selling_sqrft (in 1000s)
sales (in $1000s)
1
0.353215
1
-0.64322
-0.29341
1
-0.1416
-0.23188
0.238672
1
0.265051
0.014981
-0.36328
-0.00024
1
0.37601
0.311942
-0.62543
-0.59231
-0.40048
1
-0.57667
-0.1102
0.488521
0.133415
-0.14483
-0.33466
1
0.321659
0.089155
-0.18507
-0.02167
0.066079
0.107372
-0.21113
1
-0.33474
0.004429
0.306365
-0.08974
-0.15165
-0.09774
0.306989
-0.18811
1
-0.63945
-0.28745
0.486217
0.007783
-0.23765
-0.21836
0.599957
-0.27955
0.349022
1
Variables having higher correlation are %inc14-20, %inc20-30, %inc30-50, %inc50-100, %inc100+, medianinc, medianrent, medianhome, %owners, %nocars.
Regression between sales and other 10 variables
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.758935
R Square
0.575982
Adjusted R Square
0.558241
Standard Error
3623.869
Observations
250
ANOVA
df
SS
MS
F
Significance F
Regression
10
4.26E+09
4.26E+08
32.46557
3.09E-39
Residual
239
3.14E+09
13132430
Total
249
7.4E+09
Coefficients
Standard Error
t Stat
P-value
Lower 95%
Upper 95%
Lower 95.0%
Upper 95.0%
Intercept
29992.26
5634.274
5.323181
2.35E-07
18893.08
41091.44
18893.08
41091.44
%inc14-20
-218.377
101.365
-2.15436
0.032212
-418.06
-18.6937
-418.06
-18.6937
%inc20-30
-247.175
71.69977
-3.44737
0.000669
-388.42
-105.931
-388.42
-105.931
%inc30-50
-226.18
67.52907
-3.34937
0.000941
-359.208
-93.1517
-359.208
-93.1517
%inc50-100
-154.335
85.79453
-1.79889
0.073297
-323.345
14.67481
-323.345
14.67481
%inc100+
-209.654
68.48379
-3.06137
0.002455
-344.563
-74.7453
-344.563
-74.7453
medianinc
-0.04507
0.046098
-0.97776
0.329181
-0.13588
0.045738
-0.13588
0.045738
medianrent
2.951005
4.161783
0.709072
0.478971
-5.24745
11.14946
-5.24745
11.14946
medianhome
0.050972
0.016838
3.027244
0.002738
0.017803
0.084142
0.017803
0.084142
%owners
-54.7744
16.54521
-3.31059
0.001075
-87.3675
-22.1813
-87.3675
-22.1813
%nocars
91.90787
34.95944
2.628985
0.009119
23.03988
160.7758
23.03988
160.7758
We can predict the values of y with the help of x variables. As our variables are showing the regression equation as: Sales= 29992+ (-218.377 x1)+(-247.175 x2)+ … + 91.90787 ...