The purpose of this paper is to analyze the data set of fruits and vegetables market that supplies fresh produce to the supermarket outlets in NSW and Victoria, altogether there are seven variables that have been used for the rationale of this study. The list of these variables are provided below, in this paper we would be evaluating the customers in Sydney and Melbourne.
Column
Name
Description
A
ID Number
Customer Number
B
Customer
1= Coles
2= Private
3= Safeway
C
City
1= Sydney
2= Melbourne
D
Produce
1= Apple
2= Orange
3= Potato
4= Tomato
E
Quantity (sold/day)
Total quantity of produce sold per day
F
Price$/kg
Price in dollars per kilogram
G
% of produce damage
Percentage of produce damaged
A sample of 335 customers has been taken, that sell different fresh produce fruits and vegetable. The variable of price shows the price in dollars per kg sold by the customer in a particular city, whereas, the variable of damaged produced is also taken into consideration, additionally three of the variables (Customer, City, Produce) are taken in nominal scale, while the others are in scale measure. The random sample that has been generated through Excel is shown in the table below.
ID number
Customer
City
Produce
Quantity (sold/day)
Price $/kg
% of Produce Damage
80
1
2
3
529.41
0.36
7.92
42
1
2
2
11.10
2.99
1.13
68
1
1
3
321.88
1.25
5.25
327
3
2
4
360.72
1.53
8.41
218
3
1
1
177.78
2.31
4.07
211
2
2
4
85.31
2.34
4.58
74
1
2
3
396.38
0.81
7.07
63
1
1
3
337.18
1.04
5.88
3
1
1
1
235.31
1.88
5.36
186
2
1
4
478.02
1.90
7.30
307
3
1
4
461.12
1.09
9.73
120
2
2
1
305.48
2.05
4.85
17
1
2
1
140.02
2.72
3.04
145
2
2
2
344.16
0.94
7.18
104
2
1
1
182.03
2.28
4.16
313
3
1
4
82.03
2.05
5.62
261
3
2
2
83.06
2.29
3.23
329
3
2
4
118.80
2.29
4.76
127
2
2
1
162.50
2.50
3.70
157
2
1
3
318.86
1.29
5.13
154
2
1
3
396.85
1.57
4.79
165
2
2
3
342.91
1.38
5.36
132
2
1
2
135.82
2.76
1.82
229
3
1
1
104.44
2.58
3.46
272
3
2
2
331.46
1.12
6.64
227
3
1
1
111.50
2.51
3.67
30
1
1
2
346.85
1.57
5.29
194
2
1
4
5.00
2.28
4.79
286
3
1
3
344.85
0.93
6.21
183
2
1
4
381.38
1.42
8.74
160
2
1
3
356.88
0.75
6.75
140
2
2
2
35.82
2.76
1.82
197
2
1
4
16.03
2.62
3.57
33
1
1
2
323.28
1.85
4.45
182
2
1
4
461.12
1.65
8.05
318
3
1
4
42.45
2.45
4.18
86
1
1
4
95.95
2.88
2.63
112
2
1
1
69.82
2.91
2.47
190
2
1
4
18.00
2.82
2.85
54
1
2
2
275.88
1.82
4.54
116
2
2
1
350.09
1.53
6.41
178
2
2
3
502.84
0.73
6.81
87
1
1
4
9.00
2.96
2.34
20
1
2
1
188.77
2.23
4.51
133
2
1
2
72.61
3.33
0.11
136
2
1
2
364.28
1.35
5.95
222
3
1
1
154.48
2.47
3.59
303
3
2
3
523.45
0.45
7.65
128
2
2
1
145.21
2.67
3.19
320
3
2
4
404.24
1.59
8.23
Contingency tables are used to record and analyze the relationship between two or more variables, usually by nature qualitative (nominal or ordinal). The table allows us to see at a glance that the proportion of skilled men is approximately equal to the proportion of skilled women. However, both ratios are not identical and the statistical significance of the difference between them can be assessed with the ? ² test of Pearson, provided that the figures in the table are a random sample of a population. If the proportion of individuals in each column varies between the different rows and vice versa, we say that there is an association between two variables.
Customer * Produce Cross-tabulation
Count
Produce
Total
Apple
Orange
Potato
Tomato
Customer
Coles
3
4
4
2
13
Private
6
5
5
7
23
Safe-way
4
2
2
6
14
Total
13
11
11
15
50
The contingency table is a particular way of representing simultaneously two characters observed on the same population, if they are discrete or continuous and grouped into classes. The table above shows that most of the customers from Coles have produces potatoes, whereas only 2 tomatoes were produce by the Coles customer. On the other hand, it can be mentioned that Safeway customers have produce the tomatoes in higher quantity than any other fruit and vegetable. Overall, it can be noted that private customers are highest producers in the fruit and vegetable market.
Statistics
Produce
Price/KG
Percentage Of Produce Damage
N
Valid
50
50
50
Missing
285
285
285
Mean
2.56
1.9170
4.9848
Median
3.00
1.9750
4.7900
Std. Deviation
1.181
.75377
2.11331
Variance
1.394
.568
4.466
Report
Percentage Of Produce Damage
Customer
Mean
N
Std. Deviation
Coles
4.5700
13
1.90883
Private
4.7991
23
2.18408
Safe way
5.6750
14
2.15678
Total
4.9848
50
2.11331
ANOVA Table
Sum of Squares
df
Mean Square
F
Sig.
Percentage Of Produce Damage * Customer
Between Groups
(Combined)
9.699
2
4.849
1.090
.345
Within Groups
209.140
47
4.450
Total
218.839
49
In this section we will see how to test the null hypothesis from two means from two samples (or subgroups) independent. We will actually judge whether two means are equal in population-Based on the result of the comparison between these two samples. The technique used is called t-test for independent samples (Independent sample t test). This technique is used to compare two groups, created by a categorical variable, based on their average measurement (continuous variable).
Null hypothesis
There is no difference between the averages of two groups in the ...