Chapter 9
Robust estimation
Puzzle 1
What is a robust estimate?
A robust estimate is one that is, on average, equal to the expected population value even when the normal assumptions of the statistic are not met.
Puzzle 2
What is the difference between trimming data and winsorizing it?
They both give robust estimates, but the trimmed mean is the mean based on scores that have had a percentage of extreme scores removed. For example, removing the highest and lowest 20% of scores and then computing the mean of the remaining scores would give us the 20% trimmed mean. Winsorizing data, on the other hand, is where a percentage of the highest scores are replaced with the next highest score (rather than being discarded) in the data and the same percentage of the lowest scores are replaced with the next lowest score in the data.
Puzzle 3
Zach randomly selected 10 scores from the professional services non-employees (see Figure 9.1 in the book): 14, 15, 13, 11, 16, 13, 21, 12, 11, 15. Calculate the mean, the 20% trimmed mean, the 10% trimmed mean, and the 20% winsorized mean.
First, let’s calculate the mean by adding the scores and dividing by the number of scores:
$$ \begin{aligned} \bar{X} &= \frac{\sum_{i = 1}^n x_i}{n} \\ &= \frac{14+15+13+11+16+13+21+12+11+15}{10} \\ &= \frac{141}{10} \\ &= 14.1. \end{aligned} $$
To trim 20% of the data from the two ends of the distribution, we need to trim 2 scores from each end (because 20% of 10 is 2). The mean of the remaining 6 scores is the 20% trimmed mean. We first need to arrange the scores in ascending order: 11, 11, 12, 13, 13, 14, 15, 15, 16, 21. Then we trim (i.e. delete) 2 scores from each end. The data are now: 12, 13, 13, 14, 15, 15 (note that we trimmed the two 11s from the bottom, and the 16 and 21 from the top). Finally, we calculate the mean of these 6 scores:
$$ \begin{aligned} \bar{X} &= \frac{\sum_{i = 1}^n x_i}{n} \\ &= \frac{12+13+13+14+15+15}{6} \\ &= \frac{82}{6} \\ &= 13.67. \end{aligned} $$
To trim 10% of the data, we need to trim 1 score from each end because 10% of 10 is 1. This involves removing the lowest score (11) and highest score (21). The remaining 8 scores are: 11, 12, 13, 13, 14, 15, 15, 16. The 10% trimmed mean will be the mean of these scores:
$$ \begin{aligned} \bar{X} &= \frac{\sum_{i = 1}^n x_i}{n} \\ &= \frac{11+ 12+13+13+14+15+15+16}{8} \\ &= \frac{109}{8} \\ &= 13.63. \end{aligned} $$
To calculate the 20% winsorized mean, we need to replace the top and bottom 20% of scores with the next highest or lowest score. For these data, the top 2 scores (16 and 21) are both replaced with the next highest score (15), and the bottom two scores (11 and 11) are replaced with the next lowest score (12). So the data becomes: 12, 12, 12, 13, 13, 14, 15, 15, 15, 15. We then calculate the mean of these data:
$$ \begin{aligned} \bar{X} &= \frac{\sum_{i = 1}^n x_i}{n} \\ &= \frac{12+12+12+13+13+14+15+15+15+15}{10} \\ &= \frac{136}{10} \\ &= 13.6. \end{aligned} $$
Puzzle 4
Square-root transform the above scores.
To square root transform the scores we replace each score with its square root.
Scores and their square root transformation | |
---|---|
Original score $x_i$ |
Transformed score $\sqrt{x_i}$ |
14 | 3.74 |
15 | 3.87 |
13 | 3.61 |
11 | 3.32 |
16 | 4.00 |
13 | 3.61 |
21 | 4.58 |
12 | 3.46 |
11 | 3.32 |
15 | 3.87 |
Puzzle 5
Using the data in Table 9.3 (in the book), what was the mean strength of scientists in both the JIG:SAW group and the non-employees?
To calculate the mean strength, we need to add up all the scores in each group and then divide the total by the number of scientists in each group.
Let’s start with the strength scores for JIG:SAW employees:
1161, 1141, 1174, 1112, 1185, 1095, 1102, 1112, 1071, 1244, 1102, 1216, 1884, 1276, 1373, 1145, 1169, 1136, 1313, 1129, 1119, 1197, 1111, 1121, 1274, 1197, 1139, 1233, 1334, 1150, 1138, 1185, 1158, 1445, 1525, 1408, 1128, 1723
$$ \bar{X} = \frac{\sum_{i = 1}^n x_i}{n} = \frac{46725}{38} = 1229.61. $$
The mean scientists’ strength score for JIG:SAW employees was 1229.61.
Now, let’s move into the strength scores for Non-employees:
1321, 1153, 1072, 1218, 1088, 1373, 1135, 1055, 1096, 1007, 1223, 1291, 1171, 1101, 2091, 1308, 1141, 1433, 1141, 1212, 1769, 1071, 1412, 1214, 1031, 1209, 1222, 1241, 1740, 1367, 1313, 1208, 1257, 1376, 1155, 1065, 1147, 1166, 1566, 1436
$$ \bar{X} = \frac{\sum_{i = 1}^n x_i}{n} = \frac{50595}{40} = 1264.88. $$
The mean scientists’ strength score for non-employees was 1264.88.
Puzzle 6
Using the data in Table 9.3 (in the book and reproduced above), what was the 20% trimmed mean strength of scientists in both the JIG:SAW group and the non-employees?
First, we will calculate the 20% trimmed mean strength for the JIG:SAW employees. There are 38 scores in total and 20% of 38 is 7.6. We can’t remove 7.6 scores, so we will take 8 scores from each end of the distribution instead. The table shows the raw scores listed in ascending order, and in the final column I have deleted the bottom and top 8 scores. The 20% trimmed mean is the mean of the scores in this final column
$$ \bar{X}_\text{20% trimmed} = \frac{25896}{22} = 1177.09. $$
Trimming 20% of the strength scores (JIG:SAW employees) | |||
---|---|---|---|
Participant ID | Strength (complete) |
Strength (20% trimmed) |
|
14 | 1071 | ||
8 | 1095 | ||
12 | 1102 | ||
17 | 1102 | ||
51 | 1111 | ||
5 | 1112 | ||
13 | 1112 | ||
44 | 1119 | ||
53 | 1121 | 1121 | |
78 | 1128 | 1128 | |
43 | 1129 | 1129 | |
40 | 1136 | 1136 | |
66 | 1138 | 1138 | |
56 | 1139 | 1139 | |
2 | 1141 | 1141 | |
31 | 1145 | 1145 | |
65 | 1150 | 1150 | |
69 | 1158 | 1158 | |
1 | 1161 | 1161 | |
38 | 1169 | 1169 | |
3 | 1174 | 1174 | |
7 | 1185 | 1185 | |
68 | 1185 | 1185 | |
47 | 1197 | 1197 | |
55 | 1197 | 1197 | |
18 | 1216 | 1216 | |
60 | 1233 | 1233 | |
16 | 1244 | 1244 | |
54 | 1274 | 1274 | |
23 | 1276 | 1276 | |
41 | 1313 | ||
62 | 1334 | ||
24 | 1373 | ||
75 | 1408 | ||
72 | 1445 | ||
74 | 1525 | ||
82 | 1723 | ||
22 | 1884 | ||
Sum | 46,725.00 | 25,896.00 | |
n | 38.00 | 22.00 | |
Mean | 1,229.61 | 1,177.09 |
We calculate the 20% trimmed mean strength of the non-employees in exactly the same way. There are 40 scores in total, 20% of 40 = 8, so we will take 8 scores from each end of the distribution (after putting them in ascending order) and then calculate the mean of the remaining scores. The table shows the raw scores listed in ascending order, and in the final column I have deleted the bottom and top 8 scores. The 20% trimmed mean will be the mean of the scores in this column
$$ \bar{X}_\text{20% trimmed} = \frac{29287}{24} = 1220.29. $$
Trimming 20% of the strength scores (Non-employees) | |||
---|---|---|---|
Participant ID | Strength (complete) |
Strength (Winsorized) |
|
25 | 1007 | ||
50 | 1031 | ||
20 | 1055 | ||
77 | 1065 | ||
42 | 1071 | ||
9 | 1072 | ||
11 | 1088 | ||
21 | 1096 | ||
30 | 1101 | 1101 | |
19 | 1135 | 1135 | |
34 | 1141 | 1141 | |
36 | 1141 | 1141 | |
79 | 1147 | 1147 | |
6 | 1153 | 1153 | |
76 | 1155 | 1155 | |
80 | 1166 | 1166 | |
28 | 1171 | 1171 | |
67 | 1208 | 1208 | |
52 | 1209 | 1209 | |
37 | 1212 | 1212 | |
49 | 1214 | 1214 | |
10 | 1218 | 1218 | |
57 | 1222 | 1222 | |
26 | 1223 | 1223 | |
58 | 1241 | 1241 | |
70 | 1257 | 1257 | |
27 | 1291 | 1291 | |
33 | 1308 | 1308 | |
63 | 1313 | 1313 | |
4 | 1321 | 1321 | |
61 | 1367 | 1367 | |
15 | 1373 | 1373 | |
73 | 1376 | ||
45 | 1412 | ||
35 | 1433 | ||
83 | 1436 | ||
81 | 1566 | ||
59 | 1740 | ||
39 | 1769 | ||
32 | 2091 | ||
Sum | 50,595.00 | 29,287.00 | |
n | 40.00 | 24.00 | |
Mean | 1,264.88 | 1,220.29 |
Puzzle 7
Using the data in Table 9.3 (in the book and reproduced in Puzzle 5), what was the 20% winsorized mean strength of scientists in both the JIG:SAW group and non-employees?
To calculate the 20% Winsorized mean, we need to replace the top and bottom 20% of scores with the next highest or lowest score. If we start with the JIG:SAW employees, there were 38 in total and 20% of 38 is 7.6, but we would round this up to 8 because we need a whole number. Therefore, we take 8 scores from each end of the distribution and replace them with the next highest or lowest score. First, I put the scores into ascending order. I have done this in the table below. In the final column, I have replaced the largest 8 scores with the next largest score (1276), and replaced the lowest 8 scores with the next lowest score (1121). To get the 20% winsorized mean, calculate the mean of the final column
$$ \bar{X}_\text{winsorized} = \frac{45072}{38} = 1186.11. $$
Winsorizing the strength scores (JIG:SAW employees) | |||
---|---|---|---|
Participant ID | Strength (complete) |
Strength (Winsorized) |
|
14 | 1071 | 1121 | |
8 | 1095 | 1121 | |
12 | 1102 | 1121 | |
17 | 1102 | 1121 | |
51 | 1111 | 1121 | |
5 | 1112 | 1121 | |
13 | 1112 | 1121 | |
44 | 1119 | 1121 | |
53 | 1121 | 1121 | |
78 | 1128 | 1128 | |
43 | 1129 | 1129 | |
40 | 1136 | 1136 | |
66 | 1138 | 1138 | |
56 | 1139 | 1139 | |
2 | 1141 | 1141 | |
31 | 1145 | 1145 | |
65 | 1150 | 1150 | |
69 | 1158 | 1158 | |
1 | 1161 | 1161 | |
38 | 1169 | 1169 | |
3 | 1174 | 1174 | |
7 | 1185 | 1185 | |
68 | 1185 | 1185 | |
47 | 1197 | 1197 | |
55 | 1197 | 1197 | |
18 | 1216 | 1216 | |
60 | 1233 | 1233 | |
16 | 1244 | 1244 | |
54 | 1274 | 1274 | |
23 | 1276 | 1276 | |
41 | 1313 | 1276 | |
62 | 1334 | 1276 | |
24 | 1373 | 1276 | |
75 | 1408 | 1276 | |
72 | 1445 | 1276 | |
74 | 1525 | 1276 | |
82 | 1723 | 1276 | |
22 | 1884 | 1276 | |
Sum | 46,725.00 | 45,072.00 | |
n | 38.00 | 38.00 | |
Mean | 1,229.61 | 1,186.11 |
I did exactly the same for the non-employees: because there were 40 scores in total and 20% of 40 is 8, I took the raw scores and replaced the largest 8 scores with the next largest score (1373), and replaced the lowest 8 scores with the next lowest score (1101) — see the table below (final column). To get the 20% Winsorized mean, calculate the mean of the final column
$$ \bar{X}_\text{winsorized} = \frac{49079}{40} = 1226.97. $$
Winsorizing the strength scores (Non-employees) | |||
---|---|---|---|
Participant ID | Strength (complete) |
Strength (20% trimmed) |
|
25 | 1007 | 1101 | |
50 | 1031 | 1101 | |
20 | 1055 | 1101 | |
77 | 1065 | 1101 | |
42 | 1071 | 1101 | |
9 | 1072 | 1101 | |
11 | 1088 | 1101 | |
21 | 1096 | 1101 | |
30 | 1101 | 1101 | |
19 | 1135 | 1135 | |
34 | 1141 | 1141 | |
36 | 1141 | 1141 | |
79 | 1147 | 1147 | |
6 | 1153 | 1153 | |
76 | 1155 | 1155 | |
80 | 1166 | 1166 | |
28 | 1171 | 1171 | |
67 | 1208 | 1208 | |
52 | 1209 | 1209 | |
37 | 1212 | 1212 | |
49 | 1214 | 1214 | |
10 | 1218 | 1218 | |
57 | 1222 | 1222 | |
26 | 1223 | 1223 | |
58 | 1241 | 1241 | |
70 | 1257 | 1257 | |
27 | 1291 | 1291 | |
33 | 1308 | 1308 | |
63 | 1313 | 1313 | |
4 | 1321 | 1321 | |
61 | 1367 | 1367 | |
15 | 1373 | 1373 | |
73 | 1376 | 1373 | |
45 | 1412 | 1373 | |
35 | 1433 | 1373 | |
83 | 1436 | 1373 | |
81 | 1566 | 1373 | |
59 | 1740 | 1373 | |
39 | 1769 | 1373 | |
32 | 2091 | 1373 | |
Sum | 50,595.00 | 49,079.00 | |
n | 40.00 | 40.00 | |
Mean | 1,264.88 | 1,226.97 |
Puzzle 8
Using your answers above, how do the robust estimates of the mean differ from those based on the raw data?
If we collate our answers from the previous Puzzles it will make it easier to compare the robust estimates:
Estimated mean strength | |||
---|---|---|---|
Mean strength | |||
Raw score | 20% trimmed | 20% winsorized | |
JIG:SAW | 1229.61 | 1177.09 | 1186.11 |
Non-employee | 1264.88 | 1220.29 | 1226.97 |
Looking at the means based on the raw scores, we can see that there is not much difference between the mean strength of scientists in the JIG:SAW and non-employee groups; the non-employees were slightly stronger than the JIG:SAW employees, but not by very much. Looking at the 20% trimmed and 20% winsorized means, these robust estimates are smaller than the raw mean by about 40–45 units in the non-employee group, and smaller by about 40–50 units in the JIG:SAW group. In other words, the change in the mean is fairly similar in the two groups, and the differences between the groups have stayed fairly similar (raw mean difference = 35.27, trimmed mean difference = 43.2, winsorized mean difference = 40.67). (You might think that 35.27 is quite different to 43.2, and you’d be correct if the scale of measurement perhaps ranged from 0 to 50, but the strength scores range from 1000 to 2000, and in that context a difference of around 8 is not particularly startling.)
Puzzle 9
Log-transform the JIG:SAW data from Table 9.3 (in the book and reproduced in Puzzle 5).
To log transform the JIG:SAW data we need to take the natural log of each score. You can use software such as Excel, SPSS or R to do this for you. I used R to create this table.
Scores and their log transformations | |||
---|---|---|---|
ID | Strength score $x_i$ |
Nautural log $\ln{x_i}$ |
Log (base 10) $\log_{10}{x_i}$ |
4 | 1321 | 7.19 | 3.12 |
6 | 1153 | 7.05 | 3.06 |
9 | 1072 | 6.98 | 3.03 |
10 | 1218 | 7.10 | 3.09 |
11 | 1088 | 6.99 | 3.04 |
15 | 1373 | 7.22 | 3.14 |
19 | 1135 | 7.03 | 3.05 |
20 | 1055 | 6.96 | 3.02 |
21 | 1096 | 7.00 | 3.04 |
25 | 1007 | 6.91 | 3.00 |
26 | 1223 | 7.11 | 3.09 |
27 | 1291 | 7.16 | 3.11 |
28 | 1171 | 7.07 | 3.07 |
30 | 1101 | 7.00 | 3.04 |
32 | 2091 | 7.65 | 3.32 |
33 | 1308 | 7.18 | 3.12 |
34 | 1141 | 7.04 | 3.06 |
35 | 1433 | 7.27 | 3.16 |
36 | 1141 | 7.04 | 3.06 |
37 | 1212 | 7.10 | 3.08 |
39 | 1769 | 7.48 | 3.25 |
42 | 1071 | 6.98 | 3.03 |
45 | 1412 | 7.25 | 3.15 |
49 | 1214 | 7.10 | 3.08 |
50 | 1031 | 6.94 | 3.01 |
52 | 1209 | 7.10 | 3.08 |
57 | 1222 | 7.11 | 3.09 |
58 | 1241 | 7.12 | 3.09 |
59 | 1740 | 7.46 | 3.24 |
61 | 1367 | 7.22 | 3.14 |
63 | 1313 | 7.18 | 3.12 |
67 | 1208 | 7.10 | 3.08 |
70 | 1257 | 7.14 | 3.10 |
73 | 1376 | 7.23 | 3.14 |
76 | 1155 | 7.05 | 3.06 |
77 | 1065 | 6.97 | 3.03 |
79 | 1147 | 7.04 | 3.06 |
80 | 1166 | 7.06 | 3.07 |
81 | 1566 | 7.36 | 3.19 |
83 | 1436 | 7.27 | 3.16 |
Puzzle 10
Describe the process of bootstrapping.
Bootstrapping is a technique from which the sampling distribution of a statistic is estimated by taking repeated samples (with replacement) from the data set (in effect, treating the data as a population from which smaller samples are taken). The statistic of interest (e.g., the mean, or b coefficient) is calculated for each sample, from which the sampling distribution of the statistic is estimated. The standard error of the statistic is estimated as the standard deviation of the sampling distribution created from the bootstrap samples. From this process, confidence intervals and significance tests can be computed too.