The Tukey’s test It is a method whose purpose is to compare the individual means from an analysis of variance of several samples subjected to different treatments.

The test, presented in 1949 by John.W. Tukey, allows discerning if the results obtained are significantly different or not. It is also known as the Tukey’s honestly significant difference test (Tukey’s HSD test for its acronym in English).

In experiments where three or more different treatments applied to the same number of samples are compared, it is necessary to discern whether the results are significantly different or not.

An experiment is said to be balanced when the size of all the statistical samples is equal in each treatment. When the sample size is different for each treatment, then we have an unbalanced experiment.

Sometimes it is not enough with an analysis of variance (ANOVA) to know if in the comparison of different treatments (or experiments) applied to several samples they fulfill the null hypothesis (Ho: «all the treatments are equal») or, on the contrary, it is meets the alternative hypothesis (Ha: “at least one of the treatments is different”).

Tukey’s test is not unique, there are many more tests to compare sample means, but this is one of the best known and applied.

[toc]

## Tukey comparator and table

In the application of this test, a value is calculated w called the Tukey comparator whose definition is as follows:

w = q √(MSE /r)

where the factor what is obtained from a table (Tukey Table), which consists of rows of values what for different number of treatments or experiments. Columns indicate factor value what for different degrees of freedom. Normally the available tables have relative significances of 0.05 and 0.01.

In this formula, within the square root appears the MSE factor (Mean Square of Error) divided by r, which indicates the number of repetitions. The MSE is a number that is usually obtained from an analysis of variance (ANOVA).

When the difference between two mean values exceeds the value w (Tukey’s comparator), then it is concluded that they are different averages, but if the difference is less than Tukey’s number, then they are two samples with a statistically identical average value.

The number w is also known as the HSD (Honestly Significant Difference) number.

This unique comparator number can be applied if the number of samples applied for the test of each treatment is the same in each of them.

### unbalanced experiments

When for some reason the size of the samples is different in each treatment to be compared, then the procedure described above differs slightly and is known as Tukey–Kramer test.

Now you get a number w comparator for each pair of treatments i j:

w(i,j) = q √( ½ MSE /(ri +rj) )

In this formula, the factor q is obtained from Tukey’s table. Said factor q depends on the number of treatments and the degrees of freedom of the mistake. ri is the number of replicates in treatment i, while rj is the number of replicates in treatment j.

## example case

A rabbit breeder wants to do a reliable statistical study that tells him which of the four brands of rabbit fattening food is the most effective. For the study, he formed four groups with six month-and-a-half-old rabbits that until then had the same feeding conditions.

From the experiment the first group is called A1 because it will be fed with the brand 1 food, in a similar way with the group A2, A3 and A4. A table is made where the weight gain (in pounds) of each specimen is recorded after one month of feeding with the different brands of food, obtaining the following results:Although it began with a balanced experiment, in the sense that the number of rabbits to which the treatment would be applied was the same, the experiment could not be completed in this way.

The reasons were that in groups A1 and A4 deaths occurred due to causes not attributable to food, since one of the rabbits was stung by an insect and in the other case the death was surely caused by a congenital defect. So the groups are unbalanced and then it is necessary to apply the Tukey-Kramer test.

## solved exercise

In order not to lengthen the calculations too much, a balanced experiment case will be taken as a solved exercise. The following data will be taken:

In this case, there are four groups corresponding to four different treatments. However, we observe that all the groups have the same number of data, so it is then a balanced case.

To carry out the ANOVA analysis, the tool that is incorporated in the spreadsheet of *libreoffice*. Other spreadsheets like *Excel* have this tool incorporated for data analysis. Below is a summary table that has resulted after having performed the analysis of variance (ANOVA):

The P value is also obtained from the analysis of variance, which for the example is 2.24E-6 well below the 0.05 level of significance, which directly leads to rejecting the null hypothesis: All treatments are equal.

That is, among the treatments some have different mean values, but it is necessary to know which are the significantly and honestly different (HSD) from the statistical point of view using the Tukey test.

To find the number wo, as it is also known as the HSD number, we need to find the mean square of the MSE error. From the ANOVA analysis it is obtained that the sum of squares within the groups is SS=0.2; and the number of degrees of freedom within the groups is df=16 with these data we can find MSE:

MSE = SS/df = 0.2/16 = 0.0125

It is also required to find the factor what of Tukey, using the table. Column 4 is searched for, which corresponds to the 4 groups or treatments to be compared, and row 16, since the ANOVA analysis yielded 16 degrees of freedom within the groups. This leads us to a value of q equal to: *q = 4.33* corresponding to 0.05 significance or 95% confidence. Finally, the value for the “honestly significant difference” is found:

w=HSD= q √(MSE /r) = 4.33 √(0.0125 /5) = 0.2165

To know which are the honestly different groups or treatments, it is necessary to know the average values of each treatment:

It is also necessary to know the differences between the mean values of treatment pairs, which is shown in the following table:

Groups T3 and T1, as well as groups T2 and T4, have identical results. Thus, the honestly different groups are groups T1 and T2 or T3 and T4, since the difference in their mean values exceeds the HSM value of Tukey’s test.

It is concluded that the best treatments, in terms of maximizing the result, are T1 or T3, which are indifferent from the statistical point of view. To choose between T1 and T3, one would have to look for other factors unrelated to the analysis presented here. For example, price, availability, etc.

## References

Cochran William and Cox Gertrude. 1974. Experimental designs. threshing Mexico. Third reprint. 661p.

Snedecor, GW and Cochran, WG 1980. Statistical methods. Seventh Ed. Iowa, The Iowa State University Press. 507p.

Steel, RGD and Torrie, JH 1980. Principles and procedures of Statistics: A Biometric Approach (2nd ed.). McGraw-Hill, New York. 629p.

Tukey, JW 1949. Comparing individual means in the analysis of variance. Biometrics, 5:99-114.

Wikipedia. Tukey’s test. Retrieved from: en.wikipedia.com