Thoughts from a Bionic Lime: January 2011

I performed some statistics on the judging in United States Chess League's 2010 Game of the Year contest.

There were five judges: Hess, Gustafsson, Johannesson, Melekhina, Young. I will refer to them by the first letter of their last name.

Several analyses were completed.

What are the games for which the judges agreed most and disagreed most?

This can be calculated by looking at the standard deviations of the scores on each game.

The most agreed upon games were:

#20, Sammour-Hasbun vs. Kaplan (sd = 2.51)
#2, Sammour-Hasbun vs. Kacheishvili (sd = 2.61)
#4, Rosen-Guo (sd = 2.97)

The most disagreed upon games were:

#13 Schroer vs. Kacheishvili (sd = 7.99)
#19 Galofre vs. Milat (sd = 7.80)
#12 Friedel vs. Akobian (sd = 7.36)

Which judges were most different?

I calculated which of the judges were "most different" than the combined wisdom of all the judges together. The judges that were the most different could be considered outliers.

There are several ways to do this. I will demonstrate two approaches.

FINDING THE OUTLIER JUDGES

First, I compared the score a judge gave to the average of all the judges, but tempering that by the amount of disagreement of all the judges. For instance Judge Y gave 2 points (19th place) to Schroer-Kacheishvili, while the average number of points was 9.2, and the standard deviation (the amount of disagreement) was 7.99. Therefore, For that game, Judge Y would receive the absolute value of (2 - 9.2)/7.99 or 0.80 "difference points". For each of the twenty games, add up the difference points. The more the difference points, the more different the judge was from the other judges.

The total number of difference points were...
Judge Y: 17.49
Judge J: 11.09
Judge M: 19.40
Judge G: 12.80
Judge H: 16.96

Therefore, Judge Y and Judge M were the most different from the other judges.

Then, we could discard the scores of these two judges, and rescore the contest.

See below for how the results would have changed.

COMPUTE THE MIDDLE SCORES FOR EACH GAME

Another way of rescoring the contest is to do it on a "per game" basis, as opposed to throwing judges as a whole. Instead, discard the high and low scores given to each game, and create a new total.

For example, Golfre-Milat received scores of 1, 1, 1, 5, and 19. If we were to use this method, we would throw out one of the 1s and the 19, and the game would received a revised score of 7.

. . .

The table below shows the original place for each game, as well as the place it would have come it if you used the "Three Judges Only" method, or the "No Hi-Lo" method. Ties were not broken for these alternate methods.

GAME	Original	Three Judges	No Hi-Lo
Sammour-Hasbun vs Kaplan	20	19	19
Galofre vs Milat	19	20	20
Gurevich vs Barcenilla	18	18	18
Akobian vs Friedel	17	T13-14	17
Rosenthal vs Thompson	16	T15-17	15
Krasik vs Balasubramanian	15	T13-14	16
Hungaski vs Schroer	14	T15-17	13
Schroer vs Kacheishvili	13	T15-17	14
Friedel vs Akobian	12	T11-12	12
Shulman - Felecan	11	T11-12	11
Rensch - Abrahamyan	10	T4-5	T7-10
Shankland vs Becerra	9	8	T7-10
Stripunsky vs Erenburg	8	10	T7-10
Christiansen vs Kraai	7	T6-7	T7-10
Schroer vs Christiansen	6	T4-5	4
Kacheishvili vs Shankland	5	9	T5-6
Rosen vs Guo	4	T6-7	T5-6
Shulman vs Khachiyan	3	2	2
Sammour-Hasbun vs Kacheishvili	2	3	3
Akobian vs Shulman	1	1	1

Readers are invited to make their own conclusions.

Monday, January 24, 2011

USCL Game of the Year Judging Analysis

Useful Sites

Keep Abreast of Bionic Lime along with These Fine People

Blog Archive

About Me