lichess.org
Donate

Objective Elo in a game

Is it possible to have an objective Elo evaluation of one's play after a game?
Ok
Say my elo is 1500, but I play extremely well in a game against another 1500. The objective quality of my moves would perhaps be around 1800 (as if I had beaten SF at a 1800 level).
One game tells nothing(so there is nothing objective in it).
Also Elo tells nothing after too few games.
Elo obsession hurts your game play anyway.
You can already look at the blundercheck after the game that's enough.
No. It's technically impossible. ELO is an statistical concept. And as much other statistical parameters, its meaning is based in a statitistical sample set of suficient number of elements observed (in this case, the elements are games played with 3 possible observed measures: won, lost or tied) and a probability distribution modeling the dispertion phenomena. That's the true, unique and original meaning ot the ELO parameter. It has no meaning if it's not related to the original players pool it has been calculated from (Lichess ELO is not equal to FIDE ELO, for example) and it only tends to be exact when the number of games played is a big number. That's why there's no exact way to relate the "quality" of a player's moves in a game to an overall ELO. It would be really hard to evaluate this "quality" alone! More difficult and inaccurate would be relate this new parameter "moves quality" and an ELO!

It might be a good subject of research, but it wouldn't yield so much interesting results, i guess, 'cause we would be relating 2 parameters of different nature and one of them it's not even defined (if possible) and would be never representative of one's chess overall skills.
I agree with TinchoVM. You could invent metrics for the "quality" of a player's moves, but that wouldn't be Elo.

"The Elo rating system is a method for calculating the relative skill levels of players in competitor-versus-competitor games such as chess. It is named after its creator Arpad Elo, a Hungarian-born American physics professor." http://en.wikipedia.org/wiki/Elo_rating_system

A move isn't a blunder if it causes your opponent to resign! And this does happen! http://timkr.home.xs4all.nl/chess2/resigntxt.htm

Wow, what a tricky CAPTCHA! I didn't see the bishop on e2! http://en.lichess.org/O9WWycPL
Well, while any objective measure based on move quality wouldn't be identical with ELO, as was pointed out, that hardly means it would never be representative of skill.

Just look at some of Ken Regan's work developing IPR as part of his work on cheating detection.

Is it perfect? Of course not, but it is interesting and does seem to correlate well with results.

Obviously there are some stylistic issues that might make objective quality of moves less than completely reliable as a measure to predict future results.

Someone who purposely plays inferior moves to make the game especially complicated might do very well in practice, but the objective quality of his moves would be lower than someone who just played simple positions that were easy for his opponents to draw.

The first player would be playing sub-par moves to induce even bigger mistakes by his opponent, a quality that could ultimately be captured by a rating system like ELO or Glicko, but not by objective quality of moves.

On the other hand, it's a purely empirical matter whether that makes the objective quality of moves measure less good at predicting results, because traditional rating systems have their own flaws.

The biggest, of course, is that you only get one data point per game, and humans typically don't play enough games frequently enough to keep error bars all that small.

With objective measures of move quality, every move is a data point, so the measure becomes less uncertain much more quickly.

Whether that would offset being insensitive to the practical features of play is, again, an empirical matter for some intrepid soul to investigate :)

A similar observation was made about Scrabble, which also traditionally uses ELO ratings, in an explanation of the top program Maven in a 2002 paper that can be downloaded here:

www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=4&cad=rja&uact=8&ved=0CDkQFjAD&url=http%3A%2F%2Fwww.math-info.univ-paris5.fr%2F~bouzy%2FProjetUE3%2FScrabble.pdf.gz&ei=CQUbVPuFJIm9ggSRhIGoCA&usg=AFQjCNEQ3c3RU0GzpGsdHLHg2iqPP_NxqA&bvm=bv.75097201,d.eXY

Section 10.1 is where the discussion of skill metrics takes place, and he notes the same pros and cons of each method that I've pointed out here. The closing quote is quite appropriate, and works for the discussion here quite nicely:

"We do not really need one single measure of strength; the important thing is to have measures that are appropriate for what we are trying to accomplish."

Ratings work nicely for some things, like measuring your progress relative to a pool of other players over time, but are worthless for other things, like how strongly you played in a single game or short series of games.

Having a well-developed measure of objective quality is a useful metric for assessing such things.

If you're interested, a link to some of Regan's work on this front is here:

http://www.cse.buffalo.edu/~regan/papers/pdf/Reg12IPRs.pdf

The results obtained for historical figures and matches towards the end is interesting at the very least.

I'm done rambling now :)
As was mentioned, the current rating system rewards a win as a win, and does not care how you got there. I'm sure everyone has made a very crude checkmate at some point, and even though it was recorded as a win, there was still a feeling of disgust at the game. (If the game were played on a board, the pieces are immediately put back in the box and tucked under the coffee table. Online, the player merely closes the browser with a single click.)

And on the other side, the movement of the pieces can become so artistic, all competitive thoughts have been pushed to the far recesses of the mind. Even the loss can be viewed as a win, but usually these are the games that end in a draw. Chess does have a peaceful side.

This topic has been archived and can no longer be replied to.