Croco-Puzzle: Ratings and Ranking

This walkthrough assumes you have registered for the site and logged in. If you have not registered for the site, see that stage of the walkthrough on the registration page.

So you registered for the site some time ago and have solved some daily puzzles. Perhaps you’ve solved some puzzles in the Croco-League as well. Solving puzzles, then waiting, will give you a chance to have your performance rated over time.

1. Select the “Play Puzzles” page

Press the Rätselspaß button in the fourth tab at the top of any page on the site The result will be this page:

2. Select the “Daily Puzzles” page

There are several different sections with lots of different puzzles, but in this walk-through we will concern ourselves with the top-right one, about daily puzzles.

Click on the Uberraschungsratsel box, or the link beneath it. The result will be a rather long page which starts like this:

3. Look at the Ratings table

In the navigation box top-left, click on the “Rating” link just below “Liga”, to produce the Croco-Puzzle ratings table that looks like so:

At time of writing, there are 839 users listed on the Rating table, which means that they have attempted at least one rated puzzle within the last six months. (Clicking the “inaktive Teilnehmer anzeigen” button will relax the “within the last six months” criterion, and produce an all-time table with 1468 users.) The higher the rating score, the better.

Scrolling further down, to take just one example:

The webmaster of this site is rated 333rd, at the time of the screen grab, having solved rated puzzles on 869 days, with a current rating of 1355 and a personal all-time best rating of 1399. Very moderate, especially in the context of the best solvers having ratings that go up to over 2700.

Your performance on daily puzzles (both applet and freeform) will affect your rating, as will your performance on league puzzles. However, there is a catch: this will only happen if you have requested this in your registration. Half-way up the registration page, which looks like this:

…there is a box next to text marked “Ich möchte die Überraschungsrätsel auf Zeit lösen und meine Ergebnisse dürfen in den Highscoretabellen stehen.” This must be ticked in order for your performances to affect your rating, and it is not ticked by default. From now on, we assume that this is ticked and that the change has taken effect, as discussed on the Registration page.

4. An explanation of the rating system

Performance on indivdual puzzles is rated on a 0-to-3000 scale. The fastest correct solution is awarded a performance of 3000 for that puzzle. The performance of median speed is awarded a performance of 1500 for that puzzle.

The difference between the fastest performance and the median performance is used as a unit of variance that is used to calculate all other scores: the fastest performance is zero units behind the top, the median performance is one unit behind the top, and it’s possible to express every other performance as a number of units behind the top in the same way.

Keeping the numbers simple: suppose the fastest performance took one minute, and the median performance took three minutes. Then the unit of variance for this puzzle is two minutes. A solution in five minutes is two units behind the top, a solution in eight minutes is three-and-a-half units behind the top and so on.

For each unit below the top your performance on a puzzle is, your score for that puzzle is halved from 3000. So a top performance scores 3000, a median performance (one unit from the top) scores 1500, a performance as far below the median as the median is from the top (i.e., two units from the top) scores 750, a performance three units from the top scores 375 and so on. Additionally, if you have submitted an incorrect answer (marked with a Fehlversuch in the ranking table), your performance for that puzzle is generally halved; multiple Fehlversuch submissions are penalised even more harshly.

Your performance for each puzzle is compared to your rating so far, and your rating is set to change by one-sixtieth of the difference between the two. So if your rating is currently zero but you are the fastest solver (i.e., a 3000 performance) on your first puzzle, you have performed 3000 better on that puzzle than expected and your rating will change by one-sixtieth of 3000, or +50. Unlikely, but technically possible. Be warned: committing to a puzzle, by starting it, and then not submitting the correct solution at all will be assessed as a 0 performance.

League puzzle performances work in a similar way, though it is assumed that the solvers in the league are not distributed the same way as in the daily puzzles and so the performance associated with a median time is calculated differently.

5. An example showing how ratings change

The rating table from the puzzle we considered in the freeform daily puzzle section looked like this:

Accordingly, we will consider Auftakt, who solved the puzzle faster than anybody else, and investigate the (+16) listed by the time. While the puzzle was available to solve for the last day, their position on the ratings table was like so:

Being fastest for any particular puzzle is always equivalent to a 3000 performance for the puzzle in question. The ratings table suggests that their rating was 2045 before the puzzle concluded; as 3000 is 955 higher than 2045, this 955 is divided by 60 to reflect a change of (almost) +16 compared to expected performance.

Croco-Puzzle has some other performance-tracking wrinkles, though. Clicking on Auftakt’s name produces the following performance graph:

At the top of the page is a graph, and lower on the page (not pictured) are a long chart of individual performance ratings.

The graph has three lines on it. The black line shows the latest rating over time, the green line (captioned “50-Durchschnitt”) shows a rolling average of the last 50 distinct ratings, and the red line (captioned “200-Durchschnitt”) shows a rolling average of the last 200 distinct ratings. For both of these rolling average figures, if you are new to the site and have fewer than 50 (or 200) ratings, the rolling average fills in the numbers you are missing with zeroes.

So, suppose you score +50 as a result of your first ever puzzle; your black line score will go from 0 to 50, your green line score will be a rolling average of 49 zeroes plus one 50, or 1, and your red line score will be a rolling average of 199 zeroes plus one 50, or ¼ (rounded down to zero). The green and red line scores move slowly over time. You can see the vertical red and green dotted lines showing the point in time at which the 50th last and 200th last scores were earnt; if you solve every day, these will be (roughly) 50 and 200 days ago respectively, but if you miss days then the averages will stretch back longer in time.

Lower down the page, but not pictured, are a list of more specific ratings. Solvers will have a U1 rating, based on their performances on applet puzzles, and a U2 rating, based on their performances on freeform puzzles. Furthermore, there will be a rating for as many of the 40 different applets as you have solved. These individual ratings move more quickly than the overall rating: instead of the change to your overall being one sixtieth of the difference between your rating and your performance, the change to your specific rating is one sixth of that difference. In this way, you might be a 1500-rated solver overall but, say, a 2000-rated solver at Slalom puzzles, a 1000-rated solver at Domino puzzles and so on.

Changes to ratings take place once puzzles are no longer available to solve. Each daily puzzle is available for seven days, so a change might take place as long as seven days after you solve the puzzle. So looking at the graph again once the puzzle that Auftakt won gives the following result:

The black line is clearly higher, the x-axis of the graph has scrolled along slightly and the value quoted for the green rolling average is one point higher. (The U2 rating listed below, but not pictured, is also rather higher.)

Returning one page to look at the altered ratings table shows that:

Auftakt’s overall rating has improved from 2045 to 2061 as expected, and the position in the rating table has improved from 72nd to 68th.

6. The Croco-Puzzle ranking scheme

Competitive types may enjoy showing off their prowess, and this is the meaning of the (3D), (1D), (1k), (11k) and so on that you see listed next to people’s names on the site. These reflect the highest scores that you have ever attained with your red-line 200-score rolling average, and the system is based on that used by the European Go Federation. (It may or may not be coincidence that the highest rating scores in this system look similar – but cannot meaningfully be compared to – the highest rating scores earnt by the most accomplished chess players.)

Earning a red-line rolling average of 100 earns you a grade of 20th kyu, which is signified by (20k) next to your name. A rolling average of 200 earns you 19th kyu and the (19k) rank, and so on, up to a rolling average of 2000 which earns you 1st kyu and the (1k) rank. Getting your rolling average up to 2100 earns you 1st dan and the (1D) rank, with the capital D an obvious indicator of extended success. Similarly, a rolling average of 2200 earns 2nd dan and (2D), 2300 earns 3rd dan and (3D) and so on.

Theoretically a rolling average of 2900 could earn 9th dan and (9D) but in practice this would require you to be first on the site, or extremely close, for every single puzzle for years. The highest black-line rating yet achieved is 2804 by user uvo, better known as Ulrich Voigt of Germany, who has won the World Puzzle Championship nine times. He attained 7th dan and got his rolling average, at one point, up to above 2750. The only other solver to yet reach 7th dan is user kirarin, better known as Hideaki Jo, who has finished third in the World Puzzle Championship once, third in the World Sudoku Championship twice and has a string of other extremely strong achievements.

In practice, getting 20th kyu is quite a slow process. However, later grades are rather quicker to earn, especially after 7-8 months (or more) when you have 200 different non-zero ratings scores making up your rolling average rather than lots of zeroes, and new grades can be earnt more frequently than once a month. It will probably take 18-24 months’ consistent play for your red line rolling average to actually represent a consistent long-term average of your performance.

Rankings are more accurately considered as a reflection of past achievements, rather than current performance; it is possible to earn a relatively high ranking and then have your rating tail off by a few hundred rating points from its peak over time. Even ratings are only accurate relative to the strength of the userbase at a particular time, and this strength goes up and down over time as particularly strong solvers join and leave the site.

However, as an extremely broad rule of thumb, people who attend the World Puzzle Championships and perform reasonably competitively (say, on par with the top 60% or so of A-team solvers) often tend to have a rating of perhaps 2000 and up. However, there are many strong solvers with ratings over 2000, often well over 2000, who have not yet got to participate at the World Puzzle Championships, particularly if they have the misfortune to represent a country like Germany, the US or Japan with many extremely strong solvers. It’s probably fair to use a very strong Croco-Puzzle rating to feel some sort of justification that you really might be able to match up with the best solvers at some less accomplished nations.

It’s also relevant that Croco-Puzzle ratings mostly represent attainment at a relatively limited number of puzzle types, some of which can be presented at a relatively accessible level of difficulty, whereas the World Puzzle Championship habitually features innovative variants and sets no artificial ceiling to difficulty levels. (The recent introduction of U2 puzzles does go some way towards closing the gap, though.) Accordingly, it’s probably not completely accurate to say that a high Croco-Puzzle rating would translate into a competitive World Puzzle Championship performance – but, on the other hand, there aren’t all that many alternative measurements to compare against…

Earning a 20th kyu ranking on Croco-Puzzle reflects dedication to the site, rather than particular skill at the puzzles it presents. Conceivably it could take less than a month, but practically 2-3 months is more likely, particularly if you are not familiar with the puzzles in the applets before you start, or longer still if you do not attempt every puzzle each day. (It took me 72 days.) If getting a ranking at all is the first big step, then perhaps earning 9th kyu and having a single-digit kyu ranking beside your name is the second big step, suggesting that you reasonably consistently are at least not too far off the median (high!) standard of the site’s users. The third big step is dan status, which reflects talent as well as extremely high standards of practice and skill.

The strongest solvers need not stop at 1st dan and have higher ranks still to attain, but perhaps there comes a point at which, metaphorically, different black belts might be considered less important than championship belts…


Croco-Puzzle walkthrough: Registration § Prize puzzles § Daily puzzles § Freeform puzzles § The Croco-League § Ratings and ranking