You are viewing motris

 
 
12 December 2010 @ 11:47 am
Stranger in a Strange Land - Croco-puzzle review  
At the WPC I had a good conversation with Stefano Forcolin of Italy about how to turn competitive puzzling into a true sport, particularly with regards to having a ranking system that could tell people where they stood at different levels of performance from local/school competitions to national or international events. Many other games (chess, scrabble, ...) have such numeric systems at play and it gives a concrete measure and goal (increase that number) for competitors.



One thing to start such ratings out, at the individual puzzle level, is simply time goals applied to printed puzzles. Nikoli does this on some of their puzzles (more types online than in print) and beating the "expert" time is the goal I tend to have when solving on paper, and I note if I beat it by 2x or 3x or such with double or triple circles of the time. Sudoku is hard to rate, since most computer solvers don't approach the puzzle like a human does, but Wei-Hwa experimented with writing code to do this before the WSC and his results seemed pretty good at mapping human times. I'd like to get back to advising him on that sometime soon, and then implementing it in a printed book or two so that solvers could solve a classic sudoku and know, based on a percentage of a goal score, how well they are doing relative to other solvers.

But on the broader topic of rankings, with so many online sites and live tests, there is not much out there. A long time ago Cihan Altay tried to keep this up, using live and online tests for rating points, to give a top list, but this stopped just as I was starting to compete. Right now, Logicmasters India is making rankings based on their sudoku and puzzle tests. I happen to be leading the sudoku list despite sitting out many tests (I simply don't love sudoku the way some do) and am tied at the top of the puzzle list right now with Hideaki Jo; Ulrich Voigt is in third after a poor Flip test - otherwise he'd probably also be essentially tied with us since we are often seconds apart on an hour plus test. (Aside: an excellent "Puzzles and Chess" competition by Nikola Zivanovic is running this weekend and will be open for another 5-6 hours so do head over there to print out the puzzles for later if you can't find time to compete). Even these rankings require solving one or two hour tests on weekends and don't extend to other settings or other sites well since the formula at play is hardly transparent.

Well, the German site Croco-puzzle was brought to my attention by Stefano. I've avoided the site for awhile because A) there is a huge language barrier since it's all in German, B) I suffer from puzzle-snobbery, the puzzles are computer-generated and many, like a CG Hitori, leave me with as much "joy" as you'd expect, and C) a general dislike of online solving of any form, because applets rarely give me the freedom paper does to notate puzzles in a good way and I want to practice for live competitions, not learn bad habits from a particular applet.

However, Croco-puzzle has a very interesting ranking system that is (relatively) easy to understand and seems to be a good basis for a puzzle league/ladder system. Basically, every day solvers get access to a puzzle (since Dec. 1st, two puzzles), which they solve. After the day is over, all the times, including solvers who opened but didn't finish the puzzle, are considered to calculate a median time and a best time. The best time is worth 3000 ranking points. The median time 1500. Anything else is scaled based on its position relative to these two times through a formula given on the main page. While this can give a score for one day, the way to make it a true ranking system is that only a fraction of that day's points counts for your next ranking. 59/60ths of tomorrow's ranking will be equal to 59/60ths of today's ranking. 1/120th will be equal to the score on that day on each of the two puzzles, from 0 to 3000 depending on your result. Everyone's scores go up and down. If you are successful at a certain score level for long enough, you earn a new rank much like in judo or other systems, and rankings exist at every 100 point plateau so everyone has goals to shoot for both personally and competitively with others. If you are away for a week, your rating won't change at all. Only days you compete will ranking points increase or decrease.

After Chris Dickson of the UK made a nice walkthrough of some of the site in English which brought it to the fore of my mind right after the WPC again, I thought I'd try it out for a month. I didn't want my trial to be publicly known, so I did not register as motris or drsudoku as I would elsewhere. I wasn't sure I'd stick with it, and the mystery of who is this guy could be fun. So instead, I made a half-hearted attempt at a new nom. In a cryptic crossword sense, a half-hearted attempt is exactly what I did as half of my middle name is MARS. Also, solving on a site exclusively in German is like being a solver from another planet, so MARS fit pretty perfectly. The first week didn't go so well, as I was at best barely getting within 10% of uvo times and the puzzles were new and frustrating (a Pyramid I really stumbled on, an ABC-Box that I just didn't solve fast, an insanely large and un-fun Domino "puzzle" that took me almost an hour to hammer (not logic) out that may now be known as the Refrigerator puzzle as someone took delivery of that appliance during the puzzle and got credit for it alongside their time). Aside from unfriendly or unfamiliar puzzles, a large part of my stumbling was in learning the new applets which are not easily documented as their text can't be pushed to a translation engine as easily. Many of the applets have hidden buttons of incredible value. Dominoes has a "D" that marks out a full domino. Didn't know that when I first tackled the Refrigerator puzzle. Magnets had, at the time, an "F" key that fixed all the magnets unknown around a particular one to match polarity. An awesome feature, if you know to use it. New updates to the applet have changed some things (Magnets now does "auto-F" compared to the old setting) and also given a good trial and error feature that can instantly backtrack, a good addition to the four colors that can be used to stratify guesses as well. Some have inconsistently applied systems. The "#" route to sudoku notation is too cumbersome to use. I prefer the "-" based double noting: 1-2 puts both numbers into a cell. This works on Killers and Kakuro, but doesn't on regular sudoku. So remembering where notation can and can't work is an extra step some days. Some of the applets work quite well. Others, particularly loop-drawing ones, are nowhere near as easy to use as the Nikoli ones, and even Nikoli's can't suffice over the ease of drawing on paper.

Many days I enjoy the puzzles greatly - they have a quite varied spread and sometimes feature tough forms of some favorites like Skyscrapers and Easy as ABC and Star Battle and Tapa and other enjoyable WPC types that simply aren't as easy to find elsewhere regularly, and certainly not in a time-based setting where you can see how you stack up to others. As noted above, Magnets (with the applet assists) is a favorite for me on Croco-puzzle even though uvo will top it most all the time - but maybe "auto-F" is a bad habit to gain before the next paper test. Other days the puzzles aren't what I'd want them to be, almost exclusively when they venture to Nikoli types. The Slitherlink (Rundweg) on croco-puzzle are actually pretty good, particularly as they use large boxes or hex variants regularly, but the rest range from average (Masyu, Heyawake) to poor (Arukone = Numberlink, Hitori). The Heyawake generator, for example, is in love with 1 x n rectangles clustered together, as well as 0 clues being seeds, which means you can quickly adapt your mind to how the solutions will tend to look/expand. The applet on Heyawake also doesn't fully shade cells which I dislike compared to Nikoli's presentation. The sudoku generators are very hit and miss and I hate the inequality ones, particularly on the harder end. I made thermo-doku to improve inequality sudoku puzzles, and these rub me the wrong way as one would expect, particularly as the forcing nature of the constraints is often way too subtle for a puzzle. If I need to bifurcate to show something on a sudoku, the puzzle is broken (for a human solver).

The "new" puzzles on the site also split across a range of like to dislike. Some really don't feel like puzzles to me. Sternenhimmel feels like "work", just like a Hitori does. Basically, if I had a button to remove all the unpointed at squares, this is almost never a puzzle. If removing the unduplicated numbers similarly ruins most Hitori, and certainly these, you can understand why I don't enjoy either as there is no thinking to be had and a puzzle should engage the mind. Pillen sits on the other end of the extreme, with some search elements and logic elements that can give a good challenge. While the initial grid looks like a Hitori, the solve is so much different and quite fun. Others are still very unfamiliar, but as I solve more of them, I've gotten better at some. Solving all the preisratsels (Prize Puzzles) from the last 8 years will very much train you in the various applets and how to solve and guess on croco-puzzles.

So, the first week was rocky, but then as I said with some practice I started to top puzzles. Then top some more. This felt a lot better to me than being second or third. The reason is that when you don't have as many points to defend, and will certainly increase each day, what you really want to do is get the top time and set a standard that reduces the number of points others get. This is very much about "playing offense", until you have ranking points to defend (I'm approaching that point finally). So I aggressively went after puzzles as fast as I know how to, on some types this meant quicker bifurcation when I really didn't need to do so, and doing less checking than I might otherwise. Sometimes this costs me a lot (like on a laser recently, or a Hitori I would have beaten everyone by 40 seconds on), as errors are very bad. Unlike sites like Nikoli.com, where an error just makes you go back to the puzzle with the mistake highlighted, here an error affects your time with a penalty equal to the median time of all solvers without errors. This almost always means you are going from potentially a max of 3000 to closer to the 1500 range or worse. The red text that you've made a mistake is a real stress inducer. And I have in my haste made the common errors I do on some of these types. A Tapa had an unconnected white cell. New approach: mark all unused cells too to check. A Hitori had an unmarked black that was needed to not have 2 solutions but otherwise unimportant. New approach: mark all unused cells too to check. I'm adapting and getting better.

And mars got noticed and exposed. Sooner than I wanted. Berni actually announced who I was in week 2 of the experiment on the German forum, long before I could build any anticipation of what new solver is setting benchmark times on ~20% of the puzzles aside from a single UK forum post asking who I was that went unanswered. As it stands, after 36 days, I have solved every puzzle I've seen (48 now), topped 9 (in place to get 2 more today though for 11/48) and have built my rank to 238th (1175 points). It will likely take about 5 more months before I will be in the 2500+ ranks, and longer to earn a 5th or 6th Dan rating, but while I find this slow change frustrating at the moment, it is exactly the right thing to do to have a fair and slow to change ranking. Puzzle ranking is absolutely done right on this site.

How would I compare this to the only other online place I play (Nikoli.com)? Well, I wouldn't pay for croco-puzzle daily without some changes, but I don't have to as this site is free to register and play on. The ranking aspect of Croco-puzzles is so much better from a competition stand-point than Nikoli. On Nikoli, the only obvious ranking to go for is first to finish, since solvers will retake puzzles (and those scores then crowd the top of the time-sorted rankings for a reason I'm still not sure of) and solvers may not get to a puzzle for days/weeks so knowing your final spot requires looking back. There are no high scorers lists, for different puzzle types, on Nikoli.com, while this is a feature on croco-puzzle with all-time and current leaderboards. On Croco-puzzle, the surprise puzzle ranking is fixed after 24 hours, since solvers just have any time during that day to solve the puzzle. You can retake the puzzle as many times as you want afterwards (although they eventually disappear after 48 hours entirely from the site), but the ranking list is set from the first time, and is easy to follow and watch from day to day. The goal of "double-top" has been my attention for the last 11 days, since the double puzzle came out. Hausigel (Roland Voigt) did it first, but I might squeak it out today with an impressive Heyawake solve and a 1 second faster than misko (Michael Ley) Buchstabensalat (Easy as ABC) solve.

Even aside from the rankings, the range of puzzles is great WPC practice that simply isn't available doing just Nikoli.com. So, to catch back up to Ulrich I'll be playing here going forward. I would highly recommend other putative team members try it out too, maybe reading through Chris's walkthrough in English to get a little sense of what does what, but the experience for now is still very much like a stranger in a strange land. Everything will be in a foreign language, and until you are sure of a type and how the applet behaves for it, you might not be comfortable solving it. So certainly not where I'd point beginners, but a good experience for those who want to really figure it out. The site has a lot more to it than the surprise puzzle and rating system - the old Advent puzzle sets have some really good tough puzzles on the highest levels - but the rating system alone sets a model that begs to be copied and used elsewhere in the puzzling world. Using Nikoli-quality puzzles with this rating system would be exactly what I'd implement if I ever made my own site. Then we could establish who is the best across each of several puzzle types, as well as overall, in a compelling way.
 
 
( 21 comments — Leave a comment )
Teesside Snog Monster: swingsjiggery_pokery on December 12th, 2010 09:35 pm (UTC)
Hooray! Glad you're enjoying the site. I really ought to go back and revise that walkthrough at some point... I had completely failed to peg you as mars; it took me a long time to realise that motris is Sir Tom backwards.

Croco-puzzle is one of my very favourite web sites at the moment. It certainly helps considerably that Berni seems to be personable, pleasant and kind, and that the site is continuing to grow and expand; there have been enough web sites that have stumbled in one regard or the other sufficiently to annoy that it is a joy when one of your favourite web sites proves not to suffer either fault.

The rating and grading scheme is genius that makes the ue-ratsel game work, I think, and it makes it work even for people, like me, with relatively modest levels of attainment. (For instance, I have attracted my friend daweaver to the site, who I had never previously pegged as a logic puzzle fan, though I have known him long to be extremely bright and something of a puzzle and game fan in general. There's a good chance he may overtake me soon.) I will have to improve considerably ever to reach a grading of 1000, though I think that in time I will get to 500, and "Fragen zum Rating" suggests I might be capable of 707 with a few more lucky Pyramids. Perhaps this, possibly in conjunction with the LMI rating scheme and maybe others, might lead to a more universal rating scheme... not least so that strong competitors who never attend a WPC, only because of the strength of their nation, really might compare themselves to attendees who represent the best from weaker nations.

Congratulations on your recent award of 20th kyu; the grades will roll in more and more quickly for you from now on! I think there's something of the feel of "levelling up" to it, with the added bonus that it's not your character levelling up, but you, as you develop your puzzle skills. Of course, now you've posted this and attracted all your classy US puzzle friends to the site, median times are going to get better and better and the rest of us will have to work harder just to compete!
(Anonymous) on December 13th, 2010 02:29 am (UTC)
"aside from a single UK forum post asking who I was that went unanswered."

Finally I got an answer!: I didn't see the German forum spoiler but soon after that post I began to suspect that 'mars' was some corruption of 'motris' :)

A nice review, thanks for your point of view. Interesting how the ranking system works for players at the very top - that you consider the motivation to punish other players' ratings. (and, then, how uvo will feel about the fact that you make his 8 dan target far more challenging)

I tend to vote down Nikoli puzzles on the "Ligasystem" page for the same reasons as you give (they are already on Nikoli, with better puzzles and a better applet there). Kakuro is probably the only directly comparable type where I moderately prefer the croco-puzzle version

Congratulations on your successful double top!

Ronald
motrismotris on December 13th, 2010 02:42 am (UTC)
There are a few types where the difference in challenge from generation across the sites are nice. The Kakuro are fairly different here and pretty challenging at times (although some of the Advent ones bordered on unreasonable - it's a fine line). The standard Rundweg/Slitherlink solve differently too, although you may already have experienced this if you search out other slitherlink generators as Nikoli's use a lot of preset seeds more than a computer generator will. I suppose the same "seeded" break-in versus "one solution" puzzle construction differences explain the major differences, and most hand-construction will therefore feel easier over time if designers don't really push things to new places.

The new "Ligasystem" rating system is a superb addition to the site (I didn't even get my first review out before so much changed on Dec. 1). I'm sort of glad to see the community voting down a lot of the worse puzzles (all sudoku are going down) and voting up mostly what I"d call the better ones. This will lead to an improved mix of puzzles going forward. The only one I'm sad to see so poorly received is Schlange (amazingly enough). The old applet auto-numbered and made those puzzles really fun to play compared to paper. The generator tended to make fairly tractable ones, or at least fairly tweakable ones, which are good practice for the WPC. Not enough people, however, seem to stand up for the Schlange.

Edited at 2010-12-13 02:44 am (UTC)
(Anonymous) on December 13th, 2010 03:01 am (UTC)
Yes - and I'm surprised to find that the computer generated Kakuro give me more interesting solves.
Nikoli puzzles often seem to be solved by scanning the grid for commonly occuring patterns - I find this less interesting.

I think I mentioned to you before that Kakuro generators are often very poor - this is by far the best non-trade generator that I've seen (and perhaps the best generator, period).

The Slitherlink are similarly interesting, as you say, and are nice variants. I particularly enjoyed the recent puzzle with 4(?) point stars.

Ronald
motrismotris on December 13th, 2010 03:12 am (UTC)
The penrosemuster tiling is an interesting choice although I'm not set on whether, once you learn some of its tricks and counts, it is a great grid in general. There was actually another Rundweg type in the earlier puzzles based on triangles which was absolutely horrendous to solve - wrong ratio of edges to vertex choices to be honest - so I'm glad to see that hasn't come back recently. As others have said, the site seems to improve and good ideas kept and expanded and others made more marginal.

Edited at 2010-12-13 03:14 am (UTC)
(Anonymous) on December 13th, 2010 02:50 am (UTC)
For interest, the UKPA has been privately working on the possibility of a rating system to decide future WPC/WSC teams, based on the results of any or all online puzzle tournaments comparable to LMI in their scope (presumably including, for example, the USPC).

The subtleties of such a rating system are awful and have caused more than a few heated 'discussions': about the best/fairest data to include, so that everyone at every level of ability is motivated to participate, ranked fairly, and unable to 'game' the system to get an unfair advantage.
(The major difference between us and croco-puzzle is that we want to fairly rate players who are not able to make such a large time commitment to regular puzzle-playing)

As it stands, we've made some progress, but it may be that we are unable to agree and will continue to piggyback off the USPC.

With this work going on in the background, it makes the croco-puzzle rating system all the more impressive in its scope. In the short term, everybody faces their own challenge (whether that is competing with a key opponent, reaching a kyu/dan grade, double topping, etc) - in the medium term, players are motivated to come back day after day - and in the long term, rankings match up to a fair reflection of ability. It's an absolutely awesome piece of work, and a great credit to berni.
(Anonymous) on December 13th, 2010 02:53 am (UTC)
(that was me again, I don't intend to post anonymously but I don't care to give Livejournal access to my account details for other sites - Ronald)
motrismotris on December 13th, 2010 03:27 am (UTC)
Using a rating list for team selection seems a very good goal for such a ranking. The USPC is the one puzzle test I look forward to all year, but anyone can have a bad day and making it THE piece in team selection has always felt a slight weakness. For this reason, "exemptions" have been used to ensure higher quality US teams, but this often means just one or two spots are actually available. I could have told you that MellowMelon had a >80% chance of qualifying this year based on his relative Nikoli.com performance improvements, and from my own personal belief that writing hundreds of puzzles will make you a much better solver too, but the independent evaluation of his skills was not going to matter if he didn't bring the ability to the USPC.

The answer for the UK probably comes down to a matter of weights. A 100% USPC criteria will still work - it is one of the most balanced of all the online tests, and is rather consistent from year to year in type of content - but building in some other metrics can remove the "one bad day" effect. With some of your own tests now too, certainly they can become a small component of whatever the UKPA decides to use.
(Anonymous) on December 13th, 2010 05:05 pm (UTC)
We wondered at the German Sudoku Championship who this mysterious "mars" might be. We came to the conclusion that it's not possible to get this results without massive experience in logical puzzles and dedication to adapt the solving style. So there were only three options left: a cheater, a new account of somebody already on the list (but all top-solvers are still there day by day, so this would also be cheating) or someone of maybe a handful people in the world.

The ranking system has some minor flaws, but it takes a long time until you notice them. Nobody can get better forever, so after some time, you can't get motivation from the goal "more points" or "higher grade". One day (after two or three years or so) you will realize, that you had a very lucky streak of puzzles one time and will probably not reach your highscore again. At this point you have to get your motivation elsewhere. I think with the new accounts topping the leaderboard in the last months, it's also not possible for somebody like uvo to maintain his rating, because more people will beat him occasionally. For everybody else it would also be very difficult to ever reach his actual score. So why shouldn't he just stop? I think he will not do this, because he likes the daily competition too much, but the rating system only works as a measure of "actual skill", if everybody keeps solving even with no room to improve.

Some solvers also improve their rating way over their general skill level by ignoring some puzzle-types, or by ignoring hard puzzles or by ignoring easy puzzles. (I know I would improve much by ignoring every puzzle solved under 60 seconds, but I think thats not fair.) Maybe a very very small decay would solve some of this problems, but I'm not sure.
motrismotris on December 13th, 2010 05:21 pm (UTC)
The ability to "skip" puzzle types to improve a rating is one weakness of this site and I have recognized that already. I find doing all the puzzles when I'm able to do so (at home, with internet) the fairest way to compete. If I come to the site to do the puzzle, I will do it, whether the high score table tells me it is a 20 minute laser puzzle or a 20 second fillomino. But others who pick and choose can alter the ranking in different ways. On your last point, perhaps a separate type of classification (besides by puzzle) could be used with easy/medium/hard puzzle types. You would probably use median times to group, or even just completion percentage. Doing well on the puzzles that <80% of the solvers finish is much different than doing well on a 30 second puzzle. The community results would demonstrate the difficulty of the puzzle, as opposed to the rating itself, which feels a natural use of the data.
Teesside Snog Monster: puzzlejiggery_pokery on December 13th, 2010 06:33 pm (UTC)
Mmm... there are plenty of days that I don't do the Ue-ratsel puzzles because (a) they look so hard that I'm not likely to enjoy them and/or (b) they're of types that I don't enjoy / can't solve. This will always be the case and probably isn't a bad thing, given that it's a fun activity that people can dive in and out of to suit themselves.

I do think that people who spend even long times cracking very difficult puzzles tend to gain points - possibly more points than "they deserve", by some metric - even from slow times simply because the existence of people who try and fail on the very difficult puzzles push the median position so far down the list of solvers (sometimes past the last solver!) that it's possible to get a near-median result, or better, even from a below-median performance. I idly wonder whether the results might more accurately reward performance still by having a small number of imaginary bots (3?) who try, but fail, on each puzzle every day, so that the median deliberately reflects more accurately not just (the median performance of solvers) but (the median performance of solvers and non-solvers)?

I also note, with a smile, the return of uvo to the site, and have a fanciful suspicion that Thomas' post may be what inspired him... ;-)
motrismotris on December 13th, 2010 06:47 pm (UTC)
I totally agree that players should have the freedom to choose which puzzles to solve, and which days to solve them. If you see top solvers taking 20 minutes on a U-Bahn, and you haven't solved many/any of the type, it makes perfect sense to not get your first experience on such a challenge. Similarly, if you only want to play on Saturdays and Sundays, that is fine too. As the earlier commenter mentioned, this means in some areas of the ranking the general classification isn't as consistent as it could be, but that seems of little harm to me.

Where I frame my view is at the eventual top end of the spectrum, where frequent WPC competitors would likely want the high score levels to represent overall puzzle skill. So if I only chose the Masyu and Heyawake and Hashi and Sudoku to get to 2800+ in score, that doesn't seem right to do; rather, like the other top 20 or more, it is to play everything, expose my strengths and weaknesses, and see after a year where I sit relative to other very good players. Our natural love of puzzles and competition sets us up for the most sporting way to participate.
Teesside Snog Monster: puzzlejiggery_pokery on December 15th, 2010 01:28 pm (UTC)
One of the nice things about the croco-puzzle set-up is that it puts a structure in place that enables all sorts of additional interesting competitions, some of which could be done by hand without further programming from Berni.

For instance, imagine team competitions. Here's one possible model; declare a puzzle team to have seven (or six, or eight, or...) participants, some of whom may participate in either or both of the croco-puzzle puzzles on any particular day. The team's score for that day is the sum of the four (or five, or...) best times recorded by any of the team's players on each puzzle. Compare two team's scores, and the team taking less time wins. Accordingly if you have team members who don't complete both puzzles then your team is not sunk outright, but every team member who completes both puzzles has the opportunity of improving the cumulative team performance of that day.

It would be interesting to see "team USA" vs. "team Germany" conducted in this way, though I'm not sure if there are seven of the top US solvers on croco-puzzle yet. Certainly there's you and Roger Barkan, who would both be easy top-seven choices, but you'd probably know if the other top US solvers are on croco-puzzle rather more accurately than I would. The UK could put together a representative VII easily: in no order, DavidMcN, detuned, oenomel, PuzzleScot, ronaldx, paulredman (not a known WPC name, but has been on croco-puzzle much longer than the rest of us!) and probably rodders for the seventh position. That's off the top of my head and I may well be missing people there. The UK's VII wouldn't compete with the German VII or the US VII, but might well enjoy resuming combat against, say, the Finnish VII, which seems to be about our natural level. (Or perhaps it might be interesting to compare the UK against the Hamburg team, or the Dortmund team, or the Essen team - I don't know where in Germany the solvers live.)

Perhaps such team competition might motivate those who have reached their natural peak to return to the site and push themselves once again. On the other hand, perhaps such raising the stakes might inspire malpractice of the sort we don't currently see at the moment, but such malpractice could exist already and (to the best of our knowledge) doesn't, so it's probably not worth worrying about, for now.
zundevilzundevil on December 18th, 2010 07:14 am (UTC)
This kind of reminds me of the Nikoli league of which I'm still listed as (default) commissioner on my self-introduction. Feel free to read about it here:

http://zundevil.livejournal.com/34953.html

I liked the idea of only taking the top-n performances for each puzzle from the team's overall N. It felt like more of a team-y thing that way. I could even imagine the best players *not* being the selected VII, since breadth of coverage would be just as valuable.

The problem with our quickie league thing -- if you can call it a problem -- was that we didn't reach anybody; it was just eight anglophones. Should the croco-puzzle thing turn into something truly international, it sounds like it could be really awesome.

Even as is, it sounds like something I really should sign up for.
motrismotris on December 18th, 2010 06:59 pm (UTC)
Well, the other problem with the league was it seemed it took too long to gather results (and this was manual work) and compared to a site that automatically does this, this is more difficult to do to compare Japan's best with the US and UK's.

I still really like the idea of using Nikoli times to track solvers, and croco-puzzle at least gives a model for something that works for individuals, but a team aspect would be interesting in either place.
Gabriele Simionatogabrieleud on December 14th, 2010 05:02 pm (UTC)
score and ranking
Are you talking about an ELO-like score? Or something with rankings like "Master", "Senior", "Grand Master" and so on?
motrismotris on December 14th, 2010 05:08 pm (UTC)
Re: score and ranking
I'm not familiar enough with chess to know the specifics of an ELO-like score and how that would differ from the kind of numeric score that croco-puzzle uses.

I'd say the right system would be able to rank all players with a scaled number after they have completed some number of puzzles of some type with time information. Maintaining a certain score or breaking a score threshold to earn a title like Master or Grandmaster would be worth including too. Having a general "puzzle" rating is hard, but imagine first the classic "sudoku" rating. If we could define something for the latter, we could begin to work towards the former.
(Anonymous) on December 14th, 2010 08:12 pm (UTC)
Re: score and ranking
It's an ELO-like system. Indeed the ELO-system was a model for the rating system (more precisely I used the European Go-Rating system as a model, which is derived from the chess-ELO). I also tried to make the rating numbers being comparable to the chess-ELO-numbers, although I can't tell if they really are. In chess a Master is (roughtly speaking) someone with about 2300 ELO (I'm not 100% sure on this), which would mean, someone getting the grade of a 3-Dan would qualify as a Master?

Berni (owner of the CrocoPuzzle site)
Teesside Snog Monsterjiggery_pokery on December 15th, 2010 01:04 pm (UTC)
Re: score and ranking
If I might interject...

The croco-puzzle rating system is superficially ELO-like, inasmuch as the top ratings are around 2800 or so. However, the chess rating system is a zero-sum game; when two rated players meet then one player will donate rating points to the other according to the result of the game. The croco-puzzle rating system does not have the same property, and it's easy to create (slightly pathological) sets of results whereby the total number of points in the system increases or decreases.

The croco-puzzle rating system awards numbered dan (master) and numbered kyu (student) grades based on the average of your previous 200 distinct daily ratings; once grades are awarded, they are awarded for life, like chess titles. Kyu titles are effectively "negative dan" titles, and thus "20th kyu" is the most modest title awarded, through to "1st kyu" which is pretty good. There is no "zero'th dan" title; you go from "1st kyu" to "1st dan", the most modest master title, then can earn increasing dan titles after that for more prodigious levels of accomplishment still. Having your previous 200 distinct daily ratings average 2000 is sufficient for 1st kyu, 2100 earns 1st dan, 2200 earns 2nd dan and so forth.

I think it would be presumptive to try to compare croco-puzzle ratings and titles with chess (or go, or...) ratings and titles. One of the very strongest solvers in the world (who finished fourth in the world championship four times and is a perennial world championship contender) can go from a croco-puzzle rating of zero to one of over 2300 in about seven months. On the other hand, he has been solving puzzles for so long - well over ten years! - that you have to give his puzzle career a great deal of credit for years spent improving before he started playing on croco-puzzle. It would be interesting to follow the progress of a prodigy who hadn't taken up logic puzzles before discovering croco-puzzle and seeing how quickly they could improve. It is said that a go player will take three years to improve from being a beginner to reaching first dan, even with natural talent for the game; my gut feeling is that the same journey would be a fair bit quicker on croco-puzzle, though I don't know if we'll ever have someone to test this out on.

While I would be reluctant to award nominal titles based on croco-puzzle achievement alone (for instance, you can't get titles through playing online chess!) I would be inclined, purely for the sake of argument, to roughly equate GM with 6th dan and IM with 5th dan. (For comparison, you need a chess rating of 2500 to reach GM and one of 2400 to reach IM, which would equate to one dan level below in each case, but you additionally need "norms" - a series of results in long tournaments against strong, international opposition - at a 2600 level of performance for the GM title and a 2500 level of performance for the IM title.)
(Anonymous) on December 15th, 2010 07:07 pm (UTC)
Re: score and ranking
"I don't know if we'll ever have someone to test this out on"

flooser started solving puzzles in spring/summer 2008. On CrocoPuzzle he needed about a year to reach 5D level. From his "Tagesrating" you can see, that his skills where at that level allready in Mai 2009.

But I also know Go players who only needed one year to reach dan level.
Teesside Snog Monster: msojiggery_pokery on December 15th, 2010 07:19 pm (UTC)
Re: score and ranking
Interesting data points, thanks.

You know some seriously brilliant people!
( 21 comments — Leave a comment )