Log in

17 January 2010 @ 03:43 pm
What If...? #7 - What if I wrote the puzzles? - SudokuCup Edition  
First, as a bonus puzzle, here is a hard classic I wrote for the Sudoku Cup that was initially puzzle three but which was cut after testing to introduce a much easier classic.

3:51, same pattern as one of my Silicon Valley Puzzles last year that I really like.

What if I wrote the puzzles? Well, this weekend we got the first solid test of that with the 3rd SudokuCup. I thought I'd share some details of the experience and repost all the puzzles, now with their titles which were left off from the competition (so as to not be too clueful or introduce aspects that would not be language-neutral).

I've wanted to put together an online competition for the international community for a long time. Unfortunately, I've never had a good infrastructure in place to put together such an online event and the US itself hardly runs a qualification for its sudoku team (although in the past some of my puzzles have been used informally in this process). Having written for the Czech National Championship last year, I was asked last summer if I wanted to construct for the next SudokuCup. The timing of this event in January was (and still is) inconvenient for me and a lot of my US friends who are doing the MIT Mystery Hunt although I quit the Hunt cold turkey this year, and even stopped picking up coins this weekend so as to not accidentally win the Hunt whilst in Palo Alto. Still, with the established competition base for the first two SudokuCups in the hundreds, with the existing infrastructure to run a 2-hour competition, and with the solver-friendly 2-day window* to compete, writing for the SudokuCup as my first truly broad international competition was an attractive choice. Soon after the WPC in Antalya, even though I was burned out by puzzles for the year, I agreed to take on this task.

*(Of course, the long 2-day window also allows a huge potential for cheating by collaboration or multiple log-in accounts, added to the standard list of available online cheating methods such as the use of computer solvers for classic puzzles and other variants; however, I am not terrifically concerned about cheaters on a test with no meaningful prizes so they are just the fools they want to be if they go about the process this way. Still, watching the scores come in, I certainly saw some huge irregularities. For example, you can see a group that certainly seemed to work together if you look at some submissions with ridiculous times for some Indian solvers [45th-50th] with identical scores and identical mistakes which is the simplest kind of fraud to detect since incorrect entries should not be highly correlated, nor should 6 solvers from one country so perfectly overlap in the score spreadsheet. One can identify and deal with a certain number of these examples if one really wants to, but let's just say I won't naively claim cheating couldn't happen here by some people exploiting loop-holes in the process.)

My goals for the SudokuCup were simple at the start: make a test that would probably take the top solver 95-105 minutes, with perhaps 5 people in total finishing. This is a lower rate of success than Jan Novotny's SudokuCup, but I felt it was the right average of the past two competitions to shoot for. I wanted to construct primarily my puzzle types in my style - visually and logically elegant - and I wanted to make good examples that were similar in difficulty (since many competitions fail to have good examples). It struck me that the way to get the best possible puzzles was probably to write more than one for each type and have a choice of puzzles, and so in general, except when I was facing my own time pressure of finishing everything before Christmas, I made 2 new puzzles in every style and selected the best one for the test and used the other for the example. I was hoping the example puzzles would stand out as a reason to do the test as they would be very good puzzles on their own right. I loved the reaction I got with my first Friday Puzzle Arrow Sudoku Example where the discarded puzzle showed that something special was going to come (as nickbaxter put it: [I]f this is a "reject", then the competition is going to be pretty awesome.)

For the 15 puzzles in the competition I wanted a good mix of variants (some varied geometries, some involving math or properties of the numbers, ...), and since my two recent sudoku variants books themselves offer a good mix by design, I started by just choosing my favorites from those books. I wrote exclusively themed puzzles using either complete hand-crafting or computer-assisted construction depending on the type. Frequent solvers will know there are a lot of themes I use a lot (like odd or even only puzzles, Easy as 1,2,3 as a theme, ...), and many of these came up again. Amazingly, I did not make any smiley face puzzles :).

Near the end of the process, I was reminded by Karel that he liked having a "Surprise" puzzle and on a walk one day the correct design for this struck me. I did not really have any puzzle that used "shared group"-thinking as I've featured in Battleship Sudoku and in the Shape Sudoku section of Mutant Sudoku. I conceived of a simple (and hopefully obvious without rules) concept of using symbols to mark pairs and quadruples of cells. The rule would simply be: "Every time you see a symbol, the same 2 or 4 digits must occur in some order in the connected cells." With a good idea in hand, I still needed a nice theme for the design. The concept of giving just the top row in 1-n order was terrifically compelling to achieve the right initial level of shock/awe but also immediately draw attention to the symbols and how they appear in the solution since there would be absolutely no useful sudoku steps until you cracked the new rule and could propagate digits into other rows and boxes.

With 15 puzzles in hand, it was off to the testers right after Christmas. Two things surprised me as the results came back: first, the classics were taking test-solvers much much longer than I expected. I guess in the last year I've gotten much better at classics, but here I considered my times a good unspoiled comparison and my usual testers like Wei-Hwa who are say 50% slower were over 2-fold slower. Karel had his own set of testers and they also were slower than expected on the classics. While I hated to see it go, on Karel's advice I replaced the hardest classic (above) with a very easy classic (puzzle 1) which shaved several minutes off the test and added a friendlier classic sudoku to the mix for average solvers to enjoy. The second surprise was that total times over all the puzzles seemed quite high, and my goal of getting the podium finishers to complete the test in time seemed in doubt when I saw how long it was taking some talented test-solvers. So, we made a couple other edits. The Surprise Sudoku in particular went through a round of revisions to not fundamentally change the concept but rather to simplify it so ALL pairs/quadruples that could be marked were marked. Although I never intended this iff parameter to be necessary, or even useful, to the solution of the puzzle itself, applying it in the example made it even easier to identify the pair/quadruple rule in the example. Also, the Surprise puzzle was rewritten to take less time than it initially did. With these changes, it seemed more reasonable that a handful of solvers would be able to finish the test cleanly and we were ready to go (or so I thought) in early January.

Then it was a lot of waiting for the official test booklet to finally arrive (first in Czech, then in English). We debated some answer entry rows a little bit but had everything in place I thought by the beginning of this past week. Still, the test booklet took longer to be finalized and get online than expected, and I had a few anxious solvers commenting over here that it wasn't available, but it was up hours before the start of the test. Also, there were those odd promotional videos posted Friday morning when I woke up expecting the booklet instead. Suffice it to say, in my opinion I don't think intimidating potential competitors with Jakub Ondrousek's well-established classic solving speed is a good way to advertise a competition, particularly by using ridiculously simple, asymmetric, bland, computer-generated sudoku on a competition that has nothing of the sort. I could have just made my #3 Trophy Classic much simpler if asked, since that is a really nice design (a sudoku cup for the SudokuCup), but I wanted my example to match the average solve times and methods of the puzzles I actually used so competitors could gauge the challenges coming better.

Finally, the competition started, and I awaited some news on the test. The first comment I got back on my blog was titled "Wonderful Sudoku Competition and Site Failure" and I panicked for a bit. Had the server crashed? I eventually saw the commentary on the site related to this one solver and realized it was possibly resolved and not the end of the world although at least a few people had run into some apparent problems. As the administration of the test was the only thing really out of my control, I was reassured when I finally was given a link to a live results page that showed the scores as they came in throughout the weekend and saw reasonable results for the expected "favorites". This also gave my first confirmation that some solvers could finish all the puzzles in two hours. Jan Mrozowski ended up, again, a clear champion of the SudokuCup and I extend my congratulations to all the top finishers.

So, with the story of the steps of making the championship now told, let's get to the real meat of any competition which are the puzzles themselves. I've reposted them all below, and my times to solve them, and some commentary on their construction (and potentially some spoilers on their solution). The final version of the SudokuCup would have taken me about 70 minutes to finish, but many of my times on puzzles will be inaccurate as even weeks after constructing them I remembered some things about them, although this may make me slower or faster. In particular cases, such as the Outside Sudoku where I knew the middle was where to begin, I spent a lot of time trying not to use the middle to test that experience, and so maybe in the end my total time is within 10-20% of what my time could be, even having to deal with printing and answer-submission.

1:20 - An obvious 32-given design (my improvement maybe on the arrangement of the Goes to 11 puzzle as it leaves 2 rows/2 columns/1 region open but has more footholds) where I selected a very very easy version after getting the two identical 12345678 blocks in place.

2:30 - I like how the different arrangements of triples in this grid really control the solving flow. Its certainly a pattern that almost always solves this exact way whatever the givens.

3:00 - This design, inspired I guess by all the D4 Heine puzzles seen before Philly, was set up to have the kind of box 2/4/6/8 pointing pair logic that all those puzzles run into as well.

1:34 - I like doing all odd or all even designs on variations that allow them. Here, some obvious starting points to get a 2 and some 8s were enough to get the whole design to work.

5:12 - I really liked one of Wei-Hwa's designs in Mutant Sudoku called Waves. It used the gray space to demark sets of white cells, and then put givens in isolated regions of the puzzle. Here, I attempted the same kind of connection between pattern and givens which worked out ok.

4:54 - This was the only "I can't sleep until I get this done" moment of the design process. I really wanted a Sudo-Kurve in the test but I did not think I'd explored enough atypical/original designs when working on them for Sudoku Masterpieces. As I was tossing in bed, unable to fall asleep one night as my mind was too busy imagining grids that could exist, I devised what I thought was a 9 box arrangement with some twisted corners that could work. It took an hour of sketching to get the right grid shape, then some time in Illustrator to make a practice grid to construct from, then over an hour to fill the grid to prove its validity and select givens to leave a good puzzle, then I tested it, and tried to sleep again. During testing, I had learned that there were a lot of constrained parts of the grid that included forced repetition in limited places of some digits which were fun discoveries but would be spoiled by seeing the grid/example beforehand. So, as I tried to get back to sleep, my mind was now racing with how to give an example puzzle that wouldn't spoil the new grid. I eventually just got back out of bed and came up with a simplification of the new grid that was cross-shaped but which would work to teach the desired concept with some extreme row to column to row wrapping. This also took another hour and a half to make just right. Instead of going to bed at 11 as intended, the compulsion to get this design just right meant I was up until after 5 AM. Its been years since solving a puzzle/book of puzzles has kept me up in this way, but constructing puzzles can do it all the time. This should suggest where my passions really lie at the moment.

Regarding the puzzle and not the process, I really like the challenge presented here by the twisted classic grid. I should really work on making some tools for creating/validating Sudo-Kurve puzzles as they are fun to solve but often hard to construct without confirmation you haven't over-constrained certain cells.

3:00 - I really like 3D sudoku, but I find the "standard" 2x4 rectangle region version to be rather easy because of the implied 2x2 opposite corners of each face identities. So, recently I've been making a fair number of these 3D puzzles with different region shapes. This F shape came about while I was again unable to fall asleep, having just gotten back home to Buffalo for the holidays and searching in my mind for some new region shapes that would be good to explore. Fortunately, I was able to fall asleep once I imagined the F grid which was obviously a good choice, and I constructed the puzzle the next morning. I much prefer when this happens (compared to the Sudo-Kurve case).

6:39 - I like Vlad's isosudoku type a lot as a fairly simple twist on classic sudoku that adds in many partial diagonal constraints (it could easily be presented on a square grid but I like the hexagons more). Still, I haven't seen a lot of visually stunning examples of this type and so I worked on some ideas that I thought would pop really well when constructing this. The diamond, which matches the outer grid shape, ended up perfect, if a bit tough.

7:00 - Some part of me really likes making Outside Sudoku without many/any intersecting clues. I also love putting a naked single in the exact center of the puzzle. This Outside does both things, and is best solved (as it would have been titled) from the Inside-Out, which is atypical for Outside Sudoku in general but very typical for my versions.

5:30 - I've struggled mightily with getting good graphical themes in consecutive puzzles. I've never been able to do them entirely by hand, as the implied non-consecutive elsewhere constraint is a really strong one (for example, one can make valid, if not humanly solvable, non-consecutives with ~7 givens). Here, I really wanted a puzzle with a +/- 1 theme since that is the consecutive constraint at its heart. I couldn't do it without 3 extra bars in the bottom, but I think the eventual puzzle with the break-ins from low digits, works rather well.

7:07 - I came up with the idea to do another sort of parity theme here, made very possible by it being a nonconsecutive puzzle where a subset of evens touching odds is the entire relevant constraint space. A symmetric arrangement of odds and evens on opposite sides of the grid ended up working quite well, and careful consideration of the numbers for the very top/bottom ensured a solid work-in for rows 3 and 7 and then a slow but enjoyable solving process from then on.

6:33 - I talked about this already but I hope I succeeded (unlike some of the "Instructionless" puzzles in Antalya) to obey the concept of having a new gimmick that is explainable in one sentence, and certainly in one clear image, without ambiguity.

5:17 - Here is my over-used "Easy as 1,2,3" theme repeated in Thermo form. I've made all of my Thermo-Sudoku to now by hand (such as All Smiles and Downward Spiral in Mutant Sudoku, my favorites in part for ending up clueless) but after finishing my half of the section in Mutant Sudoku I saw Wei-Hwa's construction tools for the type and decided I'd experiment with them a bit. This puzzle was a success of combining hand-construction with computation. I wanted the thermo 1/2/3 theme, and saw how to pin digits in the lower-right to get a great work-in. However, the software helped my get the puzzle absolutely polished to just have a single instance of each number as givens. Testing this design revealed it worked really well.

5:25 - I've mentioned Arrow is my favorite variant right now, although I've seen no one else trying to make visual themes with them. I had two great unexplored concepts before starting work on this contest. First was the Downward Spiral (which became the example, as it took me over 8 minutes to solve and ended up too hard in my opinion) and then this one, which involved several different boxes using just diagonal lines. The work-ins on the left-side for a 9 from 123+12 and then another 9 from what must be 4+32 in the upper-left worked really well together and then with the bottom. While this puzzle is still a bit hard and probably worth every one of the 28 points it got, I felt this was a clear highlight of the test (and have heard the same from other test-solvers).

5:50 - I like Vlad's style here as well to let me do English-language themes in Sudoku (even if I've not seen the letters exploited to this extent ever). However, writing for an international audience meant I couldn't do some of the things I did in Sudoku Masterpieces with English sentences or words. I thought I'd be most playful in the example, which I'd title "Self-Referential" which said "Ex. For This One" which it certainly was. I also wrote the instructions to subtlely state that the English numbers ONE to NINE will appear in the competition puzzle. Indeed, my "language-neutral" theme had each of ONE to NINE appearing in the competition puzzle, one per row. I'd played with various 1-9 puzzles for Masterpieces, eventually using an interlocking criss-cross of the digits in the book, but I came very close to accomplishing the theme as shown here back in April. So, I revisited it for this test. While I could never get it to work without any givens in the grid besides the words, it became clear when I saw how a single extra 9, in the row that already had the word NINE, would leave one solution that this was THE way to go and gave an unintended extra ripple to the theme. Two nines in one row?!? Really good solving flow for a puzzle with an internationally accessible theme.

So there you have it. A solid answer to the question "What if I wrote the puzzles?" I think my style of design adapted just fine for an international competition, and I welcome any and all comments on the puzzles and the format since I will certainly run some of these again, but probably on my own web-forum once I put it together.
motrismotris on January 22nd, 2010 03:58 pm (UTC)
Re: Cheaters and server..
Puzzler Media does an online qualifier with classic puzzles and variants (although pretty tame ones like diagonal, odd/even, killer, and jigsaw). The Times Championship just uses classic puzzles and one can argue if that is the best qualifier for the puzzles that show up on a WSC. Also, the two publishers are not connected so there is no clear reason why the Times winner would go to the WSC although this consideration was given the first year when the various newspaper champions arrived in Lucca (at least this is my recollection).
(Anonymous) on January 24th, 2010 12:53 am (UTC)
Re: Cheaters and server..
Well, that's not totally accurate; classic puzzles from the Puzzler media generator are the ones currently published in The Times (this hasn't always been the case).

The Times are the only people organising any sort of UK sudoku tournament these days, so it's fair to label theirs as *the* national championship. As motris says, it's classics only, and so some people who are pretty quick at classics only can excel without being good at the sort of broader spectrum of variants required to do well at a WSC. Still, I look on enviously at the US championship (also classics only) winner, who does get a spot on the US team.

The puzzler qualifications are generally kept quiet to try and keep them as UK only - I presume because they can't be bothered to filter out international participation. Although, as motris rightly points out again, it's not the most original/fun test going, most of the variants used on it are there purely because Puzzler have a generator for them.

I could (and indeed have in the past) say an awful lot more about what I think about their WSC qualifiers - I'll restrain myself and only comment that I think there's plenty of evidence showing they've failed to pick the strongest UK team available in previous years.