Log in

21 January 2013 @ 02:40 pm
Too Big to Solve?  
Not my tagline, but a good description for the Mystery Hunt that just happened. One line of dialogue after last year's Hunt that I led with in my wrap-up was a question of when is too soon for a Hunt to end. I said, in this era of a few competitive teams trying to grow to get over the winning hurdle, constructors aiming bigger was a mistake. The Hunt ending after 36 hours (Midnight Saturday) is fine if that makes the solving experience stretch over the weekend for everyone else. I won't comment generally on this year's effort but it seems a great example to point back to of too much ambition by too many people towards the further militarization of the size of Hunt so that by 2025 the team "The whole of new USA" can go after the coin against "USSReunited" for at least a month. The sense of "puzzle" versus "grindy work" is also a discussion I have every year and I don't choose to repeat myself. I've felt since 2008 that the Mystery Hunt is far from an event I'd regularly attend in person although I'm glad to have finally been onsite to play with Team Luck with whom I've been a "free agent" now for three years.

I had a good solving year as things go relatively, but it was mostly demoralizing personally. I soloed Palmer's Portals, for example, but spent many hours after basically solving 8/10ths with a need to tweak a very small and underconstrained set of things to get from that hard work state to a finished state. At some stage I told the team "I'm going to solve Portals and the Feynman meta and then go sleep" and I met this goal but in many times the expected time when I gave the statement. I led the solve of both Danny Ocean (with zebraboy stating the most necessary last bit to get my work over the cliff) and Richard Feynman (with Jasters). I obviously co-solved lots of the logic puzzles and other puzzles, and gave various finishing help to a range of things too. I think I did this best for "Kid Crossword" once when he had spent a lot of timing mastering the hard steps of a crossword/scrabble puzzle -- and could quite impressively fast rewrite out the set of steps I wanted him to do about the puzzle -- and the follow-up steps were not obvious but I led the killing of the beast. This was too often the feel for these puzzles, and my assassination rate was far lower than I wanted. My Sunday was spent earning 3 puzzle answers by actually going to an event, and then falsely believing the power to buy some answers would let me finish solving the Indiana Jones mini-metas -- where I had already mostly soloed Adventure 2's snakes with 5/8 answers, but then killed myself dead on #1/Ouroboros for the rest of the day for so long solving, as many solvers will say in hindsight, the puzzle that was meant to be in one of a dozen ways and not the puzzle it was. Let me state here as I did for hours with my team, the phrase "I'm not cut out for this" is horrible flavor. It implies both cut this out and, in a different way, also don't cut this out. This makes you want to cut it out, which takes a lot of time, but also to not invest too much time in cutting it out, so as to save the wasted time of doing a task you are being told not to do. Other wordings are far safer, and implied negatives within positives is one of the five worst flavor failure modes in my opinion. Puzzle editing and flavor text is an art and is certainly the biggest variable from year to year and constructing team to constructing team.

So yeah, Mystery Hunt happened. And there were the usual share of overwhelmingly incredible Aha moments. Endgame seemed very fun and I wish all teams could do just that for the weekend or at least a lot more things like that. More of that, and more sleep, would have both been some good choices this year. If only the puzzles solved on schedule.

ETA: And as I added far below around comment #300, as a solver who was both frustrated yet had fun in this Hunt, I do want to thank everyone on Sages for the incredible effort they put in. Making a Mystery Hunt is a gift for all solvers whether it matches expectations or not, and as a mostly thankless job I do want the constructors and editors and software engineers and graphic designers and cooks and phone center workers and everyone else to know I appreciated all you did over the last weekend to give us several days together for puzzling.

Further, as I was asked to write a larger piece elsewhere that has given me personally a lot more attention as the face of the criticism, and as I use the phrase "My team" a lot in general as solving forms this kind of bond, I want to be very clear: since Bombers broke up after 2009 I have been a free agent. I have solved recently with Team Luck but am not a core part of their leadership and these opinions I state are my own. I intend to form my own team next year to go after the coin again, and if you have a problem with what I have said anywhere on the internets, please hate me for it. I believe in my posts I have been offering constructive criticism, but even what I have said is without all the facts of what went on inside Sages so I could easily be speaking from ignorance a lot of the time.

EFTA: Thanks to tablesaw for pointing out this chronologic feature of posts. If you want to see all the additions to this post in time sorted order, go here http://motris.livejournal.com/181790.html?view=flat. We're on page 14 at the moment.
Ali LloydAli Lloyd on January 24th, 2013 02:23 pm (UTC)
The one with 263 MP3 files was actually really fun. A couple of us testsolved it over Christmas. Despite reservations about the length, I thought that if a reasonable number of people on a team listened to what was essentially 2x10 minutes of well known music, almost all of it would be ID'd in no time. The answer was also gettable with about 75% of the information, possibly less.

However, the fact that it appeared in the final round of this hunt was obviously something of a death blow.
Gemini6Icegemini6ice on January 24th, 2013 03:54 pm (UTC)
I thought that if a reasonable number of people on a team listened to what was essentially 2x10 minutes of well known music

I think you're absolutely right for large teams. A large team the size of Codex of Manic Sages can certainly get a dozen people to listen to these clips.

But a 5-person team simply doesn't have this bandwidth :( Even my team, in the 40-person range, could get only about three people on it.
Ali LloydAli Lloyd on January 24th, 2013 04:28 pm (UTC)
Yes, I see what you mean. And I was proved wrong anyway, since I'm not aware of any teams (even larger ones) having solved it.
AJDdr_whom on January 24th, 2013 05:15 pm (UTC)
I mean, even if it wouldn't actually take that long to listen to all of the samples, there's a bit of a "screw that, man" moment when you open up the file and see that there's 263 clips to listen to. It just looks like a lot of clues to solve.
jcberk on January 24th, 2013 06:35 pm (UTC)
"A couple of us testsolved it over Christmas" is unfortunately a really bad metric for what will be fun during Hunt. Spare time over Christmas without many other pressing obligations is different from "I could do this essentially boring/frustrating identification task for 10 hours or I could work on three other puzzles during that time." People found the IDs difficult with three seconds of two overlapped songs.
Ali LloydAli Lloyd on January 24th, 2013 06:58 pm (UTC)
Well if you already think it's a boring / frustrating task, then I have no argument.
lunchboylunchboy on January 24th, 2013 10:44 pm (UTC)
I thought it looked like a boring, frustrating puzzle, and so I didn't work on it, which in retrospect was a good decision, I think. 200+ discrete research tasks, even if they're small ones (and if they're not small, cripes!), is just a lot of grunt work. Do I come to mystery hunt to sit at computers and do menial research, or do I come to have insights and solve puzzles? But a manageable amount of research is a different thing, and the same idea, scaled down to a much smaller grid, could have worked quite nicely.
Ali LloydAli Lloyd on January 24th, 2013 11:17 pm (UTC)
Like I say, I was proved wrong. I just wanted to remark that I had a lot of fun solving it, and take some responsibility for its appearance in the hunt given that my feedback was pretty overwhelmingly positive.
AJDdr_whom on January 25th, 2013 12:00 am (UTC)
I think lunchboy's point and mine, though, isn't necessarily that it was a boring puzzle—just that it looked like a boring puzzle, which was enough to stop us from working on it. As a testsolver, you were more obliged to work on it than we were, and therefore less likely to be stopped by appearances than actual Hunt solvers were.
(no subject) - Ali Lloyd on January 25th, 2013 08:43 am (UTC) (Expand)
Thouis R. JonesThouis R. Jones on January 25th, 2013 03:01 am (UTC)
Was this perchance a case of giving a puzzle for testsolving to someone it was well-suited to on the constructing team? We usually tried to give it to the second best when testsolving on Setec. (That might be something like 5th or 10th best on a tame the size of Sages).
Ali LloydAli Lloyd on January 25th, 2013 08:41 am (UTC)
It certainly didn't happen deliberately (Sages, on the whole, don't know me at all), but I don't deny I turned out to be one of the most suited to it.
Dr. C. Scott Ananiancananian on January 24th, 2013 05:35 pm (UTC)
Codex got a bunch of people to listed to the clips, but still didn't come close to 100% identification. (Perhaps Sages are most likely to identify music which is familiar to Sages?) And filling out a diagramless crossword with only tentative identifications for most of the entries was attempted without success.

Note that the puzzle also had the disadvantage of combining two very different skill sets. The crossword-ers didn't want to come near the puzzle which it was a music ID puzzle, and the music ID people generally didn't have a clue how to construct a diagramless crossword.

If the puzzle had been edited more tightly (clearer IDs, confirmation mechanisms such as listing clips in alphabetical order or in left-to-right order as they appear in the crossword), this could have been a fine puzzle. Note that the puzzle was also spoiled by the title change, which gave away the whole "diagramless crossword of music" aha. After that, there wasn't much doubt about what to do with the puzzle -- or interest in doing it.
Ali LloydAli Lloyd on January 24th, 2013 06:39 pm (UTC)
There's definitely no 'hive mind' working here - I am remote and have never actually met any Sages in person. Perhaps me and Tom just happened to be the only people with both skill sets. In any case, they are almost disjoint parts of the puzzle.

I suppose alphabetical order by across song would have worked. I guess all I'm saying is that I really enjoyed solving the puzzle.
Tom Yueyuethomas on January 25th, 2013 05:00 am (UTC)
Keep in mind it also wasn't much of a crossword. Sure you had to identify songs, but once you had a couple down, you began to notice that the number of clips matched the number of letters in the songs. I spent the longest time debating to myself whether it was an American or a British-style crossword (blocks vs. lines).

FWIW, the songs were almost overwhelmingly NOT post 2000 pop music.

I won't debate the individual merits of this puzzle, but the fact that it was placed in the last round was most likely what killed it.

Doesn't matter though; I had a good time solving it.
Jenny Ghahathor on January 25th, 2013 01:46 am (UTC)
What does "Testsolved it over Christmas" mean? Does it mean you locked yourself in a room and didn't come out till it was solved, or do you mean that over a week-long Christmas break, some group of you were able to identify all the puzzles and solve the puzzle? Because if it's the latter, well, yes, that could be a fun puzzle, but not necessarily for hunt.
Ali LloydAli Lloyd on January 25th, 2013 09:43 am (UTC)
It means 2 days, Christmas Eve and Christmas Day, IIRC. So neither of your options, but closer to the first one. yuethomas and I (and a third person towards the end) identified 90% of the clips. I reasoned that if we could do it in 2 days, a larger group could do it much more quickly, since the most time-consuming element is identifying and/or putting together clips of songs that you don't know.
Ali LloydAli Lloyd on January 25th, 2013 09:45 am (UTC)
But yes, I absolutely take the point that the circumstances are completely different, not least because it was something like puzzle #140 in an already too long hunt.
Dr. C. Scott Ananiancananian on January 25th, 2013 05:56 pm (UTC)
I think it's fair to say that on most teams they had more like "two people (who'd have to work for two straight days)" on the puzzle, than "a hundred people". Codex probably had four people working on the puzzle at max. I think there was some overestimation of the amount of parallelism happening during the hunt. Sure, some teams have 100+ people, but some large fraction of them are asleep/eating/offline/stuck on some other killer puzzle and the rest of them are spread thinly among a large number of open puzzles.
(Anonymous) on January 25th, 2013 09:02 pm (UTC)
I wonder if it's time for Mystery Hunt to steal the "have a couple teams test run the full event, start to finish, in as close to real circumstances as possible" betas from (shorter, smaller-team, less parallelized) West Coast events. No idea how you'd get teams to give up Hunt proper, or collect feedback for different puzzles when dozens are being solved at once, but I've found them extremely valuable to get an idea for how things will actually work in the event proper. Plus, it gives you an incentive to get things done early.

-gfpuzz/@gfilpus (Don't have access to my LJ creds at work)
noahspuzzlelj on January 25th, 2013 09:17 pm (UTC)
There are two major problem with this excellent suggestion.

First, as far as I know no team has ever had a completed ready-to-run hunt in place even the weekend before it runs. Many (most?) writing teams are still doing non-trivial things during hunt. Remember that for West Coast events you get to pick your own date, and you can even do much of the writing before you pick a date. A year just isn't enough time to write a Mystery Hunt, so even taking a week off is really tough.

Second, recruiting a full testsolving team is very hard. You need 40 people for a whole weekend who were willing to skip mystery hunt in order to be the testsolving team. Furthermore, odds are good that even if you did recruit such a team they wouldn't be able to finish hunt in a weekend unless you wrote a dramatically shorter hunt than solvers would be happy with.

I think this is a huge problem with MH, and part of why I'm not sure I want to ever write another MH (as opposed to a different puzzle event). But there are reasons why it's really really hard to do in a MH context.
Andrewbrokenwndw on January 25th, 2013 09:36 pm (UTC)
I think that forcing the core hunt to be fully ready seven days in advance is a perfectly fine side effect. We could certainly have used smoke testing on our tech infrastructure and release scheme!

I think the only workable solution to recruitment would be to establish the tradition that the 2012 winners test the 2014 hunt and so forth, having demonstrated that they should be able to finish the hunt in a weekend. I imagine most people aren't ready to win again even two years out. But I don't know even then if you'd be able to keep a team together well enough, and motivate them to solve a hunt without the big spectacular event to drive it.

On the last point: I for one hope you do write another Mystery Hunt, some time soon even. Plant has been responsible for both of my favorite hunts to date.

Edit: and yes, I think the hunt *could* be ready seven days in advance. The pace of work accelerates enormously as the deadline approaches, so as long as people took the one-week-before deadline seriously you would really be losing the week from the beginning, not the end, which is much less daunting.

Edited at 2013-01-25 09:40 pm (UTC)
noahspuzzlelj on January 25th, 2013 09:46 pm (UTC)
The other possible solution for recruitment which I'd thought of (which also has some problems) would be to do a Bay Area testsolve. Since remote solving is in some ways less exciting, it might be easier to recruit if you're promising a more hunt-like experience than what people would get remote solving. There were years when I might have been willing to skip hunt to testsolve with a giant Bay Area team.

Speaking having more than one team obligated to be involved, I have two totally radical suggestions that I like (but know no one will agree with):

Option 1: Hunt is written on a *two year cycle*. That is, if you win in year X then you're obligated to write the hunt in year X+2. (To get this started you'd have to have a year where the first place team wrote in year X+1 and the second place team wrote in year X+2.)

Option 2: Every year there are *two* mystery hunts (at Monopoly size) written by the top two teams from the previous year. This way writing hunt is half as hard (awesome!) and twice as many teams get to write (awesome!).
Peasant's Paladinppaladin on January 25th, 2013 09:52 pm (UTC)
A two year cycle would not allow two years to work on the hunt. It would allow a year and a half for a team to lose momentum, then the same half a year of crunch time to write a hunt. People are motivated by deadlines -- in the Codex hunt, Andrew made a graph of doom that he would email out every day, tracking our progress towards the goal. Extrapolating out on the graph always showed our hunt prep going at a good clip towards a hunt sometime in February or March -- until we really started crunching at the end, and finished everything in time:). I don't think the experience is different for any other writing team.
Mikey Gemengee on January 26th, 2013 09:06 pm (UTC)
I'm curious: have any of the previous constructing teams kept track of their progress throughout the year in a way that we could collaboratively come up with a list of deadlines that could be passed down for future constructing teams? Perhaps this information is already provided to the leadership of the constructing team each year, but if not, I think it would be helpful to have a calendar that shows when things usually get accomplished in a hunt that's mostly free of problems.

The tasks could even be broken down into leadership, tech, art, writers, editors, etc. For example:

January: Determine theme and structure (leadership with input from team), set up puzzle writing/managing software (tech), recruit and organize artistic talent (art).

February: Create metapuzzles and amass a list of answers (leadership with some writers), submit puzzle ideas/drafts (writers), assign editors to puzzle ideas (leadership), create drafts of artwork for metapuzzles and hunt website (art), talk to last year's writing team about any sorts of tech issues they had (tech).

Perhaps this would help keep teams from getting so far behind so early? Or remind teams of tasks they might be able to do early but not think to do? Or at the very least, this information would be interesting to see.
Scott HandelmanScott Handelman on January 26th, 2013 10:39 pm (UTC)
Well, the schedule you describe is definitely best case scenario, at least as it went for Codex. Looking back, we didn't vote on the final theme until early March, we spent *months* writing metas and released answers to puzzles as each one was finalized. The Watson 2.0 meta couldn't even be written until every other meta had gone through testing, because who knows if we would have to change answers due to meta failure? No answers at all were assigned to puzzles until June, and no regular puzzle made it into post production until the end of August.

I think you can set a schedule all you want, but be prepared to throw it out the window.
(no subject) - brokenwndw on January 26th, 2013 11:07 pm (UTC) (Expand)
(no subject) - emengee on January 26th, 2013 11:38 pm (UTC) (Expand)
(no subject) - brokenwndw on January 26th, 2013 11:42 pm (UTC) (Expand)