Log in

No account? Create an account
21 January 2013 @ 02:40 pm
Too Big to Solve?  
Not my tagline, but a good description for the Mystery Hunt that just happened. One line of dialogue after last year's Hunt that I led with in my wrap-up was a question of when is too soon for a Hunt to end. I said, in this era of a few competitive teams trying to grow to get over the winning hurdle, constructors aiming bigger was a mistake. The Hunt ending after 36 hours (Midnight Saturday) is fine if that makes the solving experience stretch over the weekend for everyone else. I won't comment generally on this year's effort but it seems a great example to point back to of too much ambition by too many people towards the further militarization of the size of Hunt so that by 2025 the team "The whole of new USA" can go after the coin against "USSReunited" for at least a month. The sense of "puzzle" versus "grindy work" is also a discussion I have every year and I don't choose to repeat myself. I've felt since 2008 that the Mystery Hunt is far from an event I'd regularly attend in person although I'm glad to have finally been onsite to play with Team Luck with whom I've been a "free agent" now for three years.

I had a good solving year as things go relatively, but it was mostly demoralizing personally. I soloed Palmer's Portals, for example, but spent many hours after basically solving 8/10ths with a need to tweak a very small and underconstrained set of things to get from that hard work state to a finished state. At some stage I told the team "I'm going to solve Portals and the Feynman meta and then go sleep" and I met this goal but in many times the expected time when I gave the statement. I led the solve of both Danny Ocean (with zebraboy stating the most necessary last bit to get my work over the cliff) and Richard Feynman (with Jasters). I obviously co-solved lots of the logic puzzles and other puzzles, and gave various finishing help to a range of things too. I think I did this best for "Kid Crossword" once when he had spent a lot of timing mastering the hard steps of a crossword/scrabble puzzle -- and could quite impressively fast rewrite out the set of steps I wanted him to do about the puzzle -- and the follow-up steps were not obvious but I led the killing of the beast. This was too often the feel for these puzzles, and my assassination rate was far lower than I wanted. My Sunday was spent earning 3 puzzle answers by actually going to an event, and then falsely believing the power to buy some answers would let me finish solving the Indiana Jones mini-metas -- where I had already mostly soloed Adventure 2's snakes with 5/8 answers, but then killed myself dead on #1/Ouroboros for the rest of the day for so long solving, as many solvers will say in hindsight, the puzzle that was meant to be in one of a dozen ways and not the puzzle it was. Let me state here as I did for hours with my team, the phrase "I'm not cut out for this" is horrible flavor. It implies both cut this out and, in a different way, also don't cut this out. This makes you want to cut it out, which takes a lot of time, but also to not invest too much time in cutting it out, so as to save the wasted time of doing a task you are being told not to do. Other wordings are far safer, and implied negatives within positives is one of the five worst flavor failure modes in my opinion. Puzzle editing and flavor text is an art and is certainly the biggest variable from year to year and constructing team to constructing team.

So yeah, Mystery Hunt happened. And there were the usual share of overwhelmingly incredible Aha moments. Endgame seemed very fun and I wish all teams could do just that for the weekend or at least a lot more things like that. More of that, and more sleep, would have both been some good choices this year. If only the puzzles solved on schedule.

ETA: And as I added far below around comment #300, as a solver who was both frustrated yet had fun in this Hunt, I do want to thank everyone on Sages for the incredible effort they put in. Making a Mystery Hunt is a gift for all solvers whether it matches expectations or not, and as a mostly thankless job I do want the constructors and editors and software engineers and graphic designers and cooks and phone center workers and everyone else to know I appreciated all you did over the last weekend to give us several days together for puzzling.

Further, as I was asked to write a larger piece elsewhere that has given me personally a lot more attention as the face of the criticism, and as I use the phrase "My team" a lot in general as solving forms this kind of bond, I want to be very clear: since Bombers broke up after 2009 I have been a free agent. I have solved recently with Team Luck but am not a core part of their leadership and these opinions I state are my own. I intend to form my own team next year to go after the coin again, and if you have a problem with what I have said anywhere on the internets, please hate me for it. I believe in my posts I have been offering constructive criticism, but even what I have said is without all the facts of what went on inside Sages so I could easily be speaking from ignorance a lot of the time.

EFTA: Thanks to tablesaw for pointing out this chronologic feature of posts. If you want to see all the additions to this post in time sorted order, go here http://motris.livejournal.com/181790.html?view=flat. We're on page 14 at the moment.
lunchboylunchboy on January 23rd, 2013 01:11 am (UTC)
I don't know, I think pretty much anyone could have told you a hunt with 170+ puzzles would be way too long. I don't know what the largest number of puzzles in a Hunt has been to date, but I want to say around 120? And some of the Hunts with that amount or fewer still ran long. Based on the one obstacle training puzzle my team did (the maze of guards), which I thought was excellent, I suspect there were other cool, fun live action puzzles which we -- and most teams -- never saw, and personally I think building such high, insurmountable walls around those puzzles did the people who wrote them a disservice.
Andrewbrokenwndw on January 23rd, 2013 04:42 am (UTC)
I did the math on this some time back. There is not necessarily a perfect correlation between end time and puzzle count, though, which is something I'm planning to write about soon.
Andrewbrokenwndw on January 23rd, 2013 04:47 am (UTC)
In reply to myself: Noah Snyder makes a truly prescient comment in that post, the very first one.

I'd also like to say (especially since people from sages might be reading) that first time hunt writers *should* be writing short hunts. I think 107 puzzles was already ambitious for a first time team (though certainly pulled off brilliantly). There's a great tradition (Setec00, Plant06, Midnight07, Luck10, Codex12) of short clean "warmup" hunts before you really go nuts for your next hunt. It's the right way to run a first time hunt, and you really shouldn't feel bad *at all* about ending on the early side.
ze top blurberry: driftingztbb on January 23rd, 2013 06:38 am (UTC)
This. What makes me saddest about what we saw this weekend is that the mistakes Sages made were mostly well-known pitfalls that anyone competent giving them construction advice would have advised them to avoid. I don't know if they didn't ask for advice, or didn't listen, but....

I will say that I thought the length of the individual puzzles was a bigger problem than the number of puzzles. A puzzle has to be truly special for me to want to spend 6+ hours on it. The sheer volume of puzzles with tons of work to do -- 263 (!!) audio clips to listen to, 50 MIT location photos, etc, etc -- made this hunt unpleasant for me. If you're going to ask solvers to spend six hours doing work on a puzzle, that work had better be intellectually challenging in some way, because we're there to spend our weekend being intellectually challenged. (A well-constructed logic puzzle can obviously fill that role, and I'm looking forward to trying Portals at some point.)

Once a solver has their aha! moment on a puzzle, they should be able to count on eventually being rewarded with an answer, too many of the puzzles in this hunt did not do that for us -- either because the work after the aha! was too long and unrewarding for the solver to want to do, or the extraction didn't come out for us after hours of work, or hours of work and a successful extraction only led to a second layer of the puzzle rather than to an answer. At a certain point I stopped being interested in picking up new puzzles, because my trust in the constructor had been broken -- every puzzle seemed like a major time commitment that I could not expect would be worth the investment.

Only a very small number of teams come equipped with a hive mind that can tear through scutwork the way that Sages can.

A hunt with 170 beautiful puzzles that provided rewarding solve after rewarding solve and ran into Monday morning would have been just fine; this hunt was too much chaff and not enough wheat. I realize here that I'm being a bit harsh about something that a lot of good friends of mine (including some with whom I work very closely) spent the better part of a year working on, and I feel bad about that. But I think these things have to be said.
Andrewbrokenwndw on January 23rd, 2013 06:43 am (UTC)
I really don't want to get into recriminations. In short, yes, they had advice from us and even at least one member of Plant, and it's more or less exactly the advice you can see in that LJ thread. It is not necessarily the case that they ignored this advice; lots of things go wrong when writing a hunt, including things having to do with balancing what your team is like with what the hunt needs.
ze top blurberry: driftingztbb on January 23rd, 2013 06:59 am (UTC)
I am pretty sure I gave them some of the same advice, at least informally. Honestly I wouldn't have expected the advice even to be necessary. I've solved excellent puzzles written by many of the same people on previous occasions. A good number of them have written hunts before (including Derek, who was on Setec when we wrote Normalville). I know they know these principles. I can't begin to guess why or how they went out the window, and I agree with you that it's pointless to speculate here. I expected that this hunt would significantly surpass the community's expectations.
(no subject) - MellowMelon [wordpress.com] on January 23rd, 2013 08:43 pm (UTC) (Expand)
(no subject) - motris on January 23rd, 2013 08:51 pm (UTC) (Expand)
(no subject) - noahspuzzlelj on January 23rd, 2013 09:00 pm (UTC) (Expand)
(Deleted comment)
(no subject) - zotmeister on January 23rd, 2013 09:34 pm (UTC) (Expand)
(no subject) - cmouse on January 24th, 2013 01:09 am (UTC) (Expand)
(no subject) - lunchboy on January 24th, 2013 01:59 am (UTC) (Expand)
(Deleted comment)
(no subject) - noahspuzzlelj on January 23rd, 2013 08:54 pm (UTC) (Expand)
(no subject) - (Anonymous) on January 23rd, 2013 09:21 pm (UTC) (Expand)
(no subject) - motris on January 23rd, 2013 09:30 pm (UTC) (Expand)
(no subject) - (Anonymous) on January 23rd, 2013 10:14 pm (UTC) (Expand)
(no subject) - motris on January 23rd, 2013 10:20 pm (UTC) (Expand)
(no subject) - (Anonymous) on January 24th, 2013 04:20 am (UTC) (Expand)
(no subject) - motris on January 24th, 2013 04:34 am (UTC) (Expand)
(no subject) - (Anonymous) on January 24th, 2013 11:39 pm (UTC) (Expand)
Mayhem - (Anonymous) on January 24th, 2013 01:23 am (UTC) (Expand)
Re: Mayhem - motris on January 24th, 2013 01:26 am (UTC) (Expand)
(no subject) - canadianpuzzler on January 24th, 2013 02:14 am (UTC) (Expand)
Re: Mayhem - dr_whom on January 24th, 2013 02:11 am (UTC) (Expand)
(no subject) - (Anonymous) on January 25th, 2013 05:35 am (UTC) (Expand)
(no subject) - cananian on January 23rd, 2013 10:04 pm (UTC) (Expand)
(no subject) - dr_whom on January 24th, 2013 12:29 am (UTC) (Expand)
(no subject) - coendou on January 24th, 2013 01:16 am (UTC) (Expand)
(Deleted comment)
(no subject) - davidglasser on January 24th, 2013 03:04 pm (UTC) (Expand)
(no subject) - foggyb on January 24th, 2013 03:30 pm (UTC) (Expand)
(Deleted comment)
(no subject) - rlangmit on January 24th, 2013 06:30 pm (UTC) (Expand)
(no subject) - ztbb on January 24th, 2013 09:37 pm (UTC) (Expand)
(no subject) - noahspuzzlelj on January 25th, 2013 03:58 am (UTC) (Expand)
(no subject) - (Anonymous) on January 26th, 2013 09:31 pm (UTC) (Expand)
Meta backsolvability - pesto17 on January 28th, 2013 03:57 pm (UTC) (Expand)
(no subject) - lunchboy on January 23rd, 2013 10:20 pm (UTC) (Expand)
screwdestinyscrewdestiny on January 23rd, 2013 09:40 pm (UTC)
I definitely agree with the length of individual puzzles being too long. This is my first time doing Mystery Hunt (I was with Team Palindrome) and while the variety of puzzles was great, a lot of them involved what felt like "busy work." I'm the novice of novices, so I'm not sure if that's normal. Even reading the solution for Bottom of the Top (which we were on the right track for precisely because of the title hint, but didn't have time to do) makes me wonder why exactly Manic Sages didn't realize logistically their hunt was too long:

"The title of the puzzle is intended to help solvers narrow down the list of potential songs to listen to. All songs in this puzzle were #1 hits on the Billboard Hot 100
during the 50 years of that list's existence, narrowing down the search space to only about 1000 songs. :-) On Saturday night of the Hunt, before this puzzle had
been released to anyone, it became clear that another very long puzzle was... um... unnecessary, so I added a list of years that each of the songs could be found in,
which further narrowed the search. Another helpful feature of the puzzle is that the songs were given in alphabetical order by title"

It really took until Saturday night to realize we didn't need a puzzle where after you'd gotten the gist, you had to listen to songs where every 1 out of 33 was one that you needed? It's not a puzzle at that point anymore, it's me listening to songs on Rhapsody, failing to adequately distinguish the bassline, and entertaining the notion of downloading them and running them through Audacity before deciding my efforts are better spent elsewhere.

The traveling puzzles were also big time commitments. With Highlight Reel, at least all the doors were inside and on campus. But the Cambridge Waldo and Square Routes puzzles, which required you to either know Cambridge like the back of your hand or travel around from the locations that you did know and hoped you spotted the ones you didn't on the way (in the dark and/or cold, with a phone or laptop beside you) were logistical nightmares for our teams and while we made it a good faith effort to Google Map them (although I'm pretty sure the square dancing puzzle asked you to record things you couldn't see without actually being there), we'd pretty much decided it was just Too Much.

I expected to be super frustrated yet having a great time and while that was COMPLETELY TRUE (no jokes, I have so many wonderful memories), things like the above just stood out as "...really, though?"
lunchboylunchboy on January 23rd, 2013 07:03 am (UTC)
I still think the Producers hunt was pitched at the correct difficulty level overall. It may have ended too early from Sages' viewpoint, but they were five hours ahead of their nearest competition, and their status as an outlier had to do with their overwhelming numbers (which we didn't have control over). I think it's a mistake to write with the idea in mind of "let's write this so the fastest team finishes it at the exact moment we hope the Hunt will end" -- and it seems that's what Sages did this year, and, of course, then there was the problem that the team best suited to solve this Hunt in any sort of reasonable time frame was the only one not solving, because they were running the Hunt.
Andrewbrokenwndw on January 23rd, 2013 07:33 am (UTC)
Eh. I actually think they're a lot like Codex in terms of solving ability-- especially in terms of breadth vs. depth-- and thus not well positioned to handle a harder hunt. If anything they succeeded last time exactly because we gave them a lot of very thoroughly vetted, comparatively easy puzzles, letting them best apply their large amount of less Dankatzian labor.

Edit: And lest anyone think I mean this as a putdown, I definitely put myself in the non-Dankatzian camp. I'm just saying that I think the hard hunts are best for the small, concentrated teams.

Edited at 2013-01-23 07:36 am (UTC)
Dan KatzDan Katz on January 23rd, 2013 10:01 pm (UTC)
I am flattered to be an adjective.
Dan KatzDan Katz on January 23rd, 2013 10:00 pm (UTC)
We were the second team to finish Producers, which means it was shorter for us than it was for any other team except Sages, and I did not feel short-changed by the Hunt length. As Thomas said, I would rather have the best teams finish early and have more teams get to see the entire Hunt before the time-limit cutoff.
noahspuzzlelj on January 23rd, 2013 10:33 pm (UTC)
If enough people want hunt to go longer into Sunday, it'd be totally reasonable to experiment with a later announced time-limit cutoff. Personally, I think early Sunday afternoon is right (so that wrapup can be Sunday before dinner), but if a team wants to say in November (before plane tickets are bought) "We'll stay open until Midnight on Sunday and have wrap-up Monday morning" that'd be fine too. (Though there is a trade-off there because people who work on Monday will miss wrap-up and any other post hunt gatherings.)
(no subject) - cananian on January 24th, 2013 04:14 am (UTC) (Expand)
noahspuzzlelj on January 23rd, 2013 07:30 pm (UTC)
A 170 puzzle hunt with roughly the same level of difficulty and quality as the past three hunts I'd expect to run between 48 and 60 hours (a naive estimate is 48, but you'd probably expect it to go longer as hunters got more tired), which in and of itself isn't a problem.

The main problem with writing 170 puzzles, is that it means you don't get to cut as much and so the quality goes down significantly and the difficulty will go up. It's already the case that usually the last 5-10 puzzles that get put into hunt have issues. To get to 170 you just have to put in lots of puzzles that you'd really rather cut. You also don't have the time to cut or rewrite puzzles that testsolvers thought were solvable but annoying. You have to use a lot of ok, but not great ideas. To get a quality product, you need to be able to leave a lot of work on the cutting room floor, and writing 170 puzzles in a year just doesn't give you enough material to cut.

Edited at 2013-01-23 07:51 pm (UTC)
(Deleted comment)
noahspuzzlelj on January 23rd, 2013 09:08 pm (UTC)
The last few hunts took roughly 36 hours for the fastest teams. I was just scaling up roughly.
 Catherinecmouse on January 23rd, 2013 11:31 pm (UTC)
I mean this is a huge question and believe me I have more models than you would believe on hunt length. This is an unbelievably complex question.

We (SagesHQ) all built models in deciding hunt length and I actually built solving models for every top or middle team. When we finish compiling and releasing the data you'll see that Death and Mayhem (Death from Above and Electric Mayhem) ended up nearly 200 people strong and super competitive. You can see the change in solving rates and projected solving rates from either team alone.

Puzzle solving rates have been increasing in the past few years and hunts have been shortening. Teams are getting bigger and *technology* is improving in a startlingly nonlinear rate. I know a lot of people wish that hunt wouldn't end in the early hours of Sunday (we targeted 3pm on Sunday). That's a big gap between when 2012's hunt ended and our target hunt time.
lunchboylunchboy on January 24th, 2013 12:09 am (UTC)
A 200-person team solved more puzzles than other teams and was in competition for finding the coin? This isn't altogether surprising. When you state: "Puzzle solving rates have been increasing in the past few years and hunts have been shortening. Teams are getting bigger and *technology* is improving in a startlingly nonlinear rate," you're buying into an assumption that the optimal team size is 150 or more people. Teams got bigger because they wanted a better chance of winning, and a bigger team meant (possibly) solving faster. If a large team has an inbuilt solving edge, I don't see that it follows that the Hunt has to escalate its length and difficulty to cater to the teams that increased their size. This screws smaller teams over two ways: they have less chance to win in the first place because they're up against megateams, and then future Hunts are not even approachable for them because they're so overwhelmed with puzzles.

(Also, the 2012 Hunt *did* end on Sunday afternoon. We did not stop running it just because Sages found the coin.)
Andrewbrokenwndw on January 24th, 2013 12:35 am (UTC)
What are you talking about? Our hunt ended at 3 PM on Sunday, just like the 2011 hunt!

...and to take my tongue slightly out of my cheek, what I mean here is that I really liked the transition in attitude, in 2011-2012, from "the hunt 'ends' when the coin is found" to "the hunt 'ends' when HQ closes, well after the coin is found". I want lots of teams to solve 90% of the puzzles and see endgame. I want even small teams to have a fighting chance at solving a full-strength round. And none of these things are possible if you assume the hunt has to last all weekend for the winners.

I would be curious as to how your models worked. I am slightly ashamed to admit that our models consisted of "well, our puzzles are similar in difficulty and cleanliness to the 2011 puzzles, and there are about 0.9x as many, so it should be 0.9x as long." But it did work! :-)
(no subject) - brokenwndw on January 24th, 2013 12:43 am (UTC) (Expand)
(no subject) - cananian on January 24th, 2013 04:21 am (UTC) (Expand)
(no subject) - okosut on January 24th, 2013 08:54 pm (UTC) (Expand)
(no subject) - cananian on January 24th, 2013 09:07 pm (UTC) (Expand)
(no subject) - noahspuzzlelj on January 25th, 2013 03:59 am (UTC) (Expand)
zandperl: Huntzandperl on January 24th, 2013 01:43 am (UTC)
How do you define "middle team"?

FWIW the Hunt did end at 3pm Sunday for Grand Unified Theory of Love: one of our team goals is to have a spotless room after the Hunt, so to make sure everyone chips in on the cleanup we always start packing up at 3pm Sunday. A few individuals continued to work Sunday night and Monday morning, but it was not as a team.
(no subject) - cmouse on January 24th, 2013 02:19 am (UTC) (Expand)
Andrewbrokenwndw on January 23rd, 2013 09:08 pm (UTC)
I don't think that's what he means; the previous several hunts ran well under 48 hours for the winning team.
 Catherinecmouse on January 23rd, 2013 11:32 pm (UTC)
We produced around 400 puzzle ideas that were worked on in some form or another - many puzzles were cut at various stages of development.
Dave "Novalis" Turnernovalis on January 23rd, 2013 11:37 pm (UTC)
We (Codex) had 350 in the server.

Edited at 2013-01-24 04:59 am (UTC)
ze top blurberry: driftingztbb on January 24th, 2013 12:39 am (UTC)
That's very impressive, but with all due respect does not explain why some (many?) of the puzzles that were in the hunt were in the hunt; or at least, why they were in the hunt in the form that they were in.

Dr. C. Scott Ananiancananian on January 24th, 2013 04:18 am (UTC)
It does, actually. As novalis states, sages and codex had similar numbers of puzzles in raw form. Codex let 30% fewer of them make it into the hunt.

(But I really think the main differences in length were actually (1) meta construction, and (2) number of ahas permitted in a single puzzle.)
Andrewbrokenwndw on January 24th, 2013 04:20 am (UTC)
Well, I can't say about the 400 number, but that 350 includes a number of tracking puzzles, metapuzzles, events, and dead numbers. I'd say the actual number of reasonably fleshed-out submissions is below 300.

Still, we did cut pretty aggressively, even before answer assignment, so probably your broad point stands.
(no subject) - ztbb on January 24th, 2013 04:26 am (UTC) (Expand)
(no subject) - Thouis R. Jones on January 25th, 2013 02:56 am (UTC) (Expand)
(no subject) - cananian on January 25th, 2013 03:43 am (UTC) (Expand)
(no subject) - dr_whom on January 25th, 2013 04:10 am (UTC) (Expand)
(no subject) - Thouis R. Jones on January 25th, 2013 12:27 pm (UTC) (Expand)
(no subject) - ztbb on January 25th, 2013 03:27 pm (UTC) (Expand)