?

Log in

No account? Create an account
 
 
21 January 2013 @ 02:40 pm
Too Big to Solve?  
Not my tagline, but a good description for the Mystery Hunt that just happened. One line of dialogue after last year's Hunt that I led with in my wrap-up was a question of when is too soon for a Hunt to end. I said, in this era of a few competitive teams trying to grow to get over the winning hurdle, constructors aiming bigger was a mistake. The Hunt ending after 36 hours (Midnight Saturday) is fine if that makes the solving experience stretch over the weekend for everyone else. I won't comment generally on this year's effort but it seems a great example to point back to of too much ambition by too many people towards the further militarization of the size of Hunt so that by 2025 the team "The whole of new USA" can go after the coin against "USSReunited" for at least a month. The sense of "puzzle" versus "grindy work" is also a discussion I have every year and I don't choose to repeat myself. I've felt since 2008 that the Mystery Hunt is far from an event I'd regularly attend in person although I'm glad to have finally been onsite to play with Team Luck with whom I've been a "free agent" now for three years.

I had a good solving year as things go relatively, but it was mostly demoralizing personally. I soloed Palmer's Portals, for example, but spent many hours after basically solving 8/10ths with a need to tweak a very small and underconstrained set of things to get from that hard work state to a finished state. At some stage I told the team "I'm going to solve Portals and the Feynman meta and then go sleep" and I met this goal but in many times the expected time when I gave the statement. I led the solve of both Danny Ocean (with zebraboy stating the most necessary last bit to get my work over the cliff) and Richard Feynman (with Jasters). I obviously co-solved lots of the logic puzzles and other puzzles, and gave various finishing help to a range of things too. I think I did this best for "Kid Crossword" once when he had spent a lot of timing mastering the hard steps of a crossword/scrabble puzzle -- and could quite impressively fast rewrite out the set of steps I wanted him to do about the puzzle -- and the follow-up steps were not obvious but I led the killing of the beast. This was too often the feel for these puzzles, and my assassination rate was far lower than I wanted. My Sunday was spent earning 3 puzzle answers by actually going to an event, and then falsely believing the power to buy some answers would let me finish solving the Indiana Jones mini-metas -- where I had already mostly soloed Adventure 2's snakes with 5/8 answers, but then killed myself dead on #1/Ouroboros for the rest of the day for so long solving, as many solvers will say in hindsight, the puzzle that was meant to be in one of a dozen ways and not the puzzle it was. Let me state here as I did for hours with my team, the phrase "I'm not cut out for this" is horrible flavor. It implies both cut this out and, in a different way, also don't cut this out. This makes you want to cut it out, which takes a lot of time, but also to not invest too much time in cutting it out, so as to save the wasted time of doing a task you are being told not to do. Other wordings are far safer, and implied negatives within positives is one of the five worst flavor failure modes in my opinion. Puzzle editing and flavor text is an art and is certainly the biggest variable from year to year and constructing team to constructing team.

So yeah, Mystery Hunt happened. And there were the usual share of overwhelmingly incredible Aha moments. Endgame seemed very fun and I wish all teams could do just that for the weekend or at least a lot more things like that. More of that, and more sleep, would have both been some good choices this year. If only the puzzles solved on schedule.

ETA: And as I added far below around comment #300, as a solver who was both frustrated yet had fun in this Hunt, I do want to thank everyone on Sages for the incredible effort they put in. Making a Mystery Hunt is a gift for all solvers whether it matches expectations or not, and as a mostly thankless job I do want the constructors and editors and software engineers and graphic designers and cooks and phone center workers and everyone else to know I appreciated all you did over the last weekend to give us several days together for puzzling.

Further, as I was asked to write a larger piece elsewhere that has given me personally a lot more attention as the face of the criticism, and as I use the phrase "My team" a lot in general as solving forms this kind of bond, I want to be very clear: since Bombers broke up after 2009 I have been a free agent. I have solved recently with Team Luck but am not a core part of their leadership and these opinions I state are my own. I intend to form my own team next year to go after the coin again, and if you have a problem with what I have said anywhere on the internets, please hate me for it. I believe in my posts I have been offering constructive criticism, but even what I have said is without all the facts of what went on inside Sages so I could easily be speaking from ignorance a lot of the time.

EFTA: Thanks to tablesaw for pointing out this chronologic feature of posts. If you want to see all the additions to this post in time sorted order, go here http://motris.livejournal.com/181790.html?view=flat. We're on page 14 at the moment.
 
 
 
motrismotris on January 24th, 2013 12:59 am (UTC)
But that the system -- which should by now be approaching an automated system even if a call out announcement -- logged in PLANAR and that was not flagged as correct is really inexcusable.

One model Microsoft and other Hunts has used is that the first two to three answers on a puzzle are auto-submits with auto-response, no calls either way. This would be fine. Multiple incorrect answers requires a request to HQ and call to unlock answer entry again. I'd like to see this happen sooner rather than later as an advance for MIT as the delay to know an answer is right got high at times, and mistakes should be eliminated, not accepted as a state of nature with any human system.

Edited at 2013-01-24 01:00 am (UTC)
 Catherinecmouse on January 24th, 2013 01:18 am (UTC)
It was flagged as "Correct". Software can't stop a sleepy person from picking up the phone, reading that, and telling you your answer is incorrect.
Dr. C. Scott Ananiancananian on January 24th, 2013 04:05 am (UTC)
But software can stop us from repeating to call in answers for a puzzle which has been marked correct.

Software can also display that the answer is correct on the call-in screen (all correct answers were just listed as "proposed answer").

I disagree with completely automating the system, I think the human contact is important (and fun). There are known ways of making this work well and insulating it from human error... those mechanisms were not implemented in 2013 -- probably the hunt-running team ran out of time?
 Catherinecmouse on January 24th, 2013 04:07 am (UTC)
The software folks certainly ran out of time. ;)
Dr. C. Scott Ananiancananian on January 24th, 2013 04:52 am (UTC)
(We actually anticipated this being a problem, based on when your tech guys asked us for a copy of our hunt-running software.) We had issues getting the hunt-running code done in time, too, but novalis is a wild hacker man and saved our bacon.

[Atlas Shrugged]: another note from past constructors -- get that hunt-running code done early! We actually wanted to do all our test solving using the final hunt-running code, and would have been much less stressed if we could have made that happen. (But as it was both novalis and I were doing a lot of puzzle editing and writing and I was concentrating on the post-prod code, which we *did* get done in time to use for all the test-solving, and somehow the hunt-running code kept slipping through the cracks...)

Little known fact: all our puzzles had code names which were names of My Little Ponies, since we needed placeholder names when doing meta testing and website design. The My Little Pony names *almost* made it into the hunter-visible javascript for doing answer submission... but we thought that was going to be too much of a red herring (sigh) so we substituted md5sums of the pony names at the last moment.
Andrewbrokenwndw on January 24th, 2013 07:52 am (UTC)
I would say that future "best practices" would include: start writing and testing hunt running code as soon as your structure is well enough known to know what the kinks are. You'll have to do it eventually, and being able to blindsolve (for example) on final code seems like a great idea.

Also, we would have done better to have CScott win the hunt the day before...

(cscott was the name of the testing account we used during the hunt to drive ahead of the leading teams and check for bugs. We failed to do this diligently and thus didn't smoke out a critical bug having to do with releasing the final round. Whoops!)
Jessicajessicala on January 27th, 2013 03:29 am (UTC)
Speaking as someone who's hosted a number of Microsoft events using auto-response software: there's still plenty of human contact happening. During PuzzleHunt 11.0 I remember spending a lot of time in personal contact (both phone and email) with members of various teams.

Having the software pick up a lot of the auto-answering gave us (hosts I mean) more time for one-on-one interaction, to help when midrange teams needed a nudge or some guidance to help them keep having fun.
(Anonymous) on January 25th, 2013 01:38 am (UTC)
Can't stop it, but can reduce the odds. EG Text displayed on a giant green field when calling back a right answer; text displayed on a giant red field for a wrong answer.

But it seems like every year the Hunt running software is written at the last possible minute. In 2011, several of us were sitting around at 10 AM Friday with laptops "submitting" answers in order to simulate a rapidly sped up Hunt (Setec won in 20 minutes, IIRC).

David S.
Alisonlandofnowhere on January 25th, 2013 04:02 am (UTC)
We had green and white (though the colors were not that bold). I didn't actually get to respond to the answer queue, so I don't know how well it worked.

--Alison from Sages
Hooligan: neuron artjedusor on January 26th, 2013 09:54 am (UTC)
Setec won

Now I know you're making things up. ;)
ze top blurberry: driftingztbb on January 26th, 2013 04:53 pm (UTC)
Oh c'mon, we're sober for at least the first 20 minutes of the hunt. (-:


Edited at 2013-01-26 04:54 pm (UTC)
AJDdr_whom on January 24th, 2013 02:07 am (UTC)
So certainly for the Mario Hunt, and perhaps even as far back as SPIES, Plant was talking about phone-free auto-response for answer confirmation, and one of the reasons we decided not to implement that was because we thought people value the answer-confirmation calls as a point of human contact with the running team, so you don't feel quite so much like you're in a hermetically-sealed box for the duration of the Hunt. Do you think this is a valid concern?
motrismotris on January 24th, 2013 02:18 am (UTC)
Depends how some solvers use the thing. I never took the calls for the puzzles I answered. So I basically generated work by sending everything through the team line.

On the running side, as a smaller team I hated losing so much man power to the call lines and would prefer more frequent visits to every team HQ.

Opinions will be very broad on this point though and each Hunt team should encourage some amount of interaction with each team however that is achieved.
rlangmit on January 24th, 2013 03:18 am (UTC)
I think we at Up Late enjoy the calling mechanism as it is right now. It's nice to (a) have to think a bit about submitting something (i.e., not just play Funny Farm), (b) have some anticipation before finding out if it's correct, and (c) have some human interaction. For example, we got a lot of amusement responding "Oh, you don't say?" or something else pseudo-sarcastic when Sages would call back to confirm an answer they had just sold us. And for real solves, it's nice to hear that congratulations from an actual human voice.

I don't think there were any huge delays in the call queue--at least for us--so I don't think this is a huge problem. The devolution of the software was a problem.
Dr. C. Scott Ananiancananian on January 24th, 2013 04:06 am (UTC)
Incidentally: https://github.com/mysteryhunt contains all the software we wrote for the 2012 hunt. The goal was for this to be a resource for future hunts.