DASH 7: “There’s never plenty of time”

Cartoon of a permanently stopped watchDo not take the graphic as a dig or a suggestion that DASH 7 was in some way broken, that most absolute and damning term of game criticism…

A common theme in the commentary of DASH 7 was its quantity, as well as its undoubtedly very high quality. There was more than people were expecting, possibly to the point where it strained the logistic constraints of practicality that its players had to place on it, and that’s where some of the relatively negative feedback has come from. This post concerns the Experienced players’ track only; primarily this is from inevitable self-centredness, though it’s worth noting that (provisionally) the convincing majority of players were on the Experienced track.

A phrase frequently used when describing the hunt in advance ran, roughly, to the effect of “We expect that most teams will solve all puzzles in 6-8 hours“, though the precise wording varied from location to location. Some locations announced specific wrap-up times in advance, others used phrases like “All teams across the world will be working on the same 10 puzzles over the course of a max of 8-hours“; it’s not completely clear where the concept came from that there would be an overall time limit, including non-solving time, of eight hours this year, except possibly from expecting a repeat of last year’s hard limit in the absence of anything to set our expectations otherwise. That said, this site probably propagated this incorrect notion; if so – whoops, sorry, genuine mistake.

The combined par time of the nine scored puzzles for DASH 7 was 5:45, very similar to the combine par time of the nine scored puzzles for DASH 6 of 5:50. However, as previously discussed, a reasonably representative total solving time (based on early, probably incomplete data) for a globally mid-table team rose from 5:10 for DASH 6 to 6:55 for DASH 7. Another way of looking at it is that the median score for DASH 6 was 411 and for DASH 7 was 349. True, DASH 6 had five minutes more par time and thus scores might be expected to be five points higher, but the other way of looking at it is that people were scoring far fewer bonus points than in previous years.

In DASH 4, the par value was described as a “generous average solve time”; this year, that was rather less the case. Looking at the nine global-median-scoring teams (usual caveats: early, possibly incomplete, data subject to revision), in DASH 6, a typical team earned bonus points on seven (sometimes six) of the nine scored puzzles whereas in DASH 7, a typical team earned bonus points on two, maybe three, of the nine. This is rather an abrupt analysis; fuller analysis would consider practice from previous years still. Nevertheless, the DASH 7 par values broadly didn’t feel like generous average solve times.

The very dear Snoutcast used to mention the phrase “Everybody likes solving puzzles, nobody likes not solving puzzles” often. From there, it’s not much of an extension to “Everybody likes solving puzzles, everybody likes solving puzzles and earning bonus points from doing so even more”. Teams who were used to having sufficient time to solve puzzles and frequently earning bonus points in previous years may not have had their expectations set to the higher standard this year, which doesn’t just cause “we’re not doing as well as we did last year” ill feeling but also can cause “we might not have time to get all the fun from solving puzzles that we want before the hard time limit expires” worries, which may knock on to causing teams to take sub-optimal decisions over their self-care, worsening their experience further.

There’s a very interesting discussion on the GAST scoring system on the Puzzle Hunters Facebook group at the moment. When the par times are sufficiently generous, then the ordering by (highest) scores and (fastest) solve times are identical; when they are not, some teams are arguably over-rewarded, or insufficiently punished, for relatively slow solves on some puzzles. This was an arguable issue as high as the top ten this year.

DASH has one of the hardest calibration issues of all puzzle hunts because it aims to cater to teams of so many different abilities, even among those who self-select for one level of difficulty or another. Previous DASHes perhaps might not have got the degree of credit that they have deserved for making the balancing act work quite so well. So this all points to a question of where DASH should seek to target its activities.

Is the number of puzzles correct? Should the puzzles be shorter… or the same length, with longer par values? Would DASH be better served by having the sort of quantity of content (i.e. total solve time 4½-5½ hours for median teams) that is had in previous years, or a similar quantity of content to that of this year spread over a longer day? The considerable downsides of a longer day could include that it might well put potential players off, potential GC and volunteers off and that it might make finding appropriate locations even more difficult still. On the other hand, challenges as meaty as those of this year were an awful lot of fun!

This is a very INTP-ish “throwing things out there” sort of post, so perhaps time to be a bit more concrete. It’s inevitable that calibration suggestions will turn out to be self-interested, though the self-interest will be subconscious as efforts have been made to try to eliminate conscious bias. For an eight-hour-overall-time-limit day, perhaps the calibration target should be that 75% of teams solve all the puzzles, in their division of choice, within 5½ hours solving time, and that 80% of teams beat the par value for each puzzle.

That said, it’s not as if tuning puzzle difficulty up or down is at all an exact science, or that playtest results are necessarily reflective of how puzzles will turn out in real life. The whole process is the endeavour of fallible humans after all; the puzzle community at large is truly grateful to those who submit puzzles, those who edit them, those who make the selections and turn raw puzzles into complete hunts. The quality has once again been extremely high, even if the quantity was not what people had been led to expect.

It could be possible for a DASH to offer so little challenge to the fastest teams as to hurt their experience, so here’s an out-there suggestion to finish. While adding multiple levels of difficulty by writing more sets of puzzles adds very considerably to the workload – and while the BAPHL series of hunts offers two levels of difficulty, this site isn’t aware of any other hunt that offers three, what with the brilliantly thoughtful junior track as another labour of love – here’s a possibility.

Consider the addition of a hardcore mode that shares the same material with the experienced track, but is different in the proactivity with which it offers hints, and also limits team sizes to three. This could slow the best solvers down while hurting their experience in only the “it’s fun to solve in large teams” fashion – but, if you’re that hardcore, you’re likely to have access to other events which will let you solve in larger teams as well. It’s also been proven to be the case that the best three-player teams can match the best larger teams as well!

15 thoughts on “DASH 7: “There’s never plenty of time””

  1. The eight hour limit was in the start info we got, including the word ‘strictly’ in bold and everything, so that’s where it came from at least.

  2. Timing is really, really hard. We try to do calibrations based on how fast it’s solved in playtest, with some added time, but I think playtest teams are not necessarily representative of their tracks. I do like the idea of the “hardcore” mode where there are no free hints.

  3. I felt the rubber-band nature of the 8 or 10 hour time limit was the most puzzling thing that day. If we accept that DASH is a competition intended to be held in near-identical circumstances except the location, then I don’t see how it’s logical to allow the time limit to be “8 hours, or more of your local GC allows”. That makes a mockery of the whole competition, even if it’s done with the best of intentions.

    There’s also the issue of trust. If GC says there is a time limit of 8 hours, then that should be the limit and that should apply to everyone. Equally, the overall calibration needs to lie comfortably within that 8 hours, and I feel that target was missed by a distance this year even though I much preferred this year’s puzzles to last.

    1. It was a uniform 8 hours + 2 extra for buffer time in ALL cities. We did not want to say “this hunt is actually 10 hours” because it’s not. It’s meant to be finished in 8 hours and we wanted teams to pace themselves for 8 hours. However, given that many teams stay and finish the meta regardless of time limits, we wanted to reward teams for persistence.

      Pre-cluekeeper, and not just in DASH, we had advertised end times, but GC could, at their discretion, stay longer to accept answers. The extra hours tacked on (but not advertised) was an attempt to model this, because it absolutely sucks for those teams that finish just a few minutes outside of the time limit to get no credit.

      1. If you’re in a location where the GC can’t stick around for longer, you will be penalised compared to other cities. “…Same Hunt, Different Time Limit”?

        The other thing that sticks in the craw is that the first I heard of the 10 hour limit was the next day. How would you feel if you had 3 hours to do an exam and then discovered afterwards that everyone else got 4 hours to do it? Wouldn’t that seem grossly unfair?

        1. It’s never going to be the same experience across all cities. Routes cannot be uniform. Travel time is vastly different. Locations and weather are going to be very different too. What we do instead for this is to discount travel time, and make sure content is the same.

          And, as I said before, in this case, thanks to Cluekeeper, every team in every city has that extra time. Before, it varied from city to city towards the end, and without cluekeeper to accept answers whether or not GC was present it was a different experience across each city. So yes, it was explicitly unfair but it was accepted that you cannot have the same experience across every city. However, we always recorded solve time, so at least there you can still see who finished when.

          Regarding the extra time – everyone had that extra time. Every team had the same 10 hours programmed in. Maybe it’s an American team habit, but we did expect people to try to submit answers until Cluekeeper explicitly said “Hunt is over.” I’m sorry it caused a lot of confusion and grief for you. Our goal was to run an 8 hour hunt and to have everyone pace themselves as if it were an 8 hour hunt, but to be more forgiving with cutoff times once people got to the meta than last year. If we’d said “you actually have 10 hours for the whole thing” that would have led to even more delays. Instead, we did what we did to push people to try to finish or get to the meta within 8 hours, and then allow extra time to finish the meta.

  4. Oooh and another comment regarding hardcore mode: a lot of the Games and Shinteki mandate a team of 4. No more, no less. Team size mandate has precedence and makes sense. 🙂

  5. Chris, fellow INTP here!
    Yuan, I could have sworn that the DASH San Francisco page (and emails from GC) all said that it would start promptly at 9 am (it didn’t) and end at 6 pm. Isn’t that nine hours, not eight?

    I am captain of team Mystic Fish (est 2000) however I was the only actual Mystic Fish on the team this time. None of the other Fish could play so I recruited four new people, none of whom had ever played with me or each other before, and in fact only two of us (Effie and I) had ever met. We gelled pretty well, considering, and ended up 118/333 in the Experienced division, with a score of 375, total solve time of 5:43 and overall time of 8:54, with bonus points on 5 puzzles. If my memory serves me correctly about 9-6 being the advertised time window, and if the game had started on time, we would have finished the game by 6 pm. And, I’m sure that if it had truly been an 8 hour game we would have shaved some time off our leisurely pace in order to finish within the time limit.

    Anyway, we were not very efficient over the ground, with a whopping 3 hours 11 minutes to cover a fairly tight course. By the way, whoever picked the San Francisco route, thank you for situating the entire course within 1.5 miles of my home. At one point we were solving puzzles in a park less than a half block from my house and when I realized I needed to pee, it was actually less of a walk to go home than to use the nearest public restroom. I felt pretty spoiled 🙂

    Oh, but here’s the thing. We got the suite of 14 mini-puzzles at the Little Marina Green, which was fairly exposed to the elements (wind), the grass was wet, there were very few benches and NO tables, and there was no place to buy food nearby. After we solved those puzzles, we were told that we had everything we needed in order to solve the meta. With a par value of 75 and it getting colder by the minute, we wanted to go someplace more comfortable to solve the meta and the location could not have been worse in terms of options. It was at least a 1/2 mile walk to the nearest cafe. We ended up backtracking to the Starbucks in the Presidio where we had just been on the puzzle before the minis. Let’s just say, being a local, that’s not how I would have laid out the course 🙂

    I hate to kvetch about little stuff, because I had a wonderful time, loved the puzzles, and totally appreciate the work that went into it. I am continually amazed and humbled at how much of their free time people volunteer – over many many months – for no pay, little recognition, just to create an ephemeral experience for their fellow puzzlers. Griping about little things that maybe I would have done differently seems churlish. I only do it because I’m one of those kick the tires analytical types. You know, an INTP 🙂

    1. I wonder if the SF page meant it would start the registration + waiver collection at 9am? The city lead for SF is someone who stepped up when it seemed like no one else would, and in addition it was his first time running any event like this. He most likely wanted to build in some buffer time because no event ever starts on time, as far as I know. After all, thanks to Cluekeeper, we can time exactly how many hours players will get to play.

      Regarding the last location – sorry to hear about that. I do know Chris had to change it and much of the route on short notice because, as you know, there’s always something in SF to make sure nothing goes according to plan. He’d been very diligent on making sure there was no event conflicts but there was a very late announcement that one of the streets he’d wanted to use would be closed DASH weekend.

      And I understand about the little things. My team and I also discuss what could have been better after many events.

  6. While it didn’t effect us on the novice track, had we been in experienced we would have been pretty miffed at missing out on the last puzzle because we had trains booked home so couldn’t stay past six. It was already getting a bit dicey with starting twenty minutes late. Maybe another issue unique to the UK, with London being the only city playing so people coming in from all over, and the fact that no one in their right mind drives into London. And that specific train time tickets are 5-10x cheaper than open fares.

    1. *nods* True. I wonder whether it would be worth setting people’s expectations by writing the invitation to be something like “The hunt will start at (10) a.m. and run for (8) hours, but we encourage you to plan to stay later, enjoy the company of your fellow puzzlers and perhaps go out for a shared dinner.” This would also help the social side of the event; there are plenty of players and teams who I’d like to get to spend some time with but have not yet done so in practice. (That said, there’s always Puzzled Pint!) Certainly I’ll suggest that as a tip for next year on this site, so long as I remember…

      1. Right. These are good suggestions and barring anything like teams not showing up, thus delaying start times, we could also suggest building in extra time at the beginning. Perhaps we could tell players “plan on spending 8 hours on puzzling, and then budget in an extra 2 to finish up anything else and/or to socialize.”

  7. I am concerned though, that we keep making DASH harder, and am not sure how to stop this trend. In the past we simply had to think about one track, period, now we have novice, experienced, jr. Novice is, in some ways, easier to think about than experienced. Experienced, in practice, seems to have two tracks – those super experienced (hardcore, in your suggested track) and more experienced than novice, but haven’t been playing in day games for the good part of a decade.

    I think some of the pain comes in because we tend to write puzzles first for the experienced track, and then we figure out how to make it accessible for novice. It’s much easier to make a good hard puzzle easier, than it is to make a good easy puzzle harder, in my opinion. Maybe next year’s organizers should consider scrapping the experienced track altogether and focus again on just novice and Jr. tracks. Experienced teams have many more events, after all.

    1. I suspect that if you ditched the “experienced” track, a lot of the experienced teams would simply show up and play anyway. Everyone is hungry for puzzle events.
      I also concerned about the puzzle-creep you discuss for two reasons. The first is the effect it has on truly novice players, as you outline. The second is the effect it has on GC. One of the reasons the BANG series of puzzles in the Bay Area died out was that each BANG seemed to outdo the last in some way, and the people who were considering running one were too daunted. I suspect that to get that series re-started, someone will need to run a really crappy BANG. Maybe one of the next few DASHes needs to be a “back to basics”. It might also help to have central DASH GC be someone other than the Bay Area, where experienced puzzlers are the norm and form a very tight network that doesn’t include a lot of novice voices.

      1. Deb and I came up with some great ideas for a terrible BANG during Shinteki. (Please don’t blame Shinteki.)

Leave a Reply to Dean Cancel reply

Your email address will not be published. Required fields are marked *