Reiko's Ramblings and Writings

What I'm reading and writing about lately.

IF Comp Scoring

Whenever I have time, I always try to play and review the IFComp games each year. I’ve been doing this for a number of years now, so I have a pretty good idea what sorts of things make good interactive fiction. Scoring is always a subjective activity, so it’s useful to know how different people score when reading reviews and looking at scores awarded. For me, each score implies an approximate quality range, but how you get to that score might vary between multiple games scored at the same level. One game might be overall very good, with some significant weaknesses, while another might be mediocre, but with certain features exhibiting particular flair or innovation. My scoring method captures both of these, even though the final result might be the same.

First I start with a base score of either 4 or 7. A game that is unfinishable for any reason starts with a 4. This includes technical difficulties, fatal bugs, broken design, obtuse puzzles with lack of walkthrough, etc. Obtuse puzzles aren’t a game-breaker, but obtuse puzzles with no way available to work through them are. A game that is finishable or appears finishable, even if I don’t personally finish it, generally starts with 7. I also reserve the right to start with a 4 on a game that I finished but the quality is low enough that it shouldn’t end up with a higher score, even if I don’t want to enumerate its flaws separately to reduce the score from 7. This will be most likely to happen with a game that is just poorly written in general, or a non-parser game that doesn’t effectively take advantage of choice-based tools to create interesting interaction.

From there, I adjust up or down, adding or subtracting points for particularly noteworthy features or bugs. A really fun or well-written game could get a good score even if it’s unfinishable, but it’s unlikely, since it has to overcome the automatic unfinishability penalty.

Some things I penalize for (not necessarily in order of severity):

  • Shortness. Judges are allowed up to two hours to play each game. It can be a problem to pitch the game length too long, because then judges won’t see the whole thing, won’t be able to reach an ending, won’t be able to consider the full story arc. But it’s possibly even worse to submit a tiny game that takes ten minutes to finish, because it’s just not going to be able to sustain the sort of complexity and depth that characterizes really good games.
  • Bugs, either explicit (software error messages) or implicit (clearly undesirable behavior). Of course, parser error messages are to be expected from parser games when you do something the game didn’t expect. On the other hand, too many of these can mean that not enough was implemented, too.
  • Lack of walkthrough (particularly for a puzzle game). This won’t matter if the game is primarily link-driven or focuses more on conversation or story. But I don’t appreciate banging my head against puzzles when I’ve only got two hours and I’d really like to see the whole game. I put a good effort into solving puzzles on my own, but I appreciate hints when I’m stuck, and sometimes even hints aren’t sufficient if something is particularly obtuse. That said…
  • Broken walkthrough. It’s worse if the walkthrough doesn’t actually work. Some games have random elements and therefore can’t give a strict command-by-command walkthrough, but if it can, it’s a really big problem if the walkthrough hasn’t been tested properly.
  • Typos/spelling errors. One or two can easily be overlooked, but when they are so common that it’s clear that no spellchecker was used, and no editor or anyone looked through the text before release, it really grates on me.
  • Tropes. This isn’t an automatic deduction the way most of the others are, but it’s a possibility. We’ve all seen amnesia games by now, surely, so a game starting with total amnesia is going to have a very hard time pulling it off in an innovative way. Other tropes might include hunger timers, ordinary mazes, “my apartment/house”, etc. Some games include the appearance of a maze that is then solved all at once with a particular item or action. That’s fine. But otherwise, you’d better have a good reason if you use these things.
  • Format. Some people automatically rate down non-parser games, and there’s a whole argument going on about whether they belong in the comp or not. My stance is that choice-based or hyperlinked games are on average not going to do as well as parser games because the depth of interaction is much simpler. I’ve played some well-written choice-based games with stats and hyperlinked games with subtle interactions, and I think they have the potential to be very interesting. So I give them the same consideration as parser games and rate up or down based on their features or lack of them. However, I do tend to rate down homebrew executable games, because it’s clear that using established tools generally provides better results. It’s not so much an automatic deduction as a recognition that the homebrew wrapper does not provide a good experience. Writing a game is a lot of work anyway, and writing your own system on top of that is even more work. It’s just not a good idea unless you’re a professional game developer or something.
  • Incompleteness. Submitting a preview, intro, or half-finished story just isn’t going to cut it, even if it’s a brilliant beginning. If you want to compete with an preview in order to find out whether people like it enough that you should make the rest of the game, then you want IntroComp.

Some things I give points for (not necessarily in order of appreciation):

  • Well-done feature. In general, if a particular feature is crafted expertly or shows particular innovation, I’ll note it. This can be a particular puzzle, an aspect of the world-building, a gameplay attribute, etc.
  • Realistic character(s). Let’s just say it: NPCs are hard, because people are complex. But some games have done a really excellent job with them, and it’s worth noting when that happens. Often it’s because the NPC’s role is strictly controlled so that you only see what you need to see for the interaction to make sense.
  • Setting. Not just weird for the sake of weird, but a setting that’s particularly well-realized or vividly unusual – an exotic country, an alien land, an uncommon event, even a fresh look at a normally ordinary place.
  • Multiple endings. Not just lots of death endings, either, or a single branch point at the end. A game will get a point for having multiple distinct outcomes that are achievable based on doing different things during play. That’s not to say that a more linear game can’t score well if what it does is crafted well, of course.
  • Humor. Some games are just ridiculous, some think they’re funny but they’re not, and others take themselves far too seriously. A game that hits the humor well will be funny without being too slapstick, or it will poke fun at the medium or interactive fiction history. A good response to >xyzzy helps here.

By adding or subtracting points, I end up with the final score. A good game with flaws might still end up with an average score when its positive and negative features cancel out. Here’s an approximate interpretation of each quality level:

  • 1: This is the minimum, and I don’t use it often. Only a completely broken, worthless game would get 1, where I honestly think it never should have been submitted at all and can’t think of anything positive to say about it.
  • 2: A broken game that contains some interesting ideas might get this score. More games than I’d like get this score. Generally games at this level probably should have been tested a lot more before submission.
  • 3: A deeply flawed game with some potential.
  • 4: An unfinishable game with balanced positive and negative qualities.
  • 5: A low-average game, where it was either unfinishable but did something well, or a finishable game that had some serious issues.
  • 6: A decent game, but needed more testing and polishing.
  • 7: A good game, finishable where its flaws balanced its good points, or a slightly broken game that had a lot of potential and did many things well but just wasn’t tested properly.
  • 8: A very good game, polished and fun, with few flaws.
  • 9: Excellent game that did several things right. Highly recommended, and I hope it wins the competition or comes close. I might give one or two games per comp this score.
  • 10: This is the maximum, so I don’t use it often either. Only a completely outstanding game would get 10, where it’s so well crafted and polished that there’s nothing negative I can say about it, and very little in the way of improvement.

19 Responses to “IF Comp Scoring”

  1. […] IF Comp Scoring […]

  2. […] IF Comp Scoring […]

  3. […] IF Comp Scoring […]

  4. […] IF Comp Scoring […]

  5. […] IF Comp Scoring […]

  6. […] IF Comp Scoring […]

  7. […] IF Comp Scoring […]

  8. […] IF Comp Scoring […]

  9. […] IF Comp Scoring […]

  10. […] IF Comp Scoring […]

  11. […] IF Comp Scoring […]

  12. […] IF Comp Scoring […]

  13. […] IF Comp Scoring […]

  14. […] IF Comp Scoring […]

  15. […] IF Comp Scoring […]

  16. […] IF Comp Scoring […]

  17. […] IF Comp Scoring […]

  18. […] IF Comp Scoring […]

  19. […] IF Comp Scoring […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: