Fun With ZCharacter

From IFWiki

Here's an interesting snippet of Inform code that does almost nothing, but can save a few kb of gamefile size:

Zcharacter
  "abcdefghijklmnop.rstuvwxy,"
  "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
  "0123456789[];?!'#zq-:()";

(You'll need to make this the first thing in your source file, excluding directives that affect compilation.)

What does this do? The characters in the first line of the Zcharacter directive take up one five bit unit in ZSCII, so three of them can fit into two bytes. Characters in the next two lines use two units, and all other characters use four. Most of the savings comes from picking the right punctuation to put on that last line. It also occured to me that the period and comma occur more frequently in English text than the least common lowercase letters, and that this could be exploited for further savings.

The optimal Zcharacter usage may vary depending on the frequency of characters in the strings in your source files, so it may be possible to do a little better than this; experiment if you like. (The '#' is only there because I use Platypus, which uses '#' internally in the library to mark items in strings to be processed.) Remember that in distinguishing if input matches a dictionary word, an inform game only looks at the first nine units, so with 'q' and 'z' each taking two five bit units, if your game includes both a 'quetzal' and a 'quetzalcoatl', it will no longer be able to distinguish between them.

I had problems with things breaking when I tried to put lower case letters in the second row. I believe this may have been due to a compiler bug that was fixed in the 6.30 compiler, but I haven't yet tried it out with the new compiler. In any case, it's easily avoided.

For more background on how Zcharacter works, see section 36 of the DM4.

--Two-star 17:02, 6 Mar 2005 (Central Standard Time)