Infocom-type parser
The current baseline standard in interactive fiction parsers was largely originated by Dungeon, later Zork, and further refined by Infocom. It was a huge leap ahead from the two-word parser of the original Adventure at the time, and though there has been a process of slow evolution since, there have been few really revolutionary changes to the grammar. A few rival IF companies of the 1980s (such as Magnetic Scrolls) built parsers of equal complexity, but none really did anything new.
(Any counterexamples from Magnetic Scrolls or similar games?)
While coding a two-word parser is simplicity itself, writing an Infocom-style parser can an exercise in baffling complexity, even though the semantics of the final parsed result are not much more involved than in Adventure
There may yet be huge leaps to be made in parser grammar and semantics, but as yet the Infocom parser seems to be the sweet spot of features versus complexity and player intelligibility.
The minimal dialect of English understood by the Inform implementation of the Infocom-style parser is often referred to as Informese.
Features
The major advances of the Infocom parser over the two-word parser are:
- Actors (JACK, PASS ME THE KETCHUP)
- Indirect objects (UNSCREW THE THYRISTOR WITH THE FROBLITZ WRENCH)
- Verbs as phrases rather than words (TURN THE LIGHT ON, DON'T PANIC)
- Multi-word nouns, allowing for adjectives and complex names (THE THING YOUR AUNT GAVE YOU WHICH YOU DON'T KNOW WHAT IT IS)
- Multiple object selection phrases (TAKE ALL FROM THE BASKET EXCEPT THE CHIP AND THE SOCKET)
- Numeric quantities (TAKE FIVE GOLD COINS)
- Raw text or numbers in the place of object slots (TURN DIAL TO 11, SAY "BOOJUM" TO THE SNARK)
- Articles (THE CAT, AN APPLE, ONE OF THE GOLD COINS)
- Pronouns (PICK UP THE TICKET. LOOK AT IT. FOLLOW MRS PODSNAP. ASK HER ABOUT THE CRIME)
- Interactive disambiguation (Do you mean the white chip, the back chip, or the taco chip?)
- Automatic guessing of missing sentence elements (SHOOT ELEPHANT (with the rifle) )
- Typo correction (OOPS <text>)
Grammar
The Infocom parser generally accepts sentences of the form:
>[ACTOR,] [VERB PHRASE] [DIRECT OBJECT PHRASE] [PREPOSITION PHRASE] [INDIRECT OBJECT PHRASE]
and parses out four major components:
- the actor who is being addressed
This usually defaults to the 'player' object if no explicit actor was given. The early Infocom game Suspended is one of the few games which breaks this convention, defaulting to the last spoken-to actor.
- the verb or action being taken
Many verb phrases may parse to the same action, reducing the burden on the game author to code separately for each synonym.
It is very rare for a game to need to know the exact verb used to initiate an action, but some games do this (usually by some kind of 'parser hack' to read the raw input buffer) as a special effect.
- the direct object(s) if any (called the 'noun' in Inform)
This can be a list of multiple objects, or even an incomplete and ambiguous noun phrase that requires further words to resolve to a single known object.
One interesting quirk of the Infocom parser is that in order to correctly parse the noun phrases, the verb usually needs to be known at this point, as only certain verbs allow a list of multiple objects. By convention TAKE and DROP usually support this feature, but other verbs only at the author's discretion.
Use of multiple objects can spoil many game puzzles - the classic case is 'EXAMINE ALL' revealing objects in a room that are not explicitly listed in the room description and whose existence the player is supposed to learn only by searching - so it's rare for a game to use this feature much.
The result of actions performed on multiple objects is usually listed in a special notation: 'Object: <action>'. Eg:
>TAKE ALL FROM BASKET White chip: taken Black chip: taken Taco chip: taken
By convention, generally multiple actions all occur within one 'move' of game time, which bends mimesis slightly. Generally this is done for ease of programming. Some games however have made the use of multiple-take a requirement for solving certain time-limited puzzles.
- the indirect object(s) if any (called the 'second' in Inform)
Follows similar rules to the direct object, though it is generally much rarer for the indirect object in a verb to allow multiple objects.