This section explains how to generate and to use a GLR parser to produce structural results. Please check the examples for further information. Discussion of semantic issues comes later; see Section 3.3, “Including semantic results”.
The process of generating a GLR parser is broadly the same as for standard Happy. You write a grammar specification, run Happy on this to generate some Haskell code, then compile and link this into your program.
An alternative to using Happy directly is to use the BNF Converter tool by Markus Forsberg, Peter Gammie, Michael Pellauer and Aarne Ranta. This tool creates an abstract syntax, grammar, pretty-printer and other useful items from a single grammar formalism, thus it saves a lot of work and improves maintainability. The current output of BNFC can be used with GLR mode now with just a few small changes, but from January 2005 we expect to have a fully-compatible version of BNFC.
Most of the features of Happy still work, but note the important points below.
The GLR parser is generated in TWO files, one for data and one for the driver. This is because the driver code needs to be optimized, but for large parsers with lots of data, optimizing the data tables too causes compilation to be too slow.
	       Given a file Foo.y, file
	       FooData.hs is generated with basic type
	       information, the parser tables, and the header and tail code
	       that was included in the parser specification.
	       Note that Happy generates the
	       module declaration line, so you should NOT give it in the
	       grammar file.
	       The driver is placed in file Foo.hs, and
	       does not contain any user-supplied text.
	       
	       You can declare a lexer (and error token) with the
	       %lexer directive as normal, but the
	       generated parser does NOT call this lexer automatically.
	       The action of the directive is only to
	       export the lexer function to the top
	       level. This is because some applications need finer control
	       of the lexing process.
	       
This still works, but note the reasons. The precedence and associativity declarations are used in Happy's LR table creation to resolve certain conflicts. It does this by retaining the actions implied by the declarations and removing the ones which clash with these. The GLR parser back-end then produces code from these filtered tables, hence the rejected actions are never considered by the GLR parser.
Hence, declaring precedence and associativity is still a good thing, since it avoids a certain amount of ambiguity that the user knows how to remove.
	       There is some support for monadic parsers.
	       The "tree decoding" mode
	       (see Section 3.3.2, “Tree decoding”) can use the
	       information given in the %monad
	       declaration to monadify the decoding process.
	       This is explained in more detail in
	       Section 3.3.4, “Monadic tree decoding”.
	       
Note: the generated parsers don't include Ashley Yakeley's monad context information yet. It is currently just ignored. If this is a problem, email and I'll make the changes required.
This has no effect at present. It will probably remain this way: if you want to control names, you could use qualified import.
	       The generation of semantic code relies on type information
	       given in the grammar specification. If you don't give an
	       explicit signature, the type () is
	       assumed. If you get type clashes mentioning
	       () you may need to add type annotations.
	       Similarly, if you don't supply code for the semantic rule
	       portion, then the value () is used.
	       
error symbol in grammars, and recovery
		
	       No attempt to implement this yet. Any use of
	       error in grammars is thus ignored, and
	       parse errors will eventually mean a parse will fail.
	       
	       The type used for tokens must be in
	       the Ord type class (and hence in
	       Eq), plus it is recommended that they
	       are in the Show class too.
	       The ordering is required for the implementation of
	       ambiguity packing.
	       It may be possible to relax this requirement, but it
	       is probably simpler just to require instances of the type
	       classes. Please tell us if this is a problem.
	       
	The driver file exports a function
	doParse :: [[UserDefTok]] -> GLRResult.
	If you are using several parsers, use qualified naming to
	distinguish them.
	UserDefTok is a synonym for the type declared with
	the %tokentype directive.
        
	The input to doParse is a list of
	list of token values.
	The outer level represents the sequence of input symbols, and
	the inner list represents ambiguity in the tokenisation of each
	input symbol.
	For example, the word "run" can be at least a noun or a verb,
	hence the inner list will contain at least two values.
	If your tokens are not ambiguous, you will need to convert each
	token to a singleton list before parsing.
        
The parse result is expressed with the following types. A successful parse yields a forest (explained below) and a single root node for the forest. A parse may fail for one of two reasons: running out of input or a (global) parse error. A global parse error means that it was not possible to continue parsing any of the live alternatives; this is different from a local error, which simply means that the current alternative dies and we try some other alternative. In both error cases, the forest at failure point is returned, since it may contain useful information. Unconsumed tokens are returned when there is a global parse error.
	type ForestId = (Int,Int,GSymbol)
	data GSymbol  = <... automatically generated ...>
	type Forest   = FiniteMap ForestId [Branch]
	type RootNode = ForestId
	type Tokens   = [[(Int, GSymbol)]]
	data Branch   = Branch {b_sem :: GSem, b_nodes :: [ForestId]}
	data GSem     = <... automatically generated ...>
	data GLRResult
	 = ParseOK     RootNode Forest    -- forest with root
	 | ParseError  Tokens   Forest    -- partial forest with bad input
	 | ParseEOF             Forest    -- partial forest (missing input)
        
	Conceptually, the parse forest is a directed, acyclic and-or
	graph. It is represented by a mapping of ForestIds
	to lists of possible analyses. The FiniteMap
	type is used to provide efficient and convenient access.
	The ForestId type identifies nodes in the
	graph, named by the range of input they span and the category of
	analysis they license. GSymbol is generated
	automatically as a union of the names of grammar rules (prefixed
	by G_ to avoid name clashes) and of tokens and
	an EOF symbol. Tokens are wrapped in the constructor
	HappyTok :: UserDefTok -> GSymbol.
	
	The Branch type represents a match for some
	right-hand side of a production, containing semantic information
	(see below)
	and a list of sub-analyses. Each of these is a node in the graph.
	Note that tokens are represented as childless nodes that span
	one input position. Empty productions will appear as childless nodes
	that start and end at the same position.
	
	Happy will generate two files, and these
	should be compiled as normal Haskell files.
	If speed is an issue, then you should use the -O
	flags etc with the driver code, and if feasible, with the parser
	tables too.
        
        You can also use the --ghc flag to trigger certain
	GHC-specific optimizations. At present,
	this just causes use of unboxed types in the tables and in some key
	code.
	Using this flag causes relevant GHC
	option pragmas to be inserted into the generated code, so you shouldn't
	have to use any strange flags (unless you want to...).