Table of Contents
Happy is a parser generator system for Haskell, similar to the tool yacc for C. Like yacc, it takes a file containing an annotated BNF specification of a grammar and produces a Haskell module containing a parser for the grammar.
Happy is flexible; unlike yacc, you can have several Happy parsers in the same program, and each parser may have multiple entry points. Happy can work in conjunction with a lexical analyser supplied by the user (either hand-written or generated by another program), or it can parse a stream of characters directly (but this isn't practical in most cases). In a future version we hope to include a lexical analyser generator with Happy as a single package.
Parsers generated by Happy are fast; generally faster than an equivalent parser written using parsing combinators or similar tools. Furthermore, any future improvements made to Happy will benefit an existing grammar, without need for a rewrite.
Happy is sufficiently powerful to parse Haskell itself - there's a freely available Haskell parser written using Happy which can be obtained from The hsparser Page, and included with versions of GHC from 5.00 onwards.
Happy can currently generate four types of parser from a given grammar, the intention being that we can experiment with different kinds of functional code to see which is the best, and compiler writers can use the different types of parser to tune their compilers. The types of parser supported are:
`standard' Haskell 98 (should work with any compiler that compiles Haskell 98).
standard Haskell using arrays (this is not the default because we have found this generates slower parsers than 1).
Haskell with GHC (Glasgow Haskell) extensions. This is a slightly faster option than 1 for Glasgow Haskell users.
GHC Haskell with string-encoded arrays. This is the fastest/smallest option for GHC users. If you're using GHC, the optimum flag settings are -agc (see Chapter 4).
Happy can also generate parsers which will dump debugging information at run time, showing state transitions and the input tokens to the parser.