Version 1.0 Summary
(Released October 23, 2011)
Java APG is APG, an ABNF Parser Generator, written in the Java language. A summary of new features follows:
- Both APG and the parsers it generates are written entirely in Java.
- A new operator, the User-Defined Terminal (UDT) is introduced providing the user with an unrestrained, handwritten phrase recognizer for speeding the recognition of simple phrases and solving many non-CFG problems that arise.
- The Abstract Syntax Tree (AST) is optionally available in XML format, freeing the user to develop a translator with the XML parser of his/her choice.
- A trace of the parser's path through the syntax tree is also available in XML format.
APG was originally developed simply for the purpose of having a parser generator for the ABNF grammar syntax (RFC 4234). It has since developed into an easy-to-use, reliable, compact and speedy tool that has found its way into a number of commercial applications. From inception it has been built on the concept of operators which simulate the seven ABNF syntax constructs. These are the four non-terminals, rules (productions), alternations, concatenations and repetitions of phrases and the three terminals, the character range, the case-insensitive literal string and the binary string. In its present form, APG generates unambiguous parsers with syntactic predicates and user-defined semantic actions. Though developed in a completely ad hoc fashion, APG is bears remarkable similarity to the rigorously developed Parsing Expression Grammars (PEGs).
The most flexible feature of APG is the simplicity of introducing new operators to stretch beyond the limits of CFG languages. For example, syntactic predicates were first introduced to APG in Version 4.0 simply as a new operator, closely related to the repetition operator, as a solution to a non-CFG parsing problem that arose in developing a C++ pre-parser. They were then later standardized to the PEG forms in Version 5.0.
As a generalization of this idea, Java APG introduces User-Defined Terminals (UDTs). This feature opens up the design of new operators to the user's imagination. As a small beginning, Java APG comes with a modest UDT library of efficient alternatives to simple phrases such as alphanum strings, whitespace, comments and others. These UDTs effectively enlarge and enhance the three ABNF terminals provided in the original syntax.
While UDTs are similar in functionality to the semantic actions found in many other parser generators and the rule callback functions of previous versions of APG, the UDTs in Java APG present them in a standaridized, easy-to-write, -visualize and -implement fashion. The UDT formalism puts semantic actions on a nearly equal footing with the parser's other, language-defining operators, including the original ABNF terminals.
A large number of examples are given which demonstrate the use of Java APG, from simple setup and execution to rather sophisticated use of UDTs. Some timing studies are also given in the examples comparing CFG and UDT versions of simple language phrases.