Java APG
Java APG is APG, an ABNF Parser Generator
written in the Java language.
See:
Description
Java APG is APG, an ABNF Parser Generator
written in the Java language.
A summary of its new features is:
-
Both APG and the parsers it generates are written entirely in Java.
-
A new operator, the User-Defined Terminal (UDT), is introduced
which puts semantic actions on a nearly equal footing with the other
ABNF terminal phrases such as the literal string.
UDTs allow the user to write phrase recognition functions and convert them to
parser operators.
-
The AST is optionally available in XML format, freeing the user to develop a translator with the XML parser of the his/her choice.
APG was originally written to fulfill a need for a parser generator that would generate parsers
directly from ABNF grammars as defined by the IETF in
RFC 4234.
Since then the grammar syntax for APG has evolved from that standard
1) to generate unambiguous parsers
and 2) to add capabilities beyond the class of context-free-languages.
Because of 2) the APG grammars are called SABNF (superset ABNF).
ABNF and SABNF will often be used interchangably in this document.
The differences between RFC 4234 grammars and SABNF are summarized here:
-
The <prose-val> element is not supported.
-
Incremental Alternatives - Rule1 =/ Rule2 - are not supported.
-
Prioritized-choice is used to disambiguate the grammars. That is, alternates are tried
as they appear in the grammar from left to right. The first alternate to successfully match a phrase
is accepted and all other alternates are ignored.
-
Repetitions always consume the longest string possible
with no alternative being considered.
-
APG accepts the syntactic predicate operators AND(&) and NOT(!).
-
Java APG accepts User-Defined Terminals (UDTs). UDTs appear in rule expressions
just like rule names, accept that the names must begin with
u_ or
e_ (* see below).
The underscore insures that there will never be
a name conflict with a rule name. No rule name definition is given for the UDTs.
(*) The reason for the two designations u_ and
e_ is a subtle but serious difference as to whether the UDT will accept empty
strings or not. e_ indicates that the UDT will accept empty strings.
u_ indicates that the UDT will not accept empty strings. The Parser
enforces this distinction. If a UDT named u_my-udt, for example, returns
an empty string the Parser will throw and exception. The reason for this has to do with the
fact that left-recursive grammars will put the Parser into an infinite loop that will cause a stack-overflow.
The check for left-recursiveness in the GeneratorAttributes class relies on
knowing whether the rule name operators (and UDT operators as well) accept empty strings.
The general properties of the UDT's are unknown to the Generator because they are user written. This
naming convention and the Parser's enforcing of it is the only way to prevent a possible stack-overflow
due to a hidden left-recursion.
For more information about UDTs, APG, its versions and downloads,
you can visit the official web site
http://www.coasttocoastresearch.com.
For information on how to use Java APG you should consult this document,
the examples in com.coasttocoastresearch.examples
and the source code.
Licence Notification
All the software in this distribution is free software:
you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program in the COPYING file. If not, see GPL, Version 2
or GPL, Version 3 or write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.