ABNF Parser Generator
e-mail addresses
When processing HTML forms, a common problem is validating e-mail addresses. This is commonly done with a regular expression and an Internet search will turn up a large number of them in use from simple to complex. For form validation it is probably not a good idea generally to go overboard with the precision of the validation and these regular expressions all serve a useful purpose. However, the specification given in RFC 5322 allows nested, parenthesised comments which have a nested recursive attribute and hence cannot be parsed with a regular expression.
This grammar was put together as an exercise in using Interactive APG. The situation is a common one. The desired grammar is not neatly and competely defined all in one place but is, rather, defined with statements scattered throughout a larger specification. The procedure was to find the required start rule in RFC 5322 - "addr-spec" in this case, which is briefly defined with only four other rules - and paste them into the ABNF grammar textarea. The message log will then list other rule names required but not present. Add them and repeat the procedure until a complete grammar is defined. At this point several unneeded rules will have crept in, which can be discovered in the list of unreferenced rules at the end of the "Grammar Attributes" output. These can then be eliminated, resulting in the grammar given here. (NOTE: Here is a nifty tool for extracting ABNF from RFCs.)
A parser from the grammar given here, together with the JavaScript APG library could serve as a more precise alternative to the regular expression validators.
e-mail addresses input
A more thorough study of e-mail addresses can be found here and here and elsewhere no doubt. This grammar given here isn't perfect, but it is a good start if you want to tune it up and create a JavaScript parser that will compete with the PHP parsers cited above. You can select from a tricky few of their test cases here. These are all valid e-mail addresses and should parse correctly.
"first\"last"@example.com
"first@last"@example.com
first.last@[IPv6:::12.34.56.78]
(foo)cal(bar)@(baz)iamcal.com(quux)
cal(woo(yay)hoopla)@iamcal.com
cal(foo\)bar)@iamcal.com
|
|