headerphoto

UDT Naming and Usage

UDTs appear in rule definitions just like other rule names. For example, instead of
ipv4 = num 3("." num)
num = 1*3%d48-57
we could write, ipv4 = u_num 3("." u_num)

Note that the UDT "u_num" begins with "u_" and unlike the rule name "num" there is no rule name definition for "u_num". The "u_" prefix identifies it to the Generator as a UDT. The underscore insures that it will not conflict with any rule name.

Actually, there are two prefixes for identifying UDTs - "u_" and "e_" The reason for the two designations is a subtle but serious difference as to whether the UDT will accept empty strings or not. "e_" indicates that the UDT will accept empty strings. "u_" indicates that the UDT will not accept empty strings. The Parser enforces this distinction. If a UDT named "u_my-udt", for example, returns an empty string the Parser will throw and exception. The reason for this has to do with the fact that left-recursive grammars will put the Parser into an infinite loop that will cause a stack-overflow. The check for left-recursiveness in the GeneratorAttributes class relies on knowing whether the rule name operators (and UDT operators as well) accept empty strings. The general properties of the UDT's are unknown to the Generator because they are user written. This naming convention and the Parser's enforcing of it is to prevent a possible stack-overflow due to a hidden left-recursion.