Grammar! Grammar! Grammar! Grammar!

Mushroom! Mushroom!

(Use ←/→ arrow keys to navigate)

See also

Internet Engineering Task Force

  • The IETF defines web standards

  • Some of those standards contain grammars


  • For example, Uniform Resource Identifiers (URIs) are defined by RFC 3986

  • RFC 3986 contains a grammar for parsing URIs

  • Unfortunately, uri.abnf is not an iXML grammar

    ;; Transcribed from RFC 3986 by ndw
    URI           = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
    hier-part     = "//" authority path-abempty
                  / path-absolute
                  / path-rootless
                  / path-empty
    URI-reference = URI / relative-ref
    absolute-URI  = scheme ":" hier-part [ "?" query ]
    relative-ref  = relative-part [ "?" query ] [ "#" fragment ]
    relative-part = "//" authority path-abempty
                  / path-absolute
                  / path-noscheme
                  / path-empty
    scheme        = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )
    authority     = [ userinfo "@" ] host [ ":" port ]
    userinfo      = *( unreserved / pct-encoded / sub-delims / ":" )
    host          = IP-literal / IPv4address / reg-name
    port          = *DIGIT
    IP-literal    = "[" ( IPv6address / IPvFuture  ) "]"
    IPvFuture     = "v" 1*HEXDIG "." 1*( unreserved / sub-delims / ":" )
    IPv6address   =                            6( h16 ":" ) ls32
                  /                       "::" 5( h16 ":" ) ls32
                  / [               h16 ] "::" 4( h16 ":" ) ls32
                  / [ *1( h16 ":" ) h16 ] "::" 3( h16 ":" ) ls32
                  / [ *2( h16 ":" ) h16 ] "::" 2( h16 ":" ) ls32
                  / [ *3( h16 ":" ) h16 ] "::"    h16 ":"   ls32
                  / [ *4( h16 ":" ) h16 ] "::"              ls32
                  / [ *5( h16 ":" ) h16 ] "::"              h16
                  / [ *6( h16 ":" ) h16 ] "::"
    h16           = 1*4HEXDIG
    ls32          = ( h16 ":" h16 ) / IPv4address
    IPv4address   = dec-octet "." dec-octet "." dec-octet "." dec-octet
    dec-octet     = DIGIT                 ; 0-9
                  / %x31-39 DIGIT         ; 10-99
                  / "1" 2DIGIT            ; 100-199
                  / "2" %x30-34 DIGIT     ; 200-249
                  / "25" %x30-35          ; 250-255
    reg-name      = *( unreserved / pct-encoded / sub-delims )
    path          = path-abempty    ; begins with "/" or is empty
                  / path-absolute   ; begins with "/" but not "//"
                  / path-noscheme   ; begins with a non-colon segment
                  / path-rootless   ; begins with a segment
                  / path-empty      ; zero characters
    path-abempty  = *( "/" segment )
    path-absolute = "/" [ segment-nz *( "/" segment ) ]
    path-noscheme = segment-nz-nc *( "/" segment )
    path-rootless = segment-nz *( "/" segment )
    path-empty    = 0<pchar>
    segment       = *pchar
    segment-nz    = 1*pchar
    segment-nz-nc = 1*( unreserved / pct-encoded / sub-delims / "@" )
                  ; non-zero-length segment without any colon ":"
    pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"
    query         = *( pchar / "/" / "?" )
    fragment      = *( pchar / "/" / "?" )
    pct-encoded   = "%" HEXDIG HEXDIG
    unreserved    = ALPHA / DIGIT / "-" / "." / "_" / "~"
    reserved      = gen-delims / sub-delims
    gen-delims    = ":" / "/" / "?" / "#" / "[" / "]" / "@"
    sub-delims    = "!" / "$" / "&" / "'" / "(" / ")"
                     / "*" / "+" / "," / ";" / "="

Grammar! Grammar!

  • Grammars in IETF specifications are defined with Augmented BNF (ABNF)

  • RFC 5234 contains a grammar for ABNF

  • Unfortunately, abnf.abnf it is also not an iXML grammar

    rulelist       =  1*( rule / (*c-wsp c-nl) )
    rule           =  rulename defined-as elements c-nl
                           ; continues if next line starts
                           ;  with white space
    rulename       =  ALPHA *(ALPHA / DIGIT / "-")
    defined-as     =  *c-wsp ("=" / "=/") *c-wsp
                           ; basic rules definition and
                           ;  incremental alternatives
    elements       =  alternation *c-wsp
    c-wsp          =  WSP / (c-nl WSP)
    c-nl           =  comment / CRLF
                           ; comment or newline
    comment        =  ";" *(WSP / VCHAR) CRLF
    alternation    =  concatenation
                      *(*c-wsp "/" *c-wsp concatenation)
    concatenation  =  repetition *(1*c-wsp repetition)
    repetition     =  [repeat] element
    repeat         =  1*DIGIT / (*DIGIT "*" *DIGIT)
    element        =  rulename / group / option /
                      char-val / num-val / prose-val
    group          =  "(" *c-wsp alternation *c-wsp ")"
    option         =  "[" *c-wsp alternation *c-wsp "]"
    char-val       =  DQUOTE *(%x20-21 / %x23-7E) DQUOTE
                           ; quoted string of SP and VCHAR
                           ;  without DQUOTE
    num-val        =  "%" (bin-val / dec-val / hex-val)
    bin-val        =  "b" 1*BIT
                      [ 1*("." 1*BIT) / ("-" 1*BIT) ]
                           ; series of concatenated bit values
                           ;  or single ONEOF range
    dec-val        =  "d" 1*DIGIT
                      [ 1*("." 1*DIGIT) / ("-" 1*DIGIT) ]
    hex-val        =  "x" 1*HEXDIG
                      [ 1*("." 1*HEXDIG) / ("-" 1*HEXDIG) ]
    prose-val      =  "<" *(%x20-3D / %x3F-7E) ">"
                           ; bracketed string of SP and VCHAR
                           ;  without angles
                           ; prose description, to be used as
                           ;  last resort
    ALPHA          =  %x41-5A / %x61-7A   ; A-Z / a-z
    BIT            =  "0" / "1"
    CHAR           =  %x01-7F
                           ; any 7-bit US-ASCII character,
                           ;  excluding NUL
    CR             =  %x0D
                           ; carriage return
    CRLF           =  CR LF
                           ; Internet standard newline
    CTL            =  %x00-1F / %x7F
                           ; controls
    DIGIT          =  %x30-39
                           ; 0-9
    DQUOTE         =  %x22
                           ; " (Double Quote)
    HEXDIG         =  DIGIT / "A" / "B" / "C" / "D" / "E" / "F"
    HTAB           =  %x09
                           ; horizontal tab
    LF             =  %x0A
                           ; linefeed
    LWSP           =  *(WSP / CRLF WSP)
                           ; Use of this linear-white-space rule
                           ;  permits lines containing only white
                           ;  space that are no longer legal in
                           ;  mail headers and have caused
                           ;  interoperability problems in other
                           ;  contexts.
                           ; Do not use when defining mail
                           ;  headers and use with caution in
                           ;  other contexts.
    OCTET          =  %x00-FF
                           ; 8 bits of data
    SP             =  %x20
    VCHAR          =  %x21-7E
                           ; visible (printing) characters
    WSP            =  SP / HTAB
                           ; white space

Grammar! Grammar! Grammar!

  • Michael Sperberg-McQueen wrote an iXML grammar for ABNF

  • It is an iXML grammar!

    { The grammar notation defined by RFC 5234, "Augmented BNF for Syntax
    Specifications: ABNF", ed. D. Crocker and P. Overell, January 2008.
    The nonterminals used here are those of RFC 5234, but some definitions
    have been reformulated to use ixml idioms.  The definition of ABNF has
    no analogue to ixml marks for guiding XML serialization; the marks
    used here have been supplied by the transcriber.
    Transcribed into ixml by C. M. Sperberg-McQueen, February 2022. }
    rulelist      = (rule | (c-wsp*, c-nl))+.
    rule          = rulename, defined-as, elements, c-nl.
                         { continues if next line starts 
                           with white space }
    rulename      = ALPHA, (ALPHA | DIGIT | "-")*.
    defined-as    = c-wsp*, ("=" | "=/"), c-wsp*.
                         {  basic rules definition and 
                            incremental alternatives }
    elements      = alternation, c-wsp*.
    c-wsp         = WSP | (c-nl, WSP).
    c-nl          = comment | CRLF.  { comment or newline }
    comment       = ";", (WSP | VCHAR)*, CRLF.
    alternation   = concatenation ** (c-wsp*, "/", c-wsp*).
    concatenation = repetition ** (c-wsp+).
    repetition    = repeat?, element.
    repeat        = DIGIT+ | (DIGIT*, "*", DIGIT*).
    element       = rulename | group | option 
                  | char-val | num-val | prose-val.
    group         = "(", c-wsp*, alternation, c-wsp*, ")".
    option        = "[", c-wsp*, alternation, c-wsp*, "]".
    char-val      = DQUOTE, [#20 - #21; #23 - #7E]*, DQUOTE.
                         { quoted string of SP and VCHAR 
                           without DQUOTE }
    num-val       = "%", (bin-val | dec-val | hex-val).
    bin-val       = "b", BIT+, ((".", BIT+)+ | ("-", BIT+))?.
                         { series of concatenated bit values
                           or single ONEOF range }
    dec-val       = "d", DIGIT+, ((".", DIGIT+)+ | ("-", DIGIT+))?.
    hex-val       = "x", HEXDIG+, ((".", HEXDIG+)+ | ("-", HEXDIG+))?.
    prose-val     = "<", [#20 - #3D; #3F - #7E]*, ">".
                         { bracketed string of SP and VCHAR
                           without angles
                           prose description, to be used as
                           last resort }
    { 'Core rules' from Appendix B, intended for re-use. }
    ALPHA         = ["A"-"Z"; "a"-"z"]. { #41-#5A; #61-#7A }
    BIT           = "0"; "1".
    CR            = #0D. { carriage return }
    CRLF          = CR, LF { Internet standard newline }
                  | LF .   { Extension: support single LF convention for newlines }
    DIGIT         = ["0"-"9"]. { #30 - #39 }
    DQUOTE        = #22. { (Double Quote) }
    HEXDIG        = DIGIT; ["A"-"F"].
    HTAB	      = #09. { horizontal tab }
    LF	      = #0A. { linefeed }
    SP            = #20.
    VCHAR         = [#21 - #7E]. { visible (printing) characters }
    WSP           = SP | HTAB. { white space }
    { Included in Appendix B for reuse elsewhere, but not used
    by ABNF itself: }
    CHAR          = [#01 - #7F].  
                         { Any 7-bit US-ASCII chracter, 
                           excluding NUL }
    CTL           = [#00 - #1F; #7F]. { controls }
    LWSP	      = (WSP | CRLF, WSP)*.
                         { Use of this linear-white-space rule
                           permits lines containing only white
                           space that are no longer legal in
                           mail headers and have caused 
                           interoperability problems in other
                           Do not use when defining mail
                           headers and use with caution in 
                           other contexts. }
    OCTET         = [#00 - #FF]. { 8 bits of data }
    { Notes:
    - As noted in the comments, this grammar has been reported ambiguous
      in a couple of places.
    - ABNF nonterminals are case-insensitive unless specified using
      numeric values for the characters.

An XML version of uri.abnf

  • Now we can parse uri.abnf into XML

  • Unfortunately, it is not an iXML grammar

    And it’s a little bit verbose

    <rulelist xmlns:ixml='' ixml:state='ambiguous'>
                <SP> </SP>
                <SP> </SP>
                <SP> </SP>
                <SP> </SP>
                <SP> </SP>
                <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                                  <SP> </SP>
                                        <SP> </SP>
                                  <SP> </SP>
                         <SP> </SP>
                                  <SP> </SP>
                                        <SP> </SP>
                                  <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                                  <SP> </SP>
                                        <SP> </SP>
                                  <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                         <SP> </SP>
                                  <SP> </SP>
                                        <SP> </SP>
                                  <SP> </SP>
                         <SP> </SP>
                                  <SP> </SP>
                                        <SP> </SP>
                                  <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                         <SP> </SP>
                                  <SP> </SP>
                                     <SP> </SP>
                                     <SP> </SP>
                                     <SP> </SP>
                                     <SP> </SP>
                                     <SP> </SP>
                                     <SP> </SP>
                                     <SP> </SP>
                                     <SP> </SP>
                                  <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                                  <SP> </SP>
                                        <SP> </SP>
                                  <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                                  <SP> </SP>
                                        <SP> </SP>
                                  <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                                  <SP> </SP>
                                     <SP> </SP>
                                     <SP> </SP>
                                     <SP> </SP>
                                     <SP> </SP>
                                     <SP> </SP>
                                     <SP> </SP>
                                  <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                         <SP> </SP>
                                  <SP> </SP>
                                     <SP> </SP>
                                     <SP> </SP>
                                  <SP> </SP>
                                  <SP> </SP>
                         <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                                  <SP> </SP>
                                     <SP> </SP>
                                     <SP> </SP>
                                     <SP> </SP>
                                     <SP> </SP>
                                  <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                                  <SP> </SP>
                                        <SP> </SP>
                                  <SP> </SP>
                         <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                         <SP> </SP>
                                  <SP> </SP>
                                        <SP> </SP>
                                  <SP> </SP>
                         <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                                  <SP> </SP>
                                  <SP> </SP>
                                  <SP> </SP>
                                  <SP> </SP>
                                  <SP> </SP>
                                  <SP> </SP>
                                  <SP> </SP>
                                  <SP> </SP>
                                  <SP> </SP>
                                  <SP> </SP>
                                  <SP> </SP>
                                  <SP> </SP>
                                  <SP> </SP>
                                  <SP> </SP>
                                  <SP> </SP>
                                  <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                                  <SP> </SP>
                                        <SP> </SP>
                                  <SP> </SP>
                         <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                                  <SP> </SP>
                                                 <SP> </SP>
                                                       <SP> </SP>
                                                 <SP> </SP>
                                        <SP> </SP>
                                  <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                                  <SP> </SP>
                                        <SP> </SP>
                                  <SP> </SP>
                         <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                                  <SP> </SP>
                                                 <SP> </SP>
                                                       <SP> </SP>
                                                 <SP> </SP>
                                        <SP> </SP>
                                  <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                                  <SP> </SP>
                                        <SP> </SP>
                                  <SP> </SP>
                         <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                                  <SP> </SP>
                                                 <SP> </SP>
                                                       <SP> </SP>
                                                 <SP> </SP>
                                        <SP> </SP>
                                  <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                                  <SP> </SP>
                                                 <SP> </SP>
                                                       <SP> </SP>
                                                 <SP> </SP>
                                        <SP> </SP>
                                  <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                                  <SP> </SP>
                                                 <SP> </SP>
                                                       <SP> </SP>
                                                 <SP> </SP>
                                        <SP> </SP>
                                  <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                                  <SP> </SP>
                                                 <SP> </SP>
                                                       <SP> </SP>
                                                 <SP> </SP>
                                        <SP> </SP>
                                  <SP> </SP>
                         <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                                  <SP> </SP>
                                        <SP> </SP>
                                        <SP> </SP>
                                  <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                            <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                         <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                            <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                         <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                            <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                            <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                         <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                                  <SP> </SP>
                                     <SP> </SP>
                                     <SP> </SP>
                                     <SP> </SP>
                                     <SP> </SP>
                                  <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                            <SP> </SP>
                            <SP> </SP>
                            <SP> </SP>
                            <SP> </SP>
                            <SP> </SP>
                            <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                            <SP> </SP>
                            <SP> </SP>
                            <SP> </SP>
                            <SP> </SP>
                            <SP> </SP>
                            <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                            <SP> </SP>
                            <SP> </SP>
                            <SP> </SP>
                            <SP> </SP>
                            <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                            <SP> </SP>
                            <SP> </SP>
                            <SP> </SP>
                            <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                                  <SP> </SP>
                                        <SP> </SP>
                                  <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                         <SP> </SP>
                                  <SP> </SP>
                                        <SP> </SP>
                                                 <SP> </SP>
                                                       <SP> </SP>
                                                 <SP> </SP>
                                  <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                         <SP> </SP>
                                  <SP> </SP>
                                        <SP> </SP>
                                  <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                         <SP> </SP>
                                  <SP> </SP>
                                        <SP> </SP>
                                  <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                                  <SP> </SP>
                                     <SP> </SP>
                                     <SP> </SP>
                                     <SP> </SP>
                                     <SP> </SP>
                                     <SP> </SP>
                                     <SP> </SP>
                                  <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                                  <SP> </SP>
                                     <SP> </SP>
                                     <SP> </SP>
                                     <SP> </SP>
                                     <SP> </SP>
                                  <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                                  <SP> </SP>
                                     <SP> </SP>
                                     <SP> </SP>
                                     <SP> </SP>
                                     <SP> </SP>
                                  <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                         <SP> </SP>
                         <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                   <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>
                      <SP> </SP>

Another XML version of uri.abnf

  • By adding a few marks to abnf.ixml, we can make the result less verbose

  • But it’s still not an iXML grammar

                      </alternation> </option>
                      </alternation> </option>
                      </alternation> </option>
                      </alternation> </option>
                      </alternation> </option>
                      </alternation> </group>
                      </alternation> </option>
                      </alternation> </option>
                      </alternation> </group>
                      </alternation>  </group>
                      </alternation> </group>
                      </alternation> </group>
                      </alternation> </group>
                      </alternation> </option>
                      </alternation> </group>
                                  </alternation> </group>
                      </alternation> </option>
                      </alternation> </group>
                                  </alternation> </group>
                      </alternation> </option>
                      </alternation> </group>
                                  </alternation> </group>
                      </alternation> </option>
                                  </alternation> </group>
                      </alternation> </option>
                                  </alternation> </group>
                      </alternation> </option>
                                  </alternation> </group>
                      </alternation> </option>
                      </alternation> </group>
          </alternation>          </rule>
                      </alternation> </group>
          </alternation>      </rule>
                      </alternation> </group>
                                  </alternation> </group>
                      </alternation> </option>
                      </alternation> </group>
                      </alternation> </group>
                      </alternation> </group>
                      </alternation> </group>
                      </alternation> </group>

XSLT enters the chat

  • We have the technology to fix that!

    <?xml version="1.0" encoding="utf-8"?>
    <xsl:stylesheet xmlns:xsl=""
    <xsl:output method="xml" encoding="utf-8" indent="yes"/>
    <xsl:strip-space elements="*"/>
    <xsl:preserve-space elements="char-val"/>
    <xsl:key name="definitions" match="rule" use="rulename"/>
    <xsl:key name="uses" match="rulename[not(parent::rule)]" use="."/>
    <xsl:param name="marks" select="()"/>
    <xsl:variable name="parser" select="cs:load-grammar('marks.ixml')"/>
    <xsl:variable name="marklist" as="element(marks)?"
                  select="$marks ! $parser(unparsed-text(.))/*"/>
    <xsl:variable name="marked" as="map(xs:string, xs:string)">
      <xsl:variable name="root" select="/"/>
        <xsl:iterate select="$marklist/*">
          <xsl:param name="selected" select="()"/>
          <xsl:variable name="mark" select="@mark/string()"/>
          <xsl:variable name="elements" as="element()*">
              <xsl:when test="self::rule">
                <xsl:evaluate context-item="$root" as="element()*"
              <xsl:when test="self::token">
                <!--<xsl:message select="string(.)"/>-->
                <xsl:evaluate context-item="$root" as="element()*"
              <xsl:when test="self::renametoken">
                <!-- nop -->
              <xsl:when test="self::rename">
                <!-- nop -->
                <xsl:message select="'Unrecognized mark:', ."/>
          <xsl:for-each select="$elements">
            <xsl:if test="not(generate-id(.) = $selected)">
              <xsl:map-entry key="generate-id(.)" select="$mark"/>
            <xsl:with-param name="selected" select="($selected, $elements ! generate-id(.))"/>
    <xsl:variable name="renamed" as="map(xs:string, xs:string)">
      <xsl:variable name="root" select="/"/>
        <xsl:iterate select="$marklist/*">
          <xsl:param name="selected" select="()"/>
          <xsl:variable name="rename" select="@name/string()"/>
          <xsl:variable name="elements" as="element()*">
              <xsl:when test="self::rule">
                <!-- nop -->
              <xsl:when test="self::token">
                <!-- nop -->
              <xsl:when test="self::renametoken">
                <!--<xsl:message select="string(.)"/>-->
                <xsl:evaluate context-item="$root" as="element()*"
              <xsl:when test="self::rename">
                <!-- nop -->
                <xsl:message select="'Unrecognized mark:', ."/>
          <xsl:for-each select="$elements">
            <xsl:if test="not(generate-id(.) = $selected)">
              <xsl:map-entry key="generate-id(.)" select="$rename"/>
            <xsl:with-param name="selected" select="($selected, $elements ! generate-id(.))"/>
    <xsl:variable name="core-rules" as="element()">
        <rule mark='-' name='ALPHA'>
              <member from='A' to='Z'/>
              <member from='a' to='z'/>
        <rule mark='-' name='BIT'>
            <literal string='0'/>
            <literal string='1'/>
        <rule mark='-' name='CR'>
            <literal tmark='-' hex='0D'/>
        <rule mark='-' name='CRLF'>
            <nonterminal name='CR'/>
            <nonterminal name='LF'/>
            <nonterminal name='LF'/>
        <rule mark='-' name='DIGIT'>
              <member from='0' to='9'/>
        <rule mark='-' name='DQUOTE'>
            <literal tmark='-' hex='22'/>
        <rule mark='-' name='HEXDIG'>
            <nonterminal name='DIGIT'/>
              <member from='A' to='F'/>
              <member from='a' to='f'/>
        <rule mark='-' name='HTAB'>
            <literal hex='09'/>
        <comment> horizontal tab </comment>
        <rule mark='-' name='LF'>
            <literal tmark='-' hex='0A'/>
        <rule mark='-' name='SP'>
            <literal hex='20'/>
        <rule mark='-' name='VCHAR'>
              <member from='#21' to='#7E'/>
        <rule mark='-' name='WSP'>
            <nonterminal name='SP'/>
            <nonterminal name='HTAB'/>
        <rule mark='-' name='CHAR'>
              <member from='#01' to='#7F'/>
        <rule mark='-' name='CTL'>
              <member from='#00' to='#1F'/>
              <literal hex='7F'/>
        <rule mark='-' name='LWSP'>
            <nonterminal name='CR'/>
            <nonterminal name='LF'/>
            <nonterminal name='LF'/>
        <rule name='LWSP'>
                  <nonterminal name='WSP'/>
                  <nonterminal name='CRLF'/>
                  <nonterminal name='WSP'/>
        <rule mark='-' name='OCTET'>
              <member from='#00' to='#FF'/>
    <xsl:template match="rulelist">
      <xsl:variable name="rules" select="."/>
          <version string='1.1-nineml'/>
        <xsl:for-each select="$core-rules/rule">
          <xsl:variable name="name" select="@name/string()"/>
          <xsl:if test="empty(key('definitions', $name, $rules))
                        and exists(key('uses', $name, $rules))">
            <xsl:sequence select="."/>
    <xsl:template match="rule">
      <xsl:variable name="name" select="rulename/string()"/>
      <rule name='{$name}'>
        <xsl:sequence select="f:mark(.)"/>
        <xsl:if test="$marklist/rename[@name = $name]">
          <xsl:attribute name="rename" select="normalize-space($marklist/rename[@name = $name])"/>
    <xsl:template match="alternation">
    <xsl:template match="group/alternation" priority="10">
    <xsl:template match="rule/alternation" priority="10">
    <xsl:template match="alternation/*" priority="10">
    <xsl:template match="group">
    <xsl:template match="repetition">
      <xsl:message terminate="yes" select="'Unhandled repetition type:', ."/>
    <xsl:template match="repetition[not(repeat)]">
    <xsl:template match="repetition[repeat='+']">
        <xsl:apply-templates select="* except repeat"/>
    <xsl:template match="repetition[repeat castable as xs:integer]">
      <xsl:variable name="item" select="* except repeat"/>
      <xsl:for-each select="1 to xs:integer(repeat)">
        <xsl:apply-templates select="$item"/>
    <xsl:template match="repetition[repeat='0']" priority="10">
    <xsl:template match="repetition[matches(repeat, '\d*\*\d*')]">
      <xsl:variable name="item" select="* except repeat"/>
      <xsl:variable name="min"
                    select="if (substring-before(repeat, '*') = '')
                            then 0
                            else xs:integer(substring-before(repeat, '*'))"/>
      <xsl:variable name="max"
                    select="if (substring-after(repeat, '*') = '')
                            then ()
                            else xs:integer(substring-after(repeat, '*'))"/>
      <xsl:if test="$min gt 0">
        <xsl:for-each select="1 to $min">
          <xsl:apply-templates select="$item"/>
        <xsl:when test="exists($max)">
          <xsl:for-each select="$min to $max">
            <xsl:apply-templates select="$item"/>
            <xsl:apply-templates select="$item"/>
    <xsl:template match="option">
    <xsl:template match="char-val">
      <literal string='{.}'>
        <xsl:sequence select="f:tmark(.)"/>
    <xsl:template match="hex-val">
      <xsl:variable name="attr" select="f:tmark(.)"/>
      <xsl:for-each select="tokenize(., '\.')">
        <literal hex='{.}'>
          <xsl:sequence select="$attr"/>
    <xsl:template match="hex-val[contains(., '-')]">
      <xsl:variable name="attr" select="f:tmark(.)"/>
      <xsl:variable name="first" select="substring-before(., '-')"/>
      <xsl:variable name="last" select="substring-after(., '-')"/>
            <member from='#{$first}' to='#{$last}'>
              <xsl:sequence select="$attr"/>
    <xsl:template match="rulename">
      <nonterminal name='{.}'>
        <xsl:sequence select="f:mark(.)"/>
        <xsl:sequence select="f:rename(.)"/>
    <xsl:template match="rule/rulename"/>
    <xsl:template match="comment">
    <!-- ============================================================ -->
    <xsl:function name="f:hex-to-dec" as="xs:integer">
      <xsl:param name="hex" as="xs:string"/>
      <xsl:iterate select="reverse(string-to-codepoints(upper-case($hex)))">
        <xsl:param name="dec" select="0"/>
        <xsl:param name="pow" select="1"/>
        <xsl:on-completion select="$dec"/>
        <xsl:variable name="digit" select="if (. gt 64) then . - 55 else . - 48"/>
          <xsl:with-param name="dec" select="$dec + ($digit * $pow)"/>
          <xsl:with-param name="pow" select="$pow * 16"/>
    <xsl:function name="f:dec-to-hex" as="xs:string">
      <xsl:param name="dec" as="xs:integer"/>
        <xsl:when test="$dec lt 16">
          <xsl:sequence select="substring('0123456789ABCDEF', $dec+1, 1)"/>
          <xsl:variable name="newdec" select="$dec idiv 16"/>
          <xsl:variable name="digit"  select="$dec - ($newdec * 16)"/>
          <xsl:sequence select="f:dec-to-hex($newdec)||substring('0123456789ABCDEF', $digit+1, 1)"/>
    <xsl:function name="f:mark" as="attribute()?">
      <xsl:param name="node" as="element()"/>
      <xsl:if test="map:contains($marked, generate-id($node))">
        <xsl:attribute name="mark" select="map:get($marked, generate-id($node))"/>
    <xsl:function name="f:tmark" as="attribute()?">
      <xsl:param name="node" as="element()"/>
      <xsl:if test="map:contains($marked, generate-id($node))">
        <xsl:attribute name="tmark" select="map:get($marked, generate-id($node))"/>
    <xsl:function name="f:rename" as="attribute()?">
      <xsl:param name="node" as="element()"/>
      <xsl:if test="map:contains($renamed, generate-id($node))">
        <xsl:attribute name="rename" select="map:get($renamed, generate-id($node))"/>

Grammar! Grammar! Grammar! Grammar!

Now we can…

  • Convert uri.abnf to xml…

    coffeepot -g:ABNFp.ixml -i:uri.abnf -o:uri.xml

  • Transform that XML…

    saxon -s:uri.xml -xsl:abnf2ixml.xsl -o:uri-raw.vxml

  • And now it’s iXML!

    <?xml version="1.0" encoding="utf-8"?>
          <version string="1.1-nineml"/>
       <rule name="URI">
             <nonterminal name="scheme"/>
             <literal string=":"/>
             <nonterminal name="hier-part"/>
                      <literal string="?"/>
                      <nonterminal name="query"/>
                      <literal string="#"/>
                      <nonterminal name="fragment"/>
       <rule name="hier-part">
             <literal string="//"/>
             <nonterminal name="authority"/>
             <nonterminal name="path-abempty"/>
             <nonterminal name="path-absolute"/>
             <nonterminal name="path-rootless"/>
             <nonterminal name="path-empty"/>
       <rule name="URI-reference">
             <nonterminal name="URI"/>
             <nonterminal name="relative-ref"/>
       <rule name="absolute-URI">
             <nonterminal name="scheme"/>
             <literal string=":"/>
             <nonterminal name="hier-part"/>
                      <literal string="?"/>
                      <nonterminal name="query"/>
       <rule name="relative-ref">
             <nonterminal name="relative-part"/>
                      <literal string="?"/>
                      <nonterminal name="query"/>
                      <literal string="#"/>
                      <nonterminal name="fragment"/>
       <rule name="relative-part">
             <literal string="//"/>
             <nonterminal name="authority"/>
             <nonterminal name="path-abempty"/>
             <nonterminal name="path-absolute"/>
             <nonterminal name="path-noscheme"/>
             <nonterminal name="path-empty"/>
       <rule name="scheme">
             <nonterminal name="ALPHA"/>
                      <nonterminal name="ALPHA"/>
                      <nonterminal name="DIGIT"/>
                      <literal string="+"/>
                      <literal string="-"/>
                      <literal string="."/>
       <rule name="authority">
                      <nonterminal name="userinfo"/>
                      <literal string="@"/>
             <nonterminal name="host"/>
                      <literal string=":"/>
                      <nonterminal name="port"/>
       <rule name="userinfo">
                      <nonterminal name="unreserved"/>
                      <nonterminal name="pct-encoded"/>
                      <nonterminal name="sub-delims"/>
                      <literal string=":"/>
       <rule name="host">
             <nonterminal name="IP-literal"/>
             <nonterminal name="IPv4address"/>
             <nonterminal name="reg-name"/>
       <rule name="port">
                <nonterminal name="DIGIT"/>
       <rule name="IP-literal">
             <literal string="["/>
                   <nonterminal name="IPv6address"/>
                   <nonterminal name="IPvFuture"/>
             <literal string="]"/>
       <rule name="IPvFuture">
             <literal string="v"/>
             <nonterminal name="HEXDIG"/>
                <nonterminal name="HEXDIG"/>
             <literal string="."/>
                   <nonterminal name="unreserved"/>
                   <nonterminal name="sub-delims"/>
                   <literal string=":"/>
                      <nonterminal name="unreserved"/>
                      <nonterminal name="sub-delims"/>
                      <literal string=":"/>
       <rule name="IPv6address">
                   <nonterminal name="h16"/>
                   <literal string=":"/>
                   <nonterminal name="h16"/>
                   <literal string=":"/>
                   <nonterminal name="h16"/>
                   <literal string=":"/>
                   <nonterminal name="h16"/>
                   <literal string=":"/>
                   <nonterminal name="h16"/>
                   <literal string=":"/>
                   <nonterminal name="h16"/>
                   <literal string=":"/>
             <nonterminal name="ls32"/>
             <literal string="::"/>
                   <nonterminal name="h16"/>
                   <literal string=":"/>
                   <nonterminal name="h16"/>
                   <literal string=":"/>
                   <nonterminal name="h16"/>
                   <literal string=":"/>
                   <nonterminal name="h16"/>
                   <literal string=":"/>
                   <nonterminal name="h16"/>
                   <literal string=":"/>
             <nonterminal name="ls32"/>
                      <nonterminal name="h16"/>
             <literal string="::"/>
                   <nonterminal name="h16"/>
                   <literal string=":"/>
                   <nonterminal name="h16"/>
                   <literal string=":"/>
                   <nonterminal name="h16"/>
                   <literal string=":"/>
                   <nonterminal name="h16"/>
                   <literal string=":"/>
             <nonterminal name="ls32"/>
                            <nonterminal name="h16"/>
                            <literal string=":"/>
                            <nonterminal name="h16"/>
                            <literal string=":"/>
                      <nonterminal name="h16"/>
             <literal string="::"/>
                   <nonterminal name="h16"/>
                   <literal string=":"/>
                   <nonterminal name="h16"/>
                   <literal string=":"/>
                   <nonterminal name="h16"/>
                   <literal string=":"/>
             <nonterminal name="ls32"/>
                            <nonterminal name="h16"/>
                            <literal string=":"/>
                            <nonterminal name="h16"/>
                            <literal string=":"/>
                            <nonterminal name="h16"/>
                            <literal string=":"/>
                      <nonterminal name="h16"/>
             <literal string="::"/>
                   <nonterminal name="h16"/>
                   <literal string=":"/>
                   <nonterminal name="h16"/>
                   <literal string=":"/>
             <nonterminal name="ls32"/>
                            <nonterminal name="h16"/>
                            <literal string=":"/>
                            <nonterminal name="h16"/>
                            <literal string=":"/>
                            <nonterminal name="h16"/>
                            <literal string=":"/>
                            <nonterminal name="h16"/>
                            <literal string=":"/>
                      <nonterminal name="h16"/>
             <literal string="::"/>
             <nonterminal name="h16"/>
             <literal string=":"/>
             <nonterminal name="ls32"/>
                            <nonterminal name="h16"/>
                            <literal string=":"/>
                            <nonterminal name="h16"/>
                            <literal string=":"/>
                            <nonterminal name="h16"/>
                            <literal string=":"/>
                            <nonterminal name="h16"/>
                            <literal string=":"/>
                            <nonterminal name="h16"/>
                            <literal string=":"/>
                      <nonterminal name="h16"/>
             <literal string="::"/>
             <nonterminal name="ls32"/>
                            <nonterminal name="h16"/>
                            <literal string=":"/>
                            <nonterminal name="h16"/>
                            <literal string=":"/>
                            <nonterminal name="h16"/>
                            <literal string=":"/>
                            <nonterminal name="h16"/>
                            <literal string=":"/>
                            <nonterminal name="h16"/>
                            <literal string=":"/>
                            <nonterminal name="h16"/>
                            <literal string=":"/>
                      <nonterminal name="h16"/>
             <literal string="::"/>
             <nonterminal name="h16"/>
                            <nonterminal name="h16"/>
                            <literal string=":"/>
                            <nonterminal name="h16"/>
                            <literal string=":"/>
                            <nonterminal name="h16"/>
                            <literal string=":"/>
                            <nonterminal name="h16"/>
                            <literal string=":"/>
                            <nonterminal name="h16"/>
                            <literal string=":"/>
                            <nonterminal name="h16"/>
                            <literal string=":"/>
                            <nonterminal name="h16"/>
                            <literal string=":"/>
                      <nonterminal name="h16"/>
             <literal string="::"/>
       <rule name="h16">
             <nonterminal name="HEXDIG"/>
             <nonterminal name="HEXDIG"/>
             <nonterminal name="HEXDIG"/>
             <nonterminal name="HEXDIG"/>
             <nonterminal name="HEXDIG"/>
       <rule name="ls32">
                   <nonterminal name="h16"/>
                   <literal string=":"/>
                   <nonterminal name="h16"/>
             <nonterminal name="IPv4address"/>
       <rule name="IPv4address">
             <nonterminal name="dec-octet"/>
             <literal string="."/>
             <nonterminal name="dec-octet"/>
             <literal string="."/>
             <nonterminal name="dec-octet"/>
             <literal string="."/>
             <nonterminal name="dec-octet"/>
       <rule name="dec-octet">
             <nonterminal name="DIGIT"/>
                      <member from="#31" to="#39"/>
             <nonterminal name="DIGIT"/>
             <literal string="1"/>
             <nonterminal name="DIGIT"/>
             <nonterminal name="DIGIT"/>
             <literal string="2"/>
                      <member from="#30" to="#34"/>
             <nonterminal name="DIGIT"/>
             <literal string="25"/>
                      <member from="#30" to="#35"/>
       <rule name="reg-name">
                      <nonterminal name="unreserved"/>
                      <nonterminal name="pct-encoded"/>
                      <nonterminal name="sub-delims"/>
       <rule name="path">
             <nonterminal name="path-abempty"/>
             <nonterminal name="path-absolute"/>
             <nonterminal name="path-noscheme"/>
             <nonterminal name="path-rootless"/>
             <nonterminal name="path-empty"/>
       <rule name="path-abempty">
                      <literal string="/"/>
                      <nonterminal name="segment"/>
       <rule name="path-absolute">
             <literal string="/"/>
                      <nonterminal name="segment-nz"/>
                               <literal string="/"/>
                               <nonterminal name="segment"/>
       <rule name="path-noscheme">
             <nonterminal name="segment-nz-nc"/>
                      <literal string="/"/>
                      <nonterminal name="segment"/>
       <rule name="path-rootless">
             <nonterminal name="segment-nz"/>
                      <literal string="/"/>
                      <nonterminal name="segment"/>
       <rule name="path-empty">
       <rule name="segment">
                <nonterminal name="pchar"/>
       <rule name="segment-nz">
             <nonterminal name="pchar"/>
                <nonterminal name="pchar"/>
       <rule name="segment-nz-nc">
                   <nonterminal name="unreserved"/>
                   <nonterminal name="pct-encoded"/>
                   <nonterminal name="sub-delims"/>
                   <literal string="@"/>
                      <nonterminal name="unreserved"/>
                      <nonterminal name="pct-encoded"/>
                      <nonterminal name="sub-delims"/>
                      <literal string="@"/>
       <rule name="pchar">
             <nonterminal name="unreserved"/>
             <nonterminal name="pct-encoded"/>
             <nonterminal name="sub-delims"/>
             <literal string=":"/>
             <literal string="@"/>
       <rule name="query">
                      <nonterminal name="pchar"/>
                      <literal string="/"/>
                      <literal string="?"/>
       <rule name="fragment">
                      <nonterminal name="pchar"/>
                      <literal string="/"/>
                      <literal string="?"/>
       <rule name="pct-encoded">
             <literal string="%"/>
             <nonterminal name="HEXDIG"/>
             <nonterminal name="HEXDIG"/>
       <rule name="unreserved">
             <nonterminal name="ALPHA"/>
             <nonterminal name="DIGIT"/>
             <literal string="-"/>
             <literal string="."/>
             <literal string="_"/>
             <literal string="~"/>
       <rule name="reserved">
             <nonterminal name="gen-delims"/>
             <nonterminal name="sub-delims"/>
       <rule name="gen-delims">
             <literal string=":"/>
             <literal string="/"/>
             <literal string="?"/>
             <literal string="#"/>
             <literal string="["/>
             <literal string="]"/>
             <literal string="@"/>
       <rule name="sub-delims">
             <literal string="!"/>
             <literal string="$"/>
             <literal string="&amp;"/>
             <literal string="'"/>
             <literal string="("/>
             <literal string=")"/>
             <literal string="*"/>
             <literal string="+"/>
             <literal string=","/>
             <literal string=";"/>
             <literal string="="/>
       <rule mark="-" name="ALPHA">
                <member from="A" to="Z"/>
                <member from="a" to="z"/>
       <rule mark="-" name="DIGIT">
                <member from="0" to="9"/>
       <rule mark="-" name="HEXDIG">
             <nonterminal name="DIGIT"/>
                <member from="A" to="F"/>
                <member from="a" to="f"/>

I can haz URI!

  • Now I can parse a URI with the grammar from RFC 3986:

    coffeepot -g:uri-raw.vxml "https://mushroom.mushroom/?notareal#tld"

  • XML!



  • I could improve the output with marks…

  • I could edit the VXML file…

  • I could edit the VXML file, but that would be wrong.

  • What if I could describe where I wanted the marks to go?

  • In a declarative way:

    mark rule unreserved with "-"
    mark rule pchar with “-”
    mark token //char-val[. = ('/', ':', '//')] with “-”
    mark token /rulelist/rule[rulename = 'URI']//char-val with ‘-’

Marks, in XML

  • That marks file sure would be easier to process if it was in XML though…

  • CoffeeSacks (iXML extension functions for Saxon) to the rescue

    <xsl:param name="marks" select="()"/>
    <xsl:variable name="parser" select="cs:load-grammar('marks.ixml')"/>
    <xsl:variable name="marklist" as="element(marks)?"
                  select="$marks ! $parser(unparsed-text(.))/*"/>
  • With this little grammar grammar grammar grammar grammar.

    ixml version "1.1-nineml" .
                marks = mark**NL, NL? .
                -mark = rule | token | rename .
                 rule = -'mark', s, -'rule', s, name, s, -'with', s, themark, s? .
                token = -'mark', s, -'token', s, expr, s, -'with', s, themark, s? .
              -rename = renamerule | renametoken .
    renamerule>rename = -'rename', s, @name, s, -'to', s, name, s? .
          renametoken = -'rename', s, -'token', s, expr, s, -'to', s, @name, s? .
                -expr = ~[#A]+ .
                -name = [L|N|'-'|'.'|'_']+ .
        @themark>mark = -'"', [P], -'"' | -"'", [P], -"'" | -'“', [P], -'”' | -"‘", [P], -"’" .
                   -s = -[#20 | #9]+ .
                  -NL = -#D?, -#A .

Mushroom! Mushroom!

Ready! Set! Go!

  • coffeepot -g:ABNFp.ixml -i:uri.abnf -o:uri.xml

  • saxon -s:uri.xml -xsl:abnf2ixml.xsl -o:uri.vxml marks=uri-marks.txt

  • coffeepot -g:uri.vxml "https://mushroom.mushroom/?notareal#tld"

  • Tada!


An animated gif