vcards have a form which doesn't really need a "real" parser. But assuming that we want to do it anyway we will have to make a few changes.
The main change is that all your lines have the grammatical form:
So there is no difference between "inner" lines and the surrounding vcard lines.
So I think we should change "begin", "end" and "vcard" into individual terminal symbols/keywords: [t_begin], [t_end], and [t_vcard].
Given that a vcard have the form:
Code: Select all
[t_begin] [t_colon] [t_vcard]
[t_prop] [t_colon] [t_prop]
[t_prop] [t_colon] [t_prop]
...
[t_prop] [t_colon] [t_prop]
[t_end] [t_colon] [t_vcard]
Let us first ignore collecting the result and just consider the recognition.
The repetition part in the middle (i.e. your question) will be handled using recursion. This is an obvious solution:
Code: Select all
rules
vcrd ==>
[t_begin], [t_colon], [t_vcard],
cntlines,
[t_end], [t_colon], [t_vcard].
rules
cntlines ==> .
cntlines ==> cntl, cntlines.
rules
cntl ==> [t_prop], [t_colon], [t_prop].
You have in my opinion collected a lot of irrelevant: the colons are not interesting, there will always be a colon in those places, so they may just as well be skipped completely.
The result could look like this:
Code: Select all
grammar vCardgrm
open vCard, vCardgrmSem
nonterminals
vcrd : vcard.
rules
vcrd { vcard(CL) } ==>
[t_begin],
[t_colon],
[t_vcard],
cntlines { CL },
[t_end],
[t_colon],
[t_vcard].
nonterminals
cntlines : cntl*.
rules
cntlines { [] } ==>
.
cntlines { [C | CL] } ==>
cntl { C },
cntlines { CL }.
nonterminals
cntl : cntl.
rules
cntl { cntl(P, V) } ==>
[t_prop] { P },
[t_colon],
[t_prop] { V }.
end grammar vCardgrm
The parse tree for your second vcard will produce this parse tree:
Code: Select all
vcard([contentline("VERSION", "3.0"), contentline("N", "John"), contentline("FN", "John Neumann")])
LALR grammars and right recursion is not too good, often it will cause so called shift-reduce conflicts (though, not in this case) and in any case they parse stack will grow with the length of the parsed sequence.
For the sake of the recognized language we can easily make
cntlines left recursive like this instead:
Code: Select all
rules
cntlines ==> .
cntlines ==> cntlines, cntl.
This difference does not make any difference for the recognized language, but now we kind find the lines in the opposite order of what we need for the list we want to collect.
The support class contains a reverse list domain (
revList) to assist with this situation. The left-recursive solution could look like this:
Code: Select all
grammar vCardgrm
open vCard, vCardgrmSem
nonterminals
vcrd : vcard.
rules
vcrd { vcard(unRevList(CL)) } ==>
[t_begin],
[t_colon],
[t_vcard],
cntlines { CL },
[t_end],
[t_colon],
[t_vcard].
nonterminals
cntlines : revList{contentline}.
rules
cntlines { nil } ==>
.
cntlines { consRear(CL, C) } ==>
cntlines { CL },
cntl { C }.
nonterminals
cntl : contentline.
rules
cntl { contentline(P, V) } ==>
[t_prop] { P },
[t_colon],
[t_prop] { V }.
end grammar vCardgrm