I’m starting to learn Sublime Text 3’s YAML syntax rules. So far, it seems pretty powerful and slick. I use the Ada language and I have never found a good syntax coloring system for Ada or languages like Ada. I am hopeful that Sublime Text 3’s system can do a better job.
Ada does not use { and } to denote blocks - instead there are keywords. Also, like most languages, white-space (including new-lines) can be compressed to a single space between identifiers. Hence, an Ada program can be written on a single line or with each parser token on its own line.
Here is an example:
procedure FOO is
<var-defs>
begin
<statements>
end FOO;
procedure
FOO
is
<var-defs>
begin
<statements>
end
FOO
;
Firstly, because Sublime Text matches patterns of single lines, I realize that I have to create matching rules that contain single parser tokens. If I do more than that, then the syntax styler will only match if I write my code in the way the styler is expecting which is not a language requirement.
Now if I want to enforce syntax rules, like “you can’t have the word procedure twice in a row” or “you can’t have two identifiers naming the procedure before the ‘is’ keyword” then each keyword in a syntax progression then is it correct to say that I have to put each parser token in its own context?
In Ada, you’re reached the end of a code block when you encounter the ‘end something;’ statement.
loop
end loop;
if x then
elsif y then
else
end if;
and nested:
if x then
loop
end loop;
end if;
What I think I need to do is match the ‘if’, ‘loop’, and ‘FOO’ from the start of the block with the ‘end IF’, ‘end LOOP’, and ‘end FOO’ respectively at the end of the block. I’ve seen the trick in the documentation of carrying captures forward to the next context, but can this be done to a context much further down the scope tree with named captures?
My other question is that if I have multiple matches in a context, can they be ordered according to the syntax production rules and once a match is matched then it cannot be used again for the current scope level? For example, if I write (ignoring the recursive content):
contexts:
procedure-start:
- match: \bprocedure\b
scope: procedure
push: proc-def
proc-def:
- match: '\b(?<procname>{{identifier}})\b'
- match: \bis\b
- match: \bbegin\b
- match: \bend\b
- match: '\b\k<procname>\b'
- match: ';'
pop: true
Once I’m in the proc-def context, any of these rules can match in any order. Right?
This means the following text will highlight as if it were a legal procedure definition:
procedure BLAH FIBBLE is is
begin is end begin ;
Clearly not legal syntax and I hope to indicate as such.
What I hope I can do to avoid having to put each token into it’s own context is to have these matched in order (or some specified order) and once matched be taken out of eligibility for the current scope (perhaps optionally by YAML rule) so it will not be matched again. If an ineligible rule is matched, then a syntax error is identified. Can something like this be done?
Also notice the named capture. Is that possible? I can’t seem to get it to work.
I have high hope that this new language definition syntax will be better than anything before, but the learning curve is steep.
Thank you to any who can lend their experience.