I’ve been trying to improve my syntax file to cover a particular bug. Normally a token definition looks like signal foo : std_logic;
and I capture that fine. However it’s possible to have more than one item on a line, so signal foo, bar : std_logic;
is valid. I don’t handle that properly. I have a variable called identifier
that contains the regex for a proper identifier construction.
Here’s what I have that’s close to working:
- match: |-
(?xi)
((signal)\s*)
({{identifier}})
(\s*(,)\s*({{identifier}}))*
(\s*(:))
captures:
2: storage.type.signal.vhdl
3: variable.other.vhdl
5: punctuation.separator.vhdl
6: variable.other.vhdl
8: punctuation.separator.vhdl
The logic on this is that it finds the keyword and the first (and mandatory) identifier. There is a variable capture that identifies commas and further identifiers but it’s possible for this to be zero. And finally it finds the colon separator and that’s where I push for the terminator semicolon. This actually looks like it’s matching the lexical elements pretty well until I drill down into the scopes.
signal alpha, beta, gamma, delta : std_logic;
^-- storage.type.signal.vhdl -- Correct
^-- variable.other.vhdl -- Correct
^-----------^ -- Incorrect. Matches the line however does not match any captured group
^-- punctuation.separator.vhdl -- Correct
^-- variable.other.vhdl -- Correct
^-- punctuation.separator.vhdl -- Correct
So note the bit in the middle. The line as a whole is captured, and the first and the last variable is scoped correctly, however the bit in the middle doesn’t seem to match the capture group. I did some poking around and this seems pretty similar to the topic here: Syntax highlight capture scopes should apply to all captures of a given group
Am I running into a known behavior here with scopes not working well on repetitive capture groups?
If I am, then that’s going to push me into the more elaborate matching where I look for signal
and then push and set until I hit the terminating semicolon.