Yeah I’ve had to give up matching identifiers on most things because I want to have greater scope identification and control for this language. The TextMate syntax definition is alright in general and Brian Padolino put a lot of work into it and it does match the trailing identifier in many instances but it makes for very complex patterns. My rewrite of this scopes things with more granularity, but then I lose out on identifier match.
I don’t know that I follow your final comment though because I have seen this work in Padolino’s syntax. For example, the language has the following construct:
architecture <valid_identifier_for_architecture_name> of <valid_identifier_for_entity> is
<block declaration_pattern>
begin
<concurrent statements>
end [architecture] [<valid_identifier_for_architecture>];
The Padolino construct looks like this (after being translated into YAML – it was originally the TextMate XML format):
architecture_pattern:
- match: |-
(?x)
# The word architecture $1
\b((?i:architecture))\s+
# Followed up by a valid $3 or invalid identifier $4
(([a-zA-z][a-zA-z0-9_]*)|(.+))(?=\s)\s+
# The word of $5
((?i:of))\s+
# Followed by a valid $7 or invalid identifier $8
(([a-zA-Z][a-zA-Z0-9_]*)|(.+?))(?=\s*(?i:is))\b
captures:
1: keyword.language.vhdl
3: entity.name.type.architecture.begin.vhdl
4: invalid.illegal.invalid.identifier.vhdl
5: keyword.language.vhdl
7: entity.name.type.entity.reference.vhdl
8: invalid.illegal.invalid.identifier.vhdl
push:
- meta_scope: meta.block.architecture
- match: |-
(?x)
# The word end $1
\b((?i:end))
# Optional word architecture $3
(\s+((?i:architecture)))?
# Optional same identifier $6 or illegal identifier $7
(\s+((\3)|(.+?)))?
# This will cause the previous to capture until just before the ; or $
(?=\s*;)
captures:
1: keyword.language.vhdl
3: keyword.language.vhdl
6: entity.name.type.architecture.end.vhdl
7: invalid.illegal.mismatched.identifier.vhdl
pop: true
... (there are a lot of - includes: after this for various lexical structures)
If I’m reading your line correctly, this shouldn’t work because \3 would be referencing the 3 from the pushed match. However it does seem to work, though there are not difference scopes for the prologue block and the statements block, which means technically it’ll match on a lot of things it shouldn’t in places it shouldn’t.
My variation is somewhat simpler but all I do is check for valid identifier construction, and cannot ensure that the identifier matches: I did it with three named contexts, but that was the first one that I’ve done with a prologue and I’ve subsequently done some others with anonymous contexts. Anyhow, whole point of the feature suggestion was that it’d be nice to be able to accomplish both tasks – the greater granularity on scoping the lexical elements and also be able to provide feedback to the author if they make an invalid closing construct.
# The architecture needs to be 3 contexts to correctly get the scope correct
# for the various structures.
architecture-begin:
- match: '(?i)^\s*(architecture)\s+({{identifier}})\s+(of)\s+({{identifier}})\s+(is)'
captures:
1: storage.type.architecture.vhdl
2: entity.name.architecture.vhdl
3: keyword.other.vhdl
4: entity.name.entity.vhdl
5: keyword.declaration.vhdl
push: architecture-declarations
# Note: 'set' is used because push will set us two into the stack and I
# cannot pop twice.
architecture-declarations:
- meta_scope: meta.block.arch-declarations.vhdl
- include: block-declarative-items
- match: '(?i)\b(begin)\b'
captures:
1: keyword.declaration.vhdl
set: architecture-statements
architecture-statements:
- meta_scope: meta.block.arch-statements.vhdl
- include: concurrent-statements
- match: '(?i)^\s*(end)\s+(architecture)?\s+({{identifier}})?\s*(;)'
captures:
1: keyword.declaration.vhdl
2: storage.type.architecture.vhdl
3: entity.name.architecture.vhdl
4: punctuation.terminator.vhdl
pop: true