Sublime Forum

Complicated Syntax YAML Structure

#1

I do apologize for the questions if they get to be too many. The documentation for the YAML file is good for a basic overview, but then I look at something like the Python.sublime-syntax file for examples and it’s clear that there are a lot of things that are glossed over.

  1. Actually a simple one first. In a lot of the Python.sublime-syntax contexts, particularly looking for strings, it tries to capture the group (''). When I am testing regexps I usually have to escape a single quote and it looks like (\'). Are ('') and (\') identical? I haven’t figured out if this is a YAML thing, or a Oniguruma thing or where it comes from so I don’t know how valid it is.

  2. Second question. I think I’ve got basic string identification setup correctly, but I thought I’d try something a little more complicated. The language uses string literals for a lot of things like bit arrays. It’s possible to specify these bit arrays explicitly as such: B"10100110". So far, so good. When explicity specified like this, there are a limited number of characters that are allowed. You get 0, 1, and then a few extras like X, U, -, and _, which mean specific things in digital design context. I thought, wouldn’t it be nice if I could flag bad characters in the string.

So, I wrote the following. Is this going to do what I think it’s going to do? Namely,

  1. mark the initial case-insensitive B with a storage type context,
  2. mark the punctuation context
  3. Push the unnamed context onto the stack
  4. Apply the string scope
  5. Watch for the end of the string and flag if we hit an invalid EOL case.
  6. Watch for good characters, or anything else, by definition bad characters.

I think #6 is where I have the most unease because . explicitly matches everything. However I also think REs are evaluated left to right, so I should match and capture the good ones before matching the anything-else clause. I’m also a little uneasy about multiple matches in the unnamed context. In the examples in the documentation, when you push, you declare a specific context and then you write that context separately. However this unnamed context is a LOT more convenient and so you don’t have a lot of structure-start, structure-continue, structure-end context names which seems to clutter up the namespace greatly. So, based on Python.sublime-syntax I think I’m okay, but I figured it couldn’t hurt to check in and see if I’m way out in the weeds, or pretty close to the trail.

  binary-bit-string-literal:
    - match: '(bB)(\")'
      captures: 
        1: storage.type.bit.vhdl
        2: punctuation.definition.string.begin.vhdl
      push: 
        - meta_scope: string.bit.binary.vhdl
        - match: '(\")|(\n)'
          captures:
            1: punctuation.definition.string.end.vhdl
            2: invalid.illegal.unclosed-string.vhdl
          pop: true
        - match: '([0-1zZxX\-_])|(.)'
          captures: 
            1: valid.character.bit.vhdl 
            2: invalid.illegal.unknown-char.vhdl

Update: Yeah sorry if anyone saw any of the other stuff. I have a basic starter file launched and correctly identifying comments now.

1 Like

#2

Your syntax file should look like this:
You need a main context so the parser know where to start and then just include other contexts. I made small changes to your scope captures, because the (...)|(...) is not so good for the readability.

%YAML 1.2
---
name: VHDL
file_extensions:
  - vhdl
scope: source.vhdl
contexts:
  main:
    - include: string
  string:
    - include: binary-bit-string-literal

  binary-bit-string-literal:
    - match: '([bB])(")'
      captures:
        1: storage.type.bit.vhdl
        2: punctuation.definition.string.begin.vhdl
      push:
        - meta_scope: string.bit.binary.vhdl
        - match: '"'
          scope: punctuation.definition.string.end.vhdl
          pop: true
        - match: \n
          scope: invalid.illegal.unclosed-string.vhdl
          pop: true
        - match: '[0-1zZxX\-_]'
          scope: valid.character.bit.vhdl
        - match: '.'
          scope: invalid.illegal.unknown-char.vhdl
2 Likes

#3

Okay, so the matches are ordered with priority. I agree that looks a lot more readable and logical and lets me setup fall through fail mechanics. I was sort of writing the syntax file up by itself, but then finally had to back off and go more of a baby-steps route where I am adding a single construct at a time and watching a test file get rendered with it. Also wrote myself a little scope sniffer routine which I prefer to the context viewer because it prints out to console instead of popping up a little dialog box.

2 Likes

#4

To clarify on your first question, YAML has multiple types of strings: quoted, unquoted and block. Only quoted strings have escape mechanisms, which are different for the two quotation marks. Double-quoted strings use backslash escapes and support stuff like \n or \u.... (and of course \\) while single-quoted strings only have an escape sequence for single quotes, which is two single quotes ('').
Because regular expressions make use of backslashes a lot, they aren’t suited to be contained within double-quoted strings, so you usually want to use any of the others.

On another note, anyonymous and named contexts are pretty much equivalent except that you can’t set or push multiple contexts if one of them is anonymous (since in that case push expects a list of strings, which are context names).

(Also, you could just match [^0-1zZxX\-_] as invalid and leave the remaining characters unmatched and untouched.)

0 Likes

#5

Aha, thanks. The double single quote is a YAML thing. Thank you.

The anonymous contexts are so useful that I’ve been making good use of them, however your point on multiple contexts is well noted. I know I’ve got some branching things later on (i.e. function prototype and function declaration patterns, not dissimilar to the typedef example in the literature) where I will have to be somewhat careful about the stack manipulation.

0 Likes

#6

Actually, this does work as long as the first context on the list is anonymous. I rely heavily upon this functionality in combination with my macro system. In order to circumvent this limitation, I sometimes push an anonymous “dummy” context first.

2 Likes

#7

Thought perhaps I would just add another question here because this does seem to be related to the difference between anonymous contexts, named contexts, and meta scopes.

So I have a structure that looks like this (there are actually a lot of variations on this so how I manage this will impact a lot of others).

architecture <arch_identifier> of <entity_identifier> is
    <block_declarative_items>
begin
    <statements>
end architecture <arch_identifer>;

The RE matching is going well (leveraging what I’ve done earlier) and I can create a meta scope for the entire block. However it would be technically more correct to create a meta scope for the block declarative items and a meta scope for the statements.

So I tried this:

  architectures:
    - match: '(?i)^\s*(architecture)\s+({{identifier}})\s+(of)\s+({{identifier}})\s+(is)'
      captures:
        1: storage.type.architecture.vhdl
        2: entity.name.architecture.vhdl
        3: keyword.other.vhdl
        4: entity.name.entity.vhdl
        5: keyword.declaration.vhdl
      push:
        - meta_scope: meta.block.arch-decl.vhdl
        - match: '(?i)\b(begin)\b'
          captures:
            1: keyword.declaration.vhdl
          set: meta.block.arch-body.vhdl
        - match: '(?i)^\s*(end)\s+(architecture)?\s+(\2)?\s*(;)'
          captures:
            1: keyword.declaration.vhdl
            2: storage.type.architecture.vhdl
            3: entity.name.architecture.vhdl
            4: punctuation.terminator.vhdl
          pop: true

What I was hoping would happen is that I’d set the meta_scope, and then when I hit begin, I would replace one meta_scope with another meta_scope. However this generates the error no such target meta.block.arch-body.vhdl. So it was trying to set a context, not a scope.

I believe I can do this pretty straightforward with three named contexts. One context to match the beginning line and set the follow on context, the second to watch for begin and set the third context, and the final context might be anonymous embedded in the second context or separate on its own. However in this fashion, I don’t believe I get to do the identifier matching because it’ll have lost the match data from the original context.

I could probably push again at the second phrase and create a second anonymous context, but this starts to get kind of messy when dealing with invalid identifier matching. I think as long as I’m in a child of the original context match, the group matches persist but the literature is definitely a bit vague on the way matches work (and I still have not figured out what the numbering is on nested groups).

So… is there a way to replace meta scopes in the middle of an anonymous context instead of pushing down another meta scope? I suspect the workaround is not to worry about getting terribly detailed with scopes, but honestly this scoping thing is pretty exciting because I think if done well, it could provide a wealth of information for searching and outlining and checking far beyond syntax coloring.

0 Likes

#8

I didn’t read it entirely, but it sounds like clear_scopes could help.

0 Likes

#9

Yeah, what I generally do here is something like:

some_context_on_the_stack:
    - meta_scope: meta.declaration

more_specific context:
    - clear_scopes: 1
    - meta_scope: meta.declaration.name

I tried to come up with a live example, but I rely heavily on my macro system for stuff like this:

contexts:
  else-pop:
    - match: (?=\S)
      pop: true

  architectures:
    - match: !word architecture
      scope: storage.type.architecture.vhdl
      push:
        - !meta meta.block.arch-decl.vhdl
        - !expect [';', punctuation.terminator.vhdl]
        - - match: !word begin
            scope: keyword.declaration.vhdl
            set:
              - !meta_set meta.block.arch-decl.body.vhdl
              - !expect [ !word architecture, storage.type.architecture.vhdl ]
              - !expect [ !word end, keyword.declaration.vhdl ]
              - - !pop_on [ !word end ]
                - include: statements
          - include: else-pop
        - !expect ['{{identifier}}', entity.name.architecture.vhdl ]
        - !expect [ !word of, 'keyword.other.vhdl' ]
        - !expect ['{{identifier}}', entity.name.entity.vhdl]

The full expansion of this is not easy to read:

contexts:
  else-pop:
    - match: (?=\S)
      pop: true
  
  architectures:
    - match: '(?i)\b(?:architecture)\b'
      scope: storage.type.architecture.vhdl
      push:
        - - meta_scope: meta.block.arch-decl.vhdl
          - include: else-pop

        - - match: ';'
            scope: punctuation.terminator.vhdl
            pop: true
          - include: else-pop

        - - match: '(?i)\b(?:begin)\b'
            scope: keyword.declaration.vhdl
            set:
              - - clear_scopes: 1
                - meta_scope: meta.block.arch-decl.body.vhdl
                - include: else-pop

              - - match: '(?i)\b(?:architecture)\b'
                  scope: storage.type.architecture.vhdl
                  pop: true
                - include: else-pop

              - - match: '(?i)\b(?:end)\b'
                  scope: keyword.declaration.vhdl
                  pop: true
                - include: else-pop

              - - match: '(?=(?i)\b(?:end)\b)'
                  pop: true
                - include: statements

          - include: else-pop

        - - match: '{{identifier}}'
            scope: entity.name.architecture.vhdl
            pop: true
          - include: else-pop

        - - match: '(?i)\b(?:of)\b'
            scope: keyword.other.vhdl
            pop: true
          - include: else-pop

        - - match: '{{identifier}}'
            scope: entity.name.entity.vhdl
            pop: true
          - include: else-pop

If written by hand, this approach would probably call for quite a few more named contexts. This approach does have several advantages over the traditional method:

  • It seems more declarative, which helps me to reason about it.
  • It’s very concise (with macros).
  • It handles newlines very well without extra code (e.g.“architecture¬name¬of¬name¬is…”)
  • If any part is missing, the rest will be highlighted correctly.

But it does result in a lot of boilerplate if you have to write it all yourself.

1 Like

#10

Is your “macro system” what is handling those !meta, !expect, and a few of the others? I will have to study this and try to figure out what’s going on. Is this out there on package control or just your own creation for personal use?

It does look like I should 1) use named contexts here and 2) investigate how the clear command worked. I had forgotten that it didn’t clear everything and that you could provide a value for it to pop off. The name matching might go and I haven’t quite figured out the else-pop (I mean I can tell what it’s doing, but the why hasn’t struck me yet.)

Also, not entirely sure that wouldn’t match all the initial line elements out of order? It looks like it’s matching twice on my {{identifier}} variable, so wouldn’t the first one match before the other? Yeah I think this is definitely something I don’t understand about the deeper grouping.

I will say the basic syntax that I wrote (aside from the bit about ‘begin’) does seem to work out okay. It’ll identify all the way through to the end and seems to apply the context appropriately throughout. It’s just the matter of the declarative block and the statements block would be nice to have separate metas for. Clear might do it for me though… just need to look at the scope stack right before the begin and make sure that I’m going to erase the correct one.

0 Likes

#11

Yes; that is my macro system. The !whatever tokens are YAML tags, which are perfectly valid though somewhat obscure; they are used by various serializers and such. I have a build system that interprets them as macros; you supply your own python file with each macro defined as a method.

I have not yet made a package. I should probably do that.

I haven’t quite figured out the else-pop (I mean I can tell what it’s doing, but the why hasn’t struck me yet.)

The idea is that you have a context that does exactly one thing — for example, consumes a semicolon. If that is your top context, then anything other than a semicolon is unexpected and you should break out of that context. In this architecture, you often have large stacks of single-purpose contexts — you push them all at once in reverse order, and they are popped one by one in forward order. This can be a bit confusing — the context at the end of the push list will be the first one whose match will be checked. I once wrote a !sequence macro that reversed this so that the contexts were written in sequence order, but that became confusing in complex cases.

The context produced by !meta doesn’t match anything at all. It’s just there so that the meta_scope is used as long as the context is on the stack. When everything above it pops off, it means that the construct you’re parsing is done, so the !meta context will pop immediately.

I readily admit that this is not the typical approach; I don’t recall seeing any other syntaxes written in this style. For me, the heavily stack-based approach makes sense, whereas long chains of sets confused me, and I sometimes found it difficult to maintain the correct meta_scope and to handle errors with grace.

In particular, I was working on a custom Oracle SQL syntax. I ran into several major difficulties when testing it on my company’s large legacy code base:

  • Multi-word keywords and operators might be split across several lines, with many optional parts.

    select
    distinct
    …
    group
    by
    rollup
    …
    
  • Identifiers can always be double-quoted.

    select "colA" as "Column A" …;
    create table "myTable" …;
    
  • The as keyword is often optional, making expression parsing tricky:

    select
        "colA" as "Column A",
        "colB" "Column B"
    --  ^^^^^^ variable.other.column
    --          ^^^^^^^^ entity.name.alias
    ;
    

The heavily stack-based architecture helped with the first and last problems, and the macros cleaned up the code dramatically while solving the second problem. Example:

contexts:
  select-clause:
    - match: !word select
      scope: keyword.other.select
      push:
        - !meta select-clause
        - select-list
        - !expect [ !word "distinct|unique|all", keyword.other ]
        - expect-hint

  select-list:
    - match: (?=\S)
      set:
        - select-list-rest
        - select-list-item

  select-list-rest:
    - match: ','
      scope: punctuation.separator.comma
      push: select-list-item
    - include: else-pop

  select-list-item:
    - match: '\*'
      scope: keyword.other.star
      pop: true
    - match: (?=\S)
      set:
        - !meta_set meta.select-clause.item
        - !expect_identifier entity.name.alias
        - !expect [ !word as, keyword.other ]
        - expression // not pictured

The macros:

def word(expr):
  return r'(?i:\b(?:%s)\b)' % expr

def meta(scope):
  return [
    { "meta_scope": scope, }
    { "include": "else-pop", }
  ]

def meta_set(scope):
  return [
    { "clear_scopes": 1, }
    { "meta_scope": scope, }
    { "include": "else-pop", }
  ]

def expect(expr, scope):
  return [
    {
      "match": expr,
      "scope": scope,
      "pop": True,
    },
    { "include": "else-pop", }
  ]

def expect_identifier(scope):
  return [
    {
      "match": "{{identifier}}",
      "scope": scope,
      "pop": True,
    },
    {
      "match": '"',
      "scope": "punctuation.definition.string.begin",
      "set": [
        { "meta_scope": "meta.string.identifier" },
        { "meta_content_scope": scope },
        {
          "match": '"',
          "scope": "punctuation.definition.string.end",
          "pop": True,
        },
        {
          "match": "\n",
          "scope": "invalid.illegal.newline",
          "pop": True,
        }
      ],
    },
    { "include": "else-pop", }

This example covers a core set of macros that are useful throughout a SQL syntax. Having these available made it easy to “do the right thing” when it came to quoted identifiers, newlines, case-sensitivity, and so forth, without filling the syntax with hundreds of lines of boilerplate. The macros also helped to provide a much richer set of meta scopes for custom tooling than is available for most syntaxes.

Finally, a key design requirement of the syntax was gracefully handling invalid syntax. We have a lot of legacy code that manually builds up very large and complex queries, and it was very important that the syntax could correctly handle a partial code fragment out of context and deal with “missing” constructs that are interpolated from the enclosing syntax using with_prototype. The built-in SQL syntax handles this moderately well because it doesn’t try to interpret much structure, but it lacked support for Oracle features we use. Existing Oracle syntaxes tried to be smarter, but this made them brittle.

Eventually, I tried to prototype my own syntax to see if there was any way to modify an existing syntax to correctly handle alias names without the as keyword, but that proved impossible without a dramatic restructuring of large parts of the syntax, so I wrote my own instead. (This restructuring, less the macros, is essentially what I’ve been working on to fix a variety of issues in the JavaScript syntax.)

4 Likes

#12

@ThomSmith you magnificent bastard :smiley: meta-programming syntax files (!) thanks for giving this exposition.

0 Likes

#13

Wow. That is seriously impressive. I’ll have to give it some thought and see if a strategy like that would be suitable for this language. I have been progressing down a more traditional route and it’s worked out fairly well so far, but I’ll think about it and see if it might be a good idea. This language (originally with Ada roots) is pretty great, but also very verbose so having a tool aid keeping track of things is pretty nice.

0 Likes

#14

Yeah clear_scopes has achieved one thing at the loss of another. If I do it as follows I can properly scope the declarative portion of the structure and the concurrent statements portion, but I lose the ability to check the identifier as the knowledge of the regexp field from the original match seems to be lost once I push down again.

  architectures:
    - match: '(?i)^\s*(architecture)\s+({{identifier}})\s+(of)\s+({{identifier}})\s+(is)'
      captures:
        1: storage.type.architecture.vhdl
        2: entity.name.architecture.vhdl
        3: keyword.other.vhdl
        4: entity.name.entity.vhdl
        5: keyword.declaration.vhdl
      push:
        - meta_scope: meta.block.arch-declarations.vhdl
        - include: block-declarative-items
        - match: '(?i)\b(begin)\b'
          captures:
            1: keyword.declaration.vhdl
          push: 
            - clear_scopes: 1
            - meta_scope: meta.block.arch-statements.vhdl
            - match: '(?i)^\s*(end)\s+(architecture)?\s+({{identifier}})?\s*(;)'
              captures:
                1: keyword.declaration.vhdl
                2: storage.type.architecture.vhdl
                3: entity.name.architecture.vhdl
                4: punctuation.terminator.vhdl
              pop: true

This may not be a great loss as you typically don’t have a lot of these blocks running around (it’s pretty much the ‘main’ of a VHDL code block) so if you screw up the identifier match at the end, all it’ll do is give you a compiler error and it’s a simple quick fix. What I CAN do this way though is have two different includes for the two different meta scopes. Only partially shown here because I don’t have everything in the second set done, but you have some things that are the declaratives and others that are the statements, so I could separate that out a bit and that could be useful for parsing.

Where it’s more problematic is there are other variations on this theme (intro lexical statement / block / begin / another block / end lexical statement) that are a lot more frequent. It would be nice to have a way to have some sort of consistent captured store but I can pick and choose when it’s more important to support well defined includes and when it’s more useful to support matching identifiers.

0 Likes

#15

Well I spoke too soon. clear_scope is not doing what I want it to do.

So, it looked like it was doing what it was supposed to do, however it’s not actually clearing as far as I can tell. Observing the scope just after the architecture statement, it is correctly

source.vhdl meta.block.arch-declarations.vhdl

Observing the scope just after the begin statement, it appears to be:

source.vhdl meta.block.arch-statements.vhdl

Then I go to the end, it correctly identifies the ending clause, and then I sniff again afterwards and it’s back to

source.vhdl meta.block.arch-declarations.vhdl

So… as far as I can tell, clear scopes didn’t actually clear it… just made it invisible for a little bit? I may just have to punt and go back to not marking internal portions, or doing it with several named contexts.

EDIT: So I translated this into a 3 named context variation and the scopes are still not working correctly. Here’s what I have thus far:

  architecture-begin:
    - match: '(?i)^\s*(architecture)\s+({{identifier}})\s+(of)\s+({{identifier}})\s+(is)'
      captures:
        1: storage.type.architecture.vhdl
        2: entity.name.architecture.vhdl
        3: keyword.other.vhdl
        4: entity.name.entity.vhdl
        5: keyword.declaration.vhdl
      push: architecture-declarations

  architecture-declarations:
    - meta_scope: meta.block.arch-declarations.vhdl
    - include: block-declarative-items
    - match: '(?i)\b(begin)\b'
      captures:
        1: keyword.declaration.vhdl
      push: architecture-statements

  architecture-statements:
    - clear_scopes: 1
    - meta_scope: meta.block.arch-statements.vhdl
    - include: concurrent-statements
    - match: '(?i)^\s*(end)\s+(architecture)?\s+({{identifier}})?\s*(;)'
      captures:
        1: keyword.declaration.vhdl
        2: storage.type.architecture.vhdl
        3: entity.name.architecture.vhdl
        4: punctuation.terminator.vhdl
      pop: true

If I’m examining the code I get the following. I’ll have to inject the scope similar to the testing.

...
   ^-- source.vhdl
architecture rtl of thing is
   ^-- correctly identified with contexts
...
   ^-- source.vhdl meta.block.arch-declarations.vhdl
begin
...
   ^-- source.vhdl meta.block.arch-statements.vhdl
end architecture rtl;
   ^-- correctly identified marks
...
   ^-- source.vhdl meta.block.arch-declarations.vhdl

Why in the world is arch-declarations coming back?

0 Likes

#16

Well I have a method that is working. I think (but do not know for sure) that what is happening is that my first context identifies the starter and pushes the basic large body structure notion onto the stack, more specifically it pushed the declaration point onto the stack. When I found the statements portion the push there works in some ways, but honestly it was the problem because now I’m 2 deep into a context that really should be one deep. When I pop, I went back to the declaration portion and so the meta_scope actually reapplied itself.

So trying set seems to work, as long as I get rid of the clear_scopes construct (because that was destroying my source.vhdl line as well.

I think what is getting to me is that there’s a context stack and there’s a scope stack and there are commands for manipulating both. It would have been nice to handle this with anonymous contexts, but this is okay with three and it’s definitely self documenting. And I can clearly have my includes for each block.

I decided as cool as the preprocessor macro system was for the syntax files, it doesn’t suit itself well to this situation because there are so many variations on these verbose structures that even that would probably get mangled.

pop taking an integer argument a bit like clear_scopes might have also been a potential solution if that worked. I suppose I might suggest that as a feature.

0 Likes

#17

I’m not entirely following you, but if you have the problem that a meta_scope somewhere down the stack is prematurely scoping things that should not yet be scoped with that particular meta_scope, you could use something like this:

architecture-statements:
  # the dummy, only here to match anything and consume nothing
  - match: ""
    set:
      # this is the real context
      - meta_scope: meta.block.arch-statements.vhdl
      - ... etc ... pop as usual
0 Likes

#18

Why in the world is arch-declarations coming back?

I can’t see where you’re ever popping architecture-declarations off the stack.

It looks like once you see the begin and you put architecture-statements onto the stack, you never want to see the scope meta.block.arch-declarations.vhdl again. If that is so, then in architecture-declarations, you should set architecture-statements instead of pushing it. Then, you also don’t need clear_scopes.

If you do want the meta.block.arch-declarations.vhdl to continue for a time after the statements end, then your implementation is nearly correct, but you’ll have to define a rule that will pop architecture-declarations when you’re done with it.

1 Like

#19

Yes, I apologize. I end up with a question and I realize there’s a lag time between when I am working on the problem, and when anyone reads it. However at the same time I’m continuing work on it, and thus I end up editing the post and trying to add in additional information that I find out, but to someone coming in somewhat fresh it’s confusing.

I believe when I was using two push statements, I originally had problems because whether anonymous or named, when I hit the ending construct, I was two contexts deep onto the stack. However I can’t actually see the context stack. All I can see is my trail of meta_scopes. (Seeing the current context stack would be kind of useful for debugging for sure.) I was using clear_scopes: 1 so that a 2nd push didn’t end up with both scopes on the scope stack, however pop doesn’t work on scopes, it works on contexts so since I can’t pop twice, when I popped I actually went back to my 2nd tier context.

To elaborate:

Context Stack                             Scope Stack
orig_stack                                source.vhdl
orig_stack arch-decl                      source.vhdl meta.block.arch-decl    
orig_stack arch-decl arch-stmt            source.vhdl (meta.block.arch-decl deleted) meta.block.arch-stmts
orig_stack arch-decl                      source.vhdl meta.block.arch-decl  <-- REAPPLIED because I'm 
                                                                                NOT where I thought I was.

Point being, I was trying to take care of the stack I could see (scopes) but had forgotten that I am dealing with a context stack too and pop works on that.

@ThomSmith Thank you, yes, I wasn’t. And honestly there is no way to do that really with the language. It’s a 3 stage statement. There’s a lexical beginning, a lexical separator, and a lexical ending. Thus, using set seems to work out okay because it pops and pushes and I’m only ever 1 context deep. There’s no real ‘end’ to the first section because that’s automatically the beginning of the second.

I’m seriously half tempted to do a writeup on the documentation for these commands because I think there’s a lot of nuance to each one that is not immediately obvious – either that or I’m just especially dense, I don’t discount that possibility! :smiley:

0 Likes

#20

I think I may be misunderstanding you. How does the following differ from the behavior you want?

contexts:
  architecture-begin:
    - match: '(?i)^\s*(architecture)\s+({{identifier}})\s+(of)\s+({{identifier}})\s+(is)'
      captures:
        1: storage.type.architecture.vhdl
        2: entity.name.architecture.vhdl
        3: keyword.other.vhdl
        4: entity.name.entity.vhdl
        5: keyword.declaration.vhdl
      push: architecture-declarations

  architecture-declarations:
    - meta_scope: meta.block.arch-declarations.vhdl
    - include: block-declarative-items
    - match: '(?i)\b(begin)\b'
      captures:
        1: keyword.declaration.vhdl
      set: architecture-statements

  architecture-statements:
    - meta_scope: meta.block.arch-statements.vhdl
    - include: concurrent-statements
    - match: '(?i)^\s*(end)\s+(architecture)?\s+({{identifier}})?\s*(;)'
      captures:
        1: keyword.declaration.vhdl
        2: storage.type.architecture.vhdl
        3: entity.name.architecture.vhdl
        4: punctuation.terminator.vhdl
      pop: true

0 Likes