I do apologize for the questions if they get to be too many. The documentation for the YAML file is good for a basic overview, but then I look at something like the Python.sublime-syntax file for examples and it’s clear that there are a lot of things that are glossed over.
-
Actually a simple one first. In a lot of the Python.sublime-syntax contexts, particularly looking for strings, it tries to capture the group
('')
. When I am testing regexps I usually have to escape a single quote and it looks like(\')
. Are('')
and(\')
identical? I haven’t figured out if this is a YAML thing, or a Oniguruma thing or where it comes from so I don’t know how valid it is. -
Second question. I think I’ve got basic string identification setup correctly, but I thought I’d try something a little more complicated. The language uses string literals for a lot of things like bit arrays. It’s possible to specify these bit arrays explicitly as such:
B"10100110"
. So far, so good. When explicity specified like this, there are a limited number of characters that are allowed. You get 0, 1, and then a few extras like X, U, -, and _, which mean specific things in digital design context. I thought, wouldn’t it be nice if I could flag bad characters in the string.
So, I wrote the following. Is this going to do what I think it’s going to do? Namely,
- mark the initial case-insensitive B with a storage type context,
- mark the punctuation context
- Push the unnamed context onto the stack
- Apply the string scope
- Watch for the end of the string and flag if we hit an invalid EOL case.
- Watch for good characters, or anything else, by definition bad characters.
I think #6 is where I have the most unease because .
explicitly matches everything. However I also think REs are evaluated left to right, so I should match and capture the good ones before matching the anything-else clause. I’m also a little uneasy about multiple matches in the unnamed context. In the examples in the documentation, when you push, you declare a specific context and then you write that context separately. However this unnamed context is a LOT more convenient and so you don’t have a lot of structure-start, structure-continue, structure-end context names which seems to clutter up the namespace greatly. So, based on Python.sublime-syntax I think I’m okay, but I figured it couldn’t hurt to check in and see if I’m way out in the weeds, or pretty close to the trail.
binary-bit-string-literal:
- match: '(bB)(\")'
captures:
1: storage.type.bit.vhdl
2: punctuation.definition.string.begin.vhdl
push:
- meta_scope: string.bit.binary.vhdl
- match: '(\")|(\n)'
captures:
1: punctuation.definition.string.end.vhdl
2: invalid.illegal.unclosed-string.vhdl
pop: true
- match: '([0-1zZxX\-_])|(.)'
captures:
1: valid.character.bit.vhdl
2: invalid.illegal.unknown-char.vhdl
Update: Yeah sorry if anyone saw any of the other stuff. I have a basic starter file launched and correctly identifying comments now.