Yes, by all means. More expressivity always trumps minor maintainability/QoL imrovements. (though in my current case, some variables are an OR of like hundreds of literals which need to be updated in like twenty files… Probably the yaml macros plugin will take care of that for me now.
Importing variables from one syntax to another
In such a case, I’d try to put the list of literals into a single match in a common file and include that scope in all needed syntaxes.
The include:
%YAML 1.2
---
name: Common
hidden: true
scope: library.mysyntax
contexts:
variables:
- match: |-
(?x)
(?:
literal1 | literal2 | literal3
)
scope: variable.language.mysyntax
The file using the include:
%YAML 1.2
---
name: mysyntax
scope: source.mysyntax
contexts:
main:
- include: MySyntax Include.sublime-syntax#variables
While we’re waiting, is there anything YAML Macros could do to make life easier? I’m open to improving it if possible.
I think that some more controls for context stack manipulation would be a better use of time, since it would allow us to handle things like complex heredocs properly.
Do you have something particular in mind? As it stands, the old backreference behavior is the only way to handle features like tag matching and heredocs. Extending the syntax engine to handle these more directly sounds like a tough challenge, but also one that could greatly benefit complex languages.
Yes, heredocs can be “nested”, but are FIFO in order, unlike pretty much all other constructs in most languages. They also have this funky ability to allow code to continue on until the end of the line. Here is an example (in Ruby): https://github.com/sublimehq/Packages/issues/710. If I recall correctly, Bash allows similar constructs with multiple heredocs per line.
In order to handle this, we could use some sort of ability to have temporary stack that could be pushed or unshifted. Then it would need to be able to be “applied” to the main stack. I haven’t spent any time yet on how feasible the implementation is. It is a fairly specialized requirement, from what I can tell, so it hasn’t been very high priority.
(I apologize in advance to the extent that I’m telling you things you already know.)
One way to do this would be to let each context take a parameter. You would have a stack of [context, parameter] pairs; or, equivalently, a stack of contexts and a stack of parameters. Parameters could be (and usually would be) empty. Then, the matching rules of the context could depend upon the parameter. (This might be exactly what you’re suggesting.)
This is basically how the current system works, where captures are a sort of hidden parameter that the child context can access. But it’s not properly implemented, probably because this is legacy tmLanguage behavior that I suspect was actually a bug to begin with.
This approach is incompatible with the Sublime strategy of (I assume) precompiling all of a context’s regexps into a DFA. Changing the parameter would require recompiling the context. I don’t know how long it takes to compile a single context; maybe it’s not that bad, and maybe the vast majority of the work can be cached anyway. This depends on the internals of the custom regexp system. Dynamic recompilation might be fast enough.
The reason this isn’t a problem today is that backreferences automatically trigger the legacy regexp engine. A new, proper parameter implementation could do the same. The old engine is relatively slow, but there’s no compilation time. It may be faster than dynamic recompilation (not to mention easier on memory).
Any such implementation could be tremendously useful for quite a few languages. Heredocs are a great example, but you could also use it for tag matching in HTML/XML/JSX and possibly even to handle whitespace in Python. Wouldn’t that be something!
Here’s an example I’ve come up with. It features with_parameters
, which tells the engine to set a parameter on the pushed context. For XML tag matching:
contexts:
main:
- match: <(\w+)>
scope: open-tag.xml
push:
- with_parameters:
tag_name: \1
- meta_scope: meta.tag.xml
- include: tag-body
tag-body:
- match: </({{$tag_name}})>
scope: close-tag.xml
pop: true
- match: </(\w+)>
scope: invalid.illegal.unmatched
- include: main
For Python indentation:
contexts:
main:
match: ''
push:
- with_parameters:
indent: ''
- include: block
block:
- match: ^\s*$ # do nothing, empty line
- match: ({{$indent}}(?:\s+)) # increase indent
push:
- with_parameters:
indent: \1
- meta_scope: meta.block.python
- include: block
- match: ^{{$indent}}(?=\S) # do nothing, same indent
- match: ^ # decrease indent
pop: true
The best implementation would probably be a separate parameter stack. Every time a context is pushed onto the context stack, a parameter record (or null) would be pushed onto the parameter stack. If a context refers to a parameter using {{$param}}
, that context would be either use the old engine or be dynamically compiled. It would look up the value starting from the top of the stack. (If no value was found, perhaps an unmatchable value should be substituted.) This way, the parameter records are nice and immutable. In the usual case where a context neither pushed nor used any parameters, everything would work as it does now, and the overhead would be pushing a null value onto a stack. In cases where the new behavior was used, the parameter record overhead should hopefully be small and the context matching performance should be at least as fast as the current backreference system (or faster if dynamic recompilation worked).
I haven’t done the math, but offhand, I think this would augment Sublime’s parser to handle a variety of context-sensitive languages. It would be orthogonal to other means of improving the parser, such as adding nondeterminism.
[sublime-syntax] Allow \1 in patterns that don't pop
How do you use them with regular expressions? Variables avoid repetitions and you can handle a long list of strings.
Example:
- match: '\b(string)\s*=\s*({{VARIABLE_XY}})'
captures:
1: scope.txt
2: scope2.txt
I believe that you can’t replace this variable with a context.
Thank you for your plugin. Could you be more specific? I don’t master Python and I try to do what you say using your plugin, but ST says “variables are missing”.
What I need to do?
Now that you mention possible enhancements of yamlmacros…m
One major improvement: let the user of yaml macros write the macros in a normal text file with normal syntax rather than in Python. That would help it be used by people who are not fluent in Python, and increase the overall readability of the macros’ content. (Of course I understand this would ba a huge change in the behaviour of the plugin as well.)
A medium-sized improvement would be allowing us to define macros that take more than one parameter.
And a micro-improvement: allow us to customise the first character of a macro call, rather than it be hardcoded to ‘!’.
The idea behind my suggestion is to create common re-usable rules.
Here is an extended version of my prior example
%YAML 1.2
---
name: Common
hidden: true
scope: library.mysyntax
contexts:
variables:
- match: |-
(?x)
(?:
literal1 | literal2 | literal3
)
scope: variable.language.mysyntax
The main file
%YAML 1.2
---
name: mysyntax
scope: source.mysyntax
contexts:
main:
# the mapping key or the L-value in an assignment
- match: \bstring\b
scope: meta.mapping.key variable.other.
push: mapping-maybe-operator
mapping-maybe-operator:
- meta_content_scope: meta.mapping
- match: =
scope: punctuation.separator.mapping.key-value keyword.operator.assignment
set: mapping-value
# no assignment, pop off
- match: (?=\S)
pop: true
mapping-value:
# the value contains one of the imported variables
- meta_content_scope: meta.mapping.value
- include: MySyntax Include.sublime-syntax#variables
# pop off if something else then the variable was found
- match: (?=\S)
pop: true
This looks a bit more complicated on a first glance, but allows proper meta-scope and invalid.illegal handling. The later one is not included here.
In general I’d suggest to avoid capturing examples like…
- match: '\b(string)\s*=\s*({{VARIABLE_XY}})'
captures:
1: scope.txt
2: scope2.txt
They can slow down the whole lexer by 10 to 15% and cause highlighting the whole expression only if it was completed. While writing string
won’t be highlighted until =
and the list of VARIABLE_XY
matches. This causes poor writing experience.
Syntactically, YAML Macros are implemented using YAML tags. This means that a YAML Macros file is a syntactically valid YAML file, and the macro system simply transforms it into another YAML file. This makes the implementation very simple, and it explains some of the design quirks of the system.
One major improvement: let the user of yaml macros write the macros in a normal text file with normal syntax rather than in Python.
One way this could be done is by writing a Python macro that grabs text out of another file. This could be implemented in a library bundled with the package so that it could be used without having to write any more Python.
Could you give an example of a use case for this?
A medium-sized improvement would be allowing us to define macros that take more than one parameter.
Because of the way the syntax works, you can only pass a single argument. However, if you pass an array, the macro system will pass the elements of the array as separate arguments rather than as a single array argument. So in effect, you can use this to pass multiple arguments. If you have an example of what you’re looking to do, we can see whether the current system may do the trick.
And a micro-improvement: allow us to customise the first character of a macro call, rather than it be hardcoded to ‘!’.
The !
is part of the YAML tag syntax. As-is, you can use the YAML syntax to choose another prefix, but that prefix must begin with !
. This makes it possible to import multiple macro libraries from a file. This is something that’s less likely to change, because it’s mandated by the YAML spec itself.
Is your plugin compatible with ApplySyntax?
This is a fictitious example, but should cover all my needs:
In User:
Example.sublime-syntax
%YAML 1.2
%TAG ! tag:yaml-macros:YAMLMacros.lib.extend:
---
name: Example
scope: source.example
variables: !extend
_base: myVariables.yaml
contexts:
main:
- match: '\b(blabla)\s*=\s*({{VARIABLE_1}})'
captures:
1: scope1
2: scope2
- match: '\b(blablabla)\s*=\s*({{VARIABLE_2}})'
captures:
1: scope1
2: scope2
myVariables.yaml
VARIABLE_1: |-
(?x:
literal_string_1
| literal_string_2
)
VARIABLE_2: |-
(?x:
literal_string_3
| literal_string_4
)
Maybe the syntax is wrong?
Variables_macros.py (empty)
If it doesn’t take too long, could you help me please?
This should be totally orthogonal to ApplySyntax; they should work together just fine.
In your example, you should have a file named Example.sublime-syntax.yaml-macros
. When you build this file, it should create a file in the same directory named Example.sublime-syntax
. That file will not contain the %TAG
directive or any macros; rather, it will be the result of applying all of the macros in your original Example.sublime-syntax.yaml-macros
file. It is this compiled output file that Sublime will use for syntax highlighting.
Are you creating an Example.sublime-syntax.yaml-macros
file and using the YAML Macros
build system to compile it?
Your suggestion is very interesting, because I use a lot of regex and get a huge perfomance loss.
Could you add invalid.illegal handling in your example please? Just to be sure.
I thought that it was easy with your plugin but I need to know Python, and I don’t.I give up. Thanks for your help.
As long as you’re only using macros from the built-in libraries (including extend
), you shouldn’t need to write any Python code for what you’re trying to do.
@ThomSmith
It works now (Build doesn’t work, but Build with works… Not sure exactly what’s the difference between them).
The result is not what I want though. I have almost 70 syntax files sharing many long lists of variables. I don’t want to add them to all files, I would like that ST uses them from a single file. I’m afraid ST doesn’t support this feature and even a plugin can’t do it.
As I can add a link to external contexts using a basic syntax I can reduce a lot of redundancy. Not the best way, but no choice I guess.
I don’t want to add them to all files, I would like that ST uses them from a single file.
I’m not sure I understand the distinction.
Sublime doesn’t directly “use” sublime-syntax
files; it compiles them into a binary representation. And I’m nearly certain that it resolves syntax imports (like scope:source.js
) by simply copying all of the relevant rules from the other syntax. So at the level of the binary representation that Sublime actually uses, there is no sharing of code between syntaxes.
YAML Macros adds another layer – the sublime-syntax.yaml-macros
representation. This makes the sublime-syntax
an intermediate representation. I’ve put a lot of work into making that intermediate representation as similar as possible to the authored code in the yaml-macros
file, but fundamentally the compiled sublime-syntax
file isn’t meant to be the canonical human-editable representation of the syntax. So I don’t mind if the compiled sublime-syntax
is messy or contains a lot of duplicate code as long as it doesn’t affect performance.
I’d like to help you accomplish what you want to do, including adding new capabilities to YAML Macros if it makes sense to do so. It sounds like YAML Macros isn’t meeting your needs perfectly, but I can’t tell exactly how. Maybe if you were to describe your application in more detail (70 syntax files is certainly a lot!), I could better understand what you need.