Sublime Forum

Importing variables from one syntax to another

#1

It would be awesome for syntaxes modularity and maintainability, if we could have am import_variables: that added the variables of to the ones in the syntax importing them (in case of name conflict, the importing syntax should override the imported one).

This would allow e.g. syntaxes for several lexically related grammars to share the same pool of tokens much more painlessly, specially regarding maintainability.

3 Likes

#2

You can implement this using YAML Macros:

%YAML 1.2
%TAG ! tag:yaml-macros:YAMLMacros.lib.extend:
---
name: Example
scope: source.example

variables: !extend
  _base: myVariables.yaml

contexts:
  ...
1 Like

#3

Interesting. WIll look into it. Thanks (still, a built-in functionality for it would be neat):

0 Likes

#4

In my experience with syntaxes, copying/maintaining a few variables tends to be one of the things I spend the least amount of time on.

I think that some more controls for context stack manipulation would be a better use of time, since it would allow us to handle things like complex heredocs properly.

5 Likes

#5

Yes, by all means. More expressivity always trumps minor maintainability/QoL imrovements. (though in my current case, some variables are an OR of like hundreds of literals which need to be updated in like twenty filesā€¦ Probably the yaml macros plugin will take care of that for me now.

0 Likes

#6

In such a case, Iā€™d try to put the list of literals into a single match in a common file and include that scope in all needed syntaxes.

The include:

%YAML 1.2
---
name: Common
hidden: true
scope: library.mysyntax

contexts:
  
  variables:
    - match: |-
        (?x)
        (?:
          literal1 | literal2 | literal3
        )
      scope: variable.language.mysyntax

The file using the include:

%YAML 1.2
---
name: mysyntax
scope: source.mysyntax

contexts:
  main:
    - include: MySyntax Include.sublime-syntax#variables
1 Like

#7

Iā€™m still keeping my fingers crossed for an inheritance model of sorts.

1 Like

#8

While weā€™re waiting, is there anything YAML Macros could do to make life easier? Iā€™m open to improving it if possible.

1 Like

#9

I think that some more controls for context stack manipulation would be a better use of time, since it would allow us to handle things like complex heredocs properly.

Do you have something particular in mind? As it stands, the old backreference behavior is the only way to handle features like tag matching and heredocs. Extending the syntax engine to handle these more directly sounds like a tough challenge, but also one that could greatly benefit complex languages.

0 Likes

#10

Yes, heredocs can be ā€œnestedā€, but are FIFO in order, unlike pretty much all other constructs in most languages. They also have this funky ability to allow code to continue on until the end of the line. Here is an example (in Ruby): https://github.com/sublimehq/Packages/issues/710. If I recall correctly, Bash allows similar constructs with multiple heredocs per line.

In order to handle this, we could use some sort of ability to have temporary stack that could be pushed or unshifted. Then it would need to be able to be ā€œappliedā€ to the main stack. I havenā€™t spent any time yet on how feasible the implementation is. It is a fairly specialized requirement, from what I can tell, so it hasnā€™t been very high priority.

1 Like

#11

(I apologize in advance to the extent that Iā€™m telling you things you already know.)

One way to do this would be to let each context take a parameter. You would have a stack of [context, parameter] pairs; or, equivalently, a stack of contexts and a stack of parameters. Parameters could be (and usually would be) empty. Then, the matching rules of the context could depend upon the parameter. (This might be exactly what youā€™re suggesting.)

This is basically how the current system works, where captures are a sort of hidden parameter that the child context can access. But itā€™s not properly implemented, probably because this is legacy tmLanguage behavior that I suspect was actually a bug to begin with.

This approach is incompatible with the Sublime strategy of (I assume) precompiling all of a contextā€™s regexps into a DFA. Changing the parameter would require recompiling the context. I donā€™t know how long it takes to compile a single context; maybe itā€™s not that bad, and maybe the vast majority of the work can be cached anyway. This depends on the internals of the custom regexp system. Dynamic recompilation might be fast enough.

The reason this isnā€™t a problem today is that backreferences automatically trigger the legacy regexp engine. A new, proper parameter implementation could do the same. The old engine is relatively slow, but thereā€™s no compilation time. It may be faster than dynamic recompilation (not to mention easier on memory).

Any such implementation could be tremendously useful for quite a few languages. Heredocs are a great example, but you could also use it for tag matching in HTML/XML/JSX and possibly even to handle whitespace in Python. Wouldnā€™t that be something!

Hereā€™s an example Iā€™ve come up with. It features with_parameters, which tells the engine to set a parameter on the pushed context. For XML tag matching:

contexts:
  main:
    - match: <(\w+)>
      scope: open-tag.xml
      push:
        - with_parameters:
            tag_name: \1
        - meta_scope: meta.tag.xml
        - include: tag-body

  tag-body:
    - match: </({{$tag_name}})>
      scope: close-tag.xml
      pop: true
    - match: </(\w+)>
      scope: invalid.illegal.unmatched
    - include: main

For Python indentation:

contexts:
  main:
    match: ''
    push:
      - with_parameters:
          indent: ''
      - include: block

  block:
    - match: ^\s*$ # do nothing, empty line
    
    - match: ({{$indent}}(?:\s+)) # increase indent
      push:
        - with_parameters:
            indent: \1
        - meta_scope: meta.block.python
        - include: block

    - match: ^{{$indent}}(?=\S) # do nothing, same indent

    - match: ^ # decrease indent
      pop: true

The best implementation would probably be a separate parameter stack. Every time a context is pushed onto the context stack, a parameter record (or null) would be pushed onto the parameter stack. If a context refers to a parameter using {{$param}}, that context would be either use the old engine or be dynamically compiled. It would look up the value starting from the top of the stack. (If no value was found, perhaps an unmatchable value should be substituted.) This way, the parameter records are nice and immutable. In the usual case where a context neither pushed nor used any parameters, everything would work as it does now, and the overhead would be pushing a null value onto a stack. In cases where the new behavior was used, the parameter record overhead should hopefully be small and the context matching performance should be at least as fast as the current backreference system (or faster if dynamic recompilation worked).

I havenā€™t done the math, but offhand, I think this would augment Sublimeā€™s parser to handle a variety of context-sensitive languages. It would be orthogonal to other means of improving the parser, such as adding nondeterminism.

0 Likes

[sublime-syntax] Allow \1 in patterns that don't pop
#12

How do you use them with regular expressions? Variables avoid repetitions and you can handle a long list of strings.

Example:

- match: '\b(string)\s*=\s*({{VARIABLE_XY}})'
  captures:
    1: scope.txt
    2: scope2.txt

I believe that you canā€™t replace this variable with a context.

0 Likes

#13

Thank you for your plugin. Could you be more specific? I donā€™t master Python and I try to do what you say using your plugin, but ST says ā€œvariables are missingā€.
What I need to do?

0 Likes

#14

If you post your code, I can take a look at it.

0 Likes

#15

Now that you mention possible enhancements of yamlmacrosā€¦m

One major improvement: let the user of yaml macros write the macros in a normal text file with normal syntax rather than in Python. That would help it be used by people who are not fluent in Python, and increase the overall readability of the macrosā€™ content. (Of course I understand this would ba a huge change in the behaviour of the plugin as well.)

A medium-sized improvement would be allowing us to define macros that take more than one parameter.

And a micro-improvement: allow us to customise the first character of a macro call, rather than it be hardcoded to ā€˜!ā€™.

0 Likes

#16

The idea behind my suggestion is to create common re-usable rules.

Here is an extended version of my prior example

%YAML 1.2
---
name: Common
hidden: true
scope: library.mysyntax

contexts:
  
  variables:
    - match: |-
        (?x)
        (?:
          literal1 | literal2 | literal3
        )
      scope: variable.language.mysyntax

The main file

%YAML 1.2
---
name: mysyntax
scope: source.mysyntax

contexts:
  main:
    # the mapping key or the L-value in an assignment
    - match: \bstring\b
      scope: meta.mapping.key variable.other.
      push: mapping-maybe-operator

  mapping-maybe-operator:
    - meta_content_scope: meta.mapping
    - match: =
      scope: punctuation.separator.mapping.key-value keyword.operator.assignment
      set: mapping-value
    # no assignment, pop off
    - match: (?=\S)
      pop: true

  mapping-value:
    # the value contains one of the imported variables
    - meta_content_scope: meta.mapping.value
    - include: MySyntax Include.sublime-syntax#variables
    # pop off if something else then the variable was found
    - match: (?=\S)
      pop: true

This looks a bit more complicated on a first glance, but allows proper meta-scope and invalid.illegal handling. The later one is not included here.

In general Iā€™d suggest to avoid capturing examples likeā€¦

- match: '\b(string)\s*=\s*({{VARIABLE_XY}})'
  captures:
    1: scope.txt
    2: scope2.txt

They can slow down the whole lexer by 10 to 15% and cause highlighting the whole expression only if it was completed. While writing string wonā€™t be highlighted until = and the list of VARIABLE_XY matches. This causes poor writing experience.

1 Like

#17

Syntactically, YAML Macros are implemented using YAML tags. This means that a YAML Macros file is a syntactically valid YAML file, and the macro system simply transforms it into another YAML file. This makes the implementation very simple, and it explains some of the design quirks of the system.

One major improvement: let the user of yaml macros write the macros in a normal text file with normal syntax rather than in Python.

One way this could be done is by writing a Python macro that grabs text out of another file. This could be implemented in a library bundled with the package so that it could be used without having to write any more Python.

Could you give an example of a use case for this?

A medium-sized improvement would be allowing us to define macros that take more than one parameter.

Because of the way the syntax works, you can only pass a single argument. However, if you pass an array, the macro system will pass the elements of the array as separate arguments rather than as a single array argument. So in effect, you can use this to pass multiple arguments. If you have an example of what youā€™re looking to do, we can see whether the current system may do the trick.

And a micro-improvement: allow us to customise the first character of a macro call, rather than it be hardcoded to ā€˜!ā€™.

The ! is part of the YAML tag syntax. As-is, you can use the YAML syntax to choose another prefix, but that prefix must begin with !. This makes it possible to import multiple macro libraries from a file. This is something thatā€™s less likely to change, because itā€™s mandated by the YAML spec itself.

0 Likes

#18

Is your plugin compatible with ApplySyntax?

This is a fictitious example, but should cover all my needs:

In User:
Example.sublime-syntax

%YAML 1.2
%TAG ! tag:yaml-macros:YAMLMacros.lib.extend:
---
name: Example
scope: source.example

variables: !extend
  _base: myVariables.yaml

contexts:

  main:
    - match: '\b(blabla)\s*=\s*({{VARIABLE_1}})'
      captures:
        1: scope1
        2: scope2
    - match: '\b(blablabla)\s*=\s*({{VARIABLE_2}})'
      captures:
        1: scope1
        2: scope2

myVariables.yaml

  VARIABLE_1: |-
    (?x:
        literal_string_1
      | literal_string_2
    )

  VARIABLE_2: |-
    (?x:
        literal_string_3
      | literal_string_4
    )

Maybe the syntax is wrong?

Variables_macros.py (empty)

If it doesnā€™t take too long, could you help me please?

0 Likes

#19

This should be totally orthogonal to ApplySyntax; they should work together just fine.

In your example, you should have a file named Example.sublime-syntax.yaml-macros. When you build this file, it should create a file in the same directory named Example.sublime-syntax. That file will not contain the %TAG directive or any macros; rather, it will be the result of applying all of the macros in your original Example.sublime-syntax.yaml-macros file. It is this compiled output file that Sublime will use for syntax highlighting.

Are you creating an Example.sublime-syntax.yaml-macros file and using the YAML Macros build system to compile it?

0 Likes

#20

Your suggestion is very interesting, because I use a lot of regex and get a huge perfomance loss.
Could you add invalid.illegal handling in your example please? Just to be sure.

0 Likes