Sublime Forum

Syntax definition for Regular Expressions inside JSON strings (think keybindings)

#1

I am working on new a syntax definition for JSON, mainly to make it possible to scope and color strings differently based on whether they are dictionary keys, dictionary values or array values, for example. (see https://github.com/sublimehq/Packages/issues/421)

While doing so, I want to get highlighting of Regular Expressions in strings, which will be useful when working with ST’s keybindings. The idea I had is, for a string that should be interpreted as a regex, only scope the first backslash in \\ as constant.character.escape.json, and allow the next characters to be matched by the regex syntax def.

However, so far, I haven’t figured out how to do it reliably while keeping the JSON character escapes in place. I don’t want to have to manually copy and paste / maintain the regex syntax in my JSON definition, so I’m looking for some other solution to work around the fact that there is no way to tell ST to pop a particular context after a single match (without copying the context and modifying the relevant matches to pop). Maybe someone can think of an approach I’m missing… :slight_smile: For example, take the following JSON string:

"\\d, \" . (\\d, \" \t \\t) ."

I’d want the following scopes:

 ^ constant.character.escape.before-regex-escape.json
  ^^ keyword.control.character-class.regexp
      ^^ constant.character.escape.json
            ^ constant.character.escape.before-regex-escape.json
             ^^ keyword.control.character-class.regexp
                 ^^ constant.character.escape.json
                    ^^ constant.character.escape.json
                       ^ constant.character.escape.before-regex-escape.json
                        ^^ constant.character.escape.regexp

However, the \\t currently fails, it matches the \t as though it was JSON. I know why, just not if there is a way to fix it while keeping within my constraints.

Here is the relevant part of my syntax def so far:

%YAML 1.2
---
# See http://www.sublimetext.com/docs/3/syntax.html
scope: source.example-keymap
contexts:
  main:
    - match: '"'
      scope: punctuation.string.begin.json
      push:
        - meta_scope: string.quoted.double.json
        - match: '"'
          scope: punctuation.string.end.json
          pop: true
        
        - match: '(?=\S)'
          push:
            - include: scope:source.regexp#base-literal
          with_prototype:
            - match: '(?=")'
              pop: true
            - match: '\\\\(?=")'
              scope: constant.character.escape.json
              comment: quotes have to be escaped in json, a quote in the regex will count as the closing quote for the string, and the regex will have a lonely escape character at the end of it
              pop: true
            - match: '\\(?=\\)'
              scope: constant.character.escape.before-regex-escape.json
              comment: match the first backslash so that the next one and the following character can be matched by the regexp syntax definition. TODO push into the regex scope without matching JSON escapes and pop after the first match somehow?
            - match: |-
                (?x:                # turn on extended mode
                  \\                # a literal backslash
                  (?:               # ...followed by...
                    ["\\/bfnrt]     # one of these characters
                    |               # ...or...
                    u               # a u
                    \h{4}           # and four hex digits
                  )
                )
              scope: constant.character.escape.json
              comment: note that variables don't work in with_prototype sections, see https://github.com/SublimeTextIssues/Core/issues/1488

Don’t get me wrong, it works pretty well at the moment, but it’s just… wrong… if my color scheme showed json escapes differently to regex escapes it would be terrible :wink:


Not shown in my example here, but I’d also love to be able to pop multiple contexts at once, which would enable me to scope the } that closes a dictionary as invalid when it comes after a dictionary key or separator without a value, for example. Clearing scopes doesn’t help in this case. And using set instead of push ruins my meta scoping…

0 Likes

#2

I totally agree. It would be nice to have a way to extend a definition without having to copy paste it.
I personally added git specific context to my syntaxes to avoid git merge breaking the syntax.
It’s the kind of thing that probably shouldn’t be a part of the Default package syntaxes, but it would be nice to be able to inject it in all syntaxes (it will also avoid having duplicated code)

A hack I used in some places to inject my regex to an existing syntax is to copy paste the structure but replace the content of the context by an include pointing to the original syntax. Like in:

contexts:
  main:
    - match: <your-regex-here>
    - include: "My syntax.sublime-syntax#patch-json-main"
    - include: "Packages/Default/Json/Json.sublime-syntax#main" 

The problem with this approach is you’re limited to reuse the contexts with an explicit name, and you have to do it by language.

I thinked about creating a meta-language around the “.sublime-syntax” that could support this kind of extensions and would generate regular “.sublime-syntax” on “compilation”.

1 Like