Sublime Forum

Issues embedding RegEx highlighter inside another syntax definition

#1

Hello,

I’m writing a syntax definition for a language which uses PCRE regular expressions for substring extraction and boolean testing. The regexes are always wrapped in hashes i.e. #<regex_here>#. Example:

foo = "Test String";
if (foo ~ #[tT]est#)
    success();
if (foo ~ #\##)
    // hashes inside the regex are escaped

My current solution works OK:

- match: '(?<!^)(#)(.*)(?<!\\)(#)'      # first lookbehind avoids shebang (#!/bin/lang) 
  captures:
    1: string.regexp.begin.myLanguage
    2: string.regexp.myLanguage
    3: string.regexp.end.myLanguage

But I would love to use Sublime Text’s own PCRE syntax highlighter in these regexes. My attempt to add RegExp.sublime-syntax, using the same match regexes, is broken:

- match: '(?<!^)#'
  push: "Packages/Regular Expressions/RegExp.sublime-syntax"
  with_prototype:
    - match: '(?<!\\)#'
      clear_scopes: true
      pop: true

It successfully matches the first #, and starts highlighting using RegExp.sublime-syntax. But it doesn’t revert to the original language when it hits the final #.
For some reason, adding two hashes in a row makes it stop:

test ~~ #.*/([^/]*)#;         <fails: highlights next line as if it was a regex
test ~~ #.*/([^/]*)#;  ##     <reverts after the two hashes             

What am I doing wrong?
Your advice appreciated!

Cheers,
nwh

0 Likes

#2

generally one uses a positive lookahead in a with_prototype pop pattern to ensure that it will correctly exit multiple levels of nesting, otherwise when it consumes the pop pattern, it can only exit one context level.

also, you should try to avoid using lookbehinds - your (?<!^)# can be replaced with a lookahead and your pop pattern shouldn’t need to worry about an escaped hash because it will have already been consumed.

of course that means a bit more work to scope the closing pattern, as it will pop into the same context where it pushed into the regex one, where the same token will be recognized as opening the regex pattern…
You could try checking the shipped packages, see how they do it: https://github.com/sublimehq/Packages/search?q=with_prototype&type=Code&utf8=✓

0 Likes

#3
%YAML 1.2
---
# See http://www.sublimetext.com/docs/3/syntax.html
scope: source.nwh
contexts:
  main:
    - match: '(?!^)#'
      push: [pop_at_hash, Packages/Regular Expressions/RegExp.sublime-syntax]
      with_prototype:
      - match: '(?=#)'
        pop: true
  pop_at_hash:
    - match: '#'
      pop: true

2 Likes

#4

Thanks very much kingkeith… the YAML sample you posted works really well, and has no issue correctly processing escaped hashes (i.e. \# ). I think maybe this is because RegExp.sublime-syntax is now correctly understanding the escaped hash (by pushing it into constant.character.escape.regexp context) but will fiddle with it and see if I can understand it better.

0 Likes