Sublime Forum

ST3 RegEx Engine Supports Look-Behind?

#1

As of ST 3120, are look-behind RegExs supported at all?

I know that the issue of using look-behind RegExs in syntax definitions has already been asked many times over in the forum! But because I’ve seen varying answers (ranging from “it’s a bug that’s going to be fixed” to “they are partly supported”), and because there’s confusion regarding the various versions of ST, which seems to have adopted different RegEx engines in the course of time, including switching engine within ST3, I need to ask this question and dispel all confusion.

The official Syntaxes Documentation states:

The match key is a regex, supporting features from the Oniguruma regex engine.

So it comes natural to expect that look-behind are supported.

But, in practice, when I run “Syntax Tests - Regex compatibility” on the syntax I’m working on, I get a lot of:

Packages/.../MySyntax.sublime-syntax:874:9: Negative look behind is not supported
Packages/.../MySyntax.sublime-syntax:948:9: Look behind is not supported
Packages/.../MySyntax.sublime-syntax:966:9: Look behind is not supported
FAILED: 12 patterns in "Packages/.../MySyntax.sublime-syntax" are incompatible with the new regex engine
[Finished]

Which confuses me.

Since look-behind RegExs are something that anyone working on a syntax will undoubtedly try to use, I think that it’s worth mentioning in the Syntaxes Documentation whether they are supported or not — instead of just linking to the Oniguruma documentation, which describes them as being supported!

Until the official documentation will mention explicitly the current state of ST3 RegEx engine, and which features it support and which it doesn’t, there’ll always be confusion on the matter — which can be frustrating for those willing to create new syntaxes, and it will ultimately lead to the same questions being asked again on the forum.

0 Likes

#2

Iirc, if you use look-behind / backreference, ST will fallback to use the Oniguruma engine which has a worse performance so people try to not use them in default built-in syntaxes.

But does the “worse performance” really matter for you? Maybe the “worse” is endurable or not noticeable.

0 Likes

#3

@jfcherng:

Iirc, if you use look-behind / backreference, ST will fallback to use the Oniguruma engine which has a worse performance so people try to not use them in default built-in syntaxes.

I didn’t know that! This is a precious tip — one that I wished was documented!
Thanks indeed.

The report from the “Syntax Tests - Regex compatibility” execution is rather confusing then, when it says FAILED: X patterns in ... are incompatible with the new regex engine — one is lead to think that they won’t match at all.

But does the “worse performance” really matter for you? Maybe the “worse” is endurable or not noticeable.

Difficult to say, as it might depend on the actual syntax. But so far, for the syntax I’m working on (AsciiDoc, which is a rather huge syntax) it doesn’t seem an issue at all.

Since the new .sublime-syntax format is quite smart, allowing to push and pop context, I’d say that if a syntax is well designed then performance shouldn’t be an issue — wise use of optimized RegExs, and keeping separate context to avoid long RegExs overhead, should do the trick.

Is the new “default RegEx engine” of ST3 documented anywhere? It would be cool to now what are the differences between the two (feature-wise).

1 Like

#4

IFor the missing feature you can find some info in this post but this why there is this “Regex compatibility” : to ensure that your syntax is compatible with the new one.

And about performances you might not feel if performances are an issue or not because it is only apparent in two cases: very large file and when sublime has to re-index your project and there is a lot of file using the syntax. The difference between the Oniguruma and the “new” engine is significant most of the time, so it might be worth the time to remove those look behind, especially since you have only 3, it should not be that complicated (there is a basic example in the post I linked).

2 Likes

#5

Not really. What I know about it is gleaned less from the documentation and more from folk wisdom, random conversations, extensive testing, and fearless extrapolation from first principles.

One of these days I should write up what I have for general reference.

0 Likes