The internal implementation of Sublime’s internal regexes seems pretty straightforward, but I’m curious as to how zero-width assertions (^, $, \b, lookahead) are implemented. In particular, \b would seem to require a limited form of lookbehind. Many syntaxes use \b freely even when a less stringent assertion would do – for instance, in the JavaScript syntax:
- match: \bimport\b
This pattern is seen over and over. But in most cases, it could be replaced by:
- match: import(?!\w)
This is a looser assertion that, depending on the implementation, could be more performant. I haven’t been able to get good benchmarks yet.
Should we expect there to be a cost to unnecessary use of \b, compared to (?!\w) or no assertion at all? If so, this may be an opportunity to improve (however slightly) the performance of built-in syntaxes.