I’m trying to tweak a .sublime-syntax file to make it match functions like
doThing()
with
(?:\s*([a-zA-Z0-9_]*)\s*(?=\())
I’m running into issues with the fact that control keywords like if, for, while etc are being matched to the pattern above, even though they have their own pattern:
\b(if|while|do|switch|for|foreach|return|throw|yield|continue|try|catch|resume|default|else|case|break)\b
is there any way for me to prioritize which patterns get matched in which order or any other (preferably not too complicated) approach so solving this problem?
.sublime-syntax control keywords matching as functions first
Things are matched in the order that they’re in the syntax. Putting that match before the one for if statements will match that.
If those keywords can indeed be used as a function name, that sounds like a bug for the syntax definition though. Filing an issue on https://github.com/sublimehq/Packages/issues/new/choose would be preferable.
So I’ve done some testing with other components of the file and you’re right for everything else, I can move around the ordering and it changes the classification of other components, but for some reason it doesn’t matter what order I set up the keywords vs function matches. The keywords are always identified as functions so long as that match exists (if I delete it then they aren’t) regardless of what order the two are in.
I am… so confused right now.
- match: '(?:\s*(len|tofloat|tointeger|tochar|slice|find|tolower|toupper|rawget|rawset|rawin|rawdelete|clear|append|push|extend|pop|top|insert|remove|resize|sort|reverse|map|apply|find|capture|match|search)\s*(?=\())'
scope: support.function.squirrel
- match: \b(if|while|do|switch|for|foreach|return|throw|yield|continue|try|catch|resume|default|else|case|break)\b
scope: keyword.control.squirrel
- match: '(?:\s*([a-zA-Z0-9_]*)\s*(?=\())'
scope: variable.function.squirrel
- match: '(?:(?<=\.)\b([a-zA-Z0-9_]*)\b)'
scope: variable.other.member.squirrel
The is a snippet of what I currently have with everything that should be relevant, and it’s still prioritizing the function match over the keyword match.
Edit: also a screenshot to hopefully show I’m not a raving madman
So I’ve figured out what my problem was, in my function match I was actually matching the whitespace before the function which was making it prioritize that over the keyword match, by making sure my match only matched the appropriate section like so:
(?:\b([a-zA-Z0-9_]*)\b(?=\s*\())
I have solved my issue.
You might probably want to avoid lookbehinds in new syntax definitions as those trigger the slower Oniguruma syntax engine.
Does having any of them do this? Not too hard to fix just a bit more verbose so I’m curious.
Edit: also do lookaheads do the same?
Edit 2: actually I’m not sure how I’d go about remedying this? If I change the match to this
- match: '(?:(?:\.)\b([a-zA-Z0-9_]*)\b)'
captures:
1: variable.other.member.squirrel
It matches further to the left than some of my other patterns and is therefore prioritized (since it’s technically magic the . before the property as well)
Lookaheads are SAFE. You can test compatibility by running Syntax Test - Compatibility Check
with the view containing the sublime-syntax definition being focused/active.
An alternative for the lookbehind …
- match: '(?:(?<=\.)\b([a-zA-Z0-9_]*)\b)'
scope: variable.other.member.squirrel
… is …
- match: (\.)([a-zA-Z0-9_]*)\b
captures:
1: punctuation.accessor.dot.squirrel
2: variable.other.member.squirrel
Identifiers normally start with letters a-zA-Z
to avoid confusion with floating point numbers, so I’d suggest at least
- match: (\.)([a-zA-Z_][a-zA-Z0-9_]*)\b
captures:
1: punctuation.accessor.dot.squirrel
2: variable.other.member.squirrel
If you want or need to support more kinds of members - maybe also member functions - it can be a good idea to push a dedicated context onto stack:
accessors:
- match: \.
scope: punctuation.accessor.squirrel
push: maybe-member
maybe-member:
# dot is followed by an identifier, so consume as member and pop.
- match: '[a-zA-Z_][a-zA-Z0-9_]*\b'
scope: variable.other.member.squirrel
pop: true
# add more here
# ...
# dot is not followed by an identifier, so pop.
- match: (?=\S)
pop: true
# end of line found, so pop??? Not sure if squirrel is line-based.
- match: $
pop: true
It also applies if members may be located on the next line.
var .
member
Awesome, thanks for taking the time to write all that up. I’ve implemented basically everything you suggested as well as tweaking a different part of the syntax file flagged by the engine test and now everything works flawlessly.