Sublime Forum

.sublime-syntax control keywords matching as functions first

#1

I’m trying to tweak a .sublime-syntax file to make it match functions like
doThing()
with
(?:\s*([a-zA-Z0-9_]*)\s*(?=\())
I’m running into issues with the fact that control keywords like if, for, while etc are being matched to the pattern above, even though they have their own pattern:
\b(if|while|do|switch|for|foreach|return|throw|yield|continue|try|catch|resume|default|else|case|break)\b
is there any way for me to prioritize which patterns get matched in which order or any other (preferably not too complicated) approach so solving this problem?

0 Likes

#2

Things are matched in the order that they’re in the syntax. Putting that match before the one for if statements will match that.

0 Likes

#3

If those keywords can indeed be used as a function name, that sounds like a bug for the syntax definition though. Filing an issue on https://github.com/sublimehq/Packages/issues/new/choose would be preferable.

0 Likes

#4

So I’ve done some testing with other components of the file and you’re right for everything else, I can move around the ordering and it changes the classification of other components, but for some reason it doesn’t matter what order I set up the keywords vs function matches. The keywords are always identified as functions so long as that match exists (if I delete it then they aren’t) regardless of what order the two are in.

I am… so confused right now.

- match: '(?:\s*(len|tofloat|tointeger|tochar|slice|find|tolower|toupper|rawget|rawset|rawin|rawdelete|clear|append|push|extend|pop|top|insert|remove|resize|sort|reverse|map|apply|find|capture|match|search)\s*(?=\())'
  scope: support.function.squirrel
- match: \b(if|while|do|switch|for|foreach|return|throw|yield|continue|try|catch|resume|default|else|case|break)\b
  scope: keyword.control.squirrel
- match: '(?:\s*([a-zA-Z0-9_]*)\s*(?=\())'
  scope: variable.function.squirrel
- match: '(?:(?<=\.)\b([a-zA-Z0-9_]*)\b)'
  scope: variable.other.member.squirrel

The is a snippet of what I currently have with everything that should be relevant, and it’s still prioritizing the function match over the keyword match.

Edit: also a screenshot to hopefully show I’m not a raving madman

0 Likes

#5

This is for Squirrel, which doesn’t have an official Package.

0 Likes

#6

So I’ve figured out what my problem was, in my function match I was actually matching the whitespace before the function which was making it prioritize that over the keyword match, by making sure my match only matched the appropriate section like so:
(?:\b([a-zA-Z0-9_]*)\b(?=\s*\())
I have solved my issue.

0 Likes

#7

You might probably want to avoid lookbehinds in new syntax definitions as those trigger the slower Oniguruma syntax engine.

0 Likes

#9

Does having any of them do this? Not too hard to fix just a bit more verbose so I’m curious.

Edit: also do lookaheads do the same?

Edit 2: actually I’m not sure how I’d go about remedying this? If I change the match to this

- match: '(?:(?:\.)\b([a-zA-Z0-9_]*)\b)'
  captures:
    1: variable.other.member.squirrel

It matches further to the left than some of my other patterns and is therefore prioritized (since it’s technically magic the . before the property as well)

0 Likes

#10

Lookaheads are SAFE. You can test compatibility by running Syntax Test - Compatibility Check with the view containing the sublime-syntax definition being focused/active.

An alternative for the lookbehind …

- match: '(?:(?<=\.)\b([a-zA-Z0-9_]*)\b)'
  scope: variable.other.member.squirrel

… is …

- match: (\.)([a-zA-Z0-9_]*)\b
  captures:
    1: punctuation.accessor.dot.squirrel
    2: variable.other.member.squirrel

Identifiers normally start with letters a-zA-Z to avoid confusion with floating point numbers, so I’d suggest at least

- match: (\.)([a-zA-Z_][a-zA-Z0-9_]*)\b
  captures:
    1: punctuation.accessor.dot.squirrel
    2: variable.other.member.squirrel

If you want or need to support more kinds of members - maybe also member functions - it can be a good idea to push a dedicated context onto stack:

  accessors:
    - match: \.
      scope: punctuation.accessor.squirrel
      push: maybe-member

  maybe-member:
    # dot is followed by an identifier, so consume as member and pop.
    - match: '[a-zA-Z_][a-zA-Z0-9_]*\b'
      scope: variable.other.member.squirrel
      pop: true
    # add more here
    # ...
    # dot is not followed by an identifier, so pop.
    - match: (?=\S)
      pop: true
    # end of line found, so pop??? Not sure if squirrel is line-based.
    - match: $
      pop: true

It also applies if members may be located on the next line.

var .
  member
1 Like

#11

Awesome, thanks for taking the time to write all that up. I’ve implemented basically everything you suggested as well as tweaking a different part of the syntax file flagged by the engine test and now everything works flawlessly.

0 Likes