Sublime Forum

Syntax hightlighting: put a meta scope on all that is in the next line

#1

Suppose I have code like this:

If (x = 1)
Msgbox, X is 1!
Msgbox, You get this message regardless of what x is.

So the first line has If and the condition. The second line has the outcome/body of the conditional; on the third line, the conditional is done, and normal code flow continues. What I’d like to do is put a meta scope on this outcome/body line.

So, after the second line, this meta scope should end, and we should return to source. The problem is that the outcome scope begins and ends at the same regex: ^ (the beginning of a new line). That doesn’t work: either the scope begins and ends immediately, when ^ is the first line within the scope; or it is never triggered, because the other lines within the scope are always triggered first. This is my attempt, which doesn’t work:

  if:
    - match: '(?i:^\s*(if)\b)'
      captures:
        1: meta.conditional.ahk keyword.control.conditional.ahk
      push:
        - meta_content_scope: meta.conditional.condition.ahk
        - match: '='
          scope: keyword.operator.comparison.ahk
        - match: '^'  # We are on line 2: the condition has ended, so here the outcome must begin.
          set: outcome

  outcome:
    - meta_content_scope: meta.conditional.outcome.ahk
    - include: messagebox
    - match: '^'  # When the outcome scope encounters a new line, it is finished.
      pop: true

  messagebox:  # This is just an example of the various commands that could be in the outcome, or anywhere else in the main scope.
    - match: '(?i:\b(msgbox)\b),?(.*)$'
      captures:
        1: support.type.builtin.command.ahk
        2: string.ahk

When the line that includes messagebox is above match: '^', the outcome never ends; when they are reversed, the outcome scope ends immediately after it begins, covering nothing.

How can I make sure the meta.outcome scope is applied only to line 2?

0 Likes

#2

Instead of popping on a start of line (which isn’t really a match exactly, it’s just an anchor) you might try

  • Adding the conditional text to the match for your if statement so that you end up consuming the rest of the line. Or if you are permitted to have multiline conditional text, some way of identifying the end of the conditional portion. Then push your outcome context.
  • Searching for non-whitespace and newline to trigger popping your context. I don’t think searching for the ^ will do much since it’s an anchor for the beginning of a line and isn’t exactly a matching object.
1 Like

#3

Hey, thanks for the tips!

Adding the conditional text to the match: I’d love to do that, but there can be anything in the condition, and it needs to be highlighted properly. I don’t think that’s possible if I were to include the entire condition in the regex which identified the beginning of the outcome.

There is no way to identify the end of the conditional portion, alas, except when a new line begins that cannot be connected with the condition (for example, any line beginning with a dot or a comma is attached to the previous line, so such a line would be attached to the conditional line and hence be part of the condition: but I can’t tell whether the next line begins with a dot until I’m there with the parser).

So the problem with searching for non-whitespace after a newline is that, at the beginning of the outcome line(s), this should set the outcome (like at the first Msgbox line), while, at the end of the outcome (at the second Msgbox), it should pop the outcome scope. So the same match should trigger beginning and end of the scope.

I can’t use embed/escape, because there could be nested If blocks, so I don’t want to pop the outermost one when it’s only the innermost one that should end.

Should I just give up?

0 Likes

#4

Well I don’t know about giving up. There are folks far more clever at syntax parsing than I am. However I do feel like you have to have sufficient lexical anchors to determine boundaries. The TCL syntax is inherently flawed and problematic like this since it’s a command language and it’s extremely difficult to tell what’s what (there are lines in the syntax file where it’s leaning on code conventions like a [ followed by a space and then a word meaning one thing, and a [ following by the word meaning another).

So I don’t know about giving up, but you certainly have your work cut out for you. Can you at least rely on the conditional portion being enclosed in parentheses? That might get you down the road a little further being able to distinguish between the if and the result.

0 Likes

#5

How about something like this:

%YAML 1.2
---
# See http://www.sublimetext.com/docs/3/syntax.html

scope: source.ahk
contexts:
  main:
    - include: if

  if:
    - match: '(?i:^\s*(if)\b)'
      captures:
        1: meta.conditional.ahk keyword.control.conditional.ahk
      push:
        - meta_content_scope: meta.conditional.condition.ahk
        - match: '='
          scope: keyword.operator.comparison.ahk
        - match: '^(?![\.,])'  # start of line, but only if first character is no dot or comma
          set: outcome

  outcome:
    - meta_scope: meta.conditional.outcome.ahk
    - include: messagebox
    - match: '\n'  # When the outcome scope encounters a new line, it is finished.
      pop: true

  messagebox:  # This is just an example of the various commands that could be in the outcome, or anywhere else in the main scope.
    - match: '(?i:\b(msgbox)\b),?(.*)$'
      captures:
        1: support.type.builtin.command.ahk
        2: string.ahk

I assume you want the meta.conditional.outcome scope on the whole line, as shown in the following screenshot. Otherwise use meta_content_scope instead of meta_scope in the outcome context.

screenshot

1 Like

#6

Yeah, its syntax is sometimes odd, even inconsistent. But that is exactly why I want to make a proper highlighter: it will prevent people (and myself) from making mistakes due to the inconsistency.

Unfortunately, the condition does not need to end with a closing bracket. But, even if it did, that still wouldn’t solve the problem, because outcome and post-outcome code can both start exactly the same way.

0 Likes

#7

Thank you for your help! You’re thinking in the right direction. The problem is that the rule [where a newline beginning with a dot or comma (or a bunch of other things) is attached to the previous line] applies not only to conditions, but to all code. So my example and the Msgbox regex was not realistic enough (I tried to make a minimal example). So e.g. the text in the message box could be split into several lines using dots or commas:

    x := 1
    If (x = 1)
    Msgbox, X is 1
    , congratulations!
    Msgbox, This message is shown regardless of what x is.

I have adapted your code a bit to try and match the outcome properly, but I get stuck at the same problem, i.e. either the outcome is not scoped as such at all, or the outcome and what follows are both scoped as outcome.

    %YAML 1.2
    ---
    # See http://www.sublimetext.com/docs/3/syntax.html

    scope: source.ahk
    contexts:
      main:
        - include: if

      if:
        - match: '(?i:^\s*(if)\b)'
          captures:
            1: meta.conditional.ahk keyword.control.conditional.ahk
          push:
            - meta_content_scope: meta.conditional.condition.ahk
            - match: '='
              scope: keyword.operator.comparison.ahk
            - match: '^(?![\.,])'  # start of line, but only if first character is no dot or comma
              set: outcome

      outcome:
        - meta_scope: meta.conditional.outcome.ahk
        - include: messagebox
        - match: '\n'  # When the outcome scope encounters a new line, it is finished.
          pop: true

      messagebox:  # This is just an example of the various commands that could be in the outcome, or anywhere else in the main scope.
        - match: '(?i:\b(msgbox)\b),?'
          captures:
            1: support.type.builtin.command.ahk
          push:
            - match: '^(?![\.,])'  # start of line, but only if first character is no dot or comma
              pop: true
            - match: '.'
              scope: string.ahk
0 Likes

#8

Oh, now I understand. Ok, second try:

%YAML 1.2
---
# See http://www.sublimetext.com/docs/3/syntax.html

scope: source.ahk
contexts:
  main:
    - include: if
    - include: messagebox

  if:
    - match: '(?i:^\s*(if)\b)'
      captures:
        1: meta.conditional.ahk keyword.control.conditional.ahk
      push:
        - meta_content_scope: meta.conditional.condition.ahk
        - match: '='
          scope: keyword.operator.comparison.ahk
        - match: '^(?![\.,])'  # start of line, but only if first character is no dot or comma
          set: outcome

  outcome:
    - meta_scope: meta.conditional.outcome.ahk
    - include: messagebox
    - match: '\n'  # When the outcome scope encounters a new line, it is finished.
      set: test-line-continuation

  test-line-continuation:
    - match: '^[\.,].*'
      scope: string.ahk
      set: outcome
    - match: '(?=.)'
      pop: true

  messagebox:  # This is just an example of the various commands that could be in the outcome, or anywhere else in the main scope.
    - match: '(?i:\b(msgbox)\b),?(.*)$'
      captures:
        1: support.type.builtin.command.ahk
        2: string.ahk

screenshot

Is that syntax for AutoHotkey scripts?

Edit: I think it should work now. I’ve added an auxiliary context to test for line continuation and I added messagebox to the main context as well here.

0 Likes

#9

Thanks a lot! A special scope for testing line continuation: interesting approach. It eats the continuing line, and it thereby ensures that its other match (the one that pops the context) cannot match at the beginning of a line. The problem is, though, that none of those matches should eat anything, for this ‘anything’ could be many different elements of code, each having its own highlighting rules (which I have left out from this example).

By the way, I altered the ‘messagebox’ context in my previous example, because it, too, can continue onto a new line, if the new line begins with a comma, dot, etc. So the ‘messagebox’ context cannot know it is finished at the end of a line: it can only know (pop itself) at the beginning of a new line not beginning with dot, comma, etc.

I get the feeling that what I want just isn’t possible with Sublime at the moment. Perhaps I should settle for scoping the ‘outcome’ only when it begins with some whitespace: I think that is possible.

0 Likes

#10

Shouldn’t it be enough to check at the beginning if the line whether the first non-whitespace character is not a comma and thus not a continuation and Pop the context? I’m on mobile so I can’t exactly type out that as an example. You may need to refractor a bit.

0 Likes

#11

The problem is, though, that none of those matches should eat anything, for this ‘anything’ could be many different elements of code, each having its own highlighting rules

In that case I think the only possibility is to push to a new context after the Msgbox keyword and check for line continuation within that context. That would ensure to not pop the meta.conditional.outcome scope until the Msgbox command is finished. The same would be necessary for every other command that you want to include in the outcome context though.

Shouldn’t it be enough to check at the beginning if the line whether the first non-whitespace character is not a comma and thus not a continuation and Pop the context?

I think you mean something like the following, right?

  outcome:
    - meta_scope: meta.conditional.outcome.ahk
    - include: messagebox
    - match: '^(?![\.,])'
      pop: true

This won’t work because it doesn’t pop on successive Msgbox lines. And if you place the pop-check before include: messagebox, it will immediately pop and the meta.conditional.outcome would be empty.

0 Likes

#12

The problem is that that exact same check is what begins the outcome scope: a new line not beginning with certain characters/words. So the outcome scope is set/pushed and popped by the exact same check, which ensures that it ends immediately as it begins. You can’t really know whether the outcome should begin until you’re on the line where it begins, nor can you tell where it should end until you’re at the line immediately after the outcome. It’s been this logic puzzle I have been struggling with from the beginning; I still don’t see a way out!

0 Likes

#13

Yeah, I already have an include that checks for line continuation in all(?) of the contexts where it applies, e.g. in commands like Msgbox, in expressions, assignments, everything. That works well. But I don’t think it can solve this little logic puzzle. I think your reply to Fichte accurately describes the issue.

There is really no way for the outcome context to know that it should begin, except when we are at the beginning of the (first) line of the outcome; and there is no way for the outcome context to know that it should end, except at the beginning of the first line of whatever comes after it. We cannot know either of these at the end of the previous line, because we cannot look ahead from there, so we cannot see whether or not the next line starts with a comma or similar (continuing the line). So we are stuck with the two matches, the one beginning and the one ending the outcome context, being exactly identical, and consuming zero characters, so that the position of the regex doesn’t move. As a result, both matches will match at the same position and immediately after each other.

0 Likes

#14

Yes, right. In fact to push a new context as I wrote before wouln’t work, because we would need to pop 2 levels at once to leave the outcome context. But it should work with set instead of push. Unfortunately we can’t use the meta scope in the outcome context then and must include them in each command, so that they can’t be reused somewhere else. Here is an example what I mean:

%YAML 1.2
---
# See http://www.sublimetext.com/docs/3/syntax.html

scope: source.ahk
contexts:
  main:
    - include: if

  if:
    - match: '(?i:^\s*(if)\b)'
      captures:
        1: meta.conditional.ahk keyword.control.conditional.ahk
      push:
        - meta_content_scope: meta.conditional.condition.ahk
        - match: '='
          scope: keyword.operator.comparison.ahk
        - match: '^(?![\.,])'
          set:
            - match: '(?i:\b(msgbox)\b),?'
              captures:
                1: support.type.builtin.command.ahk
              set:
                - meta_scope: meta.conditional.outcome.ahk
                - match: '^(?![\.,])'
                  pop: true
                - match: '.'
                  scope: string.ahk
            - match: '(?i:\b(OtherCommandWithNumbers)\b),?'
              captures:
                1: support.type.builtin.command.ahk
              set:
                - meta_scope: meta.conditional.outcome.ahk
                - match: '^(?![\.,])'
                  pop: true
                - match: \d
                  scope: constant.numeric.ahk
                - match: '.'
                  scope: string.ahk
            - match: '\n'
              pop: true

screenshot

1 Like

#15

Ah, yes, that would solve the puzzle. I’d have to make a copy of all contexts that can exist in the outcome (which is almost all that can exist in the main context) and alter them such that they would always have to end in a way that pops the outcome context. Some redundancy could be removed by using includes wherever possible, but that would make the syntax file somewhat chaotic. And the copy would increase the size of the syntax file a lot, because basically any sub-context that can return to the main context would need to be copied and altered. I shall have to ponder this!

Alternatively, I could ignore outcomes that start at the beginning of the line after Ifs, i.e. not scope them as outcomes. I can scope any outcomes that start with some whitespace, and also those that start with a brace. It would be bad practice anyway, to start an outcome without whitespace, and my linter already warns against it, so perhaps that is an acceptable compromise. I’m thinking.

0 Likes