Sublime Forum

Syntax highlighting guidelines

#1

Are there are any guidelines for developing Syntax Definitions in regards to handling invalid syntax? I’ve noticed that many of the syntax definitions are generous in allowing invalid syntax, and using “invalid” scopes is rare.

I have a few guesses why this might be, is this correct?

  • Simplifies the syntax definition.
  • Helps with performance.
  • Avoids flickering and odd highlighting while the user is typing.

I’m wondering how to decide the balance between a simplified syntax, and providing the user with real-time feedback that something is invalid.

0 Likes

#2

I suspect it depends greatly on the language. In my own package, whenever possible, I do indicate invalid syntax. In other areas if someone writes a statement that doesn’t belong in a block (an assignment in a declaration region) it should not scope properly and will not be identified as an assignment. This isn’t scoped invalid, and it’s an interesting question as to whether it should.

I know in another case due to limitations of the engine and the way I created contexts, I simply cannot mark some identifiers as invalid. This is usually the case where the user can match an identifier from a previous clause and I’ve lost the capture.

1 Like

#3

Personally, I try to err on the side of graceful highlighting even for invalid code. Flickering is one reason; it would be awful if invalid code always brought forth angry splotches of magenta. Another reason is recontextualization; it’s nice if a code snippet into a temporary buffer is highlighted even if it’s invalid out of context. When I wrote my Oracle syntax, a key design goal was that disconnected snippets should be highlighted properly. Example:

runQuery("select #expr# as foo from #table# where id=:id and active=1", { id: 42 });

As far as the Oracle syntax knows, that’s select as foo from where id= and active=1. Even though this is invalid, I expect the syntax to handle it. One of the key benefits of the architecture I devised for that syntax is that tokens are optional by default; missing tokens will be skipped and ignored.

In a perfect world, I would expect a linter to catch syntax errors, obviating any such functionality in the syntax. However, quality linters aren’t always available (or used when available) and it’s not unreasonable for the syntax to catch some of the lowest-hanging fruit, like unmatched close parens, invalid escapes, and so forth. However, more sophisticated error handling is often impossible in Sublime’s parser, particularly in complex languages like JavaScript and C.

2 Likes

#4

I tend to try to be helpful - mark anything obviously invalid (syntax, stray close brackets etc.) as invalid, but try to minimize what gets scoped as invalid while code is still being typed.

2 Likes

#5

What it generally comes down to is the fact that, for an editor, invalidity is the common case. You’re in the process of typing code; naturally it’s not going to be perfectly valid on a character-by-character basis. For this reason, use of invalid scopes generally produces a really bad experience. Most of the work I do on the Scala mode, for example, actually involves making it more tolerant of bad buffer states, usually in the form of context bail-outs and such, but sometimes in the form of eliminating invalid (or restricting it).

invalid generally works well for things that are never intended to exist, even transiently. Mismatched parentheses are a good example of this. Another decent example is Scala’s class inheritance syntax, which requires the first clause be delimited by extends and all subsequent clauses use with, meaning that we can and should mark with without extends as being invalid, and similarly with doubled extends.

Just be very, very careful about invalidity. Think about what happens as you type each construct. No one likes to see large swaths of their buffer flash to-and-from hot pink on a character-by-character basis.

2 Likes