I do not work for Sublime HQ. I do, however, know a great deal about Sublime's syntax highlighter, syntax highlighting in general, and the broader topic of formal language theory. I am well qualified to talk about these things and, more importantly, to explain them. If you have any questions I would be happy to address them.
I have not seen that video, but I have seen other explanations of Atom's tree-sitter implementation by the same presenter. The GitHub employee does not understand how Sublime's engine works, and some of the things that he says about it are not true.
… Why is there this inconsistency here? And it's because none of these tools really understand Go code, or any code for that matter. They're just kind of looking for simple patterns in the code that they can identify in regexes.
He is describing, more or less, the old TextMate syntax highlighting system (
tmLanguage). It's a fairly simplistic system that doesn't allow for very sophisticated parsing. Since its introduction, it has become a de facto standard for many text editors, and it is supported by Sublime, Atom, and Visual Studio Code. I would argue that his brief description is not entirely fair to the TextMate system, but on the whole it's generally accurate.
However, since 2015 Sublime has supported a different, more powerful syntax highlighting system (
The GitHub employee in the video shows an example of some Go code highlighted with an old TextMate-derived syntax definition. He thinks that all types should be highlighted the same way in the sample. That's an opinion. Then, he claims that the reason they aren't is that “none of these tools really understand Go code”. This is nonsense. Even a TextMate syntax definition could probably have highlighted all of those types the same way. I am absolutely sure that Sublime's engine can. The reason they're highlighted differently is that the specific syntax definition he's using highlights them differently. He specifically claims that this is not an artifact of the specific syntax definition he's using, but rather a fundamental limitation of the simplistic algorithms used for highlighting in other editors, specifically including Sublime. This is a false claim.
Sublime syntax definitions can be very simplistic or very sophisticated. The current built-in Go syntax definition is fairly simplistic (derived from an old TextMate syntax, I think, and probably from the same one that other editors use). However, there is an open pull request for a brand-new, much more sophisticated syntax definition that takes advantage of the greater power of Sublime's engine.
So on one hand, the GitHub employee is wrong about the sophistication of Sublime's parser, and on the other hand he's wrong about the level of sophistication required to do the job.
Error recovery is a bit of a non-sequitur. A well-written syntax definition can sensibly handle invalid code. Atom is not special in this regard. Perhaps it's of particular concern due to the formality of the tree-sitter algorithm. In the video, the GitHub employee describes in some detail how tree-sitter handles syntax errors using nondeterminism. You don't need nondeterminism to solve that sort of problem. Nondeterminism is slow. (This is a theorem, not an implementation detail; nondeterministic context-free languages cannot be parsed as effectively as deterministic context-free languages.) Even if Sublime implemented nondeterministic parsing, I wouldn't use it for that. It's not clear to me whether this is a real problem with tree-sitter, the syntax definition was badly written, or the GitHub employee is just using it as a simple example. From the way he congratulates himself for the originality, probably not the third.
Code folding has barely anything to do with this, and this comment is long enough as it is, so I'll punt on that.