Sublime Forum

A convention for scope naming

#1

Hi, I’d like to use the recent activity as an opportunity to put down a convention for scope naming.
For now the syntaxes name scope based on TextMate guidelines and conventions that have been implicitly adopted.

As Sublime has its own syntax, and a lot of contribution have been made on syntaxes recently, I think it would be a good time to agree on a more detailed convention.

I started to write down the rules I follow to created my C#, scala and .sublime-syntax syntaxes on this git repo.

It’s not finished, because I wanted to have your input on the subject.

Do you agree that we need more precise guidelines ?
What shortcomings do you think the Textmate conventions have ?
What would you want to add ?

3 Likes

Dev Build 3119
#2

To answer my own questions:

The goal of this convention is to enable theme creation without knowledge of every language supported by Sublime and Syntax creation without knowledge of all the themes.

I think we need more precise guidelines because a lot of things aren’t really clear in TextMate conventions.
For example what’s the difference between ‘constant.language’ and ‘support.constant’ ?
Textmate only mention ‘type’ but a lot of syntaxes and themes also uses ‘class’ should we explicitly discourage it ?

I also really miss conventions on punctuation. Punctuation is omnipresent in code and having it colored with the right color really improve readability. Look at @bathos Ecmascript syntax to have an idea on how far we can push this concept, or at my C# syntax for a more moderate approach.

1 Like

#3

I think one of the most important aspects of doing something like this would be to write a script to download all of the packages from https://packagecontrol.io/browse/labels/language%20syntax and generate a sqlite DB or spreadsheet of all of the scope names and in how many syntaxes they are used.

This would help inform if there have been some conventions adopted by the community, but also help identify package maintainers that could be communicated with for improving their scope names. We could identify a list of “legacy” scope names with guidance on what they should be updated to.

Ideally in addition to working towards better standards in scope names, I would love to see syntaxes try to use more explicit context manipulation, and rely less on large regular expressions with lots of branches. Obviously by not having the benefit of a two-pass system, some context information has to be processed by regular expressions with multiple options or look-aheads. However, the simpler the regular expressions and the more deterministic the branching, the better the performance all users will experience.

2 Likes

Color scheme changes
#4

That’s a good idea. I don’t have much experience with Package Control, but I’ll have a look a it.
I’ll start by making this on syntaxes in the official Packages repo.

0 Likes

#5

Hey Gwenzek, I had similar thoughts, though I specifically wondered what things would look like with a “clean slate” – and there’s actually an experimental branch on ESS with an alternative approach that tries to do that:

IIRC it isn’t quite done, though despite just being an experiment, I did do most of it.

0 Likes

#6

Ok, I added a few script to my repo that download the ‘Sublime/Packages’ from Github and extract the scopes used in each syntax.
I used pyyaml parser so I hope I didn’t miss scopes.

Here is a summary of which scopes are used IRL.
I removed scope appearing only once and scope with a depth > 3.

The number correspond to the number of different syntaxes using a given scope.

0 Likes

#7

Bathos, that’s nice :smiley:
I like the idea of a less strict hierarchy on scopes.

For example what should we use ‘punctuation.section.for’ instead of ‘for.punctuation’ ?
With your approach both are possible depending on the preference of the Theme maker.

0 Likes

#8

I’m very much interested in adopting a scope naming convention of our own. The TextMate conventions doa good job for the most part, but some syntax definitions just don’t exactly follow them (maybe because of misunderstandings?) resulting in oddly colored files occasionally. The most recent example being the FreeMarker template engine syntax, which decided to scope each tag as entitiy.name.function, which I have a background color set for in my color scheme and it was absolutely terrible.

I believe that we should specify a scope name depth of 2 as the minimum and let syntax definition writers have some freedom after that, much like the TextMate conventions did with variable.other as a kind of “catch-all”. Color schemes should define a color for at least a certain set of top-level scope names, i.e. support or keyword, and then go further when they see fit.


Other than that, I attempted to “standardize” the punctuation scopes a while ago and maybe you could put that to use:
https://github.com/SublimeText/PackageDev/blob/f29ddfdaf4c61d29b495d8ec0bb81bfdad86ada3/scope_data/init.py#L60-L75

Furthermore, I did something similar to the parsing with one person on IRC some time ago, where we downloaded color schemes and analyzed the scope names they used. We used the aziz’s online tmTheme editor as source. The result can be found here: https://gist.github.com/FichteFoll/986bfd00864c7def37bf

1 Like

#9

As we can see from all the complaints following the minor changes in javascript syntax scopes, scope naming is a crucial issue !

I looked at the scopes used under variable I don’t find them really coherent.
For example what does variable.other.readwrite means ? Why not just variable ?
And what about variable.constant ? is it a joke !?

Also I’d like to add a varaible.type to label every types and classes appearing outside their declaration.
It would also allow to replace entity.other.inherited-class by a variable.type.inherited.

variable    130
variable.function 2
variable.import 3
variable.language 20
variable.language.omitted 2
variable.language.self 2
variable.other 61
variable.other.class    2
variable.other.constant 2
variable.other.dot-access 2
variable.other.global 3
variable.other.object 2
variable.other.predefined 2
variable.other.property 4
variable.other.readwrite 14
variable.other.regexp 4
variable.parameter 36
variable.parameter.function 12
variable.parameter.handler 2

PS: thanks FichteFoll I’ll have a look a it

0 Likes

#10

I downloaded all themes (299) from ColorSublime and extracted the scopes used by them.
The themes available are heterogeneous because ColorSublime doesn’t provide you a GUI to create Themes. Therefore I hope that they are representative of the themes on PackageControl.

The first remark is that a lot of themes target punctuation scopes which is not documented. Conventions seems to exists but there are some incoherences.
For instance while Html, JS, Ruby uses punctuation.separator.key-value, JSON uses punctuation.separator.dictionary.key-value.

While entity.name.class is mentionned in 219 themes, it only appears in the Scala syntax!

There is no convention on ‘modules’ so they are called module, namespace or package across different syntaxes.

List of scopes present in more than 2 themes
Manually cleaned version
The numbers are the number of themes using a given scope.
The code is available on Github

3 Likes

#11

Here is a list of scopes I’d like to deprecate:

  • entity.other.attribute-name: deprecated, use variable.parameter
  • entity.other.inherited-class: deprecated, use variable.type.inherited
  • support.class: deprecated, use support.type
  • support.variable: deprecated, use variable.language
  • support.constant: deprecated, use constant.language
  • punctuation.terminator.statement: deprecated use punctuation.separator.statement

All these change can be made automatically, if they are adopted.
The first two are widely used.

entity.other.attribute-name would be replaced by variable.parameter which is already present in 85% of themes.
entity.other.inherited-class is replaced by variable.type.inherited.
This scope doesn’t exist in themes yet, but there is a real lack of a variable.type scope.
Even though most people likely use non-typed languages inside Sublime, it would be nice to improve supports for typed languages.

support.class is used in only 9 languages, and most themes treat support.class and support.type the same way. So not a big change.

While most Themes match both support.variable and support.constant they are duplicate of variable.language and constant.language.
I think we should deprecate one of them.
In practice *.language seems a little more used than support.*.

punctuation.terminator.statement is only found on JavaScript ; and in 2% of themes, while punctuation.separator is present in 31% of themes.

2 Likes

#12

I believe we should keep the support top-level-scope for language-provided identifiers/features. Theoretically it could be anything in most languages since those identifiers are not reserved keywords and can thus be overridden, but even then I believe that highlighting those differently than variable is valuable. And specifying scope selectors for variable.*.language is annoying. It would be much better if it was variable.language.*, in which case we might as well just stick to the rather well-established support.*. Thus, I vote to deprecate variable.*.language.* instead.
That would we also keep support.constant (and support.variable.constant? since you can override boolean constants in Python 2 for example, and that makes a difference imo).

I’m really unsure about entity.other.attribute-name. Narrowing that down to variable.parameter is too generic imo, but variable.parameter.attribute-name could work.

Other than that, I agree with depredating support.class and punctuation.terminator.statement.


Since this would break many color schemes in their highlighting (especially if we were to change variable.other.attribute-name afaik), we need to thing of a migration or conversion procedure too. I can take care of the Nil color schemes, for example.

0 Likes

#13

I think the preference between support.* and *.language is the kind of thing that should be handled by the theme not the syntax.

We could have the same discussion about .function and function.. The second will match more closely my highligting preference but the convention is to distinguish between support.function, entity.name.function and variable.function.

For the ‘support’ I personally don’t have any preference, so given your experience, I’ll side with you on this one :smiley:.

My initial hope when I started this was to find a convention that would make the life of theme creator easy. But now that I read a lot of themes I realize that their exist a wide variety of themes, and changing the convention to simplify half of the themes won’t simplify it for the other half. That’s why my suggestions for improvements are quite minor!

My conclusion is that it would be nice to improve the expressiveness of scope selector. I’d love to be able to use ‘.function.’ to match all scopes containing ‘function’, and I understand that other people prefer to distinguish between user and support functions.

0 Likes

#14

On the migration issue, I was thinking of doing it automatically.
The change I propose always introduce an alternative scope so it should be doable to create a script.
First I was thinking about reading the ‘.tmTheme’ with the python ‘plistlib’ modifying the scopes in it and saving it back. But I’m afraid this might loose comments or de-synchronize the ‘.thTheme’ from an associated ‘.YAML-thTheme’.
The second option is reading the file line by line and replacing part of it but it’s more error prone.

If such a Script exist (I’d take care of making one), we could send a PR to the most popular themes, modify those included in Sublime, and let other Themers port the change themselves if they think it’s worth it. End-users could also run the script in local (through a plugin maybe) to modify their theme if the one they are using isn’t maintained.

0 Likes

#15

@wbond I just wanted to say that I’m a bit surprised about the new official scope conventions

It writes:

The names of data structures will use one of the following scopes, or a new sub-scope of `entity.name` – this list is not exhaustive. To provide rich semantic information, use the specific terminology for a given language construct.

Avoid `entity.name.type.class` and `entity.name.type.struct` which unnecessarily nest scope labels under type.

    entity.name.class
    entity.name.struct
    entity.name.enum
    entity.name.union
    entity.name.trait
    entity.name.interface
    entity.name.type

For me the goal of the scope name was to abstract language conventions away from Color scheme writers and plugin developers, and you are going in the opposite direction.

An example of what could be achieved with more coherent scoping is this plugin for inserting Doxygen comments in a file.
In this plugin I ask Sublime to find the part of scope matching some broad scopes not specific to a language and use them to insert a documentation.

public class Foo {
    public void Main(String someArgument){ ... }
}

Becomes:

public class Foo {
    /// <summary>Main summary</summary>
    /// <param name="someArgument">someArgument</param>
    public void Main(String someArgument){ ... }
}

I wrote this long ago and the code is far from perfect, my point is that a developer from language X could write a plugin that works with several language at once without worrying about differences in C++ between a class and a struct.

I’m afraid that the documentation in the current state encourages people to overfit the scopes to their languages and to not care about the conventions used by plugins and schemes.

I think the entity.name.type was something quite clear, that can benefit from refinement but in a nested way ( entity.name.type.xxx). I agree that these scopes were a bit long but now I have to write entity.name -entity.name.function -entity.name.namespace instead of entity.name.type I’m not sure that’s a win.

0 Likes

#16

Also I wanted to speak about the storage.type scope.

I know you already had a long discussion about it on Github, so I’m going to try to be clear and short.

In python I write:

def foo(a, b):

In java:

public int foo(int a, int b){

For me the def and public are not important to understand the code. They are boilerplate, that’s why I put a low accent color on them (on the storage scope).

But the int in Java is actually quite important to understand what’s going on.
I want it in a more visible color. In the current state of things I have to change the color of storage.type.java to change the color of just the return types. But it’s not only Java, it’s the same thing for all statically typed languages.

That’s why I think that a dedicated scope should be created to differentiate between the def of python from the int of java.

0 Likes