Everything you (n)ever wanted to know about indentation in ST3
Hello, I’d like to talk about indentation in ST3. No, not that old tabs vs spaces argument but actually how automatic indentation works in the editor.
I’m sure we can all agree, that from the user side, auto indentation in ST3 is pretty good, with only a few edge cases where it misses the mark. But, from a technical perspective, there is room for improvement. Therefore, I thought it would be useful if we document the current indentation behavior (facts), and then, afterwards, we can say what we like and don’t like (opinions) to help the ST devs decide on a way forward to both make it nicer from the technical side and fix those pesky edge cases while we’re at it. I’m therefore making this a community wiki, so that people can edit this post with things I’ve missed or not expressed clearly enough - I find it quite tricky to decide how best to present information and ideas
For the purpose of this discussion, assume the auto_indent
and smart_indent
preferences are set to true
(default). This guide is valid as at build 3126. Obviously, if changes are made to the indentation engine in the future, this article will mainly be relevant for archive reference purposes. Note that any regex pattern mentioned hereafter is using the Oniguruma engine, which supports recursion and thus allows one to craft a regex for (un)balanced parens etc.
Indentation when pressing Enter
The main aspect of auto indentation increasing the indent level occurs when pressing Enter, anywhere on a line.
Note that some keybindings override Enter with their own logic by inserting a snippet etc., but otherwise, the logic is currently based on the `.tmPreferences` metadata files: https://docs.sublimetext.io/reference/metadata.html#indentation-options. (The `insert` command processes indentation logic, but the `insert_snippet` and `append` commands don't.) The scope selector is equivalent to the `eol_selector` context, wherein it works on the scope of the `\n` character at the end of the line. All the regular expression patterns match against one line of text.All subsequent lines
Example: type an open XML tag `` or curly brace `{` and press Enter
This is handled by the `increaseIndentPattern` regex.A single line
For example, typing an if statement without an opening brace `if (x)` Enter. In this case, only the next line is indented, because the "if" will apply to only a single statement/line. Then, if a `{` is typed, it gets unindented again, because most people want their opening and closing braces in the same column as the start of the "if". See also the "Adjusting indentation when you type on a line" section.
This is also potentially useful for incomplete statements (split over multiple lines), for example some languages require statements to be terminated with a semi-colon ;
, and visually indenting them makes it more readable.
a.b().c.d.e.f
.g().h.i();
as opposed to
a.b().c.d.e.f
.g().h.i();
This is handled by the `bracketIndentNextLinePattern` regex - presumably the `bracket` part relates to the `{` behavior, although it is actually not hardcoded, and is instead overridden by a separate `disableIndentNextLinePattern` regex. The naming of this is perhaps a little confusing, because the disable regex applies to the "next line" - the one whose indentation was caused by the `bracketIndentNextLinePattern`. ST3 handles this disable by default in `Packages/Default/Indentation Rules.tmPreferences`.
;
, and visually indenting them makes it more readable. a.b().c.d.e.f
.g().h.i();
a.b().c.d.e.f
.g().h.i();
But “wait”, I hear you say, “most packages I’ve seen have the key indentNextLinePattern
instead of bracketIndentNextLinePattern
in their metadata file!”
Yes, and ST ignores it - maybe some plug-ins make use of it though. (They can do so using view.meta_info('indentNextLinePattern', view.sel()[0].begin())
- note that this method is currently missing from the official documentation.)
lines containing only whitespace and comments are ignored
The auto-indenter will ignore lines matching the unIndentedLinePattern
regex when computing the next line’s indentation level. This mainly seems to be used to ensure that bracketIndentNextLinePattern
can affect multiple lines when the first line(s) is/are purely comment lines.
other built-ins
The user preference `indent_to_bracket` adds whitespace up to the first open bracket when indenting. It also affects unindenting too, the close paren is placed in the column after the open paren. ```js foobar( okay( 'yeah' ) ) ```
`indentSquareBrackets` and `indentParens` ensure that an open/unclosed square bracket `[` or paren `(` will cause subsequent lines to be indented an extra level. However, they don't have any affect on unindenting the current line when the closing bracket is typed, so must be combined with a `decreaseIndentPattern` for consistency. See [the JSON indentation rules](https://github.com/sublimehq/Packages/blob/7ef80d531b752baee46f792b6bc6b26206e56012/JavaScript/JSON%20Indent.tmPreferences#L26) for an example. They do, however, unindent the *next* line when typing a closing paren/square bracket that doesn't have a corresponding opening bracket on the same line (ignoring those in a `string` or `comment`).Adjusting indentation when you type on a line
The main aspect of auto indentation decreasing the indent level occurs when typing on a line, for example a closing brace `}` or keywords `end if` or closing SGML tags ``
But it can also occur when you typed an if statement on one line and then an opening brace on the next line if (x)
Enter {
. If you then delete that brace, it will increase the indentation level by one again. This means that ST has to keep track of whether it has increased or decreased the indentation while you typed on the line. See also “Indentation when pressing Enter/A single line”.
Manually adjusting the indentation level on the line by any means will prevent ST from automatically changing the indentation level as you type on the line. As will typing on a different line. So it seems ST analyzes the line between pressing Enter and typing on a different line - just moving the caret off the line and back again without typing doesn’t affect it.
if (x)
Enter {
. If you then delete that brace, it will increase the indentation level by one again. This means that ST has to keep track of whether it has increased or decreased the indentation while you typed on the line. See also “Indentation when pressing Enter/A single line”.This is handled by the
decreaseIndentPattern
regex.
Reindenting the whole file or a selected part of the file at once, on demand
One can ask ST to reindent any part of the file on command. (Note that this is different to reformatting i.e. the only changes are to indentation, no non-whitespace characters are moved to different lines etc. Reformatting is best left to plugins and thus is out of scope for this discussion.)
Reindentation uses the indentation of the line above as a reference point to continue from, so even if the file above the part being reindented doesn’t completely follow the indentation rules, the part being reindented can still look correct/not out of place.
Also, it is possible to “Paste and Indent” (see the Edit menu) in one go, which, doesn’t perform a paste and a reindent as one would expect, but ensures the indentation of the pasted text fits the place where it was pasted, i.e. indentation is added or removed from the lines to keep the same relative indentation in the pasted text but to start from the indentation of where it was pasted.
For example, if your original text was hello\n\tworld\n
with your cursor at the end and you pasted \t\t\tfoo\nbar
you’d normally get: hello\n\tworld\n\t\t\tfoo\nbar
, but using paste and indent, you’d get hello\n\tworld\n\t\t\t\tfoo\n\tbar
.
For example, if your original text was
hello\n\tworld\n
with your cursor at the end and you pasted \t\t\tfoo\nbar
you’d normally get: hello\n\tworld\n\t\t\tfoo\nbar
, but using paste and indent, you’d get hello\n\tworld\n\t\t\t\tfoo\n\tbar
.The Tab key is bound to the reindent
command by default, when there is no selection on a completely empty line (not even whitespace present), so one can press it once to instantly get the caret at the correct indentation level.
Currently ST supports using different rules to reindent in "batch"es. If batchIncreaseIndentPattern
or batchDecreaseIndentPattern
is defined, these take precedence over increaseIndentPattern
and decreaseIndentPattern
respectively.
preserveIndent
(a metadata boolean) can be used to ensure that lines are not moved during batch indentation. This is useful for multiline comments, especially "docblock"s so that leading whitespace inside the comment is not obliterated.
Note that unIndentedLinePattern
also functions similarly to preserveIndent
, in that a line matching this pattern has no indentation rules applied to it upon batch reindentation. However, there are a few differences:
-
unIndentedLinePattern
takes affect if the first non-whitespace token on the line matches the scope selector given in thetmPreferences
file (and the regex matches the line). If the line is the first (or only) consecutive line that matches this rule, and it had some indentation, it’s indentation will be changed to match that of the line above. If the\n
matched the scope, then the next line, if it had no indentation, will be indented to the same level as this one. Otherwise the line will stay with it’s existing indentation level. -
preserveIndent
takes affect if the\n
at the end of the line matches the scope selector given in thetmPreferences
file, or the line above’s\n
matched. Whatever indentation the first line had, the next line’s indentation is changed to match this one’s.
Therefore, you will observe different results if you unindent your file before reindenting it.
Removing automatic indentation when moving off an otherwise empty line
There is a user preference called trim_automatic_white_space
to control this behavior.
General tmPreferences
info
The usual scope selector specificity precedence rules apply if there are multiple metadata files that apply to the scope at the end of the line, and the overrides work as you’ve come to expect from ST.
If the PList file contains a key followed by an empty string, then that pattern is overridden with a blank regex and consequently the pattern doesn’t apply / is disabled. To always enable a pattern, use <string>.</string>
.
The rules are executed with a left-aligned match, so most regex patterns will typically start with \s*
. The ^
anchor is implied, but there is no implicit $
anchor.
Summary
increaseIndentPattern
affects the next line
decreaseIndentPattern
affects current line
bracketIndentNextLinePattern
affects the next line
disableIndentNextLinePattern
affects the current line if the above line was matched by bracketIndentNextLinePattern
Limitations / Drawbacks / Known Bugs in the current implementation
-
bracketIndentNextLinePattern
seems to be cumulative. So if one was to want incomplete statements to be indented, statements spanning 3+ lines get indented too far and are not restored to the original level after a;
.a.b() .c() .d() .e(); f(); // instead of a.b() .c() .d() .e(); f();
-
Currently, it is only possible to automatically adjust indentation one level at a time, which affects
switch
statements Configure auto-indent with multiple scopes per line -
a combination of the above two points, the same applies to multiple
if
statements without braces:if (true) if (false) cool(); this_should_be_one_level_backwards();
-
Regular expressions are potentially duplicated from the syntax definition in the metadata file - if the syntax gets updated, does anyone remember to check the indentation regexes are still relevant?
-
having different reindentation rules for batch reindentation can be confusing and lead to unexpected behavior, especially with
trim_automatic_white_space
enabled https://github.com/SublimeTextIssues/Core/issues/1583 -
instead of a user preference, users have to override a metadata file to get comments to be reindented in batch mode https://github.com/SublimeTextIssues/Core/issues/1271, but then its a tradeoff between comments not being moved or docblocks being aligned wrongly
-
Because the scope selectors operate at EOL, and the regular expressions only have the one line of context to match against, it is not possible to reliably skip block comments that don’t cover the new line character. i.e. using a
<scope>comment</scope>
selector and matching<key>unIndentedLinePattern</key><string>.</string>
would be a generic solution that would prevent each language from needing to override this regex pattern (probably in the past, single line comments didn’t scope the\n
character, so this technique couldn’t be used), but still causes e.g.if (true) /* test example */
Enter to lose the indentation from
bracketIndentNextLinePattern
after the comment because one can’t guarantee that*/\s*$
ended a comment -
related to some of the items mentioned already is batch reindenting multi-line
if
statements. Unless the syntax definition has a unique meta scope on theif
, it’s hard to use regexes that would handle this correctly based on a single line of context - see http://stackoverflow.com/questions/41571959/sublime-text-3-indentation-for-multi-line-statements-in-phpif (VeryLongThingThatTakesUpALotOfRoom || OtherQuiteLongThingSoINeedTwoLines) { statement1(); statement2(); }
reindents to
if (VeryLongThingThatTakesUpALotOfRoom || OtherQuiteLongThingSoINeedTwoLines) { statement1(); statement2(); }
-
there are times when unindenting doesn’t work properly when there is an increase indentation immediately followed by (or with only blank/whitespace lines in between) a decrease indentation
https://github.com/SublimeTextIssues/Core/issues/1262
-
there is no way to configure
indent_to_bracket
to place the close paren in the same column as the open paren, and it only works with parenthesis, not anything else like square brackets -
there is currently no way for users to easily customize snippet brace style https://github.com/sublimehq/Packages/issues/131
-
the
reindent
command acts weirdly with lines that don’t conform to the tab stop size. i.e. using a tab size of 2 spaces, reindenting the following PHP code:<?php function test_indent() { if (condition) { echo "here"; } else { echo "else"; } }
reindents to
<?php function test_indent() { if (condition) { echo "here"; } else { echo "else"; } }
removing all existing indentation first helps the command to work as expected - http://stackoverflow.com/questions/42510367/imporper-sublime-indentation-for-php-files#42510367
-
one can’t easily unindent a line based on the previous line
Proposals
Please feel free to comment/reply with your ideas and proposals. As mentioned before, if there is any factual information here that could be improved, please edit this post.