Sublime Forum

**Ap0c552** · February 8, 2017, 5:42pm

I have been using a build system regex which I found on the internet.

Now I am trying to understand how it works.

It is being used on android ndk-build build errors. When an error occurs, sublime correctly navigates to the file and the line number. But I have not specified the line_regex variable.

How does sublime know what line to go to??

My regex is as follows…

“file_regex”: “^(…[^:]):([0-9]+):?([0-9]+)?:? (.)$”

and an error looks likes…

head/src/MyText.cpp:23:27: error: expected ‘;’ after expression

**OdatNurd** · February 8, 2017, 5:54pm

The regex in file_regex can contain up to four regex captures to capture error information. In order, the captures are used for the file name, line number, column number and error message. The regex you’re using matches the file name and location, so Sublime knows where to go.

line_regex is only needed in situations where the error file and error line number don’t appear on the same line in the build output. In that case, you would use line_regex with a capture to catch where the error line is, and when that regex matches Sublime will go backwards through the build output looking for a match in file_regex to figure out the file name.

There is more information on this in the Unofficial Docs on Build Systems.

**Ap0c552** · February 9, 2017, 6:46pm

How are the regex captures separated in the regex?

I have put an error output from a build into a file, and have been trying to run this regex on it in a search, and I am not getting any match.

Here is an example error output…

C:\src\TRUNK\AppSuite\Android\MyNativeApp>call c:\setenv.bat
[armeabi-v7a] Compile++ thumb: mynativeapp <= Class.cpp
[armeabi-v7a] Compile++ thumb: mynativeapp <= Screen.cpp
[armeabi-v7a] Gdbserver : [arm-linux-androideabi] libs/armeabi-v7a/gdbserver
[armeabi-v7a] Gdbsetup : libs/armeabi-v7a/gdb.setup
jni/Folder/Class.cpp:23:27: error: expected ‘;’ after expression
UI::Text::Animate(f)
^
;
jni/Folder/Class.cpp:49:1: error: unknown type name ‘jni’
jni/Folder/Class.cpp:23:27: error: expected ‘;’ after expression
^
jni/Folder/Class.cpp:49:4: error: expected unqualified-id
jni/Folder/Class.cpp:23:27: error: expected ‘;’ after expression
^
3 errors generated.
make: *** [obj/local/armeabi-v7a/objs-debug/mynativeapp/Folder/Class.o] Error 1
make: *** Waiting for unfinished jobs…
jni/Folder2/Screen.cpp:50:44: error: expected ‘;’ at end of declaration
XMLnode action(node(“Strings”))
^
;
1 error generated.
make: *** [obj/local/armeabi-v7a/objs-debug/mynativeapp/Folder2/Screen.o] Error 1
[Finished in 3.4s]

**OdatNurd** · February 9, 2017, 6:53pm

Regex captures are represented by the pairs of ( and ); instead of matching actual text, they tell the regex engine that whatever text is matched at those locations should be saved for future use.

In the Replace panel, the captures can be used in the replacement text with codes like \1 for the first capture, \2 for the second, and so on. Here sublime is using internally to determine exactly what file, line and column errors are happening at.

The regex as you have it above is not correct; it’s missing some * characters. Are you sure that’s the one that you’re using in your build system? It is almost but not quite identical to the one that’s used by default in Sublime for C/C++ build output.

For comparison:

Yours:   ^(..[^:]):([0-9]+):?([0-9]+)?:? (.)$
Default: ^(..[^:]*):([0-9]+):?([0-9]+)?:? (.*)$

The * means “0 or more of the preceding character”; without it it will only match filenames that are exactly 3 characters long that happen to generate an error that’s only a single character.

**Ap0c552** · February 9, 2017, 11:00pm

For some reason my example was missing the * but the actual regex in my build system was not.

It seems like the default sublime regex is working for ndk build output.

When I copied the build output to a text file and ran the proper regex over it, it would match the entire line…

like this one…

jni/Folder/Class.cpp:23:27: error: expected ‘;’ after expression

So really the source of my confusion is, if the regex matches the entire line, how does sublime know how to parse out the file path and line number?

**OdatNurd** · February 9, 2017, 11:22pm

That’s due to the regex captures, as I mentioned above. The parts of the matched text that are captured are stored and available to whatever did the match, which in this case is the build output handling in Sublime.

This is something that’s just inherent to regexes in general and is one of the reasons why they’re so powerful. Not only can you craft an expression to match some sequence of text, you can also save parts of what you matched and use them later.

For an illustration, open up a new empty file and paste your error line into it:

jni/Folder/Class.cpp:23:27: error: expected ';' after expression

Then use Find > Replace..., enter the following and hit Replace All

Find What: ^(..[^:]*):([0-9]+):?([0-9]+)?:? (.*)$
Replace With: 1="\1"  2="\2"   3="\3"   4="\4"

The error line gets transformed, showing you the contents of the four different captures. Sublime uses the first three of them to know exactly where the error occurred, so that it can open that file and go to the precise location. It can also use the fourth to show you the error message in context in the file, if you have that turned on.

**Ap0c552** · February 10, 2017, 7:45pm

So are you saying there are 4 separate regex, that are working as 4 separate regex, in addition to working as one? What is this capability called?

What is delineating the 4 regex?

**OdatNurd** · February 10, 2017, 8:09pm

No, not at all; it’s a single regex with four captures that store the text matched at those positions for later use.

I highly recommend something like this regex tutorial if you’re not familiar with how a regex works; it also contains a section on captures.

For completeness, here is a really simple regex that just matches any two characters:

..

However, with the addition of some parenthesis:

(.)(.)

Besides comically looking like boobs, this is still a single regex that matches any two characters, but reads as “match any two characters, but I care about what each of those two characters are, so save them for future use”.

If all you’re trying to do is search for something, the captures (in this case) mean nothing to you; both of them still find instances of two characters. Where the captures get interesting is when you want to do further processing on the matched text.

For sublime build systems, that “something” is knowing where the errors are. For other uses, you can use it for more extensive text transformations. For example, if for some reason you wanted to turn some Lorem ipsum text into pig latin, that is as easy as a simple regex that finds a sequence of word characters which uses captures to save parts of the matched text for use in the replacement.

**Ap0c552** · February 10, 2017, 11:16pm

Great! Thanks for the explanation. I am learning regex, and know some of the basics, but I did not know about “captures”, I think that was the key point I needed explanation on.

**Cornishman** · February 15, 2017, 9:26am

I have found that a web based regex tester will help refine the build error expression. I have used this one:

You can enter the error text and then adjust the expression until you get the correct data. The matches are colour coded to show you exactly what is being matched by which part of the expression.

**math2001** · February 18, 2017, 5:12am

This has nothing to do with the topic, but I’ve noticed that you were posting your code blocks like this:

> "file_regex": "^(..[^:]):([0-9]+):?([0-9]+)?:? (.)$"

instead of

```
"file_regex": "^(..[^:]):([0-9]+):?([0-9]+)?:? (.)$"
```

Just so that it’d be easier to read your future message… (https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet)

Build System regex