Sublime Forum

Easy Way To Create Syntax Highlighters

#1

Hi,

Just logged on here to make people aware of a web-based system I developed for creating textmate / sublime syntax highlighters.

The name of the system is Iro, and more information (and link) is available here:

https://medium.com/@model_train/creating-universal-syntax-highlighters-with-iro-549501698fd2

The system will let you create Atom / Textmate (xml) / Sublime 3/ Ace Editor/ Rouge / Pygments syntax highlighters from a single definition file. It has autocompletion and a built in debugger that lets you examine the call stack.

Let me know what you think.

Chris

1 Like

#2

Is there an example of a high-quality .sublime-syntax produced using this system?

0 Likes

#3

Click the Sublime 3 tab for sublime 3 syntax (this is untested actually as I donā€™t use Sublime).

If you want to use ā€˜setā€™ then make sure that ā€˜textmate_compatibleā€™ is set to false.

Full documentation here :

http://eeyo.io/iro/documentation/

Please note I wrote the Sublime e code years ago based on blog posts when S3 was being developed. There are likely bugs. The .tmLanguage export is pretty strong Iā€™d say.

0 Likes

#4

I guess this isnā€™t an open source project so there is no way for us to fix the bugs to make it usable? for example, (\\{) as per your screenshot is going to match a literal slash followed by a {, as opposed to just a literal {ā€¦

0 Likes

#5

Not open source, not yet anyway. But does this help ?

0 Likes

#6

yes, although using double quotes harms readability it would at least work :slight_smile: :+1:

minor nitpick: the file_extensions shouldnā€™t be mandatory, sometimes there are legitimate reasons to create syntaxes without file extensions associated with them

but the debug mouse hover is cool, showing which the contexts are active, nice work :slight_smile:

0 Likes

#7

Refresh the page.

Mandatory file extensions are now a thing of the pastā€¦

1 Like

#8

Neat initiative! One thing that would be interesting is a way to turn an already defined syntax (say, in .sublime-syntax or in .tmLanguage format) and port it to the IRO common format, so that we could easily port syntaxes already created in Sublime to other editors.

Also, sorry for asking without reading the documentation first, but is there a way in the common syntax to do the stack manipulation that sublime allows? Namely, pushing/setting a list of contexts at once. Iā€™m not sure if that behaviour can be replicable in the other formats.

0 Likes

#9

Multi push/set is supported. Of course, this breaks compatibility with textmate as you say, so you just disable textmate compatibility mode. Doing so will mean that Atom and Textmate targets are not generated.

Importing from existing .tmLanguage files is something Iā€™m working on as we speak actually, but it may take a few days. Importing from existing .sublime-syntax files is unlikely to happen anytime soon unfortunately, but it would probably be easier work because sublime syntax is an intrinsically cleaner format.

0 Likes

#10

The debugger looks really neat. Iā€™ve thought about writing a similar tool for .sublime-syntax definitions, but the chief hurdle is the lack of a compatible JavaScript-based regular expression engine. Does Iro do any translation between various regex formats?

0 Likes

#11

Iro does some regex pre-validation and mutation, but not much. It looks for lookbehind expressions and says - hey up - thatā€™s not allowed (because there needs to be a lowest common denominator). As such, I guess if you absolutely require lookbehind expressions, that Iro might not be for you - or at least, not the web based version.

It also messes around with regular expressions for Rouge and Pygments, because they perform matches across lines (unlike Textmate based languages which are line based). Check out the generated files for those exporters to see the magic.

Iro uses the JavaScript RegEx engine internally. I re-implemented the rules of TextMate based syntax highlighting in my Java codebase, then I used GWT to convert to JavaScript.

If it doesnā€™t work properly, please let me know.

0 Likes

#12

One bug I found was that the \h escape is silently passed through. In JavaScript, this represents the character h; in Onigurumaā€™s syntax it represents a hex digit ([A-Fa-f0-9]). The behavior of the resultant syntaxes will be inconsistent.

Also, Iā€™d like to echo kingkeithā€™s comment that the use of double-quoted strings makes it hard to read the .sublime-syntax output. Single-quoted strings should be fine.

0 Likes

#13

Thanks for picking up the \h bug, Iā€™ll resolve, but we would be looking at \h being used in the JavaScript sense and losing the Onigurumaā€™s mapping to [A-Za-z0-9].

Point taken, Iā€™ll look at better encoding of the YAML string (single quote version).

EDIT : Do you actually mean [A-Fa-f0-9] ?

0 Likes

#14

For the regexp processing, are you doing transpilation or just a quick sanity check?

Do you actually mean [A-Fa-f0-9] ?

Yes. Iā€™ve edited the post.

0 Likes

#15

I have a basic regular expression parser that I use for sanity checking. Itā€™s best-efforts built by myself.

0 Likes

#16

The \h problem is fixed nowā€¦ The Iro Regular Expression format should be equivilent to JavaScript regex format. If you want to define hex chars, just use the [0-9a-fA-F] manual expansion.

0 Likes

#17

Now using single quoted version.

0 Likes

#18

Hi. Iā€™m working on Textmate to Iro converter and I wonder if someone can help me make sense of this rule, in textmate format? ā€¦ as it appears to be contradictory. I understand begin, end to demarcate the start of a new inline context. What I donā€™t understand is ā€œcapturesā€ without a ā€œmatchā€ or in the presence of ā€œbeginā€ and ā€œendā€. It just looks and feels like a mess.

So, Iā€™m looking for someone to tell me if this is a legitimate way to build a grammar file, and if the ā€œcapturesā€ section is a bug?

<dict>

	<key>begin</key>
	<string>\b(_|:)\b</string>

	<key>end</key>
	<string>(?={)</string>
	
	<key>name</key>
	<string>meta.definition.class.extends.d</string>
	
	<key>captures</key>
	<dict>
		<key>1</key>
		<dict>
			<key>name</key>
			<string>storage.modifier.d</string>
		</dict>
	</dict>
	
	<key>patterns</key>
	<array>
		<dict>
			<key>include</key>
			<string>#all-types</string>
		</dict>
	</array>
	
</dict>
0 Likes