Sublime Forum

Add BOM Awareness for Syntax Test Files

#1

As of ST3 3157, syntax test files with a BOM fail to work because the BOM prevents ST from correctly parsing the first line. See my post:

Some languages’ native IDEs do add a BOM when they save source files. The UTF-8 specification allows adding a BOM, even if it’s discouraged.

Adding a BOM check and stripping function shouldn’t be problematic, and might solve lot’s of issues when working with shared code where some contributors use IDEs which add a BOM.

Also, it might be worth mentioning the issue in the documentation — I wen’t around in circles for quite a while before realizing the problem was due to the BOM presence!

0 Likes

ST3 3157: Syntax Tests Problems
#2
ST 3175 x86_64 | Win 10

The problem persists, when the BOM is present the tests don’t get executed because ST fails to detect the comments delimiter from first line.

Basically, every time I edit the syntax source file with the language’s native IDE, it adds again a BOM at save time, and I have to remove it in ST.

ST is aware of the BOM, for in the status bar it shows encoding as “UTF-8 with BOM”.

If I click on the enconding and choose “Set encoding -> UTF-8”, it removes the BOM and the test works.

ST should really check for the presence of a BOM in test files, and skip it when parsing the comments delimiters, without forcing the user to remove the BOM from the file and make the source file not compliant to its native IDE standards.

When creating test files for languages which have a native IDE, it’s useful to be able to work with the native IDE too, especially when building a new ST syntax, for one would like to be able to use code that really works in that language, or just beacause the native IDE simplifies editing code, etc. But if we need to add/remove the BOM each time the test file is opened with ST or its native IDE it gets annoying (and usually it’s convenient to work with the test file opened in both editors at the same time). Also, if the native IDE supports different file encodings, removing the BOM could lead to special characters corruption.

0 Likes

#3

You should create an issue for this on the core tracker so it is not forgotten.

0 Likes

#4

Thanks @FichteFoll! The problem is that I’m not sure of where that would be.

If it’s a GitHub repo, please could you provide a link?

0 Likes

#5

The issue tracker for Text is at:

0 Likes

#6

Done:

1 Like