Sublime Forum

[Solved] Curly brackets causing unwanted line wrapping in custom syntax?

#1

Hi everyone! I’ve quietly been using Sublime Text for a while now (10+ years?) but recently started using it for something new (to me). As a lightweight, blazing fast e-reader!

I’m working on a custom syntax to facilitate custom text styling for this type of long-form reading and having an issue with curly brackets causing unwanted line wrapping within Chinese text (Chinese text does not contain any spaces between words/characters):

Here is my super simple, baby’s first sublime-syntax file so far:

%YAML 1.2
---
file_extensions:
  - read
scope: text
contexts:
  main:
    - match: '\{(.*?)\}'
      scope: string

I’ve observed that commenting out the match and scope lines achieves this result:

This is perfect except that I’d like to have the text within curly brackets be highlighted green (or whatever color) for improved visual grepping. The other purpose of the brackets is to mark vocab words so that I can post-process the text to auto-generate Anki vocab flashcards with textual context.

Is there any way to have the text within brackets highlighted some color without also causing a line wrap?

Thanks in advance for any tips anyone might be able to provide! I’m looking all over Google and the Sublime Text documentation/forums for solutions and am getting the feeling that I’m not quite using the right words to describe the situation.

I :heartpulse: Sublime Text! It literally single-handedly ignited my love for software.

Edit: This topic is similar, but I’m having a hard time understanding it because there’s so much going on.

1 Like

#2

Can you provide those chinese text in a copy pastable format ? I am not sure though why a syntax should cause word wrap as it just assigns scope information and nothing more.

0 Likes

#3

Sure! Here’s the text:

謝文東的爸爸媽媽開{飯店}。謝文東{有時候}會{帶}錢大朋去飯店吃飯。謝文東的爸爸媽媽都{見過}錢大朋。錢大朋有時候也會開車帶謝文東{回家}。

And here’s the syntax-specific settings I have applied, which I don’t think make any difference, but just in case!

{
  "font_face": "kaiti",
  "font_size": 30,
  "wrap_width": 40,
  "draw_white_space": "none",
  "caret_style": "solid",
  "draw_centered": true,
  "show_minimap": false,
  "line_numbers": false,
}
0 Likes

#4

It could well be some combination of those settings or because it is chinese characters that’s causing the wrap, but it’s just a guess.

0 Likes

#5

I added these settings to the settings provided above to visualize things; the ruler is set to where the wrap width is, and draw_debug shows the content of the token buffer:

  "rulers": [40 ],
  "draw_debug": true,

With those in place, your sample content looks like this:

It’s wrapping, but note that the place where it’s wrapping is between tokens (which are delimited by the color changing between different states of blue when using Mariana as a color scheme).

If we alter the syntax so that the main context doesn’t include that rule, we see:

Now the wrap is not happening, but note also that everything is a single token now (it’s all the same blue).

Modifying the syntax to:

  main:
    - match: '.'
      scope: string

Produces this result:

Now the token alternation is every character, and the wrap is the same as when there’s no match statement.

The conclusion to draw from this is that the word wrap mechanism in Sublime doesn’t go character by character, it would appear to go token by token.

As such, the rules that are put in place in your syntax definition change the tokenization and thus change the wrap points.

This makes sense for languages like English where multiple sequences of characters making up a single token are likely words that should not be split, but perhaps that makes less sense in Chinese.

2 Likes

#6

What an amazing exploration/explanation, thank you so much! Btw, I’ve learned a ton from watching your YouTube videos over the years, they are amazing :slight_smile: .

Also, I especially enjoyed the tip about the draw_debug option, that’s a new one for me.

And best of all, your explanation led me to a solution - check it out: highlighted vocab words and improved line wrapping for Chinese (or any non-spaced language, I suppose)!

Read CJK.sublime-syntax:

%YAML 1.2
---
# See http://www.sublimetext.com/docs/syntax.html
file_extensions:
  - rd_cjk
scope: text
contexts:
  main:
    # Apply a scope to {text in brackets}. Specify a different scope
    # to pull a different color from your color scheme (constant.numeric, etc.).
    - match: '\{(.*?)\}'
      scope: string

    # Tokenize every other character (fixes line wrapping issues in
    # non-spaced languages like Chinese and Japanese).
    - match: '.'
      scope: text

Read CJK.sublime-settings:

{
  "font_face": "kaiti",
  "font_size": 30,
  "wrap_width": 40,
  "draw_white_space": "none",
  "caret_style": "solid",
  "draw_centered": true,
  "show_minimap": false,
  "line_numbers": false,
  // "rulers": [0, 40],
  // "draw_debug": true,
}

Note that I changed the file extensions and file_extensions field to include an explicit language family specifier (just for my own sanity). The syntax rule that tokenizes every other character makes decent sense for non-spaced languages like Chinese and Japanese, but makes less sense for languages that are already spaced (like English) because it can potentially cause mid-word line breaks.

Also “CJK” might not be the best name because Korean already includes spaces too. I don’t know Korean (yet :slight_smile: ), so I can’t comment on whether mid-word line wraps look as weird as they do in English.

Thanks again for the help @OdatNurd! You rock!

1 Like

Unnecessary wrapping when using CJK characters