Sublime Forum

**jps** · January 13, 2016, 7:01am

The compatibility issue with Arch Linux has been resolved - this was an interesting issue, I’ve got a mostly finished writeup for it that I’ll post later.

There’s a new file setting, move_to_limit_on_up_down, which is enabled by default on OS X to behave as it typical for the platform. This means that pressing up while on the first line of the line will move the caret to the beginning of the line. Personally, I’m not a fan of this behavior, but it’s the right thing to do for users who are used to it.

**Fed03** · January 13, 2016, 7:01am

it’s incredible how much effort you put in speed development…i’m very amazed
i know that is a thing requested and re-requested but it’s so difficult to have an option to decide the position for minimap?
like right or left…because now this editor id becaming more and more known and i think that in a business optic it’s better to have options maybe unused than don’t have them…

my 2 cents and keep the good work^

**jps** · January 13, 2016, 7:01am

As promised, a full description of the Arch Linux compatibility bug:

The symptoms were that in a recent dev build, all the regexes loaded from .tmLanguage files were missing their last character, which would often render them invalid. This wasn’t happening on any other operating systems, nor any other Linux distributions.

The first thing I tried was getting a livecd version of Arch Linux (ArchBang in this case), and running it under a VM. Everything was working peachy - I had a bug that was triggering only for users of one specific Linux distribution, except not for me.

One of the changes in the broken build was that plist files (as .tmLanguage ones are) are now represented differently in memory, so I assumed that the .cache files weren’t entirely compatible (every .tmLanguage has a corresponding binary representation in a .cache file, to save on XML parsing time at startup), and something has gone wrong there. Further indicating this may have been the problem was that the size of the generated .cache files had changed between the two builds.

This turned out to be a red herring. The .cache files contain key-value dictionaries, stored in memory as hash tables, and they’re written out in the order the keys live in memory. This order is usually consistent, but had changed in this build because the key representation changed from wchar_t * (i.e., UTF-16 on Windows and UTF-32 on Linux and OS X), to a UTF-8 encoded char *. .cache files are zlib compressed, so changing the order of the keys will change the compressibility of the file, leading to different file sizes.

My next guess, given I wasn’t able to reproduce the issue, was that it was memory management related. One of the changes in the same build were to place plist strings in a memory arena (*1). Along with this, the memory arena code was changed to make use of malloc_usable_size() (aka malloc_size on OS X and _msize on Windows) to better utilise available memory, and glibc has at least one scenario where it implements this function incorrectly (sourceware.org/bugzilla/show_bug.cgi?id=1349). Alas, removing the calls to malloc_usable_size() didn’t fix anything.

I was pretty much out of ideas by now, and still unable to replicate the bug myself. As a hunch, I guessed that using a derivative of Arch Linux rather than the real thing could be related to this. Several hours later, after downloading and installing Arch the hard way, I was able to replicate the bug!

Running under GDB I could see that, yes, the regexes were indeed missing their last character. They had it when they were loaded from disk, but lost it by the time they made their way into the lexer. But only on Arch. Tracking things down, I could see that the string was copied into the memory arena, but when it was later used, the last character was gone. I looked at my 3 line copying function, but there were no visible problems. I rewrote the function to work in a different manner, but the end result was the same. I looked up the symbols in GDB to make sure the function I thought was being called was the actual one being called. Here’s the function in question, if you’re following along at home:

uchar * u_dupcstr(usubstring s, memory_arena * arena)
{
    const size_t num_chars = s.size();

    uchar * data = (uchar *) arena->alloc((num_chars + 1) * sizeof(uchar));
    memcpy(data, s.begin(), num_chars * sizeof(uchar));
    data[sz] = '\0';

    return data;
}

Stepping through this function, I could see that the string was being copied correctly, and the null terminator was in the right spot. However when the returned value was printed, the last character (which I could see just fine in GDB!) was missing. Looking at the data where it was used via GDB, the last character was indeed there. wcslen (strlen for wchar_t data) however, was saying it wasn’t: it was reporting the string as being one character shorter than it actually was. This is when I paid a bit more attention to the memory reported by GDB:

0x18d96ed:  0x74    0x00    0x00    0x00    0x65    0x00    0x00    0x00
0x18d96f5:  0x78    0x00    0x00    0x00    0x74    0x00    0x00    0x00
0x18d96fd:  0x2e    0x00    0x00    0x00    0x70    0x00    0x00    0x00
0x18d9705:  0x6c    0x00    0x00    0x00    0x61    0x00    0x00    0x00
0x18d970d:  0x69    0x00    0x00    0x00    0x6e    0x00    0x00    0x00
0x18d9715:  0x00    0x00    0x00    0x00

What we’re looking at here is the wchar_t representation of “text.plain”, with 4 little-endian bytes per character. There are 10 characters in “text.plain”, so including the null terminator, it has a 44 byte representation in memory, and it’s all nice and null terminated, as you can see above. Every other system has wcslen() report 10 for the above data, however on Arch Linux it returns 9.

The issue is the address of the first character, 0x18d96ed. wchar_t values have an implementation defined alignment requirement in C++, and being a 4 byte value here, that generally means they should lie on a 4 byte boundary, but 0x18d96ed is not on a 4 byte boundary (*2). The misaligned data comes from the memory arena, which is used to store a mix of UTF-8 and UTF-32 data, so will naturally end up returning unaligned memory addresses unless care is taken. In practical terms, x86 CPUs will happily load unaligned data, and you have to go out of your way to write code that doesn’t handle unaligned data correctly.

Enter Arch Linux, with its fancy new glibc 2.15, where the implementation of wcslen lives. One of the changes in glibc 2.15 is an optimised version of wcslen, that apparently doesn’t like unaligned data.

Long story short, my wchar_t strings are now all properly aligned, and everyone’s happy again. The moral of the story is, of course, just use UTF-8 everywhere.

As to the story of ArchBang, and why it didn’t reproduce the problem, glibc 2.15 only landed in Arch a few weeks ago, and the livecd dates from before then. In reality, this is a bit of luck: if I didn’t know exactly which build introduced the issue, it would have been much harder to track down.

*1. Memory arenas are a technique to coalesce multiple small allocations into a single larger allocation. When you have a lot of data with the same lifetime, they provide faster allocation and deallocation, better locality of reference, and less fragmentation that just mallocing the allocations individually.

*2. When I was a young fellow, I was debugging a misbehaving program with a coworker, and marveled at his ability to tell if a hexadecimal memory address is 4-byte aligned at a glance. The trick, of course, is to just look at the last digit.

**freewizard** · January 13, 2016, 7:01am

[quote=“jps”]…

0x18d96ed:  0x74    0x00    0x00    0x00    0x65    0x00    0x00    0x00
0x18d96f5:  0x78    0x00    0x00    0x00    0x74    0x00    0x00    0x00
0x18d96fd:  0x2e    0x00    0x00    0x00    0x70    0x00    0x00    0x00
0x18d9705:  0x6c    0x00    0x00    0x00    0x61    0x00    0x00    0x00
0x18d970c:  0x69    0x00    0x00    0x00    0x6e    0x00    0x00    0x00
~~~~~~~~^ should be d ?
0x18d9715:  0x00    0x00    0x00    0x00

…[/quote]

could be a memory crc error I guess

**mikeb** · January 13, 2016, 7:01am

Jon, I just have to say I am in awe. Amazing work.

**jps** · January 13, 2016, 7:01am

[quote=“freewizard”]
0x18d970c: 0x69 0x00 0x00 0x00 0x6e 0x00 0x00 0x00




Good catch! I've updated the post to fix the transcription error

**tito** · January 13, 2016, 7:01am

is there any chance to adjust the time a plugin takes to be connsidered slow? Some people started to say a plugin is slow because takes “0.03xxx” which is pretty fast. Also, Most of the time taken by the plugin is about internal sublime APIs.
I like the feature becuase push for improvements on plugins but found it too restrictive and sometimes disturbing.
Regards

**jps** · January 13, 2016, 7:01am

A plugin taking 30ms in its on_modified or on_selection_modified callback will visibly impact performance. In the case of on_selection_modified, things like holding down an arrow key will no longer feel smooth.

Ideally all processing in response to these events - the sum of all plugins event handlers, plus Sublime Text’s own time to process the action and redraw the effected part of the window, would be done in less than 16.7ms, to keep up with the 60hz refresh rate found on most monitors.

This isn’t about blame, just ensuring that users don’t get an unresponsive editor. This absolutely limits what can be done in response to on_modified and on_selection_modified, but it’s a price I personally feel is worth paying.

**sublimator** · January 13, 2016, 7:01am

0.03x seconds is kind of slow, especially for modification events.

In 2012, 16ms is a pretty modest aim. With ~1ms on_selection_modified you can set the status bar to the xpath of the cursor containing element and paint in the begin/end tags on a ~15,000 line xml file:

http://gmh.akalias.net/16ms-is-a-long-time.jpg

For this and a lot of plugins it’s not absolutely critical that it completes and gives a status update inside the event handler every time. Building the tree takes a little longer but we are armed with threads and set_timeout/generator methods for background processing. If on the event handler a tree has been built (or the equivalent for whatever plugin), then the status is updated or queued up with a set_timeout if not.

With regards to long running blocking functions composed of API calls which you can’t defer to secondary threads (most api calls can’t be run in threads) you can use set_timeout to turn the gears on a generator, slapping in yield points. Or you can use recursion or W.E.

It’s definitely worth the extra effort to keep things running smooth.

**corydeppen** · January 13, 2016, 7:01am

I’m not seeing tabs using build 2161 on OSX 10.6.8. It seems only one file is being opened at a time, since when I attempt to open multiple files then Command-w I don’t see any open files. I’m pretty certain this was working in 2159 or 2160.

**tiger2wander** · January 13, 2016, 7:01am

Great to see it fixed bug on ArchLinux X64, awesome work! thanks.

**firefusion** · January 13, 2016, 7:01am

Small feature request: can we drop .sublime-project from the project switcher popup window? Make project names easy to spot.

Dev Build 2161