Sublime Forum

Infinite recursion in syntax or bug?

#1

I’m working on implementing extended mode comments to the default RegExp.sublime-syntax syntax def but am unable to do so in the way I wanted to.

I intended to use a with_prototype pattern to match #-indicated line comments within patterns that have extended mode enabled and managed to lock up ST while causing a rapidly increasing RAM usage (like 6GB in 20s). I suspected infinite recursion, but normally ST detects those in a safety net and just errors. I also cannot find a reason for this to happen, which is why I’m creating this thread.

The file: https://gist.github.com/FichteFoll/3dd8223ddb5d8a1d8ee405956b045618

If I uncomment line 40 (or 34, same result) or comment lines 81&82, ST goes crazy on my RAM.

cc @kingkeith (since you primarily worked on this)


Note: This method is not ideal because # is allowed in sets (as are whitespace characters), but I haven’t thought of a better method yet that didn’t involve duplicating the entire group-start context. If only the YAML merge operator was supported …

1 Like

#2

I don’t believe this is specified for YAML 1.2, is it?

As to the main issue, you end up with recursion due to https://gist.github.com/FichteFoll/3dd8223ddb5d8a1d8ee405956b045618#file-regexp-sublime-syntax-L80-L82.

By using with_prototype, you are creating a new copy of group-body-extended, which include group-body, which includes (a new copy of) base-group, which either includes (a new copy of) group directly, or via base. The push to group-start from group completes the loop - a copy of it was made to apply the added prototype. It is possible that the balloon in ram usage slowed down responsiveness before 25,000 contexts were created. This may be due to the combinatorial nature of the includes?

2 Likes

#3

Thanks for taking a look.

group-start is only pushed when an opening brace (() is matched, however, which is why this shouldn’t be a perfect infinite loop at runtime.

However, from your description I take it that the recursion happens on parse-time when constructing the prototyped context copies which are constructed before parsing happens, in which case I can see the recursion. I just didn’t approach it from that perspective, although it makes sense performance-wise.


I ended up using a different and more applicable approach and submitted it as a pull request.

1 Like

#4

It is mentioned to in the spec (as part of the “tag repository”) but intentionally not specified, although recommended to be implemented. (ref)

Direct link to it: http://yaml.org/type/merge.html

It wouldn’t have helped for the lists since the merge operation can only be performed on mappings, but (as observable in the PR) there is still a lot of duplicated patterns that could have made use of this. I can make an example for this tomorrow, if you’re interested.

1 Like

#5

Yes I saw that, however that mentions YAML 1.1, hence why I asked if it was specified for 1.2.

0 Likes

#6

Btw, the 1.1 spec is even less specific about the tag repository imo and only mentions it in this section without a comment on which types are required to be parsed by implementations. Only in 1.2 “schemas” have been defined (for explicit compatibility with JSON and better differentiation).

0 Likes