Sublime Forum

Regex (syntax file) with positive and negative lookahead

#1

Sublime Text build 4107, syntax file .sublime-syntax

Hi,

I am parsing kind of a html file with following tags:
___ <b:skin and more> … </b:skin>
___ <b:template-skin and more> … </b:template-skin>
___ <b:anything else and more> … </b:anything>

I want to match all <b: tags without consuming anything (using look-ahead) ánd distinguish between the <b:skin or <b:template-skin and other <b: tags.

I tried these two regex (and many others):

  1. to match <b:…skin … > __________ (?=<(?i:b:(template-)?skin)\s+[^>]*>)
  2. to match all other <b:…> _________ (?=<(?i:b:(?!(template-)?skin))[^>]*>)

When I use each of these regex in a context without the other, the matching works for each targeted tag. But once I put the two rules in one context, no mater the order, all tags are catched by trule no. 2 and no way for the ‘skin’ tags to follow rule no. 1. Many hours of testing spent, the why still stays a big mystery to me.

I also tried another approach with branch and fail, but it seems to me that the fail doesn’t sees a matching look-ahead as not failed, maybe because it doesn’t return a resulting string ?

So if someone could give me more insight on what’s happening and maybe propose a solution, I would be more than happy !

Thanks in advance,
Bart.

0 Likes

#2

This happens because those patterns don’t consume anything.

What happens if both of those patterns are placed next to each other in same context?

  1. Pattern one matches but nothing is consumed. Therefore nothing is scoped nor is the pointer moved forward in your text file.
  2. As a result ST realizes the first one not doing anything and continues with next pattern, which also matches at same position.

You can create consuming patterns and place them right next to each other to match different types of tags. They just need to be explicit enough to match only the desired tags.

I guess you want to push into dedicated contexts for each tag?

The following example makes use of some contexts of default XML.sublime-syntax for convinience reasons, but shows how to match different tag types and push into dedicated contexts for each.

%YAML 1.2
---
name: XML (Example)
scope: text.xml.example
version: 2

##############################################################################

variables:
  # The atomic part of a tag or attribute name without namespace separator `:`
  identifier: '[[:alpha:]_][[:alnum:]_.-]*'

  tag_name_break: (?=[/<>\s])

contexts:

  main:
    - include: tags

  tags:
    - include: skin-tags
    - include: other-tags

  skin-tags:
    # opening tag
    # (closing > may be on another line, so don't include in lookahead)
    - match: (<)(b)(:)((?:template-)?skin){{tag_name_break}}
      captures:
        1: punctuation.definition.tag.begin.xml
        2: entity.name.tag.namespace.xml
        3: entity.name.tag.xml punctuation.separator.namespace.xml
        4: entity.name.tag.localname.xml
      push:
        - skin-body
        - tag-attributes

    # closing tag
    # (also match closing tags if opening tag is missing)
    - match: (</)(b)(:)((?:template-)?skin){{tag_name_break}}
      captures:
        1: punctuation.definition.tag.begin.xml
        2: entity.name.tag.namespace.xml
        3: entity.name.tag.xml punctuation.separator.namespace.xml
        4: entity.name.tag.localname.xml
      push: end-tag-content

  skin-body:
    - meta_content_scope: meta.skin-body.xml
    ## OPTION 1 #########################

    # pop body context if corresponding opening tag is matched
    # (don't consume anything to let parent context handle it)
    - match: (?=</b:\4)
      pop: 1

    ## OPTION 2 #########################

    # match and consume closing tag and pop context off stack
    # (see: set to end-tag-content)
    # - match: (</)(b)(:)((?:template-)?skin){{tag_name_break}}
    #   captures:
    #     1: punctuation.definition.tag.begin.xml
    #     2: entity.name.tag.namespace.xml
    #     3: entity.name.tag.xml punctuation.separator.namespace.xml
    #     4: entity.name.tag.localname.xml
    #   set: end-tag-content

    #####################################
    # ... match whatever you want between those tags

  other-tags:
    # opening tag
    # (closing > may be on another line, so don't include in lookahead)
    - match: (<)(b)(:)({{identifier}}){{tag_name_break}}
      captures:
        1: punctuation.definition.tag.begin.xml
        2: entity.name.tag.namespace.xml
        3: entity.name.tag.xml punctuation.separator.namespace.xml
        4: entity.name.tag.localname.xml
      push:
        - other-body
        - tag-attributes

    # closing tag
    # (also match closing tags if opening tag is missing)
    - match: (</)(b)(:)({{identifier}}){{tag_name_break}}
      captures:
        1: punctuation.definition.tag.begin.xml
        2: entity.name.tag.namespace.xml
        3: entity.name.tag.xml punctuation.separator.namespace.xml
        4: entity.name.tag.localname.xml
      push: end-tag-content

  other-body:
    - meta_content_scope: meta.other-body.xml
    # pop body context if corresponding opening tag is matched
    # (don't consume anything to let parent context handle it)
    - match: (?=</b:\4)
      pop: 1
    # ... match whatever you want between those tags

  tag-attributes:
    - meta_scope: meta.tag.xml
    - match: '/>'
      scope: punctuation.definition.tag.end.xml
      pop: 2 # skip ...-body context
    - match: '>'
      scope: punctuation.definition.tag.end.xml
      pop: 1 # move on with ...-body context
    # use attribute contexts from default XML.sublime-syntax
    - include: scope:text.xml#tag-attribute

  end-tag-content:
    - meta_scope: meta.tag.xml
    - include: scope:text.xml#end-tag-content
0 Likes

#3

hi Deathaxe,

Thanks for the insight.

But still … a matching look-ahead even consuming nothing triggers a push, pop, set, embed, … Like in:

____ - match: (?=<(?i:b:(template-)?skin)\s+[^>]*>)
_____ embed: Packages/XML/XML.sublime-syntax

This way I had hoped to avoid constructing a bunch of contexts/rules that are already well defined in the default syntaxes. So it’s a bit a pitty here that ST in his next step doesn’t act on the ‘embed’ triggered and only looks if some characters have been consumed by the match. Characters consumed by the embed are then not taken into account ?

Anyway, if this is the only way, the code frame you made will be a lot of a help. Thanks for that !

.
Some final questions though. One. Where your code says

____ - include: scope:text.xml#tag-attribute

it is not clear to me what is happening. Normally an include acts on a context. So which context is used here, and what about that scope refered to ? Will attributes be processed here as they normally do in a xml syntax ?

.
Question two. You use

____ - match: (?=</b:\4)

What is the meaning of the \4 ?

.
Greets,
Bart.

0 Likes

#4

But still … a matching look-ahead even consuming nothing triggers a push, pop, set, embed,

Yes, this is exactly one of the use cases for lookaheads. With information I got from your initial post I was under impression you just put those two after each other with simple scope or something like that.

If a non-consuming pattern pushes another context on stack which contains consuming patterns, everything is fine. Those will be matched until context is popped or set away from.

So it’s a bit a pitty here that ST in his next step doesn’t act on the ‘embed’ triggered and only looks if some characters have been consumed by the match.

I guess you misunderstood.

A matching lookahead would in fact trigger embed. The question is, what does the escape look like? Does the embedded syntax have any chance to match something? Note the escape pattern being matched before anything else once each line.

Characters consumed by the embed are then not taken into account ?

Depends on escape pattern.

… if this is the only way, the code frame you made will be a lot of a help.

Maybe some more info about your goals would help with providing more precise tips. We can do a lot, if it’s clear where to go. :wink:

  • include: scope:text.xml#tag-attribute

It works like any other include statement. It includes a context named tag-attribute from any syntax with the main scope text.xml. You could instead write - include: Packages/XML/XML-sublime-syntax#tag-attribute to make sure to include the context from default xml syntax.

If your whole syntax is to extend XML, you could also inherite from default XML and just add/manipulate those contexts of interest. You might want to have a look at HTML (Plain).sublime-syntax and HTML.sublime-syntax to learn how it works.

  • match: (?=</b:\4)

\4 = 4th capture group of the pattern which pushed the context onto stack. This way you’d make sure to pop context only with exactly the matching tag. I used it, because the pattern ((?:template-)?skin) matches for two tags. So if opening tag is <b:skin> \4 says, "pop only if matching </b:skin>. Same for <template-skin> respectivly.

0 Likes

#5

hello Deathaxe,

I am trying to make a syntax file for Google Blogger theme backup files. It is a somewhat bizar mix of Bloggers own tags, CSS, XML, Javascript and a touch of Json objects. A little mock-up reads as follows:

<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE html>
<html ...>

<b:skin version='1.0.0'>
    <![CDATA[
        /*! ... (comment) */
        ... (css)
        /*
            <...> (html/xml) </...>
        */
    ]]>
</b:skin>

<b:xyz>
    ... (html/xml)
    <![CDATA[
        {...} (Json object)
    ]]>
</b:xyz>

&lt;script type=&quot;text/javascript&quot;&gt;
    ... (javascript)
&lt;/script&gt;

<script language='javascript' type='text/javascript'>
    // <![CDATA[
        ... (javascript)
    // ]]>
</script>

</html>

.
Next is part of the syntax with the one context ‘tag-blog-open’ that is bothering me:

  main:
    - match: ""
      push: xml-blogger

  xml-blogger:
    - match: ""
      set: Packages/HTML/HTML.sublime-syntax
      with_prototype:
        - include: just-mark
        - include: tag-comments
        - include: tag-blogskin-close
        - include: tag-blog-close
        - include: tag-blog-open

  tag-blog-open:
      # §§X§§ and §§Y§§ are used to quickly disable a rule
    - match: §§X§§(?=<(?i:b:(template-)?skin)\s+[^>]*>)
      embed: Packages/HTML/HTML.sublime-syntax
      embed_scope: meta.tag.blogskin.begin.html
      escape: '/?>'
      escape_captures:
        0: meta.tag.blogskin.begin.html meta.tag.other.html punctuation.definition.tag.end.html
    - match: (?=<!\[CDATA\[)
      push: css-blogskin-prefix
    - match: §§Y§§(?=<(?i:b:(?!(template-)?skin))[^>]*>)
      embed: Packages/HTML/HTML.sublime-syntax
      embed_scope: meta.tag.blogger.any.html
      escape: '/?>'
      escape_captures:
        0: meta.tag.blogger.any.html meta.tag.other.html punctuation.definition.tag.end.html

  css-blogskin-prefix:
    - meta_scope: source.skin.embedded.html
    - match: (<!\[)(CDATA)(\[)
      captures:
        1: punctuation.definition.tag.begin.html
        2: keyword.declaration.cdata.html
        3: punctuation.definition.tag.begin.html
      set:
        - include: css-blogskin-suffix

  css-blogskin-suffix:
    - match: (?=\]\]>)
      scope: punctuation.definition.tag.end.html
      pop: true

  tag-blogskin-close:
    - match: (</)((?i:b:(template-)?skin)\s*)(>)
      captures:
        1: meta.tag.blogskin.end.html meta.tag.other.html punctuation.definition.tag.begin.html
        2: meta.tag.blogskin.end.html meta.tag.other.html entity.name.tag.blogskin.html
        4: meta.tag.blogskin.end.html meta.tag.other.html punctuation.definition.tag.end.html

  tag-blog-close:
    - match: (</)((?i:b:(?!((template-)?skin))\w*\s*))(>)
      captures:
        1: meta.tag.blogger.any.html meta.tag.other.html punctuation.definition.tag.begin.html
        2: meta.tag.blogger.any.html meta.tag.other.html entity.name.tag.other.html
        5: meta.tag.blogger.any.html meta.tag.other.html punctuation.definition.tag.end.html

.
If in the code I only leave-out §§X§§, only blogger skin tags get ‘meta.tag.blogskin.begin.html’ scope
If I only leave-out §§Y§§, only blogger non-skin tags get ‘meta.tag.blogger.begin.html’ scope
If I leave out both, all blogger tags get ‘meta.tag.blogger.begin.html’ scope !!!???
.
.
Then about the
___ Include: scope:text.xml#tag-attribute
So #tag-attribute is not just a comment but a parameter ! Missed that somewhere in the documentation.
.
.
And about the
— match: (?=</b:\4)
Okido, so it’s possible to refer to a captured group not only inside the match but also in the match (pattern) which pushed the context onto stack. That’s very usefull !
.
.
Many thanks already for enlightening me !

Bart.

PS. How do you get the color in your post ?

0 Likes

#6

Your example looks pretty much like simple HTML with some additional special purpose “foreign tags” (e.g.: <b:skin> …), whose content needs special treatment.

The only odd thing, I see is this &lt;script type=&quot;text/javascript&quot;&gt; thing. as this would normally be treated as normal text in HTML/XML due to entity escaping. Maybe this is intented? Anyway, it could be treated very much like normal <script...> tags, if wanted.

I’d use HTML as a base syntax and add some rules for those <b:...> tags. I’d even use tag-attributes contexts from HTML and leave XML out completely.

By pushing into HTML using with_prototype you add those included contexts to literly every context of HTML. I can’t recommend doing so anymore for several reasons. You would include them even in strings and attribute names - all kind of contexts you don’t want them to be matched.

That’S also the issue with tag-blog-open. You embed html but pop with first closing tag. The embedded syntax therefore can’t match anything (tag), because it is escaped before.

Solution seems much simpler. You don’t need to mess with all those with_prototype and embeds, from what I can judge on a first glance on your example.

These <b: ...> tags are the only special thing? Any example for this §§X§§ thing?

May try to hack something to start with, tomorrow or the day after.

0 Likes

#7

How about that …

%YAML 1.2
---
name: Google Blogger Theme
scope: text.html.google-blogger-theme
version: 2

# Use builtin HTML as a base syntax and just add rules to handle <b:...> tags.
extends: Packages/HTML/HTML.sublime-syntax

##############################################################################

contexts:

  tag-other:
    # Prepend rules to existing context defined in HTML.sublime-syntax
    # which contain rules for <b: ...> tags.
    - meta_prepend: true
    - include: blogger-skin-tags
    - include: blogger-other-tags

  entities:
    # Prepend rules to existing context defined in HTML.sublime-syntax
    # which scope $$X$$ as keyword.
    - meta_prepend: true
    - match: \$\$[XY]\$\$
      scope: keyword.control.google-blogger-theme

###[ B:SKIN TAGS ]############################################################

  blogger-skin-tags:
    - match: (<)(b)(:)((?:template-)?skin){{tag_name_break}}
      captures:
        1: punctuation.definition.tag.begin.html
        2: entity.name.tag.namespace.html
        3: entity.name.tag.html punctuation.separator.namespace.html
        4: entity.name.tag.localname.html
      push: blogger-skin-attributes
    - match: (</)(b)(:)((?:template-)?skin){{tag_name_break}}
      captures:
        1: punctuation.definition.tag.begin.html
        2: entity.name.tag.namespace.html
        3: entity.name.tag.html punctuation.separator.namespace.html
        4: entity.name.tag.localname.html
      push: blogger-skin-closing

  blogger-skin-closing:
    - meta_scope: meta.tag.blogskin.end.html
    - include: tag-end

  blogger-skin-attributes:
    - meta_scope: meta.tag.blogskin.begin.html
    - match: '>'
      scope: punctuation.definition.tag.end.html
      set: blogger-skin-body
    - include: tag-end-self-closing
    - include: tag-attributes

  blogger-skin-body:
    - meta_content_scope: meta.blogger.skin.html
    - match: (?=</b:\4)
      pop: 1
    - include: blogger-skin-cdata

  blogger-skin-cdata:
    - match: (<!\[)(CDATA)(\[)
      captures:
        0: meta.tag.sgml.cdata.html
        1: punctuation.definition.tag.begin.html
        2: keyword.declaration.cdata.html
        3: punctuation.definition.tag.begin.html
      embed_scope: meta.tag.sgml.cdata.html source.json.embedded.html
      embed: blogger-skin-cdata-content
      escape: ']]>'
      escape_captures:
        0: meta.tag.sgml.cdata.html punctuation.definition.tag.end.html

  blogger-skin-cdata-content:
    # Include CSS with HTML supporting comments prepended.
    # Note: We could also extend CSS to add HTML support and directly embed extended CSS.
    - include: blogger-skin-cdata-comment
    - include: scope:source.css
      apply_prototype: true

  # This context is coppied from CSS.sublime-syntax to add html support to comments.
  blogger-skin-cdata-comment:
    # empty block comment
    - match: (/\*+)(\*/)
      scope: comment.block.css
      captures:
         1: punctuation.definition.comment.begin.css
         2: punctuation.definition.comment.end.css
    # normal block comment
    - match: /\*+
      scope: punctuation.definition.comment.begin.css
      push: blogger-skin-cdata-comment-content

  # This context is coppied from CSS.sublime-syntax to add html support to comments.
  blogger-skin-cdata-comment-content:
    - meta_scope: comment.block.css
    - match: \*+/
      scope: punctuation.definition.comment.end.css
      pop: 1
    - match: ^\s*(\*)(?!/)
      captures:
        1: punctuation.definition.comment.css
    - include: main

###[ B:OTHER TAGS ]###########################################################

  blogger-other-tags:
    - match: (<)(b)(:)({{tag_name}})
      captures:
        1: punctuation.definition.tag.begin.html
        2: entity.name.tag.namespace.html
        3: entity.name.tag.html punctuation.separator.namespace.html
        4: entity.name.tag.localname.html
      push: blogger-other-attributes
    - match: (</)(b)(:)({{tag_name}})
      captures:
        1: punctuation.definition.tag.begin.html
        2: entity.name.tag.namespace.html
        3: entity.name.tag.html punctuation.separator.namespace.html
        4: entity.name.tag.localname.html
      push: blogger-other-closing

  blogger-other-closing:
    - meta_scope: meta.tag.blogger.end.html
    - include: tag-end

  blogger-other-attributes:
    - meta_scope: meta.tag.blogger.begin.html
    - match: '>'
      scope: punctuation.definition.tag.end.html
      set: blogger-other-body
    - include: tag-end-self-closing
    - include: tag-attributes

  blogger-other-body:
    - meta_content_scope: meta.blogger.any.html
    - match: (?=</b:\4)
      pop: 1
    - include: blogger-other-json
    - include: comment
    - include: tag
    - include: entities

  blogger-other-json:
    - match: (<!\[)(CDATA)(\[)
      captures:
        0: meta.tag.sgml.cdata.html
        1: punctuation.definition.tag.begin.html
        2: keyword.declaration.cdata.html
        3: punctuation.definition.tag.begin.html
      embed_scope: meta.tag.sgml.cdata.html source.json.embedded.html
      embed: scope:source.json
      escape: ']]>'
      escape_captures:
        0: meta.tag.sgml.cdata.html punctuation.definition.tag.end.html
0 Likes

#8

I will need some time to check it out ;-}

Meanwhile, let me give you some insight to this Blogger theme code …

It starts with a xml file so all other type of code except html is escaped in many ways. As a first step plenty of <b:… and <data:… tags are processed server side at Google. This is quite complex. The final result is then a html file with css and javavascript.

The §§X§§ thing is just some easy trick I use in Sublime Text to enable or disable rules so I don’t have to comment all the lines that go with a rule …

The Blogger theme I’m working on has +4500 lines of code. If your interested I can send you a shortened version (non functional Blogger theme) that has all kind of possible structures. It still will be 200-300 lines.

A little peek of what you can encounter:

<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE html>
<html b:css='false' b:defaultwidgetversion='2' b:layoutsVersion='3' b:responsive='true' b:templateUrl='strm.xml' b:templateVersion='1.0.0' expr:dir='data:blog.languageDirection' expr:lang='data:blog.locale' xmlns='http://www.w3.org/1999/xhtml' xmlns:b='http://www.google.com/2005/gml/b' xmlns:data='http://www.google.com/2005/gml/data' xmlns:expr='http://www.google.com/2005/gml/expr'>

<head>
<b:if cond='not( or( data:view.isLayoutMode, data:view.isArchive, (data:blog.view == data:dtcs.archView), data:view.isSingleItem, and( data:view.isSearch, or( and( data:view.isLabelSearch, ( data:view.search.label in data:dtcs.okLabelSearch ) ), ( data:dtcs.okSearch any (s =&gt; data:view.search.query contains s) ) ) ) ) )'>
  <!-- *dtcs*: unaccepted URL, prepare redirecting and redirect-->
  <style>
  body{
    display:none
  }
  </style>
  <title><data:view.title.escaped/> - redirecting</title>
  <script language='javascript' type='text/javascript'>
    <!-- *dtcs*: default redirect homepage to _Toeter -->
    var redir = &#39;<b:eval expr='data:blog.searchUrl + &quot;/label/_Toeter&quot;'/>&#39; ;
    <b:if cond='(data:blog.url != data:blog.homepageUrl)'>
    var ref = document.referrer;
    <!-- *dtcs*: if referrer is dtcs.archView then redirect to dtcs.archView -->
    if ( ref.match(/\bview=<data:dtcs.archView/>\b/) !== null ) {
      redir = &#39;<b:eval expr='data:blog.url params { view: data:dtcs.archView }'/>&#39; ;
    <!-- *dtcs*: if referrer is from dated url (posts, archive) then redirect to dtcs.archView -->
    } else if ( ref.match(new RegExp(&#39;<data:blog.homepageUrl.jsEscaped/>[0-9]{4}/&#39;)) !== null ) {
      redir = &#39;<b:eval expr='data:blog.url params { view: data:dtcs.archView }'/>&#39; ;
      } else {
      var clusref = ref.match(<data:dtcs.refRegExp/>);
      <!-- *dtcs*: if referrer is a search from a known dtcs cluster, redirect to same cluster, else redirect to _Toeter -->
      if ( clusref !== null ) {
      <b:if cond='data:view.isLabelSearch'>
        <b:with value='&quot;label:@@&quot; + data:view.search.label + &quot;@@ label:&quot;' var='qparam'>
        redir = &#39;<b:eval expr='data:blog.searchUrl appendParams { q: data:qparam }'/>&#39; ;
        redir = redir.replaceAll(&#39;@@&#39;, &#39;&quot;&#39;)
        </b:with>
      <b:elseif cond='data:view.isSearch'/>
        <b:with value='data:view.search.query + &quot; label:&quot;' var='qparam'>
        redir = &#39;<b:eval expr='data:blog.searchUrl appendParams { q: data:qparam }'/>&#39; ;
        </b:with>
      </b:if>
        redir = redir + clusref;
      };
    };
    </b:if>
  window.location.replace(redir);
  </script>
<b:else/>
  <!-- *dtcs*: URL accepted, show blog -->
  <meta content='width=device-width, initial-scale=1' name='viewport'/>
  <title><data:view.title.escaped/></title>
  <b:include data='blog' name='all-head-content'/>
  <b:skin version='1.0.0'><![CDATA[

/*!normalize.css v8.0.0 | MIT License | github.com/necolas/normalize.css */
html{line-height:1.15;-webkit-text-size-adjust:100%}body{margin:0}h1{font-size:2em;margin:.67em 0}hr{box-sizing:content-box;height:0;overflow:visible}pre{font-family:monospace,monospace;font-size:1em}a{background-color:transparent}abbr[title]{border-bottom:none;text-decoration:underline;text-decoration:underline dotted}b,strong{font-weight:bolder}code,kbd,samp{font-family:monospace,monospace;font-size:1em}small{font-size:80%}sub,sup{font-size:75%;line-height:0;position:relative;vertical-align:baseline}sub{bottom:-0.25em}sup{top:-0.5em}img{border-style:none}button,input,optgroup,select,textarea{font-family:inherit;font-size:100%;line-height:1.15;margin:0}button,input{overflow:visible}button,select{text-transform:none}button,[type="button"],[type="reset"],[type="submit"]{-webkit-appearance:button}button::-moz-focus-inner,[type="button"]::-moz-focus-inner,[type="reset"]::-moz-focus-inner,[type="submit"]::-moz-focus-inner{border-style:none;padding:0}button:-moz-focusring,[type="button"]:-moz-focusring,[type="reset"]:-moz-focusring,[type="submit"]:-moz-focusring{outline:1px dotted ButtonText}fieldset{padding:.35em .75em .625em}legend{box-sizing:border-box;color:inherit;display:table;max-width:100%;padding:0;white-space:normal}progress{vertical-align:baseline}textarea{overflow:auto}[type="checkbox"],[type="radio"]{box-sizing:border-box;padding:0}[type="number"]::-webkit-inner-spin-button,[type="number"]::-webkit-outer-spin-button{height:auto}[type="search"]{-webkit-appearance:textfield;outline-offset:-2px}[type="search"]::-webkit-search-decoration{-webkit-appearance:none}::-webkit-file-upload-button{-webkit-appearance:button;font:inherit}details{display:block}summary{display:list-item}template{display:none}[hidden]{display:none}

/*
<!-- Constants -->

<Variable hideEditor="true" type="font" name="damionRegular36" description="Damion Regular 36" default="400 36px Damion, cursive" value="400 36px Damion, cursive"/>
<Variable hideEditor="true" type="font" name="damionRegular62" description="Damion Regular 62" default="400 62px Damion, cursive" value="400 62px Damion, cursive"/>
<Variable hideEditor="true" type="font" name="playfairDisplayBlack28" description="Playfair Display Black 28" default="900 28px Playfair Display, serif" value="900 28px Playfair Display, serif"/>
<Variable hideEditor="true" type="font" name="playfairDisplayBlack36" description="Playfair Display Black 36" default="900 36px Playfair Display, serif" value="900 36px Playfair Display, serif"/>
<Variable hideEditor="true" type="font" name="playfairDisplayBlack44" description="Playfair Display Black 44" default="900 44px Playfair Display, serif" value="900 44px Playfair Display, serif"/>
<Variable hideEditor="true" type="font" name="robotoNormal15" description="Roboto Normal 15" default="15px Roboto, sans-serif" value="15px Roboto, sans-serif"/>

.
Greetings,
Bart

0 Likes

#9

Hi Deathaxe,

I did let go the idea to use a general prototype (as indeed to complex) and worked further on the structure you made extending the html base.

Only for Javascript I am using now a with_prototype clause, because here additional elements can occur nearly anywhere in the code.

A Blogger theme also has simple selfclosing data: tags. Code was easy to add.

For the Blogger tags inside Javascript I made a dedicated context to assure that also between begin and ending tag, the Javascript code was highlighted in the right way.

The CSS embedded in the skin tag is actually some SCSS code but with variables and operations that are always between parenthesis. A small (nonsensical) example beneath to clarify. Do you have any idea of the type of preprocessor code this is or is it some dedicated Googlish thing ?

<b:skin version='1.0.0'><![CDATA[
iframe.b-hbp-video{
border:0
}
.post-body iframe,.post-body img{
max-width:100%
}
.post-body a[imageanchor=\31]{
display:inline-block
}
.bg-photo{
background:$(body.background);
background-attachment:scroll;
background-size:cover;
-webkit-filter:blur($(body.background.blur));
filter:blur($(body.background.blur));
height:calc(100% + 2 * $(body.background.blur) - $(body.background.offset.y));
position:absolute;
top:calc(0px - $(body.background.blur) + $(body.background.offset.y));
}
@media screen and (max-width:$(sidebar.width + content.width * 1.2 - 1px)){
.bg-photo{
left:$(0 - body.background.blur);
width:calc(100% + 2 * $(body.background.blur))
}
.byline{
font:$(robotoNormal15);
margin-$endSide:1em
}
.byline:last-child{
margin-$endSide:0
}
}
.dialog input[type=email],.dialog input[type=text]{
background-color:transparent;
border:0;
border-bottom:1px solid rgba($(body.text.color.red),$(body.text.color.green),$(body.text.color.blue),.12);
color:$(body.text.color);
display:block;
font-family:$(body.text.font.family);
}
.dialog input[type=email]::-webkit-input-placeholder,.dialog input[type=text]::-webkit-input-placeholder{
color:$(body.text.color)
}
]]></b:template-skin>

.
Beside this, the only thing to find out more is if there is an easy way to highlight encoded Javascript/HTML/…, as in:

<b:includable id='postMetadataJSONPublisher'>
 &quot;publisher&quot;: {
    &quot;@type&quot;: &quot;Organization&quot;,
    &quot;name&quot;: &quot;Blogger&quot;,
    &quot;logo&quot;: {
      &quot;@type&quot;: &quot;ImageObject&quot;,
      &quot;url&quot;: &quot;https://lh3.googleusercontent.com/ULB6iBuCeTVvSjjjU1A-O8e9ZpVba6uvyhtiWRti_rBAs9yMYOFBujxriJRZ-A=h60&quot;,
      &quot;width&quot;: 206,
      &quot;height&quot;: 60
    }
  },
</b:includable>
<b:widget-setting name='content'>&lt;div style=&quot;height: 164px; width: 264px; margin: -16px -8px -20px -8px; overflow: visible;&quot;&gt;
&lt;div style=&quot;height: 66px;  width: 250px; margin: -12px 0 0; position: absolute; z-index: 2; overflow: hidden;&quot;&gt;
&lt;a class=&quot;weatherwidget-io&quot; href=&quot;https://forecast7.com/nl/51d053d72/belgium/&quot; data-font=&quot;Roboto&quot; data-mode=&quot;Current&quot; data-theme=&quot;pure&quot; data-scale=&quot;0.8&quot;&gt;
&lt;/a&gt;
&lt;script&gt;
!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0];if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=&#39;https://weatherwidget.io/js/widget.min.js&#39;;fjs.parentNode.insertBefore(js,fjs);}}(document,&#39;script&#39;,&#39;weatherwidget-io-js&#39;);
&lt;/script&gt;
&lt;/div&gt;
&lt;div style=&quot;height: 320px; width: 242px; margin: -22px 0 0 8px; transform-origin: 0 0; transform: scale(1,0.8); position: relative; top: 34px;&quot;&gt;
&lt;a class=&quot;weatherwidget-io&quot; href=&quot;https://forecast7.com/nl/60d021d17/belgium/&quot; data-font=&quot;Roboto&quot; data-mode=&quot;Forecast&quot; data-theme=&quot;pure&quot; data-days=&quot;7&quot; data-scale=&quot;1&quot; &gt;
&lt;/a&gt;
&lt;script&gt;
!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[1];if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=&#39;https://weatherwidget.io/js/widget.min.js&#39;;fjs.parentNode.insertBefore(js,fjs);}}(document,&#39;script&#39;,&#39;weatherwidget-io-js&#39;);
&lt;/script&gt;
&lt;/div&gt;
&lt;/div&gt;</b:widget-setting>
<b:widget-setting name='content'>&lt;script type=&quot;text/javascript&quot;&gt;
//################ Function Start
function mbtlist(json) {
  document.write(&#39;&lt;ul class=&quot;mbtlist&quot;&gt;&#39;);
  for (var i = 0; i &lt; ListCount; i++) {

//################### Variables Declared
    var listing = ListImage = ListUrl = ListTitle = ListImage = ListContent = ListConten = ListAuthor = ListTag = ListDate = ListUpdate = &quot;&quot;;
    var ListComments = thumbUrl = TotalPosts = sk = AuthorPic= ListMonth = Y = D = M = m = YY = DD = MM = mm = TT =  &quot;&quot;;

// .........

  if (showTotal == &#39;on&#39;) {
    document.write(&quot;&lt;div class=&#39;itotal&#39;&gt;&lt;span&gt; &lt;a href=&#39;&quot;+ListBlogLink+&quot;/search/label/&quot;+ListLabel+&quot;&#39;&gt;View all &lt;font&gt;&quot;+TotalPosts+&quot;&lt;/font&gt; posts in  &#8594;  &quot; +ListLabel+&quot; &lt;/a&gt;&lt;/span&gt;&lt;/div&gt;&quot;);
  }
  document.write(&quot;&lt;/ul&gt;&quot;);
}

document.write(&quot;&lt;script src=&#39;&quot;+ListBlogLink+&quot;/feeds/posts/default/&quot;+ListLabel+&quot;?alt=json-in-script&amp;callback=mbtlist&#39;&gt;&lt;/&quot;+&quot;script&gt;&quot;);

&lt;/script&gt;

.
And so finally the .sublime-syntax file I coded with your much apreciated help:

%YAML 1.2
---
name: Google Blogger Theme
version: 2
file_extensions:
  - xml

# Use builtin HTML as a base syntax and add rules to handle Blogger tags and inclusions.
scope: text.html.google-blogger-theme
extends: Packages/HTML/HTML.sublime-syntax
contexts:

###[ ADAPT HTML ]#############################################################

  tag-other:
    # Prepend rules to existing context defined in HTML.sublime-syntax
    # which contain rules for <b: ...> and <data: ...> tags.
    - meta_prepend: true
    - include: blogger-skin-tags
    - include: blogger-data-tags
    - include: blogger-other-tags

  comment:
    # Prepend rules to existing context defined in HTML.sublime-syntax
    # which contain rules for <-- ...> comment tags.
    - meta_prepend: true
    - include: dtcs-comment

###[ ADAPT EMBEDDED JAVASCRIPT ]##############################################

  script-javascript:
    # Overwrite existing context defined in HTML.sublime-syntax
    # to contain rules for <b: ...>, <data: ...> and other tags.
    - meta_scope: meta.tag.script.begin.html
    - include: script-common
    - match: '>'
      scope: punctuation.definition.tag.end.html
      set:
        - include: script-close-tag
        - include: script-javascript-content-blogger

  script-javascript-content-blogger:
    # use modified context from HTML.sublime-syntax to contain
    # rules for <b: ...>, <data: ...> and other tags.
    - match: ""
      push: Packages/JavaScript/JavaScript.sublime-syntax
      with_prototype:
        - include: script-javascript-encapsulation
        - include: comment
        - match: '{{script_close_lookahead}}'
          pop: 1
        - include: blogger-js-embed-tags
        - include: blogger-data-tags
        - include: quote-escaped

  script-javascript-encapsulation:
    - match: (//)\s*((<!\[)(CDATA)(\[))
      captures:
        1: comment.line.double-slash.js punctuation.definition.comment.js
        2: meta.tag.sgml.cdata.html
        3: punctuation.definition.tag.begin.html
        4: keyword.declaration.cdata.html
        5: punctuation.definition.tag.begin.html
    - match: (//)\s*(]]>)
      captures:
        1: comment.line.double-slash.js punctuation.definition.comment.js
        2: meta.tag.sgml.cdata.html punctuation.definition.tag.end.html

###[ B:SKIN TAGS ]############################################################

  blogger-skin-tags:
    - match: (<)(b)(:)((?:template-)?skin){{tag_name_break}}
      captures:
        1: punctuation.definition.tag.begin.html
        2: entity.name.tag.namespace.html
        3: entity.name.tag.html punctuation.separator.namespace.html
        4: entity.name.tag.blogskin.html
      push: blogger-skin-attributes
    - match: (</)(b)(:)((?:template-)?skin){{tag_name_break}}
      captures:
        1: punctuation.definition.tag.begin.html
        2: entity.name.tag.namespace.html
        3: entity.name.tag.html punctuation.separator.namespace.html
        4: entity.name.tag.blogskin.html
      push: blogger-skin-closing

  blogger-skin-attributes:
    - meta_scope: meta.tag.blogskin.begin.html
    - match: '>'
      scope: punctuation.definition.tag.end.html
      set: blogger-skin-body
    - include: tag-end-self-closing
    - include: tag-attributes

  blogger-skin-body:
    - meta_content_scope: meta.blogger.skin.html
    - match: (?=</b:\4)
      pop: 1
    - include: blogger-skin-cdata

  blogger-skin-closing:
    - meta_scope: meta.tag.blogskin.end.html
    - include: tag-end

  blogger-skin-cdata:
    - match: (<!\[)(CDATA)(\[)
      captures:
        0: meta.tag.sgml.cdata.html
        1: punctuation.definition.tag.begin.html
        2: keyword.declaration.cdata.html
        3: punctuation.definition.tag.begin.html
      embed_scope: meta.tag.sgml.cdata.html
      embed: blogger-skin-cdata-content
      escape: ']]>'
      escape_captures:
        0: meta.tag.sgml.cdata.html punctuation.definition.tag.end.html

  blogger-skin-cdata-content:
    # Include SCSS with HTML supporting comments prepended.
    - include: blogger-skin-cdata-comment
    - include: Packages/Sass/Syntaxes/SCSS.sublime-syntax
      apply_prototype: true

  # This context is copied from CSS.sublime-syntax to add html support to comments.
  blogger-skin-cdata-comment:
    # empty block comment
    - match: (/\*+)(\*/)
      scope: comment.block.css
      captures:
         1: punctuation.definition.comment.begin.css
         2: punctuation.definition.comment.end.css
    # normal block comment
    - match: /\*+
      scope: punctuation.definition.comment.begin.css
      push: blogger-skin-cdata-comment-content

  # This context is copied from CSS.sublime-syntax to add html support to comments.
  blogger-skin-cdata-comment-content:
    - meta_scope: comment.block.css
    - match: \*+/
      scope: punctuation.definition.comment.end.css
      pop: 1
    - match: ^\s*(\*)(?!/)
      captures:
        1: punctuation.definition.comment.css
    - include: main

###[ DATA: TAGS ]#############################################################

  blogger-data-tags:
    - match: (<)(data)(:)
      captures:
        1: punctuation.definition.tag.begin.html
        2: entity.name.tag.namespace.html
        3: entity.name.tag.html punctuation.separator.namespace.html
      push: blogger-data-attribute

  blogger-data-attribute:
    - meta_scope: meta.tag.blogdata.begin.html
    - match: '>'
      scope: punctuation.definition.tag.end.html
    - include: tag-end-self-closing
    - include: tag-attributes

###[ B:OTHER TAGS EMBEDDED IN JS ]############################################

  blogger-js-embed-tags:
    - match: (<)(b)(:)({{tag_name}})
      captures:
        1: punctuation.definition.tag.begin.html
        2: entity.name.tag.namespace.html
        3: entity.name.tag.html punctuation.separator.namespace.html
        4: entity.name.tag.blogger.html
      push: blogger-js-embed-attributes
    - match: (</)(b)(:)({{tag_name}})
      captures:
        1: punctuation.definition.tag.begin.html
        2: entity.name.tag.namespace.html
        3: entity.name.tag.html punctuation.separator.namespace.html
        4: entity.name.tag.blogger.html
      push: blogger-js-embed-closing

  blogger-js-embed-attributes:
    - meta_scope: meta.tag.blogger.begin.html
    - match: '>'
      scope: punctuation.definition.tag.end.html
      set: blogger-js-embed-body
    - include: tag-end-self-closing
    - include: tag-attributes

  blogger-js-embed-body:
    - meta_content_scope: meta.blogger.any.javascript
    - match: (?=</b:\4)
      pop: 1
    - include: script-javascript-content-blogger

  blogger-js-embed-closing:
    - meta_scope: meta.tag.blogger.end.html
    - match: '>'
      scope: punctuation.definition.tag.end.html
      pop: 3

###[ B:OTHER TAGS ]###########################################################

  blogger-other-tags:
    - match: (<)(b)(:)({{tag_name}})
      captures:
        1: punctuation.definition.tag.begin.html
        2: entity.name.tag.namespace.html
        3: entity.name.tag.html punctuation.separator.namespace.html
        4: entity.name.tag.blogger.html
      push: blogger-other-attributes
    - match: (</)(b)(:)({{tag_name}})
      captures:
        1: punctuation.definition.tag.begin.html
        2: entity.name.tag.namespace.html
        3: entity.name.tag.html punctuation.separator.namespace.html
        4: entity.name.tag.blogger.html
      push: blogger-other-closing

  blogger-other-attributes:
    - meta_scope: meta.tag.blogger.begin.html
    - match: '>'
      scope: punctuation.definition.tag.end.html
      set: blogger-other-body
    - include: tag-end-self-closing
    - include: tag-attributes

  blogger-other-body:
    - meta_content_scope: meta.blogger.any.html
    - match: (?=</b:\4)
      pop: 1
    - include: blogger-other-embedded
    - include: comment
    - include: tag
    - include: quote-escaped

  blogger-other-closing:
    - meta_scope: meta.tag.blogger.end.html
    - match: '>'
      scope: punctuation.definition.tag.end.html
      pop: 1

  blogger-other-embedded:
    - match: ((<!\[)(CDATA)(\[))
      captures:
        1: meta.tag.sgml.cdata.html
        2: punctuation.definition.tag.begin.html
        3: keyword.declaration.cdata.html
        4: punctuation.definition.tag.begin.html
      push:
        - match: ']]>'
          scope: meta.tag.sgml.cdata.html punctuation.definition.tag.end.html
          pop: 1
        - match: \s*(?=<)
          embed_scope: meta.tag.sgml.cdata.html source.html.embedded.html
          embed: Packages/HTML/HTML.sublime-syntax
          escape: (?=]]>)
        - match: \s*(?=\S)
          embed_scope: meta.tag.sgml.cdata.html source.json.embedded.html
          embed: Packages/JSON Key-Value/JSON Key-Value.tmLanguage
          escape: (?=]]>)

###[ ENCODED LINES ]##########################################################

  quote-escaped:
    - match: (&#)(39)(;)
      captures:
        1: constant.character.entity.number.html punctuation.definition.entity.html
        2: constant.character.entity.number.html
        3: constant.character.entity.number.html punctuation.terminator.entity.html
      push:
        - match: (?=\S)
          push:
            - match: (&#)(39)(;)
              captures:
                1: constant.character.entity.number.html punctuation.definition.entity.html
                2: constant.character.entity.number.html
                3: constant.character.entity.number.html punctuation.terminator.entity.html
              pop: 2
            - include: blogger-data-tags
            - include: blogger-js-embed-tags

  # still to do: &lt;  &quot;  &gt;

  ###[ DEDICATED COMMENT TAGS ]#################################################

  dtcs-comment:
    - match: (<!--)\s*(\*dtcs\*)(-?>)?
      captures:
        1: punctuation.definition.comment.begin.html
        2: dtcs.mark1
        3: invalid.illegal.bad-comments-or-CDATA.html
      push: dtc-comment-content

  dtc-comment-content:
    - meta_scope: dtcs.comment
    - include: comment-content

.
Would be nice if there was also a way to extend a .sublime-color-scheme file with some additional scopes … (or is there ?)

Any advise on the code always welcome !

Greetings,
Bart.

0 Likes