Sublime Forum

**me_suzy** · January 10, 2018, 2:05pm

hello. I have this text with many under the tag 

 <p class="my_class">An Extension&nbsp;of Java for Event Correlation. 571 geographical/logical coordinates, or sources. Henceforth,&nbsp;we will use the term&nbsp;events to refer to&nbsp;both the incidents underlying such&nbsp;events as well as to their incarnations&nbsp;and notifications</p>

I want to select this tag and replace all   with empty space.

First of all, I select the tag and the content: (?s)([^<]*)

Then I try to include   into this regex formula so as to select only  

(?s).*? ([^<]*) but does’t work. Can anyone help me?

**kingkeith** · January 10, 2018, 2:10pm

I’ve give my usual advice - keep it simple and performant.

Keep your first working search of (?s)([^<]*), click Find All, then do a new search, “in selection” for  

or, for better selection of the p tag contents (as this regex will not work if the paragraph contains child elements), don’t use regex to parse HTML, but try the right tool for the job instead like:
https://packagecontrol.io/packages/xpath
to select the tags, then do the replace in selection

**me_suzy** · January 10, 2018, 2:14pm

yes, but I need to change in 100 html pages that’s the problem…I need fo make a search and replace in more then 100 html pages

**kingkeith** · January 10, 2018, 2:25pm

ah I see, why not just

(<p class="my_class">[^&<]*+)&nbsp;

replace with \1

although you will need to execute it as many times as   appears in the tag’s inner HTML

unfortunately there’s not really a better way, though maybe you reduce the number of times it needs to be executed by adding something like (?:([^&<]*) )? to the end and replacing with \1\2

**me_suzy** · January 10, 2018, 2:36pm

your regex, ([^&<]*+)  replace with \1 will find and replace only the first instance of  

But I need to select and replace all   from the inside of tag

**kingkeith** · February 28, 2018, 11:36am

which is why I recommended to execute the replacement multiple times, until there are no matches - or to duplicate parts of the expression so you can replace multiple capture groups at once…

otherwise, what you want is impossible without using a regex engine like .NET’s that stores all captured text that matched for a capture group:
https://www.regular-expressions.info/captureall.html

you could probably get clever using \G though, and skip the start of the file:

(?:(?!\A)\G|(<p class="my_class">))([^&<]*+)&nbsp;

replace with \1\2 space

**me_suzy** · January 10, 2018, 4:32pm

I can not handle it

**facelessuser** · January 10, 2018, 11:57pm

This is something that is really best done with a mix of regular expression and coded logic. A pure regex solution is pretty much impossible.

So for a quick example, I’ll use application I wrote called Rummage to illustrate the logic. First, we would use this pattern:

(<p class="my_class">)([^<]+)

And use a little Python code:

from rummage.lib import rumcore


class NbspReplace(rumcore.ReplacePlugin):
    def replace(self, m):
        return m.group(1) + m.group(2).replace('&nbsp;', ' ')


def get_replace():
    return NbspReplace

And you can see the results:

Essentially you can put similar logic in a script and make the changes. Or use something like the plugin RegReplace and apply it to your files. Often if I’m making changes across multiple files, I’ll use Rummage as it will crawl folders and such, and I don’t have to rewrite all that logic. But I’m sure there are other things out there that you can use that can do the same thing.

**me_suzy** · January 11, 2018, 6:45am

SEARCH: (?-si)(?!#)|( )(?=.+#)|#

REPLACE: (?1$0#)(?2\x20)

IMPORTANT : you’ll have to click TWICE, on the Replace All button

**me_suzy** · January 12, 2018, 5:34am

and another answer:

SEARCH: (?:\G(?!^)|<p\s+class="my_class">)(?:(?!).)*?\K 
Replace BY: Leave Space

or

SEARCH: (?-s)(\G(?!^)|<p\s+class="text_obisnuit">)((?!).)*?\K 
Replace BY: Leave Space

Regex: Match and Remove particular repetead-words from tags