Sublime Forum

Global replacement

#1

It seems like it should be easy, but I can’t see what I’m doing wrong.

I have a dictionary - a large file with a few words per line. I want to replace YY with ZZ at the end of every line that has XX in it - that’s all!

If I understood well how findall() or search() works, maybe I could just find XX and then replace any following YY with ZZ. But how would I know it’s on the same line?

So I’m trying to use sub(), but maybe I’m calling it wrong. Sub() makes all the replacements in one call and returns the new string, right? I guess the entire 3.5M file is one string, or is it broken into lines?

import re
self.re.sub(“XX.+)YY\n”, “\1ZZ\n”, edit)

Thanks in advance

0 Likes

#2

There is a missing bracket in your search string before XX?

If that’s still not working you might want to compile your regexes first using re.compile to avoid backslash headaches in matching. Read more about using python’s re module here.

Assuming you don’t want to do anything more clever than what you posted writing your own plugin might be overkill, in which case take a look at the RegReplace plugin. And of course the regexes you mention will - with the missing bracket fixed - work in the Sublime replace UI.

:slight_smile:

0 Likes

#3

There was an extra ) before YY, but I don’t think my problem has to do with what’s inside the call to sub() - that’s well -documented.

My problem is that I don’t know how to call sub(), nor how to pass to it the entire file as an argument, nor how to replace the file with the return value.

For example, if I have a file that contains:
North Carolina
North Dakota
North Korea

And I want to replace all the Norths with South using sub(, would the package be:
import re
import sublime
import sublime_plugin

class TestCommand(sublime_plugin.TextCommand):
def run(self, edit):
self.sub(“North”, “South”, edit)

Or should the last line be
self = re.sub(“North”, “South”, edit)

Or something else?

0 Likes

#4

Something like:

import sublime
import sublime_plugin

class TestCommand(sublime_plugin.TextCommand):
    def run(self, edit):
        regions = self.view.find_all(r'North') # Or maybe r'\bNorth\b'.
        for region in reversed(regions):
            self.view.replace(edit, region, 'South')

You have to use self.view, which is a sublime.View object. view.find_all() will return a list of sublime.Region objects matching the given pattern. view.replace() will modify the view, replacing the contents of the given region with the given text.

N.B. This forum supports fenced code blocks.

1 Like

#5

Thank you.

I already use very similar code when calling replace(), but this particular replacement depends on a context earlier in the line, so I thought I have to use sub() and regex.

And I have the idea that a single call to sub() will replace all the matches, so I don’t have to iterate like I did with replace().

But my problem is in getting any call to sub() to work. I don’t know how to pass the file data to the call, nor what to do with the return. Might I have to use self.view with sub(), too? Do I need to refer to the data as edit when passing TO the function, but as self when using the return value?

Could I ask you for an example with sub()?

0 Likes

#6

Keep in mind that in order to get text into the buffer, you need to use view.replace() (since you’re trying to replace some piece of text with another). That call requires you to provide the region that’s going to be replaced and the text as well.

If you want to use re.sub() to do all the replacements at once, then you need to:

  1. Grab the entire contents of the buffer all at once as a string
  2. Run your re.sub() to perform the replacements
  3. Put the entire contents of the buffer (with replacements) back into the buffer.

That’s certainly possible; roughly something like this:

# 1
region = sublime.Region(0, self.view.size())            
content = self.view.substr(region)
# 2
new_content = re.sub(somethingorother)
# 3
self.view.replace(edit, region, new_content)

Extracting the entire contents of a file is wasteful unless you know that it’s primarily text that you need to perform replacements on though. For example, if your file was 5MB and the word North appears once, this is akin to driving in a nail by ramming it with a speeding pickup truck.

On the other hand, you can also do something like this using find_all; it has the power to extract the matched text and give it to you (leaving the rest alone) and the ability to format the strings on the way through.

A modified example of the plugin @ThomSmith posted above might be:

import sublime
import sublime_plugin

class TestCommand(sublime_plugin.TextCommand):
    def run(self, edit):
        result = []
        regions = self.view.find_all(r'^((?:This|That) .*) North', 0,
                                     r"\1 South", result)
        for idx,region in enumerate(reversed(regions)):
            self.view.replace(edit, region, result[-idx - 1])

This tells the find_all call to put the matched text into the result array along with telling you the region that it was found in, and also modifies the text as it’s extracting it so that it’s ready for replacement later.

Input:

This is North Carolina
Where is North Dakota
That was North Korea

Output:

This is South Carolina
Where is North Dakota
That was South Korea

Not all of the instances of the word North are replaced, only those on lines that start with This or That.

0 Likes

#7

May I abuse your goodwill by asking for more help?

Here is my code, which as you can see is a near copy of yours. At least it seems to be running!

import sublime
import sublime_plugin

class TestCommand(sublime_plugin.TextCommand):
def run(self, edit):
result = []
regions = self.view.find_all(r"((?:AS:|A’S:|AS’:slight_smile: .*) AX0 Z $", 0, r"\1 AH0 Z ", result)
for idx, region in enumerate(reversed(regions)):
self.view.replace(edit, region, result[-idx-1])

I made a few changes in the arguments to find_all()

  1. I retained the leading r, although the pattern contains no \
  2. I removed the leading ^, since I want the replacement to operate on lines where the pattern IS matched, not those where it’s not.
  3. My “contextual” search string is any one of AS: A’S: AS’:, with apostrophe and colon. Perhaps the colons are confusing the conditional, or the apostrophes the pattern.
  4. The string to be replaced is AX0 Z (with trailing space), only at the end of the line, by AH0 Z (again with trailing space).

Is my error obvious to you?

For your interest, the original text is an English pronouncing dictionary, and this bit of code is trying to disambiguate when a reduced vowel is realized as a mid schwa vowel or as a high schwi vowel. When the spelled word ends in “as”, possibly with apostrophes, it’s pronounced as a schwa, for example in “Alicia’s”.but not “Alice’s”.

0 Likes

#8

I imagine the problem is that the command just doesn’t seem to do anything? If so that’s an indication that the regex isn’t matching. Do you have a sample of some data that this is supposed to match but doesn’t?

0 Likes

#9

Please use fenced code blocks. It is very difficult to read the code otherwise.

1 Like

#10

Here’s an example:

AACHEN:AA1 K AX0 N 
ABACUS:AE1 B AX0 K AX0 S 
ABALONE:AE2 B AX0 L OW1 N IY0 
ABANDON:AX0 B AE1 N D AX0 N 
ABLEST:EY1 B L AX0 S T 
ABRAXAS:AX0 B R AE1 K S AX0 Z 
ACCOMPLICE:AX0 K AA1 M P L AX0 S 
ACCOMPLICES:AX0 K AA1 M P L AX0 S AX0 Z 
ACURAS:AE1 K Y ER0 AX0 Z 
ADIDAS:AX0 D IY1 D AX0 S 
ADMIRAL:AE1 D M ER0 AX0 L 
AENEAS:AE1 N IY0 AX0 S 
ALICE'S:AE1 L IH0 S AX0 Z 
ALICIA'S:AX0 L IH1 SH AX0 Z 
ARENA:ER0 IY1 N AX0 
0 Likes

#11

I seem to have solved my problem. The following code replaces AX0 with AH0 only on lines with a previous AS A’S or AS’ when followed by S or Z.

class TestCommand(sublime_plugin.TextCommand):
	def run(self, edit):
		result = []
		regions = self.view.find_all(r"(((AS:)|(A'S:)|(AS':)).*) AX0 (S|Z) $", 0, r"\1 AH0 \2 ", result)
		for idx, region in enumerate(reversed(regions)):
			self.view.replace(edit, region, result[-idx-1])			

There turned out to be no need for conditionals - it’s just pattern matching. But I never would have figured out the code without your help - thank you very much!

1 Like