For a widespread operation in a big file, you’re probably better off trying to craft a regular expression that matches directly in the second column without having to select it first or things will slow to a crawl (along with taking a long time to select the data ;)).
An example would be something like the following:
-
Find:
^([^\t]*\t[^\t]*),
-
Replace: \1
- Note: there is a space character following the
\1
above).
Basically:
- From the start of the line, match 0 or more characters that aren’t tabs, followed by a tab
- Now the match has consumed the entire first column
- Match 0 or more characters that aren’t tabs, followed by a comma
The result would be to match everything from the start of the line up until the first comma seen in the second column, capturing all of the preceding text so that it can be inserted back with a space following it instead of a comma.
Since we know that tab characters separate columns of data, commas in the first column are skipped because the match has to see a tab before it continues on; from there any comma we see prior to a tab must be in the second column, and we can’t see into the third column or beyond because that would require a tab but the second part of the match excludes tab characters.
If you have examples with multiple commas in the second column you would have to run the operation more than once to find them all. Also, although this worked for me in a small contrived data set, I would recommend that you make sure you can reverse the operation if things go weird.