I’m not a programmer. I use ST4 for prose writing. Some of the text files I’m working on have been hauled in from a number of old sources. I’ve begun noticing a dimmed-out little string – <0x10> – showing up strewn throughout some of my old files. From poking around a bit online, it seems to be some kind of ASCII control character (??). I’m trying to do a search/replace to wipe them out, but search doesn’t find them. ST lets me delete them one at a time as I come across them, but I’d rather do a search/replace. Any thoughts?
Deleting <0x10> character in text files
You should be able to clobber all of them from the current buffer by selecting one before you open the Replace
panel; the selected text gets put directly into the Find:
field, and then you can make sure that the Replace:
field is empty and click Replace All
to get rid of them.
You can also select one and then repeatedly press Ctrl+d to select the others in the file and Backspace to remove them.
I imagine you’re seeing files created or modified in one Operating System and you’re now using another. Each has it’s own ideas on line endings and what you’re seeing there is naked ‘linefeed’ characters.
I’m going to stab at this without being able to try it. If you enable regex matching when doing your replace (That’s the .* button), you should be able to enter \n to match those characters. So it’s:
- Bring up the replace panel
- Click on .*
- Enter into the Find box: \n
- Enter into the replace box:
(I.e. nothing in the replace box). Now you should find that replace/replace all will remove those characters. Don’t forget to un-click .* next time or you might get effects you don’t understand in your find string.
Or listen to OdatNurd above - he’s worth listening to!
Thanks, rogerjane. I’m not near my ST box at the moment so can’t try these fixes. If I have trouble with OdatNurd’s solution, I’ll give your regex alternate a try. I appreciate the help! And I’ll report results.
Just as a note,0x10
is a DLE character (Data Link Escape); the line ending characters are 0x0d
(Carriage return) and 0x0a
(Line Feed). As noted by @rogerjane above, those are line ending characters which Sublime normalizes away for you, so it’s fairly rare to see them in files as loaded. If that happens, you’d see them as <CR>
instead.
Of course it is! I’m too used to interpreting 10 as a linefeed! Makes it all the odder to find them in text files - they’re otherwise known as ctrl+p if that helps.
Anyway, it does mean my ‘\n’ won’t work but you can try \020.
It’s not unusual to find similar non-representable characters in old files — i.e. characters codes which don’t have a corresponding glyph to visually represent them, which is why the editor is informing you of their presence by showing you their hexadecimal code in the <0xnn>
notation, where nn is the character code.
You’ll often find such characters in old text files that where manipulated via old DOS commands, redirections and/or pipes, which might have caused the insertion of some control characters.
In other cases, they might be the result of text-processing applications, e.g. inserting non-breaking space entities (<0xa0>
) to prevent wrapping a line between “Mr.” and the name that follows it.
You can always look for such characters via Sublime Text’s Search functionality using Regular Expressions. The RegEx for an ASCII char is \xnn
, where nn is its ASCII code in hexadecimal notation (in your example, it would be \x10
).
If you work in prose editing, I strongly advise you to get well acquainted with the ASCII table, which would allow you recognize these non-printable chars via their hex notation, as well as to know how to insert via Alt-Codes special characters which are not present on the keyboard (e.g. the © char, curly quotes, en/em dashes, typography symbols, etc.). You can also search for Unicode characters, with a very similar notation.
Alt codes also use a character’s ASCII value its or Unicode code point as the mean to reference it and insert it into the text (via the numeric pad, usually).