How do I replace or find non-printable characters in vim regex? - regex

I have a file with some non-printable characters that come up as ^C or ^B, I want to find and replace those characters, how do I go about doing that?

Removing control symbols only:
:%s/[[:cntrl:]]//g
Removing non-printable characters (note that in versions prior to ~8.1.1 this removes non-ASCII characters also):
:%s/[^[:print:]]//g
The difference between them could be seen if you have some non-printable-non-control characters, e.g. zero-width space:

Say you want to replace ^C with C:
:%s/CtrlVC/C/g
Where CtrlVC means type V then C while holding Ctrl pressed.
CtrlV lets you enter control characters.

Try this after saving your file in vim (assuming you are in Linux environment)
:%!tr -cd '[:print:]\n'

None of the answers here using Vim's control characters worked for me. I had to enter a unicode range.
:%s/[\x00-\x08\x0B\x0C\x0E-\x1F\x7F-\x9F]//g
That unicode range was found on this other post: https://stackoverflow.com/a/8171868/231914

You can use:
:%s/^C//g
To get the ^C hold the control key, press V then C (Both while holding the control key) and the ^C will appear. This will find all occurrences and replace them with nothing.
To remove both ^C and ^B you can do:
:%s/^C\|^B//g

You can use the CTRL-V prefix to enter them, or if they're not easily typeable, yank and insert them using CTRL-R ".

An option not mentioned in other answers.
Delete a specific unicode character with a long hex code, e.g. <200b>:
:%s/\%U200b//g

Related

how to replace char with other in hexdecimal

I'm a new user who using mainframe, I have a file and I need to change all dots '.' in file with space, I was trying to write this statement on command
change X'05' X'40' all
after I converted the file to hexdecimal, but It doesn't work.
How can I change all the dots with space in file, in simple way please?
The dots are non-displayable characters. You can match them using picture strings in the ISPF editor (which is what I assume you're trying to use to edit the file?)
Try the command
change p'.' ' ' all
The "p'.'" part will match any non-displayable character and change it to a blank.
Hans answer above will certainly change any non-displayable character to a space. However you need to make sure you really want to change all non displayable characters to a space. Turn HEX ON to look at the actual data. You can then do a F p'.' to find the non-displayable character(s) prior to changing it. Browse shows non-displayable characters as a dot. However Edit would replace the value with an attribute for display purposes and this keeps you from typing over the data. You have to turn on HEX mode to manually modify the non-displayable value or use the Change command as you were trying. Typically any hex value from x'00' - x'3F' would be non-displayable. So a
C P'.' X'40' ALL
would modify every one of those values to a space. This may or may not be desirable depending on the file.

I see a character called xDB on notepad++. What character is this?

What is this character
All I really need to know is what is this character. I have not seen anything like this before.
How do i remove this using Vb.net:
data = data.Replace(Chr(???????), "")
Is there a specific control character decimal number or something to this character that i can use in place of ??
Please help.
I tried looking up all the html, ascii and the regex languages to find this character but i did not find this anywhere.
To prevent possible bugs related to the encoding of your source files, you should use a hex editor (such as this Notepad++ plugin) to find the hexadecimal code of the character, then use that to reference the character in your code:
data = data.Replace((char)0xDB, "")
as opposed to:
data = data.Replace("Û", "")
Note: In this case the hex editor is unnecessary because xDB is already a hex code, but other control characters, such as CR and LF, are not displayed as their hex values [in Notepad++].

SCI_AUTOCSHOW on Scintilla

I have a question, when I use
CallScintilla(SCI_AUTOCSHOW, nLen, (LPARAM)m_strCandidate.c_str())
to show the window about autocompletion, it works when I input a word but it doesn't always work when I press the backspace button, I wonder if there are some conflicts about the key backspace?
use SCI_AUTOCSETSEPARATOR to set the separator character used to separate words in the SCI_AUTOCSHOW list. The default is the space character.

How to remove stray â\302â in C++ ?

How to remove stray â\302â in C++ ?
I do not want to remove them one by one by hitting delet button.
thanks
You don't. You use something such as libiconv or ICU to convert the UTF-8 text to a charset you can understand.
Since the post by OP currently is tagged with vi I assume that OP is looking for a way to delete all characters (with a octal value of 3028) from within this editor.
:%s/\%o302//g
The above command will search the whole file for the octal value \302 and replace every instance with an empty string.

How can I use gvim to add a carriage return (aka ENTER) to a pattern?

What's the vi/gvim syntax to replace a pattern with a pattern that includes <ENTER>? I know this is possible but never felt like diving too deep in the documentation to know how to do it.
Something like this:
:s/\(word\)/\1<ENTER>/
But correctly :)
Thanks
Use the "escape" encoding:
:s/\(word\)/\1\r/
See the Vim documentation for pattern whitespace escapes.
:s/\(word\)/\1\r/
Alternatively, use Ctrl+V or Ctrl+Q to quote (escape) the Enter key:
:s/\(word\)\1^QENTER/
Where ^Q is Ctrl+Q and ENTER is the Enter key.
Clarification: Depending on your installation, either ^Q or ^V should work. The quoting character differs on some platforms.
(This has the helpful side-effect of inserting the appropriate end-of-line character for whichever platform you're using, eliminating the CR vs. LF vs. CRLF problem.)
Just for clarification purposes, now that we're talking about carriage return, it should be noted that RETURN and ENTER key are not the same, or it would be more correct to say, they should not be the same.
I haven't used a desktop keyboard for some time now, but the ENTER key is usually the one on the down right side, while the RETURN key is the big one in the middle.
RETURN key is the one that should be used for entering a carriage return, while ENTER key is the one that should be used for entering commands. I remember an old DOS editor EDT, in which RETURN key was for newline and ENTER key was for giving commands. You couldn't give a command with RETURN. I think ENTER also gave ^1 (line feed).
Today that difference is somewhat lost, although I still, now and then, run into an editor that respects it.
2 examples:
One, two and an even more obvious three