Regex - Unicode combining character sequence \x - text terminal - regex

In this pdf document in VI. Other Special Characters says
e. ANSCII or ANSI codes
1. Codes that control appearance of a text terminal
2. 0xA9 = \xA9
I can't understand "appearance of a text terminal".
What does it mean?

Presumably the author meant terminal attributes like text and background color, character set, character attributes (bold, underscored, blinking, inverse) etc.

Related

Why does ^z have two ASCII codes?

When I put the control key Ctrl + Z at the beginning of the string, its ASCII code is zero, but when I put it at the end of a string, it has an ASCII code of 26.
Ex:
^zhi --> ASCII ^z=0
But
Hi^z --> ASCII ^z=26
Why is this?
Ctrl-Z is a "Substitute character":
https://en.wikipedia.org/wiki/Substitute_character.
A substitute character (␚) is a control character that is used in the
place of a character that is recognized to be invalid or erroneous, or
that cannot be represented on a given device. It is also used as an
escape sequence in some programming languages.
As such, it can translate to different outputs in different contexts.

printf data type specifier complex question

printf("\e[2J\e[0;0H");
What does this line mean?
Can I know what to learn and from where to understand this statement?
"\e" as an escape sequence is not part of the C standard.
A number of compilers treat the otherwise undefined behavior as a character with the value of 27 - the ASCII escape character.
Alternative well defined code:
//printf("\e[2J\e[0;0H");
printf("\x1B[2J\x1b[0;0H");
printf("\033[2J\033[0;0H");
#define ESC "\033"
printf(ESC "[2J" ESC "[0;0H");
The escape character introduces ANSI escape sequences as well answered in #Mickael B.. Select terminals implement some of these sequences.
They are ANSI escape sequences
These sequences define functions that change display graphics, control cursor movement, and reassign keys.
It starts with \e[ and the following characters define what should happen.
2J: clears the terminal
Esc[2J Erase Display:
Clears the screen and moves the cursor to the home position (line 0, column 0).
0;0H moves the cursor to the position (0, 0)
Esc[Line;ColumnH Cursor Position:
Moves the cursor to the specified position (coordinates).
See also:
console_codes - Linux console escape and control sequences
List of ANSI color escape sequences

I see a character called xDB on notepad++. What character is this?

What is this character
All I really need to know is what is this character. I have not seen anything like this before.
How do i remove this using Vb.net:
data = data.Replace(Chr(???????), "")
Is there a specific control character decimal number or something to this character that i can use in place of ??
Please help.
I tried looking up all the html, ascii and the regex languages to find this character but i did not find this anywhere.
To prevent possible bugs related to the encoding of your source files, you should use a hex editor (such as this Notepad++ plugin) to find the hexadecimal code of the character, then use that to reference the character in your code:
data = data.Replace((char)0xDB, "")
as opposed to:
data = data.Replace("Û", "")
Note: In this case the hex editor is unnecessary because xDB is already a hex code, but other control characters, such as CR and LF, are not displayed as their hex values [in Notepad++].

How to use exetended unix characters in c++ in Visual studio?

We are using a korean font and freetype library and trying to display a korean character. But it displays some other characters indtead of hieroglyph
Code:
std::wstring text3 = L"놈";
Is there any tricks to type the korean characters?
For maximum portability, I'd suggest avoiding encoding Unicode characters directly in your source code and using \u escape sequences instead. The character 놈 is Unicode code point U+B188, so you could write this as:
std::wstring text3 = L"\uB188";
The question is what is the encoding of the source code.
It is likely UTF-8, which is one of the reasons not to use wstring. Use regular string. For more information on my way of handling characters, see http://utf8everywhere.org.

Delimiting Character

We are loading a Fixed width text file into a SAS dataset.
The character we are using to delimit multi valued field values is being interpreted as 2 characters by SAS. This breaks things, because the fields are of a fixed width.
We can use characters that appear on the keyboard, but obviously this isn't as safe, because our data could actually contain those characters.
The character we would like to use is '§'.
I'm guessing this may be an encoding issue, but don't know what to do about it.
Could you use the keycode for the character like DLM='09'x and change 09 to the right keycode?