Translate \n new line from Char to String in SML/NJ - sml

I am trying to convert #"\n", a Char, to "\n", a String. I used
Char.toString(#"\n");
and it gives
val it = "\\n" : string
Why does not it return "\n"?

Char.toString from the documentation.
returns a printable string representation of the character, using, if
necessary, SML escape sequences.
It also specifies that some control characters are converted to two-character escape sequences and \n is one of it.
To return a string of size one, use String.str.
- String.str(#"\n");
val it = "\n" : string

Related

Str.global_replace in OCaml putting carats where they shouldn't be

I am working to convert multiline strings into a list of tokens that might be easier for me to work with.
In accordance with the specific needs of my project, I'm padding any carat symbol that appears in my input with spaces, so that "^" gets turned into " ^ ". I'm using something like the following function to do so:
let bad_function string = Str.global_replace (Str.regexp "^") " ^ " (string)
I then use something like the below function to then turn this multiline string into a list of tokens (ignoring whitespace).
let string_to_tokens string = (Str.split (Str.regexp "[ \n\r\x0c\t]+") (string));;
For some reason, bad_function adds carats to places where they shouldn't be. Take the following line of code:
(bad_function " This is some
multiline input
with newline characters
and tabs. When I convert this string
into a list of tokens I get ^s showing up where
they shouldn't. ")
The first line of the string turns into:
^ This is some \n ^
When I feed the output from bad_function into string_to_tokens I get the following list:
string_to_tokens (bad_function " This is some
multiline input
with newline characters
and tabs. When I convert this string
into a list of tokens I get ^s showing up where
they shouldn't. ")
["^"; "This"; "is"; "some"; "^"; "multiline"; "input"; "^"; "with";
"newline"; "characters"; "^"; "and"; "tabs."; "When"; "I"; "convert";
"this"; "string"; "^"; "into"; "a"; "list"; "of"; "tokens"; "I"; "get";
"^s"; "showing"; "up"; "where"; "^"; "they"; "shouldn't."]
Why is this happening, and how can I fix so these functions behave like I want them to?
As explained in the Str module.
^ Matches at beginning of line: either at the beginning of the
matched string, or just after a '\n' character.
So you have to quote the '^' character using the escape character "\".
However, note that (also from the doc)
any backslash character in the regular expression must be doubled to
make it past the OCaml string parser.
This means you have to put a double '\' to do what you want without getting a warning.
This should do the job:
let bad_function string = Str.global_replace (Str.regexp "\\^") " ^ " (string);;

C ++ about char and string

what is char ctemp = ' '; and string stemp = ""; means? when they put ' ' and " " inside without writing anything inside? Help please! Will appreciate who answer it.
Single quotes (') indicate a character literal: a single character. Double quotes (") denote a string literal, i.e: an array of characters.
' ' is a single space character, while " " is a single space character followed by a null terminator, as is customary for C-style strings.
Character literals are directly assignable to char variables.
The type of a string literal is const char[N], where N is the length of the literal, including the null terminator. In C and C++, a static array decays to (is implicitly convertible to) a pointer to the first element, and std::string is constructible from a const char * pointer (see constructor (5)), which in C usually means a pointer to an array of characters terminated by a null terminator.
The char ctemp = ' ' will put the value ' ' (32 in ASCII decimal) inside the ctemp variable.
The string stemp = ""; will create an empty string in stemp.
Here
char ctemp = ' ';
you are assigning a whitespace character ' ' to ctemp.
Here
string stemp = "";
the initializer "" creates a empty string.
' ' is the space character. "" is an empty string. " " is a string that contains only the space character.
Note that a statement like string stemp = "" implicitly invokes the string(char const *) constructor to create a new string instance from a char const * pointer.
the first one means a "whitespace" like when you write something and need to divide the words with the space key. That empty space is still part of the string and so you can say your char is only that empty space.
The second one is of type string but it is even less than a white space. It is a completely empty string.
string is array(collection) of char
ctemp = ' '
mean whitespace character
stemp = ""
mean empty string no character in string
you can put ' ' to char variable.
you can put " " to array of char.
In C++ the single quote is used to identify the single character, and double quotes are used for string literals. A string literal “x” is a string, it is containing character ‘x’ and a null terminator ‘\0’. So “x” is two-character array in this case.
Some Examples:
string s = "" ; => empty string
char s =' ' ; => space (you should have only one character inside the single quotes)
string s = " " ; => space followed by '\0' character (two character array)
Whitespace character and empty string. You can see the string as a sequence of characters, but they are two different types

Replace all non-ASCII characters in a string by their ASCII equivalent

Using Qt/C++, I need to generate a string with only a subset of ASCII characters : letters, digits, hyphen, underscore, period, or colon.
As input, I can have anything.
So I try to apply some rules :
every QChar::isSpace will be replaced with an underscore
every non-ASCII letters will be replaced with an ASCII equivalent (example : "é" will be replaced with "e")
every other non-ASCII character will be removed
Is there any simple way with Qt/C++ to apply the 2nd and the 3rd rule ?
Thanks
Yes, there is a way.
At first you should do unicode normalization to your string with
QString::normalized. Normalization is needed to separate diacritical signs from letters and to replace some fancy symbols with ascii equivalents. Here you can read about normalization forms.
Then you may take chars which can be encoded in Latin-1. Can be tested with
toLatin1 method of QChar.
char QChar::toLatin1() const
Returns the Latin-1 character equivalent to the QChar, or 0. This is mainly useful for non-internationalized software.
...
QString testString = QString::fromUtf8("Ceñía-üÏÖ马克ñ");
QString normalized = testString.normalized(QString::NormalizationForm_KD);
QString result;
copy_if(normalized.begin(), normalized.end(), back_inserter(result), [](QChar& c) {
return c.toLatin1() != 0;
});
qDebug() << result; // Cenia-uIOn

How to find the character "\" in a string?

I am trying to manipulate a string by finding the \ character in the string Find\inHere. However, I can't put that as an input in test.find('\', 0). It won't work and gives me the error "missing terminating character." Is there a way to fix test.find('\', 0)?
string test = "Find\inHere";
int x = test.find('\', 0); // error on this line
cout << x; // x should equal 4
\ is a character used to introduce special characters, for example \n newline, \xDB shows the ASCII character with hexadecimal number DB etc.
So, in order to search this special character, you have to escape it by adding another \, use:
test.find("\\",0);
EDIT : Also, in your first string, it is not written in it "Find\inHere" but "Find" and an error because \inHere isn't a special instruction. So, same way to avoid it, write "Find\\inHere".

Is it possible to return "weird" characters in a char?

I would like to know is it possbile to return "weird" characters, or rather ones that are important to the language
For example: \ ; '
I would like to know that because I need to return them by one function that's checking the unicode value of the text key, and is returning the character by it's number, I need these too.
I get a 356|error: missing terminating ' character
Line 356 looks as following
return '\';
Ideas?
The backslash is an escape for special characters. If you want a literal backslash you have to escape it with another backslash. Try:
return '\\';
The only problem here is that a backslash is used to escape characters in a literal. For example \n is a new line, \t is a horizontal tab. In your case, the compiler is seeing \' and thinking you mean a ' character (this is so you could have the ' character like so: '\''). You just need to escape your backslash:
return '\\';
Despite this looking like a character literal with two characters in it, it's not. \\ is an escape sequence which represents a single backslash.
Similarly, to return a ', you would do:
return '\'';
The list of available escape sequences are given by Table 7:
You can have a character literal containing any character from the execution character set and the resulting char will have the value of that character. However, if the value does not fit in a char, it will have implementation-defined value.
Any character can be returned.
Yet for some of them, you have to escape it using backslash: \.
So for returning backslash, you have to return:
return '\\';
To get a plain backslash use '\\'.
In C the following characters are represented using a backslash:
\a or \A : A bell
\b or \B : A backspace
\f or \F : A formfeed
\n or \N : A new line
\r or \R : A carriage return
\t or \T : A horizontal tab
\v or \V : A vertical tab
\xhh or \Xhh : A hexadecimal bit pattern
\ooo : An octal bit pattern
\0 : A null character
\" : The " character
\' : The ' character
\\ : A backslash (\)
A plain backslash confuses the system because it expects a character to follow it. Thus, you need to "escape" it. The octal/hexadecimal bit patterns may not seem too useful at first, but they let you use ANSI escape codes.
If the character following the backslash does not specify a legal escape sequence, as shown above, the result is implementation defined, but often the character following the backslash is taken literally, as though the escape were not present.
If you have to return such characters(",',\,{,]...etc) more then once, you should write a function that escapes that characters. I wrote that function once and it is:
function EscapeSpecialChars (_data) {
try {
if (!GUI_HELPER.NOU(_data)) {
return _data;
}
if (typeof (_data) != typeof (Array)) {
return _data;
}
while (_data.indexOf("
") > 0) {
_data = _data.replace("
", "");
}
while (_data.indexOf("\n") > 0) {
_data = _data.replace("\n", "\\n");
}
while (_data.indexOf("\r") > 0) {
_data = _data.replace("\r", "\\r");
}
while (_data.indexOf("\t") > 0) {
_data = _data.replace("\t", "\\t");
}
while (_data.indexOf("\b") > 0) {
_data = _data.replace("\b", "\\b");
}
while (_data.indexOf("\f") > 0) {
_data = _data.replace("\f", "\\f");
}
return _data;
} catch (err) {
alert(err);
}
},
then use it like this:
return EscapeSpecialChars("\'"{[}]");
You should improve the function. It was working for me, but it is not escaping all special characters.