Checking if a string is a hexadecimal value - c++

I have char whose value is 183 while doing rtf parsing. This is a special character .,
When i created a string out of it, i will get a hexadecimal string \xb7, which is a hexadecimal string. This is a one length string.
How to determine that the string prep rend with \x or it is a hexadecimal string.
string substr(1,char);
cout<<substr<<substr.length();
Regards

Related

How to delimit this text file? strtok

so there's a text file where I have 1. languages, a 2. text of a number written in the said language, 3. the base of the number and 4. the number written in digits. Here's a sample:
francais deux mille quatre cents 10 2400
How I went about it:
struct Nomen{
char langue[21], nomNombre [31], baseC[3], nombreC[21];
int base, nombre;
};
and in the main:
if(myfile.is_open()){
{
while(getline(myfile, line))
{
strcpy(Linguo[i].langue, strtok((char *)line.c_str(), " "));
strcpy(Linguo[i].nomNombre, strtok(NULL, " "));
strcpy(Linguo[i].baseC, strtok(NULL, " "));
strcpy(Linguo[i].nombreC, strtok(NULL, "\n"));
i++;
}
Difficulty: I'm trying to put two whitespaces as a delimiter, but it seems that strtok() counts it as if there were only one whitespace. The fact there are spaces in the text number, etc. is messing up the tokenization. How should I go about it?
strtok treats any single character in the provided string as a delimiter. It does not treat the string itself as a single delimiter. So " " (two spaces) is the same as " " (one space).
strtok will also treat multiple delimiters together as a single delimiter. So the input "t1 t2" will be tokenized as two tokens, "t1" and "t2".
As mentioned in comments, strtok is also writes the NUL character into the input to create the token strings. So, it is an error to pass the result of string::c_str() as input to the function. The fact that you need to cast the constant string should have been enough to dissuade you from this approach.
If you want to treat a double space as a delimiter, you will have to scan the string and search for them yourself. Given you are using C APIs, you can consider strstr. However, in C++, you can use string::find.
Here's an algorithm to parse your string manually:
Given an input string input:
language is the substring from the start of input to the first SPC character.
From where language ends, skip over all whitespace, changing input to begin at the first non-whitespace character.
text is the substring from the start of input to the first double SPC sequence.
From where text ends, skip over all whitespace, changing input to begin at the first non-whitespace character.
Parse base, and parse number.

Translate \n new line from Char to String in SML/NJ

I am trying to convert #"\n", a Char, to "\n", a String. I used
Char.toString(#"\n");
and it gives
val it = "\\n" : string
Why does not it return "\n"?
Char.toString from the documentation.
returns a printable string representation of the character, using, if
necessary, SML escape sequences.
It also specifies that some control characters are converted to two-character escape sequences and \n is one of it.
To return a string of size one, use String.str.
- String.str(#"\n");
val it = "\n" : string

Replace all non-ASCII characters in a string by their ASCII equivalent

Using Qt/C++, I need to generate a string with only a subset of ASCII characters : letters, digits, hyphen, underscore, period, or colon.
As input, I can have anything.
So I try to apply some rules :
every QChar::isSpace will be replaced with an underscore
every non-ASCII letters will be replaced with an ASCII equivalent (example : "é" will be replaced with "e")
every other non-ASCII character will be removed
Is there any simple way with Qt/C++ to apply the 2nd and the 3rd rule ?
Thanks
Yes, there is a way.
At first you should do unicode normalization to your string with
QString::normalized. Normalization is needed to separate diacritical signs from letters and to replace some fancy symbols with ascii equivalents. Here you can read about normalization forms.
Then you may take chars which can be encoded in Latin-1. Can be tested with
toLatin1 method of QChar.
char QChar::toLatin1() const
Returns the Latin-1 character equivalent to the QChar, or 0. This is mainly useful for non-internationalized software.
...
QString testString = QString::fromUtf8("Ceñía-üÏÖ马克ñ");
QString normalized = testString.normalized(QString::NormalizationForm_KD);
QString result;
copy_if(normalized.begin(), normalized.end(), back_inserter(result), [](QChar& c) {
return c.toLatin1() != 0;
});
qDebug() << result; // Cenia-uIOn

Create string with ESC characters

I can initialise string with escape characer like std:string s="\065" and this creates "A" character. But what if I need ASCI character = 200. std:string s="\200" not working. Why?
"\065" does not create "A" but "5". "\065" is interpreted as an octal number, which is decimal 53, which is character '5'.
std::string s = "\xc8" ; (hex) gives me the character 200.
Because \065 is actually in octal form; to specify character 200, try \310 or \xc8. And BTW, \065 is not character A but is 5.

Can zlib-compressed string contain whitespace?

Can zlib-compressed string contain whitespace? By whitespace I mean ' ', \n, \t.
Any byte can appear in a zlib-compresed string.
In fact, for a long enough properly compressed string, any byte (from 0 to 255) should have a more-or-less equal probability, or else the string could be further compressed.
You can try this yourself -- for example using Python:
>>> z = open('/dev/urandom').read(1000000).encode('zlib') # compress a long string of junk
>>> [z.count(chr(i)) for i in range(256)] # number of occurrences of each byte
[3936, 3861, 3978, 3951, 3858, 3937, 3945, 3828, 3984, 3871, 3985,
3961, 3879, 3924, 3817, 3984, 3963, 3858, 4029, 3903, 3884, 3817,
... yada ...
Yes; it's just a stream of bytes. Any byte value can appear in there (including zero, which is more likely to cause you problems than whitespace characters!)