I wrote a simple program to print a unicode smile emoji. Unfortunately, something else is printed. Does anyone know what the problem with this code is? Thanks
#include <iostream>
#include <string>
using namespace std;
int main(int argc, char *argv[])
{
string str = u8"\u1F600";
cout << str << endl;
return 0;
}
Compilation and output:
g++ -pedantic -Wall test109.cc && ./a.out
α½ 0
\u escape sequences have the format \u#### (i.e. exactly 4 hex digits). You need \U########:
auto str = u8"\U0001F600";
Or, encoding the UTF8 bytes separately:
auto str2 = u8"\xf0\x9f\x98\x80";
That works.
The \u escape sequence is limited to 4 hex digits max, so "\u1F600" is parsed as two separate characters \u1F60 (α½ ) and 0, which is exactly what you are seeing in your console output.
Codepoint U+1F60 GREEK SMALL LETTER OMEGA WITH PSILI is very different than codepoint U+1F600 GRINNING FACE.
For what you are trying, you need to use the \U escape instead, which allows up to 8 hex digits:
string str = u8"\U0001F600";
Alternatively, you can use one of these instead:
string str = u8"\xF0\x9F\x98\x80"; // UTF-8 codeunits in hex format
string str = u8"\360\237\230\200"; // UTF-8 codeunits in octal format
string str = u8"π"; // if your compiler/editor allows this
You can use any of the following which works for you.
string str = "\u263A"; // --> βΊ
//string str = u8"\xe2\x98\xba"; --> βΊ
//string str = u8"\U0001F600"; --> π
//string str = u8"π"; --> π
//string str = "\342\230\272" --> βΊ
cout << str << endl;
Related
I need to generate a string that can match another both containing special characters. I wrote what I thought would be a simple method, but so far nothing has given me a successful match.
I know that specials characters in c++ are preceded with a "\". Per example a single quote would be written as "\'".
string json_string(const string& incoming_str)
{
string str = "\\\"" + incoming_str + "\\\"";
return str;
}
And this is the string I have to compare to:
bool comp = json_string("hello world") == "\"hello world\"";
I can see in the cout stream that in fact I'm generating the string as needed but the comparison still gives a false value.
What am I missing? Any help would be appreciated.
One way is to filter one string and compare this filtered string. For example:
#include <iostream>
#include <algorithm>
using namespace std;
std::string filterBy(std::string unfiltered, std::string specialChars)
{
std::string filtered;
std::copy_if(unfiltered.begin(), unfiltered.end(),
std::back_inserter(filtered), [&specialChars](char c){return specialChars.find(c) == -1;});
return filtered;
}
int main() {
std::string specialChars = "\"";
std::string string1 = "test";
std::string string2 = "\"test\"";
std::cout << (string1 == filterBy(string2, specialChars) ? "match" : "no match");
return 0;
}
Output is match. This code also works if you add an arbitrary number of characters to specialChars.
If both strings contain special characters, you can also put string1 through the filterBy function. Then, something like:
"\"hello \" world \"" == "\"hello world "
will also match.
If the comparison is performance-critical, you might also have a comparison that uses two iterators, getting a comparison complexity of log(N+M), where N and M are the sizes of the two strings, respectively.
bool comp = json_string("hello world") == "\"hello world\"";
This will definitely yield false. You are creating string \"hello world\" by json_string("hello world") but comparing it to "hello world"
The problem is here:
string str = "\\\"" + incoming_str + "\\\"";
In your first string literal of str, the first character backlash that youβre assuming to be treated like escape character is not actually being treated an escape character, rather just a backslash in your string literal. You do the same in your last string literal.
Do this:
string str = "\"" + incoming_str + "\"";
In C++ string literals are delimited by quotes.
Then the problem arises: How can I define a string literal that does itself contain quotes? In Python (for comparison), this can get easy (but there are other drawbacks with this approach not of interest here): 'a string with " (quote)'.
C++ doesn't have this alternative string representation1, instead, you are limited to using escape sequences (which are available in Python, too β just for completeness...): Within a string (or character) literal (but nowhere else!), the sequence \" will be replaced by a single quote in the resulting string.
So "\"hello world\"" defined as character array would be:
{ '"', 'h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd', '"', 0 };
Note that now the escape character is not necessary...
Within your json_string function, you append additional backslashes, though:
"\\\""
{ '\', '"', 0 }
//^^^
Note that I wrote '\' just for illustration! How would you define single quote? By escaping again! '\'' β but now you need to escape the escape character, too, so a single backslash actually needs to be written as '\\' here (wheras in comparison, you don't have to escape the single quote in a string literal: "i am 'singly quoted'" β just as you didn't have to escape the double quote in the character literal).
As JSON uses double quotes for strings, too, you'd most likely want to change your function:
return "\"" + incoming_str + "\"";
or even much simpler:
return '"' + incoming_str + '"';
Now
json_string("hello world") == "\"hello world\""
would yield true...
1 Side note (stolen from answer deleted in the meanwhile): Since C++11, there are raw string literals, too. Using these, you don't have to escape either.
I have a situation where I want to get separator char from the given string like as below :-
String str1 = "saurabh|om|anurag|abhishek|jitendra"
String str2 = "amit,ankur,sumit,aniket,suheel"
String str3 = "aj-kumar-manav-lalit-gaurav"
-------
In above strings I want to get separator char as :-
String separatorStr1 = "|"
String separatorStr2 = ","
String separatorStr3 = "-"
Note :- separator char always will be non-alphabetical in string
Is there any way to achieve this.
Using groovy regexp and find ([^\w] is any non-alphanumeric character)
def getSeparator = { str ->
str.find(~/[^\w]/)
}
String str1 = "saurabh|om|anurag|abhishek|jitendra"
String str2 = "amit,ankur,sumit,aniket,suheel"
String str3 = "aj-kumar-manav-lalit-gaurav"
assert getSeparator(str1) == '|'
assert getSeparator(str2) == ','
assert getSeparator(str3) == '-'
Why is a - separator of str3? It could be a as well.
Assuming separator must be non-alphabetical loop through characters and look for first non-alphabetical character.
In future questions try to avoid other users guessing what you mean - try to define the subject of a topic.
By xenteros suggestion I have achieved this by following way :-
String str1 = "saurabh|om|anurag|abhishek|jitendra"
String str2 = "amit,ankur,sumit,aniket,suheel"
String str3 = "aj-kumar-manav-lalit-gaurav"
String separatorStr1 = str1.toCharArray().find { !Character.isLetterOrDigit(it) }
String separatorStr2 = str2.toCharArray().find { !Character.isLetterOrDigit(it) }
String separatorStr3 = str3.toCharArray().find { !Character.isLetterOrDigit(it) }
assert separatorStr1 == '|'
assert separatorStr2 == ','
assert separatorStr3 == '-'
I have problem with getting out a character or characters of a string in C++.
It was easy in python, for example I could take the first block of a string like this:
x = "Hello Word"
p = x[0]
now the H character would save in p.
As mentioned, surely you mean a char (character?)
std::string x = "Hello Word";
char p = x[0]; //p now contains 'H'
See http://en.cppreference.com/w/cpp/string/basic_string for more detail (thanks for suggestion of link)
Assuming:
std::string ToShow,NumStr;
The following displays "This is 19 ch00":
ToShow = "This is nineteen ch";
ToShow.resize(ToShow.length()+0);
NumStr = "00";
ToShow += NumStr;
mvaddstr(15,0,ToShow.c_str());
And the following displays "This is 19 ch ":
ToShow = "This is nineteen ch";
ToShow.resize(ToShow.length()+1);
NumStr = "0";
ToShow += NumStr;
mvaddstr(16,0,ToShow.c_str());
In the second case, operator+= isn't adding the string "0" to the end of ToShow. Does anyone know why?
My guess is:
You don't specify the value to resize with, so after ToShow.Resize(ToShow.length()+1) your string looks like:
"This is nineteen ch\0"
And after += NumStr:
"This is nineteen ch\00"
which, after calling c_str, gets trimmed to the first \0 and looks like:
"This is nineteen ch"
(C strings are null-terminated, std::strings aren't)
Try calling .resize(someLength, ' ') instead.
int main()
{
char* a = " 'Fools\' day' ";
char* b[64];
sscanf(a, " '%[^']s ", b);
printf ("%s", b);
}
--> puts "Fools" in b
Obviously, I want to have "Fools' day" in b. Can I tell sscanf() not to consider escaped apostrophes as the end of the character sequence?
Thanks!
No. Those functions just read plain old characters. They don't interpret the contents according to any escaping rules because there's nothing to escape from β quotation marks, apostrophes, and backslashes aren't special in the input string.
You'll have to use something else to parse your string. You can write a little state machine to read the string one character at a time, keeping track of whether the previous character was a backslash. (Don't just scan to the next apostrophe and then look one character backward; if you're allowed to escape backslashes as well as apostrophes, then you could end up re-scanning all the way back to the start of the string to see whether you have an odd or even number of escape characters. Always parse strings forward, not backward.)
Replace
char* a = " 'Fools\' day' ";
with
char* a = " 'Fools' day' ";
The ' character isn't special inside a C string (although it is special within a single char). So there is not need to escape it.
Also, if all you want is "Fools' day", why put the extra 's at the start and end? Maybe you are confusing C strings with those in some other language?
Edit:
As Rob Kennedy's comment says, I was assuming you are supplying the string yourself. Otherwise, see Rob's answer.
Why on earth would you write such a thing, instead of using std::string? Since your question is tagged C++.
int main(int argc, char* argv[])
{
std::string a = " 'Fools' day' ";
std::string b(a.begin() + 2, std::find(a.begin() + 2, a.end(), ' '));
std::cout << b;
std::cin.get();
}
Edit: Oh wait a second, you want to read a string within a string? Just use escaped double quotes, e.g.
int main(int argc, char* argv[]) {
std::string a = " \"Fool's day\" ";
auto it = std::find(a.begin(), a.end(), '"');
std::string b(it, std::find(it, a.end(), '"');
std::cout << b;
}
If the user put the string in, they won't have to escape single quotes, although they would have to escape double quotes, and you'd have to make your own system for that.