In the program, Lambda λ theoretically represents nothing: ''. I thought of representing this programatically as '\0', but obviously that terminates a string which is not necessarily what lambda does. Also, I am reading in from istringstream and it has problems reading that character in.
So what character would you use?
I'm assuming you have a reason for representing Int,Char,Int as a string, rather than just define a struct to hold the data.
As you say, \0 doesn't work as it terminates the string. But there are other invisible ASCII characters that you can use and easily escape in C++. Have a look at this list of escape codes.
Related
The title is pretty much it. If a standard C++ string with UTF-8 characters has no zero bytes does the scanning terminate at the end of the string defined by it's size? Conversely, if the string has a zero byte does scanning stop at that byte, or continue to the full length of the string?
I've look at the Re2.h file and it does not seem to address this issue.
A std::string containing UTF-8 characters can´t have 0-bytes a part of the text
(only as termination), because UTF-8 doesn´t allow 0´s anywhere.
And given you´re using something C++11-compliant, a terminating 0 is guaranteed
(doesn´t matter if you use data() or c_str(). And data is the original data, so...).
See http://en.cppreference.com/w/cpp/string/basic_string/data
or the standard (21.4.7.1/1 etc.).
=> The processing of a string will stop at the 0
The interface to Re2 seems to use std::string, which almost
certainly means that it uses the begin and the end of the
string, and that null characters are characters like any other.
(The are, after all, defined in Unicode and in UTF-8.) Of
course, '\0' is in the category control characters, so it won't
match something like "\pL" (which matches a letter). But it
should match "\pC". And of course, '\u0000' and other representations of the null character.
I am writing a C++ program to solve a common problem of message decoding. Part of the problem requires me to get a bunch of random characters, including '\', and map them to a key, one by one.
My program works fine in most cases, except that when I read characters such as '\' from a string, I obviously get a completely different character representation (e.g. '\0' yields a null character, or '\' simply escapes itself when it needs to be treated as a character).
Since I am not supposed to have any control on what character keys are included, I have been desperately trying to find a way to treat special control characters such as the backslash as the character itself.
My questions are basically these:
Is there a way to turn all special characters off within the scope of my program?
Is there a way to override current digraphs definitions of special characters and define them as something else (like digraphs using very rare keys)?
Is there some obscure method on the String class that I missed which can force the actual character on the string to be read instead of the pre-defined constant?
I have been trying to look for a solution for hours now but all possible fixes I've found are for other languages.
Any help is greatly appreciate.
If you read in a string like "\0" from stdin or a file, it will be treated as two separate characters: '\\' and '0'. There is no additional processing that you have to do.
Escaping characters is only used for string/character literals. That is to say, when you want to hard-code something into your source code.
I have a problem where I have UTF16 strings (std::wstring) that might have "invalid" characters which causes my console terminal to stop printing (see question).
I wonder if there is a fast way to check all the characters in a string and replace any invalid chars with ?.
I know I could do something along these lines with a regex, but it would be difficult to make it validate all valid chars, and also slow. Is there e.g. a numeric range for the char codes that I might use e.g. all char codes between 26-5466 is valid?
It should be possible to use std::ctype<wchar_t> to determine if a character is printable:
std::local loc;
std::replace_if(string.begin(), string.end(),
[&](wchar_t c)->bool { return !std::isprint(c, loc); }, L'?');
I suspect your problem is not related to the validity of characters, but to the capability of the console to print them.
The definition UNICODE does to "printable" does not necessarily coincide to the effective capability of the console itself to "print".
Character like '€' are "printable" but -for example- not on winXP consoles.
I am able to read char into char[2] in OCI C++ code, but I am not able to read to char1?
Does anyone have any idea why?
(oracle data type is char(1))
If the input is being treated like a string, then room is needed to apply the null-termination (a '\0') at the end. That is if the data is 'a', then the string representation ("a") is stored in memory as two characters 'a' and '\0'. The '\0' is needed to tell the usual string processing suspects where the string ends.
Without knowing anything about the tools you're using I can't say for sure, but you might be able to assign to a character variable (as opposed to a character array variable).
Looking briefly at the docs along the link you posted, I suspect that you should be using std::string as the receiving type for textual data.
Possibly you need space for the null character at the end of the string?
According to the manual, string concatenation isn't implemented in gdb. I need it however, so is there a way to achieve this, perhaps using array functions?
I don't have a copy of gdb around to try this on, but perhaps this line from later in the Ada section of the document will help you?
Rather than use catenation and
symbolic character names to introduce
special characters into strings, one
may instead use a special bracket
notation, which is also used to print
strings. A sequence of characters of
the form ["XX"]' within a string or
character literal denotes the (single)
character whose numeric encoding is XX
in hexadecimal. The sequence of
characters["""]' also denotes a
single quotation mark in strings. For
example, "One line.["0a"]Next
line.["0a"]"
contains an ASCII newline character
(Ada.Characters.Latin_1.LF) after each
period.
For Objective-C:
[#"asd" stringByAppendingString:#"zxc"]
[#"ID: " stringByAppendingString:(NSString*) [aTaskDict valueForKey:#"ID"]]