Is this regular expression? - regex

This is how to split string in Unityscript from Unity Wiki. However, I don't recognize " "[0]. Is this regular expression? If so, any reference to it? I'm familiar with regular expressions generally and used them a lot, but this syntax is little confusing.
var qualifiedName = "System.Integer myInt";
var name = qualifiedName.Split(" "[0]);
Wiki Reference

On any string, wether it is a variable or a literal (" "), you can use an indexer to get the char at the nth position.
Your codesample is a very weird way of literally defining a char with a space, and could be simplified by using this:
' '
note the single quotes instead of double quotes

As many have already mentioned, " "[0] is the first character of the " " string (which is a System.String, or an array of System.Chars. The problem with UnityScript is that ' ' is interpreted as a String too, so the only way to provide a Char is by slicing.

" "[0] is the first character of the string " ".
typeof " "[0]; // "string"
Your example is strange, because " "[0] and " " are strictly equal.
" "[0] === " "; // true
Reading reference:
Mono Types When a Mono function requires a char as an input, you can
obtain one by simply indexing a string. E.g. if you wanted to pass the
lowercase a as a char, you'd write: "a"[0]
I suppose it's because UnityScript is implemented in Boo and String is provided by mono.

Related

Using one cout command to print multiple strings with each string placed on a different (text editor) line

Take a look at the following example:
cout << "option 1:
\n option 2:
\n option 3";
I know,it's not the best way to output a string,but the question is why does this cause an error saying that a " character is missing?There is a single string that must go to stdout but it just consists of a lot of whitespace charcters.
What about this:
string x="
string_test";
One may interpret that string as: "\nxxxxxxxxxxxxstring_test" where x is a whitespace character.
Is it a convention?
That's called multiline string literal.
You need to escape the embedded newline. Otherwise, it will not compile:
std::cout << "Hello world \
and stackoverflow";
Note: Backslashes must be immediately before the line ends as they need to escape the newline in the source.
Also you can use the fun fact "Adjacent string literals are concatenated by the compiler" for your advantage by this:
std::cout << "Hello World"
"Stack overflow";
See this for raw string literals. In C++11, we have raw string literals. They are kind of like here-text.
Syntax:
prefix(optional) R"delimiter( raw_characters )delimiter"
It allows any character sequence, except that it must not contain the
closing sequence )delimiter". It is used to avoid escaping of any
character. Anything between the delimiters becomes part of the string.
const char* s1 = R"foo(
Hello
World
)foo";
Example taken from cppreference.

Removing expressions from QString using QRegExp

I'm having an issue removing expressions from a QString using QRegExp. I tried a countless number of regex to no avail. What am I doing wrong?
Sample Text (QString myString) In this instance, myString contains "\u0006\u0007\u0013Hello".
myString.remove(QRegExp("\\[u][0-9]{4}"));
It does not remove any instances of \uXXXX where X = numbers.
However, when I am specific such as:
myString.remove("\u0006");
It does remove it.
String literals are not always the same as character sequence
for (char c : "\u0006\u0007\u0013Hello".toCharArray()) {
System.out.println( c + " (" + (int)c + ")" );
}
System.out.println( "--------------" );
for (char c : "\\u0006\\u0007\\u0013Hello".toCharArray()) {
System.out.println( c + " (" + (int)c + ")" );
}
In the first example \u0006 is encoding an unicode code point, whereas in second the string actually contains a backslash.
The string literal only exist at compile time, at runtime they are character sequences.
Regexes are working over character sequence not over string litteral, and also backlash have special meaning and need to be escaped.
Also note that \u0041 is another way to encode A.
Maybe what you are looking for are unicode categories, maybe following can help:
string.replaceAll( "\\p{Cc}", "" )

Issues with encoding plus sign (executing Poco::URI::setQueryParameters and Poco::URI::getQueryParameters gives unexpected result)

Suppose I have a URI parameter with value, which contains plus signs (+) and other special chars.
When I execute URI::setQueryParameters and then URI::getQueryParameters, the resulted value is not the same as the original one - all special chars are fine, except the plus sign.
Could you, please, advice what is the conventional way to do this?
Workaround: explicitly invoke URI::encode with reserved containing plus sign. But this doesn't seem to be right, it really looks like a workaround.
Anyway, if this is the correct way to achieve this, what symbols should I include in reserved, if I want to avoid such surprises in the future?
Other observations: URI::decode has a parameter named plusAsSpace (defaulted to false), but this does not help. URI::getQueryParameters replaces + with (space) before calling URI::decode.
Here's a sample code:
const std::string value_with_plus_signs = "value+with+plus+signs";
Poco::URI::QueryParameters out_params;
out_params.push_back(std::make_pair("param", value_with_plus_signs));
Poco::URI uri("path");
uri.setQueryParameters(out_params);
const auto in_params = uri.getQueryParameters();
std::cout << "Expected: '" << value_with_plus_signs << "', received: '"
<< in_params.front().second << "'" << std::endl;
output: Expected: 'value+with+plus+signs', received: 'value with plus signs'
It seems this was fixed in Poco (notice that '+' is added to the symbols that are encoded by default):
https://github.com/pocoproject/poco/issues/1260
https://github.com/pocoproject/poco/commit/c32e683b6c00950ddfce817dfe8f3fc0b6846455
I tested your code with poco 1.7.9p2 and I got the correct results.

C++ Qt QString replace double backslash with one

I have a QString with following content:
"MXTP24\\x00\\x00\\xF4\\xF9\\x80\r\n"
I want it to become:
"MXTP24\x00\x00\xF4\xF9\x80\r\n"
I need to replace the "\x" to "\x" so that I can start parsing the values. But the following code, which I think should do the job is not doing anything as I get the same string before and after:
qDebug() << "BEFORE: " << data;
data = data.replace("\\\\x", "\\x", Qt::CaseSensitivity::CaseInsensitive);
qDebug() << "AFTER: " << data;
Here, no change!
Then I tried like this:
data = data.replace("\\x", "\x", Qt::CaseSensitivity::CaseInsensitive);
Then compiler complaines that \x used with no following hex digits!
any ideas?
First let's look at what this piece of code does:
data.replace("\\\\x", "\\x", ....
First string becomes \\x in compiled code, and is used as regular expression. In reqular expression, backslash is special, and needs to be escaped with another backslash to mean actual single backslash character, and your regexp does just this. 4 backslashes in C+n string literal regexp means matching single literal backslash in target text. So your reqular expression matches literal 2-character string \x.
Then you replace it. Replacement isn't a reqular expression, so backslash doesn't need double escaping here, so you end up using literal 2-char replacement string \x, which is same as what you matched, so even if there is a match, nothing changes.
However, this is not your problem, your problem is how qDebug() prints strings. It prints them escaped. That \" at start of output means just plain double quote, 1 char, in the actual string because double quote is escaped. And those \\ also are single backslash char, because literal backslash is also escaped (because it is the escape char and has special meaning for the next char).
So it seems you don't need to do any search replace at all, just remove it.
Try printing the QString in one of these ways to get is shown literally:
std::cout << data << std::endl;
qDebug() << data.toLatin1().constData();

How to declare a variable that spans multiple lines

I'm attempting to initialise a string variable in C++, and the value is so long that it's going to exceed the 80 character per line limit I'm working to, so I'd like to split it to the next line, but I'm not sure how to do that.
I know that when splitting the contents of a stream across multiple lines, the syntax goes like
cout << "This is a string"
<< "This is another string";
Is there an equivalent for variable assignment, or do I have to declare multiple variables and concatenate them?
Edit: I misspoke when I wrote the initial question. When I say 'next line', I'm just meaning the next line of the script. When it is printed upon execution, I would like it to be on the same line.
You can simply break the line like this:
string longText("This is a "
"very very very "
"long text");
In the C family, whitespaces are insignificant, so you can freely use character literals spanning multiple lines this way.
It can also simply be
cout << "This is a string"
"This is another string";
You can write this:
const char * str = "First phrase, "
"Second phrase, "
"Third phrase";