QString split() function with QRegExp strange behaviour - c++

I have some text like this:
"1.801908\t20.439980\t\r\n25.822865\t20.439305\t\r\n26.113739\t4.069647\t\r\n1.800252\t4.301690\t\r\n"
I want to split this text by lines and then by tabs. I am using QString split() function and QRegExp to do that in this way:
QStringList rows = text.split(QRegExp("[\r\n]"), QString::SkipEmptyParts);
QStringList cols = rows.at(0).split(QRegExp("[ \t]"), QString::SkipEmptyParts);
But what I got in cols is just one item:
"1.801908\920.439980\9"
As I understand, the first split replaced all the \t characters with \9. But I don't understand why and how to fix that. Any explanation?

Related

Regex for QString::split() for two float numbers

I have the QString line like this "567\n1.23456 2.34567\n1.23456 2.34"
And I want only "whole" float numbers only between \n characters.
I need QStringList after split() that contains only this float numbers. QString::split() can use RegEx so maybe I can use som regex here.
i tried QStringList myList = QString("56\n1.12345 2.34567\n1.23456 2.34").split('\n') that returns me ["1.2345 2.34567"] so i need split this again to ["1.23456"] and ["2.34567"]
The Qt documentation for QString::split has your answer
QString str;
QStringList list;
str = "Some text\n\twith strange whitespace.";
list = str.split(QRegularExpression("\\s+"));
// list: [ "Some", "text", "with", "strange", "whitespace." ]
this regex \d+(\.\d+)? will give you any float/int number!
You should split at QRegularExpression("\\s+"). \s means whitespace (which includes both =space and \n=newline), + means one or more, and you need to escape the backslash.

Regex split string in qt

So, I'm writting app in Qt and I need to split a string into operators and variables like this:
a&&b||!c --> vars ["a", "b", "c"] and operators ["&&", "||", "!"].
I wrote this:
to get list of vars:
QRegExp rx("[&|!]+");
QStringList vars = exp.split(rx, QString::SkipEmptyParts);
for operators, I tried this one:
QRegExp oprx("[a-z]");
QStringList operators = exp.split(oprx, QString::SkipEmptyParts);
but it gives last operand like "||!".Can you help me with a pattern for spliting operators.

Pack QStringList to QString and unpack it back

I'm in search for an easy and foolproof way to convert an arbitrary QStringList to a single QString and back.
QStringList fruits;
fruits << "Banana", "Apple", "Orange";
QString packedFruits = pack(fruits);
QStringList unpackFruits = unpack(packedFruits);
// Should be true
// fruits == unpackFruits;
What might be the easiest solution for this kind of problem?
From QStringList to QString - QStringList::join:
Joins all the string list's strings into a single string with each element separated by the given separator (which can be an empty string).
QString pack(QStringList const& list)
{
return list.join(reserved_separator);
}
From QString to QStringList - QString::split:
Splits the string into substrings wherever sep occurs, and returns the list of those strings. If sep does not match anywhere in the string, split() returns a single-element list containing this string.
QStringList unpack(QString const& string)
{
return string.split(reserved_separator);
}
Previous answers mentioned QString::split and QStringList::join which is the correct way, but if the separator you choose is included in any of the strings it will break your conversion.
You must prevent strings in the list to contain your separator with one of the following techniques:
Throw an error before QStringList::join if any string includes the separator
Ensure they can not contain the separator (for example storing the string with its QByteArray::toHex(myString.toLatin1()) representation, then using a separator that has character(s) outside of the range 0..9 and a..f. Then convert back with QString::fromLatin1(QByteArray::fromHex(myHexString)) afterward
Use any separator regardless if the strings contain it, but implement an escape logic for it before the join(), and an un-escape logic after the split(), so that the separator is never present in any of the strings at the time of join, but all instances of it will be restored.
Use QStringList::join() :
QStringList strList;
strList << "Banana" << "Apple" << "Orange" ;
QString str = strList.join(""); // str = "BananaAppleOrange";
str = strList.join(","); // str = "Banana,Apple,Orange";

Regex - Private subtags RFC5646

Can someone please help me with a regex to pull out subtags from a RFC5646?
Example strings
en-us-x-test-test1 = test,test1
en-gb-x-test-test2 = test,test2
fr-x-test-test3 = test,test3
I'm using a QRegExp
Thanks for any assistance
You don't need a regex here. Split your input by - then take the last two string and add a coma in between:
QString str = "en-us-x-test-test1";
QStringList list = str.split('-');
QString output = list.at(list.count()-2) + "," + list.at(list.count()-1);
Of course, you have to check for list length to avoid index error.

C++ Boost: Split function is_any_of()

I'm trying to use the split() function provided in boost/algorithm/string.hpp in the following function :
vector<std::string> splitString(string input, string pivot) { //Pivot: e.g., "##"
vector<string> splitInput; //Vector where the string is split and stored
split(splitInput,input,is_any_of(pivot),token_compress_on); //Split the string
return splitInput;
}
The following call :
string hello = "Hieafds##addgaeg##adf#h";
vector<string> split = splitString(hello,"##"); //Split the string based on occurrences of "##"
splits the string into "Hieafds" "addgaeg" "adf" & "h". However I don't want the string to be split by a single #. I think that the problem is with is_any_of().
How should the function be modified so that the string is split only by occurrences of "##" ?
You're right, you have to use is_any_of()
std::string input = "some##text";
std::vector<std::string> output;
split( output, input, is_any_of( "##" ) );
update
But, if you want to split on exactly two sharp, maybe you have to use a regular expression:
split_regex( output, input, regex( "##" ) );
take a look at the documentation example.