Issues with conversion between QString and QByteArray - c++

I have a strange issue converting QString to QByteArray:
qDebug()<<"content to encryp:\n"<<content<<"\n";
QString pwHash = generalSettings.password;
QByteArray secretKey;
secretKey.append(pwHash);
QString clearText(content);
QBlowfish bf(secretKey);
bf.setPaddingEnabled(true);
QByteArray encryptedBa = bf.encrypted(clearText.toUtf8());
qDebug()<<"encryptedBa:";
qDebug()<<encryptedBa<<"\n";
const char* cString = encryptedBa.constData();
QString saved = QString::fromUtf8(cString);
QByteArray decryptedBa = bf.decrypted(saved.toUtf8());
qDebug()<<"decryptedBa:";
qDebug()<<decryptedBa<<"\n";
The output is this:
content to encryp:
"<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN" "http://www.w3.org/TR/REC-html40/strict.dtd">
<html><head><meta name="qrichtext" content="1" /><style type="text/css">
p, li { white-space: pre-wrap; }
</style></head><body style=" font-family:'Arial'; font-size:12pt; font-weight:400; font-style:normal;">
<p style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;">HALLO</p></body></html>"
encryptedBa:
"?6N?x9?1!???P?a?+]6H???O??Y?or??l?
??k#???m??d?.?M?o?$F??I(?
7?3??NtE-?fH(?w|sL?i??
cipherText (Input to decrypt):
"?6N?x9?1!???P?a?+]6H???O??Y?or??l?
??k#???m??d?.?M?o?$F??I(?
7?3??NtE-?fH(?w|sL?i??"
decryptedBa:
""
Now the strange part: If I switch the line
QByteArray decryptedBa = bf.decrypted(saved.toUtf8());
to
QByteArray decryptedBa = bf.decrypted(encryptedBa);
everythink works. BUT: The output is 100% the same, only that the decryptedBa at the end is a duplicate of contentas it should be.
I am also wondering, why the cipherText has a " at the end. But that is the same for both versions. What am I missing here?
EDIT:
Following the tip from the answer below and the comments, I came up with this solution:
QString saved = encryptedBa.toBase64();
std::string stdString = std::string(saved.toUtf8().data());
QByteArray fromSaved = QByteArray::fromBase64(stdString.data());
The " at the end of the ciphertext is also gone, then.

The cypher text is likely not valid UTF8. This means that the data is not guaranteed to properly round-trip through QString::fromUtf8(data).toUtf8();.
Double check that encryptedBa and saved.toUtf8() are actually exactly the same

Related

Protobuf: Serialize/DeSerialize C++ to Js

I'm using protobuf to send/receive binary data from Cpp to Js and vice-versa and I'm using QWebChannel to communicate with the HTML client.
Question: How to deserialize binary data in cpp which is serialized and sent from Js?
Following I tried:
//Serialization: Cpp to JS - WORKING
tutorial::PhoneNumber* ph = new tutorial::PhoneNumber();
ph->set_number("555-515-135");
ph->set_type(500);
QByteArray bytes = QByteArray::fromStdString(ph->SerializeAsString());
QString args = "\"" + bytes.toBase64(QByteArray::Base64Encoding) + "\"";
QString JsFunctionCall = QString("DeserializePhoneNumber(%1);").arg(args);
m_pWebView->page()->runJavaScript(JsFunctionCall);
//Deserialization In JS - Js Code - WORKING
var obj = phone_msg.PhoneNumber.deserializeBinary(data);
console.log("PhoneNumber: " + obj.getNumber());
console.log("Type: " + obj.getType());
//Serialization in Js - WORKING
var phNum = new phone_msg.PhoneNumber;
phNum.setNumber("555-515-135");
phNum.setId(500);
var base64Str = btoa(phNum.serializeBinary());
console.log("base64Str: " + base64Str);
//Call Cpp function from Js
MainChannel.SendMsgToCpp(base64Str);
Deserialization in Cpp - NOT WORKING
bool WebRelay::ReceiveMsgFromJs(QVariant data)
{
QString str = data.toString();
QByteArray bytedata = str.toLatin1();
QByteArray base64data = QByteArray::fromBase64(bytedata);
std::string stdstr = base64data.toStdString();
tutorial::PhoneNumber cppPhNum;
//THIS IS NOT WORKING. Text and id are invalid
cppPhNum.ParseFromArray(base64data.constData(), base64data.size());
qDebug() << "Text:" << itemData.number();
qDebug() << "id:" << cppPhNum.id();
}
Found the problem.
I was getting comma-separated bytes from Js like:
10,7,74,115,73,116,101,109,49,16,45
I split the strings by ',' and created a QByteArray
QStringList strList = str.split(',');
QByteArray bytedata;
foreach(const QString & str, strList)
{
bytedata+= (str.toUInt());
}
std::string stdstr = bytedata.toStdString();
itemData.ParseFromString(stdstr);
It works.
Also in JS, I removed convertion of the binary string to base64:
var base64Str = phNum.serializeBinary();

can't get xhtml <script> content with libxml++ using xpath expression

#include <libxml++/libxml++.h>
xmlpp::NodeSet xmlP(std::string xml_string, std::string xpath) {
xmlpp::DomParser doc;
// 'response' contains your HTML
doc.parse_memory(xml_string);
xmlpp::Document* document = doc.get_document();
xmlpp::Element* root = document->get_root_node();
xmlpp::NodeSet elemns = root->find(xpath);
xmlpp::Node* element = elemns[0];
std::cout << elemns.size() << std::endl;
std::cout << element->get_line() << std::endl;
//const auto nodeText = dynamic_cast<const xmlpp::TextNode*>(element);
const auto nodeText = dynamic_cast<const xmlpp::ContentNode*>(element);
if (nodeText && nodeText->is_white_space()) //Let's ignore the indenting - you don't always want to do this.
{
std::cout << nodeText->get_content() << std::endl;
}
}
The xml_string is something like this :
std::string xml_strings("
<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">
<html lang=\"en\" xml:lang=\"en\" xmlns=\"http://www.w3.org/1999/xhtml\">
<head>
<title>Demo page</title></head>
<body>
<div class=\"item\">
<div class=\"row\">
<div class=\"col-xs-8\">Item</div>
<div class=\"col-xs-4 value\">
<script type=\"text/javascript\">fruit('orange');</script>
</div></div></div>
</body></html>");
The function called with the page and the xpath expression like this : xmlpp::NodeSet xmlNodes = xmlP(xml_strings, "/html/body/div/div/div[2]/script");
The problem is i couldn't get the text inside the <script>, i tried dynamic_cast'ing to ContentNode, nothing helped...
is libxml++ worth it or i need to solve my problem with another xml library?
Please, i appreciate all suggestions that can get me the text value from the <script> tag.
I tried reproducing your issue locally and could not get root->find(xpath) to produce any nodes.
According to this issue, you need to tell XPath which namespace your nodes are under, even if it is the default namespace.
I changed the XPath string and find invocation as follows:
std::string xpath("/x:html/x:body/x:div/x:div/x:div[2]/x:script");
xmlpp::Node::PrefixNsMap nsMap = {{"x",root->get_namespace_uri()}};
xmlpp::Node::NodeSet elemns = root->find(xpath, nsMap);
xmlpp::Node* element = elemns[0];
const auto nodeText = dynamic_cast<const xmlpp::Element*>(element);
if (nodeText) {
std::cout << nodeText->get_first_child_text()->get_content() << std::endl;
}

How to make the characters viewable?

I send a request to Alipay and get this response. I feel like some characters are Chinese characters and I cannot view them.
How can I change it so I can see them as Chinese characters or something that is viewable?
<head><title>\xD6\xA7\xB8\xB6\xB1\
xA6 - \xCD\xF8\xC9\xCF\xD6\xA7\xB8\xB6 \xB0\xB2\xC8\xAB\xBF\xEC\xCB\xD9\xA3\xA1</title><meta http-equiv=\"Content-Type\" content=\"text/html; charset=gb2312\" /><meta http-equiv=\"x-ua-compatible\" content=\"ie=7\" />
<meta name=\"description\" content=\"\xD6\xD0\xB9\xFA\xD7\xEE\xB4\xF3\xB5\xC4\xB5\xDA\xC8\xFD\xB7\xBD\xB5\xE7\xD7\xD3\xD6\xA7\xB8\xB6\xB7\xFE\xCE\xF1\xCC\xE1\xB9\xA9\xC9\xCC\" /><meta name=\"keywords\" content=\"\xCD\
xF8\xC9\xCF\xB9\xBA\xCE\xEF/\xCD\xF8\xC9\xCF\xD6\xA7\xB8\xB6/\xB0\xB2\xC8\xAB\xD6\xA7\xB8\xB6/\xB0\xB2\xC8\xAB\xB9\xBA\xCE\xEF/\xB9\xBA\xCE\xEF\xA3\xAC\xB0\xB2\xC8\xAB/\xD6\xA7\xB8\xB6/\xD6\xA7\xB8\xB6\xB1\xA6,\xD4\xDA\xCF\xD
F/\xB8\xB6\xBF\xEE,\xCA\xD5\xBF\xEE/\xCD\xF8\xC9\xCF,\xC3\xB3\xD2\xD7/\xCD\xF8\xC9\xCF\xC3\xB3\xD2\xD7\" /><link rel=\"icon\" href=\"https://img.alipay.com:443/img/icon/favicon.ico\" type=\"image/x-icon\" /><l
ink rel=\"shortcut icon\" href=\"https://img.alipay.com:443/img/icon/favicon.ico\" type=\"image/x-icon\" /><link rel=\"stylesheet\" type=\"text/css\" href=\"https://img.alipay.com:443/assets/c/global/global_v1.css?t=20081
119.css\" /><link rel=\"stylesheet\" type=\"text/css\" href=\"https://img.alipay.com:443/assets/c/sys/sys.tabs.css?t=20080709.css\" /><link rel=\"stylesheet\" type=\"text/css\" href=\"https://img.alipay.com:443/assets
/c/typography/ot.old.css?t=20080709.css\" /><link rel=\"stylesheet\" type=\"text/css\" href=\"https://img.alipay.com:443/assets/c/typography/as.kt.css?t=20080709.css\" /></head>
<!--[if gte IE 6]><script type='text/javascript' src='https://img.alipay.com:443/js/sys/sys.object.js?t=20080122.js' defer='d
efer'></script><![endif]--><style type=\"text/css\">.topsearch{font-size:12px;position:relative;}.topsearch form{margin:-2px 0 0 0;padding:0}.topsearch form input{width:94px;height:13px;line-height:13p
x;border:1px solid #d7d7d7;padding:2px 2px 0 2px;font-size:12px}.topsearch form button{width:40px;height:18px;margin:0 0 0 4px;border:1px solid #d7d7d7;background:#f3f3f3;padding:0}.topsearch .xnews{border:0;position:
absolute;top:0;right:-15px}#Header #QuickLinks .QuickLinksMore{position:relative;}#Header #QuickLinks .QuickLinksMore ol{display:none;position:absolute;top:18px;left:5px;float:none;width:65px;text-align:left;z-index:9
999;text-align:left;border:1px solid #ccc;background:#fff;margin:0;padding:5px 0}#Header #QuickLinks .QuickLinksMore ol li{display:block;float:none;background:none;margin:0;padding:2px 3px;text-align:center}#Header #Q
uickLinks .QuickLinksMore ol li a{text-decoration:none;color:#666;margin:0;padding:0}#Header #QuickLinks .QuickLinksMore ol li a:hover{color:#f60}}</style><!-- Header start--><div id=\"Header\" class=\"cle
arfix\"> <!-- HeadTop start--> <div id=\"HeadTop\"> <div id=\"Logo\"></div> <div id=\"QuickLink
s\" style=\"padding-top:8px\"> <ul><li class=\"topsearch\" style=\"background:none\"><form id=\"topsearch\" name=\"topsearch\" action=\"https://help.alipay.com/support/search_new_result.htm\" method=\"get
\" onsubmit=\"return checkTopSearch()\"><input type=\"hidden\" name=\"_form_token\" value=\"kp76AFTuuAIZrf7eAcL6jMopmn2ka5fL\"/><input id=\"word\" name=\"word\" type=\"text\"/><button type=\"submit
\">\xCB\xD1\xCB\xF7</button></form></li><li>\xB0\xEF\xD6\xFA\xD6\xD0\xD0\xC4</li><li><a href=\"https:
//jifen.alipay.com/index.htm?src=yy_jifen_sy01\" target=\"_blank\">\xBF\xEC\xC0\xD6\xBB\xFD\xB7\xD6</a></li><li id=\"QuickLinksMore1\" class=\"QuickLinksMore\"><a href=\"https://wow.alipay.com?src=wow_ho
me\">\xCD\xDB\xA3\xA1\xD6\xA7\xB8\xB6\xB1\xA6</a><ol><li>\xB4\xD9 \xCF\xFA \xBD\xD6</li><li><a href=\"https://wow
.alipay.com/overseas.htm?src=overseas_home\">\xBA\xA3\xCD\xE2\xB9\xBA\xCE\xEF</a></li><li>\xBA\xCF\xD7\xF7\xC9\xCC\xBC\xD2</l
i><li>\xBB\xE1\xD4\xB1\xB7\xFE\xCE\xF1</li></ol></li></ul><script type=\"text/javascript\">var search
Info=document.getElementById(\"word\");function searchclearInfoJs(){if(searchInfo.value==\"\xCA\xE4\xC8\xEB\xC4\xFA\xB5\xC4\xCE\xCA\xCC\xE2\"){searchInfo.style.color=\"#000\";searchInfo.value=\
"\";}}function searchinputInfoJs(){if(searchInfo.value==\"\"){searchInfo.style.color=\"#999\";searchInfo.value=\"\xCA\xE4\xC8\xEB\xC4\xFA\xB5\xC4\xCE\xCA\xCC\xE2\";}}if(se
archInfo!=undefined){if(searchInfo.value==\"\"||searchInfo.value==\"\xCA\xE4\xC8\xEB\xC4\xFA\xB5\xC4\xCE\xCA\xCC\xE2\"){searchInfo.style.color=\"#999\";searchInfo.value=\"\xCA\xE4\xC8\xEB\xC4\xFA\x
B5\xC4\xCE\xCA\xCC\xE2\";searchInfo.onfocus=function(){searchclearInfoJs();}searchInfo.onblur=function(){searchinputInfoJs();}}}function showMore(obj){var oMor
e=document.getElementById(obj);var oMoreUl=oMore.getElementsByTagName(\"ol\")[0];if(document.all){//ieoMore.setAttribute(\"onmouseover\",eval(function(){oMoreUl.style.display=\"block\";}));oMore.setAttribu
te(\"onmouseout\",eval(function(){oMoreUl.style.display=\"none\";}));oMoreUl.setAttribute(\"onmouseover\",eval(function(){oMoreUl.style.display=\"block\";}));oMoreUl.setAttribute(\"onmouseout\",eval(function(){oMo
reUl.style.display=\"none\";}));}else{//ffoMore.addEventListener(\"mouseover\", function(){oMoreUl.style.display=\"block\";}, false);oMore.addEventListener(\"mouseout\", function(){oMoreUl.style.display=\"none
\";}, false);oMoreUl.addEventListener(\"mouseover\", function(){oMoreUl.style.display=\"block\";}, false);oMoreUl.addEventListener(\"mouseout\", function(){oMoreUl.style.display=\"none\";}, false);}}sh
owMore(\"QuickLinksMore1\");function checkTopSearch(){if(searchInfo.value==\"\"||searchInfo.value==\"\xCA\xE4\xC8\xEB\xC4\xFA\xB5\xC4\xCE\xCA\xCC\xE2\"){alert(\"\xC7\xEB\xCA\xE4\xC8\xEB\xC4\xFA\xB5\xC4\xCE\xCA\x
CC\xE2 \xC8\xE7\xA3\xBA\xCD\xFC\xBC\xC7\xC3\xDC\xC2\xEB\");return false;}return true;}</script><ul style=\"clear:both\"><li class=\"Inpour\" style=\"background:none\"><a href=\"https://www.
alipay.com/user/inpour_request.htm?src=yy_content_czbutton\"><img style=\"position:absolute;top:-15px;margin-left:70px;\"src=\"https://img.alipay.com:443/assets/i/base/icon/sjf.gif\" width=\"43\" height=\"22\" border=\"0\" a
lt=\"\xB3\xE4\xD6\xB5\xCB\xCD\xBB\xFD\xB7\xD6\"/></a></li><li>\xD0\xC5\xC8\xCE\xBC\xC6\xBB\xAE</li>\
t<li>\xC4\xFA\xBA\xC3\xA3\xAC\xC7\xEB \xB5\xC7\xC2\xBC</li>
</ul> </div> </div> <!-- HeadTop ending--> </div><!-- Header ending--><div id=\"Info\"><div class=\"ExclaimedInfo\"><strong>\xB5\xF7\xCA\xD4\xB4\xED\xCE\x
F3\xA3\xAC\xC7\xEB\xBB\xD8\xB5\xBD\xC7\xEB\xC7\xF3\xC0\xB4\xD4\xB4\xB5\xD8\xA3\xAC\xD6\xD8\xD0\xC2\xB7\xA2\xC6\xF0\xC7\xEB\xC7\xF3\xA1\xA3</strong> <div class=\"Todo\">\xB4\xED\xCE\xF3\xB4\xFA\xC2\xEB ILLEGAL_PARTNER <
/div> <ul> <li>\xCB\xB5\xC3\xF7:\xC8\xE7\xB9\xFB\xC4\xFA\xB2\xBB\xCA\xC7\xD2\xF2\xCE\xAA\xB1\xBE\xBD\xD3\xBF\xDA\xBC\xAF\xB3\xC9\xB5\xF7\xCA\xD4\xB6\xF8\xBF\xB4\xBC\xFB\xB8\xC3\xB4\xED\xCE\xF3\xCC\xE1\xD0\xD1\
xA3\xAC\xC7\xEB\xC1\xAA\xCF\xB5\xB1\xBE\xB4\xCE\xC7\xEB\xC7\xF3\xC0\xB4\xD4\xB4\xCD\xF8\xD5\xBE\xA3\xAC\xB1\xBE\xB4\xED\xCE\xF3\xCA\xF4\xD3\xDA\xCD\xF8\xD5\xBE\xBC\xAF\xB3\xC9\xBD\xD3\xBF\xDA\xB5\xC4\xB4\xED\xCE\xF3\xA1\xA3</
li> </ul><ul><div class=\"HelpSubmit\"><b>\xCE\xCA\xCC\xE2\xC3\xBB\xBD\xE2\xBE\xF6\xA3\xBF<input type=\"button\" value=\"\xCB\xD1\xCB\xF7\xD5\xD2\xB4\xF0\xB0\xB8\" onclick=\"window.open('http://help
.alipay.com/support/index.htm?src=yy_reach_error','_self')\">\xA3\xAC\xBB\xF2\xD5\xDF\xCF\xF2\xCD\xF8\xD3\xD1\xCC\xE1\xCE\xCA\xA1\xA3</b
></div></ul></div></div><!--footer start--><div id=\"Foot\"> <div class=\"Shell clearfix\"> <ul> <li><a href=\"https://www.alipay.com/static/aboutalipay/about.htm\" target=\"_bla
nk\">\xB9\xD8\xD3\xDA\xD6\xA7\xB8\xB6\xB1\xA6</a></li> <li>\xCC\xE5\xD1\xE9\xBC\xC6\xBB\xAE</li> <li><a href=\"https://blog.alipay.com\" target=\"_blan
k\">\xB9\xD9\xB7\xBD\xB2\xA9\xBF\xCD</a></li> <li>\xB3\xCF\xD5\xF7\xD3\xA2\xB2\xC5</li> <li><a href=\"https://www.alipay.com/static/aboutalipay/contac
t.htm\" target=\"_blank\">\xC1\xAA\xCF\xB5\xCE\xD2\xC3\xC7</a></li> <li>International Business</li> <li><a href=\"https://www
.alipay.com/static/aboutalipay/englishabout.htm\" target=\"_blank\">About Alipay</a></li> </ul> </div> <ul class=\"CopyRight clearfix\"> <li><a href=\"https://www.alipay.com/static/phone/alipay_pho
ne.htm?src=yy_sy_sjzf\" target=\"_blank\">\xB5\xE7\xBB\xB0\xD6\xA7\xB8\xB6\xB1\xA6</a>\xA3\xBA""400-66-13800 | \xCA\xD6\xBB\xFA\xD6\xA7\xB8\xB6\xB1\xA6\xA3\xBAwap.alipay.com</li> <li>\xD6\xA7\xB8\xB6\xB1\xA6\xB0\xE6\xC8\
xA8\xCB\xF9\xD3\xD0 2004-2016 ALIPAY.COM</li> </ul> <div id=\"ServerNum\">mapi-60-72</div></div><!--footer ending-->"
Here is my code in Qt :
void sendRequest();
int main(int argc, char *argv[])
{
QCoreApplication a(argc, argv);
sendRequest();
return a.exec();
}
void sendRequest(){
QNetworkProxyFactory::setUseSystemConfiguration(true);
QEventLoop eventLoop;
QNetworkAccessManager mgr;
QObject::connect(&mgr, SIGNAL(finished(QNetworkReply*)), &eventLoop, SLOT(quit()));
// the HTTP request
QNetworkRequest req( QUrl( QString("https://mapi.alipay.com/gateway.do?_input_charset=UTF-8&currency=USD&notify_url=10.237.221.84:80&out_trade_no=123456789&partner=2088101122136241&sign=760bdzec6y9goq7ctyx96ezkz78287de&subject=Coke&sign_type=MD5&service=create_forex_trade&total_fee=0.01") ) );
QNetworkReply *response = mgr.get(req);
eventLoop.exec(); // blocks stack until "finished()" has been called
if (response->error() == QNetworkReply::NoError) {
//success
qDebug() << "Success\n\n\n\n" << response->readAll();
delete response;
}
else {
//failure
qDebug() << "Failure" <<response->errorString();
delete response;
}
}
response->readAll() returns a QByteArray which is printed in form shown in your question (bytes with noneprintable characters are encoded with \xXY). Before you can print text in such raw form you have to be sure how it is encoded. From what I see (in comments for this answer too) your text is UTF-8 encoded. So simple conversion to QString should do the job to have readable html:
QString html(response->readAll());
Now in docs you can find other ways, like QString::fromUtf8.
To get rid of the html tags and get only the text content you have to use QTextDocument.
QTextDocument doc;
doc.setHtml(html);
qDebug() << "Success\n\n" << doc.toPlainText()
From your comment I can see that you don't need this last step.

Email Crawler in c++

I have this assignment that I just can't figure out. I want my function to get a line from an html file and extract an email from it. Then Split the email into email, username, and domain. Then i want to have a third function to get the next email in the html file.
void get_line_emails(ifstream &in_stream, ofstream &out_stream, string email[], string users[], string domain[])
{
int location, end;
string mail;
getline(in_stream, mail);
location = mail.find("mailto:");
end = mail.find(">");
mail = mail.substr(location, (end - 1));
cout << mail << endl;
}
void get_next_email(ifstream &in_stream, string mail)
{
getline(in_stream, mail);
int location = mail.find("mailto:");
int end = mail.find(">");
mail = mail.substr(location, (end - 1));
}
void split_email(string email[], string domain[], string users)
{
int count = 300;
string mail;
for (int i = 1; i < count; ++i) //For loop to input stream.
{
mail = email[i];
int location = mail.find("#");
int end = mail.find(">");
string domain[i] = mail.substr(location, (end - 1));
string users[i] = mail.substr(0, location);
}
}
I also get this error when I run the program:
terminate called after throwing an instance of 'std::out_of_range'
what(): basic_string::substr: __pos (which is 4294967295) > this->size() (which is 244)
Abort (core dumped)
If it helps heres my main function:
int main()
{
string email[1000];
string users[1000];
string domain[1000];
int count = 300;
string filename;
ifstream in_stream;
ofstream out_stream;
cout << "Enter input filename: " << endl;
cin >> filename; //Input of filename.
in_stream.open(filename.c_str()); //Opening the input file for population and other information.
if (in_stream.fail()) //Checking to see if file opens.
{
cout << "Error opening input/output files" << endl; //Telling user file isn't opening.
exit(1); //Exiting program.
}
out_stream.open("Emails.txt");//If it does not exist it will not be created. If it exists it will be overwritten.
out_stream << "Email " << right << setw(20) << "User " << right << setw(20) << "Domain" << endl;
out_stream << "_______________________________________________________________________________" << endl;
get_line_emails(in_stream, out_stream, email, users, domain);
//split_email(email, domain, users);
sort(email, users, domain, count);
in_stream.close(); //Closing the in stream.
out_stream.close(); //Closing the out stream.
cout << "A new file Emails has been created with the emails extracted. Thank you." << endl; //End message.
return 0;
}
Part of the HTML file I am inputting:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <!-- Content Copyright Ohio University Server ID: 2-->
<!-- Page generated 2016-03-22 14:55:21 by CommonSpot Build 9.0.3.119 (2015-08-14 15:00:01) -->
<!-- JavaScript & DHTML Code Copyright © 1998-2015, PaperThin, Inc. All Rights Reserved. --> <head>
<meta name="Description" id="Description" content="Faculty" />
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<meta name="Keywords" id="Keywords" content="engineering" />
<meta name="Generator" id="Generator" content="CommonSpot Content Server Build 9.0.3.119" />
<link rel="stylesheet" href="/style/ouws_0111_allin1_nonav.css" type="text/css" />
<link rel="stylesheet" href="/engineering/upload/engineeringEV.css" type="text/css" />
<link rel="stylesheet" href="/engineering/upload/gridpak.css" type="text/css" />
<style type="text/css">
.mw { color:#000000;font-family:Verdana,Arial,Helvetica;font-weight:bold;font-size:xx-small;text-decoration:none; }
a.mw:link {color:#000000;font-family:Verdana,Arial,Helvetica;font-weight:bold;font-size:xx-small;text-decoration:none;}
a.mw:visited {color:#000000;font-family:Verdana,Arial,Helvetica;font-weight:bold;font-size:xx-small;text-decoration:none;}
a.mw:hover {color:#0000FF;font-family:Verdana,Arial,Helvetica;font-weight:bold;font-size:xx-small;text-decoration:none;}
</style> <script type="text/javascript">
<!--
var gMenuControlID = 0;
var menus_included = 0;
var jsDlgLoader = '/engineering/about/people/loader.cfm';
var jsSiteID = 1;
var jsSubSiteID = 6148;
var js_gvPageID = 2177477;
var jsPageID = 2177477;
var jsPageSetID = 0;
var jsPageType = 0;
var jsControlsWithRenderHandlers = ",1366057,1407941,1408984,1409120,1409220,1463564,1653027,1464282,1484855,1663987,1703445,1714178,1719109,1716274,1719109,1719109,1722161,1748941,1743237,1767756,1771704,1240950,1795856,1799077,1806233,1814378,1814378,1814378,36,1156323,958270,959997,36,1239784,1239535,1240103,1264495,1264559,1240832,1241026,1268776,1269019,1365662,1365798,1367666,1367112,1367146,1403322,1236239,1644435,1707482,36,1707482,1708185,1708185,1707846,1718301,1718356,1722082,1735273,1156092,1736675,1738340,1758445,1487747,1740183,1750814,1755341,36,4,1241075,1320447,1410344,1440455,1462605,1463564,1642797,1644920,1644955,1659254,1656252,1707459,1692320,1290294,1705469,1705596,1707846,1708163,1708367,1719109,1719109,1719109,1728460,1718356,1706218,1725200,1739433,1193755,1782561,1806244,1781609,1783821,1784445,1783821,1788664,1750814,1781533,1781788,1812661,1810778,1822088,1644219,39,36,36,438722,443887,523857,542895,36,867909,671210,733944,1074794,671213,671222,671225,671231,671234,1190981,1190914,1190943,1193755,1236239,1239497,1280404,1284325,860732,860741,1080236,671204,1237273,671216,671219,671228,671237,671207,1190973,1243855,1264544,1264564,1241172,1267910,1240840,1240849,1241220,1264699,1241365,1264571,1289737,8,1290184,1321465,1322500,1363024,1365670,1365954,1365998,1366014,2214456,2068897,1837521,1190931,1190931,2239453,1992371,1967400,1992371,1808005,1792195,1792195,1156323,1716646,1967400,1763595,1080236,1971121,1960374,1290151,2007514,2013290,2012663,2012302,2012026,2012663,2021773,1191128,426028,1808005,2108357,426028,36,36,36,2145522,2145522,2186158,1792195,1827509,1827486,1827486,1840641,1843869,1843869,1843879,1843879,1827509,1827486,635375,1190931,1853586,1854295,1854509,1854614,1855117,1855125,1859942,1232520,996841,999747,1074782,801933,1156092,1231112,1240950,1264518,1264536,1240828,1241280,1241033,1241322,1265043,1268750,1269805,1287352,1290231,1321501,1322534,1368599,1407796,1407917,1408156,1408447,1461409,1463586,1466072,1660460,1704499,1701618,1704211,1701596,1707383,1706218,1713783,1713443,1715100,1716646,1714352,1723376,1706218,1717134,1717134,1759841,1740127,1740183,1737868,1755222,1763595,1750814,1812661,1784600,860732,1785700,1786558,1786640,1788366,1788803,1787835,1758851,1802116,1802116,1802116,1802116,1810778,1870892,1827509,1854528,1859942,1859942,1870780,1865837,1905202,1905202,1750814,1243855,1763595,1806295,1806280,860741,1893429,1893243,1893429,1898989,1913110,1915322,1921065,1871293,1872541,1900928,1708367,1874008,1827509,1808005,1948002,1708367,1859942,1827509,1243851,1959041,1243851,1746007,1243851,1243851,1967400,1967400,1191128,1780116,1960374,1960374,1780116,1827486,1156092,1153939,36,1827486,1859942,1974908,1156092,1156323,1763595,1080236,1763595,1854295,1854641,1865837,1867230,1867211,1869328,738180,8,1191128,1808005,1967400,1156323,2104541,2058309,2013290,2047047,2068897,2010928,2087246,2010928,2104541,2104541,2104578,2115265,1708185,2120941,426028,2129783,1663761,2166426,2068897,1967400,1967400,1967400,2068897,1808005,1716646,1833649,1827509,2010085,36,2167570,2068897,1706218,1156092,2012337,2186146,1191128,2191212,1190931,1156323,1716646,2012663,2508370,1992371,1080236,2280950,1808005,36,36,1156323,1808005,1819898,1191128,1243855,2281280,2013290,2239453,1837521,1156323,1644219,1849105,1849105,2376567,2381406,1808005,1808005,1156092,2552104,2552104,2281280,1805958,1967400,2068897,2390125,1808005,2444428,2459222,2013290,2568057,2508370,1661786,1763595,2349059,2349059,2438289,1708367,2120941,2508370,2120951,2596819,1156323,1191128,2239453,2367160,2012337,2451225,1808005,2615851,1808005,1849105,55,55,2734901,1191128,55,55,2012663,2734829,1967400,1967400,1996683,1992371,2013290,2018337,2012337,2018364,1156092,1363024,1967400,1888191,1888191,1805958,1967400,2057362,39,1153939,1708185,2010085,2010085,2010085,2079659,2079659,2010928,2010928,2087246,1808005,36,1190931,2369360,2380491,1808005,2120941,1153939,1708367,2511867,2540778,1704499,1787140,1758479,1716646,1827486,2239453,1808005,1808005,1080236,2451225,2120941,1808005,";
var jsDefaultRenderHandlerProps = ",,";
var jsAuthorizedControls = ",65684,62081,62169,62236,62658,67860,70371,70560,70645,70911,71567,71570,71579,71582,71585,71588,71630,71645,73051,73055,73135,73175,73177,73179,73181,73183,73185,75593,75596,75598,75600,75602,75604,75943,77337,77339,77367,77369,77371,77397,77399,77401,77403,77406,77408,77423,77425,77429,77431,77433,77435,77454,77456,77458,77460,77462,77464,77524,77526,77528,77530,77533,77535,77564,77566,77569,77572,77579,77581,77755,77759,77771,77940,78254,78304,78759,81449,81447,81452,81454,86430,95027,110992,112176,114559,122476,122590,122592,122594,122998,123000,123002,123004,123010,123012,123014,123016,123113,123115,123117,123119,123121,123123,123125,123127,123129,123131,123133,123135,123137,123139,123141,123143,123193,123217,123219,123221,543,1784,1786,1791,1829,1901,1903,3434,3062,10165,17470,19113,17964,17975,20458,18450,19246,20461,20532,20535,20631,22975,22976,29043,29065,29198,29497,29894,32565,37812,42989,50270,50283,51427,51770,51940,51987,52309,52306,52325,52338,52440,52727,52935,53585,53717,54936,55739,56170,57624,70375,57659,58549,60274,60859,65324,65375,65378,65630,341266,341268,341270,343681,344120,344123,344125,344127,344129,344131,344133,1155418,344136,344142,344918,344920,346066,349254,349260,353078,353096,353249,353368,353500,353518,356036,356519,356527,356534,359303,359315,359619,365645,365647,365651,372637,372642,373892,409046,385136,402687,408565,416225,423380,423445,423634,423934,424407,424503,426545,425757,425785,426028,426263,433478,438722,440105,440778,441424,441447,441488,441530,441743,441914,441917,441920,441923,442181,442184,442228,442231,442767,443887,444519,444536,448085,446524,447856,448121,450241,450489,450583,451031,123223,123225,123227,123229,123231,123233,123235,123237,123239,123241,123243,123245,123247,123249,133712,138458,138462,138472,138493,140917,152719,152941,155012,174553,176272,182475,185313,185545,185572,185600,185653,189527,189717,189912,189915,209638,190014,209612,209640,210772,233752,233754,240835,242005,245048,245061,246392,247905,253143,255217,258368,258370,258448,259352,259507,259535,259540,259557,259597,270079,272462,272484,273374,275946,276171,281359,281731,281886,285356,285362,285364,289279,290246,293573,293580,293990,306206,306372,307096,307117,1409047,1410292,1410344,1440455,1462692,1462605,1463206,1463358,1463363,1463559,1463575,1466067,1466072,1466949,565361,577664,577666,580782,580785,586106,593209,631308,631375,671204,671207,671210,671213,671216,671219,671222,671225,671228,630659,630928,631186,631230,671231,703507,703512,872630,872675,951724,1070639,1070773,1071579,1074782,1074794,1116648,1118602,1153954,1153962,310170,319781,325794,326607,326613,331241,331243,331248,338287,338305,338307,338805,340095,340098,341260,341264,523857,523883,540187,541324,542748,542895,543075,543442,543531,545031,545034,545925,550439,550694,551327,551342,551843,551848,554801,557468,563421,563522,564335,564350,564362,565392,565403,565430,565440,565460,578908,580751,589443,589691,589825,631522,631342,671234,704390,704500,730405,733189,733195,733931,733944,735045,721050,721061,720116,803640,807230,860741,867909,869754,878921,872399,911315,951437,952815,952921,954983,956036,958270,960899,960901,960903,960912,960914,960916,959997,990601,993320,996841,999438,999472,999741,999747,999871,1034551,1034553,1035679,1035681,1070829,1080236,1111202,1112587,1112594,1116088,1117180,566481,567951,635375,671237,705089,708277,738180,738270,738274,756640,808480,993241,993247,993326,998452,999162,1034549,1034793,1034795,1118837,1121340,1150407,1152064,1153928,1153933,1153939,1153948,1154637,1156092,1156320,753746,754822,754960,755002,755412,755426,755453,801854,801933,802037,802071,802077,802080,802083,802087,802091,802417,802525,804060,860732,753752,754885,753748,754422,802568,451785,453349,452911,452935,454345,454916,464533,465324,476013,469286,469308,470126,472222,476011,476015,489860,478066,482338,482852,492048,486517,489015,489681,492017,492050,492052,498151,516411,516413,516415,516417,516419,516422,1935063,1939712,1992371,1996683,2010928,2012302,2012840,2013290,2021773,2047047,2058309,2079659,2104541,2108357,2115265,2120941,2120951,2135749,2145522,2157693,2157775,1193061
<img border="0" alt="YouTube" title="YouTube" src="/engineering/images/icon_youtube.png" /><span class="imageCaption" style="display:none;"></span>
</div>
<div class="imageImg">
<img border="0" alt="LinkedIn" title="LinkedIn" src="/engineering/images/icon_linkedin.png" /><span class="imageCaption" style="display:none;"></span>
</div>
<div class="imageImg">
<img border="0" alt="Facebook" title="Facebook" src="/engineering/images/icon_fb.png" /><span class="imageCaption" style="display:none;"></span>
</div>
<div class="imageImg">
<img border="0" alt="Twitter" title="Twitter" src="/engineering/images/icon_twitter.png" /><span class="imageCaption" style="display:none;"></span>
</div>
<div class="imageImg">
<img border="0" alt="Instagram" title="Instagram" src="/engineering/images/russ_instagram.png" /><span class="imageCaption" style="display:none;"></span>
</div>
</div></div></div><div id="cs_control_2398199" class="cs_control CS_Element_Custom"></div></div></div><div id="cs_control_2142700" class="contentWrap col row"><div title="" id="CS_Element_2177477_2142700"><div id="cs_control_2142767" class="cs_control col pageTitle">
<!-- Portal Content -->
<div class="content-element">
<h2>Faculty</h2>
<p></p>
<br />
</div>
<!-- Portal Content -->
</div><div id="cs_control_2142762" class="mainContent col"><div title="" id="CS_Element_2177477_2142762"><div id="cs_control_2142772" class="cs_control CS_Element_Custom">
<!-- Portal Content -->
<div class="content-element">
<p>  </p>
</div>
<!-- Portal Content -->
</div><div id="cs_control_2177314" class="cs_control">
<style type="text/css">
/* This fixes some issues with the anchor links from the A-Z bar at the top */
.group a[name]
{
position: absolute;
}
</style>
<div id="staffAlpha">
<ul class="azList">
<li class="children ">A</li>
<li class="children ">B</li>
<li class="children ">C</li>
<li class="children ">D</li>
<li class="children ">E</li>
<li class="children ">F</li>
<li class="children ">G</li>
<li class="children ">H</li>
<li class="children ">I</li>
<li class="children ">J</li>
<li class="children ">K</li>
<li class="children ">L</li>
<li class="children ">M</li>
<li class="children ">N</li>
<li class="children ">O</li>
<li class="children ">P</li>
<li>Q</li>
<li class="children ">R</li>
<li class="children ">S</li>
<li class="children ">T</li>
<li class="children ">U</li>
<li class="children ">V</li>
<li class="children ">W</li>
<li class="children ">X</li>
<li class="children ">Y</li>
<li class="children last">Z</li>
</ul>
<div id="azContent">
<div class="group">
<a id="A" name="A"></a>
<h3 class="letter">A</h3>
Nasseef Abukamail<br />
Electrical Engineering and Computer Science <br />
Associate Lecturer <br />
abukamai#ohio.edu <br />
740.593.1229 
<div><br />
</div>Khairul Alam<br />
Mechanical Engineering, Center for Advanced Materials Processing, ESP Lab <br />
Professor <br />
alam#ohio.edu <br />
740.593.1558 
<div><br />
</div>Muhammad Ali<br />
Biomedical Engineering, Mechanical Engineering, ESP Lab <br />
Associate Professor <br />
alim1#ohio.edu <br />
740.593.1389 
<div><br />
</div>Deak Arch<br />
Aviation <br />
Associate Professor, Assistant Chair <br />
arch#ohio.edu <br />
740.597.2688
Divide the problem up into tasks. You have four tasks and they should be tackled individually. Do not proceed to the next task until you know the current task does exactly what you want. Working on more than one task at a time widens the problem area, and this turns out to be more than a geometric expansion. Bugs tend to interact with other bugs. A bug in task 1 may make a bug in task 2 look different, causing you to debug the wrong symptoms.
Consider giving each task a function or if the task is complex, its own file. This way each task can be individually tested easily. Why? What if you change the code from task 1 and want to know if it broke? Sure you can test the whole program, but what if you broke 2 things? If you want to test the splitter logic with a few hundred addresses to make sure you correctly handle all of the weird edge cases, you can just call the splitter function with those few hundred strings and not have to invent a complicated file.
Task 1: read a file line by line.
This is first because until you can do this, you can't do much else.
std::string line;
while (std::getline(in_stream, line))
{
// output line to compare with source
}
will read a file until it cannot be read anymore be this end of file, corrupt data, some joker pulling out the USB drive while you're reading it, or sundry other problems. How do you test this? An easy way is to read the file in from one stream line by line and print it to the console. This is a pretty big file and the eye is only so useful for comparing large amounts of text, so write all received lines to an output file and then diff the files. If they match, you win. Move on to task 2. If they don't, debug.
Task 2: Look for "mailto".
This take a line from task 1 and looks for "mailto"
size_t loc = line.find("mailto:");
if (loc != std::string::npos)
{
std::cout << "found: " << line << std::endl;
}
This is an easier thing to test so we can get away with the mk 1 eyeball or Notepad and ctrl+f to confirm that all mailto lines were printed.
Task 3: Isolate the address.
You've found a line containing "mailto" in task 2. Now you have to isolate the address on that line. You have the starting location from task 2 and you may be able to extract the string between the ':' after "mailto" and the next '\"'. I'm not going to spend much time here because this is the meat and potatoes of this assignment. I do too much here and I pass the course, not you, but basically this is a find and a substr similar to what OP has in their question.
Task 4: Split the Address from task 3
This is more work with find and substr to isolate the parts of the address.
You need to make a loop and test every line until you find one with the string "mailto:".
Here is some example code to give you an idea of how you can do that:
std::ifstream ifs("test.txt");
std::string line; // general buffer
// read each line
while(std::getline(ifs, line))
{
// try to find "mailto:"
std::string::size_type pos = line.find("mailto:");
// ignore if not found
if(pos == std::string::npos)
continue;
// we found it! extract address from line here
// remember that pos holds the start of the information
// ...
}

QT5 C++ QByteArray XML Parser

I get the following xml
<Tra Type="SomeText">
<tr>Abcdefghij qwertzu</tr>
<Rr X="0.0000" Y="0.0000" Z="0.0000" A="0.0000" B="0.0000" C="0.0000" />
<Ar A1="0.0000" A2="0.0000" A3="0.0000" A4="0.0000" A5="0.0000" A6="0.0000" />
<Er E1="0.0000" E2="0.0000" E3="0.0000" E4="0.0000" E5="0.0000" E6="0.0000" />
<Te T21="1.09" T22="2.08" T23="3.07" T24="4.06" T25="5.05" T26="6.04" T27="7.03" T28="8.02" T29="9.01" T210="10.00" />
<D>125</D>
<IP></IP>
</Tra>
through a socket that saves it in a QByteArray called Data.
I want to extract and save every value from the xml to different variables (some as Integers some as QString's).
My main problem is that I dont know how to distinguish xml strings like <D>125</D> with a value in between the Tags and xml strings like <Te T210="10.00" T29="9... /> that got the value in the Tag-String itself.
My code looks like this so far:
QByteArray Data = socket->readAll();
QXmlStreamReader xml(Data);
while(!xml.atEnd() && !xml.hasError())
{
.....
}
There's just so many examples already, aren't there? =(
Anyway, like Frank said, if you want to read data (characters) from within tags - use QXmlStreamReader::readElementText.
Alternatively, you can do this:
QXmlStreamReader reader(xml);
while(!reader.atEnd())
{
if(reader.isStartElement())
{
if(reader.name() == "tr")
{
reader.readNext();
if(reader.atEnd())
break;
if(reader.isCharacters())
{
// Here is the text that is contained within <tr>
QString text = reader.text().toString();
}
}
}
reader.readNext();
}
For attributes, you should use QXmlStreamReader::attributes which will give you a container-type class of attributes.
QXmlStreamReader reader(xml);
while(!reader.atEnd())
{
if(reader.isStartElement())
{
if(reader.name() == "Rr")
{
QXmlStreamAttributes attributes = reader.attributes();
// This doesn't check if the attribute exists... just a warning.
QString x = attributes.value("X").toString();
QString y = attributes.value("Y").toString();
QString a = attributes.value("A").toString();
// etc...
}
}
reader.readNext();
}