Reading a XML file in C++ with TinyXML2

Reading a XML file in C++ with TinyXML2 - c++

I'm pretty new to using XML in C++ and i'm trying to parse a list of files to download.
THe XML file I'm using is generated via PHP and looks like this :
<?xml version="1.0"?>
<FileList>
<File Name="xxx" Path="xxx" MD5="xxx" SHA1="xxx"/>
</FileList>
The code I'm using in C++ is the following, which I came up using some online tutorials (it's included in some global function):
tinyxml2::XMLDocument doc;
doc.LoadFile("file_listing.xml");
tinyxml2::XMLNode* pRoot = doc.FirstChild();
tinyxml2::XMLElement* pElement = pRoot->FirstChildElement("FileList");
if (pRoot == nullptr)
{
QString text = QString::fromLocal8Bit("Error text in french");
//other stuff
}
else
{
tinyxml2::XMLElement* pListElement = pElement->FirstChildElement("File");
while (pListElement != nullptr)
{
QString pathAttr = QString::fromStdString(pListElement->Attribute("Path"));
QString md5Attr = QString:: fromStdString(pListElement->Attribute("MD5"));
QString sha1Attr = QString::fromStdString(pListElement->Attribute("SHA1"));
QString currentPath = pathAttr.remove("path");
QString currentMd5 = this->fileChecksum(currentPath, QCryptographicHash::Md5);
QString currentSha1 = this->fileChecksum(currentPath, QCryptographicHash::Sha1);
QFile currentFile(currentPath);
if (md5Attr != currentMd5 || sha1Attr != currentSha1 || !currentFile.exists())
{
QString url = "url" + currentPath;
this->downloadFile(url);
}
pListElement = pListElement->NextSiblingElement("File");
}
Problem is, I get an error like "Access violation, this was nullptr" on the following line :
tinyxml2::XMLElement* pListElement = pElement->FirstChildElement("File");
Since I'm far from a pro when it comes to coding and I already searched the internet up and down, I hope that someone here can provide me some pointers.
Have a good day, folks.

I don't know if you have C++17 available, but you can remove a lot of noise by using auto* and if-init-expressions (or rely on the fact that pointers can be implicitly converted to boolean values.)
The main issue with your code is you were not using XMLElement* but instead a XMLNode. The function tinyxml2::XMLDocument::RootElement() automatically gets the top-most element for you.
Because you have an xml declaration at the top, FirstChild returns that...which doesn't have any children, so the rest of the code fails.
By using RootElement tinyxml knows to skip any leading non-element nodes (comments, doctypes, etc.) and give you <FileList> instead.
tinyxml2::XMLDocument doc;
auto err = doc.LoadFile("file_listing.xml");
if(err != tinyxml2::XML_SUCCESS) {
//Could not load file. Handle appropriately.
} else {
if(auto* pRoot = doc.RootElement(); pRoot == nullptr) {
QString text = QString::fromLocal8Bit("Error text in french");
//other stuff
} else {
for(auto* pListElement = pRoot->FirstChildElement("File");
pListElement != nullptr;
pListElement = pListElement->NextSiblingElement("File"))
{
QString pathAttr = QString::fromStdString(pListElement->Attribute("Path"));
QString md5Attr = QString:: fromStdString(pListElement->Attribute("MD5"));
QString sha1Attr = QString::fromStdString(pListElement->Attribute("SHA1"));
QString currentPath = pathAttr.remove("path");
QString currentMd5 = this->fileChecksum(currentPath, QCryptographicHash::Md5);
QString currentSha1 = this->fileChecksum(currentPath, QCryptographicHash::Sha1);
QFile currentFile(currentPath);
if(md5Attr != currentMd5 || sha1Attr != currentSha1 || !currentFile.exists()) {
QString url = "url" + currentPath;
this->downloadFile(url);
}
}
}
}

According to the reference for tinyxml2::XMLNodeFirstChild():
Get the first child node, or null if none exists.
This line will therefore get the root node:
tinyxml2::XMLNode* pRoot = doc.FirstChild();
Meaning when you attempt to find a FileList node within the root node it returns null.
To avoid the access violation, check your pointers are valid before using them. There is an if check for pRoot but the line immediately before it tries to call a function on pRoot. There is no if check for pElement so this is why you get an access violation. As well as checking pointers are valid, consider adding else blocks with logging to say what went wrong (e.g. "could not find element X"). This will help you in the long run - XML parsing is a pain, even with a library like Tinyxml, there are always teething problems like this, so getting into the habit of checki g pointers and logging out helpful messages will definitely pay off.

Related

libxml2 - failure to parse valid xml

I have a small C program using libxml2 for parsing xml files. Basicaly, my code is like
xmlDocPtr doc = xmlParseFile("test.xml");
if (doc == nullptr) {
return;
}
xmlNodePtr node = xmlDocGetRootElement(doc);
if (node == nullptr) {
return;
}
...
I'm getting an error situation, where doc != null and node == null. Under which conditions could that happen? I've tested with completely valid, invalid, and empty files, it happens in every case. If file does not exist, doc == null (as it should). I suspect that the program is not able to open the file for some reason, but I've checked rights, and no other program uses that file. Also, this only happens in an environment, where I cannot use a debugger.

RapidXML - how can I handle missing nodes/values

I'd like to read from XML to C++ using RapidXML. However, if a node doen't exist or a value is missing the program crashes.
for (rapidxml::xml_node<> * xmlasset_node = root_node->first_node("Asset"); xmlasset_node; xmlasset_node = xmlasset_node->next_sibling())
{mystring += xmlasset_node->first_attribute("name")->value()};
However, this "name" attribute doesn't exist in all nodes and is to be filled with a default value, if its not in XML. Similar to this, I've got some sub-nodes not in all nodes. The reason is just to keep the XML as small and clear as possible for manual adjustments.
How can a check/test be implemented (C++), to prevent the program from crashing and just taking default values if a value/node doesn't exist?
Kind regards,
- Corak

Here is what I do, you can compare if the value of the node and its attribute matches your criteria then you accepts it:
// basically I am looking for "settings" node then "network" subnode, then "port" attribute
if( boost::iequals(doc.first_node()->next_sibling()->name(), "settings"))
{
for (xml_node<> *node = doc.first_node()->next_sibling()->first_node(); node; node = node->next_sibling())
{
// find network tag
if (boost::iequals(node->name(),"network"))
{
for (xml_attribute<> *attr = node->first_attribute(); attr; attr = attr->next_attribute())
{
if ( boost::iequals(attr->name(), "port"))
{
strcpy(attr->value(), portname);
}
}
}
}
}

C++/CX WinRT File Copy

I'm really suffering through the WinRT Windows::Storage namespace with all it's asyncronousness.
I have the following private members in my header file:
//Members for copying the SQLite db file
Platform::String^ m_dbName;
Windows::Storage::StorageFolder^ m_localFolder;
Windows::Storage::StorageFolder^ m_installFolder;
Windows::Storage::StorageFile^ m_dbFile;
And I have the following code block in my implementation file:
//Make sure the SQLite Database is in ms-appdata:///local/
m_dbName = L"DynamicSimulations.db";
m_localFolder = ApplicationData::Current->LocalFolder;
m_installFolder = Windows::ApplicationModel::Package::Current->InstalledLocation;
auto getLocalFileOp = m_localFolder->GetFileAsync(m_dbName);
getLocalFileOp->Completed = ref new AsyncOperationCompletedHandler<StorageFile^>([this](IAsyncOperation<StorageFile^>^ operation, AsyncStatus status)
{
m_dbFile = operation->GetResults();
if(m_dbFile == nullptr)
{
auto getInstalledFileOp = m_installFolder->GetFileAsync(m_dbName);
getInstalledFileOp->Completed = ref new AsyncOperationCompletedHandler<StorageFile^>([this](IAsyncOperation<StorageFile^>^ operation, AsyncStatus status)
{
m_dbFile = operation->GetResults();
m_dbFile->CopyAsync(m_localFolder, m_dbName);
});
}
});
I get a memory access violation when it gets to m_dbFile = operation->GetResults();
What am I missing here? I come from a c# background in which this is really easy stuff to do :/
I've tried using '.then' instead of registering the event but I couldn't even get those to compile.
thank you for your help!

If you are interested in the WinRT solution, here it is:
It seems all you want to do is to copy the DB file from the installed location into the local folder. For that the following code should suffice:
//Make sure the SQLite Database is in ms-appdata:///local/
m_dbName = L"DynamicSimulations.db";
m_localFolder = ApplicationData::Current->LocalFolder;
m_installFolder = Windows::ApplicationModel::Package::Current->InstalledLocation;
create_task(m_installFolder->GetFileAsync(m_dbName)).then([this](StorageFile^ file)
{
create_task(file->CopyAsync(m_localFolder, m_dbName)).then([this](StorageFile^ copiedFile)
{
// do something with copiedFile
});
});

I've tried this thing before. Don't do this:
if(m_dbFile == nullptr)
Instead verify the value of "status".
if(status == AsyncStatus::Error)

Changing cell content on google spreadsheets via api 3.0

I need to change the contents of a cell on google spreadsheets.
I have successfully gotten the data via google docs api (all required authorization and options tag are set).
But I can't change the cell content. I have generated the following url and data:
req url: https://spreadsheets.google.com/feeds/cells/0AnT0uFQJWw_edENkYndfQWxCWlVmeG9oNW5kWjhYVUE/tCdbw_AlBZUfxoh5ndZ8XUA/private/full/R2C1
req data: <?xml version='1.0' encoding='UTF-8'?><entry xmlns='http://www.w3.org/2005/Atom' xmlns:gs='http://schemas.google.com/spreadsheets/2006'><id>https://spreadsheets.google.com/feeds/cells/0AnT0uFQJWw_edENkYndfQWxCWlVmeG9oNW5kWjhYVUE/tCdbw_AlBZUfxoh5ndZ8XUA/private/full/R2C1</id><link rel='edit' type='application/atom+xml' href='https://spreadsheets.google.com/feeds/cells/0AnT0uFQJWw_edENkYndfQWxCWlVmeG9oNW5kWjhYVUE/tCdbw_AlBZUfxoh5ndZ8XUA/private/full/R2C1'/><gs:cell row='2' col='1' inputValue='*match found*
имя: вася
фамилия: тра та та
номер телефона дом: +7123456789
номер телефона моб: +7098765432
город: москва'/></entry>
repl data: Response contains no content type
And sometimes I recieve "bad request" in reply.
i following this document when writing code, and create this:
1. i getting cellsfeed url
NETLIBHTTPREQUEST nlhr = {0};
nlhr.cbSize = sizeof(NETLIBHTTPREQUEST);
nlhr.headersCount = 2;
nlhr.headers = (NETLIBHTTPHEADER*)malloc(sizeof(NETLIBHTTPHEADER) * (nlhr.headersCount));
nlhr.headers[0].szName = "Authorization";
if(AuthTag.empty())
{
string str;
str += "GoogleLogin auth=";
str += Auth;
AuthTag = str;
nlhr.headers[0].szValue = _strdup(str.c_str());
}
else
nlhr.headers[0].szValue = _strdup(AuthTag.c_str());
nlhr.headers[1].szName = "GData-Version";
nlhr.headers[1].szValue = "3.0";
nlhr.cbSize = sizeof(NETLIBHTTPREQUEST);
nlhr.flags = NLHRF_SSL;
{
string str = "https://spreadsheets.google.com/feeds/worksheets/";
str += toUTF8(Params.vtszDocuments[0]);
str += "/private/full";
nlhr.szUrl = _strdup(str.c_str());
}
nlhr.requestType = REQUEST_GET;
nlhr2 = (NETLIBHTTPREQUEST*)CallService(MS_NETLIB_HTTPTRANSACTION, (WPARAM)hNetlibUser, (LPARAM)&nlhr);
if(!nlhr2)
{
boost::this_thread::sleep(boost::posix_time::minutes(Params.Interval));
continue;
}
using namespace rapidxml;
xml_document<> xml;
xml.parse<0>(nlhr2->pData);
Netlib_CloseHandle(nlhr2);
for(xml_node<> *node = xml.first_node()->first_node("entry"); node; node = node->next_sibling("entry"))
{
if(strcmp(node->first_node("title")->value(), toUTF8(Params.tszListName).c_str()))
continue;
bool found = false;
xml_node<> *id = node->first_node("id");
string spreadshit_id = id->value();
if(spreadshit_id.find(toUTF8(Params.vtszDocuments[0])) == string::npos)
continue;
for(xml_node<> *link = node->first_node("link"); link; link = link->next_sibling("link"))
{
if(strcmp(link->first_attribute("rel")->value(), "http://schemas.google.com/spreadsheets/2006#cellsfeed"))
continue;
cellsfeed = link->first_attribute("href")->value();
found = true;
if(found)
break;
}
if(found)
break;
}
2. i getting cellsfeed to buffer for parsing on need
base_document_xml.clear();
if(base_document_xml_buffer)
free(base_document_xml_buffer);
NETLIBHTTPREQUEST nlhr = {0};
nlhr.cbSize = sizeof(NETLIBHTTPREQUEST);
nlhr.headersCount = 2;
nlhr.headers = (NETLIBHTTPHEADER*)malloc(sizeof(NETLIBHTTPHEADER) * (nlhr.headersCount));
nlhr.headers[0].szName = "Authorization";
if(AuthTag.empty())
{
string str;
str += "GoogleLogin auth=";
str += Auth;
AuthTag = str;
nlhr.headers[0].szValue = _strdup(str.c_str());
}
else
nlhr.headers[0].szValue = _strdup(AuthTag.c_str());
nlhr.headers[1].szName = "GData-Version";
nlhr.headers[1].szValue = "3.0";
nlhr.cbSize = sizeof(NETLIBHTTPREQUEST);
nlhr.flags = NLHRF_SSL;
nlhr.szUrl = _strdup(cellsfeed.c_str());
nlhr.requestType = REQUEST_GET;
nlhr2 = (NETLIBHTTPREQUEST*)CallService(MS_NETLIB_HTTPTRANSACTION, (WPARAM)hNetlibUser, (LPARAM)&nlhr);
if(!nlhr2)
{
boost::this_thread::sleep(boost::posix_time::minutes(Params.Interval));
continue;
}
using namespace rapidxml;
base_document_xml_buffer = _strdup(nlhr2->pData);
base_document_xml.parse<0>(base_document_xml_buffer); //memory leak ?
Netlib_CloseHandle(nlhr2);
3. i getting etag and edit url for needed cell
string edit_link, etag;
using namespace rapidxml;
for(xml_node<> *node = base_document_xml.first_node()->first_node("entry"); node; node = node->next_sibling("entry"))
{
xml_node<> *cell_id = node->first_node("gs:cell");
char buf[4];
_itoa(i->row +1 ,buf, 10);
if(strcmp(cell_id->first_attribute("row")->value(), buf))
continue;
_itoa(i->column +1 ,buf, 10);
if(strcmp(cell_id->first_attribute("col")->value(), buf))
continue;
for(xml_node<> *link = node->first_node("link"); link; link = link->next_sibling("link"))
{
if(strcmp(link->first_attribute("rel")->value() , "edit"))
continue;
edit_link = link->first_attribute("href")->value();
etag = node->first_attribute("gd:etag")->value();
}
}
i using Miranda IM core network library in this code, and i think all right with network part, something wrong with request url or data content in request
UPD:
i have missed content type header in first code, now i fixed this, but have another problem, google returning "premature end of file"..., code updated.
UPD2:
i have solve this problem, it caused by wrong parameters passed by netowrk library, now i have following Invalid query parameter value for grid-id., and does not understand what it means...
UPD3:
looks like i have misunderstand api, i need to rewrite some code, i will post result here...
UPD4:
i have tried to obtain edit url via different api function, but have same result ...
UPD5:
i have solved this problem, not optimal and i thnk slow way, but at least working, i implement few more api calls and addition xml parsing steps to get correct link for edit each cell, code updated if someone need this, rapidxml parsing library and miranda im core net library used here.

Parsin XML file using pugixml

Hi
I want to use XML file as a config file, from which I will read parameters for my application. I came across on PugiXML library, however I have problem with getting values of attributes.
My XML file looks like that
<?xml version="1.0"?>
<settings>
<deltaDistance> </deltaDistance>
<deltaConvergence>0.25 </deltaConvergence>
<deltaMerging>1.0 </deltaMerging>
<m> 2</m>
<multiplicativeFactor>0.7 </multiplicativeFactor>
<rhoGood> 0.7 </rhoGood>
<rhoMin>0.3 </rhoMin>
<rhoSelect>0.6 </rhoSelect>
<stuckProbability>0.2 </stuckProbability>
<zoneOfInfluenceMin>2.25 </zoneOfInfluenceMin>
</settings>
To pare XML file I use this code
void ReadConfig(char* file)
{
pugi::xml_document doc;
if (!doc.load_file(file)) return false;
pugi::xml_node tools = doc.child("settings");
//[code_traverse_iter
for (pugi::xml_node_iterator it = tools.begin(); it != tools.end(); ++it)
{
cout<<it->name() << " " << it->attribute(it->name()).as_double();
}
}
and I also was trying to use this
void ReadConfig(char* file)
{
pugi::xml_document doc;
if (!doc.load_file(file)) return false;
pugi::xml_node tools = doc.child("settings");
//[code_traverse_iter
for (pugi::xml_node_iterator it = tools.begin(); it != tools.end(); ++it)
{
cout<<it->name() << " " << it->value();
}
}
Attributes are loaded corectly , however all values are equals 0. Could somebody tell me what I do wrong ?

I think your problem is that you're expecting the value to be stored in the node itself, but it's really in a CHILD text node. A quick scan of the documentation showed that you might need
it->child_value()
instead of
it->value()

Are you trying to get all the attributes for a given node or do you want to get the attributes by name?
For the first case, you should be able to use this code:
unsigned int numAttributes = node.attributes();
for (unsigned int nAttribute = 0; nAttribute < numAtributes; ++nAttribute)
{
pug::xml_attribute attrib = node.attribute(nAttribute);
if (!attrib.empty())
{
// process here
}
}
For the second case:
LPCTSTR GetAttribute(pug::xml_node & node, LPCTSTR szAttribName)
{
if (szAttribName == NULL)
return NULL;
pug::xml_attribute attrib = node.attribute(szAttribName);
if (attrib.empty())
return NULL; // or empty string
return attrib.value();
}

If you want stock plain text data into the nodes like
<name> My Name</name>
You need to make it like
rootNode.append_child("name").append_child(node_pcdata).set_value("My name");
If you want to store datatypes, you need to set an attribute. I think what you want is to be able to read the value directly right?
When you are writing the node,
rootNode.append_child("version").append_attribute("value").set_value(0.11)
When you want to read it,
rootNode.child("version").attribute("version").as_double()
At least that's my way of doing it!

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Reading a XML file in C++ with TinyXML2 - c++

Related

libxml2 - failure to parse valid xml

RapidXML - how can I handle missing nodes/values

C++/CX WinRT File Copy

Changing cell content on google spreadsheets via api 3.0

Parsin XML file using pugixml

Categories

Resources