How to get string from xml in COM - c++

I have one array like this:
static WCHAR FilesToShow[][100] = { { L"start.cmd" },{ L"image.xml" }, { L"xyz" }};
as you see that there is "xyz" which I have to replace with some unique name. For this I have to read image.xml file.
Please can you tell me how can I do this.
I wrote a method like this:
PRIVATE WCHAR GetSystemName(WCHAR *pName)
{
WCHAR line;
wfstream in("image.xml");
WCHAR tmp;
bool begin_tag = false;
while (getline(in,line))
{
// strip whitespaces from the beginning
for (int i = 0; i < line.length(); i++)
{
if (line[i] == ' ' && tmp.size() == 0)
{
}
else
{
tmp += line[i];
}
}
if (wcswcs(tmp,"<SystemPath>") != NULL)
{
???????? how to get "vikash" from here <SystemPath>C:\Users\rs_user\Documents\RobotStudio\Systems\vikash</SystemPath>
}
else
{
continue;
}
}
return tmp;
}
I'm getting exception for wfstream, getline and line.length() method.
I have included fstream.h header file but I think It's not supported in COM.
Please help me how to solve this issue without parsing xml file.

If your xml-file is simple enough so that there is only a single tag with given name, you could do it like this:
#include <string>
#include <sstream>
#include <iostream>
std::wstring get_value(std::wistream & in, std::wstring const & tagname)
{
std::wstring text = std::wstring(std::istreambuf_iterator<std::wstring::value_type>(in),
std::istreambuf_iterator<std::wstring::value_type>());
std::wstring start_tag = L"<" + tagname + L">";
std::wstring end_tag = L"</" + tagname + L">";
std::wstring::size_type start = text.find(start_tag);
if (start == std::wstring::npos)
{
throw 123;
}
start += start_tag.length();
std::wstring::size_type end = text.find(end_tag);
if (end == std::wstring::npos)
{
throw 123;
}
return text.substr(start, end - start);
}
std::wstring get_substr_after(std::wstring const & str, wchar_t delim)
{
std::wstring::size_type pos = str.rfind(delim);
if (pos == std::wstring::npos)
{
throw 123;
}
return str.substr(pos + 1);
}
void stackoverflow()
{
std::wstring text(L"<foo>\n<bar>abc/def/ghi</bar>\n<baz>123/456/789</baz>\n</foo>\n");
std::wistringstream wiss(text);
std::wcout << text << std::endl;
std::wcout << get_substr_after(get_value(wiss, std::wstring(L"bar")), L'/') << std::endl;
}
The output of this program is:
<foo>
<bar>abc/def/ghi</bar>
<baz>123/456/789</baz>
</foo>
ghi
I hope that answered your question.

you have several issues here.
what you are getting are compiler errors and not exceptions
the header file to include is 'fstream' not 'fstream.h'.
make sure you have a line saying using namespace std;
You are declaring line as a variable of type WCHAR, so it is a single wide character, which surely is not a wstring object. Therefore line.length() is incorrect.
Why are you mixing C (wcswcs()) and C++ (STL) ? maybe you should re-design your function signature.
However, try the below function. I have modified the signature to return a pointer to WCHAR, and place the requested string in the buffer space provided by pName. I added a check to verify that the buffer is large enough to fit the name and the terminating NULL character.
WCHAR* GetSystemName(WCHAR *pName, size_t buflen)
{
wstring line;
wifstream in("image.xml");
WCHAR* tmp = NULL;
while (getline(in,line))
{
// strip whitespaces from the beginning
size_t beg_non_whitespace = line.find_first_not_of(L" \t");
if (beg_non_whitespace != wstring::npos)
{
line = line.substr( beg_non_whitespace );
}
size_t beg_system_path = line.find( L"<SystemPath>" );
if ( beg_system_path != wstring::npos )
{
// strip the tags (assuming closing tag is present)
size_t beg_data = beg_system_path + wstring( L"<SystemPath>" ).length();
size_t range = line.find( L"</SystemPath>" ) - beg_data;
line = line.substr( beg_data, range );
// get file name
size_t pos_last_backslash = line.find_last_of( L'\\' );
if ( pos_last_backslash != wstring::npos )
{
line = line.substr( pos_last_backslash + 1 );
if ( buflen <= line.length() )
{
// ERROR: pName buffer is not large enough to fit the string + terminating NULL character.
return NULL;
}
wcscpy( pName, line.c_str() );
tmp = pName;
break;
}
}
}
return tmp;
}
EDIT: Moreover, if you are using and/or parsing XML in other areas of your program, I strongly suggest using an XML parsing library such as Xerces-C or libXml2.

Thank you all for your answer. Here I got solution of my question.
PRIVATE WCHAR* GetNewSystemName()
{
WCHAR line[756];
WCHAR tempBuffer[100];
CComBSTR path = CurrentFolder.Path();
CComBSTR imagePath1 = L"rimageinfo.xml";
path.AppendBSTR(imagePath1);
std::wfstream in(path);
WCHAR tmp[756];
in.getline(line, 756);
WCHAR* buffer;
buffer = wcswcs(line, L"<SystemPath>");
WCHAR *dest = wcsstr(buffer, L"</SystemPath>");
int pos;
pos = dest - buffer;
unsigned int i = 0;
if (wcswcs(buffer,L"<SystemPath>") != NULL && wcswcs(buffer,L"</SystemPath>") != NULL)
{
for (; i < pos; i++)
{
if (buffer[i] == ' ' && sizeof(tmp) == 0)
{
}
else
{
tmp[i] = buffer[i];
}
}
tmp[i] = NULL;
//break;
}
int j = i;
for (; j > 0; j--)
{
if (tmp[j] == '\\')
{
break;
}
}
j++;
int k = 0;
for (; j < i ; j++)
{
System_Name[k] = tmp[j];
k++;
}
System_Name[k] = NULL;
return System_Name;

Related

How can I speed up parsing of large strings?

So I've made a program that reads in various config files. Some of these config files can be small, some can be semi-large (largest one is 3,844 KB).
The read in file is stored in a string (in the program below it's called sample).
I then have the program extract information from the string based on various formatting rules. This works well, the only issue is that when reading larger files it is very slow....
I was wondering if there was anything I could do to speed up the parsing or if there was an existing library that does what I need (extract string up until a delimiter & extract string string in between 2 delimiters on the same level). Any assistance would be great.
Here's my code & a sample of how it should work...
#include "stdafx.h"
#include <string>
#include <vector>
std::string ExtractStringUntilDelimiter(
std::string& original_string,
const std::string& delimiter,
const int delimiters_to_skip = 1)
{
std::string needle = "";
if (original_string.find(delimiter) != std::string::npos)
{
int total_found = 0;
auto occurance_index = static_cast<size_t>(-1);
while (total_found != delimiters_to_skip)
{
occurance_index = original_string.find(delimiter);
if (occurance_index != std::string::npos)
{
needle = original_string.substr(0, occurance_index);
total_found++;
}
else
{
break;
}
}
// Remove the found string from the original string...
original_string.erase(0, occurance_index + 1);
}
else
{
needle = original_string;
original_string.clear();
}
if (!needle.empty() && needle[0] == '\"')
{
needle = needle.substr(1);
}
if (!needle.empty() && needle[needle.length() - 1] == '\"')
{
needle.pop_back();
}
return needle;
}
void ExtractInitialDelimiter(
std::string& original_string,
const char delimiter)
{
// Remove extra new line characters
while (!original_string.empty() && original_string[0] == delimiter)
{
original_string.erase(0, 1);
}
}
void ExtractInitialAndFinalDelimiters(
std::string& original_string,
const char delimiter)
{
ExtractInitialDelimiter(original_string, delimiter);
while (!original_string.empty() && original_string[original_string.size() - 1] == delimiter)
{
original_string.erase(original_string.size() - 1, 1);
}
}
std::string ExtractStringBetweenDelimiters(
std::string& original_string,
const std::string& opening_delimiter,
const std::string& closing_delimiter)
{
const size_t first_delimiter = original_string.find(opening_delimiter);
if (first_delimiter != std::string::npos)
{
int total_open = 1;
const size_t opening_index = first_delimiter + opening_delimiter.size();
for (size_t i = opening_index; i < original_string.size(); i++)
{
// Check if we have room for opening_delimiter...
if (i + opening_delimiter.size() <= original_string.size())
{
for (size_t j = 0; j < opening_delimiter.size(); j++)
{
if (original_string[i + j] != opening_delimiter[j])
{
break;
}
else if (j == opening_delimiter.size() - 1)
{
total_open++;
}
}
}
// Check if we have room for closing_delimiter...
if (i + closing_delimiter.size() <= original_string.size())
{
for (size_t j = 0; j < closing_delimiter.size(); j++)
{
if (original_string[i + j] != closing_delimiter[j])
{
break;
}
else if (j == closing_delimiter.size() - 1)
{
total_open--;
}
}
}
if (total_open == 0)
{
// Extract result, and return it...
std::string needle = original_string.substr(opening_index, i - opening_index);
original_string.erase(first_delimiter, i + closing_delimiter.size());
// Remove new line symbols
ExtractInitialAndFinalDelimiters(needle, '\n');
ExtractInitialAndFinalDelimiters(original_string, '\n');
return needle;
}
}
}
return "";
}
int main()
{
std::string sample = "{\n"
"Line1\n"
"Line2\n"
"{\n"
"SubLine1\n"
"SubLine2\n"
"}\n"
"}";
std::string result = ExtractStringBetweenDelimiters(sample, "{", "}");
std::string LineOne = ExtractStringUntilDelimiter(result, "\n");
std::string LineTwo = ExtractStringUntilDelimiter(result, "\n");
std::string SerializedVector = ExtractStringBetweenDelimiters(result, "{", "}");
std::string SubLineOne = ExtractStringUntilDelimiter(SerializedVector, "\n");
std::string SubLineTwo = ExtractStringUntilDelimiter(SerializedVector, "\n");
// Just for testing...
printf("LineOne: %s\n", LineOne.c_str());
printf("LineTwo: %s\n", LineTwo.c_str());
printf("\tSubLineOne: %s\n", SubLineOne.c_str());
printf("\tSubLineTwo: %s\n", SubLineTwo.c_str());
system("pause");
}
Use string_view or a hand rolled one.
Don't modify the string loaded.
original_string.erase(0, occurance_index + 1);
is code smell and going to be expensive with a large original string.
If you are going to modify something, do it in one pass. Don't repeatedly delete from the front of it -- that is O(n^2). Instead, procceed along it and shove "finished" stuff into an output accumulator.
This will involve changing how your code works.
You're reading your data into a string. "Length of string" should not be a problem. So far, so good...
You're using "string.find().". That's not necessarily a bad choice.
You're using "string.erase()". That's probably the main source of your problem.
SUGGESTIONS:
Treat the original string as "read-only". Don't call erase(), don't modify it.
Personally, I'd consider reading your text into a C string (a text buffer), then parsing the text buffer, using strstr().
Here is a more efficient version of ExtractStringBetweenDelimiters. Note that this version does not mutate the original buffer. You would perform subsequent queries on the returned string.
std::string trim(std::string buffer, char what)
{
auto not_what = [&what](char ch)
{
return ch != what;
};
auto first = std::find_if(buffer.begin(), buffer.end(), not_what);
auto last = std::find_if(buffer.rbegin(), std::make_reverse_iterator(first), not_what).base();
return std::string(first, last);
}
std::string ExtractStringBetweenDelimiters(
std::string const& buffer,
const char opening_delimiter,
const char closing_delimiter)
{
std::string result;
auto first = std::find(buffer.begin(), buffer.end(), opening_delimiter);
if (first != buffer.end())
{
auto last = std::find(buffer.rbegin(), std::make_reverse_iterator(first),
closing_delimiter).base();
if(last > first)
{
result.assign(first + 1, last);
result = trim(std::move(result), '\n');
}
}
return result;
}
If you have access to string_view (c++17 for std::string_view or boost::string_view) you could return one of these from both functions for extra efficiency.
It's worth mentioning that this method of parsing a structured file is going to cause you problems down the line if any of the serialised strings contains a delimiter, such as a '{'.
In the end you'll want to write or use someone else's parser.
The boost::spirit library is a little complicated to learn, but creates very efficient parsers for this kind of thing.

Using C++ regex for multi match

I want to parse relatively simple registry file format, let's assume it's plain ascii, saved in old REGEDIT4 format. I want to parse it using standard c++ regex class or function (preferably no boost). As an input data it could take for example sample file like this:
REGEDIT4
[HKEY_LOCAL_MACHINE\SOFTWARE\MyCompany\ConfigurationData\v1.0]
[HKEY_LOCAL_MACHINE\SOFTWARE\MyCompany\ConfigurationData\v1.0\General]
"SettingDword"=dword:00000009
"Setting1"="Some string 1"
"SettingString2"="my String"
[HKEY_LOCAL_MACHINE\SOFTWARE\MyCompany\ConfigurationData\v1.0\Networking]
"SettingDword2"=dword:00000002
"Setting2"="Some string 2"
"SettingString3"="my String2"
What I have briefly analyzed - scanning multiple [] can be done using for example cregex_token_iterator class, but main problem is that it is working in opposite way, which I want to use it. I want to start matching pattern like this: regex re("(\\[.*?\\])"), but token iterator returns all strings, which were not matched, which does sounds kind silly to me.
Basically I would like to match first whole section (\\[.*?\\])(.*?\n\n), and then pick up registry path first, and key-values next - then split using regex key-value pairs.
It's really incredible that in C# it's relatively easy to write regex matcher like this, but I would prefer go with C++, as it's native, does not have performance and assembly unload problems.
Finally cross analyzed - it's possible to use regex_search, but search needs to be retried by continuing from next char* after found pattern.
Below is almost complete example to load .reg file at run-time, I'm using MFC's CString, because it's slightly easier to use than std::string and portability is not needed currently.
#include "stdafx.h"
#include <afx.h> //CFile
#include "TestRegex.h"
#include <fstream>
#include <string>
#include <regex>
#include <map>
CWinApp theApp;
using namespace std;
typedef enum
{
eREG_DWORD = REG_DWORD,
eREG_QWORD = REG_QWORD,
eREG_BINARY = REG_BINARY,
eREG_SZ = REG_SZ
}eRegType;
class RegVariant
{
public:
eRegType type;
union
{
DWORD dw;
__int64 qw;
};
CStringA str;
};
class RegKeyNode
{
public:
// Paths to next nodes
map<CStringA, RegKeyNode> keyToNode;
// Values of current key
map<CStringA, RegVariant> keyValues;
};
map<HKEY, RegKeyNode> g_registry;
int char2int(char input)
{
if (input >= '0' && input <= '9')
return input - '0';
if (input >= 'A' && input <= 'F')
return input - 'A' + 10;
if (input >= 'a' && input <= 'f')
return input - 'a' + 10;
return 0;
}
void hexToBin( const char* hex, CStringA& bin, int maxSize = -1 )
{
int size = (strlen(hex) + 1)/ 3;
if(maxSize != -1 && size > maxSize)
size = maxSize;
unsigned char* buf = (unsigned char*)bin.GetBuffer(size);
for( int i = 0; i < size; i++ )
buf[i] = char2int( hex[ i*3 ] ) * 16 + char2int(hex[i * 3 + 1]);
bin.ReleaseBuffer();
}
int main()
{
HMODULE hModule = ::GetModuleHandle(nullptr);
AfxWinInit(hModule, nullptr, ::GetCommandLine(), 0);
//
// Load .reg file.
//
CString fileName = L"test1.reg";
CStringA file;
CFile cfile;
if (cfile.Open(fileName, CFile::modeRead | CFile::shareDenyNone))
{
int len = (int)cfile.GetLength();
cfile.Read(file.GetBuffer(len), len);
file.ReleaseBuffer();
}
cfile.Close();
file.Replace("\r\n", "\n");
const char* pbuf = file.GetBuffer();
regex reSection("\\[(.*?)\\]([^]*?)\n\n");
regex reLine("^\\s*\"(.*?)\"\\s*=\\s*(.*)$");
regex reTypedValue("^(hex|dword|hex\\(b\\)):(.*)$");
regex reStringValue("^\"(.*)\"$" );
cmatch cmSection, cmLine;
//
// For each section:
//
// [registry path]
// "value1"="value 1"
// "value2"="value 1"
//
while( regex_search(pbuf, pbuf + strlen(pbuf), cmSection, reSection) )
{
CStringA path = cmSection[1].str().c_str();
string key_values = cmSection[2].str();
const char* pkv = key_values.c_str();
int iPath = 0;
CStringA hkeyName = path.Tokenize("\\", iPath).MakeUpper();
RegKeyNode* rnode;
if( hkeyName.Compare("HKEY_LOCAL_MACHINE") == 0 )
rnode = &g_registry[HKEY_LOCAL_MACHINE];
else
rnode = &g_registry[HKEY_CURRENT_USER]; // Don't support other HKEY roots.
//
// Locate path where to place values.
//
for( ; hkeyName = path.Tokenize("\\", iPath); )
{
if( hkeyName.IsEmpty() )
break;
rnode = &rnode->keyToNode[hkeyName];
}
//
// Scan "key"="value" pairs.
//
while( regex_search(pkv, pkv+strlen(pkv), cmLine, reLine ))
{
CStringA key = cmLine[1].str().c_str();
string valueType = cmLine[2].str();
smatch cmTypeValue;
RegVariant* rvValue = &rnode->keyValues[key];
//
// Extract type and value.
//
if(regex_search(valueType, cmTypeValue, reTypedValue))
{
string type = cmTypeValue[1].str();
string value = cmTypeValue[2].str();
if( type == "dword")
{
rvValue->type = eREG_DWORD;
rvValue->dw = (DWORD)strtoul(value.c_str(), 0, 16);
}
else if (type == "hex(b)")
{
rvValue->type = eREG_QWORD;
rvValue->qw = 0;
if( value.size() == 8 * 2 + 7 )
{
CStringA v;
hexToBin(value.c_str(), v, sizeof(__int64));
rvValue->qw = *((__int64*)v.GetBuffer());
}
} else //if (type == "hex")
{
rvValue->type = eREG_BINARY;
hexToBin(value.c_str(), rvValue->str);
}
} else if( regex_search(valueType, cmTypeValue, reStringValue))
{
rvValue->type = eREG_SZ;
rvValue->str = cmTypeValue[1].str().c_str();
}
pkv = cmLine[2].second;
} //while
pbuf = cmSection[2].second;
} //while
return 0;
}

C++ Access violation writing location 0x000A000B

I'm building this webcrawler here. This error occurs to me when I start debugging and sends me to memcpy.asm or xstring or dbgdel.cpp files showing me different lines of these files every time.
I was wondering if the code is wrong somehow. I started thinking I am accessing memory blocks that I shouldn't. Here is some code. I hope you can help.
The idea is to iterate through httpContent and get all the URLs from the <a> tags. I am looking for href=" in the beginning and then for the next ". What is in between I am trying to put in temp, then pass the content of temp to an array of strings.
struct Url
{
string host;
string path;
};
int main(){
struct Url website;
string href[100];
website.host = "crawlertest.cs.tu-varna.bg";
website.path = "";
string httpContent = downloadHTTP(website);
for(unsigned int i = 0; i <= httpContent.length()-7; i++){
char c = httpContent[i];
if(c == 'h'){
c = httpContent[i+1];
if(c == 'r'){
c = httpContent[i+2];
if(c == 'e'){
c = httpContent[i+3];
if(c == 'f'){
c = httpContent[i+4];
if(c == '='){
c = httpContent[i+5];
if(c == '\"'){
i+=6;
c = httpContent[i];
string temp = "";
while(c!='\"'){
i++;
c = httpContent[i];
temp+= c;
}
href[i] = temp;
temp = "";
cout<<href[i]<<endl;
}}}}}}
}
system("pause");
return 0;
}
UPDATE
I edited the =, now ==
I am also stopping the iterations 7 positions earlier so the 'if's should not be problem.
I am getting the same errors though.
Use std::vector< std::string > href; to store your result.
With string::find you can find sequence in strings and with string::substr you can extract them from string.
#include <vetor>
#include <string>
struct Url
{
string host;
string path;
};
int main(){
struct Url website;
website.host = "crawlertest.cs.tu-varna.bg";
website.path = "";
std::string httpContent = downloadHTTP(website);
std::vector< std::string > href;
std::size_t pos = httpContent.find("href="); // serach for first "href="
while ( pos != string::npos )
{
pos = httpContent.find( '"', pos+5 ); // serch for '"' at start
if ( pos != string::npos )
{
std::size_t posSt = pos + 1;
pos = httpContent.find( '"', posSt ); // search for '"' at end
if ( pos != string::npos )
{
href.push_back( httpContent.substr( posSt, pos - posSt ) ); // extract ref and append to result
pos = httpContent.find( "href=", pos+1 ); // search for next "href="
}
}
}
system("pause");
return 0;
}

Using a loop with std::strcmp to load lots of settings

In my game I keep track of unlocked levels with a vector std::vector<bool> lvlUnlocked_;.
The simple function to save the progress is this:
void save() {
std::stringstream ss;
std::string stringToSave = "";
std::ofstream ofile("./progress.txt");
if (ofile.good()) {
ofile.clear();
for (std::size_t i = 0; i < levelUnlocked_.size(); ++i) {
ss << "lvl" << i << "=" << (lvlUnlocked_.at(i) ? "1" : "0") << std::endl;
}
stringToSave = ss.str();
ofile << stringToSave;
ofile.close();
}
}
This works and is nice since I can just use a loop to dump the info.
Now to the part where I am stuck, the lower part of my load function (see comment in code below):
void load() {
std::ifstream ifile("./progress.txt");
if (ifile.good()) {
int begin;
int end;
std::string line;
std::string stringKey = "";
std::string stringValue = "";
unsigned int result;
while (std::getline(ifile, line)) {
stringKey = "";
stringValue = "";
for (unsigned int i = 0; i < line.length(); i++) {
if (line.at(i) == '=') {
begin = i + 1;
end = line.length();
break;
}
}
for (int i = 0; i < begin - 1; i++) {
stringKey += line.at(i);
}
for (int i = begin; i < end; i++) {
stringValue += line.at(i);
}
result = static_cast<unsigned int>(std::stoi(stringValue));
// usually I now compare the value and act accordingly, like so:
if (std::strcmp(stringKey.c_str(), "lvl0") == 0) {
lvlUnlocked_.at(0) = true;
} else if (std::strcmp(stringKey.c_str(), "lvl1") == 0) {
lvlUnlocked_.at(1) = true;
} else if (std::strcmp(stringKey.c_str(), "lvl2") == 0) {
lvlUnlocked_.at(2) = true;
}
// etc....
}
}
}
This works fine, but...
the problem is that I have 100+ levels and I want it to be dynamic based on the size of my lvlUnlocked_ vector instead of having to type it all like in the code above.
Is there a way to somehow make use of a loop like in my save function to check all levels?
If you parse your key to extract a suitable integer value, you can just index into the bit-vector with that:
while (std::getline(ifile, line)) {
const size_t eq = line.find('=');
if (eq == std::string::npos)
// no equals sign
continue;
auto stringKey = line.substr(0, eq);
auto stringValue = line.substr(eq+1);
if (stringKey.substr(0,3) != "lvl")
// doesn't begin with lvl
continue;
// strip off "lvl"
stringKey = stringKey.substr(3);
size_t end;
std::vector<bool>::size_type index = std::stoi(stringKey, &end);
if (end == 0 || end != stringKey.length())
// not a valid level number
continue;
if (index >= lvlUnlocked_.size())
// out of range
continue;
// Set it :-)
lvlUnlocked_[index] = stringValue=="1";
}
(I've also updated your parsing for "key=value" strings to more idiomatic C++.)

How can I decode HTML entities in C++? [duplicate]

I'm interested in unescaping text for example: \ maps to \ in C. Does anyone know of a good library?
As reference the Wikipedia List of XML and HTML Character Entity References.
For another open source reference in C to decoding these HTML entities you can check out the command line utility uni2ascii/ascii2uni. The relevant files are enttbl.{c,h} for entity lookup and putu8.c which down converts from UTF32 to UTF8.
uni2ascii
I wrote my own unescape code; very simplified, but does the job: pn_util.c
Function Description: Convert special HTML entities back to characters.
Need to do some modifications to fit your requirement.
char* HtmlSpecialChars_Decode(char* encodedHtmlSpecialEntities)
{
int encodedLen = 0;
int escapeArrayLen = 0;
static char decodedHtmlSpecialChars[TITLE_SIZE];
char innerHtmlSpecialEntities[MAX_CONFIG_ITEM_SIZE];
/* This mapping table can be extended if necessary. */
static const struct {
const char* encodedEntity;
const char decodedChar;
} entityToChars[] = {
{"<", '<'},
{">", '>'},
{"&", '&'},
{""", '"'},
{"'", '\''},
};
if(strchr(encodedHtmlSpecialEntities, '&') == NULL)
return encodedHtmlSpecialEntities;
memset(decodedHtmlSpecialChars, '\0', TITLE_SIZE);
memset(innerHtmlSpecialEntities, '\0', MAX_CONFIG_ITEM_SIZE);
escapeArrayLen = sizeof(entityToChars) / sizeof(entityToChars[0]);
strcpy(innerHtmlSpecialEntities, encodedHtmlSpecialEntities);
encodedLen = strlen(innerHtmlSpecialEntities);
for(int i = 0; i < encodedLen; i++)
{
if(innerHtmlSpecialEntities[i] == '&')
{
/* Potential encode char. */
char * tempEntities = innerHtmlSpecialEntities + i;
for(int j = 0; j < escapeArrayLen; j++)
{
if(strncmp(tempEntities, entityToChars[j].encodedEntity, strlen(entityToChars[j].encodedEntity)) == 0)
{
int index = 0;
strncat(decodedHtmlSpecialChars, innerHtmlSpecialEntities, i);
index = strlen(decodedHtmlSpecialChars);
decodedHtmlSpecialChars[index] = entityToChars[j].decodedChar;
if(strlen(tempEntities) > strlen(entityToChars[j].encodedEntity))
{
/* Not to the end, continue */
char temp[MAX_CONFIG_ITEM_SIZE] = {'\0'};
strcpy(temp, tempEntities + strlen(entityToChars[j].encodedEntity));
memset(innerHtmlSpecialEntities, '\0', MAX_CONFIG_ITEM_SIZE);
strcpy(innerHtmlSpecialEntities, temp);
encodedLen = strlen(innerHtmlSpecialEntities);
i = -1;
}
else
encodedLen = 0;
break;
}
}
}
}
if(encodedLen != 0)
strcat(decodedHtmlSpecialChars, innerHtmlSpecialEntities);
return decodedHtmlSpecialChars;
}
QString UNESC(const QString &txt) {
QStringList bld;
static QChar AMP = '&', SCL = ';';
static QMap<QString, QString> dec = {
{"<", "<"}, {">", ">"}
, {"&", "&"}, {""", R"(")"}, {"'", "'"} };
if(!txt.contains(AMP)) { return txt; }
int bgn = 0, pos = 0;
while((pos = txt.indexOf(AMP, pos)) != -1) {
int end = txt.indexOf(SCL, pos)+1;
QString val = dec[txt.mid(pos, end - pos)];
bld << txt.mid(bgn, pos - bgn);
if(val.isEmpty()) {
end = txt.indexOf(AMP, pos+1);
bld << txt.mid(pos, end - pos);
} else {
bld << val;
}// else // if(val.isEmpty())
bgn = end; pos = end;
}// while((pos = txt.indexOf(AMP, pos)) != -1)
return bld.join(QString());
}// UNESC