Most efficient way to escape XML/HTML in C++ string?

Most efficient way to escape XML/HTML in C++ string? - c++

I can't believe this question hasn't been asked before. I have a string that needs to be inserted into an HTML file but it may contain special HTML characters. I want to replace these with the appropriate HTML representation.
The code below works but is pretty verbose and ugly. Performance is not critical for my application but I guess there are scalability problems here also. How can I improve this? I guess this is a job for STL algorithms or some esoteric Boost function, but the code below is the best I can come up with myself.
void escape(std::string *data)
{
std::string::size_type pos = 0;
for (;;)
{
pos = data->find_first_of("\"&<>", pos);
if (pos == std::string::npos) break;
std::string replacement;
switch ((*data)[pos])
{
case '\"': replacement = """; break;
case '&': replacement = "&"; break;
case '<': replacement = "<"; break;
case '>': replacement = ">"; break;
default: ;
}
data->replace(pos, 1, replacement);
pos += replacement.size();
};
}

Instead of just replacing in the original string, you can do copying with on-the-fly replacement which avoids having to move characters in the string. This will have much better complexity and cache behavior, so I'd expect a huge improvement. Or you can use boost::spirit::xml encode or http://code.google.com/p/pugixml/.
void encode(std::string& data) {
std::string buffer;
buffer.reserve(data.size());
for(size_t pos = 0; pos != data.size(); ++pos) {
switch(data[pos]) {
case '&': buffer.append("&"); break;
case '\"': buffer.append("""); break;
case '\'': buffer.append("&apos;"); break;
case '<': buffer.append("<"); break;
case '>': buffer.append(">"); break;
default: buffer.append(&data[pos], 1); break;
}
}
data.swap(buffer);
}
EDIT: A small improvement can be achieved by using an heuristic to determine the size of the buffer. Replace the buffer.reserve line with data.size()*1.1 (10%) or something similar depending of how much replacements are expected.

void escape(std::string *data)
{
using boost::algorithm::replace_all;
replace_all(*data, "&", "&");
replace_all(*data, "\"", """);
replace_all(*data, "\'", "&apos;");
replace_all(*data, "<", "<");
replace_all(*data, ">", ">");
}
Could win the prize for least verbose?

Here is a simple ~30 line C program that does the trick in a rather good manner. Here I am assuming that the temp_str will have allocated memory enough to have the additional escaped characters.
void toExpatEscape(char *temp_str)
{
const char cEscapeChars[6]={'&','\'','\"','>','<','\0'};
const char * const pEscapedSeqTable[] =
{
"&",
"&apos;",
""",
">",
"<",
};
unsigned int i, j, k, nRef = 0, nEscapeCharsLen = strlen(cEscapeChars), str_len = strlen(temp_str);
int nShifts = 0;
for (i=0; i<str_len; i++)
{
for(nRef=0; nRef<nEscapeCharsLen; nRef++)
{
if(temp_str[i] == cEscapeChars[nRef])
{
if((nShifts = strlen(pEscapedSeqTable[nRef]) - 1) > 0)
{
memmove(temp_str+i+nShifts, temp_str+i, str_len-i+nShifts);
for(j=i,k=0; j<=i+nShifts,k<=nShifts; j++,k++)
temp_str[j] = pEscapedSeqTable[nRef][k];
str_len += nShifts;
}
}
}
}
temp_str[str_len] = '\0';
}

My tests showed this answer gave the best performance from offered (not surprising it has the most rate).
I've implemented same algorithm for my project (I really want good performance & memory usage) - my tests showed my implementation has ~2.6-3.25 better speed performace. Also I don't like previous best offered algorithm bcs of bad memory usage - you will have extra memory usage as when apply 1.1 multiplier 'heuristic', as when .append() lead to resize.
So, leave my code here - maybe somebody find it useful.
HtmlPreprocess.h:
#ifndef _HTML_PREPROCESS_H_
#define _HTML_PREPROCESS_H_
#include <string>
class HtmlPreprocess
{
public:
HtmlPreprocess();
~HtmlPreprocess();
static void htmlspecialchars(
const std::string & in,
std::string & out
);
};
#endif // _HTML_PREPROCESS_H_
HtmlPreprocess.cpp:
#include "HtmlPreprocess.h"
HtmlPreprocess::HtmlPreprocess()
{
}
HtmlPreprocess::~HtmlPreprocess()
{
}
const unsigned char map_char_to_final_size[] =
{
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 6, 1, 1, 1, 5, 6, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 4, 1, 4, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
};
const unsigned char map_char_to_index[] =
{
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 2, 0xFF, 0xFF, 0xFF, 0, 1, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 4, 0xFF, 3, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF
};
void HtmlPreprocess::htmlspecialchars(
const std::string & in,
std::string & out
)
{
const char * lp_in_stored = &in[0];
size_t in_size = in.size();
const char * lp_in = lp_in_stored;
size_t final_size = 0;
for (size_t i = 0; i < in_size; i++)
final_size += map_char_to_final_size[*lp_in++];
out.resize(final_size);
lp_in = lp_in_stored;
char * lp_out = &out[0];
for (size_t i = 0; i < in_size; i++)
{
char current_char = *lp_in++;
unsigned char next_action = map_char_to_index[current_char];
switch (next_action){
case 0:
*lp_out++ = '&';
*lp_out++ = 'a';
*lp_out++ = 'm';
*lp_out++ = 'p';
*lp_out++ = ';';
break;
case 1:
*lp_out++ = '&';
*lp_out++ = 'a';
*lp_out++ = 'p';
*lp_out++ = 'o';
*lp_out++ = 's';
*lp_out++ = ';';
break;
case 2:
*lp_out++ = '&';
*lp_out++ = 'q';
*lp_out++ = 'u';
*lp_out++ = 'o';
*lp_out++ = 't';
*lp_out++ = ';';
break;
case 3:
*lp_out++ = '&';
*lp_out++ = 'g';
*lp_out++ = 't';
*lp_out++ = ';';
break;
case 4:
*lp_out++ = '&';
*lp_out++ = 'l';
*lp_out++ = 't';
*lp_out++ = ';';
break;
default:
*lp_out++ = current_char;
}
}
}

If you're going for processing speed, then it seems to me that the best would be to have a second string that you build as you go, copying from the first string to the second string, and then appending the html escapes as you encounter them. Since I assume that the replace method involves first a memory move, followed by a copy into the replaced position, it's going to be very slow for large strings. If you have a second string to build using .append(), it will avoid the memory move.
As far was code "cleanness", I think that's about as pretty as you're going to get. You could create an array of characters and their replacements, and then search the array, but that would probably be slower and not much cleaner anyway.

I'd honestly go with a more generic version using iterators, such that you can "stream" the encoding. Consider the following implementation:
#include <algorithm>
namespace xml {
// Helper for null-terminated ASCII strings (no end of string iterator).
template<typename InIter, typename OutIter>
OutIter copy_asciiz ( InIter begin, OutIter out )
{
while ( *begin != '\0' ) {
*out++ = *begin++;
}
return (out);
}
// XML escaping in it's general form. Note that 'out' is expected
// to an "infinite" sequence.
template<typename InIter, typename OutIter>
OutIter escape ( InIter begin, InIter end, OutIter out )
{
static const char bad[] = "&<>";
static const char* rep[] = {"&", "<", ">"};
static const std::size_t n = sizeof(bad)/sizeof(bad[0]);
for ( ; (begin != end); ++begin )
{
// Find which replacement to use.
const std::size_t i =
std::distance(bad, std::find(bad, bad+n, *begin));
// No need for escaping.
if ( i == n ) {
*out++ = *begin;
}
// Escape the character.
else {
out = copy_asciiz(rep[i], out);
}
}
return (out);
}
}
Then, you can simplify the average case using a few overloads:
#include <iterator>
#include <string>
namespace xml {
// Get escaped version of "content".
std::string escape ( const std::string& content )
{
std::string result;
result.reserve(content.size());
escape(content.begin(), content.end(), std::back_inserter(result));
return (result);
}
// Escape data on the fly, using "constant" memory.
void escape ( std::istream& in, std::ostream& out )
{
escape(std::istreambuf_iterator<char>(in),
std::istreambuf_iterator<char>(),
std::ostreambuf_iterator<char>(out));
}
}
Finally, test the whole lot:
#include <iostream>
int main ( int, char ** )
{
std::cout << xml::escape("<foo>bar & qux</foo>") << std::endl;
}

You can use the boost::property_tree::xml_parser::encode_char_entities if you don't want to write it yourself.
For reference, here's the code in boost 1.64.0:
```
template<class Str>
Str encode_char_entities(const Str &s)
{
// Don't do anything for empty strings.
if(s.empty()) return s;
typedef typename Str::value_type Ch;
Str r;
// To properly round-trip spaces and not uglify the XML beyond
// recognition, we have to encode them IF the text contains only spaces.
Str sp(1, Ch(' '));
if(s.find_first_not_of(sp) == Str::npos) {
// The first will suffice.
r = detail::widen<Str>(" ");
r += Str(s.size() - 1, Ch(' '));
} else {
typename Str::const_iterator end = s.end();
for (typename Str::const_iterator it = s.begin(); it != end; ++it)
{
switch (*it)
{
case Ch('<'): r += detail::widen<Str>("<"); break;
case Ch('>'): r += detail::widen<Str>(">"); break;
case Ch('&'): r += detail::widen<Str>("&"); break;
case Ch('"'): r += detail::widen<Str>("""); break;
case Ch('\''): r += detail::widen<Str>("&apos;"); break;
default: r += *it; break;
}
}
}
return r;
}
```

I profiled 3 solutions with Visual Studio 2017. Input were 10 000 000 strings of size 5-20 with a probability of 9,4% that a char needs to be escaped.
Solution from Giovanni Funchal
Solution from HostageBrain
Solution is mine
The result:
needs 1.675 seconds
needs 0.769 seconds
needs 0.368 seconds
In mine Solution, the final size is precalculated and a copy of string data is done, only when needed. So the heap memory allocations should be minimal.
const unsigned char calcFinalSize[] =
{
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 6, 1, 1, 1, 5, 6, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 4, 1, 4, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
};
void escapeXml(std::string & in)
{
const char* dataIn = in.data();
size_t sizeIn = in.size();
const char* dataInCurrent = dataIn;
const char* dataInEnd = dataIn + sizeIn;
size_t outSize = 0;
while (dataInCurrent < dataInEnd)
{
outSize += calcFinalSize[static_cast<uint8_t>(*dataInCurrent)];
dataInCurrent++;
}
if (outSize == sizeIn)
{
return;
}
std::string out;
out.resize(outSize);
dataInCurrent = dataIn;
char* dataOut = &out[0];
while (dataInCurrent < dataInEnd)
{
switch (*dataInCurrent) {
case '&':
memcpy(dataOut, "&", sizeof("&") - 1);
dataOut += sizeof("&") - 1;
break;
case '\'':
memcpy(dataOut, "&apos;", sizeof("&apos;") - 1);
dataOut += sizeof("&apos;") - 1;
break;
case '\"':
memcpy(dataOut, """, sizeof(""") - 1);
dataOut += sizeof(""") - 1;
break;
case '>':
memcpy(dataOut, ">", sizeof(">") - 1);
dataOut += sizeof(">") - 1;
break;
case '<':
memcpy(dataOut, "<", sizeof("<") - 1);
dataOut += sizeof("<") - 1;
break;
default:
*dataOut++ = *dataInCurrent;
}
dataInCurrent++;
}
in.swap(out);
}
Edit: Replaced "&quote;" with """. Old solution was overwriting memory, because the look-up table contained a length of 6 for "&quote;".

Or with just stl :
std::string& rep(std::string &s, std::string from, std::string to)
{
int pos = -1;
while ( (pos = s.find(from, pos+1) ) != string::npos)
s.erase(pos, from.length()).insert(pos, to);
return s;
}
Usage:
rep(s, "&", """);
rep(s, "\"", """);
or:
rep(s, "HTML","xxxx");

Related

static map exceeds stack

The situation is that I programmed an assembler and I'm using a std::unordered_multimap container to store all the different instructions, where the actual mnemonic is my key into the map and the associated value is a custom structure with some additonal information about the parameters, etc.
Since I don't need to make any changes to this lookup during runtime I thought I'd declare it as static and const and put all the values manually in an initializer_list.
Altogether it looks like this:
typedef std::wstring STRING;
static const
std::unordered_multimap<STRING, ASM_INSTRUCTION> InstructionLookup = {
// { MNEMONIC, { Opcode1, Opcode2, Param1Type, Param2Type, Param3Type, NrBytes, Bt1, Bt2, Bt3, Bt4, NoRexW, InvalidIn64Bit, InvalidIn32Bit } },
{ L"AAA",{ ot_none, ot_none, par_noparam, par_noparam, par_noparam, 1, 0x37, 0x00, 0x00, 0x00, false, true, false } },
{ L"AAD",{ ot_none, ot_none, par_noparam, par_noparam, par_noparam, 2, 0xD5, 0x0A, 0x00, 0x00, false, true, false } },
{ L"AAD",{ ot_ib, ot_none, par_imm8, par_noparam, par_noparam, 1, 0xD5, 0x00, 0x00, 0x00, false, true, false } },
{ L"AAM",{ ot_none, ot_none, par_noparam, par_noparam, par_noparam, 2, 0xD4, 0x0A, 0x00, 0x00, false, true, false } },
...
My problem now is that there're a lot of instructions (currently 1,225 of them) implemented.
So when I run a code-analysis with Visual Studio, it tells me that the constructor function exceeds the stack with 98,000/16,384 bytes because the constructor first puts all those entries on the stack it seems, before processing them any further.
My question now is how to initialize all that space directly on the heap, preferably without having to rewrite much of it.

I think emplace is what you are looking for:
InstructionLookup.emplace(std::piecewise_construct, std::forward_as_tuple(L"sXs"), std::forward_as_tuple(ot_none, ot_none, par_noparam, par_noparam, par_noparam, 1, 0x37, 0x00, 0x00, 0x00, false, true, false));
I tried to keep your syntax as much as possible and changed the Boost.Assign implementation version from here to use perfect forwarding:
template <typename T, typename U>
class create_unmap
{
private:
std::unordered_multimap<T, U> m_map;
public:
template <typename ...Args>
create_unmap(Args&&... _Val)
{
m_map.emplace(std::forward<Args>(_Val)...);
}
template <typename ...Args>
create_unmap<T, U>& operator()(Args&&... _Val)
{
m_map.emplace(std::forward<Args>(_Val)...);
return *this;
}
operator std::unordered_multimap<T, U>()
{
return std::move(m_map);
}
};
You can declare your map using this syntax:
static const std::unordered_multimap<STRING, ASM_INSTRUCTION> InstructionLookupt = create_unmap<STRING, ASM_INSTRUCTION>
(std::piecewise_construct, std::forward_as_tuple(L"AAA"), std::forward_as_tuple(ot_none, ot_none, par_noparam, par_noparam, par_noparam, 1, 0x37, 0x00, 0x00, 0x00, false, true, false))
(std::piecewise_construct, std::forward_as_tuple(L"AAD"), std::forward_as_tuple(ot_none, ot_none, par_noparam, par_noparam, par_noparam, 1, 0x37, 0x00, 0x00, 0x00, false, true, false))
(std::piecewise_construct, std::forward_as_tuple(L"AAD"), std::forward_as_tuple(ot_none, ot_none, par_noparam, par_noparam, par_noparam, 1, 0x37, 0x00, 0x00, 0x00, false, true, false));

Allocate a type to a uint8_t value

In my project I read the unique ID from an RFID tag, the result is in the form uint8_t TagRead[4].
The result is compared with a number of predefined tag ID values to establish which tag has been read.
For example:
uint8_t RED1[4] = { 0x73, 0xD5, 0xB7, 0xAC };
uint8_t RED2[4] = { 0x7E, 0x27, 0x49, 0x4E };
uint8_t RED3[4] = { 0x02, 0xFD, 0x06, 0x40 };
uint8_t GREEN1[4] = { 0xAB, 0xEC, 0x68, 0x80 };
uint8_t GREEN2[4] = { 0xEE, 0x20, 0x50, 0x4E };
uint8_t GREEN3[4] = { 0x27, 0x06, 0x40, 0x73 };
if (*((uint32_t *)TagRead) == *((uint32_t *)RED2)) {
// RED2 tag has been read
}
else if (*((uint32_t *)TagRead) == *((uint32_t *)GREEN3)) {
// GREEN3 tag has been read
}
My question relates to being able to assign a type/category to a group of tags so that an action can be performed based on the colour of the tag that has been scanned.
It may be that when a RED tag is scanned we switch on a red LED and when a GREEN tag is scanned we switch on a blue LED.
Because there are approximately 50 tags of each colour, I don't want to to have to list all the tag names in the If statement. Instead, is it possible to assign the colour to the tag?
It would then be possible to do:
If scanned tag is of type RED, do red action.
If scanned tag is of type GREEN do green action.
Thanks for your help.

You can create a structure with id and a color enum:
enum class Color { red, green };
struct Tag
{
uint8_t id[4];
Color color;
};
Tag RED1 = { { 0x73, 0xD5, 0xB7, 0xAC }, Color::red } ;
Tag RED2 = { { 0x7E, 0x27, 0x49, 0x4E }, Color::red } ;
Tag RED3 = { { 0x02, 0xFD, 0x06, 0x40 }, Color::red } ;
Tag GREEN1 = { { 0xAB, 0xEC, 0x68, 0x80 }, Color::green } ;
Tag GREEN2 = { { 0xEE, 0x20, 0x50, 0x4E }, Color::green } ;
Tag GREEN3 = { { 0x27, 0x06, 0x40, 0x73 }, Color::green } ;
void test(Tag tag)
{
if (tag.color == Color::red)
{
//
}
else if (tag.color == Color::green)
{
}
}

First, your comparison is undefined behaviour. The right way to go is with a std::memcmp. You also need to take care of endianness.
In order to attach properties (like color) to your tags, simply define a struct:
struct rfid_tag
{
uint8_t value[4];
enum { ... } color;
};
Once you got a struct, you can enrich it with operator== so you can use std::find() to lookup the appropriate tag in one line:
#include <iostream>
#include <array>
#include <algorithm>
#include <cstring>
struct rfid_tag
{
enum color_type { red = 10, blue = 11 };
std::array<uint8_t, 4> value;
color_type color;
};
bool operator==(std::array<uint8_t, 4> const& tagvalue, rfid_tag const& rhs)
{
return std::memcmp(tagvalue.data(), rhs.value.data(), rhs.value.size()) == 0;
}
bool operator==(rfid_tag const& lhs, std::array<uint8_t, 4> const& tagvalue)
{
return tagvalue == lhs;
}
static const std::array<rfid_tag, 3> known_tags = {
rfid_tag{ { 0x00, 0x01, 0x02, 0x03 }, rfid_tag::red },
rfid_tag{ { 0x10, 0x11, 0x12, 0x13 }, rfid_tag::blue },
rfid_tag{ { 0x20, 0x21, 0x22, 0x23 }, rfid_tag::red }
};
int main()
{
const std::array<uint8_t, 4> tag_to_find{ 0x10, 0x11, 0x12, 0x13 };
std::cout << std::find(begin(known_tags), end(known_tags), tag_to_find)->color << "\n"; // outputs "11" as expected
}
demo

There are multiple ways.
You could write a struct which contains your tag along with your color, like this:
struct ColoredTag
{
uint8_t[4] value;
std::string color;
} typename ColoredTag_t;
ColoredTag_t RED1 = {{ 0x73, 0xD5, 0xB7, 0xAC }, "Red"};
ColoredTag_t RED2 = {{ 0x7E, 0x27, 0x49, 0x4E }, "Red"};
ColoredTag_t RED3 = {{ 0x02, 0xFD, 0x06, 0x40 }, "Red"};
ColoredTag_t GREEN1 = {{ 0xAB, 0xEC, 0x68, 0x80 }, "Green"};
ColoredTag_t GREEN2 = {{ 0xEE, 0x20, 0x50, 0x4E }, "Green"};
ColoredTag_t GREEN3 = {{ 0x27, 0x06, 0x40, 0x73 }, "Green"};
Or, you can use a std::map to assign a color to a tag, like this
std map<uint8_t[4], std::string> tags;
public void fillTags()
{
tags[RED1] = "Red";
tags[RED2] = "Red";
//...
}
std::string getColor(uint8_t tag)
{
return tags[tag];
}
There might be some more solutions for this issue, but these are the ones that came to my mind first.

Error when hooking a function, "Stack around the variable x was corrupted."? C++

I'm trying to hook a function on x64 application. Here's my code:
int __stdcall nRecv(SOCKET s, char* buf, int len, int flags)
{
Log("Receving!");
return 0;
}
BOOL HookFunction(LPCWSTR moduleName, LPCSTR funcName, LPVOID pDestination)
{
BYTE stub[6] = { 0xe9, 0x00, 0x00, 0x00, 0x00, 0xc3 };
DWORD pProtection;
DWORD pSource = (DWORD)GetProcAddress(GetModuleHandle(moduleName), funcName);
LPVOID pTrampoline = VirtualAlloc(NULL, 6 + sizeof(stub), MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
VirtualProtect((LPVOID)pSource, 6, PAGE_EXECUTE_READWRITE, &pProtection);
CopyMemory(stub + 1, &pDestination, 4);
CopyMemory((LPVOID)((DWORD_PTR)pTrampoline), &pSource, 6);
CopyMemory((LPVOID)((DWORD_PTR)pTrampoline + 6), stub, sizeof(stub));
CopyMemory(stub + 1, &pTrampoline, 4);
CopyMemory(&pSource, &stub, sizeof(stub));
VirtualProtect((LPVOID)pSource, 6, pProtection, NULL);
return TRUE;
}
BOOL recvHook = HookFunction(L"ws2_32.dll", "recv", &nRecv);
I've attached a debugger and spot an error:
Stack around the variable pSource was corrupted.
I couldn't really find a definitive reason for this happening, am I doing something wrong? Thanks!

This line here is copying 6 bytes of memory into a 4 byte variable
CopyMemory(&pSource, &stub, sizeof(stub));

Send Hex Bytes To A Serial Port

I am attempting to send hexadecimal bytes to a serial com port. The issue is that the segment that sends the command apparently wants a system string instead of an integer (error C2664 "cannot convert parameter 1 from 'int' to 'System::String ^'). I have looked for a way to send an integer instead but have had no luck. (I have tried sending string representations of the hexadecimal values, but the device did not recognize the commands)
Main part of Code
private: System::Void poll_Click(System::Object^ sender, System::EventArgs^ e)
{
int i, end;
double a = 1.58730159;
String^ portscan = "port";
String^ translate;
std::string portresponse [65];
std::fill_n(portresponse, 65, "Z");
for (i=1;i<64;i++)
{
if(this->_serialPort->IsOpen)
{
// Command 0 generator
int y = 2;
y += i;
int command0[10] = {0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x02, dectohex(i), 0x00, 0x00, dectohex(y)};
for (end=0;end<10;end++)
{
this->_serialPort->WriteLine(command0[end]);
}
translate = (this->_serialPort->ReadLine());
MarshalString(translate, portresponse [i]);
if(portresponse [i] != "Z")
{
comboBox7->Items->Add(i);
}
this->progressBar1->Value=a;
a += 1.58730159;
}
}
}
Here is the function dectohex:
int dectohex(int i)
{
int x = 0;
char hex_array[10];
sprintf (hex_array, "0x%02X", i);
string hex_string(hex_array);
x = atoi(hex_string.c_str());
return x;
}
This is what solved my problem, courtesy of Jochen Kalmbach
auto data = gcnew array<System::Byte> { 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x02, 0xBF, 0x00, 0x00, 0xBD };
_serialPort->Write(data, 0, data->Length);
Replaced this
this->_serialPort->WriteLine(command0[end]);

You cannot sent an integer over a serial line.... you can only sent BYTES (7-8 bit)!
You need to choose what you want to do:
Sent characters: So the "number" 12 will be converted into the bytes
_serialPort->Write(12.ToString());
// => 0x49, 0x50
Sent the integer (4 bytes) as little endian
auto data = System::BitConverter::GetBytes(12);
_serialPort->Write(data, 0, data->Length);
// => 0x0c, 0x00, 0x00, 0x00
Or you write just a single byte:
auto data = gcnew array<System::Byte> { 12 };
_serialPort->Write(data, 0, data->Length);
// => 0x0c
Or write an byte array:
auto data = gcnew array<System::Byte> { 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x02, 0xBF, 0x00, 0x00, 0xBD };
_serialPort->Write(data, 0, data->Length);
// => 0xFF 0xFF 0xFF 0xFF 0xFF 0x02 0xBF 0x00 0x00 0xBD

Invalid Algorithm Specified CryptoAPI

I am trying to decrypt something using 128BIT AES Decryption. When i attempt to calling CryptDecrypt i get an Error stating "Invalid Algorithm Specified". I get the same problem when using the library posted here: http://www.codeproject.com/KB/security/WinAES.aspx
What can cause this error?
I am using CryptoAPI along on vista64bit with visual studio 2008. I checked in the registry and the AES library is there...
EDIT
BYTE*& encryptedData /* get data length */
HCRYPTPROV cryptoHandle = NULL;
HCRYPTKEY aesKeyHandle = NULL;
hr = InitWinCrypt(cryptoHandle);
if(FAILED(hr))
{
return hr;
}
AesKeyOffering aesKey = { {PLAINTEXTKEYBLOB, CUR_BLOB_VERSION, 0, CALG_AES_128}, 16, { 0xFF, 0x00, 0xFF, 0x1C, 0x1D, 0x1E, 0x03, 0x04, 0x05, 0x0F, 0x20, 0x21, 0xAD, 0xAF, 0xA4, 0x04 }};
if(CryptImportKey(cryptoHandle, (CONST BYTE*)&aesKey, sizeof(AesKeyOffering), NULL, 0, &aesKeyHandle) == FALSE)
{
// DO error
return HRESULT_FROM_WIN32(GetLastError());
}
if(CryptSetKeyParam(aesKeyHandle, KP_IV, { 0xFF, 0x00, 0xFF, 0x1C, 0x1D, 0x1E, 0x03, 0x04, 0x05, 0x0F, 0x20, 0x21, 0xAD, 0xAF, 0xA4, 0x04 } , 0) == FALSE)
{
return HRESULT_FROM_WIN32(GetLastError());
}
BYTE blah2 = CRYPT_MODE_CBC;
// set block mode
if(CryptSetKeyParam(aesKeyHandle, KP_MODE, &blah2, 0) == FALSE)
{
//
return HRESULT_FROM_WIN32(GetLastError());
}
DWORD lol = dataLength / 16 + 1;
DWORD lol2 = lol * 16;
if(CryptDecrypt(aesKeyHandle, 0, TRUE, 0, encryptedData, &lol2) == FALSE)
{
return HRESULT_FROM_WIN32(GetLastError());
}
InitWinCrypt function
if(!CryptAcquireContextW(&cryptoHandle, NULL, L"Microsoft Enhanced RSA and AES Cryptographic Provider", PROV_RSA_AES, CRYPT_VERIFYCONTEXT))
{
if(!CryptAcquireContextW(&cryptoHandle, NULL, L"Microsoft Enhanced RSA and AES Cryptographic Provider", PROV_RSA_AES, 0))
{
return HRESULT_FROM_WIN32(GetLastError());
}
else
{
return S_OK;
}
}
return S_OK;
AesOffering struct:
struct AesKeyOffering
{
BLOBHEADER m_Header;
DWORD m_KeyLength;
BYTE Key[16];
};
EDIT2
After rebooting my computer, and remvoing the CBC chunk. I am now getting Bad Data Errors. The data decrypts fine in C#. But i need to do this using wincrypt.

Are you passing cryptoHandle by reference to InitWithCrypt? If not, your code
if(!CryptAcquireContextW(&cryptoHandle, ...
would only modify InitWinCrypt's copy of cryptoHandle.
EDIT: Given that it does, try getting rid of the CryptSetKeyParam call which sets CRYPT_MODE_CBC

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Most efficient way to escape XML/HTML in C++ string? - c++

void escape(std::string data) { using boost::algorithm::replace_all; replace_all(data, "&", "&"); replace_all(data, "\"", """); replace_all(data, "\'", "'"); replace_all(data, "<", "<"); replace_all(data, ">", ">"); } Could win the prize for least verbose?

Or with just stl : std::string& rep(std::string &s, std::string from, std::string to) { int pos = -1; while ( (pos = s.find(from, pos+1) ) != string::npos) s.erase(pos, from.length()).insert(pos, to); return s; } Usage: rep(s, "&", """); rep(s, "\"", """); or: rep(s, "HTML","xxxx");

Related

static map exceeds stack

Allocate a type to a uint8_t value

Error when hooking a function, "Stack around the variable x was corrupted."? C++

Send Hex Bytes To A Serial Port

Invalid Algorithm Specified CryptoAPI

Categories

Resources

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Most efficient way to escape XML/HTML in C++ string? - c++

void escape(std::string *data) { using boost::algorithm::replace_all; replace_all(*data, "&", "&"); replace_all(*data, "\"", """); replace_all(*data, "\'", "&apos;"); replace_all(*data, "<", "<"); replace_all(*data, ">", ">"); } Could win the prize for least verbose?

Or with just stl : std::string& rep(std::string &s, std::string from, std::string to) { int pos = -1; while ( (pos = s.find(from, pos+1) ) != string::npos) s.erase(pos, from.length()).insert(pos, to); return s; } Usage: rep(s, "&", """); rep(s, "\"", """); or: rep(s, "HTML","xxxx");

Related

static map exceeds stack

Allocate a type to a uint8_t value

Error when hooking a function, "Stack around the variable x was corrupted."? C++

Send Hex Bytes To A Serial Port

Invalid Algorithm Specified CryptoAPI

Categories

Resources

void escape(std::string data) { using boost::algorithm::replace_all; replace_all(data, "&", "&"); replace_all(data, "\"", """); replace_all(data, "\'", "'"); replace_all(data, "<", "<"); replace_all(data, ">", ">"); } Could win the prize for least verbose?