How do I encode a string to base64 using only boost? - c++

I'm trying to quickly encode a simple ASCII string to base64 (Basic HTTP Authentication using boost::asio) and not paste in any new code code or use any libraries beyond boost.
Simple signature would look like:
string Base64Encode(const string& text);
Again I realize the algorithm is easy and there are many libraries/examples doing this but I'm looking for a clean boost example. I found boost serialization but no clear examples there or from Google.
http://www.boost.org/doc/libs/1_46_1/libs/serialization/doc/dataflow.html
Is this possible without adding the actual base64 algorithm explicitly to my code?

Here is my solution. It uses the same basic technique as the other solutions on this page, but solves the problem of the padding in what I feel is a more elegant way. This solution also makes use of C++11.
I think that most of the code is self explanatory. The bit of math in the encode function calculates the number of '=' characters we need to add. The modulo 3 of val.size() the remainder, but what we really want is the difference between val.size() and the next number divisible by three. Since we have the remainder we can just subtract the remainder from 3, but that leaves 3 in the case that we want 0, so we have to modulo 3 one more time.
#include <boost/archive/iterators/binary_from_base64.hpp>
#include <boost/archive/iterators/base64_from_binary.hpp>
#include <boost/archive/iterators/transform_width.hpp>
#include <boost/algorithm/string.hpp>
std::string decode64(const std::string &val) {
using namespace boost::archive::iterators;
using It = transform_width<binary_from_base64<std::string::const_iterator>, 8, 6>;
return boost::algorithm::trim_right_copy_if(std::string(It(std::begin(val)), It(std::end(val))), [](char c) {
return c == '\0';
});
}
std::string encode64(const std::string &val) {
using namespace boost::archive::iterators;
using It = base64_from_binary<transform_width<std::string::const_iterator, 6, 8>>;
auto tmp = std::string(It(std::begin(val)), It(std::end(val)));
return tmp.append((3 - val.size() % 3) % 3, '=');
}

I improved the example in the link you provided a little:
#include <boost/archive/iterators/base64_from_binary.hpp>
#include <boost/archive/iterators/insert_linebreaks.hpp>
#include <boost/archive/iterators/transform_width.hpp>
#include <boost/archive/iterators/ostream_iterator.hpp>
#include <sstream>
#include <string>
#include <iostream>
int main()
{
using namespace boost::archive::iterators;
std::string test = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Fusce ornare ullamcorper ipsum ac gravida.";
std::stringstream os;
typedef
insert_linebreaks< // insert line breaks every 72 characters
base64_from_binary< // convert binary values to base64 characters
transform_width< // retrieve 6 bit integers from a sequence of 8 bit bytes
const char *,
6,
8
>
>
,72
>
base64_text; // compose all the above operations in to a new iterator
std::copy(
base64_text(test.c_str()),
base64_text(test.c_str() + test.size()),
ostream_iterator<char>(os)
);
std::cout << os.str();
}
This prints the string encoded base64 nicely formated with a line break every 72 characters onto the console, ready to be put into an email. If you don't like the linebreaks, just stay with this:
typedef
base64_from_binary<
transform_width<
const char *,
6,
8
>
>
base64_text;

You could use beast's implementation.
For boost version 1.71, the functions are:
boost::beast::detail::base64::encode()
boost::beast::detail::base64::encoded_size()
boost::beast::detail::base64::decode()
boost::beast::detail::base64::decoded_size()
From #include <boost/beast/core/detail/base64.hpp>
For older versions back to beast's inclusion in 1.66, the functions are:
boost::beast::detail::base64_encode()
boost::beast::detail::base64_decode()
From #include <boost/beast/core/detail/base64.hpp>

Another solution using boost base64 encode decode:
const std::string base64_padding[] = {"", "==","="};
std::string base64_encode(const std::string& s) {
namespace bai = boost::archive::iterators;
std::stringstream os;
// convert binary values to base64 characters
typedef bai::base64_from_binary
// retrieve 6 bit integers from a sequence of 8 bit bytes
<bai::transform_width<const char *, 6, 8> > base64_enc; // compose all the above operations in to a new iterator
std::copy(base64_enc(s.c_str()), base64_enc(s.c_str() + s.size()),
std::ostream_iterator<char>(os));
os << base64_padding[s.size() % 3];
return os.str();
}
std::string base64_decode(const std::string& s) {
namespace bai = boost::archive::iterators;
std::stringstream os;
typedef bai::transform_width<bai::binary_from_base64<const char *>, 8, 6> base64_dec;
unsigned int size = s.size();
// Remove the padding characters, cf. https://svn.boost.org/trac/boost/ticket/5629
if (size && s[size - 1] == '=') {
--size;
if (size && s[size - 1] == '=') --size;
}
if (size == 0) return std::string();
std::copy(base64_dec(s.data()), base64_dec(s.data() + size),
std::ostream_iterator<char>(os));
return os.str();
}
And here are the test cases:
std::string t_e[TESTSET_SIZE] = {
""
, "M"
, "Ma"
, "Man"
, "pleasure."
, "leasure."
, "easure."
, "asure."
, "sure."
};
std::string t_d[TESTSET_SIZE] = {
""
, "TQ=="
, "TWE="
, "TWFu"
, "cGxlYXN1cmUu"
, "bGVhc3VyZS4="
, "ZWFzdXJlLg=="
, "YXN1cmUu"
, "c3VyZS4="
};
Hope this helps

For anyone coming here from Google, here's my base64 encode/decode functions based off boost. It handles padding correctly as per DanDan's comment above. The decode functions stops when it encounters an illegal character, and returns a pointer to that character, which is great if you're parsing base64 in json or xml.
///
/// Convert up to len bytes of binary data in src to base64 and store it in dest
///
/// \param dest Destination buffer to hold the base64 data.
/// \param src Source binary data.
/// \param len The number of bytes of src to convert.
///
/// \return The number of characters written to dest.
/// \remarks Does not store a terminating null in dest.
///
uint base64_encode(char* dest, const char* src, uint len)
{
char tail[3] = {0,0,0};
typedef base64_from_binary<transform_width<const char *, 6, 8> > base64_enc;
uint one_third_len = len/3;
uint len_rounded_down = one_third_len*3;
uint j = len_rounded_down + one_third_len;
std::copy(base64_enc(src), base64_enc(src + len_rounded_down), dest);
if (len_rounded_down != len)
{
uint i=0;
for(; i < len - len_rounded_down; ++i)
{
tail[i] = src[len_rounded_down+i];
}
std::copy(base64_enc(tail), base64_enc(tail + 3), dest + j);
for(i=len + one_third_len + 1; i < j+4; ++i)
{
dest[i] = '=';
}
return i;
}
return j;
}
///
/// Convert null-terminated string src from base64 to binary and store it in dest.
///
/// \param dest Destination buffer
/// \param src Source base64 string
/// \param len Pointer to unsigned int representing size of dest buffer. After function returns this is set to the number of character written to dest.
///
/// \return Pointer to first character in source that could not be converted (the terminating null on success)
///
const char* base64_decode(char* dest, const char* src, uint* len)
{
uint output_len = *len;
typedef transform_width<binary_from_base64<const char*>, 8, 6> base64_dec;
uint i=0;
try
{
base64_dec src_it(src);
for(; i < output_len; ++i)
{
*dest++ = *src_it;
++src_it;
}
}
catch(dataflow_exception&)
{
}
*len = i;
return src + (i+2)/3*4; // bytes in = bytes out / 3 rounded up * 4
}

While the encoding works, the decoder is certainly broken. Also there is a bug opened: https://svn.boost.org/trac/boost/ticket/5629.
I have not found a fix for that.

This is another answer:
#include <boost/archive/iterators/binary_from_base64.hpp>
#include <boost/archive/iterators/base64_from_binary.hpp>
#include <boost/archive/iterators/transform_width.hpp>
std::string ToBase64(const std::vector<unsigned char>& binary)
{
using namespace boost::archive::iterators;
using It = base64_from_binary<transform_width<std::vector<unsigned char>::const_iterator, 6, 8>>;
auto base64 = std::string(It(binary.begin()), It(binary.end()));
// Add padding.
return base64.append((3 - binary.size() % 3) % 3, '=');
}
std::vector<unsigned char> FromBase64(const std::string& base64)
{
using namespace boost::archive::iterators;
using It = transform_width<binary_from_base64<std::string::const_iterator>, 8, 6>;
auto binary = std::vector<unsigned char>(It(base64.begin()), It(base64.end()));
// Remove padding.
auto length = base64.size();
if(binary.size() > 2 && base64[length - 1] == '=' && base64[length - 2] == '=')
{
binary.erase(binary.end() - 2, binary.end());
}
else if(binary.size() > 1 && base64[length - 1] == '=')
{
binary.erase(binary.end() - 1, binary.end());
}
return binary;
}

Base64 encode text and data
const std::string base64_padding[] = {"", "==","="};
std::string base64EncodeText(std::string text) {
using namespace boost::archive::iterators;
typedef std::string::const_iterator iterator_type;
typedef base64_from_binary<transform_width<iterator_type, 6, 8> > base64_enc;
std::stringstream ss;
std::copy(base64_enc(text.begin()), base64_enc(text.end()), ostream_iterator<char>(ss));
ss << base64_padding[text.size() % 3];
return ss.str();
}
std::string base64EncodeData(std::vector<uint8_t> data) {
using namespace boost::archive::iterators;
typedef std::vector<uint8_t>::const_iterator iterator_type;
typedef base64_from_binary<transform_width<iterator_type, 6, 8> > base64_enc;
std::stringstream ss;
std::copy(base64_enc(data.begin()), base64_enc(data.end()), ostream_iterator<char>(ss));
ss << base64_padding[data.size() % 3];
return ss.str();
}

I modified the Answer 8 because it's not functional on my platform.
const std::string base64_padding[] = {"", "==","="};
std::string *m_ArchiveData;
/// \brief To Base64 string
bool Base64Encode(string* output)
{
try
{
UInt32 iPadding_Mask = 0;
typedef boost::archive::iterators::base64_from_binary
<boost::archive::iterators::transform_width<const char *, 6, 8> > Base64EncodeIterator;
UInt32 len = m_ArchiveData->size();
std::stringstream os;
std::copy(Base64EncodeIterator(m_ArchiveData->c_str()),
Base64EncodeIterator(m_ArchiveData->c_str()+len),
std::ostream_iterator<char>(os));
iPadding_Mask = m_ArchiveData->size() % 3;
os << base64_padding[iPadding_Pask];
*output = os.str();
return output->empty() == false;
}
catch (...)
{
PLOG_ERROR_DEV("unknown error happens");
return false;
}
}
/// \brief From Base64 string
bool mcsf_data_header_byte_stream_archive::Base64Decode(const std::string *input)
{
try
{
std::stringstream os;
bool bPaded = false;
typedef boost::archive::iterators::transform_width<boost::archive::iterators::
binary_from_base64<const char *>, 8, 6> Base64DecodeIterator;
UInt32 iLength = input->length();
// Remove the padding characters, cf. https://svn.boost.org/trac/boost/ticket/5629
if (iLength && (*input)[iLength-1] == '=') {
bPaded = true;
--iLength;
if (iLength && (*input)[iLength - 1] == '=')
{
--iLength;
}
}
if (iLength == 0)
{
return false;
}
if(bPaded)
{
iLength --;
}
copy(Base64DecodeIterator(input->c_str()) ,
Base64DecodeIterator(input->c_str()+iLength),
ostream_iterator<char>(os));
*m_ArchiveData = os.str();
return m_ArchiveData->empty() == false;
}
catch (...)
{
PLOG_ERROR_DEV("unknown error happens");
return false;
}
}

Related

String into binary

I used Huffman encoding that we wrote to compress a file.
The function takes String and its output is String.
The problem is I want to save it as binary to get lower size than the original size, but when I take it back (0's and 1's ) as a string its size is larger than the main file. How can I convert that string of (0's and 1's) to a binary so that every character is saved in 1 bit? I am using Qt to achieve this:
string Huffman_encoding(string text)
{
buildHuffmanTree(text);
string encoded = "";
unordered_map<char, string> StringEncoded;
encoding(main_root, "", StringEncoded);
for (char ch : text) {
encoded += StringEncoded[ch];
}
return encoded;
}
The canonical solution uses a "bit packer" that accepts bitstrings and emits packed bytes. As a first start, replace encoded by an instance of the following:
class BitPacker {
QByteArray res;
quint8 bitsLeft = 8;
quint8 buf = 0;
public:
void operator+=(const std::string& s) {
for (auto c : s) {
buf = buf << 1 | c - '0';
if (--bitsLeft == 0) {
res.append(buf);
buf = 0;
bitsLeft = 8;
}
}
}
QByteArray finish() {
if (bitsLeft < 8) {
res.append(buf << bitsLeft);
buf = 0;
bitsLeft = 8;
}
return res;
}
}
operator+= will add additional bits to buf and flush complete bytes to res. At the end of the process you may be left with, say, 3 bits. finish uses a simple algorithm: it pads the buffer with zeroes to produce a final byte and hands you back the fully encoded buffer.
A more sophisticated solution might be to introduce an explicit "end of stream" token that is not present in the source character set.
Seems what you're searching for is a way to convert a string containing a sequence of 0s and 1s like "0000010010000000" to an actual binary representation (numbers 4 and 128 in this example).
This could be achieved with a function like this:
#include <iostream>
#include <string>
#include <cstdint>
#include <vector>
std::vector<uint8_t> toBinary(std::string const& binStr)
{
std::vector<uint8_t> result;
result.reserve(binStr.size() / 8);
size_t pos = 0;
size_t len = binStr.length();
while (pos < len)
{
size_t curLen = std::min(static_cast<size_t>(8), len-pos);
auto curStr = binStr.substr(pos, curLen) + std::string(8-curLen, '0');
std::cout << "curLen: " << curLen << ", curStr: " << curStr << "\n";
result.push_back(std::stoi(curStr, 0, 2));
pos += 8;
}
return result;
}
// test:
int main()
{
std::string binStr("000001001000000001");
auto bin = toBinary(binStr);
for (auto i: bin)
{
std::cout << static_cast<int>(i) << " ";
}
return 0;
}
Output:
4 128 64
You can then do whatever you want with these numbers, e.g. write them into a binary file.
Note that toBinary as above, pads the last byte, if incomplete, with zeros.
You can create a bitstream using bitwise logic like this :
#include <cassert>
#include <string>
#include <stdexcept>
#include <vector>
auto to_bit_stream(const std::string& bytes)
{
std::vector<std::uint8_t> stream;
std::uint8_t shift{ 0 };
std::uint8_t out{ 0 };
// allocate enough bytes to hold the bits
// speeds up the code a bit
stream.reserve((bytes.size() + 7) / 8);
// loop over all bytes
for (const auto c : bytes)
{
// check input
if (!((c == '0') || (c == '1'))) throw std::invalid_argument("invalid character in input");
// shift output by one to accept next bit
out <<= 1;
// keep track of number of shifts
// after 8 shifts a byte has been filled
shift++;
// or the output with a 1 if needed
out |= (c == '1');
// complete an output byte
if (shift == 8)
{
stream.push_back(out);
out = 0;
shift = 0;
}
}
return stream;
}
int main()
{
// stream is 8 bits per value, values 0,1,2,3
auto stream = to_bit_stream("00000000000000010000001000000011");
assert(stream.size() == 4ul);
assert(stream[0] == 0);
assert(stream[1] == 1);
assert(stream[2] == 2);
assert(stream[3] == 3);
return 0;
}
Use std::stoi()
int n = std::stoi("01000100", nullptr, 2);

Sorting string vector using integer values at the end of the string in C++

I have a directory containing files {"good_6", good_7", "good_8"...,"good_660"}, after reading it using readdir and storing in a vector I get {"good_10", "good_100", "good_101", "good_102"...}.
What I want to do is to keep the file names as {"good_6", good_7", "good_8"...,"good_660"} in the vector and then replacing first name with 1, second with 2 and so on... such that good_6 will be 1, good_7 will be 2 and so on. but now good_10 corresponds to 1 and good_100 to 2 and so on.
I tried std::sort on vector but the values are already sorted, just not in a way that I desire (based on integer after _). Even if I just get the last integer and sort on that, it will still be sorted as 1, 100, 101...
Any help would be appreciated. Thanks.
You can use a custom function that compares strings with a special case for digits:
#include <ctype.h>
int natural_string_cmp(const char *sa, const char *sb) {
for (;;) {
int a = (unsigned char)*sa++;
int b = (unsigned char)*sb++;
/* simplistic version with overflow issues */
if (isdigit(a) && isdigit(b)) {
const char *sa1 = sa - 1;
const char *sb1 = sb - 1;
unsigned long na = strtoul(sa1, (char **)&sa, 10);
unsigned long nb = strtoul(sb1, (char **)&sb, 10);
if (na == nb) {
if ((sa - sa1) == (sb - sb1)) {
/* XXX should check for '.' */
continue;
} else {
/* Perform regular strcmp to handle 0 :: 00 */
return strcmp(sa1, sb1);
}
} else {
return (na < nb) ? -1 : +1;
}
} else {
if (a == b) {
if (a != '\0')
continue;
else
return 0;
} else {
return (a < b) ? -1 : 1;
}
}
}
}
Depending on your sorting algorithm, you may need to wrap it with an extra level of indirection:
int natural_string_cmp_ind(const void *p1, const void *p2) {
return natural_string_cmp(*(const char * const *)p1, *(const char * const *)p2);
}
char *array[size];
... // array is initialized with filenames
qsort(array, size, sizeof(*array), natural_string_cmp_ind);
I think you can play around with your data structure. For example instead of vector<string>, you can convert your data to vector< pair<int, string> >. Then {"good_6", "good_7", "good_8"...,"good_660"} should be {(6, "good"), (7, "good"), (7, "good")..., (660, "good")}. In the end, you convert it back and do whatever you want.
Another way is just to define your own comparator to do the exact comparison as what you want.
You can use string::replace to replace string "good_" with empty string, and use stoi to convert the rest of the integral part of the string. Lets say the value obtained is x.
Create std::map and populate it in this way myMap[x] = vec_element.
Then you can traverse from m.begin() till m.end() to find sorted order.
Code:
myMap[ stoi( vec[i].replace(0,5,"") )] = vec[i];
for( MapType::iterator it = myMap.begin(); it != myMap.end(); ++it ) {
sortedVec.push_back( it->second );
If I understand your question, you're just having trouble with the sorting and not how you plan to change the names after you sort.
Something like this might work for you:
#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
#include <tuple>
#include <string.h>
int main()
{
std::vector<std::string> v;
char buffer[64] = {};
for (size_t i = 1; i < 10; ++i)
{
sprintf(buffer, "good_%d", i * 3);
v.push_back(buffer);
sprintf(buffer, "bad_%d", i * 2);
v.push_back(buffer);
}
std::random_shuffle(v.begin(), v.end());
for (const auto& s : v)
{
std::cout << s << "\n";
}
std::sort(v.begin(), v.end(),
[](const std::string& lhs, const std::string& rhs)
{
//This assumes a lot about the contents of the strings
//and has no error checking just to keep things short.
size_t l_pos = lhs.find('_');
size_t r_pos = rhs.find('_');
std::string l_str = lhs.substr(0, l_pos);
std::string r_str = rhs.substr(0, r_pos);
int l_num = std::stoi(lhs.substr(l_pos + 1));
int r_num = std::stoi(rhs.substr(r_pos + 1));
return std::tie(l_str, l_num) < std::tie(r_str, r_num);
});
std::cout << "-----\n";
for (const auto& s : v)
{
std::cout << s << "\n";
}
return 0;
}
Managed to do it with the following compare function:
bool numericStringComapre(const std::string& s1, const std::string& s2)
{
size_t foundUnderScore = s1.find_last_of("_");
size_t foundDot = s1.find_last_of(".");
string s11 = s1.substr(foundUnderScore+1, foundDot - foundUnderScore - 1);
foundUnderScore = s2.find_last_of("_");
foundDot = s2.find_last_of(".");
string s22 = s2.substr(foundUnderScore+1, foundDot-foundUnderScore - 1);
int i1 = stoi(s11);
int i2 = stoi(s22);
if (i1 < i2) return true;
return false;
}
full file name was good_0.png, hence that find_last_of(".").

Convert vector<string> to unsigned char array in C++

I have a string vector that holds some values. These values are supposed to be hex bytes but are being stored as strings inside this vector.
The bytes were read from inside a text file actually, something like this:
(contents of the text file)
<jpeg1>
0xFF,0xD8,0xFF,0xE0,0x00,0x10,0x4A,0x46,0x49,0x46,0x00,0x01,0x01,0x01,0x00,0x60
</jpeg1>
so far, what my code does is, it starts reading the line after the {JPEG1} tag until the {/jpeg1} tag and then using the comma ',' as a delimeter it stores the bytes into the string vector.
After Splitting the string, the vector at the moment stores the values like this :
vector<string> myString = {"0xFF", "0xD8", "0xFF", "0xE0", "0x00", "0x10", "0x4A", "0x46", "0x49", "0x46", "0x00", "0x01", "0x01", "0x01", "0x00", "0x60"};
and if i print this i get the following:
0: 0xFF
1: 0xD8
2: 0xFF
3: 0xE0
4: 0x00
5: 0x10
6: 0x4A
7: 0x46
8: 0x49
9: 0x46
What I would want is that, I'd like to store these bytes inside an unsigned char array, such that each element be treated as a HEX byte and not a string value.
Preferably something like this :
unsigned char myHexArray[] = {0xFF,0xD8,0xFF,0xE0,0x00,0x10,0x4A,0x46,0x49,0x46,0x00,0x01,0x01,0x01,0x00,0x60};
if i print this i get:
0:  
1: ╪
2:  
3: α
4:
5:
6: J
7: F
8: I
9: F
Solved!
Thanks for your help guys, so far "ranban282" solution has worked for me, I'll try solutions provided by other users as well.
I wouldn't even go through the std::vector<std::string> stage, you don't need it and it wastes a lot of allocations for no good reason; just parse the string to bytes "online".
If you already have an istream for your data, you can parse it straight from it, although I had terrible experiences about performance for it.
// is is some derived class of std::istream
std::vector<unsigned char> ret;
while(is) {
int val = 0;
is>>std::hex>>val;
if(!is) {
break; // failed conversion; remember to clean up the stream
// if you need it later!
}
ret.push_back(val);
if(is.getc()!=',') break;
}
If instead you have it in a string - as often happens when extracting data from an XML file, you can parse it either using istringstream and the code above (one extra string copy + generally quite slow), or parse it straight from the string using e.g. sscanf with %i; say that your string is in a const char *sz:
std::vector<unsigned char> ret;
for(; *sz; ++sz) {
int read = 0;
int val = 0;
if(sscanf(sz, " %i %n", &val, &read)==0) break; // format error
ret.push_back(val):
sz += read;
if(*sz && *sz != ',') break; // format error
}
// now ret contains the decoded string
If you are sure that the strings are always hexadecimal, regardless of the 0x prefix, and that whitespace is not present strtol is a bit more efficient and IMO nicer to use:
std::vector<unsigned char> ret;
for( ;*sz;++sz) {
char *endp;
long val = strtol(sz, &endp, 16);
if(endp==sz) break; // format error
sz = endp;
ret.push_back(val);
if(*sz && *sz!=',') break; // format error
}
If C++17 is available, you can use std::from_chars instead of strtol to cut out the locale bullshit, which can break your parsing function (although that's more typical for floating point parsing) and slow it down for no good reason.
OTOH, if the performance is critical but from_chars is not available (or if it's available but you measured that it's slow), it may be advantageous to hand roll the whole parser.
auto conv_digit = [](char c) -> int {
if(c>='0' && c<='9') return c-'0';
// notice: technically not guaranteed to work;
// in practice it'll work on anything that doesn't use EBCDIC
if(c>='A' && c<='F') return c-'A'+10;
if(c>='a' && c<='f') return c-'a'+10;
return -1;
};
std::vector<unsigned char> ret;
for(; *sz; ++sz) {
while(*sz == ' ') ++sz;
if(*sz!='0' || sz[1]!='x' || sz[1]!='X') break; // format error
sz+=2;
int val = 0;
int digit = -1;
const char *sz_before = sz;
while((digit = conv_digit(*sz)) >= 0) {
val=val*16+digit; // or, if you prefer: val = val<<4 | digit;
++sz;
}
if(sz==sz_before) break; // format error
ret.push_back(val);
while(*sz == ' ') ++sz;
if(*sz && *sz!=',') break; // format error
}
If you're using C++11, you can use the stoi function.
vector<string> myString = {"0xFF", "0xD8", "0xFF", "0xE0", "0x00", "0x10", "0x4A", "0x46", "0x49", "0x46", "0x00", "0x01", "0x01", "0x01", "0x00", "0x60"};
unsigned char* myHexArray=new unsigned char[myString.size()];
for (unsigned i=0;i<myString.size();i++)
{
myHexArray[i]=stoi(myString[i],NULL,0);
}
for (unsigned i=0;i<myString.size();i++)
{
cout<<myHexArray[i]<<endl;
}
The function stoi() was introduced by C++11. In order to compile with gcc, you should compile with the flags -std=c++11.
In case you're using an older version of c++ you can use strtol instead of stoi. Note that you need to convert the string to a character array first.
myHexArray[i]=strtol(myString[i].c_str(),NULL,0);
You can use std::stoul on each of your values and build your array using another std::vector like this:
std::vector<std::string> vs {"0xFF", "0xD8", "0xFF" ...};
std::vector<unsigned char> vc;
vc.reserve(vs.size());
for(auto const& s: vs)
vc.push_back((unsigned char) std::stoul(s, 0, 0));
Now you can access your array with:
vc.data(); // <-- pointer to unsigned char array
Here's a complete solution including a test and a rudimentary parser (for simplicity, it assumes that the xml tags are on their own lines).
#include <string>
#include <sstream>
#include <regex>
#include <iostream>
#include <iomanip>
#include <iterator>
const char test_data[] =
R"__(<jpeg1>
0xFF,0xD8,0xFF,0xE0,0x00,0x10,0x4A,0x46,0x49,0x46,0x00,0x01,0x01,0x01,0x00,0x60,
0x12,0x34,0x56,0x78,0x9a,0xbc,0xde,0xf0
</jpeg1>)__";
struct Jpeg
{
std::string name;
std::vector<std::uint8_t> data;
};
std::ostream& operator<<(std::ostream& os, const Jpeg& j)
{
os << j.name << " : ";
const char* sep = " ";
os << '[';
for (auto b : j.data) {
os << sep << std::hex << std::setfill('0') << std::setw(2) << std::uint32_t(b);
sep = ", ";
}
return os << " ]";
}
template<class OutIter>
OutIter read_bytes(OutIter dest, std::istream& source)
{
std::string buffer;
while (std::getline(source, buffer, ','))
{
*dest++ = static_cast<std::uint8_t>(std::stoul(buffer, 0, 16));
}
return dest;
}
Jpeg read_jpeg(std::istream& is)
{
auto result = Jpeg {};
static const auto begin_tag = std::regex("<jpeg(.*)>");
static const auto end_tag = std::regex("</jpeg(.*)>");
std::string line, hex_buffer;
if(not std::getline(is, line)) throw std::runtime_error("end of file");
std::smatch match;
if (not std::regex_match(line, match, begin_tag)) throw std::runtime_error("not a <jpeg_>");
result.name = match[1];
while (std::getline(is, line))
{
if (std::regex_match(line, match, end_tag)) { break; }
std::istringstream hexes { line };
read_bytes(std::back_inserter(result.data), hexes);
}
return result;
}
int main()
{
std::istringstream input_stream(test_data);
auto jpeg = read_jpeg(input_stream);
std::cout << jpeg << std::endl;
}
expected output:
1 : [ ff, d8, ff, e0, 00, 10, 4a, 46, 49, 46, 00, 01, 01, 01, 00, 60, 12, 34, 56, 78, 9a, bc, de, f0 ]

attempt to decode a value not in base64 char set

I am using the following code snippet to base64 encode and decode a string using Boost C++ library.
//Base64 Encode Implementation using Boost C++ library
const std::string base64_padding[] = {"", "=", "=="};
std::string X_Privet_Token_Generator::base64_encode(const std::string & s)
{
namespace bai = boost::archive::iterators;
std::stringstream os;
// convert binary values to base64 characters
typedef bai::base64_from_binary
// retrieve 6 bit integers from a sequence of 8 bit bytes
<bai::transform_width<const char *, 6, 8> > base64_enc; // compose all the above operations in to a new iterator
std::copy(base64_enc(s.c_str()), base64_enc(s.c_str() + s.size()), std::ostream_iterator<char>(os));
os << base64_padding[s.size() % 3];
return os.str();
}
std::string X_Privet_Token_Generator::base64_decode(std::string & s)
{
namespace bai = boost::archive::iterators;
std::stringstream os;
// convert binary values to base64 characters
typedef bai::binary_from_base64
<bai::transform_width<const char *, 8, 6> > base64_dec;
unsigned int size = s.size();
// Remove the padding characters, cf.
if (size && s[size - 1] == '=')
{
--size;
if (size && s[size - 1] == '=')
--size;
}
if (size == 0)
return std::string();
LOGINFO("Hash decoded token : %s", s.c_str());
std::copy(base64_dec(s.data()), base64_dec(s.data() + size), std::ostream_iterator<char>(os));
std::cout<< os.str();
return os.str();
}
Encoding works well, however, while decoding I get the following error:
terminate called after throwing an instance of boost::archive::iterators::dataflow_exception
what(): attempt to decode a value not in base64 char set
Is it one of the padded characters that is causing this issue? Am I missing something here?
The padding characters '=' are part of the b64 encoded data and should not be removed before decoding.
b64 is encoded in blocks of 4 character, I suspect that while decoding it reads a '\0' instead of an expected '=' at the end of the string.
Changing the
std::copy(base64_dec(s.data()), base64_dec(s.data() + size), std::ostream_iterator<char>(os))
to
return std::string( base64_dec(s.c_str()), base64_dec(s.c_str() + size))
resolved the issue.
A more efficient solution for base64 encoding and decoding is as follows:
#include <boost/archive/iterators/base64_from_binary.hpp>
#include <boost/archive/iterators/binary_from_base64.hpp>
#include <boost/archive/iterators/insert_linebreaks.hpp>
#include <boost/archive/iterators/remove_whitespace.hpp>
#include <boost/archive/iterators/transform_width.hpp>
#include <boost/archive/iterators/ostream_iterator.hpp>
#include <boost/algorithm/string.hpp>
#include <bits/stl_algo.h>
std::string X_Privet_Token_Generator::base64_encode(std::string s)
{
namespace bai = boost::archive::iterators;
std::stringstream os;
// convert binary values to base64 characters
typedef bai::base64_from_binary
// retrieve 6 bit integers from a sequence of 8 bit bytes
<bai::transform_width<char *, 6, 8> > base64_enc; // compose all the above operations in to a new iterator
std::copy(base64_enc(s.c_str()), base64_enc(s.c_str() + s.size()), std::ostream_iterator<char>(os));
os << base64_padding[s.size() % 3];
return os.str();
}
std::string X_Privet_Token_Generator::base64_decode(std::string s)
{
namespace bai = boost::archive::iterators;
std::stringstream os;
typedef bai::transform_width<bai::binary_from_base64<char * >, 8, 6>
base64_dec;
unsigned int size = s.size();
// Remove the padding characters.
if(size && s[size - 1] == '=') {
--size;
if(size && s[size - 1] == '=')
--size;
}
if(size == 0) return std::string();
unsigned int paddChars = count(s.begin(), s.end(), '=');
std::replace(s.begin(),s.end(), '=', 'A');
std::string decoded_token(base64_dec(s.c_str()), base64_dec(s.c_str() + size));
decoded_token.erase(decoded_token.end()-paddChars,decoded_token.end());
return decoded_token;
}

Converting from char string to an array of uint8_t?

I'm reading a string from a file so it's in the form of a char array. I need to tokenize the string and save each char array token as a uint8_t hex value in an array.
char* starting = "001122AABBCC";
// ...
uint8_t[] ending = {0x00,0x11,0x22,0xAA,0xBB,0xCC}
How can I convert from starting to ending? Thanks.
Here is a complete working program. It is based on Rob I's solution, but fixes several problems has been tested to work.
#include <string>
#include <stdio.h>
#include <stdlib.h>
#include <vector>
#include <iostream>
const char* starting = "001122AABBCC";
int main()
{
std::string starting_str = starting;
std::vector<unsigned char> ending;
ending.reserve( starting_str.size());
for (int i = 0 ; i < starting_str.length() ; i+=2) {
std::string pair = starting_str.substr( i, 2 );
ending.push_back(::strtol( pair.c_str(), 0, 16 ));
}
for(int i=0; i<ending.size(); ++i) {
printf("0x%X\n", ending[i]);
}
}
strtoul will convert text in any base you choose into bytes. You have to do a little work to chop the input string into individual digits, or you can convert 32 or 64bits at a time.
ps uint8_t[] ending = {0x00,0x11,0x22,0xAA,0xBB,0xCC}
Doesn't mean anything, you aren't storing the data in a uint8 as 'hex', you are storing bytes, it's upto how you (or your debugger) interpretes the binary data
With C++11, you may use std::stoi for that :
std::vector<uint8_t> convert(const std::string& s)
{
if (s.size() % 2 != 0) {
throw std::runtime_error("Bad size argument");
}
std::vector<uint8_t> res;
res.reserve(s.size() / 2);
for (std::size_t i = 0, size = s.size(); i != size; i += 2) {
std::size_t pos = 0;
res.push_back(std::stoi(s.substr(i, 2), &pos, 16));
if (pos != 2) {
throw std::runtime_error("bad character in argument");
}
}
return res;
}
Live example.
I think any canonical answer (w.r.t. the bounty notes) would involve some distinct phases in the solution:
Error checking for valid input
Length check and
Data content check
Element conversion
Output creation
Given the usefulness of such conversions, the solution should probably include some flexibility w.r.t. the types being used and the locale required.
From the outset, given the date of the request for a "more canonical answer" (circa August 2014) liberal use of C++11 will be applied.
An annotated version of the code, with types corresponding to the OP:
std::vector<std::uint8_t> convert(std::string const& src)
{
// error check on the length
if ((src.length() % 2) != 0) {
throw std::invalid_argument("conversion error: input is not even length");
}
auto ishex = [] (decltype(*src.begin()) c) {
return std::isxdigit(c, std::locale()); };
// error check on the data contents
if (!std::all_of(std::begin(src), std::end(src), ishex)) {
throw std::invalid_argument("conversion error: input values are not not all xdigits");
}
// allocate the result, initialised to 0 and size it to the correct length
std::vector<std::uint8_t> result(src.length() / 2, 0);
// run the actual conversion
auto str = src.begin(); // track the location in the string
std::for_each(result.begin(), result.end(), [&str](decltype(*result.begin())& element) {
element = static_cast<std::uint8_t>(std::stoul(std::string(str, str + 2), nullptr, 16));
std::advance(str, 2); // next two elements
});
return result;
}
The template version of the code adds flexibility;
template <typename Int /*= std::uint8_t*/,
typename Char = char,
typename Traits = std::char_traits<Char>,
typename Allocate = std::allocator<Char>,
typename Locale = std::locale>
std::vector<Int> basic_convert(std::basic_string<Char, Traits, Allocate> const& src, Locale locale = Locale())
{
using string_type = std::basic_string<Char, Traits, Allocate>;
auto ishex = [&locale] (decltype(*src.begin()) c) {
return std::isxdigit(c, locale); };
if ((src.length() % 2) != 0) {
throw std::invalid_argument("conversion error: input is not even length");
}
if (!std::all_of(std::begin(src), std::end(src), ishex)) {
throw std::invalid_argument("conversion error: input values are not not all xdigits");
}
std::vector<Int> result(src.length() / 2, 0);
auto str = std::begin(src);
std::for_each(std::begin(result), std::end(result), [&str](decltype(*std::begin(result))& element) {
element = static_cast<Int>(std::stoul(string_type(str, str + 2), nullptr, 16));
std::advance(str, 2);
});
return result;
}
The convert() function can then be based on the basic_convert() as follows:
std::vector<std::uint8_t> convert(std::string const& src)
{
return basic_convert<std::uint8_t>(src, std::locale());
}
Live sample.
uint8_t is typically no more than a typedef of an unsigned char. If you're reading characters from a file, you should be able to read them into an unsigned char array just as easily as a signed char array, and an unsigned char array is a uint8_t array.
I'd try something like this:
std::string starting_str = starting;
uint8_t[] ending = new uint8_t[starting_str.length()/2];
for (int i = 0 ; i < starting_str.length() ; i+=2) {
std::string pair = starting_str.substr( i, i+2 );
ending[i/2] = ::strtol( pair.c_str(), 0, 16 );
}
Didn't test it but it looks good to me...
You may add your own conversion from set of char { '0','1',...'E','F' } to uint8_t:
uint8_t ctoa(char c)
{
if( c >= '0' && c <= '9' ) return c - '0';
else if( c >= 'a' && c <= 'f' ) return 0xA + c - 'a';
else if( c >= 'A' && c <= 'F' ) return 0xA + c - 'A';
else return 0;
}
Then it will be easy to convert a string in to array:
uint32_t endingSize = strlen(starting)/2;
uint8_t* ending = new uint8_t[endingSize];
for( uint32_t i=0; i<endingSize; i++ )
{
ending[i] = ( ctoa( starting[i*2] ) << 4 ) + ctoa( starting[i*2+1] );
}
This simple solution should work for your problem
char* starting = "001122AABBCC";
uint8_t ending[12];
// This algo will work for any size of starting
// However, you have to make sure that the ending have enough space.
int i=0;
while (i<strlen(starting))
{
// convert the character to string
char str[2] = "\0";
str[0] = starting[i];
// convert string to int base 16
ending[i]= (uint8_t)atoi(str,16);
i++;
}
uint8_t* ending = static_cast<uint8_t*>(starting);