uintx_t to const char* in freestanding c++ using GNU compiler - c++

so I am trying to convert some integers in to character arrays that my terminal can write. so I can see the value of my codes calculations for debugging purposes when its running.
as in if the int_t count = 57 I want the terminal to write 57.
so char* would be an array of character of 5 and 7
The kicker here though is that this is in an freestanding environment so that means no standard c++ library.
EDIT:
this means No std::string, no c_str, no _tostring, I cant just print integers.
The headers I have access to are iso646,stddef,float,limits,stdint,stdalign, stdarg, stdbool and stdnoreturn
Ive tried a few things from casting the int as an const char*, witch just led to random characters being displayed. To feeding my compiler different headers from the GCC collection but they just keeped needing other headers that I continued feeding it until I did not know what header the compiler wanted.
so here is where the code needs to be used to be printed.
uint8_t count = 0;
while (true)
{
terminal_setcolor(3);
terminal_writestring("hello\n");
count++;
terminal_writestring((const char*)count);
terminal_writestring("\n");
}
any advice with this would be greatly appreciated.
I am using an gnu, g++ cross compiler targeted at 686-elf and I guess I am using C++11 since I have access to stdnoreturn.h but it could be C++14 since I only just built the compiler with the latest gnu software dependencies.

Without C/C++ Standard Library you have no options except writing conversion function manually, e.g.:
template <int N>
const char* uint_to_string(
unsigned int val,
char (&str)[N],
unsigned int base = 10)
{
static_assert(N > 1, "Buffer too small");
static const char* const digits = "0123456789ABCDEF";
if (base < 2 || base > 16) return nullptr;
int i = N - 1;
str[i] = 0;
do
{
--i;
str[i] = digits[val % base];
val /= base;
}
while (val != 0 && i > 0);
return val == 0 ? str + i : nullptr;
}
template <int N>
const char* int_to_string(
int val,
char (&str)[N],
unsigned int base = 10)
{
// Output as unsigned.
if (val >= 0) return uint_to_string(val, str, base);
// Output as binary representation if base is not decimal.
if (base != 10) return uint_to_string(val, str, base);
// Output signed decimal representation.
const char* res = uint_to_string(-val, str, base);
// Buffer has place for minus sign
if (res > str)
{
const auto i = res - str - 1;
str[i] = '-';
return str + i;
}
else return nullptr;
}
Usage:
char buf[100];
terminal_writestring(int_to_string(42, buf)); // Will print '42'
terminal_writestring(int_to_string(42, buf, 2)); // Will print '101010'
terminal_writestring(int_to_string(42, buf, 8)); // Will print '52'
terminal_writestring(int_to_string(42, buf, 16)); // Will print '2A'
terminal_writestring(int_to_string(-42, buf)); // Will print '-42'
terminal_writestring(int_to_string(-42, buf, 2)); // Will print '11111111111111111111111111010110'
terminal_writestring(int_to_string(-42, buf, 8)); // Will print '37777777726'
terminal_writestring(int_to_string(-42, buf, 16)); // Will print 'FFFFFFD6'
Live example: http://cpp.sh/5ras

You could declare a string and get the pointer to it :
std::string str = std::to_string(count);
str += "\n";
terminal_writestring(str.c_str());

Related

longest palindromic substring. Error: AddressSanitizer, heap overflow

#include<string>
#include<cstring>
class Solution {
void shift_left(char* c, const short unsigned int bits) {
const unsigned short int size = sizeof(c);
memmove(c, c+bits, size - bits);
memset(c+size-bits, 0, bits);
}
public:
string longestPalindrome(string s) {
char* output = new char[s.length()];
output[0] = s[0];
string res = "";
char* n = output;
auto e = s.begin() + 1;
while(e != s.end()) {
char letter = *e;
char* c = n;
(*++n) = letter;
if((letter != *c) && (c == &output[0] || letter != (*--c)) ) {
++e;
continue;
}
while((++e) != s.end() && c != &output[0]) {
if((letter = *e) != (*--c)) {
const unsigned short int bits = c - output + 1;
shift_left(output, bits);
n -= bits;
break;
}
(*++n) = letter;
}
string temp(output);
res = temp.length() > res.length()? temp : res;
shift_left(output, 1);
--n;
}
return res;
}
};
input string longestPalindrome("babad");
the program works fine and prints out "bab" as the longest palindrome but there's a heap overflow somewhere. Error like this appears:
Read of size 6 at ...memory address... thread T0
"babad" is size 5 and after going over this for an hour. I don't see the point where the iteration ever exceeds 5
There is 3 pointers here that iterate.
e as the element of string s.
n which is the pointer to the next char of output.
and c which is a copy of n and decrements until it reaches the address of &output[0].
maybe it's something with the memmove or memset since I've never used it before.
I'm completely lost
TL;DR : mixture of char* and std::string are not really good idea if you don't understand how exactly it works.
If you want to length of string you cant do this const unsigned short int size = sizeof(c); (sizeof will return size of pointer (which is commonly 4 on 32-bit machine and 8 on 64-bit machine). You must do this instead: const size_t size = strlen(c);
Address sanitizers is right that you (indirectly) are trying to get an memory which not belongs to you.
How does constructor of string from char* works?
Answer: char* is considered as c-style string, which means that it must be null '\0' terminated.
More details: constructor of string from char* calls strlen-like function which looks like about this:
https://en.cppreference.com/w/cpp/string/byte/strlen
int strlen(char *begin){
int k = 0;
while (*begin != '\0'){
++k;
++begin;
}
return k;
}
If c-style char* string does not contain '\0' it cause accessing memory which doesn't belongs to you.
How to fix?
Answer (two options):
not use mixture of char* and std::string
char* output = new char[s.length()]; replace with char* output = new char[s.length() + 1]; memset(output, 0, s.length() + 1);
Also you must delete all memory which you newed. So add delete[] output; before return res;

How to detect UTF16 strings in PE files

I need to extract Unicode strings from a PE file. While extracting I need to detect it first. For UTF-8 characters, I used the following link - How to easily detect utf8 encoding in the string?. Is there any similar way to detect UTF-16 characters. I have tried the following code. Is this right? Please do help or provide suggestions. Thanks in advance!!!
BYTE temp1 = buf[offset];
BYTE temp2 = buf[offset+1];
while (!(temp1 == 0x00 && temp2 == 0x00) && offset <= bufSize)
{
if ((temp1 >= 0x00 && temp1 <= 0xFF) && (temp2 >= 0x00 && temp2 <= 0xFF))
{
tmp += 2;
}
else
{
break;
}
offset += 2;
temp1 = buf[offset];
temp2 = buf[offset+1];
if (temp1 == 0x00 && temp2 == 0x00)
{
break;
}
}
I just implemented right now a function for you, DecodeUtf16Char(), basically it is able to do two things - either just check if it is a valid utf-16 (when check_only = true) or check and return valid decoded Unicode code-point (32-bit). Also it supports either big endian (default, when big_endian = true) or little endian (big_endian = false) order of bytes within two-byte utf-16 word. bad_skip equals to number of bytes to be skipped if failed to decode a character (invalid utf-16), bad_value is a value that is used to signify that utf-16 wasn't decoded (was invalid) by default it is -1.
Example of usage/tests are included after this function definition. Basically you just pass starting (ptr) and ending pointer to this function and when returned check return value, if it is -1 then at pointer begin was invalid utf-16 sequence, if it is not -1 then this returned value contains valid 32-bit unicode code-point. Also my function increments ptr, by amount of decoded bytes in case of valid utf-16 or by bad_skip number of bytes if it is invalid.
My functions should be very fast, because it contains only few ifs (plus a bit of arithmetics in case when you ask to actually decode chars), always place my function into headers so that it is inlined into calling function to produce very fast code! Also pass in only compile-time-constants check_only and big_endian, this will remove extra decoding code through C++ optimizations.
If for example you just want to detect long runs of utf-16 bytes then you do next thing, iterate in a loop calling this function and whenever it first returned not -1 then it will be possible beginning, then iterate further and catch last not-equal-to -1 value, this will be the last point of text. Also important to pass in bad_skip = 1 when searching for utf-16 bytes because valid char may start at any byte.
I used for testing different characters - English ASCII, Russian chars (two-byte utf-16) plus two 4-byte chars (two utf-16 words). My tests append converted line to test.txt file, this file is UTF-8 encoded to be easily viewable e.g. by notepad. All of the code after my decoding function is not needed for it to work, the rest is just testing code.
My function to work needs two functions - _DecodeUtf16Char_ReadWord() (helper) plus DecodeUtf16Char() (main decoder). I only include one standard header <cstdint>, if you're not allowed to include anything then just define uint8_t and uint16_t and uint32_t, I use only these types definition from this header.
Also, for reference, see my other post which implements both from scratch (and using standard C++ library) all types of conversions between UTF-8<-->UTF-16<-->UTF-32!
Try it online!
#include <cstdint>
static inline bool _DecodeUtf16Char_ReadWord(
uint8_t const * & ptrc, uint8_t const * end,
uint16_t & r, bool const big_endian
) {
if (ptrc + 1 >= end) {
// No data left.
if (ptrc < end)
++ptrc;
return false;
}
if (big_endian) {
r = uint16_t(*ptrc) << 8; ++ptrc;
r |= uint16_t(*ptrc) ; ++ptrc;
} else {
r = uint16_t(*ptrc) ; ++ptrc;
r |= uint16_t(*ptrc) << 8; ++ptrc;
}
return true;
}
static inline uint32_t DecodeUtf16Char(
uint8_t const * & ptr, uint8_t const * end,
bool const check_only = true, bool const big_endian = true,
uint32_t const bad_skip = 1, uint32_t const bad_value = -1
) {
auto ptrs = ptr, ptrc = ptr;
uint32_t c = 0;
uint16_t v = 0;
if (!_DecodeUtf16Char_ReadWord(ptrc, end, v, big_endian)) {
// No data left.
c = bad_value;
} else if (v < 0xD800 || v > 0xDFFF) {
// Correct single-word symbol.
if (!check_only)
c = v;
} else if (v >= 0xDC00) {
// Unallowed UTF-16 sequence!
c = bad_value;
} else { // Possibly double-word sequence.
if (!check_only)
c = (v & 0x3FF) << 10;
if (!_DecodeUtf16Char_ReadWord(ptrc, end, v, big_endian)) {
// No data left.
c = bad_value;
} else if ((v < 0xDC00) || (v > 0xDFFF)) {
// Unallowed UTF-16 sequence!
c = bad_value;
} else {
// Correct double-word symbol
if (!check_only) {
c |= v & 0x3FF;
c += 0x10000;
}
}
}
if (c == bad_value)
ptr = ptrs + bad_skip; // Skip bytes.
else
ptr = ptrc; // Skip all eaten bytes.
return c;
}
// --------- Next code only for testing only and is not needed for decoding ------------
#include <iostream>
#include <string>
#include <codecvt>
#include <fstream>
#include <locale>
static std::u32string DecodeUtf16Bytes(uint8_t const * ptr, uint8_t const * end) {
std::u32string res;
while (true) {
if (ptr >= end)
break;
uint32_t c = DecodeUtf16Char(ptr, end, false, false, 2);
if (c != -1)
res.append(1, c);
}
return res;
}
#if (!_DLL) && (_MSC_VER >= 1900 /* VS 2015*/) && (_MSC_VER <= 1914 /* VS 2017 */)
std::locale::id std::codecvt<char16_t, char, _Mbstatet>::id;
std::locale::id std::codecvt<char32_t, char, _Mbstatet>::id;
#endif
template <typename CharT = char>
static std::basic_string<CharT> U32ToU8(std::u32string const & s) {
std::wstring_convert<std::codecvt_utf8<char32_t>, char32_t> utf_8_32_conv;
auto res = utf_8_32_conv.to_bytes(s.c_str(), s.c_str() + s.length());
return res;
}
template <typename WCharT = wchar_t>
static std::basic_string<WCharT> U32ToU16(std::u32string const & s) {
std::wstring_convert<std::codecvt_utf16<char32_t, 0x10ffffUL, std::little_endian>, char32_t> utf_16_32_conv;
auto res = utf_16_32_conv.to_bytes(s.c_str(), s.c_str() + s.length());
return std::basic_string<WCharT>((WCharT*)(res.c_str()), (WCharT*)(res.c_str() + res.length()));
}
template <typename StrT>
void OutputString(StrT const & s) {
std::ofstream f("test.txt", std::ios::binary | std::ios::app);
f.write((char*)s.c_str(), size_t((uint8_t*)(s.c_str() + s.length()) - (uint8_t*)s.c_str()));
f.write("\n\x00", sizeof(s.c_str()[0]));
}
int main() {
std::u16string a = u"привет|мир|hello|𐐷|world|𤭢|again|русский|english";
*((uint8_t*)(a.data() + 12) + 1) = 0xDD; // Introduce bad utf-16 byte.
// Also truncate by 1 byte ("... - 1" in next line).
OutputString(U32ToU8(DecodeUtf16Bytes((uint8_t*)a.c_str(), (uint8_t*)(a.c_str() + a.length()) - 1)));
return 0;
}
Output:
привет|мир|hllo|𐐷|world|𤭢|again|русский|englis

How to convert the template from C++ to C

I am trying to convert some C++ code to C for my compiler that can't run with C++ code. I'd like to create the template below to C. This template converts the decimal integer to hexadecimal, and adds 0 in front of value if the size of the hexadecimal string is smaller than (sizeof(T)*2). Data type T can be unsigned char, char, short, unsigned short, int, unsigned int, long long, and unsigned long long.
template< typename T > std::string hexify(T i)
{
std::stringbuf buf;
std::ostream os(&buf);
os << std::setfill('0') << std::setw(sizeof(T) * 2)
<< std::hex << i;
std::cout<<"sizeof(T) * 2 = "<<sizeof(T) * 2<<" buf.str() = "<<buf.str()<<" buf.str.c_str() = "<<buf.str().c_str()<<std::endl;
return buf.str().c_str();
}
Thank you for tour help.
Edit 1: I have tried to use the declaration
char * hexify (void data, size_t data_size)
but when I call with the int value int_value:
char * result = hexify(int_value, sizeof(int))
it doesn't work because of:
noncompetitive type (void and int).
So in this case, do I have to use a macro? I haven't tried with macro because it's complicated.
C does not have templates. One solution is to pass the maximum width integer supported (uintmax_t, in Value below) and the size of the original integer (in Size). One routine can use the size to determine the number of digits to print. Another complication is C does not provide C++’s std::string with is automatic memory management. A typical way to handle this in C is for the called function to allocate a buffer and return it to the caller, who is responsible for freeing it when done.
The code below shows a hexify function that does this, and it also shows a Hexify macro that takes a single parameter and passes both its size and its value to the hexify function.
Note that, in C, character constants such as 'A' have type int, not char, so some care is needed in providing the desired size. The code below includes an example for that.
#include <inttypes.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
char *hexify(size_t Size, uintmax_t Value)
{
// Allocate space for "0x", 2*Size digits, and a null character.
size_t BufferSize = 2 + 2*Size + 1;
char *Buffer = malloc(BufferSize);
// Ensure a buffer was allocated.
if (!Buffer)
{
fprintf(stderr,
"Error, unable to allocate buffer of %zu bytes in %s.\n",
BufferSize, __func__);
exit(EXIT_FAILURE);
}
// Format the value as "0x" followed by 2*Size hexadecimal digits.
snprintf(Buffer, BufferSize, "0x%0*" PRIxMAX, (int) (2*Size), Value);
return Buffer;
}
/* Provide a macro that passes both the size and the value of its parameter
to the hexify function.
*/
#define Hexify(x) (hexify(sizeof (x), (x)))
int main(void)
{
char *Buffer;
/* Show two examples of using the hexify function with different integer
types. (The examples assume ASCII.)
*/
char x = 'A';
Buffer = hexify(sizeof x, x);
printf("Character '%c' = %s.\n", x, Buffer); // Prints "0x41".
free(Buffer);
int i = 123;
Buffer = hexify(sizeof i, i);
printf("Integer %d = %s.\n", i, Buffer); // Prints "0x00007b".
free(Buffer);
/* Show examples of using the Hexify macro, demonstrating that 'A' is an
int value, not a char value, so it would need to be cast if a char is
desired.
*/
Buffer = Hexify('A');
printf("Character '%c' = %s.\n", 'A', Buffer); // Prints "0x00000041".
free(Buffer);
Buffer = Hexify((char) 'A');
printf("Character '%c' = %s.\n", 'A', Buffer); // Prints "0x41".
free(Buffer);
}
You don't need templates if you step down to raw bits and bytes.
If performance is important, it is also best to roll out the conversion routine by hand, since the string handling functions in C and C++ come with lots of slow overhead. The somewhat well-optimized version would look something like this:
char* hexify_data (char*restrict dst, const char*restrict src, size_t size)
{
const char NIBBLE_LOOKUP[0xF+1] = "0123456789ABCDEF";
char* d = dst;
for(size_t i=0; i<size; i++)
{
size_t byte = size - i - 1; // assuming little endian
*d = NIBBLE_LOOKUP[ (src[byte]&0xF0u)>>4 ];
d++;
*d = NIBBLE_LOOKUP[ (src[byte]&0x0Fu)>>0 ];
d++;
}
*d = '\0';
return dst;
}
This breaks down any passed type byte-by-byte, using a character type. Which is fine, when using character types specifically. It also uses caller allocation for maximum performance. (It can also be made endianess-independent with an extra check per loop.)
We can make the call a bit more convenient with a wrapper macro:
#define hexify(buf, var) hexify_data(buf, (char*)&var, sizeof(var))
Full example:
#include <string.h>
#include <stdint.h>
#include <stdio.h>
#define hexify(buf, var) hexify_data(buf, (char*)&var, sizeof(var))
char* hexify_data (char*restrict dst, const char*restrict src, size_t size)
{
const char NIBBLE_LOOKUP[0xF+1] = "0123456789ABCDEF";
char* d = dst;
for(size_t i=0; i<size; i++)
{
size_t byte = size - i - 1; // assuming little endian
*d = NIBBLE_LOOKUP[ (src[byte]&0xF0u)>>4 ];
d++;
*d = NIBBLE_LOOKUP[ (src[byte]&0x0Fu)>>0 ];
d++;
}
*d = '\0';
return dst;
}
int main (void)
{
char buf[50];
int32_t i32a = 0xABCD;
puts(hexify(buf, i32a));
int32_t i32b = 0xAAAABBBB;
puts(hexify(buf, i32b));
char c = 5;
puts(hexify(buf, c));
uint8_t u8 = 100;
puts(hexify(buf, u8));
}
Output:
0000ABCD
AAAABBBB
05
64
an optional solution is to use format string like printf
note that you can't return pointer to local variable, but you can get the buffer as argument, (here it is without boundaries check).
char* hexify(char* result, const char* format, void* arg)
{
int size = 0;
if(0 == strcmp(format,"%d") || 0 == strcmp(format,"%u"))
{
size=4;
sprintf(result,"%08x",arg);
}
else if(0 == strcmp(format,"%hd") || 0 == strcmp(format,"%hu"))
{
size=2;
sprintf(result,"%04x",arg);
}
else if(0 == strcmp(format,"%hhd")|| 0 == strcmp(format,"%hhu"))
{
size=1;
sprintf(result,"%02x",arg);
}
else if(0 == strcmp(format,"%lld") || 0 == strcmp(format,"%llu") )
{
size=8;
sprintf(result,"%016x",arg);
}
//printf("size=%d", size);
return result;
}
int main()
{
char result[256];
printf("%s", hexify(result,"%hhu", 1));
return 0;
}

boost:uuid into char * without std::string

I am trying to convert a boost UUID to a char * without having to use std::string at all.
I mostly modified the to_string method from https://www.boost.org/doc/libs/1_68_0/boost/uuid/uuid_io.hpp to my own version. However, it fails on certain UUIDs.
Here is my modification:
#include <boost/uuid/string_generator.hpp>
#include <boost/uuid/uuid_generators.hpp>
using UUID = boost::uuids::uuid;
static constexpr std::size_t UUID_STR_LEN = 37;
inline char uuid_byte_to_char(size_t i)
{
if (i <= 9) {
return static_cast<char>('0' + i);
} else {
return static_cast<char>('a' + (i - 10));
}
}
inline void uuid_to_cstr(UUID const& uuid, char out[UUID_STR_LEN])
{
std::size_t out_i = 0;
std::size_t dash_i = 0;
for (UUID::const_iterator it_data = uuid.begin(); it_data != uuid.end(); ++it_data, ++dash_i) {
const size_t hi = ((*it_data) >> 4) & 0x0F;
out[out_i++] = uuid_byte_to_char(hi);
const size_t lo = (*it_data) & 0x0F;
out[out_i++] = uuid_byte_to_char(lo);
if (dash_i == 3 || dash_i == 5 || dash_i == 7 || dash_i == 9) {
out[out_i++] += '-';
}
}
out[UUID_STR_LEN - 1] = '\0';
}
Usage:
int main() {
UUID uuid(uuid_generator());
char uuid_cstr(UUID_STR_LEN];
uuid_to_str(uuid uuid_cstr);
std::cout << uuid_cstr << "\n";
}
So if the UUID was cd0fa728-e7d6-4578-9450-7beb284e0103 for example this works fine.
However, for 0cf31c43-7621-407c-94d6-6d593bae96e8 what I actually end up getting is 0cf31c43-7621�407cQ94d6-6d593bae96e8.
What's the problem in my code? As far as I'm aware, my char manipulations mimic what the std::string is doing minus the temporary copies due to the constant appending. Or am I mistaken?
Your buffer char uuid_cstr[UUID_STR_LEN]; was allocated onto stack so it has garbage values, all elements in buffer has some initial value probably not 0.
1) You can set all items to zero by
char uuid_cstr[UUID_STR_LEN];
memset (uuid_cstr,0,UUID_STR_LEN);
then the following statement can work
out[out_i++] += '-';
2) Or use the assignment
out[out_i++] = '-';

Converting from char string to an array of uint8_t?

I'm reading a string from a file so it's in the form of a char array. I need to tokenize the string and save each char array token as a uint8_t hex value in an array.
char* starting = "001122AABBCC";
// ...
uint8_t[] ending = {0x00,0x11,0x22,0xAA,0xBB,0xCC}
How can I convert from starting to ending? Thanks.
Here is a complete working program. It is based on Rob I's solution, but fixes several problems has been tested to work.
#include <string>
#include <stdio.h>
#include <stdlib.h>
#include <vector>
#include <iostream>
const char* starting = "001122AABBCC";
int main()
{
std::string starting_str = starting;
std::vector<unsigned char> ending;
ending.reserve( starting_str.size());
for (int i = 0 ; i < starting_str.length() ; i+=2) {
std::string pair = starting_str.substr( i, 2 );
ending.push_back(::strtol( pair.c_str(), 0, 16 ));
}
for(int i=0; i<ending.size(); ++i) {
printf("0x%X\n", ending[i]);
}
}
strtoul will convert text in any base you choose into bytes. You have to do a little work to chop the input string into individual digits, or you can convert 32 or 64bits at a time.
ps uint8_t[] ending = {0x00,0x11,0x22,0xAA,0xBB,0xCC}
Doesn't mean anything, you aren't storing the data in a uint8 as 'hex', you are storing bytes, it's upto how you (or your debugger) interpretes the binary data
With C++11, you may use std::stoi for that :
std::vector<uint8_t> convert(const std::string& s)
{
if (s.size() % 2 != 0) {
throw std::runtime_error("Bad size argument");
}
std::vector<uint8_t> res;
res.reserve(s.size() / 2);
for (std::size_t i = 0, size = s.size(); i != size; i += 2) {
std::size_t pos = 0;
res.push_back(std::stoi(s.substr(i, 2), &pos, 16));
if (pos != 2) {
throw std::runtime_error("bad character in argument");
}
}
return res;
}
Live example.
I think any canonical answer (w.r.t. the bounty notes) would involve some distinct phases in the solution:
Error checking for valid input
Length check and
Data content check
Element conversion
Output creation
Given the usefulness of such conversions, the solution should probably include some flexibility w.r.t. the types being used and the locale required.
From the outset, given the date of the request for a "more canonical answer" (circa August 2014) liberal use of C++11 will be applied.
An annotated version of the code, with types corresponding to the OP:
std::vector<std::uint8_t> convert(std::string const& src)
{
// error check on the length
if ((src.length() % 2) != 0) {
throw std::invalid_argument("conversion error: input is not even length");
}
auto ishex = [] (decltype(*src.begin()) c) {
return std::isxdigit(c, std::locale()); };
// error check on the data contents
if (!std::all_of(std::begin(src), std::end(src), ishex)) {
throw std::invalid_argument("conversion error: input values are not not all xdigits");
}
// allocate the result, initialised to 0 and size it to the correct length
std::vector<std::uint8_t> result(src.length() / 2, 0);
// run the actual conversion
auto str = src.begin(); // track the location in the string
std::for_each(result.begin(), result.end(), [&str](decltype(*result.begin())& element) {
element = static_cast<std::uint8_t>(std::stoul(std::string(str, str + 2), nullptr, 16));
std::advance(str, 2); // next two elements
});
return result;
}
The template version of the code adds flexibility;
template <typename Int /*= std::uint8_t*/,
typename Char = char,
typename Traits = std::char_traits<Char>,
typename Allocate = std::allocator<Char>,
typename Locale = std::locale>
std::vector<Int> basic_convert(std::basic_string<Char, Traits, Allocate> const& src, Locale locale = Locale())
{
using string_type = std::basic_string<Char, Traits, Allocate>;
auto ishex = [&locale] (decltype(*src.begin()) c) {
return std::isxdigit(c, locale); };
if ((src.length() % 2) != 0) {
throw std::invalid_argument("conversion error: input is not even length");
}
if (!std::all_of(std::begin(src), std::end(src), ishex)) {
throw std::invalid_argument("conversion error: input values are not not all xdigits");
}
std::vector<Int> result(src.length() / 2, 0);
auto str = std::begin(src);
std::for_each(std::begin(result), std::end(result), [&str](decltype(*std::begin(result))& element) {
element = static_cast<Int>(std::stoul(string_type(str, str + 2), nullptr, 16));
std::advance(str, 2);
});
return result;
}
The convert() function can then be based on the basic_convert() as follows:
std::vector<std::uint8_t> convert(std::string const& src)
{
return basic_convert<std::uint8_t>(src, std::locale());
}
Live sample.
uint8_t is typically no more than a typedef of an unsigned char. If you're reading characters from a file, you should be able to read them into an unsigned char array just as easily as a signed char array, and an unsigned char array is a uint8_t array.
I'd try something like this:
std::string starting_str = starting;
uint8_t[] ending = new uint8_t[starting_str.length()/2];
for (int i = 0 ; i < starting_str.length() ; i+=2) {
std::string pair = starting_str.substr( i, i+2 );
ending[i/2] = ::strtol( pair.c_str(), 0, 16 );
}
Didn't test it but it looks good to me...
You may add your own conversion from set of char { '0','1',...'E','F' } to uint8_t:
uint8_t ctoa(char c)
{
if( c >= '0' && c <= '9' ) return c - '0';
else if( c >= 'a' && c <= 'f' ) return 0xA + c - 'a';
else if( c >= 'A' && c <= 'F' ) return 0xA + c - 'A';
else return 0;
}
Then it will be easy to convert a string in to array:
uint32_t endingSize = strlen(starting)/2;
uint8_t* ending = new uint8_t[endingSize];
for( uint32_t i=0; i<endingSize; i++ )
{
ending[i] = ( ctoa( starting[i*2] ) << 4 ) + ctoa( starting[i*2+1] );
}
This simple solution should work for your problem
char* starting = "001122AABBCC";
uint8_t ending[12];
// This algo will work for any size of starting
// However, you have to make sure that the ending have enough space.
int i=0;
while (i<strlen(starting))
{
// convert the character to string
char str[2] = "\0";
str[0] = starting[i];
// convert string to int base 16
ending[i]= (uint8_t)atoi(str,16);
i++;
}
uint8_t* ending = static_cast<uint8_t*>(starting);