Is there an alternative to the std::string substring?

Is there an alternative to the std::string substring? - c++

Given a string s = "RADILAMIA" I want to take all the substrings of length 4 (or something else).
If len == 4 then the substrings are: "RADI","ADIL","DILA","ILAM","LAMI","AMIA". It seems easy to do that by using the std::string substr method:
vector<string> allSubstr(string s,int len) {
vector<string>ans;
for(int i=0;i<=s.size()-len;i++) {
ans.push_back(s.substr(i,len));
}
return ans;
}
substr's time complexity is unspecified, but generally linear against the length of the substring.
Can I do this without std::string substr. Any substring and the previous substring differ in only one letter. Is there any better way to reduce the time complexity?

string_view (C++17) has a constant time substr:
vector<string_view> allSubstr(const string_view& s, int len) {
vector<string_view> ans;
and.reserve(s.size() - len + 1);
for (int i = 0 ; i <= s.size() - len; ++i) {
ans.push_back(s.substr(i, len));
}
return ans;
}
Just make sure that s outlives the return value of the function.

There can be millions of different approaches. Here is my algorithm.
vector<string> allSubstr(string s,int len) {
vector<string>ans;
ans.reserve(s.size() - len );
for(size_t i=0;i<=s.size()-len;i++)
{
ans.emplace_back( s.begin() +i, s.begin() + i + len );
}
return ans;
}
It is tested. I mean it wouldn't matter what you are using but emplace_back above can make a difference since there won't be copy cost. Also you add reserve for more performance.

No matter what you do, you still need O(NL) time to write all your substrings into the vector.
The fastest thing would be probably:
vector<string> ans(s.size()-len);
for(int i=0;i<=s.size()-len;i++) {
ans[i] = s.substr(i, len);
}
Because push_back is slowish, and should generally be avoided if possible. It is overused.
PS: maybe this code would be even faster:
vector<string> ans(s.size()-len);
for(int i=0;i<=s.size()-len;i++) {
ans[i].append(s.begin()+i, s.begin()+i+len);
}

Probably you could use an array of chars instead. For example, you have got your word:
char s[] = "RADILAMIA";
To deal with all necessary substrings you can use such approach:
int substLength = 4;
int length = strlen(s);
char buffer[256];
for (int i = 0; i < length - substLength + 1; i++) {
strncpy(buffer, s + i, substLength);
buffer[substLength] = '\0';
cout << buffer << endl;
}
Using the char array you easily can access to the start of any substring by adding the necessary index to the beginning of the array.

It pays to revisit the docos
// string proto(len);
vector<string> result(s.size()-len, string(len, char(32))); // preallocates the buffers
const char *str=s.c_str();
const char* end=str+s.size()-len;
for(size_t i=0; str<end; str++, i++) {
result[i].assign(str, len); // likely to result in a simple copy in the preallocated buffer
}
The complexity is the same O(len*s.size()) - one can only hope for a smaller proportionality factor.

C is not always faster than C++ but #Fomalhaut was right to post the performant core solution in C. Here is my (C program) complete version, based on his algorithm. Without using strncpy, too.
Here it is on the godbolt.
#ifdef __STDC_ALLOC_LIB__
#define __STDC_WANT_LIB_EXT2__ 1
#else
#define _POSIX_C_SOURCE 200809L
#endif
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <string.h>
#include <assert.h>
#include <malloc.h>
//////////////////////////////////////////////////////////////
// array of buffers == a_of_b
typedef struct a_of_b {
const unsigned size;
unsigned count ;
char ** data ;
} a_of_b ;
a_of_b a_of_b_make ( const unsigned size_ )
{
return (a_of_b){ .size = size_, .count = 0, .data = calloc(1, sizeof(char * [size_] ) ) } ;
}
a_of_b * a_of_b_append ( a_of_b * self, const unsigned len_, const char str_[len_] )
{
assert( self->data ) ;
assert( self->size > self->count ) ;
self->data[ self->count ] = strndup( str_, len_ ) ;
self->count += 1;
return self ;
}
a_of_b * a_of_b_print ( a_of_b * self , const char * fmt_ )
{
for (unsigned j = 0; j < self->count; ++j)
printf( fmt_ , self->data[j]);
return self ;
}
a_of_b * a_of_b_free ( a_of_b * self )
{
for (unsigned j = 0; j < self->count; ++j)
free( self->data[j]) ;
free( self->data) ;
self->count = 0 ;
return self ;
}
//////////////////////////////////////////////////////////////
a_of_b breakit ( const unsigned len_, const char input_[len_], const unsigned substLength )
{
assert( len_ > 2 ) ;
assert( substLength > 0 ) ;
assert( substLength < len_ ) ;
const unsigned count_of_buffers = len_ - substLength + 1;
a_of_b rez_ = a_of_b_make( count_of_buffers +1 ) ;
for (int i = 0; i < count_of_buffers ; i++) {
a_of_b_append( &rez_, substLength, input_ + i ) ;
}
return rez_ ;
}
//////////////////////////////////////////////////////////////
static void driver( const char * input_, const unsigned substLength )
{
printf("\n");
a_of_b substrings = breakit( strlen(input_), input_, substLength );
a_of_b_print( & substrings , "%s ");
a_of_b_free( & substrings);
}
//////////////////////////////////////////////////////////////
int main () {
driver( "RADILAMIA", 4) ;
driver( "RADILAMIA", 3) ;
driver( "RADILAMIA", 2) ;
driver( "RADILAMIA", 1) ;
return EXIT_SUCCESS;
}
And the program output is:
RADI ADIL DILA ILAM LAMI AMIA
RAD ADI DIL ILA LAM AMI MIA
RA AD DI IL LA AM MI IA
R A D I L A M I A
Enjoy.

Related

How to convert a literal string of hex to actual hex values in C++? [duplicate]

What is the best way to convert a variable length hex string e.g. "01A1" to a byte array containing that data.
i.e converting this:
std::string = "01A1";
into this
char* hexArray;
int hexLength;
or this
std::vector<char> hexArray;
so that when I write this to a file and hexdump -C it I get the binary data containing 01A1.

This implementation uses the built-in strtol function to handle the actual conversion from text to bytes, but will work for any even-length hex string.
std::vector<char> HexToBytes(const std::string& hex) {
std::vector<char> bytes;
for (unsigned int i = 0; i < hex.length(); i += 2) {
std::string byteString = hex.substr(i, 2);
char byte = (char) strtol(byteString.c_str(), NULL, 16);
bytes.push_back(byte);
}
return bytes;
}

This ought to work:
int char2int(char input)
{
if(input >= '0' && input <= '9')
return input - '0';
if(input >= 'A' && input <= 'F')
return input - 'A' + 10;
if(input >= 'a' && input <= 'f')
return input - 'a' + 10;
throw std::invalid_argument("Invalid input string");
}
// This function assumes src to be a zero terminated sanitized string with
// an even number of [0-9a-f] characters, and target to be sufficiently large
void hex2bin(const char* src, char* target)
{
while(*src && src[1])
{
*(target++) = char2int(*src)*16 + char2int(src[1]);
src += 2;
}
}
Depending on your specific platform there's probably also a standard implementation though.

So for fun, I was curious if I could do this kind of conversion at compile-time. It doesn't have a lot of error checking and was done in VS2015, which doesn't support C++14 constexpr functions yet (thus how HexCharToInt looks). It takes a c-string array, converts pairs of characters into a single byte and expands those bytes into a uniform initialization list used to initialize the T type provided as a template parameter. T could be replaced with something like std::array to automatically return an array.
#include <cstdint>
#include <initializer_list>
#include <stdexcept>
#include <utility>
/* Quick and dirty conversion from a single character to its hex equivelent */
constexpr std::uint8_t HexCharToInt(char Input)
{
return
((Input >= 'a') && (Input <= 'f'))
? (Input - 87)
: ((Input >= 'A') && (Input <= 'F'))
? (Input - 55)
: ((Input >= '0') && (Input <= '9'))
? (Input - 48)
: throw std::exception{};
}
/* Position the characters into the appropriate nibble */
constexpr std::uint8_t HexChar(char High, char Low)
{
return (HexCharToInt(High) << 4) | (HexCharToInt(Low));
}
/* Adapter that performs sets of 2 characters into a single byte and combine the results into a uniform initialization list used to initialize T */
template <typename T, std::size_t Length, std::size_t ... Index>
constexpr T HexString(const char (&Input)[Length], const std::index_sequence<Index...>&)
{
return T{HexChar(Input[(Index * 2)], Input[((Index * 2) + 1)])...};
}
/* Entry function */
template <typename T, std::size_t Length>
constexpr T HexString(const char (&Input)[Length])
{
return HexString<T>(Input, std::make_index_sequence<(Length / 2)>{});
}
constexpr auto Y = KS::Utility::HexString<std::array<std::uint8_t, 3>>("ABCDEF");

You can use boost:
#include <boost/algorithm/hex.hpp>
char bytes[60] = {0};
std::string hash = boost::algorithm::unhex(std::string("313233343536373839"));
std::copy(hash.begin(), hash.end(), bytes);

You said "variable length." Just how variable do you mean?
For hex strings that fit into an unsigned long I have always liked the C function strtoul. To make it convert hex pass 16 as the radix value.
Code might look like:
#include <cstdlib>
std::string str = "01a1";
unsigned long val = strtoul(str.c_str(), 0, 16);

If you want to use OpenSSL to do it, there is a nifty trick I found:
BIGNUM *input = BN_new();
int input_length = BN_hex2bn(&input, argv[2]);
input_length = (input_length + 1) / 2; // BN_hex2bn() returns number of hex digits
unsigned char *input_buffer = (unsigned char*)malloc(input_length);
retval = BN_bn2bin(input, input_buffer);
Just be sure to strip off any leading '0x' to the string.

This can be done with a stringstream, you just need to store the value in an intermediate numeric type such as an int:
std::string test = "01A1"; // assuming this is an even length string
char bytes[test.length()/2];
stringstream converter;
for(int i = 0; i < test.length(); i+=2)
{
converter << std::hex << test.substr(i,2);
int byte;
converter >> byte;
bytes[i/2] = byte & 0xFF;
converter.str(std::string());
converter.clear();
}

Somebody mentioned using sscanf to do this, but didn't say how. This is how. It's useful because it also works in ancient versions of C and C++ and even most versions of embedded C or C++ for microcontrollers.
When converted to bytes, the hex-string in this example resolves to the ASCII text "Hello there!" which is then printed.
#include <stdio.h>
int main ()
{
char hexdata[] = "48656c6c6f20746865726521";
char bytedata[20]{};
for(int j = 0; j < sizeof(hexdata) / 2; j++) {
sscanf(hexdata + j * 2, "%02hhX", bytedata + j);
}
printf ("%s -> %s\n", hexdata, bytedata);
return 0;
}

I would use a standard function like sscanf to read the string into an unsigned integer, and then you already have the bytes you need in memory. If you were on a big endian machine you could just write out (memcpy) the memory of the integer from the first non-zero byte. However you can't safely assume this in general, so you can use some bit masking and shifting to get the bytes out.
const char* src = "01A1";
char hexArray[256] = {0};
int hexLength = 0;
// read in the string
unsigned int hex = 0;
sscanf(src, "%x", &hex);
// write it out
for (unsigned int mask = 0xff000000, bitPos=24; mask; mask>>=8, bitPos-=8) {
unsigned int currByte = hex & mask;
if (currByte || hexLength) {
hexArray[hexLength++] = currByte>>bitPos;
}
}

C++11 variant (with gcc 4.7 - little endian format):
#include <string>
#include <vector>
std::vector<uint8_t> decodeHex(const std::string & source)
{
if ( std::string::npos != source.find_first_not_of("0123456789ABCDEFabcdef") )
{
// you can throw exception here
return {};
}
union
{
uint64_t binary;
char byte[8];
} value{};
auto size = source.size(), offset = (size % 16);
std::vector<uint8_t> binary{};
binary.reserve((size + 1) / 2);
if ( offset )
{
value.binary = std::stoull(source.substr(0, offset), nullptr, 16);
for ( auto index = (offset + 1) / 2; index--; )
{
binary.emplace_back(value.byte[index]);
}
}
for ( ; offset < size; offset += 16 )
{
value.binary = std::stoull(source.substr(offset, 16), nullptr, 16);
for ( auto index = 8; index--; )
{
binary.emplace_back(value.byte[index]);
}
}
return binary;
}
Crypto++ variant (with gcc 4.7):
#include <string>
#include <vector>
#include <crypto++/filters.h>
#include <crypto++/hex.h>
std::vector<unsigned char> decodeHex(const std::string & source)
{
std::string hexCode;
CryptoPP::StringSource(
source, true,
new CryptoPP::HexDecoder(new CryptoPP::StringSink(hexCode)));
return std::vector<unsigned char>(hexCode.begin(), hexCode.end());
}
Note that the first variant is about two times faster than the second one and at the same time works with odd and even number of nibbles (the result of "a56ac" is {0x0a, 0x56, 0xac}). Crypto++ discards the last one if there are odd number of nibbels (the result of "a56ac" is {0xa5, 0x6a}) and silently skips invalid hex characters (the result of "a5sac" is {0xa5, 0xac}).

#include <iostream>
#include <sstream>
#include <vector>
int main() {
std::string s("313233");
char delim = ',';
int len = s.size();
for(int i = 2; i < len; i += 3, ++len) s.insert(i, 1, delim);
std::istringstream is(s);
std::ostringstream os;
is >> std::hex;
int n;
while (is >> n) {
char c = (char)n;
os << std::string(&c, 1);
if(is.peek() == delim) is.ignore();
}
// std::string form
std::string byte_string = os.str();
std::cout << byte_string << std::endl;
printf("%s\n", byte_string.c_str());
// std::vector form
std::vector<char> byte_vector(byte_string.begin(), byte_string.end());
byte_vector.push_back('\0'); // needed for a c-string
printf("%s\n", byte_vector.data());
}
The output is
123
123
123
'1' == 0x31, etc.

If your goal is speed, I have an AVX2 SIMD implementation of an encoder and decoder here: https://github.com/zbjornson/fast-hex. These benchmark ~12x faster than the fastest scalar implementations.

#include <iostream>
using byte = unsigned char;
static int charToInt(char c) {
if (c >= '0' && c <= '9') {
return c - '0';
}
if (c >= 'A' && c <= 'F') {
return c - 'A' + 10;
}
if (c >= 'a' && c <= 'f') {
return c - 'a' + 10;
}
return -1;
}
// Decodes specified HEX string to bytes array. Specified nBytes is length of bytes
// array. Returns -1 if fails to decode any of bytes. Returns number of bytes decoded
// on success. Maximum number of bytes decoded will be equal to nBytes. It is assumed
// that specified string is '\0' terminated.
int hexStringToBytes(const char* str, byte* bytes, int nBytes) {
int nDecoded {0};
for (int i {0}; str[i] != '\0' && nDecoded < nBytes; i += 2, nDecoded += 1) {
if (str[i + 1] != '\0') {
int m {charToInt(str[i])};
int n {charToInt(str[i + 1])};
if (m != -1 && n != -1) {
bytes[nDecoded] = (m << 4) | n;
} else {
return -1;
}
} else {
return -1;
}
}
return nDecoded;
}
int main(int argc, char* argv[]) {
if (argc < 2) {
return 1;
}
byte bytes[0x100];
int ret {hexStringToBytes(argv[1], bytes, 0x100)};
if (ret < 0) {
return 1;
}
std::cout << "number of bytes: " << ret << "\n" << std::hex;
for (int i {0}; i < ret; ++i) {
if (bytes[i] < 0x10) {
std::cout << "0";
}
std::cout << (bytes[i] & 0xff);
}
std::cout << "\n";
return 0;
}

i've modified TheoretiCAL's code
uint8_t buf[32] = {};
std::string hex = "0123";
while (hex.length() % 2)
hex = "0" + hex;
std::stringstream stream;
stream << std::hex << hex;
for (size_t i= 0; i <sizeof(buf); i++)
stream >> buf[i];

How I do this at compiletime
#pragma once
#include <memory>
#include <iostream>
#include <string>
#include <array>
#define DELIMITING_WILDCARD ' '
// #sean :)
constexpr int _char_to_int( char ch )
{
if( ch >= '0' && ch <= '9' )
return ch - '0';
if( ch >= 'A' && ch <= 'F' )
return ch - 'A' + 10;
return ch - 'a' + 10;
};
template <char wildcard, typename T, size_t N = sizeof( T )>
constexpr size_t _count_wildcard( T &&str )
{
size_t count = 1u;
for( const auto &character : str )
{
if( character == wildcard )
{
++count;
}
}
return count;
}
// construct a base16 hex and emplace it at make_count
// change 16 to 256 if u want the result to be when:
// sig[0] == 0xA && sig[1] == 0xB = 0xA0B
// or leave as is for the scenario to return 0xAB
#define CONCATE_HEX_FACTOR 16
#define CONCATE_HEX(a, b) ( CONCATE_HEX_FACTOR * ( a ) + ( b ) )
template
< char skip_wildcard,
// How many occurances of a delimiting wildcard do we find in sig
size_t delimiter_count,
typename T, size_t N = sizeof( T )>
constexpr auto _make_array( T &&sig )
{
static_assert( delimiter_count > 0, "this is a logical error, delimiter count can't be of size 0" );
static_assert( N > 1, "sig length must be bigger than 1" );
// Resulting byte array, for delimiter_count skips we should have delimiter_count integers
std::array<int, delimiter_count> ret{};
// List of skips that point to the position of the delimiter wildcard in skip
std::array<size_t, delimiter_count> skips{};
// Current skip
size_t skip_count = 0u;
// Character count, traversed for skip
size_t skip_traversed_character_count = 0u;
for( size_t i = 0u; i < N; ++i )
{
if( sig[i] == DELIMITING_WILDCARD )
{
skips[skip_count] = skip_traversed_character_count;
++skip_count;
}
++skip_traversed_character_count;
}
// Finally traversed character count
size_t traversed_character_count = 0u;
// Make count (we will supposedly have at least an instance in our return array)
size_t make_count = 1u;
// Traverse signature
for( size_t i = 0u; i < N; ++i )
{
// Read before
if( i == 0u )
{
// We don't care about this, and we don't want to use 0
if( sig[0u] == skip_wildcard )
{
ret[0u] = -1;
continue;
}
ret[0u] = CONCATE_HEX( _char_to_int( sig[0u] ), _char_to_int( sig[1u] ) );
continue;
}
// Make result by skip data
for( const auto &skip : skips )
{
if( ( skip == i ) && skip < N - 1u )
{
// We don't care about this, and we don't want to use 0
if( sig[i + 1u] == skip_wildcard )
{
ret[make_count] = -1;
++make_count;
continue;
}
ret[make_count] = CONCATE_HEX( _char_to_int( sig[i + 1u] ), _char_to_int( sig[i + 2u] ) );
++make_count;
}
}
}
return ret;
}
#define SKIP_WILDCARD '?'
#define BUILD_ARRAY(a) _make_array<SKIP_WILDCARD, _count_wildcard<DELIMITING_WILDCARD>( a )>( a )
#define BUILD_ARRAY_MV(a) _make_array<SKIP_WILDCARD, _count_wildcard<DELIMITING_WILDCARD>( std::move( a ) )>( std::move( a ) )
// -----
// usage
// -----
template <int n>
constexpr int combine_two()
{
constexpr auto numbers = BUILD_ARRAY( "55 8B EC 83 E4 F8 8B 4D 08 BA ? ? ? ? E8 ? ? ? ? 85 C0 75 12 ?" );
constexpr int number = numbers[0];
constexpr int number_now = n + number;
return number_now;
}
int main()
{
constexpr auto shit = BUILD_ARRAY( "?? AA BB CC DD ? ? ? 02 31 32" );
for( const auto &hex : shit )
{
printf( "%x ", hex );
}
combine_two<3>();
constexpr auto saaahhah = combine_two<3>();
static_assert( combine_two<3>() == 88 );
static_assert( combine_two<3>() == saaahhah );
printf( "\n%d", saaahhah );
}
Method can be used for runtime too, but for that you'd probably prefer something else, faster.

It may be useful to someone. The logic of translating a set of bytes into a string and back. Solves the zero character problem.
#include <sstream>
#include <iomanip>
std::string BytesToHex(const std::vector<char>& data, size_t len)
{
std::stringstream ss;
ss << std::hex << std::setfill('0');
for(size_t index(0); index < len; ++index)
{
ss << std::setw(2) << static_cast<unsigned short>(data[index]);
}
return ss.str();
}
std::vector<char> HexToBytes(const std::string& data)
{
std::stringstream ss;
ss << data;
std::vector<char> resBytes;
size_t count = 0;
const auto len = data.size();
while(ss.good() && count < len)
{
unsigned short num;
char hexNum[2];
ss.read(hexNum, 2);
sscanf(hexNum, "%2hX", &num);
resBytes.push_back(static_cast<char>(num));
count += 2;
}
return resBytes;
}

If you can make your data to look like this e.g array of "0x01", "0xA1"
Then you can iterate your array and use sscanf to create the array of values
unsigned int result;
sscanf(data, "%x", &result);

The difficulty in an hex to char conversion is that the hex digits work pairwise, f.ex: 3132 or A0FF. So an even number of hex digits is assumed. However it could be perfectly valid to have an odd number of digits, like: 332 and AFF, which should be understood as 0332 and 0AFF.
I propose an improvement to Niels Keurentjes hex2bin() function.
First we count the number of valid hex digits. As we have to count, let's control also the buffer size:
void hex2bin(const char* src, char* target, size_t size_target)
{
int countdgts=0; // count hex digits
for (const char *p=src; *p && isxdigit(*p); p++)
countdgts++;
if ((countdgts+1)/2+1>size_target)
throw exception("Risk of buffer overflow");
By the way, to use isxdigit() you'll have to #include <cctype>.
Once we know how many digits, we can determine if the first one is the higher digit (only pairs) or not (first digit not a pair).
bool ishi = !(countdgts%2);
Then we can loop digit by digit, combining each pair using bin shift << and bin or, and
toggling the 'high' indicator at each iteration:
for (*target=0; *src; ishi = !ishi) {
char tmp = char2int(*src++); // hex digit on 4 lower bits
if (ishi)
*target = (tmp << 4); // high: shift by 4
else *target++ |= tmp; // low: complete previous
}
*target=0; // null terminated target (if desired)
}

I found this question, but the accepted answer didn't look like a C++ way of solving the task to me (this doesn't mean it's a bad answer or anything, just explaining motivation behind adding this one). I recollected this nice answer and decided to implement something similar. Here is complete code of what I ended up with (it also works for std::wstring):
#include <cctype>
#include <cstdlib>
#include <algorithm>
#include <iostream>
#include <iterator>
#include <ostream>
#include <stdexcept>
#include <string>
#include <vector>
template <typename OutputIt>
class hex_ostream_iterator :
public std::iterator<std::output_iterator_tag, void, void, void, void>
{
OutputIt out;
int digitCount;
int number;
public:
hex_ostream_iterator(OutputIt out) : out(out), digitCount(0), number(0)
{
}
hex_ostream_iterator<OutputIt> &
operator=(char c)
{
number = (number << 4) | char2int(c);
digitCount++;
if (digitCount == 2) {
digitCount = 0;
*out++ = number;
number = 0;
}
return *this;
}
hex_ostream_iterator<OutputIt> &
operator*()
{
return *this;
}
hex_ostream_iterator<OutputIt> &
operator++()
{
return *this;
}
hex_ostream_iterator<OutputIt> &
operator++(int)
{
return *this;
}
private:
int
char2int(char c)
{
static const std::string HEX_CHARS = "0123456789abcdef";
const char lowerC = std::tolower(c);
const std::string::size_type pos = HEX_CHARS.find_first_of(lowerC);
if (pos == std::string::npos) {
throw std::runtime_error(std::string("Not a hex digit: ") + c);
}
return pos;
}
};
template <typename OutputIt>
hex_ostream_iterator<OutputIt>
hex_iterator(OutputIt out)
{
return hex_ostream_iterator<OutputIt>(out);
}
template <typename InputIt, typename OutputIt>
hex_ostream_iterator<OutputIt>
from_hex_string(InputIt first, InputIt last, OutputIt out)
{
if (std::distance(first, last) % 2 == 1) {
*out = '0';
++out;
}
return std::copy(first, last, out);
}
int
main(int argc, char *argv[])
{
if (argc != 2) {
std::cout << "Usage: " << argv[0] << " hexstring" << std::endl;
return EXIT_FAILURE;
}
const std::string input = argv[1];
std::vector<unsigned char> bytes;
from_hex_string(input.begin(), input.end(),
hex_iterator(std::back_inserter(bytes)));
typedef std::ostream_iterator<unsigned char> osit;
std::copy(bytes.begin(), bytes.end(), osit(std::cout));
return EXIT_SUCCESS;
}
And the output of ./hex2bytes 61a062a063 | hexdump -C:
00000000 61 a0 62 a0 63 |a.b.c|
00000005
And of ./hex2bytes 6a062a063 | hexdump -C (note odd number of characters):
00000000 06 a0 62 a0 63 |..b.c|
00000005

In: "303132", Out: "012". Input string can be odd or even length.
char char2int(char input)
{
if (input >= '0' && input <= '9')
return input - '0';
if (input >= 'A' && input <= 'F')
return input - 'A' + 10;
if (input >= 'a' && input <= 'f')
return input - 'a' + 10;
throw std::runtime_error("Incorrect symbol in hex string");
};
string hex2str(string &hex)
{
string out;
out.resize(hex.size() / 2 + hex.size() % 2);
string::iterator it = hex.begin();
string::iterator out_it = out.begin();
if (hex.size() % 2 != 0) {
*out_it++ = char(char2int(*it++));
}
for (; it < hex.end() - 1; it++) {
*out_it++ = char2int(*it++) << 4 | char2int(*it);
};
return out;
}

Very similar to some of the other answers here, this is what I went with:
typedef uint8_t BYTE;
BYTE* ByteUtils::HexStringToBytes(BYTE* HexString, int ArrayLength)
{
BYTE* returnBytes;
returnBytes = (BYTE*) malloc(ArrayLength/2);
int j=0;
for(int i = 0; i < ArrayLength; i++)
{
if(i % 2 == 0)
{
int valueHigh = (int)(*(HexString+i));
int valueLow = (int)(*(HexString+i+1));
valueHigh = ByteUtils::HexAsciiToDec(valueHigh);
valueLow = ByteUtils::HexAsciiToDec(valueLow);
valueHigh *= 16;
int total = valueHigh + valueLow;
*(returnBytes+j++) = (BYTE)total;
}
}
return returnBytes;
}
int ByteUtils::HexAsciiToDec(int value)
{
if(value > 47 && value < 59)
{
value -= 48;
}
else if(value > 96 && value < 103)
{
value -= 97;
value += 10;
}
else if(value > 64 && value < 71)
{
value -= 65;
value += 10;
}
else
{
value = 0;
}
return value;
}

static bool Hexadec2xdigit(const std::string& data, std::string& buffer, std::size_t offset = sizeof(uint16_t))
{
if (data.empty())
{
return false;
}
try
{
constexpr auto s_function_lambda = [] (const char* string) noexcept { return *static_cast<const uint16_t*>(reinterpret_cast<const uint16_t*>(string)); };
{
for (std::size_t i = 0, tmp = s_function_lambda(data.c_str() + i); i < data.size(); i += offset, tmp = s_function_lambda(data.c_str() + i))
{
if (std::isxdigit(data[i]))
{
buffer += static_cast<char>(/*std::stoul*/std::strtoul(reinterpret_cast<const char*>(std::addressof(tmp)), NULL, 16));
}
}
}
return true;
}
catch (const std::invalid_argument& ex)
{
}
catch (const std::out_of_range& ex)
{
}
return false;
}
This code doesn't have much of a copy process

Building a char array with hex bytes from string values [duplicate]

What is the best way to convert a variable length hex string e.g. "01A1" to a byte array containing that data.
i.e converting this:
std::string = "01A1";
into this
char* hexArray;
int hexLength;
or this
std::vector<char> hexArray;
so that when I write this to a file and hexdump -C it I get the binary data containing 01A1.

This implementation uses the built-in strtol function to handle the actual conversion from text to bytes, but will work for any even-length hex string.
std::vector<char> HexToBytes(const std::string& hex) {
std::vector<char> bytes;
for (unsigned int i = 0; i < hex.length(); i += 2) {
std::string byteString = hex.substr(i, 2);
char byte = (char) strtol(byteString.c_str(), NULL, 16);
bytes.push_back(byte);
}
return bytes;
}

This ought to work:
int char2int(char input)
{
if(input >= '0' && input <= '9')
return input - '0';
if(input >= 'A' && input <= 'F')
return input - 'A' + 10;
if(input >= 'a' && input <= 'f')
return input - 'a' + 10;
throw std::invalid_argument("Invalid input string");
}
// This function assumes src to be a zero terminated sanitized string with
// an even number of [0-9a-f] characters, and target to be sufficiently large
void hex2bin(const char* src, char* target)
{
while(*src && src[1])
{
*(target++) = char2int(*src)*16 + char2int(src[1]);
src += 2;
}
}
Depending on your specific platform there's probably also a standard implementation though.

So for fun, I was curious if I could do this kind of conversion at compile-time. It doesn't have a lot of error checking and was done in VS2015, which doesn't support C++14 constexpr functions yet (thus how HexCharToInt looks). It takes a c-string array, converts pairs of characters into a single byte and expands those bytes into a uniform initialization list used to initialize the T type provided as a template parameter. T could be replaced with something like std::array to automatically return an array.
#include <cstdint>
#include <initializer_list>
#include <stdexcept>
#include <utility>
/* Quick and dirty conversion from a single character to its hex equivelent */
constexpr std::uint8_t HexCharToInt(char Input)
{
return
((Input >= 'a') && (Input <= 'f'))
? (Input - 87)
: ((Input >= 'A') && (Input <= 'F'))
? (Input - 55)
: ((Input >= '0') && (Input <= '9'))
? (Input - 48)
: throw std::exception{};
}
/* Position the characters into the appropriate nibble */
constexpr std::uint8_t HexChar(char High, char Low)
{
return (HexCharToInt(High) << 4) | (HexCharToInt(Low));
}
/* Adapter that performs sets of 2 characters into a single byte and combine the results into a uniform initialization list used to initialize T */
template <typename T, std::size_t Length, std::size_t ... Index>
constexpr T HexString(const char (&Input)[Length], const std::index_sequence<Index...>&)
{
return T{HexChar(Input[(Index * 2)], Input[((Index * 2) + 1)])...};
}
/* Entry function */
template <typename T, std::size_t Length>
constexpr T HexString(const char (&Input)[Length])
{
return HexString<T>(Input, std::make_index_sequence<(Length / 2)>{});
}
constexpr auto Y = KS::Utility::HexString<std::array<std::uint8_t, 3>>("ABCDEF");

You can use boost:
#include <boost/algorithm/hex.hpp>
char bytes[60] = {0};
std::string hash = boost::algorithm::unhex(std::string("313233343536373839"));
std::copy(hash.begin(), hash.end(), bytes);

You said "variable length." Just how variable do you mean?
For hex strings that fit into an unsigned long I have always liked the C function strtoul. To make it convert hex pass 16 as the radix value.
Code might look like:
#include <cstdlib>
std::string str = "01a1";
unsigned long val = strtoul(str.c_str(), 0, 16);

If you want to use OpenSSL to do it, there is a nifty trick I found:
BIGNUM *input = BN_new();
int input_length = BN_hex2bn(&input, argv[2]);
input_length = (input_length + 1) / 2; // BN_hex2bn() returns number of hex digits
unsigned char *input_buffer = (unsigned char*)malloc(input_length);
retval = BN_bn2bin(input, input_buffer);
Just be sure to strip off any leading '0x' to the string.

This can be done with a stringstream, you just need to store the value in an intermediate numeric type such as an int:
std::string test = "01A1"; // assuming this is an even length string
char bytes[test.length()/2];
stringstream converter;
for(int i = 0; i < test.length(); i+=2)
{
converter << std::hex << test.substr(i,2);
int byte;
converter >> byte;
bytes[i/2] = byte & 0xFF;
converter.str(std::string());
converter.clear();
}

Somebody mentioned using sscanf to do this, but didn't say how. This is how. It's useful because it also works in ancient versions of C and C++ and even most versions of embedded C or C++ for microcontrollers.
When converted to bytes, the hex-string in this example resolves to the ASCII text "Hello there!" which is then printed.
#include <stdio.h>
int main ()
{
char hexdata[] = "48656c6c6f20746865726521";
char bytedata[20]{};
for(int j = 0; j < sizeof(hexdata) / 2; j++) {
sscanf(hexdata + j * 2, "%02hhX", bytedata + j);
}
printf ("%s -> %s\n", hexdata, bytedata);
return 0;
}

I would use a standard function like sscanf to read the string into an unsigned integer, and then you already have the bytes you need in memory. If you were on a big endian machine you could just write out (memcpy) the memory of the integer from the first non-zero byte. However you can't safely assume this in general, so you can use some bit masking and shifting to get the bytes out.
const char* src = "01A1";
char hexArray[256] = {0};
int hexLength = 0;
// read in the string
unsigned int hex = 0;
sscanf(src, "%x", &hex);
// write it out
for (unsigned int mask = 0xff000000, bitPos=24; mask; mask>>=8, bitPos-=8) {
unsigned int currByte = hex & mask;
if (currByte || hexLength) {
hexArray[hexLength++] = currByte>>bitPos;
}
}

C++11 variant (with gcc 4.7 - little endian format):
#include <string>
#include <vector>
std::vector<uint8_t> decodeHex(const std::string & source)
{
if ( std::string::npos != source.find_first_not_of("0123456789ABCDEFabcdef") )
{
// you can throw exception here
return {};
}
union
{
uint64_t binary;
char byte[8];
} value{};
auto size = source.size(), offset = (size % 16);
std::vector<uint8_t> binary{};
binary.reserve((size + 1) / 2);
if ( offset )
{
value.binary = std::stoull(source.substr(0, offset), nullptr, 16);
for ( auto index = (offset + 1) / 2; index--; )
{
binary.emplace_back(value.byte[index]);
}
}
for ( ; offset < size; offset += 16 )
{
value.binary = std::stoull(source.substr(offset, 16), nullptr, 16);
for ( auto index = 8; index--; )
{
binary.emplace_back(value.byte[index]);
}
}
return binary;
}
Crypto++ variant (with gcc 4.7):
#include <string>
#include <vector>
#include <crypto++/filters.h>
#include <crypto++/hex.h>
std::vector<unsigned char> decodeHex(const std::string & source)
{
std::string hexCode;
CryptoPP::StringSource(
source, true,
new CryptoPP::HexDecoder(new CryptoPP::StringSink(hexCode)));
return std::vector<unsigned char>(hexCode.begin(), hexCode.end());
}
Note that the first variant is about two times faster than the second one and at the same time works with odd and even number of nibbles (the result of "a56ac" is {0x0a, 0x56, 0xac}). Crypto++ discards the last one if there are odd number of nibbels (the result of "a56ac" is {0xa5, 0x6a}) and silently skips invalid hex characters (the result of "a5sac" is {0xa5, 0xac}).

#include <iostream>
#include <sstream>
#include <vector>
int main() {
std::string s("313233");
char delim = ',';
int len = s.size();
for(int i = 2; i < len; i += 3, ++len) s.insert(i, 1, delim);
std::istringstream is(s);
std::ostringstream os;
is >> std::hex;
int n;
while (is >> n) {
char c = (char)n;
os << std::string(&c, 1);
if(is.peek() == delim) is.ignore();
}
// std::string form
std::string byte_string = os.str();
std::cout << byte_string << std::endl;
printf("%s\n", byte_string.c_str());
// std::vector form
std::vector<char> byte_vector(byte_string.begin(), byte_string.end());
byte_vector.push_back('\0'); // needed for a c-string
printf("%s\n", byte_vector.data());
}
The output is
123
123
123
'1' == 0x31, etc.

If your goal is speed, I have an AVX2 SIMD implementation of an encoder and decoder here: https://github.com/zbjornson/fast-hex. These benchmark ~12x faster than the fastest scalar implementations.

#include <iostream>
using byte = unsigned char;
static int charToInt(char c) {
if (c >= '0' && c <= '9') {
return c - '0';
}
if (c >= 'A' && c <= 'F') {
return c - 'A' + 10;
}
if (c >= 'a' && c <= 'f') {
return c - 'a' + 10;
}
return -1;
}
// Decodes specified HEX string to bytes array. Specified nBytes is length of bytes
// array. Returns -1 if fails to decode any of bytes. Returns number of bytes decoded
// on success. Maximum number of bytes decoded will be equal to nBytes. It is assumed
// that specified string is '\0' terminated.
int hexStringToBytes(const char* str, byte* bytes, int nBytes) {
int nDecoded {0};
for (int i {0}; str[i] != '\0' && nDecoded < nBytes; i += 2, nDecoded += 1) {
if (str[i + 1] != '\0') {
int m {charToInt(str[i])};
int n {charToInt(str[i + 1])};
if (m != -1 && n != -1) {
bytes[nDecoded] = (m << 4) | n;
} else {
return -1;
}
} else {
return -1;
}
}
return nDecoded;
}
int main(int argc, char* argv[]) {
if (argc < 2) {
return 1;
}
byte bytes[0x100];
int ret {hexStringToBytes(argv[1], bytes, 0x100)};
if (ret < 0) {
return 1;
}
std::cout << "number of bytes: " << ret << "\n" << std::hex;
for (int i {0}; i < ret; ++i) {
if (bytes[i] < 0x10) {
std::cout << "0";
}
std::cout << (bytes[i] & 0xff);
}
std::cout << "\n";
return 0;
}

i've modified TheoretiCAL's code
uint8_t buf[32] = {};
std::string hex = "0123";
while (hex.length() % 2)
hex = "0" + hex;
std::stringstream stream;
stream << std::hex << hex;
for (size_t i= 0; i <sizeof(buf); i++)
stream >> buf[i];

How I do this at compiletime
#pragma once
#include <memory>
#include <iostream>
#include <string>
#include <array>
#define DELIMITING_WILDCARD ' '
// #sean :)
constexpr int _char_to_int( char ch )
{
if( ch >= '0' && ch <= '9' )
return ch - '0';
if( ch >= 'A' && ch <= 'F' )
return ch - 'A' + 10;
return ch - 'a' + 10;
};
template <char wildcard, typename T, size_t N = sizeof( T )>
constexpr size_t _count_wildcard( T &&str )
{
size_t count = 1u;
for( const auto &character : str )
{
if( character == wildcard )
{
++count;
}
}
return count;
}
// construct a base16 hex and emplace it at make_count
// change 16 to 256 if u want the result to be when:
// sig[0] == 0xA && sig[1] == 0xB = 0xA0B
// or leave as is for the scenario to return 0xAB
#define CONCATE_HEX_FACTOR 16
#define CONCATE_HEX(a, b) ( CONCATE_HEX_FACTOR * ( a ) + ( b ) )
template
< char skip_wildcard,
// How many occurances of a delimiting wildcard do we find in sig
size_t delimiter_count,
typename T, size_t N = sizeof( T )>
constexpr auto _make_array( T &&sig )
{
static_assert( delimiter_count > 0, "this is a logical error, delimiter count can't be of size 0" );
static_assert( N > 1, "sig length must be bigger than 1" );
// Resulting byte array, for delimiter_count skips we should have delimiter_count integers
std::array<int, delimiter_count> ret{};
// List of skips that point to the position of the delimiter wildcard in skip
std::array<size_t, delimiter_count> skips{};
// Current skip
size_t skip_count = 0u;
// Character count, traversed for skip
size_t skip_traversed_character_count = 0u;
for( size_t i = 0u; i < N; ++i )
{
if( sig[i] == DELIMITING_WILDCARD )
{
skips[skip_count] = skip_traversed_character_count;
++skip_count;
}
++skip_traversed_character_count;
}
// Finally traversed character count
size_t traversed_character_count = 0u;
// Make count (we will supposedly have at least an instance in our return array)
size_t make_count = 1u;
// Traverse signature
for( size_t i = 0u; i < N; ++i )
{
// Read before
if( i == 0u )
{
// We don't care about this, and we don't want to use 0
if( sig[0u] == skip_wildcard )
{
ret[0u] = -1;
continue;
}
ret[0u] = CONCATE_HEX( _char_to_int( sig[0u] ), _char_to_int( sig[1u] ) );
continue;
}
// Make result by skip data
for( const auto &skip : skips )
{
if( ( skip == i ) && skip < N - 1u )
{
// We don't care about this, and we don't want to use 0
if( sig[i + 1u] == skip_wildcard )
{
ret[make_count] = -1;
++make_count;
continue;
}
ret[make_count] = CONCATE_HEX( _char_to_int( sig[i + 1u] ), _char_to_int( sig[i + 2u] ) );
++make_count;
}
}
}
return ret;
}
#define SKIP_WILDCARD '?'
#define BUILD_ARRAY(a) _make_array<SKIP_WILDCARD, _count_wildcard<DELIMITING_WILDCARD>( a )>( a )
#define BUILD_ARRAY_MV(a) _make_array<SKIP_WILDCARD, _count_wildcard<DELIMITING_WILDCARD>( std::move( a ) )>( std::move( a ) )
// -----
// usage
// -----
template <int n>
constexpr int combine_two()
{
constexpr auto numbers = BUILD_ARRAY( "55 8B EC 83 E4 F8 8B 4D 08 BA ? ? ? ? E8 ? ? ? ? 85 C0 75 12 ?" );
constexpr int number = numbers[0];
constexpr int number_now = n + number;
return number_now;
}
int main()
{
constexpr auto shit = BUILD_ARRAY( "?? AA BB CC DD ? ? ? 02 31 32" );
for( const auto &hex : shit )
{
printf( "%x ", hex );
}
combine_two<3>();
constexpr auto saaahhah = combine_two<3>();
static_assert( combine_two<3>() == 88 );
static_assert( combine_two<3>() == saaahhah );
printf( "\n%d", saaahhah );
}
Method can be used for runtime too, but for that you'd probably prefer something else, faster.

It may be useful to someone. The logic of translating a set of bytes into a string and back. Solves the zero character problem.
#include <sstream>
#include <iomanip>
std::string BytesToHex(const std::vector<char>& data, size_t len)
{
std::stringstream ss;
ss << std::hex << std::setfill('0');
for(size_t index(0); index < len; ++index)
{
ss << std::setw(2) << static_cast<unsigned short>(data[index]);
}
return ss.str();
}
std::vector<char> HexToBytes(const std::string& data)
{
std::stringstream ss;
ss << data;
std::vector<char> resBytes;
size_t count = 0;
const auto len = data.size();
while(ss.good() && count < len)
{
unsigned short num;
char hexNum[2];
ss.read(hexNum, 2);
sscanf(hexNum, "%2hX", &num);
resBytes.push_back(static_cast<char>(num));
count += 2;
}
return resBytes;
}

If you can make your data to look like this e.g array of "0x01", "0xA1"
Then you can iterate your array and use sscanf to create the array of values
unsigned int result;
sscanf(data, "%x", &result);

The difficulty in an hex to char conversion is that the hex digits work pairwise, f.ex: 3132 or A0FF. So an even number of hex digits is assumed. However it could be perfectly valid to have an odd number of digits, like: 332 and AFF, which should be understood as 0332 and 0AFF.
I propose an improvement to Niels Keurentjes hex2bin() function.
First we count the number of valid hex digits. As we have to count, let's control also the buffer size:
void hex2bin(const char* src, char* target, size_t size_target)
{
int countdgts=0; // count hex digits
for (const char *p=src; *p && isxdigit(*p); p++)
countdgts++;
if ((countdgts+1)/2+1>size_target)
throw exception("Risk of buffer overflow");
By the way, to use isxdigit() you'll have to #include <cctype>.
Once we know how many digits, we can determine if the first one is the higher digit (only pairs) or not (first digit not a pair).
bool ishi = !(countdgts%2);
Then we can loop digit by digit, combining each pair using bin shift << and bin or, and
toggling the 'high' indicator at each iteration:
for (*target=0; *src; ishi = !ishi) {
char tmp = char2int(*src++); // hex digit on 4 lower bits
if (ishi)
*target = (tmp << 4); // high: shift by 4
else *target++ |= tmp; // low: complete previous
}
*target=0; // null terminated target (if desired)
}

I found this question, but the accepted answer didn't look like a C++ way of solving the task to me (this doesn't mean it's a bad answer or anything, just explaining motivation behind adding this one). I recollected this nice answer and decided to implement something similar. Here is complete code of what I ended up with (it also works for std::wstring):
#include <cctype>
#include <cstdlib>
#include <algorithm>
#include <iostream>
#include <iterator>
#include <ostream>
#include <stdexcept>
#include <string>
#include <vector>
template <typename OutputIt>
class hex_ostream_iterator :
public std::iterator<std::output_iterator_tag, void, void, void, void>
{
OutputIt out;
int digitCount;
int number;
public:
hex_ostream_iterator(OutputIt out) : out(out), digitCount(0), number(0)
{
}
hex_ostream_iterator<OutputIt> &
operator=(char c)
{
number = (number << 4) | char2int(c);
digitCount++;
if (digitCount == 2) {
digitCount = 0;
*out++ = number;
number = 0;
}
return *this;
}
hex_ostream_iterator<OutputIt> &
operator*()
{
return *this;
}
hex_ostream_iterator<OutputIt> &
operator++()
{
return *this;
}
hex_ostream_iterator<OutputIt> &
operator++(int)
{
return *this;
}
private:
int
char2int(char c)
{
static const std::string HEX_CHARS = "0123456789abcdef";
const char lowerC = std::tolower(c);
const std::string::size_type pos = HEX_CHARS.find_first_of(lowerC);
if (pos == std::string::npos) {
throw std::runtime_error(std::string("Not a hex digit: ") + c);
}
return pos;
}
};
template <typename OutputIt>
hex_ostream_iterator<OutputIt>
hex_iterator(OutputIt out)
{
return hex_ostream_iterator<OutputIt>(out);
}
template <typename InputIt, typename OutputIt>
hex_ostream_iterator<OutputIt>
from_hex_string(InputIt first, InputIt last, OutputIt out)
{
if (std::distance(first, last) % 2 == 1) {
*out = '0';
++out;
}
return std::copy(first, last, out);
}
int
main(int argc, char *argv[])
{
if (argc != 2) {
std::cout << "Usage: " << argv[0] << " hexstring" << std::endl;
return EXIT_FAILURE;
}
const std::string input = argv[1];
std::vector<unsigned char> bytes;
from_hex_string(input.begin(), input.end(),
hex_iterator(std::back_inserter(bytes)));
typedef std::ostream_iterator<unsigned char> osit;
std::copy(bytes.begin(), bytes.end(), osit(std::cout));
return EXIT_SUCCESS;
}
And the output of ./hex2bytes 61a062a063 | hexdump -C:
00000000 61 a0 62 a0 63 |a.b.c|
00000005
And of ./hex2bytes 6a062a063 | hexdump -C (note odd number of characters):
00000000 06 a0 62 a0 63 |..b.c|
00000005

In: "303132", Out: "012". Input string can be odd or even length.
char char2int(char input)
{
if (input >= '0' && input <= '9')
return input - '0';
if (input >= 'A' && input <= 'F')
return input - 'A' + 10;
if (input >= 'a' && input <= 'f')
return input - 'a' + 10;
throw std::runtime_error("Incorrect symbol in hex string");
};
string hex2str(string &hex)
{
string out;
out.resize(hex.size() / 2 + hex.size() % 2);
string::iterator it = hex.begin();
string::iterator out_it = out.begin();
if (hex.size() % 2 != 0) {
*out_it++ = char(char2int(*it++));
}
for (; it < hex.end() - 1; it++) {
*out_it++ = char2int(*it++) << 4 | char2int(*it);
};
return out;
}

Very similar to some of the other answers here, this is what I went with:
typedef uint8_t BYTE;
BYTE* ByteUtils::HexStringToBytes(BYTE* HexString, int ArrayLength)
{
BYTE* returnBytes;
returnBytes = (BYTE*) malloc(ArrayLength/2);
int j=0;
for(int i = 0; i < ArrayLength; i++)
{
if(i % 2 == 0)
{
int valueHigh = (int)(*(HexString+i));
int valueLow = (int)(*(HexString+i+1));
valueHigh = ByteUtils::HexAsciiToDec(valueHigh);
valueLow = ByteUtils::HexAsciiToDec(valueLow);
valueHigh *= 16;
int total = valueHigh + valueLow;
*(returnBytes+j++) = (BYTE)total;
}
}
return returnBytes;
}
int ByteUtils::HexAsciiToDec(int value)
{
if(value > 47 && value < 59)
{
value -= 48;
}
else if(value > 96 && value < 103)
{
value -= 97;
value += 10;
}
else if(value > 64 && value < 71)
{
value -= 65;
value += 10;
}
else
{
value = 0;
}
return value;
}

static bool Hexadec2xdigit(const std::string& data, std::string& buffer, std::size_t offset = sizeof(uint16_t))
{
if (data.empty())
{
return false;
}
try
{
constexpr auto s_function_lambda = [] (const char* string) noexcept { return *static_cast<const uint16_t*>(reinterpret_cast<const uint16_t*>(string)); };
{
for (std::size_t i = 0, tmp = s_function_lambda(data.c_str() + i); i < data.size(); i += offset, tmp = s_function_lambda(data.c_str() + i))
{
if (std::isxdigit(data[i]))
{
buffer += static_cast<char>(/*std::stoul*/std::strtoul(reinterpret_cast<const char*>(std::addressof(tmp)), NULL, 16));
}
}
}
return true;
}
catch (const std::invalid_argument& ex)
{
}
catch (const std::out_of_range& ex)
{
}
return false;
}
This code doesn't have much of a copy process

Casting string to literal bytes [duplicate]

What is the best way to convert a variable length hex string e.g. "01A1" to a byte array containing that data.
i.e converting this:
std::string = "01A1";
into this
char* hexArray;
int hexLength;
or this
std::vector<char> hexArray;
so that when I write this to a file and hexdump -C it I get the binary data containing 01A1.

This implementation uses the built-in strtol function to handle the actual conversion from text to bytes, but will work for any even-length hex string.
std::vector<char> HexToBytes(const std::string& hex) {
std::vector<char> bytes;
for (unsigned int i = 0; i < hex.length(); i += 2) {
std::string byteString = hex.substr(i, 2);
char byte = (char) strtol(byteString.c_str(), NULL, 16);
bytes.push_back(byte);
}
return bytes;
}

This ought to work:
int char2int(char input)
{
if(input >= '0' && input <= '9')
return input - '0';
if(input >= 'A' && input <= 'F')
return input - 'A' + 10;
if(input >= 'a' && input <= 'f')
return input - 'a' + 10;
throw std::invalid_argument("Invalid input string");
}
// This function assumes src to be a zero terminated sanitized string with
// an even number of [0-9a-f] characters, and target to be sufficiently large
void hex2bin(const char* src, char* target)
{
while(*src && src[1])
{
*(target++) = char2int(*src)*16 + char2int(src[1]);
src += 2;
}
}
Depending on your specific platform there's probably also a standard implementation though.

So for fun, I was curious if I could do this kind of conversion at compile-time. It doesn't have a lot of error checking and was done in VS2015, which doesn't support C++14 constexpr functions yet (thus how HexCharToInt looks). It takes a c-string array, converts pairs of characters into a single byte and expands those bytes into a uniform initialization list used to initialize the T type provided as a template parameter. T could be replaced with something like std::array to automatically return an array.
#include <cstdint>
#include <initializer_list>
#include <stdexcept>
#include <utility>
/* Quick and dirty conversion from a single character to its hex equivelent */
constexpr std::uint8_t HexCharToInt(char Input)
{
return
((Input >= 'a') && (Input <= 'f'))
? (Input - 87)
: ((Input >= 'A') && (Input <= 'F'))
? (Input - 55)
: ((Input >= '0') && (Input <= '9'))
? (Input - 48)
: throw std::exception{};
}
/* Position the characters into the appropriate nibble */
constexpr std::uint8_t HexChar(char High, char Low)
{
return (HexCharToInt(High) << 4) | (HexCharToInt(Low));
}
/* Adapter that performs sets of 2 characters into a single byte and combine the results into a uniform initialization list used to initialize T */
template <typename T, std::size_t Length, std::size_t ... Index>
constexpr T HexString(const char (&Input)[Length], const std::index_sequence<Index...>&)
{
return T{HexChar(Input[(Index * 2)], Input[((Index * 2) + 1)])...};
}
/* Entry function */
template <typename T, std::size_t Length>
constexpr T HexString(const char (&Input)[Length])
{
return HexString<T>(Input, std::make_index_sequence<(Length / 2)>{});
}
constexpr auto Y = KS::Utility::HexString<std::array<std::uint8_t, 3>>("ABCDEF");

You can use boost:
#include <boost/algorithm/hex.hpp>
char bytes[60] = {0};
std::string hash = boost::algorithm::unhex(std::string("313233343536373839"));
std::copy(hash.begin(), hash.end(), bytes);

You said "variable length." Just how variable do you mean?
For hex strings that fit into an unsigned long I have always liked the C function strtoul. To make it convert hex pass 16 as the radix value.
Code might look like:
#include <cstdlib>
std::string str = "01a1";
unsigned long val = strtoul(str.c_str(), 0, 16);

If you want to use OpenSSL to do it, there is a nifty trick I found:
BIGNUM *input = BN_new();
int input_length = BN_hex2bn(&input, argv[2]);
input_length = (input_length + 1) / 2; // BN_hex2bn() returns number of hex digits
unsigned char *input_buffer = (unsigned char*)malloc(input_length);
retval = BN_bn2bin(input, input_buffer);
Just be sure to strip off any leading '0x' to the string.

This can be done with a stringstream, you just need to store the value in an intermediate numeric type such as an int:
std::string test = "01A1"; // assuming this is an even length string
char bytes[test.length()/2];
stringstream converter;
for(int i = 0; i < test.length(); i+=2)
{
converter << std::hex << test.substr(i,2);
int byte;
converter >> byte;
bytes[i/2] = byte & 0xFF;
converter.str(std::string());
converter.clear();
}

Somebody mentioned using sscanf to do this, but didn't say how. This is how. It's useful because it also works in ancient versions of C and C++ and even most versions of embedded C or C++ for microcontrollers.
When converted to bytes, the hex-string in this example resolves to the ASCII text "Hello there!" which is then printed.
#include <stdio.h>
int main ()
{
char hexdata[] = "48656c6c6f20746865726521";
char bytedata[20]{};
for(int j = 0; j < sizeof(hexdata) / 2; j++) {
sscanf(hexdata + j * 2, "%02hhX", bytedata + j);
}
printf ("%s -> %s\n", hexdata, bytedata);
return 0;
}

I would use a standard function like sscanf to read the string into an unsigned integer, and then you already have the bytes you need in memory. If you were on a big endian machine you could just write out (memcpy) the memory of the integer from the first non-zero byte. However you can't safely assume this in general, so you can use some bit masking and shifting to get the bytes out.
const char* src = "01A1";
char hexArray[256] = {0};
int hexLength = 0;
// read in the string
unsigned int hex = 0;
sscanf(src, "%x", &hex);
// write it out
for (unsigned int mask = 0xff000000, bitPos=24; mask; mask>>=8, bitPos-=8) {
unsigned int currByte = hex & mask;
if (currByte || hexLength) {
hexArray[hexLength++] = currByte>>bitPos;
}
}

C++11 variant (with gcc 4.7 - little endian format):
#include <string>
#include <vector>
std::vector<uint8_t> decodeHex(const std::string & source)
{
if ( std::string::npos != source.find_first_not_of("0123456789ABCDEFabcdef") )
{
// you can throw exception here
return {};
}
union
{
uint64_t binary;
char byte[8];
} value{};
auto size = source.size(), offset = (size % 16);
std::vector<uint8_t> binary{};
binary.reserve((size + 1) / 2);
if ( offset )
{
value.binary = std::stoull(source.substr(0, offset), nullptr, 16);
for ( auto index = (offset + 1) / 2; index--; )
{
binary.emplace_back(value.byte[index]);
}
}
for ( ; offset < size; offset += 16 )
{
value.binary = std::stoull(source.substr(offset, 16), nullptr, 16);
for ( auto index = 8; index--; )
{
binary.emplace_back(value.byte[index]);
}
}
return binary;
}
Crypto++ variant (with gcc 4.7):
#include <string>
#include <vector>
#include <crypto++/filters.h>
#include <crypto++/hex.h>
std::vector<unsigned char> decodeHex(const std::string & source)
{
std::string hexCode;
CryptoPP::StringSource(
source, true,
new CryptoPP::HexDecoder(new CryptoPP::StringSink(hexCode)));
return std::vector<unsigned char>(hexCode.begin(), hexCode.end());
}
Note that the first variant is about two times faster than the second one and at the same time works with odd and even number of nibbles (the result of "a56ac" is {0x0a, 0x56, 0xac}). Crypto++ discards the last one if there are odd number of nibbels (the result of "a56ac" is {0xa5, 0x6a}) and silently skips invalid hex characters (the result of "a5sac" is {0xa5, 0xac}).

#include <iostream>
#include <sstream>
#include <vector>
int main() {
std::string s("313233");
char delim = ',';
int len = s.size();
for(int i = 2; i < len; i += 3, ++len) s.insert(i, 1, delim);
std::istringstream is(s);
std::ostringstream os;
is >> std::hex;
int n;
while (is >> n) {
char c = (char)n;
os << std::string(&c, 1);
if(is.peek() == delim) is.ignore();
}
// std::string form
std::string byte_string = os.str();
std::cout << byte_string << std::endl;
printf("%s\n", byte_string.c_str());
// std::vector form
std::vector<char> byte_vector(byte_string.begin(), byte_string.end());
byte_vector.push_back('\0'); // needed for a c-string
printf("%s\n", byte_vector.data());
}
The output is
123
123
123
'1' == 0x31, etc.

If your goal is speed, I have an AVX2 SIMD implementation of an encoder and decoder here: https://github.com/zbjornson/fast-hex. These benchmark ~12x faster than the fastest scalar implementations.

#include <iostream>
using byte = unsigned char;
static int charToInt(char c) {
if (c >= '0' && c <= '9') {
return c - '0';
}
if (c >= 'A' && c <= 'F') {
return c - 'A' + 10;
}
if (c >= 'a' && c <= 'f') {
return c - 'a' + 10;
}
return -1;
}
// Decodes specified HEX string to bytes array. Specified nBytes is length of bytes
// array. Returns -1 if fails to decode any of bytes. Returns number of bytes decoded
// on success. Maximum number of bytes decoded will be equal to nBytes. It is assumed
// that specified string is '\0' terminated.
int hexStringToBytes(const char* str, byte* bytes, int nBytes) {
int nDecoded {0};
for (int i {0}; str[i] != '\0' && nDecoded < nBytes; i += 2, nDecoded += 1) {
if (str[i + 1] != '\0') {
int m {charToInt(str[i])};
int n {charToInt(str[i + 1])};
if (m != -1 && n != -1) {
bytes[nDecoded] = (m << 4) | n;
} else {
return -1;
}
} else {
return -1;
}
}
return nDecoded;
}
int main(int argc, char* argv[]) {
if (argc < 2) {
return 1;
}
byte bytes[0x100];
int ret {hexStringToBytes(argv[1], bytes, 0x100)};
if (ret < 0) {
return 1;
}
std::cout << "number of bytes: " << ret << "\n" << std::hex;
for (int i {0}; i < ret; ++i) {
if (bytes[i] < 0x10) {
std::cout << "0";
}
std::cout << (bytes[i] & 0xff);
}
std::cout << "\n";
return 0;
}

i've modified TheoretiCAL's code
uint8_t buf[32] = {};
std::string hex = "0123";
while (hex.length() % 2)
hex = "0" + hex;
std::stringstream stream;
stream << std::hex << hex;
for (size_t i= 0; i <sizeof(buf); i++)
stream >> buf[i];

How I do this at compiletime
#pragma once
#include <memory>
#include <iostream>
#include <string>
#include <array>
#define DELIMITING_WILDCARD ' '
// #sean :)
constexpr int _char_to_int( char ch )
{
if( ch >= '0' && ch <= '9' )
return ch - '0';
if( ch >= 'A' && ch <= 'F' )
return ch - 'A' + 10;
return ch - 'a' + 10;
};
template <char wildcard, typename T, size_t N = sizeof( T )>
constexpr size_t _count_wildcard( T &&str )
{
size_t count = 1u;
for( const auto &character : str )
{
if( character == wildcard )
{
++count;
}
}
return count;
}
// construct a base16 hex and emplace it at make_count
// change 16 to 256 if u want the result to be when:
// sig[0] == 0xA && sig[1] == 0xB = 0xA0B
// or leave as is for the scenario to return 0xAB
#define CONCATE_HEX_FACTOR 16
#define CONCATE_HEX(a, b) ( CONCATE_HEX_FACTOR * ( a ) + ( b ) )
template
< char skip_wildcard,
// How many occurances of a delimiting wildcard do we find in sig
size_t delimiter_count,
typename T, size_t N = sizeof( T )>
constexpr auto _make_array( T &&sig )
{
static_assert( delimiter_count > 0, "this is a logical error, delimiter count can't be of size 0" );
static_assert( N > 1, "sig length must be bigger than 1" );
// Resulting byte array, for delimiter_count skips we should have delimiter_count integers
std::array<int, delimiter_count> ret{};
// List of skips that point to the position of the delimiter wildcard in skip
std::array<size_t, delimiter_count> skips{};
// Current skip
size_t skip_count = 0u;
// Character count, traversed for skip
size_t skip_traversed_character_count = 0u;
for( size_t i = 0u; i < N; ++i )
{
if( sig[i] == DELIMITING_WILDCARD )
{
skips[skip_count] = skip_traversed_character_count;
++skip_count;
}
++skip_traversed_character_count;
}
// Finally traversed character count
size_t traversed_character_count = 0u;
// Make count (we will supposedly have at least an instance in our return array)
size_t make_count = 1u;
// Traverse signature
for( size_t i = 0u; i < N; ++i )
{
// Read before
if( i == 0u )
{
// We don't care about this, and we don't want to use 0
if( sig[0u] == skip_wildcard )
{
ret[0u] = -1;
continue;
}
ret[0u] = CONCATE_HEX( _char_to_int( sig[0u] ), _char_to_int( sig[1u] ) );
continue;
}
// Make result by skip data
for( const auto &skip : skips )
{
if( ( skip == i ) && skip < N - 1u )
{
// We don't care about this, and we don't want to use 0
if( sig[i + 1u] == skip_wildcard )
{
ret[make_count] = -1;
++make_count;
continue;
}
ret[make_count] = CONCATE_HEX( _char_to_int( sig[i + 1u] ), _char_to_int( sig[i + 2u] ) );
++make_count;
}
}
}
return ret;
}
#define SKIP_WILDCARD '?'
#define BUILD_ARRAY(a) _make_array<SKIP_WILDCARD, _count_wildcard<DELIMITING_WILDCARD>( a )>( a )
#define BUILD_ARRAY_MV(a) _make_array<SKIP_WILDCARD, _count_wildcard<DELIMITING_WILDCARD>( std::move( a ) )>( std::move( a ) )
// -----
// usage
// -----
template <int n>
constexpr int combine_two()
{
constexpr auto numbers = BUILD_ARRAY( "55 8B EC 83 E4 F8 8B 4D 08 BA ? ? ? ? E8 ? ? ? ? 85 C0 75 12 ?" );
constexpr int number = numbers[0];
constexpr int number_now = n + number;
return number_now;
}
int main()
{
constexpr auto shit = BUILD_ARRAY( "?? AA BB CC DD ? ? ? 02 31 32" );
for( const auto &hex : shit )
{
printf( "%x ", hex );
}
combine_two<3>();
constexpr auto saaahhah = combine_two<3>();
static_assert( combine_two<3>() == 88 );
static_assert( combine_two<3>() == saaahhah );
printf( "\n%d", saaahhah );
}
Method can be used for runtime too, but for that you'd probably prefer something else, faster.

It may be useful to someone. The logic of translating a set of bytes into a string and back. Solves the zero character problem.
#include <sstream>
#include <iomanip>
std::string BytesToHex(const std::vector<char>& data, size_t len)
{
std::stringstream ss;
ss << std::hex << std::setfill('0');
for(size_t index(0); index < len; ++index)
{
ss << std::setw(2) << static_cast<unsigned short>(data[index]);
}
return ss.str();
}
std::vector<char> HexToBytes(const std::string& data)
{
std::stringstream ss;
ss << data;
std::vector<char> resBytes;
size_t count = 0;
const auto len = data.size();
while(ss.good() && count < len)
{
unsigned short num;
char hexNum[2];
ss.read(hexNum, 2);
sscanf(hexNum, "%2hX", &num);
resBytes.push_back(static_cast<char>(num));
count += 2;
}
return resBytes;
}

If you can make your data to look like this e.g array of "0x01", "0xA1"
Then you can iterate your array and use sscanf to create the array of values
unsigned int result;
sscanf(data, "%x", &result);

The difficulty in an hex to char conversion is that the hex digits work pairwise, f.ex: 3132 or A0FF. So an even number of hex digits is assumed. However it could be perfectly valid to have an odd number of digits, like: 332 and AFF, which should be understood as 0332 and 0AFF.
I propose an improvement to Niels Keurentjes hex2bin() function.
First we count the number of valid hex digits. As we have to count, let's control also the buffer size:
void hex2bin(const char* src, char* target, size_t size_target)
{
int countdgts=0; // count hex digits
for (const char *p=src; *p && isxdigit(*p); p++)
countdgts++;
if ((countdgts+1)/2+1>size_target)
throw exception("Risk of buffer overflow");
By the way, to use isxdigit() you'll have to #include <cctype>.
Once we know how many digits, we can determine if the first one is the higher digit (only pairs) or not (first digit not a pair).
bool ishi = !(countdgts%2);
Then we can loop digit by digit, combining each pair using bin shift << and bin or, and
toggling the 'high' indicator at each iteration:
for (*target=0; *src; ishi = !ishi) {
char tmp = char2int(*src++); // hex digit on 4 lower bits
if (ishi)
*target = (tmp << 4); // high: shift by 4
else *target++ |= tmp; // low: complete previous
}
*target=0; // null terminated target (if desired)
}

I found this question, but the accepted answer didn't look like a C++ way of solving the task to me (this doesn't mean it's a bad answer or anything, just explaining motivation behind adding this one). I recollected this nice answer and decided to implement something similar. Here is complete code of what I ended up with (it also works for std::wstring):
#include <cctype>
#include <cstdlib>
#include <algorithm>
#include <iostream>
#include <iterator>
#include <ostream>
#include <stdexcept>
#include <string>
#include <vector>
template <typename OutputIt>
class hex_ostream_iterator :
public std::iterator<std::output_iterator_tag, void, void, void, void>
{
OutputIt out;
int digitCount;
int number;
public:
hex_ostream_iterator(OutputIt out) : out(out), digitCount(0), number(0)
{
}
hex_ostream_iterator<OutputIt> &
operator=(char c)
{
number = (number << 4) | char2int(c);
digitCount++;
if (digitCount == 2) {
digitCount = 0;
*out++ = number;
number = 0;
}
return *this;
}
hex_ostream_iterator<OutputIt> &
operator*()
{
return *this;
}
hex_ostream_iterator<OutputIt> &
operator++()
{
return *this;
}
hex_ostream_iterator<OutputIt> &
operator++(int)
{
return *this;
}
private:
int
char2int(char c)
{
static const std::string HEX_CHARS = "0123456789abcdef";
const char lowerC = std::tolower(c);
const std::string::size_type pos = HEX_CHARS.find_first_of(lowerC);
if (pos == std::string::npos) {
throw std::runtime_error(std::string("Not a hex digit: ") + c);
}
return pos;
}
};
template <typename OutputIt>
hex_ostream_iterator<OutputIt>
hex_iterator(OutputIt out)
{
return hex_ostream_iterator<OutputIt>(out);
}
template <typename InputIt, typename OutputIt>
hex_ostream_iterator<OutputIt>
from_hex_string(InputIt first, InputIt last, OutputIt out)
{
if (std::distance(first, last) % 2 == 1) {
*out = '0';
++out;
}
return std::copy(first, last, out);
}
int
main(int argc, char *argv[])
{
if (argc != 2) {
std::cout << "Usage: " << argv[0] << " hexstring" << std::endl;
return EXIT_FAILURE;
}
const std::string input = argv[1];
std::vector<unsigned char> bytes;
from_hex_string(input.begin(), input.end(),
hex_iterator(std::back_inserter(bytes)));
typedef std::ostream_iterator<unsigned char> osit;
std::copy(bytes.begin(), bytes.end(), osit(std::cout));
return EXIT_SUCCESS;
}
And the output of ./hex2bytes 61a062a063 | hexdump -C:
00000000 61 a0 62 a0 63 |a.b.c|
00000005
And of ./hex2bytes 6a062a063 | hexdump -C (note odd number of characters):
00000000 06 a0 62 a0 63 |..b.c|
00000005

In: "303132", Out: "012". Input string can be odd or even length.
char char2int(char input)
{
if (input >= '0' && input <= '9')
return input - '0';
if (input >= 'A' && input <= 'F')
return input - 'A' + 10;
if (input >= 'a' && input <= 'f')
return input - 'a' + 10;
throw std::runtime_error("Incorrect symbol in hex string");
};
string hex2str(string &hex)
{
string out;
out.resize(hex.size() / 2 + hex.size() % 2);
string::iterator it = hex.begin();
string::iterator out_it = out.begin();
if (hex.size() % 2 != 0) {
*out_it++ = char(char2int(*it++));
}
for (; it < hex.end() - 1; it++) {
*out_it++ = char2int(*it++) << 4 | char2int(*it);
};
return out;
}

Very similar to some of the other answers here, this is what I went with:
typedef uint8_t BYTE;
BYTE* ByteUtils::HexStringToBytes(BYTE* HexString, int ArrayLength)
{
BYTE* returnBytes;
returnBytes = (BYTE*) malloc(ArrayLength/2);
int j=0;
for(int i = 0; i < ArrayLength; i++)
{
if(i % 2 == 0)
{
int valueHigh = (int)(*(HexString+i));
int valueLow = (int)(*(HexString+i+1));
valueHigh = ByteUtils::HexAsciiToDec(valueHigh);
valueLow = ByteUtils::HexAsciiToDec(valueLow);
valueHigh *= 16;
int total = valueHigh + valueLow;
*(returnBytes+j++) = (BYTE)total;
}
}
return returnBytes;
}
int ByteUtils::HexAsciiToDec(int value)
{
if(value > 47 && value < 59)
{
value -= 48;
}
else if(value > 96 && value < 103)
{
value -= 97;
value += 10;
}
else if(value > 64 && value < 71)
{
value -= 65;
value += 10;
}
else
{
value = 0;
}
return value;
}

static bool Hexadec2xdigit(const std::string& data, std::string& buffer, std::size_t offset = sizeof(uint16_t))
{
if (data.empty())
{
return false;
}
try
{
constexpr auto s_function_lambda = [] (const char* string) noexcept { return *static_cast<const uint16_t*>(reinterpret_cast<const uint16_t*>(string)); };
{
for (std::size_t i = 0, tmp = s_function_lambda(data.c_str() + i); i < data.size(); i += offset, tmp = s_function_lambda(data.c_str() + i))
{
if (std::isxdigit(data[i]))
{
buffer += static_cast<char>(/*std::stoul*/std::strtoul(reinterpret_cast<const char*>(std::addressof(tmp)), NULL, 16));
}
}
}
return true;
}
catch (const std::invalid_argument& ex)
{
}
catch (const std::out_of_range& ex)
{
}
return false;
}
This code doesn't have much of a copy process

How to convert a decimal string to binary string?

I have a decimal string like this (length < 5000):
std::string decimalString = "555";
Is there a standard way to convert this string to binary representation? Like this:
std::string binaryString = "1000101011";
Update.
This post helps me.

As the number is very large, you can use a big integer library (boost, maybe?), or write the necessary functions yourself.
If you decide to implement the functions yourself, one way is to implement the old pencil-and-paper long division method in your code, where you'll need to divide the decimal number repeatedly by 2 and accumulate the remainders in another string. May be a little cumbersome, but division by 2 should not be so hard.

Since 10 is not a power of two (or the other way round), you're out of luck. You will have to implement arithmetics in base-10. You need the following two operations:
Integer division by 2
Checking the remainder after division by 2
Both can be computed by the same algorithm.
Alternatively, you can use one of the various big integer libraries for C++, such as GNU MP or Boost.Multiprecision.

I tried to do it.. I don't think my answer is right but here is the IDEA behind what I was trying to do..
Lets say we have 2 decimals:
100 and 200..
To concatenate these, we can use the formula:
a * CalcPower(b) + b where CalcPower is defined below..
Knowing this, I tried to split the very long decimal string into chunks of 4. I convert each string to binary and store them in a vector..
Finally, I go through each string and apply the formula above to concatenate each binary string into one massive one..
I didn't get it working but here is the code.. maybe someone else see where I went wrong.. BinaryAdd, BinaryMulDec, CalcPower works perfectly fine.. the problem is actually in ToBinary
#include <iostream>
#include <bitset>
#include <limits>
#include <algorithm>
std::string BinaryAdd(std::string First, std::string Second)
{
int Carry = 0;
std::string Result;
while(Second.size() > First.size())
First.insert(0, "0");
while(First.size() > Second.size())
Second.insert(0, "0");
for (int I = First.size() - 1; I >= 0; --I)
{
int FirstBit = First[I] - 0x30;
int SecondBit = Second[I] - 0x30;
Result += static_cast<char>((FirstBit ^ SecondBit ^ Carry) + 0x30);
Carry = (FirstBit & SecondBit) | (SecondBit & Carry) | (FirstBit & Carry);
}
if (Carry)
Result += 0x31;
std::reverse(Result.begin(), Result.end());
return Result;
}
std::string BinaryMulDec(std::string value, int amount)
{
if (amount == 0)
{
for (auto &s : value)
{
s = 0x30;
}
return value;
}
std::string result = value;
for (int I = 0; I < amount - 1; ++I)
result = BinaryAdd(result, value);
return result;
}
std::int64_t CalcPowers(std::int64_t value)
{
std::int64_t t = 1;
while(t < value)
t *= 10;
return t;
}
std::string ToBinary(const std::string &value)
{
std::vector<std::string> sets;
std::vector<int> multipliers;
int Len = 0;
int Rem = value.size() % 4;
for (auto it = value.end(), jt = value.end(); it != value.begin() - 1; --it)
{
if (Len++ == 4)
{
std::string t = std::string(it, jt);
sets.push_back(std::bitset<16>(std::stoull(t)).to_string());
multipliers.push_back(CalcPowers(std::stoull(t)));
jt = it;
Len = 1;
}
}
if (Rem != 0 && Rem != value.size())
{
sets.push_back(std::bitset<16>(std::stoull(std::string(value.begin(), value.begin() + Rem))).to_string());
}
auto formula = [](std::string a, std::string b, int mul) -> std::string
{
return BinaryAdd(BinaryMulDec(a, mul), b);
};
std::reverse(sets.begin(), sets.end());
std::reverse(multipliers.begin(), multipliers.end());
std::string result = sets[0];
for (std::size_t i = 1; i < sets.size(); ++i)
{
result = formula(result, sets[i], multipliers[i]);
}
return result;
}
void ConcatenateDecimals(std::int64_t* arr, int size)
{
auto formula = [](std::int64_t a, std::int64_t b) -> std::int64_t
{
return (a * CalcPowers(b)) + b;
};
std::int64_t val = arr[0];
for (int i = 1; i < size; ++i)
{
val = formula(val, arr[i]);
}
std::cout<<val;
}
int main()
{
std::string decimal = "64497387062899840145";
//6449738706289984014 = 0101100110000010000100110010111001100010100000001000001000001110
/*
std::int64_t arr[] = {644, 9738, 7062, 8998, 4014};
ConcatenateDecimals(arr, 5);*/
std::cout<<ToBinary(decimal);
return 0;
}

I found my old code that solve sport programming task:
ai -> aj
2 <= i,j <= 36; 0 <= a <= 10^1000
time limit: 1sec
Execution time was ~0,039 in worst case. Multiplication, addition and division algorithms is very fast because of using 10^9 as numeration system, but implementation can be optimized very well I think.
source link
#include <iostream>
#include <string>
#include <vector>
using namespace std;
#define sz(x) (int((x).size()))
typedef vector<int> vi;
typedef long long llong;
int DigToNumber(char c) {
if( c <= '9' && c >= '0' )
return c-'0';
return c-'A'+10;
}
char NumberToDig(int n) {
if( n < 10 )
return '0'+n;
return n-10+'A';
}
const int base = 1000*1000*1000;
void mulint(vi& a, int b) { //a*= b
for(int i = 0, carry = 0; i < sz(a) || carry; i++) {
if( i == sz(a) )
a.push_back(0);
llong cur = carry + a[i] * 1LL * b;
a[i] = int(cur%base);
carry = int(cur/base);
}
while( sz(a) > 1 && a.back() == 0 )
a.pop_back();
}
int divint(vi& a, int d) { // carry = a%d; a /= d; return carry;
int carry = 0;
for(int i = sz(a)-1; i >= 0; i--) {
llong cur = a[i] + carry * 1LL * base;
a[i] = int(cur/d);
carry = int(cur%d);
}
while( sz(a) > 1 && a.back() == 0 )
a.pop_back();
return carry;
}
void add(vi& a, vi& b) { // a += b
for(int i = 0, c = 0, l = max(sz(a),sz(b)); i < l || c; i++) {
if( i == sz(a) )
a.push_back(0);
a[i] += ((i<sz(b))?b[i]:0) + c;
c = a[i] >= base;
if( c ) a[i] -= base;
}
}
int main() {
ios_base::sync_with_stdio(0);
cin.tie(0);
int from, to; cin >> from >> to;
string s; cin >> s;
vi res(1,0); vi m(1,1); vi tmp;
for(int i = sz(s)-1; i >= 0; i--) {
tmp.assign(m.begin(), m.end());
mulint(tmp,DigToNumber(s[i]));
add(res,tmp); mulint(m,from);
}
vi ans;
while( sz(res) > 1 || res.back() != 0 )
ans.push_back(divint(res,to));
if( sz(ans) == 0 )
ans.push_back(0);
for(int i = sz(ans)-1; i >= 0; i--)
cout << NumberToDig(ans[i]);
cout << "\n";
return 0;
}
How "from -> to" works for string "s":
accumulate Big Number (vector< int >) "res" with s[i]*from^(|s|-i-1), i = |s|-1..0
compute digits by dividing "res" by "to" until res > 0 and save them to another vector
send it to output digit-by-digit (you can use ostringstream instead)
PS I've noted that nickname of thread starter is Denis. And I think this link may be useful too.

Sorting char arrays by swapping pointers, C++

I am trying to sort an array of char pointers (char * _string) by swapping pointers.
I have this method, and what I want to do is use the values I get from _string and sort them by not manipulating _string, but the empty helper array (char * _output) which I also hand over to the method.
Can anyone help me and tell me what I am doing wrong?
void sortAsc(char* _string, char* _output)
{
int length = strlen(_string);
// output and string now point to the same area in the memory
_output = _string;
for( int i = 0; i < length; i++) {
for( int j = 0; j < length; j++) {
if( *(_output) > (_output[j] ) ) {
// save the pointer
char* tmp = _output;
// now output points to the smaller value
_output = _output+j;
// move up the pointer to the smaller value
_output + j;
// now the pointer of the smaller value points to the higher value
_output = tmp;
// move down to where we were + 1
_output - j + 1;
}
}
}
//_output[length]='\0';
//delete chars;
}
In my main-Method, I do something like this:
char * string = {"bcdae"};
char * output = new char[5];
sortAsc(string, output);
After that code, I want the output array to contain the sorted values.

Let's do the selection sort for a 10 size int array using pointer notation, you can simply change it to an array list.
*---*---*---*---*---* ........
a[] = | 1 | 2 | 4 | 0 | 3 | ........
*---*---*---*---*---* ........
^--------We start here looking for the smaller numbers and sort the array.
for( i = 0; i < 10; i++ ){
k = i;
bypass = *( a + i );
for( j = i + 1; j < 10; j++ ){
/* To get Increasing order. */
if( bypass > *( a + j ) ){
bypass = *( a + j );
k = j;
}
}
if ( k != i ){
*( a + k ) = *( a + i );
*( a + i ) = bypass;
}
}

This sorts the string into an already allocated buffer, and if the buffer isn't large enough tells you how big it has to be:
std::size_t sortAsc(char const* string, char* dest, std::size_t dest_length) {
std::size_t str_length = strlen(string);
char const* str_end = string + str_length;
if (dest_length < str_length+1)
return str_length+1;
std::copy( string, str_end, output );
output[str_length] = '\0';
std::sort( output, output+strlen(output) );
return str_length+1;
}
This does the poor "allocate a new string" pattern, using the above implementation:
char* allocate_and_sortAsc(char const* string) {
std::size_t str_length = strlen(string);
char* retval = new char[str_length+1];
std::size_t count = sortAsc( string, retval, str_length+1);
ASSERT( count <= str_length );
return retval;
}
And don't use variable names that start with an _, it is a bad practice because it wanders really near compiler reserved names. _Capital is reserved everywhere, and _lower in global scope, and foo__bar everywhere.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Is there an alternative to the std::string substring? - c++

Related

How to convert a literal string of hex to actual hex values in C++? [duplicate]

Building a char array with hex bytes from string values [duplicate]

Casting string to literal bytes [duplicate]

How to convert a decimal string to binary string?

Sorting char arrays by swapping pointers, C++

Categories

Resources