Converting double array values to char values in C++ - c++

I have a matrix array of doubles that I need to store into an array of chars. These 32-bit double values are guaranteed to be small enough to fit into an 8-bit char value. (The maximum double value in my program is 31). I've researched a bit and what I find are solutions to store a double as a char*, in other words convert a double to a c_string. This is NOT what I seek to achieve.
// I'm dealing with a 15*4 double array
double **d_array = new double*[15];
d_array[i] = new double[4];
// This creates a char array (That will have > 15*4 spaces)
unsigned char *c_array = new unsigned char [1024];
I can iterate a loop over the double matrix to store to the character matrix.
Say I had d_array[1][0] = 4. I want to have c_array[5] = 4. Because 4 is 00000100, it should be able to fit.

I think you should be able to just make the assignment in your loop and it will automatically be truncated and converted (you may get a compiler warning):
c_array[0] = d_array[0][0];
To be safe, you could do
c_array[0] = (char)(int)d_array[0][0];

You may want to use uint8_t since a char is either signed, unsigned or char.
You can use static_cast:
uint8_t value = static_cast<uint8_t>(d_array[i][j]);
If you want to copy the bytes of a floating point to a buffer:
uint8_t buffer[4096];
float f_value = 3.14;
uint8_t * p_float = static_cast<uint8_t *>(&f_value);
for (unsigned int i = 0; i < sizeof(float); ++i)
{
buffer[index + i] = p_float[i];
}

Related

casting a pointer to an array to a structure in efficient C++11 way

I have a big amount point cloud data that I read from a file into
char * memblock = new char [size];
where size is the size of data. Then I cast my data to float numbers
float * file_content = reinterpret_cast<float *>(memblock);
Now I would like to change the data from a pointer to an array and place it in a certain structure like std::vector<PointXYZ>.
vector.clear();
for (int i = 2; i < file_content_size; i+=3) {
vector.push_back(
PointXYZ(file_content[i-2], file_content[i-1], file_content[i] )
);
}
But I feel there must be a better way than just looping through the whole data, considering that the size of the vector is more than 1e6.
std::vector has a range constructor that you can use to copy the elements to the vector.
std::vector<PointXYZ> vec(memblock, memblock + size);
I believe this will be faster because you are not reallocating memory for every push_back, however you will still be doing a copy of all elements in memblock.
I think you alignment problems when you cast your char* raw data into a float*.
Generally you should arrange things so you cast other types to a char* because that is allowed to alias everything else and ensures you get correct alignment.
// create your array in the target type (float)
std::vector<float> file_content(size/sizeof(float));
// read the data in (cast to char* here)
file.read(reinterpret_cast<char*>(file_content.data()), size);
I honestly don't think you can get away from copying all the data.
std::vector<PointXYZ> points;
points.reserve(file_content.size() / 3);
for(auto i = 0ULL; i < file_content.size(); i += 3)
points.emplace_back(points[i], points[i + 1], points[i + 2])

Nice way to truncate an integer

I have a function which is given a buffer which accepts to be filled up to a size_t length; however, the actual call which fills it wants an int as max length.
So, in case the parameter cannot fit in an integer, I want it truncated to the maximum value that can fit; as I couldn't get more data anyway.
I can do this
int truncatedMaxLen = static_cast<int>(std::min<std::size_t>(maxLength, (std::numeric_limits<int>::max)()));
Any less ugly ways?
A branchless way would be:
int truncatedMaxLen = maxLength;
truncatedMaxLen |= (truncatedMaxLen < maxLength) * std::numeric_limits<int>::max();
For unsigned types it is nicer because there is no sign bit to take care of:
unsigned truncatedMaxLen = maxLength;
truncatedMaxLen |= -(truncatedMaxLen < maxLength);

Setting pointer to a double array in for loop

I have an algorithm that I want to run that uses a potentially long double array. Because the array can be millions in length, I'm putting it on the GPU so I need to export the array from a CPP file to a CU file. However, Im prototyping it in CPP only for now because it doesnt work in either case.
In my CPU prototype I get errors when I try to set the members of the double array with my for loop. For example, any operation including cout will give error c2109:subscript requires array or pointer type in the CPP file
or if the same code is run from a CU file, error: expression must have a pointer-to-object type
const int size = 100000;
double inputMeshPts_PROXY[size][4];
inputMeshPts.get(inputMeshPts_PROXY);
int lengthPts = inputMeshPts.length();
if (useCUDA == 1)
{
double *inputMeshPts_CUDA = &inputMeshPts_PROXY[size][4];
myArray(lengthPts, inputMeshPts_CUDA);
}
MStatus abjBlendShape::myArray(int length_CUDA, float weight_CUDA, double *inputMeshPts_CUDA)
{
for (int i = 0; i < length_CUDA; i++)
{
for (int j = 0; j < 3; j++)
{
cout << inputMeshPts_CUDA[i][j] << endl;
// inputMeshPts_CUDA[i][j] += (sculptedMeshPts_PROXY[i][j] - inputMeshPts_CUDA[i][j]); // WHAT I WANT, EVENTUALLY
}
}
}
When you are writing:
double *inputMeshPts_CUDA = &inputMeshPts_PROXY[size][4];
The variable inputMeshPts_CUDA is a pure pointer. You cannot use 2-dimensional indexing [][] as before. The right way to access it is now to linearize the indexes:
inputMeshPts_CUDA[i*4+j]
Alternatively you could declare "correctly" your pointer:
double (*inputMeshPts_CUDA)[4] = inputMeshPts_PROXY;
which allows you to use the 2-dimensional indexing again.
MStatus abjBlendShape::myArray(int length_CUDA, float weight_CUDA, double *inputMeshPts_CUDA)
{
inputMeshPts_CUDA is just a pointer, the compiler has lost all the dimension information. It needs that dimension information for inputMeshPts_CUDA[i][j], which gets converted to an access to address (byte arithmetic, not C++ pointer arithmetic)
inputMeshPts_CUDA + i * sizeof (double) * num_colums + j * sizeof (double)
You can either provide the missing information yourself and do the arithmetic like Angew suggests, or have the compiler pass the dimension information through:
template<size_t M, size_t N>
MStatus abjBlendShape::myArray(int length_CUDA, float weight_CUDA, double (&inputMeshPts_CUDA)[M][N])
Of course, this only works when the size is known at compile-time.
inputMeshPts_CUDA is a pointer to double - that is, it can represent a 1D array. You're accessing it as a 2D array: inputMeshPts_CUDA[i][j]. That doesn't make sense - you're effectively applying [j] to the double object storead at inputMeshPts_CUDA[i].
I believe you were looking for inputMeshPts_CUDA[i * 4 + j] - you have to compute the 2D addressing yourself.

C++ Initialization query

Can someone please tell me if there is anything incorrect with this initialization:
static unsigned char* var[1]
var[0] = new unsigned char[ 800 * 600 ]
Is this creating a 2D array at var[0] ? Is this even valid ?
It's creating a single array with 480,000 elements (aka 800*600) and storing it in var[0]. It is not creating a 2D array. It's valid as long as var[0] is an unsigned char*.
It is creating a 1-d array of length (800*600) at var[0]. You'll be able to access its elements with:
var[0][0]
var[0][1]
...
var[0][800*600-1]
I think you want something more like (although I may be interrupting the question wrong);
static unsigned char* var = new char*[800]
for (int i =0; i < 800; i++)
var[i] = new char[600]
That will give you a 800x600 2d character array.
Your code is not correct, since var[0] is not a pointer anymore.
You may want to do something like:
static unsigned char* var;
var = new unsigned char[800 * 600];
That does not create a 2D array. It is just a 1D array. Having said that, you can use it as a 2D array if you compute the offset yourself. For instance to access position (row, col) you could do:
var[row * 600 + col]
If you really want a "true" 2D array, you will need to use for instance:
static unsigned char** var;
var = new unsigned char*[600];
for (int i = 0; i < 600; i++)
var[i] = new unsigned char[800];
Of course, if you do not need to dynamically specify the array size at runtime, you can just use a statically allocated array:
static unsigned char var[600][800];
BTW, I assume 600 is the number of rows and 800 the number of cols (like in a matrix). That is why I always use 600 in the first index. Otherwise, just swap the values.
Yes, it's quite incorrect to use new[]. What you're looking for is std::vector<unsigned int> var(800 * 600);.

How can you convert a std::bitset<64> to a double?

Is there a way to convert a std::bitset<64> to a double without using any external library (Boost, etc.)? I am using a bitset to represent a genome in a genetic algorithm and I need a way to convert a set of bits to a double.
The C++11 road:
union Converter { uint64_t i; double d; };
double convert(std::bitset<64> const& bs) {
Converter c;
c.i = bs.to_ullong();
return c.d;
}
EDIT: As noted in the comments, we can use char* aliasing as it is unspecified instead of being undefined.
double convert(std::bitset<64> const& bs) {
static_assert(sizeof(uint64_t) == sizeof(double), "Cannot use this!");
uint64_t const u = bs.to_ullong();
double d;
// Aliases to `char*` are explicitly allowed in the Standard (and only them)
char const* cu = reinterpret_cast<char const*>(&u);
char* cd = reinterpret_cast<char*>(&d);
// Copy the bitwise representation from u to d
memcpy(cd, cu, sizeof(u));
return d;
}
C++11 is still required for to_ullong.
Most people are trying to provide answers that let you treat the bit-vector as though it directly contained an encoded int or double.
I would advise you completely avoid that approach. While it does "work" for some definition of working, it introduces hamming cliffs all over the place. You usually want your encoding to arrange things so that if two decoded values are near to one another, then their encoded values are near to one another as well. It also forces you to use 64-bits of precision.
I would manage the conversion manually. Say you have three variables to encode, x, y, and z. Your domain expertise can be used to say, for example, that -5 <= x < 5, 0 <= y < 100, and 0 <= z < 1, where you need 8 bits of precision for x, 12 bits for y, and 10 bits for z. This gives you a total search space of only 30 bits. You can have a 30 bit string, treat the first 8 as encoding x, the next 12 as y, and the last 10 as z. You are also free to gray code each one to remove the hamming cliffs.
I've personally done the following in the past:
inline void binary_encoding::encode(const vector<double>& params)
{
unsigned int start=0;
for(unsigned int param=0; param<params.size(); ++param) {
// m_bpp[i] = number of bits in encoding of parameter i
unsigned int num_bits = m_bpp[param];
// map the double onto the appropriate integer range
// m_range[i] is a pair of (min, max) values for ith parameter
pair<double,double> prange=m_range[param];
double range=prange.second-prange.first;
double max_bit_val=pow(2.0,static_cast<double>(num_bits))-1;
int int_val=static_cast<int>((params[param]-prange.first)*max_bit_val/range+0.5);
// convert the integer to binary
vector<int> result(m_bpp[param]);
for(unsigned int b=0; b<num_bits; ++b) {
result[b]=int_val%2;
int_val/=2;
}
if(m_gray) {
for(unsigned int b=0; b<num_bits-1; ++b) {
result[b]=!(result[b]==result[b+1]);
}
}
// insert the bits into the correct spot in the encoding
copy(result.begin(),result.end(),m_genotype.begin()+start);
start+=num_bits;
}
}
inline void binary_encoding::decode()
{
unsigned int start = 0;
// for each parameter
for(unsigned int param=0; param<m_bpp.size(); param++) {
unsigned int num_bits = m_bpp[param];
unsigned int intval = 0;
if(m_gray) {
// convert from gray to binary
vector<int> binary(num_bits);
binary[num_bits-1] = m_genotype[start+num_bits-1];
intval = binary[num_bits-1];
for(int i=num_bits-2; i>=0; i--) {
binary[i] = !(binary[i+1] == m_genotype[start+i]);
intval += intval + binary[i];
}
}
else {
// convert from binary encoding to integer
for(int i=num_bits-1; i>=0; i--) {
intval += intval + m_genotype[start+i];
}
}
// convert from integer to double in the appropriate range
pair<double,double> prange = m_range[param];
double range = prange.second - prange.first;
double m = range / (pow(2.0,double(num_bits)) - 1.0);
// m_phenotype is a vector<double> containing all the decoded parameters
m_phenotype[param] = m * double(intval) + prange.first;
start += num_bits;
}
}
Note that for reasons that probably don't matter to you, I wasn't using bit vectors -- just ordinary vector<int> to encoding things. And of course, there's a bunch of stuff tied into this code that isn't shown here, but you can probably get the basic idea.
One other note, if you're doing GPU calculations or if you have a particular problem such that 64 bits are the appropriate size anyway, it may be worth the extra overhead to stuff everything into native words. Otherwise, I would guess that the overhead you add to the search process will probably overwhelm whatever benefits you get by faster encoding and decoding.
Edit:: I've decided that I was being a bit silly with this. While you do end up with a double it assumes that the bitset holds an integer... which is a big assumption to make. You will end up with a predictable and repeatable value per bitset but still I don't think that this is what the author intended.
Well if you iterate over the bit values and do
output_double += pow( 2, 64-(bit_position+1) ) * bit_value;
That would work. As long as it is big-endian