Convert __m128d to double - c++

I just try to use SSE extensions, and I started with a simple vector dot multiplication.
So I wrote the following code:
void SSE_vectormult(double * A, double * B)
{
__m128d a;
__m128d b;
a = _mm_load_pd(A);
b = _mm_load_pd(B);
const int mask = 0xf1;
__m128d res = _mm_dp_pd(a,b,mask);
A = res;
}
with A and B vectors of the same length. Now, I have to convert the result in __m128d back into double. Is there a easy way to do this (or a conversion function)?
Thank you!

You should use double dot = _mm_cvtsd_f64(res). This extracts the lower 64bit double from the 128 bit register.

The counterpart of load would be store[ms, intel]. So in your case I'd guess (double precision, aligned pointer, regular store):
_mm_store_pd(A, res); //A = res;

Related

SSE vector Operation on type double

I want to use SIMD Operation of vectors containing double type values on AMD64 architecture. below is the simple example of my Problem. this works fine if I print float values , but not for double. I Need precision of upto 9 decimal Digits.
#include<stdio.h>
#include<emmintrin.h>
typedef union f4vector
{
__m128d v;
}float4;
int main()
{
float4 x,y,z;
double f0[2]={2334, 5};
double f1[2]={2334.32345324 , 5};
double f3[2];
x.v=_mm_set_pd(f0[0], f0[1]);
y.v = _mm_set_pd(f1[0], f1[1]);
z.v = _mm_mul_pd(x.v , y.v);
f3[0]=z.v[0];
f3[1]=z.v[1];
printf("%d, %d\n", f3[0], f3[1]); // doesnt print correct values.
}
You have some mistakes:
Using %d format specifier instead of %f in function printf.
To use SIMD instructions effective you have to load and store data with using of vector instructions such as _mm_loadu_pd/_mm_storeu_pd. Intrinsic _mm_set_pd is very ineffecient.
Below I write correct example:
#include<stdio.h>
#include<emmintrin.h>
int main()
{
double d0[2] = { 2334, 5 };
double d1[2] = { 2334.32345324 , 5 };
double d2[2] = { 0, 0 };
__m128d v0 = _mm_loadu_pd(d0);
__m128d v1 = _mm_loadu_pd(d1);
__m128d v2 = _mm_mul_pd(v0, v1);
_mm_storeu_pd(d2, v2);
printf("%f, %f\n", d2[0], d2[1]);
}
Output:
5448310.939862, 25.000000
In printf you have used format specifier as %d you need to use %f as you want to print double value

Split Multiplication of integers

I need an algorithm that uses two 32-bit integers as parameters, and returns the multiplication of these parameters split into two other 32-bit integers: 32-highest-bits part and 32-lowest-bits part.
I would try:
uint32_t p1, p2; // globals to hold the result
void mult(uint32_t x, uint32_t y){
uint64_t r = (x * y);
p1 = r >> 32;
p2 = r & 0xFFFFFFFF;
}
Although it works1, it's not guaranteed the existence of 64-bit integers in the machine, neither is the use of them by the compiler.
So, how is the best way to solve it?
Note1: Actually, it didn't work because my compiler does not support 64-bit integers.
Obs: Please, avoid using boost.
Just use 16 bits digits.
void multiply(uint32_t a, uint32_t b, uint32_t* h, uint32_t* l) {
uint32_t const base = 0x10000;
uint32_t al = a%base, ah = a/base, bl = b%base, bh = b/base;
*l = al*bl;
*h = ah*bh;
uint32_t rlh = *l/base + al*bh;
*h += rlh/base;
rlh = rlh%base + ah*bl;
*h += rlh/base;
*l = (rlh%base)*base + *l%base;
}
As I commented, you can treat each number as a binary string of length 32.
Just multiply these numbers using school arithmetic. You will get a 64 character long string.
Then just partition it.
If you want fast multiplication, then you can look into Karatsuba multiplication algorithm.
This is the explanation and an implementation of the Karatsubas-Algorithm.
I have downloaded the code and ran it several times. It seems that it's doing well. You can modify the code according to your need.
If the unsigned long type are supported, this should work:
void umult32(uint32 a, uint32 b, uint32* c, uint32* d)
{
unsigned long long x = ((unsigned long long)a)* ((unsigned long long)b); //Thanks to #Толя
*c = x&0xffffffff;
*d = (x >> 32) & 0xffffffff;
}
Logic borrowed from here.

Alternative to C++11's std::nextafter and std::nexttoward for C++03?

As the title says, the functionality I'm after is provided by C++11's math libraries to find the next floating point value towards a particular value.
Aside from pulling the code out of the std library (which I may have to resort to), any alternatives to do this with C++03 (using GCC 4.4.6)?
Platform dependently, assuming IEEE754, and modulo endianness, you can store the data of the floating point number in an integer, increment by one, and retrieve the result:
float input = 3.15;
uint32_t tmp;
unsigned char * p = reinterpret_cast<unsigned char *>(&tmp);
unsigned char * q = reinterpret_cast<unsigned char *>(&input);
p[0] = q[0]; p[1] = q[1]; p[2] = q[2]; p[3] = q[3]; // endianness?!
++tmp;
q[0] = p[0]; q[1] = p[1]; q[2] = p[2]; q[3] = p[3];
return input;
Beware of zeros, NaNs and infinities, of course.
I realize this answer is quite late, but I stumbled upon the solution when I needed something very similar, so figured this should be included here for completeness. In C++11, the mentioned APIs are in the 'std' namespace. Prior to that, just remove the 'std' namespace qualifier, from math.h:
double nextafter (double x , double y);
float nextafterf (float x , float y);
long double nextafterl (long double x, long double y);
Although I don't have GCC4.4.6 to validate, this did work for me on an old compiler that does not support C++11.

how to convert float to int preserving bit value

I have a float4 coming into a compute shader, 3 of these floats are really floats but the fourth is 2 uints shifted together, how would i convert the float to uint by preserving the bit sequence instead of the numeric value?
on the c++ side i solved it by creating a uint pointer, filling it with the desired number and passing on the pointer as a float pointer instead. However in hlsl as similar as it is to c/c++ there are no pointers so im stuck here :|
In HLSL you should be able to do the following (assuming the value you are after is in f4.w)
uint ui = asuint( f4.w );
uint ui1 = ui & 0xffff;
uint ui2 = ui >> 16;
Basically it looks like the asuint intrinsic is your friend :)
You could use a union.
float f; // you float value is here
union X
{
float f;
short int a[2];
} x;
x.f = f;
int i1 = x.a[0]; // these are your ints
int i2 = x.a[1];

How can you convert a std::bitset<64> to a double?

Is there a way to convert a std::bitset<64> to a double without using any external library (Boost, etc.)? I am using a bitset to represent a genome in a genetic algorithm and I need a way to convert a set of bits to a double.
The C++11 road:
union Converter { uint64_t i; double d; };
double convert(std::bitset<64> const& bs) {
Converter c;
c.i = bs.to_ullong();
return c.d;
}
EDIT: As noted in the comments, we can use char* aliasing as it is unspecified instead of being undefined.
double convert(std::bitset<64> const& bs) {
static_assert(sizeof(uint64_t) == sizeof(double), "Cannot use this!");
uint64_t const u = bs.to_ullong();
double d;
// Aliases to `char*` are explicitly allowed in the Standard (and only them)
char const* cu = reinterpret_cast<char const*>(&u);
char* cd = reinterpret_cast<char*>(&d);
// Copy the bitwise representation from u to d
memcpy(cd, cu, sizeof(u));
return d;
}
C++11 is still required for to_ullong.
Most people are trying to provide answers that let you treat the bit-vector as though it directly contained an encoded int or double.
I would advise you completely avoid that approach. While it does "work" for some definition of working, it introduces hamming cliffs all over the place. You usually want your encoding to arrange things so that if two decoded values are near to one another, then their encoded values are near to one another as well. It also forces you to use 64-bits of precision.
I would manage the conversion manually. Say you have three variables to encode, x, y, and z. Your domain expertise can be used to say, for example, that -5 <= x < 5, 0 <= y < 100, and 0 <= z < 1, where you need 8 bits of precision for x, 12 bits for y, and 10 bits for z. This gives you a total search space of only 30 bits. You can have a 30 bit string, treat the first 8 as encoding x, the next 12 as y, and the last 10 as z. You are also free to gray code each one to remove the hamming cliffs.
I've personally done the following in the past:
inline void binary_encoding::encode(const vector<double>& params)
{
unsigned int start=0;
for(unsigned int param=0; param<params.size(); ++param) {
// m_bpp[i] = number of bits in encoding of parameter i
unsigned int num_bits = m_bpp[param];
// map the double onto the appropriate integer range
// m_range[i] is a pair of (min, max) values for ith parameter
pair<double,double> prange=m_range[param];
double range=prange.second-prange.first;
double max_bit_val=pow(2.0,static_cast<double>(num_bits))-1;
int int_val=static_cast<int>((params[param]-prange.first)*max_bit_val/range+0.5);
// convert the integer to binary
vector<int> result(m_bpp[param]);
for(unsigned int b=0; b<num_bits; ++b) {
result[b]=int_val%2;
int_val/=2;
}
if(m_gray) {
for(unsigned int b=0; b<num_bits-1; ++b) {
result[b]=!(result[b]==result[b+1]);
}
}
// insert the bits into the correct spot in the encoding
copy(result.begin(),result.end(),m_genotype.begin()+start);
start+=num_bits;
}
}
inline void binary_encoding::decode()
{
unsigned int start = 0;
// for each parameter
for(unsigned int param=0; param<m_bpp.size(); param++) {
unsigned int num_bits = m_bpp[param];
unsigned int intval = 0;
if(m_gray) {
// convert from gray to binary
vector<int> binary(num_bits);
binary[num_bits-1] = m_genotype[start+num_bits-1];
intval = binary[num_bits-1];
for(int i=num_bits-2; i>=0; i--) {
binary[i] = !(binary[i+1] == m_genotype[start+i]);
intval += intval + binary[i];
}
}
else {
// convert from binary encoding to integer
for(int i=num_bits-1; i>=0; i--) {
intval += intval + m_genotype[start+i];
}
}
// convert from integer to double in the appropriate range
pair<double,double> prange = m_range[param];
double range = prange.second - prange.first;
double m = range / (pow(2.0,double(num_bits)) - 1.0);
// m_phenotype is a vector<double> containing all the decoded parameters
m_phenotype[param] = m * double(intval) + prange.first;
start += num_bits;
}
}
Note that for reasons that probably don't matter to you, I wasn't using bit vectors -- just ordinary vector<int> to encoding things. And of course, there's a bunch of stuff tied into this code that isn't shown here, but you can probably get the basic idea.
One other note, if you're doing GPU calculations or if you have a particular problem such that 64 bits are the appropriate size anyway, it may be worth the extra overhead to stuff everything into native words. Otherwise, I would guess that the overhead you add to the search process will probably overwhelm whatever benefits you get by faster encoding and decoding.
Edit:: I've decided that I was being a bit silly with this. While you do end up with a double it assumes that the bitset holds an integer... which is a big assumption to make. You will end up with a predictable and repeatable value per bitset but still I don't think that this is what the author intended.
Well if you iterate over the bit values and do
output_double += pow( 2, 64-(bit_position+1) ) * bit_value;
That would work. As long as it is big-endian