injecting a pcm s16be array into novocaine audiomanager - pcm

To #alexbw and friends,
First of all thanks for this great piece of code.
I have pcm data (signed 16 bit big endian) in a byte array and I want to play it with Novocaine AudioManager setOutputBlock.
I understand I first need to convert to a float array.
Or is there a faster way?
Cheers
Philippe

Late, but for anyone else reading:
You can use the Accelerate framework here:
float *float_data = malloc(sizeof(float) * numFrames);
vDSP_vflt16(my_s16_data, 1, float_data, 1, numFrames);
//Scaling [-32768, 32768] to [-1, 1]
float scale = 1.0 / (float)INT16_MAX;
vDSP_vsmul(float_data, 1, &scale, float_data, 1, numFrames);
And "float_data" will now have the float equivalent.

Related

Trying to mix two PCM audio sources

I have two audio files I read in using libsndfile.
SNDFILE* file1 = sf_open("D:\\audio1.wav", SFM_READ, &info);
SNDFILE* file2 = sf_open("D:\\audio2.wav", SFM_READ, &info2);
After I've done the previous I sample x-number of samples:
//Buffers that will hold the samples
short* buffer1 = new short[2 * sizeof(short) * 800000];
short* buffer2 = new short[2 * sizeof(short) * 800000];
// Read the samples using libsndfile
sf_readf_short(file1, buffer1, 800000);
sf_readf_short(file2, buffer2, 800000);
Now, I want to mix those two. I read that you need to get the left and right channel separately and then sum them up. I tried doing it like this:
short* mixdown = new short[channels * sizeof(short) * 800000];
for (int t = 0; t < 800000; ++t)
{
mixdown[t] = buffer1[t] + buffer2[t] - ((buffer1[t]*buffer2[t]) / 65535);
t++;
mixdown[t] = buffer1[t] + buffer2[t] - ((buffer1[t]*buffer2[t]) / 65535);
}
After that I'm encoding the new audio using ffmpeg:
FILE* process2 = _popen("ffmpeg -y -f s16le -acodec pcm_s16le -ar 44100 -ac 2 -i - -f vob -ac 2 D:\\audioMixdown.wav", "wb");
fwrite(mixdown, 2 * sizeof(short) * 800000, 1, process2);
Now, the problem is that the audio from buffer1 sounds fine in the mixdown but the only thing "added" to the new audio is noise (like if it's an old audio recording) when I encode the mixdown to a file.
If I encode only one of the two to a file it works perfectly.
I have no idea why it's going wrong. I guess it has something to do with the mixing, obviously, but I don't know what I'm doing wrong. I got the mixing algorithm here but it doesn't give me the expected results.
I've also read other information on SO about people having similar questions but I couldn't figure it out with those.
Your mixing code is very odd - you seem to be adding a non-linear term which will result in distortion - it seems to be a hack specifically for 8 bit PCM where the dynamic range is very limited, but you probably don't need to worry about this for 16 bit PCM. For basic mixing you just want this:
for (int t = 0; t < 800000 * 2; ++t)
{
mixdown[t] = (buffer1[t] + buffer2[t]) / 2;
}
Note that the divide by 2 is necessary to prevent distortion when you have two full scale signals. Note also that I've removed 2x loop unrolling.
Your algorithm is correct, but you missed an important point : the range of your PCM is from -32768 to 32767. Thus you must divide by 32768, and not 65535.

How to convert 16 bit image to 32 bit image in OpenCV?

I am novice in OpenCV. My program reads image data in 16 bit unsigned int. I need to multiply the image data by some gain of 16 bit unsigned int. So, the resulting data should be kept in 32 bit image file.
I tried following, but I get 8 bit all white image. Please help.
Mat inputData = Mat(Size(width, height), CV_16U, inputdata);
inputData.convertTo(input1Data, CV_32F);
input1Data = input1Data * gain;//gain is ushort
As Micka noticed in the comment, first of all we need to scale inputData to have values between 0.0f and 1.0f by passing a scaling factor:
inputData.convertTo(input1Data, CV_32F, 1.0/65535.0f); // since in inputData
// we have values between 0 and
// 65535 so all resulted values
// will be between 0.0f and 1.0f
And now, the same with the multiplication:
input1Data = input1Data * gain * (1.0f / 65535.0f); // gain, of course, will be
// automatically cast to float
// therefore the resulted factor
// will have value from 0 to 1,
// so input1Data too!
And I think this should compile too:
input1Data *= gain * (1.0f / 65535.0f);
optimizing first version a bit by not creating temporary data.

OpenCL: Downsampling with bilinear interpolation

I've a problem with downsampling image with bilinear interpolation. I've read almost all relevant articles on stackoverflow and searched around in google, trying to solve or at least to find the problem in my OpenCL kernel. This is my main source for the theory. After I implemented this code in OpenCL:
__kernel void downsample(__global uchar* image, __global uchar* outputImage, __global int* width, __global int* height, __global float* factor){
//image vector containing original RGB values
//outputImage vector containing "downsampled" RGB mean values
//factor - downsampling factor, downscaling the image by factor: 1024*1024 -> 1024/factor * 1024/factor
int r = get_global_id(0);
int c = get_global_id(1); //current coordinates
int oWidth = get_global_size(0);
int olc, ohc, olr, ohr; //coordinates of the original image used for bilinear interpolation
int index; //linearized index of the point
uchar q11, q12, q21, q22;
float accurate_c, accurate_r; //the exact scaled point
int k;
accurate_c = convert_float(c*factor[0]);
olc=convert_int(accurate_c);
ohc=olc+1;
if(!(ohc<width[0]))
ohc=olc;
accurate_r = convert_float(r*factor[0]);
olr=convert_int(accurate_r);
ohr=olr+1;
if(!(ohr<height[0]))
ohr=olr;
index= (c + r*oWidth)*3; //3 bytes per pixel
//Compute RGB values: take a central mean RGB values among four points
for(k=0; k<3; k++){
q11=image[(olc + olr*width[0])*3+k];
q12=image[(olc + ohr*width[0])*3+k];
q21=image[(ohc + olr*width[0])*3+k];
q22=image[(ohc + ohr*width[0])*3+k];
outputImage[index+k] = convert_uchar((q11*(ohc - accurate_c)*(ohr - accurate_r) +
q21*(accurate_c - olc)*(ohr - accurate_r) +
q12*(ohc - accurate_c)*(accurate_r - olr) +
q22*(accurate_c - olc)*(accurate_r - olr)));
}
}
The kernel works with factor = 2, 4, 5, 6 but not with factor = 3, 7 (I get missing pixels, and the image appears little bit skewed) whereas the "identical" code written in c++ works fine with all factor values. I cann't explain it to myself why that happens in opencl. I attach my full code project here

zlib compression on floats in a struct

I have been searching along for a way to compress, using the zlib library (and the function compress) a struct containing float vars.
Every example I saw are showing how to compress a string, specifically an unsigned char*.
My struct is an easy one :
struct Particle{
float x;
float y;
float z;
};
And I am calling the compress function as below :
uLong initSize = sizeof(Particle);
uLongf destSize = initSize * 1.1 + 12;
Bytef *dataOriginal = (Bytef*)malloc( initSize );
Bytef *dataCompressed = (Bytef*)malloc( destSize );
Particle p;
memset( &p, 0, sizeof(Particle) );
p.x = 10.24;
p.y = 23.5;
p.z = 7.4;
memcpy( dataOriginal, &p, sizeof(p) );
compress( dataCompressed, &destSize, dataOriginal, initSize );
But when I try to uncompress my data to see what inside, I can't get back to my initial float value :
Bytef *decomp = (Bytef*)malloc( initSize );
uncompress( decomp, &initSize, dataCompressed, destSize );
for( int i = 0 ; i < initSize ; i++ ){
std::cout << (float)decomp[i] << std::endl;
}
If anyone have a solution to this problem, I'm on it since 2 days now...
You would need to copy the decompressed data back into the Particle struct, just like you copied it out in the first place. (Or you could just use casts instead of copies). Then you will recover the original floats in the struct. Whatever it is you think you're doing with 'decomp[i]` doesn't make any sense.
However there are several problems with this. First, this is only assured to work on the same machine, with the same compiler, and even then only within the same version of the compiler. If a different compiler or different version chooses to align the structure differently, then the compressed data will not be transferrable between the two. If there is a different representation of floats between different machines, the compressed data will not be transferrable.
Furthermore, you will not get any compression when compressing three floats. I presume that this is just a prelude to compressing a large array of such Particle structs. Then maybe you'll get somewhere with this.
Better would be to first convert the floats to the precision needed as integers. You should know the range and the useful number of bits for your application. This will compress before even using compress(), by using only the number of bits needed as opposed to 32 per float. Then convert those integers portably to a series of bytes with shift operations. You can then also apply differencing to successive Particles (e.g. x1-x2, y1-y2, z1-z2), which might improve compression if there is a correlation between successive Particles.
By the way, instead of * 1.1 + 12, you should use compressBound(), which does exactly what you want in a way that is assured by the zlib library for future versions.

C++ creating image

I haven't been programming in C++ for a while, and now I have to write a simple thing, but it's driving me nuts.
I need to create a bitmap from a table of colors:
char image[200][200][3];
First coordinate is width, second height, third colors: RGB. How to do it?
Thanks for any help.
Adam
I'm sure you've already checked http://en.wikipedia.org/wiki/BMP_file_format.
With that information in hand we can write a quick BMP with:
// setup header structs bmpfile_header and bmp_dib_v3_header before this (see wiki)
// * note for a windows bitmap you want a negative height if you're starting from the top *
// * otherwise the image data is expected to go from bottom to top *
FILE * fp = fopen ("file.bmp", "wb");
fwrite(bmpfile_header, sizeof(bmpfile_header), 1, fp);
fwrite(bmp_dib_v3_header, sizeof(bmp_dib_v3_header_t), 1, fp);
for (int i = 0; i < 200; i++) {
for (int j = 0; j < 200; j++) {
fwrite(&image[j][i][2], 1, 1, fp);
fwrite(&image[j][i][1], 1, 1, fp);
fwrite(&image[j][i][0], 1, 1, fp);
}
}
fclose(fp);
If setting up the headers is a problem let us know.
Edit: I forgot, BMP files expect BGR instead of RGB, I've updated the code (surprised nobody caught it).
I'd suggest ImageMagick, comprehensive library etc.
I would first try to find out, how the BMP file format (that's what you mean by a bitmap, right?) is defined. Then I would convert the array to that format and print it to the file.
If that's an option, I would also consider trying to find an existing library for BMP files creation, and just use it.
Sorry if what I said is already obvious for you, but I don't know on which stage of the process you are stuck.
For simple image operations I highly recommend Cimg. This library works like a charm, and is extremely easy to use. You just have to include a header file in your code. It literally took me less than 10 minutes to compile and test.
If you want to do more complicated image operations however, I would go with Magick++ as suggested by dagoof.
It would be advisable to initialise the function as a simple 1 dimensional array.
ie (Where bytes is the number of bytes per pixel)
char image[width * height * bytes];
You can then access the relevant position in the array as follows
char byte1 = image[(x * 3) + (y * (width * bytes)) + 0];
char byte2 = image[(x * 3) + (y * (width * bytes)) + 1];
char byte3 = image[(x * 3) + (y * (width * bytes)) + 2];