Export buffer to WAV in C++ - c++

I have a simple program that creates a single cycle sine wave and puts the float numbers to a buffer. Then this is exported to a text file.
But I want to be able to export it to a WAV file (24 bit). Is there a simple way of doing it like on the text file?
Here is the code I have so far:
#include <iostream>
#include <fstream>
#include <cmath>
using namespace std;
int main ()
{
long double pi = 3.14159265359; // Declaration of PI
ofstream textfile; // Text object
textfile.open("sine.txt"); // Creating the txt
double samplerate = 44100.00; // Sample rate
double frequency = 200.00; // Frequency
int bufferSize = (1/frequency)*samplerate; // Buffer size
double buffer[bufferSize]; // Buffer
for (int i = 0; i <= (1/frequency)*samplerate; ++i) // Single cycle
{
buffer[i] = sin(frequency * (2 * pi) * i / samplerate); // Putting into buffer the float values
textfile << buffer[i] << endl; // Exporting to txt
}
textfile.close(); // Closing the txt
return 0; // Success
}

First you need to open the stream for binary.
ofstream stream;
stream.open("sine.wav", ios::out | ios::binary);
Next you'll need to write out a wave header. You can search to find the details of the wave file format. The important bits are the sample rate, bit depth, and length of the data.
int bufferSize = (1/frequency)*samplerate;
stream.write("RIFF", 4); // RIFF chunk
write<int>(stream, 36 + bufferSize*sizeof(int)); // RIFF chunk size in bytes
stream.write("WAVE", 4); // WAVE chunk
stream.write("fmt ", 4); // fmt chunk
write32(stream, 16); // size of fmt chunk
write16(stream, 1); // Format = PCM
write16(stream, 1); // # of Channels
write32(stream, samplerate); // Sample Rate
write32(stream, samplerate*sizeof(int)); // Byte rate
write16(stream, sizeof(int)); // Frame size
write16(stream, 24); // Bits per sample
stream.write("data", 4); // data chunk
write32(stream, bufferSize*sizeof(int)); // data chunk size in bytes
Now that the header is out of the way, you'll just need to modify your loop to first convert the double (-1.0,1.0) samples into 32-bit signed int. Truncate the bottom 8-bits since you only want 24-bit and then write out the data. Just so you know, it is common practice to store 24-bit samples inside of a 32-bit word because it is much easier to stride through using native types.
for (int i = 0; i < bufferSize; ++i) // Single cycle
{
double tmp = sin(frequency * (2 * pi) * i / samplerate);
int intVal = (int)(tmp * 2147483647.0) & 0xffffff00;
stream << intVal;
}
A couple other things:
1) I don't know how you weren't overflowing buffer by using the <= in your loop. I changed it to a <.
2) Again regarding the buffer size. I'm not sure if you are aware but you can't have a repeated waveform represented by a single cycle for all frequencies. What I mean is that for most frequencies if you use this code and expect to play the waveform repeated, you're going to hear a glitch on every cycle. It'll work for nice synchronous frequencies like 1kHz because there will be exactly 48 samples per cycle and it will come around to exactly the same phase. 999.9 Hz will be a different story though.

Related

Static noise in generated sine wave pcm sound

This is very similar to a question here but I don't seem to be able to apply the solution.
I have a code that samples a sine wave and writes it into a pcm file. When I listen to it with ffplay, there is some static noise that I don't know where it comes from. Based on the solution in the mentioned post, I use a binary file for writting out and I make sure I play the file with signed 8 bit format.
This is the code I use:
int createSineWavePCM(int freq, int sample_rate) {
char out_name[100];
sprintf(out_name, "../sine_freq%d_sr%d.pcm", freq, sample_rate);
ofstream outfile(out_name, ios::binary);
char data[1000000];
for (int j = 0 ; j < 1000000 ; ++j) {
double ll = 50.0L * sin((2.0L * M_PIl * j * freq / sample_rate));
data[j] = ll;
}
outfile.write(data, sizeof data);
outfile.close();
cout << "Stored sine wave pcm file in " << out_name << endl;
return 0;
}
I use freq = 440 and sample_rate = 44100, and then I play with:
ffplay {pcm_file} -f s8 -sample_rate 44100
Any ideas on what may cause the static noise?
The expression in the sin function looks dubious. Are all the components of type int or long? You wrote 2.0L, and I'm surprised that parses, but L usually converts a number to a long. Also it seems like M_PI has an l appended which would possibly also make that a long. If this is the case, the division being performed freq / sample_rate could well be integer division.

OpenCV vs byte array

I am working on a simple C++ image processing application and deciding whether to use OpenCV for loading the image and accessing individual pixels.
My current approach is to simply load the image using fopen, reading the 54 byte header and load the rest of the bytes in a char* array.
To access a specific pixel I use
long q = (long*)(bmpData + x*3 + (bmpSize.height - y - 1) * bmpSize.stride);
To perform a simple color check, for ex. "is blue?"
if (((long*)q | 0xFF000000) == 0xFFFF0000) //for some reason RGB is reversed to BGR
//do something here
Is OpenCV any faster considering all the function calls, parsing, etc.?
Bitmap file header is actually 54 bytes and you can't skip it. You have to read it to find the width, height, bitcount... calculate padding if necessary... and other information.
Depending on how the file is opened, OpenCV will read the header and reads the pixels directly in to a buffer. The only change is that the rows are flipped so the image is right side up.
cv::Mat mat = cv::imread("filename.bmp", CV_LOAD_IMAGE_COLOR);
uint8_t* data = (uint8_t*)mat.data;
The header checks and the small changes made by OpenCV will not significantly affect performance. The bottle neck is mainly in reading the file from the disk. The change in performance will be difficult to measure, unless you are doing a very specific task, for example you want only 3 bytes in a very large file, and you don't want to read the entire file.
OpenCV is overkill for this task, so you may choose other libraries for example CImg as suggested in comments. If you use smaller libraries they load faster, it might be noticeable when your program starts.
The following code is a test run on Windows.
For a large 16MB bitmap file, the result is almost identical for opencv versus plain c++.
For a small 200kb bitmap file, the result is 0.00013 seconds to read in plain C++, and 0.00040 seconds for opencv. Note the plain c++ is not doing much beside reading the bytes.
class stopwatch
{
std::chrono::time_point<std::chrono::system_clock> time_start, time_end;
public:
stopwatch() { reset();}
void reset(){ time_start = std::chrono::system_clock::now(); }
void print(const char* title)
{
time_end = std::chrono::system_clock::now();
std::chrono::duration<double> diff = time_end - time_start;
if(title) std::cout << title;
std::cout << diff.count() << "\n";
}
};
int main()
{
const char* filename = "filename.bmp";
//I use `fake` to prevent the compiler from over-optimization
//and skipping the whole loop. But it may not be necessary here
int fake = 0;
//open the file 100 times
int count = 100;
stopwatch sw;
for(int i = 0; i < count; i++)
{
//plain c++
std::ifstream fin(filename, std::ios::binary);
fin.seekg(0, std::ios::end);
int filesize = (int)fin.tellg();
fin.seekg(0, std::ios::beg);
std::vector<uint8_t> pixels(filesize - 54);
BITMAPFILEHEADER hd;
BITMAPINFOHEADER bi;
fin.read((char*)&hd, sizeof(hd));
fin.read((char*)&bi, sizeof(bi));
fin.read((char*)pixels.data(), pixels.size());
fake += pixels[i];
}
sw.print("time fstream: ");
sw.reset();
for(int i = 0; i < count; i++)
{
//opencv:
cv::Mat mat = cv::imread(filename, CV_LOAD_IMAGE_COLOR);
uint8_t* pixels = (uint8_t*)mat.data;
fake += pixels[i];
}
sw.print("time opencv: ");
printf("show some fake calculation: %d\n", fake);
return 0;
}

Read bmp file header size

I am trying to find file size, file header size width, and height of a bmp file. I have studied the format of bmp file and the arrangement of bytes in file.
When I try this code it shows wrong width and height for different files.
I have tried this for three images so far. This one image results the right measurement.
This one did not:
I don't understand where I went wrong, but the bit depth showed the right value for all three images.
Here is my code:
#include<iostream>
#include<fstream>
#include<math.h>
using namespace std;
int main() {
ifstream inputfile("bmp.bmp",ios::binary);
char c; int imageheader[1024];
double filesize=0; int width=0; int height=0;int bitCount = 0;
for(int i=0; i<1024; i++) {
inputfile.get(c); imageheader[i]=int(c);
}
filesize=filesize+(imageheader[2])*pow(2,0)+(imageheader[3])*pow(2,8)+(imageheader[4])*pow(2,16)+(imageheader[5])*pow(2,24);
cout<<endl<<endl<<"File Size: "<<(filesize/1024)<<" Kilo Bytes"<<endl;
width=width+(imageheader[18])*pow(2,0)+(imageheader[19])*pow(2,8)+(imageheader[20])*pow(2,16)+(imageheader[21])*pow(2,24);
cout<<endl<<"Width: "<<endl<<(width)<<endl;
height=height+(imageheader[22])*pow(2,0)+(imageheader[23])*pow(2,8)+(imageheader[24])*pow(2,16)+(imageheader[25])*pow(2,24);
cout<<endl<<"Height: "<<endl<<(height)<<endl;
bitCount=bitCount+(imageheader[28])*pow(2,0)+(imageheader[29])*pow(2,8);
cout<<endl<<"Bit Depth: "<<endl<<(bitCount)<<endl;
}
Let's start by reading the BMP header in as a series of bytes, not integers. To make this code truly portable, we'll use <stdint> types.
#include <fstream>
#include <stdint.h>
int main()
{
ifstream inputfile("D:/test.bmp", ios::binary);
uint8_t headerbytes[54] = {};
inputfile.read((char*)headerbytes, sizeof(headerbytes));
Now that we've got the header in memory as an array of bytes, we can simply cast the memory address of each header field back into a integer. Referencing the wikipedia page for bmp and the layout diagram.
uint32_t filesize = *(uint32_t*)(headerbytes+2);
uint32_t dibheadersize = *(uint32_t*)(headerbytes + 14);
uint32_t width = *(uint32_t*)(headerbytes + 18);
uint32_t height = *(uint32_t*)(headerbytes + 22);
uint16_t planes = *(uint16_t*)(headerbytes + 26);
uint16_t bitcount = *(uint16_t*)(headerbytes + 28);
Now an astute reader of the code will recognize that the individual fieds of a a BMP headers are stored in little endian format. And that the code above relies on you to have an x86 processor or any other architecture in which the byte layout is Little Endian. On a big endian machine, you'll have to apply a workaround to convert from LE to BE for each of the variables above.
The bug is reading into signed char. This should fix it:
for(int i = 0; i < 1024; i++)
{
//inputfile.get(c); imageheader[i] = int(c);
// This version of get returns int, where -1 means EOF. Should be checking for errors...
imageheader[i] = inputfile.get();
}
Others have commented on improvements to the code so I won't bother.

wave file export is corrupted

I wrote two functions which should export an audio-float buffer into a .wav-file, but I have problems with playing the exported file. Audacity plays it like it should be (sounds exactly like within my application), however, Ableton (DAW-software) seems to misinterprets some part of the wav so it sounds realy distorted. (like a distortion-effekt)
I guess that ableton somehow assumes a wrong sample-depth (smaller) so the actuall samples blow the limits.
I have two functions, the one creates an int32_t buffer from two float-buffers (mixing left and right into one buffer), the other function writes the .wav-file, including the format chunk etc. I guess that somewhere there is the problem.
class members / structs
// static I use in the export function
static const int FORMAT_PCM = 1;
static const int CHANNEL_COUNT = 2; // fix stereo
static const int BYTES_PER_SAMPLE = 4; // fix bytes per sample, 32bit audio
// a function I found in the internet, helps writting the bytes to the file
template <typename T>
static void write(std::ofstream& stream, const T& t) {
stream.write((const char*)&t, sizeof(T));
};
// used "structure" to store the buffer
class StereoAudioBuffer {
public:
StereoAudioBuffer(int length) : sizeInSamples(2*length){
samples = new int32_t[2*length];
};
~StereoAudioBuffer() {delete samples;};
int32_t *samples;
const int sizeInSamples;
};
converting function
StereoAudioBuffer* WaveExport::convertTo32BitStereo(
float *leftSamples,
float*rightSamples,
int length)
{
StereoAudioBuffer *buffer = new StereoAudioBuffer(length);
float max = 0;
// find max sample
for(int i = 0; i < length; i++) {
if(abs(leftSamples[i]) > max) {
max = abs(leftSamples[i]);
}
if(abs(rightSamples[i]) > max) {
max = abs(rightSamples[i]);
}
}
// normalise and scale to size(int32_t)
float factor = 2147483000.0f / max;
for(int i = 0; i < length; i++) {
buffer->samples[2*i] = leftSamples[i] * factor ;
buffer->samples[2*i+1] = rightSamples[i] * factor;
}
return buffer;
}
the exporting function (part of this code comes from the internet, sadly, I can't find the source anymore
void WaveExport::writeStereoWave(
const char *path,
StereoAudioBuffer* buffer,
int sampleRate)
{
std::ofstream stream(path, std::ios::binary);
// RIFF
stream.write("RIFF", 4);
// FILE SIZE
write<int>(stream, 36 + buffer->sizeInSamples * BYTES_PER_SAMPLE); // 32 bits -> 4 bytes
// WAVE
stream.write("WAVE", 4);
// FORMAT CHUNK
stream.write("fmt ", 4);
write<int>(stream, 16);
write<short>(stream, FORMAT_PCM); // Format
write<short>(stream, CHANNEL_COUNT); // Channels
write<int>(stream, sampleRate); // Sample Rate
write<int>(stream, sampleRate * CHANNEL_COUNT * BYTES_PER_SAMPLE); // Byterate
write<short>(stream, CHANNEL_COUNT * BYTES_PER_SAMPLE); // Frame size
write<short>(stream, 8 * BYTES_PER_SAMPLE); // Bits per sample
int dataChunkSize = buffer->sizeInSamples * BYTES_PER_SAMPLE;
// SAMPLES
stream.write("data", 4);
stream.write((const char*)&dataChunkSize, 4);
stream.write((const char*)buffer->samples, BYTES_PER_SAMPLE*buffer->sizeInSamples);
}
Does anybody know how to write .wav files and maybe can tell me what I did wrong or missed?
Thanks!
There was no problem. I used 32bit .wav which just wasn't supported in the application, I used for playback.
I changed the export functions to use int16_t, 16bit depth, and it works fine.

How to record the microphone untill there is no sound?

I've created 2 functions :
- One that records the microphone
- One that plays the sound of the microphone
It records the microphone for 3 seconds
#include <iostream>
#include <Windows.h>
#include <vector>
using namespace std;
#pragma comment(lib, "winmm.lib")
short int waveIn[44100 * 3];
void PlayRecord();
void StartRecord()
{
const int NUMPTS = 44100 * 3; // 3 seconds
int sampleRate = 44100;
// 'short int' is a 16-bit type; I request 16-bit samples below
// for 8-bit capture, you'd use 'unsigned char' or 'BYTE' 8-bit types
HWAVEIN hWaveIn;
MMRESULT result;
WAVEFORMATEX pFormat;
pFormat.wFormatTag=WAVE_FORMAT_PCM; // simple, uncompressed format
pFormat.nChannels=1; // 1=mono, 2=stereo
pFormat.nSamplesPerSec=sampleRate; // 44100
pFormat.nAvgBytesPerSec=sampleRate*2; // = nSamplesPerSec * n.Channels * wBitsPerSample/8
pFormat.nBlockAlign=2; // = n.Channels * wBitsPerSample/8
pFormat.wBitsPerSample=16; // 16 for high quality, 8 for telephone-grade
pFormat.cbSize=0;
// Specify recording parameters
result = waveInOpen(&hWaveIn, WAVE_MAPPER,&pFormat,
0L, 0L, WAVE_FORMAT_DIRECT);
WAVEHDR WaveInHdr;
// Set up and prepare header for input
WaveInHdr.lpData = (LPSTR)waveIn;
WaveInHdr.dwBufferLength = NUMPTS*2;
WaveInHdr.dwBytesRecorded=0;
WaveInHdr.dwUser = 0L;
WaveInHdr.dwFlags = 0L;
WaveInHdr.dwLoops = 0L;
waveInPrepareHeader(hWaveIn, &WaveInHdr, sizeof(WAVEHDR));
// Insert a wave input buffer
result = waveInAddBuffer(hWaveIn, &WaveInHdr, sizeof(WAVEHDR));
// Commence sampling input
result = waveInStart(hWaveIn);
cout << "recording..." << endl;
Sleep(3 * 1000);
// Wait until finished recording
waveInClose(hWaveIn);
PlayRecord();
}
void PlayRecord()
{
const int NUMPTS = 44100 * 3; // 3 seconds
int sampleRate = 44100;
// 'short int' is a 16-bit type; I request 16-bit samples below
// for 8-bit capture, you'd use 'unsigned char' or 'BYTE' 8-bit types
HWAVEIN hWaveIn;
WAVEFORMATEX pFormat;
pFormat.wFormatTag=WAVE_FORMAT_PCM; // simple, uncompressed format
pFormat.nChannels=1; // 1=mono, 2=stereo
pFormat.nSamplesPerSec=sampleRate; // 44100
pFormat.nAvgBytesPerSec=sampleRate*2; // = nSamplesPerSec * n.Channels * wBitsPerSample/8
pFormat.nBlockAlign=2; // = n.Channels * wBitsPerSample/8
pFormat.wBitsPerSample=16; // 16 for high quality, 8 for telephone-grade
pFormat.cbSize=0;
// Specify recording parameters
waveInOpen(&hWaveIn, WAVE_MAPPER,&pFormat, 0L, 0L, WAVE_FORMAT_DIRECT);
WAVEHDR WaveInHdr;
// Set up and prepare header for input
WaveInHdr.lpData = (LPSTR)waveIn;
WaveInHdr.dwBufferLength = NUMPTS*2;
WaveInHdr.dwBytesRecorded=0;
WaveInHdr.dwUser = 0L;
WaveInHdr.dwFlags = 0L;
WaveInHdr.dwLoops = 0L;
waveInPrepareHeader(hWaveIn, &WaveInHdr, sizeof(WAVEHDR));
HWAVEOUT hWaveOut;
cout << "playing..." << endl;
waveOutOpen(&hWaveOut, WAVE_MAPPER, &pFormat, 0, 0, WAVE_FORMAT_DIRECT);
waveOutWrite(hWaveOut, &WaveInHdr, sizeof(WaveInHdr)); // Playing the data
Sleep(3 * 1000); //Sleep for as long as there was recorded
waveInClose(hWaveIn);
waveOutClose(hWaveOut);
}
int main()
{
StartRecord();
return 0;
}
How can I change my StartRecord function (and I guess my PlayRecord function aswell), to make it record untill theres no input from the microphone?
(So far, those 2 functions are working perfectly - records the microphone for 3 seconds, then plays the recording)...
Thanks!
Edit: by no sound, I mean the sound level is too low or something (means the person probably isnt speaking)...
Because sound is a wave, it oscillates between high and low pressures. This waveform is usually recorded as positive and negative numbers, with zero being the neutral pressure. If you take the absolute value of the signal and keep a running average it should be sufficient.
The average should be taken over a long enough period that you account for the appropriate amount of silence. A very cheap way to keep an estimate of the running average is like this:
const double threshold = 50; // Whatever threshold you need
const int max_samples = 10000; // The representative running average size
double average = 0; // The running average
int sample_count = 0; // When we are building the average
while( sample_count < max_samples || average > threshold ) {
// New sample arrives, stored in 'sample'
// Adjust the running absolute average
if( sample_count < max_samples ) sample_count++;
average *= double(sample_count-1) / sample_count;
average += std::abs(sample) / sample_count;
}
The larger max_samples, the slower average will respond to a signal. After the sound stops, it will slowly trail off. However, it will be slow to rise again too. This would be fine for reasonably continuous sound.
With something like speech, which can have short or long pauses, you may want to use an impulse-based approach. You can just define the number of samples of 'silence' that you expect, and reset it whenever you receive an impulse that exceeds the threshold. Using the running average above with a much shorter window size will give you a simple way of detecting an impulse. Then you just need to count...
const int max_samples = 100; // Smaller window size for impulse
const int max_silence_samples = 10000; // Maximum samples below threshold
int silence = 0; // Number of samples below threshold
while( silence < max_silence_samples ) {
// Compute running average as before
//...
// Check for silence. If there's a signal, reset the counter.
if( average > threshold ) silence = 0;
else ++silence;
}
Adjusting threshold and max_samples will control the sensitivity to pops and clicks, while max_silence_samples gives you control over how much silence is allowed before you stop recording.
There are undoubtedly more technical ways to achieve your goals, but it's always good to try the simple one first. See how you go with this.
I suggest you to do it via DirectShow. You should create an instance of microphone, SampleGrabber, audio encoder and file writer. Your graph should be like this:
Microphone -> SampleGrabber -> Audio Encoder -> File Writer
Every sample passes through SampleGrabber and you can read all raw samples and check if you should continue record or not. This is the best way you and both record and check it's contents.