Writing to WAV file C++ - c++

I have a homework about WAV files and FIR filters for a Digital Signal Processing class.
My program must read a WAV file, apply a filter to the data and write the output data to another WAV file again.
I have completed reading and applying filters but I can't write the WAV file. The program doesn't give any errors while compiling but the WAV file doesn't play.
If I write "temp" to the WAV, it runs properly. But if I write "data", it doesn't.
How can I write a WAV file properly?
#define _CRT_SECURE_NO_WARNINGS
#define PI 3.14f
#define WAV_HEADER_LENGTH 44
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <iostream>
#include <fstream>
char* read_wav(const char* filename, short*, short*, int*);
void write_wav(const char* filename, const char*, int);
using namespace std;
int main()
{
short nchannel, ssample;
int csample;
//Reading WAV file and returning the data.
char* temp = read_wav("sum.wav", &nchannel, &ssample, &csample);
short* data = (short*)&temp[WAV_HEADER_LENGTH];
cout << "How many coefficients are there in filter ?" << endl;
int N;
cin >> N ;
float filter[N];
cout << "Type coefficients in filter." << endl;
for(int i=0; i<N;i++){
cin >> filter[i];
}
short* output = (short*)&temp[WAV_HEADER_LENGTH];
for(int i=0; i < csample; i++){
double sum = 0;
for(int j=0; j < N; j++){
if((i - j) >= 0)
sum += filter[j] * data[i-j];
}
output[i] = (short) sum;
}
write_wav("test.wav", out, csample * ssample + WAV_HEADER_LENGTH);
}
char* read_wav(const char* filename, short* nchannel, short* ssample, int* csample) {
//Reading the file.
FILE* fp = fopen(filename, "rb");
if (!fp) {
fprintf(stderr, "Couldn't open the file \"%s\"\n", filename);
exit(0);
}
fseek(fp, 0, SEEK_END);
int file_size = ftell(fp);
fseek(fp, 0, SEEK_SET);
printf("The file \"%s\" has %d bytes\n\n", filename, file_size);
char* buffer = (char*)malloc(sizeof(char) * file_size);
fread(buffer, file_size, 1, fp);
// Dump the buffer info.
*nchannel = *(short*)&buffer[22];
*ssample = *(short*)&buffer[34] / 8;
*csample = *(int*)&buffer[40] / *ssample;
printf("ChunkSize :\t %u\n", *(int*)&buffer[4]);
printf("Format :\t %u\n", *(short*)&buffer[20]);
printf("NumChannels :\t %u\n", *(short*)&buffer[22]);
printf("SampleRate :\t %u\n", *(int*)&buffer[24]); // number of samples per second
printf("ByteRate :\t %u\n", *(int*)&buffer[28]); // number of bytes per second
printf("BitsPerSample :\t %u\n", *(short*)&buffer[34]);
printf("Subchunk2ID :\t \"%c%c%c%c\"\n", buffer[36], buffer[37], buffer[38], buffer[39]); // marks beginning of the data section
printf("Subchunk2Size :\t %u\n", *(int*)&buffer[40]); // size of data (byte)
printf("Duration :\t %fs\n\n", (float)(*(int*)&buffer[40]) / *(int*)&buffer[28]);
fclose(fp);
return buffer;
}
void write_wav(const char* filename, const char* data, int len) {
FILE* fp = fopen(filename, "wb");
if (!fp) {
fprintf(stderr, "Couldn't open the file \"%s\"\n", filename);
exit(0);
}
fwrite(data, len, 1, fp);
fclose(fp);
}

This works for me:
int main()
{
short nchannel, ssample;
int csample;
// Reading WAV file and returning the data.
char* temp = read_wav("sum.wav", &nchannel, &ssample, &csample);
short* data = (short*)&temp[WAV_HEADER_LENGTH];
// cout << "How many coefficients are there in filter ?" << endl;
const int N = 2;
// cin >> N;
float filter[N] = {0.5, 0.75};
// cout << "Type coefficients in filter." << endl;
// for (int i = 0; i < N; i++)
// {
// cin >> filter[i];
// }
short* output = (short*)&temp[WAV_HEADER_LENGTH];
for (int i = 0; i < csample; i++)
{
double sum = 0;
for (int j = 0; j < N; j++)
{
if ((i - j) >= 0) sum += filter[j] * data[i - j];
}
output[i] = (short)sum;
}
write_wav("test.wav", (char*)temp, csample * ssample + WAV_HEADER_LENGTH);
}
My changes:
The major change is to use the full buffer, with extremely misleading name: temp, instead of your out that does not compile, as the argument of write_wav.
I applied "my" filter coefficients (the sound from the output file is really distorted),
I applied my favorite indentation
If the code is to be portable, you need to check the endiannes and act accordingly.
I would expect the input and output files to be of the same length, but they're not. Please check it yourself why this is not the case.
Example:
-rw-r--r-- 1 zkoza zkoza 787306 06-23 14:09 sum.wav
-rw-r--r-- 1 zkoza zkoza 787176 06-23 14:16 test.wav
It looks like 130 bytes are missing in the output file.
Your float filter[N] with N not known at compile time is a C++ extension: please use std::vector in your final code instead.
Next time please provide also a link for any input files. For my tests, I used https://freewavesamples.com/alesis-fusion-clean-guitar-c3 , but all these little things, like finding an input file (WAV format has several flavors, I could have missed the correct one), guessing filter parameters etc. take time and effort.
Your condition if ((i - j) >= 0) can be written in a way easier to understand; preferably by changing the inner loop "header".

Related

Reading file consists of float number with mmap() in C++

I'm trying to read a file consists of 100000000 float numbers like 0.12345678 or -0.1234567 separated by space in c++. I used fscanf() to read the file and the codes is like this:
FILE *fid = fopen("testingfile.txt", "r");
if (fid == NULL)
return false;
float v;
for (int i = 0; i < 100000000; i++)
fscanf(fid, "%f", &v);
fclose(fid);
The file is 1199999988 bytes in size and took around 18 seconds to finish reading using fscanf().Therefore, I would like to use mmap() to speed up the reading and code is like this:
#define FILEPATH "testingfile.txt"
char text[10] = {'\0'};
struct stat s;
int status = stat(FILEPATH, &s);
int fd = open(FILEPATH, O_RDONLY);
if (fd == -1)
{
perror("Error opening file for reading");
return 0;
}
char *map = (char *)mmap(NULL, s.st_size, PROT_READ, MAP_SHARED, fd, 0);
close(fd);
if (map == MAP_FAILED)
{
perror("Error mmapping the file");
return 0;
}
for (int i = 0,j=0; i < s.st_size; i++)
{
if (isspace(map[i]))
{
text[j] = '\0';
j = 0;
float v = atof(text);
for (int j = 0; j < 10; j++)
text[j] = '\0';
continue;
}
text[j] = map[i];
j++;
}
if (munmap(map, s.st_size) == -1)
{
return 0;
}
However, it still takes around 14.5 seconds to finish reading. I found the most time consuming part is converting array to float,which consumes around 10 seconds
So I have three questions:
Is there any way I can directly read float instead of char or
Is there any better method to convert char array to float
How does fscanf recognize floating point value and read it, which is much faster than atof().
Thanks in advance!
Based on the advice given, here are two possible solutions to this problem:
The first approach would be a bit "stupid". Since the format of floating number values stored is known, conversion from char array to float number can be easily done without usingatof().
By removing atof(), it only takes 8 seconds to finish reading and conversion for the same file.
The second approach is to change the store format of float numbers in the file (as advised by Jeremy Friesner). Floating number values are stored in binary format so that conversion part for mmap() is not required. The code becomes something like this:
#define FILEPATH "myfile.bin"
int main()
{
int start_s = clock();
struct stat s;
int status = stat(FILEPATH, &s);
int fd = open(FILEPATH, O_RDONLY);
if (fd == -1)
{
perror("Error opening file for reading");
return 0;
}
float *map = (float *)mmap(NULL, s.st_size, PROT_READ, MAP_SHARED, fd, 0);
close(fd);
if (map == MAP_FAILED)
{
perror("Error mmapping the file");
return 0;
}
for (int i = 0; i < s.st_size / 4; i++)
{
float v = map[i];
}
if (munmap(map, s.st_size) == -1)
{
return 0;
}
}
This would dramatically reduce the time required to read the file in same size.

C++: Write BMP image format error on WINDOWS

I have the most strange problem here... I'm using the same code(copy-paste) from Linux in Windows to READ and WRITE and BMP image. And from some reason in Linux every thing works perfectly fine, but when I'm coming to Windows 10 from some I can't open that images and I've receive an error message how said something like this:
"It looks like we don't support this file format."
Do you have any idea what should I do? I will put the code below.
EDIT:
I've solved the padding problem and now it's write the images but they are completely white, any idea why? I've update the code also.
struct BMP {
int width;
int height;
unsigned char header[54];
unsigned char *pixels;
int size;
int row_padded;
};
void writeBMP(string filename, BMP image) {
string fileName = "Output Files\\" + filename;
FILE *out = fopen(fileName.c_str(), "wb");
fwrite(image.header, sizeof(unsigned char), 54, out);
unsigned char tmp;
for (int i = 0; i < image.height; i++) {
for (int j = 0; j < image.width * 3; j += 3) {
// Convert (B, G, R) to (R, G, B)
tmp = image.pixels[j];
image.pixels[j] = image.pixels[j + 2];
image.pixels[j + 2] = tmp;
}
fwrite(image.pixels, sizeof(unsigned char), image.row_padded, out);
}
fclose(out);
}
BMP readBMP(string filename) {
BMP image;
string fileName = "Input Files\\" + filename;
FILE *f = fopen(fileName.c_str(), "rb");
if (f == NULL)
throw "Argument Exception";
fread(image.header, sizeof(unsigned char), 54, f); // read the 54-byte header
// extract image height and width from header
image.width = *(int *) &image.header[18];
image.height = *(int *) &image.header[22];
image.row_padded = (image.width * 3 + 3) & (~3);
image.pixels = new unsigned char[image.row_padded];
unsigned char tmp;
for (int i = 0; i < image.height; i++) {
fread(image.pixels, sizeof(unsigned char), image.row_padded, f);
for (int j = 0; j < image.width * 3; j += 3) {
// Convert (B, G, R) to (R, G, B)
tmp = image.pixels[j];
image.pixels[j] = image.pixels[j + 2];
image.pixels[j + 2] = tmp;
}
}
fclose(f);
return image;
}
In my point of view this code should be cross-platform... But it's not... why?
Thanks for help
Check the header
The header must start with the following two signature bytes: 0x42 0x4D. If it's something different a third party application will think that this file doesn't contain a bmp picture despite the .bmp file extension.
The size and the way pixels are stored is also a little bit more complex than what you expect: you assume that the number of bits per pixels is 24 and no no compression is used. This is not guaranteed. If it's not the case, you might read more data than available, and corrupt the file when writing it back.
Furthermore, the size of the header depends also on the BMP version you are using, which you can detect using the 4 byte integer at offset 14.
Improve your code
When you load a file, check the signature, the bmp version, the number of bits per pixel and the compression. For debugging purpose, consider dumping the header to check it manually:
for (int i=0; i<54; i++)
cout << hex << image.header[i] << " ";`
cout <<endl;
Furthermore, when you fread() check that the number of bytes read correspond to the size you wanted to read, so to be sure that you're not working with uninitialized buffer data.
Edit:
Having checked the dump, it appears that the format is as expected. But verifying the padded size in the header with the padded size that you have calculated it appears that the error is here:
image.row_padded = (image.width * 3 + 3) & (~3); // ok size of a single row rounded up to multiple of 4
image.pixels = new unsigned char[image.row_padded]; // oops ! A little short ?
In fact you read row by row, but you only keep the last one in memory ! This is different of your first version, where you did read the full pixels of the picture.
Similarly, you write the last row repeated height time.
Reconsider your padding, working with the total padded size.
image.row_padded = (image.width * 3 + 3) & (~3); // ok size of a single row rounded up to multiple of 4
image.size_padded = image.row_padded * image.height; // padded full size
image.pixels = new unsigned char[image.size_padded]; // yeah !
if (fread(image.pixels, sizeof(unsigned char), image.size_padded, f) != image.size_padded) {
cout << "Error: all bytes couldn't be read"<<endl;
}
else {
... // process the pixels as expected
}
...

How to read an audio file in an array format from libsndfile library like MATLAB's audioread

I am using libsndfile to read .caf file. I am able to read the file properly with number of items in the audio file. However, when I save those numbers in a text file and try to verify my values with MATLAB, they look a lot different. I have attached the code in C++ and the values I obtain from C++ and MATLAB.
void ofApp::setup(){
const char* fn = "/Users/faiyadhshahid/Desktop/Desktopdemo.caf";
SNDFILE *sf;
SF_INFO info;
int num_channels, num, num_items, *buf, f, sr,c, i , j;
FILE *out;
/* Open the WAV file. */
info.format = 0;
sf = sf_open(fn,SFM_READ,&info);
if (sf == NULL)
{
printf("Failed to open the file.\n");
}
/* Print some of the info, and figure out how much data to read. */
f = info.frames;
sr = info.samplerate;
c = info.channels;
printf("frames=%d\n",f);
printf("samplerate=%d\n",sr);
printf("channels=%d\n",c);
num_items = f*c;
printf("num_items=%d\n",num_items);
/* Allocate space for the data to be read, then read it. */
buf = (int *) malloc(num_items*sizeof(int));
num = sf_read_int(sf,buf,num_items);
sf_close(sf);
printf("Read %d items\n",num);
/* Write the data to filedata.out. */
out = fopen("/Users/faiyadhshahid/Desktop/filedata.txt","w");
for (i = 0; i < num; i += c)
{
for (j = 0; j < c; ++j)
fprintf(out,"%d ",buf[i+j]);
fprintf(out,"\n");
}
fclose(out);
return 0;
}
Values of C++ (on left) vs MATLAB (on right):
I figured it out by myself. I was comparing apples with oranges.
The changes I needed to make were to convert the buffer saving the values to read float values. `int num_channels, num, num_items,f, sr,c, i , j;
float *buf;
FILE *out;
/* Open the WAV file. */
info.format = 0;
sf = sf_open(fn,SFM_READ,&info);
if (sf == NULL)
{
printf("Failed to open the file.\n");
}
/* Print some of the info, and figure out how much data to read. */
f = info.frames;
sr = info.samplerate;
c = info.channels;
printf("frames=%d\n",f);
printf("samplerate=%d\n",sr);
printf("channels=%d\n",c);
num_items = f*c;
printf("num_items=%d\n",num_items);
/* Allocate space for the data to be read, then read it. */
buf = (float *) malloc(num_items*sizeof(float));
num = sf_read_float(sf,buf,num_items);
sf_close(sf);
printf("Read %d items\n",num);
/* Write the data to filedata.out. */
out = fopen("/Users/faiyadhshahid/Desktop/filedata.txt","w");
for (i = 0; i < num; i += c)
{
for (j = 0; j < c; ++j)
fprintf(out,"%f \n",buf[i]);
// fprintf(out,"\n");
}
fclose(out);
`

Writing with FILE pointer, reading with ifstream

I have to handle with code where at one side information is written to a file using FILE* and on the other side it is read-in using ifstream.
I tried to compile a dummy code which shows the same behavior as the original code:
#include <cstdio>
#include <cstdlib>
#include <cstring>
#include <fstream>
#include <iostream>
int main()
{
FILE* outFile = fopen("testFile", "w");
char* posBuf = NULL;
unsigned int counter = 0;
posBuf = (char*) malloc( sizeof(int) + 2*sizeof(double) );
int iDummy = 123;
memcpy(posBuf+counter, (const void*) &iDummy, sizeof(int));
counter += sizeof(int);
double dDummy = 456.78;
memcpy(posBuf+counter, (const void*) &dDummy, sizeof(double));
counter += sizeof(double);
dDummy = 111.222;
memcpy(posBuf+counter, (const void*) &dDummy, sizeof(double));
fputs(posBuf, outFile);
fclose(outFile);
/////////////////////
std::ifstream myfile;
myfile.open("testFile", std::ios::in|std::ios::binary);
myfile.seekg (0, std::ios::end);
unsigned int length = myfile.tellg();
myfile.seekg (0, std::ios::beg);
char* posBuf2 = (char*) malloc( length );
myfile.read(posBuf2, length);
counter = 0;
int idummy = 0;
memcpy((void*) &idummy, posBuf2+counter, sizeof(int));
counter += sizeof(int);
printf("read integer: %u\n", idummy);
double ddummy = 1.0;
memcpy((void*) &ddummy, posBuf2+counter, sizeof(double));
counter += sizeof(double);
printf("read double: %f\n", ddummy);
ddummy = 1.0;
memcpy((void*) &ddummy, posBuf2+counter, sizeof(double));
counter += sizeof(double);
printf("read double: %f\n", ddummy);
myfile.close();
/////////////////////
FILE* inFile = fopen("testFile", "r");
char* posBuf3 = NULL;
unsigned int c = 0;
while ( ! feof (inFile) )
{
posBuf3 = (char*) realloc((void*) posBuf3, c+4);
fgets(posBuf3+c, 4, inFile);
c += 4;
}
idummy = 0;
memcpy((void*) &idummy, posBuf, sizeof(int));
printf("read again integer: %u\n", idummy);
ddummy =1.0;
memcpy((void*) &ddummy, posBuf+sizeof(int), sizeof(double));
printf("read again double: %f\n", ddummy);
ddummy =1.0;
memcpy((void*) &ddummy, posBuf+sizeof(int)+sizeof(double), sizeof(double));
printf("read again double: %f\n", ddummy);
return 0;
}
The output I get from that is:
read integer: 123
read double: 0.000000
read double: 0.000000
read again integer: 123
read again double: 456.780000
read again double: 111.222000
As you can see, the deserialization only works if I use FILE* also for the reading of the file.
QUESTION: Any explanation for that behavior?
Thanks!
UPDATED:
1) open ifstream using std::ios::in|std::ios::binary
2) fix malloc
A few problems with the posted code:
it is writing beyond the bounds of the memory allocated for posBuf (1 int and 2 doubles are copied to the memory, but only sizeof(int) + sizeof(double) is allocated), which is undefined behaviour.
fputs() treats its argument as a null terminated string and so will stop writing when it encounters a null character. Open the file in binary mode and use fwrite() instead which does not treat its input as a null terminated string.
There are several other issues with the code;
it is a horrible mix of C and C++
avoidable explicit dynamic memory management (never mind malloc() and realloc()). Simply replaced with:
char posBuf[sizeof(int) + 2 * sizeof(double)];
while (!feof(inFile)).
there is practically no checking of the success of I/O operations

Reading then adding large number of integers from binary file fast in C/C++

I was writing code to read unsigned integers from a binary file using C/C++ on a 32 bit Linux OS intended to run on an 8-core x86 system. The application takes an input file which contains unsigned integers in little-endian format one after another. So the input file size in bytes is a multiple of 4. The file could have a billion integers in it. What is the fastest way to read and add all the integers and return the sum with 64 bit precision?
Below is my implementation. Error checking for corrupt data is not the major concern here and the input file is considered to without any issues in this case.
#include <iostream>
#include <fstream>
#include <pthread.h>
#include <string>
#include <string.h>
using namespace std;
string filepath;
unsigned int READBLOCKSIZE = 1024*1024;
unsigned long long nFileLength = 0;
unsigned long long accumulator = 0; // assuming 32 bit OS running on X86-64
unsigned int seekIndex[8] = {};
unsigned int threadBlockSize = 0;
unsigned long long acc[8] = {};
pthread_t thread[8];
void* threadFunc(void* pThreadNum);
//time_t seconds1;
//time_t seconds2;
int main(int argc, char *argv[])
{
if (argc < 2)
{
cout << "Please enter a file path\n";
return -1;
}
//seconds1 = time (NULL);
//cout << "Start Time in seconds since January 1, 1970 -> " << seconds1 << "\n";
string path(argv[1]);
filepath = path;
ifstream ifsReadFile(filepath.c_str(), ifstream::binary); // Create FileStream for the file to be read
if(0 == ifsReadFile.is_open())
{
cout << "Could not find/open input file\n";
return -1;
}
ifsReadFile.seekg (0, ios::end);
nFileLength = ifsReadFile.tellg(); // get file size
ifsReadFile.seekg (0, ios::beg);
if(nFileLength < 16*READBLOCKSIZE)
{
//cout << "Using One Thread\n"; //**
char* readBuf = new char[READBLOCKSIZE];
if(0 == readBuf) return -1;
unsigned int startOffset = 0;
if(nFileLength > READBLOCKSIZE)
{
while(startOffset + READBLOCKSIZE < nFileLength)
{
//ifsReadFile.flush();
ifsReadFile.read(readBuf, READBLOCKSIZE); // At this point ifsReadFile is open
int* num = reinterpret_cast<int*>(readBuf);
for(unsigned int i = 0 ; i < (READBLOCKSIZE/4) ; i++)
{
accumulator += *(num + i);
}
startOffset += READBLOCKSIZE;
}
}
if(nFileLength - (startOffset) > 0)
{
ifsReadFile.read(readBuf, nFileLength - (startOffset));
int* num = reinterpret_cast<int*>(readBuf);
for(unsigned int i = 0 ; i < ((nFileLength - startOffset)/4) ; ++i)
{
accumulator += *(num + i);
}
}
delete[] readBuf; readBuf = 0;
}
else
{
//cout << "Using 8 Threads\n"; //**
unsigned int currthreadnum[8] = {0,1,2,3,4,5,6,7};
if(nFileLength > 200000000) READBLOCKSIZE *= 16; // read larger blocks
//cout << "Read Block Size -> " << READBLOCKSIZE << "\n";
if(nFileLength % 28)
{
threadBlockSize = (nFileLength / 28);
threadBlockSize *= 4;
}
else
{
threadBlockSize = (nFileLength / 7);
}
for(int i = 0; i < 8 ; ++i)
{
seekIndex[i] = i*threadBlockSize;
//cout << seekIndex[i] << "\n";
}
pthread_create(&thread[0], NULL, threadFunc, (void*)(currthreadnum + 0));
pthread_create(&thread[1], NULL, threadFunc, (void*)(currthreadnum + 1));
pthread_create(&thread[2], NULL, threadFunc, (void*)(currthreadnum + 2));
pthread_create(&thread[3], NULL, threadFunc, (void*)(currthreadnum + 3));
pthread_create(&thread[4], NULL, threadFunc, (void*)(currthreadnum + 4));
pthread_create(&thread[5], NULL, threadFunc, (void*)(currthreadnum + 5));
pthread_create(&thread[6], NULL, threadFunc, (void*)(currthreadnum + 6));
pthread_create(&thread[7], NULL, threadFunc, (void*)(currthreadnum + 7));
pthread_join(thread[0], NULL);
pthread_join(thread[1], NULL);
pthread_join(thread[2], NULL);
pthread_join(thread[3], NULL);
pthread_join(thread[4], NULL);
pthread_join(thread[5], NULL);
pthread_join(thread[6], NULL);
pthread_join(thread[7], NULL);
for(int i = 0; i < 8; ++i)
{
accumulator += acc[i];
}
}
//seconds2 = time (NULL);
//cout << "End Time in seconds since January 1, 1970 -> " << seconds2 << "\n";
//cout << "Total time to add " << nFileLength/4 << " integers -> " << seconds2 - seconds1 << " seconds\n";
cout << accumulator << "\n";
return 0;
}
void* threadFunc(void* pThreadNum)
{
unsigned int threadNum = *reinterpret_cast<int*>(pThreadNum);
char* localReadBuf = new char[READBLOCKSIZE];
unsigned int startOffset = seekIndex[threadNum];
ifstream ifs(filepath.c_str(), ifstream::binary); // Create FileStream for the file to be read
if(0 == ifs.is_open())
{
cout << "Could not find/open input file\n";
return 0;
}
ifs.seekg (startOffset, ios::beg); // Seek to the correct offset for this thread
acc[threadNum] = 0;
unsigned int endOffset = startOffset + threadBlockSize;
if(endOffset > nFileLength) endOffset = nFileLength; // for last thread
//cout << threadNum << "-" << startOffset << "-" << endOffset << "\n";
if((endOffset - startOffset) > READBLOCKSIZE)
{
while(startOffset + READBLOCKSIZE < endOffset)
{
ifs.read(localReadBuf, READBLOCKSIZE); // At this point ifs is open
int* num = reinterpret_cast<int*>(localReadBuf);
for(unsigned int i = 0 ; i < (READBLOCKSIZE/4) ; i++)
{
acc[threadNum] += *(num + i);
}
startOffset += READBLOCKSIZE;
}
}
if(endOffset - startOffset > 0)
{
ifs.read(localReadBuf, endOffset - startOffset);
int* num = reinterpret_cast<int*>(localReadBuf);
for(unsigned int i = 0 ; i < ((endOffset - startOffset)/4) ; ++i)
{
acc[threadNum] += *(num + i);
}
}
//cout << "Thread " << threadNum + 1 << " subsum = " << acc[threadNum] << "\n"; //**
delete[] localReadBuf; localReadBuf = 0;
return 0;
}
I wrote a small C# program to generate the input binary file for testing.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
namespace BinaryNumWriter
{
class Program
{
static UInt64 total = 0;
static void Main(string[] args)
{
BinaryWriter bw = new BinaryWriter(File.Open("test.txt", FileMode.Create));
Random rn = new Random();
for (UInt32 i = 1; i <= 500000000; ++i)
{
UInt32 num = (UInt32)rn.Next(0, 0xffff);
bw.Write(num);
total += num;
}
bw.Flush();
bw.Close();
}
}
}
Running the program on a Core i5 machine # 3.33 Ghz (its quad-core but its what I got at the moment) with 2 GB RAM and Ubuntu 9.10 32 bit had the following performance numbers
100 integers ~ 0 seconds (I would really have to suck otherwise)
100000 integers < 0 seconds
100000000 integers ~ 7 seconds
500000000 integers ~ 29 seconds (1.86 GB input file)
I am not sure if the HDD is 5400RPM or 7200RPM. I tried different buffer sizes for reading and found reading 16 MB at a time for big input files was kinda the sweet spot.
Are there any better ways to read faster from the file to increase overall performance? Is there a smarter way to add large arrays of integers faster and folding repeatedly? Is there any major roadblocks to performance the way I have written the code / Am I doing something obviously wrong that's costing a lot of time?
What can I do to make this process of reading and adding data faster?
Thanks.
Chinmay
Accessing a mechanical HDD from multiple threads the way you do is going to take some head movement (read slow it down). You're almost surely IO bound (65MBps for the 1.86GB file).
Try to change your strategy by:
starting the 8 threads - we'll call them CONSUMERS
the 8 threads will wait for data to be made available
in the main thread start to read chunks (say 256KB) of the file thus being a PROVIDER for the CONSUMERS
main thread hits the EOF and signals the workers that there is no more avail data
main thread waits for the 8 workers to join.
You'll need quite a bit of synchronization to get it working flawlessly and I think it would totally max out your HDD / filesystem IO capabilities by doing sequencial file access. YMMV on smallish files which can be cached and served from the cache at lightning speed.
Another thing you can try is to start only 7 threads, leave one free CPU for the main thread & the rest of the system.
.. or get an SSD :)
Edit:
For simplicity see how fast you can simply read the file (discarding the buffers) with no processing, single-threaded. That plus epsilon is your theoretical limit to how fast you can get this done.
If you want to read (or write) a lot of data fast, and you don't want to do much processing with that data, you need to avoid extra copies of the data between buffers. That means you want to avoid fstream or FILE abstractions (as they introduce an extra buffer that needs to be copied through), and avoid read/write type calls that copy stuff between kernel and user buffers.
Instead, on linux, you want to use mmap(2). On a 64-bit OS, just mmap the entire file into memory, use madvise(MADV_SEQUENTIAL) to tell the kernel you're going to be accessing it mostly sequentially, and have at it. For a 32-bit OS, you'll need to mmap in chunks, unmapping the previous chunk each time. Something much like your current structure, with each thread mmapping one fixed-size chunk at a time should work well.