Reading subchunk2 data of a wav file in C++ - c++

I am trying to read the data part of a .wav file into a buffer. I have already read the header part according to C++ Reading the Data part of a WAV file
Therefore, my file pointer wavFile now points to the beginning of the data section. Then I use the following code to read audio data into a buffer.
long bytes = wavHeader.bitsPerSample/8;
long buffsize= wavHeader.Subchunk2Size/bytes;
int16_T *audiobuf = new int16_T[buffsize];
fread(audiobuf,bytes,buffsize,wavFile);
// do some processing
delete audiobuf;
In my test audio file, bitsPerSample is 16 and Subchunk2Size is 79844. Therefore, buffsize is 39922.
After running this code, I noticed that only first 256 positions of audiobuf get filled. But theoretically there should be 39922 entries of audio data. How can I sort out this issue?

Related

Endianness in wav files

I have tried to make a simple wav writer. I wanted to do this so that I could read in a wav file (using a pre-existing wav reader), resample the audio data then write the resampled data to another wav file. Input files could be 16 bitsPerSample or 32 bitsPerSample and I wanted to save the resampled audio with the same number of bitsPerSample.
The writer is working but there a couple of things I don't understand to do with endianness and I was hoping someone may be able to help me?
I previously had no experience of reading or writing binary files. I began by looking up the wav file format online and tried to write the data following the correct format. At first the writing wasn't working but I then found out that wav files are little-endian and it was trying to make my file writer consistent with this that brought up the majority of my problems.
I have got the wav writer to work now (by way of a test whereby I read in a wav file and checked I could write the unsampled audio and reproduce the exact same file) however there are a couple of points I am still unsure on to do with endianness and I was hoping someone may be able to help me?
Assuming the relevant variables have already been set here is my code for the wav writer:
// Write RIFF header
out_stream.write(chunkID.c_str(),4);
out_stream.write((char*)&chunkSize,4);
out_stream.write(format.c_str());
// Write format chunk
out_stream.write(subchunk1ID.c_str(),4);
out_stream.write((char*)&subchunk1Size,4);
out_stream.write((char*)&audioFormat,2);
out_stream.write((char*)&numOfChannels,2);
out_stream.write((char*)&sampleRate,4);
out_stream.write((char*)&byteRate,4);
out_stream.write((char*)&blockAlign,2);
out_stream.write((char*)&bitsPerSample,2);
// Write data chunk
out_stream.write(subchunk2ID.c_str(),4);
out_stream.write((char*)&subchunk2Size,4);
// Variables for writing 16 bitsPerSample data
std::vector<short> soundDataShort;
soundDataShort.resize(numSamples);
char theSoundDataBytes [2];
// soundData samples are written as shorts if bitsPerSample=16 and floats if bitsPerSample=32
switch( bitsPerSample )
{
case (16):
// cast each of the soundData samples from floats to shorts
// then save the samples in little-endian form (requires reversal of byte-order of the short variable)
for (int sample=0; sample < numSamples; sample++)
{
soundDataShort[sample] = static_cast<short>(soundData[sample]);
theSoundDataBytes[0] = (soundDataShort[sample]) & 0xFF;
theSoundDataBytes[1] = (soundDataShort[sample] >> 8) & 0xFF;
out_stream.write(theSoundDataBytes,2);
}
break;
case (32):
// save the soundData samples in binary form (does not require change to byte order for floats)
out_stream.write((char*)&soundData[0],numSamples);
}
The questions that I have are:
In the soundData vector why does the endianness of a vector of shorts matter but the vector of floats doesn't? In my code I have reversed the byte order of the shorts but not the floats.
Originally I tried to write the shorts without reversing the byte order. When I wrote the file it ended up being half the size it should have been (i.e. half the audio data was missing, but the half that was there sounded correct), why would this be?
I have not reversed the byte order of the shorts and longs in the other single variables which are essentially all the other fields that make up the wav file e.g. sampleRate, numOfChannels etc but this does not seem to affect the playing of the wav file. Is this just because media players do not use these fields (and hence I can't tell that I have got them wrong) or is it because the byte order of these variables does not matter?
In the soundData vector why does the endianness of a vector of shorts matter but the vector of floats doesn't? In my code I have reversed the byte order of the shorts but not the floats.
Actually, if you take a closer look at your code, you will see that you are not reversing the endianness of your shorts at all. Nor do you need to, on Intel CPUs (or on any other low-endian CPU).
Originally I tried to write the shorts without reversing the byte order. When I wrote the file it ended up being half the size it should have been (i.e. half the audio data was missing, but the half that was there sounded correct), why would this be?
I have no idea without seeing the code but I suspect that some other factor was in play.
I have not reversed the byte order of the shorts and longs in the other single variables which are essentially all the other fields that make up the wav file e.g. sampleRate, numOfChannels etc but this does not seem to affect the playing of the wav file. Is this just because media players do not use these fields (and hence I can't tell that I have got them wrong) or is it because the byte order of these variables does not matter?
These fields are in fact very important and must also be little-endian, but, as we have seen, you don't need to swap those either.

Knowing current compressed file size using gzwrite (zlib)

I'm using zlib for c++.
Quote from
http://refspecs.linuxbase.org/LSB_3.0.0/LSB-PDA/LSB-PDA/zlib-gzwrite-1.html regarding gzwrite function:
The gzwrite() function shall write data to the compressed file referenced by file, which shall have been opened in a write mode (see gzopen() and gzdopen()). On entry, buf shall point to a buffer containing len bytes of uncompressed data. The gzwrite() function shall compress this data and write it to file. The gzwrite() function shall return the number of uncompressed bytes actually written.
I interpret this as the return value will NOT tell me how much larger the file became when writing. Only how much data was compressed into the file.
The only way to know how large the file is would then be to close it, and read the size from the file system. I have a requirement to only continue to write to the file until it reaches a certain size. Can this be achieved without closing the file?
A workaround would be to write until the uncompressed size reaches my limit and then close the file, read the size from file system and update my best guess of file size based on that, and then re-open the file and continue writing. This would make me close and open the file a few times towards the end (as I'm approaching the size limit).
Another workaround, which would give more of an estimate (which is not what I want really) would be to write until uncompressed size reaches the limit, close the file, read the file size from the file system and calculate the compression ratio so far. The I can use this compression ratio to calculate a new limit for uncompressed file size where the compression should get me down to the limit for the compressed file size. If I repeat this the estimate would improve, but again, not what I'm looking for.
Are there better options?
Preferred option would be if zlib could tell me the compressed file size while the file is still open. I don't see why this information would not be available inside zlib at this point, since compression happens when I call gzwrite and not when i close the file.
zlib provides the function gzoffset(), which does exactly what you're asking.
If for some reason you are stuck with a version of zlib that is more than about eight years old, when gzoffset() was added, then this is easy to do with gzdopen(). You open the output file with fopen() or open(), and provide the file descriptor (using fileno() and dup() if you used fopen()), and then provide that descriptor to gzdopen(). Then you can use ftell() or lseek() at any time to see how much as been written. Be careful to not try to double-close the descriptor. See the comments for gzdopen().
You can work around this issue by using a pipe. The idea is to write the compressed data into a pipe. After that, you read the data from the other end of the pipe, count it and write it to the actual file.
To set this up you need to first open the file to write to via a simple open. Then create a pipe via pipe2 and initialize zlib by passing one of the pipe descriptors to gzdopen:
int out = open("/path/to/file", O_WRONLY | O_CREAT | O_TRUNC);
int p[2];
pipe2(p, O_NONBLOCK);
gzFile zFile = gzdopen(p[0], "w");
You can now write the data first to the pipe and then splice it from the pipe to the out file:
gzwrite(zFile, buf, 1024); //or any other length
size_t bytesWritten = 0;
do {
bytesWritten = splice(p[1], NULL, out, NULL, 1024, SPLICE_F_NONBLOCK | SPLICE_F_MORE);
} while(bytesWritten == 1024);
As you can see, you now have the bytesWritten to tell you how much data was actually written. Simply sum it up in another variable and stop splicing as soon as you have written as much data as you need to (or just splice it in one go by writing everything to the zFile and the splice once with the amount of data you are allowed to store as the fifth parameter. If you want to not compress uneccessary data, simply do it in chunks as shown above).
A note on splice: Splice is linux specific, and is basically just a very efficient copy. You can always replace it with a simple "read and write" combo, i.e. read data from fd[1] into a buffer and then write the data from that buffer into out - splice is just faster and less code.

How to truncate a JPEG 2000 filestream?

I am trying to extract quality layers from a JPEG 2000 filestream, which is contained in a .j2k file for testing. I am trying to do this in order to learn how to transmit the filestream, and eventually to perform Region of Interest (ROI) selection on it. I want to do these things without decoding, and right now the only utility I have is the OpenJPEG library.
I've used the image_to_j2k utility (linux) to transform a test image into a filestream contained in a .j2k file. I've then read the .j2k file into a buffer, in binary mode:
long fsize = get_file_size("img.j2k"); //This does what it's supposed to
char* buffer = new char[fsize];
ifstream in ("img.j2k", ios::in | ios::binary);
in.read(buffer, fsize); //The entire file goes into the buffer
ofstream out1("out1.j2k");
ofstream out2("out2.j2k");
ofstream out3("out3.j2k");
//This is where I try to truncate the filestream
out1.write(buffer, fsize); //Write the entire file to out1.j2k - this works
out2.write(buffer, 11032); //Write 11032 bytes of the filestream to out2.j2k - this does not to what I thought it would
out3.write(buffer, 14714); //Write 14714 bytes of the filestream to out2.j2k - this does not to what I thought it would
in.close();
out1.flush();out1.close();
out2.flush();out2.close();
out3.flush();out3.close();
The number of bytes written to the out2 and out3 files are not chosen at random - they come from an index file that OpenJPEG makes whilst compressing. The thought was that if I took the file from the beginning and read it up to a certain point where the index file tells me there is an "end_pos" marker corresponding to the end of a quality layer, I would simulate an unfinished wireless transmission of the file - this is the end goal, to transmit the file wirelessly out in the forest and show the image in progressively better quality on a handheld device or laptop somewhere else in the forest. The result of trying to use j2k_to_image on the out2.j2k and out3.j2k files is:
[ERROR] JPWL: bad tile byte size (1307053 bytes against 10911 bytes left)
[ERROR] 00000081; expected a marker instead of 1
ERROR -> j2k_to_image: failed to decode image!
Am I going about this the entirely wrong way? Not using JPEG 2000 is out of the question. Thankful for any answers, I've really gone through documentation on this thing but can't find this detail.

Storing audio file into an array/stringstream C++

I would like to send the contents of an audio file to another system over the network using socket. Both systems run on Windows operating system. Is there a tutorial on some way to store the audio contents into a C++ array or Stringstream datatype, so that it will be easier to send it to a different node.
I basically want to know how to extract data bytes from an audio file.
The easiest thing to do is to simply send the data in chunks of bytes. If you are starting with an audio file, just open it like any other binary file with something like file = fopen(filename, "rb"); (where filename is the name of the audio file). Then enter a loop to read a chunks of bytes until you reach the end of the file. Just use something like bytes_read = fread(buffer, sizeof(char), read_size, file); where buffer should probably be a char array of at least size read_size, which could be, say, 1024. After each fread, you can make your network send call. Alternately, you could read the whole file first and then send it chunk by chunk. Your call. Either way, when you reach the end of the file, send some sort of signal that you have reached the end. The receiving system should take these chunks and call fwrite to create a new audio file. You can either append each chunk as it comes in or buffer it all until you reach the end and then write it all out.
soundfile++ can be used if you have wav files only. Check the readtest and writetest demo programs here http://sig.sapp.org/doc/examples/soundfile/

is there a way to fopen a file that allows me to edit just a few bytes?

I am writing a class that compresses binary data using a zlib stream. I have a buffer that I fill with the output stream and once it becomes full I dump the buffer out to a file using fopen(filename, 'ab');... What this means is that my program only opens up the file to write to it whenever it has a buffer full of data to dump, it goes and does it and immediately closes it.
The issue is in my format I use an 8 byte header at the beginning of each file which contains the original length and compressed length but I do not know these values until the end of the whole compression process.
What I wanted to do was write 8 bytes of zeros, then append with all my compressed data, then come back at the end during cleanup to fill in those 8 bytes with the size data, but I can't seem to find a way to open the file without bringing it all back into memory. I just want to edit the first 8 bytes of the file. Do I need to use mmap?
Since you're using the file in append mode, you do need to close and re-open it:
open with fopen(filename, "r+b");
write the 8 bytes;
close the file using fclose().
The r+ means
Open for reading and writing. The stream is positioned at the
beginning of the file.
and the b is needed to open in binary mode.
You can use this method to change the data at any position in the file, not just at the beginning: simply use fseek() to seek to the required position before writing.
Use rewind() to take the file pointer back to the start of the file after you write out the last few bytes of data. You can then output your 8 bytes of length info.
If you have flexibility in changing your format, I might suggest this. Define your compressed stream such that it is a sequence of an unknown number of blocks, and each block is preceded by a fixed length integer specifying the number of bytes in the block. The stream is finished when the next block has a size of zero.
The drawback to this format is that there no way for the reader of the stream to know how much data is coming until it's all been read. But the advantage is that it avoids this problem you are trying to solve.
More importantly, it allows you to send a compressed stream of data somewhere as you read the input and you don't have to save it all before sending it. For example, you could write a compression Unix filter that you could put in a pipe stream:
prog1 | yourprog -compress | rsh host yourprog -expand | prog2
Good luck.