Converting audio samples into an audible file format - c++

I've been trying to convert a huge amount of data (about 900 MB) to an audible file format for a few days now. I've been given a .dat file containing 900 millions floating-point samples (one per line) representing 90 seconds of music at 10 MHz.
I downsampled to 40 KHz but now I don't know how I could possibly listen to the audio hidden in those bytes. I'm writing a C++ program in a Linux environment but if any one knows how to accomplish this task using Matlab, Octave, Python, Audacity, MPlayer or any other tool, please come forth and speak :) Contributions in any amount are greatly appreciated.
head -n 5 ~/input.dat
-2.4167
-7.5322e-016
-0.2283
0.13581
-0.51926

Target your sample rate to 44100 hz (or 48000, 22050, 11025, or 8000 hz)
Convert your audio samples to 16-bit signed integers (-32768 to +32767).
Follow the instructions on WAV file synthesis here:
WAV File Synthesis From Scratch - C

The wav file format is a rather simple one.
You just need to write the 44 byte header block defined in that link, followed by your data converted to integers.

If you have a sequence of bytes and want to convert it to audio, all you need to do is write a header to it. Since you mentioned that you can use MatLAB, I would recommend wavwrite command. It is simple, tried and tested and excellent for prototyping. Here is the link to the documentation:
http://www.mathworks.in/help/matlab/ref/wavwrite.html
Here are some steps you may need to take in case you are using wavwrite.
- Since your input data is floating point, scale the data in your file to within a range of [-1, 1].
- Once data is scaled plug and chug into the function call.
- Play the wav file using wavplay command.

WAV allows for floating-point sample data and a wide range of sample-rates (1Hz to 4.2GHz in 1Hz increments, if memory serves).
You don't need to bother with converting to integer values. Just set up the WAV file's header appropriately and write 32-bit floats as binary data in the data section.
From a storage perspective, the 10MHz sample-rate is no problem for a WAV file. Playback, however, will require conversion to something the hardware can handle. The upper limit these days is typically 96 or 192 kHz.

Related

Is there a way to check if a buffer is in Brotli compressed format?

I'm an intern doing research into whether using Brotli compression in a piece of software provides a performance boost over the current release, which uses GZip.
My task is to change anything using GZip to use Brotli compression instead. One function I need to replace does a check to test if a buffer contains data that was compressed using GZip. It does this by checking the stream identifier at the beginning and end:
bool isGzipped() const
{
// Gzip file signature (0x1f8b)
return
(_bufferEnd >= _bufferStart + 2) &&
(static_cast<unsigned char>(_bufferStart[0]) == 0x1f) &&
(static_cast<unsigned char>(_bufferStart[1]) == 0x8b);
}
I want to create similar function bool isBrotliEncoded(). I was wondering if there is a similar quick check that can can be done with Brotli encoded buffers? I've had a look at the byte values for some of the compressed files that brotli produces, but I can't find a rule that holds for all of them. Some start with 0x5B, some with 0x1B, compression of empty files results in 0x06, and files that have been compressed multiple times start with a range of different values. The end of each file is also inconsistent.
The only way I know of to test if it is in the correct format is to attempt decompression and wait for an error, which defeats the purpose of doing this test.
So my question is: Does anyone know how to check if a buffer has been compressed with Brotli without attempting decompression and waiting for failure?
Unfortunately, the raw brotli format is not well suited to such detection, even when simply trying to decompress and waiting for an error.
I ran a trial of one million brotli decompressions of random data. About 5% of them checked out as good brotli streams. So you've already got a problem right there. 3.5% of the million are a single byte, since there are nine one-byte values that are each a valid brotli stream. The mean length of the random valid streams was almost a megabyte.
For those in which an error was detected (about 95% of the million cases), 3.5% went more than a megabyte before the error was detected. 1.4% went more than ten megabytes. The mean number of random bytes before finding an error was 309 KB. Another problem.
In short, the probability of a false positive is relatively high, and the number of bytes to process to find a negative can be quite large.
If you are writing this software, then you should put your own header before the brotli data to aid in detection. Or you can use the brotli framing format that I developed at their request, which has a unique four-byte header before the brotli compressed stream. That would reduce the probability of a false positive dramatically.
Brotli is formally defined in RFC 7932. The format of the data stream is covered in Section 2: Compressed Representation Overview and Section 9: Compressed Data Format. Brotli does not employ leading/trailing identifiers like gzip does, but it does consist of a sequence of uncompressed headers and commands that describe the compressed data. They are not all aligned on byte boundaries, you have to parse them at the bit level instead (Brotli is processed as a stream of bits and bytes). Refer to Section 10: Decoding Algorithm for how to read these headers. If you parse out a few headers that follow the Brotli format without error then it is a good bet that you are dealing with a Brotli compressed buffer.

Understanding Mp3 file structure

I'm working on an mp3 Steganography project and i want to encode text inside the mp3 file by manipulating Least Significant Bits(LSB) at regular intervals. I want to encode that text without making any significant changes in the audio. And according to this link http://www.datavoyage.com/mpgscript/mpeghdr.htm there are mp3 headers which carry information of the leading mp3 chunk. So i want a guidance on how can i make this possible?
Mp3 file is made of sequences of "Frames" (It's about 11000 frames for a mp3 file with 4 minutes playing). At front and end of each MP3 file there are two fields of information (Id3 Tag v1, v2) contains information about Mp3 file - these two fields are optional and can exist or not without any impact on the quality of Mp3 file. You should not hide staga-message here because they can easily be found. Frame consists of frame header (32 bits) and frame body (contains compressed sound). According to your question, steganography will affect on the frame header (32 bits), so I'll focus on frame header!
In 32 bits of frame header still exists some "unimportant bit" due to their functions (read more detail on their function). In short you can use bit in index of: 24, 27, 28, 29, 30, 31, 32 (with bit 27 and 28 will have a small impact on the sound quality) with index in this picture in this link: https://en.wikipedia.org/wiki/MP3#/media/File:Mp3filestructure.svg.
So it depends on whether you want just 5 bits per frame of 7 bits per frame. 7 bits is the max number of bits that you can use on each frame due to my working (both theory and test by source code) but someone else can find a larger bit!
In order to access byte array of each frame, you can write your own class but there are many free available classes on the Internet - NAudio.dll by Mark Heath - (I cannot post link due to forum laws, you can search Google) - is a useful one.
Having accessed the byte array of each frame, you can embed/extract data in/from Mp3 file. Note that: 32 first bits of byte array of each frame is the Frame Header, so you can easily identify the precise index of unimportant bits!
I've recently completed my final year thesis on this topic (steganography on images -LSB, Parity Coding and MP3 - Unused Bits Header). The following source codes from my thesis (written in C#) is a runnable steganography program. I hope that it can help: http://www.mediafire.com/download/aggg33i5ydvgrpg/ThesisSteganography%2850900483%29.rar
Ps: I'm a Vietnamese, so it can exist some errors in my sentences!

Parallelization of PNG file creation with C++, libpng and OpenMP

I am currently trying to implement a PNG encoder in C++ based on libpng that uses OpenMP to speed up the compression process.
The tool is already able to generate PNG files from various image formats.
I uploaded the complete source code to pastebin.com so you can see what I have done so far: http://pastebin.com/8wiFzcgV
So far, so good! Now, my problem is to find a way how to parallelize the generation of the IDAT chunks containing the compressed image data. Usually, the libpng function png_write_row gets called in a for-loop with a pointer to the struct that contains all the information about the PNG file and a row pointer with the pixel data of a single image row.
(Line 114-117 in the Pastebin file)
//Loop through image
for (i = 0, rp = info_ptr->row_pointers; i < png_ptr->height; i++, rp++) {
png_write_row(png_ptr, *rp);
}
Libpng then compresses one row after another and fills an internal buffer with the compressed data. As soon as the buffer is full, the compressed data gets flushed in a IDAT chunk to the image file.
My approach was to split the image into multiple parts and let one thread compress row 1 to 10 and another thread 11 to 20 and so on. But as libpng is using an internal buffer it is not as easy as I thought first :) I somehow have to make libpng write the compressed data to a separate buffer for every thread. Afterwards I need a way to concatenate the buffers in the right order so I can write them all together to the output image file.
So, does someone have an idea how I can do this with OpenMP and some tweaking to libpng? Thank you very much!
This is too long for a comment but is not really an answer either--
I'm not sure you can do this without modifying libpng (or writing your own encoder). In any case, it will help if you understand how PNG compression is implemented:
At the high level, the image is a set of rows of pixels (generally 32-bit values representing RGBA tuples).
Each row can independently have a filter applied to it -- the filter's sole purpose is to make the row more "compressible". For example, the "sub" filter makes each pixel's value the difference between it and the one to its left. This delta encoding might seem silly at first glance, but if the colours between adjacent pixels are similar (which tends to be the case) then the resulting values are very small regardless of the actual colours they represent. It's easier to compress such data because it's much more repetitive.
Going down a level, the image data can be seen as a stream of bytes (rows are no longer distinguished from each other). These bytes are compressed, yielding another stream of bytes. The compressed data is arbitrarily broken up into segments (anywhere you want!) written to one IDAT chunk each (along with a little bookkeeping overhead per chunk, including a CRC checksum).
The lowest level brings us to the interesting part, which is the compression step itself. The PNG format uses the zlib compressed data format. zlib itself is just a wrapper (with more bookkeeping, including an Adler-32 checksum) around the real compressed data format, deflate (zip files use this too). deflate supports two compression techniques: Huffman coding (which reduces the number of bits required to represent some byte-string to the optimal number given the frequency that each different byte occurs in the string), and LZ77 encoding (which lets duplicate strings that have already occurred be referenced instead of written to the output twice).
The tricky part about parallelizing deflate compression is that in general, compressing one part of the input stream requires that the previous part also be available in case it needs to be referenced. But, just like PNGs can have multiple IDAT chunks, deflate is broken up into multiple "blocks". Data in one block can reference previously encoded data in another block, but it doesn't have to (of course, it may affect the compression ratio if it doesn't).
So, a general strategy for parallelizing deflate would be to break the input into multiple large sections (so that the compression ratio stays high), compress each section into a series of blocks, then glue the blocks together (this is actually tricky since blocks don't always end on a byte boundary -- but you can put an empty non-compressed block (type 00), which will align to a byte boundary, in-between sections). This isn't trivial, however, and requires control over the very lowest level of compression (creating deflate blocks manually), creating the proper zlib wrapper spanning all the blocks, and stuffing all this into IDAT chunks.
If you want to go with your own implementation, I'd suggest reading my own zlib/deflate implementation (and how I use it) which I expressly created for compressing PNGs (it's written in Haxe for Flash but should be comparatively easy to port to C++). Since Flash is single-threaded, I don't do any parallelization, but I do split the encoding up into virtually independent sections ("virtually" because there's the fractional-byte state preserved between sections) over multiple frames, which amounts to largely the same thing.
Good luck!
I finally got it to parallelize the compression process.
As mentioned by Cameron in the comment to his answer I had to strip the zlib header from the zstreams to combine them. Stripping the footer was not required as zlib offers an option called Z_SYNC_FLUSH which can be used for all chunks (except the last one which has to be written with Z_FINISH) to write to a byte boundary. So you can simply concatenate the stream outputs afterwards. Eventually, the adler32 checksum has to be calculated over all threads and copied to the end of the combined zstreams.
If you are interested in the result you can find the complete proof of concept at https://github.com/anvio/png-parallel

How to determine .mp3 bit rate without downloading it?

I have a list of .mp3 files over the web and I would like to get the highest quality file.
Quality in multimedia files equals the bit rate of them.
The bit rate itself should be found in the file's headers. If not, length of the audio track could be used too. (Filesize / Track Length = Bit Rate)
These things would be easy if I would have these files locally, but I would like to fetch this information over HTTP and determine which file has the highest quality.
Can I get an audio track's length out of HTTP headers? If not, is it possible to fetch only the bits that describes the length/bit rate instead of downloading the whole file?
I'm writing the code in python, but the question is quite general so I'm not tagging it as a python question.
Assuming that the remote server is behaving nicely, you could issue a HEAD request to the file and check the contents of the Content-Length header field. It doesn't give you track length or bit rate but you can get the size of the file.
EDIT: MP3s consist of multiple frames, each of which can be of a different bit rate (VBR). Track length is calculated from the bit rate of each of these frames, rather than the length itself being stored. If you want the bit rate reliably, you'd need two get the whole file and get the bit rate of each of the frames. It may be possible to grab the first few KB of the file and read the bit rate from the first frame, but this is not always at the same point in the file (e.g. due to position of ID3 tag etc.).

What is the best compression algorithm for small 4 KB files?

I am trying to compress TCP packets each one of about 4 KB in size. The packets can contain any byte (from 0 to 255). All of the benchmarks on compression algorithms that I found were based on larger files. I did not find anything that compares the compression ratio of different algorithms on small files, which is what I need. I need it to be open source so it can be implemented on C++, so no RAR for example. What algorithm can be recommended for small files of about 4 kilobytes in size? LZMA? HACC? ZIP? gzip? bzip2?
Choose the algorithm that is the quickest, since you probably care about doing this in real time. Generally for smaller blocks of data, the algorithms compress about the same (give or take a few bytes) mostly because the algorithms need to transmit the dictionary or Huffman trees in addition to the payload.
I highly recommend Deflate (used by zlib and Zip) for a number of reasons. The algorithm is quite fast, well tested, BSD licensed, and is the only compression required to be supported by Zip (as per the infozip Appnote). Aside from the basics, when it determines that the compression is larger than the decompressed size, there's a STORE mode which only adds 5 bytes for every block of data (max block is 64k bytes). Aside from the STORE mode, Deflate supports two different types of Huffman tables (or dictionaries): dynamic and fixed. A dynamic table means the Huffman tree is transmitted as part of the compressed data and is the most flexible (for varying types of nonrandom data). The advantage of a fixed table is that the table is known by all decoders and thus doesn't need to be contained in the compressed stream. The decompression (or Inflate) code is relatively easy. I've written both Java and Javascript versions based directly off of zlib and they perform rather well.
The other compression algorithms mentioned have their merits. I prefer Deflate because of its runtime performance on both the compression step and particularly in decompression step.
A point of clarification: Zip is not a compression type, it is a container. For doing packet compression, I would bypass Zip and just use the deflate/inflate APIs provided by zlib.
This is a follow-up to Rick's excellent answer which I've upvoted. Unfortunately, I couldn't include an image in a comment.
I ran across this question and decided to try deflate on a sample of 500 ASCII messages that ranged in size from 6 to 340 bytes. Each message is a bit of data generated by an environmental monitoring system that gets transported via an expensive (pay-per-byte) satellite link.
The most fun observation is that the crossover point at which messages are smaller after compression is the same as the Ultimate Question of Life, the Universe, and Everything: 42 bytes.
To try this out on your own data, here's a little bit of node.js to help:
const zlib = require('zlib')
const sprintf = require('sprintf-js').sprintf
const inflate_len = data_packet.length
const deflate_len = zlib.deflateRawSync(data_packet).length
const delta = +((inflate_len - deflate_len)/-inflate_len * 100).toFixed(0)
console.log(`inflated,deflated,delta(%)`)
console.log(sprintf(`%03i,%03i,%3i`, inflate_len, deflate_len, delta))
If you want to "compress TCP packets", you might consider using a RFC standard technique.
RFC1978 PPP Predictor Compression Protocol
RFC2394 IP Payload Compression Using DEFLATE
RFC2395 IP Payload Compression Using LZS
RFC3173 IP Payload Compression Protocol (IPComp)
RFC3051 IP Payload Compression Using ITU-T V.44 Packet Method
RFC5172 Negotiation for IPv6 Datagram Compression Using IPv6 Control Protocol
RFC5112 The Presence-Specific Static Dictionary for Signaling Compression (Sigcomp)
RFC3284 The VCDIFF Generic Differencing and Compression Data Format
RFC2118 Microsoft Point-To-Point Compression (MPPC) Protocol
There are probably other relevant RFCs I've overlooked.
All of those algorithms are reasonable to try. As you say, they aren't optimized for tiny files, but your next step is to simply try them. It will likely take only 10 minutes to test-compress some typical packets and see what sizes result. (Try different compress flags too). From the resulting files you can likely pick out which tool works best.
The candidates you listed are all good first tries. You might also try bzip2.
Sometimes simple "try them all" is a good solution when the tests are easy to do.. thinking too much sometimes slow you down.
I don't think the file size matters - if I remember correctly, the LZW in GIF resets its dictionary every 4K.
ZLIB should be fine. It is used in MCCP.
However, if you really need good compression, I would do an analysis of common patterns and include a dictionary of them in the client, which can yield even higher levels of compression.
I've had luck using zlib compression libraries directly and not using any file containers. ZIP, RAR, have overhead to store things like filenames. I've seen compression this way yield positive results (compression less than original size) for packets down to 200 bytes.
You may test bicom.
This algorithm is forbidden for commercial use.
If you want it for professional or commercial usage look at "range coding algorithm".
You can try delta compression. Compression will depend on your data. If you have any encapsulation on the payload, then you can compress the headers.
I did what Arno Setagaya suggested in his answer: made some sample tests and compared the results.
The compression tests were done using 5 files, each of them 4096 bytes in size. Each byte inside of these 5 files was generated randomly.
IMPORTANT: In real life, the data would not likely be all random, but would tend to have quiet a bit of repeating bytes. Thus in real life application the compression would tend to be a bit better then the following results.
NOTE: Each of the 5 files was compressed by itself (i.e. not together with the other 4 files, which would result in better compression). In the following results I just use the sum of the size of the 5 files together for simplicity.
I included RAR just for comparison reasons, even though it is not open source.
Results: (from best to worst)
LZOP: 20775 / 20480 * 100 = 101.44% of original size
RAR : 20825 / 20480 * 100 = 101.68% of original size
LZMA: 20827 / 20480 * 100 = 101.69% of original size
ZIP : 21020 / 20480 * 100 = 102.64% of original size
BZIP: 22899 / 20480 * 100 = 111.81% of original size
Conclusion: To my surprise ALL of the tested algorithms produced a larger size then the originals!!! I guess they are only good for compressing larger files, or files that have a lot of repeating bytes (not random data like the above). Thus I will not be using any type of compression on my TCP packets. Maybe this information will be useful to others who consider compressing small pieces of data.
EDIT:
I forgot to mention that I used default options (flags) for each of the algorithms.