Understanding Mp3 file structure

Understanding Mp3 file structure - mp3

I'm working on an mp3 Steganography project and i want to encode text inside the mp3 file by manipulating Least Significant Bits(LSB) at regular intervals. I want to encode that text without making any significant changes in the audio. And according to this link http://www.datavoyage.com/mpgscript/mpeghdr.htm there are mp3 headers which carry information of the leading mp3 chunk. So i want a guidance on how can i make this possible?

Mp3 file is made of sequences of "Frames" (It's about 11000 frames for a mp3 file with 4 minutes playing). At front and end of each MP3 file there are two fields of information (Id3 Tag v1, v2) contains information about Mp3 file - these two fields are optional and can exist or not without any impact on the quality of Mp3 file. You should not hide staga-message here because they can easily be found. Frame consists of frame header (32 bits) and frame body (contains compressed sound). According to your question, steganography will affect on the frame header (32 bits), so I'll focus on frame header!
In 32 bits of frame header still exists some "unimportant bit" due to their functions (read more detail on their function). In short you can use bit in index of: 24, 27, 28, 29, 30, 31, 32 (with bit 27 and 28 will have a small impact on the sound quality) with index in this picture in this link: https://en.wikipedia.org/wiki/MP3#/media/File:Mp3filestructure.svg.
So it depends on whether you want just 5 bits per frame of 7 bits per frame. 7 bits is the max number of bits that you can use on each frame due to my working (both theory and test by source code) but someone else can find a larger bit!
In order to access byte array of each frame, you can write your own class but there are many free available classes on the Internet - NAudio.dll by Mark Heath - (I cannot post link due to forum laws, you can search Google) - is a useful one.
Having accessed the byte array of each frame, you can embed/extract data in/from Mp3 file. Note that: 32 first bits of byte array of each frame is the Frame Header, so you can easily identify the precise index of unimportant bits!
I've recently completed my final year thesis on this topic (steganography on images -LSB, Parity Coding and MP3 - Unused Bits Header). The following source codes from my thesis (written in C#) is a runnable steganography program. I hope that it can help: http://www.mediafire.com/download/aggg33i5ydvgrpg/ThesisSteganography%2850900483%29.rar
Ps: I'm a Vietnamese, so it can exist some errors in my sentences!

Related

Converting audio samples into an audible file format

I've been trying to convert a huge amount of data (about 900 MB) to an audible file format for a few days now. I've been given a .dat file containing 900 millions floating-point samples (one per line) representing 90 seconds of music at 10 MHz.
I downsampled to 40 KHz but now I don't know how I could possibly listen to the audio hidden in those bytes. I'm writing a C++ program in a Linux environment but if any one knows how to accomplish this task using Matlab, Octave, Python, Audacity, MPlayer or any other tool, please come forth and speak :) Contributions in any amount are greatly appreciated.
head -n 5 ~/input.dat
-2.4167
-7.5322e-016
-0.2283
0.13581
-0.51926

Target your sample rate to 44100 hz (or 48000, 22050, 11025, or 8000 hz)
Convert your audio samples to 16-bit signed integers (-32768 to +32767).
Follow the instructions on WAV file synthesis here:
WAV File Synthesis From Scratch - C

The wav file format is a rather simple one.
You just need to write the 44 byte header block defined in that link, followed by your data converted to integers.

If you have a sequence of bytes and want to convert it to audio, all you need to do is write a header to it. Since you mentioned that you can use MatLAB, I would recommend wavwrite command. It is simple, tried and tested and excellent for prototyping. Here is the link to the documentation:
http://www.mathworks.in/help/matlab/ref/wavwrite.html
Here are some steps you may need to take in case you are using wavwrite.
- Since your input data is floating point, scale the data in your file to within a range of [-1, 1].
- Once data is scaled plug and chug into the function call.
- Play the wav file using wavplay command.

WAV allows for floating-point sample data and a wide range of sample-rates (1Hz to 4.2GHz in 1Hz increments, if memory serves).
You don't need to bother with converting to integer values. Just set up the WAV file's header appropriately and write 32-bit floats as binary data in the data section.
From a storage perspective, the 10MHz sample-rate is no problem for a WAV file. Playback, however, will require conversion to something the hardware can handle. The upper limit these days is typically 96 or 192 kHz.

tar.Z file format, structure, header

I am trying to figure out the file layout of
tar.Z file. (so called .taz file. compressed tar file).
this file can be produced with tar -Z option or
using unix compress utility(result are same)
I tried to google some document about this file structure
but there is no documentation about this file structure.
I know that this is LZW compressed file and starts with
its magic number "1F 9D" but thats all I can figure out.
someone please tell me more details about the file header or
anything.
I am not interested about how to uncompress this file, or
what linux command can process this file.
I want to know is internal file structure/header/format/layout.
thank you in advance

A .Z file is compressed using compress and can be uncompressed with uncompress (or on some machines this is called uncompress.real). This .Z file can hold any data. .tar.Z or .taz is just a .tar file that is compressed with compress.
The first 2 bytes (MAGIC_1 and MAGIC_2) are used to check if the .Z file really is a .Z file, and not something else with accidentally the same extension. These bytes are hardcoded in the sources.
The third byte is a settings byte and holds 2 values:
The most significant bit is the block mode.
The last 5 bits indicate the maximum size of the code table (the code table is used for lzw compression).
From the original code: BLOCK_MODE=0x80; byte3=(BIT|BLOCK_MODE); and BIT is in an if/else block where it is 12..16.
If block mode is turned on, in the code table a entity will be added at place 256 (remember 0..255 are filled with the values 0..255) and this will contain the CLEAR sign. So whenever the CLEAR sign is gotten from the data stream from the file, the code table has to be reverted to it's initial state (so it has only 0..256 in it).
The maximum code size indicates the amount of bits the code table can be. When the maximum is hit, there are no entities added to the code table anymore. So if the maximum code size is 0b00001100, it means that the code table can only hold 12 bits, so a maximum of 2^12=4096 entities.
The highest amount possible that is used by compress is 16 bit. That means that there are 2 bits in this settings field that are unused.
After these 3 bytes the raw LZW data starts. Because the LZW table starts at 9 bits, the 4th byte will be the same as the first byte of the input (in case of a .tar.Z file, or taz file, this byte will be the first byte of the uncompressed .tar file).

A tar.Z file is just a compressed tar file, so you will only find the 1F 9D magic number telling you to uncompress it.
When uncompressed you can read the tar file header:
http://www.fileformat.info/format/tar/corion.htm

Q: this file can be produced with tar -Z option or using unix compress utility(result are same)
A: Yes. "tar -cvf myfile.tar myfiles; compress myfile.tar" is equivalent to using "-Z". An even better choice is often "j" (using BZip, instead of Zip)
Q: What is the layout of a tar file?
A: There are many references, and much freely available source. For example:
http://en.wikipedia.org/wiki/Tar_%28file_format%29
Q: What is the format of a Unix compressed file?
A: Again: many references; easy to find sample source code:
http://en.wikipedia.org/wiki/Compress
Fot a .tgz (compressed tar file) you'll need both formats: you must first uncompress it, then untar it. The "tar" utility will do both for you, automagically :)

Parallelization of PNG file creation with C++, libpng and OpenMP

I am currently trying to implement a PNG encoder in C++ based on libpng that uses OpenMP to speed up the compression process.
The tool is already able to generate PNG files from various image formats.
I uploaded the complete source code to pastebin.com so you can see what I have done so far: http://pastebin.com/8wiFzcgV
So far, so good! Now, my problem is to find a way how to parallelize the generation of the IDAT chunks containing the compressed image data. Usually, the libpng function png_write_row gets called in a for-loop with a pointer to the struct that contains all the information about the PNG file and a row pointer with the pixel data of a single image row.
(Line 114-117 in the Pastebin file)
//Loop through image
for (i = 0, rp = info_ptr->row_pointers; i < png_ptr->height; i++, rp++) {
png_write_row(png_ptr, *rp);
}
Libpng then compresses one row after another and fills an internal buffer with the compressed data. As soon as the buffer is full, the compressed data gets flushed in a IDAT chunk to the image file.
My approach was to split the image into multiple parts and let one thread compress row 1 to 10 and another thread 11 to 20 and so on. But as libpng is using an internal buffer it is not as easy as I thought first :) I somehow have to make libpng write the compressed data to a separate buffer for every thread. Afterwards I need a way to concatenate the buffers in the right order so I can write them all together to the output image file.
So, does someone have an idea how I can do this with OpenMP and some tweaking to libpng? Thank you very much!

This is too long for a comment but is not really an answer either--
I'm not sure you can do this without modifying libpng (or writing your own encoder). In any case, it will help if you understand how PNG compression is implemented:
At the high level, the image is a set of rows of pixels (generally 32-bit values representing RGBA tuples).
Each row can independently have a filter applied to it -- the filter's sole purpose is to make the row more "compressible". For example, the "sub" filter makes each pixel's value the difference between it and the one to its left. This delta encoding might seem silly at first glance, but if the colours between adjacent pixels are similar (which tends to be the case) then the resulting values are very small regardless of the actual colours they represent. It's easier to compress such data because it's much more repetitive.
Going down a level, the image data can be seen as a stream of bytes (rows are no longer distinguished from each other). These bytes are compressed, yielding another stream of bytes. The compressed data is arbitrarily broken up into segments (anywhere you want!) written to one IDAT chunk each (along with a little bookkeeping overhead per chunk, including a CRC checksum).
The lowest level brings us to the interesting part, which is the compression step itself. The PNG format uses the zlib compressed data format. zlib itself is just a wrapper (with more bookkeeping, including an Adler-32 checksum) around the real compressed data format, deflate (zip files use this too). deflate supports two compression techniques: Huffman coding (which reduces the number of bits required to represent some byte-string to the optimal number given the frequency that each different byte occurs in the string), and LZ77 encoding (which lets duplicate strings that have already occurred be referenced instead of written to the output twice).
The tricky part about parallelizing deflate compression is that in general, compressing one part of the input stream requires that the previous part also be available in case it needs to be referenced. But, just like PNGs can have multiple IDAT chunks, deflate is broken up into multiple "blocks". Data in one block can reference previously encoded data in another block, but it doesn't have to (of course, it may affect the compression ratio if it doesn't).
So, a general strategy for parallelizing deflate would be to break the input into multiple large sections (so that the compression ratio stays high), compress each section into a series of blocks, then glue the blocks together (this is actually tricky since blocks don't always end on a byte boundary -- but you can put an empty non-compressed block (type 00), which will align to a byte boundary, in-between sections). This isn't trivial, however, and requires control over the very lowest level of compression (creating deflate blocks manually), creating the proper zlib wrapper spanning all the blocks, and stuffing all this into IDAT chunks.
If you want to go with your own implementation, I'd suggest reading my own zlib/deflate implementation (and how I use it) which I expressly created for compressing PNGs (it's written in Haxe for Flash but should be comparatively easy to port to C++). Since Flash is single-threaded, I don't do any parallelization, but I do split the encoding up into virtually independent sections ("virtually" because there's the fractional-byte state preserved between sections) over multiple frames, which amounts to largely the same thing.
Good luck!

I finally got it to parallelize the compression process.
As mentioned by Cameron in the comment to his answer I had to strip the zlib header from the zstreams to combine them. Stripping the footer was not required as zlib offers an option called Z_SYNC_FLUSH which can be used for all chunks (except the last one which has to be written with Z_FINISH) to write to a byte boundary. So you can simply concatenate the stream outputs afterwards. Eventually, the adler32 checksum has to be calculated over all threads and copied to the end of the combined zstreams.
If you are interested in the result you can find the complete proof of concept at https://github.com/anvio/png-parallel

How to determine .mp3 bit rate without downloading it?

I have a list of .mp3 files over the web and I would like to get the highest quality file.
Quality in multimedia files equals the bit rate of them.
The bit rate itself should be found in the file's headers. If not, length of the audio track could be used too. (Filesize / Track Length = Bit Rate)
These things would be easy if I would have these files locally, but I would like to fetch this information over HTTP and determine which file has the highest quality.
Can I get an audio track's length out of HTTP headers? If not, is it possible to fetch only the bits that describes the length/bit rate instead of downloading the whole file?
I'm writing the code in python, but the question is quite general so I'm not tagging it as a python question.

Assuming that the remote server is behaving nicely, you could issue a HEAD request to the file and check the contents of the Content-Length header field. It doesn't give you track length or bit rate but you can get the size of the file.
EDIT: MP3s consist of multiple frames, each of which can be of a different bit rate (VBR). Track length is calculated from the bit rate of each of these frames, rather than the length itself being stored. If you want the bit rate reliably, you'd need two get the whole file and get the bit rate of each of the frames. It may be possible to grab the first few KB of the file and read the bit rate from the first frame, but this is not always at the same point in the file (e.g. due to position of ID3 tag etc.).

How to extract album cover from a mp3 file without download the whole file

I'm using TabLib for extraction, but i need to know how many bytes should i download from the mp3 file, in order to be able to extract TagLib.
I've looked into mp3 specs, but i didn't found anything relevant.

In 99% of cases, if you pull down first the first 10 bytes, you'd then have the ID3v2 header, of which the last 4 bytes will be the size of the ID3v2 tag, which will contain the cover art.
The ID3v2 size is a "sync-safe integer", but TagLib has a function to decode that to a normal integer:
TagLib::ID3v2::SynchData::toUInt(const ByteVector &data)
So, basically the algorithm would be:
Grab the first 10 bytes
Sanity check those bytes that they start with "ID3"
Read the last 4 bytes of those 10 and pass them through the function above to get the ID3v2 tag length
Grab that much additional data from the stream
Pass that block of data to TagLib
Extract the cover art

The mp3 specification doesn't really have meta-data like song name, or album art. It's part of id3, and it's normally placed at the end of the file.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js