C/C++: Streaming MP3 - c++

In a C++ program, I get multiple chunks of PCM data and I am currently using libmp3lame to encode this data into MP3 files. The PCM chunks are produced one after another. However, instead of waiting until the PCM data stream finished, I'd like to encode data early as possible into multiple MP3 chunks, so the client can either play or append the pieces together.
As far as I understand, MP3 files consist of frames, files can be split along frames and published in isolation. Moreover, there is no information on length needed in advance, so the format is suitable for streaming. However, when I use libmp3lame to generate MP3 files from partial data, the product cannot be interpreted by audio players after concatted together. I deactivated the bit reservoir, thus, I expect the frames to be independent.
Based on this article, I wrote a Python script that extracts and lists frames from MP3 files. I generated an MP3 file with libmp3lame by first collecting the whole PCM data and then applying libmp3lame. Then, I took the first n frames from this file and put them into another file. But the result would be unplayable as well.
How is it possible to encode only chunks of an audio, which library is suitable for this and what is the minimum size of a chunk?

I examined the source code of lame and the file lame_main.c helped me to come to a solution. This file implements the lame command-line utility, which also can encode multiple wav files, so they can be appended to a single mp3 file without gaps.
My mistake was to initialize lame every single time I call my encode function, thus, initialize lame for each new segment. This causes short interruptions in the output mp3. Instead, initializing lame once and re-using it for subsequent calls already solved the problem. Additionally, I call lame_init_bitstream at the start of encode and use lame_set_nocap_currentindex and lame_set_nogap_total appropriately. Now, the output fragments can be combined seamlessly.

Related

Concatenate .wav files in C++

How can I, using a function, library, whatever I have to, concatenate two .wav files? The input should be the absolute paths, and the output an audio file created and placed (not just played) somewhere, it doesn't really matter where.
I am writing a Mac command line application in XCode 6.
The .wav file format is a very simple format, consisting of the fixed header that defines the audio file's properties; namely the endian-ness, the number of channels, and the sampling rate. Its documentation is widely defined on the intertubes.
Off the top of my head I don't recall if any common library offers a convenient way to do this (it's worth looking through libsndfile's API documentation, for something that would fit the bill).
In any case, it shouldn't be too tough to read the headers of both WAV files, to check their format, and then create the output file. If both WAV files have the same endian-ness, number of channels, and sampling rate, the procedure is trivial, otherwise you will have to resample/remix at least one of the files.
There is a very simple, lightweight and mature open source C API library for reading-writing several common audio file formats. I haven't worked with it for a while, if I remember well, it has routines for opening a sound file for writing, seeking the end, appending data from another file and updating the header. I hope this can help.

Are wave files better candidate for steganography than mp3?if so then why?

I read about wav file format and found too many projects of steganography based on it but didn't found that much projects based on mp3 though it is found more frequently on web than wav.
The wav format is uncompressed audio with no formatting headers. You can change a few bits in this format without significantly affecting the audio; you will not break the file format and a listener will not be able to tell the difference between the original file and the modified one.
The mp3 format is compressed audio. If you change bits in mp3, you run risks:
You modify a header and the audio no longer plays back
You modify the audio, and a listener can tell the file is weird. The audio is compressed, so changes in the audio data get magnified upon decompression.

Parsing char* data of h.264

I have an char* array of binary data.
It is binary media-stream encoded with h.264.
It has next structure: ...
stream_header is 64 bytes struct.
I've already done reinterpret_cast(charArray) where chararray represents first 64 bytes of stream. I'm successfully get all header data. In this header there is an nLength variable, which tell us how many bytes of media data is in next stream_data.
For example 1024 bytes.
I read next 1024 bytes in char* data array, and here my question begins: how I can get from this data set of video frames (in structure i have info about resolution of this frames), and save it in *.jpg files such as (1.jpg 2.jpg 3.jpg .....)
Maybe someone has already done something simmilar??? Help me plz..
You need an H264 decoder library, best option is ffmpeg
But even then it's a bit complicated to use the library - although decoding is simpler since you have less options to worry about.
Do you really need to do this in a program? It's very simple to use the 'ffmpeg' executable to save a video as jpegs
If you just want to get a sequence of JPEGs from a video file, GStreamer can do that among many other things.
If you want to write code from scratch to convert H.264 video into JPEGs, let me warn you that you have many hundreds of pages of specifications documents and some very serious mathematics to understand and then implement. It would be months of work for a reasonably skilled programmer mathematician. Understanding the MP4 format is the easy part, the video compression will blow your mind.

MJPEG Video from IP Camera too fast

I'm just trying to read a video Stream out of an IP Camera (Basler BIP-1280c).
The stream I want to have is saved in a buffer on the camera, has a length of 40 seconds and is decoded in MJPEG.
Now if I access the stream via my webbrowser it shows me the 40 seconds without any problems.
But actually I need an application which is capable of downloading and saving the stream by itself.
The camera is accessed via http, so I am using libcurl to access it. This works fine and I also can download the stream without any troubles. I have chosen to save the stream data into an *.avi file (hope that's correct…?).
But now to the problem: I can open the video (tried with Totem Video Player and VLC) and also view all that has been recorded — BUT it's way too fast. The whole video lasts like 5 seconds (instead of 40). Is there in MJPEG anything in a header where to put information like the total video length or the fps? I mean there must be some information missing for the video players, so that they play it way to fast?
Update:
As suggested in the answers, I opened the file with a hexeditor and what I found was this:
--myboundary..Content-Type: image/jpeg..Content-Length: 39050.........*Exif..II*...............V...........................2...................0210................FrameNr=000398732
6.AOI=(0800x0720)#(0240,0060)/(1280x0720).Motion=00000 (no)
[00000 | 00000 | 00000 | 00000 | 00000].Alarm=0000 (no) .IO
=000.RtTrigger=0...Basler..BIP2-1280c..1970:01:05 23:08:10.8
98286......JFIF.................................. ....&"((
This header reoccurs in the file all over ( followed by a a lot of Bytes of binary Data ). This is actually okay, since I read in the camera manual that all MJPEG Pictures get this Header.
More interesting ins the JFIFin the last line. As in the answers suggested this is maybe the indicator of the file format. But afaik JFIF is a single picture format just like jpg. So does this maybe even mean that the whole video file is just some "brainless" chained pictures? And my Player just assumes that he should show this pictures one after another, without any knowledge about the framerate?
There is not a single format to use with MJPEG. From Wikipedia:
[...] there is no document that defines a single exact format that is
universally recognized as a complete specification of “Motion JPEG”
for use in all contexts.
The formats differ by vendor. My advice would be to closely inspect the file you download. Check if it is really an AVI container. (Some cameras can send the frames wrapped in a MIME container).
After the container format is clear, you can check out the documentation of that container and look for a file which has that format and the desired fps. Then you can start adjusting your downloaded file to have the desired effect.
You might also find this project useful: http://mjpeg.sourceforge.net/
Edit:
According to your sample data your camera sends the frames packed into a MIME container. (The first line is the boundary, then the headers until you encounter an empty line, then the file data itseld, followe by the boundary and so on).
These are JPEG files as the header suggests: image/jpeg. JFIF is the standard file format to store JPEG data.
I recommend you to:
Extract the contents of the file into multiple jpeg files (with munpack for instance), then
use ffmpeg or mplayer to create a movie file out of the series of jpegs.
This way you can specify the desired frame rate too.
It can make things more complicated if the camera dynamically canges AOI (area of interest), meaning it can send only a smaller part of the image where change occured. But you should check first if the simple approach works.
on un*x systems (linux, osx,...), you can use the file cmdline tool to make a (usually good) guess about the file format.
--myboundary is an indication that the stream is regular M-JPEG streamed as multipart content over HTTP. There is no well known file format which can hold this stream "as is" and be playable (that is if you rename this to AVI it is not supposed to play back).
The format itself is a sequence of (boundary, subheader, JPEG image), (boundary, subheader, JPEG image), ... etc. The stream does not have time stamps, so playback speed completely depends on the player.

How to write mp3 frames from PCM data (C/C++)?

How to write mp3 frames (not full mp3 files with ID3 etc) from PCM data?
I have something like PCM data (for ex 100mb) I want to create an array of mp3 frames from that data. How to perform such operation? (for ex with lame or any other opensource encoder)
What do I need:
Open Source Libs for encoding.
Tutorials and blog articles on How to do it, about etc.
You should be able to use LAME. It has a -t command line switch that turns off the INFO header in the output (otherwise present in frame 0). If that still leaves too much bookkeeping data, you should be able to write a separate tool to strip that away.
You are already on the right track: use LAME external executable, or any other shell-invoked encoder.
To build MP frames, were your layer of interest is 3, is not easy to do from scratch. There are compression steps, Fast-fourier transforms followed by quantization, which are of complex and tediously long explanation. The amount of work required for a developer to build it from scratch is very big.
There are programmatic C and C++ MP encoding libs, but you will be either asked for fees, be left with very limited support, or have very limited interfacing options.
Go LAME, study their wiki.