Frequencies in mp3 - c++

I`m tried to get information about frequency characteristics of music file in mp3 format.
How to get frequencies from mp3 file with c++? And with what data does the fftw library work?

This will be a two step process, first you need to read the mp3 file into a data-structure of your liking in your C++ program. Most mp3 files are stereo format meaning you will have two arrays/vectors, one for each channel. FFTW works in two steps as well, first you create a plan describing your data, number of FFT points, Inverse/Forward FFT direction etc. In the next step you use this plan and your input data to compute frequency transforms. You can then take the magnitude response of the resulting float/double complex array from FFTW. Link to FFTW tutorial http://www.fftw.org/fftw2_doc/fftw_2.html
Having said that, you do not need to go through these steps manually, there are many open source Music information retrieval libraries that can do this for you, like Marsyas. You can also use audio libraries such as PortAudio to get audio data and get the results you want by feeding that data to signal processing libraries like openvsip or LiquidDSP

Related

C/C++: Streaming MP3

In a C++ program, I get multiple chunks of PCM data and I am currently using libmp3lame to encode this data into MP3 files. The PCM chunks are produced one after another. However, instead of waiting until the PCM data stream finished, I'd like to encode data early as possible into multiple MP3 chunks, so the client can either play or append the pieces together.
As far as I understand, MP3 files consist of frames, files can be split along frames and published in isolation. Moreover, there is no information on length needed in advance, so the format is suitable for streaming. However, when I use libmp3lame to generate MP3 files from partial data, the product cannot be interpreted by audio players after concatted together. I deactivated the bit reservoir, thus, I expect the frames to be independent.
Based on this article, I wrote a Python script that extracts and lists frames from MP3 files. I generated an MP3 file with libmp3lame by first collecting the whole PCM data and then applying libmp3lame. Then, I took the first n frames from this file and put them into another file. But the result would be unplayable as well.
How is it possible to encode only chunks of an audio, which library is suitable for this and what is the minimum size of a chunk?
I examined the source code of lame and the file lame_main.c helped me to come to a solution. This file implements the lame command-line utility, which also can encode multiple wav files, so they can be appended to a single mp3 file without gaps.
My mistake was to initialize lame every single time I call my encode function, thus, initialize lame for each new segment. This causes short interruptions in the output mp3. Instead, initializing lame once and re-using it for subsequent calls already solved the problem. Additionally, I call lame_init_bitstream at the start of encode and use lame_set_nocap_currentindex and lame_set_nogap_total appropriately. Now, the output fragments can be combined seamlessly.

how to include time from c++ code to ASCII vtk files for paraView animation

I print out data from C/C++ simulation code to vtk files at each time step. I create numbered data files (e.g. data.000.vtk, data.001.vtk, …).
I am having trouble with including the time from each calculation step in these vtk files that contains structured grid data so that I can view time (seconds) by paraView during animation.
Is that possible in any way? Your help is really appreciated.
Unfortunately, there is no way to record the simulation time in the legacy VTK file format. Thus, if you want to pass the time from the simulation to ParaView, you will need to use a different file format.
I think the easiest way to record the simulation time is to use a ParaView Data (PVD) file. A PVD file is a simple XML file that captures metadata about a group of data files, and time value information is one of the things that can be captured. A brief description of the PVD format is given at http://www.paraview.org/Wiki/ParaView/Data_formats#PVD_File_Format. The example on that page has "timestep" values that are integers starting at 0, but you can replace them with any sequence of floating point numbers.
The PVD file itself does not hold the data itself. Rather, it points to other files that have the actual data. The data files it points to have to be in the newer XML-based VTK file formats (vti, vtr, or vts depending on the nature of your structured data). The XML-based VTK file format is also documented in http://www.vtk.org/VTK/img/file-formats.pdf (after the documentation for the legacy VTK format).

How to decode MP3 files? How MP3 files stores sounds?

I'm not talking about any concrete language here. I want to analyse the MP3 file, so I want to get some information about sound from specific second (i don't know, tone/height/frequency of sound). How those data is stored in single file?
Unless you have weeks (months?) available to play with it, I would recommend using an existing MP3 decoding library to pull the decoded audio out of the file. In C/C++, there's libMAD or libmpg123, as well as the Windows components. In C#, you can use NAudio or NLayer.
Once you have the decoded data, you'll need to run a FFT, DFT, or DCT over it to convert to frequency & amplitude. The FFT is probably your best bet, though the DFT may give a less "noisy" analysis. YMMV.
Note that all three of the transforms provide amplitude values you can convert to decibel values.
there are some useful MP3 Librarys where you get information about your MP3 file.
If you use C# it could be NAudio.
http://naudio.codeplex.com/
I recommend the program xxd and google for the first steps.
First of all i would look into its binary code.
xxd -b file.mp3
Viewing it as ASCII also exposes some information.
xxd file.mp3
That was my first steps.

DICOM File compression

My line of work requires the use of DICOM files. Each DICOM file constitutes many .dcm files in a single directory. I am required to send these files over the network, a process which is somewhat so due to the massive size of the files.
I am also a programmer and I was wondering what is the ideal way to compress such files? I'm talking about a compression that will be made on the local computer and later decompressed on the destination computer (namely the compression is solely for speeding up the over-the-network transfer of the file). Is there a simple way to crop the DICOM files? (the files contain imaging of an entire head, whereas I'm only interested in a small part of the head).
Thanks!
In medical context, lossy compression is somewhere between not encouraged and forbidden. If you'd insist on cropping existing datasets the standard demands you to form at least new image & series UIDs. The standard does allow losless compression in the form of jpeg2000, but it is quite rare - if I had to bet I'd say your dataset is uncompressed altogether.
In my experience it is significantly better to compress a medical dataset as a solid archive - that is, unify all the images into a single stream. This makes a lot of sense, as there is typically a lot of similarity between nearby images and this is the way to take advantage of that similarity (a unified compression dictionary). This is available as a command line option both to rar and gzip compressors.
Solution:
gdcmconv --jpeg uncompressed.dcm compressed.dcm
or for better compression ratio:
gdcmconv --jpegls uncompressed.dcm compressed.dcm
See:
http://gdcm.sourceforge.net/html/gdcmconv.html
I would also recommend against lossy compression, you would need to be a DICOM wizard to do it properly (see derivation mechanism in the DICOM standard). I would also recommend against cropping the image (you would need to regenerate UIDs, get the Frame or Reference updated...)
HTH
You could use something simple like lzma compression on one end to pack up the files and send them over. This is the easiest solution, since you can grab something like gzip and pack/unpack the files easily programmaticly. This may help considerably, because modern computers prefer transmitting/receiving one large file over many small files (a single 1GB file will transfer much faster than 10000 100KB files).
As for actually reducing the aggregate size, each .dcm file is probably a slice (if you're looking at something like MRI or CT data), and the viewer you are using reconstructs the slices into the 3d image. Cropping them isn't impossible, but parsing the DICOM format is a bit tricky. I'm not aware of any free programs that will help you parse the DICOM files, but I haven't looked for some time.
Since DICOM is a container format, the image data you are after is usually stored in a common format (such as JPEG), so if you are able to grab the relevant part of the file to extract the image data, you can use any of the loads of image processing tools available to crop the image to whatever dimensions you choose.
We have a compression router called "DICOM Shrinkinator" that can do this as it transmits the study to PACS:
http://fluxinc.ca/medical/dicom-shrinkinator/

How to write mp3 frames from PCM data (C/C++)?

How to write mp3 frames (not full mp3 files with ID3 etc) from PCM data?
I have something like PCM data (for ex 100mb) I want to create an array of mp3 frames from that data. How to perform such operation? (for ex with lame or any other opensource encoder)
What do I need:
Open Source Libs for encoding.
Tutorials and blog articles on How to do it, about etc.
You should be able to use LAME. It has a -t command line switch that turns off the INFO header in the output (otherwise present in frame 0). If that still leaves too much bookkeeping data, you should be able to write a separate tool to strip that away.
You are already on the right track: use LAME external executable, or any other shell-invoked encoder.
To build MP frames, were your layer of interest is 3, is not easy to do from scratch. There are compression steps, Fast-fourier transforms followed by quantization, which are of complex and tediously long explanation. The amount of work required for a developer to build it from scratch is very big.
There are programmatic C and C++ MP encoding libs, but you will be either asked for fees, be left with very limited support, or have very limited interfacing options.
Go LAME, study their wiki.