Running music as SDL_Mixer chunks - c++

Currently, SDL_Mixer has two types of sound resources: chunks and music.
Apart from the API and supported formats limitations, are there any reasons not to load and play music as a SDL_Chunk and channel? (memory, speed, etc.)

The API is the real issue. The "music" APIs are designed to deal with streaming compressed music, while the "sound" APIs aren't. Then again, if you manage to make it work in your app, then it works.

I haven't looked at the SDL code, but my guess would be the "chunks" are intended for smaller sound samples, and are cached in memory, decoded, in their entirety while the "music" is streamed (not cached in memory in its entirety, but decoded and buffered as needed, with the assumption that it would, for the most part, be played from the beginning, and continuously from that point, with maybe some reset back to the beginning occasionally)
So the reason is memory. You don't want to decode say, 4 minutes of a 16-bit stereo song into memory, as it will eat 44100Hz * 2bytes * 2channels *4minutes *60sec/min == 42336000 bytes if you attempt it, when you can decode and buffer smaller pieces of it.
OTOH, if you have the ~10Mb of RAM per minute of music to waste and you need the CPU that would be consumed by the on-the-fly decoding... you could probably use chunks.

Related

Is there a standardized method to send JPG images over network with TCP/IP?

I am looking for a standardized approach to stream JPG images over the network. Also desirable would be a C++ programming interface, which can be easily integrated into existing software.
I am working on a GPGPU program which processes digitized signals and compresses them to JPG images. The size of the images can be defined by a user, typically the images are 1024 x 2048 or 2048 x 4096 pixel. I have written my "own" protocol, which first sends a header (image size in bytes, width, height and corresponding channel) and then the JPG data itself via TCP. After that the receiver sends a confirmation that all data are received and displayed correctly, so that the next image can be sent. So far so good, unfortunately my approach reaches just 12 fps, which does not satisfy the project requirements.
I am sure that there are better approaches that have higher frame rates. Which approach do streaming services like Netflix and Amzon take for UHD videos? Of course I googled a lot, but I couldn't find any satisfactory results.
Is there a standardized method to send JPG images over network with TCP/IP?
There are several internet protocols that are commonly used to transfer files over TCP. Perhaps the most commonly used protocol is HTTP. Another, older one is FTP.
Which approach do streaming services like Netflix and Amzon take for UHD videos?
Firstly, they don't use JPEG at all. They use some video compression codec (such as MPEG), that does not only compress the data spatially, but also temporally (successive frames tend to hold similar data). An example of the protocol that they might use to stream the data is DASH, which is operates over HTTP.
I don't have a specific library in mind that already does these things well, but some items to keep in mind:
Most image / screenshare/ video streaming applications use exclusively UDP, RTP,RTSP for the video stream data, in a lossy fashion. They use TCP for control flow data, like sending key commands, or communication between client / server on what to present, but the streamed data is not TCP.
If you are streaming video, see this.
Sending individual images you just need efficient methods to compress, serialize, and de-serialize, and you probably want to do so in a batch fashion instead of just one at a time.Batch 10 jpegs together, compress them, serialize them, send.
You mentioned fps so it sounds like you are trying to stream video and not just copy over images in fast way. I'm not entirely sure what you are trying to do. Can you elaborate on the digitized signals and why they have to be in jpeg? Can they not be in some other format, later converted to jpeg at the receiving end?
This is not a direct answer to your question, but a suggestion that you will probably need to change how you are sending your movie.
Here's a calculation: Suppose you can get 1Gb/s throughput out of your network. If each 2048x4096 file compresses to about 10MB (80Mb), then:
1000000000 ÷ (80 × 1000000) = 12.5
So, you can send about 12 frames a second. This means if you have a continuous stream of JPGs you want to display, if you want faster frame rates, you need a faster network.
If your stream is a fixed length movie, then you could buffer the data and start the movie after enough data is buffered to allow playback at desired frame rate sooner than waiting for the entire movie to download. If you want playback at 24 frames a second, then you will need to buffer at least 2/3rds of the movie before you being playback, because the the playback is twice is fast as your download speed.
As stated in another answer, you should use a streaming codec so that you can also take advantage of compression between successive frames, rather than just compressing the current frame alone.
To sum up, your playback rate will be limited by the number of frames you can transfer per second if the stream never ends (e.g., a live stream).
If you have a fixed length movie, buffering can be used to hide the throughput and latency bottlenecks.

How to create vector of matrices to store large number of images?

I want to create the vector of matrices to stores as many images as possible.
I know that,it is possible as written below:
vector<Mat> images1;
and during the image acquisition from the camera and i would save the images at 100fps with resolution of 1600*800 as below:
images1.push_back(InputImage.clone());
Where InputImage is the Mat and given by the camera. Since creating video during the acquisition process either leads to frame missing in the video or reduction in aquisition speed.
Later after stopping the image acquisition and before stopping the program, I would write the images into video as written below:
VideoWriter writer;
writer = Videowriter("video.avi",-1,100,frameSize(1600,800),false);
for (vector<Mat>::iterator iter = images1.begin(); ier != images1.end(); iter++)
writer.write(*iter);
Is it correct, since I am not sure the images1 can store the images around 1500 images without overflow.
You don't really have to worry about "overflow", whatever that means in your context.
The bigger problem is memory. A single frame takes (at 8 bits per color, with 3 colors) 3 * 1600 * 800 == 3.84Mb. At 100fps, One second of footage requires 0.384Gb of memory. 8GB of memory can only hold about 20 seconds of footage. You'll need almost 24GB of memory before you can hold a whole minute. There's a reason that the vast, vast, vast majority of Video Encoding Software only keeps a few frames of video data in memory at any given time, and dumps the rest to the hard drive (or discards it, depending on what purpose the software is serving).
What you should probably be doing (which is what programs like FRAPS do) is dumping frames to the hard drive as soon as soon as you receive them. Then, when recording finishes, you can either call it a day (if raw video footage is what you need) or you can begin a process of reading the file and encoding it into a more compressed format.
Pre-allocate your image vector in memory so that you just need to copy the frames without real-time allocation.
If you have memory problems, try dumping the frames to a file, the OS will hopefully be able to handle the I/O. If not try memory mapped files.

Can the mp3 or wav file format take advantage of repetitious sounds?

I want to store a number of sound fragments as MP3 or WAV files, but these fragments are each highly repetitive (a 10 second burst of tone for example). Are the MP3 or WAV file formats able to take advantage of this - i.e. is there a sound file equivalent of run-length encoding?
No, neither codec can do this.
WAV files (typically) use PCM, which holds a value for every single sample. Even if there were complete digital silence (all values the same), every sample is stored.
MP3 works in frames of 1,152 samples. Each frame stands alone (well, there is the bit reservoir but for the purpose of encoding/decoding, this is just extra bandwidth made available). Even if there were a way to say do-this-n-times, it would be fixed within a frame. Now, if you are using MP3 with variable bit rate, I suspect that you will have great results with perfect sine waves since they have no harmonics. MP3 works by converting from the time domain to the frequency domain. That is, it samples the frequencies in each frame. If you only have one of those frequencies (or no sound at all), the VBR method would be efficient.
I should note that FLAC does use RLE when encoding silence. However, I don't think FLAC could be hacked to use RLE for 10 seconds of audio, since again there is a frame border. FLAC's RLE for silence is problematic for live internet radio stations that leave a few second gap inbetween songs. It's important for these stations to have a large buffer, since clients will often pause the stream if they don't receive enough data. (They do get caught back up again though as soon as that silent block is sent, once audio resumes.)

Saving images from a video camera to hard disk

I have a Firewire camera whose driver software deposits incoming images into a circular buffer of 16 images. I would like to avoid re-buffering these images, and just write them as fast as possible to disk. So I would prefer to just enqueue a pointer to each buffer as it is filled, and have a separate disk write thread which kept far enough ahead of the incoming images to be confident that it would write them out to disk before the incoming images overwrote them.
Clearly this would depend on the image size and frame rate... but in principle, for VGA images at 30 frames per second, we're talking about needing to write 27.6 MB/sec. This seems quite achievable, particularly if the writing thread can decide to drop occasional frames to keep far enough ahead of the incoming images, and if this strategy fails, to at least detect that an overwrite has invalidated the image, and signal that appropriately (e.g. delete the file after completion).
Comments on the validity of this strategy are welcome... but what I really want to know is what disk writing functions should be used for maximum efficency to get the disk writing rate up as high as possible. E.g. CreateFile() using FILE_FLAG_NO_BUFFERING + WriteFile()?

XAudio2 delay with small buffer size

I'm writing a video player. For audio part i'm using XAudio2. For this i have separate thread that is waiting for BufferEnd event and after this fills buffer with new data and call SubmitSourceBuffer.
The problem is that XAudio2(driver or sound card) has huge delays before playing next buffer if buffer size is small (1024 bytes). I made measurements and XAudio takes up to two times long for play such chunk. (1024 bytes chunk of 48khz raw 2-channeled pcm should be played in nearly 5ms, but on my computer it's played up to 10ms). And nearly no delays if i make buffer 4kbytes or more.
I need such small buffer to be able making synchronizations with video clock or external clock (like ffplay does). If i make my buffer too big then end-user will hear lot of noises in output due to synchronization stuff.
Also i have made measurements on all my functions that are decoding and synchronizing audio or anything else that could block or produce delays, they take 0 or 1 ms to execute, so they are not the problem 100%.
Does anybody know what can it be and why it's happenning? Can anyone check if he has same delay problems with small buffer?
I've not experienced any delay or pause using .wav files. If you are using mp3 format, it may add silence at the beginning and end of the sound during the compress operation thus causing a delay in your sound playing. See this post for more information.