I'm developing a screencasting utility in C++.
It basically captures desktop frames and creates an AVI file. The algorithm is as follows:
Create a thread: this->m_hThread=CreateThread(NULL,0,thScreenCapture,this,0,NULL);
Capture desktop in thScreenCapture n times per second (like 5 fps).
obj->Capture();
In Capture(), append the bitmap data to the avi file.
this->appendBitmapToAvi(this->avifile, bmp);
This utility also records sound. So, in method thScreenCapture, sound data is also being appended to the avi file.
The problem is that a lag occurs between frames and sound when more than 6 frames (this can change depending on hardware configuration) are captured per second.
I'm seeking for a solution to optimize the algorithm. A solution may be buffering frames in memory, not appending all of them to the avi file on-the-fly. But this makes the code more complex because I have to deal with the sound data that's being captured in a different thread.
What do you suggest to increase the fps value that this utility supports without losing synchronization?
Are you writing the AVI file yourself? A noble effort, but there are APIs to help with this task.
If you're working on the windows platform, I'd suggest considering using the DirectShow or Media Foundation APIs to mux the audio and video to disk. DirectShow is the API for A/V capture, streaming and muxing on the windows platform.
This article on CodeProject talks about audio & video sync issues and the mechanism that DirectShow uses to overcome this difficulty.
Essentially, a reference clock is employed and the frames are timestamped.
There's a very active DirectShow community that is an extremely useful resource for new folks. TMH's website is well worth checking out - he's an MS MVP and is an active member of the community.
I hope this helps!
You could take a look at the source for other screencasting software, such as CamStudio, to see how they are doing it.
If your program is disk bound (and I suspect it is), then things might improve by using compression (this is how the big name programs, such as Camtasia Studio, operate)
Use a circular double or triple buffer to store the bitmap and sound each frame and use a separate thread to add the bitmap and sound to the avi. So data-collection is in one thread, data is in a circular (thread-safe) buffer and the data-storage is in another thread.
What OS are you targeting? If tyou are working on Windows XP I would take a look at some of the DirectShow code at http://tmhare.mvps.org/downloads.htm, specifically Filter Graph Library.
Related
I'm trying to develop a little application in which you can load a mp3 file and play it in variable speeds! (I know it already exists :-) )
I'm using Qt and C++. I already have the basic player but I'm stuck with the rate thing, because I want to change the rate smoothly (like in Mixxx) without stopping the playback! The QMediaPlayer always stops if I change the value and creates a gap in the sound. Also I don't want the pitch to change!
I already found something called "SoundTouch" but now I'm completely clueless what to do with it, how to process my mp3 data and how to get it to the player! The "SoundTouch" Library is capable of doing what I want, i got that from the samples on the homepage.
How do I have to import the mp3 file, so I can process it with the SoundTouch functions
How can I play the output from the SoundTouch function? (Perhaps QMediaPlayer can do the job?)
How is that stuff done live? I have to do some kind of stream I guess? So I can change the speed during play and keep on playing without gaps. Graphicaly in my head it has to be something that sits between the data and the player, where all data has to go through live, with a small buffer (20-50 ms or so) behind to avoid gaps during processing future data.
Any help appreciated! I'm also open to any another solution then "SoundTouch" as long as I can stay with Qt/C++!
(Second thing: I want to view a waveform overview aswell as moving part of it (around actual position of the song), so I could also use hints on how to get the waveform data)
Thanks in advance!
As of now (Qt 5.5) this is impossible to do with QMediaPlayer only. You need to do the following:
Decode the audio using GStreamer, FFMpeg or (new) QAudioDecoder: http://doc.qt.io/qt-5/qaudiodecoder.html - this will give you raw PCM stream;
Apply SoundTouch or some other library to this raw data to change the pitch. If GPL is ok, take a look at http://nsound.sourceforge.net/examples/index.html, if you develop proprietary stuff, STK might be a better choice: https://ccrma.stanford.edu/software/stk/
Output the modified data into audio device by using QAudioOutput.
This strategy uses Qt as much as possible, and brings you the best platform coverage (you still lose Android though as it does not support QAudioOutput)
We have an "MC1362 Camera" and an "Inspecta-5" frame grabber in our lab. There is program in LABVIEW11 which gets the data from a frame grabber, however as the Labview is slow my supervisor has told me to write a program in c++ to get the data from the frame grabber. I have no idea how to write a c++ program to connect to a frame grabber and do the data acquisition. I know how to write software in c++, but have never tried programming to connect to hardware and read data from it. Is there any specific library or framework which can help me, or any tutorial?
Please, if anybody knows, help me in this matter.
Update:just to add, we are doing medical image analysis, and a laser illuminate a subject, so camera will take pictures and pass it to the computer. I need to grab the pictures and analysis them.
You basically have a couple of options,
1 see if there is an SDK for the grabber card, if there is this is usually easier then option 2 but is of course restricted to work with that grabber or familly of grabber cards, we do it this way with the eurysys grabber cards.
2 assuming you are running on a windows platform, implement a DirectShow filtergraph and write your own ouput filter to get the data, the SDK for DirectShow is quiet good and has many examples. This approach is far more flexible and you should be able to use a number of grabber but its also alot more complex, we do it this way for USB / some other inbuilt grabbers.
Our software is done in Delphi 7 but its just importing DLLs, for C++ should be no problem and most SDK's are written round C++ anyway.
I know its not much but its a place to start.
Update
Just done a quick Google search and there is an SDK for that Grabber and on first looks its seams fairly straight forward.
We have a requirement to lets users record a video of our 3D application. I can already grab the individual rendered frames so this question is specifically about how to write frames into a video file.
I don't think writing each frame as a separate file and post-processing is a workable option.
I can look at options to record to a simple video file for later optimising/encoding, or writing directly to a sensibly encoded format.
FFmpeg was suggested in another post but it looks a bit daunting to me. Is it the best option, if not what can be suggested? We can work with LGPL but not full GPL.
We're working on Windows (Win32 not MFC) in C++. Sample/pseudo code with your recommended library is very much appreciated... basically after how to do 3 functions:
startRecording() does whatever initialization is needed
recordFrame() takes pointer to frame data and encodes it, ideally with timing data
endRecording() finalizes the video file, shuts down video system, etc
Check out the sources to Taksi on sourceforge. http://taksi.sourceforge.net/
You need 2 things.
1. A code to compress the frames.
2. A container file format. Like AVI or MPG.
Taksi useses the old VideoForWindows API and AVI not the newer COM API's but it still might work for you.
I'm writing a program similar to StreamMyGame with the difference of the client being free and more importantly, open source, so I can port it to other devices (in my case an OpenPandora), or even make an html5 or flash client.
Because the objective of the program is to stream video games, latency should be reduced to a minimum.
Right now I can capture video of Direct 3D 9 games at a fixed frame rate, encode it using libx264 and dumping it to disk, and send input remotely, but I'm stumped at sending the video and eventually the audio through the network.
I don't want to implement a way just to discover that it introduces several seconds of delay and I don't care how it is done as long as it is done.
Off of my head I can think several ways:
My current way, encode video with libx264 and audio with lame or as ac3 and send them with live555 as a RTSP feed, though the library is not playing nice with MSVC and I’m still trying to understand its functioning.
Have the ffmpeg library do all the grunt work, where it encodes and sends (I guess I'll have to use ffserver to get an idea on how to do it)
Same but using libvlc, perhaps hurting encoding configurability in the process.
Using several pipes with the independent programs (ie: piping data to x264.exe or ffmpeg.exe)
Use other libraries such as pjsip or JRTPLIB that might simplify the process.
The hard way, sending video and audio through an UDP channel and figuring out how to synchronizing everything at the client (though the reason to use RTSP is to avoid this).
Your way, if I didn't think of something.
The second option would really be the best as it would reduce the number of libraries (integrate swscale, libx264, the audio codec and the sender library), simplify the development and bringing more codec variety (CELT looks promising) but I worry about latency as it might have a longer pipeline.
100 ms would already be too much, especially when you consider you might be adding another 150 ms of latency when it is used trough broadband.
Does any of you have experience with these libraries, to recommend me to switch to ffmpeg, keep wrestling live555 or do anything else (even if I didn’t mentioned it)?
I had very good results of streaming large blocks of data with low latency using UDT4 library. But first I would suggest checking ffmpegs network capabilities, so you have a native solution in all operations.
I wanted to get some ideas one how some of you would approach this problem.
I've got a robot, that is running linux and uses a webcam (with a v4l2 driver) as one of its sensors. I've written a control panel with gtkmm. Both the server and client are written in C++. The server is the robot, client is the "control panel". The image analysis is happening on the robot, and I'd like to stream back the video from the camera to the control panel for two reasons:
A) for fun
B) to overlay image analysis results
So my question is, what are some good ways to stream video from the webcam to the control panel as well as giving priority to the robot code to process it? I'm not interested it writing my own video compression scheme and putting it through the existing networking port, a new network port (dedicated to video data) would be best I think. The second part of the problem is how do I display video in gtkmm? The video data arrives asynchronously and I don't have control over main() in gtkmm so I think that would be tricky.
I'm open to using things like vlc, gstreamer or any other general compression libraries I don't know about.
thanks!
EDIT:
The robot has a 1GHz processor, running a desktop like version of linux, but no X11.
Gstreamer solves nearly all of this for you, with very little effort, and also integrates nicely with the Glib event system. GStreamer includes V4L source plugins, gtk+ output widgets, various filters to resize / encode / decode the video, and best of all, network sink and sources to move the data between machines.
For prototype, you can use the 'gst-launch' tool to assemble video pipelines and test them, then it's fairly simply to create pipelines programatically in your code. Search for 'GStreamer network streaming' to see examples of people doing this with webcams and the like.
I'm not sure about the actual technologies used, but this can end up being a huge synchronization ***** if you want to avoid dropped frames. I was streaming a video to a file and network at the same time. What I eventually ended up doing was using a big circular buffer with three pointers: one write and two read. There were three control threads (and some additional encoding threads): one writing to the buffer which would pause if it reached a point in the buffer not read by both of the others, and two reader threads that would read from the buffer and write to the file/network (and pause if they got ahead of the producer). Since everything was written and read as frames, sync overhead could be kept to a minimum.
My producer was a transcoder (from another file source), but in your case, you may want the camera to produce whole frames in whatever format it normally does and only do the transcoding (with something like ffmpeg) for the server, while the robot processes the image.
Your problem is a bit more complex, though, since the robot needs real-time feedback so can't pause and wait for the streaming server to catch up. So you might want to get frames to the control system as fast as possible and buffer some up in a circular buffer separately for streaming to the "control panel". Certain codecs handle dropped frames better than others, so if the network gets behind you can start overwriting frames at the end of the buffer (taking care they're not being read).
When you say 'a new video port' and then start talking about vlc/gstreaming i'm finding it hard to work out what you want. Obviously these software packages will assist in streaming and compressing via a number of protocols but clearly you'll need a 'network port' not a 'video port' to send the stream.
If what you really mean is sending display output via wireless video/tv feed that's another matter, however you'll need advice from hardware experts rather than software experts on that.
Moving on. I've done plenty of streaming over MMS/UDP protocols and vlc handles it very well (as server and client). However it's designed for desktops and may not be as lightweight as you want. Something like gstreamer, mencoder or ffmpeg on the over hand is going to be better I think. What kind of CPU does the robot have? You'll need a bit of grunt if you're planning real-time compression.
On the client side I think you'll find a number of widgets to handle video in GTK. I would look into that before worrying about interface details.