opencv videocapture default setting - c++

I am using mac book and have a program written in C++, the program is to extract successive frames from the webcam. The extracted frames are then grayscaled and smoothed using opencv functions. After that i would use CVNorm to find out the relative difference between 2 frames. I am using videoCapture class.
I found out that the frame rate is 30fps and using CVNorm, the relative difference obtained between successive frames are less than 200 most of the time.
I am trying to do the same thing on xcode so as to implement the program on ipad. This time I am using AVCaptureSession, the same steps are performed but i realize that the relative difference between 2 frames are much higher (>600).
Thus i would like to know about the default setting for videoCapture class, I know that i can edit the setting using cvSetCaptureProperty but i cannot find the default setting of it. After that i would compare it with the setting of the AVcaptureSession and hope to find out why there is such a huge difference in CVNorm when i use these 2 approaches to extract my frame.
Thanks in advance.

OpenCV's VideoCapture class is just a simple wrapper for capturing video from cameras or for reading video files. It is built upon several multimedia frameworks (avfoundation, dshow, ffmpeg, v4l, gstreamer, etc.) and totally hides them from the outside. The problem is coming from here, it is really hard to achieve the same behaviour of capturing under different platform and multimedia frameworks. The common set of (capture's) properties are poor and setting their values is rather only a suggestion instead of a requirement.
In summary, the default properties can differ under different platforms, but in case of AV Foundation framework:
The cvCreateCameraCapture_AVFoundation(int index) function will create a CvCapture instance under iOS, which is defined in cap_qtkit.mm. Seems like you are not able to change the sampling rate, only CV_CAP_PROP_FRAME_WIDTH, CV_CAP_PROP_FRAME_HEIGHT and DISABLE_AUTO_RESTART are supported.
The grabFrame() implementation is below. I'm absolutely not an Objective-C expert, but it seems like it waits until the capture updates the image or a time out occurs.
bool CvCaptureCAM::grabFrame() {
return grabFrame(5);
}
bool CvCaptureCAM::grabFrame(double timeOut) {
NSAutoreleasePool* localpool = [[NSAutoreleasePool alloc] init];
double sleepTime = 0.005;
double total = 0;
[NSTimer scheduledTimerWithTimeInterval:100 target:nil selector:#selector(doFireTimer:) userInfo:nil repeats:YES];
while (![capture updateImage] && (total += sleepTime)<=timeOut) {
[[NSRunLoop currentRunLoop] runUntilDate:[NSDate dateWithTimeIntervalSinceNow:sleepTime]];
}
[localpool drain];
return total <= timeOut;
}

Related

Media foundation multiple videos playback results in memory-leak & crash after undefined timeframe

So we are using a stack consisting of c++ media foundation code in order to playback video files. An important requirement is the ability to play these videos in constantly repeating sequences, so every single video slot will periodically change the video it is playing. In our current example we are creating 16 HWNDs to render video into and 16 corresponding player objects. The main application loops over all of them one after another and does the following:
Shutdown the last player
Release the object
CoCreateinstance for a new player
Initialize the player with the (old) HWND
Start Playback
The media player is called "MediaPlayer2", this needs to be built and registered as COM (regsvr32). The main application is to be found in the TAPlayer2 Project. It searches for the CLSID of the player in the registry and instantiates it. As current test file we use a test.mp4 that has to reside on the disk like C:\test.mp4
Now everything goes fine initially. The program loops through the players and the video keeps restarting and playing. The memory footprint is normal and all goes smooth. After a timeframe of anything between 20 minutes and 4 days, all of the sudden things will get weird. At this point it seems as if calls to "InitializeRenderer" by the EVR slow down and eventually don't go through anymore at all. With this, also thread count and memory footprint will start to increase drastically and after a certain amount of time depending on existing RAM all the memory will be exhausted and our application crashes, usually somewhere in the GPU driver or near the EVR DLL.
I am happy to try out any other code examples that propose to solve my issue: displaying multiple video windows at the same time, and looping through them like in a playlist. Needs to be running on Windows 10!
I have been going at this for quite a while now and am pretty hard stuck. I uploaded the above mentioned code example and added the link to this post. This should work out of the box afaik. I can also provide code excerpts in here in the thread if that is preferred.
Any help is appreciated, thanks
Thomas
Link to demo project (VS2015): https://filebin.net/l8gl79jrz6fd02vt
edit: the following code from the end of winmain.cpp is used to restart the players:
do
{
for (int i = 0; i < PLAYER_COUNT; i++)
{
hr = g_pPlayer[i]->Shutdown();
SafeRelease(&g_pPlayer[i]);
hr = CoCreateInstance(CLSID_AvasysPlayer, // CLSID of the coclass
NULL, // no aggregation
CLSCTX_INPROC_SERVER, // the server is in-proc
__uuidof(IAvasysPlayer), // IID of the interface we want
(void**)&g_pPlayer[i]); // address of our interface pointer
hr = g_pPlayer[i]->InitPlayer(hwndPlayers[i]);
hr = g_pPlayer[i]->OpenUrl(L"C:\\test.mp4");
}
} while (true);
Some MediaFoundation interface like
IMFMediaSource
IMFMediaSession
IMFMediaSink
need to be Shutdown before Release them.
At this point it seems as if calls to "InitializeRenderer" by the EVR slow down and eventually don't go through anymore at all.
... usually somewhere in the GPU driver or near the EVR DLL.
a good track to make a precise search in your code.
In your file PlayerTopoBuilder.cpp, at CPlayerTopoBuilder::AddBranchToPartialTopology :
if (bVideo)
{
if (false) {
BREAK_ON_FAIL(hr = CreateMediaSinkActivate(pSD, hVideoWnd, &pSinkActivate));
BREAK_ON_FAIL(hr = AddOutputNode(pTopology, pSinkActivate, 0, &pOutputNode));
}
else {
//// try directly create renderer
BREAK_ON_FAIL(hr = MFCreateVideoRenderer(__uuidof(IMFMediaSink), (void**)&pMediaSink));
CComQIPtr<IMFVideoRenderer> pRenderer = pMediaSink;
BREAK_ON_FAIL(hr = pRenderer->InitializeRenderer(nullptr, nullptr));
CComQIPtr<IMFGetService> getService(pRenderer);
BREAK_ON_FAIL(hr = getService->GetService(MR_VIDEO_RENDER_SERVICE, __uuidof(IMFVideoDisplayControl), (void**)&pVideoDisplayControl));
BREAK_ON_FAIL(hr = pVideoDisplayControl->SetVideoWindow(hVideoWnd));
BREAK_ON_FAIL(hr = pMediaSink->GetStreamSinkByIndex(0, &pStreamSink));
BREAK_ON_FAIL(hr = AddOutputNode(pTopology, 0, &pOutputNode, pStreamSink));
}
}
You create a IMFMediaSink with MFCreateVideoRenderer and pMediaSink. pMediaSink is release because of the use of CComPtr, but never Shutdown.
You must keep a reference on the Media Sink and Shutdown/Release it when the Player Shutdown.
Or you can use a different approach with MFCreateVideoRendererActivate.
IMFMediaSink::Shutdown
If the application creates the media sink, it is responsible for calling Shutdown to avoid memory or resource leaks.
In most applications, however, the application creates an activation object for the media sink, and the Media Session uses that object to create the media sink.
In that case, the Media Session — not the application — shuts down the media sink. (For more information, see Activation Objects.)
I also suggest you to use this kind of code at the end of CPlayer::CloseSession (after release all others objects) :
if(m_pSession != NULL){
hr = m_pSession->Shutdown();
ULONG ulMFObjects = m_pSession->Release();
m_pSession = NULL;
assert(ulMFObjects == 0);
}
For the use of MFCreateVideoRendererActivate, you can look at my MFNodePlayer project :
MFNodePlayer
EDIT
I rewrote your program, but i tried to keep your logic and original source code, like CComPtr/Mutex...
MFMultiVideo
Tell me if this program has memory leaks.
It will depend on your answer, but then we can talk about best practices with MediaFoundation.
Another thought :
Your program uses 1 to 16 IMFMediaSession. On a good computer configuration, you could use only one IMFMediasession, i think (Never try to aggregate 16 MFSource).
Visit :
CustomVideoMixer
to understand the other way to do it.
I think your approach to use 16 IMFMediasession is not the best approach on a modern computer. VuVirt talk about this.
EDIT2
I've updated MFMultiVideo using Work Queues.
I think the problem can be that you call MFStartup/MFShutdown for each players.
Just call MFStartup/MFShutdown once in winmain.cpp for example, like my program does.

Is there an API that will run on iOS in order to change the Frame Per Second of an existing video?

I am looking for a way to receive as an input any video (that is supported on iOS) and save on the device a new video with a new Frame Per Second rate. The motivation is to decrease the video size, and as well make it as lite weighted as possible.
Tried using ffmpeg library from command line (need it to run directly from application)
Tried working with SDAVAssetExportSessionDelegate, but managed only to change the bit per second (each frame quality is lower)
Though to work with OpenCV - but preferring something lighter and build in if possible
Objective C:
'''
compressionEncoder.videoSettings = #
{
AVVideoCodecKey: AVVideoCodecTypeH264,
AVVideoWidthKey: [NSNumber numberWithInt:width], //Set your resolution width here
AVVideoHeightKey: [NSNumber numberWithInt:height], //set your resolution height here
AVVideoCompressionPropertiesKey: #
{
AVVideoAverageBitRateKey: [NSNumber numberWithInt:bitRateKey], // Give bitrate for lower size low values
AVVideoProfileLevelKey: AVVideoProfileLevelH264High40,
// Does not change - quality setting and not reletaed to playback framerate!
//AVVideoMaxKeyFrameIntervalKey: #800,
},
};
compressionEncoder.audioSettings = #
{
AVFormatIDKey: #(kAudioFormatMPEG4AAC),
AVNumberOfChannelsKey: #2,
AVSampleRateKey: #44100,
AVEncoderBitRateKey: #128000,
};
'''
Expected a video with less Frame Per Second, each frame is in the same quality. Similar to a brief thumbnail summary of the video
The type of conversion you are doing will be time and power consuming on a mobile device, but I am guessing you are already aware of that.
Given your end goal is to reduce size, while presumably maintaining a reasonable quality, you may find you want to experiment with different settings etc in the encodings.
For this type of video manipulation, ffmpeg is a good choice as you probably saw from your command line usage. To use ffmpeg from an application, a common approach is to use a well supported 'ffmpeg wrapper' - this effectively runs the Ffmpeg command line commands from wihtin your application.
The advantage is that all the usual syntax should work and you can leverage the vast amount of info on ffmpeg command line syntax on the web. The downsides are that ffmpeg was not not designed to be wrapped like this so you may see some issues, although with a well supported wrapper you should find either help or that others have already worked around the issues.
Some examples of popular iOS ffmpeg wrappers:
https://github.com/tanersener/mobile-ffmpeg
https://github.com/sunlubo/SwiftFFmpeg
Get MobileFFMpeg up and running:
https://stackoverflow.com/a/59325680/1466453
Once you can make MobileFFMpeg calls in your IOS code then changing frame rate is pretty straightforward with this code:
[MobileFFmpeg execute: #"-i -filter:v fps=fps=30 "];

Stream OpenCV cv::Mat image to website (html5 page)

I have c++ code running on a raspberry pi using OpenCV to process the camera input (form and color detection). Here is the thread where i capture my images from my pi cam:
(variables names are in french, sorry about that)
Mat imgOriginal;
VideoCapture camera;
int largeur = camPartage->getLargeur();
int hauteur = camPartage->getHauteur();
camera.open(0);
if ( !camera.isOpened() )
{
screen->dispStr(10,1,"Cannot open the web cam");
}
else
{
screen->dispStr(10,1,"Open the web cam");
camera.set(CV_CAP_PROP_FRAME_WIDTH,largeur);
camera.set(CV_CAP_PROP_FRAME_HEIGHT,hauteur);
camera.set(CV_CAP_PROP_FPS,30);
}
while(1)
{
if(largeur != camPartage->getLargeur() || hauteur != camPartage->getHauteur())
{
largeur = camPartage->getLargeur();
hauteur = camPartage->getHauteur();
camera.set(CV_CAP_PROP_FRAME_WIDTH,largeur);
camera.set(CV_CAP_PROP_FRAME_HEIGHT,hauteur);
}
camera.grab();
camera.retrieve(imgOriginal);
camPartage->setImageCam(imgOriginal); //shared object
if(thread.destruction == DESTRUCTION_SYNCHRONE)
{
pthread_testcancel();
}
usleep(20000);
}
Now, i want to stream those images to my website hosted on another raspberry pi. I have looked into gstreamer, ffmpeg and sockets but i didn't find any good example in c++ that worked for me. Im trying to get the lowest latency possible.
Some people suggested to use raspistill but i can't open the camera in another program since its already open by OpenCV.
If you need more information let me know, any help is appreciated.
If you need to stream your camera images from a RPi on the network, There are many approaches to do that, based on your needs.
One approach is to use high-level applications like MJPG streamer, RPi IP Camera, etc.
Another approach is, you can stream camera images throw a network (by RTP, UDP, etc) with GStreamer, FFmpeg, Raspistill, etc. With this approach, you need to have a receiver app to get streams (e.g FFmpeg).
There is also another approach which you already stated in your question and that is directly accessing the camera and capture images then transfer them manually throw network. With this approach, you have more freedom to modify the design (like adding your own compression, encryption, etc) but you should take care of the network protocol by yourself.
In your example, you can transfer each frame in network with a simple TCP/IP socket or you can build up a simple web server. It is obvious that you can't access the cam with two apps at the same time. You can use v4l2loopback to create multiple camera interfaces and access them by multiple apps but it won't solve your problem.
There are good projects like rpi-webrtc-streamer and streameye which uses low-level protocols to transfer images.

ParaView: Live point cloud visualization plugin

I am writing a ParaView version 5.1.2 plugin in C++ to visualize point cloud data produced by a LiDAR sensor. I noticed that Velodyne has an open source ParaView custom application to visualize their LiDAR data called Veloview. I tweaked some of their code to start but I am stuck now.
So far I wrote a reader that takes a pcap file and renders a point cloud that can be played back frame by frame. I also wrote a ParaView source that listens on a port and captures udp packets and after they are captured uses the reader to split them into frames and visualize the PC.
Now I would like to take live udp packets and render the point cloud in real time as each frame is completed.
I am having trouble accomplishing this because of the ParaView plugin structure. Currently, my reader displays a frame when the method RequestData is called. My method looks something like this.
int RequestData(vtkInformation *request, vtkInformationVector **inputVector, vtkInformationVector *outputVector){
vtkPolyData* output = vtkPolyData::GetData(outputVector);
vtkInformation* info = outputVector->GetInformationObject(0);
int timestep = 0;
if (info->Has(vtkStreamingDemandDrivenPipeline::UPDATE_TIME_STEP()))
{
double timeRequest = info->Get(vtkStreamingDemandDrivenPipeline::UPDATE_TIME_STEP());
int length = info->Length(vtkStreamingDemandDrivenPipeline::TIME_STEPS());
timestep = static_cast<int>(floor(timeRequest + 0.5));
}
this->Open();
// GetFrame returns a vtkSmartPointer<vtkPolyData> that is the frame
output->ShallowCopy(this->GetFrame(timestep));
this->Close();
return 1;
}
The RequestData method is called every time the timestep is updated in the ParaView gui. Then the frame from that timestep is copied into the outputVector.
I am not sure how to implement this with live data because in that circumstance the RequestData method is not called because no timesteps are requested. I saw there is a way to keep RequestData executing by using CONTINUE_EXECUTING() in this way.
request->Set(vtkStreamingDemandDrivenPipeline::CONTINUE_EXECUTING(), 1);
But I do not know if that is supposed to be used to visualize live data.
For now I am interested in simply reading live packets and throwing them away as soon as their frame is rendered. Does anyone know how I can achieve this?
In the code of VeloView (which basically is a bundled ParaView+LidarPlugin), the timesteps of ParaView is changed by the main code, not the Lidar Plugin.
We advice you to start from VeloView code, which is much closer to your goal.
If you really want to start from scratch within ParaView, you need to increment this requested timestep yourself.
Newest version of VeloView (unreleased) uses the same mechanism as ParaView “LiveSource” plugin (available in 5.6+), where the plugin tells ParaView to set a QtTimer that will automatically increment the available and requested timesteps.
request->Set(vtkStreamingDemandDrivenPipeline::CONTINUE_EXECUTING(), 1); relates to another mechanism that will run request Data multiple time, but won’t take care of updating the requested timestep.
Best,
Bastien Jacquet
VeloView project leader

Canon SDK (EDSDK) capture region of specified size for video stream

I am very new to the EDSDK so sorry for maybe weird question in some places.
Is it possible to access a video stream and perform some operations on it using the SDK? I need this to capture very thin region (ROI) of a specified size (for example 3840x10 px) for each frame in the stream. Don`t understand this as compression of a frame, aspect ratios are not needed to follow. These changes in theory should increase fps, because the region will be very thin (Should they?).
I found the code snippet below from the official documentation, although it seems this causes only to send a signal for starting and stopping video rec, without accessing the stream.
EdsUInt32 record_start = 4; // Begin movie shooting
err = EdsSetPropertyData(cameraRef, kEdsPropID_Record, 0, sizeof(record_start), &record_start);
EdsUInt32 record_stop = 0; // End movie shooting
err = EdsSetPropertyData(cameraRef, kEdsPropID_Record, 0, sizeof(record_stop), &record_stop);
I would be very thanksful for any suggestions and help. Please feel free to ask any additional information!
This sdk doesnt allow you to directly get access to hi res streams like industrial cams would. You can access over USB ~960x640 liveview images in sequential JPGs. Movie recording can only be done to internal card, and after stopping transfering the result. Outside of this SDk, use of an external HDMI recorder gives access to a near realtime feed at max FullHD1080p, depending on model and not always “clean”.