How to make motion history image for presentation into one single image? - c++

I am working on a project with gesture recognition. Now I want to prepare a presentation in which I can only show images. I have a series of images defining a gesture, and I want to show them in a single image just like motion history images are shown in literature.
My question is simple, which functions in opencv can I use to make a motion history image using lets say 10 or more images defining the motion of hand.
As an example I have the following image, and I want to show hand's location (opacity directly dependent on time reference).
I tried using GIMP to merge layers with different opacity to do the same thing, however the output is not good.

You could use cv::updateMotionHistory
Actually OpenCV also demonstrates the usage in samples/c/motempl.c

Related

How to use CNN-LSTMs to classify image sequences for multiple bounding boxes in a video stream?

I am working on a pytorch project, where I’m using a webcam video stream. An object detector is used to find objects within the frame and each box is given an id by a tracker. Then, I want to analyse each bounding box with a CNN-LSTM and classify it (binary classification) based on the previous frame sequence of that box (for the last 5 frames). I want the program to run a close to real-time as possible.
Currently I am stuck with the CNN-LSTM part of my problem - the detector and tracker are working quite well already.
I am a little bit clueless on how to approach this task. Here are the questions I have:
1) How does inferencing work in this case? Do I have to save np arrays for each bounding box containing the last 5 frames, then add the current frame and delete the oldest one? Then use the model for each bounding box that is in the current frame. This way sounds very slow and inefficient. Is there a faster or easier way?
2) Do you have any tipps for creating the dataset? I have a couple of videos with bounding boxes and labels. Should I loop through the videos and save save each frame sequence for each bounding box in a new folder, together with a csv that contains the label? I have never worked with an CNN-LSTM, so I don’t know how to load the data for training.
3) Would it be possible to use the extracted features of the CNN in parallel? As mentioned above, The extracted features should be used by the LSTM for a binary classification problem. The classification is only needed for the current frame. I would like to use an additional classifier (8 classes) based on the extracted CNN features, also only for the current frame. For this classifier, the LSTM is not needed.
Since my explaining propably is very confusing, the following image hopefully helps with understanding what I want to build:
Architecture
This is the architecture I want to use. Is this possible using Pytorch? So far, I only worked with CNNs and LSTM seperately. Any help is apprechiated :)

Using OpenCV to touch and select object

I'm using the OpenCV framework in iOS Xcode objc, is there a way that I could process the image feed from the video camera and allow the user to touch an object on the screen then we use some functionality in OpenCV to highlight it.
Here is graphically what I mean. The first image shows an example of what the user might see in the video feed:
Then when they tap on the screen on the ipad i want to use OpenCV feature/object detecting to process the area they've clicked to highlight the area. Would look something like this if they clicked the ipad:
Any ideas on how this would be achievable in objc OpenCV?
I can see quite easily how we could achieve this using trained templates of the iPad to match it using OpenCV algorithms but I want to try and get it dynamic so users can just touch anything in the screen and we'll take it from there?
Explanation: why should we use the segmentation approach
According to my understanding, the task which you are trying to solve is segmentation of objects, regardless to their identity.
The object recognition approach is one way to do it. But it has two major downsides:
It requires you to train an object classifier, and to collect a dataset which contains a respectable amount of examples of objects which you would like to recognize. If you choose to take a classifier which is already trained - it won'y necessarily work on any type of object which you would like to detect.
Most of the object recognition solutions find a bounding box around the recognized object, but they don't perform a complete segmentation of it. The segmentation part requires extra effort.
Therefore, I believe that the best way for your case is to use an image segmentation algorithms. More precisly, we'll be using the GrabCut segmentation algorithm.
The GrabCut algorithm
This is an iterative algorithm with two stages:
initial stage - the user specify a bounding box around the object.
given this bounding box the algorithm estimates the color distribution of foreground (the object) and the background by using GMM, followed by a graph cut optimization for finding the optimal boundaries between the foreground and the background.
In the next stage, the user may correct the segmentation if needed, by supplying scribbles of the foreground and the background. The algorithm fixes the model accordingly and perform a new segmentation based on the updated information.
Using this approach has pros and cons.
The pros:
The segmentation algorithm is easy to implement with openCV.
It enables the user to fix segmentation errors if needed.
It doesn't relies on a collecting a dataset and training a classifier.
The main con is that you will need an extra source of information from the user beside of a single tap on the screen. This information will be a bounding box around the object, and in some cases - additional scribbles will be required to correct the segmentation.
Code
Luckily, there is an implementation of this algorithm in OpenCV. The user Itseez create a simple and easy to use sample for using OpenCV's GrabCut algorithm, which can be found here: https://github.com/Itseez/opencv/blob/master/samples/cpp/grabcut.cpp
Application usage:
The application receives a path to an image file as an command line argument input. It renders the image onto the screen and the user is required to supply an initial bounding rect.
The user can press 'n' in order to perform the segmentation for the current iteration or press 'r' to revert his operation.
After choosing a rect, the segmentation is calculated. If the user wants to correct it, he may choose to add foreground or background scribbles by pressing shift+left and Ctrl+left accordingly.
Examples
Segmenting the iPod:
Segmenting the pen:
You Can do it by Training a Classifier of Ipad images using opencv Haar Classifiers and then detecting Ipad images in a given frame.
Now based on coordinates of the touch check if that area overlapped with detected Ipad image area. If it does Drawbounding box on the detected Object.Means from there on you can proceed towards processing your detected ipad image.
Repeat the above procedure for Number of objects that you want to detect.
The task which you are trying to solve is "Object proposal". It doesn't work very accurate and this results are very new.
This two articles give you a good overview of methods for this:
https://pdollar.wordpress.com/2013/12/10/a-seismic-shift-in-object-detection/
https://pdollar.wordpress.com/2013/12/22/generating-object-proposals/
To have state-of-the-art results, look for latest CVPR papers on Object proposals. Quite often they have code available to test.

Auto white balancing for camera

I am developing a sample camera and I am able to control image sensor directly. The sensors gives out Bayer image and I need to do show images as live view.
I looked at debayering codes and also white balancing. Is there any library in C/C++ that can help me in this process?
Since I need to have live view, I need to do these things very fast and hence I need algorithms that are very fast.
For example, I can change the RGB gains on sensor and hence I need an algorithm that act at that level, instead of acting on generated image.
Is there any library that help to save images in raw format?
simplecv has a function for white balance control:
simplecv project web site

C++ OpenCV sky image stitching

Some background:
Hi all! I have a project which involves cloud imaging. I take pictures of the sky using a camera mounted on a rotating platform. I then need to compute the amount of cloud present based on some color threshold. I am able to this individually for each picture. To completely achieve my goal, I need to do the computation on the whole image of the sky. So my problem lies with stitching several images (about 44-56 images). I've tried using the stitch function on all and some subsets of image set but it returns an incomplete image (some images were not stitched). This could be because of a lack of overlap of something, I dunno. Also the output image has been distorted weirdly (I am actually expecting the output to be something similar to a picture taken by a fish-eye lense).
The actual problem:
So now I'm trying to figure out the opencv stitching pipeline. Here is a link:
http://docs.opencv.org/modules/stitching/doc/introduction.html
Based on what I have researched I think this is what I want to do. I want to map all the images to a circular shape, mainly because of the way how my camera rotates, or something else that has uses a fairly simple coordinate transformation. So I think I need get some sort of fixed coordinate transform thing for the images. Is this what they call the homography? If so, does anyone have any idea how I can go about my problem? After this, I believe I need to get a mask for blending the images. Will I need to get a fixed mask like the one I want for my homography?
Am I going through a possible path? I have some background in programming but almost none in image processing. I'm basically lost. T.T
"So I think I need get some sort of fixed coordinate transform thing for the images. Is this what they call the homography?"
Yes, the homography matrix is the transformation matrix between an original image and the ideal result. It warps an image in perspective so it can fit in stitching to the other image.
"If so, does anyone have any idea how I can go about my problem?"
Not with the limited information you provided. It would ease the problem a lot if you know the order of pictures (which borders which.. row, column position)
If you have no experience in image processing, I would recommend you use a tutorial covering stitching using more basic functions in detail. There is some important work behind the scenes, and it's not THAT harder to actually do it yourself.
Start with this example. It stitches two pictures.
http://ramsrigoutham.com/2012/11/22/panorama-image-stitching-in-opencv/

adding cliparts to image/video OpenCV

I'm building a web cam application as my C++ project in my college. I am integrating QT (for GUI) and OpenCV (for image processing). My application will be a simple web cam app that will access the web cam, show/record videos, capture images and other stuffs.
Well, I also want to put in a feature to add cliparts to captured images, or the streaming video. While on my research, I found out that there is no way we can overlay two images using OpenCV. The best alternative I was able to find was to reconfigure the whole image to add the clipart into the original image making it a single image. You see, that's not going to work for me as I have to be able to move the clipart and resize or rotate the clipart in my canvas.
So, I was wondering if anybody could tell me how to achieve the effect I want most efficiently.
I would really appreciate your help. The deadline for the project submission is closing in and its a huge bump on the road to completion. PLEEEASE... RELP!!
If you just want to stick a logo onto the openCV image then you simply define a region of interest (roi) on the destination video image and copy the source image to this (the details vary with each version of opencv)
If you want the logo to be semi transparent - like a TV channel ID - then you can copy the image but loop over the pixels writing a destination that is source_pixel/2 + dest_pixel/2;