Get most clear iplimage from the buffer - c++

How can I get most clear iplimage among rest , using OpenCV.? When there is no training image to compare.
As a example there is a web cam input which move a hand. But when I stream this video ,I get 10 iplimages. But only 5th one is more clear. I want to filter that 5th one using openCV.
Its good to evaluate each and every iplimage (10 images) and assign rank using clearness of the images. Is there any way to do it so..?
I hope kindness support from your all.
Thank you.

Several measures of 'clearness' or blur in pictures have been quantified in research papers. So what you're looking for probably is to estimate the amount of blur present in an image and then use it to decide whether it is to be discarded or not. I found this study to be quite helpful, gives a comparative overview of the different metrics of blurriness. Google Scholar's always there with a search string of estimating blur in images if you want to check other resources.
You could choose whichever method gives you the best results with your set of images, experimentally determining which threshold of the measure will be 'clear' enough. Or of course you could run the blur estimation on every frame and choose the one with the lowest blur.

Related

Extracting the descriptors of a bunch of sample images and training an SVM in OpenCV

I have 4 types of symbols of musical notes of the same color: Whole note, half note, Crotchet and quaver. I need to classify an image and tell if it has one of this symbols (just one for now) and which one. for example, if i have an image with just the musical staff (but nothing else in it) it should tell me that the image is empty, but if i have an image with a Half note symbol in it, it should tell me something like "it is a half note".
Suppose i have 20 sample images for each possible symbol and 20 with the base case (nothing in it), i want to train a SVM to classify any input image. I've read about how i could do it, but i still have certain doubts. i think the process is something like this (and please correct me if i'm wrong):
extract the descriptors of all the sample images.
put those descriptors inside different Mat Objects (one for each symbol).
feed those Mats to the SVM to train it.
Use the SVM to classify the images.
i have specific doubts about what i think is the process:
is what i described the correct process for what i need to do?
should i pre-process the sample images (say extract the background and apply canny edges) before i feed them to the descriptor extractor? o can i leave them as they are?
i have read about three methods of extracting the descriptors: HOG, BOW (Bag of Words) and SIFT. i think they all do what i need but i don't know which one to use. i see that HOG is mostly (if not all times) for face and pedestrians detection and i don't know if it could be used for my case. Any advice of which one should use?
how many sample images should i have for every case?
i dont need specific details of the implementation, but i do need answers to these questions, thank you in advance
I'm not an expert of SIFT and BOW but I know something about HOG and SVM.
1 Is what i described the correct process for what i need to do?
If you are using OpenCV and HOG no that is not correct. Have a look to the sample code for HOG in OpenCV samples and you will find that, once extracted, the descriptors directly feed the SVM without filling a MAT element.
2 should i pre-process the sample images (say extract the background and apply canny edges) before i feed them to the descriptor extractor? o can i leave them as they are?
This is not mandatory. Preprocessing has been proved to be very useful but for your simple case you wont need it. On the other hand, if your wall presents draws, stickers or something that can confuse the detector then yes. It can be a good solution to decrease the number of false positives.
3 i have read about three methods of extracting the descriptors: HOG, BOW (Bag of Words) and SIFT. i think they all do what i need but i don't know which one to use. i see that HOG is mostly (if not all times) for face and pedestrians detection and i don't know if it could be used for my case. Any advice of which one should use?
I have direct knowledge only of HOG. You can easily implement your own detector with HOG without any problem, I'm currently using it for traffic signs. Pay attention to the detection window that you want to use. You can leave all the other parameters as they are, it will work for simple cases.
4 how many sample images should i have for every case?
Once again it depends on the situation. I would say that 200 images (try also with less) for class will do the trick but you can always increase the number by applying some transformation on the positives. Try to flip, saturate or blur the images.
Some more considerations. I think that you can work with grey scale images due to the fact that color is not important to distinguish the notes (all the same color right?). If you have problem with false positives you can try to use the HSV color space to filter out patches that you will then use to detect the notes (it really works well with red!!). The easiest way to train your SVM is using a linear kernel and then train a model for each class.

Face recognition using neural networks

I am doing a project on face recognition, for that I have already used different methods like eigenface, fisherface, LBP histograms and surf. But these methods are not giving me an accurate result. Surf gives good matches for exact same images, but I need to match one image with it's own different poses(wearing glasses,side pose,if somebody is covering his face) etc. LBP compares histogram of images, i.e., only color informations. So when there is high variation on lighting condition it is not showing good results. So I heard about neural networks, but I don't know much about that. Is it possible to train the system very accurately by using neural networks. If possible how can we do that?
According to this OpenCV page, there does seem to be some support for machine learning. That being said, the support does seem to be a bit limited.
What you could do, would be to:
User OpenCV to extract the face of the person.
Change the image to grey scale.
Try to manipulate so that the face is always the same size.
All the above should be doable with OpenCV itself (could be wrong, haven't messed with OpenCV in a while) so that should save you some time.
Next, you take the image, as a bitmap maybe, and feed the bitmap as a vector to the neural network. Alternatively, as #MatthiasB recommended, you could feed the features instead of individual pixels. This would simplify the data being passed, thus making the network easier to train.
As for training, you manipulate these images as above, and then feed them to the network. If a person uses glasses occasionally, you could have cases of the same person with and without glasses, etc.

Remove noise from the computed optical flow

I compute the optical flow on grayscale videos which contains true-white and noisy-black patch besides the useful information. I want to remove those patches because the correspondant optical flow is foolish.
Those patches are on the edges of the image and their sizes vary from a video to another. My goal is to extract a bounding box describing the useful information in my video thanks to the optical flow.
How can I compute this bounding box ? Or at least, how can I remove the computed optical flow in those regions ?
Edit : I saw your answers. I'll try that next week end then come back to discuss about that. Tank you !
Remove noise from optical flow could be a complicated task. A simple and dummy way could be to use a threshold on the optical flow vector intensity.
But if you only need to find bounding boxes why just do not use a simple background/motion object segmentation? Like MOG, GMG, opencv has nice implementations of them and they works well and are quite fast. See this tutorial.
It's a little tough to understand what the problem is, if the noises is true-white and noisy-black patches in a grayscale image as you have said, then I suggest you look at eroding and dilating. More information can be found here: Eroding and Dilating
Should this not be what you are asking, do post some sample images with the patches and comment so that I can have a clearer idea on what the problem is. Cheers.
If I understand correctly, you are getting noisy optical flow in patches which are grey/white or basically uniform. A simple approach would be to divide the image into small patches and compute the entropy over each patch. Now, patches which have a very low entropy can be discarded by choosing an appropriate threshold because they do not contain much information.

stitching aerial images to create a map

I am working on a project to stitch together around 400 high resolution aerial images around 36000x2600 to create a map. I am currently using OpenCV and so far I have obtained the match points between the images. Now I am at a lost in figuring out how to get the matrix transformation of the images so I can begin the stitching process. I have absolutely no background in working with images nor graphics so this is a first time for me. Can I get some advice on how I would approach this?
The images that I received also came with a data sheet showing longitude, latitude, airplane wing angle, altitude, etc. of each image. I am unsure how accurate these data are, but I am wondering if I can use these information to perform the proper matrix transformation that I need.
Thanks
Do you want to understand the math behind the process or just have an superficial idea of whats going on and just use it?
The regular term for "image snitching" is image alignment. Feed google with it and you'll find tons of sources.
For example, here.
Best regards,
zhengtonic
In recent opencv 2.3 release...they implemented a whole process of image stitching. Maybe it is worth looking at.

Assessing the quality of an image with respect to compression?

I have images that I am using for a computer vision task. The task is sensitive to image quality. I'd like to remove all images that are below a certain threshold, but I am unsure if there is any method/heuristic to automatically detect images that are heavily compressed via JPEG. Anyone have an idea?
Image Quality Assessment is a rapidly developing research field. As you don't mention being able to access the original (uncompressed) images, you are interested in no reference image quality assessment. This is actually a pretty hard problem, but here are some points to get you started:
Since you mention JPEG, there are two major degradation features that manifest themselves in JPEG-compressed images: blocking and blurring
No-reference image quality assessment metrics typically look for those two features
Blocking is fairly easy to pick up, as it appears only on macroblock boundaries. Macroblocks are a fixed size -- 8x8 or 16x16 depending on what the image was encoded with
Blurring is a bit more difficult. It occurs because higher frequencies in the image have been attenuated (removed). You can break up the image into blocks, DCT (Discrete Cosine Transform) each block and look at the high-frequency components of the DCT result. If the high-frequency components are lacking for a majority of blocks, then you are probably looking at a blurry image
Another approach to blur detection is to measure the average width of edges of the image. Perform Sobel edge detection on the image and then measure the distance between local minima/maxima on each side of the edge. Google for "A no-reference perceptual blur metric" by Marziliano -- it's a famous approach. "No Reference Block Based Blur Detection" by Debing is a more recent paper
Regardless of what metric you use, think about how you will deal with false positives/negatives. As opposed to simple thresholding, I'd use the metric result to sort the images and then snip the end of the list that looks like it contains only blurry images.
Your task will be a lot simpler if your image set contains fairly similar content (e.g. faces only). This is because the image quality assessment metrics
can often be influenced by image content, unfortunately.
Google Scholar is truly your friend here. I wish I could give you a concrete solution, but I don't have one yet -- if I did, I'd be a very successful Masters student.
UPDATE:
Just thought of another idea: for each image, re-compress the image with JPEG and examine the change in file size before and after re-compression. If the file size after re-compression is significantly smaller than before, then it's likely the image is not heavily compressed, because it had some significant detail that was removed by re-compression. Otherwise (very little difference or file size after re-compression is greater) it is likely that the image was heavily compressed.
The use of the quality setting during re-compression will allow you to determine what exactly heavily compressed means.
If you're on Linux, this shouldn't be too hard to implement using bash and imageMagick's convert utility.
You can try other variations of this approach:
Instead of JPEG compression, try another form of degradation, such as Gaussian blurring
Instead of merely comparing file-sizes, try a full reference metric such as SSIM -- there's an OpenCV implementation freely available. Other implementations (e.g. Matlab, C#) also exist, so look around.
Let me know how you go.
I had many photos shot to an ancient book (so similar layout, two pages per image), but some were much blurred, to the point that the text could not be read. I searched for a ready-made batch script to find the most blurred one, but I didn't find any useful, so I used another part of script got on the net (based on ImageMagick, but no longer working; I couldn't retrieve the author for the credits!), useful to assessing the blur level of a single image, tweaked it, and automatised it over a whole folder. I uploaded here:
https://gist.github.com/888239
hoping it will be useful for someone else. It works on a Linux system, and uses ImageMagick (and some usually command line installed tools, as gawk, sort, grep, etc.).
One simple heuristic could be to look at width * height * color depth < sigma * file size. You would have to determine a good value for sigma, of course. sigma would be dependent on the expected entropy of the images you are looking at.