Pattern Recognition in C++ - c++

I have a simple template grayscale image, with white background and black shape over it, and I have several similar test images, I want to compare these two images and see if template matches any of the test images. Can you please suggest a simple(easy to use) pattern recognition library for C++ which takes two images and compares them and shows the result?

Just do image1-image2 for all pixels. Then sum up all the differences. The lower the results, the closer the images.
If your pattern could be of several sizes, then you have to resize it and check it for each positions.

Implement a Neural Network on the image. Inputs should be the greyscales of your image. you should train your network to a train set, chose proper regularization parameters using a cross validation set, and finally test your network on a test set.
http://www.codeproject.com/Articles/13582/Back-propagation-Neural-Net
(I have done this myself to train a network to recognise hand written digits - it works very well.)

How simple the library you need is depends on the specific parameters of your problem. OpenCV is a great image processing library that should be able to do what you need it to. Here is a tutorial on template matching in OpenCV. It makes it very easy to switch between matching metrics and choose the best one for your problem.

Related

How to find logos on a website screenshot

I'm looking for a way to check if a given logo appears on a screenshot of a webpage. So basically, I need to be able to find a small predefined image on a larger image that may or may not contain the smaller image. A match could be of a different scale, somewhat different colors. I need to judge occurrence similarity as well. Need some pointers for what to look at, I've never worked with computer vision before.
Simplest yet not simple way to do it is a normal CNN trained on augmented dataset of the logos.
Trying to keep the answer short, Just make a cnn in tensorflow and train your model on tons images of logos with labels on each training image, It's a simple task and a not-very-crafty CNN must be able to get your work done.
CNN- Convolutional Neural Network
Reference : https://etasr.com/index.php/ETASR/article/view/3919

Extracting the descriptors of a bunch of sample images and training an SVM in OpenCV

I have 4 types of symbols of musical notes of the same color: Whole note, half note, Crotchet and quaver. I need to classify an image and tell if it has one of this symbols (just one for now) and which one. for example, if i have an image with just the musical staff (but nothing else in it) it should tell me that the image is empty, but if i have an image with a Half note symbol in it, it should tell me something like "it is a half note".
Suppose i have 20 sample images for each possible symbol and 20 with the base case (nothing in it), i want to train a SVM to classify any input image. I've read about how i could do it, but i still have certain doubts. i think the process is something like this (and please correct me if i'm wrong):
extract the descriptors of all the sample images.
put those descriptors inside different Mat Objects (one for each symbol).
feed those Mats to the SVM to train it.
Use the SVM to classify the images.
i have specific doubts about what i think is the process:
is what i described the correct process for what i need to do?
should i pre-process the sample images (say extract the background and apply canny edges) before i feed them to the descriptor extractor? o can i leave them as they are?
i have read about three methods of extracting the descriptors: HOG, BOW (Bag of Words) and SIFT. i think they all do what i need but i don't know which one to use. i see that HOG is mostly (if not all times) for face and pedestrians detection and i don't know if it could be used for my case. Any advice of which one should use?
how many sample images should i have for every case?
i dont need specific details of the implementation, but i do need answers to these questions, thank you in advance
I'm not an expert of SIFT and BOW but I know something about HOG and SVM.
1 Is what i described the correct process for what i need to do?
If you are using OpenCV and HOG no that is not correct. Have a look to the sample code for HOG in OpenCV samples and you will find that, once extracted, the descriptors directly feed the SVM without filling a MAT element.
2 should i pre-process the sample images (say extract the background and apply canny edges) before i feed them to the descriptor extractor? o can i leave them as they are?
This is not mandatory. Preprocessing has been proved to be very useful but for your simple case you wont need it. On the other hand, if your wall presents draws, stickers or something that can confuse the detector then yes. It can be a good solution to decrease the number of false positives.
3 i have read about three methods of extracting the descriptors: HOG, BOW (Bag of Words) and SIFT. i think they all do what i need but i don't know which one to use. i see that HOG is mostly (if not all times) for face and pedestrians detection and i don't know if it could be used for my case. Any advice of which one should use?
I have direct knowledge only of HOG. You can easily implement your own detector with HOG without any problem, I'm currently using it for traffic signs. Pay attention to the detection window that you want to use. You can leave all the other parameters as they are, it will work for simple cases.
4 how many sample images should i have for every case?
Once again it depends on the situation. I would say that 200 images (try also with less) for class will do the trick but you can always increase the number by applying some transformation on the positives. Try to flip, saturate or blur the images.
Some more considerations. I think that you can work with grey scale images due to the fact that color is not important to distinguish the notes (all the same color right?). If you have problem with false positives you can try to use the HSV color space to filter out patches that you will then use to detect the notes (it really works well with red!!). The easiest way to train your SVM is using a linear kernel and then train a model for each class.

Face recognition using neural networks

I am doing a project on face recognition, for that I have already used different methods like eigenface, fisherface, LBP histograms and surf. But these methods are not giving me an accurate result. Surf gives good matches for exact same images, but I need to match one image with it's own different poses(wearing glasses,side pose,if somebody is covering his face) etc. LBP compares histogram of images, i.e., only color informations. So when there is high variation on lighting condition it is not showing good results. So I heard about neural networks, but I don't know much about that. Is it possible to train the system very accurately by using neural networks. If possible how can we do that?
According to this OpenCV page, there does seem to be some support for machine learning. That being said, the support does seem to be a bit limited.
What you could do, would be to:
User OpenCV to extract the face of the person.
Change the image to grey scale.
Try to manipulate so that the face is always the same size.
All the above should be doable with OpenCV itself (could be wrong, haven't messed with OpenCV in a while) so that should save you some time.
Next, you take the image, as a bitmap maybe, and feed the bitmap as a vector to the neural network. Alternatively, as #MatthiasB recommended, you could feed the features instead of individual pixels. This would simplify the data being passed, thus making the network easier to train.
As for training, you manipulate these images as above, and then feed them to the network. If a person uses glasses occasionally, you could have cases of the same person with and without glasses, etc.

Setting parameters for BRISK in OpenCV

I'm trying to use BRISK implementation of OpenCV (for C++) in order to check in a photo if an image (or a part of an image) is included in. For example, I take a photo, and I try to match it with a set of images in database, and I would like to select the best corresponding image (or an error message if none of all the images is good enough).
So, I'm just testing OpenCV for the moment. I've simply taken the sample included in the framework (matching_to_many_images), and change the detector and descriptor from SURF to BRISK.
However, I have weird results. These are the results of matching (BruteForce Hamming):
In the first one, the scenes are entirely different, but there are a lot of matches!
In the second one, the scenes are pretty similar, but some matches are wrong.
I think this is a parameters issue- because on demo videos of BRISK, the results are significant.
Have you seen the OpenCV documentation for BRISK? I'm not sure what parameters you're using now, but you can specify the threshold and octaves, as well as the pattern. Documentation at
http://docs.opencv.org/modules/features2d/doc/feature_detection_and_description.html#brisk
Also you could try a different feature matching algorithm, although it appears that in the BRISK paper they also used hamming distance
Lastly, it's not too unexpected to have erroneous feature matches; try out different scenes as well as different feature parameters and see how your results are
There are commonly many incorrect initial matches when doing feature-feature matching using SIFT, SURF, BRISK, or any other local descriptor.
Many of these initial matches will be incorrect due to ambiguous features or features that arise from background clutter. [From Distinctive Image Features from Scale Invariant Keypoints]
The next step is to select only a subset of those matches that all agree on a common transformation between the two images. This is explained in sections 7.3 and 7.4 of Distinctive Image Features from Scale Invariant Keypoints.
The OpenCV Tutorial gives an excellent example of how to extract features and calculate a homography (a transformation that tells you how to to transform each point from one image to the other one).
You can replace the feature-detector/descriptor with any other one, which will result in different robustness to certain transformations like rotation, scaling or errors like blur or lumination change. The basic implementation of BRISK already has meaningful parameters defined.
Last but not least, if you try to match two completely different images, what would you expect as a result? The algorithm will try to find similarities, and therefore always calculate a result, even if it is non-sense and the scores are very low. Just keep in mind: Garbage in -> Garbage out.

Image Correspondence - Matching regions of images

I have 2 images with the same content but might have different scale or rotation. The problem is, I have to find the regions of these images and match them with one another. For example, if I have a circle on image1, i have to find the corresponding circle in image2.
I just like to ask what the proper way of solving this is. I am looking at the matchShapes of opencv. I believe this problem is image correspondence but I really have no idea how to solve it!
Thanks in advance!
I have the following images:
Template Image => https://lh6.googleusercontent.com/-q5qeExXUlpc/T7SbL9yWmCI/AAAAAAAAByg/gV_vM1kyLnU/w348-h260-n-k/1.labeled.jpg
Sample Image => https://lh4.googleusercontent.com/-x0IWxV7JdbI/T7SbNjG5czI/AAAAAAAAByw/WSu-y5O7ee4/w348-h260-n-k/2.labeled.jpg
Note that the numbers on the images correspond to the proper matching of regions. These are not present when comaparing the images.
As usually with computer vision problems, you can never provide too much information and make too many assumptions about the data you intend to analyze. Solving the general problem is close to impossible as we can't do human level pattern recognition with computers. How does your problem set look like? A few examples would be very helpful in trying to provide good answers.
You mention that the images have the same content, but with different colors. If that means it's the same scene photographed under different lighting conditions and from possibly different angles, you might need to do a rigid image registration first, so the feature points in the two images should overlap. If the shapes on your images might have multiple distortions compared to each other, you might be interested in non-rigid image registration.
If you already know the objects you are looking for, you can simply do a search for these objects in both images, for example with chamfer matching or any other matching algorithm.
Use ORB feature detector from OpenCV. Once you have the descriptors use BFMatcher with norm type NORM_HAMMING.