I was asked to recognize logo in an image using opencv. The lecturer told me that I don't have to do logo detection but logo recognition only. I am using opencv in c++. Can I know the easiest way to do it??
Ps: newbie in computer vision.
It largely depends on your kind of images.
If your logo occupies say 90% of the image, you don't need detection, since you are probably good with color histograms.
If the logo is small compared to the image, you should "find" the logo, in order to focus your comparison on that and not on the background clutter.
There could be multiple logos on the same image?
The logo is always fully visible?
The logo is rigid? Or could be deformed? (think for example of a logo on a shirt or a small bottle)
Assuming that you have a single complete rigid logo to find, the simplest thing to try is template matching.
A more accurate approach is to match descriptors.
You can also see a related topic on SO here
Other more robust approaches would require to build constellations of keypoints on your reference logo, and match those constellations on the target image.
Last, but not least, have fun on Google!
I agree with #Miki , you need to do template matching, my recomendation to you is to use sum of square differences and only use a rigid transformation, you can find a lot of information here. The last is one of the best books that I've red is simple to understand and it have the major part of the equations step by step.
Related
I'm looking for a way to check if a given logo appears on a screenshot of a webpage. So basically, I need to be able to find a small predefined image on a larger image that may or may not contain the smaller image. A match could be of a different scale, somewhat different colors. I need to judge occurrence similarity as well. Need some pointers for what to look at, I've never worked with computer vision before.
Simplest yet not simple way to do it is a normal CNN trained on augmented dataset of the logos.
Trying to keep the answer short, Just make a cnn in tensorflow and train your model on tons images of logos with labels on each training image, It's a simple task and a not-very-crafty CNN must be able to get your work done.
CNN- Convolutional Neural Network
Reference : https://etasr.com/index.php/ETASR/article/view/3919
I raised this question due to curiousity while using Google Goggle and Google's "Search by Image".
If you try giving Google an image to search, it can show you some results. Identical images work best (of course), but taken photo of various objects could be difficult.
I guess Google Goggle has workaround a bit by using text recognition and image matching recognition. If text recognition found the text, for instance, "SONY", then things might get simpler. If a brand's image is detected, then things should be simpler as well. The same goes with other famous brand and famous landmark, such as an Eiffel Tower. Having text and brand's image could help recognize things easily.
But if we are to search for something more obscure (need a better wording here), for instance, take this ramen image.
If you put this image into Google, you will get images of various other images that have similar colors and sometimes similar shape. Heck, there are other ramen images in the result, but I think it would be better if these ramen images are up in the top, since we input a ramen image, and our context here is ramen.
So here is my question, will it be possible to create such a software that can understand the context of the image? How can we express the context in the software?
Man, you just pointet out the very reason why so much people work on computer vision.
Is is quite easy to mathematically describe objects. Color, shape, density, . . .
All those can be calculated easily.
But computer vision becomes very complex when talking about "real life objects".
Angle, luminosity, and simply non consistency make it really almost impossible to detect an object accurately.
When working on computer vision, you should always ask yourself : what makes the object I want to recognize unique ?
What descriptor can I use that no other object possess ?
Ask yourself the question for theses ramen. Let's say I simply want to detect ramens.
What if the color of the soup changes? What if the meat is bigger ?
If you want to know more, you should read about pattern recognition and pattern matching.
And if you can find the solution to this kind of problems in a generic way, you can register for the nobel price I think :)
Some things are quite well known nowadays, like face recognition or OCR; but they are often quite specialized and apply to only one domain.
Think about it, even Google's image search algorithm sucks when you feed it with ramen.
It is pretty efficient with sudoku though, as he knows exactly what he is searching for.
All the difference is made in training, where you give a list of assumptions to help the algorithm.
So basically you got it. either you create a really nice computer vision system good at detecting one thing based on a lot of assumptions, or an "ok" but quite generic one :).
The choice mostly depends on your application
I have a simple template grayscale image, with white background and black shape over it, and I have several similar test images, I want to compare these two images and see if template matches any of the test images. Can you please suggest a simple(easy to use) pattern recognition library for C++ which takes two images and compares them and shows the result?
Just do image1-image2 for all pixels. Then sum up all the differences. The lower the results, the closer the images.
If your pattern could be of several sizes, then you have to resize it and check it for each positions.
Implement a Neural Network on the image. Inputs should be the greyscales of your image. you should train your network to a train set, chose proper regularization parameters using a cross validation set, and finally test your network on a test set.
http://www.codeproject.com/Articles/13582/Back-propagation-Neural-Net
(I have done this myself to train a network to recognise hand written digits - it works very well.)
How simple the library you need is depends on the specific parameters of your problem. OpenCV is a great image processing library that should be able to do what you need it to. Here is a tutorial on template matching in OpenCV. It makes it very easy to switch between matching metrics and choose the best one for your problem.
I recently saw the virtual mirror concept on you tube, I tried it out and researched about it. It seems that the creators have used augmented reality so that people can see the output on their screens. On researching I found out that we identify a pattern on which a 3D image is superimposed.
Question 1:How are they able to superimpose the jewellery and track the face of the person without identifying any pattern?
I also tried to check various libraries that I can use to make a program similar to the one they show. Seems to me that a lot of people are using Android phones and iPhones and making apps that use augmented reality.
Question 2:Is there any way that I can use c++ and try to make a program that uses augmented reality?
Oh, and the most important thing, the link to the application is provided below:
http://www.boutiqueaccessories.com.au/virtual-mirror/w1/i1001664/
Do try it out. Its a good experience. :D
I'm not able to actually try the live demo, but the linked video suggests that they either use some simplified pattern recognition (get the person's outline), or they simply track you based on the initial image (with your position/texture being determined by the outline being shown.
Following the video, it's easy to see that there's no real/advanced AR behind this. The images are simply overlayed or hidden (e.g. in case it's missing track of one ear due to you looking to the side) and they're not transformed (no perspective or resizing happening). They definitely seem to track the head (or features like ears, neck, etc.). depending on your background and surroundings that's actually a rather trivial task.
Question 2: Sure! There are lots of premade toolsets out there, but you could as well use some general image processing library such as OpenCV to do the math. Augmented reality usually uses some kind of pattern (e.g. a card or page with a known pattern) to determine the correct position and transformation for the contents to be added to the image. There are also approaches using the device's orientation and perspective changes in camera images to determine depth/position (I really like this demo).
I have 2 images with the same content but might have different scale or rotation. The problem is, I have to find the regions of these images and match them with one another. For example, if I have a circle on image1, i have to find the corresponding circle in image2.
I just like to ask what the proper way of solving this is. I am looking at the matchShapes of opencv. I believe this problem is image correspondence but I really have no idea how to solve it!
Thanks in advance!
I have the following images:
Template Image => https://lh6.googleusercontent.com/-q5qeExXUlpc/T7SbL9yWmCI/AAAAAAAAByg/gV_vM1kyLnU/w348-h260-n-k/1.labeled.jpg
Sample Image => https://lh4.googleusercontent.com/-x0IWxV7JdbI/T7SbNjG5czI/AAAAAAAAByw/WSu-y5O7ee4/w348-h260-n-k/2.labeled.jpg
Note that the numbers on the images correspond to the proper matching of regions. These are not present when comaparing the images.
As usually with computer vision problems, you can never provide too much information and make too many assumptions about the data you intend to analyze. Solving the general problem is close to impossible as we can't do human level pattern recognition with computers. How does your problem set look like? A few examples would be very helpful in trying to provide good answers.
You mention that the images have the same content, but with different colors. If that means it's the same scene photographed under different lighting conditions and from possibly different angles, you might need to do a rigid image registration first, so the feature points in the two images should overlap. If the shapes on your images might have multiple distortions compared to each other, you might be interested in non-rigid image registration.
If you already know the objects you are looking for, you can simply do a search for these objects in both images, for example with chamfer matching or any other matching algorithm.
Use ORB feature detector from OpenCV. Once you have the descriptors use BFMatcher with norm type NORM_HAMMING.