AWS image processing - amazon-web-services

I am working on a project where I need to take a picture of a surface using my phone and then analyze the surface for defects and marks.
I want to take the image and then send it to the cloud for analysis.
Does AWS-Rekognition provide such a service to analyze the defects I want to study?
Or Would I need to write a custom code using opencv or something?

While Amazon Rekognition can detect faces and objects, it has no idea what it meant by a "defect".
Imagine if you had 10 people lined up and showed them a picture, asking them if they could see a defect. Would they all agree? They'd probably ask you what you mean by a defect and how bad something has to look before it could be considered a defect.
Similarly, you would need to train a system on what is a valid defect and what is not a defect.
This is a good use case for Amazon SageMaker. You would need to provide LOTS of sample images of defects and not-defects. They should be shot from many different angles in many different lighting situations, similar to the images you would want to test.
It would then build a model that could be used for detecting 'defects' in supplied images. You could even put the model into an AWS DeepLens unit to do the processing locally.
Please note, however, that you need to provide a large number of images (hundreds is good, thousands is better) to be able to train it to correct detect 'defects'.

Related

Quick and Dirty Image Registration Tool

Given two images, e.g. two cats, is there a library that includes a "quick and dirty" way of telling by how much the two images differ regarding translation and rotation? Image registration is a big field and every application I run into seems to be tailored to medical scans and usually has certain domain specific caps on the transformation ranges. The tool I require should take two images as an input and return an angle of rotation and a translation vector, maybe even a confidence metric, it's that simple. (Most algorithms out there are heavy-duty and focus on minute details for alignment, the tool I'm looking for need not be as exact.)
If it needs not be very precise, you can probably tweak the code from PyImageSearch to better suit your application.
If you know that the two images you are going to compare do contain the same object (i.e., if there is no additional object recognition problem that comes before this step), then you can maybe try using the ORB detector to find the good keypoints, and then estimate the homography using ViSP

Detecting liveness with AWS Rekognition

I am looking to use AWS Reckognition in one of my projects and trying to find out whether or not its possible to differentiate between a still image (photograph) vs a real person, in other words liveness detection. I don't want my system to be fooled with a still photograph for authentication.
I see that it has many features such as pose and emotion detection, etc. If its not an official feature, is there a work around or any tricks that some of you have used to achieve what I want?
I am also wondering if its possible to detect gaze and how to best approach that. I want to see where the user is looking at, at the screen, to the side, etc.
Alternatively, if AWS does not have a good solution for this, what are some of your alternative recommendations?
Regards
Could you make use of blink detection, which isnt part of AWS Rekognition, to check if an image isn't a still photograph. You just need OpenCV.
Here is an example.
Face recognition alone is notoriously insecure when it comes to authentication, as has been evidenced by the many examples of the Android Face Unlock functionality being fooled by photographs.
Apple makes use of Depth sensing cameras in its FaceID technology to create a 3D map of the faces which cant be fooled by a photograph. Windows Hello face authentication utilises a camera specially configured for near infrared (IR) imaging to authenticate.
as alternative to gaze you can have a look at liveness example using aws rekognition based on face and nose position:
https://aws.amazon.com/pt/blogs/industries/improving-fraud-prevention-in-financial-institutions-by-building-a-liveness-detection-solution/
https://aws.amazon.com/blogs/industries/liveness-detection-to-improve-fraud-prevention-in-financial-institutions-with-amazon-rekognition/
https://github.com/aws-samples/liveness-detection

Time between training set images for individual facial recognition

Edit: I didn't make this clear, for this is for the possible future development of an application.
I am looking into individual facial recognition for an application, but an essential part of this seems to be a fairly large training set of images for each individual to be recognized.
Is it important for the images to be taken at different times in different environments, or could several images captured over a few seconds with a handheld camera possibly provide the necessary variations for a good training set?
(This isn't for human facial recognition, by the way, so existing tools and databases won't really help too much. I'm aware that 2D image recognition can not necessarily be applied to all species; let's just assume that it does work in my use case.)
This paper may answer some of your questions:
http://uran.donetsk.ua/~masters/2011/frt/dyrul/library/article8.pdf
From the pattern classification point of view, a usual problem in face recognition is having a plethora of classes and only a few, possibly only one, training sample(s) per class. For this reason, more sophisticated classifiers are not needed but a nearest-neighbour classifier is used.
While I'm not an expert on the subject, it appears to be a common problem to have only one image per person as a training sample and one that has been solved with at least some level of accuracy in controlled lighting/positional situations.
To specifically answer your question, a training set that had multiple images of each person with little or no variation ("several images captured over a few seconds with a handheld camera"), would not be as valuable as one that had more variation (e.g. different facial expressions, lighting, backgrounds).

Possibility of creating a software that can recognize context of an image?

I raised this question due to curiousity while using Google Goggle and Google's "Search by Image".
If you try giving Google an image to search, it can show you some results. Identical images work best (of course), but taken photo of various objects could be difficult.
I guess Google Goggle has workaround a bit by using text recognition and image matching recognition. If text recognition found the text, for instance, "SONY", then things might get simpler. If a brand's image is detected, then things should be simpler as well. The same goes with other famous brand and famous landmark, such as an Eiffel Tower. Having text and brand's image could help recognize things easily.
But if we are to search for something more obscure (need a better wording here), for instance, take this ramen image.
If you put this image into Google, you will get images of various other images that have similar colors and sometimes similar shape. Heck, there are other ramen images in the result, but I think it would be better if these ramen images are up in the top, since we input a ramen image, and our context here is ramen.
So here is my question, will it be possible to create such a software that can understand the context of the image? How can we express the context in the software?
Man, you just pointet out the very reason why so much people work on computer vision.
Is is quite easy to mathematically describe objects. Color, shape, density, . . .
All those can be calculated easily.
But computer vision becomes very complex when talking about "real life objects".
Angle, luminosity, and simply non consistency make it really almost impossible to detect an object accurately.
When working on computer vision, you should always ask yourself : what makes the object I want to recognize unique ?
What descriptor can I use that no other object possess ?
Ask yourself the question for theses ramen. Let's say I simply want to detect ramens.
What if the color of the soup changes? What if the meat is bigger ?
If you want to know more, you should read about pattern recognition and pattern matching.
And if you can find the solution to this kind of problems in a generic way, you can register for the nobel price I think :)
Some things are quite well known nowadays, like face recognition or OCR; but they are often quite specialized and apply to only one domain.
Think about it, even Google's image search algorithm sucks when you feed it with ramen.
It is pretty efficient with sudoku though, as he knows exactly what he is searching for.
All the difference is made in training, where you give a list of assumptions to help the algorithm.
So basically you got it. either you create a really nice computer vision system good at detecting one thing based on a lot of assumptions, or an "ok" but quite generic one :).
The choice mostly depends on your application

Classification of Lightning type in Images

I need to write an application that uses image processing functionality to identify the type of lightning in an image. The lightning types that it has to identify are the cloud to ground and the intracloud lightning which are shown in the pictures below. The cloud to ground lightning has these features: it hits the ground and has flashes branching downwards and the features of the intracloud lightning are that: it has no contact with the ground. Are there any image processing algorithms that you guys know which i can use to identify these features in the image such that the application will be able to identify the lightning type? I want to implement this in C++ using the CImg library.
Thanking you in advance
!!Since I cant upload photos because am a new user, i posted the links to the images!!
http://wvlightning.com/types.shtml
Wow, this seems like a fun algorithm. If you had a large set of images for each type you might be able to use HAAR training (http://note.sonots.com/SciSoftware/haartraining.html) but I'm not sure that would work because of the form of lightning. Maybe HAAR in combination with your own algorithm. For instance it should be very straightforward to know whether the lightning goes to the ground. You could use some OpenCV image analysis to do that - http://www.cs.iit.edu/~agam/cs512/lect-notes/opencv-intro/