When I run the following image through the Google Cloud Vision API it see's the grass but not the snake. What can I do to improve object detection?
We can improve image detection by following the recommended image size guidelines or by using crop hints to make the snake more dominant in the image. Google Cloud Vision API is powered by machine learning and misses like this (snake) is expected on the early stages of the API. Vision API improves over time as new concepts are introduced and accuracy is improved.
Sample use of crop hints:
Result show "60% reptile" when using the Vision API explorer:
Related
Recently i find an OCR tool, which is called PaddleOCR. Has anyone used it, and how this OCR system preformance compare to Google Cloud Vision API?
I heard PaddleOCR called itself an industry-level open-sourced OCR engine, so I test a few images between it and Google Cloud Vision.
Generally speaking, commercial APIs like Google Cloud and Azure suppose to work better than the open-sourced OCR engine, it does, but for some scenarios, it's not too far away.
If the text is clear and flat, both work great. The main difference is the result format. Google API gives you rich content including block, paragraph, and word location information. PaddleOCR only returns the result according to the text line (transcriptions and locations).
If your test images are more complicated, like curved text, handwriting, or blurry. Commercial APIs probably work great than the open-sourced engine. However, when it can not meet your needs, try to use PaddleOCR training a new model.
Here is some visualization images:
PaddleOCR:
test1
test2
Google Cloud Vision API:
test1
test2
I am looking at Google AutoML Vision API and Google Vision API. I know that if you use Google AutoML Vision API that it is a custom model because you train ML models based on your own images and define your own labels. And when using Google Vision API, you are using a pretrained model...
However, I am wondering if it is possible to use my own algorithm (one which I created and not provided by Google) and using that instead with Vision / AutoML Vision API ? ...
Sure, you can definitely deploy your own ML algorithm on Google Cloud, without being tied up to the Vision or AutoML API.
Two approaches that I have used many times for this same use case:
Serverless approach, if your model is relatively light in terms of computational resources requirement - Deploy your own custom cloud function. More info here.
To be more specific, the way it works is that you just call your cloud function, passing your image directly (base64 or pointing to a storage location). The function then automatically allocates all required resources (automatically), run your custom algorithm to process the image and/or run inferences, send the results back and vanishes (all resources released, no more running costs). Neat :)
Google AI Platform. More info here
Use AI Platform to train your machine learning models at scale, to host your trained model in the cloud, and to use your model to make predictions about new data.
In doubt, go for AI Platform, as the whole pipeline is nicely lined-up for any of your custom code/models. Perfect for deployment in production as well.
I am doing OCR using the API of Google Cloud vision.
To make it easier to check the results, I'd like to visualize where we should be more careful and where we should be better off, depending on how reliable the API output is.
I couldn't find it as far as I could, but does the API have the ability to output the confidence level? It would be very much appreciated if you could tell us.
I am new to Azure Cognitive services. I want to detect multiple objects in a single image. Is it possible with custom vision api.
Any help is appreciated. Thank you.
You should be able to with the Object Detection part of Custom Vision. Simply give it images of multiples to train on and it should start detecting both items.
For example, I was playing with it a while ago to see if it could detect red and white wines. After sending a few images with both to train on I started getting results like the below.
Using Ionic, is it possible for me to be able to capture an image and the trigger would be whenever the face smiles? I am looking for suggestions, any resource materials that I could get using Ionic.
You need to use some emotion detection api. This problem is not related to ionic itself but to computer vision. So what you likely do is send/upload your photo to online api (for example google cloud vision or any other) to detect emotions in your photo and it will detect emotions for for you. The result will be then utilized by your application.