Google Cloud Vision API online vs offline pricing - google-cloud-platform

I'm in need of a plug and play text recognition system after having tried some solutions such as Tesseract OCR, Google's Vision API seemed to produce the best results for me.
However I have never used any of their cloud API before but I've noticed it is able to work offline? How would billing work for this? As I understand the online version charges for every 1000 images, wouldn't the offline library circumvent this? What is the quality difference between online and offline?

Both online and offline charge based on the features used. Here is the pricing chart: https://cloud.google.com/vision/pricing
Quality should be similar for online and offline. You could run a small experiment with your own files to verify this.

Related

Fine tuning on either Google Cloud Vision, Microsoft Azure Computer Vision API or Amazon Text Extract

I need to transcribe a large number of Handwritten documents. I tried to use cloud services from either Google, Amazon, and Microsoft. Namely:
https://azure.microsoft.com/en-us/services/cognitive-services/computer-vision/
https://cloud.google.com/vision/docs/handwriting
https://aws.amazon.com/textract/
Unfortunately, none of them achieved good enough results. I suspect it is because my documents have a weird handwriting style, and as a result, the networks struggle a lot.
I searched whether I could fine-tune (with manually transcribed data), but I have not found anything online, so as a last resort, I ask here.
If it is possible to fine-tune one of these models, could you please point me some resources?
You are correct, with Azure Cognitive Services with Computer Vision you cannot upload your own data to train the API to recognise the handwriting in your documents I'm afraid. I can't comment on the other offerings from AWS and Google I'm afraid, but certainly not for Azure.

Google API Speeds Slow in Cloud Run / Functions?

Bottom Line: Cloud Run and Cloud Functions seem to have bizarrely limited bandwidth to the Google Drive API endpoints. Looking for advice on how to work around, or, ideally, #Google support to fix the underlying issue(s) as I will not be the only like use case.
Background: I have what I think is a really simple use case. We're trying to automate private domain Google Drive users to take existing audio recordings and send them off to Speech API to generate a transcript on an ad hoc basis, and to dump the transcript back into the same Drive folder with email notification to the submitter. Easy, right? Only hard part is that Speech API will only read from Google Cloud Storage, so the 'hard part' should be moving the file over. 'Hard' doesn't really cover it...
Problem: Writing in nodejs and using the latest version of the official modules for Drive and GCS, the file copying was going extremely slow. When we broke things down, it became apparent that the GCS speed was acceptable (mostly -- honestly it didn't get a robust test, but was fast enough in limited testing); it was the Drive ingress which was causing the real problem. Using even the sample Google Drive Download app from the repo was slow as can be. Thinking the issue might be either my code or the library, though, I ran the same thing from the Cloud Console, and it was fast as lightning. Same with GCE. Same locally. But in Cloud Functions or Cloud Run, it's like molasses.
Request:
Has anyone in the community run into this or a like issue and found a workaround?
#Google -- Any chance that whatever the underlying performance bottleneck is, you can fix it? This is a quintessentially 'serverless' use case, and it's hard to believe that the folks who've been doing this the longest can't crack it.
Thank you all in advance!
Updated 1/4/19 -- GCS is also slow following more robust testing. Image base also makes no difference (tried nodejs10-alpine, nodejs12-slim, nodejs12-alpine without impact), and memory limits equally do not impact results locally or on GCP (256m works fine locally; 2Gi fails in GCP).
Google Issue at: https://issuetracker.google.com/147139116
Self-inflicted wound. Google-provided code seeks to be asynchronous and do work in the background. Cloud Run and Cloud Functions do not support that model (for now at least). Move to promise-chaining and all of a sudden it works like it should -- so long as the CPU keeps the attention it needs. Limits what we can do with CR / CF, but hopefully that too will evolve.

How to render an AE project on Google Cloud Platform

I've got a 2012 MacBook Pro that has been holding strong for most tasks to this day, though it would appear rendering video seems to be its breaking point.
I read that it was possible to render an After Effects project on the Google Cloud Platform, however, I can't find any tutorials on how it could be done.
I'm not really looking to purchase a $3,000 rendering station right now, which is why I am pursuing the Google Cloud option under the assumption that it would be cheaper but please correct me if there is a better alternative.
The video I want to render is about 30min long # 30fps.
Any help is appreciated.
Thanks.
The easiest way will be to use a Windows compute engine.
Depending of the video quality you want, it would be a good idea to use a GPU instance.
Here is a link to a google tutorial.

Google Cloud Speech API on production

As we know, Google Cloud Speech API is in Beta now.
Will it be safe to use it in a application on production server?
I was also searching for the applications which is using Google Cloud Speech API, So far I have found the following,
VoiceBase, Hyperconnect, InterActiveTel
Does anyone know of any other applications that could give us more confidence in using it on production server?
The official definition of GCP launch stages, such as Beta, can be found in our documentation here.
Beta is the point at which we are ready to open a release for any customer to use. There are no SLA or technical support obligations in a Beta release, and charges may be waived in some cases. Products will be complete from a feature perspective, but may have some open outstanding issues. Beta releases are suitable for limited production use cases.
Emphasis is mine: Limited production. Ultimately, it is going to come down to your risk appetite.
As of Tuesday, April 18, the Cloud Speech API has reached General Availability, meaning all features are open to developers and are to be considered stable.
Voicebase provides more than just speech recognition and it is currently used in production by large customers. Take a look at some of the features
http://voicebase.readthedocs.io/en/v2-beta/index.html

Speech recognition (web) services?

I have a buffer of audio and I'd like to perform speech recognition/transcription on it. I have limited CPU and RAM locally so I want to perform recognition on a server.
Are there any (web) services that allow me to do this?
My searches so far have led nowhere...
Google has just introduced browser-based access to its speech engine through HTML5.
http://slides.html5rocks.com/#speech-input
To get this page to work, I launched the Chromium browser as follows in Ubuntu:
$ chromium-browser --enable-speech-input
I believe that the idea is to be able to build applications that use Google's speech recognizer, but I haven't had a chance to look deeply into it.
Another interesting project is WAMI from MIT:
http://wami.csail.mit.edu
Lumenvox offers such a service but seems expensive for your needs.