Please allow me to ask a rather newbie question. So far, I have been using local tools like imagemagick or GOCR to perform the job, but that is rather old-fashioned, and I am urged to "move to google cloud AI".
The setup
I have a (training) data set of various documents (as JPG and PDF) of different kinds, and by certain features (like prevailing color, repetitive layout) I intend to classify them, e.g. as invoice type 1, invoice type 2, not an invoice. In a 2nd step, I would like to OCR certain predefined areas of each document and extract e.g. the address of the company sending the invoice and the date.
The architecture I am envisioning
In a modern platform as a service (pass), I have already set up an UI where I can upload new files. These are then locally stored in a directory with filenames (or in a MongoDB). Meta info like upload timestamp, user, original file name is stored in a DB.
The newly uploaded file should should then be submitted to google cloud which should do the classification step, and deliver back the label to be saved in the database.
The document pages should be auto-cropped, i.e. black or white margins are removed, most probably with google cloud as well. The parameters of the crop should be persisted in the DB.
In case it is e.g. an invoice, OCR should be performed (again by google cloud) for certain regions of the documents, e.g. a bounding box of spanning from the mid of the page to the right margin in the upper 10% of the cropped page. The results of the OCR should be again persisted locally.
The problem
I seem to be missing the correct search term to figure out how to do it with google cloud. Is there an google-API (e.g. REST), I can use to upload and which gives me back the results of steps 2 to 4?
I think that your best option here is to use Document AI (REST API and Libraries).
Using Document AI, you can:
Convert images to text
Classify documents
Analyze and extract entities
Additionally, for your use case, we have a new Document AI feature that is still in preview and has limited access which is the Invoice parser.
Invoice parser is similar to Form parser but for invoices instead of forms. Check out the Invoice parser page and you will see what I mean by preview and limited access.
AFIK, there isn't any GCP tool for image edition.
Related
We’re just getting started with Document AI. So far, we have about 80 labeled documents and one trained version.
We are making changes to the schema and adding a property. We’d like to go back in and extract this new label to the previously labeled documents.
The Document AI user interface presents some challenges here.
I want to isolate the documents that don’t contain this label. With the filtering capabilities, it looks like I’m only able to filter on documents that have that label, not the inverse.
I also don’t see a way to mark a bunch of documents as unlabeled once they have been marked as labeled. That would be useful for indicating which previously labeled documents need some additional work.
For those that are making schema changes and need to go back and re-label documents, what does your workflow look like?
I want to go ahead and create a classifier, I and I do not like the Google's Browser Labeling Service. Is there a tool similar to vott or some code, that I can use to go ahead and import my vott labeled data and import it Google AutoML.
The Google Labeling Service looks something like this and is very slow in loading images and inefficient it literally has a white labeling cursor and I have light background in my images
As seen in the Image Here.
On the Other Hand can I import it using vott which is much more better in every way. So is there a way for me to do this using vott to import the labeled csv into Google's Cloud AutoML.
I don't think that it is currently possible to import already labeled data from other apps (like VOTT).
At the moment there are 3 ways to label images in Cloud Vision. It's described in the Annotating imported training images
Provide bounding boxes with labels for your training images via labeled bounding boxes in your .csv import file
In the CSV file you would need to provide GCS url and label/labels
Labeled: gs://my-storage-bucket-vcm/flowers/images/img100.jpg,daisy
Multi-label: gs://my-storage-bucket-vcm/flowers/images/img384.jpg,dandelion,tulip,rose
Assigned to a set: TEST,gs://my-storage-bucket-vcm/flowers/images/img805.jpg,daisy
More details can be found here.
Provide unannotated images in your .csv import file and use the UI to provide image annotations
Not labeled: gs://my-storage-bucket-vcm/flowers/images/img403.jpg
However, later you will need to label it using UI, otherwise it will be ignored.
AutoML Vision ignores items without a category label.
Request manual image annotation with Google's Human Labeling service
This option would include human labelers and would need to provide additional information like dataset, label set and instructions for people.
In the documentation you can also find information that currently API is not supporting any method for labeling.
The AutoML API does not currently include methods for labeling.
However, you can propose Feature Request via IssueTracker to add some additional import methods from different apps or enable the use API.
I am trying to build an object detection model using Google Cloud Vision. The model should draw bounding boxes around rice
What I have done so far:
I have imported an image set of 15 images
I have used the Google Cloud tool to draw ~550 bounding boxes in 10 images
Where I am stuck:
I have built models before, and the data set was automatically split into train, validation and test set. This time however, Google Cloud is not splitting the data set.
What I have tried:
Downloading the .csv with the labeled data and reimporting it into Google Cloud
Adding more labels beyond the one label I have right now
Deleting and recreating the data set
How can I get Google Cloud to split the data set?
Your problem is that Google Cloud Platform determined your train, test and validation sets when you uploaded your images. Your test and validation images are likely your last 5 images which are not available for training if you have not labeled them yet. If you label all of your images or remove those from the dataset you should be able to train. See this SO answer for more info.
You can verify this by clicking the Export Data option and downloading a CSV of your dataset: you can see that data set categories are already defined, even for images that have not yet been labeled yet.
I want to scrape the data from an ArcGIS map. The following map has a popup when we click the red features. How do I access that data programmatically?
Link : https://cslt.maps.arcgis.com/apps/MapSeries/index.html?appid=2c9f3e737cbf4f6faf2eb956fa26cdc5
Note: Please respect the access and use constraints of any ArcGIS Online item you access. When in doubt, don't save a copy of someone else's data.
The ArcGIS Online REST interface makes it relatively simple to get the data behind ArcGIS Online items. You need to use an environment that can make HTTP requests and parse JSON text. Most current programming languages either have these capabilities built in or have libraries available with these capabilities.
Here's a general workflow that your code could follow.
Use the app ID and the item data endpoint to see the app's JSON text:
https://www.arcgis.com/sharing/rest/content/items/2c9f3e737cbf4f6faf2eb956fa26cdc5/data
Search that text for webmap and see that the app uses the following web maps:
d2b4a98c39fd4587b99ac0878c420125
7b1af1752c3a430184fbf7a530b5ec65
c6e9d07e4c2749e4bfe23999778a3153
Look at the item data endpoint for any of those web maps:
https://www.arcgis.com/sharing/rest/content/items/d2b4a98c39fd4587b99ac0878c420125/data
The list of operationalLayers specifies the feature layer URLs from which you could harvest data. For example:
https://services2.arcgis.com/gWRYLIS16mKUskSO/arcgis/rest/services/VHR_Areas/FeatureServer/0
Then just run a query with a where of 0=0 (or whatever you want) and an outFields of *:
https://services2.arcgis.com/gWRYLIS16mKUskSO/arcgis/rest/services/VHR_Areas/FeatureServer/0/query?where=0%3D0&outFields=%2A&f=json
Use f=html instead if you want to see a human-readable request form and results.
Note that feature services have a limit of how many features you can get per request, so you will probably want to filter by geometry or attribute values. Read the documentation to learn everything you can do with feature service queries.
We are working in one of the customer module. With the help of this module, we are calculating price for google cloud components like compute engine with attributes images, boot disk, region, snapshot etc.
But we found that GCP is revising the prices and JSON is not modifying immediately. Newly revised pricing values comes in JSON after few days.
So the price according to JSON and price in google cosole/calculator are different in this case.
Is there any exchangeable format to get revised pricing immediately?
The JSON file isn't exactly the ideal way to obtain pricing information anymore. In fact, the link to the JSON file was removed from the Pricing Calculator page very recently precisely because they want people to start using the Catalogue API. That one has the prices up-to-date for sure, so I'd recommend you to use that instead.