Interpretation of output of bounding box annotation job - amazon-web-services

The output folder of an annotation job contains the following file structure:
active learning
annotation-tools
annotations
intermediate
manifests
Each line of the manifests/output/output.manifest file is a dictionary, where the key 'jobname' contains information about the annotations, and the key 'jobname-metadata' contains confidence score and other information about each of the bounding box annotations. There is also another folder called annotations which contain json files which contain information about annotations and associated worker ids. How are the two annotation informations related to each other? Is there any blogs/tutorials which discuss how to interpret the data received from amazon sagemaker ground-truth service? Thanks in advance.
Links I referred to:
1. https://docs.aws.amazon.com/sagemaker/latest/dg/sms-data-output.html
2. https://github.com/awslabs/amazon-sagemaker-examples/blob/master/ground_truth_labeling_jobs/ground_truth_object_detection_tutorial/object_detection_tutorial.ipynb
I have displayed the annotations received using the code available in the link 2 here, which treats consolidated annotations and worker response separately.

Thank you for your question. I’m the product manager for Amazon SageMaker Ground Truth and am happy to answer your question here.
We have a feature called annotation consolidation that takes the responses from multiple workers for a single image and then consolidates those responses into a single set of bounding boxes for the image. The bounding boxes referenced in the manifest file are the consolidated responses whereas what you see in the annotations folders are the raw annotations (which is why you have the respective worker IDs).
You can find out more about the annotation consolidation feature here: https://docs.aws.amazon.com/sagemaker/latest/dg/sms-annotation-consolidation.html
Please let us know if you have any further questions.

Related

Google Document AI - Add New Label Workflow

We’re just getting started with Document AI. So far, we have about 80 labeled documents and one trained version.
We are making changes to the schema and adding a property. We’d like to go back in and extract this new label to the previously labeled documents.
The Document AI user interface presents some challenges here.
I want to isolate the documents that don’t contain this label. With the filtering capabilities, it looks like I’m only able to filter on documents that have that label, not the inverse.
I also don’t see a way to mark a bunch of documents as unlabeled once they have been marked as labeled. That would be useful for indicating which previously labeled documents need some additional work.
For those that are making schema changes and need to go back and re-label documents, what does your workflow look like?

Cloud Data Fusion - Input HTTP Post Body from BQ rows

I am a new cloud data fusion user and have run into a problem I cant find a solution for.
I have a table in BQ with ~150 rows of latitude and longitude points. For each row, I want to pass the lat and lng into an HTTP post request to get a result from TravelTime API. Ultimately I want to have a table with all my original rows with a column with the response for each one/
Where I am stuck is that so far I have only been able to hard-code the body of the post request into the HTPP Source plugin and successfully write the response to a file in gcs. However, I expect the rows will change over time, so I would like to dynamically generate and pass in the POST request body from my BQ data.
Is this possible with data fusion? Is this an advisable approach? Or is there a better way?
As #Albert Shau and #user3750486 agreed in the comments:
There is no out-of-the-box way to pass data from BQ rows dynamically in a POST HTTP request.
A possible workaround is to have an HTTP transform plugin that sits in the middle of the pipeline and can be configured to make calls based on the input data. Then you would have a BQ source followed by that plugin followed by the GCS sink. I think your best bet would be to write a custom transform.
This can be done by following this link that #Albert Shau provided or to do a custom code using GCP's Cloud Function as OP did.
Posting the answer as community wiki for the benefit of the community that might encounter this use case in the future.
Feel free to edit this answer for additional information.

Import Labeled Data from vott to Google Cloud AutoML

I want to go ahead and create a classifier, I and I do not like the Google's Browser Labeling Service. Is there a tool similar to vott or some code, that I can use to go ahead and import my vott labeled data and import it Google AutoML.
The Google Labeling Service looks something like this and is very slow in loading images and inefficient it literally has a white labeling cursor and I have light background in my images
As seen in the Image Here.
On the Other Hand can I import it using vott which is much more better in every way. So is there a way for me to do this using vott to import the labeled csv into Google's Cloud AutoML.
I don't think that it is currently possible to import already labeled data from other apps (like VOTT).
At the moment there are 3 ways to label images in Cloud Vision. It's described in the Annotating imported training images
Provide bounding boxes with labels for your training images via labeled bounding boxes in your .csv import file
In the CSV file you would need to provide GCS url and label/labels
Labeled: gs://my-storage-bucket-vcm/flowers/images/img100.jpg,daisy
Multi-label: gs://my-storage-bucket-vcm/flowers/images/img384.jpg,dandelion,tulip,rose
Assigned to a set: TEST,gs://my-storage-bucket-vcm/flowers/images/img805.jpg,daisy
More details can be found here.
Provide unannotated images in your .csv import file and use the UI to provide image annotations
Not labeled: gs://my-storage-bucket-vcm/flowers/images/img403.jpg
However, later you will need to label it using UI, otherwise it will be ignored.
AutoML Vision ignores items without a category label.
Request manual image annotation with Google's Human Labeling service
This option would include human labelers and would need to provide additional information like dataset, label set and instructions for people.
In the documentation you can also find information that currently API is not supporting any method for labeling.
The AutoML API does not currently include methods for labeling.
However, you can propose Feature Request via IssueTracker to add some additional import methods from different apps or enable the use API.

Using google cloud for image classification, cropping and OCR

Please allow me to ask a rather newbie question. So far, I have been using local tools like imagemagick or GOCR to perform the job, but that is rather old-fashioned, and I am urged to "move to google cloud AI".
The setup
I have a (training) data set of various documents (as JPG and PDF) of different kinds, and by certain features (like prevailing color, repetitive layout) I intend to classify them, e.g. as invoice type 1, invoice type 2, not an invoice. In a 2nd step, I would like to OCR certain predefined areas of each document and extract e.g. the address of the company sending the invoice and the date.
The architecture I am envisioning
In a modern platform as a service (pass), I have already set up an UI where I can upload new files. These are then locally stored in a directory with filenames (or in a MongoDB). Meta info like upload timestamp, user, original file name is stored in a DB.
The newly uploaded file should should then be submitted to google cloud which should do the classification step, and deliver back the label to be saved in the database.
The document pages should be auto-cropped, i.e. black or white margins are removed, most probably with google cloud as well. The parameters of the crop should be persisted in the DB.
In case it is e.g. an invoice, OCR should be performed (again by google cloud) for certain regions of the documents, e.g. a bounding box of spanning from the mid of the page to the right margin in the upper 10% of the cropped page. The results of the OCR should be again persisted locally.
The problem
I seem to be missing the correct search term to figure out how to do it with google cloud. Is there an google-API (e.g. REST), I can use to upload and which gives me back the results of steps 2 to 4?
I think that your best option here is to use Document AI (REST API and Libraries).
Using Document AI, you can:
Convert images to text
Classify documents
Analyze and extract entities
Additionally, for your use case, we have a new Document AI feature that is still in preview and has limited access which is the Invoice parser.
Invoice parser is similar to Form parser but for invoices instead of forms. Check out the Invoice parser page and you will see what I mean by preview and limited access.
AFIK, there isn't any GCP tool for image edition.

Getting error "INTERNAL" when training a model with AutoML

I'm training a small model with AutoML entity extraction, but the training keeps failing with the error message "INTERNAL" and no other details.
I'm doing this from the Google Cloud console, and I've followed the same steps I've used successfully to train other models.
The dataset has two labels with a few hundred text items each, so I doubt it's a timeout or anything like that.
What might be causing this and is there a way to debug/get more visibility?
Could be that dataset contains duplicate columns which is not currently supported. If this is not your case, I'd suggest to reach with GCP Support to check it internally.