Google document AI - google-cloud-platform

I have raised a request for Google invoice parser last week, but haven't received any response from them. Does anyone has experience in getting access from google? or any contact?

With API calls you can
Either get immediately the result in a Document structure if your file is small
Or get the name of the processing job if your file is large. Then you have to request periodically the job name to check the status and get the result

I got a response for the invoice parser in about a month, I think it depends on how many requests they have and after the recent web presentation of their AI api's requests grew. Just give it some more time.

Related

Cloud Data Fusion - Input HTTP Post Body from BQ rows

I am a new cloud data fusion user and have run into a problem I cant find a solution for.
I have a table in BQ with ~150 rows of latitude and longitude points. For each row, I want to pass the lat and lng into an HTTP post request to get a result from TravelTime API. Ultimately I want to have a table with all my original rows with a column with the response for each one/
Where I am stuck is that so far I have only been able to hard-code the body of the post request into the HTPP Source plugin and successfully write the response to a file in gcs. However, I expect the rows will change over time, so I would like to dynamically generate and pass in the POST request body from my BQ data.
Is this possible with data fusion? Is this an advisable approach? Or is there a better way?
As #Albert Shau and #user3750486 agreed in the comments:
There is no out-of-the-box way to pass data from BQ rows dynamically in a POST HTTP request.
A possible workaround is to have an HTTP transform plugin that sits in the middle of the pipeline and can be configured to make calls based on the input data. Then you would have a BQ source followed by that plugin followed by the GCS sink. I think your best bet would be to write a custom transform.
This can be done by following this link that #Albert Shau provided or to do a custom code using GCP's Cloud Function as OP did.
Posting the answer as community wiki for the benefit of the community that might encounter this use case in the future.
Feel free to edit this answer for additional information.

Fetch historical liquidation orders from Binance

I am fetching the past liquidation orders of a specific trading pairs (say ETHUSDT) in a time interval (3/2021-3/2022) to perform some market simulation.
However when I looked up the official documentation of binance api, I only found the websocket market stream to fetch the real-time liquidation order by the websocket uri : wss://fstream.binance.com/ws/!forceOrder#arr.
I searched the past Postman Collection and saw that there was a REST endpoint to get all the liquidation order by the url" https://fapi.binance.com/fapi/v1/allForceOrders?. However, this endpoint is deprecated. I can not find anything to retrieve the historical data for the liquidation.
Anyone has any idea where should I go to? Thank you very much!

How to batch send documents in DocumentAI?

I am doing the processDocument process using the expense parser as in the example here. Since the billing costs too much, instead of sending the documents one by one, i combine 10 documents into one pdf and use processDocument again. However, DocumentAI sees 10 separate receipts that we have combined as a single receipt, and instead of returning 10 different total_amount entities for each receipt, 1 total_amount returns.I want to combine 10 documents into one pdf and send it for less billing cost. In addition, i am looking for a way to think of each document independently from each other and extract its entities separately. Will batch processing work for me? What can I do for it? Can you help me please?
Unfortunately there is no way to make the billing cheaper because the pricing of Document AI is calculated on a per page/document basis. See Document AI pricing.
With regards to your question:
I am looking for a way to think of each document independently from
each other and extract its entities separately. Will batch processing
work for me?
Yes batch processing will work for you, but pricing is just the same with processDocument. See the pricing info I have attached above.
The only difference between batch processing and processDocument is that instead of sending a single request for a single document, batch processing will send all your documents in a single request. The response will then be stored in a GCS bucket that you have defined on the batch process options. See batch process sample code.
Another thing to add is batch processing process the documents asynchronously. This means that when the request is sent, the processing is done on the backend and you can poll the status of your request to see if it is still processing or it is done.

Integration of BigQuery and Dialogflow

I'm getting started on Dialogflow and I would like to integrate it with BigQuery.
I have some tables in BigQuery with difference data, for instance a record of alarms that a wind turbine showed during time.
In one of my test cases, let's say I want my chatbot to tell me what alarms were raised in the wind turbine number 5 of my farm, on the 25th of October.
I have already created a chatbot in Dialogflow that asks for all the necessary parameters of the enquiry, such as the wind farm name, the wind turbine number, the date, and the name of the alarm.
My doubt now is how I can send those parameters to BigQuery in order to dig into my tables, extract the required information, and print it in Dialogflow.
I have been looking for documentation or tutorials but nothing came out that could fit my case...
Thanks in advance!
You need to implement a fulfillment. It triggers a webhook, for example a Cloud Functions or a Cloud Run service.
This webhook call contains the value gather by your intent and parameters. You have to extract them and perform your process, for example a call to BigQuery. Then format the response and display it on Dialogflow.

Google Admin Reports API: Users Usage Report stats accuracy

I am trying to use Google Admin Reports API: Users Usage Report to pull emails received/sent per user per day in our org's google app.
When I use Google APIs Explorer to pull my own stats for a particular day and compared it with real situation, it seems the number is far off.
For example, on Sunday, 7th Dec 2014, I only sent out one email. But the stats shows there were 4 emails sent out by me on that day.
Any assistance would be appreciated
Cheers,
You should get the same results than searching in Gmail:
in:sent from:me after:2014/12/07 before:2014/12/08
The missing bit is the time zone the server is using which in my research it is always Pacific Standard Time.
Did you:
Send out any calendar invitations that day? (1 email per attendee)
Share any Google Drive files/folders that day (1 email per file shared)
Send mail from a Google Group
there are likely other actions you may have performed in other Google Apps which caused emails to go out in your name and count against your quota but not necessarily show in your Sent folder.
If you'd like for these messages to appear in your Sent folder, turn on Comprehensive Storage.