Model directory expected to contain the 'export.meta' file - google-cloud-ml

During the creation of a new version for a model after the selection of a bucket and folder I got this error from the Cloud Console.
{
"error": {
"code": 400,
"message": "Field: version.deployment_uri Error: The model directory gs://ml-codelab/v1-output/ is expected to contain the 'export.meta' file. Please make sure it exists and Cloud ML service account cloud-ml-service#xxx.iam.gserviceaccount.com has read access to it",
"status": "FAILED_PRECONDITION",
"details": [
{
"#type": "type.googleapis.com/google.rpc.BadRequest",
"fieldViolations": [
{
"field": "version.deployment_uri",
"description": "The model directory gs://ml-codelab/v1-output/ is expected to contain the 'export.meta' file. Please make sure it exists and Cloud ML service account cloud-ml-service#xxxx.iam.gserviceaccount.com has read access to it"
}
]
}
]
}
}

You need to create a meta graph when you export your model. You can do this using a saver e.g.
saver = tf.train.Saver()
saver.save(sess, os.path.join(FLAGS.output_dir, "export"))
Typically you save the session and graph separately because your serving graph can be different from the training graph.

Related

GCP recommendation data format for catalog

I am currently working on recommendation AI. since I am new to GCP recommendation, I have been struggling with data format for catalog. I read the documentation and it says each product item JSON format should be on a single line.
I understand this totally, but It would be really great if I could get what the JSON format looks like in real because the one in their documentation is very ambiguous to me. and I am trying to use console to import data
I tried to import data looking like down below but I got error saying invalid JSON format 100 times. it has lots of reasons such as unexpected token and something should be there and so on.
[
{
"id": "1",
"title": "Toy Story (1995)",
"categories": [
"Animation",
"Children's",
"Comedy"
]
},
{
"id": "2",
"title": "Jumanji (1995)",
"categories": [
"Adventure",
"Children's",
"Fantasy"
]
},
...
]
Maybe it was because each item was not on a single line, but I am also wondering if the above is enough for importing. I am not sure if those data should be included in another property like
{
"inputConfig": {
"productInlineSource": {
"products": [
{
"id": "1",
"title": "Toy Story (1995)",
"categories": [
"Animation",
"Children's",
"Comedy"
]
},
{
"id": "2",
"title": "Jumanji (1995)",
"categories": [
"Adventure",
"Children's",
"Fantasy"
]
},
}
I can see the above in the documentation but it says it is for importing inline which is using POST request. it does not mention anything about importing with console. I just guess the format is also used for console but I am not 100% sure. that is why I am asking
Is there anyone who can show me the entire data format to import data by using console?
Problem Solved
For those who might have the same question, The exact data format you should import by using gcp console looks like
{"id":"1","title":"Toy Story (1995)","categories":["Animation","Children's","Comedy"]}
{"id":"2","title":"Jumanji (1995)","categories":["Adventure","Children's","Fantasy"]}
No square bracket wrapping all the items.
No comma between items.
Only each item on a single line.
Posting this Community Wiki for better visibility.
OP edited question and add solution:
The exact data format you should import by using gcp console looks like
{"id":"1","title":"Toy Story (1995)","categories":["Animation","Children's","Comedy"]}
{"id":"2","title":"Jumanji (1995)","categories":["Adventure","Children's","Fantasy"]}
No square bracket wrapping all the items.
No comma between items.
Only each item on a single line.
However I'd like to elaborate a bit.
There are a few ways to import Importing catalog information:
Importing catalog data from Merchant Center
Importing catalog data from BigQuery
Importing catalog data from Cloud Storage
I guess this is what was used by OP, as I was able to import catalog using UI and GCS with below JSON file.
{
"inputConfig": {
"catalogInlineSource": {
"catalogItems": [
{"id":"111","title":"Toy Story (1995)","categories":["Animation","Children's","Comedy"]}
{"id":"222","title":"Jumanji (1995)","categories":["Adventure","Children's","Fantasy"]}
{"id":"333","title":"Test Movie (2020)","categories":["Adventure","Children's","Fantasy"]}
]
}
}
}
Importing catalog data inline
At the bottom of the Importing catalog information documentation you can find information:
The line breaks are for readability; you should provide an entire catalog item on a single line. Each catalog item should be on its own line.
It means you should use something similar to NDJSON - convenient format for storing or streaming structured data that may be processed one record at a time.
If you would like to try inline method, you should use this format, however it's single line but with breaks for readability.
data.json file
{
"inputConfig": {
"catalogInlineSource": {
"catalogItems": [
{
"id": "1212",
"category_hierarchies": [ { "categories": [ "Animation", "Children's" ] } ],
"title": "Toy Story (1995)"
},
{
"id": "5858",
"category_hierarchies": [ { "categories": [ "Adventure", "Fantasy" ] } ],
"title": "Jumanji (1995)"
},
{
"id": "321123",
"category_hierarchies": [ { "categories": [ "Comedy", "Adventure" ] } ],
"title": "The Lord of the Rings: The Fellowship of the Ring (2001)"
},
]
}
}
}
Command
curl -X POST \
-H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
--data #./data.json \
"https://recommendationengine.googleapis.com/v1beta1/projects/[your-project]/locations/global/catalogs/default_catalog/catalogItems:import"
{
"name": "import-catalog-default_catalog-1179023525XX37366024",
"done": true
}
Please keep in mind that the above method requires Service Account authentication, otherwise you will get PERMISSION DENIED error.
"message" : "Your application has authenticated using end user credentials from the Google Cloud SDK or Google Cloud Shell which are not supported by the translate.googleapis.com. We recommend that most server applications use service accounts instead. For more information about service accounts and how to use them in your application, see https://cloud.google.com/docs/authentication/.",
"status" : "PERMISSION_DENIED"

How to output attributes of resources created?

I'm executing a GCP module to create a service account.
main.tf:
resource "google_service_account" "gsvc_account" {
account_id = "xxx"
display_name = ""
project = "proj-yyy"
}
output "account_id" {
value = "${google_service_account.gsvc_account.account_id}"
}
Once the account is created, a terraform.tfstate file is created containing all details of the account.
terraform.tfstate
{
"version": 4,
"terraform_version": "0.12.0",
"serial": 3,
"lineage": "aaaa-bbbb-cccc",
"outputs": {
"xxx": {
"value": "xxx",
"type": "string"
}
},
"resources": [
{
"module": "module.gsvc_tf",
"mode": "managed",
"type": "google_service_account",
"name": "gsvc_account",
"provider": "provider.google",
"instances": [
{
"schema_version": 0,
"attributes": {
"account_id": "xxx",
"display_name": "",
"email": "xxx#yyy.com",
"id": "projects/proj-yyy/serviceAccounts/xxx#yyy.com",
"name": "projects/proj-yyy/serviceAccounts/xxx#yyy.com",
"policy_data": null,
"project": "proj-xxx",
"unique_id": "10891885"
}
}
]
}
]
}
As you can see above, in the module, I'm outputing the account_id input variable. Is there a way to output the attributes viz. id, name etc. so that they can be accessed by another module? The attributes are computed after the resource is created.
From the docs for the google_service_account resource:
the following computed attributes are exported:
email - The e-mail address of the service account. This value should be referenced from any google_iam_policy data sources that would grant the service account privileges.
name - The fully-qualified name of the service account.
unique_id - The unique id of the service account.
You can declare outputs using these attributes in the same way as you declared your account_id output. For example:
output "id" {
value = "${google_service_account.gsvc_account.unique_id}"
}
output "email" {
value = "${google_service_account.gsvc_account.email}"
}
Re this: "so that they can be accessed by another module" ... if the "other module" uses the same state file then the above outputs are addressable using ...
${google_service_account.gsvc_account.account_id}
${google_service_account.gsvc_account.email}
etc
... i.e. you don't need outputs at all. So, I'm guessing that the "other module" is in a separate project / workspace / repo and hence is using a different state file. If so, then you would access these outputs via remote state. For example, you would declare a remote state data source to point at whatever state contains your outputs:
resource "terraform_remote_state" "the_other_state" {
backend = "..."
config {
...
}
}
And then refer to the outputs within that state like so:
${terraform_remote_state.the_other_state.output.account_id}
${terraform_remote_state.the_other_state.output.email}
etc
If your other module is ran against a different state file (eg your Terraform code is in a separate directory) then you might be better off using the google_service_account data source instead of trying to output the values of the resource to your state file and using the terraform_remote_state data source to fetch them.
The documentation for the google_service_account data source shows a nice example of how you would use this:
data "google_service_account" "myaccount" {
account_id = "myaccount-id"
}
resource "google_service_account_key" "mykey" {
service_account_id = "${data.google_service_account.myaccount.name}"
}
resource "kubernetes_secret" "google-application-credentials" {
metadata = {
name = "google-application-credentials"
}
data {
credentials.json = "${base64decode(google_service_account_key.mykey.private_key)}"
}
}
This avoids needing to configure your remote state data source and can be significantly simpler. In fact, this is the way I'd recommend accessing information about an existing resource in any case where the provider has a suitable data source. I'd even go so far as to recommend the external data source over the terraform_remote_state data source if there's another way to get at that information (eg through a cloud providers CLI) just because the terraform_remote_state data source is particularly clunky.

Google Cloud Vision Api only return "name"

I am trying to use Google Cloud Vision API.
I am using the REST API in this link.
POST https://vision.googleapis.com/v1/files:asyncBatchAnnotate
My request is
{
"requests": [
{
"inputConfig": {
"gcsSource": {
"uri": "gs://redaction-vision/pdf_page1_employment_request.pdf"
},
"mimeType": "application/pdf"
},
"features": [
{
"type": "DOCUMENT_TEXT_DETECTION"
}
],
"outputConfig": {
"gcsDestination": {
"uri": "gs://redaction-vision"
}
}
}
]
}
But the response is always only "name" like below:
{
"name": "operations/a7e4e40d1e1ac4c5"
}
My "gs" location is valid.
When I write the wrong path in "gcsSource", 404 not found error is coming.
Who knows why my response is weird?
This is expected, it will not send you the output as a HTTP response. To see what the API did, you need to go to your destination bucket and check for a file named "xxxxxxxxoutput-1-to-1.json", also, you need to specify the name of the object in your gcsDestination section, for example: gs://redaction-vision/test.
Since asyncBatchAnnotate is an asynchronous operation, it won't return the result, it instead returns the name of the operation. You can use that unique name to call GetOperation to check the status of the operation.
Note that there could be more than 1 output file for your pdf if the pdf has more pages than batchSize and the output json file names change depending on the number of pages. It isn't safe to always append "output-1-to-1.json".
Make sure that the uri prefix you put in the output config is unique because you have to do a wildcard search in gcs on the prefix you provide to get all of the json files that were created.

Google Data Transfer API says completed but nothing has happened?

I'm using the Data Transfer API to programmatically transfer the files owned by user A to user B as part of our exit process.
I look up the email addresses for the two users so that I can retrieve their IDs. I also query the list of data transfer applications to get the application ID for "Drive and Docs".
I pass the built transfer definition to the API and get the following JSON back:
{
"kind": "admin#datatransfer#DataTransfer",
"etag": "\"RV_wOygBiIUZUtakV6Iq44-H_Gw/2M4Z2X_c8OpsyQOJxtWDmIHcYzo\"",
"id": "AKrEtIbF0aAg_4KK7-lHFOpRNPhcgAOWWDEK1HE0zD_EEY-bOPHXuj1rKNrEE-yHPYyjY8vzvZkK",
"oldOwnerUserId": "101496053770427062754",
"newOwnerUserId": "118268322014081744703",
"applicationDataTransfers": [
{
"applicationId": "55656082996",
"applicationTransferStatus": "pending"
}
],
"overallTransferStatusCode": "inProgress",
"requestTime": "2017-03-31T10:50:48.560Z"
}
I then query the transfers API to get an update on that transfer and get the following back:
{
'kind': 'admin#datatransfer#DataTransfer',
'requestTime': '2017-03-31T10:50:48.560Z',
'applicationDataTransfers': [
{
'applicationTransferStatus': 'completed',
'applicationId': '55656082996'
}
],
'newOwnerUserId': '118268322014081744703',
'oldOwnerUserId': '101496053770427062754',
'etag': '"RV_wOygBiIUZUtakV6Iq44-H_Gw/ZVnLgj3YLcsURTSzNm8m91tNeC0"',
'overallTransferStatusCode': 'completed',
'id': 'AKrEtIbF0aAg_4KK7-lHFOpRNPhcgAOWWDEK1HE0zD_EEY-bOPHXuj1rKNrEE-yHPYyjY8vzvZkK'
}
and, indeed, I get a confirmation email that the files have been transferred.
However, if I look in Google Drive for both users, the files have NOT changed ownership. For user B, a new directory has been created with the email address of user A, but it contains no files and user A still owns all of their files.
What have I done wrong or misunderstood?
Thanks.
I had faced the same issue, please provide "applicationTransferParams" with key value.
"applicationTransferParams": [
{
"key": string,
"value": [
string
]
}
]

Embedding facebook videos in website

Using the Graph API I request /me?fields=videos.type(uploaded).fields(id, embed_html). This gives me a list of my uploaded videos:
{
"id": "[...snip...]",
"videos": {
"data": [
{
"id": "10151488520332264",
"embed_html": "<iframe src=\"https://graph.facebook.com/video/embed?video_id=10151488520332264\" width=\"190\" height=\"240\" frameborder=\"0\"></iframe>",
"updated_time": "2013-02-28T11:09:14+0000"
},
[...snip...]
]
}
}
I expect embed_html to be html code that embeds the video. But when I use it, the iframe shows only a graph error:
{
"error": {
"message": "Unknown path components: /embed",
"type": "OAuthException",
"code": 2500
}
}
The video is public and I get the same error also in the graph explorer when requesting it with an access token that has permissions for user_videos.
The video object also has a source property which links directly to the source video file (no player). I could use that and build my own player, but I'd prefer to use the embed code that facebook thinks is best for the video (and browser).