"We can not access the URL currently." - google-cloud-platform

I call google api when the return of "We can not access the URL currently." But the resources must exist and can be accessed.
https://vision.googleapis.com/v1/images:annotate
request content:
{
"requests": [
{
"image": {
"source": {
"imageUri": "http://yun.jybdfx.com/static/img/homebg.jpg"
}
},
"features": [
{
"type": "TEXT_DETECTION"
}
],
"imageContext": {
"languageHints": [
"zh"
]
}
}
]
}
response content:
{
"responses": [
{
"error": {
"code": 4,
"message": "We can not access the URL currently. Please download the content and pass it in."
}
}
]
}

As of August, 2017, this is a known issue with the Google Cloud Vision API (source). It appears to repro for some users but not deterministically, and I've run into it myself with many images.
Current workarounds include either uploading your content to Google Cloud Storage and passing its gs:// uri (note it does not have to be publicly readable on GCS) or downloading the image locally and passing it to the vision API in base64 format.
Here's an example in Node.js of the latter approach:
const request = require('request-promise-native').defaults({
encoding: 'base64'
})
const data = await request(image)
const response = await client.annotateImage({
image: {
content: data
},
features: [
{ type: vision.v1.types.Feature.Type.LABEL_DETECTION },
{ type: vision.v1.types.Feature.Type.CROP_HINTS }
]
})

I have faced the same issue when I was trying to call the api using the firebase storage download url (although it worked initially)
After looking around I found the below example in the api docs for NodeJs.
NodeJs example
// Imports the Google Cloud client libraries
const vision = require('#google-cloud/vision');
// Creates a client
const client = new vision.ImageAnnotatorClient();
/**
* TODO(developer): Uncomment the following lines before running the sample.
*/
// const bucketName = 'Bucket where the file resides, e.g. my-bucket';
// const fileName = 'Path to file within bucket, e.g. path/to/image.png';
// Performs text detection on the gcs file
const [result] = await client.textDetection(`gs://${bucketName}/${fileName}`);
const detections = result.textAnnotations;
console.log('Text:');
detections.forEach(text => console.log(text));

For me works only uploading image to google cloud platform and passing it to URI parameters.

In my case, I tried retrieving an image used by Cloudinary our main image hosting provider.
When I accessed the same image but hosted on our secondary Rackspace powered CDN, Google OCR was able to access the image.
Not sure why Cloudinary didn't work when I was able to access the image via my web browser, but just my little workaround situation.

I believe the error is caused by the Cloud Vision API refusing to download images on a domain whose robots.txt file blocks Googlebot or Googlebot-Image.
The workaround that others mentioned is in fact the proper solution: download the images yourself and either pass them in the image.content field or upload them to Google Cloud Storage and use the image.source.gcsImageUri field.

For me, I resolved this issue by requesting URI (e.g.: gs://bucketname/filename.jpg) instead of Public URL or Authenticated URL.
const vision = require('#google-cloud/vision');
function uploadToGoogleCloudlist (req, res, next) {
const originalfilename = req.file.originalname;
const bucketname = "yourbucketname";
const imageURI = "gs://"+bucketname+"/"+originalfilename;
const client = new vision.ImageAnnotatorClient(
{
projectId: 'yourprojectid',
keyFilename: './router/fb/yourprojectid-firebase.json'
}
);
var visionjson;
async function getimageannotation() {
const [result] = await client.imageProperties(imageURI);
visionjson = result;
console.log ("vision result: "+JSON.stringify(visionjson));
return visionjson;
}
getimageannotation().then( function (result){
var datatoup = {
url: imageURI || ' ',
filename: originalfilename || ' ',
available: true,
vision: result,
};
})
.catch(err => {
console.error('ERROR CODE:', err);
});
next();
}

I faced with the same issue several days ago.
In my case the problem happened due to using queues and call api requests in one time from the same ip. After changing the number of parallel processes from 8 to 1, the amount of such kind of errors was reduced from ~30% to less than 1%.
May be it will help somebody. I think there is some internal limits on google side for loading remote images (because as people reported, using google storage also solves the problem).

My hypothesis is that an overall (short) timeout exists on Google API side which limit the number of files that can actually be retrieved.
Sending 16 images for batch-labeling is possible but only 5 o 6 will labelled because the origin webserver hosting the images was unable to return all 16 files within <Google-Timeout> milliseconds.

In my case, the image uri that I was specifying in the request pointed at a large image ~ 4000px x 6000px. When I changed it to a smaller version of the image. The request succeeded

The very same request works for me. It is possible that the image host was temporarily down and/or had issues on their side. If you retry the request it will mostly work for you.

Related

Google Cloud Functions Credentials for Local Development

I have a google cloud function. Within this function, I want to write files to GCS (google cloud storage), then get a signed URL of the file that is written to GCS and send that URL to the caller.
For local development, I run the functions locally using the functions-framework command:
functions-framework --source=.build/ --target=http-function --port 8082
When I want to write to GCS or get the signed URL, the cloud functions framework just tries to get the credentials from the signed-in gcloud CLI user. However, I want to point it to read the credentials from a service account. For all other gcloud development purposes, we have put the service account information in a local creds.json file and point the gcloud to read from that file.
Is there any way I can achieve this for functions? Meaning that when I start the functions locally (using functions-framework), I point it to the creds.json file to read the credentials from there?
All Google's SDKs, e.g., for GCS, make use of Application Default Credentials which you should be using instead of explicitly pathing to a key. If this is true for functions-framework, then exporting the variable should work.
The command gcloud auth application-default login is a better recommendation in that case, especially for testing the signed URL, because with that local credential as well as the Cloud Functions credential (through metadata server), the private key isn't present, and the signed URL must be called in a specific manner (provide token and the service account to be able to sign the URL).
Using gcloud auth application-default login creates Application Default Credentials, which have all the powers of the user's account and are persisted as a key called {HOME}/.config/gcloud/application-default_credentials.
Cloud Function Local Development Authentication
This is what I'm doing for local development of a google cloud function using Nodejs that is triggered by a pub/sub event. This function reads a file from a google cloud storage. This uses the Functions Framework Nodejs
TL;DR;
# shell A
gcloud auth application-default login
npm start
pub/sub event message
# shell B
curl -d "#mockPubSub.json" \
-X POST \
-H "Content-Type: application/json" \
http://localhost:8080
Greater Details
Cloud Function with Functions Framework
Docs: Functions Framework Nodejs
package.json
note the --target and --signature-type
{
...
"scripts": {
"start": "npx functions-framework --target=helloPubSub --signature-type=http"
},
"dependencies": {
"#google-cloud/debug-agent": "^7.0.0",
"#google-cloud/storage": "^6.0.0"
},
"devDependencies": {
"#google-cloud/functions-framework": "^3.1.2"
}
...
}
sample nodejs cloud function that downloads file into memory
/* modified from the sample
index.js
*/
const {Storage} = require('#google-cloud/storage');
function log(message, severity = 'DEBUG', payload) {
// Structured logging
// https://cloud.google.com/functions/docs/monitoring/logging#writing_structured_logs
if (!!payload) {
// If payload is an Error, get the stack trace.
if (payload instanceof Error && !!payload.stack) {
if (!!message ) {
message = message + '\n' + payload.stack;
} else {
message = payload.stack;
}
}
}
const logEntry = {
message: message,
severity: severity,
payload : payload
};
console.log(JSON.stringify(logEntry));
}
function getConfigFile(payload){
console.log("Get Config File from GCS")
const bucketName = 'some-bucket-in-a-project';
const fileName = 'config.json';
// Creates a client
const storage = new Storage();
async function downloadIntoMemory() {
// Downloads the file into a buffer in memory.
const contents = await storage.bucket(bucketName).file(fileName).download();
console.log(
`Contents of gs://${bucketName}/${fileName} are ${contents.toString()}.`
);
}
downloadIntoMemory().catch(console.error);
}
exports.helloPubSub = async (pubSubEvent, context) => {
/*
Read payload from the event and log the exception in App project if the payload cannot be parsed
*/
try {
const payload = Buffer.from(pubSubEvent.body.message.data, 'base64').toString()
const pubSubEventObj = JSON.parse(payload) ;
console.log("name: ", pubSubEventObj.name);
getConfigFile(pubSubEventObj)
} catch (err) {
log('failed to process payload: + payload \n' , 'ERROR', err);
}
};
Mock Message for Pub/Sub Event
blog reference, but I'm not using the emulator
myJson.json
{"widget": {
"debug": "on",
"window": {
"title": "Sample Konfabulator Widget",
"name": "main_window",
"width": 500,
"height": 500
},
"image": {
"src": "Images/Sun.png",
"name": "sun1",
"hOffset": 250,
"vOffset": 250,
"alignment": "center"
},
"text": {
"data": "Click Here",
"size": 36,
"style": "bold",
"name": "text1",
"hOffset": 250,
"vOffset": 100,
"alignment": "center",
"onMouseUp": "sun1.opacity = (sun1.opacity / 100) * 90;"
}
}}
encode for Pub/sub message ( likely there's a better way )
cat myJson.json | grep -v % | base64
take that output and put it into value for the data key:
mockPubSub.json
{
"message": {
"attributes": {
"greeting": "Hello from the Cloud Pub/Sub Emulator!"
},
"data": "< put the output of the base64 from above here >",
"messageId": "136969346945"
},
"subscription": "projects/myproject/subscriptions/mysubscription"
}
Follow steps from the TL;DR: above.
Disclaimers
gcloud auth application-default login uses the permissions of the user who executes the command. So remember, in production, the service account the cloud function uses will need to read from the storage bucket.
while scrubbing this (i.e. renaming bits) and copying it over, I may have messed it up. Sorry if that is true.
this is all contrived if you are curious, my design is to take a message from a cloud scheduler that includes relevant details about what to read from the config.
this article explains a way to hot-reload the cloud function
It remains unclear why I'm using ``-signature-type=http` to mock message, but for now, I am.

s3.putObject(params).promise() does not upload file, but successfully executes then() callback

I had pretty long number of attempts to put a file in S3 bucket, after which I have to update my model.
I have following code (note that I have tried commented lines too. It works neither with comments nor without it.)
The problem observed:
Everything in the first .then() block (successCallBack()) gets successfully executed, but I do not see result of s3.putObject().
The bucket in question is public, no access restrictions. It used to work with sls offline option, then because of it not working in AWS I had to make lot of changes and managed to make successCallback() work which does the database work successfully. However, file upload still doesn't work.
Some questions:
While solving this, the real questions I am pondering / searching are,
Is lambda supposed to return something? I saw AWS docs but they have fragmented code snippets.
Putting await in front of s3.putObject(params).promise() does not help. I see samples with and without await in front of things that have AWS Promise() function call. Not sure which ones are correct.
What is the correct way when you have chained async functions to accomplish within one lambda function?
UPDATE:
var myJSON = {}
const createBook = async (event) => {
let bucketPath = "https://com.xxxx.yyyy.aa-bb-zzzzzz-1.amazonaws.com"
let fileKey = //file key
let path = bucketPath + "/" + fileKey;
myJSON = {
//JSON from headers
}
var s3 = new AWS.S3();
let buffer = Buffer.from(event.body, 'utf8');
var params = {Bucket: 'com.xxxx.yyyy', Key: fileKey, Body: buffer, ContentEncoding: 'utf8'};
let putObjPromise = s3.putObject(params).promise();
putObjPromise
.then(successCallBack())
.then(c => {
console.log('File upload Success!');
return {
statusCode: 200,
headers: { 'Content-Type': 'text/plain' },
body: "Success!!!"
}
})
.catch(err => {
let str = "File upload / Creation error:" + err;
console.log(str);
return {
statusCode: err.statusCode || 500,
headers: { 'Content-Type': 'text/plain' },
body: str
}
});
}
const successCallBack = async () => {
console.log("Inside success callback - " + JSON.stringify(myJSON)) ;
const { myModel } = await connectToDatabase()
console.log("After connectToDatabase")
const book = await myModel.create(myJSON)
console.log(msg);
}
Finally, I got this to work. My code worked already in sls offline setup.
What was different on AWS endpoint?
What I observed on console was the fact that my lambda was set to run under VPC.
When I chose No VPC, it worked. I do not know if this is the best practice. There must be some security advantage obtained by functions running under VPC.
I came across this huge explanation about VPC but I could not find anything related to S3.
The code posted in the question currently runs fine on AWS endpoint.
If the lambda is running in a VPC then you would need a VPC endpoint to access a service outside the vpc. S3 would be outside the VPC. Perhaps if security is an issue then creating a VPC endpoint would solve the issue in a better way. Also, if security is an issue, then perhaps adding a policy (or using the default AmazonS3FullAccess policy) to the role that the lambda is using, then the S3 bucket wouldn't need to be public.

Cloud Vision PDF unsupported input file format

Im using cloud vision to detect text in a pdf file.Ive used the code provided in the documentation but it throws an error saying unsupported input file format.im using 100% sure the file is pdf and i even used the sample resource file https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/vision/cloud-client/detect/resources/kafka.pdf what should i do?????????
EDIT
This is the code taken staright from the documentation which i used.
const vision = require('#google-cloud/vision').v1;
const client = new vision.ImageAnnotatorClient();
const gcsSourceUri = `gs://${bucketName}/${fileName}`;
const gcsDestinationUri = `gs://${bucketName}/${outputPrefix}/`;
const inputConfig = {
// Supported mime_types are: 'application/pdf' and 'image/tiff'
mimeType: 'application/pdf',
gcsSource: {
uri: gcsSourceUri,
},
};
const outputConfig = {
gcsDestination: {
uri: gcsDestinationUri,
},
};
const features = [{type: 'DOCUMENT_TEXT_DETECTION'}];
const request = {
requests: [
{
inputConfig: inputConfig,
features: features,
outputConfig: outputConfig,
},
],
};
const [operation] = await client.asyncBatchAnnotateFiles(request);
const [filesResponse] = await operation.promise();
const destinationUri =
filesResponse.responses[0].outputConfig.gcsDestination.uri;
console.log('Json saved to: ' + destinationUri);
I tried moving that kafka.pdf to my gcs bucket and ran the python sample code, which worked as expected. Maybe something went wrong with the kafka.pdf file when you moved it into the gcs bucket.
Try using the sample file they provide to see if it works for you 'gs://cloud-samples-data/vision/pdf_tiff/census2010.pdf'. The census file works for me as well.
I was getting the same response from the batch annotation service on otherwise valid PDF files. In my case it had to do with copy/pasting the example code from the node sample for file uploading to google cloud storage, and including the keys for gzip and cacheControl
It doesn't look like you've included those values, but after a lot of head scratching I ended up finding that if I uploaded my pdfs without those options then the annotation service tolerated them, not an exact reproduction but I hope it leads to progress for you :)

How to return an entire Datastore table by name using Node.js on a Google Cloud Function

I want to retrieve a table (with all rows) by name. I want to HTTP request using something like this on the body {"table": user}.
Tried this code without success:
'use strict';
const {Datastore} = require('#google-cloud/datastore');
// Instantiates a client
const datastore = new Datastore();
exports.getUsers = (req, res) => {
//Get List
const query = this.datastore.createQuery('users');
this.datastore.runQuery(query).then(results => {
const customers = results[0];
console.log('User:');
customers.forEach(customer => {
const cusKey = customer[this.datastore.KEY];
console.log(cusKey.id);
console.log(customer);
});
})
.catch(err => { console.error('ERROR:', err); });
}
Google Datastore is a NoSQL database that is working with entities and not tables. What you want is to load all the "records" which are "key identifiers" in Datastore and all their "properties", which is the "columns" that you see in the Console. But you want to load them based the "Kind" name which is the "table" that you are referring to.
Here is a solution on how to retrieve all the key identifiers and their properties from Datastore, using HTTP trigger Cloud Function running in Node.js 8 environment.
Create a Google Cloud Function and choose the trigger to HTTP.
Choose the runtime to be Node.js 8
In index.js replace all the code with this GitHub code.
In package.json add:
{
"name": "sample-http",
"version": "0.0.1",
"dependencies": {
"#google-cloud/datastore": "^3.1.2"
}
}
Under Function to execute add loadDataFromDatastore, since this is the name of the function that we want to execute.
NOTE: This will log all the loaded records into the Stackdriver logs
of the Cloud Function. The response for each record is a JSON,
therefore you will have to convert the response to a JSON object to
get the data you want. Get the idea and modify the code accordingly.

Amazon Rekognition for text detection

I have images of receipts and I want to store the text in the images separately. Is it possible to detect text from images using Amazon Rekognition?
Update from November 2017:
Amazon Rekognition announces real-time face recognition, Text in Image
recognition, and improved face detection
Read the announcement here: https://aws.amazon.com/about-aws/whats-new/2017/11/amazon-rekognition-announces-real-time-face-recognition-text-in-image-recognition-and-improved-face-detection/
Proof:
No, Amazon Rekognition not provide Optical Character Recognition (OCR).
At the time of writing (March 2017), it only provides:
Object and Scene Detection
Facial Analysis
Face Comparison
Facial Recognition
There is no AWS-provided service that offers OCR. You would need to use a 3rd-party product.
Amazon doesn't provide an OCR API. You can use Google Cloud Vision API for Document Text Recognition. It costs $3.5/1000 images though. To test Google's open this link and paste the code below in the the test request body on the right.
https://cloud.google.com/vision/docs/reference/rest/v1/images/annotate
{
"requests": [
{
"image": {
"source": {
"imageUri": "JPG_PNG_GIF_or_PDF_url"
}
},
"features": [
{
"type": "DOCUMENT_TEXT_DETECTION"
}
]
}
]
}
You may get better results with Amazon Textract although it's currently only available in limited preview.
It's possible to detect text in an image using the AWS JS SDK for Rekognition but your results may vary.
/* jshint esversion: 6, node:true, devel: true, undef: true, unused: true */
// Import libs.
const AWS = require('aws-sdk');
const axios = require('axios');
// Grab AWS access keys from Environmental Variables.
const { S3_ACCESS_KEY, S3_SECRET_ACCESS_KEY, S3_REGION } = process.env;
// Configure AWS with credentials.
AWS.config.update({
accessKeyId: S3_ACCESS_KEY,
secretAccessKey: S3_SECRET_ACCESS_KEY,
region: S3_REGION
});
const rekognition = new AWS.Rekognition({
apiVersion: '2016-06-27'
});
const TEXT_IMAGE_URL = 'https://loremflickr.com/g/320/240/text';
(async url => {
// Fetch the URL.
const textDetections = await axios
.get(url, {
responseType: 'arraybuffer'
})
// Convert to base64 Buffer.
.then(response => new Buffer(response.data, 'base64'))
// Pass bytes to SDK
.then(bytes =>
rekognition
.detectText({
Image: {
Bytes: bytes
}
})
.promise()
)
.catch(error => {
console.log('[ERROR]', error);
return false;
});
if (!textDetections) return console.log('Failed to find text.');
// Output the raw response.
console.log('\n', 'Text Detected:', '\n', textDetections);
// Output to Detected Text only.
console.log('\n', 'Found Text:', '\n', textDetections.TextDetections.map(t => t.DetectedText));
})(TEXT_IMAGE_URL);
See more examples of using Rekognition with NodeJS in this answer.
public async Task<List<string>> IdentifyText(string filename)
{
// Using USWest2, not the default region
AmazonRekognitionClient rekoClient = new AmazonRekognitionClient("Access Key ID", "Secret Access Key", RegionEndpoint.USEast1);
Amazon.Rekognition.Model.Image img = new Amazon.Rekognition.Model.Image();
byte[] data = null;
using (FileStream fs = new FileStream(filename, FileMode.Open, FileAccess.Read))
{
data = new byte[fs.Length];
fs.Read(data, 0, (int)fs.Length);
}
img.Bytes = new MemoryStream(data);
DetectTextRequest dfr = new DetectTextRequest();
dfr.Image = img;
var outcome = rekoClient.DetectText(dfr);
return outcome.TextDetections.Select(x=>x.DetectedText).ToList();
}