Google Vision API request size limitation (text detection) - google-cloud-platform

I'm using Google Vision API via curl (image is sent as base64-encoded payload within JSON). I can get correct results back only when my request sent via CURL is under 16k or so. As soon as it's over ~16k I'm getting no response at all:
Exactly the same request but with a smaller image
I have added the request over 16k to pastebin:
{
"requests": [
{
"image": {
"content": ...base64...
....
}
Failing request is here:
https://pastebin.com/dl/vL4Ahfw7
I could only find a 20MB limitation in the docs (https://cloud.google.com/vision/docs/supported-files?hl=th) but nothing like the weird issue I have. Thanks.

Related

Receiving a 429 error when iterating through data using Newman with Postman collection

New to Postman and Newman. I am iterating through a CSV data file, at times the values are few (20 or less) and at times they are great (50 or more). When iterating through large sets of values, 50 or more I receive a 429 error. Is there a way to write a retry function on a single request when the status is not 200?
I am working with the Amazon SP-API and reading through the documentation it appears as if there is nothing I can do about the x-amzn-RateLimit-Limit. My current limit is set at 5.0 I believe this is 5 requests per second.
Any advice would be helpful: a retry function, telling the requests to sleep/wait every X-amount, another method I am not aware of, etc.
This is the error I receive
{
"errors": [
{
"code": "QuotaExceeded",
"message": "You exceeded your quota for the requested resource.",
"details": ""
}
]
}
#Danny Dainton pointed me to the right place. By reading through the documentation I found out that by using options.delayRequest I am able to delay the time between requests. My final code looks like the sample below—and it works now:
newman.run({
delayRequest: 3000,
...
})

The Content-MD5 you specified is invalid for multi-part uploads

I am using the awd-sdk for nodejs and have a working upload.
const checksum = await this.getChecksum(path);
const payload: S3.PutObjectRequest = {
Bucket: bucket,
Key: key,
Body: fs.createReadStream(path),
ContentMD5: checksum,
};
return this.s3.upload(payload).promise();
This piece of code works great for small files and takes advantage of ContentMD5 which automatically verifies the file integrity.
Content-MD5
The base64-encoded 128-bit MD5 digest of the message (without the headers) according to RFC 1864. This header can be used as a message integrity check to verify that the data is the same data that was originally sent. Although it is optional, we recommend using the Content-MD5 mechanism as an end-to-end integrity check. For more information about REST request authentication, see REST Authentication.
https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObject.html
However it doesn't work for multipart uploads.
The Content-MD5 you specified is invalid for multi-part uploads.
That makes sense because we send the file chunk by chunk but then, I am wondering how am I supposed to use this feature with multipart uploads?
I too faced this and did multiple testing. Finally found the answer and verified it in my own way. If you guys know a much better way please let me know here.
Being said that, here's how I solved the issue.
When you create the S3 client you have to create it as below.
const s3 = new AWS.S3({ computeChecksums: true });
Then you can define the s3 upload parameters like below
var fileData = Buffer.from(fileContentString, 'binary');
var s3params = {
Bucket: bucketName,
Key: folderName + "/" + fileName,
ContentType: 'binary',
Body: fileData
};
Then do the upload as below.
await s3.upload(s3params).promise()
.then(data => {
// use this log to verify whether the md5 checksum was verified
console.log(`File uploadedd successfully ${JSON.stringify(data)}`);
// handle success upload
})
.catch(error => {
//handle the error
});
After upload is successful here's how I verified its working.
Check the item in the S3. In the document details, Object Overview section check the E-tag. You should see something like this 7e35f58f134d8914604c9fc6c35b2db7-9. This number after the - means how many parts were uploaded. For bigger files there should be a number bigger than 1. In this case it is 9.
Check the log in the console. (The log with the comment in above code.) You'll see something like below.
{
"Location": "https://bucketname.s3.region.amazonaws.com/folderName/fileName",
"Bucket": "bucketName",
"Key": "folderName/fileName",
"ETag": "\"7e35f58f134d8914604c9fc6c35b2db7-9\""
}
If you are debugging you can further test it by printing the s3 upload request.
var uploadRequest = s3.upload(s3params);
console.log(`S3 Request ${JSON.stringify(uploadRequest)}`);
This will print the s3 client configurations. Check whether the 'computeChecksums' is set to true.
I tried to verify with the s3.putObject as well. But when I print the request it didn't show me the md5Checksum config in header as it is intended to be. Also it gave me cyclic JSON stringifying error when I tried to log the whole object in the same way as in the third point for upload. So I printed httpRequest only.
var uploadRequest = s3.putObject(s3params);
console.log(`S3 Request ${JSON.stringify(uploadRequest.httpRequest)}`
//Following gives the JSON stringifying cyclic issue
//console.log(`S3 Request ${JSON.stringify(uploadRequest)}`);
Appreciate if someone can tell how to do this and verify with putObject as well.

Debugging "read time out" for AWS lambda function in Alexa Skill

I am using an AWS lambda function to serve my NodeJS codebase for an Alexa Skill.
The skill makes external API calls to a custom API as well as the Amazon GameOn API, it also uses URL's which serve audio files and images from an S3 Bucket.
The issue I am having is intermittent, and is affecting about 20% of users. At random points of the skill, the user request will produce an invalid response from the skill, with the following error:
{
"Request": {
"type": "System.ExceptionEncountered",
"requestId": "amzn1.echo-api.request.ab35c3f1-b8e6-4478-945c-16f644359556",
"timestamp": "2020-05-16T19:54:24Z",
"locale": "en-US",
"error": {
"type": "INVALID_RESPONSE",
"message": "Read timed out for requestId amzn1.echo-api.request.323b1fbb-b4e8-4cdf-8f31-30c9b67e4a5d"
},
"cause": {
"requestId": "amzn1.echo-api.request.323b1fbb-b4e8-4cdf-8f31-30c9b67e4a5d"
}
},
I have looked up this issue, I believe it's something wrong with the lambda function configuration but can't figure out where!
I've tried increasing the Memory the function uses (now 256MB).
It should be noted that the function timeout is 8000ms, since this is the max time you are allowed for an Alexa response.
What causes this Read timeout issue, and what measures can I take to debug and resolve it?
Take a look at AWS XRay. By using this with your Lambda you should be able to identify the source of these timeouts.
This link should help you understand how to apply it.
We found that this was occurring when the skill was trying to access a resource which was stored on our Azure website.
The CPU and Memory allocation for the azure site was too low, and it would fail when facing a large amount of requests.
To fix, we improved the plan the app service was running on.

AWS API Gateway base64Decode produces garbled binary?

I'm trying to return a 1px gif from an AWS API Gateway method.
Since binary data is now supported, I return an image/gif using the following 'Integration Response' mapping:
$util.base64Decode("R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7")
However, when I look at this in Chrome, I see the following binary being returned:
Instead of:
Could anyone help me understand why this is garbled and the wrong length? Or what I could do to return the correct binary? Is there some other what I could always return this 1px gif without using the base64Decode function?
Many thanks in advance, this has being causing me a lot of pain!
EDIT
This one gets stranger. It looks like the issue is not with base64Decode, but with the general handling of binary. I added a Lambda backend (previously I was using Firehose) following this blog post and this Stack Overflow question. I set images as binaryMediaType as per this documentation page.
This has let me pass the following image/bmp pixel from Lambda through the Gateway API, and it works correctly:
exports.handler = function(event, context) {
var imageHex = "\x42\x4d\x3c\x00\x00\x00\x00\x00\x00\x00\x36\x00\x00\x00\x28\x00\x00\x00\x01\x00\x00\x00\x01\x00\x00\x00\x01\x00\x18\x00\x00\x00\x00\x00\x06\x00\x00\x00\x27\x00\x00\x00\x27\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xff\xff\xff\xff\x00\x00";
context.done(null, { "body":imageHex });
};
However the following images representing an image/png or a image/gif get garbled when passed through:
exports.handler = function(event, context) {
//var imageHex = "\x47\x49\x46\x38\x39\x61\x01\x00\x01\x00\x80\x00\x00\x00\x00\x00\xff\xff\xff\x21\xf9\x04\x01\x00\x00\x00\x00\x2c\x00\x00\x00\x00\x01\x00\x01\x00\x00\x02\x01\x44\x00\x3b";
//var imageHex = "\x47\x49\x46\x38\x39\x61\x01\x00\x01\x00\x80\x00\x00\xff\xff\xff\x00\x00\x00\x21\xf9\x04\x01\x00\x00\x00\x00\x2c\x00\x00\x00\x00\x01\x00\x01\x00\x00\x02\x02\x44\x01\x00\x3b";
var imageHex = "\x47\x49\x46\x38\x39\x61\x01\x00\x01\x00\x80\x00\x00\x00\x00\x00\x00\x00\x00\x21\xf9\x04\x01\x00\x00\x00\x00\x2c\x00\x00\x00\x00\x01\x00\x01\x00\x00\x02\x02\x44\x01\x00\x3b\x0a"
context.done(null, { "body":imageHex });
};
This seems to be the same issue as another Stack Overflow question, but I was hoping this would be fixed with the Gateway API binary support. Unfortunately image/bmp doesn't work for my use case as it can't be transparent...
In case it helps anyone, this has been a good tool for converting between base64 and hex.
To anyone else having problems with this: I was also banging my head against the wall trying to retrieve a binary image over API Gateway proxy integration from lambda, but then I noticed that it says right there in the Binary Support section of Lambda Console:
API Gateway will look at the Content-Type and Accept HTTP headers to decide how to handle the body.
So I added Accept: image/png to the request headers and it worked. Oh the joy, and joyness!
No need to manually change content handling to CONVERT_TO_BINARY or muck about with the cli. Of course this rules out using, for example, <img src= directly (can't set headers).
So, in order to get a binary file over API Gateway from lambda with proxy integration:
List all supported binary content types in the lambda console (and deploy)
The request Accept header must include the Content-Type header returned from the lambda expression
The returned body must be base64 encoded
The result object must also have the isBase64Encoded property set to true
Code:
callback(null, {
statusCode: 200,
headers: { 'Content-Type': 'image/png' },
body: buffer.toString('base64'),
isBase64Encoded: true
}
It looks like this was a known issue previously:
https://forums.aws.amazon.com/thread.jspa?messageID=668306&#668306
But it should be possible now that they've added support for binary data:
http://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-payload-encodings.html
It looks like this is the bit we need: "Set the contentHandling property of the IntegrationResponse resource to CONVERT_TO_BINARY to have the response payload converted from a Base64-encoded string to its binary blob". Then we shouldn't need the base64Decode() function.
Working on a test now to see if this works.
EDIT: I was finally able to get this working. You can see the binary image here:
https://chtskiuz10.execute-api.us-east-1.amazonaws.com/prod/rest/image
I updated the method response as follows:
I updated the integration response to include a hard-coded image/png header:
The last step was tricky: setting the contentHandling property to "CONVERT_TO_BINARY". I couldn't figure out how to do in the AWS console. I had to use the CLI API to accomplish this:
aws apigateway update-integration-response \
--profile davemaple \
--rest-api-id chtskiuzxx \
--resource-id ki1lxx \
--http-method GET \
--status-code 200 \
--patch-operations '[{"op" : "replace", "path" : "/contentHandling", "value" : "CONVERT_TO_BINARY"}]'
I hope this helps.
Check out this answer. It helped me with exposing PDF file for download through GET request without any additional headers.

SOAP vs REST in a non-CRUD and stateless environment

Pretend I am building a simple image-processing API. This API is completely stateless and only needs three items in the request, an image, the image format and an authentication token.
Upon receipt of the image, the server merely processes the image and returns a set of results.
Ex: I see five faces in this image.
Would this still work with a REST based API? Should this be used with a REST based API?
Most of the examples I have seen when comparing REST and SOAP have been purely CRUD based, so I am slightly confused with how they compare in a scenario such as this.
Any help would be greatly appreciated, and although this question seems quite broad, I have yet to find a good answer explaining this.
REST is not about CRUD. It is about resources. So you should ask yourself:
What are my resources?
One answer could be:
An image processing job is a resource.
Create a new image processing job
To create a new image processing job, mak a HTTP POST to a collection of jobs.
POST /jobs/facefinderjobs
Content-Type: image/jpeg
The body of this POST would be the image.
The server would respond:
201 Created
Location: /jobs/facefinderjobs/03125EDA-5044-11E4-98C5-26218ABEA664
Here 03125EDA-5044-11E4-98C5-26218ABEA664 is the ID of the job assigned by the server.
Retrieve the status of the job
The client now wants to get the status of the job:
GET /jobs/facefinderjobs/03125EDA-5044-11E4-98C5-26218ABEA664
If the job is not finished, the server could respond:
200 OK
Content-Type: application/json
{
"id": "03125EDA-5044-11E4-98C5-26218ABEA664",
"status": "processing"
}
Later, the client asks again:
GET /jobs/facefinderjobs/03125EDA-5044-11E4-98C5-26218ABEA664
Now the job is finished and the response from the server is:
200 OK
Content-Type: application/json
{
"id": "03125EDA-5044-11E4-98C5-26218ABEA664",
"status": "finished",
"faces": 5
}
The client would parse the JSON and check the status field. If it is finished, it can get the number of found faces from the faces field.