How to catch failed S3 copyObject with 200 OK result in AWSJavaScriptSDK - amazon-web-services

Documentation on the S3.copyObject method in AWSJavaScriptSDK indicates the following:
A copy request might return an error when Amazon S3 receives the copy
request or while Amazon S3 is copying the files. If the error occurs
before the copy operation starts, you receive a standard Amazon S3
error. If the error occurs during the copy operation, the error
response is embedded in the 200 OK response. This means that a 200 OK
response can contain either a success or an error. Design your
application to parse the contents of the response and handle it
appropriately.
However, no example is given of what that failure might look like, and the types associated with copyObject in the aws-sdk Node library (i.e. CopyObjectResult and S3.Types.CopyObjectOutput) suggest that there isn't a place for a failed copy to be reported in a success response.
Does anyone know how to interpret this documentation? What is an example of a copy operation failing while returning a 200 OK to copyObject, and how would the caller know?

The SDK itself massages 200 status OK responses into errors for specific API calls, including copyObject.
As of this commit , the operations completeMultipartUpload, copyObject, and uploadPartCopy are flagged as able to return a status code 200 that is actually an error, and there is a handler to coerce those responses into error responses.

Related

Media Tailor ad returning 504 error in AWS

I'm using AWS Media Tailor to test an ad inserting demo. The demo page is this one: https://github.com/aws-samples/aws-media-services-simple-vod-workflow/tree/master/12-AdMarkerInsertion.
When I place my manifest into a TheoPlayer I always get an 504 error. My manifes page is: https://ebf348c58b834d189af82777f4f742a6.mediatailor.us-west-2.amazonaws.com/v1/master/3c879a81c14534e13d0b39aac4479d6d57e7c462/MyTestCampaign/llama.m3u8.
I have also tried with: https://ebf348c58b834d189af82777f4f742a6.mediatailor.us-west-2.amazonaws.com/v1/master/3c879a81c14534e13d0b39aac4479d6d57e7c462/MyTestCampaign/llama_with_slates.m3u8.
The specific error is:
{"message":"failed to generate manifest: Unable to obtain template playlist. sessionId:[c915d529-3527-4e37-89e0-087e393e75de]"}
I have read about this error: https://docs.aws.amazon.com/mediatailor/latest/ug/playback-errors-examples.html
But don't know how to fix it.
Maybe I did something wrong or do I need a quote in AWS?
Any idea?
Thanks for the inquiry!
The following example shows the result when a timeout occurs between AWS Elemental MediaTailor and either the ad decision server (ADS) or the origin server.
An HTTP 504 error is known as a Gateway Timeout meaning that a resource was unresponsive and prevented the request from completing successfully. In this case since MediaTailor is returning an HTTP 504 this means that either the ADS or Origin failed to respond within the timeout period.
To troubleshoot this you will need to determine which dependency is failing to respond to MediaTailor and correct it. Typically the issue is the ADS failing to respond to a VAST request performed by MediaTailor which you can confirm by reviewing your CloudWatch logs.
https://docs.aws.amazon.com/mediatailor/latest/ug/monitor-cloudwatch-ads-logs.html
Make sure that your ADS follows the guidelines listed below for integrating with MediaTailor.
https://docs.aws.amazon.com/mediatailor/latest/ug/vast-integration.html

"LAMBDA_RUNTIME" Error on high-volume Lambda Function

I'm currently using a Lambda Function written in Javascript that is setup with an SQS event source to automatically pull messages from an SQS Queue and do some basic processing on the message contents. I cannot show the code but the summary of the lambda function's execution is basically:
For each message in the batch it receives as part of the event:
It parses the body, which is a JSON string, into a Javascript object.
It reads an object from S3 that is listed in the object using getObject.
It puts a record into a DynamoDB table using put.
If there were no errors, it deletes the individual SQS message that was processed from the Queue using deleteMessage.
This SQS queue is high-volume and receives messages in-bulk, regularly building up a backlog of millions of messages. The Lambda is normally able to scale to process hundreds of thousands of messages concurrently. This solution has worked well for me with other applications in the past but I'm now encountering the following intermittent error that reliably begins to appear as the Lambda scales up:
[ERROR] [#############] LAMBDA_RUNTIME Failed to post handler success response. Http response code: 400.
I've been unable to find any information anywhere about what this error means and what causes it. There appears to be not discernible pattern as to which executions encounter it. The function is usually able to run for a brief period without encountering the error and scale to expected levels. But then, as you can see, the error starts to appear quite suddenly and completely destroys the Lambda throughput by forcing it to auto-scale down:
Does anyone know what this "LAMBDA_RUNTIME" error means and what might cause it? My Lambda Function runtime is Node v12.
Your function is being invoked asynchronously, so when it finishes it signals the caller if it was sucessful.
You should have an error some milliseconds earlier, probably an unhandled exception not being logged. If that's the case, your functions ends without knowing about the exception and tries to post a success response.
I have this error only that I get:
[ERROR] [1638918279694] LAMBDA_RUNTIME Failed to post handler success response. Http response code: 413.
I went to the lambda function on aws console and ran the test with a custom event I build and the error I got there was:
{
"errorMessage": "Response payload size exceeded maximum allowed payload size (6291556 bytes).",
"errorType": "Function.ResponseSizeTooLarge"
}
So this is the actual error that cloudwatch doesn't return but the testing section of the lambda function console do.
I think I'll have to return info to an S3 file or something, but that's another matter.

Google pubsub 88% of requests come back as 503

Question on why pubsub requests seem to trigger such a high number of 503 errors? Is this something common? It seems other people see something similar but a majority of my requests end up that way
Similar to
Google Pubsub: UNAVAILABLE: The service was unable to fulfill your request
Catch error code from GCP pub/sub
This is expected behavior. Streaming pull, which is used by the client libraries, creates a bidirectional stream for receiving messages and sending back acknowledgements. These streams stay open for long periods of time and don't close with a successful response code when messages are received, they terminate with an error condition when the stream disconnects, perhaps due to a restart on the part of the server receiving the request or because of brief network blip. Therefore, even if you are receiving messages successfully, you'll still see error response codes for all of the streams themselves. The new streaming pull docs address this question directly.

How can I distinguish between different BadRequest errors with Amazon S3?

Amazon S3 has a large number of reasons why it will return an HTTP 400 Bad Request error. Most relevant is the fact that some of these errors are from the unreliability of the internet, such as a request timeout. Another reason it might be returned is if the bucket or key name is invalid.
I am attempting to upload files to S3 with key names supplied by a customer-controlled key name. I need to be able to distinguish between a transient 400 error such as a timeout and a bad key/bucket name error that will not be transient. A transient error will indicate we should retry that upload, while a non-transient error means we should cease trying to upload that file.
However, I do not know how to distinguish between these two errors! If it matters, I am attempting to use the JetS3t API to perfom these uploads. How can I distinguish between a bad key/bucket name error and anything else with a 400 error code?
Read the response body.
From the page you cited:
The body o[f] the response also contains information about the error.
Parse the response body that accompanies the http error code. An explanation of the error is almost always spelled out in XML in the response body.
Example nonsense request (nothing edited here, this is exactly what I used for a GET request to generate this error):
http://example-bucket.s3.amazonaws.com/?AWSAccessKeyId=AKIAEXAMPLEEXAMPLE&Signature=bogus&Expires=1500000000
Response:
<Error>
<Code>InvalidAccessKeyId</Code>
<Message>
The AWS Access Key Id you provided does not exist in our records.
</Message>
<AWSAccessKeyId>AKIAEXAMPLEEXAMPLE</AWSAccessKeyId>
<RequestId>...</RequestId>
<HostId>...</HostId>
</Error>
You'll find a pretty close correlation between this content and the list of possible errors.
Now, technically, that's a 403 not a 400, it's just the first idea I came up with for an easily handcrafted nonsense request to generate an error, but any S3 error should generate a comparable response.

What are REST API error handling best practices? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I'm looking for guidance on good practices when it comes to return errors from a REST API. I'm working on a new API so I can take it any direction right now. My content type is XML at the moment, but I plan to support JSON in future.
I am now adding some error cases, like for instance a client attempts to add a new resource but has exceeded his storage quota. I am already handling certain error cases with HTTP status codes (401 for authentication, 403 for authorization and 404 for plain bad request URIs). I looked over the blessed HTTP error codes but none of the 400-417 range seems right to report application specific errors. So at first I was tempted to return my application error with 200 OK and a specific XML payload (ie. Pay us more and you'll get the storage you need!) but I stopped to think about it and it seems to soapy (/shrug in horror). Besides it feels like I'm splitting the error responses into distinct cases, as some are http status code driven and other are content driven.
So what is the industry recommendations? Good practices (please explain why!) and also, from a client pov, what kind of error handling in the REST API makes life easier for the client code?
A great resource to pick the correct HTTP error code for your API:
http://www.codetinkerer.com/2015/12/04/choosing-an-http-status-code.html
An excerpt from the article:
Where to start:
2XX/3XX:
4XX:
5XX:
So at first I was tempted to return my application error with 200 OK and a specific XML payload (ie. Pay us more and you'll get the storage you need!) but I stopped to think about it and it seems to soapy (/shrug in horror).
I wouldn't return a 200 unless there really was nothing wrong with the request. From RFC2616, 200 means "the request has succeeded."
If the client's storage quota has been exceeded (for whatever reason), I'd return a 403 (Forbidden):
The server understood the request, but is refusing to fulfill it. Authorization will not help and the request SHOULD NOT be repeated. If the request method was not HEAD and the server wishes to make public why the request has not been fulfilled, it SHOULD describe the reason for the refusal in the entity. If the server does not wish to make this information available to the client, the status code 404 (Not Found) can be used instead.
This tells the client that the request was OK, but that it failed (something a 200 doesn't do). This also gives you the opportunity to explain the problem (and its solution) in the response body.
What other specific error conditions did you have in mind?
The main choice is do you want to treat the HTTP status code as part of your REST API or not.
Both ways work fine. I agree that, strictly speaking, one of the ideas of REST is that you should use the HTTP Status code as a part of your API (return 200 or 201 for a successful operation and a 4xx or 5xx depending on various error cases.) However, there are no REST police. You can do what you want. I have seen far more egregious non-REST APIs being called "RESTful."
At this point (August, 2015) I do recommend that you use the HTTP Status code as part of your API. It is now much easier to see the return code when using frameworks than it was in the past. In particular, it is now easier to see the non-200 return case and the body of non-200 responses than it was in the past.
The HTTP Status code is part of your api
You will need to carefully pick 4xx codes that fit your error conditions. You can include a rest, xml, or plaintext message as the payload that includes a sub-code and a descriptive comment.
The clients will need to use a software framework that enables them to get at the HTTP-level status code. Usually do-able, not always straight-forward.
The clients will have to distinguish between HTTP status codes that indicate a communications error and your own status codes that indicate an application-level issue.
The HTTP Status code is NOT part of your api
The HTTP status code will always be 200 if your app received the request and then responded (both success and error cases)
ALL of your responses should include "envelope" or "header" information. Typically something like:
envelope_ver: 1.0
status: # use any codes you like. Reserve a code for success.
msg: "ok" # A human string that reflects the code. Useful for debugging.
data: ... # The data of the response, if any.
This method can be easier for clients since the status for the response is always in the same place (no sub-codes needed), no limits on the codes, no need to fetch the HTTP-level status-code.
Here's a post with a similar idea: http://yuiblog.com/blog/2008/10/15/datatable-260-part-one/
Main issues:
Be sure to include version numbers so you can later change the semantics of the api if needed.
Document...
Remember there are more status codes than those defined in the HTTP/1.1 RFCs, the IANA registry is at http://www.iana.org/assignments/http-status-codes. For the case you mentioned status code 507 sounds right.
As others have pointed, having a response entity in an error code is perfectly allowable.
Do remember that 5xx errors are server-side, aka the client cannot change anything to its request to make the request pass. If the client's quota is exceeded, that's definitly not a server error, so 5xx should be avoided.
I know this is extremely late to the party, but now, in year 2013, we have a few media types to cover error handling in a common distributed (RESTful) fashion. See "vnd.error", application/vnd.error+json (https://github.com/blongden/vnd.error) and "Problem Details for HTTP APIs", application/problem+json (https://datatracker.ietf.org/doc/html/draft-nottingham-http-problem-05).
There are two sorts of errors. Application errors and HTTP errors. The HTTP errors are just to let your AJAX handler know that things went fine and should not be used for anything else.
5xx Server Error
500 Internal Server Error
501 Not Implemented
502 Bad Gateway
503 Service Unavailable
504 Gateway Timeout
505 HTTP Version Not Supported
506 Variant Also Negotiates (RFC 2295 )
507 Insufficient Storage (WebDAV) (RFC 4918 )
509 Bandwidth Limit Exceeded (Apache bw/limited extension)
510 Not Extended (RFC 2774 )
2xx Success
200 OK
201 Created
202 Accepted
203 Non-Authoritative Information (since HTTP/1.1)
204 No Content
205 Reset Content
206 Partial Content
207 Multi-Status (WebDAV)
However, how you design your application errors is really up to you. Stack Overflow for example sends out an object with response, data and message properties. The response I believe contains true or false to indicate if the operation was successful (usually for write operations). The data contains the payload (usually for read operations) and the message contains any additional metadata or useful messages (such as error messages when the response is false).
Agreed. The basic philosophy of REST is to use the web infrastructure. The HTTP Status codes are the messaging framework that allows parties to communicate with each other without increasing the HTTP payload. They are already established universal codes conveying the status of response, and therefore, to be truly RESTful, the applications must use this framework to communicate the response status.
Sending an error response in a HTTP 200 envelope is misleading, and forces the client (api consumer) to parse the message, most likely in a non-standard, or proprietary way. This is also not efficient - you will force your clients to parse the HTTP payload every single time to understand the "real" response status. This increases processing, adds latency, and creates an environment for the client to make mistakes.
Modeling your api on existing 'best practices' might be the way to go.
For example, here is how Twitter handles error codes
https://developer.twitter.com/en/docs/basics/response-codes
Please stick to the semantics of protocol. Use 2xx for successful responses and 4xx , 5xx for error responses - be it your business exceptions or other. Had using 2xx for any response been the intended use case in the protocol, they would not have other status codes in the first place.
Don't forget the 5xx errors as well for application errors.
In this case what about 409 (Conflict)? This assumes that the user can fix the problem by deleting stored resources.
Otherwise 507 (not entirely standard) may also work. I wouldn't use 200 unless you use 200 for errors in general.
If the client quota is exceeded it is a server error, avoid 5xx in this instance.