Amazon retries upload with large files - amazon-web-services

I have a trouble, i'm using aws-sdk on the browser to upload videos from my web app to Amazon S3, it works fine with shortest files (less than 100MB) but with large files (500MB for example) amazon retries the upload.
For example, when the upload is in 28% it return back to 1%, i don't know why, i put an event listener of the upload file but it don't give any error, just return back to 1%
Literally this is my code:
const params = {
Bucket: process.env.VUE_APP_AWS_BUCKET_NAME, // It comes from my dotenv file
Key: videoPath, // It comes from a external var
ACL: 'public-read',
ContentType: this.type, // It comes from my class where i have the video type
Body: VideoObject // <- It comes from the input file
};
s3.putObject(params, function (err) {
if (err)
console.log(err);
}).on('httpUploadProgress', progress => {
console.log(progress);
console.log(progress.loaded + " - " + progress.total);
this.progress = parseInt((progress.loaded * 100) / progress.total);
});
I really would like to give more info but that's all i have, i don't know why s3 retry the upload without any error (Also i don't know how to catch s3 errors...)
My internet connection is fine, i'm using my business conection and it works fine, this issue happens with all my computers

For large object you may try upload function that supports progress tracking and multipart uploading to upload parts in parallel. Also I didn't see content length set in your example, actually uplaod accepts a stream without a content length necessarily defined.
An example: https://aws.amazon.com/blogs/developer/announcing-the-amazon-s3-managed-uploader-in-the-aws-sdk-for-javascript/

You need to use a multipart upload for large files.

Related

how to stream microphone audio from browser to S3

I want to stream the microphone audio from the web browser to AWS S3.
Got it working
this.recorder = new window.MediaRecorder(...);
this.recorder.addEventListener('dataavailable', (e) => {
this.chunks.push(e.data);
});
and then when user clicks on stop upload the chunks new Blob(this.chunks, { type: 'audio/wav' }) as multiparts to AWS S3.
But the problem is if the recording is 2-3 hours longer then it might take exceptionally longer and user might close the browser before waiting for the recording to complete uploading.
Is there a way we can stream the web audio directly to S3 while it's going on?
Things I tried but can't get a working example:
Kineses video streams, looks like it's only for real time streaming between multiple clients and I have to write my own client which will then save it to S3.
Thought to use kinesis data firehose but couldn't find any client data producer from brower.
Even tried to find any resource using aws lex or aws ivs but I think they are just over engineering for my use case.
Any help will be appreciated.
You can set the timeslice parameter when calling start() on the MediaRecorder. The MediaRecorder will then emit chunks which roughly match the length of the timeslice parameter.
You could upload those chunks using S3's multipart upload feature as you already mentioned.
Please note that you need a library like extendable-media-recorder if you want to record a WAV file since no browser supports that out of the box.

File got corrupted and size is increased when uploading after publishing code on AWS Lambda function with .NET Core 3.1

I'm Uploading mp3 file in aws s3 bucket, when i upload from my local system its working fine and uploading file with same size. but after publishing on AWS, when i try to upload file, file getting corrupted and also file size is increased.
Here is my API :
I'm just returning file from here.
[HttpPost, DisableRequestSizeLimit]
[Route("SaveSoundRecordingDataFile")]
public async Task<IActionResult> SaveSoundRecordingDataFile()
{
try
{
var file = Request.Form.Files[0];
return Ok(file);
}
catch (Exception ex)
{
return StatusCode(500, $"Internal server error: {ex}");
}
}
Check below responses I'm uploading 1MB size file you will find how file size are getting increased after publish code on aws lambda.
Response from Local System :
Response from local system
Response after publish on aws lambda :
Response from aws
I have tried many ways and go through many articles and forms but didn't find any solution why is this happening.
Thank you in advance.

The Content-MD5 you specified is invalid for multi-part uploads

I am using the awd-sdk for nodejs and have a working upload.
const checksum = await this.getChecksum(path);
const payload: S3.PutObjectRequest = {
Bucket: bucket,
Key: key,
Body: fs.createReadStream(path),
ContentMD5: checksum,
};
return this.s3.upload(payload).promise();
This piece of code works great for small files and takes advantage of ContentMD5 which automatically verifies the file integrity.
Content-MD5
The base64-encoded 128-bit MD5 digest of the message (without the headers) according to RFC 1864. This header can be used as a message integrity check to verify that the data is the same data that was originally sent. Although it is optional, we recommend using the Content-MD5 mechanism as an end-to-end integrity check. For more information about REST request authentication, see REST Authentication.
https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObject.html
However it doesn't work for multipart uploads.
The Content-MD5 you specified is invalid for multi-part uploads.
That makes sense because we send the file chunk by chunk but then, I am wondering how am I supposed to use this feature with multipart uploads?
I too faced this and did multiple testing. Finally found the answer and verified it in my own way. If you guys know a much better way please let me know here.
Being said that, here's how I solved the issue.
When you create the S3 client you have to create it as below.
const s3 = new AWS.S3({ computeChecksums: true });
Then you can define the s3 upload parameters like below
var fileData = Buffer.from(fileContentString, 'binary');
var s3params = {
Bucket: bucketName,
Key: folderName + "/" + fileName,
ContentType: 'binary',
Body: fileData
};
Then do the upload as below.
await s3.upload(s3params).promise()
.then(data => {
// use this log to verify whether the md5 checksum was verified
console.log(`File uploadedd successfully ${JSON.stringify(data)}`);
// handle success upload
})
.catch(error => {
//handle the error
});
After upload is successful here's how I verified its working.
Check the item in the S3. In the document details, Object Overview section check the E-tag. You should see something like this 7e35f58f134d8914604c9fc6c35b2db7-9. This number after the - means how many parts were uploaded. For bigger files there should be a number bigger than 1. In this case it is 9.
Check the log in the console. (The log with the comment in above code.) You'll see something like below.
{
"Location": "https://bucketname.s3.region.amazonaws.com/folderName/fileName",
"Bucket": "bucketName",
"Key": "folderName/fileName",
"ETag": "\"7e35f58f134d8914604c9fc6c35b2db7-9\""
}
If you are debugging you can further test it by printing the s3 upload request.
var uploadRequest = s3.upload(s3params);
console.log(`S3 Request ${JSON.stringify(uploadRequest)}`);
This will print the s3 client configurations. Check whether the 'computeChecksums' is set to true.
I tried to verify with the s3.putObject as well. But when I print the request it didn't show me the md5Checksum config in header as it is intended to be. Also it gave me cyclic JSON stringifying error when I tried to log the whole object in the same way as in the third point for upload. So I printed httpRequest only.
var uploadRequest = s3.putObject(s3params);
console.log(`S3 Request ${JSON.stringify(uploadRequest.httpRequest)}`
//Following gives the JSON stringifying cyclic issue
//console.log(`S3 Request ${JSON.stringify(uploadRequest)}`);
Appreciate if someone can tell how to do this and verify with putObject as well.

Random ssl handshake failure when pulling file from Amazon s3 bucket

I have a specific fetch request in my node app which just pulls a json file from my S3 bucket and stores the content within the state of my app.
The fetch request works 99% of the time but for some reason about every 4 or 5 days I get a notification saying the app has crashed and when I investigate the reason is always because of this ssl handshake failure.
I am trying to figure out why this is happening as well as a fix to prevent this in future cases.
The fetch request looks like the following and is called everything someone new visits the site, Once the request has been made and the json is now in the app's state, the request is no longer called.
function grabPreParsedContentFromS3 (prefix, callback) {
fetch(`https://s3-ap-southeast-2.amazonaws.com/my-bucket/${prefix}.json`)
.then(res => res.json())
.then(res => callback(res[prefix]))
.catch(e => console.log('Error fetching data from s3: ', e))
}
When this error happens the .catch method with get thrown and returns the following error message:
Error fetching data from s3: {
FetchError: request to https://s3-ap-southeast-2.amazonaws.com/my-bucket/services.json failed,
reason: write EPROTO 139797093521280:error:1409E0E5:SSL routines:ssl3_write_bytes:ssl handshake failure:../deps/openssl/openssl/ssl/s3_pkt.c:659:
...
}
Has anyone encountered this kind of issue before or has any idea why this might be happening? Currently I am wondering if maybe there is a limit to the amount of S3 request i can make at one time which is causing it to fail? But the site isn't super popular either to be request a huge portion of fetches.

Optimize Image Storage / Get Requests from Amazon S3 with Picasso

I am creating a polling app, and each poll is going to have an associated image of the particular topic.
I am using Firebase to dynamically update polls as events occur. In Firebase, I am storing the relevant Image URL (referencing the URL in Amazon S3), and I am then using Picasso to load the image onto the client's device (see code below).
I have already noticed that I may be handling this data inefficiently, resulting in unnecessary Get requests to my Amazon files in S3. I was wondering what options I have with Picasso (i.e. I am thinking some caching) to pull the images for each client just once and them store them locally (I do not want them to remain on the client's device permanently, however). My goal is to minimize costs but not compromise performance. Below is my current code:
mPollsRef.child(mCurrentDateString).child(homePollFragmentIndexConvertedToFirebaseReferenceImmediatelyBelowDate).addListenerForSingleValueEvent(new ValueEventListener() {
#Override
public void onDataChange(DataSnapshot dataSnapshot) {
int numberOfPollAnswersAtIndexBelowDate = (int) dataSnapshot.child("Poll_Answers").getChildrenCount();
Log.e("TAG", "There are " + numberOfPollAnswersAtIndexBelowDate + " polls answers at index " + homePollFragmentIndexConvertedToFirebaseReferenceImmediatelyBelowDate);
addRadioButtonsWithFirebaseAnswers(dataSnapshot, numberOfPollAnswersAtIndexBelowDate);
String pollQuestion = dataSnapshot.child("Poll_Question").getValue().toString();
mPollQuestion.setText(pollQuestion);
//This is where the image "GET" from Amazon S3 using Picasso begins; the URL is in Firebase and then I use that URL
//with the Picasso.load method
final String mImageURL = (String) dataSnapshot.child("Image").getValue();
Picasso.with(getContext())
.load(mImageURL)
.fit()
.into((ImageView) rootView.findViewById(R.id.poll_image));
}
#Override
public void onCancelled(FirebaseError firebaseError) {
}
});
First, the Picasso instance will hold a memory cache by default (or you can configure it).
Second, disk caching is done by the HTTP client. You should use OkHttp 3+ in 2016. By default, Picasso will make a reasonable default cache with OkHttp if you include OkHttp in your dependencies. You can also set the Downloader when creating the Picasso instance (make sure to set the cache on the client and use OkHttpDownloader or comparable).
Third, OkHttp will respect cache headers, so make sure the max-age and max-stale have appropriate values.