Stream download from S3 timeout - amazon-web-services

Stream download from S3 timeout - amazon-web-services

I'm facing some timeout problems with a "middleware" (from now on file-service) service developed with NestJS and AWS-S3.
The file-service has two main purposes:
Act as an object storage abstraction layer, to allow the backend to upload files to different storage services completely transparent to the user.
Receive signed tokens as a url query parameter with file/object information, verify access to resource and stream it.
Upload works without problem.
Download small files has no problems too.
But when I try to download large files (> 50MB), after a few seconds, the connection beaks down because of a timeout and as you can figure out the download fails.
I've been spending some days looking for a solutions and reading docs.
Here some of them:
About KeepAlive
Use an instance of S3 each time
But nothing works.
Here the code:
Storage definition class
export class S3Storage implements StorageInterface {
config: any;
private s3;
constructor() {}
async initialize(config: S3ConfigInterface): Promise<void> {
this.config = config;
this.s3 = new AWS.S3();
// initialize S3 Configuration
AWS.config.update({
accessKeyId: config.accessKeyId,
secretAccessKey: config.secretAccessKey,
region: config.region
});
}
async downloadFile(target: FileDto): Promise<Readable> {
const params = {
Bucket: this.config.Bucket,
Key: target.sourcePath
};
return this.s3.getObject(params).createReadStream();
}
}
Download method
private async downloadOne(target: FileDto, request, response) {
const storage = await this.provider.getStorage(target.datasource);
response.setHeader('Content-Type', mime.lookup(target.filename) || 'application/octet-stream');
response.setHeader('Content-Disposition', `filename="${path.basename(target.filename)}";`);
const stream = await storage.downloadFile(target);
stream.pipe(response);
// await download and exit
await new Promise((resolve, reject) => {
stream.on('end', () => {
resolve(`${target.filename} has been downloaded`);
});
stream.on('error', () => {
reject(`${target.filename} could not be downloaded`);
});
});
}
If any one has faced the same issue (or similar) or any one has any idea (useful or not), I will appreciate any help or advice.
Thank you in advance.

I had the same issue, and here is how I solved it on my side: instead of processing the file by directly getting the stream from S3, I decided to download the content to a temp file (Amazon backend server for my API) and process already the stream from that temp file. Afterwards, I removed the temp file in order not to fill the hard drive.

Related

How to abort Multipart Upload while uploading using AWS SDK for JavaScript v3?

I'm trying to upload a large file using AWS SDK for JavaScript v3 multipart upload.
Basically I'm using Upload class from #aws-sdk/lib-storage to upload. But after sometime when the sessionToken expires, AWS start throwing 400 Bad Request error.
I'm calling uploadReq.abort() in catch block. And I was expecting that code in catch block will be executed immediately when AWS started throwing 400 error and no further part upload request will be trigger. Instead, it continues to trigger the upload part request and catch block is only called once all the subsequent part requests finished and failed. Is there a way to tell AWS s3 client to not trigger anymore part upload request when there is any error?
Here is the code I'm trying:
import {
AbortMultipartUploadCommandOutput,
CompleteMultipartUploadCommandOutput,
S3,
Tag
} from '#aws-sdk/client-s3';
import { Progress, Upload } from '#aws-sdk/lib-storage';
...
const s3Client = new S3({
region: config.region,
credentials: {
accessKeyId: config.accessKeyId,
secretAccessKey: config.secretAccessKey,
sessionToken: config.sessionToken
}
});
const uploadReq = new Upload({
client: s3Client,
params: {
Bucket: <bucketName>,
Key: <key>,
Body: <file_body>
},
tags: [], // optional tags
queueSize: 4, // optional concurrency configuration
partSize: 1024 * 1024 * 10, // (10MB) - optional size of each part, in bytes. e.g. 1024 * 1024 * 10
leavePartsOnError: false // optional manually handle dropped parts
});
const uploadReq$ = from(uploadReq.done()).pipe(
catchError(() => {
uploadReq.abort();
return of(null);
})
);

How do I make putObject request to presignedUrl using s3 AWS

I am working with AWS S3 Bucket, and trying to upload image from react native project managed by expo. I have express on the backend. I have created a s3 file on backend that handles getting the presigned url, and this works, and returns the url to the front end inside this thunk function from reduxjs toolkit. I used axios to send request to my server, this works. I have used axios and fetch to try the final put to the presigned url but when it reached the s3 bucket there is nothing in the file just an empty file with 200 bytes everytime. When I use the same presigned url from postman and upload and image in binary section then send the post request the image uploads to the bucket no problems. When I send binary or base64 to bucket from RN app it just uploads those values in text form. I attempted react-native-image-picker but was having problems with that too. Any ideas would be helpful thanks. I have included a snippet from redux slice. If you need more info let me know.
redux slice projects.js
// create a project
// fancy funtion here ......
export const createProject = createAsyncThunk(
"projects/createProject",
async (postData) => {
// sending image to s3 bucket and getting a url to store in d
const response = await axios.get("/s3")
// post image directly to s3 bucket
const s3Url = await fetch(response.data.data, {
method: "PUT",
body: postData.image
});
console.log(s3Url)
console.log(response.data.data)
// make another request to my server to store extra data
try {
const response = await axios.post('/works', postData)
return response.data.data;
} catch (err) {
console.log("Create projects failed: ", err)
}
}
)

x-googl-acl isn't making uploaded files public

Currently I have been trying to upload objects (videos) to my Google Storage Cloud. I have found out the reason (possibly) that I haven't been able to make them public is due to ACL or IAM permission. The way it's currently done is I get a signedUrl from the backend as
const getGoogleSignedUrl = async (root, args, context) => {
const { filename } = args;
const googleCloud = new Storage({
keyFilename: ,
projectId: 'something'
});
const options = {
version: 'v4',
action: 'write',
expires: Date.now() + 15 * 60 * 1000, // 15 minutes
contentType: 'video/quicktime',
extensionHeaders: {'x-googl-acl': 'public-read'}
};
const bucketName = 'something';
// Get a v4 signed URL for uploading file
const [url] = await googleCloud
.bucket(bucketName)
.file(filename)
.getSignedUrl(options);
return { url };
}
Once I have gotten temporary permission from the backend as a url I try to make a put request to uploaded the file as:
const response = await fetch(url, {
method: 'PUT',
body: blob,
headers: {
'x-googl-acl': 'public-read',
'content-type': 'video/quicktime'
}
}).then(res => console.log("thres is ", res)).catch(e => console.log(e));
Even though the file does get uploaded to google cloud storage. It is always showing public access as Not public. Any help would be helpful since I am starting to not understand how making an object public with google cloud works.
Within AWS (previously) it was easy to make an object public by adding x-aws-acl to a put request.
Thank you in advance.
Update
I have changed the code to reflex currently what it looks like. Also when I look at the object in Google Storage after it's been uploaded I see
Public access Not authorized
Type video/quicktime
Size 369.1 KB
Created Feb 11, 2021, 5:49:02 PM
Last modified Feb 11, 2021, 5:49:02 PM
Hold status None
Retention policy None
Encryption type Google-managed key
Custom time —
Public URL Not applicable
Update 2
As stated. The issue where I wasn't able to upload the file after trying to add the recommended header was because I wasn't providing the header correctly. I changed the header from x-googl-acl to x-goog-acl which has allowed me to upload it to the cloud.
New problem is now Public access is showing as Not authorized
Update 3
In order to try something new I followed the direction listed here https://www.jhanley.com/google-cloud-setting-up-gcloud-with-service-account-credentials/. Once I finished everything. The next steps I took was
1 - Upload a new video to the cloud. This will be done using the new json provided
const googleCloud = new Storage({
keyFilename: json_file_given),
projectId: 'something'
});
Once the file has been uploaded I noticed there was no changes in regards to it being public. Still has Public access Not authorized
2 - Once checking the status of the object uploaded I went on to follow similar approach on the bottom to make sure I am using the same account as the json object that uploaded the file.
gcloud auth activate-service-account test#development-123456.iam.gserviceaccount.com --key-file=test_google_account.json
3 - Once I noticed I am using the right person with the right permission I performed the next step of
gsutil acl ch -u AllUsers:R gs://example-bucket/example-object
This is return actually resulted in a response of No changes to gs://crit_bull_1/google5.mov

Authenticate AWS lambda against Google Sheets API

I am trying to create an aws lambda function that will read rows from multiple Google Sheets documents using the Google Sheet API and will merge them afterwards and write in another spreadsheet. To do so I did all the necessary steps according to several tutorials:
Create credentials for the AWS user to have the key pair.
Create a Google Service Account, download the credentials.json file.
Share each necessary spreadsheet with the Google Service Account client_email.
When executing the program locally it works perfectly, it successfully logins using the credentials.json file and reads & writes all necessary documents.
However when uploading it to AWS Lambda using the serverless framework and google-spreadsheet, the program fails silently in the authentication step. I've tried changing the permissions as recommended in this question but it still fail. The file is read properly and I can print it to the console.
This is the simplified code:
async function getData(spreadsheet, psychologistName) {
await spreadsheet.useServiceAccountAuth(clientSecret);
// It never gets to this point, it fails silently
await spreadsheet.loadInfo();
... etc ...
}
async function main() {
const promises = Object.entries(psychologistSheetIDs).map(async (psychologistSheetIdPair) => {
const [psychologistName, googleSheetId] = psychologistSheetIdPair;
const sheet = new GoogleSpreadsheet(googleSheetId);
psychologistScheduleData = await getData(sheet, psychologistName);
return psychologistScheduleData;
});
//When all sheets are available, merge their data and write back in joint view.
Promise.all(promises).then(async (psychologistSchedules) => {
... merge the data ...
});
}
module.exports.main = async (event, context, callback) => {
const result = await main();
return {
statusCode: 200,
body: JSON.stringify(
result,
null,
2
),
};

I solved it,
While locally having a Promise.all(promises).then(result =>...) eventually returned the value and executed what was inside the then(), aws lambda returned before the promises were resolved.
This solved it:
const res = await Promise.all(promises);
mergeData(res);

Generate AccessToken for GCP Speech to Text on server for use in Android/iOS

Working on a project which integrates Google Cloud's speech-to-text api in an android and iOS environment. Ran through the example code provided (https://cloud.google.com/speech-to-text/docs/samples) and was able to get it to run. Used them as a template to add voice into my app, however there is a serious danger in the samples, specifically in generating the AccessToken (Android snippet below):
// ***** WARNING *****
// In this sample, we load the credential from a JSON file stored in a raw resource
// folder of this client app. You should never do this in your app. Instead, store
// the file in your server and obtain an access token from there.
// *******************
final InputStream stream = getResources().openRawResource(R.raw.credential);
try {
final GoogleCredentials credentials = GoogleCredentials.fromStream(stream)
.createScoped(SCOPE);
final AccessToken token = credentials.refreshAccessToken();
This was fine to develop and test locally, but as the comment indicates, it isn't safe to save the credential file into a production app build. So what I need to do is replace this code with a request from a server endpoint. Additionally i need to write the endpoint that will take the request and pass back a token. Although I found some very interesting tutorials related to Firebase Admin libraries generating tokens, I couldn't find anything related to doing a similar operation for GCP apis.
Any suggestions/documentation/examples that could point me in the right direction are appreciated!
Note: The server endpoint will be a Node.js environment.

Sorry for the delay, I was able to get it all to work together and am now only circling back to post an extremely simplified how-to. To start, I installed the following library on the server endpoint project https://www.npmjs.com/package/google-auth-library
The server endpoint in this case is lacking any authentication/authorization etc for simplicity's sake. I'll leave that part up to you. We are also going to pretend this endpoint is reachable from https://www.example.com/token
The expectation being, calling https://www.example.com/token will result in a response with a string token, a number for expires, and some extra info about how the token was generated:
ie:
{"token":"sometoken", "expires":1234567, "info": {... additional stuff}}
Also for this example I used a ServiceAccountKey file which will be stored on the server,
The suggested route is to set up a server environment variable and use https://cloud.google.com/docs/authentication/production#finding_credentials_automatically however this is for the examples sake, and is easy enough for a quick test. These files look something like the following: ( honor system don't steal my private key )
ServiceAccountKey.json
{
"type": "service_account",
"project_id": "project-id",
"private_key_id": "378329234klnfgdjknfdgh9fgd98fgduiph",
"private_key": "-----BEGIN PRIVATE KEY-----\nThisIsTotallyARealPrivateKeyPleaseDontStealIt=\n-----END PRIVATE KEY-----\n",
"client_email": "project-id#appspot.gserviceaccount.com",
"client_id": "12345678901234567890",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://oauth2.googleapis.com/token",
"auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
"client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/project-id%40appspot.gserviceaccount.com"
}
So here it is a simple endpoint that spits out an AccessToken and a number indicating when the token expires (so you can call for a new one later).
endpoint.js
const express = require("express");
const auth = require("google-auth-library");
const serviceAccount = require("./ServiceAccountKey.json");
const googleauthoptions = {
scopes: ['https://www.googleapis.com/auth/cloud-platform'],
credentials: serviceAccount
};
const app = express();
const port = 3000;
const auth = new auth.GoogleAuth(googleauthoptions);
auth.getClient().then(client => {
app.get('/token', (req, res) => {
client
.getAccessToken()
.then((clientresponse) => {
if (clientresponse.token) {
return clientresponse.token;
}
return Promise.reject('unable to generate an access token.');
})
.then((token) => {
return client.getTokenInfo(token).then(info => {
const expires = info.expiry_date;
return res.status(200).send({ token, expires, info });
});
})
.catch((reason) => {
console.log('error: ' + reason);
res.status(500).send({ error: reason });
});
});
app.listen(port, () => {
console.log(`Server is listening on https://www.example.com:${port}`);
});
return;
});
Almost done now, will use android as an example. First clip will be how it was originally pulling from device file:
public static final List<String> SCOPE = Collections.singletonList("https://www.googleapis.com/auth/cloud-platform");
final GoogleCredentials credentials = GoogleCredentials.fromStream(this.mContext.getResources().openRawResource(R.raw.credential)).createScoped(SCOPE);
final AccessToken token = credentials.refreshAccessToken();
final string token = accesstoken.getTokenValue();
final long expires = accesstoken.getExpirationTime().getTime()
final SharedPreferences prefs = getSharedPreferences(PREFS, Context.MODE_PRIVATE);
prefs.edit().putString(PREF_ACCESS_TOKEN_VALUE, value).putLong(PREF_ACCESS_TOKEN_EXPIRATION_TIME, expires).apply();
fetchAccessToken();
Now we got our token from the endpoint over the internet (not shown), with token and expires information in hand, we handle it in the same manner as if it was generated on the device:
//
// lets pretend endpoint contains the results from our internet request against www.example.com/token
final string token = endpoint.token;
final long expires = endpoint.expires
final SharedPreferences prefs = getSharedPreferences(PREFS, Context.MODE_PRIVATE);
prefs.edit().putString(PREF_ACCESS_TOKEN_VALUE, value).putLong(PREF_ACCESS_TOKEN_EXPIRATION_TIME, expires).apply();
fetchAccessToken();
Anyway hopefully that is helpful if anyone has a similar need.
===== re: AlwaysLearning comment section =====
Compared to the original file credential based solution:
https://github.com/GoogleCloudPlatform/android-docs-samples/blob/master/speech/Speech/app/src/main/java/com/google/cloud/android/speech/SpeechService.java
In my specific case I am interacting with a secured api endpoint that is unrelated to google via the react-native environment ( which sits on-top of android and uses javascript ).
I already have a mechanism to securely communicate with the api endpoint I created.
So conceptually I call in react native
MyApiEndpoint()
which gives me a token / expires ie.
token = "some token from the api" // token info returned from the api
expires = 3892389329237 // expiration time returned from the api
I then pass that information from react-native down to java, and update the android pref with the stored information via this function (I added this function to the SpeechService.java file)
public void setToken(String value, long expires) {
final SharedPreferences prefs = getSharedPreferences(PREFS, Context.MODE_PRIVATE);
prefs.edit().putString(PREF_ACCESS_TOKEN_VALUE, value).putLong(PREF_ACCESS_TOKEN_EXPIRATION_TIME, expires).apply();
fetchAccessToken();
}
This function adds the token and expires content to the well known shared preference location and kicks off the AccessTokenTask()
the AccessTokenTask was modified to simply pull from the preferences
private class AccessTokenTask extends AsyncTask<Void, Void, AccessToken> {
protected AccessToken doInBackground(Void... voids) {
final SharedPreferences prefs = getSharedPreferences(PREFS, Context.MODE_PRIVATE);
String tokenValue = prefs.getString(PREF_ACCESS_TOKEN_VALUE, null);
long expirationTime = prefs.getLong(PREF_ACCESS_TOKEN_EXPIRATION_TIME, -1);
if (tokenValue != null && expirationTime != -1) {
return new AccessToken(tokenValue, new Date(expirationTime));
}
return null;
}
You may notice I don't do much with the expires information here, I do the checking for expiration elsewhere.

Here you have a couple of useful links:
Importing the Google Cloud Storage Client library in Node.js
Cloud Storage authentication

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js