I am trying to download multiple files from S3 to AWS RDS MSSQL at the same time.
The following example shows the stored procedure to download files from S3
exec msdb.dbo.rds_download_from_s3
#s3_arn_of_file='arn:aws:s3:::bucket_name/data.csv',
#rds_file_path='D:\S3\Folder\data.csv',
#overwrite_file=1;
This execution can only execute (download) one at a time and queue the rest. Is there a solution whereby I can download multiple files at a same time?
I was thinking of downloading a zipped file and unzipped once downloaded but it does not support zip format.
I have also checked on the AWS Documentation limitations:
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/User.SQLServer.Options.S3-integration.html
Can anyone help?
Related
I have lots of .tar.gz files (millions) stored in a bucket on Amazon S3.
I'd like to untar them and create the corresponding folders on Amazon S3 (in the same bucket or another).
Is it possible to do this without me having to download/process them locally?
It's not possible with only S3. You'll have to have something like EC2, ECS or Lambda, preferably running in the same region as the S3 bucket, to download the .tar files, extract them, and upload every file that was extracted back to S3.
With AWS lambda you can do this! You won't have the extra costs with downloading and uploading to AWS network.
You should follow this blog but then unzip instead of zipping!
https://www.antstack.io/blog/create-zip-using-lambda-with-files-streamed-from-s3/
I have a link in a request which is pointing to some pdf /image content type. My requirement is to upload the content in the link to the s3 server.
Do I have to download it and then uploading the file but I have to many call and limited file storage in the machine Or Is there any other way to achieve this.
You must upload the file to Amazon S3.
It is not possible to tell Amazon S3 to retrieve a file from a URL.
My requirement is to upload the content in the link to the s3 server.
we - you need some compute resource. S3 itself won't do that.
Do I have to download it and then uploading the file
Or Is there any other way to achieve this.
The compute resource (logic) doesn't need to reside on your computer. You may use some AWS Compute resource near the S3, such as Lambda, EC2, ECS, .. You may decide based on the predicted load or other requirements.
What is the better option of get data from a directory in SFTP and copy in bucket of S3 of AWS? In SFTP i only have permission of read so Rsync isn't option.
My idea is create a job in GLUE with Python that download this data y copy in bucket of S3. They are different files, one weighs about 600 MB, others are 4 GB.
Assuming you are talking about an sFTP server that is not on AWS, you have a few different options that may be easier than what you have proposed (although your solution could work):
Download the AWS CLI onto the sFTP server and copy the files via the AWS s3 cp command.
Write a script using the AWS SDK that takes the files and copies them. You may need to use the multi-part upload with the size of your files.
Your can create an AWS managed sFTP server that links directly to your s3 bucket as the backend storage for that server, then use sftp commands to copy the files over.
Be mindful that you will need the appropriate permissions in your AWS account to complete any of these 3 (or 4) solutions.
I want my users to be able to download many files from AWS S3 bucket(potentially over few hundred GBs sized when accumulated) as one large ZIP file. I would download those selected files from S3 first and upload a newly created ZIP file on S3. This job will be rarely invoked during our service, so I decided to use Lambda for it.
But Lambda has its own limitations - 15 min of execution time, ~500MB /tmp storage, etc. I found several workaround solutions on Google that can beat the storage limit(streaming) but found no way to solve execution time limit.
Here are what I've found so far:
https://dev.to/lineup-ninja/zip-files-on-s3-with-aws-lambda-and-node-1nm1
Create a zip file on S3 from files on S3 using Lambda Node
Note that programming language is not a concern here.
Could you please give me a suggestion?
I have a csv hosted on a server which updates daily. I'd like to setup a transfer to load this into Google Cloud Storage so that I can then query it using BigQuery.
I'm looking at the transfer service and it doesn't seem to have what I need e.g. only accepts csvs or files from other google storage buckets or amazon s3 buckets.
Thanks in advance
You can also use a URL to a TSV file as explained here and configure the transfer to run daily at the time of your choice.
Alternatively, if it still doesn't fit your need, you may install gsutil on your remote machine and use the gsutil rsync command and schedule it to run daily.