Downloading files from a S3 bucket to another server

Downloading files from a S3 bucket to another server - amazon-web-services

I have a requirement that whenever files are placed in a S3 bucket, they need to be moved to an external VMC server. I have written a boto3 script that can download these files.
But how do I automate this process, so that whenever new files land on S3 bucket, script runs on external VMC server, and files are downloaded on it.
Is there any way I can do that?

Related

Monitoring a S3 bucket and downloading any new files continuously

Is there any way I can monitor a S3 bucket for any new files added to it using boto3? Once a new file is added to the S3 bucket, it needs to be downloaded.
My Python code needs to run on an external VMC Server, which is not hosted on an AWS EC2 instance. Whenever a vendor will push a new file to our public S3 bucket, I need to download those files to this VMC Server for ingestion in our on-prem databases/servers. I can't access the VMC Server from AWS either, and neither is there any webhook available.
I have written the code for downloading the files, however, how can I monitor a S3 bucket for any new files?

Take a look at S3 Event Notifications: https://docs.aws.amazon.com/AmazonS3/latest/userguide/NotificationHowTo.html

How to publish MVC Web app via Amazon Elastic Beanstalk so that it will not delete any contents of the App_data directory?

I am publishing web application on Amazon EC2 via Elastic Beanstalk from Visual studio 2017. I have given facility to the users to upload files on server and internally storing those files in App_Data folder. But while publishing every new build, its deleting all the files in App_Data folder that were previously uploaded by the users.
I also tried excluding App_Data folder from project and manually creating that folder from code. but its still deleting all files. Any Suggestion on this?
Do let me know if you need any further information.

You should not hold any data on your ElasticBeanstalk instances, you'll have issues when running multiple instances, and it's not very resilient.
One option is to attach an Elastic File System (EFS) to your ElasticBeanstalk instances, then treat this as a shared volume.
My preferred solution is to save the files to S3, either by uploading from your application to the S3 bucket, or using generated pre-signed urls to upload directly from the user's browser.

AWS S3 Directory

I am making a Cron Job in AWS Server and I have this File Handling function which creates JSON file. I already have Amazon S3 Cloud Storage, and I want my JSON file saved inside it. How can I do it? I tried to locate the directory for Amazon S3 Storage using Filezilla but found nothing. Thank you!

you have to put another command into your cron.
After you create a Json file, you have to use awscli to upload your json to S3 storage.
Here is how to install it.
installation guide
after its set up, you can use aws s3 command to upload it.
have a look here for more information.
S3 upload command
I guess this this a command you need to add.
aws s3 cp ./yourfile.json s3://your-bucket-name/

Simplest way to fetch the file from FTP server (on-prem) & put into S3 bucket

As per my project requirement, I want to fetch some files from on-prem FTP server & put them into a S3 bucket. Files are of size 1-2 GB. Once the file will be put into the FTP server folder, I want that file to be uploaded to S3 bucket.
Please suggest the easiest way to achieve this?
Note- Mostly the files will be put into FTP server only once in a day, hence i dont want continuously scan the FTP server. once the files will be uploaded to S3 from FTP server, i want to terminate any resources (like EC2) created in AWS.

These are my ideas:
I think you could create an agent on your FTP server that will upload the files every N seconds/minutes/hours/Etc using the AWS CLI. This way you're avoiding external access to your FTP server.
Another approach is a Lambda function for pulling process, but like you said the FTP server doesn't allow external access.
Create a VPN between your on-prem and the cloud infra, create a Cloudwatch event and through a Lambda execute the pulling process.
Here you can configure a timeout:
Create a VPN between your on-prem and the cloud infra, from your FTP server upload the files using AWS CLI (pay attention to sync option). Take a look at this link: https://aws.amazon.com/answers/networking/accessing-vpc-endpoints-from-remote-networks/
With Jenkins create a task to execute a process that will upload the files.
You can use Storage gateway, visit its site here: https://aws.amazon.com/es/storagegateway/

Here is how we solved it.
Enable S3 acceleration on your S3 bucket. This is very much needed, since you are pushing large file.
If you have access to the server install aws cli and perform a sync on the folder to s3 bucket. AWS CLI will automatically sync your folder to bucket. This way if you change any of your existing files, it will keep in sync with S3 bucket. This is ideal and simplest way if you have access to the server and able to install aws cli.
https://docs.aws.amazon.com/AmazonS3/latest/dev/transfer-acceleration-examples.html#transfer-acceleration-examples-aws-cli
aws s3api put-bucket-accelerate-configuration --bucket bucketname --accelerate-configuration Status=Enabled
If you want to enable for specific or default profile,
aws configure set default.s3.use_accelerate_endpoint true
If you don't have access to ftp server in your premisis, you need an external server to perform this process. In this case you need to perform a poll or share file system, copy the file locally and move it to s3 bucket. There will be lot of failure points with this process.
Hope it helps.

AWS Static web hosting - tedious to update site

I'm using AWS to host a static website. Unfortunately, it's very tedious to upload the directory to S3. Is there any way to streamline the process?

Have you considered using AWSCLI - AWS Command Line Interface to interact with AWS Services & resources.
Once you install and configure the AWSCLI; to update the site all that you need to do is
aws s3 sync s3://my-website-bucket /local/dev/site
This way you can continue developing the static site locally and a simple aws s3 sync command line call would automatically look at the files which have changed since the last sync and automatically uploads to S3 without any mess.
To make the newly created object public (if not done using Bucket Policy)
aws s3 sync s3://my-website-bucket /local/dev/site --acl public-read
The best part is, the multipart upload is built in. Additionally you sync back from S3 to local (the reverse)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Downloading files from a S3 bucket to another server - amazon-web-services

Related

Monitoring a S3 bucket and downloading any new files continuously

How to publish MVC Web app via Amazon Elastic Beanstalk so that it will not delete any contents of the App_data directory?

AWS S3 Directory

Simplest way to fetch the file from FTP server (on-prem) & put into S3 bucket

AWS Static web hosting - tedious to update site

Categories

Resources