I have did a number of searches and can't seem to understand if this is doable at all.
I have a data logger that has FTP-push function. The FTP-push function have the following settings:
FTP server
Port
Upload directory
User name
Password
In general, I understand that a Filezilla client (I have a Pro edition) is able to drop files into my AWS S3 bucket and I had done this successfully in my local PC.
Is it possible to remove the Filezilla client requirement and input my S3 information directly into my data logger? Something like the below diagram:
Data logger ----FTP----> S3 bucket
If not, what will be the most sensible method to have my data logger JSON files drop into AWS S3 via FTP?
Frankly, you'd be better off with:
Logging to local files
Using a schedule to copy the log files to Amazon S3 using the aws s3 sync command
The schedule could be triggered by cron (Linux) or a Scheduled Task (Windows).
Amazon did add support recently to AWS Transfer for FTP support. This will provide an integration with Amazon S3 via FTP without setting up any additional infrastructure, however you should review the pricing at the moment.
As an alternative you could create an intermediary server that can sync between itself and AWS S3 using the cli aws s3 sync.
Related
I'm trying to prepare a flow where we can regularly pull the available new files in third parties' on-prem server to our S3 using AWS Transfer family.
I read this documentation https://aws.amazon.com/blogs/storage/how-discover-financial-secures-file-transfers-with-aws-transfer-family/, but it was not clear on setting up and configuring the process.
Can someone share any clear documentation or reference links on using AWS Transfer Family to pull files from external on-prem server to our S3?
#Sampath, I think you misunderstood the available features of the AWS Transfer service. That service is actually acting as a serverless SFTP with AWS S3 as the backend storage to which you can connect via SFTP protocol (now supports FTP and FTPS as well). You can either PUSH data to S3 or PULL data from S3 via AWS Transfer service. You cannot PULL data into S3 from anywhere else via AWS Transfer service alone.
You may have to use any other solution like a Python Script running on AWS EC2 for that purpose.
Another solution would be to connect the external third-party server to the AWS Transfer Service and that server PUSHES files on S3 via AWS Transfer.
As per your use case, I think you need a simple solution that connects to an external third-party server and copies files from it to the AWS S3 bucket. It can be done via a Python script as well and you can run it on either AWS EC2, AWS ECS, AWS Lambda, AWS Batch, etc, depending on the specifications and requirements.
I have used AWS Transfer once I found it to be very expensive and went on with AWS EC2 instead. In the case of AWS EC2, you can even buy reserved instances to further reduce the cost. If the task is just about copying files from an external server to S3 and the copy job will never take more than 10 minutes, then it is better to run it on AWS Lambda.
In short, you cannot PULL data from any server into S3 using the AWS Transfer service. You can only PUSH data to or PULL data from S3 using the AWS Transfer service.
References to some informative blogs:
Centralize data access using AWS Transfer Family and AWS Storage Gateway
How Discover Financial secures file transfers with AWS Transfer Family
Moving external site data to AWS for file transfers with AWS Transfer Family
Easy SFTP Setup with AWS Transfer Family
With the AWS Transfer Family service you can create servers that uses SFTP, FTPS, and FTP protocols for your file transfers, and use the Amazon S3 and EFS as domains to store and access your files.
To connect your on-premise servers with the Transfer Family server you will need to use a service like File Gateway/Storage Gateway and connect via HTTPS to S3 to sync your files.
Your architecture will be something like this:
If you want more details of how to connect with your on-premises servers with the AWS S3/Transfer Family services take a look on this blog post: Centralize data access using AWS Transfer Family and AWS Storage Gateway
I have a requirement to send files from S3 bucket to an external client. FTP or SFTP can be used for this. Based on certain research I found this can be done using Lambda or using EC2 but couldn't find detailed steps for it. Please let me know how this can be done.
Amazon S3 cannot "send" files anywhere.
Therefore, you will need some code running 'somewhere' that will:
Download the file(s) from Amazon S3
Send the file(s) to the external client via SFTP
This is all easily scriptable. The difficulty probably comes in deciding which files to send and how to handle any errors.
You probably couldn't find any documentation on the topic because sending files via SFTP has nothing specifically related to AWS. Just do it the way you would from anywhere.
For example, let's say you wanted to do it via a Python program running either on an Amazon EC2 instance or as an AWS Lambda function:
Download the desired files by using the AWS SDK for Python (boto3). See: Amazon S3 examples
Send the files via SFTP. See: Transfer file from AWS S3 to SFTP using Boto 3
Came across a similar requirement, and this can be done very easily with the lambda function.
functional requirement for our use case was automated transfer of the files when it's ready to send back to the customer.
Architecture
We came up with this simplistic architecture for the basic use case.
Workflow
Upload a file to the S3 bucket
Trigger Push event notification for the lambda function. Prefer to have a separate lambda function for each client so that we can store all SFTP connection details in environment variables.
Env variables will be used to store Server details, credentials, file path, etc...
Lambda function will fetch a file from the S3 bucket
Lambda will transfer the file to External Server.
Worthy Addition
Worth Considering changes on top of this simple approach
If the Lambda function failed to fetch a file then it should do a couple of retries and if it still fails, they should send a notification to the client who is uploading the file to S3 bucket.
If the external transfer fails then Lambda should add that to any SQS queue from that any application can process messages and notify the system and also we can setup retry after few days again.
I have a workflow need. I have a customer that does not want to deal with our S3 folders where we drop their files. They want us to send the files directly to their SFTP account. When I unload files from my backend they automatically unload to S3 from AWS services. As this is a one time request per customer I don't wish to set up an automated transfer protocol in a Lamda or bash script. nor do I wish to go through the hassle of copying the file to my local server only to post it to the SFTP site. I would prefer to just right click on the file and select to transfer to SFTP location. Does anyone know if AWS has any plans to add file transfer protocol support into the S3 console UI? (SFTP, FTP, etc.)
What would be even better is if AWS S3 allowed all files dropped in an S3 bucket location to be automatically transferred to the SFTP location defined -- in the scenario where the customer never wishes to deal with S3, but we need to use it.
Given the current capabilities of Amazon S3, automating a send of files from Amazon S3 to an SFTP target would require the use of an AWS Lambda function.
There are a few ways to do this, since you are looking for the most easiest way i would suggest you to install s3fuse on a linux server, this enables you to mount s3 as a file system. You can directly mount it on the sftp server and copy them locally , below is the URL for s3Fuse.
https://cloud.netapp.com/blog/amazon-s3-as-a-file-system
The other method would be to use the AWS CLI to do recursive copy , this would involve installing AWS CLI and generate API keys. Below is an example of the command.
aws s3 cp s3://mybucket/test.txt test2.txt
You can revoke the API keys once you are done with the transfer!
As per my project requirement, I want to fetch some files from on-prem FTP server & put them into a S3 bucket. Files are of size 1-2 GB. Once the file will be put into the FTP server folder, I want that file to be uploaded to S3 bucket.
Please suggest the easiest way to achieve this?
Note- Mostly the files will be put into FTP server only once in a day, hence i dont want continuously scan the FTP server. once the files will be uploaded to S3 from FTP server, i want to terminate any resources (like EC2) created in AWS.
These are my ideas:
I think you could create an agent on your FTP server that will upload the files every N seconds/minutes/hours/Etc using the AWS CLI. This way you're avoiding external access to your FTP server.
Another approach is a Lambda function for pulling process, but like you said the FTP server doesn't allow external access.
Create a VPN between your on-prem and the cloud infra, create a Cloudwatch event and through a Lambda execute the pulling process.
Here you can configure a timeout:
Create a VPN between your on-prem and the cloud infra, from your FTP server upload the files using AWS CLI (pay attention to sync option). Take a look at this link: https://aws.amazon.com/answers/networking/accessing-vpc-endpoints-from-remote-networks/
With Jenkins create a task to execute a process that will upload the files.
You can use Storage gateway, visit its site here: https://aws.amazon.com/es/storagegateway/
Here is how we solved it.
Enable S3 acceleration on your S3 bucket. This is very much needed, since you are pushing large file.
If you have access to the server install aws cli and perform a sync on the folder to s3 bucket. AWS CLI will automatically sync your folder to bucket. This way if you change any of your existing files, it will keep in sync with S3 bucket. This is ideal and simplest way if you have access to the server and able to install aws cli.
https://docs.aws.amazon.com/AmazonS3/latest/dev/transfer-acceleration-examples.html#transfer-acceleration-examples-aws-cli
aws s3api put-bucket-accelerate-configuration --bucket bucketname --accelerate-configuration Status=Enabled
If you want to enable for specific or default profile,
aws configure set default.s3.use_accelerate_endpoint true
If you don't have access to ftp server in your premisis, you need an external server to perform this process. In this case you need to perform a poll or share file system, copy the file locally and move it to s3 bucket. There will be lot of failure points with this process.
Hope it helps.
I'm using AWS to host a static website. Unfortunately, it's very tedious to upload the directory to S3. Is there any way to streamline the process?
Have you considered using AWSCLI - AWS Command Line Interface to interact with AWS Services & resources.
Once you install and configure the AWSCLI; to update the site all that you need to do is
aws s3 sync s3://my-website-bucket /local/dev/site
This way you can continue developing the static site locally and a simple aws s3 sync command line call would automatically look at the files which have changed since the last sync and automatically uploads to S3 without any mess.
To make the newly created object public (if not done using Bucket Policy)
aws s3 sync s3://my-website-bucket /local/dev/site --acl public-read
The best part is, the multipart upload is built in. Additionally you sync back from S3 to local (the reverse)