Files in AWS Fargate - amazon-web-services

Is it possible to actualy write/edit/delete files in Fargate ?
Since it's serverless and it doesn't really have a filesystem,
I couldn't get a clear answer about this.
For example, one of our clients needs to write some temporary cache files on the local container.
Is this possible ?
I don't want to setup a whole Data Container Volume just for this..
Thanks !

Fargate runs containers. It is Containers as a Service (CaaS). Your container (Docker) can be anything, Linux, Windows etc. You DO have a filesystem in Fargate, it is OS filesystem whatever you setup in your container. Your application is deployed on this file system and the OS user running your application has whatever permissions to the local filesystem that you give it in the container.
The file system is ephemeral meaning when your Fargate task stops and is destroyed, your local storage will be destroyed with it. It is also limited to a small amount of storage, maybe 10GB.
In Fargate you actually cannot mount a volume, like an EBS volume. If you need to do this you have to use EC2 launch type task in ECS instead of Fargate launch type ECS tasks if you want to use containers, or use a raw EC2 instance. This does not prevent you from read/write/delete access to the local file system inside your container.
So you can write local temp files just fine. If you need to persist the data after the life of the Fargate task or very large amounts of data, you need to write to some other storage like S3 or RDS.
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/fargate-task-storage.html
Edit: Mounting EFS volumes in ECS and Fargate is now generally available.

Related

What is the concept of attached disk and the default min/max disk space provided by the AWS Lightsail container service?

I have created an AWS Lightsail container service (Micro). Uploaded a docker image and made the deployment.
The project is a web api that has a POST endpoint that expects a file, it will save the file in a folder, extract the keywords from the file, delete the file and reply with JSON response.
I think that - when the container service restarts (for example due to API crash or whatever reason), then the container service will create a new instance of the image and hence the disk will get reset.
I am not able to locate any documentation regarding the concept of attached disk and what is the default min/max disk space provided by the AWS Lightsail container service?
when the container service restarts (for example due to API crash or whatever reason), then the container service will create a new instance of the image and hence the disk will get reset.
The container service has only ephemeral (temporary) storage by default.
When you want to persist a filesystem in a container service, you have to use some persistent storage.
If you want to use "local" filesystem, you can mount an EFS storage. Another option is to store the files directly to an S3 bucket if it is feasible for you.
S3 sand EFS are elastic - practically without any size limits.
If you want to mount a block storage (EBS - a disk), I'm not sure (I doubt) that it is possible in a LightSail container.

Will EFS maintain files after instance is terminated?

I am using cloudformation heavily currently, and I have a three stacks I am currently dealing with. The first stack is my load balancer stack, which essentially just has an application load balancer as its resource. The second stack only has a single resource: an elastic file system. The third stack is my main stack where I have an autoscaling group behind the load balancer mentioned in the first stack. This autoscaling group also mounts the EFS from the second stack onto each new instance. If my autoscaling group were to kill an unhealthy instance, and then initialize a new one to take its place, will it keep all of the files that were initially in the old EFS?
Basically, I am just wondering if files in an EFS in a particular instance will remain if that same file system is mounted in a different instance.
Basically, I am just wondering if files in an EFS in a particular instance will remain if that same file system is mounted in a different instance.
Yes. Files persist in an EFS filesystem until a) you delete them via a file operation from an instance that has mounted the EFS target, or b) the EFS resource itself is deleted from the console or CLI. They are independent of any instance.
This persistence is what make EFS useful as a sharable network attached file store. It is designed for your exact use case.
Please be aware you should consider backing up EFS fileshare to another EFS file share, or synced backup to S3, as a safety precaution. This backup is not built into the service, but can be added via scheduled tasks, Lambdas etc. In our system, I launch a scheduled instance once a day, and sync the production EFS with a backup EFS. for security and redundancy.
Until you use the same EFS on new machine that was used in destroyed machine, you will not loose your files.
Based on my experience and description provided on EFS webpage, EFS files are not lost until EFS is terminated. As It is also used for containers, which are easily throw away environment, it would also suit for your ELB based architecture as well.
"Amazon EFS is ideal for container storage providing persistent shared access to a common file repository"
https://aws.amazon.com/efs/

AWS Auto-Scaling

I'm trying AWS auto-scaling for the first time, as far as I understand it creates instances if for example my CPU Utilization reaches critical level, that I define.
So I am curious, after I lunch my instance I spend a fair amount of time configuring it and copying the data, if AWS auto-scales my instance how will it configure the new instances and move the data to it?
You can't store any data that you want to keep on an instance that is part of an autoscaling group (well you can, but you will lose it).
There are (at least) two ways to answer your question:
Create a 'golden image', in other words spin-up an instance, configure it, install the software etc and then save it as an AMI (amazon machine image). Then tell the autoscaling group to use that AMI each time an instance starts - it will be pre-configured when it starts.
Put a script on the instance that tells the instance how to configure itself when it starts up (in the user data). SO basically each time an instance scales up, it runs the script and does all the steps it needs to to configure itself.
As for you data, best practice would be to store any data you want to keep in a database or object store that is not on the instance - so something like RDS, DynamoDB or even S3 objects.
You could also use AWS EFS, store there your data/scripts that the EC2 Instances will be sharing, and automatically mount it every time a new EC2 Instance is created via /etc/fstab.
Once you have configured the EFS to be mounted on the EC2 Instance (/etc/fstab), you should create a new AMI, and use this new AMI to create a new Launch Configuration and AutoScaling Group, so that the new Instances automatically mount your EFS and are able to consume that shared data.
https://aws.amazon.com/efs/faq/
Q. What use cases is Amazon EFS intended for?
Amazon EFS is designed to provide performance for a broad spectrum of
workloads and applications, including Big Data and analytics, media
processing workflows, content management, web serving, and home
directories.
Q. When should I use Amazon EFS vs. Amazon Simple Storage Service (S3)
vs. Amazon Elastic Block Store (EBS)?
Amazon Web Services (AWS) offers cloud storage services to support a
wide range of storage workloads.
Amazon EFS is a file storage service for use with Amazon EC2. Amazon
EFS provides a file system interface, file system access semantics
(such as strong consistency and file locking), and
concurrently-accessible storage for up to thousands of Amazon EC2
instances. Amazon EBS is a block level storage service for use with
Amazon EC2. Amazon EBS can deliver performance for workloads that
require the lowest-latency access to data from a single EC2 instance.
Amazon S3 is an object storage service. Amazon S3 makes data available
through an Internet API that can be accessed anywhere.
https://docs.aws.amazon.com/efs/latest/ug/mount-fs-auto-mount-onreboot.html
You can use the file fstab to automatically mount your Amazon EFS file
system whenever the Amazon EC2 instance it is mounted on reboots.
There are two ways to set up automatic mounting. You can update the
/etc/fstab file in your EC2 instance after you connect to the instance
for the first time, or you can configure automatic mounting of your
EFS file system when you create your EC2 instance.
I recommend using a shared data container if it is data that is updated and the updated data is needed by all instances that might be spinning up.
If it is database data or you could store the needed data in a database I would consider using an RDS.
If it is static data only used to configure the instances like dumps or configuration files which are not updated by running instances then I would recommend pulling them from CloudFlare or S3 of iT is not possible to pull them from a repository.
Good luck

How to load ESB Volume by ID via .ebextensions

I'm trying to mount the same volume for a Beanstalk build but can't figure out how to make it work with the volume-id.
I can attach a new volume, and I can attach one based on a snapshot ID but neither are what I'm after.
My current .ebextension
commands:
01umount:
command: "umount /dev/sdh"
ignoreErrors: true
02mkfs:
command: "mkfs -t ext3 /dev/sdh"
03mkdir:
command: "mkdir -p /media/volume1"
ignoreErrors: true
04mount:
command: "mount /dev/sdh /media/volume1"
option_settings:
- namespace: aws:autoscaling:launchconfiguration
option_name: BlockDeviceMappings
value: /dev/sdh=:20
Which of course will mount a new volume, not attach an existing one. Perhaps snapshot is what I want and I just don't understand the terminology here?
I need the same data that was on the volume when the autoscaling kicks in to be on each EC2 instants that scales... A snapshot would surely just be the data that existed at the point the snapshot was created?
Any ideas or better approaches?
Elastic Block Store (EBS) allows you to create, snapshot/clone, and destroy virtual hard drives for EC2 instances. These drives ("volumes") can be attached to and detached from EC2 instances, but they are not a "share" or shared volume... so attaching a volume by ID becomes a non-useful idea after the first instance launched.
EBS volumes are hard drives. The analogy is imprecise (because they're on a SAN) but much the same way as you can't physically install the same hard drive in multiple servers, you can't attach an EBS volume to multiple instances (SAN != NAS).
Designing with a cloud mindset, all of your fixed resources would actually be on the snapshot (disk image) you deploy when you release a new version and then use to spawn each fresh auto-scaled instance... and nothing persistent would be stored there because -- just as important as scaling up, is scaling down. Autoscaled instances go away when not needed.
AWS has Simple Storage Service (S3) which is commonly used for storing things like documents, avatars, images, videos, and other resources that need to be accessible in a distributed environment. It is not a filesystem, and can't properly be compared to a filesystem, because it's an object store... but is a highly scalable and highly available storage service that is well-suited to distributed applications. s3fs allows an S3 "bucket" to be mounted into your machine's filesystem, but this is no panacea. That mechanism should be reserved for back-end process use, if you use it at all, because it's not appropriate for resources like code or templates, and will not perform as well for serving up content as S3 will perform if used as designed, with clients directly accessing it over https. You can secure the content through more than one mechanism, as documented.
AWS also now has Elastic File System (EFS) which sets up an array of storage that you can mount from all of your machines, using NFS. AWS provides the NFS server and the back-end storage. Unlike EBS, you do not need to know how much storage to provision up front, because it scales up and down based on what you've stored, billing you This service is still in "preview" as of this writing, so should not be used for production data.
Or, you can manually configure your own NFS server and mount it from the autoscaling machines. Making such as setup fail-safe is a bit tricky, though.

Persistent storage on Elastic Beanstalk

How can i attach persistent storage on Elastic Beanstalk ?
I know i need to have a .config file where i set the parameters of the environment to run every time an instance is created.
My goal is to have a volume, let's say 100GB, that even if the instances got deleted/terminated, i have this volume with persistent data where all instances can access to read from.
I could use S3 to store this data, but it would require changes to the application, and latency could be a problem.
This way i could access the filesystem like any common server.
AWS now offer a solution called Elastic File System (Amazon EFS) that lets multiple instances access a shared file store.
If your desire is to have a central data repository that all EC2 instances can access, then Amazon S3 would be your best option.
Normal disk volumes are provided via Elastic Block Store (EBS). EBS volumes can only be mounted to one EC2 instance at a time. Therefore, to share data that is contained on an EBS volume, you will need to use normal network sharing methods to mount network volumes.
However, if your goal is to provide shared access without one specific instance sharing a volume to other instances, then it is better to use S3 because it is accessible from all instances. It would likely be worth the effort of modifying your application to take advantage of S3.