ECS with persistent data on EFS or EBS with CloudFormation

ECS with persistent data on EFS or EBS with CloudFormation - amazon-web-services

I am looking for some expert with AWS to help me with this thing. I've spent almost one week trying to deploy my backend docker image to AWS with no 100% of the desired behaviour.
Firstly I was suggested to try out the new Fargate service AWS recently provided. I managed to deploy everything that I needed but it quickly turned out that I need any kind of data persistence which is unavailable with Fargate for the moment from what I've read.
I found those templates which are very helpful because AWS is soooo big and overwhelming so I would do nothing without those and currently tried the deploy using EC2 instances. https://github.com/awslabs/aws-cloudformation-templates/tree/master/aws/services/ECS/EC2LaunchType
I've a question to this kind of deploy:
1: Why this deploy creates 2 instances of EBS for the cluster? One 8GB with snapshot and second with 22GB in size without snapshot.
2: Is it possible to reduce the size of those EBS volumes ?? If so how ?
3: Is it possible to have just only one of those EBS ?
4: Is it possible to mount volue from my docker backend image to those EBS volumes to persist data ? If so how ? I need to mount two volumes for my backend that is
/root/.local/share/Bisq /root/.local/share/bisq-api or ~/.local/share/Bisq ~/.local/share/bisq-api im not quite sure how this works with AWS. What are the paths compared to the steps on the local environment.
5: Would it be better to use EBS or EFS for data persistance? Problem with EFS is that i Can't find any related documents how to connect EFS to this kind of ECS deploy. Everything MUST be with use of CloudFormation templates
Overall the requirements that would match 100% desired behaviour are:
1: CloudFormation Template/tempalte's that deploy neccesarry services as less as possible in order to not build huge infrastructure to preserve the COSTs and ability to just click button and get external link to the backend service. (There can't be any manuall configuration everything must be automated)
2: Ability to stop/start EC2 instace for the backend container (EC2 will be running up from few minutes to hours per day / few days a month. (dependent on each user scenario how often he will be using the backend)
3: Ability to preserve the data when user stops the instance and then starts it in the future point in time.
I would apprecieate any help/suggestions because Im starting to lose my HEAD over everything what connects to AWS services. This is really huge problem to understand any use cases for AWS so I would appreciate help
Thank you!

1: Why this deploy creates 2 instances of EBS for the cluster? One 8GB with snapshot and second with 22GB in size without snapshot.
As per the docs:
Amazon ECS-optimized AMIs from version 2015.09.d and later launch with an 8-GiB volume for the operating system that is attached at /dev/xvda and mounted as the root of the file system. There is an additional 22-GiB volume that is attached at /dev/xvdcz that Docker uses for image and metadata storage.
Here the reference: ecs-ami-storage-config
2: Is it possible to reduce the size of those EBS volumes ?? If so how ?
Also from the same docs:
You can increase these default volume sizes by changing the block device mapping settings for your instances when you launch them; however, you cannot specify a smaller volume size than the default.
3: Is it possible to have just only one of those EBS ?
For this you can better use a custom AMI and configure it as needed.
4: Is it possible to mount volume from my docker backend image to those EBS volumes to persist data ? If so how ? I need to mount two volumes for my backend that is /root/.local/share/Bisq /root/.local/share/bisq-api or ~/.local/share/Bisq ~/.local/share/bisq-api im not quite sure how this works with AWS. What are the paths compared to the steps on the local environment.
This is configured in the task definition:
AWS::ECS::TaskDefinition
Type: AWS::ECS::TaskDefinition
Properties:
Volumes:
- Volume Definition
...
5: Would it be better to use EBS or EFS for data persistance? Problem with EFS is that i Can't find any related documents how to connect EFS to this kind of ECS deploy. Everything MUST be with use of CloudFormation templates
It depends on your use case scenario. Each one has advantages and disadvantages depending on your needs, so better read the docs for each and chose the best one accordingly. In the templates you found, you can customize the LaunchConfiguration UserData to run the attach commands. You can do all this in CloudFormation.
Additionally I will leave you this documentation for mounting automatically an EFS: Mounting Your Amazon EFS File System Automatically

Related

Automate AWS AMI creation without downtime and Data loss

I wanted to know is it possible to automate the creation of AMI in AWS without downtime and data loss, if possible how can we achieve it.
I have use system manager-> maintenance window in which i have set the reboot to true for data integrity, but i need a way so that the data is not lost.
Any help will be appreciated.
Thank-you.

Answering it as per comments discussion, question is somehow still vague to me
You have EBS right now. I'm not sure if your Instances are in Same AZ or not. If they are in same AZ then you can use EBS multi attach feature (available for IO volumes only) to share same storage with all of them.
Regarding backup you can choose EBS snapshots
Ideally my suggestion to you would be create a launch template, use EFS that can be mounted to multiple instances in same region, if you want it across regions then create mount targets. EFS is natively integrated with AWS backup.
Whenever any failover happens or your EC2 crashes for any reason and it goes less than your target capacity, auto scaling would automatically provision a new instance using launch template which would be using same EFS

but i need a way so that the data is not lost.
if you want to achieve this, then According to Docs, you need to ensure that application or os is not writing to ebs, which can be managed by either a script or a custom logic.
You can take a snapshot of an attached volume that is in use. However, snapshots only capture data that has been written to your Amazon EBS volume at the time the snapshot command is issued. This might exclude any data that has been cached by any applications or the operating system. If you can pause any file writes to the volume long enough to take a snapshot, your snapshot should be complete. However, if you can't pause all file writes to the volume, you should unmount the volume from within the instance, issue the snapshot command, and then remount the volume to ensure a consistent and complete snapshot.
if you achieved the above then you can automate the creation, retention, and deletion of EBS snapshots and EBS-backed AMIs it using Data Lifecycle Manager
I haven't tried this but I think exporting VM to S3 and then automating the entire pipeline with Ec2 image builder should do the trick, you can customise your further images with build components
Refers importing and exporting vm's
Unfortunately there is not of box solution other than compromising data integrity but you can try above mentioned which can ensure data integrity and automation

How to make EBS volume to be available with zero downtime and attach to EC2 instance when ebs volume crashes?

I have one r5.xlarge windows ec2 instance which is attached to 6TB EBS volume and the backup for EBS is taken every week.
Now I want a better solution with zero downtime.
if ec2 instance fails I want new ec2 instance to be to created and it should be attached to EBS volume automatically.
if EBS volume crashes I want backup snapshot to be available and gets attached to ec2 instance with zero downtime and volume content available immediately.
Is there a way to implement solution for this and how?

If you want to improve the availability of the large drive, I would recommend FSx. It's not so trivial - you need to set up Active Directory, and your EC2 needs to join AD - but once you have all these components (and there is additional cost as well) - FSx provides significantly higher availability than EBS assuming you're using a multi-AZ setup.
More information here
Note that load balancer tags are irrelevant to your question - although to address your first situation you probably need load balancer balancing two EC2 instances

You should try and use Elastic BeanStalk, this will handle the autoscaling on its own. You just have to configure the settings while creating it.
This should be the nearest solution that you need, as explained by Marcin above there is no zero downtime for 2 single point of failures. So Elastic BeanStalk seems to be the best solution for you.
Updated:
So in this case what you can do is add a load balance and add auto scaling, so that the first point of your questions is taken care and if the ec2 instance goes down through autoscaling a new instance will be create and in the shell script you can add commands to map to the EBS.
And for EBS, there isn't much you can do, you can take regular backups and if the EBS fails then replace it with the latest backup or snapshot, also check the below link:
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-restoring-volume.html

AWS Central Mounting for Autoscaling

We need an option to mount /var/www/html/ folder on multiple servers to EFS or some other services for setting up autoscaling environment.
Firstly we have used EFS and faced an issue such that, at high throughput, and if the file size is not increasing, then it will burst.
So could you please suggest an alternative for high throughput and the file size not increasing rapidly.

I have tried using EFS with auto-scaling and I have faced the same issue as yours then I have done the below solution.
I have mounted a drive at any mount point at "/home/code" which is using XFS filesystem and deploy the code in this particular directory. I am taking up snapshot after every deployment using script (Link).
I am using the ec2 bootstrap script to create a volume from this snapshot with high IOPS which automatically attaches to the new instance which are coming in.
Alternatively, you can tar your code at every time you deploy and copy the code to the new servers which are coming in through bootstrap script itself.

AWS - HA NFS - Best practices

Anyone have a sound strategy for implementing NFS on AWS in such a way that it's not a SPoF (single point of failure), or at the very least, be able to recover quickly if an instance crashes?
I've read this SO post, relating to the ability to share files with multiple EC2 instances, but it doesn't answer the question of how to ensure HA with NFS on AWS, just that NFS can be used.
A lot of online assets are saying that AWS EFS is available, but it is still in preview mode and only available in the Oregon region, our primary VPC is located in N. Cali., so can't use this option.
Other online assets are saying that GlusterFS is a way to go, but after some research I just don't feel comfortable implementing this solution due to race conditions and performance concerns.
Another options is SoftNAS but I want to avoid bringing in an unknown AMI into a tightly controlled, homogeneous environment.
Which leaves NFS. NFS is what we use in our dev environment and works fine, but it's dev, so if it crashes we go get a couple beers while systems fixes the problem, but on production, this is obviously a no go.
The best solution I can come up with at this point is to create an EBS and two EC2 instances. Both instances will be updated as normal (via puppet) to maintain stack alignment (kernel, nfs libs etc), but only one instance will mount the EBS. We set up a monitor on the active NFS instance, and if it goes down, we are notified and we manually detach and attach to the backup EC2 instance. I'm thinking we also create a network interface that can also be de/re-attached so we only need to maintain a single IP in DNS.
Although I suppose we could do this automatically with keepalived, and a IAM policy that will allow the automatic detachment/re-attachment.
--UPDATE--
It looks like EBS volumes are tied to specific availability zones, so re-attaching to an instance in another AZ is impossible. The only other option I can think of is:
Create EC2 in each AZ, in public subnet (each have EIP)
Create route 53 healthcheck for TCP:2049
Create route 53 failover policies for nfs-1 (AZ1) and nfs-2 (AZ2)
The only question here is, what's the best way to keep the two NFS servers in-sync? Just cron an rsync script between them?
Or is there a best practice that I am completely missing?

There are a few options to build a highly available NFS server. Though I prefer using EFS or GlusterFS because all these solutions have their downsides.
a) DRBD
It is possible to synchronize volumes with the help of DRBD. This allows you to mirror your data. Use two EC2 instances in different availability zones for high availability. Downside: configuration and operation is complex.
b) EBS Snapshots
If a RPO of more than 30 minutes is reasonable you can use periodic EBS snapshots to be able to recover from an outage in another availability zone. This can be achieved with an Auto Scaling Group running a single EC2 instance, a user-data script and a cronjob for periodic EBS snapshots. Downside: RPO > 30 min.
c) S3 Synchronisation
It is possible to synchronize the state of an EC2 instance acting as NFS server to S3. The standby server uses S3 to stay up to date. Downside: S3 sync of lots of small files will take too long.
I recommend watching this talk from AWS re:Invent: https://youtu.be/xbuiIwEOCAs

AWS has reviewed and approved a number of SoftNAS AMIs, which are available on AWS Marketplace. The jointly published SoftNAS Architecture on AWS White Paper provides more details:
Security (pages 4-11)
HA across AZs (pages 13-14)
You can also try a 30 day free trial to see if it meets your needs.
http://softnas.com/tryaws
Full disclosure: I work for SoftNAS.

How to load ESB Volume by ID via .ebextensions

I'm trying to mount the same volume for a Beanstalk build but can't figure out how to make it work with the volume-id.
I can attach a new volume, and I can attach one based on a snapshot ID but neither are what I'm after.
My current .ebextension
commands:
01umount:
command: "umount /dev/sdh"
ignoreErrors: true
02mkfs:
command: "mkfs -t ext3 /dev/sdh"
03mkdir:
command: "mkdir -p /media/volume1"
ignoreErrors: true
04mount:
command: "mount /dev/sdh /media/volume1"
option_settings:
- namespace: aws:autoscaling:launchconfiguration
option_name: BlockDeviceMappings
value: /dev/sdh=:20
Which of course will mount a new volume, not attach an existing one. Perhaps snapshot is what I want and I just don't understand the terminology here?
I need the same data that was on the volume when the autoscaling kicks in to be on each EC2 instants that scales... A snapshot would surely just be the data that existed at the point the snapshot was created?
Any ideas or better approaches?

Elastic Block Store (EBS) allows you to create, snapshot/clone, and destroy virtual hard drives for EC2 instances. These drives ("volumes") can be attached to and detached from EC2 instances, but they are not a "share" or shared volume... so attaching a volume by ID becomes a non-useful idea after the first instance launched.
EBS volumes are hard drives. The analogy is imprecise (because they're on a SAN) but much the same way as you can't physically install the same hard drive in multiple servers, you can't attach an EBS volume to multiple instances (SAN != NAS).
Designing with a cloud mindset, all of your fixed resources would actually be on the snapshot (disk image) you deploy when you release a new version and then use to spawn each fresh auto-scaled instance... and nothing persistent would be stored there because -- just as important as scaling up, is scaling down. Autoscaled instances go away when not needed.
AWS has Simple Storage Service (S3) which is commonly used for storing things like documents, avatars, images, videos, and other resources that need to be accessible in a distributed environment. It is not a filesystem, and can't properly be compared to a filesystem, because it's an object store... but is a highly scalable and highly available storage service that is well-suited to distributed applications. s3fs allows an S3 "bucket" to be mounted into your machine's filesystem, but this is no panacea. That mechanism should be reserved for back-end process use, if you use it at all, because it's not appropriate for resources like code or templates, and will not perform as well for serving up content as S3 will perform if used as designed, with clients directly accessing it over https. You can secure the content through more than one mechanism, as documented.
AWS also now has Elastic File System (EFS) which sets up an array of storage that you can mount from all of your machines, using NFS. AWS provides the NFS server and the back-end storage. Unlike EBS, you do not need to know how much storage to provision up front, because it scales up and down based on what you've stored, billing you This service is still in "preview" as of this writing, so should not be used for production data.
Or, you can manually configure your own NFS server and mount it from the autoscaling machines. Making such as setup fail-safe is a bit tricky, though.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js