AWS Automatic Attach EBS Volume to EC2 Instances behind an Elastic Beanstalk - amazon-web-services

I am facing an architecture-related problem:
I have created a new environment in ElasticBeanstalk and pushed my app there. All good so far. I have set it to auto scale up/down.
My app depends on filesystem storage (it creates files and then serves them to users). I am using an EBS volume (5gb large) to create the files and then push them to S3 and delete them from EBS. The reason I'm using EBS is because of ephemeral filesystem in EC2 instances.
When AWS scales up new instances don't have the EBS volume attached because EBS can be attached to one instance at a time.
When it scales down, it shuts down the instance that had the EBS volume attached, which totally messes things up.
I have added to /etc/fstab a special line that will automatically mount the EBS volume to /data but that only applies for the instance I add the file to /etc/fstab. I guess the solution here would be to create a customized AMI image with that special line. But again, EBS can't be attached to more than one instance at a time, so it seems like a dead end.
What am I thinking wrong? What would be a possible solution or the proper way of doing it?
For some reason, I believe that using S3 is not the right way of doing it.

S3 is a fine way to do it: your application creates the file, uploads to S3, removes the file from the local filesystem, and hands a URL to access the file back to the client. Totally reasonable. Why you can't use ephemeral storage for this. Instance store-backed instances have additional storage available, mounted to /mnt by default. Why can't the application create the file there? If the files don't need to be persisted between instance start/stop/reboot then there's no great reason to use EBS (unless you want faster boot times for your autoscale instances I suppose).

Related

Automate AWS AMI creation without downtime and Data loss

I wanted to know is it possible to automate the creation of AMI in AWS without downtime and data loss, if possible how can we achieve it.
I have use system manager-> maintenance window in which i have set the reboot to true for data integrity, but i need a way so that the data is not lost.
Any help will be appreciated.
Thank-you.
Answering it as per comments discussion, question is somehow still vague to me
You have EBS right now. I'm not sure if your Instances are in Same AZ or not. If they are in same AZ then you can use EBS multi attach feature (available for IO volumes only) to share same storage with all of them.
Regarding backup you can choose EBS snapshots
Ideally my suggestion to you would be create a launch template, use EFS that can be mounted to multiple instances in same region, if you want it across regions then create mount targets. EFS is natively integrated with AWS backup.
Whenever any failover happens or your EC2 crashes for any reason and it goes less than your target capacity, auto scaling would automatically provision a new instance using launch template which would be using same EFS
but i need a way so that the data is not lost.
if you want to achieve this, then According to Docs, you need to ensure that application or os is not writing to ebs, which can be managed by either a script or a custom logic.
You can take a snapshot of an attached volume that is in use. However, snapshots only capture data that has been written to your Amazon EBS volume at the time the snapshot command is issued. This might exclude any data that has been cached by any applications or the operating system. If you can pause any file writes to the volume long enough to take a snapshot, your snapshot should be complete. However, if you can't pause all file writes to the volume, you should unmount the volume from within the instance, issue the snapshot command, and then remount the volume to ensure a consistent and complete snapshot.
if you achieved the above then you can automate the creation, retention, and deletion of EBS snapshots and EBS-backed AMIs it using Data Lifecycle Manager
I haven't tried this but I think exporting VM to S3 and then automating the entire pipeline with Ec2 image builder should do the trick, you can customise your further images with build components
Refers importing and exporting vm's
Unfortunately there is not of box solution other than compromising data integrity but you can try above mentioned which can ensure data integrity and automation

How do I sync a folder from one EBS volume to another for the same EC2?

I have an EC2 instance with EBS volumes A and B attached to it, and I want to copy/replicate/sync the data from a specific folder in EBS A to EBS B.
EBS A is the primary volume which hosts application installation data and user data, and I'm looking to effectively backup the user data (which is just a specific directory) to EBS B in the event that the application install gets corrupted or needs to be blown away. That way I can simply stand up a new EC2 with a new primary EBS, call it C, attach EBS B to it, and push the user data from EBS B into EBS C.
I am using Amazon Linux 2 and have already gone through the process of formatting and mounting the backup EBS. I can manually copy data from EBS A to EBS B but I was hoping someone could point me towards a best practices for keeping the directory data in sync between the two volumes?
I have found recommendations for rsync, a cron task, and gluster for similar use cases. Would is be considered good practice to use one these for my use case?
While you can use rsync, a better alternative is Data Lifecycle Manager, which will make automated EBS snapshots.
The reason that it's better is that you can specify a fixed number of snapshots, at a fixed time interval, so you don't need to restore the latest (important if the "current" data is corrupted).
To use this most effectively, I would separate the boot volume from the application/data volume(s). So you could just restore the snapshot, spin up a new instance, and mount the restored volume to it.

Will EFS maintain files after instance is terminated?

I am using cloudformation heavily currently, and I have a three stacks I am currently dealing with. The first stack is my load balancer stack, which essentially just has an application load balancer as its resource. The second stack only has a single resource: an elastic file system. The third stack is my main stack where I have an autoscaling group behind the load balancer mentioned in the first stack. This autoscaling group also mounts the EFS from the second stack onto each new instance. If my autoscaling group were to kill an unhealthy instance, and then initialize a new one to take its place, will it keep all of the files that were initially in the old EFS?
Basically, I am just wondering if files in an EFS in a particular instance will remain if that same file system is mounted in a different instance.
Basically, I am just wondering if files in an EFS in a particular instance will remain if that same file system is mounted in a different instance.
Yes. Files persist in an EFS filesystem until a) you delete them via a file operation from an instance that has mounted the EFS target, or b) the EFS resource itself is deleted from the console or CLI. They are independent of any instance.
This persistence is what make EFS useful as a sharable network attached file store. It is designed for your exact use case.
Please be aware you should consider backing up EFS fileshare to another EFS file share, or synced backup to S3, as a safety precaution. This backup is not built into the service, but can be added via scheduled tasks, Lambdas etc. In our system, I launch a scheduled instance once a day, and sync the production EFS with a backup EFS. for security and redundancy.
Until you use the same EFS on new machine that was used in destroyed machine, you will not loose your files.
Based on my experience and description provided on EFS webpage, EFS files are not lost until EFS is terminated. As It is also used for containers, which are easily throw away environment, it would also suit for your ELB based architecture as well.
"Amazon EFS is ideal for container storage providing persistent shared access to a common file repository"
https://aws.amazon.com/efs/

Persistent storage on Elastic Beanstalk

How can i attach persistent storage on Elastic Beanstalk ?
I know i need to have a .config file where i set the parameters of the environment to run every time an instance is created.
My goal is to have a volume, let's say 100GB, that even if the instances got deleted/terminated, i have this volume with persistent data where all instances can access to read from.
I could use S3 to store this data, but it would require changes to the application, and latency could be a problem.
This way i could access the filesystem like any common server.
AWS now offer a solution called Elastic File System (Amazon EFS) that lets multiple instances access a shared file store.
If your desire is to have a central data repository that all EC2 instances can access, then Amazon S3 would be your best option.
Normal disk volumes are provided via Elastic Block Store (EBS). EBS volumes can only be mounted to one EC2 instance at a time. Therefore, to share data that is contained on an EBS volume, you will need to use normal network sharing methods to mount network volumes.
However, if your goal is to provide shared access without one specific instance sharing a volume to other instances, then it is better to use S3 because it is accessible from all instances. It would likely be worth the effort of modifying your application to take advantage of S3.

aws - can I use EC2 without S3?

Why and when exactly should I use EC2 with S3?
I'm using EC2 to install tools like Gitlab and Rundeck. It works fine without S3 storage.
The problem is just if I terminate instances, I'll lose my files?
Short answer: Yes, you can use EC2 without S3. S3 is cloud storage and isn't used for EC2 images.
S3 is used for storing files, such as distributions, backups, and can even be used for static websites.
To answer the second part of your question: when creating storage for a new EC2 instance, uncheck the Delete on Termination, so it will be saved if you ever choose to terminate the EC2 instance.
Be careful though, I've had problems in the past where AWS will not let you reuse volumes that were used with a marketplace image.
EC2 uses EBS, not S3, for storing the volumes. (In fact, I don't exactly know how to make it use anything besides EBS. S3 is used for AMIs, which are basically templates that are copied to EBS when creating an instance.)
Option 1: Don't terminate your instances. Note that terminate means delete, not stop. You can stop them without terminating them.
Option 2: Configure your EBS volumes to not be deleted on termination. The volume will be detached rather than deleted. You can then attach it to another machine later.
You can also attach multiple EBS volumes to an instance, so if you want to save your data only but discard the OS, simply place your data on a secondary volume. The primary volume can be deleted and the secondary volume can be preserved. Delete-on-termination can be configured per-volume.