AWS EC2 Apache file persistence - amazon-web-services

I'm new to AWS and a little perplexed as to the situation with var/www/html folder in an EC2 instance in which Apache has been installed.
After setting up an Elastic Beanstalk service and uploading the files, I see that these are stored in the regular var/www/html folder of the instance.
From reading AWS documents, it seems that instances may be deleted and re-provisioned, which is why use of an S3 bucket, EFS or EBS is recommended.
Why, then, is source code stored in the EC2 instance when using an apache server? Won't these files potentially be deleted with the instance?

If you manually uploaded some files and data to /var/www/html then off course they will be wiped out when AWS is going to replace/delete the instance, e.g. due to autoscalling.
All files that you use on EB should be part of your deployment package, and all files that, e.g. your users upload, should be stored outside of Eb, e.g. on S3.

Even if the instance is terminated for some reason, since the source code is part of your deployment package on Beanstalk it can provision another instance to replace with the exact same application and configurations. Basically you are losing this data, but it doesn't matter.
The data loss concern is for anything you do that is not part of the automated provisioning/deployment, ie any manual configuration changes or any data your application may write to ephemeral storage. This is what you would need a more persistent storage option for.

Seems that, when the app is first deployed, all files are uploaded to an S3 bucket, from where these are copied into the relevant directory of each new instance. In the event new instances have to be created (such as for auto-scaling) or replaced, these instances also pull the code from the S3 bucket. This is also how the app is re-deployed - the bucket is updated and each instance makes a new copy of the code.
Apologies if this is stating the obvious to some people, but I had seen a few similar queries.

Related

Automating Certificates Installation automatically using config files in .ebextensions on AWS

My Application is deployed on ElasticBeanStalk on AWS. It is accessing an API that needs SSL certificate to be installed on the instance. I have to manually run the keytool command to import the certificate file every time the instance rebuilds. And whenever EBS rebuilds the EC2 instance, the installed certificates are lost and I have to again transfer the certificate file and install the certificates again.
I think ebextensions can be a solution to this problem but I am not able to understand the exact way to use it.
Please help me with some directions here.
First you need to create the file you want in question, then put it into an S3 bucket. I'd recommend you have it encrypted, and that there's no public permissions on the file for security purposes. From there, in your application root you'll create a .ebextensions folder in your application source root. In there you'll create a .config file named however you want.
This file will need to spell out where to grab the cert you need from and where to put it. The AWS documents spell out how to grab a file from S3 and put it somewhere. The instance profile it's talking about is described here. It's basically a way to allow your instance to talk to S3 without needing to store credentials in a file somewhere. You'll need to make sure it has at least read permissions on the bucket to pull the file.
Once this is all setup beanstalk should have the file on the instance when all is said and done. Another option is to generate a custom AMI with the key already on the file system. Just be aware of the performance considerations it mentions in the document.

AWS Windows EC2 Pull From S3 on Upload

I have a subset of Windows EC2 instances that I would like to continuously copy files to whenever files are uploaded to a specific S3 bucket. Files will be uploaded to this bucket anywhere between once a month to several times a month but will need to be copied to the instances within an hour of upload. EC2 instances will be continually added and removed from this subset of instances. I would like this functionality to be controlled by the EC2 instance so that whenever a new instance is created, it can be configured to pull from this bucket. Ideally, this would be an instantaneous upon upload (vs a cron job running periodically). I have researched AWS Lamba and S3-notifications, and I am unsure if these are the correct methods to use. What solution is best suited to fit this model of copying files?
If you don't need "real time" presence of the files, you might think to run s3 sync on each instance by a cron job (easy one) or s3-notification->with some lambda works to deliver EC2 Run Command.
If the instances are in an autoscaling group, you can use aws s3 copy in the user data section of your launch config to accomplish this.

Where to store public data and how to deploy it from GitHub

I'm working on my project which is placed on AWS EC2 instance. I used CodeDeploy to deploy app from GitHub to EC2. But I want to store public data as stylesheets, JS, images etc on S3. It's even possible to deploy app on EC2 and S3 in one step? Or should I place all files to EC2 instance only?
I've been learning AWS documentation about Elastic Beanstal, CodeDeploy, CodePipeline, Ops Works and others for two days, but I confused.
It sounds like you want to have two steps in your deployment. One where you update your static assets in S3 and another where you update your servers and dynamic content on EC2 instances.
Here are some options:
Since they are static, just have every EC2 host upload the S3 assets to your bucket as a BeforeInstall script. You would need to include the static content as part of your bundle you use with CodeDeploy.
Use a leader election algorithm to do (1) from a single host. You could deploy something like Zookeeper as part of your CodeDeploy deployment.
Upload your static assets as a separately from your CodeDeploy deployment. You might want to look ad CodePipeline as a solution for a more complex multistage deployment (which can use CodeDeploy for your server deployment).
In either case, you will want to make sure that you aren't just overwriting your static assets or you'll end up in the situation where you old server code is trying to use new static assets. You should always be careful that you can run both versions of your code side by side during a deployment.
I won't complicate it. I'll put all files to EC2 include CSS and JS by CodeDeploy from GitHub, because there is no simple and ideal solution for this.

Integrating AWS EC2, RDS and... S3?

I'm following this tutorial and 100% works like a charm :
http://docs.aws.amazon.com/gettingstarted/latest/wah-linux/awsgsg-wah-linux.pdf
but, in that tutorial, it use Amazon EC2 and RDS only. I was wondering what if my servers scaled up into multiple EC2 instances then I need to update my PHP files.
do I have to distribute it manually across those instances? because, as far as I know, those instances are not synced each other.
so, I decided to use S3 as replacement of my /var/www so the PHP files is now centralised in one place.
so, whenever those EC2 scaled up, the files remains in one place and I don't need to upload to multiple EC2.
is this the best practice to have centralised file server (S3) for /var/www ? because currently I still having permission issue when it's mounted using s3fs.
thank you.
You have to put your /var/www/ in S3 and when your instances scaled up have to make 'aws s3 sync' from your bucket, you can do that in the userdata. Also you have to select a 'master' instance where you make changes, a sync script upload changes to S3 and with rsync it copy changes to your alive FE. This is because if you have 3 FE that downloaded /var/www/ from S3 and you want to make a new change you would have to make a s3 sync in all your instances.
You can manage changes in your 'master' instance with inotify. Inotify can detect a change in /var/www/ and exec two commands, one could be aws s3 sync and then a rsync to the rest of your instances. You can get the list of your instances from the ELB through the AWS API.
The last thing is check the instance terminate protection in your 'master' instance.
Your architecture should look like here http://www.markomedia.com.au/scaling-wordpress-in-amazon-cloud/
Good look!!

Auto scaling and data replication on EC2

Here is my scenario.
We have an ELB setup with two reserved instances of EC2 acting as web server under it (Amazon Linux).
There are some rapidly changing files (pdf, xls, jpg, etc) on the web server which are consumed by the websites hosted on the EC2 instances. Code files are identical and we will be sure to update both the servers manually at the same time with new code as and when needed.
The main problem is the user uploaded content which is stored on the EC2 instance.
What is the best approach to make sure that the uploaded files are available on both the servers almost instantly ?
Many people have suggested the use of rsync or unison, but this will involve setting a cron job. I am looking for something like FileSystemWatcher in C# which is triggered
ONLY when the contents of the specified folder are changed. Moreover due to the ELB we are not sure which of the EC2 instances will actually be connected to the user when the files are uploaded.
To add to the above we have one more Staging Server which pushes certain files to one of the EC2 web servers. We want these files too replicated to the other instance.
I was wondering whether S3 can solve the problem ? Will this setup be still good if we decide to enable auto scaling ?
I am confused at this stage. Please help
S3 will be the choice for your case. In this way, you don't have to sync files between EC2 instances. Also it is probably the best choice if you need to enable auto scaling. You should not put any data in EC2 instances, they should be stateless so that you can easily auto scale.
To use S3, it will require your application to support it instead of directly writing to local file system. It should be quite easy, there are many libraries in each language which can help you to store files into S3.