Is s3cmd a safe option for sync EC2 instances? - amazon-web-services

I have the following problem: we are working on a project on AWS which will use autoscaling, so the EC2 instances will start and die very often. Freeze images, update the launch configurations, auto scalling groups, alarms, etc, takes a while and several things can go wrong.
I just want the new instances to sync the most recent code, so I was just thinking about fetching it from S3 using s3cmd once the instance finishes booting and manually updating it everytime we have new codes to be uploaded. So my doubts are:
Is it too much risky to store the code on s3? How secure are the files in there? Using the s3cmd encryption password it is unlikely someone will be able do decrypt them?
What other ooptions would be good for this? I was thinking about rsync, but then I think I would need to store the private key for the servers inside them, which I don't think its a good idea.
Thanks for any advices

You might be a candidate for Elastic Beanstalk - using a plain vanilla AMI.
Then package your application, use AWS's ebextensions tool to customize the instance when it is spun up. ebextensions will allow you to do anything you like to the image, in place, as it is deploying. change .htaccess, erase a file, place a cron job, whatever.
When you have code updates, package them, upload and do a rolling update.
All instances will use your latest code, including auto-scaled ones.
The key concept here is to never have your real data in the instance, where it might go away if an instance dies or is shut down.
Elastic Beanstalk will allow you to set up the load balancing, auto-scaling, monitoring, etc.

Related

How to add some new code to an existing EC2 instance

Bear with me, what I am requesting may be impossible. I am a AWS noob.
So I am going to describe to you the situation I am in...
I am doing a freelance gig and was essentially handed the keys to AWS. That is, I was handed the root user login credentials for the AWS account that powers this website.
Now there are 3 EC2 instances. One of the instances is a linux box that, from what I am being told, is running a Django Python backend.
My new "service" if you will must exist within this instance.
How do I introduce new source code into this instance? Is there a way to pull down the existing source code that lives within it?
I am not be helped by any existing/previous developers so I am kind of just handed the AWS credentials and have no idea where to start.
Is this even possible. That is, is it possible to pull the source code from an EC2 instance and/or modify the code? How do I do this?
EC2 instances are just virtual machines. So you can use SSH/SCP/SFTP files to and from. You can use the AWS CLI tools to copy stuff from S3. Dealers choice...
Now to get into this instance... If you look in the web console you can find its IP(s), what the security groups (firewall rules), and the key pair name. Hopefully they gave you the keys. You need these to SSH in.
You'll also want to check to make sure there's a security group applied that has SSH open. Hopefully only to your IP :)
If you don't have the keys you'll have to create an AMI image of the instance so you can create a new one with a key pair you do have.
Amazon has a set of tools for you in Amazon CodeSuite.
The tool used for "deploying" the code is Amazon CodeDeploy. By using this service you install an agent onto your host, then when triggered it will pull down an artifact of a code base and install it matching hosts. You can even specify additional commands through the hook system.
But you also want to trigger this to happen, maybe even automatically? CodeDeploy can be orchestrated using the CodePipeline tool.

AWS EC2 instances with auto scaling staying in sync

I have a Node.js web application currently running on a single EC2 instance on AWS. I am thinking of using auto scaling with 2 or more EC2 instances since the load on the application is increasing.
I have been trying to understand something with AWS Auto Scaling for a couple hours now but I cant seem to find an answer anywhere.
Currently, at many instances I SSH into my Ubuntu EC2 instance to modify some things or to run a deploy command (which grabs latest code from github). How does this work when you have, let's say 4 instances running under the auto scaling?
So if I SSH into a server and change the server.js file, what happens to the other 3 instances?
If that is not possible what are my choices? I have seen many people seeing that using S3 is the way to keep things in Sync but I don't fully get that. So I have to keep all my source code in S3 and do my edits from there?
You won't be able to modify files directly on the server once they are in an auto-scaling group. Changing something on one server won't be reflected on the other servers, and even if you manually updated all the currently running servers, any servers added by auto-scaling actions will not have those changes.
There are many methods to solve this, for example using AWS Code Deploy.
You could also configure something via an EC2 User-Data script in your auto-scaling configuration which will run on each server when they are created. That script could checkout the latest code from Git, or pull the latest build artifact from S3, and then start the app. When you have an update ready to deploy, you would simply flag the current instances as "unhealthy" and wait for the Auto-Scaling group to automatically replace them with new, updated instances.
You could use AWS EFS to host your application code and all web servers will get content from EFS instead of individual server. This way you don't have to worry about modifying individual server content.
One way you can do it is using github. you can update your code and push it to github and then terminate your existing instances and let the auto-scaling group spin up new instances with the updated code. here is a youtube tutorial video that has detailed steps on how to do it: https://www.youtube.com/watch?v=lB3Ip0Yn-Zs

Do I need to duplicate code on every EC2 instance running behind an ELB?

Hi this is a very noob question, but I am trying to deply my Node JS API server on AWS.
Everything is working fine with one m1.large instance that my Front End running on S3 connects to.
Now I want to Scale and put my EC2 instance and possibly many more behing and ELB and an Auto Scaling Group.
Do I need to duplicate my server code on every EC2 instance?
If so , I assume I'll have to create a seperate DB server which all of the EC2 instances will connect to.
Am I right,anyone experienced in Amazon AWS can answer this, I tried googling but most of the links point to detailed tutorials which however don't answer my question.
Any help would be much appreciated. Thanks
yep. that's basically correct. the code needs to be on all instances fronted by the load balancer. for the database you may want to look into RDS.
Of course NOT.. But sure you can do..
That's why there are EFS volumes, which are shared volumes to more than one EC2 instance, but you have to choose a region that support them since they are available on certain regions. As a candidate AWS certified architect I would recommend you more than two options.
You can follow your first approach and create an EC2 instance put your code inside and then create an AMI and use this AMI to launch your upcoming EC2s through autoscaling group. In my opinion bad decision since on any code change you have to go on each one and put the new code and then create a new AMI and a new Auto scaling configuration..Lot's of stuff to do, but it will work.
Second approach, following the first approach but do not create an AMI, instead upload your code on a private (I suppose) Repo like github, bitbucket, install SSM and the appropriate roles for managing EC2 and on every code changes push them to repo and pull them on your EC2, using SSM. Of course you may write a webhook to bitbucket to call an api and run the git pull command on each EC2. Probably the last sentence could be a third approach but needs more coding!!!
Last but not least!! Use an EFS volume put your code there, mount this volume on your EC2, add a auto mount command on every boot, alter your apache httpd main document to point on this EFS/folder and create an AMI with this configuration. Voila! every new EC2 will use the same code which located on this shared/network volume. Whenever you need to change something you have to log in on a third instance outside of your autoscaling group for a certain amount of time upload your changes and then turn it off and all of your EC2 will take immediately the new code. Of course you may pull the changes from a repo following the third approach.
Maybe there are more approaches, I'm using the third one with private repos of course and until now I haven't faced any problem (Fingers crossed)!
One other option is to use Elastic Beanstalk to Deploy NodeJs applications. Here is the guide specific to NodeJs. This will take care of most of the stuff which you would need to do otherwise if you only use EC2 For example: ELB, Autoscaling Cloudwatch etc.
For Database, you may want to use the Master Slave with Read Replicas. Another option is to evaluate NoSql Databases like DynamoDB if it fits your use case. The scalability of DynamoDB tables is managed by AWS so you dont need to worry about it.

updating all files on AWS EC2

I'm trying to determine the "best" way for a small company to keep web app EC2 instances in sync with current files while using autoscaling.
From my research, CloudFormation, Chef, Puppet, OpsWorks, and others seem like the tools to do so. All of them seem to have a decent learning curve, so I am hoping someone can point me in the right direction and I'll learn one.
The initial setup I am after is:
Route53
1x Load Balancer
2x EC2 (different AZ) - Apache/PHP
1x ElastiCache Redis
2x EC2 (different AZ) w/ MySQL
Email thru Google Apps
Customer File/Image Storage via S3
CloudFront for CDN
The only major challenge I can see is versioning/syncing the web/app server. We're small now, so I could probably just manually update the EBS or even using rsync, but I would rather automate it and be setup for autoscaling.
This is probably too broad of a question and may be closed, but let me give you a few thoughts.
Why not use RDS for MySQL?
You need to get into the thought of how to make and promote disk images. In the cloud world, you don't want to be rsyncing around a bunch of files from server to server. When you are ready to publish a revised set of code, just make am image from your staging environment, start new EC2 instances in your ELB based on that image, and turn off old instances. You may have a little different deployment sequence if you need to coordinate with DB schema changes, but that is a pretty straightforward approach.
You should still seek to automate some of your activities using tools such as those you mentioned. You don't need to do this all at once. Just figure out a manual part in your process that you want to automate and do it.

Auto scaling and data replication on EC2

Here is my scenario.
We have an ELB setup with two reserved instances of EC2 acting as web server under it (Amazon Linux).
There are some rapidly changing files (pdf, xls, jpg, etc) on the web server which are consumed by the websites hosted on the EC2 instances. Code files are identical and we will be sure to update both the servers manually at the same time with new code as and when needed.
The main problem is the user uploaded content which is stored on the EC2 instance.
What is the best approach to make sure that the uploaded files are available on both the servers almost instantly ?
Many people have suggested the use of rsync or unison, but this will involve setting a cron job. I am looking for something like FileSystemWatcher in C# which is triggered
ONLY when the contents of the specified folder are changed. Moreover due to the ELB we are not sure which of the EC2 instances will actually be connected to the user when the files are uploaded.
To add to the above we have one more Staging Server which pushes certain files to one of the EC2 web servers. We want these files too replicated to the other instance.
I was wondering whether S3 can solve the problem ? Will this setup be still good if we decide to enable auto scaling ?
I am confused at this stage. Please help
S3 will be the choice for your case. In this way, you don't have to sync files between EC2 instances. Also it is probably the best choice if you need to enable auto scaling. You should not put any data in EC2 instances, they should be stateless so that you can easily auto scale.
To use S3, it will require your application to support it instead of directly writing to local file system. It should be quite easy, there are many libraries in each language which can help you to store files into S3.