Why does AWS EC2 instance authentication keeps stop working (session maybe)? - amazon-web-services

Problem In Short: PHP website hosted on AWS EC2 session stops working and restarting the EC2 instance fixes it.
=================
I have this website built on custom PHP hosted on AWS as an instance of EC2, DB is MySql RDS. Everything was working just fine a few weeks back. Now at certain times, the user can't log in to the website. I reckon there's some problem with the session but not sure.
But whenever I restart the ec2 instance, the authentication (user/pass) for the website starts working again. This is very strange because.. there's is no such issues on AWS troubleshooting or in Stackoverflow. I hope there is a permanent solution rather than having to restart the server every other day.
This is an e-commerce site with at least 500 to 700 orders per day.

If an issue is resolved by the restart of server/EC2-instance, then most likely you have memory/CPU related problems in your application.
You can do the following to nail the issue
Trace the cloudwatch statistics for your EC2 instance for CPU and memory
Set up another environment for load testing your application
Generate load script to simulate the scenario leading to the problem
Run a code profiler to investigate the problematic code
Fix problems, run the load again to verify your changes
Apply the changes in prod and hope your application rocks in production afterward
EDIT : As suggested by #Boinst, as in interm solution you can schedule restart of EC2 instance, while you find the root cause. One of the ways to do that can be to use AWS CLI
aws ec2 reboot-instances --instance-ids yourInstanceId
you can add a cronjob/scheduled task a machine setup with AWS CLI.

Related

AWS EC2 instances with auto scaling staying in sync

I have a Node.js web application currently running on a single EC2 instance on AWS. I am thinking of using auto scaling with 2 or more EC2 instances since the load on the application is increasing.
I have been trying to understand something with AWS Auto Scaling for a couple hours now but I cant seem to find an answer anywhere.
Currently, at many instances I SSH into my Ubuntu EC2 instance to modify some things or to run a deploy command (which grabs latest code from github). How does this work when you have, let's say 4 instances running under the auto scaling?
So if I SSH into a server and change the server.js file, what happens to the other 3 instances?
If that is not possible what are my choices? I have seen many people seeing that using S3 is the way to keep things in Sync but I don't fully get that. So I have to keep all my source code in S3 and do my edits from there?
You won't be able to modify files directly on the server once they are in an auto-scaling group. Changing something on one server won't be reflected on the other servers, and even if you manually updated all the currently running servers, any servers added by auto-scaling actions will not have those changes.
There are many methods to solve this, for example using AWS Code Deploy.
You could also configure something via an EC2 User-Data script in your auto-scaling configuration which will run on each server when they are created. That script could checkout the latest code from Git, or pull the latest build artifact from S3, and then start the app. When you have an update ready to deploy, you would simply flag the current instances as "unhealthy" and wait for the Auto-Scaling group to automatically replace them with new, updated instances.
You could use AWS EFS to host your application code and all web servers will get content from EFS instead of individual server. This way you don't have to worry about modifying individual server content.
One way you can do it is using github. you can update your code and push it to github and then terminate your existing instances and let the auto-scaling group spin up new instances with the updated code. here is a youtube tutorial video that has detailed steps on how to do it: https://www.youtube.com/watch?v=lB3Ip0Yn-Zs

AWS Elastic Beanstalk Restart

I work on an AWS Elastic Beanstalk environment. I was testing websockets on my server, when I noticed that var/www/error_log was about 5GB large, filling up my diskspace. I (stupidly) tried to view its contents using vim, which froze everything. I am currently trying to restart the environment, but the restart keeps timing out. What can I do to get it back up again?
The solution was to restart the underlying EC2 machine, by accessing the AWS EC2 Console.
I cannot thank #riverfall and #sébastien-stormacq enough for helping me! These things tend to get very stressful

Capistrano and Auto-Scaling AWS

We're trying to figure out the best way to deploy to an auto-scaling AWS setup using Capistrano, and stuck on the best way to ensure new servers automatically get the latest code, without having to rely on AMIs.
Any ideas?
Using User Data, you can have your EC2 instances pull the latest code each time a new instance is launched.
More info on user data here: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html
tldr: user data is pretty much a shell script thats executed when your ec2 instance launches. you can get it to pull the latest code and run it
#Moe's answer (or something like it is the right one). But just as another thought, you could write some Ruby which queries AWS on deploy to fetch the list of servers to which Capistrano will deploy. The issue with this approach is that you will have to manually deploy to all servers every time auto-scaling adds a server, which kind of defeats the purpose.

Should multiple ec2 instances share an EFS or should code be downloaded to instance on spin-up?

I am playing around with the idea of having an Auto Scaling Group for my website that receives a lot of traffic. I need each server to be running an identical webservice, so I have come up with several ideas to make this happen.
Idea 1: Use Code Commit + User Data
I will keep my webserver code in a git repo in CodeCommit. Then, when my EC2 instances spin-up, they will install apache2, and then pull from the git repo.
Idea 2: Use Elastic File System
After a server spins up, it will mount to one central EFS that has my webserver code on it. EC2 will install apache2 then use EFS to get the proper php files etc.
Idea 3: Use AWS S3
Like above with apache2, but then download webserver code from s3.
Which option is advised? Why?
I suggest you have a reference machine which is used for creating images. Keep it updated with the latest version of your code and when you are happy with it, create an image out of it, update your launch configuration, and change the ASG configuration so that it uses it. You can then stop the reference machine and leave the job to the ASG instances.

Deploying Flask app that uses Celery and Redis to AWS: Elastic Beanstalk or EC2 directly?

I'm new to web development and i wrote a small Flask API that uses Celery for message queue and Redis as the broker. I start redis with redis-server and Celery with celery -A application.celery worker --loglevel=info on my local machine and the app runs with no problem.
However i couldn't get it to work on AWS. Right now I'm deploying the app following the docs but when I try to send requests to my API I get internal server errors, which are probably related to Redis and Celery not working. I SSH'ed into the EC2 instance but since I'm new, couldn't find what to do to get the app working.
My questions are:
1) What do i do to start my application, Redis and Celery after deploying it to AWS? Does Elastic Beanstalk do it automatically or do I need to do some stuff?
2) Where do I find my app files? I think I'll need to install all the requirements manually from requirements.txt, and set up a virtualenv in the EC2 instance, is that right?
3) If I setup and install all the requirements in a virtualenv, will they persist if the EC2 instance changes? The command line tool for Elastic Beanstalk deployed the application automatically and created Load Balancer and Auto Scaling Group. Will the installations I make through the SSH be available when new instances are created, or do I need to manually do that everytime, or is there some other way?
4) I heard some people say that creating an EC2 instance and deploying manually is better than using Elastic Beanstalk. What does Elastic Beanstalk do for me? Is it better if I use Elastic Beanstalk or deploy manually?
Thanks for any help!
For the past week I was trying to do the exact same thing, so I'd thought I'd share everything I've learned. Although these answers are spread about stackoverflow/google, but I can help all the same.
1) To get a flask app running is easy, you can use the elastic beanstalk CLI. Generally, just follow the AWS documentation here, it's fairly straightforward. In terms of Redis/Celery, you start to get multiple things going on. Before you do your initial deploy, you'll probably want to setup the celery worker, you can use this stackoverflow answer on how to setup celery as a daemon. Be sure you read the script, you'll need to set your application name properly. The thing to note when you deploy to production via EBS is that your application will be hosted by apache, meaning some strange things will happen if you call your tasks via "some_task.delay", as the name of the celery app will be unknown. As far as I know, the only way to work around this properly is to use:
my_celery_object.send_task("flask_application_name.the_task", [param1, param2], ...)
Wherever you need to call tasks.
You can now prepare your redis cache. You can use anything, for this I'll just assume you want to use AWS ElasticCache (EC). Going to EC, you'll need to deploy a cache cluster with however many nodes you want. Once deployed you'll see it on the list under "Cache Clusters". Next, click the "X node" link that's in the table, you'll need to copy the endpoint url (and port!) to your celery application which you can learn about here.
So now that you have everything ready to deploy, you'll be sad to hear that the security thing I mentioned earlier will cause your application to fail on any task requests as the elastic cache cluster will be part of the wrong security group initially. Go ahead and deploy, this will create the security group you need along with your app and everything else. You can then find that security group under the EC2 dashboard, under Network & Security -> Security Groups. The name of the group should be the name of your environment, something like "myapp-env" is the default. Now modify the inbound rules and add a custom TCP rule setting the port number to your redis port and the source to "Anywhere", save. At this point, note the group name and go to your elastic cache. Click the Cach Clusters on the left, modify the CACHE CLUSTER (not the node) for the app, and update the VPC security group to the one you just noted and save.
Now celery will automatically connect to the redis cache as it will keep attempting to make connections for awhile. Otherwise you can always redeploy.
Hopefully you now have a functioning Flask/Celery app utilizing redis.
2) You shouldn't need to know the location of your app files on the EBS EC2 instance as it will automatically use a virtual environment and requirements.txt assuming you followed the instructions found here. However, at the time of writing this, you can always ssh to your EC2 instance at find your application files at:
/opt/python/current/app
3) I don't know how you mean "If I setup and install all the requirements in a virtualenv, will they persist if the EC2 instance changes?" As I said previously, if you followed the instructions on how to deploy an EBS environment for flask, then new instances that are deployed will automatically update their environment based on your requirements.txt
4) This is a matter of opinion. I have definitely heard not using EBS could be a better way to go, I personally have no opinion here as I have only used EBS. There have been some severe struggles (including trying to setup this application). What I hear some people do is deploy via EBS so that they can get a pre-configured and ready to go EC2 machine and then make an AMI from that machine, tear the EBS down, and then make an EC2 with the AMI. Regardless of the route you go, if you are planning to have a database backed server, I have learned (the hard way) that you shouldn't have EBS automatically attach the RDS. This is due to the fact that the RDS is then associated with the EBS application, so if you have to replace the resources, terminate it, etc., you'll lose the RDS (you can work around this of course, it's just a pain is all).