I´m trying to transfer a project from TeamCity to my EC2 server using CodeDeploy. In the process we had a problem and the file isn´t being transfered from the S3 service to our EC2 instance. The error message is :
The overall deployment failed because too many individual instances
failed deployment, too few healthy instances are available for
deployment, or some instances in your deployment group are
experiencing problems. (Error code: HEALTH_CONSTRAINTS)
The team decided that the best way to solve this problem was reading the server log, but in the process we noticed that the server keeps shutting down alone and that was a huge problem, we tought that can be solved with the logs, so we tried to get them using CodeWatch ( our team created the correct IAM and with a run command installed the agent on the server). Sadly this work only managed to get shutting down or turning information on logs.
At this moment we don´t know how to solve this problem but our plain was to get all the logs and then see what is wrong.
I´m stucked at this part since setember, can someone help me ?
Related
I have created a cluster in AWS EKS with 2 nodes, where it is a web based application and i am using EBS volumes on the Pods. I am able to communicate in initial login page with the server and can proceed with other steps as well. But sometimes for one particular module it is throwing an "internal server error:500" which i used to inspect option whether it is from the development code issue or from the AWS code issue.
Only for one particular Pod(service) the issue is coming and throwing an error whether it is from network related issue, which the communication is not happening or not. from the server the data is unable to pull or not unable to understand.
We had a working env yesterday morning. One person was working with some VPC peering stuff. And somehow it seems blocked the ELB from being able to talk to the application servers. So we can't deploy.
We get this error
Command failed on instance. An unexpected error has occurred [ErrorCode: 0000000001].
And EB can't retrieve any logs from the app server either. So clearly it is getting blocked. The IAM of the instance has AWSElasticBeanstalkWebTier and is the same IAM used in another env that works.
Also, I can RDP to the app instance from my laptop. Though I added my IP to one of the security groups for the instance. I audited the SG's for the working env, and don't see anything specific for EB.
Everything points to something with the VPC... but what should I look at?
So turns out that that error message doesn't always mean it can't talk to the instance. We had updated the AMI a while back and used the wrong one... it wasn't compatible with the platform. So nothing worked right, including getting the logs. So the fix was to sync up the AMI and the EB platform.
Problem In Short: PHP website hosted on AWS EC2 session stops working and restarting the EC2 instance fixes it.
=================
I have this website built on custom PHP hosted on AWS as an instance of EC2, DB is MySql RDS. Everything was working just fine a few weeks back. Now at certain times, the user can't log in to the website. I reckon there's some problem with the session but not sure.
But whenever I restart the ec2 instance, the authentication (user/pass) for the website starts working again. This is very strange because.. there's is no such issues on AWS troubleshooting or in Stackoverflow. I hope there is a permanent solution rather than having to restart the server every other day.
This is an e-commerce site with at least 500 to 700 orders per day.
If an issue is resolved by the restart of server/EC2-instance, then most likely you have memory/CPU related problems in your application.
You can do the following to nail the issue
Trace the cloudwatch statistics for your EC2 instance for CPU and memory
Set up another environment for load testing your application
Generate load script to simulate the scenario leading to the problem
Run a code profiler to investigate the problematic code
Fix problems, run the load again to verify your changes
Apply the changes in prod and hope your application rocks in production afterward
EDIT : As suggested by #Boinst, as in interm solution you can schedule restart of EC2 instance, while you find the root cause. One of the ways to do that can be to use AWS CLI
aws ec2 reboot-instances --instance-ids yourInstanceId
you can add a cronjob/scheduled task a machine setup with AWS CLI.
I am having an issue pulling private images from Artifactory to AWS Fargate. It is showing an error "access violation". Anybody getting the same error while running task in AWS Fargate?
Status reason : CannotPullContainerError: API error (500): Get https://xxx.artifactory.xx:xxx/v2/: Access violation
This issue has been fixed in Fargate Platform Version‐1.3.0.
In 1.3.0, along with Secrets support, AWS fixed the issue of pulling images from the private registry which runs on HTTPS ports other than 443.
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/platform_versions.html
https://aws.amazon.com/about-aws/whats-new/2018/12/aws-fargate-platform-version-1-3-adds-secrets-support/
AWS have recognized this as an issue with Fargate:
Thank you for your patience. Here's an update which I received from our Fargate team on this issue.
We have identified an issue were customers are unable to pull images from private container image registries running on non-443 HTTPS ports. We have fixed the issue and will make it available with the next AWS Fargate Platform Version release (mid December).
I hope the above answers your questions and I do apologize for the issues you faced. Please do get back on the case if you needed more information. Have a great day ahead :)
To retry, then, after mid-December
I'm in the process of migrating a Django app from Heroku to Elastic Beanstalk. Deploys were working and I was debugging issues to try to get the requirements.txt working then deploys stopped.
Now they will time out. The worst part of this is I'm unable to access logs, so I have no idea what's causing the issue.
Image of events:
When I try to access logs it either returns the error An error occurred retrieving logs: Rate exceeded or it will trigger the event requestEnvironmentInfo is starting. which itself will time out.
I'm not sure how you can move forward in debugging without access to the logs. I've cloned into a new environment and terminated the old one, but that didn't work. If you've encountered similar issues or know how to proceed please provide help!
I fixed the issue by terminating the main EC2 instance and redeploying in EB. in a larger instance. In this case small.