Cannot run more than two tasks in Amazon Web Services - amazon-web-services

I have two clusters in my Amazon Elastic Container Service, one for production and one as a testing environment.
Each cluster has three different services with one task each. There should be 6 tasks running.
To update a task, I always pushed my new Docker Image to the Elastic Container Registry and restarted the Service with the new Image.
Since about 2 weeks I am only able to start 2 Tasks at all. It doesn't depend on the cluster, just 2 Tasks in general.
It looks like the tasks that should start are stuck in the "In Progress" Rollout State.
Has anybody similar problem or knows how to fix this?

I wrote to the support with this issue.
"After a review, I have noticed that the XXXXXXX region has not yet been activated. In order to activate the region you will have to launch an instance, I recommended a Free Tier EC2 instance.
After the EC2 instance has been launched you can terminate it thereafter.
"
I don't know why, but it's working

Related

Difference between AWS ECS Service Type Daemon & Constraint "One Task Per Host"

On an initial look AWS ECS "Daemon" Service Type and the placement constraint "One Task per Host" looks very similar. Can someone please guide me on the differences between the two and some real life examples of when one is preferred over another?
By "One Task Per Host" are you referring to the distinctInstance constraint?
distinctInstance means that no more than 1 instance of the task can be running on a server at a time. However the actual count of task instances across your cluster will depend on your desired task count setting. So if you have 3 servers in your cluster, you could have as little as 1 of the tasks running, and as much as 3 of the tasks running.
daemon specifies to ECS that one of these tasks has to be running on every server in the cluster. So if you have 3 servers in your cluster then you will have 3 instances of the task running, one on each server.

Running ECS service on 2 container instances (ECS instances)

I have an ECS service which has a requirement that it should be running on exactly 2 container instances. How can this be achieved? I could not find any specific place in container definition, where I can fix the number of ECS instances.
There are a few ways to achieve this. One is to deploy your ECS service on Fargate. When you do so and you set your task count to, say, 2 ... ECS will deploy your 2 tasks onto 2 separate and dedicated operating systems/VMs managed by AWS. Two or more tasks can never be colocated to one of these VMs. It's always a 1 task : 1 VM relationship.
If you are using EC2 as your launch type and you want to make sure your service deploys exactly 1 task per instance the easiest way would be to configure your ECS service as type DAEMON. In this case you don't even need (or can't) configure the number of tasks in your service because ECS will always deploy 1 task per EC2 instance that is part of the cluster.
At the time of creating service you will find the field Number of tasks it means that how many container you want exactly. If you write 1 than it will launch only 1 and if you write 2 then it will launch 2 . I Hope you understand

ECS services taking more than 10 minutes to start

I am using ECS to deploy my services, I've 2 services but after starting the ECS instance from my ASG, ecs-agent docker container comes up immediately but both of my service containers takes more than 10 minutes to come up.
I am using t2.medium instance and both these services are very small and doesn't do any checks at startup times.
Let me know if I need to provide any other information. Note I've checked in events section and even there is no information until instance is started.

Why every time Elastic Beanstalk issues a command to its instance it always timed out?

I have a PHP application deployed to Amazon Elastic Beanstalk. But I notice a problem that every time I push my code changes via git aws.push to the Elastic Beanstalk, the application deployed didn't picked up the changes. I checked the events log on my application Beanstalk environment and notice that every time the Beanstalk issues:
Deploying new version to instance(s)
it's always followed by:
The following instances have not responded in the allowed command timeout time (they might still finish eventually on their own):
[i-d5xxxxx]
The same thing happens when I try to request snapshot logs. The Beanstalk issues:
requestEnvironmentInfo is starting
then after a few minutes it's again followed by:
The following instances have not responded in the allowed command timeout time (they might still finish eventually on their own): [i-d5xxxxx].
I had this problem a few times. It seems to affect only particular instances. So it can be solved by terminating the EC2 instance (done via the EC2 page on the Management Console). Thereafter, Elastic Beanstalk will detect that there are 0 healthy instances and automatically launch a new one.
If this is a production environment and you have only 1 instance and you want minimal down time
configure minimum instances to 2, and Beanstalk will launch another instance for you.
terminate the problematic instance via EC2 tab, Beanstalk will launch another instance for you because minimum instance is 2
configure minimum instance back to 1, Beanstalk will remove one of your two instances.
By default Elastic Beanstalk "throws a timeout exception" after 8 minutes (480 seconds defined in settings) if your commands did not complete in time.
You can set an higher time up to 30 minutes (1800 seconds).
{
"Namespace": "aws:elasticbeanstalk:command",
"OptionName": "Timeout",
"Value": "1800"
}
Read here: http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/command-options.html
Had the same issue here (single t1.micro instance).
Did solve the problem by rebooting the EC2 instance via the EC2 page on the Management Console (and not from EB page).
Beanstalk deployment (and other features like Get Logs) work by sending SQS commands to instances. SQS client is deployed to instances and checks queue about every 20 secs (see /var/log/cfn-hup.log):
2018-05-30 10:42:38,605 [DEBUG] Receiving messages for queue https://sqs.us-east-2.amazonaws.com/124386531466/93b60687a33e19...
If SQS Client crashes or has network problems on t1/t2 instances then it will not be able to receive commands from Beanstalk, and deployment would time out. Rebooting instance restarts SQS Client, and it can receive commands again.
An easier way to fix SQS Client is to restart cfn-hup service:
sudo service cfn-hup restart
In the case of deployment, an alternative to shutting down the EC2 instances and waiting for Elastic Beanstalk to react, or messing about with minimum and maximum instances, is to simply perform a Rebuild environment on the target environment.
If a previous deployment failed due to timeout then the new version will still be registered against the environment, but due to the timeout it will not appear to be operational (in my experience the instance appears to still be running the old version).
Rebuilding the environment seems to reset things with the new version being used.
Obviously there's the downside with that of a period of downtime.
I think is the correct way to deal with this.
I think the correct way to deal with this is to figure out the cause of the timeout by doing what this answer suggests.
chongzixin's answer is what needs to be done if you need this fixed ASAP before investigating the reason for a timeout.
However, if you do need to increase timeout, see the following:
Add configuration files to your source code in a folder named .ebextensions and deploy it in your application source bundle.
Example:
option_settings:
"aws:elasticbeanstalk:command":
Timeout: 2400
*"value" represents the length of time before timeout in seconds.
Reference: https://serverfault.com/a/747800/496353
"Restart App Server(s)" from the "Actions" menu in Elastic Beanstalk management dashboard followed by eb deploy fixes it for me.
Visual cue for the first instruction
After two days of checking random issues, I restarted both EC2 instances one after another to make sure there is no downtime. Site worked fine but after a while, website started throwing error 504.
When I checked the http server, nginx was off and "Out of HDD space" was thrown. "Increased the HDD size", elastic beanstalk created new instances and the issue was fixed.
For me, the problem was my VPC security group rules. According to the docs, you need to allow outbound traffic on port 123 for NTP to work. I had the port closed, so the clock was drifting, and so the EC2's were becoming unresponsive to commands from the Elastic Beanstalk environment, taking forever to deploy (only to time out) failing to get logs, etc.
Thank you #Logan Pickup for the hint in your comment.

ColdFusion multi instance installation

We have a single ColdFusion 10 Tomcat server on Apache with Multi-instance configuration. We have 4 instances. cfusion and cfusion2 are for visitors in a single cluster. The other two are Search instance and Scheduled Job instance. We have a couple of questions.
It does not seem to be obeying the cluster rule and moving clients from instance and instance. Cfusion is clearly on, cfusion2 seems to be stopped. But i seem to have no way of getting it restarted. What am I doing wrong there?
Question 2 how can be assured that scheduled jobs will only run from the scheduled job instance and not be touched by the other 3 instances. I don't seem to be able to delete or disable their reference with the other instances.