Amazon Linux 2 instances won't appear in Systems Manager - amazon-web-services

I think I've done everything listed as a pre-req for this, but I just can't get the instances to appear in Systems Manager as managed instances.
I've picked an AMI which i believe should have the agent in by default.
ami-032598fcc7e9d1c7a
PS C:\Users\*> aws ec2 describe-images --image-ids ami-032598fcc7e9d1c7a
{
"Images": [
{
"ImageLocation": "amazon/amzn2-ami-hvm-2.0.20200520.1-x86_64-gp2",
"Description": "Amazon Linux 2 AMI 2.0.20200520.1 x86_64 HVM gp2",
I've also created my own Role, and included the following policy which i've used previously to get instances into Systems Manager.
Finally I've attached the role to the instances.
I've got Systems Manager set to a 30 min schedule and waited this out and the instances don't appear. I've clearly missed something here, would appreciate suggestions of what.
Does the agent use some sort of backplane to communicate, or should I have enabled some sort of communication with base in the security groups?
Could this be because the instances have private IPs only? Previous working examples had public IPs, but I dont want that for this cluster.

Besides the role for ec2 instances, SSM also needs to be able to assume role to securely run commands on the instances. You only did the first step. All the steps are described in AWS documentation for SSM.
However, I strongly recommend you use the Quick Setup feature in System Manager to setup everything for you in no time!
In AWS Console:
Go to Systems Manager
Click on Quick Setup
Leave all the defaults
In the Targets box at the bottom, select Choose instances manually and tick your ec2 instance(s)
Finish the setup
It will automatically create AmazonSSMRoleForInstancesQuickSetup Role and assign it to the selected ec2 instance(s) and also create proper AssumeRole for SSM
Go to EC2 Console, find that ec2 instance(s), right-click and reboot it by choosing Instance State > Reboot
Wait for a couple of minutes
Refresh the page and try to Connect via Session Manager tab
Notes:
It's totally fine and recommended to create your ec2 instances in private subnets if you don't need them to be accessed from internet. However, make sure the private subnet has internet access itself via NAT. It's a hidden requirement of SSM!
Some of the AmazonLinux2 images like amzn2-ami-hvm-2.0.20200617.0-x86_64-gp2 does not have proper SSM Agent pre-installed. So, recreate your instance using a different AMI and try again with the above steps if it didn't work.

Could this be because the instances have private IPs only? Previous working examples had public IPs, but I don't want that for this cluster.
If you place your instance in a private subnet (or in a public subnet but without a public IP), then the SSM agent can't connect to the SSM Service. Thus it can't register to it.
There are two solutions to this issue:
Setup VPC Interface endpoint in a private subnet for SSM System Manger. With this your intances will be able to connect to the SSM service without the internet.
Create a public subnet with NAT gateway/instance, and setup route tables to route internet traffic from the private subnets to the NAT gateway. This way your private instances will be able to access the SSM service over internet through the NAT device.

Related

EC2 Instance doesn't become managed after installing SSM Agent

I've installed SSM Agent (2.2.607.0) on Windows Server 2012 R2 Standard instance with the EC2 Config (4.9.2688.0). After installing it, i cannot see the server on the Managed Instances screen. I did the same steps on other servers (Windows and Linux) and it worked.
Tried to uninstall the EC2 Config, reinstalled it again with no luck. Tried to install a different SSM Agent version (2.2.546.0) with no luck also.
Any thoughts?
The agent is installed, but the instance still needs the proper role to communicate with the systems manager. Particularly this step of Configuring Access to Systems Manager.
By default, Systems Manager doesn't have permission to perform actions
on your instances. You must grant access by using an IAM instance
profile. An instance profile is a container that passes IAM role
information to an Amazon EC2 instance at launch.
You should review the whole configuration guide and make sure you have configured all required roles appropriately.
I had this problem, and of the four troubleshooting steps - SSM Agent, IAM instance role, Service Endpoint connectivity, Target operating system type, it turned out that the problem was endpoint connectivity.
My VPC, Subnet, route table, and internet gateway all looked correct (and were identical to another instance which was being managed by SSM). But the instance didn't have a public IP, and without that you can't use the IGW. You can't use a VPC endpoint and an Internet Gateway. So adding a public IP allowed the instance to connect to SSM and become managed.
Extra complication : I was trying to use EC2 Image Builder, which creates an instance without a public IP. So there is no way to use Image Builder in a VPC which has an Internet Gateway.
New SSM agent version comes with a diagnostic package.. You can run that to see which prerequisites is missing.
https://docs.aws.amazon.com/systems-manager/latest/userguide/ssm-cli.html

AWS ECS Error when running task: No Container Instances were found in your cluster

Im trying to deploy a docker container image to AWS using ECS, but the EC2 instance is not being created. I have scoured the internet looking for an explanation as to why I'm receiving the following error:
"A client error (InvalidParameterException) occurred when calling the RunTask operation: No Container Instances were found in your cluster."
Here are my steps:
1. Pushed a docker image FROM Ubuntu to my Amazon ECS repo.
2. Registered an ECS Task Definition:
aws ecs register-task-definition --cli-input-json file://path/to/my-task.json
3. Ran the task:
aws ecs run-task --task-definition my-task
Yet, it fails.
Here is my task:
{
"family": "my-task",
"containerDefinitions": [
{
"environment": [],
"name": "my-container",
"image": "my-namespace/my-image",
"cpu": 10,
"memory": 500,
"portMappings": [
{
"containerPort": 8080,
"hostPort": 80
}
],
"entryPoint": [
"java",
"-jar",
"my-jar.jar"
],
"essential": true
}
]
}
I have also tried using the management console to configure a cluster and services, yet I get the same error.
How do I configure the cluster to have ec2 instances, and what kind of container instances do I need to use? I thought this whole process was to create the EC2 instances to begin with!!
I figured this out after a few more hours of investigating. Amazon, if you are listening, you should state this somewhere in your management console when creating a cluster or adding instances to the cluster:
"Before you can add ECS instances to a cluster you must first go to the EC2 Management Console and create ecs-optimized instances with an IAM role that has the AmazonEC2ContainerServiceforEC2Role policy attached"
Here is the rigmarole:
1. Go to your EC2 Dashboard, and click the Launch Instance button.
2. Under Community AMIs, Search for ecs-optimized, and select the one that best fits your project needs. Any will work. Click next.
3. When you get to Configure Instance Details, click on the create new IAM role link and create a new role called ecsInstanceRole.
4. Attach the AmazonEC2ContainerServiceforEC2Role policy to that role.
5. Then, finish configuring your ECS Instance. NOTE: If you are creating a web server you will want to create a securityGroup to allow access to port 80.
After a few minutes, when the instance is initialized and running you can refresh the ECS Instances tab you are trying to add instances too.
I ran into this issue when using Fargate. I fixed it when I explicitly defined launchType="FARGATE" when calling run_task.
Currently, the Amazon AWS web interface can automatically create instances with the correct AMI and the correct name so it'll register to the correct cluster.
Even though all instances were created by Amazon with the correct settings, my instances wouldn't register. On the Amazon AWS forums I found a clue. It turns out that your clusters need internet access and if your private VPC does not have an internet gateway, the clusters won't be able to connect.
The fix
In the VPC dashboard you should create a new Internet Gateway and connect it to the VPC used by the cluster.
Once attached you must update (or create) the route table for the VPC and add as last line
0.0.0.0/0 igw-24b16740
Where igw-24b16740 is the name of your freshly created internet gateway.
Other suggested checks
Selecting the suggested AMI which was specified for the given region solved my problem.
To find out the AMI - check Launching an Amazon ECS Container Instance.
By default all the ec2 instances are added to default cluster . So the name of the cluster also matters.
See point 10 at Launching an Amazon ECS Container Instance.
More information available in this thread.
Just in case someone else is blocked with this problem as I was...
I've tried everything here and didn't work for me.
Besides what was said here regards the EC2 Instance Role, as commented here, in my case only worked if I still configured the EC2 Instance with simple information. Using the User Data an initial script like this:
#!/bin/bash
cat <<'EOF' >> /etc/ecs/ecs.config
ECS_CLUSTER=quarkus-ec2
EOF
Informing the related ECS Cluster Name created at this ecs config file, resolved my problem. Without this config, the ECS Agent Log at the EC2 Instance was showing an error that was not possible to connect to the ECS, doing this I've got the EC2 Instance visible to the ECS Cluster.
After doing this, I could get the EC2 Instance available for my EC2 Cluster:
The AWS documentation said that this part is optional, but in my case, it didn't work without this "optional" configuration.
When this happens, you need to look to the following:
Your EC2 instances should have a role with AmazonEC2ContainerServiceforEC2Role managed policy attached to it
Your EC2 Instances should be running AMI image which is ecs-optimized (you can check this in EC2 dashboard)
Your VPC's private subnets don't have public IPs assigned, OR you do not have an interface VPC endpoint configured, OR you don't have NAT gateway set up
Most of the time, this issue appears because of the misconfigured VPC. According to the Documentation:
QUOTE: If you do not have an interface VPC endpoint configured and your container instances do not have public IP addresses, then they must use network address translation (NAT) to provide this access.
To create a VPC endpoint: Follow to the documentation here
To create a NAT gateway: Follow to the documentation here
These are the reasons why you don't see the EC2 instances listed in the ECS dashboard.
If you have come across this issue after creating the cluster
Go the ECS instance in the EC2 instances list and check the IAM role that you have assigned to the instance. You can identify the instances easily with the instance name starts with ECS Instance
After that click on the IAM role and it will direct you to the IAM console. Select the AmazonEC2ContainerServiceforEC2Role policy from the permission policy list and save the role.
Your instances will be available in the cluster shortly after you save it.
The real issue is lack of permission. As long as you create and assign a IAM Role with AmazonEC2ContainerServiceforEC2Role permission, the problem goes away.
I realize this is an older thread, but I stumbled on it after seeing the error the OP mentioned while following this tutorial.
Changing to an ecs-optimized AMI image did not help. My VPC already had a route 0.0.0.0/0 pointing to the subnet. My instances were added to the correct cluster, and they had the proper permissions.
Thanks to #sanath_p's link to this thread, I found a solution and took these steps:
Copied my Autoscaling Group's configuration
Set IP address type under the Advanced settings to "Assign a public IP address to every instance"
Updated my Autoscaling Group to use this new configuration.
Refreshed my instances under the Instance refresh tab.
Another possible cause that I ran into was updating my ECS cluster AMI to an "Amazon Linux 2" AMI instead of an "Amazon Linux AMI", which caused my EC2 user_data launch script to not work.
for other than ecs-optimized instance image. Please do below step
Install ECS Agent ECS Agent download link
ECS_CLUSTER=REPLACE_YOUR_CLUSTER_NAME
add above content to /etc/ecs/ecs.config
The VPC will need to communicate with the ECR.
To do this, the security group attached to the VPC will need an outbound rule of 0.0.0.0/0.

Why can't my ECS service register available EC2 instances with my ELB?

I've got an EC2 launch configuration that builds the ECS optimized AMI. I've got an auto scaling group that ensures that I've got at least two available instances at all times. Finally, I've got a load balancer.
I'm trying to create an ECS service that distributes my tasks across the instances in the load balancer.
After reading the documentation for ECS load balancing, it's my understanding that my ASG should not automatically register my EC2 instances with the ELB, because ECS takes care of that. So, my ASG does not specify an ELB. Likewise, my ELB does not have any registered EC2 instances.
When I create my ECS service, I choose the ELB and also select the ecsServiceRole. After creating the service, I never see any instances available in the ECS Instances tab. The service also fails to start any tasks, with a very generic error of ...
service was unable to place a task because the resources could not be found.
I've been at this for about two days now and can't seem to figure out what configuration settings are not properly configured. Does anybody have any ideas as to what might be causing this to not work?
Update # 06/25/2015:
I think this may have something to do with the ECS_CLUSTER user data setting.
In my EC2 auto scaling launch configuration, if I leave the user data input completely empty, the instances are created with an ECS_CLUSTER value of "default". When this happens, I see an automatically-created cluster, named "default". In this default cluster, I see the instances and can register tasks with the ELB like expected. My ELB health check (HTTP) passes once the tasks are registered with the ELB and all is good in the world.
But, if I change that ECS_CLUSTER setting to something custom I never see a cluster created with that name. If I manually create a cluster with that name, the instances never become visible within the cluster. I can't ever register tasks with the ELB in this scenario.
Any ideas?
I had similar symptoms but ended up finding the answer in the log files:
/var/log/ecs/ecs-agent.2016-04-06-03:
2016-04-06T03:05:26Z [ERROR] Error registering: AccessDeniedException: User: arn:aws:sts::<removed>:assumed-role/<removed>/<removed is not authorized to perform: ecs:RegisterContainerInstance on resource: arn:aws:ecs:us-west-2:<removed:cluster/MyCluster-PROD
status code: 400, request id: <removed>
In my case, the resource existed but was not accessible. It sounds like OP is pointing at a resource that doesn't exist or isn't visible. Are your clusters and instances in the same region? The logs should confirm the details.
In response to other posts:
You do NOT need public IP addresses.
You do need: the ecsServiceRole or equivalent IAM role assigned to the EC2 instance in order to talk to the ECS service. You must also specify the ECS cluster and can be done via user data during instance launch or launch configuration definition, like so:
#!/bin/bash
echo ECS_CLUSTER=GenericSericeECSClusterPROD >> /etc/ecs/ecs.config
If you fail to do this on newly launched instances, you can do this after the instance has launched and then restart the service.
In the end, it ended up being that my EC2 instances were not being assigned public IP addresses. It appears ECS needs to be able to directly communicate with each EC2 instance, which would require each instance to have a public IP. I was not assigning my container instances public IP addresses because I thought I'd have them all behind a public load balancer, and each container instance would be private.
Another problem that might arise is not assigning a role with the proper policy to the Launch Configuration. My role didn't have the AmazonEC2ContainerServiceforEC2Role policy (or the permissions that it contains) as specified here.
You definitely do not need public IP addresses for each of your private instances. The correct (and safest) way to do this is setup a NAT Gateway and attach that gateway to the routing table that is attached to your private subnet.
This is documented in detail in the VPC documentation, specifically Scenario 2: VPC with Public and Private Subnets (NAT).
It might also be that the ECS agent creates a file in /var/lib/ecs/data that stores the cluster name.
If the agent first starts up with the cluster name of 'default', you'll need to delete this file and then restart the agent.
There where several layers of problems in our case. I will list them out so it might give you some idea of the issues to pursue.
My gaol was to have 1 ECS in 1 host. But ECS forces you to have 2 subnets under your VPC and each have 1 instance of docker host. I was trying to just have 1 docker host in 1 availability zone and could not get it to work.
Then the other issue was that the only one of the subnets had an attached internet facing gateway to it. So one of them was not accessible from public.
The end result was DNS was serving 2 IPs for my ELB. And one of the IPs would work and the other did not. So I was seeing random 404s when accessing the NLB using the public DNS.

Cannot access instances of openstack on AWS

I've openstack(single node) installed on an AWS instance. I can log into the openstack dashboard and I'm also able to spawn instances but, I'm not able to connect via SSH or ping those instances.
In the security group setting section, I've allowed all types of protocols for the instances.
SSH into the AWS instance first, from there you can access the openstack instances.
If you need your spawned instances to be available over the public internet, you need to manage your public (floating) IPs.

How can I tell if an EC2 instance is inside my VPC?

My client has many EC2 instances running, and a VPC (virtual private cloud) running.
I'm using a platform called Starcluster to launch nodes, and I need to know if they're in the VPC or just ordinary EC2 nodes. How can I do that?
Amazon's VPC console at this address:
https://console.aws.amazon.com/vpc/home?region=us-east-1
shows:
1 VPC
3 Running Instances
but some of those running instances are non-VPC instances, as far as I know. Hints?
On AWS Console you can see it. Just like below:
When you select an instance in the EC2 Instances screen, you can see a bunch of fields under the Description tab. Look for a field called "VPC ID". If there is no value for that field, it is not in a VPC.