Running docker on a data pipeline ec2 instance - amazon-web-services

I am having issues running docker (I cannot connect after the service is started) on an EC2 instance initiated as a ec2Resource - shellCommand in AWS Data Pipeline.
Basically I have a pipeline and part of it is run an ec2Resource, which performs a shell command - that command installs docker (successfully it seems) and then starts the service (again gets back the okay) - but the user cannot then connect to the docker daemon (like it's not running).
Has anyone got this working before?
Can I use, or should I be using a different AMI (I'm running in sydney AU).
Your help would be most appreciated!

Okay solved - run HVM not the default PV, that fixed it right away!

Related

Docker Service not running on EC2 when launched in an AWS auto-scaling group?

Currently I have a fresh Ubuntu image with just docker installed on it. I tested this image to make sure my docker container works and everything is working fine.
I create a golden AMI of this. I create a launch template and add below to the User Data.
#!/bin/bash
sudo docker pull command
sudo docker run command
My issue is that the command will not run. From the other testing I have done it seems like the docker service is not running when these commands are being executed. When I ssh on to the EC2 and check the docker service it is running. I can manually run the docker commands and it works.
Any idea why the docker service wouldn't run when it boots up with the autoscaling group?
I have tried adding a bash file and running that through the User Data but same thing.
I even added a sleep to see if the docker service would come up before running the commands but still the same thing.
Any help would be greatly appreciated.

Gitlab-runner autoscale not registering

I am new Gitlab and following this guide:
https://docs.gitlab.com/runner/configuration/runner_autoscale_aws/
It might be out of date?
There are a couple of issues:
when idlecount is zero, the gitlab runner with docker-machine does not automatically create an instance when i submit a job. i had to set the idlecount to 1 to have gitlab runner with docker-machine to create an instance.
when i run gitlab-runner on debug, it keeps on showing builds=0, though the gitlab shared runners execute the jobs. so the builds is no zero. i'm using group shared runners btw.
docker-machine uses ubuntu 16.04 as the default ami. when spinning up, it fails completely.
i had to specify the ami with docker-machine to ubuntu 18.04 and 20.04. it spins up a instance and completes. but it does not register the gitlab-runner. i logged into the new instance, and gitlab-runner is not installed and no docker container is being run on the machine.
Questions:
Has anybody use this guide recently?
Is AWS tested or should we use GCP like the Gitlab shared runners
docker-machine is no longer supported, but I understood Gitlab would still continue supporting it?
i was thinking of creating solution around lambda functions creating gitlab runners, but there is not way to view the pending jobs queue in gitlab? any suggestions?
Thanks in advance!

Elastic Beanstalk Deployment TaskStoppedBeforePullBeginError

I am experiencing an issue in deploying a multicontainer EBS app which appears to be undocumented by AWS. The error reads: client: TaskStoppedBeforePullBeginError: Task stopped before image pull could begin for task: where client is my container name. I am using the deprecated multicontainer Docker environment as the latest Docker environment on Amazon Linux 2 is undocumented and has a lot of issues when I tried it.
Things I have tried:
verified that the image is pullable and runnable from my local machine (i.e. the image runs without immediately exiting)
used eb-cli to verify that my Dockerrun.aws.json configuration is working by running it all locally using eb local run
I honestly have no idea how I could go about further troubleshooting the issue. Any here would be much appreciated. Here are the last 100 lines of logs produced by EBS.

AWS opsworks deploy runs for ever after instances are joined to Domain

I have custom chef 12.2 script to run the deployment on my opsworks, deployment and running recipe used work great on the instances (with Windows Custom AMI - OS server 2012).
Post migration of instances to a Domain. nothing seems to work.
opsworks agent is running on the instances. not sure what else to look at to solve the issue.
any suggestions how i can investigate the issue and solve it?
**note Reboot from opsworks reboots the instance
whenever opsworks runs for a long time use
opsworks-agent-cli show_log
command to get the current activity logs of opsworks agent. if its stuck inside some loop forever, you can get its pid and kill it abruptly.

Codedeploy with AWS ASG

I have configured an aws asg using ansible to provision new instances and then install the codedeploy agent via "user_data" script in a similar fashion as suggested in this question:
Can I use AWS code Deploy for pulling application code while autoscaling?
CodeDeploy works fine and I can install my application onto the asg once it has been created. When new instances are triggered in the ASG via one of my rules (e.g. high cpu usage), the codedeploy agent is installed correctly. The problem is, CodeDeploy does not install the application on these new instances. I suspect it is trying to run before the user_data script has finished. Has anyone else encountered this problem? Or know how to get CodeDeploy to automatically deploy the application to new instances which are spawned as part of the ASG?
AutoScaling tells CodeDeploy to start the deployment before the user data is started. To get around this CodeDeploy gives the instance up to an hour to start polling for commands for the first lifecycle event instead of 5 minutes.
Since you are having problems with automatic deployments but not manual ones and assuming that you didn't make any manual changes to your instances you forgot about, there is most likely a dependency specific to your deployment that's not available yet at the time the instance launches.
Try listing out all the things that your deployment needs to succeed and make sure that each of those is available before you install the host agent. If you can log onto the instance fast enough (before AutoScaling terminates the instance), you can try and grab the host agent logs and your application's logs to find out where the deployment is failing.
If you think the host agent is failing to install entirely, make sure you have Ruby2.0 installed. It should be there by default on AmazonLinux, but Ubuntu and RHEL need to have it installed as part of the user data before you can install the host agent. There is an installer log in /tmp that you can check for problems in the initial install (again you have to be quick to grab the log before the instance terminates).