Using a custom AMI (with s3cmd) in a Datapipeline

Using a custom AMI (with s3cmd) in a Datapipeline - amazon-web-services

How can I install s3cmd on a AMI that is used in the pipeline?
This should be a fairly basic thing to do but I can't seem to get it done:
Here's what I've tried:
Started a Pipeline without the Image-id option => Everything works fine
Navigated to EC2 and created an Image of the running Instance to make sure all the needed stuff to run in the pipeline is installed on my custom AMI
Started this AMI manually as an Instance
SSH'd into the machine and installed S3cmd
Created another Image of the machine, this time with s3cmd installed
Shut down the Instance
Started the Pipeline again, this time with the newly created AMI as Image-id and S3cmd installed
Now the Resource starts "RUNNING" but my Activity (ShellCommandActivity) is stuck in the WAITING_FOR_RUNNER state and the script never gets executed.
What do I have to do to get the pipeline running with a custom image? Or is there even an easier way to use s3cmd in a pipeline?
Thank you!

I figured it out now, by using a "clean" Amazon Linux AMI (from the marketplace for example) and installing S3cmd, rather than creating an AMI out of a running Pipeline Resource. I saw that this AMI has a different Kernel version, so this could have been the problem.

Related

Gitlab-runner autoscale not registering

I am new Gitlab and following this guide:
https://docs.gitlab.com/runner/configuration/runner_autoscale_aws/
It might be out of date?
There are a couple of issues:
when idlecount is zero, the gitlab runner with docker-machine does not automatically create an instance when i submit a job. i had to set the idlecount to 1 to have gitlab runner with docker-machine to create an instance.
when i run gitlab-runner on debug, it keeps on showing builds=0, though the gitlab shared runners execute the jobs. so the builds is no zero. i'm using group shared runners btw.
docker-machine uses ubuntu 16.04 as the default ami. when spinning up, it fails completely.
i had to specify the ami with docker-machine to ubuntu 18.04 and 20.04. it spins up a instance and completes. but it does not register the gitlab-runner. i logged into the new instance, and gitlab-runner is not installed and no docker container is being run on the machine.
Questions:
Has anybody use this guide recently?
Is AWS tested or should we use GCP like the Gitlab shared runners
docker-machine is no longer supported, but I understood Gitlab would still continue supporting it?
i was thinking of creating solution around lambda functions creating gitlab runners, but there is not way to view the pending jobs queue in gitlab? any suggestions?
Thanks in advance!

Dynamically update AMI

I have a question regarding AWS, have an AMI with windows server installed, IIS installed, and a site up and running.
My AutoScale always maintains two instances created based on this AMI.
However, whenever I need to change something on the site I need to upload a new instance, make the changes, update the AMI and update the auto-scale, which is quite time consuming.
Is there any way to automate this by linking to a Git repository?

This is more like a CI CD work rather than achieved in AWS.
You can schedule a CI CD pipeline to detect any update happens in SCM(GIT) and trigger a build job(Jenkins or similar tool) which will provide an artifact to you. You can deploy the artifact to respective application server using CD tools (ansible/even with jenkins or similar tools) whichever suits your infra. In the deploy script itself you can connect to ec2 service to create a new AMI once deployment is completed.
You need to use set of tools to achieve it SCM webhook/poll, Jenkins, Ansible.

Creating image of the running instance using Packer

I launched an instance in AWS using Terraform with basic functionalities.Once the instance is launched, i need to capture that instance into image using Packer.
How can i accomplish the same?

Packer is used to make customised AMIs. But if the image is already running and customised then an AMI can be made with standard AWS tools
Once the instance is running and configured with Terraform, take an ami image of it. See this documentation https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/creating-an-ami-ebs.html
OTOH if you want to automate the whole process of making the AMI automatically then add your steps that configure the base AMI in the first place to run Terraform within the packer script
Once you have a Packer script like this, the whole process is automatic
To make a new AMI running packer build on your packer config file should do everything automatically
Edit: your comments below hint that maybe what you want to do is use Packer on an already running instance to make an AMI, perhaps with further configuration to add to it. To do this, first make an AMI of the running image. There are instructions for doing this in the link above.
Next, write your Packer script in such a way that it takes the AMI id of the new image as a parameter. Then you can run the Packer script using the new AMI as input and so finish up with an AMI made by Packer but based on the running instance
Sorry it took a while to add this but I can't imagine a use case where this would be a useful thing to do

Codedeploy with AWS ASG

I have configured an aws asg using ansible to provision new instances and then install the codedeploy agent via "user_data" script in a similar fashion as suggested in this question:
Can I use AWS code Deploy for pulling application code while autoscaling?
CodeDeploy works fine and I can install my application onto the asg once it has been created. When new instances are triggered in the ASG via one of my rules (e.g. high cpu usage), the codedeploy agent is installed correctly. The problem is, CodeDeploy does not install the application on these new instances. I suspect it is trying to run before the user_data script has finished. Has anyone else encountered this problem? Or know how to get CodeDeploy to automatically deploy the application to new instances which are spawned as part of the ASG?

AutoScaling tells CodeDeploy to start the deployment before the user data is started. To get around this CodeDeploy gives the instance up to an hour to start polling for commands for the first lifecycle event instead of 5 minutes.
Since you are having problems with automatic deployments but not manual ones and assuming that you didn't make any manual changes to your instances you forgot about, there is most likely a dependency specific to your deployment that's not available yet at the time the instance launches.
Try listing out all the things that your deployment needs to succeed and make sure that each of those is available before you install the host agent. If you can log onto the instance fast enough (before AutoScaling terminates the instance), you can try and grab the host agent logs and your application's logs to find out where the deployment is failing.
If you think the host agent is failing to install entirely, make sure you have Ruby2.0 installed. It should be there by default on AmazonLinux, but Ubuntu and RHEL need to have it installed as part of the user data before you can install the host agent. There is an installer log in /tmp that you can check for problems in the initial install (again you have to be quick to grab the log before the instance terminates).

Stuck in WAITING_FOR_RUNNER while using an AMI for EC2 Resource

I was able to successfully run Data pipeline for predefined templates. I wanted to use a customized AMI for my EC2 Resource as I require some libraries and packages to be installed.
I also have to run a python script as a part of the process.
So, I have created a Base Image of EC2 Resource with all packages required and the code that has to be run.
As part of my activity, I trigger a shell command activity, where I execute the python script as the command that has to be run.
The EC2 resource comes up successfully based on the Customized AMI that I have specified. I am able to login to that machine using the key pair that I specified but just that the Activity gets stuck in "WAITING FOR RUNNER" state.
I am not sure as to how to solve this problem. Please let me know if there are better ways to fix the same. Am I missing some basic step while trying to use a EC2 resource from an AMI.

Use Amazon Linux when creating your custom AMI and it will resolve this issue. Their OS comes preinstalled with tools that are used by data pipelines to communicate with the instance.

Are you running in a VPC or EC2Classic? I had the same problem when running in a VPC. When I checked run.out on the EC2 instance, I saw an error message:
Error in custom provider, java.lang.RuntimeException: java.net.UnknownHostException: . . . "
The TaskRunner was not able to resolve its own hostname, and was failing to start.
I solved this by setting the "DNS hostnames" setting to yes on my VPC in the VPC console. By default on new accounts it is set to no. This resolved the issue.

I realize this is old but if you are using a custom AMI with runson specified, you should make sure your custom AMI has all dependencies installed: https://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-custom-ami.html

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Using a custom AMI (with s3cmd) in a Datapipeline - amazon-web-services

I figured it out now, by using a "clean" Amazon Linux AMI (from the marketplace for example) and installing S3cmd, rather than creating an AMI out of a running Pipeline Resource. I saw that this AMI has a different Kernel version, so this could have been the problem.

Related

Gitlab-runner autoscale not registering

Dynamically update AMI

Creating image of the running instance using Packer

Codedeploy with AWS ASG

Stuck in WAITING_FOR_RUNNER while using an AMI for EC2 Resource

Categories

Resources