I have an AMI that I am trying to put on AWS Marketplace. During this process, amazon scans the AMI for potential security vulnerabilities. The scan found several in my AMI.
How can I fix them?
Do i:
Delete the current AMI, go back to the EC2 instance from which the AMI was created, make the changes and create a new AMI?
Or Can I somehow start the current AMI, SSH into it and make the necessary changes?
The best practice is to build a repeatable process for creating your AMIs from a base operating system image (typically AWS Linux, Ubuntu, etc.). The reason is that you have many more updates ahead of you:
You might not succeed at fixing the identified issues completely to Amazon's satisfaction
Future scans may find new, different issues
AWS Marketplace staff will manually check some things with your AMI
You might find your own bugs
You will eventually want to deploy new software versions
Yes, you could launch an instance from your your image, modify it, save it, and make a new image. It might be worth doing that to learn something about the AMI scanning criteria.
But you would not make any progress towards a reliable, repeatable image building flow. I strongly recommend looking into tools like Packer that can help you automate the AMI building process.
Related
I wanted to ask more experienced cloud users, I am thinking about deploying my applications in EC2 machines using AMI snapshots. Each new release is new AMI snapshot containing application artifacts, built from base image, each EC2 is replaced on deploy.
Is it a bad practice? Are there any possible problems or vulnerabilities that could occur when using this approach? I don't see any drawbacks apart from long deployment time.
It's not a bad practice. A lot of vendors these days are creating their AMIs and sharing it with their clients. Creating an AMI is not the hard part, you can always start an instance from previous AMI, update it, and call AWS API to create a new AMI from the instance once you finalized it.
You will however want to automate the tasks involved as it would be cumbersome to manually do update your code, update the image and install security updates while at it and do any cleanup you may need.
Deployment is a different story. Problem there is ami-id will now change and you need a way to update the ami-id for whichever product is launching the instances. You could tag your AMIs and build logic to always use the tag and look for the latest one when choosing the ami-id etc.
I have a linux ec2 instance hosting some legacy tools that I need to move to another AWS. We hoped to do this by making an AMI, sharing it to the other account, bringing up the new instance from the AMI, etc.
The problem: The instance was built on an AWS marketplace image of Debian - I must opt-in to the terms and conditions (and subscribe) in order to use the image. However, it is Debian8 (https://aws.amazon.com/marketplace/pp/prodview-5pgbnftzmrgec) which is no longer offered in the marketplace. Since the base image is no longer offered, I cannot opt in.
Is it possible to just upgrade/update the source instance to Debian 9 or 10 (both are still offered in the marketplace) so that I will be able to accept the T&C? Or is there some way to tell the AMI itself to use Debian 9 instead?
If not, I am looking at an old style file-based migration, and I was really hoping not to have to get into the guts of this server (it's a legacy integration) just yet. (It's on the to-do list, I just really wanted to get the migration to our AWS account finished first.)
I found this related question, but the suggested answer does not work - I can create and share the snapshot, even create a volume, but I cannot attach/mount the volume without "accept[ing] terms and subscribe" to the underlying product (Debian 8).
Can't export a EC2 AMI to another account because the AWS Marketplace OS is obsolete
Thanks for any advice!
I am developing a cloud solution. I have no experience in it, so I want to ask some professionals about best practices. The current question mostly related to the autoscaling groups functionality.
I've read a lot of howtos and guides and came to conclusion that the only ways to provision/configure instances in ASG are:
to pre-bake AMI;
to use user_data field.
So, let's assume, I have an autoscaling group. And I want to configure instances what it launches, for example, using chef-solo (or ansible-local, but as I understood, chef is better option for the aws).
I see only 2 ways how to do this:
Use packer and pre-bake image locally (using chef-solo provider), then update ASG configuration with the brand-new created AMI;
Use base Amazon AMI and configure images at the launch, using user_data script: install chef-solo, fetch cookbooks from the git, run chef-solo on machine.
What is better choice in you opinion and why? Also I am interested in how to update already running instances in the ASG when my chef cookbooks configuration changes.
Also, if you know better options, leave them here. I am open to discuss.
It depends on your use case.
A pre-baked AMI may be quicker to launch when scaling up, but if you need to make even small changes to the code or configuration, you'll need to bake another AMI. Using user data (whether using straight OS commands or Chef or something else) may take longer if you're installing application servers and deploying applications, and you may also be introducing external dependencies for scaling: what if the GitHub repository is off line or a necessary download is blocked?
So, if speed of scale-up is important, consider a pre-baked AMI. If you can tolerate a reasonable scale-up hit, look at a hybrid approach:
Bake into your AMI the Chef DK and any other large objects you need. For example, you might bake your application server installation into the AMI and then just have Chef configure it through user data.
Make sure your dependencies, scripts and deployables such as WAR files are in reliable repositories such as S3.
The best advice is to try both approaches to get some metrics and see how these fit your use cases.
I am sitting with a situation where I need to provision EC2 instances with some packages on startup. There are a couple of (enterprise/corporate) constraints that exist:
I need to provision on top of a specific AMI, which adds enterprisey stuff such as LDAP/AD access and so on
These changes are intended to be used for all internal development machines
Because of mainly the second constraint, I was wondering where is the best place to place the provisioning. This is what I've come up with
Provision in Terraform
As it states, I simply provision in terraform for the necessary instances. If I package these resources into modules, then provisioning won't "leak out". The disadvantages
I won't be able to add a different set of provisioning steps on top of the module?
A change in the provisioning will probably result in instances being destroyed on apply?
Provisioning takes a long time because of the packages it tries to install
Provisioning in Packer
This is based on the assumption that Packer allows you to provision on top of AMIs so that AMIs can be "extended". Also, this will only be used in AWS so it won't use other builders necessarily. Provisioning in Packer makes the Terraform Code much simpler and terraform applies will become faster because it's just an AMI that you fire up.
For me both of these methods have their place. But what I really want to know is when do you choose Packer Provisioning over Terraform Provisioning?
Using Packer to create finished (or very nearly finished) images drastically shortens the time it takes to deploy new instances and also allows you to use autoscaling groups.
If you have Terraform run a provisioner such as Chef or Ansible on every EC2 instance creation you add a chunk of time for the provisioner to run at the time you need to deploy new instances. In my opinion it's much better to do the configuration up front and ahead of time using Packer to bake as much as possible into the AMI and then use user data scripts/tools like Consul-Template to provide environment specific differences.
Packer certainly can build on top of images and in fact requires a source_ami to be specified. I'd strongly recommend tagging your AMIs in a way that allows you to use source_ami_filter in Packer and Terraform's aws_ami data source so when you make changes to your AMIs Packer and Terraform will automatically pull those in to be built on top of or deployed at the next opportunity.
I personally bake a reasonably lightweight "Base" AMI that does some basic hardening and sets up monitoring and logging that I want for all instances that are deployed and also makes sure that Packer encrypts the root volume of the AMI. All other images are then built off the latest "Base" AMI and don't have to worry about making sure those things are installed/configured or worry about encrypting the root volume.
By baking your configuration into the AMI you are also able to move towards the immutable infrastructure model which has some major benefits in that you know that you can always throw away an instance that is having issues and very quickly replace it with a new one. Depending on your maturity level you could even remove access to the instances so that it's no longer possible to change anything on the instance once it has been deployed which, in my experience, is a major factor in operational issues.
Very occasionally you might come across something that makes it very difficult to bake an AMI for and in those cases you might choose to run your provisioning scripts in a Terraform provisioner when it is being created. Sometimes it's simply easier to move an existing process over to using provisioners with Terraform than baking the AMIs but I would push to move things over to Packer where possible.
I have come across the same situation. As per my understanding
If you bring up your EC2 instances very frequently say 2 to 3 times a
day then go with creating an customized AMI with packer and then call
the ami through terraform.
If your base image ( AMI created by packer) change frequently based
on your requirements then its good to go with packer. But for me
running a packer scripts are very time consuming.
You can do the same with packer as well. You can write your
requirements in a script and call it in terraform. By having everything
incorporated in a terraform script would reduce some time
Finally its your decision and your frequency of bringing up EC2 instances.
I'm using AWS Cloudformation to setup numerous elements of network infrastructure (VPCs, SecurityGroups, Subnets, Autoscaling groups, etc) for my web application. I want the whole process to be automated. I want click a button and be able to fire up the whole thing.
I have successfully created a Cloudformation template that sets up all this network infrastructure. However the EC2 instances are currently launched without any needed software on them. Now I'm trying to figure out how best to get that software on them.
To do this, I'm creating AMIs using Packer.io. But some people have instead urged me to use Cloud-Init. What heuristic should I use to decide what to bake into the AMIs and/or what to configure via Cloud-Init?
For example, I want to preconfigure an EC2 instance to allow me (saqib) to login without a password from my own laptop. Thus the EC2 must have a user. That user must have a home directory. And in that home directory must live a file .ssh/known_hosts containing encrypted codes. Should I bake these directories into the AMI? Or should I use cloud-init to set them up? And how should I decide in this and other similar cases?
I like to separate out machine provisioning from environment provisioning.
In general, I use the following as a guide:
Build Phase
Build a Base Machine Image with something like Packer, including all software required to run your application. Create an AMI out of this.
Install the application(s) onto the Base Machine Image creating an Application Image. Tag and version this artifact. Do not embed environment specific stuff here like database connections etc. as this precludes you from easily reusing this AMI across different environment runtimes.
Ensure all services are stopped
Release Phase
Spin up an environment consisting of the images and infra required, using something like CFN.
Use Cloud-Init user-data to configure the application environment (database connections, log forwarders etc.) and then start the applications/services
This approach gives the greatest flexibility and cleanly separates out the various concerns of a continuous delivery pipeline.
One of the important factors that determines how you should assemble servers, AMIs, and infrastructure planning is to answer the question: In production, how fast will I need a new instance launched?
The answer to this question will determine how much you bake into the AMI vs. how much you build after boot.
NOTE: My experience is with Chef Server so I will use Chef terminology but the concepts are the same for any other configuration management stack.
The general rule of thumb is to treat your "Infrastructure as Code". This means think about the process of launching instances, creating users on that machine, and the process of managing a known_hosts files and SSH keys the same as you would your application code. Being able to track the changes to Infrastructure in source code makes management easier, redeployments, and even CI much easier.
This Chef Introduction covers the terminology in Chef of Cookbooks, Recipes, Resources, and more. It shows you how to build a simple LAMP stack, and how you can relaunch it just as easily with one command.
So given the example in your question, at a high level I would do the following:
Launch a base Ubuntu Linux AMI (currently 14.04) with a Cloudformation script.
In the UserData section of the Instance configuration, boot strap the Chef Client Install process.
Run a Recipe to create a user.
Run a Recipe to create the known_hosts file for the user
Tools like Chef are used because you are able to break down the infrastructure into small blocks of code performing specific functions. There are numerous Cookbooks already built and available that perform the basic building blocks of creating services, installing software packages, etc.
All that being said, there are some times when you have to deviate from best practices in the interest of your specific domain and requirements. There may be situations where given all the advantages of a infrastructure management you will still need to bake items into the AMI.
Let's pretend your application does image processing and has a requirement to use ImageMagick. Let's assume that you will need to build ImageMagick from source. If you were to do this via Chef Recipes this could add another 7 minutes of just compiling ImageMagick to the normal instance boot time. If waiting 10-12 minutes is too long for a new instance to come online then you may want to consider baking your own AMI that has ImageMagick already compiled and installed.
This is an acceptable solution but you should keep in mind that managing your own fleet of pre-baked AMIs adds additional infrastructure overhead. You will need to keep your custom AMIs updated as new AMIs are released, you expand to different instance types and to different AWS Regions.