CloudFormation open source equivalent or rolling your own - amazon-web-services

Would anybody have any clues as to how AWS CloudFormation works under the hood?
Also, would anybody know an open-source equivalent to AWS CF (and I don't mean tooling that may be using CloudFormation)?
It's clearly a powerful orchestrator, but I'd be keen to explore the inner workings of such tools.

AWS Cloudformation has multiple pre-defined set of schemas for each of the components that are supported. When you upload a Cloudformation template for creating resources, it performs the below steps:
It validates the templates against the schema
It generates dynamic form for gathering parameters
It validates the values of parameters
Once it has all it needs, you can click Create to begin with the resource creation
Under the hood, it starts creation of resources using the internal coding for which is keeps echoing the status and progress continuously on the console.
We need to understand here that internally Cloudformation in itself is a product that does use AWS SDK/CLI as needed. However, under the hood, it maintains its own data to compare the attributes and resources when you run an update.
An open source alternative to this is Terraform. Terraform is the most widely used open source replacement of Cloudformation. Terraform is known for its Cloud independent architecture. Terraform works with multiple clouds with minimal changes in the templates.
The under-the-hood working of terraform involves creation of a State file/directory where it stores the current state of any stack identified uniquely by the name provided by the user. Terraform creates resources majorly using Python SDK (boto3) and some other APIs as needed. We need to pass the access key and secrets to the Terraform configuration in order to enable it to access the AWS Cloud environment.
If you are looking to build a smart new alternative, it should be fairly simple considering that AWS strictly follows standard design patterns in its SDK and CLI interface design. This makes it easier to convert template into executable code.
More information about working of Cloudformation can be found here

Related

Can aws cdk provide remote state?

Terraform has a remote stack via well documented plugins, i.e. terraform.backend.s3
https://www.terraform.io/docs/language/settings/backends/s3.html
Can aws cdk provide remote state for the stacks?
I can't find in documentation.
https://docs.aws.amazon.com/cdk/latest/guide/awscdk.pdfstack
I ask about aws cdk because I found pure documentation about aws cdktf.
Found that cloud cloudfront generates a lot of json file as well as uses it. Does the contain state?
The CDK uses CloudFormation under the hood, which manages the remote state of the infrastructure in a similar way like a Terraform state-file.
You get the benefit of AWS taking care of state management for you (for free) without the risks of doing it yourself and messing up your state file.
The drawback is that if there is drift between the state CloudFormation thinks resources are in and their actual state, things get tricky.

Documentation for AWS infrastructure as code

Recently, while trying to build a terraform IaC, I found that I couldn’t get the API Gateway to route to the Lambda properly. It turned out that when using the console AWS automatically assigns the permissions the gateway needs for the Lambda, but with IaC in terraform this must be assigned explicitly.
The above is understandable but for a newbie, to both AWS and terraform, confusing.
Is there documentation which explains the required components within an infrastructure connection, such as that above?
I know of the AWS docs and the terraform docs are particularly well thought out but none of it actually explains (as far as I’ve seen) that a certain resource is required in any particular (however common or obscure) setup. Inferring these connections from general searching is not a great replacement.
I don't think that there is a documentation that lists "all of the required components" in one single page/area. But you can get different pieces of information from different docs, and as you mentioned AWS and Terraform do both a great job at this.
Talking about AWS, in the case of permissions in API gateway, I can think of two useful links (the 1st one is referenced from the 2nd one though):
How API Gateway resource policies affect authorization workflow
Control access for invoking an API
I agree in the fact that sometimes it's a lot of guesses to translate AWS into terraform if you don't really know what you are trying to achieve. Usually when I am blocked on something that "should theoritically work" in IaC vs AWS console, I step back from the problem and try to figure out what kind of components am I really trying to glue together in AWS world. Then usually things become more obvious.
Because in terraform it's really creating small independant pieces of infrastructure and make them work together. Comparing with other IaC, in my experience it's a lot more granular than CloudFormation for instance.
A personal practice that helps me figure out things faster is to read every single intro doc of the components I am working on in Terraform. For instance, if I am writing lambda in terraform IaC, I would quickly read all the lambda_xxxx_yyyy intro parts to get less stuck and react faster when something fails. It usually works for me.
I haven't see such a documentation, but I can share my work-around for similar cases.
You can make changes you need using AWS console - manually, using UI. Then you can define resources you just created in your TF files, defining only/required required set of properties, even random values will work. Then you import what you created manually into resources you defined.
By running terraform plan you will see the differences, that will allow you to adjust your TF files accordingly.
After few iterations you will replicate what you have just done in the UI using TF. As a final test you can manually revert your changes, run terraform apply and ensure that everything works as expected.

AWS CDK VS SDK for IaC

I recently started working with AWS and IaC, I'm using Cloudformation to provision my AWS resources, but I discovered that AWS provide both a SDK and a CDK to enable you to provision resources programmatically instead of plain json/yaml.
But based on the documentation I did not really understand how they differ, can someone explain me how they differ and for what use case you should use what?
CDK: Is a framework to model and provision your infrastructure or stack. Stack can consist of a database for ex: DynamoDB, S3 Bucket, Lambda, API Gateway etc. It provides a facility to write code to create an infrastructure in AWS. Also called Infrastructure as code.
Check here
SDK: These are the code libraries provided by Amazon in various languages, like Java, Python, PHP, Javascript, Typescript etc. These libraries help interact with AWS services (like creating data in DynamoDB) which you either create through CDK or console. SDKs simplify using AWS services in your application with an API.
Check here
AWS SDK is a library primarily to ease the access to the AWS services by handling for you the data (de)serialization, credentials management, failure handling, etc. Perhaps, for specific scenarios, you could use the AWS SDK as the infrastructure as a code tool, however it could be cumbersome as it is not the intended usage of the library.
Based on the https://docs.aws.amazon.com/whitepapers/latest/develop-deploy-dotnet-apps-on-aws/infrastructure-as-code.html, dedicated tools for the IaC are AWS CloudFormation and AWS CDK.
AWS CDK is an abstraction on top of CloudFormation. CDK scripts are in fact transformed to the CloudFormation definitions when scripts are synthesized.
The difference can be best described on an example: Imagine that for each lambda function in your stack you want to create an error CloudWatch alarm and connect to the SNS topic.
With CloudFormation you will either a) need to write a pretty much similar bunch of yaml/json definitions for each lambda function to ensure the monitoring, b) use the nested stack templates, c) use CloudFormation modules.
With CDK you can write a generic code construct - class or method, which can create the alarm for the given lambda function and create the SNS alarm action for given topic.
In other words, CDK helps you generalize and re-use your IaC in a very familiar way to how you develop your business code. The code is shorter and more readable than the CF definitions.
The difference is even more remarkable when you need to set up similar resources in different AWS regions and when you have different AWS account per environment. You can manage all AWS accounts and regions with a single CDK codebase.
Some background first: CloudFormation is Amazon's solution for an “Infrastructure as Code” approach to managing the definition, provisioning and deployment of a bunch of resources across accounts/regions. This is done by using their declarative yaml/json-based template language to define it all, and then executing the templates through various means (console, cli, APIs...). More info:
white paper: https://docs.aws.amazon.com/whitepapers/latest/develop-deploy-dotnet-apps-on-aws/infrastructure-as-code.html
faq: https://aws.amazon.com/cloudformation/faqs/
There are other popular IaC solutions or tools to help achieve it more easily out there, such as Terraform and Kubernetes (container orchestration that also uses declarative templates to define desired states).
Potential benefits of IaC: At a high level, you can better track & audit your infra, reuse definitions/processes, make all your changes in a more consistent manner, faster thanks to all the automation and assurances you can get with an infra-as-code approach. You may be familiar with these as mentioned in previous answers and more, such as:
version controlling your infrastructure definitions,
more efficient and logically complex ways of constructing templates,
ability to write tests against them,
do diffs (see "change sets") before making real infra changes with the templates,
detect when live infra differs from your definitions,
automate rollbacks,
and lots of other state management assistance through a framework like CF that might be needed when performing regular ops duties.
CDK:
This is for helping to automate CloudFormation as part of an IaC approach to provisioning and deploying resources. It lets you use various popular programming languages to help with the creation, testing, and management of your CF setup. Some of AWS’s motivations: “YAML is an excellent format for describing the desired state of your cluster, but it is does not have primitives for expressing logic and reusable abstractions.“ “AWS CDK uses the familiarity and expressive power of programming languages for modeling your applications.”
 More info: https://docs.aws.amazon.com/cdk/v2/guide/home.html

However, Amazon knows about other solutions, and happily points them out on the main CDK page now, downplaying its original connection to CF. You don't need to use CloudFormation if you don't want to; specifically, they mention you can use the same CDK constructs with the help of:
cdktf for Terraform maintained by its creators, Hashicorp
cdk8s for Kubernetes by AWS. re: “We realized this was exactly the same problem our customers had faced when defining their applications through CloudFormation templates, a problem solved by the AWS Cloud Development Kit (AWS CDK), and that we could apply the same design concepts from the AWS CDK to help all Kubernetes users.”

SDK:

AWS has an API for all of their services, and the various SDKs give you access to them. For example, I can use AWS’s Java SDK to manage an API Gateway. If I wanted to script some custom deployment process, I could do so with the SDK, managing all the state, etc. myself. You could probably even re-implement the CloudFormation service with the various underlying APIs... The APIs have varying levels of documentation though. E.g. CloudFormation Java APIs are only mentioned in the raw API reference, not the friendlier Developer Guide.
I find that the difference for me is that the CDK codifies the CloudFormation JSON/YAML. First response, is great ya okay in code but the benefit on the code side of things is you can write unit testing against the code. Therefore you get to build that sense of security or insurance policy against the provisioned services in the CDK.
There are other ways to test CF, however, with a dev background, this feels more comfortable.

How to programmatically recreate resources done via AWS consoles?

I am trying to programmatically recreate a bunch of AWS resources that were created/configured manually via AWS consoles.
The AWS consoles do a lot for you.
For example, you can create a Lambda function with an Api-Gateway trigger in about 10 seconds using the AWS console.
The console is doing a lot of magic under the covers, defining and configuring resources such as policies, stages, permissions, models, etc.
In theory, CloudTrail is supposed to allow me to see what exactly is happening under the covers, but it seems to be silent in this case (i.e. Lambda function with Api-Gateway trigger).
I can play hide and seek and do extensive dumps using the CLI to list stages, policies, export api definitions, etc. etc. and look for the differences but is there an easier way? - like some way to trace the REST calls that the console is creating when it does all its magic?
Note: CloudFormer could have helped but it is only half-written software (Hey Amazon!) and only covers about a third of the resources I have defined. Does embracing Cloudformation imply not using these great time-saving consoles?
CloudFormation and other Infrastructure as code services are there to lessen the clicks you make while using AWS console or any other cloud console to manage your resources.
But these come in handy when you have to spin up resources which will be having almost the same configurations and software stack.
If you use CloudFormation you will be able to define the policies according to your need, which OS image to use, which stack to install etc. etc. it provides you minute control over your resources.
I suggest if you have to deploy these resources multiple times then create a CloudFormation template and use it.
So, I would suggest that rather than finding a way to recreate the code from your current infrastructure, create a CloudFormation template and use it for future needs.
If you need something easier than your current flow, this is it, as you just have to write your required configuration once.
Hashicorp Terraform is also a good alternative to AWS CloudFormation. You can use Terraforming to export the current infrastructure into Terraform readable files.

Does CloudFormation's "create-change-set" feature allow to separate planning from execution?

When deciding between Terraform and AWS CloudFormation, one advantage generally associated with Terraform is the separation of planning (terraform plan) and execution (terraform apply).
I recently heard about CloudFormation's create-change-set feature, which is described as:
Creates a list of changes that will be applied to a stack so that you can review the changes before executing them.
In Updating Stacks Using Change Sets from the AWS documentation, they describe the recommended update process.
Now I wonder:
Doesn't create-change-set also allow to separate planning from execution?
If not, what are the shortcomings in comparison with Terraform's plan & apply?
Background: Examples where the separation is emphasized as a significant advantage:
Official comparison on terraform.io:
Terraform also separates the planning phase from the execution phase, by using the concept of an execution plan. By running terraform plan, the current state is refreshed and the configuration is consulted to generate an action plan. The plan includes all actions to be taken: which resources will be created, destroyed or modified. It can be inspected by operators to ensure it is exactly what is expected.
[...]
Terraform lets operators apply changes with confidence, as they know exactly what will happen beforehand.
(Possibly outdated) Blog post from 2014 comparing Terraform and CloudFormation:
Infrastructure Updates: This is the absolute killer feature of Terraform. Terraform has a separate planning and execution phase. The planning phase shows which resources will be created, modified and destroyed. It gives you complete control of how your changes will affect the existing environment, which is quite crucial. This was one of the main reasons why we went ahead with Terraform. CloudFormation does not show you what changes it is going to make to the environment.
A CloudFormation Change Set is an analogous object to a Terraform plan saved in a plan file. Its core functionality is the same: inspect a report of the changes required to reach a new desired state, and then apply those changes.
When Terraform was originally created, CloudFormation did not have this feature. The documentation you have found was accurate at the time it was written, but with the release of Change Sets there is not a significant difference in this regard.
The other difference covered in the official comparison still applies, however:
Terraform similarly uses configuration files to detail the
infrastructure setup, but it goes further by being both cloud-agnostic
and enabling multiple providers and services to be combined and
composed. For example, Terraform can be used to orchestrate an AWS and
OpenStack cluster simultaneously, while enabling 3rd-party providers
like Cloudflare and DNSimple to be integrated to provide CDN and DNS
services. This enables Terraform to represent and manage the entire
infrastructure with its supporting services, instead of only the
subset that exists within a single provider. It provides a single
unified syntax, instead of requiring operators to use independent and
non-interoperable tools for each platform and service.