How to create a Stack in AWS via Terraform? - amazon-web-services

My goal is to be able to create a 'stack' in AWS, i.e a grouping of related resources I can update and change using Terraform.
I've been attempting to read the documentation but I'm a little confused as to how I could accomplish this in terraform.
I understand the concept of possibly writing modules which are reusable, but I'm used to dealing with CF stacks when using AWS.
Is there an idiomatic way to do this in terraform? It seems that maybe the concept of a stack is abstracted away somewhat. i.e if I want to get and output from a resource.. eg a RDS url, I can reference that in the Terraform code and it will evaluate and determine it at runtime rather than reading a CF stack output value in AWS?
Is this correct?

From what I understand you are wanting to understand how to write a replica of a "stack" in Terraform and want to understand the concepts.
There are great number of resources for seeing example stacks, take a look at the official Terraform AWS examples to get a feel for notation.
You're describing modules etc which are best practice, however start small. Add to your main.tf file a simple infrastructure and then build on that.
The best way to learn will be through doing, but take it at a steady pace.
And yes you can reference your resources, generally before you run terraform apply everything is evaluated. Any resource dependencies will be created in order.

Related

AWS CDK multi stack or single stack

I use CDK to deploy a lambda function (along some IAM role & queue) and monitoring resources about the lambda, lambda log group and queue earlier. What i have right now is basically 2 class, 1 class to create all the lambda related resource and another to create monitoring resource and they are added all into 1 deployment stack.
Recently im deploying this to a new account and i realized my stack fail to create because some of the monitoring stuff is looking for the lambda log group and cant find it since its not created yet.
So what is the better option:
have 2 deployment group, 1 for lambda related resource and 1 for monitoring resource
use dependencies to create some ordering in my stack.
seems like both possible solution but what is a better long term solution?
Assuming you mean a Stack for your two classes, then you are better off making them both cdk.NestedStacks and instantiating them in a single common stack. You can then expose constructs as class attributes in one stack and pass them into the other as parameters to the second. Of course, this only works one way - if you have to go both ways you need to re-evaluate how you have your stacks organized.
The advantage of doing this is great: exposing constructs as an attribute is the best practice as it gives you direct access to that construct before it creates the CloudFormation data for it. you have complete access to every part of that construct from various arns (like dynamodb stream arns which are difficult to import) and automatically know the layer versions for lamdba layers - among many other things.
In addition, you never run into a stack dependency - if they are different top level stacks and you share constructs between them you can very run into lock situations where attempting to change something in one stack creates a dependency lock and prevents the stack from deploying.
The downside is that they all are part of the deployment. So there is a potential for something to be updated when you didnt expect it too - though CDK does use the Cloudformation Changeset system so it should not update things that have no changes applied to them (but sometimes, changes occur because of the way CDK generates tokens and such that you may not be aware of)
IF you do not go this route you are stuck using the various from* methods in cdk constructs to import the existing construct into your stack. This causes some issues, as it it can't import everything about a given construct at synth time (layer version and dynamo stream arns are two notable ones i mentioned already). Plus, you need to know the name of the construct - and Best Practices says you shouldn't deliberately name your constructs so you can easily spin up adhoc versions of your app without naming issues.

Understanding where to begin with batch processing on AWS

I have a set of calculations that needs to run in a batch, and the workload is easily parallelized across machines. The work to be done is already done within a Docker container. I'm trying to understand the easiest way for me to run this workload in a highly parallel way on AWS. However, in trying to figure out where to begin I'm having trouble finding the right entrypoint. I read about AWS Batch and AWS Fargate, but each time I try to go down one of those paths to learn about them in more detail, more AWS services start popping up (Lamdas, Step Functions, ECS, AutoScaling groups), with each article having a different combination. Furthermore, I start thinking about the problem as a Batch vs Fargate problem, and then I find another article that talks about Batch + Fargate, or X + ECS + ....
I'm having trouble finding the appropriate introduction to the choices so I can get started with setting something up and getting some experience. Any pointers on which direction I might go or some resources for me to look at?
AWS containers services team member here. Your question triggers all my button cause I have been working on a deliverable to address some of this confusion ("where do I start with xyz?"). I can try to answer your question briefly here but if you want to read more (perhaps way more than you'd need feel free to contact me offline (mreferre at amazon dot com will work).
First and foremost it's not a Vs but it's an AND. Think of all these products you mention being distributed at different layers of the stack (this is a draft visual in the deliverable):
Fargate represents capacity (where your container is running), ECS represents a core containers orchestrator and Batch is one of the provisioners on top of the container orchestrator. Lambda is something separate and that live on its own. The options for your specific use case seem to be:
Lambda
ECS/Fargate
Batch/ECS/Fargate
Step Functions/ECS/Fargate (this one is outside of analysis and you don't see it in my visual - wondering if I should add it).
As others have hinted you probably want to use Lambda if your model is event-driven (e.g. if you want to fire up a dedicated function for every event like a new file uploaded to S3).
You probably do not want to use a naked ECS/Fargate solution because it would require more work to deal with the triggering and the scheduling of your batch jobs.
You probably want to use either Batch or Step Functions to schedule jobs on ECS/Fargate. I'd argue SF is good if you have basic workflows that you need to deal with and Batch if you need to manage complex jobs at scale. Perhaps this 35 mins presentation that I did last year can provide a bit more background on these Batch Vs SF differences.
Let me know if you have any additional questions because this discussion is super useful for the positioning I am trying to build.

Is there a way to discover VMs using Terraform?

Infrastructure team members are creating, deleting and modifying resources in GCP project using console. Security team wants to scan the infra and check weather proper security measures are taken care
I am tryng to create a terraform script which will:
1. Take project ID as input and list all instances of the given project.
2. Loop all the instances and check if the security controls are in place.
3. If any security control is missing, terraform script will be modifying the resource(VM).
I have to repeat the same steps for all resoources available in project like subnet, cloud storage buckets, firewalls etc.
As per my initial investigation to do such task We will have to import the resources to terraform using "terraform import" command and after that will have to think of loops.
Now it looks like using APIs of GCP is the best fit for this task, as it looks terraform is not the good choice for this kind of tasks and I am not sure weather it is achievable using teffarform.
Can somebody provide any directions here?
Curious if by "console" you mean the gcp console (aka by hand), because if you are not already using terraform to create the resources (and do not plan to in the future), then terraform is not the correct tool for what you're describing. I'd actually argue it is increasing the complexity.
Mostly because:
The import feature is not intended for this kind of use case and we still find regular issues with it. Maybe 1 time for a few resources, but not for entire environments and not without it becoming the future source of truth. Projects such as terraforming do their best but still face wild west issues in complex environments. Not all resources even support importing
Terraform will not tell you anything about the VM's that you wouldn't know from the GCP cli already. If you need more information to make an assessment about the controls then you will need to use another tool or have some complicated provisioners. Provisioners at best would end up being a wrapper around other tooling you could probably use directly.
Honestly, I'm worried your team is trying to avoid the pain of converting older practices to IaC. It's uncomfortable and challenging, but yields better fruit in the long run then the path you're describing.
Digress, if you have infra created via terraform then I'd invest more time in some other practices that can accomplish the same results. Some other options are: 1) enforce best practices via parent modules that security has "blessed", 2) implement some CI on your terraform, 3) AWS has Config and Systems Manager, not sure if GCP has an equivalent but I would look around. Also it's worth evaluating using different technologies for different layers of abstraction. What checks your OS might be different from what checks your security groups and that's ok. Knowing is half the battle and might make for a more sane first version then automatic remediation.
With or without terraform, there is a an ecosystem of both products and opensource projects that can help with the compliance or control enforcement. Take a look at tools like inspec, sentinel, or salstack for inspiration.

How to Deploy Lambdas from one code base?

After doing some brief research, I'm receiving conflicting answers regarding best practices for the AWS lambda service. I'm writing a few microservices for my company that will automate the steps for adding clients to our various services: creating api keys, uploading documents to a repo, sending an email, etc.
I have copied and pasted my code for 3 lambdas now (only changing around a few variable values), but, before I start doing this for all of them, I wanted to request if anyone had an easier method. I do know about the ProxyIntegration, where I could use the same lambda for similar requests and differentiate them by their resource paths; however, is there an easier way I could "map" the lambdas to shared code?
I was thinking about using an S3 Object to hold the code, then change the variables by environment variables (which could very well work), but does anyone have any other recommendations or obvious solutions I'm not realizing?
Thanks!
There is a very recent feature called Lambda Layers that specifically allows you to share code between AWS Lambda functions.
You would build the common code as a library and deploy it as a Layer. Then each individual Lambda function would include that Layer.

AWS CloudFormation vs. Web Console?

I'm trying to understand the real-world usefulness of AWS CloudFormation. It seems to be a way of describing AWS infrastructure as a JSON file, but even then I'm struggling to understand what benefits that serves (besides potentially "recording" your infrastructure changes in VCS).
Of what use does CloudFormation's JSON files serve? What benefits does it have over using the AWS web console and making changes manually?
CloudFormation gives you the following benefits:
You get to version control your infrastructure. You have a full record of all changes made, and you can easily go back if something goes wrong. This alone makes it worth using.
You have a full and complete documentation of your infrastructure. There is no need to remember who did what on the console when, and exactly how things fit together - it is all described right there in the stack templates.
In case of disaster you can recreate your entire infrastructure with a single command, again without having to remember just exactly how things were set up.
You can easily test changes to your infrastructure by deploying separate stacks, without touching production. Instead of having permanent test and staging environments you can create them automatically whenever you need to.
Developers can work on their own, custom stacks while implementing changes, completely isolated from changes made by others, and from production.
It really is very good, and it gives you both more control, and more freedom to experiment.
First, you seem to underestimate the power of tracking changes in your infrastructure provisioning and configuration in VCS.
Provisioning and editing your infrastructure configuration via web interface is usually very lengthy process. Having the configuration in a file versus having it in multiple web dashboards gives you the much needed perspective and overall glance at what you use and what is it's configuration. Also, when you repeatedly configure similar stacks, you can re-use the code and avoid errors or mistakes.
It's also important to note that AWS CloudFormation resources frequently lag behind development of services available in the AWS Console. CloudFormation also requires gathering some know-how and time getting used to it, but in the end the benefits prevail.