Terraform : Seperate modules VS one big project - amazon-web-services

I'm working on a Datalake project composed by many services : 1VPC (+ subnets, security groups, internet gateway, ...), S3 buckets, EMR cluster, Redshift, ElasticSearch, some Lambdas functions, API Gateway and RDS.
We can say that some resources are "static" as they will be created only once and will not change in the future, like : VPC + Subnets and S3 buckets
The other resources will change during the developement and production project lifecycle.
My question is what's the best way to manage the structure of the project ?
I first started this way :
-modules
.rds
.main.tf
.variables.tf
.output.tf
-emr
-redshift
-s3
-vpc
-elasticsearch
-lambda
-apigateway
.main.tf
.variables.tf
So this way i only have to do a terraform apply and it deploys all the services.
The second option (i saw some developers using it) is that each service will be in a seperate folder and then we only go the folder of the service that we want to launch it and then execute terraform apply
We will be 2 to 4 developers on this project and some of us will only work on a seperate resources.
What strategy do you advice me to follow ? Or maybe you have other idea and best practice ?
Thanks for your help.

The way we do it is separate modules for each service, with a “foundational” module that sets up VPCs, subnets, security policies, CloudTrail, etc.
The modules for each service are as self-contained as possible. The module for our RDS cluster for example creates the cluster, the security group, all necessary IAM policies, the Secrets Manager entry, CloudWatch alarms for monitoring, etc.
We then have a deployment “module” at the top that includes the foundational module plus any other modules it needs. One deployment per AWS account, so we have a deployment for our dev account, for our prod account, etc.
The deployment module is where we setup any inter-module communication. For example if web servers need to talk to the RDS cluster, we will create a security group rule to connect the SG from the web server module to the SG from the RDS module (both modules pass back their security group ID as an output).
Think of the deployment as a shopping list of modules and stitching between them.
If you are working on a module and the change is self-contained, you can do a terraform apply -target=module.modulename to change your thing without disrupting others. When your account has lots of resources this is also handy so plans and applies can run faster.
P.S. I also HIGHLY recommend that you setup remote state for Terraform stored in S3 with DynamoDB for locking. If you have multiple developers, you DO NOT want to try to manage the state file yourself you WILL clobber each other’s work. I usually have a state.tf file in the deployment module that sets up remote state.

Related

Use AWS CodeBuild to test private components of a system deployed with CodePipeline

I have a system that is automatically deployed by AWS CodePipeline. The deployment creates all resources from scratch, included the VPC.
This system has some components in private subnets, which expose an API and have no public access. They are accessible from other internal components only.
Once the deployment completes, I would like CodePipeline to start some API tests against these components.
I was planning to do it with AWS CodeBuild test action, but I have hit a roadblock there. The issue is that, to configure CodeBuild to run in a VPC, you need the VPC to exist. But my VPC does not exist yet at setup time of the pipeline.
Does anyone have a simple solution for that?
Note that I don't consider create the VPC in advance, separately from the rest of the system, a solution. I really want my deployment to be atomic.

Extract Entire AWS Setup into storable Files or Deployment Package(s)

Is there some way to 'dehydrate' or extract an entire AWS setup? I have a small application that uses several AWS components, and I'd like to put the project on hiatus so I don't get charged every month.
I wrote / constructed the app directly through the various services' sites, such as VPN, RDS, etc. Is there some way I can extract my setup into files so I can save these files in Version Control, and 'rehydrate' them back into AWS when I want to re-setup my app?
I tried extracting pieces from Lambda and Event Bridge, but it seems like I can't just 'replay' these files using the CLI to re-create my application.
Specifically, I am looking to extract all code, settings, connections, etc. for:
Lambda. Code, Env Variables, layers, scheduling thru Event Bridge
IAM. Users, roles, permissions
VPC. Subnets, Route tables, Internet gateways, Elastic IPs, NAT Gateways
Event Bridge. Cron settings, connections to Lambda functions.
RDS. MySQL instances. Would like to get all DDL. Data in tables is not required.
Thanks in advance!
You could use Former2. It will scan your account and allow you to generate CloudFormation, Terraform, or Troposphere templates. It uses a browser plugin, but there is also a CLI for it.
What you describe is called Infrastructure as Code. The idea is to define your infrastructure as code and then deploy your infrastructure using that "code".
There are a lot of options in this space. To name a few:
Terraform
Cloudformation
CDK
Pulumi
All of those should allow you to import already existing resources. At least Terraform has a import command to import an already existing resource into your IaC project.
This way you could create a project that mirrors what you currently have in AWS.
Excluded are things that are strictly taken not AWS resources, like:
Code of your Lambdas
MySQL DDL
Depending on the Lambdas deployment "strategy" the code is either on S3 or was directly deployed to the Lambda service. If it is the first, you just need to find the S3 bucket etc and download the code from there. If it is the second you might need to copy and paste it by hand.
When it comes to your MySQL DDL you need to find tools to export that. But there are plenty tools out there to do this.
After you did that, you should be able to destroy all the AWS resources and then deploy them later on again from your new IaC.

Argo with multiple GCP projects

I've been looking into Argo as a Gitops style CD system. It looks really neat. That said, I am not understanding how to use Argo in across multiple GCP projects. Specifically, the plan is to have environment dependent projects (i.e. prod, stage dev). It seems like Argo is not designed to orchestrate deployment across environment dependent clusters, or is it?
Your question is mainly about security management. You have several possibilities and several point of views/level of security.
1. Project segregation
The most simple and secure way is to have Argo running in each project without relation/bridge between each environment. No risk in security or to deploy on the wrong project. Default project segregation (VPC and IAM role) are sufficient.
But it implies to deploy and maintain the same app on several clusters, and to pay several clusters (Dev, Staging and prod CD aren't used at the same frequency)
In term of security, you can use the Compute Engine default service account for the authorization, or you can rely on Workload identity (preferred way)
2. Namespace segregation
The other way is to have only one project with a cluster deployed on it and a kubernetes namespace per delivery project. By the way, you can reuse the same cluster for all the projects in your company.
You still have to update and maintain Argo in each namespace, but the cluster administration is easier because the node are the same.
In term of security, you can use the Workload identity per namespace
(and thus to have 1 service account per namespace authorized in the delivery project) and to keep the permission segregated
Here, the trade off is the private IP access. If your deployment need to access to private IP inside the delivery project (for testing purpose or to access to private K8S master), you have to set up a VPC peering (and you are limited to 25 peering per project) or set up a shared VPC.
3. Service account segregation
The latest solution isn't recommended, but it's the easiest to maintain. You have only one GKE cluster for all the environment, and only 1 namespace with Argo deployed on it. By configuration, you can say to Argo to use a specific service account to access to the delivery project (with service account key files (not recommended solution) stored in GKE secrets or in secret manager, or (better) by using service account impersonation).
Here also, you have 1 service account authorized per delivery project. And the peering issue is the same in case of private IP access required in the delivery project.

Docker for AWS vs pure Docker deployment on EC2

The purpose is production-level deployment of a 8-container application, using swarm.
It seems (ECS aside) we are faced with 2 options:
Use the so called docker-for-aws that does (swarm) provisioning via a cloudformation template.
Set up our VPC as usual, install docker engines, bootstrap the swarm (via init/join etc) and deploy our application in normal EC2 instances.
Is the only difference between these two approaches the swarm bootstrap performed by docker-for-aws?
Any other benefits of docker-for-aws compared to a normal AWS VPC provisioning?
Thx
If you need to provide a portability across different cloud providers - go with AWS CloudFormation template provided by Docker team. If you only need to run on AWS - ECS should be fine. But you will need to spend a bit of time on figuring out how service discovery works there. Benefit of Swarm is that they made it fairly simple, just access your services via their service name like they were DNS names with built-in load-balancing.
It's fairly easy to automate new environment creation with it and if you need to go let's say Azure or Google Cloud later - you simply use template for them to get your docker cluster ready.
Docker team has put quite a few things into that template and you really don't want to re-create them yourself unless you really have to. For instance if you don't use static IPs for your infra (fairly typical scenario) and one of the managers dies - you can't just restart it. You will need to manually re-join it to the cluster. Docker for AWS handles that through IPs sync via DynamoDB and uses other provider specific techniques to make failover / recovery work smoothly. Another example is logging - they push your logs automatically into CloudWatch, which is very handy.
A few tips on automating your environment provisioning if you go with Swarm template:
Use some infra automation tool to create VPC per environment. Use some template provided by that tool so you don't write too much yourself. Using a separate VPC makes all environment very isolated and easier to work with, less chance to screw something up. Also, you're likely to add more elements into those environments later, such as RDS. If you control your VPC creation it's easier to do that and keep all related resources under the same one. Let's say DEV1 environment's DB is in DEV1 VPC
Hook up running AWS Cloud Formation template provided by docker to provision a Swarm cluster within this VPC (they have a separate template for that)
My preference for automation is Terraform. It lets me to describe a desired state of infrastructure rather than on how to achieve it.
I would say no, there are basically no other benefits.
However, if you want to achieve all/several of the things that the docker-for-aws template provides I believe your second bullet point should contain a bit more.
E.g.
Logging to CloudWatch
Setting up EFS for persistence/sharing
Creating subnets and route tables
Creating and configuring elastic load balancers
Basic auto scaling for your nodes
and probably more that I do not recall right now.
The template also ingests a bunch of information about related resources to your EC2 instances to make it readily available for all Docker services.
I have been using the docker-for-aws template at work and have grown to appreciate a lot of what it automates. And what I do not appreciate I change, with the official template as a base.
I would go with ECS over a roll your own solution. Unless your organization has the effort available to re-engineer the services and integrations AWS offers as part of the offerings; you would be artificially painting yourself into a corner for future changes. Do not re-invent the wheel comes to mind here.
Basically what #Jonatan states. Building the solutions to integrate what is already available is...a trial of pain when you could be working on other parts of your business / application.

How to structure Terraform so underlying AWS resources can be referenced across applications?

I'm using Terraform to create AWS ECS resources. I'm planning to deploy two applications as Docker containers that share the same ECS Cluster so that I can have multi-tenancy on the EC2 instances.
I have three separate Git repos with respective Terraform files in them:
Underlying ECS resources shared across the 2 apps - i.e. ECS Cluster, ec2 instance, Security Group, ASG, Launch Configuration etc.
First application. As well as the source code I have the Terraform configurations that creates the ECS Service, Task, ELB, ELB's security group etc which depend on resources created by repo 1.
Second application. Similar repo 2 with same set of AWS resources created and leveraging AWS resources created by repo 1.
Repos 2 and 3 depend on repo 1. To be specific, repo 1 creates the ECS Cluster. This is referenced when creating the ECS Service in repos 2 and 3. Right now I've hard coded the ARN but I'd like to remove the hard coded value. Is this possible? Do I need to structure my projects and Terraform separately or differently?
Furthermore, there's actually a circular dependency as ideally repo 1 would also know about resources created in repos 2 and 3. Specifically I'd like the ec2 Security Groups (SG) to open ports for the ELB SG. If this is also possible then that would be awesome.