Terraform: state management for multi-tenancy

Terraform: state management for multi-tenancy - amazon-web-services

As we're in progress of evaluating Terraform to replace (partially) our Ansible provisioning process for a multi-tenancy SaaS, we realize the convenience, performance and reliability of Terraform as we can handle the infrastructure change (adding/removing) smoothly, keeping track of infra state (that's very cool).
Our application is a multi-tenancy SaaS which we provision separate instances for our customers - in Ansible we have our own dynamic inventory (quite the same as EC2 dynamic inventory). We go through lots of Terraform books/tutorials and best practices where many suggest that multi environment states should be managed separately & remotely in Terraform, but all of them look like static env (like Dev/Staging/Prod).
Is there any best practice or real example of managing dynamic inventory of states for multi-tenancy apps? We would like to track state of each customer set of instances - populate changes to them easily.
One approach might be we create a directory for each customer and place *.tf scripts inside, which will call to our module hosted somewhere global. State files might be put to S3, this way we can populate changes to each individual customer if needed.

Terraform works on a folder level, pulling in all .tf files (and by default a terraform.tfvars file).
So we do something similar to Anton's answer but do away with some complexity around templating things with sed. So as a basic example your structure might look like this:
$ tree -a --dirsfirst
.
├── components
│   ├── application.tf
│   ├── common.tf
│   ├── global_component1.tf
│   └── global_component2.tf
├── modules
│   ├── module1
│   ├── module2
│   └── module3
├── production
│   ├── customer1
│   │   ├── application.tf -> ../../components/application.tf
│   │   ├── common.tf -> ../../components/common.tf
│   │   └── terraform.tfvars
│   ├── customer2
│   │   ├── application.tf -> ../../components/application.tf
│   │   ├── common.tf -> ../../components/common.tf
│   │   └── terraform.tfvars
│   └── global
│   ├── common.tf -> ../../components/common.tf
│   ├── global_component1.tf -> ../../components/global_component1.tf
│   ├── global_component2.tf -> ../../components/global_component2.tf
│   └── terraform.tfvars
├── staging
│   ├── customer1
│   │   ├── application.tf -> ../../components/application.tf
│   │   ├── common.tf -> ../../components/common.tf
│   │   └── terraform.tfvars
│   ├── customer2
│   │   ├── application.tf -> ../../components/application.tf
│   │   ├── common.tf -> ../../components/common.tf
│   │   └── terraform.tfvars
│   └── global
│   ├── common.tf -> ../../components/common.tf
│   ├── global_component1.tf -> ../../components/global_component1.tf
│   └── terraform.tfvars
├── apply.sh
├── destroy.sh
├── plan.sh
└── remote.sh
Here you run your plan/apply/destroy from the root level where the wrapper shell scripts handle things like cd'ing into the directory and running terraform get -update=true but also running terraform init for the folder so you get a unique state file key for S3, allowing you to track state for each folder independently.
The above solution has generic modules that wrap resources to provide a common interface to things (for example our EC2 instances are tagged in a specific way depending on some input variables and also given a private Route53 record) and then "implemented components".
These components contain a bunch of modules/resources that would be applied by Terraform at the same folder. So we might put an ELB, some application servers and a database under application.tf and then symlinking that into a location gives us a single place to control with Terraform. Where we might have some differences in resources for a location then they would be separated off. In the above example you can see that staging/global has a global_component2.tf that isn't present in production. This might be something that is only applied in the non production environments such as some network control to prevent internet access to the environment.
The real benefit here is that everything is easily viewable in source control for developers directly rather than having a templating step that produces the Terraform code you want.
It also helps follow DRY where the only real differences between the environments are in the terraform.tfvars files in the locations and makes it easier to test changes before putting them live as each folder is pretty much the same as the other.

Your suggested approach sounds right to me, but there are few more things which you may consider doing.
Keep original Terraform templates (_template in the tree below) as versioned artifact (git repo, for eg) and just pass key-values properties to be able to recreate your infrastructure. This way you will have very small amount of copy pasted Terraform configuration code laying around in directories.
This is how it looks:
/tf-infra
├── _global
│   └── global
│   ├── README.md
│   ├── main.tf
│   ├── outputs.tf
│   ├── terraform.tfvars
│   └── variables.tf
└── staging
└── eu-west-1
├── saas
│   ├── _template
│   │   └── dynamic.tf.tpl
│   ├── customer1
│   │   ├── auto-generated.tf
│   │   └── terraform.tfvars
│   ├── customer2
│   │   ├── auto-generated.tf
│   │   └── terraform.tfvars
...
Two helper scripts are needed:
Template rendering. Use either sed to generate module's source attribute or use more powerful tool (as for example it is done in airbnb/streamalert )
Wrapper script. Run terraform -var-file=... is usually enough.
Shared terraform state files as well resources which should be global (directory _global above) can be stored on S3, so that other layers can access them.
PS: I am very much open for comments on the proposed solution, because this is an interesting task to work on :)

Related

Django best practice for scripts organizing

I had the project on Django 3.1 with the following layout:
.
├── app
│   ├── app
│   │   ├── asgi.py
│   │   ├── __init__.py
│   │   ├── settings.py
│   │   ├── urls.py
│   │   └── wsgi.py
│   ├── core
│   │   ├── admin.py
│   │   ├── apps.py
│   │   ├── fixtures
│   │   │   ├── Client.json
│   │   │   └── DataFeed.json
│   │   ├── __init__.py
│   │   ├── migrations
│   │   │   ├── 0001_initial.py
│   │   │   ├── 0002_auto_20201009_0950.py
│   │   │   └── __init__.py
│   │   ├── models.py
│   │   └── tests
│   │   └── __init__.py
│   └── manage.py
I want to add 2 scripts to this project:
download_xml.py - to check and download .xml files from external sources by schedule (every ~30 min)
update_db_info.py - to be invoked by download_xml.py and transfer data from downloaded xml to the database
What is the best django practice for organizing a placement for this kind of scripts?
My ideas:
just create scripts folder inside of an app/core and put scripts there. Invoke them using cron
run python manage.py startapp db_update
so the new app in django will be created. I will remove migrations, views, models etc from it and put scripts there. use cron again
Create app/core/management/commands folder and put scripts there. Call them by cron using python manage.py download_xml && python manage.py download_xml update_db_info

Option 3 (mostly)
However if download_xml.py doesn't use or rely on Django, I would put it in a scripts directory outside of the Django project (but still in source control). You might decide not to do this if the script does need to be deployed with your app. It doesn't need to be a management command though.
update_db_info.py definitely sounds like it would be best suited as a management command.

Losing sync with state in TFC (local execution) when clearing terragrunt cache

I have an AWS ou/account/infra deployment automation pipeline I am working on that uses terragrunt, terraform, and terraform cloud to deploy accounts to an AWS org. I am using local execution in TF Cloud, so I run it locally on my machine at this time, and only the state is stored in TFC (as opposed to S3 or GCS). Terragrunt is compatible with this state storage technique but I found that if I apply the resources (works great) and then clean the .terragrunt-cache (find . -type d -name ".terragrunt-cache" -prune -exec rm -rf {} \; ) and then plan or apply the resources that were previously created, I lose sync with the remote state, and it wants to recreate everything. When I replan, the backend.tf file is regenerated by Terragrunt in .terragrunt-cache, and I'm wondering if the provider is getting a new ID that doesn't sync with the previous provider in the other state. One hack I'm going to try is using aliases. According to the terragrunt docs, this should not be an issue as the state persists and the cache can be lost and regenerated.
Any ideas as to what my issue might be? I am new to Terragrunt and am doing some initial investigation now.
My design is based on the infrastructure-live example (and corresponding modules linked in the README). It is shared by Terragrunt here: https://github.com/gruntwork-io/terragrunt-infrastructure-live-example
I am running terragrunt plan-all -refresh=true from this directory structure:
teamname
├── Makefile
├── base
│   └── terragrunt.hcl
├── deploy.hcl
├── dev
│   ├── account
│   │   └── terragrunt.hcl
│   ├── env.hcl
│   ├── iam
│   │   └── terragrunt.hcl
│   ├── regions
│   │   ├── us-east-1
│   │   │   ├── region.hcl
│   │   │   └── terragrunt.hcl
│   │   └── us-west-2
│   │   ├── region.hcl
│   │   └── terragrunt.hcl
│   └── terragrunt.hcl
└── terragrunt.hcl
I generate a state for the account (in account/), and for the infra in the root account (base/). All the Terraform modules are in a separate repo.

Virtualenv for a project with multiple modules

I am trying to build a project from scratch in python 2, it has structure shown below. In past I have created projects with a single hierarchy, so there would be single virtualenv, but this project has multiple subpackages, what is the best practice to be followed: there should be a single virtualenv inside project_root directory shared by all subpackages in it, or there should be separate virtualenv for each subpackage?
project_root/
├── commons
│   ├── hql_helper.py
│   ├── hql_helper.pyc
│   ├── __init__.py
│   └── sample_HQL.hql
├── fl_wtchr
│   ├── fl_wtchr_test.py
│   ├── fl_wtchr_test.pyc
│   ├── __init__.py
│   ├── meta_table.hql
│   ├── requirements.txt
│   ├── sftp_tmp
│   ├── sql_test.py
│   └── sql_test.pyc
├── qry_exec
│   ├── act_qry_exec_script.py
│   ├── hive_db.logs
│   ├── params.py
│   └── params.pyc
├── sqoop_a
│   ├── __init__.py
│   └── sqoop.py
└── test.py

A case could be made for creating separate virtual environments for each module; but fundamentally, you want and expect all this code to eventually be able to run without a virtualenv at all. All your modules should be able to run with whatever you install into the top-level virtual environment and so that's what you should primarily be testing against.

How do you import modules from google.cloud for use in AWS Lambda?

I'm trying to run a script on AWS Lambda that sends data to Google Cloud Storage (GCS) at the end. When I do so locally, it works, but when I run the script on AWS Lambda, importing the GCS client library fails (other imports work fine though). Anyone know why?
Here's an excerpt of the script's imports:
# main_script.py
import robobrowser
from google.cloud import storage
# ...generate data...
# ...send data to storage...
The error message from AWS:
Unable to import module 'main_script': No module named google.cloud
To confirm that the problem is with the google client library import, I ran a version of this script in AWS Lambda with and without the GCS import (commenting out the later references to it) and the script proceeds as usual without import-related errors when the GCS client library import is commented out. Other imports (robobrowser) work fine at all times, locally and on AWS.
I'm using a virtualenv with python set to 2.7.6. To deploy to AWS Lambda, I'm going through the following manual process:
zip the pip packages for the virtual environment:
cd ~/.virtualenvs/{PROJECT_NAME}/lib/python2.7/site-packages
zip -r9 ~/Code/{PROJECT_NAME}.zip *
zip the contents of the project, adding them to the same zip as above:
zip -g ~/Code/{PROJECT_NAME}.zip *
upload the zip to AWS and test using the web console
Here is a subset of the result from running tree inside ~/.virtualenvs/{PROJECT_NAME}/lib/python2.7/site-packages:
...
│
├── google
│   ├── ...
│   ├── cloud
│   │   ├── _helpers.py
│   │   ├── _helpers.pyc
│   │   ├── ...
│   │   ├── bigquery
│   │   │   ├── __init__.py
│   │   │   ├── __init__.pyc
│   │   │   ├── _helpers.py
│   │   │   ├── _helpers.pyc
│   │   ├── ...
│   │   ├── storage
│   │   │   ├── __init__.py
│   │   │   ├── __init__.pyc
│   │   │   ├── _helpers.py
│   │   │   ├── _helpers.pyc
├── robobrowser
│   ├── __init__.py
│   ├── __init__.pyc
│   ├── browser.py
│   ├── browser.pyc
│   ├── ...
...
Unzipping and inspecting the contents of the zip confirms this structure is kept in tact during the zipping process.

I was able to solve this problem by adding __init__.py to the google and google/cloud directories in the pip installation for google-cloud. Despite the current google-cloud package (0.24.0) saying it supports python 2.7, the package structure for this as downloaded using pip seems to cause problems for me.
In the interest of reporting everything, I also had a separate problem after doing this: AWS lambda had trouble importing the main script as a module. I fixed this by recreating the repo step-by-step from scratch. Wasn't able to pinpoint the cause of this 2nd issue, but hey. Computers.

HTMLBars how to get started?

Is there any guide how to start with HTMLBars? I am following "building HTMLBars" section but finally I am stuck. I have run building tool and now I have files in my dist directory like this:
.
├── htmlbars-compiler.amd.js
├── htmlbars-runtime.amd.js
├── morph.amd.js
├── test
│   ├── htmlbars-compiler-tests.amd.js
│   ├── htmlbars-runtime-tests.amd.js
│   ├── index.html
│   ├── loader.js
│   ├── morph-tests.amd.js
│   ├── packages-config.js
│   ├── qunit.css
│   └── qunit.js
└── vendor
├── handlebars.amd.js
└── simple-html-tokenizer.amd.js
Which should I add to my ember project and is that all or have I to do something more? Is this library ready or it is still unusable for ember?

Not even close to ready yet, I'd love to give more info, but there really isn't any. Last I heard they wanted it as a beta in 1.9, but we'll see.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Terraform: state management for multi-tenancy - amazon-web-services

Related

Django best practice for scripts organizing

Losing sync with state in TFC (local execution) when clearing terragrunt cache

Virtualenv for a project with multiple modules

How do you import modules from google.cloud for use in AWS Lambda?

HTMLBars how to get started?

Categories

Resources