Using terraform output in kitchen terraform tests - unit-testing

I am using Kitchen terraform to deploy/test a environment on GCP.
I am struggling to get the kitchen/inspec part to use the terraform output values, so i can use them in my tests.
This is what I have
My inspec.yml
name: default
depends:
- name: inspec-gcp
url: https://github.com/inspec/inspec-gcp/archive/master.tar.gz
supports:
- platform: gcp
attributes:
- name: gcloud_project
required: true
description: gcp project
type: string
My Kitchen Yaml
driver:
name: terraform
root_module_directory: test/fixtures/tf_module
provisioner:
name: terraform
verifier:
name: terraform
format: documentation
systems:
- name: default
backend: gcp
controls:
- instance
platforms:
- name: terraform
suites:
- name: kt_suite
My Unit test
gcloud_project = attribute('gcloud_project',
{ description: "The name of the project where resources are deployed." })
control "instance" do
describe google_compute_instance(project: "#{gcloud_project}", zone: 'us-central1-c', name: 'test') do
its('status') { should eq 'RUNNING' }
its('machine_type') { should match 'n1-standard-1' }
end
end
my output.tf
output "gcloud_project" {
description = "The name of the GCP project to deploy against. We need this output to pass the value to tests."
value = "${var.project}"
}
The error I am getting is
× instance: /mnt/c/Users/Github/terra-test-project/test/integration/kt_suite/controls/default.rb:4
× Control Source Code Error /mnt/c/Users/Github/terra-test-project/test/integration/kt_suite/controls/default.rb:4
bad URI(is not URI?): "https://compute.googleapis.com/compute/v1/projects/Input 'gcloud_project' does not have a value. Skipping test./zones/us-central1-c/instances/test"
Everything works if i directly declare the project name in the control loop, however obviously dont want to have to do this.
How can i get kitchen/inspec to use the terraform outputs?

Looks like this may just be due to a typo. You've listed gcp_project under attributes in your inspec.yml but gcloud_project everywhere else.

Not sure if this is fixed, but I am using something like below and it works pretty well. I assume that it could be the way you are using google_project attribute.
Unit Test
dataset_name = input('dataset_name')
account_name = input('account_name')
project_id = input('project_id')
control "gcp" do
title "Google Cloud configuration"
describe google_service_account(
name: account_name,
project: project_id
) do
it { should exist }
end
describe google_bigquery_dataset(
name: dataset_name,
project: project_id
) do
it { should exist }
end
end
inspec.yml
name: big_query
depends:
- name: inspec-gcp
git: https://github.com/inspec/inspec-gcp.git
tag: v1.8.0
supports:
- platform: gcp
inputs:
- name: dataset_name
required: true
type: string
- name: account_name
required: true
type: string
- name : project_id
required: true
type: string

Related

Custom Check for GCP Cloud SQL Database Flags

I have been working with tfsec for about a week so I am still figuring things out. So far the product is pretty awesome. That being said I'm having a bit of trouble getting this custom check for Google Cloud SQL to work as expected. The goal of the check is to ensure the database flag for remote access is set to "off." The TF code below should pass the custom check, but it does not. Instead I get an error (see below):
I figured maybe I am not using subMatch/Predicatedmatch correctly, but no matter what I do the check keeps failing. There is a similar check that is included as a standard check for GCP. I ran the custom check logic through a YAML checker and it came back okay so I can rule that out any YAML specific syntax errors.
TF Code (Pass example)
resource "random_id" "db_name_suffix" {
byte_length = 4
}
resource "google_sql_database_instance" "instance" {
provider = google-beta
name = "private-instance-${random_id.db_name_suffix.hex}"
region = "us-central1"
database_version = "SQLSERVER_2019_STANDARD"
root_password = "#######"
depends_on = [google_service_networking_connection.private_vpc_connection]
settings {
tier = "db-f1-micro"
ip_configuration {
ipv4_enabled = false
private_network = google_compute_network.private_network.id
require_ssl = true
}
backup_configuration {
enabled = true
}
password_validation_policy {
min_length = 6
reuse_interval = 2
complexity = "COMPLEXITY_DEFAULT"
disallow_username_substring = true
password_change_interval = "30s"
enable_password_policy = true
}
database_flags {
name = "contained database authentication"
value = "off"
}
database_flags {
name = "cross db ownership chaining"
value = "off"
}
database_flags {
name = "remote access"
value = "off"
}
}
}
Tfsec Custom Check:
---
checks:
- code: SQL-01 Ensure Remote Access is disabled
description: Ensure Remote Access is disabled
impact: Prevents locally stored procedures form being run remotely
resolution: configure remote access = off
requiredTypes:
- resource
requiredLabels:
- google_sql_database_instance
severity: HIGH
matchSpec:
name: settings
action: isPresent
subMatchOne:
- name: database_flags
action: isPresent
predicateMatchSpec:
- name: name
action: equals
value: remote access
- name: value
action: equals
value: off
errorMessage: DB remote access has not been disabled
relatedLinks:
- http://testcontrols.com/gcp
Error Message
Error: invalid option: failed to load custom checks from ./custom_checks: Check did not pass the expected schema. yaml: unmarshal errors:
line 15: cannot unmarshal !!map into []custom.MatchSpec
I was able to get this working last night finally. This worked for me:
---
checks:
- code: SQL-01 Ensure Remote Access is disabled
description: Ensure Remote Access is disabled
impact: Prevents locally stored procedures form being run remotely
resolution: configure remote access = off
requiredTypes:
- resource
requiredLabels:
- google_sql_database_instance
severity: HIGH
matchSpec:
name: settings
action: isPresent
predicateMatchSpec:
- name: database_flags
action: isPresent
subMatch:
name: name
action: equals
value: remote access
- action: and
subMatch:
name: value
action: equals
value: off
errorMessage: DB remote access has not been disabled
relatedLinks:
- http://testcontrols.com/gcp

Terraform / Cloudformation - Pass parameter as YAML

I use Terraform to launch a Cloudformation stack to create Glue Databrew resources that don't exist yet on Terraform.
The thing is that I've a variable in Terraform that corresponds to the list of my data sources and in order to create the databrew resources associated to this data, I loop over my list to create one instance of my Cloudformation template for each data source.
Inside this template, I've a resource that I want to be different per data source. It correspond to the AWS::DataBrew::Ruleset resource.
It looks like this :
DataBrewDataQualityRuleset:
Type: AWS::DataBrew::Ruleset
Properties:
Name: !Ref RuleSetName
Description: Data Quality ruleset
Rules:
- Name: Check columns for missing values
Disabled: false
CheckExpression: AGG(MISSING_VALUES_PERCENTAGE) == :val1
SubstitutionMap:
- ValueReference: ":val1"
Value: '0'
ColumnSelectors:
- Regex: ".*"
- Name: Check two
Disabled: false
CheckExpression: :col IN :list
SubstitutionMap:
- ValueReference: ":col"
Value: "`group`"
- ValueReference: ":list"
Value: "[\"Value1\", \"Value2\"]"
TargetArn: !Sub SomeArn
What I want to do is, extract the Rules part of the component and create one file where I will put all my rules per data sources. In fact having something like below :
DataBrewDataQualityRuleset:
Type: AWS::DataBrew::Ruleset
Properties:
Name: !Ref RuleSetName
Description: Data Quality ruleset
Rules: !Ref Rules
TargetArn: !Sub SomeArn
And in my terraform, my Rules parameter would be my actual set of rules for one particular data source.
I've thought about having one YAML file from which I would loop on terraform but I'm not sure it's doable and if cloudformation would accept YAML as parameter type.
Below you'll also find my terraform component :
resource "aws_cloudformation_stack" "databrew_jobs" {
for_each = var.data_sources
name = "datachecks-${each.value.stack_name}"
parameters = {
Bucket = "test_bucket"
DataSetKey = "raw/${each.value.job_name}"
DataSetName = "dataset-${each.value.stack_name}"
RuleSetName = "ruleset-${each.value.stack_name}"
JobName = "profile-job-${each.value.stack_name}"
DataSourceName = "${each.value.stack_name}"
JobResultKey = "databrew-results/${each.value.job_name}"
RoleArn = iam_role_test.arn
}
template_body = file("${path.module}/databrew-job.yaml")
}
Do you have any idea how could I achieve this ?
Thanks in advance !

AWS WAF Update IP set Automation

I am trying to automate the process of updating IPs to help engineers whitelist IPs on AWS WAF IP set. aws waf-regional update-ip-set returns a ChangeToken which has to be used in the next run of update-ip-set command.
This automation I am trying to achieve is through Rundeck job (community edition). Ideally engineers will not have access to the output of previous job to retrieve ChangeToken. What's the best way to accomplish this task?
You can hide the step output using the "Mask Log Output by Regex" output filter.
Take a look at the following job definition example, the first step is just a simulation of getting the token, but it's hidden by the filter.
- defaultTab: nodes
description: ''
executionEnabled: true
id: fcf8cf5d-697c-42a1-affb-9cda02183fdd
loglevel: INFO
name: TokenWorkflow
nodeFilterEditable: false
plugins:
ExecutionLifecycle: null
scheduleEnabled: true
sequence:
commands:
- exec: echo "abc123"
plugins:
LogFilter:
- config:
invalidKeyPattern: \s|\$|\{|\}|\\
logData: 'false'
name: mytoken
regex: s*([^\s]+?)\s*
type: key-value-data
- config:
maskOnlyValue: 'false'
regex: .*
replacement: '[SECURE]'
type: mask-log-output-regex
- exec: echo ${data.mytoken}
keepgoing: false
strategy: node-first
uuid: fcf8cf5d-697c-42a1-affb-9cda02183fdd
The second step uses that token (to show the data passing the steps print the data value generated in the first step, of course in your case the token is used by another command).
Update (passing the data value to another job)
Just use the job reference step and put the data variable name on the remote job option as an argument.
Check the following example:
The first job generates the token (or gets it from your service, hiding the result like in the first example). Then, it calls another job that "receives" that data in an option (Job Reference Step > Arguments) using this format:
-token ${data.mytoken}
Where -token is the target job option name, and ${data.mytoken} is the current data variable name.
- defaultTab: nodes
description: ''
executionEnabled: true
id: fcf8cf5d-697c-42a1-affb-9cda02183fdd
loglevel: INFO
name: TokenWorkflow
nodeFilterEditable: false
plugins:
ExecutionLifecycle: null
scheduleEnabled: true
sequence:
commands:
- exec: echo "abc123"
plugins:
LogFilter:
- config:
invalidKeyPattern: \s|\$|\{|\}|\\
logData: 'false'
name: mytoken
regex: s*([^\s]+?)\s*
type: key-value-data
- config:
maskOnlyValue: 'false'
regex: .*
replacement: '[SECURE]'
type: mask-log-output-regex
- jobref:
args: -token ${data.mytoken}
group: ''
name: ChangeRules
nodeStep: 'true'
uuid: b6975bbf-d6d0-411e-98a6-8ecb4c3f7431
keepgoing: false
strategy: node-first
uuid: fcf8cf5d-697c-42a1-affb-9cda02183fdd
This is the job that receive the token and do something, the example show the token but the idea is to use internally to do some action (like the first example).
- defaultTab: nodes
description: ''
executionEnabled: true
id: b6975bbf-d6d0-411e-98a6-8ecb4c3f7431
loglevel: INFO
name: ChangeRules
nodeFilterEditable: false
options:
- name: token
plugins:
ExecutionLifecycle: null
scheduleEnabled: true
sequence:
commands:
- exec: echo ${option.token}
keepgoing: false
strategy: node-first
uuid: b6975bbf-d6d0-411e-98a6-8ecb4c3f7431

Parsing variables in ansible inventory in python

I'm trying to parse ansible variables using python specified in an inventory file like below:
[webservers]
foo.example.com type=news
bar.example.com type=sports
[dbservers]
mongodb.local type=mongo region=us
mysql.local type=mysql region=eu
I want to be able to parse type=news for host foo.example.com in webservers and type=mongo region=us for host mongodb.local under dbservers. Any help with this is greatly appreciated
The play below
- name: List type=news hosts in the group webservers
debug:
msg: "{{ hostvars[item].inventory_hostname }}"
loop: "{{ groups['webservers'] }}"
when: hostvars[item].type == "news"
- name: List type=mongo and region=us hosts in the group dbservers
debug:
msg: "{{ hostvars[item].inventory_hostname }}"
loop: "{{ groups['dbservers'] }}"
when:
- hostvars[item].type == "mongo"
- hostvars[item].region == "us"
gives:
"msg": "foo.example.com"
"msg": "mongodb.local"
If the playbook will be run on the host:
foo.example.com
you can get "type = news" simply by specifying "{{type}}". If you want to use in "when" conditions, then simply indicating "type"
If the playbook will be run on the host:
mongodb.local
then the value for "type" in this case will automatically be = "mongo", and "region" will automatically be = "us"
The values of the variables, if they are defined in the hosts file as you specified, will automatically be determined on the specified hosts
Thus, the playbook can be executed on all hosts and if you get a value for "type", for example:
- debug:
     msg: "{{type}}"
On each of the hosts you will get your unique values that are defined in the hosts file
I'm not sure that I understood the question correctly, but if it meant that on the foo.example.com host it was necessary to get a list of servers from the "webservers" group that have "type = news", then the answer is already given.
Rather than re-inventing the wheel, I suggest you have a look at how ansible itsef is parsing ini files to turn them into an inventory object
You could also easily get this info in json format with a very simple playbook (as suggested by #vladimirbotka), or rewrite your inventory in yaml which would be much easier to parse with any external tool
inventory.yaml
---
all:
children:
webservers:
hosts:
foo.example.com:
type: news
bar.example.com:
type: sports
dbservers:
hosts:
mongodb.local:
type: mongo
region: us
mysql.local:
type: mysql
region: eu

Ansible gcp_compute inventory plugin - groups based on machine names

Consider the following config for ansible's gcp_compute inventory plugin:
plugin: gcp_compute
projects:
- myproj
scopes:
- https://www.googleapis.com/auth/compute
filters:
- ''
groups:
connect: '"connect" in list"'
gcp: 'True'
auth_kind: serviceaccount
service_account_file: ~/.gsutil/key.json
This works for me, and will put all hosts in the gcp group as expected. So far so good.
However, I'd like to group my machines based on certain substrings appearing in their names. How can I do this?
Or, more broadly, how can I find a description of the various variables available to the jinja expressions in the groups dictionary?
The variables available are the keys available inside each of the items in the response, as listed here: https://cloud.google.com/compute/docs/reference/rest/v1/instances/list
So, for my example:
plugin: gcp_compute
projects:
- myproj
scopes:
- https://www.googleapis.com/auth/compute
filters:
- ''
groups:
connect: "'connect' in name"
gcp: 'True'
auth_kind: serviceaccount
service_account_file: ~/.gsutil/key.json
For complete your accurate answer, for choose the machines based on certain substrings appearing in their names in the parameter 'filter' you can add a, for example, expression like this:
filters:
- 'name = gke*'
This value list only the instances that their name start by gke.