I am working on creating multiple Azure VMs through terraform.
I earlier created 4 VMs with count variable successfully. Now I need to add additional 2 VMs. When I give the extra count of 2 VMs. It tries to destroy existing 4 Vms as well to create total VMs of 6. So I gave Prevent_destroy as true so that earlier 4 VMs will not be destroyed but new 2 VMs will be created. but with prevent_destroy also it gives error that Instance needs to be destroyed, this is not a desirable behavior as destroying existing 4 VMs is not expected as users are working on them I just need 2 new VMs using same script.
I tried using terraform-azurerm latest provider version v2.93.0 with latest terraform version v0.1.13 with the below code and it works as expected :
I tested it by creating 2 vm's first and then added another one by changing the count value as 3 .
provider "azurerm" {
features{}
}
data "azurerm_resource_group" "example" {
name = "ansumantest"
}
resource "azurerm_virtual_network" "example" {
name = "example-network"
address_space = ["10.0.0.0/16"]
location = data.azurerm_resource_group.example.location
resource_group_name = data.azurerm_resource_group.example.name
}
resource "azurerm_subnet" "example" {
name = "internal"
resource_group_name = data.azurerm_resource_group.example.name
virtual_network_name = azurerm_virtual_network.example.name
address_prefixes = ["10.0.2.0/24"]
}
resource "azurerm_public_ip" "example" {
count = 3
name = "vm-public-ip${count.index}"
location = data.azurerm_resource_group.example.location
resource_group_name = data.azurerm_resource_group.example.name
allocation_method = "Dynamic"
}
resource "azurerm_network_interface" "example" {
count = 3
name = "example-nic-${count.index}"
location = data.azurerm_resource_group.example.location
resource_group_name = data.azurerm_resource_group.example.name
ip_configuration {
name = "internal"
subnet_id = azurerm_subnet.example.id
public_ip_address_id = "${element(azurerm_public_ip.example.*.id, count.index)}"
private_ip_address_allocation = "Dynamic"
}
}
resource "azurerm_ssh_public_key" "example" {
name = "ansuman-sshkey"
resource_group_name = data.azurerm_resource_group.example.name
location = data.azurerm_resource_group.example.location
public_key = file("~/.ssh/id_rsa.pub")
}
resource "azurerm_linux_virtual_machine" "example" {
count = 3
name = "example-machine-${count.index}"
resource_group_name = data.azurerm_resource_group.example.name
location = data.azurerm_resource_group.example.location
size = "Standard_F2"
admin_username = "adminuser"
admin_password = "Password#1234"
disable_password_authentication = true
network_interface_ids = ["${element(azurerm_network_interface.example.*.id, count.index)}"]
admin_ssh_key {
username = "adminuser"
public_key = azurerm_ssh_public_key.example.public_key
}
os_disk {
caching = "ReadWrite"
storage_account_type = "Standard_LRS"
}
source_image_reference {
publisher = "Canonical"
offer = "UbuntuServer"
sku = "16.04-LTS"
version = "latest"
}
}
Output:
After changing count from 2 to 3:
Related
I'm having issues when trying to create an EKS cluster with a few security groups that I already have created. I don't want a new SG every time I create a new EKS Cluster.
I have a problem with a part of this code under vpc_id part."cluster_create_security_group=false" produces an error, and cluster_security_group_id = "sg-123" is completely ignored.
My code is like this:
provider "aws" {
region = "us-east-2"
}
terraform {
backend "s3" {
bucket = "mys3bucket"
key = "eks/terraform.tfstate"
region = "us-east-2"
}
}
data "aws_eks_cluster" "cluster" {
name = module.eks.cluster_id
}
data "aws_eks_cluster_auth" "cluster" {
name = module.eks.cluster_id
}
provider "kubernetes" {
host = data.aws_eks_cluster.cluster.endpoint
cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
token = data.aws_eks_cluster_auth.cluster.token
}
variable "cluster_security_group_id" {
description = "Existing security group ID to be attached to the cluster. Required if `create_cluster_security_group` = `false`"
type = string
default = "sg-1234"
}
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "~> 18.0"
cluster_name = "cluster-example"
cluster_version = "1.21" #This may vary depending on the purpose of the cluster
cluster_endpoint_private_access = true
cluster_endpoint_public_access = true
cluster_addons = {
coredns = {
resolve_conflicts = "OVERWRITE"
}
kube-proxy = {}
vpc-cni = {
resolve_conflicts = "OVERWRITE"
}
}
vpc_id = "vpc-12345"
subnet_ids = ["subnet-123", "subnet-456", "subnet-789"]
create_cluster_security_group=false ----------> ERROR: An argument named "cluster_create_security_group" is not expected here
cluster_security_group_id = "my-security-group-id"
# EKS Managed Node Group(s)
eks_managed_node_group_defaults = {
disk_size = 50
instance_types = ["t3.medium"]
}
eks_managed_node_groups = {
Test-Nodegroup = {
min_size = 2
max_size = 5
desired_size = 2
instance_types = ["t3.large"]
capacity_type = "SPOT"
}
}
tags = {
Environment = "dev"
Terraform = "true"
}
}
Where am I wrong? This is my whole Terraform file.
I need to get the terraform state file from the backend GCS and to use it while I update the resource is it possible to overwrite the excising terraform state file and fetch it on need.
this is my main.tf
provider "google" {
project = var.project
region = var.region
zone = var.zone
}
###################################################
########################################################
data "google_compute_default_service_account" "default" {
}
#create instance1
resource "random_id" "bucket_prefix" {
byte_length = 8
}
#create instance
resource "google_compute_instance" "vm_instance" {
name = var.instance_name
machine_type = var.machine_type
zone = var.zone
metadata_startup_script = var.script
allow_stopping_for_update = true
#metadata = {
# enable-oslogin = "TRUE"
#}
service_account {
email = data.google_compute_default_service_account.default.email
scopes = ["cloud-platform"]
}
boot_disk {
initialize_params {
image = var.image
#image = "ubuntu-2004-lts" # TensorFlow Enterprise
size = 30
}
}
# Install Flask
tags = ["http-server","allow-ssh-ingress-from-iap", "https-server"]
network_interface {
network = "default"
access_config {
}
}
guest_accelerator{
#type = "nvidia-tesla-t4" // Type of GPU attahced
type = var.type
count = var.gpu_count
#count = 2 // Num of GPU attached
}
scheduling{
on_host_maintenance = "TERMINATE"
automatic_restart = true// Need to terminate GPU on maintenance
}
}
This is my variables.tfvars:
instance_name = "test-vm-v5"
machine_type = "n1-standard-16"
region = "europe-west4"
zone = "europe-west4-a"
image = "tf28-np-pandas-nltk-scikit-py39"
#image = "debian-cloud/debian-10"
project = "my_project"
network = "default"
type = ""
gpu_count = "0"
I wanted to create multiple instances by changing the variables.tfvars and need to modify the instance on the basis of the name of vm.
I have a requirement to create multiple VMs in GCP using the Instance Template module located here:
https://github.com/terraform-google-modules/terraform-google-vm/tree/master/modules/instance_template
My Instance Template code looks like this:
module "db_template" {
source = "terraform-google-modules/vm/google//modules/instance_template"
version = "7.8.0"
name_prefix = "${var.project_short_name}-db-template"
machine_type = var.app_machine_type
disk_size_gb = 20
source_image = "debian-10-buster-v20220719"
source_image_family = "debian-10"
source_image_project = "debian-cloud"
additional_disks = var.additional_disks
labels = {
costing = "db",
inventory = "gcp",
}
network = var.network
subnetwork = var.subnetwork
access_config = []
service_account = {
email = var.service_account_email
scopes = ["cloud-platform"]
}
tags = ["compute"]
}
in my tfvars I have this:
additional_disks = [
{ disk_name = "persistent-disk-1"
device_name = "persistent-disk-1"
auto_delete = true
boot = false
disk_size_gb = 50
disk_type = "pd-standard"
interface = "SCSI"
disk_labels = {}
}
]
However when my code has multiple VMs to deploy with this template, only 1 VM gets deployed--the first--and the subsequent VMs error out with this message:
Error: Error creating instance: googleapi: Error 409: The resource 'projects/<PATH>/persistent-disk-1' already exists, alreadyExists
I understand what is happening but I don't know how to fix it. The subsequent VMs cannot be created because the additional_disk name has already been taken by the first VM. I thought the whole point of using the instance template would be that there is logic built into this where you can use the same template and create multiple VMs of that type.
But it seems like I have to do some additional coding to get multiple VMs deployed with this template.
Can anyone suggest how to do this?
Ultimately got this working with various for_each constructs:
locals {
app_servers = ["inbox", "batch", "transfer", "tools", "elastic", "artemis"]
db_servers = ["inboxdb", "batchdb", "transferdb", "gatewaydb", "artemisdb"]
}
resource "google_compute_disk" "db_add_disk" {
for_each = toset(local.db_servers)
name = "${each.value}-additional-disk"
type = "pd-standard" // pd-ssd
zone = var.zone
size = 50
// interface = "SCSI"
labels = {
environment = "dev"
}
physical_block_size_bytes = 4096
}
module "db_template" {
source = "terraform-google-modules/vm/google//modules/instance_template"
version = "7.8.0"
name_prefix = "${var.project_short_name}-db-template"
machine_type = var.app_machine_type
disk_size_gb = 20
source_image = "debian-10-buster-v20220719"
source_image_family = "debian-10"
source_image_project = "debian-cloud"
labels = {
costing = "db",
inventory = "gcp",
}
network = var.network
subnetwork = var.subnetwork
access_config = []
service_account = {
email = var.service_account_email
scopes = ["cloud-platform"]
}
tags = ["compute"]
}
resource "google_compute_instance_from_template" "db_server-1" {
for_each = toset(local.db_servers)
name = "${var.project_short_name}-${each.value}-1"
zone = var.zone
source_instance_template = module.db_template.self_link
// Override fields from instance template
labels = {
costing = "db",
inventory = "gcp",
component = "${each.value}"
}
lifecycle {
ignore_changes = [attached_disk]
}
}
resource "google_compute_attached_disk" "db_add_disk" {
for_each = toset(local.db_servers)
disk = google_compute_disk.db_add_disk[each.key].id
instance = google_compute_instance_from_template.db_server-1[each.key].id
}
I am trying to pass mat_ip of google compute instances created in module "microservice-instance" to another module "database". Since I am creating more than one instance, I am getting following error for output variable in module "microservice-instance".
Error: Missing resource instance key
on modules/microservice-instance/ms-outputs.tf line 3, in output "nat_ip": 3: value = google_compute_instance.apps.network_interface[*].access_config[0].nat_ip
Because google_compute_instance.apps has "count" set, its attributes must be accessed on specific instances.
For example, to correlate with indices of a referring resource, use:
google_compute_instance.apps[count.index]
I have looked at following and using the same way of accessing attribute but its not working. Here is code -
main.tf
provider "google" {
credentials = "${file("../../service-account.json")}"
project = var.project
region =var.region
}
# Include modules
module "microservice-instance" {
count = var.appserver_count
source = "./modules/microservice-instance"
appserver_count = var.appserver_count
}
module "database" {
count = var.no_of_db_instances
source = "./modules/database"
nat_ip = module.microservice-instance.nat_ip
no_of_db_instances = var.no_of_db_instances
}
./modules/microservice-instance/microservice-instance.tf
resource "google_compute_instance" "apps" {
count = var.appserver_count
name = "apps-${count.index + 1}"
# name = "apps-${random_id.app_name_suffix.hex}"
machine_type = "f1-micro"
boot_disk {
initialize_params {
image = "ubuntu-os-cloud/ubuntu-1804-lts"
}
}
network_interface {
network = "default"
access_config {
// Ephemeral IP
}
}
}
./modules/microservice-instance/ms-outputs.tf
output "nat_ip" {
value = google_compute_instance.apps.network_interface[*].access_config[0].nat_ip
}
./modules/database/database.tf
resource "random_id" "db_name_suffix" {
byte_length = 4
}
resource "google_sql_database_instance" "postgres" {
name = "postgres-instance-${random_id.db_name_suffix.hex}"
database_version = "POSTGRES_11"
settings {
tier = "db-f1-micro"
ip_configuration {
dynamic "authorized_networks" {
for_each = var.nat_ip
# iterator = ip
content {
# value = ni.0.access_config.0.nat_ip
value = each.key
}
}
}
}
}
You are creating var.appserver_count number of google_compute_instance.apps resources. So you will have:
google_compute_instance.apps[0]
google_compute_instance.apps[1]
...
google_compute_instance.apps[var.appserver_count - 1]
Therefore, in your output, instead of:
output "nat_ip" {
value = google_compute_instance.apps.network_interface[*].access_config[0].nat_ip
}
you have to reference individual apps resources or all of them using [*], for example:
output "nat_ip" {
value = google_compute_instance.apps[*].network_interface[*].access_config[0].nat_ip
}
Terraform CLI and Terraform AWS Provider Version
Installed from https://releases.hashicorp.com/terraform/0.13.5/terraform_0.13.5_linux_amd64.zip
hashicorp/aws v3.15.0
Affected Resource(s)
aws_rds_cluster
aws_rds_cluster_instance
Terraform Configuration Files
# inside ./modules/rds/main.tf
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
}
}
required_version = "~> 0.13"
}
provider "aws" {
alias = "primary"
}
provider "aws" {
alias = "dr"
}
locals {
region_tags = ["primary", "dr"]
db_name = "${var.project_name}-${var.stage}-db"
db_cluster_0 = "${local.db_name}-cluster-${local.region_tags[0]}"
db_cluster_1 = "${local.db_name}-cluster-${local.region_tags[1]}"
db_instance_name = "${local.db_name}-instance"
}
resource "aws_rds_global_cluster" "global_db" {
global_cluster_identifier = "${var.project_name}-${var.stage}"
database_name = "${var.project_name}${var.stage}db"
engine = "aurora-mysql"
engine_version = "${var.mysql_version}.mysql_aurora.${var.aurora_version}"
// force_destroy = true
}
resource "aws_rds_cluster" "primary_cluster" {
depends_on = [aws_rds_global_cluster.global_db]
provider = aws.primary
cluster_identifier = "${local.db_name}-cluster-${local.region_tags[0]}"
# the database name does not allow dashes:
database_name = "${var.project_name}${var.stage}db"
# The engine and engine_version must be repeated in aws_rds_global_cluster,
# aws_rds_cluster, and aws_rds_cluster_instance to
# avoid "Value for engine should match" error
engine = "aurora-mysql"
engine_version = "${var.mysql_version}.mysql_aurora.${var.aurora_version}"
engine_mode = "global"
global_cluster_identifier = aws_rds_global_cluster.global_db.id
# backtrack and multi-master not supported by Aurora Global.
master_username = var.username
master_password = var.password
backup_retention_period = 5
preferred_backup_window = "07:00-09:00"
db_subnet_group_name = aws_db_subnet_group.primary.id
# We must have these values, because destroying or rolling back requires them
skip_final_snapshot = true
final_snapshot_identifier = "ci-aurora-cluster-backup"
tags = {
Name = local.db_cluster_0
Stage = var.stage
CreatedBy = var.created_by
}
}
resource "aws_rds_cluster_instance" "primary" {
depends_on = [aws_rds_global_cluster.global_db]
provider = aws.primary
cluster_identifier = aws_rds_cluster.primary_cluster.id
engine = "aurora-mysql"
engine_version = "${var.mysql_version}.mysql_aurora.${var.aurora_version}"
instance_class = "db.${var.instance_class}.${var.instance_size}"
db_subnet_group_name = aws_db_subnet_group.primary.id
tags = {
Name = local.db_instance_name
Stage = var.stage
CreatedBy = var.created_by
}
}
resource "aws_rds_cluster" "dr_cluster" {
depends_on = [aws_rds_cluster_instance.primary, aws_rds_global_cluster.global_db]
provider = aws.dr
cluster_identifier = "${local.db_name}-cluster-${local.region_tags[1]}"
# db name now allowed to specified on secondary regions
# The engine and engine_version must be repeated in aws_rds_global_cluster,
# aws_rds_cluster, and aws_rds_cluster_instance to
# avoid "Value for engine should match" error
engine = "aurora-mysql"
engine_version = "${var.mysql_version}.mysql_aurora.${var.aurora_version}"
engine_mode = "global"
global_cluster_identifier = aws_rds_global_cluster.global_db.id
# backtrack and multi-master not supported by Aurora Global.
# cannot specify username/password in cross-region replication cluster:
backup_retention_period = 5
preferred_backup_window = "07:00-09:00"
db_subnet_group_name = aws_db_subnet_group.dr.id
# We must have these values, because destroying or rolling back requires them
skip_final_snapshot = true
final_snapshot_identifier = "ci-aurora-cluster-backup"
tags = {
Name = local.db_cluster_1
Stage = var.stage
CreatedBy = var.created_by
}
}
resource "aws_rds_cluster_instance" "dr_instance" {
depends_on = [aws_rds_cluster_instance.primary, aws_rds_global_cluster.global_db]
provider = aws.dr
cluster_identifier = aws_rds_cluster.dr_cluster.id
engine = "aurora-mysql"
engine_version = "${var.mysql_version}.mysql_aurora.${var.aurora_version}"
instance_class = "db.${var.instance_class}.${var.instance_size}"
db_subnet_group_name = aws_db_subnet_group.dr.id
tags = {
Name = local.db_instance_name
Stage = var.stage
CreatedBy = var.created_by
}
}
resource "aws_db_subnet_group" "primary" {
name = "${local.db_name}-subnetgroup"
subnet_ids = var.subnet_ids
provider = aws.primary
tags = {
Name = "primary_subnet_group"
Stage = var.stage
CreatedBy = var.created_by
}
}
resource "aws_db_subnet_group" "dr" {
provider = aws.dr
name = "${local.db_name}-subnetgroup"
subnet_ids = var.dr_subnet_ids
tags = {
Name = "dr_subnet_group"
Stage = var.stage
CreatedBy = var.created_by
}
}
resource "aws_rds_cluster_parameter_group" "default" {
name = "rds-cluster-pg"
family = "aurora-mysql${var.mysql_version}"
description = "RDS default cluster parameter group"
parameter {
name = "character_set_server"
value = "utf8"
}
parameter {
name = "character_set_client"
value = "utf8"
}
parameter {
name = "aurora_parallel_query"
value = "ON"
apply_method = "pending-reboot"
}
}
Inside ./modules/sns/main.tf, this is the resource I'm adding when calling terraform apply from within the ./modules directory:
resource "aws_sns_topic" "foo_topic" {
name = "foo-${var.stage}-${var.topic_name}"
tags = {
Name = "foo-${var.stage}-${var.topic_name}"
Stage = var.stage
CreatedBy = var.created_by
CreatedOn = timestamp()
}
}
./modules/main.tf:
terraform {
backend "s3" {
bucket = "terraform-remote-state-s3-bucket-unique-name"
key = "terraform.tfstate"
region = "us-east-2"
dynamodb_table = "TerraformLockTable"
}
}
provider "aws" {
alias = "primary"
region = var.region
}
provider "aws" {
alias = "dr"
region = var.dr_region
}
module "vpc" {
stage = var.stage
source = "./vpc"
providers = {
aws = aws.primary
}
}
module "dr_vpc" {
stage = var.stage
source = "./vpc"
providers = {
aws = aws.dr
}
}
module "vpc_security_group" {
source = "./vpc_security_group"
vpc_id = module.vpc.vpc_id
providers = {
aws = aws.primary
}
}
module "rds" {
source = "./rds"
stage = var.stage
created_by = var.created_by
vpc_id = module.vpc.vpc_id
subnet_ids = [module.vpc.subnet_a_id, module.vpc.subnet_b_id, module.vpc.subnet_c_id]
dr_subnet_ids = [module.dr_vpc.subnet_a_id, module.dr_vpc.subnet_b_id, module.dr_vpc.subnet_c_id]
region = var.region
username = var.rds_username
password = var.rds_password
providers = {
aws.primary = aws.primary
aws.dr = aws.dr
}
}
module "sns_start" {
stage = var.stage
source = "./sns"
topic_name = "start"
created_by = var.created_by
}
./modules/variables.tf:
variable "region" {
default = "us-east-2"
}
variable "dr_region" {
default = "us-west-2"
}
variable "service" {
type = string
default = "foo-back"
description = "service to match what serverless framework deploys"
}
variable "stage" {
type = string
default = "sandbox"
description = "The stage to deploy: sandbox, dev, qa, uat, or prod"
validation {
condition = can(regex("sandbox|dev|qa|uat|prod", var.stage))
error_message = "The stage value must be a valid stage: sandbox, dev, qa, uat, or prod."
}
}
variable "created_by" {
description = "Company or vendor name followed by the username part of the email address"
}
variable "rds_username" {
description = "Username for rds"
}
variable "rds_password" {
description = "Password for rds"
}
./modules/sns/main.tf:
resource "aws_sns_topic" "foo_topic" {
name = "foo-${var.stage}-${var.topic_name}"
tags = {
Name = "foo-${var.stage}-${var.topic_name}"
Stage = var.stage
CreatedBy = var.created_by
CreatedOn = timestamp()
}
}
./modules/sns/output.tf:
output "sns_topic_arn" {
value = aws_sns_topic.foo_topic.arn
}
Debug Output
Both outputs have modified keys, names, account IDs, etc:
The plan output from running terraform apply:
https://gist.github.com/ystoneman/95df711ee0a11d44e035b9f8f39b75f3
The state before applying: https://gist.github.com/ystoneman/5c842769c28e1ae5969f9aaff1556b37
Expected Behavior
The entire ./modules/main.tf had already been created, and the only thing that was added was the SNS module, so only the SNS module should be created.
Actual Behavior
But instead, the RDS resources are affected too, and terraform "claims" that engine_mode has changed from provisioned to global, even though it already was global according to the console:
The plan output also says that cluster_identifier is only known after apply and therefore forces replacement, however, I think the cluster_identifier is necessary to let the aws_rds_cluster know it belongs to the aws_rds_global_cluster, and for the aws_rds_cluster_instance to know it belongs to the aws_rds_cluster, respectively.
Steps to Reproduce
comment out the module "sns_start"
cd ./modules
terraform apply (after this step is done is where the state file I included is at)
uncomment out the module "sns_start"
terraform apply (at this point is where I provide the debug output)
Important Factoids
This problem happens whether I run it from my Mac or within AWS CodeBuild.
References
Seems like AWS Terraform tried to destory and rebuild RDS cluster references this too, but it's not specific to a Global Cluster, where you do need identifiers so that instances and clusters know to what they belong to.
It seems like you are using an outdated version of the aws provider and are specifying the engine_mode incorrectly. There was a bug ticket relating to this: https://github.com/hashicorp/terraform-provider-aws/issues/16088
It is fixed in version 3.15.0 which you can use via
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 3.15.0"
}
}
required_version = "~> 0.13"
}
Additionally you should drop the engine_mode property from your terraform specification completely.