How to identify if create read replica action is complete and finished - amazon-web-services

I am trying to create a read replica of the production database and then I want to promote the read replica to a test database.
I have used this awscli command to create a read replica
aws rds create-db-instance-read-replica --db-instance-identifier test-database --source-db-instance-identifier production-database --region eu-central-1
I know I cannot issue a promote read replica command immediately as I would get an error.
bash-3.2$ aws rds promote-read-replica --db-instance-identifier test-database --region eu-central-1
An error occurred (InvalidDBInstanceState) when calling the PromoteReadReplica operation: DB Instance is not in an available state.
How can I check if the read replica is created successfully so I can issue a promote read replica command?
I tried to query the events for the database but it is returning empty.
bash-3.2$ aws rds describe-events --source-identifier test-database --source-type db-instance
{
"Events": []
}
I am trying to do this in the Jenkins pipeline so it has to be checked programmatically.
Kindly advise.

You can use describe-db-instances and create a simple while-based waiter for the replica to be available.
For example:
while true; do
db_status=$(aws rds describe-db-instances \
--db-instance-identifier test-database \
--query 'DBInstances[0].DBInstanceStatus' \
--output text)
[[ $db_status != "available" ]] \
&& (echo $db_status; sleep 5) \
|| break
done
echo "Finally ${db_status}"
The above will check the status of test-database every 5 seconds until its available.

i m trying to continue the above script by adding parameter group and reboot the read replica .
db_status=$(aws rds describe-db-instances \
--db-instance-identifier db-replica \
--profile xxx \
--region us-west-2 \
--query 'DBInstances[0].DBInstanceStatus' \
--output text)
[[ $db_status != "available" ]] \
&& (echo $db_status; sleep 15)\
|| break
echo "Finally ${db_status}"
if [ $db_status == "available" ]
then
echo "replica created successfully"
echo "Rebooting replica"
aws rds reboot-db-instance --db-instance-identifier db-replica
sleep 5;
else
echo "replica rebooted successfully"
fi
exit 1

Related

aws_ecs_cluster local-exec aws not found

resource "aws_ecs_cluster" "demo" {
name = var.ecs_cluster_name
capacity_providers = [local.cluster_name]
default_capacity_provider_strategy {
capacity_provider = local.cluster_name
}
# We need to terminate all instances before the cluster can be destroyed.
# (Terraform would handle this automatically if the autoscaling group depended
# on the cluster, but we need to have the dependency in the reverse
# direction due to the capacity_providers field above).
provisioner "local-exec" {
when = destroy
interpreter = ["bash", "-c"]
command = <<EOT
set -e
CAP_PROVS="$(aws ecs describe-clusters --clusters "${self.arn}" \
--query 'clusters[*].capacityProviders[*]' --output text)"
ASG_ARNS="$(aws ecs describe-capacity-providers \
--capacity-providers "$CAP_PROVS" \
--query 'capacityProviders[*].autoScalingGroupProvider.autoScalingGroupArn' \
--output text)"
if [ -n "$ASG_ARNS" ] && [ "$ASG_ARNS" != "None" ]
then
for ASG_ARN in $ASG_ARNS
do
ASG_NAME=$(echo $ASG_ARN | cut -d/ -f2-)
aws autoscaling update-auto-scaling-group \
--auto-scaling-group-name "$ASG_NAME" \
--min-size 0 --max-size 0 --desired-capacity 0
INSTANCES="$(aws autoscaling describe-auto-scaling-groups \
--auto-scaling-group-names "$ASG_NAME" \
--query 'AutoScalingGroups[*].Instances[*].InstanceId' \
--output text)"
aws autoscaling set-instance-protection --instance-ids $INSTANCES \
--auto-scaling-group-name "$ASG_NAME" \
--no-protected-from-scale-in
done
fi
EOT
}
}
aws_ecs_service local-exec not working and the error say aws command not found
https://github.com/hashicorp/terraform-provider-aws/issues/11409
Error: Error deleting ECS cluster: ClusterContainsContainerInstancesException: The Cluster cannot be deleted while Container InstancesainsContainerInstancesException: The Cluster cantainer Instances are actnot be deleted while Container Instances are active or draining
Error waiting for internet gateway (igw 'detached' (last state:-0e21e59722a46f970) to detach: timeout while waiting for state to become 'detached' (last state: 'detaching', timeout: 15m0s)
aws_ecs_cluster.demo (local-exec): /bin/bash: aws: command not found

Jenkins pipeline - check if an ECS service exist

I'm using AWS ECS CLI within Jenkins pipeline to automate my CICD. What I'm trying to do is I want to create a service based upon the task definition if the service does not exist yet, if the service already exists I just want to update it instead. Here is the create-service cli command:
aws ecs create-service \
--cluster $stg_cluster \
--task-definition $task_def \
--service-name $ecs_service \
--desired-count 1 \
--launch-type EC2 \
--scheduling-strategy REPLICA \
--load-balancers \"targetGroupArn=$target_group_arn,containerName=$container_name,containerPort=80\" \
--deployment-configuration \"maximumPercent=200,minimumHealthyPercent=100\"
It works fine for the first time but will fail at subsequent deployments because of this error:
An error occurred (InvalidParameterException) when calling the CreateService operation: Creation of service was not idempotent.
I believe I have to use the command update-service instead but not sure how to write ECS CLI command to check if an ECS service has already existed. One way I can think of is I can check the returned code from the create-service cli command see if it equals to 0 but again not sure how to retrieve it from the pipeline. Thanks for your help.
First you can run aws ecs describe-services to check if that service is already there and the state is ACTIVE. If it's active then you can run aws ecs update-service to update existing service. If it's not ACTIVE then you can run aws ecs create-service to create the service.
This is a sample code you can use to check if a service is already created and active:
aws ecs describe-services --cluster CLUSTER_NAME --services SERVIVE_NAME | jq --raw-output 'select(.services[].status != null ) | .services[].status'
and then use an if condition to run update-service or create-service.
This is my code in case anyone is interested.
#Jenkinsfile pipeline
stage('Deploy')
{
steps {
....
script {
int count = sh(script: """
aws ecs describe-services --cluster $ecs_cluster --services $ecs_service | jq '.services | length'
""",
returnStdout: true).trim()
echo "service count: $count"
if (count > 0) {
//ECS service exists: update
echo "Updating ECS service $ecs_service..."
sh(script: """
aws ecs update-service \
--cluster $ecs_cluster \
--service $ecs_service \
--task-definition $task_def \
""")
}
else {
//ECS service does not exist: create new
echo "Creating new ECS service $ecs_service..."
sh(script: """
aws ecs create-service \
--cluster $ecs_cluster \
--task-definition $task_def \
--service-name $ecs_service \
--desired-count 1 \
--launch-type EC2 \
--scheduling-strategy REPLICA \
--load-balancers \"targetGroupArn=${target_group_arn},containerName=$app_name,containerPort=80\" \
--deployment-configuration \"maximumPercent=200,minimumHealthyPercent=100\"
""")
}
}
}
}
Use query parameter in aws cli :
#!/bin/bash
CLUSTER="test-cluster-name"
SERVICE="test-service-name"
echo "check ECS service exists"
status=$(aws ecs describe-services --cluster ${CLUSTER} --services ${SERVICE} --query 'failures[0].reason' --output text)
if [[ "${status}" == "MISSING" ]]; then
echo "ecs service ${SERVICE} missing in ${CLUSTER}"
exit 1
fi
same in jenkins pipline:
stage('Deploy')
{
steps {
....
script {
string status = sh(script: """
aws ecs describe-services --cluster $ecs_cluster --services $ecs_service --query 'failures[0].reason' --output text
""",
returnStdout: true).trim()
echo "service status: $status"
if (status == 'MISSING') {
echo "ecs service: $ecs_service does not exists in $ecs_cluster"
// put create service logic
}
}}}

How to make Terraform wait for cloudinit to finish?

In my Terraform AWS Docker Swarm module I use cloud-init to initialize the EC2 instance. However, Terraform says the resource is ready before cloud-init finishes. Is there a way of making it wait for cloud-init to finish ideally without SSHing or checking for a port to be up using a null resource.
Your managers and workers both use template_cloudinit_config. They also have ec2:CreateTags.
You can use an EC2 resource tag like trajano/terraform-docker-swarm-aws/cloudinit-complete to indicate that the cloudinit has finished.
You could add this final part to each to invoke a tagging script:
part {
filename = "tag_complete.sh"
content = local.tag_complete_script
content_type = "text/x-shellscript"
}
And declare tag_complete_script be the following:
locals {
tag_complete_script = <<-EOF
#!/bin/bash
instance_id="${TOKEN=`curl -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600"` \
&& curl -H "X-aws-ec2-metadata-token: $TOKEN" -v http://169.254.169.254/latest/meta-data/instance-id}"
aws ec2 create-tags --resources "$instance_id" --tags 'Key=trajano/terraform-docker-swarm-aws/cloudinit-complete,Value=true'
EOF
}
Then with a null_resource, you wait for the tag to appear (wrote this on my phone, so use it for a general idea, but I don't expect that it will work without testing and edits):
resource "null_resource" "wait_for_cloudinit" {
provisioner "local-exec" {
command = <<-EOF
#!/bin/bash
poll_tags="aws ec2 describe-tags --filters 'Name=resource-id,Values=${join(",", aws_instance.managers[*].id)}' 'Name=key,Values=trajano/terraform-docker-swarm-aws/cloudinit-complete' --output text --query 'Tags[*].Value'"
expected='${join(",", formatlist("true", aws_instance.managers[*].id))}'
$tags="$($poll_tags)"
while [[ "$tags" != "$expected" ]] ; do
$tags="$($poll_tags)"
done
EOF
}
}
This way you can have dependencies on null_resource.wait_for_cloudinit on any resources that need to run after cloudinit has completed.
Another possible approach is using AWS Systems Manager Run Command, if available on your AMI.
You create an SSM Document with Terraform that uses the cloud-init status --wait command, then you trigger the command from a local provisioner, and wait for it to complete. In this way, you don't have to play around with tags, and you are 100% sure cloud-init has been completed.
This is an example of the document you can create with Terraform:
resource "aws_ssm_document" "cloud_init_wait" {
name = "cloud-init-wait"
document_type = "Command"
document_format = "YAML"
content = <<-DOC
schemaVersion: '2.2'
description: Wait for cloud init to finish
mainSteps:
- action: aws:runShellScript
name: StopOnLinux
precondition:
StringEquals:
- platformType
- Linux
inputs:
runCommand:
- cloud-init status --wait
DOC
}
and then you can use a local-provisioner inside the EC2 instance block, or in a null resource, up to what you have to do with it.
The provisioner would be more or less like this:
provisioner "local-exec" {
interpreter = ["/bin/bash", "-c"]
command = <<-EOF
set -Ee -o pipefail
export AWS_DEFAULT_REGION=${data.aws_region.current.name}
command_id=$(aws ssm send-command --document-name ${aws_ssm_document.cloud_init_wait.arn} --instance-ids ${self.id} --output text --query "Command.CommandId")
if ! aws ssm wait command-executed --command-id $command_id --instance-id ${self.id}; then
echo "Failed to start services on instance ${self.id}!";
echo "stdout:";
aws ssm get-command-invocation --command-id $command_id --instance-id ${self.id} --query StandardOutputContent;
echo "stderr:";
aws ssm get-command-invocation --command-id $command_id --instance-id ${self.id} --query StandardErrorContent;
exit 1;
fi;
echo "Services started successfully on the new instance with id ${self.id}!"
EOF
}

Is there any way to figure out public ip belongs to which aws account?

I have multiple aws accounts and i don't remember in which aws account this EC2 instance was created, is there any optimal way to figure out in very less time?
Note: i need to know account DNS name or Alias name.(Not account number)
If you have access to the instance you could use Instance metadata API:
[ec2-user ~]$ curl http://169.254.169.254/latest/dynamic/instance-identity/document
It returns json with accountId field.
If you configure AWS CLI for all account, then you can get the Account ID, ARN and user ID.
The script does the following.
Get the list of AWS configuration profile
Loop over all profile
Get a list of All Ec2 public IP address
print account info if IP matched and exit
RUN
./script.sh 52.x.x.x
script.sh
#!/bin/bash
INSTANCE_IP="${1}"
if [ -z "${INSTANCE_IP}" ]; then
echo "pls provide instance IP"
echo "./scipt.sh 54.x.x.x"
exit 1
fi
PROFILE_LIST=$(grep -o "\\[[^]]*]" < ~/.aws/credentials | tr -d "[]")
for PROFILE in $PROFILE_LIST; do
ALL_IPS=$(aws ec2 describe-instances --profile "${PROFILE}" --query "Reservations[].Instances[][PublicIpAddress]" --output text | tr '\r\n' ' ')
echo "looking against profile ${PROFILE}"
for IP in $ALL_IPS; do
if [ "${INSTANCE_IP}" == "${IP}" ]; then
echo "Instance IP matched in below account"
aws sts get-caller-identity
exit 0
fi
done
done
echo "seems like instance not belong to these profile"
echo "${PROFILE_LIST}"
exit 1
loop over accounts
loop over regions
also be aware of lightsail!
I came up with the following and helped me. I didn't exclude the regions that did not have lightsail
for region in `aws ec2 describe-regions --output text --query 'Regions[*].[RegionName]' --region eu-west-1` ; do \
echo $region; \
aws ec2 describe-network-interfaces --output text --filters Name=addresses.private-ip-address,Values="IPv4 address" --region $region ; \
aws lightsail get-instances --region eu-west-1 --output text --query 'instances[*].[name,publicIpAddress]' --region $region; \
done

AWS Aurora MySQL Database cloning using CLI

I want to create a copy of my production aurora mysql database on a weekly basis. The copies will be used for development.
I like the clone feature of Aurora MySQL but unfortunately, the instructions to create these clones from AWS CLI are not clear.
Following the docs, I am able to create another Aurora cluster, but it doesn't create the DBs. It just creates an empty cluster. I am not able to figure out the commands to create a new Db inside this cluster from a snapshot of the Db in the source cluster as the restore-db-instance-from-db-snapshot is not supported for Aurora MySQL.
Please let me know the commands to clone the Aurora Cluster along with the DBs inside it.
According to the AWS documentation, this is a two phase process.
When you create a new cluster with:
aws rds restore-db-cluster-to-point-in-time \
--source-db-cluster-identifier arn:aws:rds:eu-central-1:AAAAAAAAAAA:cluster:BBBBBBBBBB-cluster \
--db-cluster-identifier YYYYYYYYYY-cluster \
--restore-type copy-on-write \
--use-latest-restorable-time
When this completes, data store has been created and is ready to be used but there are no aurora instances running.
The second step would be to create one (or more) instances:
aws rds create-db-instance \
--db-cluster-identifier YYYYYYYYYY-cluster \
--db-instance-class <value> \
--engine <value>
(other optional values)
The concept of Aurora DB cloning took me a while to get my head around. With Aurora the data is actually part of the cluster. The database instance gets its data from the cluster. To clone one Aurora cluster to another you need to clone the cluster, then create a DB instance in the new cluster. The db instance you create in the new cluster will get its data from the cluster in which it is created. Phew! That was a long explanation. Anyway the shell script below is something I run from a cron and it works for me (so far). The security group ids below are fake for this example obviously.
#!/bin/bash
start=$(date +%s)
NOW_DATE=$(date '+%Y-%m-%d-%H-%M')
SOURCE_CLUSTER_INSTANCE_ID=source-aurora-cluster
TARGET_CLUSTER_INSTANCE_ID=target-aurora-cluster
TARGET_CLUSTER_INSTANCE_CLASS=db.r3.large
TARGET_ENGINE="aurora-mysql"
NEW_MASTER_PASS=setyourpasshere
SECURITY_GROUP_ID=sg-0cbc97f44ed74d652
SECURITY_GROUP_ID_DEV=sg-0b36b590347ba8796
SECURITY_GROUP_ID_ADMIN=sg-04032188f428031fd
BACKUP_RETENTION=7
echo -e "\e[93mDeleting existing RDS instance ${TARGET_CLUSTER_INSTANCE_ID} ..."
aws rds delete-db-instance --db-instance-identifier $TARGET_CLUSTER_INSTANCE_ID --skip-final-snapshot
echo -e "\e[93mWaiting for database deletion to complete..."
sleep 10
aws rds wait db-instance-deleted --db-instance-identifier $TARGET_CLUSTER_INSTANCE_ID
echo -e "\e[92mFinished deleting old ${TARGET_CLUSTER_INSTANCE_ID} RDS instance."
EXISTING_CLUSTER_INSTANCE=$(aws rds describe-db-instances --db-instance-identifier $TARGET_CLUSTER_INSTANCE_ID --query 'DBInstances[0].[DBInstanceIdentifier]' --output text)
echo -e "\e[93mDeleting existing cluster instance ${TARGET_CLUSTER_INSTANCE_ID} ..."
aws rds delete-db-cluster --db-cluster-identifier $TARGET_CLUSTER_INSTANCE_ID --skip-final-snapshot
echo -e "\e[93mWaiting for cluster deletion to complete..."
status="available"
while [ "$status" == "available" ] || [ "$status" == "deleting" ]; do
sleep 10
status=$(aws rds describe-db-clusters --db-cluster-identifier $TARGET_CLUSTER_INSTANCE_ID --query "*[].{DBClusters:Status}" --output text)
echo " status = $status "
done
echo -e "\e[92mFinished deleting old ${TARGET_CLUSTER_INSTANCE_ID} cluster."
echo -e "\e[93mRestoring cluster ${SOURCE_CLUSTER_INSTANCE_ID} to new cluster ${TARGET_CLUSTER_INSTANCE_ID} ..."
CRUSTERRESTORECOMMAND="
aws rds restore-db-cluster-to-point-in-time \
--source-db-cluster-identifier $SOURCE_CLUSTER_INSTANCE_ID \
--db-cluster-identifier $TARGET_CLUSTER_INSTANCE_ID \
--restore-type copy-on-write \
--use-latest-restorable-time "
eval $CRUSTERRESTORECOMMAND
status=unknown
while [ "$status" != "available" ]; do
sleep 10
status=$(aws rds describe-db-clusters --db-cluster-identifier $TARGET_CLUSTER_INSTANCE_ID --query "*[].{DBClusters:Status}" --output text)
done
echo -e "\e[93mModifying cluster ${TARGET_CLUSTER_INSTANCE_ID} settings..."
CREATECLUSTERCOMMAND="
aws rds modify-db-cluster \
--db-cluster-identifier $TARGET_CLUSTER_INSTANCE_ID \
--master-user-password $NEW_MASTER_PASS \
--vpc-security-group-ids $SECURITY_GROUP_ID $SECURITY_GROUP_ID_DEV $SECURITY_GROUP_ID_ADMIN \
--backup-retention-period $BACKUP_RETENTION \
--apply-immediately "
eval $CREATECLUSTERCOMMAND
status_modify=unknown
while [ "$status_modify" != "available" ]; do
sleep 10
status_modify=$(aws rds describe-db-clusters --db-cluster-identifier $TARGET_CLUSTER_INSTANCE_ID --query "*[].{DBClusters:Status}" --output text)
echo -e "\e[92mModifications to ${TARGET_CLUSTER_INSTANCE_ID} complete."
done
echo " create RDS instance within new cluser ${TARGET_CLUSTER_INSTANCE_ID}."
CREATEDBCOMMAND="
aws rds create-db-instance \
--db-cluster-identifier $TARGET_CLUSTER_INSTANCE_ID \
--db-instance-identifier $TARGET_CLUSTER_INSTANCE_ID \
--db-instance-class $TARGET_CLUSTER_INSTANCE_CLASS \
--publicly-accessible
--engine $TARGET_ENGINE "
eval $CREATEDBCOMMAND
# neeed to wait until the new db is in an available state
while [ "${exit_status3}" != "0" ]; do
echo -e "\e[93mWaiting for ${TARGET_CLUSTER_INSTANCE_ID} to enter 'available' state..."
aws rds wait db-instance-available --db-instance-identifier $TARGET_CLUSTER_INSTANCE_ID
exit_status3="$?"
INSTANCE_STATUS=$(aws rds describe-db-instances --db-instance-identifier $TARGET_CLUSTER_INSTANCE_ID --query 'DBInstances[0].[DBInstanceStatus]' --output text)
echo -e "\e[92m${TARGET_CLUSTER_INSTANCE_ID} is now ${INSTANCE_STATUS}."
echo -e "\e[92mCreation of ${TARGET_CLUSTER_INSTANCE_ID} complete."
done
echo -e "\e[92mFinished clone of ${SOURCE_DB_INSTANCE_ID} to ${TARGET_CLUSTER_INSTANCE_ID}!"
end=$(date +%s)
runtime=$((end - start))
displaytime=$(displaytime runtime)
echo -e "\e[92mFinished clone of '${SOURCE_DB_INSTANCE_ID}' to '${TARGET_CLUSTER_INSTANCE_ID}"
echo -e "\e[92mThe script took ${displaytime} to run."
exit 0
The answer is correct, one important detail, not mentioned and causing me thinking it didn't work, is that not necessarily security policy will be the same, so in order to make the DB available you need to set the the same or appropriate plus make the DB public. I am providing some snippet for Java API:
private final AmazonRDS rds;
rds.restoreDBClusterToPointInTime(
new RestoreDBClusterToPointInTimeRequest()
.withSourceDBClusterIdentifier("sourceClusterIdentifier")
.withDBClusterIdentifier("targetName")
.withRestoreType("copy-on-write")
.withVpcSecurityGroupIds("vpc_group_id_to_be_found") //important
.withUseLatestRestorableTime(true));
DBInstance instanceOfDb = rds.createDBInstance(new CreateDBInstanceRequest()
.withDBClusterIdentifier("targetName")
.withDBInstanceIdentifier("targetName-cluster")
.withEngine("aurora-postgresql")
.withDBInstanceClass("db.r4.large")
.withPubliclyAccessible(true) //important
.withMultiAZ(false)
);
rds.waiters().dBInstanceAvailable()
.run(new WaiterParameters<>(new DescribeDBInstancesRequest()
.withDBInstanceIdentifier(instanceOfDb.getDBInstanceIdentifier()))
.withPollingStrategy(new PollingBuilder().delay(30).maxWait(30, TimeUnit.MINUTES).build()));