AWS Waiter TasksStopped failed: taskId length should be one of - amazon-web-services

For some reason I am getting the following error:
Waiter TasksStopped failed: taskId length should be one of [32,36]
I really don't know what taskId is supposed to mean and aws documentation isn't helping. Does anyone know what is going wrong in this pipeline script?
- step:
name: Run DB migrations
script:
- >
export BackendTaskArn=$(aws cloudformation list-stack-resources \
--stack-name=${DEXB_PRODUCTION_STACK} \
--output=text \
--query="StackResourceSummaries[?LogicalResourceId=='BackendECSTask'].PhysicalResourceId")
- >
SequelizeTask=$(aws ecs run-task --cluster=${DEXB_PRODUCTION_ECS_CLUSTER} --task-definition=${BackendTaskArn} \
--overrides='{"containerOverrides":[{"name":"NodeBackend","command":["./node_modules/.bin/sequelize","db:migrate"]}]}' \
--launch-type=EC2 --output=text --query='tasks[0].taskArn')
- aws ecs wait tasks-stopped --cluster=${DEXB_PRODUCTION_ECS_CLUSTER} --tasks ${SequelizeTask}

AWS introduced a new ARN format for tasks, container instances, and services. This format now contains the cluster name, which might break scripts and applications that were counting on the ARN only containing the task resource ID.
# Previous format (taskId contains hyphens)
arn:aws:ecs:$region:$accountID:task/$taskId
# New format (taskI does not contain hyphens)
arn:aws:ecs:$region:$accountId:task/$clusterName/$taskId
Until March 31, 2021, it will be possible to opt-out of this change per-region, using https://console.aws.amazon.com/ecs/home?#/settings. In order to change the behavior for the whole account, you will need to use the Root IAM user.

It turns out I had a duplicate task running in the background. I went to the ECS clusters page and stopped the duplicate task. However this may be dangerous to do if you have used cloudformation to set up your tasks and services. Proceed cautiously if you're in the same boat.

We were bit with this cryptic error message, and what it actually means is that the task_id you're sending to cloudformation script is invalid. Task ids must have a length of 32 or 36 chars.
In our case, an undocumented change in the way AWS sent back taskArn key value was causing us to grab the incorrect value, and sending an unrelated string as the task_id. AWS detected this and blew up. So double check the task_id string and you should be good.

Related

AWS Service Quota: How to get service quota for Amazon S3 using boto3

I get the error "An error occurred (NoSuchResourceException) when calling the GetServiceQuota operation:" while trying running the following boto3 python code to get the value of quota for "Buckets"
client_quota = boto3.client('service-quotas')
resp_s3 = client_quota.get_service_quota(ServiceCode='s3', QuotaCode='L-89BABEE8')
In the above code, QuotaCode "L-89BABEE8" is for "Buckets". I presumed the value of ServiceCode for Amazon S3 would be "s3" so I put it there but I guess that is wrong and throwing error. I tried finding the documentation around ServiceCode for S3 but could not find it. I even tried with "S3" (uppercase 'S' here), "Amazon S3" but that didn't work as well.
What I tried?
client_quota = boto3.client('service-quotas') resp_s3 = client_quota.get_service_quota(ServiceCode='s3', QuotaCode='L-89BABEE8')
What I expected?
Output in the below format for S3. Below example is for EC2 which is the output of resp_ec2 = client_quota.get_service_quota(ServiceCode='ec2', QuotaCode='L-6DA43717')
I just played around with this and I'm seeing the same thing you are, empty responses from any service quota list or get command for service s3. However s3 is definitely the correct service code, because you see that come back from the service quota list_services() call. Then I saw there are also list and get commands for AWS default service quotas, and when I tried that it came back with data. I'm not entirely sure, but based on the docs I think any quota that can't be adjusted, and possibly any quota your account hasn't requested an adjustment for, will probably come back with an empty response from get_service_quota() and you'll need to run get_aws_default_service_quota() instead.
So I believe what you need to do is probably run this first:
client_quota.get_service_quota(ServiceCode='s3', QuotaCode='L-89BABEE8')
And if that throws an exception, then run the following:
client_quota.get_aws_default_service_quota(ServiceCode='s3', QuotaCode='L-89BABEE8')

how to Get CloudWatch Logs for last 30 minutes in command prompt?

how can I get last 30 minutes AWS CloudWatch logs which are inserted to the specific LogStream using AWS Command ?
Can you describe what you already tried yourself and what you ran into? Looking at the AWS CLI command reference, it seems that you should be able to run "aws cloudwatch get-log-events ----log-stream-name <name of the stream> --start-time <timestamp>" to get a list of events, starting at given UNIX timestamp, calculating the timestamp should be fairly trivial.
Addition, based on your comment: you'll need to look into the AWS concept of pagination. Most/many AWS API calls (which the CLI also makes for you) retrieve a size/length limited set of data and return a token if there is more data present. You can then make a subsequent call passing that token, which tells the service to return data starting at that token. Repeat this process until you no longer get a token back, at which point you know you have iterated the full dataset.
For this specific CLI command, there is a flag.
--next-token (string)
The token for the next set of items to return. (You received this token from a previous call.
Hope this helps?

GKE cluster creator in GCP

How can we get the cluster owner details in GKE. Logging part only contains the entry with service account operations and there is no entry with principal email of userId anywhere.
It seems very difficult to get the name of the user who created the GKE cluster.
we have exported complete json file of logs but did not the user entry who actually click on create cluster button. I think this is very common use case to know GKE cluster creator, not sure if we are missing something.
Query:
resource.type="k8s_cluster"
resource.labels.cluster_name="clusterName"
resource.labels.location="us-central1"
-protoPayload.methodName="io.k8s.core.v1.configmaps.update"
-protoPayload.methodName="io.k8s.coordination.v1.leases.update"
-protoPayload.methodName="io.k8s.core.v1.endpoints.update"
severity=DEFAULT
-protoPayload.authenticationInfo.principalEmail="system:addon-manager"
-protoPayload.methodName="io.k8s.apiserver.flowcontrol.v1beta1.flowschemas.status.patch"
-protoPayload.methodName="io.k8s.certificates.v1.certificatesigningrequests.create"
-protoPayload.methodName="io.k8s.core.v1.resourcequotas.delete"
-protoPayload.methodName="io.k8s.core.v1.pods.create"
-protoPayload.methodName="io.k8s.apiregistration.v1.apiservices.create"
I have referred the link below, but it did not help either.
https://cloud.google.com/blog/products/management-tools/finding-your-gke-logs
Audit Logs and specifically Admin Activity Logs
And, there's a "trick": The activity audit log entries include the API method. You can find the API method that interests you. This isn't super straightforward but it's relatively easy. You can start by scoping to the service. For GKE, the service is container.googleapis.com.
NOTE APIs Explorer and Kubenetes Engine API (but really container.googleapis.com) and projects.locations.clusters.create. The mechanism breaks down a little here as the protoPayload.methodName is a variant of the underlying REST method name.
And so you can use logs explorer with the following very broad query:
logName="projects/{PROJECT}/logs/cloudaudit.googleapis.com%2Factivity"
container.googleapis.com
NOTE replace {PROJECT} with the value.
And then refine this based on what's returned:
logName="projects/{PROJECT}/logs/cloudaudit.googleapis.com%2Factivity"
protoPayload.serviceName="container.googleapis.com"
protoPayload.methodName="google.container.v1beta1.ClusterManager.CreateCluster"
NOTE I mentioned that it isn't super straightforward because, as you can see in the above, I'd used gcloud beta container clusters create and so I need the google.container.v1beta1.ClusterManager.CreateCluster method but, it was easy to determine this from the logs.
And, who dunnit?
protoPayload: {
authenticationInfo: {
principalEmail: "{me}"
}
}
So:
PROJECT="[YOUR-PROJECT]"
FILTER="
logName=\"projects/${PROJECT}/logs/cloudaudit.googleapis.com%2Factivity\"
protoPayload.serviceName=\"container.googleapis.com\"
protoPayload.methodName=\"google.container.v1beta1.ClusterManager.CreateCluster\"
"
gcloud logging read "${FILTER}" \
--project=${PROJECT} \
--format="value(protoPayload.authenticationInfo.principalEmail)"
For those who are looking for a quick answer.
Use the log filter in Logs Explorer & use below to check the creator of the cluster.
resource.type="gke_cluster"
protoPayload.authorizationInfo.permission="container.clusters.create"
resource.labels.cluster_name="your-cluster-name"
From gcloud command, you can get the creation date of the cluster.
gcloud container clusters describe YOUR_CLUSTER_NAME --zone ZONE

An error occurred (ValidationError) when calling the UpdateStack operation: No updates are to be performed

What is the correct command syntax for checking whether or not a specific call to aws cloudformation update-stack will result in any changes being made?
The problem we are having is that an automation program that runs an aws cloudformation update-stack --stack-name ourstackname --template-body file://c:\path\to\ourtemplate.json --parameters ParameterKey=someKey,ParameterValue=someValue ... command is failing with the following error:
An error occurred (ValidationError) when calling the UpdateStack operation: No updates are to be performed.
The result is a 254 http response code which we can tell from this documentation link means that a lot of possible problems could have occurred. So it would NOT help us to handle that 254 response code.
What aws cloudformation cli command syntax can we type instead to have the automation process receive a 0 response code in cases where no changes will be made? For example, a --flag added to the aws cloudformation update-stack ... command to return 0 when no changes are made.
Alternatively, if there were some preview command that returned 0 indicating that NO CHANGES WILL BE MADE, then our automation could simply refrain from calling aws cloudformation update-stack ... in that situation.
Terraform, for example defaults to simply succeeding while reporting that no changes have been made after a run when presented with this use case.
Since you are asking to create a "preview", I suggest you try creating a Change Set, reviewing it's output, and then deciding if you want to execute it in case some changes are listed.
The commands below have been tested in bash/zsh, you might need to tweak it a bit in a Windows environment (unfortunately I have no way to test in a Windows machine right now).
# start the change-set and get its ID (note the extra change-set-name, output and query params)
myid=$(aws cloudformation create-change-set --change-set-name my-change --stack-name ourstackname --template-body file://ourtemplate.json --parameters ParameterKey=someKey,ParameterValue=someValue --output text --query Id)
# wait for your change-set to finish execution
aws cloudformation wait change-set-create-complete --change-set-name $myid 2> /dev/null
# get the result status in a variable
result_status=$(aws cloudformation describe-change-set --change-set-name $myid --output text --query Status)
# only executes the change-set if there were changes, aka Status is complete (if no changes, this will be FAILED)
[[ "$result_status" == "CREATE_COMPLETE" ]] && aws cloudformation execute-change-set --change-set-name $myid
# cleanup change-set afterwards if you want to re-use the name "my-change" from the 1st command, otherwise just leave it
aws cloudformation delete-change-set --change-set-name $myid
Update: author asked for a Python, more resilient version of the implementation:
from typing import Dict, Tuple
import boto3
from botocore.exceptions import ClientError, WaiterError
def main():
template_file = "../s3-bucket.yaml"
with open(template_file) as template_fileobj:
template_data = template_fileobj.read()
client = boto3.client('cloudformation')
changeset_applied = _create_and_execute_changeset(
client, 'my-stack', 'my-changeset', template_data)
print(changeset_applied)
def _create_and_execute_changeset(client: boto3.client, stack_name: str, changeset_name: str, template_body: str) -> bool:
client.validate_template(TemplateBody=template_body)
response_create_changeset = client.create_change_set(
StackName=stack_name,
ChangeSetName=changeset_name,
TemplateBody=template_body,
)
changeset_id = response_create_changeset['Id']
apply_changeset = True
waiter = client.get_waiter('change_set_create_complete')
try:
waiter.wait(ChangeSetName=changeset_id)
except WaiterError as ex:
if ex.last_response['Status'] == 'FAILED' and ex.last_response['StatusReason'].startswith('The submitted information didn\'t contain changes'):
apply_changeset = False
else:
raise
if apply_changeset:
client.execute_change_set(ChangeSetName=changeset_id)
# executed changesets cleanup automatically
else:
# cleanup changeset not executed
client.delete_change_set(ChangeSetName=changeset_id)
return apply_changeset
if __name__ == '__main__':
main()
Instead of aws cloudformation update-stack, you can use the aws cloudformation deploy command.
Documentation for aws cloudformation deploy
Based on your exact requirements, the following two flags described in the linked documentation can be used:
--no-execute-changeset (boolean) Indicates whether to execute the change set. Specify this flag if you want to view your stack changes
before executing the change set. The command creates an AWS
CloudFormation change set and then exits without executing the change
set. After you view the change set, execute it to implement your
changes.
--fail-on-empty-changeset | --no-fail-on-empty-changeset (boolean) Specify if the CLI should return a non-zero exit code if there are no
changes to be made to the stack. The default behavior is to return a
zero exit code.
Another advantage of using the deploy command is that it can be used to create a stack as well as update the stack if it already exists. If you are interested specifically in differences between deploy and update-stack or create-stack, this answer to a previous question provides more details.

Aws Elasticbeanstalk cron.yaml worker issue

I have an application deployed to Elasticbeanstalk and run as worker, I wanted to add a periodic task ti run each hour, so I create a cron.yaml with this conf:
version: 1
cron:
- name: "task1"
url: "/task"
schedule: "00 * * * *"
But during the deploy I always got this error:
[Instance: i-a072e41d] Command failed on instance. Return code: 1 Output: missing required parameter params[:table_name] - (ArgumentError). Hook /opt/elasticbeanstalk/addons/sqsd/hooks/start/02-start-sqsd.sh failed. For more detail, check /var/log/eb-activity.log using console or EB CLI.
I added the right permission to EBT role, and I verified the cron.yaml maybe it formatted for Windows (CR/LF), but always got the same error.
missing required parameter params[:table_name] looks like DynamoDB table name is missing, where I can define it ? ,
Any idea how I can fix that.
Thanks !
Well I didn't figure out a solution with this issue so I moved to another approach which use CloudWatch Event to create a Rule type:schedule and select a target as SQS Queue (the one configured with the worker).
Works perfectly!
I encountered this same error when I was dynamically generating a cron.yaml file in a container command instead of already having it in my application root.
The DynamoDB table for the cron is created in the PreInitStage which occurs before any of your custom code executes so if there is no cron.yaml file than no DynamoDB table is created. When the file later appears and the cron jobs are being scheduled it fails because the table was never created.
I solved this problem by having a skeleton cron.yaml in my application root. It must have a valid cron job (I just hit my health check URL once a month) but it doesn't get scheduled since the job registration does happen after your custom commands which can reset the file with only the jobs you need.
This might not be your exact problem but hopefully it helps you find yours as it appears the error happens when the DynamoDB table does not get created.
I looks like your yaml formatting is off. That might be the issue here.
version: 1
cron:
- name: "task1"
url: "/task"
schedule: "00 * * * *"
Formatting is critical in Yaml. Give this a try at the very least.