Strange behavior while creating ES domain using AWS CLI with localstack - amazon-web-services

I' trying to create AWS ElasticSearchDomain using AWS-CLI on my local machine with Localstack .
All my domain configuration in in one file called mydomain.json.
By using command awslocal es create-elasticsearch-domain --cli-input-json file://path/to/file
Here the file I'm referring to is mydomain.json.
Here are the contents of json file
mydomain.json
{
"DomainName":"mydomain",
"ElasticsearchVersion": "6.8",
"ElasticsearchClusterConfig": {
"DedicatedMasterEnabled": true,
"DedicatedMasterCount": 3,
"ZoneAwarenessEnabled": true,
"InstanceCount": 3,
"InstanceType":"m3.medium.elasticsearch",
"DedicatedMasterType": "m3.medium.elasticsearch",
},
"EBSOptions" : {
"EBSEnabled" : true,
"VolumeSize" : 1,
"VolumeType" : "gp2"
},
"Tags": [
{
"Key": "Name",
"Value": "Localstack_Elasticsearch_Domain"
}
]
}
There are several errors I'm facing
It should create domain with "DedicatedMasterCount" of 3 but it creates "DedicatedMasterCount" of 1.
Even is volume size is specified as "VolumeSize" : 1 it creates it with "VolumeSize" : 10
Parameter validation failed:
Unknown parameter in input: "Tags", must be one of: DomainName, ElasticsearchVersion, ElasticsearchClusterConfig, EBSOptions, AccessPolicies, SnapshotOptions, VPCOptions, CognitoOptions, EncryptionAtRestOptions, NodeToNodeEncryptionOptions, AdvancedOptions, LogPublishingOptions, DomainEndpointOptions, AdvancedSecurityOptions, AutoTuneOptions, TagList

Related

Is it possible to extract "instanceId" from EventBridge event data, and use it as Target Value?

I was able to setup AutoScaling events as rules in EventBridge to trigger SSM Commands, but I've noticed that with my chosen Target Value the event is passed to all my active EC2 Instances. My Target key is a tag shared by those instances, so my mistake makes sense now.
I'm pretty new to EventBridge, so I was wondering if there's a way to actually target the instance that triggered the AutoScaling event (as in extracting the "InstanceId" that's present in the event data and use that as my new Target Value). I saw the Input Transformer, but I think that just transforms the event data to pass to the target.
Thanks!
EDIT - help with js code for Lambda + SSM RunCommand
I realize I can achieve this by setting EventBridge to invoke a Lambda function instead of the SSM RunCommand directly. Can anyone help with the javaScript code to call a shell command on the ec2 instance specified in the event data (event.detail.EC2InstanceId)? I can't seem to find a relevant and up-to-date base template online, and I'm not familiar enough with js or Lambda. Any help is greatly appreciated! Thanks
Sample of Event data, as per aws docs
{
"version": "0",
"id": "12345678-1234-1234-1234-123456789012",
"detail-type": "EC2 Instance Launch Successful",
"source": "aws.autoscaling",
"account": "123456789012",
"time": "yyyy-mm-ddThh:mm:ssZ",
"region": "us-west-2",
"resources": [
"auto-scaling-group-arn",
"instance-arn"
],
"detail": {
"StatusCode": "InProgress",
"Description": "Launching a new EC2 instance: i-12345678",
"AutoScalingGroupName": "my-auto-scaling-group",
"ActivityId": "87654321-4321-4321-4321-210987654321",
"Details": {
"Availability Zone": "us-west-2b",
"Subnet ID": "subnet-12345678"
},
"RequestId": "12345678-1234-1234-1234-123456789012",
"StatusMessage": "",
"EndTime": "yyyy-mm-ddThh:mm:ssZ",
"EC2InstanceId": "i-1234567890abcdef0",
"StartTime": "yyyy-mm-ddThh:mm:ssZ",
"Cause": "description-text"
}
}
Edit 2 - my Lambda code so far
'use strict'
const ssm = new (require('aws-sdk/clients/ssm'))()
exports.handler = async (event) => {
const instanceId = event.detail.EC2InstanceId
var params = {
DocumentName: "AWS-RunShellScript",
InstanceIds: [ instanceId ],
TimeoutSeconds: 30,
Parameters: {
commands: ["/path/to/my/ec2/script.sh"],
workingDirectory: [],
executionTimeout: ["15"]
}
};
const data = await ssm.sendCommand(params).promise()
const response = {
statusCode: 200,
body: "Run Command success",
};
return response;
}
Yes, but through Lambda
EventBridge -> Lambda (using SSM api) -> EC2
Thank you #Sándor Bakos for helping me out!! My JavaScript ended up not working for some reason, so I ended up just using part of the python code linked in the comments.
1. add ssm:SendCommand permission:
After I let Lambda create a basic role during function creation, I added an inline policy to allow Systems Manager's SendCommand. This needs access to your documents/*, instances/* and managed-instances/*
2. code - python 3.9
import boto3
import botocore
import time
def lambda_handler(event=None, context=None):
try:
client = boto3.client('ssm')
instance_id = event['detail']['EC2InstanceId']
command = '/path/to/my/script.sh'
client.send_command(
InstanceIds = [ instance_id ],
DocumentName = 'AWS-RunShellScript',
Parameters = {
'commands': [ command ],
'executionTimeout': [ '60' ]
}
)
You can do this without using lambda, as I just did, by using eventbridge's input transformers.
I specified a new automation document that called the document I was trying to use (AWS-ApplyAnsiblePlaybooks).
My document called out the InstanceId as a parameter and is passed this by the input transformer from EventBridge. I had to pass the event into lambda just to see how to parse the JSON event object to get the desired instance ID - this ended up being
$.detail.EC2InstanceID
(it was coming from an autoscaling group).
I then passed it into a template that was used for the runbook
{"InstanceId":[<instance>]}
This template was read in my runbook as a parameter.
This was the SSM playbook inputs I used to run the AWS-ApplyAnsiblePlaybook Document, I just mapped each parameter to the specified parameters in the nested playbook:
"inputs": {
"InstanceIds": ["{{ InstanceId }}"],
"DocumentName": "AWS-ApplyAnsiblePlaybooks",
"Parameters": {
"SourceType": "S3",
"SourceInfo": {"path": "https://testansiblebucketab.s3.amazonaws.com/"},
"InstallDependencies": "True",
"PlaybookFile": "ansible-test.yml",
"ExtraVariables": "SSM=True",
"Check": "False",
"Verbose": "-v",
"TimeoutSeconds": "3600"
}
See the document below for reference. They used a document that was already set up to receive the variable
https://docs.aws.amazon.com/systems-manager/latest/userguide/automation-tutorial-eventbridge-input-transformers.html
This is the full automation playbook I used, most of the parameters are defaults from the nested playbook:
{
"description": "Runs Ansible Playbook on Launch Success Instances",
"schemaVersion": "0.3",
"assumeRole": "<Place your automation role ARN here>",
"parameters": {
"InstanceId": {
"type": "String",
"description": "(Required) The ID of the Amazon EC2 instance."
}
},
"mainSteps": [
{
"name": "RunAnsiblePlaybook",
"action": "aws:runCommand",
"inputs": {
"InstanceIds": ["{{ InstanceId }}"],
"DocumentName": "AWS-ApplyAnsiblePlaybooks",
"Parameters": {
"SourceType": "S3",
"SourceInfo": {"path": "https://testansiblebucketab.s3.amazonaws.com/"},
"InstallDependencies": "True",
"PlaybookFile": "ansible-test.yml",
"ExtraVariables": "SSM=True",
"Check": "False",
"Verbose": "-v",
"TimeoutSeconds": "3600"
}
}
}
]
}

How can I get the TaskId of a Fargate ecs Container

Similar to this question How to get Task ID from within ECS container? but I want to get the TaskId for my Fargate task. How can you do this? Like others I want this for logging information.
I'm running a Spring App with ELK stack for logging and would like if possible to include the TaskId in the logs if possible.
Edit
I actually never got this to work by the way, here is my code:
private String getTaskIdInternal() {
String url = System.getenv("ECS_CONTAINER_METADATA_URI_V4") + "/task";
logger.info("Getting ecsMetaDataURL={}", url);
if (url == null) {
throw new RuntimeException("ECS_CONTAINER_METADATA_URI_V4 env variable not defined");
}
RestTemplate restTemplate = new RestTemplate();
ResponseEntity<JsonNode> response = restTemplate.getForEntity(url, JsonNode.class);
logger.info("ecsMetaData={}", response);
JsonNode map = response.getBody();
String taskArn = map.get("TaskARN").asText();
String[] splitTaskArn = taskArn.split("/");
String taskId = splitTaskArn[splitTaskArn.length - 1];
logger.info("ecsTaskId={}", taskId);
return taskId;
}
But I always get this stack trace:
Could not get the taskId from ECS. exception=org.springframework.web.client.HttpClientErrorException: 403 Forbidden
at org.springframework.web.client.DefaultResponseErrorHandler.handleError(DefaultResponseErrorHandler.java:118)
at org.springframework.web.client.DefaultResponseErrorHandler.handleError(DefaultResponseErrorHandler.java:103)
at org.springframework.web.client.ResponseErrorHandler.handleError(ResponseErrorHandler.java:63)
at org.springframework.web.client.RestTemplate.handleResponse(RestTemplate.java:732)
at org.springframework.web.client.RestTemplate.doExecute(RestTemplate.java:690)
at org.springframework.web.client.RestTemplate.execute(RestTemplate.java:646)
at org.springframework.web.client.RestTemplate.getForEntity(RestTemplate.java:325)
If you're trying to get the task id in Fargate for ECS you make use of metadata endpoints.
Assuming you're using version 1.4.0 of Fargate you can get this via a http request to ${ECS_CONTAINER_METADATA_URI_V4}/task.
An example response from this endpoint is below
{
"Cluster": "arn:aws:ecs:us-west-2:&ExampleAWSAccountNo1;:cluster/default",
"TaskARN": "arn:aws:ecs:us-west-2:&ExampleAWSAccountNo1;:task/default/febee046097849aba589d4435207c04a",
"Family": "query-metadata",
"Revision": "7",
"DesiredStatus": "RUNNING",
"KnownStatus": "RUNNING",
"Limits": {
"CPU": 0.25,
"Memory": 512
},
"PullStartedAt": "2020-03-26T22:25:40.420726088Z",
"PullStoppedAt": "2020-03-26T22:26:22.235177616Z",
"AvailabilityZone": "us-west-2c",
"Containers": [
{
"DockerId": "febee046097849aba589d4435207c04aquery-metadata",
"Name": "query-metadata",
"DockerName": "query-metadata",
"Image": "mreferre/eksutils",
"ImageID": "sha256:1b146e73f801617610dcb00441c6423e7c85a7583dd4a65ed1be03cb0e123311",
"Labels": {
"com.amazonaws.ecs.cluster": "arn:aws:ecs:us-west-2:&ExampleAWSAccountNo1;:cluster/default",
"com.amazonaws.ecs.container-name": "query-metadata",
"com.amazonaws.ecs.task-arn": "arn:aws:ecs:us-west-2:&ExampleAWSAccountNo1;:task/default/febee046097849aba589d4435207c04a",
"com.amazonaws.ecs.task-definition-family": "query-metadata",
"com.amazonaws.ecs.task-definition-version": "7"
},
"DesiredStatus": "RUNNING",
"KnownStatus": "RUNNING",
"Limits": {
"CPU": 2
},
"CreatedAt": "2020-03-26T22:26:24.534553758Z",
"StartedAt": "2020-03-26T22:26:24.534553758Z",
"Type": "NORMAL",
"Networks": [
{
"NetworkMode": "awsvpc",
"IPv4Addresses": [
"10.0.0.108"
],
"AttachmentIndex": 0,
"IPv4SubnetCIDRBlock": "10.0.0.0/24",
"MACAddress": "0a:62:17:7a:36:68",
"DomainNameServers": [
"10.0.0.2"
],
"DomainNameSearchList": [
"us-west-2.compute.internal"
],
"PrivateDNSName": "ip-10-0-0-108.us-west-2.compute.internal",
"SubnetGatewayIpv4Address": ""
}
]
}
]
}
As you can see you would need to parse the TaskARN to get the TaskID (it is the last part of the ARN if you split by "/".
Amazon do specify the following in the documentation that should be noted.
For tasks using the Fargate launch type and platform versions prior to 1.4.0, the task metadata version 3 and 2 endpoint are supported. For more information, see Task Metadata Endpoint version 3 or Task Metadata Endpoint version 2.
The link in the accepted answer is for EC2 launch type. The direct doc link for Fargate is: https://docs.aws.amazon.com/AmazonECS/latest/userguide/task-metadata-endpoint-v4-fargate.html. The json content seems to be pretty much the same though.

Failed to create API Gateway

I'm trying to create this API Gateway (gist) with Authorizer, and ANY method.
I run into this error:
The following resource(s) failed to create: [BaseLambdaExecutionPolicy, ApiGatewayDeployment]
I've checked the parameters passed into this template from my other stacks and they're correct. I've checked this template and it's valid.
My template is modified from this template with "Runtime": "nodejs8.10".
This is the same stack (gist) which is created successfully using swagger 2. I just want to replace swagger 2 with AWS::ApiGateway::Method.
Update 6 Jun 2019:
I tried to create the whole nested stack using the working version of the API Gateway stack, then create another API Gateway with the template that doesn't work with the parameters I get from the nested stack, then I have this:
The REST API doesn't contain any methods (Service: AmazonApiGateway; Status Code: 400; Error Code: BadRequestException; Request ID: ID)
But I did specify the method in my template following AWS docs:
"GatewayMethod": {
"Type" : "AWS::ApiGateway::Method",
"DependsOn": ["LambdaRole", "ApiGateway"],
"Properties" : {
"ApiKeyRequired" : false,
"AuthorizationType" : "Cognito",
"HttpMethod" : "ANY",
"Integration" : {
"IntegrationHttpMethod" : "ANY",
"Type" : "AWS",
"Uri" : {
"Fn::Sub": "arn:aws:apigateway:${AWS::Region}:lambda:path/2015-03-31/functions/${LambdaFunction.Arn}/invocations"
}
},
"MethodResponses" : [{
"ResponseModels": {
"application/json": "Empty"
},
"StatusCode": 200
}],
"RequestModels" : {"application/json": "Empty"},
"ResourceId" : {
"Fn::GetAtt": ["ApiGateway", "RootResourceId"]
},
"RestApiId" : {
"Ref": "ApiGateway"
}
}
},
Thanks to #John's suggestion. I've tried to create the nested stack with the version that worked and pass in the parameters for the version that doesn't work.
The reason for that error is:
CloudFormation might try to create Deployment before it creates Method
from balaji's answer here.
So this is what I did:
"methodANY": {
"Type": "AWS::ApiGateway::Method",
"Properties": {
"AuthorizationType": "COGNITO_USER_POOLS",
...},
"ApiGatewayDeployment": {
"Type": "AWS::ApiGateway::Deployment",
"DependsOn": "methodANY",
...
I also found this article on cloudonaut.io by Michael Wittig helpful.

HIVE_INVALID_METADATA in Amazon Athena

How can I work around the following error in Amazon Athena?
HIVE_INVALID_METADATA: com.facebook.presto.hive.DataCatalogException: Error: : expected at the position 8 of 'struct<x-amz-request-id:string,action:string,label:string,category:string,when:string>' but '-' is found. (Service: null; Status Code: 0; Error Code: null; Request ID: null)
When looking at position 8 in the database table connected to Athena generated by AWS Glue, I can see that it has a column named attributes with a corresponding struct data type:
struct <
x-amz-request-id:string,
action:string,
label:string,
category:string,
when:string
>
My guess is that the error occurs because the attributes field is not always populated (c.f. the _session.start event below) and does not always contain all fields (e.g. the DocumentHandling event below does not contain the attributes.x-amz-request-id field). What is the appropriate way to address this problem? Can I make a column optional in Glue? Can (should?) Glue fill the struct with empty strings? Other options?
Background: I have the following backend structure:
Amazon PinPoint Analytics collects metrics from my application.
The PinPoint event stream has been configured to forward the events to an Amazon Kinesis Firehose delivery stream.
Kinesis Firehose writes data to S3
Use AWS Glue to crawl S3
Use Athena to write queries based on the databases and tables generated by AWS Glue
I can see PinPoint events successfully being added to json files in S3, e.g.
First event in a file:
{
"event_type": "_session.start",
"event_timestamp": 1524835188519,
"arrival_timestamp": 1524835192884,
"event_version": "3.1",
"application": {
"app_id": "[an app id]",
"cognito_identity_pool_id": "[a pool id]",
"sdk": {
"name": "Mozilla",
"version": "5.0"
}
},
"client": {
"client_id": "[a client id]",
"cognito_id": "[a cognito id]"
},
"device": {
"locale": {
"code": "en_GB",
"country": "GB",
"language": "en"
},
"make": "generic web browser",
"model": "Unknown",
"platform": {
"name": "macos",
"version": "10.12.6"
}
},
"session": {
"session_id": "[a session id]",
"start_timestamp": 1524835188519
},
"attributes": {},
"client_context": {
"custom": {
"legacy_identifier": "50ebf77917c74f9590c0c0abbe5522d2"
}
},
"awsAccountId": "672057540201"
}
Second event in the same file:
{
"event_type": "DocumentHandling",
"event_timestamp": 1524835194932,
"arrival_timestamp": 1524835200692,
"event_version": "3.1",
"application": {
"app_id": "[an app id]",
"cognito_identity_pool_id": "[a pool id]",
"sdk": {
"name": "Mozilla",
"version": "5.0"
}
},
"client": {
"client_id": "[a client id]",
"cognito_id": "[a cognito id]"
},
"device": {
"locale": {
"code": "en_GB",
"country": "GB",
"language": "en"
},
"make": "generic web browser",
"model": "Unknown",
"platform": {
"name": "macos",
"version": "10.12.6"
}
},
"session": {},
"attributes": {
"action": "Button-click",
"label": "FavoriteStar",
"category": "Navigation"
},
"metrics": {
"details": 40.0
},
"client_context": {
"custom": {
"legacy_identifier": "50ebf77917c74f9590c0c0abbe5522d2"
}
},
"awsAccountId": "[aws account id]"
}
Next, AWS Glue has generated a database and a table. Specifically, I see that there is a column named attributes that has the value of
struct <
x-amz-request-id:string,
action:string,
label:string,
category:string,
when:string
>
However, when I attempt to Preview table from Athena, i.e. execute the query
SELECT * FROM "pinpoint-test"."pinpoint_testfirehose" limit 10;
I get the error message described earlier.
Side note, I have tried to remove the attributes field (by editing the database table from Glue), but that results in Internal error when executing the SQL query from Athena.
This is a known limitation. Athena table and database names allow only underscore special characters#
Athena table and database names cannot contain special characters, other than underscore (_).
Source: http://docs.aws.amazon.com/athena/latest/ug/known-limitations.html
Use tick (`) when table name has - in the name
Example:
SELECT * FROM `pinpoint-test`.`pinpoint_testfirehose` limit 10;
Make sure you select "default" database on the left pane.
I believe the problem is your struct element name: x-amz-request-id
The "-" in the name.
I'm currently dealing with a similar issue since my elements in my struct have "::" in the name.
Sample data:
some_key: {
"system::date": date,
"system::nps_rating": 0
}
Glue derived struct Schema (it tried to escape them with ):
struct <
system\:\:date:String
system\:\:nps_rating:Int
>
But that still gives me an error in Athena.
I don't have a good solution for this other than changing Struct to STRING and trying to process the data that way.

AWS data pipeline job failing but there is no error message or error code

I tried running a data pipeline job but the EmrActivity step reached a FAILED status but there is no error code or error message:
Name:
#EMR cluster to perform the work_2013-09-03T16:15:00
View instance fields
Description:
Latest attempt count: 3, Tries left: 0
Select attempt for this instance:
Status:
FAILED
Error code:
Error message:
any idea why? Where can I find out more info about the underlying problem?
The job is simple: fire up EMR cluster and run a pig script (where xxx is my bucket name):
{
"objects": [
{
"id":"Default",
"failureAndRerunMode":"cascade"
},
{
"id" : "MyScheduleID",
"type" : "Schedule",
"period" : "1 hour",
"startDateTime" : "2013-09-03T19:00:00",
"endDateTime" : "2013-09-03T20:00:00"
},
{
"id" : "MyEmrCluster",
"name" : "EMR cluster to perform the work",
"type" : "EmrCluster",
"hadoopVersion" : "0.20",
"masterInstanceType" : "m1.small",
"coreInstanceType" : "m1.medium",
"coreInstanceCount" : "2",
"terminateAfter": "1 Hours",
"schedule": {
"ref": "MyScheduleID"
},
"logUri":"s3://xxx/amazonlogs",
"emrLogUri":"s3://xxx/amazonlogs"
},
{
"id" : "MyEmrActivity",
"name" : "Work to perform on my data",
"type" : "EmrActivity",
"runsOn" : {"ref" : "MyEmrCluster"},
"schedule": {
"ref": "MyScheduleID"
},
"step": "s3://us-east-1.elasticmapreduce/libs/script-runner/script-runner.jar,s3://us-east-1.elasticmapreduce/libs/pig/pig-script,--base-path,s3://us-east-1.elasticmapreduce/libs/pig/,--install-pig,--pig-versions,latest",
"step": "s3://us-east-1.elasticmapreduce/libs/script-runner/script-runner.jar,s3://us-east-1.elasticmapreduce/libs/pig/pig-script,--base-path,s3://us-east-1.elasticmapreduce/libs/pig/,--pig-versions,latest,--run-pig-script,--args,-f,s3://xxx/carls_minimal_script.pig"
}
]
}
Does this config look OK?
I don't see anything in s3://xxx/amazonlogs
Here are couple of things you could try
Go to "https://console.aws.amazon.com/elasticmapreduce/home", find the corresponding cluster that got started (based on timestamp), click on "Debug", you should find logs about each step.
Or start an EMR cluster from AWS Console, login into the Master node, run the Pig script to check if its working.