I deployed Elasticsearch and Kibana 7.10.1. And I am streaming cloudwatch metrics data (raw json) to Elasticsearch.
The metric raw data format looks like:
{
"metric_stream_name" : "metric-stream-elk",
"account_id" : "264100014405",
"region" : "ap-southeast-2",
"namespace" : "AWS/DynamoDB",
"metric_name" : "ReturnedRecordsCount",
"dimensions" : {
"Operation" : "GetRecords",
"StreamLabel" : "2021-06-18T01:12:31.851",
"TableName" : "dev-dms-iac-events"
},
"timestamp" : 1624924620000,
"value" : {
"count" : 121,
"sum" : 0,
"max" : 0,
"min" : 0
},
"unit" : "Count"
}
I can see that these raw data are saved in Elasitcsearch with a custom index name aws-metrics-YYYY-MM-DD. Now how can I let Kibana read metrics from this index?
I don't want to use metricbeat because it queries metrics from AWS. My event flow is streaming AWS metrics to Elasticsearch. How can I achieve that?
Related
I am trying to send a binary file and string parameters to AWS API Gateway.
this is the mapping template that is on API Gateway POST:
{
"imageFile" : $input.params('imageFile'),
"purdueUsername" : $input.params('purdueUsername'),
"description" : $input.params('description'),
"price" : $input.params('price'),
"longitude" : $input.params('longitude'),
"latitude" : $input.params('latitude'),
"category" : $input.params('category'),
}
Making a post request results in this:
When I try this
{
"imageFile" : "$input.params('imageFile')",
"purdueUsername" : "$input.params('purdueUsername')",
"description" : "$input.params('description')",
"price" : "$input.params('price')",
"longitude" : "$input.params('longitude')",
"latitude" : "$input.params('latitude')",
"category" : "$input.params('category')",
}
I am getting empty parameters. The api is not receiving the parameters I am sending through POST request.
How should I change the mapping template?
Note: When I only try to have imageFile in the mapping template and only send binary file without extra parameters it works completely fine.
{
"imageFile" : "$input.body"
}
However, I want to be able to send other parameters beside the binary file.
this is how I solved the problem. I am sending the binary file in the body of the POST request and the other parameters as a header.
this is the mapping template I put on the AWS API Gateway
{
"purdueUsername" : "$input.params('purdueUsername')",
"description" : "$input.params('description')",
"price" : "$input.params('price')",
"longitude" : "$input.params('longitude')",
"latitude" : "$input.params('latitude')",
"category" : "$input.params('category')",
"isbnNumber" : "$input.params('isbnNumber')",
"imageFile" : "$input.body"
}
Is it possible to capture the startTime and endTime of execution of lambda functions along with parameters that were passed to it ?
I couldn't find any state-change event configurations that could be configured to send events when lambda function starts/terminates?
A crappy alternative is to record parameters & start time in database when the lambda is being invoked and have the lambda update the endgame as final step before it's completion. This appears prone to failures scenarios like function erroring out before updating DB.
Are there other alternatives to capture this information
aws x-ray may be a good solution here. It is easy to integrate and use. You may enable it aws console.
Go to your lambda function/ configuration tab
Scroll down & in AWS X-Ray box choose active tracing.
Without any configuration in the code, it is going to record start_time and end_time of the function with additional meta data. You may integrate it as a library to your lambda function and send additional subsegments such as request parameters. Please check here for documentation
Here is a sample payload;
{
"trace_id" : "1-5759e988-bd862e3fe1be46a994272793",
"id" : "defdfd9912dc5a56",
"start_time" : 1461096053.37518,
"end_time" : 1461096053.4042,
"name" : "www.example.com",
"http" : {
"request" : {
"url" : "https://www.example.com/health",
"method" : "GET",
"user_agent" : "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/601.7.7",
"client_ip" : "11.0.3.111"
},
"response" : {
"status" : 200,
"content_length" : 86
}
},
"subsegments" : [
{
"id" : "53995c3f42cd8ad8",
"name" : "api.example.com",
"start_time" : 1461096053.37769,
"end_time" : 1461096053.40379,
"namespace" : "remote",
"http" : {
"request" : {
"url" : "https://api.example.com/health",
"method" : "POST",
"traced" : true
},
"response" : {
"status" : 200,
"content_length" : 861
}
}
}
]
}
I am having issues finding good sources for / figuring out how to correctly add server-side validation to my AppSync GraphQL mutations.
In essence I used AWS dashboard to define my AppSync schema, hence had DynamoDB tables created for me, plus some basic resolvers set up for the data.
No I need to achieve following:
I have a player who has inventory and gold
Player calls purchaseItem mutation with item_id
Once this mutation is called I need to perform some checks in resolver i.e. check if item_id exists int 'Items' table of associated DynamoDB, check if player has enough gold, again in 'Players' table of associated DynamoDB, if so, write to Players DynamoDB table by adding item to their inventory and new subtracted gold amount.
I believe most efficient way to achieve this that will result in less cost and latency is to use "Apache Velocity" templating language for AppSync?
It would be great to see example of this showing how to Query / Write to DynamoDB, handle errors and resolve the mutation correctly.
For writing to DynamoDB with VTL use the following tutorial
you can start with the PutItem template. My request template looks like this:
{
"version" : "2017-02-28",
"operation" : "PutItem",
"key" : {
"noteId" : { "S" : "${context.arguments.noteId}" },
"userId" : { "S" : "${context.identity.sub}" }
},
"attributeValues" : {
"title" : { "S" : "${context.arguments.title}" },
"content": { "S" : "${context.arguments.content}" }
}
}
For query:
{ "version" : "2017-02-28",
"operation" : "Query",
"query" : {
## Provide a query expression. **
"expression": "userId = :userId",
"expressionValues" : {
":userId" : {
"S" : "${context.identity.sub}"
}
}
},
## Add 'limit' and 'nextToken' arguments to this field in your schema to implement pagination. **
"limit": #if(${context.arguments.limit}) ${context.arguments.limit} #else 20 #end,
"nextToken": #if(${context.arguments.nextToken}) "${context.arguments.nextToken}" #else null #end
}
This is based on the Paginated Query template.
What you want to look at is at Pipeline Resolvers:
https://docs.aws.amazon.com/appsync/latest/devguide/pipeline-resolvers.html
Yes, this requires the VTL (Velocity Template)
That allows you to perform read, writes, validation, and anything you'd like using VTL. What you basically do is chain the inputs and outputs into the next template and make the required process.
Here's a Medium post showing you how to do it:
https://medium.com/#dabit3/intro-to-aws-appsync-pipeline-functions-3df87ceddac1
In other words, what you can do is:
Have one template that queries the database, pipeline the result to another template that validates the result and inserts it if succeeds or fails it.
Basically i am trying to transfer data from postgres to redshift using aws datapipeline and the process i am following
Write a pipeline(CopyActivity) that moves data from postgres to s3
Write a pipeline(RedShiftCopyActivity) that moves data from s3 to redshift
So in my case both are working perfectly with the pipelines i wrote, but the problem is the data was duplicating in the Redshift database
For example below is the data from postgres database in a table called company
After the successful run of s3 to redshift(RedShiftCopyActivity) pipeline the data was copied but it was duplicated as below
below is the some of the definition part from RedShiftCopyActivity(S3 to Redshift) pipeline
pipeline_definition = [{
"id":"redshift_database_instance_output",
"name":"redshift_database_instance_output",
"fields":[
{
"key" : "database",
"refValue" : "RedshiftDatabaseId_S34X5",
},
{
"key" : "primaryKeys",
"stringValue" : "id",
},
{
"key" : "type",
"stringValue" : "RedshiftDataNode",
},
{
"key" : "tableName",
"stringValue" : "company",
},
{
"key" : "schedule",
"refValue" : "DefaultScheduleTime",
},
{
"key" : "schemaName",
"stringValue" : RedShiftSchemaName,
},
]
},
{
"id":"CopyS3ToRedshift",
"name":"CopyS3ToRedshift",
"fields":[
{
"key" : "output",
"refValue" : "redshift_database_instance_output",
},
{
"key" : "input",
"refValue" : "s3_input_data",
},
{
"key" : "runsOn",
"refValue" : "ResourceId_z9RNH",
},
{
"key" : "type",
"stringValue" : "RedshiftCopyActivity",
},
{
"key" : "insertMode",
"stringValue" : "KEEP_EXISTING",
},
{
"key" : "schedule",
"refValue" : "DefaultScheduleTime",
},
]
},]
So according to the docs of RedShitCopyActivity we need to use insertMode to describe how the data should behave(inserted/updated/deleted) when copying to database table as below
insertMode : Determines what AWS Data Pipeline does with pre-existing data in the target table that overlaps with rows in the data to be loaded. Valid values are KEEP_EXISTING, OVERWRITE_EXISTING, TRUNCATE and APPEND. KEEP_EXISTING adds new rows to the table, while leaving any existing rows unmodified. KEEP_EXISTING and OVERWRITE_EXISTING use the primary key, sort, and distribution keys to identify which incoming rows to match with existing rows, according to the information provided in Updating and inserting new data in the Amazon Redshift Database Developer Guide. TRUNCATE deletes all the data in the destination table before writing the new data. APPEND will add all records to the end of the Redshift table. APPEND does not require a primary, distribution key, or sort key so items that may be potential duplicates may be appended.
So what my requirements are
When copying from postgres (infact data is in s3 now) to Redshift database if it found already existing rows then just update it
If it founds new records from s3 then create new records in Redshift
But for me even though i have used KEEP_EXISTING or OVERWRITE_EXISTING, the data was just repeating over and over again as shown in the above redshift database picture
So finally how to achieve my requirements ? are there still any tweaks or settings to add to my configuration ?
Edit
Table(company) definition from redshift
If you want to avoid duplication , you must define Primary key in redshift and also set myInsertMode as "OVERWRITE_EXISTING" .
I'm designing a template which includes a Redis service and I would like to enable Multi-AZ feature in Redis such that upon the primary cluster failure, read replica can be promoted to primary. I looked in the CloudFormation documentation but I couldn't find this feature i.e. Multi-AZ. It is available for RDS service but not for Redis. Can I know how I can include the feature for redis such AWS take cares of the automatic failover ?
Source:
http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-elasticache-cache-cluster.html
The list of properties that are available for the elastic cache are listed below.
"AutoMinorVersionUpgrade" : Boolean,
"AZMode" : String,
"CacheNodeType" : String,
"CacheParameterGroupName" : String,
"CacheSecurityGroupNames" : [ String, ... ],
"CacheSubnetGroupName" : String,
"ClusterName" : String,
"Engine" : String,
"EngineVersion" : String,
"NotificationTopicArn" : String,
"Port" : Integer,
"PreferredAvailabilityZone" : String,
"PreferredAvailabilityZones" : [String, ... ],
"PreferredMaintenanceWindow" : String,
"SnapshotArns" : [String, ... ],
"SnapshotName" : String,
"SnapshotRetentionLimit" : Integer,
"SnapshotWindow" : String,
"Tags" : [Resource Tag, ...],
"VpcSecurityGroupIds" : [String, ...]
This are the two ways you can set Redis to use Multi Az programatically.
Using CLI
aws elasticache modify-replication-group \
--replication-group-id myReplGroup \
--automatic-failover-enabled
Using Elasticache API
https://elasticache.us-west-2.amazonaws.com/
?Action=ModifyReplicationGroup
&AutoFailover=true
&ReplicationGroupId=myReplGroup
&Version=2015-02-02
&SignatureVersion=4
&SignatureMethod=HmacSHA256
&Timestamp=20140401T192317Z
&X-Amz-Credential=<credential>
This are some of the notes that you should read while selecting Multi Az for redis.
http://docs.aws.amazon.com/AmazonElastiCache/latest/UserGuide/AutoFailover.html#AutoFailover.Notes
For Cloud Formation below are the Properties:
{
"Type" : "AWS::ElastiCache::ReplicationGroup",
"Properties" : {
"AutomaticFailoverEnabled" : Boolean,
"AutoMinorVersionUpgrade" : Boolean,
"CacheNodeType" : String,
"CacheParameterGroupName" : String,
"CacheSecurityGroupNames" : [ String, ... ],
"CacheSubnetGroupName" : String,
"Engine" : String,
"EngineVersion" : String,
"NotificationTopicArn" : String,
"NumCacheClusters" : Integer,
"Port" : Integer,
"PreferredCacheClusterAZs" : [ String, ... ],
"PreferredMaintenanceWindow" : String,
"ReplicationGroupDescription" : String,
"SecurityGroupIds" : [ String, ... ],
"SnapshotArns" : [ String, ... ],
"SnapshotRetentionLimit" : Integer,
"SnapshotWindow" : String
}
}
You have to tweak this property for Multi Az
AutomaticFailoverEnabled
Indicates whether Multi-AZ is enabled. When Multi-AZ is enabled, a read-only replica is automatically promoted to a read-write primary cluster if the existing primary cluster fails. If you specify true, you must specify a value greater than 1 for the NumCacheNodes property. By default, AWS CloudFormation sets the value to true.
For more information about Multi-AZ, see Multi-AZ with Redis Replication Groups in the Amazon ElastiCache User Guide.
Note
You cannot enable automatic failover for Redis versions earlier than 2.8.6 or for T1 and T2 cache node types.
Required: No
Type: Boolean
http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-elasticache-replicationgroup.html