Automate turning on and off of redis cluster - amazon-web-services

I'm trying to automate the turning on and off process of Redis Cluster in aws. I saw the following link for reference (https://forums.aws.amazon.com/thread.jspa?threadID=149772). Is there a way to do it via cloudwatch ?
I am very new to aws platform.

Check the documentation regarding scale in/out
https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/redis-cluster-resharding-online.html It also has commands to reshard a cluster manually.
Check CloudWatch metrics from the Redis cluster. https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/CacheMetrics.HostLevel.html and https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/CacheMetrics.Redis.html Choose the metrics that will trigger autoscaling
You can trigger an AWS Lambda on some event for a metric https://docs.aws.amazon.com/AmazonCloudWatch/latest/events/RunLambdaSchedule.html
From the Lambda you cal call aws cli to reshard the cluster as described in 1. Example: https://alestic.com/2016/11/aws-lambda-awscli/
If you need to turn off the cluster completely, instead of the resharding commands just use https://docs.aws.amazon.com/cli/latest/reference/elasticache/delete-cache-cluster.html

Related

Cloud Data Fusion triggered pipeline - reuse the already provisioned Dataproc clusters

Is there a way to avoid the provisioning step for subsequently triggered outbound pipelines? It looks like when a pipeline triggers an outbound pipeline, it does the provisioning all over again. Can we simply execute the triggered pipeline on the provisioned cluster of the first?
Thanks.
Cloud Data Fusion has support for running a pipeline against an existing Dataproc cluster, see more details in this doc.

AWS EC2 standalone instance Windows , Can I include start stop schdule in cloud formation?

I am already doing scale in scale out for auto scaling stack through scheduled actions.
I am in need of to auto start and stop for standalone EC2 ( not part of ASG)just wondering if CF support that ?
You can use the AWS Instance Scheduler to automate the starting and stopping of EC2 machines. Here you can find the documentation on how to set this up manually, and there is also a nice tutorial to follow along if the documentation is a bit too heavy.
The AWS Instance Scheduler deployment can also be setup using CloudFormation. AWS provides these examples as a starting point.
There is no capability in AWS (or CloudFormation) to schedule a start and stop of an instance.
However, you can code-up a simple solution using Amazon CloudWatch Events to trigger an AWS Lambda function on a schedule. You could even use Tags to identify when to start/stop instances.
For some examples, see: Simple EC2 Stopinator in Lambda - DEV Community

How to run PySpark on AWS EMR with AWS Lambda

How may I make my PySpark code to run with AWS EMR from AWS Lambda? Do I have to use AWS Lambda to create an auto-terminating EMR cluster to run my S3-stored code once?
You need transient cluster for this case which will auto terminate once your job is completed or the timeout is reached whichever occurs first.
You can access this link on how to initialise the same.
What are the processes available to create a EMR cluster:
Using boto3
/ AWS
CLI
/ Java
SDK
Using cloudformation
Using Data Pipeline
Do I have to use AWS Lambda to create an auto-terminating EMR cluster to run my S3-stored code once?
No. It isn’t mandatory to use lambda to create an auto-terminating cluster.
You just need to specify a flag --auto-terminate while creating a cluster using boto3 / CLi / Java-SDK. But this case you need to submit the job along with cluster config. Ref
Note:
Its not possible to create an auto-terminating cluster using cloudformation. By design, CloudFormation assumes that the
resources that are being created will be permanent to some extent.
If you REALLY had to do it this way, you could make an AWS api call to
delete the CF stack upon finishing your EMR tasks.
How may I make my PySpark code to run with AWS EMR from AWS Lambda?
You can design your lambda to submit spark
job.
You can find an example
here
In my use case I have one parameterised lambda which invoke CF to create cluster, submit job and terminate cluster.

How to show AWS CodeDeploy deployment on Grafana

Using Grafana's CloudWatch data source and a little InfluxDB magic, I can pull many metrics from my live environment; like CPU utilisation, memory utilisation, host count, thread count, ect etc.
These metrics will make more sense if I can spot the moments of live deployments on that graph.ELB Health Host Count metric kinda helps but does not show deployments, rather shows auto scale activities.
I can't find any metrics in AWS CloudWatch adapter for CodeDeploy. Dooes anybody has a way of doing this?
(My Env: Sprint Boot app on Docker containers deployed on AWS Fargate using CodeDeploy)
You can push datapoints into a CloudWatch metric using the "put-metric-data" aws cli call [1]. You can call this command from the AppSpec file hooks like BeforeInstall and AfterInstall. Make sure the EC2 instance role has the requisite permissions.
[1] https://docs.aws.amazon.com/cli/latest/reference/cloudwatch/put-metric-data.html
[2] https://docs.aws.amazon.com/codedeploy/latest/userguide/reference-appspec-file-example.html#appspec-file-example-server

Automated Amazon CloudWatch Metric creation

Had couple of questions on AWS:
Is there a way by which I can recreate/write AWS CloudWatch metrics to DynamoDB?
If an Amazon EC2 instance is deleted or if I change a VPC, I need to recreate all CloudWatch metrics manually every time. Is there a way by which I can automate CloudWatch metrics creation for every new VPC instance? Through Terraform, I can only create CloudWatch metric alarms, events and logs but not CloudWatch metrics (eg, EC2, RDS metrics etc).
#1 I could achieve it via AWS CLI and via Python script thereby writing it to dynamodb as well. #2 is still open.