Idle delete configuration for PySpark Cluster on GCP

Idle delete configuration for PySpark Cluster on GCP - google-cloud-platform

I am trying to define a create cluster function to create a cluster on Cloud Dataproc. While going through the reference material I came across an idle delete parameter (idleDeleteTtl) which would auto-delete the cluster if not in use for the amount of time defined. When I try to include it in cluster config it gives me a ValueError: Protocol message ClusterConfig has no "lifecycleConfig" field.
The create cluster function for reference:
def create_cluster(dataproc, project, zone, region, cluster_name, pip_packages):
"""Create the cluster."""
print('Creating cluster...')
zone_uri = \
'https://www.googleapis.com/compute/v1/projects/{}/zones/{}'.format(
project, zone)
cluster_data = {
'project_id': project,
'cluster_name': cluster_name,
'config': {
'initialization_actions': [{
'executable_file': 'gs://<some_path>/python/pip-install.sh'
}],
'gce_cluster_config': {
'zone_uri': zone_uri,
'metadata': {
'PIP_PACKAGES': pip_packages
}
},
'master_config': {
'num_instances': 1,
'machine_type_uri': 'n1-standard-1'
},
'worker_config': {
'num_instances': 2,
'machine_type_uri': 'n1-standard-1'
},
'lifecycleConfig': { #### PROBLEM AREA ####
'idleDeleteTtl': '30m'
}
}
}
cluster = dataproc.create_cluster(project, region, cluster_data)
cluster.add_done_callback(callback)
global waiting_callback
waiting_callback = True
I want similar functionality if not in the same function itself. I already have a manual delete function defined but I want to add the functionality to auto delete clusters when not in use.

You are calling the v1 API passing a parameter that is part of the v1beta2 API.
Change your endpoint from:
https://www.googleapis.com/compute/v1/projects/{}/zones/{}
To this:
https://www.googleapis.com/compute/v1beta2/projects/{}/zones/{}

Related

What is the integration service name for CloudWatch

I am trying to create AWS API gateway with AWS service integration with cloudwatch using AWS cdk/ cloudformation. But I am getting errors like "AWS service of type cloudwatch not supported". When I try to use Cloud watch log then it works but not for only cloudwatch.
Code
new AwsIntegrationProps
{
Region = copilotFoundationalInfrastructure.Region,
Options = new IntegrationOptions {
PassthroughBehavior = PassthroughBehavior.WHEN_NO_TEMPLATES,
CredentialsRole = Role.FromRoleArn(this,"CloudWatchAccessRole", "arn:aws:iam::800524210815:role/APIGatewayCloudWatchRole"),
RequestParameters = new Dictionary\<string, string\>()
{
{ "integration.request.header.Content-Encoding", "'amz-1.0'" },
{ "integration.request.header.Content-Type", "'application/json'" },
{ "integration.request.header.X-Amz-Target", "'GraniteServiceVersion20100801.PutMetricData'" },
},
},
IntegrationHttpMethod = "POST",
Service = "cloudwatch", // this is working with s3 and logs
Action = "PutMetricData"
}
What is the correct service name for cloudwatch to putmetricsdata?
new AwsIntegrationProps
{
Region = copilotFoundationalInfrastructure.Region,
Options = new IntegrationOptions {
PassthroughBehavior = PassthroughBehavior.WHEN_NO_TEMPLATES,
CredentialsRole = Role.FromRoleArn(this,"CloudWatchAccessRole", "arn:aws:iam::800524210815:role/APIGatewayCloudWatchRole"),
RequestParameters = new Dictionary<string, string>()
{
{ "integration.request.header.Content-Encoding", "'amz-1.0'" },
{ "integration.request.header.Content-Type", "'application/json'" },
{ "integration.request.header.X-Amz-Target", "'GraniteServiceVersion20100801.PutMetricData'" },
},
},
IntegrationHttpMethod = "POST",
Service = "", // What will be the correct value for cloudwatch
Action = "PutMetricData"
}
What will be the correct value for cloudwatch

For CloudWatch logs you put logs right?
So for CloudWatch, it is monitoring... I got it from a github code but cannot find it anymore.

There are several ways to configure CloudWatch to monitor your API Gateway. First, you can create an AWS CloudWatch metric to monitor specific outputs produced by your API Gateway - see an example here. The second way is to use the default configuration - see here.

How to add two or more Target id under 'Id' : instance id in aws lambda function with python

I want to register/deregister two ec2 instance which is i-26377gdhdhj and i-9876277sgshj in aws alb target group using lambda function python script.
I want to know how to add both instance id under Targets id simultaneously.Please help.
This is my current script.
import boto3
clients=boto3.client('elbv2')
response_tg = clients.register_targets(
TargetGroupArn='arn:aws:elasticloadbalancing:us-east-1:123456789123:targetgroup/target-demo/c64e6bfc00b4658f',
Targets=[
{
'Id': 'i-26377gdhdhj',
},
]
)

response_tg = clients.register_targets(
TargetGroupArn='arn:aws:elasticloadbalancing:us-east-1:123456789123:targetgroup/target-demo/c64e6bfc00b4658f',
Targets=[
{
'Id': 'i-26377gdhdhj',
},
{
'Id': 'i-9876277sgshj',
}
]
)
Since Targets is a list, you can pass them both in.

Is it possible to insert a value into a file before it gets used in a terraform resource?

I'm deploying an Amazon Connect instance with an attached contact flow, following this documentation: https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/connect_contact_flow
My contact flow is stored in a file, so I'm using the following:
resource "aws_connect_contact_flow" "general" {
instance_id = aws_connect_instance.dev.id
name = "General"
description = "General Flow routing customers to queues"
filename = "flows/contact_flow.json"
content_hash = filebase64sha256("flows/contact_flow.json")
}
However my contact flow requires me to specify the ARN of an AWS Lambda function in one particular section:
...
"parameters":[
{
"name":"FunctionArn",
"value":"<lambda function arn>",
"namespace":null
},
{
"name":"TimeLimit",
"value":"3"
}
],
</snip>
...
the value I would like to substitute into the json file at <lambda function arn> before the contact flow is created is accessible at
data.terraform_remote_state.remote.outputs.lambda_arn
Is there any way to achieve this? Or will I have to use the 'content = ' method in the documentation linked above to achieve what I need?
Thanks.

If you really want to use filename instead of content, you have to write the rendered template to some temporary file using local_file.
But using content with templatefile directly for that would probably be easier. For that you would have to convert your flows/contact_flow.json into a template format:
"parameters":[
{
"name":"FunctionArn",
"value":"$arn",
"namespace":null
},
{
"name":"TimeLimit",
"value":"3"
}
],
then, for example:
locals {
contact_flow = templatefile("flows/contact_flow.json", {
arn = data.terraform_remote_state.remote.outputs.lambda_arn
})
}
resource "aws_connect_contact_flow" "general" {
instance_id = aws_connect_instance.dev.id
name = "General"
description = "General Flow routing customers to queues"
content = local.contact_flow
}

VPN Using AWS CDK

I've been working on creating a VPN using AWS's CDK. I had to use Cloudformation lower level resources, as there doesn't seem to be any constructs yet. I believe I have the code set up correctly, as cdk diff doesn't show any errors. However, when running cdk deploy I get the following error:
CREATE_FAILED | AWS::EC2::ClientVpnEndpoint | ClientVpnEndpoint2
Mutual authentication is required but is missing in the request (Service: AmazonEC2; Status Code: 400; Error Code: MissingParameter; Request ID: 5
384a1d9-ff60-4ac4-a1bc-df3a4db9146b; Proxy: null)
Which is odd... because I wouldn't think I'd need mutual authentication in order to create a VPN that uses mutual authentication. And if that is the case, then how do I get the aws cdk stack to use mutual authentication on deployment? Here is the relevant code I have:
client_cert = certificate_manager.Certificate.from_certificate_arn(
self,
"ServerCertificate",
self.cert_arn,
)
server_cert = certificate_manager.Certificate.from_certificate_arn(
self,
"ClientCertificate",
self.client_arn,
)
log_group = logs.LogGroup(
self,
"ClientVpnLogGroup",
retention=logs.RetentionDays.ONE_MONTH
)
log_stream = log_group.add_stream("ClientVpnLogStream")
endpoint = ec2.CfnClientVpnEndpoint(
self,
"ClientVpnEndpoint2",
description="VPN",
authentication_options=[{
"type": "certificate-authentication",
"mutual_authentication": {
"client_root_certificate_chain_arn": client_cert.certificate_arn
}
}],
tag_specifications=[{
"resourceType": "client-vpn-endpoint",
"tags": [{
"key": "Name",
"value": "Swyp VPN CDK created"
}]
}],
client_cidr_block="10.27.0.0/20",
connection_log_options={
"enabled": True,
"cloudwatch_log_group": log_group.log_group_name,
"cloudwatch_log_stream": log_stream.log_stream_name,
},
server_certificate_arn=server_cert.certificate_arn,
split_tunnel=False,
vpc_id=vpc.vpc_id,
dns_servers=["8.8.8.8", "8.8.4.4"],
)
dependables = core.ConcreteDependable()
for i, subnet in enumerate(vpc.isolated_subnets):
network_asc = ec2.CfnClientVpnTargetNetworkAssociation(
self,
"ClientVpnNetworkAssociation-" + str(i),
client_vpn_endpoint_id=endpoint.ref,
subnet_id=subnet.subnet_id,
)
dependables.add(network_asc)
auth_rule = ec2.CfnClientVpnAuthorizationRule(
self,
"ClientVpnAuthRule",
client_vpn_endpoint_id=endpoint.ref,
target_network_cidr="0.0.0.0/0",
authorize_all_groups=True,
description="Allow all"
)
# add routes for subnets in order to surf internet (useful while splitTunnel is off)
for i, subnet in enumerate(vpc.isolated_subnets):
ec2.CfnClientVpnRoute(
self,
"CfnClientVpnRoute" + str(i),
client_vpn_endpoint_id=endpoint.ref,
destination_cidr_block="0.0.0.0/0",
description="Route to all",
target_vpc_subnet_id=subnet.subnet_id,
).node.add_dependency(dependables)
Maybe this is something simple like needing to update IAM policies? I'm fairly new to aws, aws cdk/cloudformation, and devops in general. So any insight would be much appreciated!

How can I create a custom metric watching EFS metered size in AWS Cloudwatch?

Title pretty much says it all - Since the EFS metered size (usage) ist not a metric that I can use in Cloudwatch, I need to create a custom metric watching the last metered file size in EFS.
Is there any possiblity to do so? Or is there maybe a even better way to monitore the size of my EFS?

I would recommend using a Lambda, running every hour or so and sending the data into CloudWatch.
This code gathers all the EFS File Systems and sends their size (in kb) to Cloudwatch along with the file system name. Modify it to suit your needs:
import json
import boto3
region = "us-east-1"
def push_efs_size_metric(region):
efs_name = []
efs = boto3.client('efs', region_name=region)
cw = boto3.client('cloudwatch', region_name=region)
efs_file_systems = efs.describe_file_systems()['FileSystems']
for fs in efs_file_systems:
efs_name.append(fs['Name'])
cw.put_metric_data(
Namespace="EFS Metrics",
MetricData=[
{
'MetricName': 'EFS Size',
'Dimensions': [
{
'Name': 'EFS_Name',
'Value': fs['Name']
}
],
'Value': fs['SizeInBytes']['Value']/1024,
'Unit': 'Kilobytes'
}
]
)
return efs_name
def cloudtrail_handler(event, context):
response = push_efs_size_metric(region)
print ({
'EFS Names' : response
})
I'd also suggest reading up on the reference below for more details on creating custom metrics.
References
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/publishingMetrics.html

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Idle delete configuration for PySpark Cluster on GCP - google-cloud-platform

You are calling the v1 API passing a parameter that is part of the v1beta2 API. Change your endpoint from: https://www.googleapis.com/compute/v1/projects/{}/zones/{} To this: https://www.googleapis.com/compute/v1beta2/projects/{}/zones/{}

Related

What is the integration service name for CloudWatch

How to add two or more Target id under 'Id' : instance id in aws lambda function with python

Is it possible to insert a value into a file before it gets used in a terraform resource?

VPN Using AWS CDK

How can I create a custom metric watching EFS metered size in AWS Cloudwatch?

Categories

Resources