Getting error while testing AWS Lambda function: "Invalid database identifier" - amazon-web-services

Hi I'm getting error while testing lambda function like:
{
"errorMessage": "An error occurred (InvalidParameterValue) when calling the DescribeDBInstances operation: Invalid database identifier: <RDS instance id>",
"errorType": "ClientError",
"stackTrace": [
" File \"/var/task/lambda_function.py\", line 25, in lambda_handler\n db_instances = rdsClient.describe_db_instances(DBInstanceIdentifier=rdsInstanceId)['DBInstances']\n",
" File \"/var/runtime/botocore/client.py\", line 391, in _api_call\n return self._make_api_call(operation_name, kwargs)\n",
" File \"/var/runtime/botocore/client.py\", line 719, in _make_api_call\n raise error_class(parsed_response, operation_name)\n"
]
}
AND here is my lambda code :
import json
import boto3
import logging
import os
#Logging
LOGGER = logging.getLogger()
LOGGER.setLevel(logging.INFO)
#Initialise Boto3 for RDS
rdsClient = boto3.client('rds')
def lambda_handler(event, context):
#log input event
LOGGER.info("RdsAutoRestart Event Received, now checking if event is eligible. Event Details ==> ", event)
#Input event from the SNS topic originated from RDS event notifications
snsMessage = json.loads(event['Records'][0]['Sns']['Message'])
rdsInstanceId = snsMessage['Source ID']
stepFunctionInput = {"rdsInstanceId": rdsInstanceId}
rdsEventId = snsMessage['Event ID']
#Retrieve RDS instance ARN
db_instances = rdsClient.describe_db_instances(DBInstanceIdentifier=rdsInstanceId)['DBInstances']
db_instance = db_instances[0]
rdsInstanceArn = db_instance['DBInstanceArn']
# Filter on the Auto Restart RDS Event. Event code: RDS-EVENT-0154.
if 'RDS-EVENT-0154' in rdsEventId:
#log input event
LOGGER.info("RdsAutoRestart Event detected, now verifying that instance was tagged with auto-restart-protection == yes")
#Verify that instance is tagged with auto-restart-protection tag. The tag is used to classify instances that are required to be terminated once started.
tagCheckPass = 'false'
rdsInstanceTags = rdsClient.list_tags_for_resource(ResourceName=rdsInstanceArn)
for rdsInstanceTag in rdsInstanceTags["TagList"]:
if 'auto-restart-protection' in rdsInstanceTag["Key"]:
if 'yes' in rdsInstanceTag["Value"]:
tagCheckPass = 'true'
#log instance tags
LOGGER.info("RdsAutoRestart verified that the instance is tagged auto-restart-protection = yes, now starting the Step Functions Flow")
else:
tagCheckPass = 'false'
#log instance tags
LOGGER.info("RdsAutoRestart Event detected, now verifying that instance was tagged with auto-restart-protection == yes")
if 'true' in tagCheckPass:
#Initialise StepFunctions Client
stepFunctionsClient = boto3.client('stepfunctions')
# Start StepFunctions WorkFlow
# StepFunctionsArn is stored in an environment variable
stepFunctionsArn = os.environ['STEPFUNCTION_ARN']
stepFunctionsResponse = stepFunctionsClient.start_execution(
stateMachineArn= stepFunctionsArn,
name=event['Records'][0]['Sns']['MessageId'],
input= json.dumps(stepFunctionInput)
)
else:
LOGGER.info("RdsAutoRestart Event detected, and event is not eligible")
return {
'statusCode': 200
}
I'm trying to Stop an Amazon RDS database which starts automatically after 7 days. I'm following this AWS document: Field Notes: Stopping an Automatically Started Database Instance with Amazon RDS | AWS Architecture Blog
Can anyone help me?

The error message is saying: Invalid database identifier: <RDS instance id>"
It seems to be coming from this line:
db_instances = rdsClient.describe_db_instances(DBInstanceIdentifier=rdsInstanceId)['DBInstances']
The error message is saying that the rdsInstanceId variable contains <RDS instance id>, which seems to be an example value rather than a real value.
In looking at the code on Field Notes: Stopping an Automatically Started Database Instance with Amazon RDS | AWS Architecture Blog, it is asking you to create a test event that includes this message:
"Message": "{\"Event Source\":\"db-instance\",\"Event Time\":\"2020-07-09 15:15:03.031\",\"Identifier Link\":\"https://console.aws.amazon.com/rds/home?region=<region>#dbinstance:id=<RDS instance id>\",\"Source ID\":\"<RDS instance id>\",\"Event ID\":\"http://docs.amazonwebservices.com/AmazonRDS/latest/UserGuide/USER_Events.html#RDS-EVENT-0154\",\"Event Message\":\"DB instance started\"}",
If you look closely at that line, it includes this part to identify the Amazon RDS instance:
dbinstance:id=<RDS instance id>
I think that you are expected to modify the provided test event to fill-in your own values for anything in <angled brackets> (such as the Instance Id of your Amazon RDS instance).

Related

AWS SSM error while targets.1.member.values failed to satisfy constraint: Member must have length less than or equal to 50

I am trying to run a SSM command on more than 50 EC2 instances of my fleet. By using AWS boto3's SSM client, I am running a specific command on my nodes. My code is given below. After running the code, an unexpected error is showing up.
# running ec2 instances
instances = client.describe_instances()
instance_ids = [inst["InstanceId"] for inst in instances] # might contain more than 50 instances
# run command
run_cmd_resp = ssm_client.send_command(
Targets=[
{"Key": "InstanceIds", "Values": inst_ids_all},
],
DocumentName="AWS-RunShellScript",
DocumentVersion="1",
Parameters={
"commands": ["#!/bin/bash", "ls -ltrh", "# some commands"]
}
)
On executing this, getting below error
An error occurred (ValidationException) when calling the SendCommand operation: 1 validation error detected: Value '[...91 instance IDs...]' at 'targets.1.member.values' failed to satisfy constraint: Member must have length less than or equal to 50.
How do I run the SSM command my whole fleet?
As shown in the error message and boto3 documentation (link), the number of instances in one send_command call is limited up to 50. To run the SSM command for all instances, splitting the original list into 50 each could be a solution.
FYI: If your account has a fair amount of instances, describe_instances() can't retrieve all instance info in one api call, so it would be better to check whether NextToken is in response.
ref: How do you use "NextToken" in AWS API calls
# running ec2 instances
instances = client.describe_instances()
instance_ids = [inst["InstanceId"] for inst in instances]
while "NextToken" in instances:
instances = client.describe_instances(NextToken=instances["NextToken"])
instance_ids += [inst["InstanceId"] for inst in instances]
# run command
for i in range(0, len(instance_ids), 50):
target_instances = instance_ids[i : i + 50]
run_cmd_resp = ssm_client.send_command(
Targets=[
{"Key": "InstanceIds", "Values": inst_ids_all},
],
DocumentName="AWS-RunShellScript",
DocumentVersion="1",
Parameters={
"commands": ["#!/bin/bash", "ls -ltrh", "# some commands"]
}
)
Finally after #Rohan Kishibe's answer, I tried to implement below batched execution for the SSM runShellScript.
import math
ec2_ids_all = [...] # all instance IDs fetched by pagination.
PG_START, PG_STOP = 0, 50
PG_SIZE = 50
PG_COUNT = math.ceil(len(ec2_ids_all) / PG_SIZE)
for page in range(PG_COUNT):
cmd = ssm.send_command(
Targets=[{"Key": "InstanceIds", "Values": ec2_ids_all[PG_START:PG_STOP]}],
DocumentVersion="AWS-RunShellScript",
Parameters={"commands": ["ls -ltrh", "# other commands"]}
}
PG_START += PG_SIZE
PG_STOP += PG_SIZE
In above way, the total number of instance IDs will be distributed in batches and then executed accordingly. One can also save the Command IDs and batch instance IDs in a mapping for future usage.

Creating Connection for RedshiftDataOperator

So i when to the airflow documentation for aws redshift there is 2 operator that can execute the sql query they are RedshiftSQLOperator and RedshiftDataOperator. I already implemented my job using RedshiftSQLOperator but i want to do it using RedshiftDataOperator instead, because i dont want to using postgres connection in RedshiftSQLOperator but AWS API.
RedshiftDataOperator Documentation
I had read this documentation there is aws_conn_id in the parameter. But when im trying to use the same connection id there is error.
[2023-01-11, 04:55:56 UTC] {base.py:68} INFO - Using connection ID 'redshift_default' for task execution.
[2023-01-11, 04:55:56 UTC] {base_aws.py:206} INFO - Credentials retrieved from login
[2023-01-11, 04:55:56 UTC] {taskinstance.py:1889} ERROR - Task failed with exception
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.7/site-packages/airflow/providers/amazon/aws/operators/redshift_data.py", line 146, in execute
self.statement_id = self.execute_query()
File "/home/airflow/.local/lib/python3.7/site-packages/airflow/providers/amazon/aws/operators/redshift_data.py", line 124, in execute_query
resp = self.hook.conn.execute_statement(**filter_values)
File "/home/airflow/.local/lib/python3.7/site-packages/botocore/client.py", line 415, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/home/airflow/.local/lib/python3.7/site-packages/botocore/client.py", line 745, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (UnrecognizedClientException) when calling the ExecuteStatement operation: The security token included in the request is invalid.
From task id
redshift_data_task = RedshiftDataOperator(
task_id='redshift_data_task',
database='rds',
region='ap-southeast-1',
aws_conn_id='redshift_default',
sql="""
call some_procedure();
"""
)
What should i fill in the airflow connection ? Because in the documentation there is no example of value that i should fill to airflow. Thanks
Airflow RedshiftDataOperator Connection Required Value
Have you tried using the Amazon Redshift connection? There is both an option for authenticating using your Redshift credentials:
Connection ID: redshift_default
Connection Type: Amazon Redshift
Host: <your-redshift-endpoint> (for example, redshift-cluster-1.123456789.us-west-1.redshift.amazonaws.com)
Schema: <your-redshift-database> (for example, dev, test, prod, etc.)
Login: <your-redshift-username> (for example, awsuser)
Password: <your-redshift-password>
Port: <your-redshift-port> (for example, 5439)
(source)
and an option for using an IAM role (there is an example in the first link).
Disclaimer: I work at Astronomer :)
EDIT: Tested the following with Airflow 2.5.0 and Amazon provider 6.2.0:
Added the IP of my Airflow instance to the VPC security group with "All traffic" access.
Airflow Connection with the connection id aws_default, Connection type "Amazon Web Services", extra: { "aws_access_key_id": "<your-access-key-id>", "aws_secret_access_key": "<your-secret-access-key>", "region_name": "<your-region-name>" }. All other fields blank. I used a root key for my toy-aws. If you use other credentials you need to make sure that IAM role has access and the right permissions to the Redshift cluster (there is a list in the link above).
Operator code:
red = RedshiftDataOperator(
task_id="red",
database="dev",
sql="SELECT * FROM dev.public.users LIMIT 5;",
cluster_identifier="redshift-cluster-1",
db_user="awsuser",
aws_conn_id="aws_default"
)

EndpointConnectionError: Could not connect to the endpoint URL: "http://169.254.169.254/....."

I am trying to create AWS RDS and deploy lambda function using a python script. However, I am getting below error, looks like it is unable to communicate with the aws commands to create rds.
DEBUG: Caught retryable HTTP exception while making metadata service request to http://169.254.169.254/latest/meta-data/iam/security-credentials/: Could not connect to the endpoint URL: "http://169.254.169.254/latest/meta-data/iam/security-credentials/"
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/botocore/utils.py", line 303, in _get_request
response = self._session.send(request.prepare())
File "/usr/lib/python2.7/site-packages/botocore/httpsession.py", line 282, in send raise EndpointConnectionError(endpoint_url=request.url, error=e)
EndpointConnectionError: Could not connect to the endpoint URL: "http://169.254.169.254/latest/meta-data/iam/security-credentials/"
I am getting the aws credentials through SSO okta. In the ~/.aws directory,below are the contents of 'credentials' and 'config' file respectively.
[default]
aws_access_key_id = <Key Id>
aws_secret_access_key = <Secret Key>
aws_session_token = <Token>
[default]
region = us-west-2
```python
```
for az in availability_zones:
if aurora.get_db_instance(db_instance_identifier + "-" + az)[0] != 0:
aurora.create_db_instance(db_cluster_identifier, db_instance_identifier + "-" + az, az, subnet_group_identifier, db_instance_type)
else:
aurora.modify_db_instance(db_cluster_identifier, db_instance_identifier + "-" + az, az, db_instance_type)
# Wait for DB to become available for connection
iter_max = 15
iteration = 0
for az in availability_zones:
while aurora.get_db_instance(db_instance_identifier + "-" + az)[1]["DBInstances"][0]["DBInstanceStatus"] != "available":
iteration += 1
if iteration < iter_max:
logging.info("Waiting for DB instances to become available - iteration " + str(iteration) + " of " + str(iter_max))
time.sleep(10*iteration)
else:
raise Exception("Waiting for DB Instance to become available timed out!")
cluster_endpoint = aurora.get_db_cluster(db_cluster_identifier)[1]["DBClusters"][0]["Endpoint"]
The actual error below, coming from the while loop, DEBUG shows unable to locate credential, but the credential is there. I can deploy an Elastic Beanstalk environment from cli using the same aws credential, but not this. Looks like the above aurora.create_db_instance command failed.
DEBUG: Unable to locate credentials
Traceback (most recent call last):
File "./deploy_api.py", line 753, in <module> sync_rds()
File "./deploy_api.py", line 57, in sync_rds
while aurora.get_db_instance(db_instance_identifier + "-" + az)[1]["DBInstances"][0]["DBInstanceStatus"] != "available":
TypeError: 'NoneType' object has no attribute '__getitem__'
I had this error because an ECS task didn't have permissions to write to DynamoDB. The code causing the problem was:
from boto3 import resource
dynamodb_resource = resource("dynamodb")
The problem was resolved when I filled in the region_name, aws_access_key_id and aws_secret_access_key parameters for the resource() function call.
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html#boto3.session.Session.resource
If this doesn't solve your problem then check your code that connects to AWS services and make sure that you are filling in all of the proper function parameters.

When using lambda to generate elbv2 attributes (name specifically), receiving error from Lambda that name is longer than 32 characters

I am building a CloudFormation template that uses a Lambda function to generate the name of the load balancer built by the template.
When the function runs, it fails with the following error:
Failed to validate attributes of ELB arn:aws-us-gov:elasticloadbalancing:us-gov-west-1:273838691273:loadbalancer/app/dev-fu-WALB-18VHO2DJ4MHK/c69c48fd3464de01. An error occurred (ValidationError) when calling the DescribeLoadBalancers operation: The load balancer name 'arn:aws-us-gov:elasticloadbalancing:us-gov-west-1:273838691273:loadbalancer/app/dev-fu-WALB-18VHO2DJ4MHK/c69c48fd3464de01' cannot be longer than '32' characters.
It is obviously pulling the arn rather than the name of the elbv2.
I opened a ticket with AWS to no avail, and also with the company that wrote the script... same results.
I have attached the script and any help is greatly appreciated.
import cfn_resource
import boto3
import boto3.session
import logging
logger = logging.getLogger()
handler = cfn_resource.Resource()
# Retrieves DNSName and source security group name for the specified ELB
#handler.create
def get_elb_attribtes(event, context):
properties = event['ResourceProperties']
elb_name = properties['PORALBName']
elb_template = properties['PORALBTemplate']
elb_subnets = properties['PORALBSubnets']
try:
client = boto3.client('elbv2')
elb = client.describe_load_balancers(
Names=[
elb_name
]
)['LoadBalancers'][0]
for az in elb['AvailabilityZones']:
if not az['SubnetId'] in elb_subnets:
raise Exception("ELB does not include VPC subnet '" + az['SubnetId'] + "'.")
target_groups = client.describe_target_groups(
LoadBalancerArn=elb['LoadBalancerArn']
)['TargetGroups']
target_group_arns = []
for target_group in target_groups:
target_group_arns.append(target_group['TargetGroupArn'])
if elb_template == 'geoevent':
if elb['Type'] != 'network':
raise Exception("GeoEvent Server requires network ElasticLoadBalancer V2.")
response_data = {}
response_data['DNSName'] = elb['DNSName']
response_data['TargetGroupARNs'] = target_group_arns
msg = 'ELB {} found.'.format(elb_name)
logger.info(msg)
return {
'Status': 'SUCCESS',
'Reason': msg,
'PhysicalResourceId': context.log_stream_name,
'StackId': event['StackId'],
'RequestId': event['RequestId'],
'LogicalResourceId': event['LogicalResourceId'],
'Data': response_data
}
except Exception, e:
error_msg = 'Failed to validate attributes of ELB {}. {}'.format(elb_name, e)
logger.error(error_msg)
return {
'Status': 'FAILED',
'Reason': error_msg,
'PhysicalResourceId': context.log_stream_name,
'StackId': event['StackId'],
'RequestId': event['RequestId'],
'LogicalResourceId': event['LogicalResourceId']
}
The error says:
An error occurred (ValidationError) when calling the DescribeLoadBalancers operation
So, looking at where it calls DescribeLoadBalancers:
elb = client.describe_load_balancers(
Names=[
elb_name
]
)['LoadBalancers'][0]
The error also said:
The load balancer name ... cannot be longer than '32' characters.
The name comes from:
properties = event['ResourceProperties']
elb_name = properties['PORALBName']
So, the information is being passed into the Lambda function via event. This is coming from whatever is triggering the Lambda function. So, you'll need to find out what is triggering the function and discover what information it actually sending. Your problem is outside of the code listed.
Other options
In your code, you can send event to the debug logs (eg print (event)) and see whether they are passing the ELB name in a different field.
Alternatively, you could call describe_load_balancers without a Name filter to retrieve a list of all load balancers, then use the ARN (that you have) to find the load balancer of interest. Simply loop through all the results until you find the one that matches the ARN you have. Then, continue as normal.

AWS: How to programmatically create a RDS Aurora Cluster in Python/Boto3

My application is hosted on Amazon Web Services, and I'm starting to script the creation of all the infrastructure of my app (VPC, Security Group, Beanstalk ect ...). I did not find the proper way to create a RDS Aurora Cluster, and I failed to reproduce the RDS wizard (helping you to create the db instances and the cluster) in Python with Boto3. Maybe I lack of knowledge in infrastructure, and networks, but I think creating a Aurora cluster must be accessible to me.
So here is my question:
Lets says I have a VPC id, a security group id, and some database info (user, password...), what are the minimum API calls I have to do to create a cluster, and make it usable by my application? The procedure must end with a cluster reader/writer endpoint and a reader only endpoint.
Here is how I create an Aurora MySQL instance in Python/BOTO3. You have to implement by yourself some missing functions.
def create_aurora(
instance_identifier, # used for instance name and cluster name
db_username,
db_password,
db_name,
db_port,
vpc_id,
vpc_sg, # Must be an array
dbsubnetgroup_name,
public_access = False,
AZ = None,
instance_type = "db.t2.small",
multi_az = True,
nb_instance = 1,
extratags = []
):
rds = boto3.client('rds')
# Assume a DB SUBNET Groups exists before creating the cluster. You must have created a DBSUbnetGroup associated to the Subnet of the VPC of your cluster. AWS will find it automatically.
#
# Search if the cluster exists
try:
db_cluster = rds.describe_db_clusters(
DBClusterIdentifier = instance_identifier
)['DBClusters']
db_cluster = db_cluster[0]
except botocore.exceptions.ClientError as e:
psa.printf("Creating empty cluster\r\n");
res = rds.create_db_cluster(
DBClusterIdentifier = instance_identifier,
Engine="aurora",
MasterUsername=db_username,
MasterUserPassword=db_password,
DBSubnetGroupName=dbsubnetgroup_name,
VpcSecurityGroupIds=vpc_sg,
AvailabilityZones=AZ
)
db_cluster = res['DBCluster']
cluster_name = db_cluster['DBClusterIdentifier']
instance_identifier = db_cluster['DBClusterIdentifier']
psa.printf("Cluster identifier : %s, status : %s, members : %d\n", instance_identifier , db_cluster['Status'], len(db_cluster['DBClusterMembers']))
if (db_cluster['Status'] == 'deleting'):
psa.printf(" Please wait for the cluster to be deleted and try again.\n")
return None
psa.printf(" Writer Endpoint : %s\n", db_cluster['Endpoint'])
psa.printf(" Reader Endpoint : %s\n", db_cluster['ReaderEndpoint'])
# Now create instances
# Loop on requested number of instance, and balance them on AZ
for i in range(1, nb_instance+1):
if AZ != None:
the_AZ = AZ[i -1 % len(AZ)]
dbinstance_id = instance_identifier+"-"+str(i)+"-"+the_AZ
else:
the_AZ = None
dbinstance_id = instance_identifier+"-"+str(i)
psa.printf("Creating instance %d named '%s' in AZ %s\n", i, dbinstance_id, the_AZ)
try:
res = rds.create_db_instance(
DBInstanceIdentifier=dbinstance_id,
DBInstanceClass=instance_type,
Engine='aurora',
PubliclyAccessible=False,
AvailabilityZone=the_AZ,
DBSubnetGroupName=dbsubnetgroup_name,
DBClusterIdentifier=instance_identifier,
Tags = psa.tagsKeyValueToAWStags(extratags)
)['DBInstance']
psa.printf(" DbiResourceId=%s\n", res['DbiResourceId'])
except botocore.exceptions.ClientError as e:
psa.printf(" Instance seems to exists.\n")
res = rds.describe_db_instances(DBInstanceIdentifier = dbinstance_id)['DBInstances']
psa.printf(" Status is %s\n", res[0]['DBInstanceStatus'])
return db_cluster
Yeah, you are on the right track. Here is the boto3 document for creating a Aurora RDS cluster.
Further, to address the bigger picture problem (i.e. managing your entire infrastructure as code), you should look at options like Terraform.
Check out their Git Repo Terraform Git Repo So, you can accomplish the same task of creating the Aurora cluster using terraform using this template