AWS Glue ETL job from AWS Redshift to S3 fails - amazon-web-services

I am trying out AWS Glue service to ETL some data from redshift to S3. Crawler runs successfully and creates the meta table in data catalog, however when I run the ETL job ( generated by AWS ) it fails after around 20 minutes saying "Resource unavailable".
I cannot see AWS glue logs or error logs created in Cloudwatch. When I try to view them it says "Log stream not found. The log stream jr_xxxxxxxxxx could not be found. Check if it was correctly created and retry."
I would appreciate it if you could provide any guidance to resolve this issue.

So basically, the job you add to Glue will either run if there's not too much traffic in the region your Glue is. If there are no resources available, you need to either manually re-add the job again or you can also bind yourself to events from CloudWatch via SNS.
Also, there are parameters you can pass to the job like maximunRetry and timeout.
If you have a Ressource not available, it won't trigger a retry because the job did not fail, it just didn't even started. But if you set the timeout to let's say 60 minutes, it will trigger an error after that time, decrement your retry pool and re-launch the job.

The closest thing I see to Glue documentation on this is here:
If you encounter errors in AWS Glue, use the following solutions to
help you find the source of the problems and fix them. Note The AWS
Glue GitHub repository contains additional troubleshooting guidance in
AWS Glue Frequently Asked Questions. Error: Resource Unavailable If
AWS Glue returns a resource unavailable message, you can view error
messages or logs to help you learn more about the issue. The following
tasks describe general methods for troubleshooting. • A custom DNS
configuration without reverse lookup can cause AWS Glue to fail. Check
your DNS configuration. If you are using Amazon Route 53 or Microsoft
Active Directory, make sure that there are forward and reverse
lookups. For more information, see Setting Up DNS in Your VPC (p. 23).
• For any connections and development endpoints that you use, check
that your cluster has not run out of elastic network interfaces.

I have recently struggled with Resource Unavailable thrown by Glue Job
Also i was not able to make a direct connection in Glue using RDS -it said "no suitable security group found"
I faced this issue while trying to connect with AWS RDS and Redshift
The problem was with the Security Group that the Redshift was using. There is a need to place a self referencing inbound rule in the Security Group.
For those who dont know what is self referencing inbound rule, follow the steps
1) Go to the Security Group you are using (VPC -> Security Group)
2) In the Inbound Rules select Edit Inbound Rules
3) Add a Rule
a) Type - All Traffic b) Protocol - All c) Port Range - ALL d) Source - custom and in space available write the initial of your security group and select it. e) Save it.
Its done !
if you were missing this condition in your Security Group Inbound Rules
Try creating the connection you will be able to create the connection.
Also job should work this time.

Related

AWS CloudWatch sending logs but not custom metrics to CloudWatch

first time asker.
So I've been trying to implement AWS Cloud Watch to monitor Disk Usage on an EC2 instance running EC2 Linux. I'm interesting in doing this just using the CW Agent and I've installed it according to the how-to found here. The install runs fine and I've made sure I've created an IAM Role for the instance as is described here. Unfortunately whenever I run the amazon-cloudwatch-agent.service it only sends log files and not the custom used_percent measurement specified. I receive this error when I tail the logs.
2021-06-18T15:41:37Z E! WriteToCloudWatch failure, err: RequestError: send request failed
caused by: Post "https://monitoring.us-west-2.amazonaws.com/": dial tcp 172.17.1.25:443: i/o timeout
I've done my best googlefu but gotten nowhere thus far. If you've got any advice it would be appreciated.
Thank you
Belated answer to my own question. I had to create a security group that would accept traffic from that same security group!
Having the same issue, it definitely wasn't a network restriction as I was still able to telnet to the monitoring endpoint.
From AWS docs: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/create-iam-roles-for-cloudwatch-agent.html
One role or user enables CloudWatch agent to be installed on a server
and send metrics to CloudWatch. The other role or user is needed to
store your CloudWatch agent configuration in Systems Manager Parameter
Store. Parameter Store enables multiple servers to use one CloudWatch
agent configuration.
If you're using the default cloudwatchagent configuration wizard, you may require extra policy CloudWatchAgentAdminRole in your role for the agent to connect to the monitoring service.

Send AWS EC2 metrics to AWS Elasticsearch Service Domain for monitoring in Kibana

I am stuck on one point I have created one EC2 Linux based instance in Aws.
Now I want to send the EC2 metrics data to the managed Elasticsearch domain for monitoring purposes in Kiban, I go through the cloud watch console and check the metric is present of instance but didn't get how to connect with the Elasticsearch domain that I have created.
Can anyone please help me with this situation?
There is no build in mechanism for extraction/streaming of metrics data points in real time. You have to develop a custom solution for that. For example, by having a lambda function which is invoked every minute and which reads data points using get_metric_data. The the lambda would inject the points into your ES.
To invoke a lambda function periodically, e.g. every 1 minute you would have to setup CloudWatch Event rule with schedule Expressions. Lambda function would also need to have permissions granted to interact with CloudWatch metrics.
Welcome to SO :)
An alternative to the solution suggested by Marcin is to install metricbeat on the EC2 Instance and configure the metricbeat config file to send metrics to your Managed AWS ES Domain.
This is pretty simple and you should be able to do this fairly quickly.

Load data from S3 into Aurora Serverless using AWS Glue

According to Moving data from S3 -> RDS using AWS Glue
I found that an instance is required to add a connection to a data target. However, my RDS is a serverless, so there is no instance available. Does Glue support this case?
I have tried to connect Aurora MySql Serverless with AWS glue recently, and I failed. And I got a timeout error.
Check that your connection definition references your JDBC database with
correct URL syntax, username, and password. Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago.
The driver has not received any packets from the server.
I think the reason was Aurora serverless doesn't have any continuously running instances so in the connection URL you cannot give any instances, and that's why Glue cannot connect.
So, you need to make sure that DB instance is running. Only then your JDBC connection works.
If your DB runs in a private VPC, you can follow this link:
Nat Creation
EDIT:
Instead of NAT GW, you can also use the VPC endpoint for S3.
Here is a really good blog that explains step by step.
Or AWS documentation
AWS Glue supports the scenario, i.e., it works well to load data from S3 into Aurora Serverless using an AWS Glue job. The engine version I'm currently using is 8.0.mysql_aurora.3.02.0
Note: if you get an error saying Data source rejected establishment of connection, message from server: "Too many connections", you can increase ACUs (currently mine is set to min 4 - max 8 ACUs for your reference), as the maximum number of connections depends on the capacity of ACUs.
I can use JDBC build connection,
There is one thing very important is you should have at least one subnet open ALL TCP port, but you can point the port to the subnet.
With the setting, connection test pass, crawler also can create tables.

AWS ECS service error: Task long arn format must be enabled for launching service tasks with ECS managed tags

I have a ECS service running in a cluster which has 1 task. Upon task update, the service suddenly died with error:
'service my_service_name failed to launch a task with (error Task long arn format must be enabled for launching service tasks with ECS managed tags.)'
Current running tasks are automatically drained and the above message shows up every 6 hours in the "Events" tab of the service. Any changes done to the service config does not repair the issue. Rolling back the task update also doesn't change anything.
I believe I'm already using the long ARN format. Looking for help.
This turned out to be a AWS bug acknowledged by them now. It was supposed to manifest after Jan 1 2020 but appeared early because of a workflow fault in AWS.
The resources were created by an IAM user who was later deleted and hence the issue appears.
I simply removed the following from my task JSON input: propagateTags, enableECSManagedTags
It seems like you are Tagging Your Amazon ECS Resources but you did not opt-in to this feature so you have to opt-in and I thin you are using regular expression in deployment so if your deployment mechanism uses regular expressions to parse the old format ARNs or task IDs, this may be a breaking change.
Starting today you can opt in to a new Amazon Resource Name (ARN) and
resource ID format for Amazon ECS tasks, container instances, and
services. The new format enables the enhanced ability to tag resources
in your cluster, as well as tracking the cost of services and tasks
running in your cluster.
In most cases, you don’t need to change your system beyond opting in
to the new format. However, if your deployment mechanism uses regular
expressions to parse the old format ARNs or task IDs, this may be a
breaking change. It may also be a breaking change if you were storing
the old format ARNs and IDs in a fixed-width database field or data
structure.
After you have opted in, any new ECS services or tasks have the new
ARN and ID format. Existing resources do not receive the new format.
If you decide to opt out, any new resources that you later create then
use the old format.
You can check this AWS compute blog to migrate to new ARN.
migrating-your-amazon-ecs-deployment-to-the-new-arn-and-resource-id-format-2
Tagging Your Amazon ECS Resources
To help you manage your Amazon ECS tasks, services, task definitions,
clusters, and container instances, you can optionally assign your own
metadata to each resource in the form of tags. This topic describes
tags and shows you how to create them.
Important
To use this feature, it requires that you opt-in to the new Amazon
Resource Name (ARN) and resource identifier (ID) formats. For more
information, see Amazon Resource Names (ARNs) and IDs.
ecs-using-tags

Starting AWS DMS Replication Task in Terraform

Is there any way to start an AWS Database Migration Service full-load-and-cdc replication task through Terraform? Preferably, this would start automatically upon creation of the task.
The AWS DMS console provides an option to "Start task on create", and the AWS CLI provides a start-replication-task command, but I'm not seeing similar options in the Terraform resources. The aws_dms_replication_task provides the cdc_start_time argument, but I think this may apply only to cdc tasks. I've tried setting this argument to a number of past/current/future timestamps with my full-load-and-cdc replication task, but the task never started (it was merely created and entered the ready state).
I'd be glad to log a feature request to Terraform if this feature is not supported, but wanted to check with the community first to see if I was overlooking an existing way to do this today.
(Note: This question has also been logged to the Terraform Google group.)
I've logged an issue for this feature request:
Terraform AWS Provider #2083: Support Starting AWS Database Migration Service Replication Task