AWS RDS pg_transport failed to download file data - amazon-web-services

When running the following command
SELECT transport.import_from_server(%s,5432,'My RDS ADMIN USER',%s,%s,%s,true);
I get the following response from the command:
AWS RDS pg_transport failed to download file data
Both RDS are in the same region, same vpc, both have security groups allowing the connection between them, SG only has inbound for 5432
Unable to find documentation or any further info on possible failure.
Steps followed were: https://aws.amazon.com/blogs/database/migrating-databases-using-rds-postgresql-transportable-databases/
With existing RDS instances, both are running Postgresql 11.5 and custom data instead of the one from the tutorial.
Any advice?

Could you please recheck if your source instance Security group allows connection from destination instance?
Recheck all the parameters that you have set in the source and destination param groups.

Had this before, it seems to be a bug within pg_transport.
Advice from AWS was to use a larger instance class on both the source and target instances. It seems to be stable using db.m5.4xlarge

Related

AWS EMR jupyter error 403 Forbidden (Workspace is not attached to cluster)

I have a simple notebook in EMR. I have no running clusters. From the notebook open page itself I request a new cluster so my expectation is that all params necessary to ensure a good workbook-cluster connection are in place. I observe that the release is emr-5.36.0 and that applications Hadoop, Spark, Livy, Hive, JupyterEnterpriseGateway are all included. I am using default security groups.
Both the cluster and the workbook hosts start but upon opening jupyter (or jupyterlab), the kernel launch fails with the message Error 403: Workbook is not attached to cluster. All attempts at "jiggling" the kernel -- choosing a different one, doing a start/stop, etc. -- all yield the same error.
There are a number of docs plus answers here on SO but these tend to revolve around trying to use EC2 instances instead of EMR, messing with master vs. core nodes, forgetting the JupyterGateway, and the like. Again, you'd think that a cluster launched directly from notebook would work.
Any clues?
I have done this many times before and it always works, with the create new cluster option, and default security groups are not an issue.
here is an image of one from before:
One thing that could cause this error, and which you have not made clear is that it will not let you open it as root. So do not use the root AWS account to create the cluster / notebook. Create and use an IAM user that has permissions to launch the cluster
I tried with the admin policy attached.

Resetting AWS params to default params

I have a Basic plan AWS account that I use for doing some labs and PoCs since I start doing some Devops.
Lately I'm no more able to connect to created instance via SSH (despite that I create a rule for port 22) after changing a couple of network params.
Is there a way to reset my AWS account so I get back to my default configuration ?
Finally, I used aws-nuke but I couldn't remove all my resources for some reason (I might had missed something).
But when I changed region from Paris to Frankfurt I was able to create EC2 instances and access them via SSH.

Using AWS Glue with Mysql in my EC2 Instance

I am trying to connect my ec2 installed/mysql with Glue, the purpose is to xtract some information and moved to redshift, but i am receiving the following error:
Check that your connection definition references your JDBC database with correct URL syntax, username, and password. Communications link failure The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
This is the format that i am using jdbc:mysql://host:3306/database
I am using the same vpc, same SG, same subnet for the instance.
i know the user/password are correct because i am connected to the database with sql developer.
What i need to check? Is it possible to use AWS Glue with mysql in my instance?
Thanks in advance.
In the JDBC connection url that you have mentioned, use the private ip of the ec2 instance(where mysql is installed) as host.
jdbc:mysql://ec2_instance_private_ip:3306/database
Yes, It is possible to use AWS Glue with your MySQL running in your EC2 Instance but Before, you should first use DMS to migrate your databases.
More over, your target database (Redshift) has a different schema than the source database (MySQL), that's what we call heterogeneous database migrations (the schema structure, data types, and database code of source databases are quite differents), you need AWS SCT.
Check this out :
As you can see, I'm not sure you can migrate straight from MySQL in an EC2 instance to Amazon Redshift.
Hope this helps

Load data from S3 into Aurora Serverless using AWS Glue

According to Moving data from S3 -> RDS using AWS Glue
I found that an instance is required to add a connection to a data target. However, my RDS is a serverless, so there is no instance available. Does Glue support this case?
I have tried to connect Aurora MySql Serverless with AWS glue recently, and I failed. And I got a timeout error.
Check that your connection definition references your JDBC database with
correct URL syntax, username, and password. Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago.
The driver has not received any packets from the server.
I think the reason was Aurora serverless doesn't have any continuously running instances so in the connection URL you cannot give any instances, and that's why Glue cannot connect.
So, you need to make sure that DB instance is running. Only then your JDBC connection works.
If your DB runs in a private VPC, you can follow this link:
Nat Creation
EDIT:
Instead of NAT GW, you can also use the VPC endpoint for S3.
Here is a really good blog that explains step by step.
Or AWS documentation
AWS Glue supports the scenario, i.e., it works well to load data from S3 into Aurora Serverless using an AWS Glue job. The engine version I'm currently using is 8.0.mysql_aurora.3.02.0
Note: if you get an error saying Data source rejected establishment of connection, message from server: "Too many connections", you can increase ACUs (currently mine is set to min 4 - max 8 ACUs for your reference), as the maximum number of connections depends on the capacity of ACUs.
I can use JDBC build connection,
There is one thing very important is you should have at least one subnet open ALL TCP port, but you can point the port to the subnet.
With the setting, connection test pass, crawler also can create tables.

AWS Glue ETL job from AWS Redshift to S3 fails

I am trying out AWS Glue service to ETL some data from redshift to S3. Crawler runs successfully and creates the meta table in data catalog, however when I run the ETL job ( generated by AWS ) it fails after around 20 minutes saying "Resource unavailable".
I cannot see AWS glue logs or error logs created in Cloudwatch. When I try to view them it says "Log stream not found. The log stream jr_xxxxxxxxxx could not be found. Check if it was correctly created and retry."
I would appreciate it if you could provide any guidance to resolve this issue.
So basically, the job you add to Glue will either run if there's not too much traffic in the region your Glue is. If there are no resources available, you need to either manually re-add the job again or you can also bind yourself to events from CloudWatch via SNS.
Also, there are parameters you can pass to the job like maximunRetry and timeout.
If you have a Ressource not available, it won't trigger a retry because the job did not fail, it just didn't even started. But if you set the timeout to let's say 60 minutes, it will trigger an error after that time, decrement your retry pool and re-launch the job.
The closest thing I see to Glue documentation on this is here:
If you encounter errors in AWS Glue, use the following solutions to
help you find the source of the problems and fix them. Note The AWS
Glue GitHub repository contains additional troubleshooting guidance in
AWS Glue Frequently Asked Questions. Error: Resource Unavailable If
AWS Glue returns a resource unavailable message, you can view error
messages or logs to help you learn more about the issue. The following
tasks describe general methods for troubleshooting. • A custom DNS
configuration without reverse lookup can cause AWS Glue to fail. Check
your DNS configuration. If you are using Amazon Route 53 or Microsoft
Active Directory, make sure that there are forward and reverse
lookups. For more information, see Setting Up DNS in Your VPC (p. 23).
• For any connections and development endpoints that you use, check
that your cluster has not run out of elastic network interfaces.
I have recently struggled with Resource Unavailable thrown by Glue Job
Also i was not able to make a direct connection in Glue using RDS -it said "no suitable security group found"
I faced this issue while trying to connect with AWS RDS and Redshift
The problem was with the Security Group that the Redshift was using. There is a need to place a self referencing inbound rule in the Security Group.
For those who dont know what is self referencing inbound rule, follow the steps
1) Go to the Security Group you are using (VPC -> Security Group)
2) In the Inbound Rules select Edit Inbound Rules
3) Add a Rule
a) Type - All Traffic b) Protocol - All c) Port Range - ALL d) Source - custom and in space available write the initial of your security group and select it. e) Save it.
Its done !
if you were missing this condition in your Security Group Inbound Rules
Try creating the connection you will be able to create the connection.
Also job should work this time.