AWS AppFlow Salesforce to Redshift Error Creating Connection - amazon-web-services

I'm wanting to create a one-way real-time copy of a Salesforce (SF) object in Redshift. The idea being that when fields are updated in SF, those fields will be updated in Redshift as well. The history of changes are irrelevant in AWS/Redshift, that's all being tracked in SF - I just need a real-time read-only copy of that particular object to query. Preferably without having to query the whole SF object, clearing the Redshift table, and piping the data in.
I thought AWS AppFlow listening for SF Change Data Capture events might be a good setup for this:
When I try to create a flow, I don't have any issues with the SF source connection:
so I click "Connect" in the Destination details section to setup Redshift and I fill out this page and click "Connect" again:
About 5 seconds goes by and I receive this error pop-up:
An error occurred while creating the connection
Error while communicating to connector: Failed to validate Connection while attempting "select distinct(table_schema) from information_schema.tables limit 1" with connector failure Can't connect to JDBC database with message: Amazon Error setting/closing connection: SocketTimeoutException. (Service: null; Status Code: 400; Error Code: Client; Request ID: null; Proxy: null)
I know my connection string, username, password, etc are all good - I'm connected to Redshift in other apps. Any idea what the issue could be? Is this even the right solution for what I'm trying to do?

I solved this by adding the AppFlow IP ranges for my region to my Redshift VPC's security group inbound rules.

Related

Errors connecting to AWS Keyspaces using a lambda layer

Intermittently getting the following error when connecting to an AWS keyspace using a lambda layer
All host(s) tried for query failed. First host tried, 3.248.244.53:9142: Host considered as DOWN. See innerErrors.
I am trying to query a table in a keyspace using a nodejs lambda function as follows:
import cassandra from 'cassandra-driver';
import fs from 'fs';
export default class AmazonKeyspace {
tpmsClient = null;
constructor () {
let auth = new cassandra.auth.PlainTextAuthProvider('cass-user-at-xxxxxxxxxx', 'zzzzzzzzz');
let sslOptions1 = {
ca: [ fs.readFileSync('/opt/utils/AmazonRootCA1.pem', 'utf-8')],
host: 'cassandra.eu-west-1.amazonaws.com',
rejectUnauthorized: true
};
this.tpmsClient = new cassandra.Client({
contactPoints: ['cassandra.eu-west-1.amazonaws.com'],
localDataCenter: 'eu-west-1',
authProvider: auth,
sslOptions: sslOptions1,
keyspace: 'tpms',
protocolOptions: { port: 9142 }
});
}
getOrganisation = async (orgKey) => {
const SQL = 'select * FROM organisation where organisation_id=?;';
return new Promise((resolve, reject) => {
this.tpmsClient.execute(SQL, [orgKey], {prepare: true}, (err, result) => {
if (!err?.message) resolve(result.rows);
else reject(err.message);
});
});
};
}
I am basically following this recommended AWS documentation.
https://docs.aws.amazon.com/keyspaces/latest/devguide/using_nodejs_driver.html
It seems that around 10-20% of the time the lambda function (cassandra driver) cannot connect to the endpoint.
I am pretty familiar with Cassandra (I already use a 6 node cluster that I manage) and don't have any issues with that.
Could this be a timeout or do I need more contact points?
Followed the recommended guides. Checked from the AWS console for any errors but none shown.
UPDATE:
Update to the above question....
I am occasionally (1 in 50 if I parallel call the function (5 concurrent calls)) getting the below error:
"All host(s) tried for query failed. First host tried,
3.248.244.5:9142: DriverError: Socket was closed at Connection.clearAndInvokePending
(/opt/node_modules/cassandra-driver/lib/connection.js:265:15) at
Connection.close
(/opt/node_modules/cassandra-driver/lib/connection.js:618:8) at
TLSSocket.
(/opt/node_modules/cassandra-driver/lib/connection.js:93:10) at
TLSSocket.emit (node:events:525:35)\n at node:net:313:12\n at
TCP.done (node:_tls_wrap:587:7) { info: 'Cassandra Driver Error',
isSocketError: true, coordinator: '3.248.244.5:9142'}
This exception may be caused by throttling in the keyspaces side, resulting the Driver Error that you are seeing sporadically.
I would suggest taking a look over this repo which should help you to put measures in place to either prevent the occurrence of this issue or at least reveal the true cause of the exception.
Some of the errors you see in the logs you will need to investigate Amazon CloudWatch metrics to see if you have throttling or system errors. I've built this AWS CloudFormation template to deploy a CloudWatch dashboard with all the appropriate metrics. This will provide better observability for your application.
A System Error indicates an event that must be resolved by AWS and often part of normal operations. Activities such as timeouts, server faults, or scaling activity could result in server errors. A User error indicates an event that can often be resolved by the user such as invalid query or exceeding a capacity quota. Amazon Keyspaces passes the System Error back as a Cassandra ServerError. In most cases this a transient error, in which case you can retry your request until it succeeds. Using the Cassandra driver’s default retry policy customers can also experience NoHostAvailableException or AllNodesFailedException or messages like yours "All host(s) tried for query failed". This is a client side exception that is thrown once all host in the load balancing policy’s query plan have attempted the request.
Take a look at this retry policy for NodeJs which should help resolve your "All hosts failed" exception or pass back the original exception.
The retry policies in the Cassandra drivers are pretty crude and will not be able to do more sophisticated things like circuit breaker patters. You may want to eventually use a "failfast" retry policy for the driver and handle the exceptions in your application code.

Database Migration Task fails to load the data into the source database

I have created PostgreSQL (target) RDS on AWS , did schema conversion using SCT and now I am trying to move data using Data Migration task from database (DB2) placed at EC2 instance (source) to target DB. The data is not loading and task is giving following error:
Last Error ODBC general error. Task error notification received from subtask 1, thread 0 [reptask/replicationtask.c:2800] [1022502] Error executing source loop; Stream component failed at subtask 1, component st_1_5D3OUPDVTS3BLNMSQGEXI7ARKY ; Stream component 'st_1_5D3OUPDVTS3BLNMSQGEXI7ARKY' terminated [reptask/replicationtask.c:2807] [1022502] Stop Reason RECOVERABLE_ERROR Error Level RECOVERABLE
I was getting the same error and the issue was related to database user rights for REPLICATION CLIENT and REPLICATION SLAVE as mentioned in AWS Documentation:
https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.MySQL.html#CHAP_Source.MySQL.Prerequisites
I resolved it by setting the above mentioned REPLICATION rights using the following statements in MySQL (replacing {dbusername} with the actual database user name which was being used in DMS Endpoint):
GRANT REPLICATION CLIENT ON *.* to {dbusername}#'%';
GRANT REPLICATION SLAVE ON *.* to {dbusername}#'%';

AWS Glue job runs correct but returns a connection refused error

I am running a test job on AWS. I am reading CSV data from S3 bucket, running a GLUE ETL job on it and storing the same data on Amazon Redshift. GLUE job is just reading the data from S3 and storing in Redshift without any modification. The job runs fine and I get the desired result in Redshift but it returns an error which I am unable to understand.
Here is the error log:
18/11/14 09:17:31 WARN YarnClient: The GET request failed for the URL http://169.254.76.1:8088/ws/v1/cluster/apps/application_1542186720539_0001
com.amazon.ws.emr.hadoop.fs.shaded.org.apache.http.conn.HttpHostConnectException: Connect to 169.254.76.1:8088 [/169.254.76.1] failed: Connection refused (Connection refused)
It is a WARN rather than error but I want to understand what is causing the WARN. I tried to search for the IP that is indicated in the WARN but I am not able to find the machine with the mentioned IP.
I noticed these error comming up to me in my AWS Glue Job so I found something that could be helpful from AWS:
This WARN message is not so special, and does not mean job failure or any errors directly. I guess there should be other cause.
I would recommend you to enable continuous logging, and check both driver/executor logs to see if there are any suspicious behavior.
If you enable job bookmark, please try disabling it and see how it goes without bookmark.
https://forums.aws.amazon.com/thread.jspa?messageID=927547
I had dissabled bookmarks from the begining. What I check is that my Glue job writing data to S3 and got an exeption per Memory, so what I did is to repartition the data.
MyDynamicFrame.coalesce(100).write.partitionBy("month").mode("overwrite").parquet("s3://"+bucket+"/"+path+"/out_data")
so if you have some write opperations, I'll recommend to check how you are writing to S3

How to reconnect if AWS RDS recovery happens

How have I written the code
createPool is used at the start of the app
then for every request I am using getConnection
I am using AWS RDS & it went into sudden recovery mode, due to which my db url was unchanged but instance IP must have changed as it was created in another AZ
So for such a scenario I am supposed to reinitialize my db connection so that new instance DNS is updated.
The issue is in such a scenario I did not received any timeout error or connection error. So how do I capture this type of error?
Kindly guide if possible.
Thanks
It is unclear from your description what exactly you have built, but it sounds like you've created a connection pool.
If you open a connection to the db, the first time you call getConnection you should validate that the connection is still active - obviously if the db fails over, the existing connection will get closed, and you will either need to create a new connection or re-open the existing one.

In Azure Eventhub receiver giving "Encountered error while fetching the list of EventHub PartitionIds" error

I am trying to implement the receiver part as per the tutorial
https://azure.microsoft.com/en-us/documentation/articles/event-hubs-java-ephjava-getstarted/
Failure while registering: com.microsoft.azure.eventprocessorhost.EPHConfigurationException:
Encountered error while fetching the list of EventHub PartitionIds
I have 16 partitions in my eventhub. But when I send the data I don't specify any partition. How do I know in which partition my data is sent to? Am I getting the above error because of all the partitions?
Make sure that your consumer sas policies does includes "manage" and not only "listen".
I guess it has to have manage rights to be able to list the partitions.
Ideally - EPH should work with "Listen"-only Claims. Right now we have a bug in EventProcessorHost client-code as a result of which - it needs "Manage" claims. We are working on it.
The error "Encountered error while fetching the list of EventHub PartitionIds" is generic and Thrown at PartitionManager, while querying for Partitions. You ran into one of the exceptions in the below catch block. Please indicate inner-exception for completeness (SEO) & faster resolution.
catch(XPathExpressionException|ParserConfigurationException|IOException|InvalidKeyException|NoSuchAlgorithmException|URISyntaxException|SAXException exception)
{
throw new EPHConfigurationException("Encountered error while fetching the list of EventHub PartitionIds", exception);
}
EDIT:
this issue is fixed in version 0.7.7.
I was working with integrating eventhub with the ELK stack and came across the same error.
To solve this I found that in the settings for the Event Hub Namespace that my event hub was in I had to allow access to the VNET my ELK stack was in.
This can be done by going to the following page: Your EventHubNamespace > Settings - Firewalls and virtual networks.
Either allow access from all networks or add your specific VNET, Subnet, or IP range (for specific machines).
This in addition to setting consumer sas policies to Manage should resolve the "Encountered error while fetching the list of EventHub PartitionIds" error.