I am trying to create an MSK connector using kafka-connect-aws-s3-kafka-2-8-4.0.0.zip, but it is stuck in ** Creating...** status and it fails with an unknown error.
So a bit about my connector config:
I am trying to deploy this Connector in order to sink data from MSK to s3 ( kafka-connect-aws )
here are the connector properties:
connector.class=io.lenses.streamreactor.connect.aws.s3.sink.S3SinkConnector
behavior.on.null.values=ignore
s3.region=eu-central-1
flush.size=5
tasks.max=8
topics=MyTopic
s3.part.size=5242880
connect.s3.vhost.bucket=true
schema.enable=false
format.bytearray.separator=\n--==SEPARATOR==--\n
key.converter.schemas.enable=false
connect.s3.kcql=INSERT INTO kafka-connect-aws2:evelin-msk SELECT * FROM MyTopic WITH_FLUSH_INTERVAL = 300
format.class=io.confluent.connect.s3.format.bytearray.ByteArrayFormat
partitioner.class=io.confluent.connect.storage.partitioner.DefaultPartitioner
value.converter.schemas.enable=false
connect.s3.aws.region=eu-central-1
value.converter=org.apache.kafka.connect.converters.ByteArrayConverter
storage.class=io.confluent.connect.s3.storage.S3Storage
errors.log.enable=true
s3.bucket.name=kafka-connect-aws
Any idea what could be the reason? Please let me know if I need to provide more information
Related
I am trying to setup a cloud replication for my master/slave db. The master resides on an external vpc and I want to set up a slave in google cloud sql. I have followed the steps as here to setup the databases.
They are setup fine and I can see initial replication taking place from my master. The data is synchronized. However shortly after, it becomes disabled for replication. I cannot seem to start it again to replicate and each time gives the following error
The instance or operation is not in an appropriate state to handle the request.
I checked the suggestions from here but that didnt work.
Running gcloud sql instances describe replica-instance1 gives me the following (excerpt):
state: RUNNABLE
replicaConfiguration:
failoverTarget: false
kind: sql#replicaConfiguration
I can update if you need more of the results but that all looks fine. Can anyone help?
Edit:
This is in the postgresql logs
resource: {
labels: {3}
type: "cloudsql_database"
}
severity: "ERROR"
textPayload: "2023-01-20 22:10:36.354 UTC [282]: [2-1] db=postgres,user=[unknown] ERROR: data stream ended"
timestamp: "2023-01-20T22:10:36.354863Z"
}
I want to add connection my glue job using lambda, so I did so:
response = glue.create_job(Name="redshift", Role="XXXXXXX",Command={
'Name':'glueetl',
'PythonVersion':'3',
'ScriptLocation':path_redshift_job,
},ExecutionProperty={'MaxConcurrentRuns':1}, Connections={
'Connections': ['RedshiftClusterConnection']})
But as you see in the image connection no connection is added to the job, what can I do to solve this issue ?
I am trying to connect to Athena using SQL workbench. I followed all the instructions from page 15 to 19 mentioned in this PDF file:
https://s3.amazonaws.com/athena-downloads/drivers/JDBC/SimbaAthenaJDBC_2.0.7/docs/Simba+Athena+JDBC+Driver+Install+and+Configuration+Guide.pdf
If I use the default athena bucket name, I get this error:
S3://aws-athena-query-results-51346970XXXX-us-east-1/Unsaved
[Simba]AthenaJDBC An error has been thrown from the AWS SDK
client. Unable to execute HTTP request: No such host is known
(athena.useast-1.amazonaws.com) [Execution ID not available]
For any other bucketname I get this error:
s3://todel162/testfolder-1
[Simba]AthenaJDBC An error has been thrown from the AWS SDK
client. Unable to execute HTTP request: athena.useast-1.amazonaws.com
[Execution ID not available]
How do I connect to Athena using JDBC client?
Using copy-paste had an issue with the string on page 16:
jdbc:awsathena://AwsRegion=useast-1;
It should have a - like this...
jdbc:awsathena://AwsRegion=us-east-1;
Once I corrected this, I was able to connect.
I am trying to execute following flow:
user hits AWS Gateway (REST),
it triggers AWS Lambda,
that uses Tinkerpop/Gremlin connects to
TitanDB on EC2, that uses
AWS DynamoDB in cloud (not on EC2) as backend.
Right now I have managed to crete fully working TitanDB instance on EC2, that stores data in DynamoDB in cloud.
I am also able to connect from AWS Lambda to EC2 through Tinkerpop/Gremlin BUT only this way:
Cluster.build()
.addContactPoint("10.x.x.x") // ip of EC2
.create()
.connect()
.submit("here I type my query as string and it will work");
And this works, however I strongly prefer to use "Criteria API" (GremlinPipeline) instead of plain Gremlin language.
In other words, I need ORM or something like that.
I know, that Tinkerpop includes it.
I have realized, that what I need is object of class Graph.
This is what I have tried:
Graph graph = TitanFactory
.build()
.set("storage.hostname", "10.x.x.x")
.set("storage.backend", "com.amazon.titan.diskstorage.dynamodb.DynamoDBStoreManager")
.set("storage.dynamodb.client.credentials.class-name", "com.amazonaws.auth.DefaultAWSCredentialsProviderChain")
.set("storage.dynamodb.client.credentials.constructor-args", "")
.set("storage.dynamodb.client.endpoint", "https://dynamodb.ap-southeast-2.amazonaws.com")
.open();
However, it throws "Could not find implementation class: com.amazon.titan.diskstorage.dynamodb.DynamoDBStoreManager".
Of course, computer is correct, as IntelliJ IDEA also cannot find it.
My dependencies:
//
// aws
compile 'com.amazonaws:aws-lambda-java-core:+'
compile 'com.amazonaws:aws-lambda-java-events:+'
compile 'com.amazonaws:aws-lambda-java-log4j:+'
compile 'com.amazonaws:aws-java-sdk-dynamodb:1.10.5.1'
compile 'com.amazonaws:aws-java-sdk-ec2:+'
//
// database
// titan 1.0.0 is compatible with gremlin 3.0.2-incubating, but not yet with 3.2.0
compile 'com.thinkaurelius.titan:titan-core:1.0.0'
compile 'org.apache.tinkerpop:gremlin-core:3.0.2-incubating'
compile 'org.apache.tinkerpop:gremlin-driver:3.0.2-incubating'
What is my goal: have fully working Graph object
What is my problem: I don't have DynamoDBStoreManager class, and I do not know what dependency I have to add.
My additional question is: why connecting through Cluster class requires only IP and works, but TitanFactory requires properties like those I have used on gremlin-server on EC2?
I do not want to create second server, I just want to connect as client to it and take Graph object.
EDIT:
After adding resolver, it builds, in output I get multiple:
13689 [TitanID(0)(4)[0]] WARN com.thinkaurelius.titan.diskstorage.idmanagement.ConsistentKeyIDAuthority - Temporary storage exception while acquiring id block - retrying in PT2.4S: com.thinkaurelius.titan.diskstorage.TemporaryBackendException: Wrote claim for id block [1, 51) in PT0.342S => too slow, threshold is: PT0.3S
and execution hangs on open() method, so does not allow me to execute any queries.
For the DynamoDBStoreManager class, you would need this dependency:
compile 'com.amazonaws:dynamodb-titan100-storage-backend:1.0.0'
Then for the DynamoDBLocal issue, try adding this resolver:
resolvers += "AWS DynamoDB Local Release Repository" at "http://dynamodb-local.s3-website-us-west-2.amazonaws.com/release"
I'm not entirely clear on what this means -- "Criteria API" instead of plain Gremlin language. I'm guessing that you mean that you want to interact with the graph using Java rather than passing Gremlin as a string over to a running Titan/Gremlin Server? If this is the case, then you don't need to start a Titan/Gremlin Server at all (step 4 above). Write an AWS Lambda program (step 2-3 above) that creates a direct Titan client connection via TitanFactory, where all of the Titan configuration properties are for your DynamoDB instance (step 5 above).
I am trying to automate the process of creating application version for an existing elastic beanstalk application through java api and command line arguments.
while implementing createApplicationVersion() of AWSElasticBeanstalkClient I am getting error for the below code snipplet.
Note: I am passing the endpoint for AWSElasticBeanstalkClient as US East-1 (N.Virginia) or the environment url for the existing application.
ArrayList<String> s3SourceBundleList = AmazonS3BucketUploadApp.doBucketUploadFromLocal(sourceLocation);
String bucketName = s3SourceBundleList.get(0);
String keyName = java.net.URLEncoder.encode(s3SourceBundleList.get(1), "UTF-8");
//String keyName = s3SourceBundleList.get(1);
S3Location s3SourceBundle = new S3Location();
s3SourceBundle.setS3Bucket(bucketName);
s3SourceBundle.setS3Key(keyName);
createApplicationVersionRequest.setSourceBundle(s3SourceBundle);
createApplicationVersionRequest.setDescription("New version");
appVersionResultObject = awsBeanstalkclient.createApplicationVersion(createApplicationVersionRequest);
Error:
com.amazonaws.AmazonClientException: Unable to unmarshall response (ParseError at [row,col]:[6,1]
and one more error is
AWS service: AmazonElasticBeanstalk AWS Request ID: null AWS service unavailable.
Please suggest any solution for this.
How are you initializing the client (check logs output - enabling logger org.apache.http.wire to TRACE could help)?
If you want an idea, peek at this source:
https://github.com/jenkinsci/awseb-deployment-plugin/blob/master/src/main/java/br/com/ingenieux/jenkins/plugins/awsebdeployment/Deployer.java
It contains all you need to build and deploy into AWS EB :)