How to connect gremlin with python runtime in GCP environment?

How to connect gremlin with python runtime in GCP environment? - google-cloud-platform

Janusgraph is deployed and running in GCP container, I can access that using Cloud shell.
I want to perform some CRUD operation using python runtime.
What are the connection URL, and ports I have to mention to get proper result.
Docs used to create the GCP environment - https://cloud.google.com/architecture/running-janusgraph-with-bigtable#overview
Docs used to connect gremlin to Python - https://tinkerpop.apache.org/docs/current/reference/#connecting-via-drivers
But I'm unable to hit the server, Is there anyone out who tried to establish this type of connection.

Related

Apache Phoenix - GCP Data Proc

I am doing a POC on Google Cloud Dataproc along with HBase as one of the component.
I created cluster and was able to get the cluster running along with the HBase service. I can list and create tables via shell.
I want to use the Apache Phoenix as the client to query on HBase. I installed that on the cluster by referring to this link.
The installation when fine but when I execute sqlline.py localhost which should create the Meta table in hbase. It actually fails and gives error as Region in Transistion.
Does anyone know how to resolve this or is there a limitation that Apache Phoenix cannot be used along with Dataproc.

There is no limitation on Dataproc to use Apache Phoenix. You might want to dig deeper into the error message, it might be a configuration issue.

AWS Datapipeline RDS to S3 Activity Error: Unable to establish connection to jdbc://mysql:

I am currently setting up a AWS Data Pipeline using the RDStoRedshift Template. During the first RDStoS3Copy activity I am receiving the following error:
"[ERROR] (TaskRunnerService-resource:df-04186821HX5MK8S5WVBU_#Ec2Instance_2021-02-09T18:09:17-0) df-04186821HX5MK8S5WVBU amazonaws.datapipeline.database.ConnectionFactory: Unable to establish connection to jdbc://mysql:/myhostname:3306/mydb No suitable driver found for jdbc://mysql:/myhostname:3306/mydb"
I'm relatively new with AWS services, but it seems that the copy activity spins up an EC2 instance for the copy activity. The error clearly states there isn't a drive available. Do I need to stand up an EC2 instance for AWSDataPipeline to use and install the driver there?

Typically when you are coding a solution that interacts with a MySQL RDS instance, esp a Java solution such a Lambda function written using Java runtime API or a cloud based web app (ie - Spring Boot web app), you specify the driver file using a POM/Gradle dependency.
For this use case, there seems to be information here about a Driver file: https://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-object-jdbcdatabase.html

how can we run Neptune graph database on docker

how can we run Neptune graph database on docker
Since Neptune DB has been productized recently it is not available on Localstack can someone guide me how to deploy AWS Neptune DB Service into docker container

You don't deploy neptune, you deploy a client application which uses an appropriate client library to access neptune. The neptune software/hardware is managed by AWS and you can't access it except via API.

My guess is that you're attempting to create a local Neptune compatible docker container (i.e. some docker container with a compatible API). This would be similar to using minios when performing local integration testing with S3. If this is indeed what you're in search of I'd recommend using tinkerpop's gremlin-server image. This should get the job done for you since Neptune uses Gremlin for its query language.

For now, I found only one way. It is Pro version of Localstack. It contains Neptune DB. https://localstack.cloud/features/ Unfortunately free version of test container does not support DB interface. =(

Neptune is a fully managed graph database, and not a binary that can be independently deployed in your personal containers or infrastructure. You can run your client application in your own custom docker containers, and setup your network such that the container makes requests to the managed Neptune cluster that you have created.
Hope this helps.

Google Flexible Environment django deploy with:Invalid Cloud SQL name: gcloud beta sql instances describe

I am deploying Django app to Google Flexible Environment using the following command
gcloud app deploy
But I get this error
Updating service [default] (this may take several minutes)...failed.
ERROR: (gcloud.app.deploy) Error Response: [13] Invalid Cloud SQL name: gcloud beta sql instances describe
What could be the problem?

In your app-yaml file (as well as in mysite/settings.py), you have to provide the instance connection name of your CloudSQL instance. It is in the format:
[PROJECT_NAME]:[REGION_NAME]:[INSTANCE_NAME].
You can get this instance connection name by running the gcloud command gcloud sql instances describe [YOUR_INSTANCE_NAME] and copy the value shown for connectionName. In your case it seems that you have copied the command itself instead of the connectionName value.
Alternatively, you can also get the instance connection name by going to your Developer Console > SQL and click on your instance. You'll find the instance connection name under the "Connect to this instance" section.

LundinCast post contains the most important information to fix the issue. Take also into account that the Cloud SQL Proxy provides secure access to your Cloud SQL Second Generation instances (as described here). Use this command to run the proxy, if you already created one, as suggested in this Django in App Engine Flexible guide:
./cloud_sql_proxy -instances="[YOUR_INSTANCE_CONNECTION_NAME]"=tcp:5432
The mentioned command establishes a connection from your local computer to your Cloud SQL instance for local testing purposes and must be running while testing but it is not required while deploying.

Connecting to Spark SQL on EMR using JDBC

I have spark running on EMR and i have been trying to connect to spark-SQL from SQLWorkbench using the JDBC hive drivers, but in vain. I have started the thrift server on the EMR and i'm able to connect to Hive on port 10000(default) from Tableau/SQL Workbench. When i try to run a query, it fires a Tez/Hive job. However, i want to run the query using Spark. Within the EMR box, I'm able to connect to SparkSQL using beeline and run a query as a spark job. Resource manager shows that the beeline query is running as a spark job, while the query running through SQLWorkbench, is running a hive/Tez job.
When i checked the logs, i found that the thrift server to connect to spark was running on port 10001(default).
When i fire up beeline, the entries come up for connection and sql that i'm running. However, when the same connection parameters are used to connect form SQLWorkbench/Tableau, it has an exception without much details. the exception just say connection ended.
I tried running on a custom port by passing the parameters, beeline works, but not through jdbc connection.
Any help to resolve this issue?

I was able to resolve the issue. I was able to connect to SparkSQL from Tableau and the reason i was not able to connect was we were bringing up the thrift service as root. Not sure why it would matter, i had to change the permission on the log folder to the current user(not root) and bring up the thrift service, which enabled me to connect without any issues.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to connect gremlin with python runtime in GCP environment? - google-cloud-platform

Related

Apache Phoenix - GCP Data Proc

AWS Datapipeline RDS to S3 Activity Error: Unable to establish connection to jdbc://mysql:

how can we run Neptune graph database on docker

Google Flexible Environment django deploy with:Invalid Cloud SQL name: gcloud beta sql instances describe

Connecting to Spark SQL on EMR using JDBC

Categories

Resources