PrestoDB EMR Server refused connection - amazon-web-services

I have setup an EMR in AWS with PrestoDB installed on it, Earlier I was able to query with PrestoDB but somehow after a restart it stopped working and started giving following error
"Error running command: Server refused connection: http://ip-*---.us-west-2.compute.internal:8889/v1/statement"
I have looked into all configuration files and nothing seems to be wrong. I have also cross check Hive configuration files but could not get any success.
Could anyone who has encountered similar issue can help me.

Yes , You will have to restart presto on all machines .
Adding to the note I would like you to mention that give it a shot installing open source presto on EMR using presto admin. It has a lot of functions which will help you to avoid such issues .
Updating and maintaining the cluster is easy using presto admin

I know this is an old question, but I've run into this as well.
The likely reason is that you only restarted the Presto server on the coordinating node. You have to ssh into each core node and restart the Presto server there as well.
If this is a persistent Presto cluster, you would probably benefit from installing presto-admin. It's kind of a pain to set up the first time, but it makes this stuff much easier once it's in place.

Related

lyft/Cartography on EC2, is it possible?

I've been trying to Run cartography on my EC2 account for the last 2 days. I have no previous knowledge of Neo4j, But following their installation process doesn't work.
First I've tried to install Neo4j using rpm instructions for Neo4J website, no success acessing Neo4j on port 7474. Error: Connection refused.
Then I gave up trying to make Neo4J work on an EC2 installation, and used their MarketPlace AMi- Works Like a charm but I don't know what is being installed on that AMI. So I decided to install and run cartography on this instance.
My first problem was installing python, pip and java correctly. After everything working, I've discovered neo4j bolt port used my public IP, not my localhost. After thatI was able to finally execute Cartography, but Not it's giving me the following error:
neobolt.exceptions.ClientError: Supplied bookmark [FB:kcwQ40omSYgvSzKPpCQTXDOcCBSQ] does not conform to pattern neo4j:bookmark:v1:tx
Have Anyone really was able to use this?, every step along the way requires some specific libraries.
Thanks !
I maintain cartography and hope I can help (wish I saw this earlier though haha)
Few things to check:
Are you using Neo4j 4.x? cartography currently only supports 3.5.x.
To run for one AWS account,
AWS_PROFILE=profilename cartography --neo4j-uri <uri for your neo4j instance; usually bolt://localhost:7687>`
To run multiple accounts, set up an AWS config file and run
AWS_CONFIG_FILE=/path/to/your/aws/config cartography --neo4j-uri <uri for your neo4j instance; usually bolt://localhost:7687> --aws-sync-all-profiles
(see https://github.com/lyft/cartography/blob/master/docs/setup/install.md#cartography-installation)
If you have more questions feel free to open a GitHub issue or start a thread on our Slack (can talk about more specialized setups like if you're using containers or anything like that too)

AWS Glue Development Endpoint Not Working properly

I am trying to use a development Endpoint to interactively run and edit ETL scripts but there seems to some issues in the development endpoint just after creating it as i am getting errors in scala/python REPL and also unable to do SSH tunnel to remote interpreter.
Let me explain what i did exactly - I created a development endpoint in the AWS console with all the default configurations. While creating the development endpoint i only provided three things 'Development endpoint name' and 'IAM Role' and my 'pub ssh key'. This is how it looks after creation
Then Right After creating the endpoint i am connecting to the spark/python REPL, I am able to connect to them successfully but within couple of minutes of connecting, the REPL starts throwing errors without writing a single line of code. This is happening in all the REPL present in the development endpoints.
Also When I am trying to do SSH tunneling to remote interpreter to connect my Local Zeppelin Notebook it is throwing - "bind: Cannot assign requested address".
Couple of things that are working though -
Able to do ssh to the endpoint.
Created a Sagemaker notebook in the AWS glue that is attached to this development endpoint and this notebook seems to be working fine, although surely it is adding an additional cost and i don't want to continue using it.
Can anyone please help what am i doing wrong? Am I missing any important steps that is needed to be done on the machine right after creating the development endpoint?
Thanks in Advance!
Not very sure about this error but if you are using it smaller datasets then probably you would like to use Docker implementation as it will not add any additional cost and you can go on with your developments.
You can refer this blog on how to set it up
https://towardsdatascience.com/develop-glue-jobs-locally-using-docker-containers-bffc9d95bd1

Installing Sitecore9.2 in AWS

Can anyone provide me answer to below query?
I wanted to install Sitecore9.2 on AWS, does the installation process requires SQL VMs?
or Can someone point me to right article to this
Thanks in advance.
From my own experience Sitecore XM can use AWS RDS for the Database. If that is a good idea you must know yourself. For the installation, the Sitecore 9 installation uses contained database that may broke the SIF installation, you can turn it on in AWS RDS or use normal database user account but you need a workaround. like first installing on SQL server and migrate to RDS, or installing manual without SIF. or adjust SIF
For more information see:
https://jeroen-de-groot.com/2018/07/19/deploying-sitecore-9-in-aws-rds/
https://sitecore.stackexchange.com/questions/11047/sitecore-9-installation-using-sql-active-directory-user/11063
https://sitecore.stackexchange.com/questions/13859/why-do-we-require-contained-database-for-sitecore-9
https://sheenumalhi.wordpress.com/2019/02/19/sitecore-9-with-aws-rds/

Django Cassandra engine timeout when creating data using models

I have a Django app running in a docker swarm. I'm trying to connect it with cassandra using django-cassandra-engine.
Following the guides, I've configured installed apps and connection settings, and running the first step works great (manage.py sync_cassandra). It created the keyspace and my models.
However, whenever I try to create data using the models or a raw query, there's simply a timeout while connecting without any other errors.
Swarm is running on AWS with a custom VPC. Docker flow proxy is used as reverse proxy for the setup (not that it should affect the connection in any way)
I've tried deploying cassandra both as a standalone and as a docker image.
I am also able to ping the servers both ways.
I am even able to manually connect to the django app container, install cassandra and connect to the cassandra cluster using cqlsh
I've been banging my head off for the past few days around this...
Has anyone encountered something similar? Any ideas as to where I can start digging
Feel free to ask for any information you think may be relevant.

Spark step on EMR just hangs as "Running" after done writing to S3

Running PySpark 2 job on EMR 5.1.0 as a step. Even after the script is done with a _SUCCESS file written to S3 and Spark UI showing the job as completed, EMR still shows the step as "Running". I've waited for over an hour to see if Spark was just trying to clean itself up but the step never shows as "Completed". The last thing written in the logs is:
INFO MultipartUploadOutputStream: close closed:false s3://mybucket/some/path/_SUCCESS
INFO DefaultWriterContainer: Job job_201611181653_0000 committed.
INFO ContextCleaner: Cleaned accumulator 0
I didn't have this problem with Spark 1.6. I've tried a bunch of different hadoop-aws and aws-java-sdk jars to no avail.
I'm using the default Spark 2.0 configurations so I don't think anything else like metadata is being written. Also the size of the data doesn't seem to have an impact on this problem.
If you aren't already, you should close your spark context.
sc.stop()
Also, if you are watching the Spark Web UI via a browser, you should close that as it sometimes keeps the spark context alive. I recall seeing this on the spark dev mailing list, but can't find the jira for it.
We experienced this problem and resolved it by running the job in cluster deploy mode using the following spark-submit option:
spark-submit --deploy-mode cluster
It was something to do with when running in client mode the driver runs in the master instance and the spark-submit process is getting stuck despite the spark spark context closing. This was causing the instance controller to continuously polling for process as it never receives the completion signal. Running the driver on one of the instance nodes using the above option doesn't seem to have this problem. Hope this helps
I experienced the same issue with Spark on AWS EMR and I solved the issue by calling sys.exit(O) at the end of my Python script. The same worked with Scala program with System.exit(0).