I am setting up a new Percona node on an AWS to connect to an existing cluster that is not on AWS. I have security groups updated.
I added all the IPs to the my.cnf file and could not start Percona. I removed the IPs to start from scratch. I am getting this error:
140114 16:24:30 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
at gcomm/src/pc.cpp:connect():139
140114 16:24:30 [ERROR] WSREP: gcs/src/gcs_core.c:gcs_core_open():195: Failed to open backend connection: -110 (Connection timed out)
140114 16:24:30 [ERROR] WSREP: gcs/src/gcs.c:gcs_open():1289: Failed to open channel 'my_centos_cluster' at 'gcomm://10.10.25.10,10.20.4.11,10.10.20.12,10.20.4.13': -110 (Connection timed out)
140114 16:24:30 [ERROR] WSREP: gcs connect failed: Connection timed out
140114 16:24:30 [ERROR] WSREP: wsrep::connect() failed: 6
140114 16:24:30 [ERROR] Aborting
That is only a sample. Here is my.cnf:
[mysqld]
#datadir=/var/lib/mysql
datadir=/data/mysql
user=mysql
log-error=/data/mysql/mysqlerror.log
# Path to Galera library
wsrep_provider=/usr/lib64/libgalera_smm.so
# Cluster connection URL contains the IPs of node#1, node#2 and node#3
wsrep_cluster_address=gcomm://LOCAL_IP
# In order for Galera to work correctly binlog format should be ROW
binlog_format=ROW
# MyISAM storage engine has only experimental support
default_storage_engine=InnoDB
# This is a recommended tuning variable for performance
innodb_locks_unsafe_for_binlog=1
# This changes how InnoDB autoincrement locks are managed and is a requirement for Galera
innodb_autoinc_lock_mode=2
# Node #1 address
wsrep_node_address=LOCAL_IP
# SST method
wsrep_sst_method=xtrabackup
# Cluster name
wsrep_cluster_name=my_centos_cluster
# Authentication for SST method
wsrep_sst_auth="USER:PASS"
What am I doing wrong?
Related
After the software update of Command Line Tools for Xcode to the version 13,4 the gcloud compute ssh command stopped working with the error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate.
I'm not behind proxy or firewall.
What I've tried so far: updating google cloud sdk, then reinstalling, then removing and installing goole cloud sdk from scratch a number of times but the gcloud init command fails to complete with the same error. Downgrading command line tools to 13.2 didn't help. Updating certifi and launching "Install Certificates.command" neither.
output of "gcloud info --run-diagnostics --verbosity debug":
DEBUG: Running [gcloud.info] with arguments: [--run-diagnostics: "True", --verbosity: "debug"]
Network diagnostic detects and fixes local network connection issues.
Checking network connection...⠏DEBUG: Starting new HTTPS connection (1): accounts.google.com:443
Checking network connection...⠛DEBUG: https://accounts.google.com:443 "GET / HTTP/1.1" 302 338
Checking network connection...⠹DEBUG: https://accounts.google.com:443 "GET /ServiceLogin?passive=1209600&continue=https%3A%2F%2Faccounts.google.com%2F&followup=https%3A%2F%2Faccounts.google.com%2F HTTP/1.1" 302 526
DEBUG: https://accounts.google.com:443 "GET /v3/signin/identifier?dsh=S352504070%3A1656098809680794&continue=https%3A%2F%2Faccounts.google.com%2F&followup=https%3A%2F%2Faccounts.google.com%2F&passive=1209600&flowName=WebLiteSignIn&flowEntry=ServiceLogin&ifkv=AX3vH3-l3sW9otbTScMC6LItjgqZXIpEl6jaKQLX4a-o3Z7M4L5oVPqMq_V_Vltgjce-HlGz4y0mFQ HTTP/1.1" 200 None
Checking network connection...⠼DEBUG: Starting new HTTPS connection (1): cloudresourcemanager.googleapis.com:443
DEBUG: Starting new HTTPS connection (1): www.googleapis.com:443
Checking network connection...⠶DEBUG: Starting new HTTPS connection (1): dl.google.com:443
Checking network connection...⠧DEBUG: https://dl.google.com:443 "GET /dl/cloudsdk/channels/rapid/components-2.json HTTP/1.1" 200 190919
Checking network connection...done.
ERROR: Reachability Check failed.
httplib2 cannot reach https://cloudresourcemanager.googleapis.com/v1beta1/projects:
[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1091)
httplib2 cannot reach https://www.googleapis.com/auth/cloud-platform:
[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1091)
requests cannot reach https://cloudresourcemanager.googleapis.com/v1beta1/projects:
HTTPSConnectionPool(host='cloudresourcemanager.googleapis.com', port=443): Max retries exceeded with url: /v1beta1/projects (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1091)')))
requests cannot reach https://www.googleapis.com/auth/cloud-platform:
HTTPSConnectionPool(host='www.googleapis.com', port=443): Max retries exceeded with url: /auth/cloud-platform (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1091)')))
Network connection problems may be due to proxy or firewall settings.
Do you have a network proxy you would like to set in gcloud (Y/n)? n
ERROR: Network diagnostic failed (0/1 checks passed).
Property diagnostic detects issues that may be caused by properties.
Checking hidden properties...done.
Hidden Property Check passed.
Property diagnostic passed (1/1 checks passed).
DEBUG: (gcloud.info) Some of the checks in diagnostics failed.
Traceback (most recent call last):
File "/Users/gclouder/google-cloud-sdk/lib/googlecloudsdk/calliope/cli.py", line 987, in Execute
resources = calliope_command.Run(cli=self, args=args)
File "/Users/gclouder/google-cloud-sdk/lib/googlecloudsdk/calliope/backend.py", line 809, in Run
resources = command_instance.Run(args)
File "/Users/gclouder/google-cloud-sdk/lib/surface/info.py", line 91, in Run
raise exceptions.Error('Some of the checks in diagnostics failed.')
googlecloudsdk.core.exceptions.Error: Some of the checks in diagnostics failed.
ERROR: (gcloud.info) Some of the checks in diagnostics failed.
output of "gcloud info":
Google Cloud SDK [391.0.0]
Platform: [Mac OS X, x86_64] uname_result(system='Darwin', node='gclouder.local', release='21.5.0', version='Darwin Kernel Version 21.5.0: Tue Apr 26 21:08:22 PDT 2022; root:xnu-8020.121.3~4/RELEASE_X86_64', machine='x86_64', processor='i386')
Locale: (None, 'UTF-8')
Python Version: [3.7.9 (v3.7.9:13c94747c7, Aug 15 2020, 01:31:08) [Clang 6.0 (clang-600.0.57)]]
Python Location: [/Users/gclouder/.config/gcloud/virtenv/bin/python3]
OpenSSL: [OpenSSL 1.1.1g 21 Apr 2020]
Requests Version: [2.22.0]
urllib3 Version: [1.25.9]
Site Packages: [Enabled]
Installation Root: [/Users/gclouder/google-cloud-sdk]
Installed Components:
gsutil: [5.10]
core: [2022.06.17]
bq: [2.0.75]
System PATH: [/Users/gclouder/.config/gcloud/virtenv/bin:/Users/gclouder/google-cloud-sdk/bin:/Users/gclouder/.nvm/versions/node/v14.19.0/bin:/Users/gclouder/.jenv/shims:/Users/gclouder/.jenv/bin:/usr/local/opt/mysql#5.7/bin:/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin]
Python PATH: [/Users/gclouder/google-cloud-sdk/lib/third_party:/Users/gclouder/google-cloud-sdk/lib:/Library/Frameworks/Python.framework/Versions/3.7/lib/python37.zip:/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7:/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/lib-dynload:/Users/gclouder/.config/gcloud/virtenv/lib/python3.7/site-packages]
Cloud SDK on PATH: [True]
Kubectl on PATH: [/usr/local/bin/kubectl]
Installation Properties: [/Users/gclouder/google-cloud-sdk/properties]
User Config Directory: [/Users/gclouder/.config/gcloud]
Active Configuration Name: [default]
Active Configuration Path: [/Users/gclouder/.config/gcloud/configurations/config_default]
Account: [None]
Project: [None]
Current Properties:
[core]
disable_usage_reporting: [True] (property file)
Logs Directory: [/Users/gclouder/.config/gcloud/logs]
Last Log File: [/Users/gclouder/.config/gcloud/logs/2022.06.24/21.26.47.993939.log]
git: [git version 2.32.1 (Apple Git-133)]
ssh: [OpenSSH_8.6p1, LibreSSL 3.3.6]
Update: it was the corporate antivirus that started behaving this way after a software update
I have installed mysql in a VM and wanted my EKS with istio 1.9 installed to talk with them, i am following this https://istio.io/latest/docs/setup/install/virtual-machine/ but when am doing this step the host file which getting generated is empty file.
With this empty host file i tried but when starting the vm with this command am getting
> sudo systemctl start istio
when tailed this file
*/var/log/istio/istio.log*
2021-03-22T18:44:02.332421Z info Proxy role ips=[10.8.1.179 fe80::dc:36ff:fed3:9eea] type=sidecar id=ip-10-8-1-179.vm domain=vm.svc.cluster.local
2021-03-22T18:44:02.332429Z info JWT policy is third-party-jwt
2021-03-22T18:44:02.332438Z info Pilot SAN: [istiod.istio-system.svc]
2021-03-22T18:44:02.332443Z info CA Endpoint istiod.istio-system.svc:15012, provider Citadel
2021-03-22T18:44:02.332997Z info Using CA istiod.istio-system.svc:15012 cert with certs: /etc/certs/root-cert.pem
2021-03-22T18:44:02.333093Z info citadelclient Citadel client using custom root cert: istiod.istio-system.svc:15012
2021-03-22T18:44:02.410934Z info ads All caches have been synced up in 82.7974ms, marking server ready
2021-03-22T18:44:02.411247Z info sds SDS server for workload certificates started, listening on "./etc/istio/proxy/SDS"
2021-03-22T18:44:02.424855Z info sds Start SDS grpc server
2021-03-22T18:44:02.425044Z info xdsproxy Initializing with upstream address "istiod.istio-system.svc:15012" and cluster "Kubernetes"
2021-03-22T18:44:02.425341Z info Starting proxy agent
2021-03-22T18:44:02.425483Z info dns Starting local udp DNS server at localhost:15053
2021-03-22T18:44:02.427627Z info dns Starting local tcp DNS server at localhost:15053
2021-03-22T18:44:02.427683Z info Opening status port 15020
2021-03-22T18:44:02.432407Z info Received new config, creating new Envoy epoch 0
2021-03-22T18:44:02.433999Z info Epoch 0 starting
2021-03-22T18:44:02.690764Z warn ca ca request failed, starting attempt 1 in 91.93939ms
2021-03-22T18:44:02.693579Z info Envoy command: [-c etc/istio/proxy/envoy-rev0.json --restart-epoch 0 --drain-time-s 45 --parent-shutdown-time-s 60 --service-cluster istio-proxy --service-node sidecar~10.8.1.179~ip-10-8-1-179.vm~vm.svc.cluster.local --local-address-ip-version v4 --bootstrap-version 3 --log-format %Y-%m-%dT%T.%fZ %l envoy %n %v -l warning --component-log-level misc:error --concurrency 2]
2021-03-22T18:44:02.782817Z warn ca ca request failed, starting attempt 2 in 195.226287ms
2021-03-22T18:44:02.978344Z warn ca ca request failed, starting attempt 3 in 414.326774ms
2021-03-22T18:44:03.392946Z warn ca ca request failed, starting attempt 4 in 857.998629ms
2021-03-22T18:44:04.251227Z warn sds failed to warm certificate: failed to generate workload certificate: create certificate: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup istiod.istio-system.svc on 10.8.0.2:53: no such host"
2021-03-22T18:44:04.849207Z warn ca ca request failed, starting attempt 1 in 91.182413ms
2021-03-22T18:44:04.940652Z warn ca ca request failed, starting attempt 2 in 207.680983ms
2021-03-22T18:44:05.148598Z warn ca ca request failed, starting attempt 3 in 384.121814ms
2021-03-22T18:44:05.533019Z warn ca ca request failed, starting attempt 4 in 787.704352ms
2021-03-22T18:44:06.321042Z warn sds failed to warm certificate: failed to generate workload certificate: create certificate: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup istiod.istio-system.svc on 10.8.0.2:53: no such host"
Trying to use aws cli to push a docker container. Getting connection reset. Not sure why?
ecs-cli compose --project-name blah service up --create-log-groups --cluster-config blah --timeout 20
WARN[0000] Skipping unsupported YAML option for service... option name=expose service name=app
WARN[0000] Ignoring the ip address while transforming it to task definition container=app portMapping="0.0.0.0:8080:8080"
WARN[0000] Ignoring the ip address while transforming it to task definition container=app portMapping="0.0.0.0:8080:8080"
INFO[0000] Using ECS task definition TaskDefinition="blah:2"
WARN[0000] No log groups to create; no containers use 'awslogs'
INFO[0001] Updated ECS service successfully desiredCount=1 force-deployment=false service=blah
INFO[0047] (service blah) has started 1 tasks: (task d4d52496-057a-4b24-878a-4cf654085eff). timestamp="2019-05-24 18:15:02 +0000 UTC"
INFO[0125] (service blah) has started 1 tasks: (task 9484000f-4a4d-4b2d-8fdb-a9668012d6ae). timestamp="2019-05-24 18:16:23 +0000 UTC"
INFO[0202] (service blah) has started 1 tasks: (task f13dea68-e996-40ce-a44f-be266219b001). timestamp="2019-05-24 18:17:32 +0000 UTC"
ERRO[0392] Error describing service error="RequestError: send request failed\ncaused by: Post https://ecs.us-west-2.amazonaws.com/: read tcp 192.168.1.231:53094->52.119.169.134:443: read: connection reset by peer" service=blah
FATA[0392] RequestError: send request failed
caused by: Post https://ecs.us-west-2.amazonaws.com/: read tcp 192.168.1.231:53094->52.119.169.134:443: read: connection reset by peer
I am trying to run Cloud Endpoints ESP container locally on my Mac fallowing the documentation at https://cloud.google.com/endpoints/docs/openapi/running-esp-localdev.
I could deploy the endpoint configuration to GCP using the openapi yaml file. But some how the ESP container could not get the service definitions from Service management API.
I have looked at the ESP container logs, it has go below errors. Am I missing something?
2018/09/13 21:29:38[error]11#11: Failed to download rollouts: UNAVAILABLE: Failed to connect to the service management, Response body:
2018/09/13 21:30:08 [error] 11#11: servicemanagement.googleapis.com could not be resolved (110: Operation timed out)
2018/09/13 21:30:08 [error] 11#11: servicemanagement.googleapis.com could not be resolved (110: Operation timed out)
2018/09/13 21:30:08 [error] 11#11: servicemanagement.googleapis.com could not be resolved (110: Operation timed out)
2018/09/13 21:30:38 [error] 11#11: servicemanagement.googleapis.com could not be resolved (110: Operation timed out)
2018/09/13 21:30:38 [error] 11#11: servicemanagement.googleapis.com could not be resolved (110: Operation timed out)
2018/09/13 21:30:38 [error] 11#11: servicemanagement.googleapis.com could not be resolved (110: Operation timed out)
Operation timed out errors imply that there may be a networking issue with your ESP container. Try running:
$ docker run byrnedo/alpine-curl curl -sI servicemanagement.googleapis.com
If you see HTTP/1.1 404 Not Found then you can eliminate any issues with your docker networking setup; otherwise try following the troubleshooting guide at https://success.docker.com/article/troubleshooting-container-networking.
I've followed the instructions on the website of Spark and I got 1 master and 1 slave running in my Amazon. However, I'm not able to connect to the master node using pyspark
I can connect to the master node using SSH without any problem.
Here's my command
spark-ec2 --key-pair=graph-cluster --identity-file=/Users/.ssh/pem.pem --region=us-east-1 --zone=us-east-1a launch graph-cluster
I can go to http://ec2-54-152-xx-xxx.compute-1.amazonaws.com:8080/
and see that Spark is up and running I also see this Spark Master at
spark://ec2-54-152-xx-xxx.compute-1.amazonaws.com:7077
However when I run command
MASTER=spark://ec2-54-152-xx-xx.compute-1.amazonaws.com:7077 pyspark
I get this error
2015-09-16 15:39:31,800 ERROR actor.OneForOneStrategy (Slf4jLogger.scala:apply$mcV$sp(66)) -
java.lang.NullPointerException
at org.apache.spark.deploy.client.AppClient$ClientActor$$anonfun$receiveWithLogging$1.applyOrElse(AppClient.scala:160)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at org.apache.spark.util.ActorLogReceive$$anon$1.apply(ActorLogReceive.scala:59)
at org.apache.spark.util.ActorLogReceive$$anon$1.apply(ActorLogReceive.scala:42)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
at org.apache.spark.util.ActorLogReceive$$anon$1.applyOrElse(ActorLogReceive.scala:42)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at org.apache.spark.deploy.client.AppClient$ClientActor.aroundReceive(AppClient.scala:61)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
at akka.dispatch.Mailbox.run(Mailbox.scala:220)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
2015-09-16 15:39:31,804 INFO client.AppClient$ClientActor (Logging.scala:logInfo(59)) - Connecting to master akka.tcp://sparkMaster#ec2-54-152-xx-xxx.compute-1.amazonaws.com:7077/user/Master...
2015-09-16 15:39:31,955 INFO util.Utils (Logging.scala:logInfo(59)) - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 52333.
2015-09-16 15:39:31,956 INFO netty.NettyBlockTransferService (Logging.scala:logInfo(59)) - Server created on 52333
2015-09-16 15:39:31,959 INFO storage.BlockManagerMaster (Logging.scala:logInfo(59)) - Trying to register BlockManager
2015-09-16 15:39:31,964 INFO storage.BlockManagerMasterEndpoint (Logging.scala:logInfo(59)) - Registering block manager xxx:52333 with 265.1 MB RAM, BlockManagerId(driver, xxx, 52333)
2015-09-16 15:39:31,969 INFO storage.BlockManagerMaster (Logging.scala:logInfo(59)) - Registered BlockManager
2015-09-16 15:39:32,458 ERROR spark.SparkContext (Logging.scala:logError(96)) - Error initializing SparkContext.
java.lang.IllegalStateException: Cannot call methods on a stopped SparkContext
at org.apache.spark.SparkContext.org$apache$spark$SparkContext$$assertNotStopped(SparkContext.scala:103)
at org.apache.spark.SparkContext.getSchedulingMode(SparkContext.scala:1503)
at org.apache.spark.SparkContext.postEnvironmentUpdate(SparkContext.scala:2007)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:543)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
at py4j.Gateway.invoke(Gateway.java:214)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68)
at py4j.GatewayConnection.run(GatewayConnection.java:207)
at java.lang.Thread.run(Thread.java:745)
2015-09-16 15:39:32,460 INFO spark.SparkContext (Logging.scala:logInfo(59)) - SparkContext already stopped.
Traceback (most recent call last):
File "/usr/local/Cellar/apache-spark/1.4.1/libexec/python/pyspark/shell.py", line 43, in <module>
sc = SparkContext(appName="PySparkShell", pyFiles=add_files)
File "/usr/local/Cellar/apache-spark/1.4.1/libexec/python/pyspark/context.py", line 113, in __init__
conf, jsc, profiler_cls)
File "/usr/local/Cellar/apache-spark/1.4.1/libexec/python/pyspark/context.py", line 165, in _do_init
self._jsc = jsc or self._initialize_context(self._conf._jconf)
File "/usr/local/Cellar/apache-spark/1.4.1/libexec/python/pyspark/context.py", line 219, in _initialize_context
return self._jvm.JavaSparkContext(jconf)
File "/usr/local/Cellar/apache-spark/1.4.1/libexec/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 701, in __call__
File "/usr/local/Cellar/apache-spark/1.4.1/libexec/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", line 300, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.lang.IllegalStateException: Cannot call methods on a stopped SparkContext
at org.apache.spark.SparkContext.org$apache$spark$SparkContext$$assertNotStopped(SparkContext.scala:103)
at org.apache.spark.SparkContext.getSchedulingMode(SparkContext.scala:1503)
at org.apache.spark.SparkContext.postEnvironmentUpdate(SparkContext.scala:2007)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:543)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
at py4j.Gateway.invoke(Gateway.java:214)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68)
at py4j.GatewayConnection.run(GatewayConnection.java:207)
at java.lang.Thread.run(Thread.java:745)
Spark_ec2 doesn't not open port 7077 on master node for incoming connections from outside the cluster.
You can check in AWS console/EC2/Network & Security/Security Groups and check graph-cluster-master security group's Inbound tab.
You can add the rule to open inbound connection to port 7077.
But it is suggested to run pyspark (essentially Spark's App driver) from the master machine in EC2 cluster, and avoid running driver outside the network.
The reason for this - increased delays and problems with settings firewall connections - you'll need to open some ports so executions could connection to driver on your machine.
So the way to go is to login to ssh cluster with this command:
spark-ec2 --key-pair=graph-cluster --identity-file=/Users/.ssh/pem.pem --region=us-east-1 --zone=us-east-1a login graph-cluster
And run the commands from the master server:
cd spark
bin/pyspark
You'll need to transfer related files (your script and data) to master. I usually keep data on S3 and edit script files with vim or start ipython notebook.
BTW the latter is very easy - you need to add the rule for incoming connections from your computer IP to port 18888 in EC2 console master's security group. And then run the command on a cluster:
IPYTHON_OPTS="notebook --pylab inline --port=18888 --ip='*'" pyspark
Then you can access it with http://ec2-54-152-xx-xxx.compute-1.amazonaws.com:18888/