Specify kubectl client version installed via gcloud SDK - google-cloud-platform

I probably missed this in the docs somewhere, but since I haven't found it yet, I'll ask: How can I specify the version of kubectl CLI when installing with the gcloud SDK?
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"13+", GitVersion:"v1.13.9-2+4a03651a7e7e04", GitCommit:"4a03651a7e7e04a0021b2ef087963dfb7bd0a17e", GitTreeState:"clean", BuildDate:"2019-08-16T19:08:17Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"13+", GitVersion:"v1.13.7-gke.24", GitCommit:"2ce02ef1754a457ba464ab87dba9090d90cf0468", GitTreeState:"clean", BuildDate:"2019-08-12T22:05:28Z", GoVersion:"go1.11.5b4", Compiler:"gc", Platform:"linux/amd64"}
$ gcloud components update
All components are up to date.
$ which kubectl
/Users/me/Projects/googlecloud/google-cloud-sdk/bin/kubectl
$ which gcloud
/Users/me/Projects/googlecloud/google-cloud-sdk/bin/gcloud
$ ls -nL /Users/me/Projects/googlecloud/google-cloud-sdk/bin | grep kubectl
-rwxr-xr-x 1 501 20 44296840 Aug 16 12:08 kubectl
-rwxr-xr-x 1 501 20 54985744 Apr 30 21:56 kubectl.1.11
-rwxr-xr-x 1 501 20 56860112 Jul 7 21:34 kubectl.1.12
-rwxr-xr-x 1 501 20 44329928 Aug 5 02:52 kubectl.1.13
-rwxr-xr-x 1 501 20 48698616 Aug 5 02:55 kubectl.1.14
-rwxr-xr-x 1 501 20 48591440 Aug 5 02:57 kubectl.1.15
So I'm using the gcloud-installed kubectl, and I see that the version I want is locally installed. The gcloud components update command run previously indicated that kubectl would be set to the default version of 1.13, but I haven't caught any indication of how to change the default version.
I imagine I could create a link, or copy the version I want onto Users/me/Projects/googlecloud/google-cloud-sdk/bin/kubectl, but I'm leary of messing with the managed environs of gcloud.

Whelp, I went ahead and ran the following
KUBE_BIN=$(which kubectl)
rm $KUBE_BIN
ln ~/googlecloud/google-cloud-sdk/bin/kubectl.1.15 $KUBE_BIN
and now i get
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.2", GitCommit:"f6278300bebbb750328ac16ee6dd3aa7d3549568", GitTreeState:"clean", BuildDate:"2019-08-05T09:23:26Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"darwin/amd64"}
and everything seems to be working just fine...

IIRC you cannot.
But, as you show, you have multiple major-minor versions available and, because kubectl is distributed as a static binary, you can, e.g.
kubectl1.15 version

GKE only
Do not change kubectl version if you're working only with GKE, because kubectl supports only one version forward and backward skew.
For example, if you use kubectl 1.16 against GKE 1.14, you may experience some bugs, such as --watch flag is not working properly.
gcloud provides just the right version for the current version of GKE.
Multiple cluster versions
explicit version
If you're working with different Kubernetes cluster, I'd suggest using the gcloud version of kubectl as a default one. For any specific version of kubectl, just create a dir ~/bin/kubectl, put there kubectl1.15, kubectl1.16, etc. and add the dir to your PATH.
With such setup you can explicitly use appropriate version:
$ # Working with GKE
$ kubectl ...
$ # Working with K8s 1.15
$ kubectl1.15 ...
implicit version
Using direnv you can make switching between versions transparent.
There are many ways of doing this, here is on example.
Let's say you have a project which requires kubectl 1.15. Inside the project dir create env/bin subdir and link there all binaries you need (kubectl1.15, helm2, etc.), create .envrc file with the following content:
export PATH="$(PWD)/env/bin:${PATH}"
Run direnv allow in the project dir (it's needed only once for any new .envrc). After that you'll have all binaries from env/bin in your path.
And then, in the dir and all subdirs:
$ # Invokes kubectl 1.15
$ kubectl ...
$ # Invokes Helm 2
$ helm ...

Related

Jenkins Plugins are not installed : Command Line

I am trying to install jenkins plugins from AWS S3 bucket.
Code for installing jenkins plugins :
plugin_manager_url="https://github.com/jenkinsci/plugin-installation-manager-tool/releases/download/2.12.3/jenkins-plugin-manager-2.12.3.jar"
jpath="/var/lib/jenkins"
echo "Installing Jenkins Plugin Manager..."
wget -O $${jpath}/jenkins-plugin-manager.jar $${plugin_manager_url}
chown jenkins:jenkins $${jpath}/jenkins-plugin-manager.jar
cd $${jpath}
mkdir pluginsInstalled
aws s3 cp "s3://bucket/folder-with-plugins.zip" .
unzip folder-with-plugins.zip
echo 'Installing Jenkins Plugins...'
cd plugins/
for plugin in *.jpi; do
java -jar $${jpath}/jenkins-plugin-manager.jar --war /usr/share/java/jenkins.war --plugin-download-directory $${jpath}/pluginsInstalled --plugins $(echo $plugin | cut -f 1 -d '.')
done
chown -R jenkins:jenkins $${jpath}/pluginsInstalled
systemctl start jenkins //before installing plugins Jenkins is installed, which is up and running
IN above code snippet, I unzipped s3 bucket folder, where all plugins are inside "plugins/" folder with .jpi extention so I trimmed that extention while
installing plugins and installed plugins will be in "pluginsInstalled" folder
I have DEV and PROD aws accounts. I will build an AMI using EC2 image builder in DEV account and will share/use that AMI in prod for security reasons.
So, the userdata script for installing jenkins and plugins is part of building AMI. When I check EC2 Image builder's Build Instance, I can see userdata is installed propelry.
But, when I check same AMI which is used in PROD, then I cannot see Jenkins Plugins installed.
Jenkins Version : 2.346.2
And the error log for jenkins is,
java.lang.IllegalArgumentException: No hudson.security.AuthorizationStrategy implementation found for folderBased
at io.jenkins.plugins.casc.impl.configurators.HeteroDescribableConfigurator.lambda$lookupDescriptor$11(HeteroDescribableConfigurator.java:211)
at io.vavr.control.Option.orElse(Option.java:321)
at io.jenkins.plugins.casc.impl.configurators.HeteroDescribableConfigurator.lookupDescriptor(HeteroDescribableConfigurator.java:210)
at io.jenkins.plugins.casc.impl.configurators.HeteroDescribableConfigurator.lambda$configure$3(HeteroDescribableConfigurator.java:84)
at io.vavr.Tuple2.apply(Tuple2.java:238)
at io.jenkins.plugins.casc.impl.configurators.HeteroDescribableConfigurator.configure(HeteroDescribableConfigurator.java:83)
at io.jenkins.plugins.casc.impl.configurators.HeteroDescribableConfigurator.check(HeteroDescribableConfigurator.java:92)
at io.jenkins.plugins.casc.impl.configurators.HeteroDescribableConfigurator.check(HeteroDescribableConfigurator.java:55)
at io.jenkins.plugins.casc.BaseConfigurator.configure(BaseConfigurator.java:350)
at io.jenkins.plugins.casc.BaseConfigurator.check(BaseConfigurator.java:286)
at io.jenkins.plugins.casc.ConfigurationAsCode.lambda$checkWith$8(ConfigurationAsCode.java:776)
at io.jenkins.plugins.casc.ConfigurationAsCode.invokeWith(ConfigurationAsCode.java:712)
at io.jenkins.plugins.casc.ConfigurationAsCode.checkWith(ConfigurationAsCode.java:776)
at io.jenkins.plugins.casc.ConfigurationAsCode.configureWith(ConfigurationAsCode.java:761)
at io.jenkins.plugins.casc.ConfigurationAsCode.configureWith(ConfigurationAsCode.java:637)
at io.jenkins.plugins.casc.ConfigurationAsCode.configure(ConfigurationAsCode.java:306)
at io.jenkins.plugins.casc.ConfigurationAsCode.init(ConfigurationAsCode.java:298)
Caused: java.lang.reflect.InvocationTargetException
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at hudson.init.TaskMethodFinder.invoke(TaskMethodFinder.java:109)
Caused: java.lang.Error
at hudson.init.TaskMethodFinder.invoke(TaskMethodFinder.java:115)
at hudson.init.TaskMethodFinder$TaskImpl.run(TaskMethodFinder.java:185)
at org.jvnet.hudson.reactor.Reactor.runTask(Reactor.java:305)
at jenkins.model.Jenkins$5.runTask(Jenkins.java:1158)
at org.jvnet.hudson.reactor.Reactor$2.run(Reactor.java:222)
at org.jvnet.hudson.reactor.Reactor$Node.run(Reactor.java:121)
at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:68)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused: org.jvnet.hudson.reactor.ReactorException
at org.jvnet.hudson.reactor.Reactor.execute(Reactor.java:291)
at jenkins.InitReactorRunner.run(InitReactorRunner.java:49)
at jenkins.model.Jenkins.executeReactor(Jenkins.java:1193)
at jenkins.model.Jenkins.<init>(Jenkins.java:983)
at hudson.model.Hudson.<init>(Hudson.java:86)
at hudson.model.Hudson.<init>(Hudson.java:82)
at hudson.WebAppMain$3.run(WebAppMain.java:247)
Caused: hudson.util.HudsonFailedToLoad
at hudson.WebAppMain$3.run(WebAppMain.java:264)
When I check jenkins status on PROD where plugins installed AMI is used, somehow jenkins is not able to restart. It gives following error for jenkins status
Aug 18 21:08:40 ip-10-220-74-95.ec2.internal systemd[1]: Starting Jenkins Continuous Integration Server...
Aug 18 21:08:45 ip-10-220-74-95.ec2.internal jenkins[6656]: Exception in thread "Attach Listener" Agent failed to start!
Aug 18 21:08:50 ip-10-220-74-95.ec2.internal jenkins[6656]: WARNING: An illegal reflective access operation has occurred
Aug 18 21:08:50 ip-10-220-74-95.ec2.internal jenkins[6656]: WARNING: Illegal reflective access by org.codehaus.groovy.vmplugin.v7.Java7$...s,int)
Aug 18 21:08:50 ip-10-220-74-95.ec2.internal jenkins[6656]: WARNING: Please consider reporting this to the maintainers of org.codehaus.g...ava7$1
Aug 18 21:08:50 ip-10-220-74-95.ec2.internal jenkins[6656]: WARNING: Use --illegal-access=warn to enable warnings of further illegal ref...ations
Aug 18 21:08:50 ip-10-220-74-95.ec2.internal jenkins[6656]: WARNING: All illegal access operations will be denied in a future release
The issue was,
I was installing plugins using,
java -jar ./jenkins-plugin-manager.jar --war ./jenkins.war --plugin-download-directory <dir> --plugins <plugins_list>
Here, while it was installing plugins with latest jenkins version.
In my case, I updated targeted jenkins version I am using in our project
sudo java -jar ./jenkins-plugin-manager.jar --jenkins-version <JENNKINS_VERSION> --plugin-download-directory <dir> --plugins <plugins_list>

no Istio pods in namespace "istio-system"

I have installed istioctl/1.4.8, istioctl is not able to talk to my cluster, using command (istioctl version -c platform)
#kubectl version
Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.0", GitCommit:"e8462b5b5dc2584fdcd18e6bcfe9f1e4d970a529", GitTreeState:"clean", BuildDate:"2019-06-19T16:32:14Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
# kubectl get pods -A | grep -i istio | grep pilot
istio-platform istio-pilot-7c5adrgcd89-wt9k 2/2 Running 4 1d
#istioctl version
2020-06-14T11:26:13.636825Z warn will use `--remote=false` to retrieve version info due to `no Istio pods in namespace "istio-system"`
1.4.8
# istioctl version -c istio-platform
2020-06-14T11:27:59.121013Z warn will use `--remote=false` to retrieve version info due to `no Istio pods in namespace "istio-system"`
1.4.8
istio is running in namespace : istio-platform
What could be the issue here, any hints ?
You have to provide the Istio namespace if its not in istio-system:
istioctl version -i istio-platform
cf. https://istio.io/latest/docs/reference/commands/istioctl/

Issue with elixir-phoenix-on-google-compute-engine

I’m trying to deploy to GCP Compute Engine by following this tutorial
https://cloud.google.com/community/tutorials/elixir-phoenix-on-google-compute-engine
Unable to connect to provided external IP after create firewall-rules
There are no errors in following the tutorial. But cannot connect to http://${external_ip}:8080 after creating firewall rules
Build release is already in Google Cloud Storage
I copy hello
gsutil cp _build/prod/rel/hello/bin/hello\
gs://${BUCKET_NAME}/hello-release
instead of hello.run
gsutil cp _build/prod/rel/hello/bin/hello.run \
gs://${BUCKET_NAME}/hello-release
My instance-startup.sh
#!/bin/sh
set -ex
export HOME=/app
mkdir -p ${HOME}
cd ${HOME}
RELEASE_URL=$(curl \
-s "http://metadata.google.internal/computeMetadata/v1/instance/attributes/release-url" \
-H "Metadata-Flavor: Google")
gsutil cp ${RELEASE_URL} hello-release
chmod 755 hello-release
wget https://dl.google.com/cloudsql/cloud_sql_proxy.linux.amd64 \
-O cloud_sql_proxy
chmod +x cloud_sql_proxy
mkdir /tmp/cloudsql
PROJECT_ID=$(curl \
-s "http://metadata.google.internal/computeMetadata/v1/project/project-id" \
-H "Metadata-Flavor: Google")
./cloud_sql_proxy -projects=${PROJECT_ID} -dir=/tmp/cloudsql &
PORT=8080 ./hello-release start
gcloud compute instances get-serial-port-output shows
...
Feb 23 18:02:35 hello-instance startup-script: INFO startup-script: + PORT=8080 ./hello-release start
Feb 23 18:02:35 hello-instance startup-script: INFO startup-script: + ./cloud_sql_proxy -projects= hello -dir=/tmp/cloudsql
Feb 23 18:02:35 hello-instance startup-script: INFO startup-script: 2019/02/23 18:02:35 Rlimits for file descriptors set to {&{8500 8500}}
Feb 23 18:02:35 hello-instance startup-script: INFO startup-script: ./hello-release: 31: exec: /app/hello_rc_exec.sh: not found
Feb 23 18:02:39 hello-instance startup-script: INFO startup-script: 2019/02/23 18:02:39 Listening on /tmp/cloudsql/hello:asia-east1:hello-db/.s.PGSQL.5432 for hello:asia-east1: hello-db
Feb 23 18:02:39 hello-instance startup-script: INFO startup-script: 2019/02/23 18:02:39 Ready for new connections
Feb 23 18:08:08 hello-instance ntpd[656]: kernel reports TIME_ERROR: 0x41: Clock Unsynchronized
hello_rc_exec.sh is generated after initialize Distillery. It is stored in _build/prod/rel/hello/bin/hello_rc_exec.sh
firewall rules
NAME NETWORK DIRECTION PRIORITY ALLOW DENY DISABLED
default-allow-http-8080 default INGRESS 1000 tcp:8080 False
...
I also run in ps aux | grep erl in the instance
hello_team#hello-instance:~$ ps aux | grep erl
hello_t+ 23166 0.0 0.0 12784 1032 pts/0 S+ 08:04 0:00 grep erl
Im not sure what information is needed to fix this
Please do ask for information and I will provide them.
Thank you
For posterity, here was the solution (worked out in this forum thread).
First, the poster had uploaded the hello file instead of hello.run to cloud storage. The tutorial intentionally specifies uploading hello.run because it is a full executable archive of the entire release, whereas hello is merely a wrapper script and is by itself not capable of executing the app. So this modification to the procedure needed to be reverted.
Second, the poster's app included the elixir_bcrypt library. This library includes a NIF whose platform-specific binary code is built in the deps directory (instead of the _build directory). The tutorial's procedure doesn't properly clean out binaries in deps prior to cross-compiling for deployment, and so the poster's macOS-built bcrypt library was leaking into the build. When deployed to compute engine on Debian, this crashed on initialization. The poster fixed this problem by deleting the deps directory and re-installing dependencies while cross-compiling.
It was also noted during the discussion that the tutorial promoted a poor practice of mounting the user's app in a volume when doing a Docker cross-compilation. Instead, it should simply copy the app into the image, perform the entire build there, and use docker cp to extract the built artifact. This practice would have prevented this issue. A work item was filed to modify the tutorial accordingly.
The solution is here.
Thank you for the help everyone!

What is wrong with the setup of Hyperledger Fabric?

Because I want to install a new clear version of Hyperledger Fabric, I deleted old Hyperledger file of one month ago, and run "vagrant destroy".
I run "vagrant up", and "vagrant ssh" successfully.
I "make peer" successfully, when I run "peer", if failed.
When I run "make peer" and "peer" again, the error is pop up as below:
vagrant#ubuntu-1404:/opt/gopath/src/github.com/hyperledger/fabric$ make peer
make: Nothing to be done for `peer'.
vagrant#ubuntu-1404:/opt/gopath/src/github.com/hyperledger/fabric$ peer
No command 'peer' found, did you mean:
Command 'pee' from package 'moreutils' (universe)
Command 'beer' from package 'gerstensaft' (universe)
Command 'peel' from package 'ears' (universe)
Command 'pear' from package 'php-pear' (main)
peer: command not found
vagrant#ubuntu-1404:/opt/gopath/src/github.com/hyperledger/fabric$
vagrant#ubuntu-1404:/opt/gopath/src/github.com/hyperledger/fabric$ cd peer
vagrant#ubuntu-1404:/opt/gopath/src/github.com/hyperledger/fabric/peer$ ls -l
total 60
drwxr-xr-x 1 vagrant vagrant 204 Jun 26 01:16 bin
-rw-r--r-- 1 vagrant vagrant 17342 Jun 25 14:18 core.yaml
-rw-r--r-- 1 vagrant vagrant 35971 Jun 25 14:18 main.go
-rw-r--r-- 1 vagrant vagrant 1137 Jun 23 08:46 main_test.go
The binary peer file's location is ./build/bin/ folder.
For your configuration the full path is "/opt/gopath/src/github.com/hyperledger/fabric/build/bin/"
Let me tell you one thing I observed when I pulled code from gitHub last week, [Thursday to be exact].
Make command had created the executable in "/opt/gopath/src/github.com/hyperledger/fabric/build/bin/". But one pretty thing which I found was, it had copied the same to "/hyperledger/build/bin". And the $PATH variable now included "/hyperledger/build/bin" also.
So to answer your question, you have two options :-
1. one retain your current version of code & Navigate into the bin folder in the fabric directory and see whether peer executable is present there. ? If yes, then execute the rest of the code.
2. Pull the latest copy from gitHub.com and make peer from fabric directory as usual. But execute peer from anywhere. :)

Why does spark-shell --master yarn-client fail (yet pyspark --master yarn seems to work)?

I'm trying to run the spark shell on my Hadoop cluster via Yarn.
I use
Hadoop 2.4.1
Spark 1.0.0
My Hadoop cluster already works. In order to use Spark, I built Spark as described here :
mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.1 -DskipTests clean package
The compilation works fine, and I can run spark-shell without troubles. However, running it on yarn :
spark-shell --master yarn-client
gets me the following error :
14/07/07 11:30:32 INFO cluster.YarnClientSchedulerBackend: Application report from ASM:
appMasterRpcPort: -1
appStartTime: 1404725422955
yarnAppState: ACCEPTED
14/07/07 11:30:33 INFO cluster.YarnClientSchedulerBackend: Application report from ASM:
appMasterRpcPort: -1
appStartTime: 1404725422955
yarnAppState: FAILED
org.apache.spark.SparkException: Yarn application already ended,might be killed or not able to launch application master
.
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApp(YarnClientSchedulerBackend.scala:105
)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:82)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:136)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:318)
at org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:957)
at $iwC$$iwC.<init>(<console>:8)
at $iwC.<init>(<console>:14)
at <init>(<console>:16)
at .<init>(<console>:20)
at .<clinit>(<console>)
at .<init>(<console>:7)
at .<clinit>(<console>)
at $print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:788)
at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1056)
at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:614)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:645)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:609)
at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:796)
at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:841)
at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:753)
at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:121)
at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:120)
at org.apache.spark.repl.SparkIMain.beQuietDuring(SparkIMain.scala:263)
at org.apache.spark.repl.SparkILoopInit$class.initializeSpark(SparkILoopInit.scala:120)
at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:56)
at org.apache.spark.repl.SparkILoop$$anonfun$process$1$$anonfun$apply$mcZ$sp$5.apply$mcV$sp(SparkILoop.scala:913)
at org.apache.spark.repl.SparkILoopInit$class.runThunks(SparkILoopInit.scala:142)
at org.apache.spark.repl.SparkILoop.runThunks(SparkILoop.scala:56)
at org.apache.spark.repl.SparkILoopInit$class.postInitialization(SparkILoopInit.scala:104)
at org.apache.spark.repl.SparkILoop.postInitialization(SparkILoop.scala:56)
at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply$mcZ$sp(SparkILoop.scala:930)
at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:884)
at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:884)
at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:884)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:982)
at org.apache.spark.repl.Main$.main(Main.scala:31)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:292)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Spark manages to communicate with my cluster, but it doesn't work out.
Another interesting thing is that I can access my cluster using pyspark --master yarn. However, I get the following warning
14/07/07 14:10:11 WARN cluster.YarnClientClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory
and an infinite computation time when doing something as simple as
sc.wholeTextFiles('hdfs://vm7x64.fr/').collect()
What may be causing this problem ?
Please check does your Hadoop cluster is running correctly.
On the master node next YARN process must be running:
$ jps
24970 ResourceManager
On slave nodes/executors:
$ jps
14389 NodeManager
Also make sure that you created a reference (or copied those files) to Hadoop configuration in Spark config directory :
$ ll /spark/conf/ | grep site
lrwxrwxrwx 1 hadoop hadoop 33 Jun 8 18:13 core-site.xml -> /hadoop/etc/hadoop/core-site.xml
lrwxrwxrwx 1 hadoop hadoop 33 Jun 8 18:13 hdfs-site.xml -> /hadoop/etc/hadoop/hdfs-site.xml
You also can check ResourceManager Web UI on port 8088 - http://master:8088/cluster/nodes. There must be a list of available nodes and resources.
You must take a look at your log files using next command (application ID you can find in Web UI):
$ yarn logs -applicationId <yourApplicationId>
Or you can look directly to entire log files on Master/ResourceManager host:
$ ll /hadoop/logs/ | grep resourcemanager
-rw-rw-r-- 1 hadoop hadoop 368414 Jun 12 18:12 yarn-hadoop-resourcemanager-master.log
-rw-rw-r-- 1 hadoop hadoop 2632 Jun 12 17:52 yarn-hadoop-resourcemanager-master.out
And on Slave/NodeManager hosts:
$ ll /hadoop/logs/ | grep nodemanager
-rw-rw-r-- 1 hadoop hadoop 284134 Jun 12 18:12 yarn-hadoop-nodemanager-slave.log
-rw-rw-r-- 1 hadoop hadoop 702 Jun 9 14:47 yarn-hadoop-nodemanager-slave.out
Also check if all environment variables are correct:
HADOOP_CONF_LIB_NATIVE_DIR=/hadoop/lib/native
HADOOP_MAPRED_HOME=/hadoop
HADOOP_COMMON_HOME=/hadoop
HADOOP_HDFS_HOME=/hadoop
YARN_HOME=/hadoop
HADOOP_INSTALL=/hadoop
HADOOP_CONF_DIR=/hadoop/etc/hadoop
YARN_CONF_DIR=/hadoop/etc/hadoop
SPARK_HOME=/spark