Yarn can not connect to namenode cluster - hdfs

2018-03-08 16:36:16,775 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Downloading public rsrc:{ hdfs://mycluster/user/abc_user/udf/pig_udf-1.5.7_handle_input_error.jar, 1516336589685, FILE, null }
2018-03-08 16:36:16,775 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Failed to download resource { { hdfs://mycluster/user/oozie/share/lib/lib_20171215093741/pig/libgplcompression.so.0.0.0, 1513307849411, FILE, null },pending,[(container_1519371600813_0002_02_000001)],8140205165392614,DOWNLOADING}
java.lang.IllegalArgumentException: java.net.UnknownHostException: mycluster
at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:406)
at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:310)
at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:176)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:728)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:671)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:155)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2815)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:98)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2852)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2834)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:387)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:249)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecumytor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.UnknownHostException: mycluster
Yarn-nodemanager service and Data-node service is on the same machine
Yarn-resource-manager service and NameNode in on the same machine
When run a simple pig script load data and print . I met above error .
Before add standby Namnode everything work well.
How can I config yarn to understand my NameNode Cluster
Thanks you

After check again hdfs-site.xml on 2 DataNode where Yarn Node Manager stand on , I see that the hdfs-site file missing this line when compare with the hdfs-site on Name Node
<property>
<name>dfs.client.failover.proxy.provider.mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
It working now

Related

Notary cannot install Corda service

I was trying to configure Business network Operator services in my solution by adding the toolkit provided by r3 as corrdapp dependancy in my application.I am able to build the application but when I runnodes i am getting error for Notary
UPDATE
I am adding the log
[ERROR] 2020-09-04T14:21:15,399Z [main] internal.Node. - Unable to install Corda service com.r3.businessnetworks.membership.flows.bno.service.BNOConfigurationService - [errorCode=dfc7g6, moreInformationAt=https://errors.corda.net/OS/4.5/dfc7g6]
java.lang.reflect.InvocationTargetException: null
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_201]
[...]
at net.corda.node.Corda.main(Corda.kt:13) ~[corda-node-4.5.jar:?]
Caused by: java.lang.NullPointerException
at com.r3.businessnetworks.membership.flows.ConfigUtils.loadConfig(ConfigUtils.kt:16) ~[?:?]
at com.r3.businessnetworks.membership.flows.bno.service.BNOConfigurationService.<init>(BNOConfigurationService.kt:21) ~[?:?]
... 33 more
[ERROR] 2020-09-04T14:21:15,458Z [main] internal.NodeStartupLogging. - Exception during node startup - [errorCode=dfc7g6, moreInformationAt=https://errors.corda.net/OS/4.5/dfc7g6]
java.lang.reflect.InvocationTargetException: null
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_201]
[...]
at net.corda.cliutils.CordaCliWrapperKt.start(CordaCliWrapper.kt:89) ~[corda-tools-cliutils-4.5.jar:?]
at net.corda.node.Corda.main(Corda.kt:13) ~[corda-node-4.5.jar:?]
Caused by: java.lang.NullPointerException
at com.r3.businessnetworks.membership.flows.ConfigUtils.loadConfig(ConfigUtils.kt:16) ~[?:?]
at com.r3.businessnetworks.membership.flows.bno.service.BNOConfigurationService.<init>(BNOConfigurationService.kt:21) ~[?:?]
... 33 more
You are missing the configuration file of this CorDapp as explained here; you must:
Create a config folder inside your node's cordapps folder (i.e. node-folder/cordapps/config).
Inside that folder create a membership-service.conf file.
Inside that file add:
// Whitelist of accepted BNOs. Attempt to communicate to not whitelisted
// BNO would result into an exception
bnoWhitelist = ["O=BNO,L=New York,C=US", "O=BNO,L=London,C=GB"]
// Name of the notary to use for BNO transactions such as membership approval
notaryName = "O=Notary,L=Longon,C=GB"
The CorDapp that you're using relies on a configuration file (the above 3 steps create that file) and it causes the NullPointerException when it's missing. To understand more about CorDapp configuration files, read my article.
On a side note, according to this; the CorDapp that you're using will be deprecated on 31 September 2020.

AWS EMR InvalidAuxServiceException: The auxService:mapreduce_shuffle does not exist

I am launching an EMR cluster at run time based on an User Event and once the job is done the cluster will be terminated.
How ever when i the cluster is launched and the tasks are getting executed i am getting the Error:
I read some posts where it is being suggested that we need to update yarn-site.xml in namenode and datanodes and restart the yarn instance.
Not sure how to configure this during the launch of the cluster itself.
org.apache.hadoop.yarn.exceptions.InvalidAuxServiceException: The auxService:mapreduce_shuffle does not exist
Container launch failed for container_1523533251407_0001_01_000002 : org.apache.hadoop.yarn.exceptions.InvalidAuxServiceException: The auxService:mapreduce_shuffle does not exist
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:168)
at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106)
at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:155)
at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:390)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Thanks
Answer:
Here is what i have added in my code to resolve the Issue:
Map<String,String> yarnProperties = new HashMap<String,String>();
yarnProperties.put("yarn.nodemanager.aux-services","mapreduce_shuffle");
yarnProperties.put("yarn.nodemanager.aux-services.mapreduce_shuffle.class","org.apache.hadoop.mapred.ShuffleHandler");
Configuration yarnConfig = new Configuration()
.withClassification("yarn-env")
.withProperties(yarnProperties);
RunJobFlowRequest request = new RunJobFlowRequest()
.withConfigurations(yarnConfig)
We were setting some other properties in the yarn-site.xml .
In case you are trying to create using AWS CLI, you can use
--configurations 'json file with the config'
Else if you are trying to create through java , for example
Application hive = new Application().withName("Hive");
Map<String,String> hiveProperties = new HashMap<String,String>();
hiveProperties.put("hive.join.emit.interval","1000");
hiveProperties.put("hive.merge.mapfiles","true");
Configuration myHiveConfig = new Configuration()
.withClassification("hive-site")
.withProperties(hiveProperties);
Then you can refer as
RunJobFlowRequest request = new RunJobFlowRequest()
.withName("Create cluster with ReleaseLabel")
.withReleaseLabel("emr-5.13.0")
.withApplications(hive)
.withConfigurations(myHiveConfig)
For the other problem :-
You need to add this 2 properties in the above way and then create the cluster:-
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>

AWS EMR Mapreduce failure

We have an installation of AWS EMR in a client environment. The encryption in transit and the encryption at rest has been enabled using security configuration. We continue to get the below mapreduce errors when we execute a simple Hive query.
Diagnostic Messages for this Task:
Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError:
error in shuffle in fetcher#1
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:377)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by:
java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
at org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkReducerHealth(ShuffleSchedulerImpl.java:366)
at org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:288)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.openShuffleUrl(Fetcher.java:282)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:323)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193)
Please let me know if anyone has faced this error before.

Deduce the HDFS path at runtime on EMR

I have spawned an EMR cluster with an EMR step to copy a file from S3 to HDFS and vice-versa using s3-dist-cp.
This cluster is an on-demand cluster so we are not keeping track of the ip.
The first EMR step is:
hadoop fs -mkdir /input - This step completed successfully.
The second EMR step is:
Following is the command I am using:
s3-dist-cp --s3Endpoint=s3.amazonaws.com --src=s3://<bucket-name>/<folder-name>/sample.txt --dest=hdfs:///input - This step FAILED
I get the following exception Error:
Error: java.lang.IllegalArgumentException: java.net.UnknownHostException: sample.txt
at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:378)
at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:310)
at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:176)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:678)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:619)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:149)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2717)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:93)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2751)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2733)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:377)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
at com.amazon.elasticmapreduce.s3distcp.CopyFilesReducer.reduce(CopyFilesReducer.java:213)
at com.amazon.elasticmapreduce.s3distcp.CopyFilesReducer.reduce(CopyFilesReducer.java:28)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:635)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:390)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.net.UnknownHostException: sample.txt
But this file does exist on S3 and I can read it through my spark application on EMR.
The solution was while using s3-dist-cp , filename should not be mentioned in both source and destination.
If you want to filter files in the src directory, you can use --srcPattern option
eg: s3-dist-cp --s3Endpoint=s3.amazonaws.com --src=s3://// --dest=hdfs:///input/ --srcPattern=sample.txt.*

What is wrong with my configuration? AWS - IoT using MQTT

I am following the developer guide for amazon web service IoT, but I have run into some beginner problems and I don't know what to do to solve them. I been sitting for two days. This is what has happened (I am using mac):
1st problem:
In this guide: http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html
I can see and edit my profile by calling "$ was configure" in the terminal. If I call "ls ~/.aws" I can see the config and the credentials folder. But if i call: "~/.aws/credentials" or "~/.aws/config", I get the following error: -bash: /Users/christopher/.aws/credentials: Permission denied.
If i search in finder no file names aws is found.
2nd Problem:
On the next page of the guide, at number 11. i can't get the dot go green cause I get an error when I press connect. The MQTT.fx tool says that it can't find the rootCA.pem file, in the Log. Like this:
2015-11-10 09:55:48,330 INFO --- ScriptsController : Clear console.
2015-11-10 09:55:48,332 INFO --- MqttFX ClientModel : MqttClient with ID fc47354d17e84b6c8507eb1accb61560 assigned.
2015-11-10 09:55:48,340 ERROR --- MqttFX ClientModel : Error when connecting
java.io.FileNotFoundException: rootCA.pem (No such file or directory)
at java.io.FileInputStream.open0(Native Method) ~[?:1.8.0_60]
at java.io.FileInputStream.open(FileInputStream.java:195) ~[?:1.8.0_60]
at java.io.FileInputStream.<init>(FileInputStream.java:138) ~[?:1.8.0_60]
at java.io.FileInputStream.<init>(FileInputStream.java:93) ~[?:1.8.0_60]
at de.jensd.mqttfx.ssl.SSLFellow.loadX509CertificatePem(SSLFellow.java:173) ~[MQTT.fx-jfx.jar:?]
at de.jensd.mqttfx.ssl.SSLFellow.createSSLSocketFactory(SSLFellow.java:51) ~[MQTT.fx-jfx.jar:?]
at de.jensd.mqttfx.model.MqttFXClientModel.getMqttConnectOptions(MqttFXClientModel.java:713) ~[MQTT.fx-jfx.jar:?]
at de.jensd.mqttfx.model.MqttFXClientModel.connect(MqttFXClientModel.java:420) ~[MQTT.fx-jfx.jar:?]
at de.jensd.mqttfx.services.BrokerConnectService$1.call(BrokerConnectService.java:68) ~[MQTT.fx-jfx.jar:?]
at de.jensd.mqttfx.services.BrokerConnectService$1.call(BrokerConnectService.java:65) ~[MQTT.fx-jfx.jar:?]
at javafx.concurrent.Task$TaskCallable.call(Task.java:1423) ~[jfxrt.jar:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_60]
at javafx.concurrent.Service.lambda$null$493(Service.java:725) ~[jfxrt.jar:?]
at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_60]
at javafx.concurrent.Service.lambda$executeTask$494(Service.java:724) ~[jfxrt.jar:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_60]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_60]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_60]
2015-11-10 09:55:48,563 INFO --- ScriptsController : Clear console.
2015-11-10 09:55:48,563 ERROR --- BrokerConnectService : FileNotFoundException: rootCA.pem (No such file or directory)
When i click the "..." button next to the certificate I Notice two things.
1. The certificates here created in my user folder, not in the aws? config folder.
2. No rootCA.pem certificate have been created.
I followed this tutorial to sign my own rootCA.pem certificate: http://datacenteroverlords.com/2012/03/01/creating-your-own-ssl-certificate-authority/
Then in the MQTT.fx tool I press the "..." buttons again next to each certificate to manually select each certificate. The path to each one is displayed, like this: /Users/christopher/cert.pem
When I try to connect again i get the following error in the MQTT.fx tool:
2015-11-09 17:00:30,634 INFO --- ScriptsController : Clear console.
2015-11-09 17:00:30,635 ERROR --- BrokerConnectService : NullPointerException: null
2015-11-09 17:43:17,544 INFO --- BrokerConnectorController : onConnect
2015-11-09 17:43:17,592 INFO --- ScriptsController : Clear console.
2015-11-09 17:43:17,595 INFO --- MqttFX ClientModel : MqttClient with ID fc47354d17e84b6c8507eb1accb61560 assigned.
2015-11-09 17:43:17,661 ERROR --- MqttFX ClientModel : Error when connecting
java.lang.NullPointerException
at de.jensd.mqttfx.ssl.SSLFellow.loadPem(SSLFellow.java:221) ~[MQTT.fx-jfx.jar:?]
at de.jensd.mqttfx.ssl.SSLFellow.loadPrivateKeyPem(SSLFellow.java:184) ~[MQTT.fx-jfx.jar:?]
at de.jensd.mqttfx.ssl.SSLFellow.createSSLSocketFactory(SSLFellow.java:55) ~[MQTT.fx-jfx.jar:?]
at de.jensd.mqttfx.model.MqttFXClientModel.getMqttConnectOptions(MqttFXClientModel.java:713) ~[MQTT.fx-jfx.jar:?]
at de.jensd.mqttfx.model.MqttFXClientModel.connect(MqttFXClientModel.java:420) ~[MQTT.fx-jfx.jar:?]
at de.jensd.mqttfx.services.BrokerConnectService$1.call(BrokerConnectService.java:68) ~[MQTT.fx-jfx.jar:?]
at de.jensd.mqttfx.services.BrokerConnectService$1.call(BrokerConnectService.java:65) ~[MQTT.fx-jfx.jar:?]
at javafx.concurrent.Task$TaskCallable.call(Task.java:1423) ~[jfxrt.jar:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_60]
at javafx.concurrent.Service.lambda$null$493(Service.java:725) ~[jfxrt.jar:?]
at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_60]
at javafx.concurrent.Service.lambda$executeTask$494(Service.java:724) ~[jfxrt.jar:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_60]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_60]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_60]
2015-11-09 17:43:18,472 INFO --- ScriptsController : Clear console.
2015-11-09 17:43:18,473 ERROR --- BrokerConnectService : NullPointerException: null
I can see on the aws IoT site (the console?) that the certificate, the policy and the lightbulb are all connected just like in the tutorial. But the rest is a mystery.
I would be so happy for all help I can get. Thank you!
You have to sign your own CA when you use your own domain name. Location of the original root certificate can be found at http://docs.aws.amazon.com/iot/latest/developerguide/verify-pub-sub.html
The root CA certificate is not created for you, rather it is linked somewhere in the AWS IoT documentation and is the same for all.
See first parapgraph here:
http://docs.aws.amazon.com/iot/latest/developerguide/verify-pub-sub.html
The is the root CA:
https://www.symantec.com/content/en/us/enterprise/verisign/roots/VeriSign-Class%203-Public-Primary-Certification-Authority-G5.pem