hadoop Web UI localhost:50070 can not open - hadoop3

Ubuntu 16.04.1 LTS
Hadoop 3.3.1
I try to set up hadoop pseudo distributed mode referring to one network tutorial.and follow below steps.
Step 1:setting Hadoop
1.add below code to /etc/profile.
export HADOOP_HOME=/home/hadoop/hadoop
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
export HADOOP_INSTALL=$HADOOP_HOME
2.in $HADOOP_HOME/etc/hadoop/hadoop-env.sh,set
export JAVA_HOME=/opt/jdk1.8.0_261
core-site.xml:
<configuration>
<property>
<name>fs.default.name </name>
<value> hdfs://localhost:9000 </value>
</property>
</configuration>
hdfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>file:///home/hadoop/hadoop/pseudo/hdfs/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>file:///home/hadoop/hadoop/pseudo/hdfs/datanode</value>
</property>
</configuration>
yarn-site.xml:
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
mapred-site.xml:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
Step 2:Verify Hadoop
1.$ hdfs namenode -format
2.
sudo apt-get install ssh
ssh-keygen -t rsa
ssh-copy-id hadoop#ubuntu
cd ~/hadoop/sbin
start-dfs.sh
3.start-yarn.sh
4.open http://localhost:50070/ in firefox on local machine.
Unable to connect
Firefox can't establish a connection to the server at localhost:50070.
The site could be temporarily unavailable or too busy. Try again in a few moments.
If you are unable to load any pages, check your computer's network connection.
If your computer or network is protected by a firewall or proxy, make sure that Firefox is permitted to access the Web.
5.open http://localhost:8088/ in firefox ,returns the same error of port 50070.
when I run jps command,it returns
hadoop#ubuntu:~/hadoop/sbin$ jps
32501 DataNode
32377 NameNode
32682 SecondaryNameNode
32876 Jps

Since Hadoop 3.0.0 - Alpha 1 there was a Change in the port configuration:
http://localhost:50070
was moved to
http://localhost:9870
see https://issues.apache.org/jira/browse/HDFS-9427

Related

oozie error: Accessing local file system is not allowed

Sqoop import action is giving error while running as an oozie job.
I am using a pesudo-distributed hadoop cluster.
I have followed the following steps.
1.Started oozie server
2.edited job.properties and workflow.xml files
3.copied workflow.xml into hdfs
4.ran oozie job
my job.properties file
nameNode=hdfs://localhost:8020
jobTracker=localhost:8021
queueName=default
examplesRoot=examples
oozie.use.system.libpath=true
oozie.wf.application.path=${nameNode}/user/hduser/${examplesRoot}/apps/sqoop
workflow.xml file
<action name="sqoop-node">
<sqoop xmlns="uri:oozie:sqoop-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="${nameNode}/user/hduser/${examplesRoot}/output-data/sqoop"/>
<!--<mkdir path="${nameNode}/user/hduser/${examplesRoot}/output-data"/>-->
</prepare>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<command>import --connect "jdbc:mysql://localhost/db" --username user --password pass --table "table" --where "Conditions" --driver com.mysql.jdbc.Driver --target-dir ${nameNode}/user/hduser/${examplesRoot}/output-data/sqoop -m 1</command>
<!--<file>db.hsqldb.properties#db.hsqldb.properties</file>
<file>db.hsqldb.script#db.hsqldb.script</file>-->
</sqoop>
<ok to="end"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>Sqoop failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
I was expecting that the job will run without any errors. But the job got killed and it gave the following error.
UnsupportedOperationException: Accessing local file system is not allowed.
I don't understand where I am wrong and why it is not allowing to complete the job?
Can Anyone help me to solve the issue.
Oozie sharelib (with the Sqoop action's dependencies) is stored on HDFS, and the server needs to know how to communicate with the Hadoop cluster. Access to the sharelib stored on a local filesystem is not allowed, see CVE-2017-15712.
Please review conf/hadoop-conf/core-site.xml, and make sure it does not use the local filesystem. For example, if your HDFS namenode listens on port 9000 on localhost, configure fs.defaultFS accordingly.
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
...
</configuration>
Alternatively, you can remove the RawLocalFileSystem class (dummy implementation) and restart the server, but it isn't recommended (i.e. server becomes vulnerable to CVE-2017-15712).
Hope this helps. Also see this answer.

AWS EMR Phoenix Kerberos not working

I am trying to get Phoenix work with kerberos.
Kerberos realm are setup , i am able to generate tickets from keytab /etc/hbase.keytab
but phoenix sqlline client is giving following error
Error: org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get the locations (state=,code=0)
I have done following changes
# Issue ticket for principal
kinit -t /etc/hbase.keytab hbase/HOSTNAME.RealmName#RealmName
<!-- Configuring Phoenix Kerberos Properties in Hbase_site.xml -->
<property>
<name>phoenix.queryserver.kerberos.principal</name>
<value>hbase/_HOST#RealmName</value>
</property>
<property>
<name>phoenix.queryserver.kerberos.keytab</name>
<value>/etc/hbase.keytab</value>
</property>
<property>
<name>zookeeper.znode.parent</name>
<value>/hbase-secure</value>
</property>
<property>
<name>phoenix.queryserver.keytab.file</name>
<value>/etc/hbase.keytab</value>
</property>
<property>
<name>phoenix.queryserver.http.keytab.file</name>
<value>/etc/hbase.keytab</value>
</property>
<property>
<name>phoenix.queryserver.kerberos.http.principal</name>
<value>hbase/_HOST#RealmName</value>
</property>
<property>
<name>phoenix.queryserver.dns.nameserver</name>
<value>HOSTNAME.RealmName</value>
</property>
<property>
<name>phoenix.queryserver.dns.interface</name>
<value>eth0</value>
</property>
# Set hbase conf path (This seems to be a bug in EMR. This should be set by default.)
sudo vi ~/.bashrc
add “export HBASE_CONF_DIR=/etc/hbase/conf/ “
source ~/.bashrc
Starting Phoenix Client
/usr/lib/phoenix/bin/sqlline.py HOSTNAME.RealmName:2181:/hbase-secure:hbase/HOSTNAME.RealmName#RealmName:/etc/hbase.keytab
This seems to be a bug in EMR 5.11
The issue was resolved after setting HBASE_CONF_DIR in ~/.bashrc
sudo vi ~/.bashrc
add “export HBASE_CONF_DIR=/etc/hbase/conf/“
source ~/.bashrc
run phoenix
/usr/lib/phoenix/bin/sqlline.py

Unable to connect Cassandra Cluster in AWS from EC2 instance

I setup Cassandra Cluster Using DataStax AMI in AWS and run the cassandra service. I am trying to connect this cassandra service from another EC2 instance where titan is installed. Titan server version is 0.4.4. I also tried with 0.5.3 but still the same error.
Cassandra is backend storage for the titan .
Error is
20366 [main] WARN com.tinkerpop.rexster.config.GraphConfigurationContainer - Could not load graph graph. Please check the XML configuration.
20367 [main] WARN com.tinkerpop.rexster.config.GraphConfigurationContainer - GraphConfiguration could not be found or otherwise instantiated: [com.thinkaurelius.titan.tinkerpop.rexster.TitanGraphConfiguration]. Ensure that it is in Rexster's path.
com.tinkerpop.rexster.config.GraphConfigurationException: GraphConfiguration could not be found or otherwise instantiated: [com.thinkaurelius.titan.tinkerpop.rexster.TitanGraphConfiguration]. Ensure that it is in Rexster's path.at com.tinkerpop.rexster.config.GraphConfigurationContainer.getGraphFromConfiguration(GraphConfigurationContainer.java:137)
at com.tinkerpop.rexster.config.GraphConfigurationContainer.<init>(GraphConfigurationContainer.java:54)
at com.tinkerpop.rexster.server.XmlRexsterApplication.reconfigure(XmlRexsterApplication.java:99)
at com.tinkerpop.rexster.server.XmlRexsterApplication.<init>(XmlRexsterApplication.java:47)
at com.tinkerpop.rexster.Application.<init>(Application.java:96)
at com.tinkerpop.rexster.Application.main(Application.java:188)
Caused by: java.lang.IllegalArgumentException: Could not instantiate implementation: com.thinkaurelius.titan.diskstorage.cassandra.astyanax.AstyanaxStoreManager
at com.thinkaurelius.titan.diskstorage.Backend.instantiate(Backend.java:355)
at com.thinkaurelius.titan.diskstorage.Backend.getImplementationClass(Backend.java:367)
at com.thinkaurelius.titan.diskstorage.Backend.getStorageManager(Backend.java:311)
at com.thinkaurelius.titan.diskstorage.Backend.<init>(Backend.java:121)
at com.thinkaurelius.titan.graphdb.configuration.GraphDatabaseConfiguration.getBackend(GraphDatabaseConfiguration.java:1173)
at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.<init>(StandardTitanGraph.java:75)
at com.thinkaurelius.titan.core.TitanFactory.open(TitanFactory.java:40)
at com.thinkaurelius.titan.tinkerpop.rexster.TitanGraphConfiguration.configureGraphInstance(TitanGraphConfiguration.java:25)
at com.tinkerpop.rexster.config.GraphConfigurationContainer.getGraphFromConfiguration(GraphConfigurationContainer.java:119)
Configuration file -
<rexster>
<http>
<server-port>7182</server-port>
<server-host>0.0.0.0</server-host>
<base-uri>http://localhost</base-uri>
<web-root>public</web-root>
<character-set>UTF-8</character-set>
<enable-jmx>false</enable-jmx>
<enable-doghouse>true</enable-doghouse>
<max-post-size>2097152</max-post-size>
<max-header-size>8192</max-header-size>
<upload-timeout-millis>30000</upload-timeout-millis>
<thread-pool>
<worker>
<core-size>8</core-size>
<max-size>8</max-size>
</worker>
<kernal>
<core-size>4</core-size>
<max-size>4</max-size>
</kernal>
</thread-pool>
<io-strategy>leader-follower</io-strategy>
</http>
<rexpro>
<server-port>7180</server-port>
<server-host>0.0.0.0</server-host>
<session-max-idle>1790000</session-max-idle>
<session-check-interval>3000000</session-check-interval>
<connection-max-idle>180000</connection-max-idle>
<connection-check-interval>3000000</connection-check-interval>
<enable-jmx>false</enable-jmx>
<thread-pool>
<worker>
<core-size>8</core-size>
<max-size>8</max-size>
</worker>
<kernal>
<core-size>4</core-size>
<max-size>4</max-size>
</kernal>
</thread-pool>
<io-strategy>leader-follower</io-strategy>
</rexpro>
<shutdown-port>7183</shutdown-port>
<shutdown-host>127.0.0.1</shutdown-host>
<graphs>
<graph>
<graph-name>graph</graph-name>
<graph-type>com.thinkaurelius.titan.tinkerpop.rexster.TitanGraphConfiguration</graph-type>
<graph-location>/tmp/titan</graph-location>
<graph-read-only>false</graph-read-only>
<properties>
<storage.hostname>ec2-52-22-199-210.amazonaws.com</storage.hostname>
<storage.backend>cassandra</storage.backend>
</properties>
<extensions>
<allows>
<allow>tp:gremlin</allow>
</allows>
</extensions>
</graph>
</graphs>
</rexster>

Scalatra app on Openshift - setting Jetty IP

I'm trying to deploy a minimal Scalatra application on Openshift with DIY cartridge. I've managed to get SBT working, but when it comes to container:start, I get the error:
FAILED SelectChannelConnector#0.0.0.0:8080: java.net.SocketException: Permission denied
Apparently, embedded Jetty tries to open socket at 0.0.0.0, which is prohibited by Openshift (you can only open ports at $OPENSHIFT_INTERNAL_IP). How can I tell Jetty exactly which IP I need it to listen?
Yes you are right about $OPENSHIFT_INTERNAL_IP. So edit ${jetty.home}/etc/jetty.xml and set jetty.host in the connector section as follows:
…..
<Set name="connectors">
<Array type="org.mortbay.jetty.Connector">
<Item>
<New class="org.mortbay.jetty.nio.SelectChannelConnector">
<Set name="host"><SystemProperty name="jetty.host" />$OPENSHIFT_INTERNAL_IP</Set>
<Set name="port"><SystemProperty name="jetty.port" default="8080"/></Set>
...
</New>
</Item>
</Array>
</Set>
hth
I've never used Openshift, so I'm groping a bit here.
Do you have a jetty.host set?
You may need to set up a jetty.xml file and set it in there. See http://docs.codehaus.org/display/JETTY/Newbie+Guide+to+Jetty for how to set the host. You can tell the xsbt web plugin about jetty.xml by setting your project up like this:
https://github.com/JamesEarlDouglas/xsbt-web-plugin/wiki/Settings
Alternately, you may be able to pass the parameter to Jetty during startup. That'd look like this: -Djetty.host="yourhostname"
To get running with jetty 9.2.13.v20150730 on the Openshift with DIY cartridge you have to run with Java8 setting it to run on the $OPENSHIFT_INTERNAL_IP as follows. First ssh onto the host and download a jdk8 with
cd $OPENSHIFT_DATA_DIR
wget --no-check-certificate --no-cookies --header "Cookie: oraclelicense=accept-securebackup-cookie" http://download.oracle.com/otn-pub/java/jdk/8u5-b13/jdk-8u5-linux-x64.tar.gz
tar -zxf jdk-8u5-linux-x64.tar.gz
export PATH=$OPENSHIFT_DATA_DIR/jdk1.8.0_05/bin:$PATH
export JAVA_HOME="$OPENSHIFT_DATA_DIR/jdk/jdk1.8.0_05"
java -version
Then in your .openshift\action_hooks\start ensure you have the same exported variables with something like:
# see http://stackoverflow.com/a/23895161/329496 to install jdk1.8 no DIY cartridge
export JAVA_HOME="$OPENSHIFT_DATA_DIR/jdk/jdk1.8.0_05"
export PATH=$OPENSHIFT_DATA_DIR/jdk1.8.0_05/bin:$PATH
nohup java -cp ${OPENSHIFT_REPO_DIR}target/dependency/jetty-runner.jar org.eclipse.jetty.runner.Runner --host ${OPENSHIFT_DIY_IP} --port ${OPENSHIFT_DIY_PORT} ${OPENSHIFT_REPO_DIR}/target/thinbus-srp-spring-demo.war > ${OPENSHIFT_LOG_DIR}server.log 2>&1 &
(Note that jdk-8u20-linux-x64.tar.gz has also been reported to work so you may want to check for the latest available.)
That setup does not need a jetty.xml as it sets the --host and --port to bind to the correct interface and run the built war file. What it does require is that jetty-runner.jar is copied out of the ivy cache into the target folder. With maven to do that you add something like:
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-dependency-plugin</artifactId>
<version>2.3</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>copy</goal>
</goals>
<configuration>
<artifactItems>
<artifactItem>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-runner</artifactId>
<version>${jetty.version}</version>
<destFileName>jetty-runner.jar</destFileName>
</artifactItem>
</artifactItems>
</configuration>
</execution>
</executions>
</plugin>
Google suggest that the SBT equivalent is simply retrieveManaged := true. You can ssh to the host and run find to figure out where the jetty-runner.jar dependency has been copied to and update the start command appropriately.

How do I start Jetty8 to only serve static content?

For testing purposes I want to use Jetty 8 to serve only static content. I know how to start the webserver from the command line:
java -jar start.jar jetty.port=8082
I would like to be able to use a vanilla Jetty, preferably 8 or 7, and start it using something like:
java -jar start.jar OPTIONS=resources resources.root=../foo jetty.port=8082
The files should then be accessible from the root of the server. A file called ../foo/x.html should be accessible via http://localhost:8082/x.html.
I don't want to create a WAR file or anything fancy. Preferably it shouldn't do any caching on the server side, leaving the files unlocked on Windows machines. Also, I only want to serve files, even located in subdirectories, no fancy file browser or ways to modify them from a client.
Is this possible? If not, what is the minimum configuration needed to accomplish such behavior?
Additional information
I've tried the following command. I expected to be able to browse the javadoc shipped with Jetty 8 using http://localhost:8080/javadoc/, but it always gives me a 404
java -jar start.jar --ini OPTIONS=Server,resources etc/jetty.xml contexts/javadoc.xml
The simplest way to start Jetty and have it serve static content is by using the following xml file:
static-content.xml:
<?xml version="1.0"?>
<!DOCTYPE Configure PUBLIC "-//Jetty//Configure//EN" "http://www.eclipse.org/jetty/configure.dtd">
<Configure id="FileServer" class="org.eclipse.jetty.server.Server">
<Call name="addConnector">
<Arg>
<New class="org.eclipse.jetty.server.nio.SelectChannelConnector">
<Set name="host"><Property name="jetty.host" /></Set>
<Set name="port"><Property name="jetty.port" default="8080"/></Set>
</New>
</Arg>
</Call>
<Set name="handler">
<New class="org.eclipse.jetty.server.handler.ResourceHandler">
<Set name="resourceBase"><Property name="files.base" default="./"/></Set>
</New>
</Set>
</Configure>
Than you can start Jetty using:
java -jar start.jar --ini static-content.xml files.base=./foo jetty.port=8082
If you omit files.base, the current direcory will be used; if you omit jetty.port, port 8080 will be used.
The --ini will disable the settings from start.ini, therefore also make sure no other handlers etc. will be activated.
A bit of offtopic, but somebody using maven may wish to this something like this (supposing that static resources have been copied to target/web):
<plugin>
<groupId>org.mortbay.jetty</groupId>
<artifactId>jetty-maven-plugin</artifactId>
<version>8.1.9.v20130131</version>
<executions>
<execution>
<id>start-jetty</id>
<phase>install</phase>
<goals>
<goal>start</goal>
</goals>
<configuration>
<webAppConfig>
<resourceBases>
<contextPath>/</contextPath>
<resourceBase>${project.build.directory}/web</resourceBase>
</resourceBases>
</webAppConfig>
</configuration>
</execution>
</executions>
</plugin>
In your distribution under the contexts directory is a javadoc.xml that you can use as an example on how to do this easily enough.
http://git.eclipse.org/c/jetty/org.eclipse.jetty.project.git/tree/jetty-distribution/src/main/resources/contexts/javadoc.xml
that is what it actually looks like
you are looking to change the context path and the resource base
would also recommend just removing jetty-webapps.xml from the startup in the start.ini file and also removing the context files you don't want to deploy with
you can look at setting some of the other options in the start.ini file as well if you like
http://wiki.eclipse.org/Jetty/Feature/Start.jar
go there for information the start process
cheers