I am trying to get Phoenix work with kerberos.
Kerberos realm are setup , i am able to generate tickets from keytab /etc/hbase.keytab
but phoenix sqlline client is giving following error
Error: org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get the locations (state=,code=0)
I have done following changes
# Issue ticket for principal
kinit -t /etc/hbase.keytab hbase/HOSTNAME.RealmName#RealmName
<!-- Configuring Phoenix Kerberos Properties in Hbase_site.xml -->
<property>
<name>phoenix.queryserver.kerberos.principal</name>
<value>hbase/_HOST#RealmName</value>
</property>
<property>
<name>phoenix.queryserver.kerberos.keytab</name>
<value>/etc/hbase.keytab</value>
</property>
<property>
<name>zookeeper.znode.parent</name>
<value>/hbase-secure</value>
</property>
<property>
<name>phoenix.queryserver.keytab.file</name>
<value>/etc/hbase.keytab</value>
</property>
<property>
<name>phoenix.queryserver.http.keytab.file</name>
<value>/etc/hbase.keytab</value>
</property>
<property>
<name>phoenix.queryserver.kerberos.http.principal</name>
<value>hbase/_HOST#RealmName</value>
</property>
<property>
<name>phoenix.queryserver.dns.nameserver</name>
<value>HOSTNAME.RealmName</value>
</property>
<property>
<name>phoenix.queryserver.dns.interface</name>
<value>eth0</value>
</property>
# Set hbase conf path (This seems to be a bug in EMR. This should be set by default.)
sudo vi ~/.bashrc
add “export HBASE_CONF_DIR=/etc/hbase/conf/ “
source ~/.bashrc
Starting Phoenix Client
/usr/lib/phoenix/bin/sqlline.py HOSTNAME.RealmName:2181:/hbase-secure:hbase/HOSTNAME.RealmName#RealmName:/etc/hbase.keytab
This seems to be a bug in EMR 5.11
The issue was resolved after setting HBASE_CONF_DIR in ~/.bashrc
sudo vi ~/.bashrc
add “export HBASE_CONF_DIR=/etc/hbase/conf/“
source ~/.bashrc
run phoenix
/usr/lib/phoenix/bin/sqlline.py
Related
Ubuntu 16.04.1 LTS
Hadoop 3.3.1
I try to set up hadoop pseudo distributed mode referring to one network tutorial.and follow below steps.
Step 1:setting Hadoop
1.add below code to /etc/profile.
export HADOOP_HOME=/home/hadoop/hadoop
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
export HADOOP_INSTALL=$HADOOP_HOME
2.in $HADOOP_HOME/etc/hadoop/hadoop-env.sh,set
export JAVA_HOME=/opt/jdk1.8.0_261
core-site.xml:
<configuration>
<property>
<name>fs.default.name </name>
<value> hdfs://localhost:9000 </value>
</property>
</configuration>
hdfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>file:///home/hadoop/hadoop/pseudo/hdfs/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>file:///home/hadoop/hadoop/pseudo/hdfs/datanode</value>
</property>
</configuration>
yarn-site.xml:
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
mapred-site.xml:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
Step 2:Verify Hadoop
1.$ hdfs namenode -format
2.
sudo apt-get install ssh
ssh-keygen -t rsa
ssh-copy-id hadoop#ubuntu
cd ~/hadoop/sbin
start-dfs.sh
3.start-yarn.sh
4.open http://localhost:50070/ in firefox on local machine.
Unable to connect
Firefox can't establish a connection to the server at localhost:50070.
The site could be temporarily unavailable or too busy. Try again in a few moments.
If you are unable to load any pages, check your computer's network connection.
If your computer or network is protected by a firewall or proxy, make sure that Firefox is permitted to access the Web.
5.open http://localhost:8088/ in firefox ,returns the same error of port 50070.
when I run jps command,it returns
hadoop#ubuntu:~/hadoop/sbin$ jps
32501 DataNode
32377 NameNode
32682 SecondaryNameNode
32876 Jps
Since Hadoop 3.0.0 - Alpha 1 there was a Change in the port configuration:
http://localhost:50070
was moved to
http://localhost:9870
see https://issues.apache.org/jira/browse/HDFS-9427
Sqoop import action is giving error while running as an oozie job.
I am using a pesudo-distributed hadoop cluster.
I have followed the following steps.
1.Started oozie server
2.edited job.properties and workflow.xml files
3.copied workflow.xml into hdfs
4.ran oozie job
my job.properties file
nameNode=hdfs://localhost:8020
jobTracker=localhost:8021
queueName=default
examplesRoot=examples
oozie.use.system.libpath=true
oozie.wf.application.path=${nameNode}/user/hduser/${examplesRoot}/apps/sqoop
workflow.xml file
<action name="sqoop-node">
<sqoop xmlns="uri:oozie:sqoop-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="${nameNode}/user/hduser/${examplesRoot}/output-data/sqoop"/>
<!--<mkdir path="${nameNode}/user/hduser/${examplesRoot}/output-data"/>-->
</prepare>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<command>import --connect "jdbc:mysql://localhost/db" --username user --password pass --table "table" --where "Conditions" --driver com.mysql.jdbc.Driver --target-dir ${nameNode}/user/hduser/${examplesRoot}/output-data/sqoop -m 1</command>
<!--<file>db.hsqldb.properties#db.hsqldb.properties</file>
<file>db.hsqldb.script#db.hsqldb.script</file>-->
</sqoop>
<ok to="end"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>Sqoop failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
I was expecting that the job will run without any errors. But the job got killed and it gave the following error.
UnsupportedOperationException: Accessing local file system is not allowed.
I don't understand where I am wrong and why it is not allowing to complete the job?
Can Anyone help me to solve the issue.
Oozie sharelib (with the Sqoop action's dependencies) is stored on HDFS, and the server needs to know how to communicate with the Hadoop cluster. Access to the sharelib stored on a local filesystem is not allowed, see CVE-2017-15712.
Please review conf/hadoop-conf/core-site.xml, and make sure it does not use the local filesystem. For example, if your HDFS namenode listens on port 9000 on localhost, configure fs.defaultFS accordingly.
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
...
</configuration>
Alternatively, you can remove the RawLocalFileSystem class (dummy implementation) and restart the server, but it isn't recommended (i.e. server becomes vulnerable to CVE-2017-15712).
Hope this helps. Also see this answer.
I am launching an EMR cluster at run time based on an User Event and once the job is done the cluster will be terminated.
How ever when i the cluster is launched and the tasks are getting executed i am getting the Error:
I read some posts where it is being suggested that we need to update yarn-site.xml in namenode and datanodes and restart the yarn instance.
Not sure how to configure this during the launch of the cluster itself.
org.apache.hadoop.yarn.exceptions.InvalidAuxServiceException: The auxService:mapreduce_shuffle does not exist
Container launch failed for container_1523533251407_0001_01_000002 : org.apache.hadoop.yarn.exceptions.InvalidAuxServiceException: The auxService:mapreduce_shuffle does not exist
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:168)
at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106)
at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:155)
at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:390)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Thanks
Answer:
Here is what i have added in my code to resolve the Issue:
Map<String,String> yarnProperties = new HashMap<String,String>();
yarnProperties.put("yarn.nodemanager.aux-services","mapreduce_shuffle");
yarnProperties.put("yarn.nodemanager.aux-services.mapreduce_shuffle.class","org.apache.hadoop.mapred.ShuffleHandler");
Configuration yarnConfig = new Configuration()
.withClassification("yarn-env")
.withProperties(yarnProperties);
RunJobFlowRequest request = new RunJobFlowRequest()
.withConfigurations(yarnConfig)
We were setting some other properties in the yarn-site.xml .
In case you are trying to create using AWS CLI, you can use
--configurations 'json file with the config'
Else if you are trying to create through java , for example
Application hive = new Application().withName("Hive");
Map<String,String> hiveProperties = new HashMap<String,String>();
hiveProperties.put("hive.join.emit.interval","1000");
hiveProperties.put("hive.merge.mapfiles","true");
Configuration myHiveConfig = new Configuration()
.withClassification("hive-site")
.withProperties(hiveProperties);
Then you can refer as
RunJobFlowRequest request = new RunJobFlowRequest()
.withName("Create cluster with ReleaseLabel")
.withReleaseLabel("emr-5.13.0")
.withApplications(hive)
.withConfigurations(myHiveConfig)
For the other problem :-
You need to add this 2 properties in the above way and then create the cluster:-
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
I'm trying to deploy a minimal Scalatra application on Openshift with DIY cartridge. I've managed to get SBT working, but when it comes to container:start, I get the error:
FAILED SelectChannelConnector#0.0.0.0:8080: java.net.SocketException: Permission denied
Apparently, embedded Jetty tries to open socket at 0.0.0.0, which is prohibited by Openshift (you can only open ports at $OPENSHIFT_INTERNAL_IP). How can I tell Jetty exactly which IP I need it to listen?
Yes you are right about $OPENSHIFT_INTERNAL_IP. So edit ${jetty.home}/etc/jetty.xml and set jetty.host in the connector section as follows:
…..
<Set name="connectors">
<Array type="org.mortbay.jetty.Connector">
<Item>
<New class="org.mortbay.jetty.nio.SelectChannelConnector">
<Set name="host"><SystemProperty name="jetty.host" />$OPENSHIFT_INTERNAL_IP</Set>
<Set name="port"><SystemProperty name="jetty.port" default="8080"/></Set>
...
</New>
</Item>
</Array>
</Set>
hth
I've never used Openshift, so I'm groping a bit here.
Do you have a jetty.host set?
You may need to set up a jetty.xml file and set it in there. See http://docs.codehaus.org/display/JETTY/Newbie+Guide+to+Jetty for how to set the host. You can tell the xsbt web plugin about jetty.xml by setting your project up like this:
https://github.com/JamesEarlDouglas/xsbt-web-plugin/wiki/Settings
Alternately, you may be able to pass the parameter to Jetty during startup. That'd look like this: -Djetty.host="yourhostname"
To get running with jetty 9.2.13.v20150730 on the Openshift with DIY cartridge you have to run with Java8 setting it to run on the $OPENSHIFT_INTERNAL_IP as follows. First ssh onto the host and download a jdk8 with
cd $OPENSHIFT_DATA_DIR
wget --no-check-certificate --no-cookies --header "Cookie: oraclelicense=accept-securebackup-cookie" http://download.oracle.com/otn-pub/java/jdk/8u5-b13/jdk-8u5-linux-x64.tar.gz
tar -zxf jdk-8u5-linux-x64.tar.gz
export PATH=$OPENSHIFT_DATA_DIR/jdk1.8.0_05/bin:$PATH
export JAVA_HOME="$OPENSHIFT_DATA_DIR/jdk/jdk1.8.0_05"
java -version
Then in your .openshift\action_hooks\start ensure you have the same exported variables with something like:
# see http://stackoverflow.com/a/23895161/329496 to install jdk1.8 no DIY cartridge
export JAVA_HOME="$OPENSHIFT_DATA_DIR/jdk/jdk1.8.0_05"
export PATH=$OPENSHIFT_DATA_DIR/jdk1.8.0_05/bin:$PATH
nohup java -cp ${OPENSHIFT_REPO_DIR}target/dependency/jetty-runner.jar org.eclipse.jetty.runner.Runner --host ${OPENSHIFT_DIY_IP} --port ${OPENSHIFT_DIY_PORT} ${OPENSHIFT_REPO_DIR}/target/thinbus-srp-spring-demo.war > ${OPENSHIFT_LOG_DIR}server.log 2>&1 &
(Note that jdk-8u20-linux-x64.tar.gz has also been reported to work so you may want to check for the latest available.)
That setup does not need a jetty.xml as it sets the --host and --port to bind to the correct interface and run the built war file. What it does require is that jetty-runner.jar is copied out of the ivy cache into the target folder. With maven to do that you add something like:
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-dependency-plugin</artifactId>
<version>2.3</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>copy</goal>
</goals>
<configuration>
<artifactItems>
<artifactItem>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-runner</artifactId>
<version>${jetty.version}</version>
<destFileName>jetty-runner.jar</destFileName>
</artifactItem>
</artifactItems>
</configuration>
</execution>
</executions>
</plugin>
Google suggest that the SBT equivalent is simply retrieveManaged := true. You can ssh to the host and run find to figure out where the jetty-runner.jar dependency has been copied to and update the start command appropriately.
For testing purposes I want to use Jetty 8 to serve only static content. I know how to start the webserver from the command line:
java -jar start.jar jetty.port=8082
I would like to be able to use a vanilla Jetty, preferably 8 or 7, and start it using something like:
java -jar start.jar OPTIONS=resources resources.root=../foo jetty.port=8082
The files should then be accessible from the root of the server. A file called ../foo/x.html should be accessible via http://localhost:8082/x.html.
I don't want to create a WAR file or anything fancy. Preferably it shouldn't do any caching on the server side, leaving the files unlocked on Windows machines. Also, I only want to serve files, even located in subdirectories, no fancy file browser or ways to modify them from a client.
Is this possible? If not, what is the minimum configuration needed to accomplish such behavior?
Additional information
I've tried the following command. I expected to be able to browse the javadoc shipped with Jetty 8 using http://localhost:8080/javadoc/, but it always gives me a 404
java -jar start.jar --ini OPTIONS=Server,resources etc/jetty.xml contexts/javadoc.xml
The simplest way to start Jetty and have it serve static content is by using the following xml file:
static-content.xml:
<?xml version="1.0"?>
<!DOCTYPE Configure PUBLIC "-//Jetty//Configure//EN" "http://www.eclipse.org/jetty/configure.dtd">
<Configure id="FileServer" class="org.eclipse.jetty.server.Server">
<Call name="addConnector">
<Arg>
<New class="org.eclipse.jetty.server.nio.SelectChannelConnector">
<Set name="host"><Property name="jetty.host" /></Set>
<Set name="port"><Property name="jetty.port" default="8080"/></Set>
</New>
</Arg>
</Call>
<Set name="handler">
<New class="org.eclipse.jetty.server.handler.ResourceHandler">
<Set name="resourceBase"><Property name="files.base" default="./"/></Set>
</New>
</Set>
</Configure>
Than you can start Jetty using:
java -jar start.jar --ini static-content.xml files.base=./foo jetty.port=8082
If you omit files.base, the current direcory will be used; if you omit jetty.port, port 8080 will be used.
The --ini will disable the settings from start.ini, therefore also make sure no other handlers etc. will be activated.
A bit of offtopic, but somebody using maven may wish to this something like this (supposing that static resources have been copied to target/web):
<plugin>
<groupId>org.mortbay.jetty</groupId>
<artifactId>jetty-maven-plugin</artifactId>
<version>8.1.9.v20130131</version>
<executions>
<execution>
<id>start-jetty</id>
<phase>install</phase>
<goals>
<goal>start</goal>
</goals>
<configuration>
<webAppConfig>
<resourceBases>
<contextPath>/</contextPath>
<resourceBase>${project.build.directory}/web</resourceBase>
</resourceBases>
</webAppConfig>
</configuration>
</execution>
</executions>
</plugin>
In your distribution under the contexts directory is a javadoc.xml that you can use as an example on how to do this easily enough.
http://git.eclipse.org/c/jetty/org.eclipse.jetty.project.git/tree/jetty-distribution/src/main/resources/contexts/javadoc.xml
that is what it actually looks like
you are looking to change the context path and the resource base
would also recommend just removing jetty-webapps.xml from the startup in the start.ini file and also removing the context files you don't want to deploy with
you can look at setting some of the other options in the start.ini file as well if you like
http://wiki.eclipse.org/Jetty/Feature/Start.jar
go there for information the start process
cheers