I have a HBase Schema setup on an Amazon EMR Cluster running 3 m3.xlarge instances with Amazon Linux Image. When I issue the command 'hbase rest start' it's not starting and I'm getting the following output. What can I do?
Output:
[hadoop#ip-10-81-13-20 ~]$ hbase rest start
2016-08-01 08:29:27,863 INFO [main] util.VersionInfo: HBase 1.2.1
2016-08-01 08:29:27,863 INFO [main] util.VersionInfo: Source code repository file:///workspace/workspace/bigtop.release-rpm-4.7.2/build/hbase/rpm/BUILD/hbase-1.2.1 revision=Unknown
2016-08-01 08:29:27,863 INFO [main] util.VersionInfo: Compiled by ec2-user on Fri Jul 8 02:16:27 UTC 2016
2016-08-01 08:29:27,863 INFO [main] util.VersionInfo: From source with checksum b1b31eefd0314d3ed5fa7036ed0201e9
2016-08-01 08:29:28,870 INFO [main] impl.MetricsConfig: loaded properties from hadoop-metrics2-hbase.properties
2016-08-01 08:29:28,967 INFO [main] impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2016-08-01 08:29:28,967 INFO [main] impl.MetricsSystemImpl: HBase metrics system started
2016-08-01 08:29:29,034 INFO [main] mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2016-08-01 08:29:29,081 INFO [main] http.HttpRequestLog: Http request log for http.requests.rest is not defined
2016-08-01 08:29:29,108 INFO [main] http.HttpServer: Added global filter 'safety' (class=org.apache.hadoop.hbase.http.HttpServer$QuotingInputFilter)
2016-08-01 08:29:29,109 INFO [main] http.HttpServer: Added global filter 'clickjackingprevention' (class=org.apache.hadoop.hbase.http.ClickjackingPreventionFilter)
2016-08-01 08:29:29,114 INFO [main] http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter) to context rest
2016-08-01 08:29:29,114 INFO [main] http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter) to context static
2016-08-01 08:29:29,114 INFO [main] http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter) to context logs
2016-08-01 08:29:29,129 INFO [main] http.HttpServer: HttpServer.start() threw a non Bind IOException
java.net.BindException: Port in use: 0.0.0.0:8085
at org.apache.hadoop.hbase.http.HttpServer.openListeners(HttpServer.java:1017)
at org.apache.hadoop.hbase.http.HttpServer.start(HttpServer.java:953)
at org.apache.hadoop.hbase.http.InfoServer.start(InfoServer.java:91)
at org.apache.hadoop.hbase.rest.RESTServer.main(RESTServer.java:248)
Caused by: java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:463)
at sun.nio.ch.Net.bind(Net.java:455)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
at org.apache.hadoop.hbase.http.HttpServer.openListeners(HttpServer.java:1012)
... 3 more
Exception in thread "main" java.net.BindException: Port in use: 0.0.0.0:8085
at org.apache.hadoop.hbase.http.HttpServer.openListeners(HttpServer.java:1017)
at org.apache.hadoop.hbase.http.HttpServer.start(HttpServer.java:953)
at org.apache.hadoop.hbase.http.InfoServer.start(InfoServer.java:91)
at org.apache.hadoop.hbase.rest.RESTServer.main(RESTServer.java:248)
Caused by: java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:463)
at sun.nio.ch.Net.bind(Net.java:455)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
at org.apache.hadoop.hbase.http.HttpServer.openListeners(HttpServer.java:1012)
... 3 more
2016-08-01 08:29:29,133 INFO [Shutdown] mortbay.log: Shutdown hook executing
2016-08-01 08:29:29,133 INFO [Shutdown] mortbay.log: Shutdown hook complete
(Answering my own question)
The defaults in AWS EMR for HBase ports are different from the regular HBase. From here we can say that the rest port for HBase is 8070 and the port for the UI is 8085. One could use them.
That said, there's always the -p option. Use hbase rest start -p portnumber to start the HBase rest server on a port number of your choice.
There's probably another process using the 8080 port that's why you can't start the HBase server using only hbase rest start.
Related
I have a 2 node EMR (Version 4.6.0) Cluster (1 master (m4.large) , 1 core (r4.xlarge) ) with HBase installed. I'm using default EMR configurations. I want to export HBase tables using
hbase org.apache.hadoop.hbase.mapreduce.Export -D hbase.mapreduce.include.deleted.rows=true Table_Name hdfs:/full_backup/Table_Name 1
I'm getting the following error
2022-04-04 11:29:20,626 INFO [main] util.RegionSizeCalculator: Calculating region sizes for table "Table_Name".
2022-04-04 11:29:20,900 INFO [main] client.ConnectionManager$HConnectionImplementation: Closing master protocol: MasterService
2022-04-04 11:29:20,900 INFO [main] client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x17ff27095680070
2022-04-04 11:29:20,903 INFO [main-EventThread] zookeeper.ClientCnxn: EventThread shut down for session: 0x17ff27095680070
2022-04-04 11:29:20,904 INFO [main] zookeeper.ZooKeeper: Session: 0x17ff27095680070 closed
2022-04-04 11:29:20,980 INFO [main] mapreduce.JobSubmitter: number of splits:1
2022-04-04 11:29:20,994 INFO [main] Configuration.deprecation: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2022-04-04 11:29:21,192 INFO [main] mapreduce.JobSubmitter: Submitting tokens for job: job_1649071534731_0002
2022-04-04 11:29:21,424 INFO [main] impl.YarnClientImpl: Submitted application application_1649071534731_0002
2022-04-04 11:29:21,454 INFO [main] mapreduce.Job: The url to track the job: http://ip-10-0-2-244.eu-west-1.compute.internal:20888/proxy/application_1649071534731_0002/
2022-04-04 11:29:21,455 INFO [main] mapreduce.Job: Running job: job_1649071534731_0002
2022-04-04 11:29:28,541 INFO [main] mapreduce.Job: Job job_1649071534731_0002 running in uber mode : false
2022-04-04 11:29:28,542 INFO [main] mapreduce.Job: map 0% reduce 0%
It is stuck at this progress and not running. However when I add a task node and redo the same command, it gets finished within seconds.
Based on the documentation, https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-master-core-task-nodes.html , core node itself should handle tasks as well. What could be going wrong?
I'm running a mapreduce mode in Apache Pig version 0.17.0 to simply dump a few lines of text data from a file on HDFS Hadoop-2.7.2
When executing the dump command, the execution goes very slow, however it gets completed. I see some failures during execution shown below:
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
[main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1589604570386_0002]
[main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
[main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1589604570386_0002]
[main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032
[main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
[main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
[main] WARN org.apache.pig.tools.pigstats.mapreduce.MRJobStats - Failed to get map task report
java.io.IOException: java.net.ConnectException: Call From localhost/127.0.0.1 to 0.0.0.0:10020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:343)
at org.apache.hadoop.mapred.ClientServiceDelegate.getJobStatus(ClientServiceDelegate.java:428)
at org.apache.hadoop.mapred.YARNRunner.getJobStatus(YARNRunner.java:572)
at org.apache.hadoop.mapreduce.Cluster.getJob(Cluster.java:184)
at org.apache.pig.tools.pigstats.mapreduce.MRJobStats.getTaskReports(MRJobStats.java:528)
at org.apache.pig.tools.pigstats.mapreduce.MRJobStats.addMapReduceStatistics(MRJobStats.java:355)
at org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil.addSuccessJobStats(MRPigStatsUtil.java:232)
at org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil.accumulateStats(MRPigStatsUtil.java:164)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:379)
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:290)
at org.apache.pig.PigServer.launchPlan(PigServer.java:1475)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1460)
at org.apache.pig.PigServer.storeEx(PigServer.java:1119)
at org.apache.pig.PigServer.store(PigServer.java:1082)
at org.apache.pig.PigServer.openIterator(PigServer.java:995)
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:782)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:383)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66)
at org.apache.pig.Main.run(Main.java:564)
at org.apache.pig.Main.main(Main.java:175)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Is there away to speed up the mapreduce job?
I think I am almost there.
I created an instance of AWS BeanStalk and added an oracle DB instance to it.
When I found the log, I saw the driver was loaded but it keeps saying that URL is
invalid.
Here are my RDS info and log message.
[RDS Info]
Endpoint = aa1c9autjaqoufk.c2k1ch01futy.ap-northeast-2.rds.amazonaws.com
Port = 1521
Public Access = yes
[System Log]
25-Jun-2018 02:42:56.759 INFO [main] org.apache.coyote.AbstractProtocol.init Initializing ProtocolHandler ["http-nio-8080"]
25-Jun-2018 02:42:56.787 INFO [main] org.apache.tomcat.util.net.NioSelectorPool.getSharedSelector Using a shared selector for servlet write/read
25-Jun-2018 02:42:56.796 INFO [main] org.apache.coyote.AbstractProtocol.init Initializing ProtocolHandler ["ajp-nio-8009"]
25-Jun-2018 02:42:56.799 INFO [main] org.apache.tomcat.util.net.NioSelectorPool.getSharedSelector Using a shared selector for servlet write/read
25-Jun-2018 02:42:56.800 INFO [main] org.apache.catalina.startup.Catalina.load Initialization processed in 1366 ms
25-Jun-2018 02:42:56.842 INFO [main] org.apache.catalina.core.StandardService.startInternal Starting service Catalina
25-Jun-2018 02:42:56.848 INFO [main] org.apache.catalina.core.StandardEngine.startInternal Starting Servlet Engine: Apache Tomcat/8.0.50
25-Jun-2018 02:42:56.872 INFO [localhost-startStop-1] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory /var/lib/tomcat8/webapps/ROOT
25-Jun-2018 02:42:58.613 INFO [localhost-startStop-1] org.apache.jasper.servlet.TldScanner.scanJars At least one JAR was scanned for TLDs yet contained no TLDs. Enable debug logging for this logger for a complete list of JARs that were scanned but no TLDs were found in them. Skipping unneeded JARs during scanning can improve startup time and JSP compilation time.
25-Jun-2018 02:42:58.689 INFO [localhost-startStop-1] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory /var/lib/tomcat8/webapps/ROOT has finished in 1,817 ms
25-Jun-2018 02:42:58.693 INFO [main] org.apache.coyote.AbstractProtocol.start Starting ProtocolHandler ["http-nio-8080"]
25-Jun-2018 02:42:58.720 INFO [main] org.apache.coyote.AbstractProtocol.start Starting ProtocolHandler ["ajp-nio-8009"]
25-Jun-2018 02:42:58.736 INFO [main] org.apache.catalina.startup.Catalina.start Server startup in 1935 ms
Loading driver...
Driver loaded!
jdbc:oracle:oci://aa1c9autjaqoufk.c2k1ch01futy.ap-northeast-2.rds.amazonaws.com:1521/ebdb?user=username&password=password
SQLException: Invalid Oracle URL specified
SQLState: 99999
VendorError: 17067
Closing the connection.
SQLException: Invalid Oracle URL specified
SQLState: 99999
VendorError: 17067
Closing the connection.
I included ojdbc8 drvier in my web project library and made a build.
Is this about driver? What am I doing wrong?
Message clearly says your URL is incorrect,
It should be something like below.
//step1 load the driver class
Class.forName("oracle.jdbc.driver.OracleDriver");
//step2 create the connection object
Connection con=DriverManager.getConnection(
"jdbc:oracle:thin:#aa1c9autjaqoufk.c2k1ch01futy.ap-northeast-2.rds.amazonaws.com:1521:edb","username","password");
`
It is so werid that I can connect to AWS RDS with sqldeveloper but can't with my java application(java source code or jsp)
When I try to access to RDS, there are errors like:
coyote.AbstractProtocol.init Initializing ProtocolHandler ["http-nio-8080"]
26-Jun-2018 04:24:33.203 INFO [main] org.apache.tomcat.util.net.NioSelectorPool.getSharedSelector Using a shared selector for servlet write/read
26-Jun-2018 04:24:33.212 INFO [main] org.apache.coyote.AbstractProtocol.init Initializing ProtocolHandler ["ajp-nio-8009"]
26-Jun-2018 04:24:33.215 INFO [main] org.apache.tomcat.util.net.NioSelectorPool.getSharedSelector Using a shared selector for servlet write/read
26-Jun-2018 04:24:33.219 INFO [main] org.apache.catalina.startup.Catalina.load Initialization processed in 1387 ms
26-Jun-2018 04:24:33.265 INFO [main] org.apache.catalina.core.StandardService.startInternal Starting service Catalina
26-Jun-2018 04:24:33.266 INFO [main] org.apache.catalina.core.StandardEngine.startInternal Starting Servlet Engine: Apache Tomcat/8.0.50
26-Jun-2018 04:24:33.286 INFO [localhost-startStop-1] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory /var/lib/tomcat8/webapps/ROOT
26-Jun-2018 04:24:35.020 INFO [localhost-startStop-1] org.apache.jasper.servlet.TldScanner.scanJars At least one JAR was scanned for TLDs yet contained no TLDs. Enable debug logging for this logger for a complete list of JARs that were scanned but no TLDs were found in them. Skipping unneeded JARs during scanning can improve startup time and JSP compilation time.
26-Jun-2018 04:24:35.097 INFO [localhost-startStop-1] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory /var/lib/tomcat8/webapps/ROOT has finished in 1,811 ms
26-Jun-2018 04:24:35.100 INFO [main] org.apache.coyote.AbstractProtocol.start Starting ProtocolHandler ["http-nio-8080"]
26-Jun-2018 04:24:35.106 INFO [main] org.apache.coyote.AbstractProtocol.start Starting ProtocolHandler ["ajp-nio-8009"]
26-Jun-2018 04:24:35.108 INFO [main] org.apache.catalina.startup.Catalina.start Server startup in 1888 ms
Loading driver...
Driver loaded!
jdbc:oracle:thin://IP:1521/ORCL?user=username&password=password
SQLException: Invalid Oracle URL specified
SQLState: 99999
VendorError: 17067
Closing the connection.
SQLException: Invalid Oracle URL specified
SQLState: 99999
VendorError: 17067
Closing the connection.
But the URL is just the same value as I tried with sqldeveloper.
Is there anything wrong?
Please enlighten me since I've been suffering for this about a week! :(
I'm not sure how your application is set up, but I'm using Maven & Spring Boot and I got it working like this:
I mainly followed this guide, ignoring the .sql files, thymeleaf UI, "model.addAttribute("cities", cities);" part, and the html file:
https://zetcode.com/springboot/postgresql/
My application.properties file looks like this
postgres.comment.aa=https://zetcode.com/springboot/postgresql/
spring.main.banner-mode=off
logging.level.org.springframework=ERROR
spring.jpa.hibernate.ddl-auto=none
spring.datasource.initialization-mode=always
spring.datasource.platform=postgres
spring.datasource.url=jdbc:postgresql://your-rds-url-here.us-east-1.rds.amazonaws.com:yourDbPortHere/postgres
spring.datasource.username=postgres
spring.datasource.password=<your db password here>
spring.jpa.properties.hibernate.jdbc.lob.non_contextual_creation=true
If you have custom schemas, you can append "?currentSchema=users" to the url:
spring.datasource.url=jdbc:postgresql://your-rds-url-here.us-east-1.rds.amazonaws.com:yourDbPortHere/postgres?currentSchema=users
Thanks to this SO answer for the schema:
Is it possible to specify the schema when connecting to postgres with JDBC?
These other couple links also helped
https://turreta.com/2015/03/01/how-to-specify-a-default-schema-when-connecting-to-postgresql-using-jdbc/
https://doc.cuba-platform.com/manual-latest/db_schema_connection.html
I'm deploying Vora 1.3 Services on HDP 2.3 using the Manager web UI. Mostly default configuration and nodes assignment. I've assigned Vora Thriftserver service to the node that's been successfully hosting the same service of Vora 1.2 (which I removed already).
The service doesn't start though. Here's the related part of the log:
17/01/23 10:04:27 INFO Server: jetty-8.y.z-SNAPSHOT
17/01/23 10:04:27 INFO AbstractConnector: Started SelectChannelConnector#0.0.0.0:4040
17/01/23 10:04:27 INFO Utils: Successfully started service 'SparkUI' on port 4040.
17/01/23 10:04:27 INFO SparkUI: Started SparkUI at http://<jumpbox>:4040
17/01/23 10:04:28 INFO SparkContext: Added JAR file:/var/lib/ambari-agent/cache/stacks/HDP/2.3/services/vora-manager/package/lib/vora-spark/lib/spark-sap-datasources-1.3.102-assembly.jar at http://<jumpbox>:41874/jars/spark-sap-datasources-1.3.102-assembly.jar with timestamp 1485126268263
17/01/23 10:04:28 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.
17/01/23 10:04:28 INFO Executor: Starting executor ID driver on host localhost
17/01/23 10:04:28 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 37523.
17/01/23 10:04:28 INFO NettyBlockTransferService: Server created on 37523
17/01/23 10:04:28 INFO BlockManagerMaster: Trying to register BlockManager
17/01/23 10:04:28 INFO BlockManagerMasterEndpoint: Registering block manager localhost:37523 with 530.0 MB RAM, BlockManagerId(driver, localhost, 37523)
17/01/23 10:04:28 INFO BlockManagerMaster: Registered BlockManager
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/sql/execution/SparkPlanner
at org.apache.spark.sql.hive.sap.thriftserver.SapSQLEnv$.init(SapSQLEnv.scala:39)
at org.apache.spark.sql.hive.thriftserver.SapThriftServer$.main(SapThriftServer.scala:22)
at org.apache.spark.sql.hive.thriftserver.SapThriftServer.main(SapThriftServer.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
(.... goes on...)
Spark executable and Java executable paths in the Vora Thriftserver configuration tab are correct.
Did I miss something else?
You are running Vora 1.3 which means you must use HDP 2.4.2 which includes the required Spark 1.6.1 version. See the official Vora product availability matrix (PAM)