I haven't been able to find much on the web about this problem but...
I am setting up a fresh BigTable cluster on Googles Cloud services. I've gone through the whole process you do with most Google APIs (create service account, know your project ID, authing with the gcloud tool, Google enviornment variable set, etc.).
I am having a problem though after going through the setup. I get this error I can't find anything on web on that says:
Caused by: com.google.bigtable.repackaged.com.google.common.util.concurrent.UncheckedExecutionException: io.grpc.StatusRuntimeException:
NOT_FOUND: Error listing tables for cluster projects/bigtable-1127/zones/us-central1-c/clusters/bigdatastats : Failed to read Tables in cluster: bigdatastats
Here is the complete print that includes the error..note that I get the same error when trying to create a table as well:
./bin/hbase com.google.cloud.bigtable.hbase.CheckConfig
User Agent: bigtable-hbase-1.0-0.2.1
Project ID: bigtable-1127
Cluster Id: bigdatastats
ZoneId: us-central1-c
Cluster admin host: bigtableclusteradmin.googleapis.com
Table admin host: bigtabletableadmin.googleapis.com
Data host: bigtable.googleapis.com
Attempting credential refresh...
HBase Connection Class = com.google.cloud.bigtable.hbase1_0.BigtableConnection (OK)
Opening table admin connection...
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/Users/Michael/bigtable/hbase-1.0.1.1/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/Cellar/hadoop/2.7.1/libexec/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2015-11-12 01:30:31,552 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2015-11-12 01:30:32,619 INFO [main] grpc.BigtableSession: Opening connection for projectId bigtable-1127, zoneId us-central1-c, clusterId bigdatastats, on data host bigtable.googleapis.com, table admin host bigtabletableadmin.googleapis.com.
Tables in cluster bigdatastats:
Exception in thread "main" java.io.IOException: Failed to listTables
at org.apache.hadoop.hbase.client.AbstractBigtableAdmin.requestTableList(AbstractBigtableAdmin.java:221)
at org.apache.hadoop.hbase.client.AbstractBigtableAdmin.listTableNames(AbstractBigtableAdmin.java:208)
at com.google.cloud.bigtable.hbase.CheckConfig.main(CheckConfig.java:99)
Caused by: com.google.bigtable.repackaged.com.google.common.util.concurrent.UncheckedExecutionException: io.grpc.StatusRuntimeException: NOT_FOUND: Error listing tables for cluster projects/bigtable-1127/zones/us-central1-c/clusters/bigdatastats : Failed to read Tables in cluster: bigdatastats
at io.grpc.stub.Calls.getUnchecked(Calls.java:117)
at io.grpc.stub.Calls.blockingUnaryCall(Calls.java:129)
at com.google.bigtable.admin.table.v1.BigtableTableServiceGrpc$BigtableTableServiceBlockingStub.listTables(BigtableTableServiceGrpc.java:338)
at com.google.cloud.bigtable.grpc.BigtableTableAdminGrpcClient.listTables(BigtableTableAdminGrpcClient.java:44)
at org.apache.hadoop.hbase.client.AbstractBigtableAdmin.requestTableList(AbstractBigtableAdmin.java:219)
... 2 more
Caused by: io.grpc.StatusRuntimeException: NOT_FOUND: Error listing tables for cluster projects/bigtable-1127/zones/us-central1-c/clusters/bigdatastats : Failed to read Tables in cluster: bigdatastats
at io.grpc.Status.asRuntimeException(Status.java:428)
at io.grpc.stub.Calls$UnaryStreamToFuture.onClose(Calls.java:324)
at io.grpc.ChannelImpl$CallImpl$ClientStreamListenerImpl$3.run(ChannelImpl.java:402)
at io.grpc.SerializingExecutor$TaskRunner.run(SerializingExecutor.java:154)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
It would be amazing if someone could help with this. I am not sure what to do and I can't find anything out there. Obviously its around authentication, my key file is fresh and in the right place, ive ran the gcloud auth and Im not sure what else to check.
Please let me know if I can provide anymore information to help answer.
As noted in the comments, this was unlikely to have been an authentication issue.
You would receive NOT_FOUND as an error if the resource you are trying to query does not exist in your project. So it's likely that you wanted to switch your default project using gcloud config set project as recommended by Les.
Related
I have the following error for one of my DataFlow Jobs:
2022-06-15T16:12:27.365182607Z Error message from worker: java.lang.RuntimeException: org.apache.beam.sdk.util.UserCodeException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: com.google.api.gax.rpc.PermissionDeniedException: io.grpc.StatusRuntimeException: PERMISSION_DENIED: BigQuery Storage API has not been used in project 770406736630 before or it is disabled. Enable it by visiting https://console.developers.google.com/apis/api/bigquerystorage.googleapis.com/overview?project=770406736630 then retry. If you enabled this API recently, wait a few minutes for the action to propagate to our systems and retry.
The same code works fine with Apache Beam 2.38.0. I tested multiple times and this is not a temporary issues. The project number mentioned in the error (770406736630) is not mine.
Any idea why I get this error?
I had the same issue. I'm using Spring Cloud GCP and hadn't set the spring.cloud.gcp.project-id property, which I'm guessing makes the SDK or API use some default value.
I don't know how you've set up you environment, because you haven't specified, but look into how you can explicitly set the project id. You can get it from the dialog for selecting a project in GCP Console.
I just ran into this, and simply needed to re-authenticate with the gcp cli by running gcloud auth application-default login.
The error happens for the latest Apache Beam SKD (2.41.0) when BigQueryIO.Write.Method.STORAGE_WRITE_API is used and destination does not specify the project name. For example dataset.table instead of project-id:dataset.table
This is the solution that worked for me:
BigQueryIO.writeTableRows()
.to("project-id:dataset.table")
.withMethod(BigQueryIO.Write.Method.STORAGE_WRITE_API)
For some reason the Apache Beam implementation for BigQuery Write Storage API does not handle this situation even though it works fine for FILE_LOADS method.
You may also receive a sightly different error for the latest Beam SDK.
Exception in thread "main" org.apache.beam.sdk.Pipeline$PipelineExecutionException: java.lang.RuntimeException:
java.lang.RuntimeException:
java.lang.RuntimeException: com.google.api.gax.rpc.PermissionDeniedException:
io.grpc.StatusRuntimeException:
PERMISSION_DENIED: Permission denied: Consumer 'project:null' has been suspended.
I'm working on testing Cloud Data Fusion in GCP by executing their quickstart tutorial. The tutorial I am following is here
I configured my environment to have all the appropriate permissions and get to the point where my Dataproc cluster is up and running and the job starts.
After a few minutes, the job fails with the following error:
java.io.IOException: com.jcraft.jsch.JSchException: java.net.ConnectException: Connection timed out (Connection timed out)
And:
io.grpc.netty.shaded.io.netty.channel.ChannelException: eventfd_write(...) failed: Bad file descriptor
For the second error, I manually changed the 'input' format to be JSON instead of text (like it comes when you import the pipeline from the HUB), but still no luck. The first error I'm not exactly sure whats going wrong.
I have already review the Creating a Cloud Data Fusion instance documentation, but still receive errors.
Any suggestions?
The WSO2 Business Rule Manager fails when deploy. I'm using Docker to comunicate WSO2-dashboard and WSO2-worker.
The error logs shows the following:
ERROR {org.wso2.carbon.business.rules.core.services.TemplateManagerService} - Failed to update the deployed artifact for business rule myRule org.wso2.carbon.business.rules.core.exceptions.SiddhiAppsApiHelperException: Failed to update the siddhi app '#App:name('MyApp')
#App:description('MyDescription')
.
.
.
Siddi Template Code
.
.
.'
on node 'wso2sp-worker:9443' due to a validation error occurred when updating the siddhi app
at org.wso2.carbon.business.rules.core.deployer.SiddhiAppApiHelper.update(SiddhiAppApiHelper.java:139)
at org.wso2.carbon.business.rules.core.services.TemplateManagerService.updateDeployedSiddhiApp(TemplateManagerService.java:1400)
at org.wso2.carbon.business.rules.core.services.TemplateManagerService.updateDeployedArtifacts(TemplateManagerService.java:1388)
at org.wso2.carbon.business.rules.core.services.TemplateManagerService.redeployBusinessRule(TemplateManagerService.java:663)
at org.wso2.carbon.business.rules.core.api.impl.BusinessRulesApiServiceImpl.redeployBusinessRule(BusinessRulesApiServiceImpl.java:412)
at org.wso2.carbon.business.rules.core.api.BusinessRulesApi.redeployBusinessRule(BusinessRulesApi.java:235)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.wso2.msf4j.internal.router.HttpMethodInfo.invokeResource(HttpMethodInfo.java:187)
at org.wso2.msf4j.internal.router.HttpMethodInfo.invoke(HttpMethodInfo.java:143)
at org.wso2.msf4j.internal.MSF4JHttpConnectorListener.dispatchMethod(MSF4JHttpConnectorListener.java:218)
at org.wso2.msf4j.internal.MSF4JHttpConnectorListener.lambda$onMessage$57(MSF4JHttpConnectorListener.java:129)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
This can occur if the siddhi app created by the business rules manager is incorrect.
One possible reason for this is using an invalid siddhi app template to create business rules.
Therefore, can you check the following?
Create a siddhi app by filling up the templated fields in your templated siddhi app.
Copy that siddhi file to $SP_HOME/wso2/worker/deployment/siddhi-files directory.
Start the worker runtime.
If there is any issue with the template, it worker will fail to deploy that siddhi app and it will log the relevant error.
i am trying to configue Yarn 2.2.0 with whirr in Amazon EC2. however I am having some problems. I have modified the whirr services to support yarn 2.2.0. As a result I am able to start the jobs and run them successfully. however I am facing n issue in tracking the job progress.
mapreduce.Job (Job.java:monitorAndPrintJob(1317)) - Running job: job_1397996350238_0001
2014-04-20 21:57:24,544 INFO [main] mapred.ClientServiceDelegate (ClientServiceDelegate.java:getProxy(270)) - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
java.io.IOException: Job status not available
at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:322)
at org.apache.hadoop.mapreduce.Job.isComplete(Job.java:599)
at org.apache.hadoop.mapreduce.Job.monitorAndPrintJob(Job.java:1327)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1289)
at com.zetaris.hadoop.seek.preprocess.PreProcessorDriver.executeJobs(PreProcessorDriver.java:112)
at com.zetaris.hadoop.seek.JobToJobMatchingDriver.executePreProcessJob(JobToJobMatchingDriver.java:143)
at com.zetaris.hadoop.seek.JobToJobMatchingDriver.executeJobs(JobToJobMatchingDriver.java:78)
at com.zetaris.hadoop.seek.JobToJobMatchingDriver.executeJobs(JobToJobMatchingDriver.java:43)
at com.zetaris.hadoop.seek.JobToJobMatchingDriver.main(JobToJobMatchingDriver.java:56)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212
I tried Debugginh the problem is with the ApplicationMaster. It has an hostname and rpc port , in which the hostname is the internal hostname which can only be resolved from within the amazon network. Idealy it should have been a public Amazon DNs name. however I could'nt set it yet. I tried setting parameters like
yarn.nodemanager.hostname
yarn.nodemanager.address
But I couldnt find any change in the ApplicationMaster's hostname or port they are still the private amazon internal hostname. Am I missing anything. Or should I change the /etc/hosts in all node manager nodes so that node managers start with the public address..
But that will be an overkill right.Or is there any way I can configure the ApplicationMaster to take the public ip.So that I can Remotely track the progress
I am doing this all because I need to submit the jobs remotely.I am not willing to compromise this feature. Anyone out there who an guide me
I was successful in configuring the historyserver and I am able to access then from the remote client. I used the configuration to do it.
mapreduce.jobhistory.webapp.address
When i debugged I find the
MRClientProtocol MRClientProxy = null;
try {
MRClientProxy = getProxy();
return methodOb.invoke(MRClientProxy, args);
} catch (InvocationTargetException e) {
// Will not throw out YarnException anymore
LOG.debug("Failed to contact AM/History for job " + jobId +
" retrying..", e.getTargetException());
// Force reconnection by setting the proxy to null.
realProxy = null;
proxy failing to connect because of the private address . And above code snipped is from ClientServiceDelegate
I was able to avoid the issue. Rather than solve this. The problem is with the resolution of ip outside the cloud environment.
Initially I tried updating the whirr-yarn source to make use of public ip for configurations rather than private ip. But still There where issues.So I gave up the task.
What I finally did was to start job form the cloud environment itself. rther than from a host outside the cloud infrastructure. Hope somebody found a better way.
I had the same problem. Solved by adding following lines in mapred-site.yml. It move's your staging directory from default tmp directory to your home directory where your have permission.
<property>
<name>yarn.app.mapreduce.am.staging-dir</name>
<value>/user</value>
</property>
In addition to this, you need to create a history directory on hdfs:
hdfs dfs -mkdir -p /user/history
hdfs dfs -chmod -R 1777 /user/history
hdfs dfs -chown mapred:hadoop /user/history
I found this link quite useful for configuring a Hadoop cluster.
conf.set("mapreduce.jobhistory.address", "hadoop3.hwdomain:10020");
conf.set("mapreduce.jobhistory.intermediate-done-dir", "/mr-history/tmp");
conf.set("mapreduce.jobhistory.done-dir", "/mr-history/done");
I am trying to integrate external cassandra to BAM. I have changed cassandra-component.xml.
1) I want to know how keyspace are created on external cassandra because when I am running BAM,
I am getting the error Unknown keyspace EVENT_KS.
2) I am having the following error in my wso2 logs
TID: [0] [BAM] [2014-02-11 15:28:30,905] WARN {org.apache.hadoop.mapred.JobClient} - Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. {org.apache.hadoop.mapred.JobClient}
TID: [0] [BAM] [2014-02-11 15:37:04,393] ERROR {org.apache.hadoop.hive.ql.exec.ExecDriver} - Job Submission failed with exception 'java.lang.RuntimeException(org.apache.thrift.transport.TTransportException)'
java.lang.RuntimeException: org.apache.thrift.transport.TTransportException
at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getRangeMap(ColumnFamilyInputFormat.java:297)
at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:105)
at org.apache.hadoop.hive.cassandra.input.HiveCassandraStandardColumnInputFormat.getSplits(HiveCassandraStandardColumnInputFormat.java:291)
at org.apache.hadoop.hive.cassandra.input.HiveCassandraStandardColumnInputFormat.getSplits(HiveCassandraStandardColumnInputFormat.java:216)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:302)
at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:292)
at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:933)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:925)
at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:839)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:792)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1123)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:792)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:766)
at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:460)
at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:733)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Caused by: org.apache.thrift.transport.TTransportException
EVENT_KS is created only after the very first event is published to BAM as I remember. If you try to access it before getting created errors may arise.
In BAM 2.4.0, EVENT_KS is getting created when you run the BAM for the first time. (But in previous versions EVENT_KS will be created when the very first event is published to BAM). Please make sure your cassandra-component.xml looks similar to something like below. Also tell us about the cassandra version you are using.
<Cassandra><Cluster>
<Name>Test Cluster</Name>
<DefaultPort>9160</DefaultPort>
<Nodes>localhost:9160</Nodes>
<AutoDiscovery disable="false" delay="1000"/>
</Cluster></Cassandra>
First You need to check the following:
Have you pointed the cassandra-component.xml correctly to the external cassandra. With this your published data will be stored in the intended external cassandra database.
Have you installed a toolbox with intended stream definition inside? Or Else have you triggered to publish the data to BAM? In both cases the EVENT_KS will be created with the column family with the name of stream.
Have you modified the $BAM_HOME/repository/conf/datasource/master-datasource.xml to point to external cassandra databse? You need to validate the cassandra database configuration provided in WSO2BAM_CASSANDRA_DATASOURCE datasource. For the default toolboxes, this is the default cassandra datasource being used and by default it points to localhost. If you are using this in your hive script you need to change this configuration.
After putting many efforts i figure that after changing data directory of cassendra.yaml of external cassandra to repository/database/cassandra/data everything works fine with external cassandra.Not to mention with version 1.1.3. I want to know is there any other work around for this external cassandra configuration.