Error in SQL Launcher (java.lang.NullPointerException) in Google Dataflow SQL - google-cloud-platform

I am trying to read the data from a Pubsub topic using Google dataflow SQL and getting "NullPointerException" error. Could anyone guide me on what I am doing wrong.
Below is the SQL query. I tried selecting few columns also. Same error is coming.
SELECT tr.* FROM pubsub.topic.`project-xxx`.county_timeranges as tr
Log from LogsExplorer
logger: "com.google.cloud.dataflow.sqllauncher.DataflowSqlLauncher"
message: "Error in SQL launcher"
exception: "java.lang.NullPointerException
at sun.reflect.annotation.TypeAnnotationParser.mapTypeAnnotations(TypeAnnotationParser.java:356)
at sun.reflect.annotation.AnnotatedTypeFactory$AnnotatedTypeBaseImpl.<init>(AnnotatedTypeFactory.java:139)
at sun.reflect.annotation.AnnotatedTypeFactory.buildAnnotatedType(AnnotatedTypeFactory.java:65)
at sun.reflect.annotation.TypeAnnotationParser.buildAnnotatedType(TypeAnnotationParser.java:79)
at java.lang.reflect.Executable.getAnnotatedReturnType0(Executable.java:633)
at java.lang.reflect.Method.getAnnotatedReturnType(Method.java:648)
at org.apache.beam.sdk.schemas.FieldValueTypeInformation.hasNullableReturnType(FieldValueTypeInformation.java:173)
at org.apache.beam.sdk.schemas.FieldValueTypeInformation.forGetter(FieldValueTypeInformation.java:132)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1380)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
at org.apache.beam.sdk.schemas.AutoValueSchema$AbstractGetterTypeSupplier.get(AutoValueSchema.java:62)
at org.apache.beam.sdk.schemas.utils.FieldValueTypeSupplier.get(FieldValueTypeSupplier.java:43)
at org.apache.beam.sdk.schemas.utils.JavaBeanUtils.lambda$getFieldTypes$0(JavaBeanUtils.java:107)
at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1660)
at org.apache.beam.sdk.schemas.utils.JavaBeanUtils.getFieldTypes(JavaBeanUtils.java:106)
at org.apache.beam.sdk.schemas.AutoValueSchema.fieldValueTypeInformations(AutoValueSchema.java:78)
at org.apache.beam.sdk.schemas.CachingFactory.create(CachingFactory.java:52)
at org.apache.beam.sdk.schemas.FromRowUsingCreator.fromRow(FromRowUsingCreator.java:78)
at org.apache.beam.sdk.schemas.FromRowUsingCreator.apply(FromRowUsingCreator.java:62)
at org.apache.beam.sdk.schemas.FromRowUsingCreator.apply(FromRowUsingCreator.java:45)
at org.apache.beam.sdk.io.gcp.pubsub.PubsubSchemaIOProvider$PubsubSchemaIO.<init>(PubsubSchemaIOProvider.java:176)
at org.apache.beam.sdk.io.gcp.pubsub.PubsubSchemaIOProvider$PubsubSchemaIO.<init>(PubsubSchemaIOProvider.java:165)
at org.apache.beam.sdk.io.gcp.pubsub.PubsubSchemaIOProvider.from(PubsubSchemaIOProvider.java:124)
at org.apache.beam.sdk.io.gcp.pubsub.PubsubSchemaIOProvider.from(PubsubSchemaIOProvider.java:91)
at org.apache.beam.sdk.extensions.sql.meta.provider.SchemaIOTableProviderWrapper.buildBeamSqlTable(SchemaIOTableProviderWrapper.java:72)
at org.apache.beam.sdk.extensions.sql.meta.provider.datacatalog.DataCatalogTableProvider.buildBeamSqlTable(DataCatalogTableProvider.java:123)
at org.apache.beam.sdk.extensions.sql.meta.provider.FullNameTableProvider$TableNameTrackingProvider.buildBeamSqlTable(FullNameTableProvider.java:163)
at org.apache.beam.sdk.extensions.sql.impl.BeamCalciteSchema.getTable(BeamCalciteSchema.java:110)
at org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.jdbc.SimpleCalciteSchema.getImplicitTable(SimpleCalciteSchema.java:83)
at org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.jdbc.CalciteSchema.getTable(CalciteSchema.java:289)
at org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.jdbc.CalciteSchema$SchemaPlusImpl.getTable(CalciteSchema.java:657)
at org.apache.beam.sdk.extensions.sql.zetasql.TableResolution.resolveCalciteTable(TableResolution.java:57)
at org.apache.beam.sdk.extensions.sql.zetasql.SqlAnalyzer.addTableToLeafCatalog(SqlAnalyzer.java:326)
at org.apache.beam.sdk.extensions.sql.zetasql.SqlAnalyzer.lambda$createPopulatedCatalog$1(SqlAnalyzer.java:216)
at com.google.common.collect.ImmutableList.forEach(ImmutableList.java:407)
at org.apache.beam.sdk.extensions.sql.zetasql.SqlAnalyzer.createPopulatedCatalog(SqlAnalyzer.java:216)
at org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLPlannerImpl.rel(ZetaSQLPlannerImpl.java:98)
at org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLQueryPlanner.convertToBeamRelInternal(ZetaSQLQueryPlanner.java:168)
at org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLQueryPlanner.convertToBeamRel(ZetaSQLQueryPlanner.java:156)
at org.apache.beam.sdk.extensions.sql.impl.BeamSqlEnv.parseQuery(BeamSqlEnv.java:110)
at org.apache.beam.sdk.extensions.sql.SqlTransform.expand(SqlTransform.java:135)
at org.apache.beam.sdk.extensions.sql.SqlTransform.expand(SqlTransform.java:86)
at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:542)
at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:493)
at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:56)
at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:186)
at com.google.cloud.dataflow.sqllauncher.DataflowSqlLauncher.buildPipelineOrThrow(DataflowSqlLauncher.java:177)
at com.google.cloud.dataflow.sqllauncher.DataflowSqlLauncher.buildPipeline(DataflowSqlLauncher.java:114)
at com.google.cloud.dataflow.sqllauncher.DataflowSqlLauncher.buildAndRunPipeline(DataflowSqlLauncher.java:105)
at com.google.cloud.dataflow.sqllauncher.DataflowSqlLauncher.main(DataflowSqlLauncher.java:73)

Related

Executing queries from Intellij Ultimate DB plugin to AWS Timestream service

I am trying to use Intellij Idea to query AWS timestream however I am having some problems with it.
when I execute the query in AWS console it works just fine, executing the same query in
Intellij Idea return an error:
Error executing query with id "null": Requested database 'data-test' not found for identifier 'data-test.table-name' at line 1:15 (Service: AmazonTimestreamQuery; Status Code: 400; Error Code: ValidationException; Request ID: req-id; Proxy: null) com.tsshaded.amazonaws.services.timestreamquery.model.ValidationException: Requested database 'data-test' not found for identifier 'data-test.table-name' at line 1:15 (Service: AmazonTimestreamQuery; Status Code: 400; Error Code: ValidationException; Request ID: req-id; Proxy: null)
The query is:
SELECT * FROM "data-test"."table-name" LIMIT 10
I am using the latest driver amazon-timestream-jdbc-1.0.2-shaded.jar

Error returned: 'OLE DB or ODBC error: [DataSource.Error] Teradata: [Teradata Database] [3119] Continue request submitted but no response to return

Failed to save modifications to the server. Error returned: 'OLE DB or ODBC error: [DataSource.Error] Teradata: [Teradata Database] [3119] Continue request submitted but no response to return..
'.
When trying to connect a View to my power bi file, I get the above error when around 25M records are imported. I do not have any issue on smaller tables.

ORA error in wso2 apim-analytics server

1) I'm getting below error in wso2carbon logs when I try to configure wso2 apim-analytics(2.1) server with Oracle DB(12c version). I have tried with ojdbc6.jar and ojdbc7.jar in lib folder but still error is there.
error:
Caused by: java.lang.RuntimeException: ORA-28040: No matching authentication
protocol
2) Is there any REST api available for wso2 apim-analytics similar to DAS server to extract data?
full error:
ERROR
{org.wso2.carbon.analytics.spark.core.AnalyticsTask} - Error while executing
the scheduled task for the script: APIM_LAST_ACCESS_TIME_SCRIPT
{org.wso2.carbon.analytics.spark.core.AnalyticsTask}
org.wso2.carbon.analytics.spark.core.exception.AnalyticsExecutionException:
Exception in executing query create temporary table APILastAccessSummaryData
using CarbonJDBC options (dataSource "WSO2AM_STATS_DB", tableName
"API_LAST_ACCESS_TIME_SUMMARY", schema "tenantDomain STRING ,
apiPublisher STRING , api STRING , version STRING , userId STRING ,
context STRING , max_request_time LONG ", primaryKeys
"tenantDomain,apiPublisher,api" )
at
org.wso2.carbon.analytics.spark.core.internal.SparkAnalyticsExecutor.executeQueryLocal(SparkAnalyticsExecutor.java:764)
at
org.wso2.carbon.analytics.spark.core.internal.SparkAnalyticsExecutor.executeQuery(SparkAnalyticsExecutor.java:721)
at
org.wso2.carbon.analytics.spark.core.CarbonAnalyticsProcessorService.executeQuery(CarbonAnalyticsProcessorService.java:201)
at
org.wso2.carbon.analytics.spark.core.CarbonAnalyticsProcessorService.executeScript(CarbonAnalyticsProcessorService.java:151)
at
org.wso2.carbon.analytics.spark.core.AnalyticsTask.execute(AnalyticsTask.java:60)
at org.wso2.carbon.ntask.core.impl.TaskQuartzJobAdapter.execute(TaskQuartzJobAdapter.java:67)
at org.quartz.core.JobRunShell.run(JobRunShell.java:213)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException:
ORA-28040: No matching authentication protocol
thanks,
Santosh
This was an issue identified in Oracle, and the workaround is : to set SQLNET.ALLOWED_LOGON_VERSION=8 in the $crs_home/network/admin/sqlnet.ora file. [1]
[1] https://community.softwaregrp.com/t5/UCMDB-and-UD-Practitioners-Forum/ORA-28040-No-matching-authentication-protocol/m-p/253403

Start token not found error while using JsonSerDe

I am trying to import a JSON data from S3, and after making some queries, export the output as JSON format to S3 again. However, I get the "org.apache.hadoop.hive.serde2.SerDeException: java.io.IOException: Start token not found where expected" error at hive step on EMR cluster. In order to understand what the problem is, I simplify the Hive script and JSON data, but it keeps giving the same error. How can I solve this problem?
Cluster configuration:
Release: emr-5.3.1
Hive version: 2.1.1
Hadoop distribution: Amazon 2.7.3
Service Role: EMR_DefaultRole
MasterInstanceType: m4.large
The content of the simplifed JSON data:
[{"MyID":"FOO123","MyField":"FOO"},{"MyID":"BAR123","MyField":"BAR"}]
Hive script:
DROP TABLE IF EXISTS SOURCE;
DROP TABLE IF EXISTS DESTINATION;
CREATE EXTERNAL TABLE SOURCE(MyID STRING, MyField STRING)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
LOCATION 's3://myPath/subPath/';
CREATE EXTERNAL TABLE DESTINATION(MyID STRING, MyField STRING)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
LOCATION 's3://anotherPath/subPath/';
INSERT OVERWRITE TABLE DESTINATION SELECT MyID, MyField FROM SOURCE;
And here is the stack trace:
Vertex failed, vertexName=Map 4, vertexId=vertex_1278452616863_0001_1_00, diagnostics=[Task failed, taskId=task_1278452616863, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : attempt_1278452616863:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable [{"MyID":"FOO123","MyField":"FOO"},{"MyID":"BAR123","MyField":"BAR"}]
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable [{"MyID":"FOO123","MyField":"FOO"},{"MyID":"BAR123","MyField":"BAR"}]
at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:95)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:70)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:383)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:185)
... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable [{"MyID":"FOO123","MyField":"FOO"},{"MyID":"BAR123","MyField":"BAR"}]
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:86)
... 17 more
Caused by: org.apache.hadoop.hive.serde2.SerDeException: java.io.IOException: Start token not found where expected
at org.apache.hive.hcatalog.data.JsonSerDe.deserialize(JsonSerDe.java:183)
at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.readRow(MapOperator.java:128)
at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.access$200(MapOperator.java:92)
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:488)
... 18 more
Caused by: java.io.IOException: Start token not found where expected
at org.apache.hive.hcatalog.data.JsonSerDe.deserialize(JsonSerDe.java:169)
... 21 more
Thanks.
JSON should start with { and not with array ([)
I tried with this approach updated my JSON file with structure as
{"MyID":"FOO123","MyField":"FOO"},
{"MyID":"BAR123","MyField":"BAR"}
but after done, I noticed only the first object is being inserted into the table.

Tasks fail after BAM admin password change

After changing the default password for admin user in WSO2 BAM 4.1.0, tasks fail with the following error:
TID: [0] [BAM] [2013-06-20 16:56:15,464] ERROR {org.wso2.carbon.analytics.hive.impl.HiveExecutorServiceImpl} - Error while executing Hive script.
Query returned non-zero code: 9, cause: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask {org.wso2.carbon.analytics.hive.impl.HiveExecutorServiceImpl}
java.sql.SQLException: Query returned non-zero code: 9, cause: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
at org.apache.hadoop.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:189)
at org.wso2.carbon.analytics.hive.impl.HiveExecutorServiceImpl$ScriptCallable.call(HiveExecutorServiceImpl.java:355)
at org.wso2.carbon.analytics.hive.impl.HiveExecutorServiceImpl$ScriptCallable.call(HiveExecutorServiceImpl.java:250)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)
TID: [0] [BAM] [2013-06-20 16:56:15,467] ERROR {org.wso2.carbon.analytics.hive.task.HiveScriptExecutorTask} - Error while executing script : am_stats_analyzer_460 {org.wso2.carbon.analytics.hive.ta
sk.HiveScriptExecutorTask}
org.wso2.carbon.analytics.hive.exception.HiveExecutionException: Error while executing Hive script.Query returned non-zero code: 9, cause: FAILED: Execution Error, return code 2 from org.apache.hado
op.hive.ql.exec.MapRedTask
at org.wso2.carbon.analytics.hive.impl.HiveExecutorServiceImpl.execute(HiveExecutorServiceImpl.java:117)
at org.wso2.carbon.analytics.hive.task.HiveScriptExecutorTask.execute(HiveScriptExecutorTask.java:60)
at org.wso2.carbon.ntask.core.impl.TaskQuartzJobAdapter.execute(TaskQuartzJobAdapter.java:56)
at org.quartz.core.JobRunShell.run(JobRunShell.java:213)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)
Reverting back the password to its original value solves the issue.
How do I change the password for the admin user and keep the tasks working?
Have you changed the username, password in the hive script am_stats_analyzer? The defaults are admin/admin, check the hive script and update the password accordingly. The property is as follows;
"cassandra.ks.username" = "admin",
"cassandra.ks.password" = "xxxxx",
Check if that fixes your issue.
In order to solve the issue I had to perform the following steps:
edit the file [BAM_HOME]/repository/conf/etc/cassandra-auth.xml and change the password value to the new password.
edit the file [BAM_HOME]/repository/conf/datasources/master-datasources.xml and change the password value of the WSO2BAM_CASSANDRA_DATASOURCE datasource to the new password.
restart the BAM: the Hive tasks now run without errors.
where the new password is the password I assigned to the admin user.
Moreover, the Main \ Manage \ Cassandra Keyspaces \ List page in the BAM UI, which was raising the following error, is now fixed:
org.wso2.carbon.cassandra.mgt.ui.CassandraAdminClientException: Error retrieving keyspace names !
(...)
Caused by: org.apache.axis2.AxisFault: InvalidRequestException(why:You have not logged in)
(...)
Sorry I couldn't follow up with the question earlier, anyway glad your problem is sorted now .! Keep on trying BAM and don't hesitate to holler if you run into any issues.
Thanks,
Shariq.