we are integrating data into salesforce from Hive database source using informatica cloud.
But when we use SFDC standard API,its generating error file with all rejected records.
Here we have to re-process the error file to make sure all source records loaded into salesforce. The error file will be generated in unix box
path: \apps\Data_Integration_Server\data\error.
The file name will be in the format _timestamp.csv. ex: s_mtt_0103YB0Z000000000012_3_26_2020_12_21_standard_error.csv
can anyone pls help me i didn't find answers in Informatica cloud platform community.
Related
I'm trying to connect to the HDFS from the ADF. I created a folder and sample file (orc format) and put it in the newly created folder.
Then in ADF I created successfully linked service for HDFS using my Windows credentials (the same user which was used for creating sample file):
But when trying to browse the data through dataset:
I'm getting an error: The response content from the data store is not expected, and cannot be parsed.:
Is there something I'm doing wrongly or it is kind of permissions issue?
Please advise
This appears to be a generic issue, you need to point to a file with appropriate extension rather than a folder itself. Also make sure you are using a supported data store activity.
You can follow this official MS doc to use HDFS server with Azure Data Factory
I'm trying to load around 1000 files from Google Cloud Storage into BigQuery using the BigQuery transfer service, but it appears I have an error in one of my files:
Job bqts_601e696e-0000-2ef0-812d-f403043921ec (table streams) failed with error INVALID_ARGUMENT: Error while reading data, error message: CSV table references column position 19, but line starting at position:206 contains only 19 columns.; JobID: 931777629779:bqts_601e696e-0000-2ef0-812d-f403043921ec
How can I find which file is causing this error?
I feel like this is in the docs somewhere, but I can't seem to find it.
Thanks!
You can use bq show --format=prettyjson -j job_id_here and will show a verbose error about the failed job. You can see more info about the usage of the command in BigQuery managing jobs docs.
I tried this with a failed job of mine wherein I'm loading csv files from a Google Coud Storage bucket in my project.
Command used:
bq show --format=prettyjson -j bqts_xxxx-xxxx-xxxx-xxxx
Here is a snippet of the output. Output is in JSON format:
I'm trying to load ORC data files stored in GCS into BigQuery via bq load/bq mk and facing an error below. The data files copied via hadoop discp command from on-prem cluster's Hive instance version 1.2. Most of the orc-files are loaded successfully, but few are not. There is no problem when I read this data from Hive.
Command I used:
$ bq load --source_format ORC hadoop_migration.pm hive/part-v006-o000-r-00000_a_17
Upload complete.
Waiting on bqjob_r7233761202886bd8_00000175f4b18a74_1 ... (1s) Current status: DONE
BigQuery error in load operation: Error processing job '<project>-af9bd5f6:bqjob_r7233761202886bd8_00000175f4b18a74_1': Error while reading data, error message:
The Apache Orc library failed to parse metadata of stripes with error: failed to open /usr/share/zoneinfo/GMT-00:00 - No such file or directory
Indeed, there is no such file and I believe it shouldn't be.
Google doesn't know about this error message but I've found similar problem here: https://issues.apache.org/jira/browse/ARROW-4966. There is a workaround for on-prem servers of creating sym-link to /usr/share/zoneinfo/GMT-00:00. But I'm in a Cloud.
Additionally, I found that if I extract data from orc file via orc-tools into json format I'm able to load that json file into BigQuery. So I suspect that the problem not in the data itself.
Does anybody came across such problem?
Official Google support position below. In short BigQuery doesn't understand some timezone's description and we suggested to change it in the data. Our workaround for this was to convert ORC data to parquet and then load it into table.
Indeed this error can happen. Also when you try to execute a query from the BigQuery Cloud Console such as:
select timestamp('2020-01-01 00:00:00 GMT-00:00')
you’ll get the same error. It is not just related to the ORC import, it’s how BigQuery understands timestamps. BigQuery supports a wide range of representations as described in [1]. So:
“2020-01-01 00:00:00 GMT-00:00” -- incorrect timestamp string literal
“2020-01-01 00:00:00 abcdef” -- incorrect timestamp string literal
“2020-01-01 00:00:00-00:00” -- correct timestamp string literal
In your case the problem is with the representation of the time zone within the ORC file. I suppose it was generated that way by some external system. If you were able to get the “GMT-00:00” string with preceding space replaced with just “-00:00” that would be the correct name of the time zone. Can you change the configuration of the system which generated the file into having a proper time zone string?
Creating a symlink is only masking the problem and not solving it properly. In case of BigQuery it is not possible.
Best regards,
Google Cloud Support
[1] https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#time_zones
Unable to view the table contents in Tibco Spotfire Analytics Explorer.
I have copied the hive EMR jar files to library folder in tibco server. Once the server is up, I tried to configure the data source in Information Designer. I am able to setup the datasource and I can see schema. But when I tried to expand the table, I am not getting any results
This is the error message I am getting here
Error message: An issue occurred while creating the default model. It may be partially constructed.
The data source reported a failure.
InformationModelException at Spotfire.Dxp.Data:
Error retrieving metadata: java.lang.NullPointerException (HRESULT: 80131500)
The following is the URL template in Information Designer
jdbc:hive2://<host>:<port10000>/<database>
which I changed to
jdbc:hive2://sitename:10000/db_name.
Please let me know what need to be changed in the driver config or any other place to see the contents of the table.
I all
I have a job schudled by tivoli for an Informatica workflow.
i have checked property to save workflow logs for 5 runs.
Job is running fine through informatica but if u try to run is from tivoli using pmcmd it fails to rename the workflow log file .
pLease help , i am getting this error :
Cannot rename workflow log file [E:\Informatica\etl_d\WorkflowLogs\wf_T.log.bin] to [E:\Informatica\etl_d\WorkflowLogs\wf_T.log.4.bin]. Please check the Integration Service log for more information.
Disconnecting from Integration Service
Check the log file name in Workflow Edit options. Possibly you have same workflow log file name for multiple workflows.
HTH
Irfan