Informatica - Writing JSON target using Data Processor - informatica

I have an Informatica Developer (IDQ) mapping that use Data Processor transformation to read from relational source (SQL server) and write an NDJSON (New Line delimited JSON) file. How ever, the output file has an extra blank line between each JSON object
Current Output
{"CustInfo":{"CustName":{"FirstName":"F1","LastName":"L1"}}}
{"CustInfo":{"CustName":{"FirstName":"F1","LastName":"L1"}}}
{"CustInfo":{"CustName":{"FirstName":"F1","LastName":"L1"}}}
Expected Output
{"CustInfo":{"CustName":{"FirstName":"F1","LastName":"L1"}}}
{"CustInfo":{"CustName":{"FirstName":"F1","LastName":"L1"}}}
{"CustInfo":{"CustName":{"FirstName":"F1","LastName":"L1"}}}
I tried different delimiters in the Output file properties, but none of them seems to work. Any suggestions?

REPLACESTR (1,Output_mplt_Output,CHR(10),CHR(13), '')
Use the above expression for the port coming from Data processor. It will fix the extra line issue.

Related

How to retrieve multiple NDJSON objects from the same file using ArduinoJson?

I am trying to use ArduinoJson to parse Google's quickdraw dataset, which contains .ndjson files with multiple objects inside. I figured how to retrieve the first of the objects in the file using the following simple code:
DeserializationError deserialization_error = ArduinoJson::deserializeJson(doc, as_cstr);
if (deserialization_error) {
printf("deserializeJson() failed: %s\n", deserialization_error.c_str());
}
However, this only parses the first object in the ndjson file.
According to the website, I get the sense that something else should happen automatically:
NDJSON, JSON Lines
When parsing a JSON document from an input stream, ArduinoJson stops reading as soon as the document ends (e.g., at the closing brace).
This feature allows to read JSON documents one after the other; for example, it allows to read line-delimited formats like NDJSON or JSON Lines.
{"event":"add_to_cart"}
{"event":"purchase"}
Is there some way to get the byte length of the parsed object to I can continue using the cstring to parse consecutive objects? I did print out the cstring and it does contain the entirety of the ndjson file.
I found it.
just call multiple times:
DeserializationError error = deserializeJson(doc, sceneFile);
or:
deserializeJson(docline1, sceneFile);
deserializeJson(docline2, sceneFile);
deserializeJson(docline3, sceneFile);

Skip top N lines in snowflake load

My actual data in csv extracts starts from line 10. How can I skip top few lines in snowflake load using copy or any other utility. Do we have anything similar to SKIP_HEADER ?
I have files on S3 and its my stage. I would be creating a snowpipe later on this datasource.
yes there is a skip_header option for CSV, allowing you to skip a specified number of rows, when defining a file format. Please have a look here:
https://docs.snowflake.net/manuals/sql-reference/sql/create-file-format.html#type-csv
So you create a file format associated with the csv files you have in mind and then use this when calling the copy commands.

How can i load csv file data in BigQuery using bq load command?

I have created multiple tables into BigQuery dataset and also i have so many csv files stored into GCS and now i want to load csv files data into each tables. i have used below command to laod data from csv file into table. Table contains more 300+ columns.
bq load ARADMIN.T1851 gs://new007/t1851data.csv C1:STRING,C2:STRING,C3:NUMERIC,C4:STRING,C5:STRING,C6:NUMERIC,C7:NUMERIC,C8:STRING,C112:STRING,C179:STRING,C60900:STRING,C3004100:STRING,C10000001:STRING,C10003001:NUMERIC,C200000003:STRING,C200000004:STRING,C200000005:STRING,C200000006:STRING,C200000007:STRING,C200000012:STRING,C230000009:STRING,C240001002:STRING,C240001003:STRING,C240001005:STRING,C250000023:NUMERIC,C260000001:STRING,C300270800:STRING,C300270900:STRING,C300271000:STRING,C300271200:STRING,C300617700:STRING,C301090500:STRING,C301284400:STRING,C301290300:NUMERIC,C301321300:STRING,C301368700:STRING,C301389272:STRING,C301390090:NUMERIC,C301391782:STRING,C301412900:NUMERIC,C301540300:STRING,C301541000:STRING,C301541600:STRING,C301541700:NUMERIC,C301550900:NUMERIC,C301571900:STRING,C301572000:STRING,C301572100:STRING,C301572200:STRING,C301600300:STRING,C301610100:STRING,C301612200:STRING,C301626500:NUMERIC,C301629100:STRING,C301667500:NUMERIC,C301674600:NUMERIC,C301734000:STRING,C301735100:STRING,C301736700:STRING,C301788500:STRING,C301807600:STRING,C301809900:STRING,C301810000:STRING,C301827100:STRING,C301827300:STRING,C301920700:STRING,C301920800:STRING,C301920900:STRING,C301921000:STRING,C301921100:STRING,C301921200:STRING,C301921300:STRING,C301921400:STRING,C301921500:STRING,C301921600:STRING,C301921700:STRING,C301921800:STRING,C301921900:STRING,C303070100:NUMERIC,C303070200:NUMERIC,C303356300:STRING,C303497300:STRING,C303497400:STRING,C303497500:STRING,C303519300:STRING,C303522900:STRING,C303523900:STRING,C303544200:STRING,C303544300:STRING,C303558600:STRING,C303595900:NUMERIC,C303601600:STRING,C303601700:STRING,C303616500:STRING,C303720800:NUMERIC,C303755200:STRING,C303758300:STRING,C303790700:STRING,C1000000000:STRING,C1000000001:STRING,C1000000002:STRING,C1000000003:STRING,C1000000004:STRING,C1000000010:STRING,C1000000014:STRING,C1000000017:STRING,C1000000018:STRING,C1000000019:STRING,C1000000020:STRING,C1000000022:NUMERIC,C1000000026:NUMERIC,C1000000027:NUMERIC,C1000000028:STRING,C1000000029:STRING,C1000000030:STRING,C1000000031:STRING,C1000000035:STRING,C1000000036:STRING,C1000000037:STRING,C1000000039:STRING,C1000000046:STRING,C1000000048:STRING,C1000000054:STRING,C1000000056:STRING,C1000000063:STRING,C1000000064:STRING,C1000000065:STRING,C1000000069:STRING,C1000000074:STRING,C1000000079:STRING,C1000000080:STRING,C1000000082:STRING,C1000000099:NUMERIC,C1000000109:STRING,C1000000118:NUMERIC,C1000000145:STRING,C1000000150:NUMERIC,C1000000151:STRING,C1000000156:STRING,C1000000161:STRING,C1000000162:NUMERIC,C1000000163:NUMERIC,C1000000164:NUMERIC,C1000000169:NUMERIC,C1000000188:STRING,C1000000215:NUMERIC,C1000000217:STRING,C1000000218:STRING,C1000000239:STRING,C1000000251:STRING,C1000000296:NUMERIC,C1000000298:STRING,C1000000300:STRING,C1000000342:STRING,C1000000396:STRING,C1000000422:STRING,C1000000426:STRING,C1000000427:STRING,C1000000541:STRING,C1000000557:NUMERIC,C1000000558:NUMERIC,C1000000559:NUMERIC,C1000000560:NUMERIC,C1000000561:NUMERIC,C1000000562:NUMERIC,C1000000563:NUMERIC,C1000000564:NUMERIC,C1000000565:NUMERIC,C1000000566:NUMERIC,C1000000567:NUMERIC,C1000000571:NUMERIC,C1000000572:NUMERIC,C1000000631:NUMERIC,C1000000642:NUMERIC,C1000000652:STRING,C1000000715:STRING,C1000000716:STRING,C1000000731:NUMERIC,C1000000744:STRING,C1000000745:STRING,C1000000746:STRING,C1000000854:STRING,C1000000869:NUMERIC,C1000000875:STRING,C1000000878:NUMERIC,C1000000942:STRING,C1000000964:NUMERIC,C1000000984:NUMERIC,C1000000985:NUMERIC,C1000000987:NUMERIC,C1000001025:NUMERIC,C1000001165:STRING,C1000001259:NUMERIC,C1000001288:NUMERIC,C1000001296:NUMERIC,C1000001317:NUMERIC,C1000001319:NUMERIC,C1000001445:NUMERIC,C1000001446:NUMERIC,C1000001555:NUMERIC,C1000001600:NUMERIC,C1000002488:STRING,C1000002613:NUMERIC,C1000003009:NUMERIC,C1000003302:STRING,C1000003662:STRING,C1000003663:STRING,C1000003664:STRING,C1000003752:STRING,C1000003753:NUMERIC,C1000003754:STRING,C1000003755:STRING,C1000003756:STRING,C1000003757:NUMERIC,C1000003761:NUMERIC,C1000003764:NUMERIC,C1000003765:NUMERIC,C1000003779:STRING,C1000003781:NUMERIC,C1000003888:STRING,C1000003889:STRING,C1000003890:STRING,C1000003891:STRING,C1000003892:STRING,C1000003893:STRING,C1000003894:STRING,C1000003895:STRING,C1000003896:STRING,C1000003897:STRING,C1000003898:STRING,C1000003899:NUMERIC,C1000003988:STRING,C1000005261:NUMERIC,C1000005661:NUMERIC,C1000005735:NUMERIC,C1000005736:NUMERIC,C1000005781:STRING,C1000005782:STRING,C1000005783:STRING,C1000005784:STRING,C1000005785:STRING,C1000005786:STRING,C1000005787:STRING,C1000005788:STRING,C1000005789:STRING,C1000005790:STRING,C1000005791:STRING,C1000005897:STRING,C1000005898:STRING,C1000005899:STRING,C1000005900:STRING,C1000005901:STRING,C1000005902:STRING,C1000005903:STRING,C1000005904:STRING,C1000005905:STRING,C1000005906:STRING,C1000005908:STRING,C1000005909:STRING,C1000005910:STRING,C1000005911:STRING,C303898800:NUMERIC,C303901000:NUMERIC,C303979600:NUMERIC,C1000005970:STRING,C536870913:NUMERIC,C536870916:NUMERIC,C1000005980:NUMERIC,C650000020:STRING,C650000021:STRING,C650000060:STRING,C536870914:NUMERIC,C536870915:STRING,C650000023:NUMERIC,C60903:STRING,C304302260:STRING,C304309530:STRING,C304309540:STRING,C304313170:STRING,C304379051:NUMERIC,C304379701:STRING,C304384051:NUMERIC,C304384081:STRING,C700000019:NUMERIC,C700000023:NUMERIC,C650000024:STRING,C700000988:NUMERIC,C781290302:NUMERIC,C60901:STRING,C60989:STRING,C301743800:STRING,C304384321:NUMERIC,C304401091:NUMERIC,C304404731:NUMERIC,C304405171:NUMERIC,C304405181:NUMERIC,C304409261:NUMERIC,C420050000:NUMERIC,C420050001:STRING,C420050002:STRING,C420050003:NUMERIC,C420050004:NUMERIC,C420050005:NUMERIC,C420050006:STRING,C420050008:NUMERIC,C420050100:STRING,C420050101:STRING,C420050102:NUMERIC,C420050103:STRING,C759110005:STRING,C759110004:NUMERIC,C759000017:NUMERIC,C536870919:STRING,C800001001:STRING,C800001012:STRING,C800001025:NUMERIC,C536870918:NUMERIC,C536870920:NUMERIC
========================================================================
Error:
Waiting on bqjob_r1f519b38e9e242d9_0000016f80889252_1 ... (1s) Current
status: DONE
BigQuery error in load operation: Error processing job
'upbeat-repeater-257414:bqjob_r1f519b38e9e242d9_0000016f80889252_1':
Error while reading data, error message: CSV table encountered too many errors, giving up. Rows: 1; errors: 1. Please look into the errors[] collection for more details. Failure details:
gs://new007/t1851data.csv: Error while reading data, error message: CSV table references column position 317, but line starting at
position:0 contains only 138 columns.
You are loading data without specifying data format, data will be treated as CSV format by default. If this is not what you mean, please
specify data format by --source_format.
How can i load any csv files without any errors.please suggest me the hassle free strategy to load data.
The issue you are experiencing seems to be an issue in the CSV file format, perhaps your CSV file has characters such as new line \n or carriage character \r, so then, the interpretation of the file is not correct, and this makes sense with the error you are receiving:
CSV table references column position 317, but line starting at position:0
contains only 138 columns
Where it is saying that the first line of your CSV has only 138 columns and you are indicating 317 columns. Therefore, you will need to handle your CSV file first, removing this kind of characters that can confuse the interpreter and then try running the command again.

Read data from Qlikview into sas

I have a qlikview data file.
The file includes:
An XML formatted table header, after loading XML table header in sas system, I received some sasdatasets that describe about structure of qlikview data file, I made empty sasdataset that correspond with structure of qlickview data file.
The actual data like that: " à# à# à# # ÁZ¦EÀ ÁZ¦EÀÁZ¦EÀ ", but I don't know what is format data is stored in?
Now I need load actual data into this empty sasdataset, but I don't know how can I do. My leader suggested using ''READING BYNARY DATA". I tried read by using statement INFILE and INPUT with ENCODING, but my implements aren't successful because very hard determine Binary Informat.
Can somebody give me suggestion ? thank so much!

Greenplum to SAS Bulkload gpfdist error - line too long in file

I'm currently doing a bulk load from Greenplum to SAS. Initially there was one field with a backslash "\" at the end of the column causing to throw an error during loading. To resolve it I changed the format from TEXT to CSV and worked fine. But loading more data I encountered this error:
gpfdist error - line too long in file
I've been doing some search but couldn't assess if the cause is due to that the max_length to set when starting the gpfdist service. I also saw that there is a limit for Windows which is 1MB? Greatly appreciate your help.
By the way here are some additional info which might help:
-Greenplum version: 4.2.1.0 build 3
-Gpfdist installed in Windows along with SAS Applications
-Script submitted to Greenplum based on SAS Logs:
CREATE EXTERNAL TABLE ( ) LOCATION ('gpfdist://:8081/fileout.dat')
FORMAT 'CSV' ( DELIMITER '|' NULL '\N') ENCODING 'LATIN1'
Thanks!
"Line too long" sorts of errors usually indicate that you've got extra delimiters buried in VARCHAR/TEXT columns that throw the parsing of the file off.
Another possibility is that you've got hidden control characters, extra linebreaks or other nasty stuff hidden in your file that again is throwing your formatting off. Gpfdist can handle a lot of different data errors and keep going, but extra delimeters throws it for a loop.
Scan your load file looking for extra pipe characters in a line.
Another option would be to re-export your data, picking a different delimiter.
Please try an alternate solution, by selecting the input format as Text and client encoding as ISO_8859_5 in the session and see if that will help you. In my case it worked.