Redshift copy command failure - amazon-web-services

Redshift copy command failure - amazon-web-services

I am using Amazon Redshift COPY command to insert new rows into a table.
The copy command fails and an error message coming up:
index "pg_toast_16408_index" is not a btree
I have noticed that the problem occurs because of description field that contains long string. When I try to copy without this field it works!
Does someone know why is that? How can I overcome this issue?

Use the TRUNCATECOLUMNS parameter:
Truncates data in columns to the appropriate number of characters so that it fits the column specification. Applies only to columns with a VARCHAR or CHAR data type, and rows 4 MB or less in size.

Related

Create.io Order of column when creating a table

I have CrateDb version 3.2.7 running under Windows Server 2012. I create a table like this:
create table test3 (firstcolumn bigint primary key, secondcolumn int, thirdcolumn timestamp, fourthcolumn double, fifthcolumn double, sixtcolumn smallint, seventhcolumn double, heightcolumn int, ninthcolumn smallint, tenthcolumn smallint) clustered into 12 shards with(number_of_replicas = 0, refresh_interval =0);
So I'm expecting the firstcolumn to be the first, and so on. But after the creation, when I do a SELECT * FROM test3, I get the following result:
It seems that the first column returned is the "fifth" Looks like columns are returned in alphabetical order.
Does it means that CrateDB created the columns in that order? Does it keeps the order somewhere? If columns are in alphabetical order, does that mean that if I want to COPY data from another dbms to CrateDB, then I have to export data based on alphabetical order?

For insert not necessarily, only if they are omitted do they have to be in an alphabetical order see here. Order doesn't seem to be "kept" anywhere per se.
COPY FROM is a different kind of import tactic and not quite what the good old INSERT would do. I would suggest writing a command line app to import data into cratedb. COPY FROM doesn't do any type checking, nor does it cast types and will always import the data as it was in the source file (see here). From your other question I see you may have gps related data (?) you will need to manually map them to a GEO_POINT type just as 1 example.
Crate offers good performance (whatever that means to you or me) with bulk endpoint

Pentaho - Execute every input row, if fail one row continue

I'm creating a ETL but i don't how to do it.
In table input I get my data flow this data can have problems of length, type etc ... Then, I just want to insert correct rows. The uncorrects I just want create error, to be picked up by "Jenkins"
Steps:
Transformation: Obtain rows
Table input
Copy rows to result
Transformation: Load rows (execute every input row)
Get rows from result
Data Validator
Table output (This in reality is another "copy to rows to result", this data is needed by another table input)
How can I fix it?
Thanks a lot!

#guibos You don't need a datavalidator. Do error handling in table output which will redirect the wrong stream of data into error table or erroneous file. OR Do the error handling of Data validator and redirect the data to error table. Please let me know if you need any help.

Error with upgrade codeunit when changing table's PK length

I have table A-Z. Table A has PK of ID, and all other tables has fields that relates to TableA's ID.
I'm being tasked to do code cleanup, and I need to change the TableA's ID from length 30 to 20. I have done for other table B-Z, together with the upgrade codeunit. But when I try to change for TableA, I get this error:
"The are changes related to the following primary key that can cause data loss in the new table. The changes cannot be handled because the TableUpgradeMode of the TableSyncSetup type function for the changed table is set to Copy, which does not copy data to the new table. To fix this issue, you must change the TableUpgradeMode option to Move, then add C/AL code to an Upgrade type function to handle new table data."
What does the error mean? Do I need to change TableA's upgrade codeunit from TableSyncSetup.Mode::Copy to ::Move? Any guidance?
I'm using Dynamics NAV 2016.

Yes, you have to change the mode to Move but you also have to create a new table which holds the data temporarily from the fields where you've reduced the field length. You also have to handle the possible data truncation issue because of the reduced field length.
But I would do this in a different way (the old way from the Upgrade Toolkits):
- Create a new table with the same field length (30), copy the field contents and clear the fields (using a codeunit)
- Change the field lengths and but choose Force when NAV is asking about the Sync Mode (because you know that there is no data in those fields - SQL can drop and recreate the columns)
- Using a second codeunit copy the data back into the reduced fields - handle the truncation
I hope it helps

AWS Aurora ALTER TABLE not working

I'm trying to add a new column to a table that weighs about 20GB using:
ALTER TBLE ... ALGORYTHM = INPLACE
After about one hour of processing, the ALTER command fails and returns the following error without adding the column:
ERROR 1034 (HY000): Incorrect key file for table '[TABLE]'; try to repair it
Any idea why is this happening?

Seems to be an issue related to temporary disk space.
It's a known problem in Aurora: https://forums.aws.amazon.com/message.jspa?messageID=691512

Redshift COPY command delimiter not found

I'm trying to load some text files to Redshift. They are tab delimited, except for after the final row value. That's causing a delimiter not found error. I only see a way to set the field delimiter in the COPY statement, not a way to set a row delimiter. Any ideas that don't involve processing all my files to add a tab to the end of each row?
Thanks

I don't think the problem is with missing <tab> at the end of lines. Are you sure that ALL lines have correct number of fields?
Run the query:
select le.starttime, d.query, d.line_number, d.colname, d.value,
le.raw_line, le.err_reason
from stl_loaderror_detail d, stl_load_errors le
where d.query = le.query
order by le.starttime desc
limit 100
to get the full error report. It will show the filename with errors, incorrect line number, and error details.
This will help to find where the problem lies.

You can get the delimiter not found error if your row has less columns than expected. Some CSV generators may just output a single quote at the end if last columns are null.
To solve this you can use FILLRECORD on Redshift copy options.

From my understanding the error message Delimiter not found may be caused also by not specifying correctly the COPY command, in particular by not specifying the Data format parameters https://docs.aws.amazon.com/redshift/latest/dg/r_COPY.html
In my case I was trying to load Parquet data with this expression:
COPY my_schema.my_table
FROM 's3://my_bucket/my/folder/'
IAM_ROLE 'arn:aws:iam::my_role:role/my_redshift_role'
REGION 'my-region-1';
and I received the Delimiter not found error message when looking into the system table stl_load_errors. But specifying I'm dealing with Parquet data in the expression in this way:
COPY my_schema.my_table
FROM 's3://my_bucket/my/folder/'
IAM_ROLE 'arn:aws:iam::my_role:role/my_redshift_role'
FORMAT AS PARQUET;
solved my problem and I was able to correctly load the data.

I know this was answered, but I just dealt with the same error and I had a simple solution so i'll share it.
This error can also be solved by stating the specific columns of the table that are copied from the s3 files (if you know what are the columns in the data on s3).
In my case the data had less columns than the number of columns in the table.
Madahava's answer with the 'FILLRECORD' option DID solve the issue for me but then I noticed a column that was supposed to filled up with default values, remained null.
COPY <table> (col1, col2, col3) from 's3://somebucket/file' ...

This may not be directly related to the OP's question but I received the same Delimiter not found error which was caused by newline characters within one of the fields.
For any field that you think may have newline characters you can remove them with:
replace(my_field, chr(10), '')

When you send fewer fields than expected on the destin table, it will also throw this error.

I'm sure there are multiple scenarios that would return this error. I just came across one that I don't see mentioned in the other answers while I was debugging someone else's code. The COPY had the EXPLICIT_IDS option listed, the table it was trying to import into had a column with a data type of identity(1,1), but the file it was trying to import into Redshift did not have an ID field. It made sense for me to add the identity field to the file. But, I imagine removing the EXPLICIT_IDS option would also have fixed the issue.

So recently I came across of this Delimiter not found error in Redshift SQL while loading the data with copy command. In my case, the problem was with column numbers.
I had created a table with 20 columns but I was loading the file with 21 columns.
I corrected it in my table by making 21 columns in the table and then re-loaded the data and boom it worked.
Hope it will be helpful to those who are facing the same kind of problem.
Ta-da

Sometimes this pops up when you dont specify the file type, for example CSV
Ref: https://docs.aws.amazon.com/redshift/latest/dg/tutorial-loading-run-copy.html
copy "dev"."my"."table" from 's3://bucket/myfile_upload.csv' credentials 'aws_iam_role=arn:aws:iam::2112277888:role/RedshiftAccessRole' IGNOREHEADER 1 csv;

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Redshift copy command failure - amazon-web-services

Use the TRUNCATECOLUMNS parameter: Truncates data in columns to the appropriate number of characters so that it fits the column specification. Applies only to columns with a VARCHAR or CHAR data type, and rows 4 MB or less in size.

Related

Create.io Order of column when creating a table

Pentaho - Execute every input row, if fail one row continue

Error with upgrade codeunit when changing table's PK length

AWS Aurora ALTER TABLE not working

Redshift COPY command delimiter not found

Categories

Resources