COPY command: AWS Aurora Postgresql, s3 extension - amazon-web-services

I'm trying to execute a COPY command to import csv file from S3 (result of UNLOAD command from Redshift) into an Amazon Aurora database Using the aws_s3.table_import_from_s3 Function to Import Amazon S3 Data, but I don't know to indicate the quotes character in the command.
SELECT aws_s3.table_import_from_s3(
'hr.person',
'',
'(FORMAT CSV,HEADER true,QUOTES ''"'')',
aws_commons.create_s3_uri('redshift-unload-tmp','resul_file.csv','us-east-2')
);
Thanks

I'll explain my use case more detailed:
Source
I used the UNLOAD command to "export" data from a table in Redshift, this is the command:
UNLOAD('SELECT * FROM schema.table')
TO 's3://bucket-name/prefix_'
HEADER
CSV
NULL AS '\000'
IAM_ROLE 'arn:aws:iam::accountNumber:role/aRoleWithRedshiftAndS3Permissions';
Target
I need to put the Redshift data (now file in s3) into RDS database (Aurora-postgresql), before import file, I did a rename of the files in s3 and add the extension .csv; I used pgAdmin 4 as a Postgresql client, and open a query editor to execute the following commands:
Add new s3 extension to the database:
CREATE EXTENSION aws_s3 CASCADE;
NOTICE: installing required extension "aws_commons"
Execute function to import file from s3
select aws_s3.table_import_from_s3(
'schame.table_name',
'',
'(FORMAT CSV, HEADER true)',
aws_commons.create_s3_uri('sample_s3_bucket_name','source_file_name.csv','aws-region')
);
Note: If you use CSV format by default quote character is ", then you don't need to indicate the quote as an option parameter, you need to do if you are using a different quote character.

Related

use SQL Workbench import csv file to AWS Redshift Database

I'm look for a manual and automatic way to use SQL Workbench to import/load a LOCAL csv file to a AWS Redshift database.
The manual way could be a way that click a navigation bar and select a option.
The automatic way could be some query codes to load the data, just run it.
here's my attempt:
there's an error "my target table in AWS is not found." but I'm sure the table exists, anyone know why?
WbImport -type=text
-file ='C:\myfile.csv'
-delimiter = ,
-table = public.data_table_in_AWS
-quoteChar=^
-continueOnError=true
-multiLine=true
You can use wbimport in SQL Workbench/J to import data
For more info : http://www.sql-workbench.net/manual/command-import.html
Like it was mentioned in the comments COPY command provided by Redshift is the optimal solution. You can use copy from S3, EC2 etc.
S3 Example:
copy <your_table>
from 's3://<bucket>/<file>'
access_key_id 'XXXX'
secret_access_key 'XXXX'
region '<your_region>'
delimiter '\t';
For more examples:
https://docs.aws.amazon.com/redshift/latest/dg/r_COPY_command_examples.html

How to execute AWS S3 to Redshift Copy command from SQL script?

I am trying to copy some files from S3 to Redshift using copy command. I used following command through SQL workbench and it worked fine, it copied the data to Redshift Table.
copy <Redshift table name>
from 's3://my-bucket/path/to/directory/part'
iam_role 'arn:aws:iam::<IAM ROLE>'
delimiter '|' dateformat 'auto' IGNOREHEADER AS 1;
but when I copied the same command into .sql file and tried to execute this SQL file using AWS data pipeline, pipeline just fails without giving any explicit error.
Due to some issues with internally developed pipeline definition generation tool, I am not able to use CopyToRedshift type activity.
I would like to know how do I execute this copy command from an file?
Try this out !!
COPY Table_Name
FROM S3File_PATH
credentials "*AWS Credentails"
ignoreheader as 1
ACCEPTINVCHARS
delimiter '|'
This copy command should work from a sql file .If not try to check for any errors in stl_load_error_details

CSV file in amazon s3 to amazon SQL Server rds

Is there any sample where I can find how to copy data from a CSV file inside Amazon S3 into a Microsoft SQL Server Amazon RDS ?
In the documentation its only mentioned about importing data from a local db into RDS.
Approach would be like - You have to spin up an EC2 instance and copy S3 CSV files into it and then from there you have to use Bulk insert command. Example:
BULK INSERT SchoolsTemp
FROM 'Schools.csv'
WITH
(
FIRSTROW = 2,
FIELDTERMINATOR = ',', --CSV field delimiter
ROWTERMINATOR = '\n', --Use to shift the control to next row
TABLOCK
)
All this can be stitched together in AWS Data Pipeline.
It looks like they setup Sql Server RDS integration with S3. I found this aws docs article which explains it in good detail.
After you've setup the proper credentials, it appears they added specific stored procedures to download (and upload/delete) to a D:\S3 directory. I haven't personally done this, but I thought I would share since the comment on the other post mentions that BULK INSERT isn't supported. But this would provide a way for BULK INSERT to work using a file from s3.
Copy the file to the RDS instance:
exec msdb.dbo.rds_download_from_s3
#s3_arn_of_file='arn:aws:s3:::bucket_name/bulk_data.csv',
#rds_file_path='D:\S3\seed_data\data.csv',
#overwrite_file=1;
Then run the BULK INSERT:
BULK INSERT MyData
FROM 'D:\S3\seed_data\data.csv'
WITH
(
FIRSTROW = 2,
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
)

Why DynamoDB default import pipeline is not loading any data?

I have successfully exported a table with a single row from DynamoDB to S3. I then cleared the table and tried to import the same file back in but I can't get it to work.
Rephrasing what import data from Amazon S3 to DynamoDB paragraph (5.a) says that I should put the file in s3://bucket[/prefix]/tablename/YYYY-MM-DD_HH.MM.
The export generates a different layout of the data, so I moved the file where the documentation says. I.e. s3://mybucket/dynamodb/mytable/2014-05-29_14.32, and I configured the pipeline to look in s3://mybucket/dynamodb.
I then setup an import job which ran without returning any error, yet the table was left empty.
The logs generated by the pipeline are not clear unfortunately.
Did anyone managed to import data in DynamoDB format from S3?
While exporting DynamoDB table, backup data is put to s3 path with format : output_s3_path/region/tableName/time
where
tableName is dynamoDB table which is being backed up
region is region of the table
output_s3_path is "S3 Output Folder" field on console/UI
Example let
tableName = test
region = us-east-1
output_s3_path = s3://test-bucket
Backup is generated in s3 path s3://test-bucket/us-east-1/test/2014-05-30_06.08
For importing this data, set value of "S3 Input Folder" same as generated path i.e. "s3://test-bucket/us-east-1/test/2014-05-30_06.08" . "S3 Input Folder" should be the s3 prefix where data files exists.

Move RedShift file to S3 as CSV

I'm trying to move a file from RedShift to S3. Is there an option to move this file as a .csv?
Currently I am writing a shell script to get the Redshift data, save it as a .csv, and then upload to S3. I'm assuming since this is all on AWS services, they would have an argument or something that let's me do this.
Use the UNLOAD command. It will create at least one file per slice, you will have to merge the files by yourself.
unload ('__SQL__')
to 's3://__BUCKET__/__PATH__'
credentials 'aws_access_key_id=__S3_KEY__;aws_secret_access_key=__S3_SECRET__'
delimiter as ','
addquotes
escape