Upload a local CSV to a Remote Postgre (AWS) - amazon-web-services

I'm facing the following problem: I have an instance (RDS) on AWS to store some data, and I want to upload some data from my local pc to it! Using PgAdmin it seemed such an easy task, but I have to be a superuser in order to use the command 'COPY' that everywhere in the internet says!
Sadly, for security reasons AWS blocks you from having those kinds of permissions, which is making my task difficult.
I'm looking to see if anyone can come up with any solution, as getting the file to the same instance the database is running is impossible to me.
Thank you!

The official AWS RDS documentation covers this. Read the \copy command section at the bottom of the page.
You can run the \copy command from the psql prompt to import data into
a table on a PostgreSQL DB instance.
target-db=> \copy source-table from 'source-table.csv' with DELIMITER ',';

Related

How to copy an AWS RDS database, NOT the entire instance

I am somewhat new to AWS RDS and their terminology. What they call a "database" seems to me to be an SQL Server instance. I have a database (as defined by SSMS--with tables, data, stored procedures, etc.) on RDS named "prod" and I want to duplicate it for testing purposes to be named "test" with all the content, and leave "prod" as-is.
All the instructions I've found by doing many, many searches seem to be related to duplicating the entire instance. Can someone help me with instructions on how to create a duplicate of just the (ssms term) database?
Thanks in advance for any help!
P.S. What does AWS/RDS call the object that is equivalent to an SSMS database?
I've found multiple posts here about duplicating an entire instance. It could be that I don't fully understand the terminology because I know this must be a common task but I am not understanding how to do it.
This is a production environment so I am proceeding very cautiously. I do have nightly snapshots made so I know I could recover but would rather do it right the first time.
I usually use a command like this to backup a single database to s3:
exec msdb.dbo.rds_backup_database #source_db_name='<mydatabasename>', #S3_arn_to_backup_to='arn:aws:s3:::<mys3objectname>', #type='FULL'
There is a bit of one-time configuration you need to do first, see the link below , and then its as simple as executing commands from SSMS to backup a database to S3 and then restoring it from S3 - maybe not exactly what you are looking for, but it works great.
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/SQLServer.Procedural.Importing.html
What you are looking for can't really be done via any RDS commands/tools/interface. RDS only concerns itself with the database server itself, and isn't really even aware of the different databases, schemas, tables, etc. you may have created on the server.
You will need to use the tools for the DBMS you are using, in this case it sounds like you are using Microsoft SQL Server, so you will need to use MS SQL Server tools (perhaps running on an EC2 instance) to dump a single database to a file, and then load it into another database.
P.S. What does AWS/RDS call the object that is equivalent to an SSMS database?
An RDS database is a "database server". You might also see it called a "database instance". AWS/RDS calls the object equivalent to an SSMS database simply a database. The terminology is confusing because both the physical server/computer running the database software, and the logical grouping of tables/logs/data inside the software are both generally referred to as a "database".

AWS Glue Development Endpoint Not Working properly

I am trying to use a development Endpoint to interactively run and edit ETL scripts but there seems to some issues in the development endpoint just after creating it as i am getting errors in scala/python REPL and also unable to do SSH tunnel to remote interpreter.
Let me explain what i did exactly - I created a development endpoint in the AWS console with all the default configurations. While creating the development endpoint i only provided three things 'Development endpoint name' and 'IAM Role' and my 'pub ssh key'. This is how it looks after creation
Then Right After creating the endpoint i am connecting to the spark/python REPL, I am able to connect to them successfully but within couple of minutes of connecting, the REPL starts throwing errors without writing a single line of code. This is happening in all the REPL present in the development endpoints.
Also When I am trying to do SSH tunneling to remote interpreter to connect my Local Zeppelin Notebook it is throwing - "bind: Cannot assign requested address".
Couple of things that are working though -
Able to do ssh to the endpoint.
Created a Sagemaker notebook in the AWS glue that is attached to this development endpoint and this notebook seems to be working fine, although surely it is adding an additional cost and i don't want to continue using it.
Can anyone please help what am i doing wrong? Am I missing any important steps that is needed to be done on the machine right after creating the development endpoint?
Thanks in Advance!
Not very sure about this error but if you are using it smaller datasets then probably you would like to use Docker implementation as it will not add any additional cost and you can go on with your developments.
You can refer this blog on how to set it up
https://towardsdatascience.com/develop-glue-jobs-locally-using-docker-containers-bffc9d95bd1

Move database from local server to remote server

This is quite a general one, I'm trying to move a database from a local psql server on Windows to an AWS RDS server I've set up. The database is small, I'm really just doing it for the sake of learning how.
The problems I'm having are as so:
-Seems like such a simple thing yet no simple solutions? (although I know/think if I had a linux system I could use pg_dump and cat dump. )
-I've gone into looking at aws documentation about how to migrate my database using their database migration service. However I've come to problems so early on that I'm close to believing it won't be worth the effort at my level of experience.
Seems like there must be a simple way and hoping the S/O community can help!
Use some tool like PHPMyAdmin, use import to import your database dump to AWS RDS.
If you wish to use CLI, than use some FTP tool to upload dump to database, connect via SSH and use CLI to import this dump

Using impdp/expdp with RDS Oracle on AWS

I'm very new to Amazon web services, especially using their RDS system. I have set up an Oracle database (11.2) and I now want to import a dump we made locally from our server using expdp. Apparently, the ability to use expdp/impdp on AWS is quite new. From what I understand, when creating an ORACLE database on RDS, a DATA_PUMP_DIR is automatically created. What is less obvious is how to access this directory and made our local dump available to RDS. I've tried to read the following information http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Oracle.Procedural.Importing.html on their website. But there is a lot of things I don't understand:
Why do I have to setup an EC2 instance when the dump file is actually on my local computer (and I can access remotely the RDS database using sqlplus or sql developper)
They are often using the 'sys' or 'system' user in their examples but, when reading the security settings for Oracle, it said that these users are made unavailable on RDS => you cannot connect to a database as Sysdba.
Could someone please point me to a simple and clear tutorial on how to use impdp on AWS ?
Thanks
It is possible to use Data Pump on RDS now.
duduklein's answer was correct when he wrote it. But the RDS docs now have details about using Oracle Data Pump. The doc page url is unmodified from the link as originally posted in the question (nice job, Amazon!) but it has new content on using Data Pump now.
It's not possible for now. I have just contacted amazon (through the premium support) for the same issue and they just told me that this is a feature request that was already passed to the RDS team, but there is no estimation of when this will be available.
The only way you can import files dumps is using the "exp" utility instead of the "expdp". In this case, you can use the "imp" utility to import data to RDS

How to copy a database using RDS

I have a database instance on RDS with 2 databases on it. Is there a good way using the RDS command line tools to copy the one database to the other? If not, what is the recommended way of doing it?
This is not an exact solution to the OP, but if all you need is to clone an existing database for a new purpose, there's an easier way. You can take a snapshot from the original RDS instance, then restore it to a new instance. You can even use the web console.
I'd use mysqldump to get the tables and then mysql to import them.
Update 2014/07/08: Depending on what you're planning to do here, another solution today is to setup replication and then to promote the slave to be the master. That is for example if you want to update your database's release/version:
http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_ReadRepl.html
If you're looking to backup externally, there's also replication:
http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/MySQL.Procedural.Exporting.NonRDSRepl.html
RDS has come a long way.
it depends on which database you are hosting there - for SQL Server I have used the SQL Azure Migration wizard (free download from CodePlex).
To get full RDBMS functionality the trick is to use the DNS name of your SQL Server instance in the wizard, but select 'SQL Server v2008' (or eventually v2012 after AWS RDS makes instances with 2012 available) and do NOT select to-->'SQL Azure'. I did a short screencast on this on my blog as well.