I am trying to use AWS DMS to move data from a source database ( AWS RDS MySQL ) in the Paris region ( eu-west-3 ) to a target database ( AWS Redshift ) in the Ireland region ( eu-west-1 ). The goal is to continuously replicate ongoing changes.
I am running into these kind of errors :
An error occurred (InvalidResourceStateFault) when calling the CreateEndpoint operation: The redshift cluster datawarehouse needs to be in the same region as the current region. The cluster's region is eu-west-1 and the current region is eu-west-3.
The documentation says :
The only requirement to use AWS DMS is that one of your endpoints must
be on an AWS service.
So what I am trying to do should be possible. In practice, it's seems it's not allowed.
How to use AWS DMS from a region to an other ?
In what region, should my endpoints be ?
In what region, should my replication task be ?
My replication instance has to be on the same region than the RDS MySQL instance because they need to share a subnet
AWS provides this whitepaper called "Migrating AWS Resources to a New AWS Region", updated last year. You may want to contact their support, but an idea would be to move your RDS to another RDS in the proper region, before migrating to Redshift. In the whitepaper, they provide an alternative way to migrate RDS (without DMS, if you don't want to use it for some reason):
Stop all transactions or take a snapshot (however, changes after this point in time are lost and might need to be reapplied to the
target Amazon RDS DB instance).
Using a temporary EC2 instance, dump all data from Amazon RDS to a file:
For MySQL, make use of the mysqldump tool. You might want to
compress this dump (see bzip or gzip).
For MS SQL, use the bcp
utility to export data from the Amazon RDS SQL DB instance into files.
You can use the SQL Server Generate and Publish Scripts Wizard to
create scripts for an entire database or for just selected objects.36
Note: Amazon RDS does not support Microsoft SQL Server backup file
restores.
For Oracle, use the Oracle Export/Import utility or the
Data Pump feature (see
http://aws.amazon.com/articles/AmazonRDS/4173109646282306).
For
PostgreSQL, you can use the pg_dump command to export data.
Copy this data to an instance in the target region using standard tools such as CP, FTP, or Rsync.
Start a new Amazon RDS DB instance in the target region, using the new Amazon RDS security group.
Import the saved data.
Verify that the database is active and your data is present.
Delete the old Amazon RDS DB instance in the source region
I found a work around that I am currently testing.
I declare "Postgres" as the engine type for the Redshift cluster. It tricks AWS DMS into thinking it's an external database and AWS DMS no longer checks for regions.
I think it will result in degraded performance, because DMS will probably feed data to Redshift using INSERTs instead of the COPY command.
Currently Redshift has to be in the same region as the replication instance.
The Amazon Redshift cluster must be in the same AWS account and the
same AWS Region as the replication instance.
https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.Redshift.html
So one should create the replication instance in the Redshift region inside a VPC
Then use VPC peering to enable the replication instance to connect to the VPC of the MySQL instance in the other region
https://docs.aws.amazon.com/vpc/latest/peering/what-is-vpc-peering.html
Related
can we consider RDS MYSQL/POSTGRES read replica's as source endpoint for AWS DMS ?
I hava a requirement werein i have to use replica's of RDS MYSQL and RDS POSTGRES as source endpoint in AWS DMS and REDSHIFT cluster as target endpointfor AWS DMS(we have to use FULL load with CDC).
please suggest if it's possible to use replica's of RDS MYSQL and RDS POSTGRES.
Thanks
On Using a MySQL-compatible database as a source for AWS DMS - AWS Database Migration Service, there is a reference to read replicas:
If you are using an Amazon RDS MySQL or Amazon RDS MariaDB read replica as a source, enable backups on the read replica, and ensure the log_slave_updates parameter is set to TRUE.
Therefore, it seems to be possible with MySQL on Amazon RDS.
However, Using a PostgreSQL database as an AWS DMS source - AWS Database Migration Service says:
You can't use RDS PostgreSQL read replicas for CDC (ongoing replication).
However, it seems possible for the initial (full) load.
I redeployed an Auraro cluster (postgresql 11). I did it by delete the existing one and re-create a new one. I have snapshot backup from the previous db instance and I'd like to restore the data to the existing instance.
I understand that Aurora doesn't support it. Is there a workaround for me to do that? Like whether I can download the snapshot to local in plain sql script format. Then manually restore them to the new instance?
I understand that Aurora doesn't support it. Is there a workaround for
me to do that? Like whether I can download the snapshot to local in
plain sql script format. Then manually restore them to the new
instance?
You can certainly accomplish that by doing the following:
Restore the snapshot to a new cluster (Cluster B).
Export the data from that cluster using pgdump or mysqldump depending on the Aurora database engine you are using. I suggest doing this in an EC2 instance in the same VPC.
Delete Cluster B.
Drop the database in your original cluster (Cluster A).
Load all the data from the export into Cluster A.
However, at that point all you have accomplished is that you will have maintained the RDS cluster's endpoint URL. If you design your system to allow changes to the RDS endpoint URL, then it would be much easier for you to simply restore the snapshot to the new cluster, swap the endpoint your application connects to, and delete the old cluster.
You can use the method suggested by Teddy Aryono to restore the DB cluster.
Now rest is using DB host name, use secret manager to store your credentials so even if db host name is changing after restoring snapshot to new cluster you dont need to update application/lambda configurations.
(of-course you nee to update secret manager details with new host name after restore )
You can restore from a DB cluster snapshot that you have saved. To restore a DB cluster from a DB cluster snapshot, use the AWS CLI command restore-db-cluster-from-snapshot.
In this example, you restore from a previously created DB cluster snapshot named mydbclustersnapshot. You restore to a new DB cluster named mynewdbcluster. You use Aurora PostgreSQL.
Example:
For Linux, macOS, or Unix:
aws rds restore-db-cluster-from-snapshot \
--db-cluster-identifier mynewdbcluster \
--snapshot-identifier mydbclustersnapshot \
--engine aurora-postgresql
For Windows:
aws rds restore-db-cluster-from-snapshot ^
--db-cluster-identifier mynewdbcluster ^
--snapshot-identifier mydbclustersnapshot ^
--engine aurora-postgresql
Furthermore, you can also do it using AWS Management Console, or RDS API. Read this docs for details.
My DMS replication instance (which is in same VPC as of Aurora serverless DB instance) is not able to find DB while creating endpoint in DMS.
However, I am able to create a cloud9 instance in same VPC as aurora serverless instance and connect to it from there.
Am I missing something here or it is not possible to use AWS DMS for migrating data from Aurora serverless as source?
Above issue was resolved by explicitly specifying the connection details for aurora serverless cluster (instead of dropdown selection). But the answer to original question of using Aurora serverless DB as source in DMS replication -
Yes, if only one time replication is required
No, If ongoing replication is required. For ongoing replication, It is required to change the values of binlog_format parameter for source database. Although, Aurora serverless allows changing value for this parameter but it has no impact in actual. Only a few parameters are supported for change which are listed here
AWS RDS aurora mysql - from console we can do "cross region read replica" and its working .
but I don't see any option to do so with
- AWS CLI
- Boto3
What I found is with boto3 we can do replication for cluster but not for instance .
Please suggest if am missing something as am working on lambda function to do below operation once any new aurora rds instance being created
- create cross region read replica on "Oregon" region
If you are referring to creating a Cross-Region Read Replica, then the boto3 documentation says this for the create_db_cluster command:
You can use the ReplicationSourceIdentifier parameter to create the DB cluster as a Read Replica of another DB cluster or Amazon RDS MySQL DB instance.
Commands for Aurora always refer to a cluster whereas commands for non-Aurora Amazon RDS instances refer to instance.
This is very tricky. It seems like you should create two or above instances to associate with cluster you created. The two instances should be in different zones, then the cluster will use the first as writer, the others as read replica. The Multi-AZ field will show "2 zones" (depends on how many zones you use to create instances)
Pseudo code like:
cluster_response = rds.create_db_instance(....)
instance_response = rds.create_db_instance(
DBInstanceIdentifier='name1',
DBInstanceClass='instance_type',
AvailabilityZone='zone1',
Engine=aurora-mysql,
DBClusterIdentifier=cluster_response['DBCluster']['DBClusterIdentifier'],
)
instance_response = rds.create_db_instance(
DBInstanceIdentifier='name2',
DBInstanceClass='instance_type',
AvailabilityZone='zone2',
Engine=aurora-mysql,
DBClusterIdentifier=cluster_response['DBCluster']['DBClusterIdentifier'],
)
When trying to use Amazon Redshift to create a datasource for my Machine Learning model, I encountered the following error when testing the access of my IAM role:
There is no '' cluster, or the cluster is not in the same region as your Amazon ML service. Specify a cluster in the same region as the Amazon ML service.
Is there anyway around this, as this would be a huge pain since all of our development team's data is stored in a region that Machine Learning doesn't work in?
That's an interesting situation to be in.
What probably you can do :
1) Wait for Amazon Web Services to support AWS ML in your preferred Region. (That's a long wait though).
2) OR what else you can do is Create a backup plan for your Redshift data.
Amazon Redshift provides you some by Default tools to back up your
cluster via snapshot to Amazon Simple Storage Service (Amazon S3).
These snapshots can be restored in any AZ in that region or
transferred automatically to other regions wherever you want (In your
case where your ML is running).
There is (Probably) no other way around to use your ML with Redshift being in different regions.
Hope it will help !