I am working on QLDB from last 3 months on a single region using it as a leisure database.
Now, business wants to move applications in multi-region support.
I found many of the aws services support multi region like DynamoDB, secret manager.
but there is limitations on QLDB for multi region use.
I saw from some aws articles that QLDB does not have support for multi region as its not distributed technology.
Now, to cater business requirement with minimal changes in code, I have to approaches/workaround for QLDB to support multi region,
Do I need to create region based ledger, with same functionality? I understand there are major challenges with maintaining the geo based traffic.
I will keep QLDB ledger in single region and gives cross region access permissions to Lambda functions to access it. Its a simplest one but eat latency.
Which approach helps in long term and in scalability? Or please suggest if anyone has different approach to achieve this.
Do I need to create region based leisure, with same functionality? I understand there are major challenges with maintaining the geo based traffic.
Yes, at this moment, like you said there is no multi region support or global in aws jargon, you need to create region based leisure on your own.
to cater business requirement with minimal changes in code
You can achieve cross region replication by following as mentioned in docs
Amazon QLDB does not support cross-region replication as of now. QLDB's export to S3 feature enables customers to export the contents of the QLDB journal to a S3 bucket. The S3 buckets can be configured for cross-region replication.
Side note :
I will keep QLDB leisure in single region and gives cross region access permissions to Lambda functions to access it. Its a simplest one but eat latency.
If your business wants multi-region support this option would not satisfy their conditions.
Related
I am new to QLDB and seem to be finding slightly conflicting info on multi-Region architecture. I see that it has high availability in a given Region; however, it is unclear as to what happens when an entire Region goes down, or how I use it in a hot-hot multi-Region application.
Let's assume that an application is in US-East-2 and US-West-2 with latency routing rules. Each of these needs to write and read from the same ledger. Is this possible, or would the ledger need to exist in a single region and only one region can have full-access while the other would only have access to a read-only copy (maybe in S3)?
As of 21/6/2021 QLDB ledgers are in a single region. Cross-region business continuity is a need we have heard from other customers and we take this feedback very seriously. I will come back to this answer in the future when there is an update.
We are creating a Federated Datawarehouse using snowflake i,e i will have dedicated DWH on each specific region say 3 regions. I will have one global DWH in a separate region for which we need to take data from tables from the other region DWH for reporting. What would be the best approach to accomplish the same?
I read and understood that, you can unload data from DWH in a region into AWS S3 or AZURE Blob on the same region. This i have to do it for all 3 regions. Then we have an option in AWS S3 cross-region replication which i can enable and then load it into Global DWH.
This was my approach and seems bit long and might cost extra for cross region data transfer which anyway is required. Mainly i will not be able to create a flow end to end. Since all are in different region, i need to run a separate job to unload it to s3 in that region, need to validate and need to start loading once all 3 unload completes. Workflow or orchestration also a problem. I considered AWS Batch and Step function but both regional services.
Appreciate if someone can through some light and options? Thank you!
I wouldn't advise doing it with S3 and loading into each database.
You have two options with Snowflake which are much more suited to your use case. One which Rich has already mentioned (Database Replication). You also have the option to use Data Sharing via the Private Data Exchange (not available in all regions yet) or by using Data Sharing
I suggest you review Snowflake's documentation on Database Replication, and ask your account executive or sales engineer to discuss it with you, it seems like a perfect fit for your use case.
https://docs.snowflake.com/en/user-guide/database-replication-failover.html
I hope this helps...Rich Murnane
What is the best way to scan data in S3 (for auditing purposes, possibly)? I was asked to do some research on this and utilizing AWS Athena was my first idea I could think of. But if you can provide more knowledge/ideas, I'd appreciate it.
Thanks!
You want to use Amazon Macie:
Amazon Macie is a security service that uses machine learning to automatically discover, classify, and protect sensitive data in AWS. Amazon Macie recognizes sensitive data such as personally identifiable information (PII) or intellectual property, and provides you with dashboards and alerts that give visibility into how this data is being accessed or moved.
Video: AWS Summit Series 2017 - New York: Introducing Amazon Macie
I am new to cloud computing and I was working on a research paper for s3 proactive replica checking. I have a few questions and I have tried and read many forums and research papers but I couldn't find answers anywhere or they may be too complicated for me to understand.
If I don't enable, cross regional replica for s3 storage, just created a new bucket, will AWS automatically create replicas for my storage anywhere?
Is there any Java code or tutorial available by which I can calculate the s3 replica checking time?
AWS has great documentation on their services so that's the place to start. This link should help: http://docs.aws.amazon.com/AmazonS3/latest/dev/DataDurability.html
To answer your first question, replication occurs automatically for all s3 objects in a given region and provides 11 9s durability unless you choose reduced redundancy storage.
Cross region replication is something you will have to enable and is not automatic. As for java code to test replication time, I'm not aware of any. However it seems you could do it fairly easily using the standard SDK and issue a PUT for an object and then time how long it takes to show up in the bucket of the region to which you have replicated it. I suspect that timing will depend on your origin and destination regions, but from my experience I can tell you even replicating from a US region to an Asia region is quite fast.
I haven't been able to find a clear answer on this from the documentation.
Is is discouraged to access DynamoDB from outside the region it is hosted in? For example, I want to do a lot of writes to a DynamoDB table in us-west-2, from a cluster in us-east-1 (or even ap-southeast-1). My writes are batched and non-real-time, so I don't care so much about a small increase in latency.
Note that I am not asking about cross-region replication.
DynamoDB is a hosted solution but that doesn't mean you need to be inside AWS to use it.
There are cases, especially for storing user information for clients making queries against DynamoDB - outside of "AWS region".
So to answer your question - best performance will be achieved when you mitigate the geo barrier, but you can work with any endpoint you'd like from anywhere in the world.