memorystore and instances in different regions (GCP) - google-cloud-platform

I'm building a chat app in React Native, and the backend is in Node.JS I'm using GKE to deploy the server code.
I'm using a cloud sql postgresql, connecting with internal IP. This works. I also use a memorystore (redis). Here is the problem.
For autoscaling, I'm planning to get multiple GKE clusters in different regions (for now, europe-west1 and us-central1). I have configured a load balancer with one backend containing all instance groups. I don't know of this is the correct/ideal solution, but it works. The problem lies in the fact that you can only connect to a redis database from an instance within the same region. If i use use-central1 as the region for my memorystore instance, I cannot connect to it through the vm's in the eu-cluster I created.
What is the best solution to overcome this problem? I've created an extra VM in the same region as the redis instance with haproxy configured to use as a reverse proxy to the memorystore, and this way, I can connect to the redis database through all instances, no matter what region they're in. But I don't know if this is the correct solution?
EDIT:
I'm using websockets (socket.io) for chat messages. Because I'm planning to use multiple servers, I need a centralized database to store (references to) the socket ID's, so users can send messages to users that are connected to other servers.
I'm thinking redis is the correct solution for a number of reasons:
I can use socket.io-redis to store the socket ID's on redis
fast response time
I don't know about the size of the data stored, but it's definitely not Mb's
I'm using a postgresql database to store other information (like username, passwords), but it seems to me that redis is a far better solution for real time applications.

Related

How can I deploy and connect to a postgreSQL instance in AlloyDB without utilizing VM?

Currently, I have followed the google docs quick start docs for deploying a simple cloud run web server that is connected to AlloyDB. However, in the docs, it all seem to point towards of having to utilize VM for a postgreSQL client, which then is connected to my AlloyDB cluster instance. I believe a connection can only be made within the same VPC and/or a proxy service via the VM(? Please correct me if I'm wrong)
I was wondering, if I only want to give access to services within the same VPC, is having a VM a must? or is there another way?
You're correct. AlloyDB currently only allows connecting via Private IP, so the only way to talk directly to the instances is within the same VPC. The reason all the tutorials (e.g. https://cloud.google.com/alloydb/docs/quickstart/integrate-cloud-run, which is likely the quickstart you mention) talk about a VM is that in order to create your databases themselves within the AlloyDB cluster, set user grants, etc, you need to be able to talk to it from inside the VPC. Another option for example, would be to set up Cloud VPN to some local network to connect your LAN to the VPC directly. But that's slow, costly, and kind of a pain.
Cloud Run itself does not require the VM piece, the quickstart I linked to above walks through setting up the Serverless VPC Connector which is the required piece to connect Cloud Run to AlloyDB. The VM in those instructions is only for configuring the PG database itself. So once you've done all the configuration you need, you can shut down the VM so it's not costing you anything. If you needed to step back in to make configuration changes, you can spin the VM back up, but it's not something that needs to be running for the Cloud Run -> AlloyDB connection.
Providing public ip functionality for AlloyDB is on the roadmap, but I don't have any kind of timeframe for when it will be implemented.

Are managed databases (e.g. Amazon RDS) slower to access than databases on the same machine (EC2) as the web server

Imagine two cases:
I have web server running in an EC2 instance and it is connected to the database in the RDS, the managed database service.
I have web server and database running in the same EC2 instance.
Is my database in RDS going to be slower to access because it's not in the same machine?
How many milliseconds, approximately, does it add to your latency between the two?
Does this become bottleneck?
What about other managed database services like Azure, GCP, Digital Ocean, etc?
Do they behave the same?
Yes, it will be slower to RDS instances from your Webserver than a database on the same host, because you need to go over the network and that adds latency.
The drawback of running the DB on the same server is that you can't use a managed service to take care of your database and you're mixing largely stateless components (webserver) with stateful components (database). This solution is typically not scalable either. If you add more webservers, things get messy.
Don't know about Azure, GCP or Digital Ocean, but I'd be very surprised if things are different there. There's good reasons to separate these components.

How do I set up an AWS RDS instance for production so that I can regularly read in custom data from my personal computer

I have built a REST API in Spring that I am ready to deploy as the back-end for my company's website. It utilizes a mySQL RDS instance to store data. I'm going to host it on AWS and am currently in the process of learning how to do that. I connect to my database with Spring's jdbc template and make SQL queries to create and edit tables.
There is a big concern I have that has not been addressed by any of the tutorials I've read: Once everything is up and running on AWS, I will not have direct access to the database anymore as it will only be accessible from behind a my REST API which makes the necessary queries. And the REST API will only be accessible by the front end server (which is also on AWS). But I will regularly need to read in custom data in different formats.
Currently it is very easy to do that, because I can read in a random excel file and directly call the methods that actually make SQL queries on startup of the server. But that is because my test RDS database is publicly accessible. And I am pretty sure that is terrible practice.
So how can I set things up on AWS so that I can still connect to my database from my laptop and make custom SQL queries to my database?
I am following this tutorial (https://keyholesoftware.com/2017/09/26/using-docker-aws-to-build-deploy-and-scale-your-application/) to get my REST service up and running, and will have to set up the RDS instance separately.
Best choice I know of is to SSH into an EC2 then connect to RDS. If you're on Mac, Sequel Pro makes this easy since you can provide SSH settings along with your MySQL connection settings.
This can also be accomplished with an SSH port forwarding then you can use your local SQL client. Here's a link to an article that appears to have correct information MySQL SSH Tunnel
Only other secure option is to allow RDS connections from your IP. I can't verify that still works but my memory says I used to run my former companies RDS that way.

Can AWS Elastic Load Balancer be used to only send traffic to a second server if the first fails

Can an AWS Elastic Load Balancer be setup so it sends all traffic to a main server and if that server fails, only then send traffic to a second server.
Have an existing web app I picked up that was never built to run on multiple servers and the client has become worried about redundancy. They don't want to invest enough to make it run well across multiple servers so I was thinking I could setup a second EC2 server with a MySQL slave and periodically copy files from the primary server to the secondary using rsync. Then have an AWS ELB send traffic to the primary server and only if that fails send it to the second server.
AWS load balancers don't support "backup" nodes that only take traffic when the primary is down.
Beyond that, you are proposing a complicated scenario.
was thinking I could setup a second EC2 server with a MySQL slave
If you do that, you can only fail over once, then you can't fail back, because the master database will then be obsolete. For a configuration like this to work and be useful, your two MySQL servers need to be configured with master/master (circular) replication, so that each is a replica of the other. This is an advanced configuration that requires expertise and caution.
For the MySQL component, an RDS instance with multi-AZ enabled will provide you with hands-off fault tolerance of the database.
Of course, the client may be unwilling to pay for this as well.
A reasonable shortcut for small systems might be EC2 instance recovery which will bring the site back up if the underlying hardware fails. This feature replaces a failed instance with a new instance, reattaches the EBS volumes, and starts it back up. If the system is stable and you have a solid backup strategy for all data, this might be sufficient. Effective redundancy as a retrofit is non-trivial.

Sails app with multiple instances on AWS - Redis/Elasticache/ALB

I'm building a Sails app that is using socket.io and see that Sails offers a method for using multiple servers via redis:
http://sailsjs.org/documentation/concepts/realtime/multi-server-environments
Since I will be placing the app on AWS, preferably with ELB (elastic load balancer) and autoscale group with multiple EC2 instances was wondering how I can handle so it doesn't need a separate redis instance?
Maybe we can use AWS Elasticache? If so - how would this be done?
Now that AWS has released the new ALB application load balancer which has websockets, could this be used to help simplify things?
Thanks in advance
Updates for use-cases in application
Allow end-user to update data dynamically from their own dashboard
and display analytics/stats in real-time to an administrator
Application status' to change based on specific timings eg. at a
given start date/time the app allows users to update data.
Regarding your first question, you don't want to run Redis on the same servers that Sails is running on, especially if you are using AutoScaling. The Redis server needs to be a separate server that won't disappear if your environment experiences a "scale-in" event. So Redis is going to have to be on a separate "server" somewhere.
ElastiCache is just separate EC2 instances, running Redis, where AWS handles most of the management for you to the point that you can't even SSH into the instance. It's similar to how RDS works. ElastiCache will certainly work for your scenario. You might also want to look at the third-party service RedisLabs which also manages Redis instances on AWS for you.
Regarding your second question, an Application Load Balancer will have no bearing on your Redis usage. It will however bring actual support for WebSockets which it sounds like you are using. So yes, you should be using an ALB instead of an ELB.