upgrading postgresl/redis databases without downtime on GCP - google-cloud-platform

I'm creating a web app in react with a nodeJS backend. I'm hosting all this on the Google Cloud Platform. I'm using a postgresql database and a redis database, and because my knowledge of these databases is very little, I'm using the managed options (cloud SQL and cloud memorystore).
These are not the cheapest solutions, but for now, they'll do what I want them to do.
My question now is: I'm using the managed options. Imagine my web app has success and grows bigger, I'll probably want my own managed solution (like a redis cluster on compute engine or a postgresql cluster on compute engine). Will I be able to migrate my managed databases to the compute engine solution without downtime/loss of data?
If things are getting bigger, I'll probably hire someone with more knowledge about postgresql/redis, that's not the problem, the only thing I want to know: is it possible to upgrade from a GCP managed solution to an unmanaged solution on compute engine without loss of data and downtime? I'm do not want loss of data at all, a little downtime should not be the problem.

Using the managed solution is, in fact, a better approach for handling databases. GCP takes over updates, management and maintenance of the database and provides reliable tools for backup and scaling.
But to answer your question, yes it is possible to migrate with a minimum downtime. You would need to configure main/worker or master/slave (deprecated terminology) with synchronous replication. After that you can switch your database to worker (which is in Compute Engine) and make it your primary database. This would give basically minimal possible downtime.

Related

Docker Django NGINX Postgres build with scaling

I have worked mostly with monolithic web applications over the years and have not spent very much time looking at Django high-availability scaling solutions.
There are plenty of NGINX/Postgres/Django/Docker builds on hub.docker.com but not one that I could find that would be a good starting point for a scalable Django web application.
Ideally the Docker project would include:
A web container configuration that easily allows adding new web containers to web app
A database container configuration that easily allows adding new database shards
Django microservice examples would also be a plus
Unfortunately, I do not have very much experience with Kubernetes and Docker swarms.
If there are none, I may make this my next github.com project.
Thank you in advance.
That is a very broad request. It depends on many factors:
What is the performance requirement?
Make sure, you really need that kind of scaling. In most cases, a single good server with docker-compose can handle your needs. If you want to reduce risk of downtime, create 2 servers, add a load balancer in between them and extract shared services like DB or Caching into another Service and have both server instances use them.
How do you want to run your applications?
Docker-Compose:
Checkout https://github.com/pydanny/cookiecutter-django on how to do it with docker-compose on a single server.
Kubernetes:
Django is quite good to run in Kubernetes, I am running several applications on Kubernetes. The issue with that is, you need to setup all other services properly (redis, DB, ElasticSearch, etc.). Each of those services run independently of your main Django app and need to be connected through libraries like haystack or python-redis.
Heroku & others
There are also providers like Heroku that offer a lot the auto scaling for a price. If price is less of a concern, go with these solutions because they are very easy to setup and maintain.
How scalable should your DB be?
If you are trying to setup your own DB cluster across different servers/regions, you will spend a lot of time building and maintaining it. I would recommend, use DB services like AWS RDS. They do this for you with easy setup and maintenance. You can configure how scalable it should be. It costs some money, but it's the least effort solution.

Manually install every thing in GCP VM module

I am new to cloud and still learning GCP, I exhausted almost all my free credit for GCP within 2 months while learning different modules.
GCP is great and provides a lot of things to ease the development and maintenance process.
But I realized using different modules cost me a lot.
So I was wondering if I could have a big VM box, install MySQL, Docker, and Java and React required components by myself, I can achieve pretty much what I want without using extra modules.
Am I right?
Can I use the same VM to host multiple sites by changing ports of API, or do I need to have different boxes for that?
Your question is out of GCP domain but about IT architecture. You can create a big VM with all installed on it. But you have to manage it by yourselves and the scalability is hard.
You can also have 1 VM per website, but the management cost is higher (patching and updgrade)! However you can scale with a better granularity (per website).
The standard pattern today is to explode your monolith server into dedicated services. The database on a specific server, the docker and Java in another one, and the react in a static component (like Google Cloud Platform).
If you want to use VM, you can use GKE and you containerize your application. It's far more easier to maintain your VM with an automatic tool like K8S.
The ultimate step, is to use serverless and/or full managed solution. Cloud SQL for your database, GCS for your static content, and App Engine or Cloud Run for your backend. Like this, you pay as you use and if you website is not very used, you won't be charged on it (except for the database).

Solr or SolrCloud in Aws?

I have to add a solr search server in an Aws-EC2 instance.Right now I have Solr installed in an AWS-EC2 instance with ram 8gb and disc space 50gb.Its working fine, but I was wondering if changing to SolrCloud improve the performance.Should I go for normal Solr or Should I go for SolrCloud? If SolrCloud,why?
It's impossible to say, both will work. "Regular" Solr allows you to scale your infrastructure by adding replicas for your cores, while SolrCloud adds hidden complexity for easier handling of replication and query distribution.
If everything works now, I wouldn't fret it. Keep track of your query times and re-evaluate if you run into issues where you need to add instances to your cluster quickly. A regular, simple Solr setup with HTTP replication will in almost all cases do just fine.
Both would work, but if you were starting afresh, go with SolrCloud. Here's a few reasons for it:
There is little or no meaningful application development overhead if you go with SolrCloud. If you are writing your integration / application to talk to Solr, just do so with SolrCloud to begin with.
SolrCloud is what's getting more and more utilized by everyone and lot of new development and capabilities are release fast enough.
As you scale, you are already on a path to simple scalability as opposed to trying to figure out High Availability and / or Data partitioning. You could use SolrCloud API to add replicas, etc.
Not much too lose from a feature, functionality perspective.
If you want to try out starting a cluster with high availability or single node deployment of SolrCloud on AWS, you can sign up for a free trial at https://searchstax.measuredsearch.com/freetrial/

What are some of the most appropriate ways for serving a large scale django app on Google Compute Engine?

I am working on a project that will presumably have a lot of user uploaded content and also a fairly large user base. I am now looking for deploying this app to the Google Compute Engine.
I have looked up for the possible options and nginx+gunicorn seems to be a good option. In the beginning I am going to be using a single ns-1 instance with 100 GB persistent hard drive and google cloud sql for serving my database.
But I want to make things scalable so that I can add more instances and disk storage without any hustle in the future. But I am very confused how to do that. So the main concern is.
I want such setup so that I can extend my disk space and no. of Google Compute Instances whenever I want.
In order to have a fully scalable architecture, a good approach is to separate computation / serving, from file storage, and both from data storage. Going part by part:
file storage - Google Cloud Storage - by storing common service files in a GCS bucket, you get a central repository that is both highly-redundant, and scalable;
data storage - Google Cloud SQL - gives you a highly reliable, scalable MySQL-like database back-end, which can be resized at will to accommodate increasing database usage;
front-ends - GCE instance group - template-generated web / computation front-ends, setting up a resource pool into which a forwarding rule (load balancer) distributes incoming connections.
In a nutshell, this is one of the most adaptable set-ups I can think of, while you keep control over every aspect of the service and underlying infrastructure.
A simple approach would be to run a Python app on Google App Engine, which will auto-scale your instances (both up and down) and it supports Django, as mentioned by #spirulence in the comments.
Here are some starting points:
Django and Cloud SQL support on App Engine
Running Pure Django Projects on Google App Engine
Third-party Libraries in Python 2.7
The last link shows which versions of Django are currently supported.

How to convert a WAMP stacked app running on a VPS to a scalable AWS app?

I have a web app running on php, mysql, apache on a virtual windows server. I want to redesign it so it is scalable (for fun so I can learn new things) on AWS.
I can see how to setup an EC2 and dump it all in there but I want to make it scalable and take advantage of all the cool features on AWS.
I've tried googling but just can't find a simple guide (note - I have no command line experience of Linux)
Can anyone direct me to detailed resources that can lead me through the steps and teach me? Or alternatively, summarise the steps in an answer so I can research based on what you say.
Thanks
AWS is growing and changing all the time, so there aren't a lot of books to help. Amazon offers training that's excellent. I took their three day class on Architecting with AWS that seems to be just what you're looking for.
Of course, not everyone can afford to spend the travel time and money to attend a class. The AWS re:Invent conference in November 2012 had a lot of sessions related to what you want, and most (maybe all) of the sessions have videos available online for free. Building Web Scale Applications With AWS is probably relevant (slides and video available), as is Dissecting an Internet-Scale Application (slides and video available).
A great way to understand these options better is by fiddling with your existing application on AWS. It will be easy to just move it to an EC2 instance in AWS, then start taking more advantage of what's available. The first thing I'd do is get rid of the MySql server on your own machine and use one offered with RDS. Once that's stable, create one or more read replicas in RDS, and change your application to read from them for most operations, reading from the main (writable) database only when you need completely current results.
Does your application keep any data on the web server, other than in the database? If so, get rid of all local storage by moving that data off the EC2 instance. Some of it might go to the database, some (like big files) might be suitable for S3. DynamoDB is a good place for things like session data.
All of the above reduces the load on the web server to just your application code, which helps with scalability. And now that you keep no state on the web server, you can use ELB and Auto-scaling to automatically run multiple web servers (and even automatically launch more as needed) to handle greater load.
Does the application have any long running, intensive operations that you now perform on demand from a web request? Consider not performing the operation when asked, but instead queueing the request using SQS, and just telling the user you'll get to it. Now have long running processes (or cron jobs or scheduled tasks) check the queue regularly, run the requested operation, and email the result (using SES) back to the user. To really scale up, you can move those jobs off your web server to dedicated machines, and again use auto-scaling if needed.
Do you need bigger machines, or perhaps can live with smaller ones? CloudWatch metrics can show you how much IO, memory, and CPU are used over time. You can use provisioned IOPS with EC2 or RDS instances to improve performance (at a cost) as needed, and use difference size instances for more memory or CPU.
All this AWS setup and configuration can be done with the AWS web console, or command-line tools, or SDKs available in many languages (Python's boto library is great). After learning the basics, look into CloudFormation to automate it better (I've written a couple of posts about that so far).
That's a bit of the 10,000 foot high view of one approach. You'll need to discover the details of each AWS service when you try to use them. AWS has good documentation about all of them.
Depending on how you look at it, this is more of a comment than it is an answer, but it was too long to write as a comment.
What you're asking for really can't be answered on SO--it's a huge, complex question. You're basically asking is "How to I design a highly-scalable, durable application that can be deployed on a cloud-based platform?" The answer depends largely on:
The specifics of your application--what does it do and how does it work?
Your tolerance for downtime balanced against your budget
Your present development and deployment workflow
The resources/skill sets you have on-staff to support the application
What your launch time frame looks like.
I run a software consulting company that specializes in consulting on Amazon Web Services architecture. About 80% of our business is investigating and answering these questions for our clients. It's a multi-week long project each time.
However, to get you pointed in the right direction, I'd recommend that you look at Elastic Beanstalk. It's a PaaS-like service that abstracts away the underlying AWS resources, making AWS easier to use for developers who don't have a lot of sysadmin experience. Think of it as "training wheels" for designing an autoscaling application on AWS.