Design service on GCP - google-cloud-platform

In google cloud platform i want to write one application that will take http request , hit apis in chain and then show a template based on the response received from the api and populate them with data received from apis . There are many templates .
What is the best way to design on GCP considering the below.
1. The application will received huge traffic.
2. Some apis will return dynamic urls that template needs.
I was thinking of wrinting in java and putting that on Kubernetes , that will manage the traffic . But what should be the choice of database to be used ?
The data is mostly key value pairs and should be highly available , in case it is down some backup should be there

Yes, Kubernetes is one option, something else that you may want to consider to handle huge app traffic is Google App Engine (GAE), since you mentioned Java development you can use the GAE Standard environment which is easy to build, deploy and runs reliably even under heavy load (fully managed).
You may want to consider using Cloud Datastore since based on your description, it is the best fit for the application needs (NoSQL database and automatically handles sharding and replication). You can also use the diagram to choose the best storage option.

Related

How create a combined response from multiple microservices (cloud run containers) in a single api endpoint using Google Cloud Endpoints (gateway)?

I am familiar with firebase platform, but I am relatively a new user of the google cloud platform as whole.
I am working on a project built using a microservices structure, and I do have so many question for which I cannot find an answer or better I cannot find any example.
Unfortunately all the example that I am able to find are way to simple to be able to extrapolate a viable answer for my issues.
I adopted the new cloud run offer, and I decided to play with the full managed version (not kubernetes). I built few microservices (each service is built using express for node or flask for python - depending on what the services does). Each microservices expose it's own endpoint and has it's own api to call the methods - and I use a service account to allow the application to perform the internal calls.
I now want to expose the application to the external (specifically to my client built using vuejs technology), and I was trying to leverage another google product to create and expose an api: the google endpoints.
My question (specifically referred to the cloud run structure) is related to how is possible and what I need to do to create an api endpoints to communicate with the client app, that internally calls multiple services and combine their response in one.
Just to be clear, let's make an example:
Cloud run service 1 -> crud user api
Cloud run service 2 -> crud product api
Cloud endpoint external visible api -> get user from service 1, and after get products from service 2 and return the combined response all green products for user Jane Doe.
How I can aggregate the response directly in the endpoint gateway, check for failure and if everything goes smooth send the aggregate response to the client?
I need to build the aggregate endpoint in something else, like a cloud function for example? or I can do it directly in the google endpoints gateway?
Note that for cloud run the google endpoints is another cloud run container.
Thanks guys for some help, running pretty much out of option here.
As per my understanding, API Gateway should just work as a proxy, presenting all micro services as a single endpoint. To this scenarios I think you can have following 2 approaches :
1: Implement a new micro service (or on any of the existing one) which will do invocations and aggregation of responses.
2: Client(like UI) can invoke the services and do the aggregation on their side as well.
I feel, it is not a good idea to do it at api-gateway.
In my opinion, from an architectural point of view, the best option for you is to create a new microservice which will take the responses from the other two and then, it will aggregate them.
I understand that you want to aggregate the responses in a api-geteway and you are not able to find code examples for it. Here I was able to find a guide on what are you wanting to implement. The full code implementation can be found in this repository.
Keep in mind though, this idea of implementation is not a best practice.
This is ok, only if those two services that are going to be combined are independent. Meaning there is no functional/business relation between them and the concurrency or inconsistency problem will not occur in the process of aggregating.

What are some of the most appropriate ways for serving a large scale django app on Google Compute Engine?

I am working on a project that will presumably have a lot of user uploaded content and also a fairly large user base. I am now looking for deploying this app to the Google Compute Engine.
I have looked up for the possible options and nginx+gunicorn seems to be a good option. In the beginning I am going to be using a single ns-1 instance with 100 GB persistent hard drive and google cloud sql for serving my database.
But I want to make things scalable so that I can add more instances and disk storage without any hustle in the future. But I am very confused how to do that. So the main concern is.
I want such setup so that I can extend my disk space and no. of Google Compute Instances whenever I want.
In order to have a fully scalable architecture, a good approach is to separate computation / serving, from file storage, and both from data storage. Going part by part:
file storage - Google Cloud Storage - by storing common service files in a GCS bucket, you get a central repository that is both highly-redundant, and scalable;
data storage - Google Cloud SQL - gives you a highly reliable, scalable MySQL-like database back-end, which can be resized at will to accommodate increasing database usage;
front-ends - GCE instance group - template-generated web / computation front-ends, setting up a resource pool into which a forwarding rule (load balancer) distributes incoming connections.
In a nutshell, this is one of the most adaptable set-ups I can think of, while you keep control over every aspect of the service and underlying infrastructure.
A simple approach would be to run a Python app on Google App Engine, which will auto-scale your instances (both up and down) and it supports Django, as mentioned by #spirulence in the comments.
Here are some starting points:
Django and Cloud SQL support on App Engine
Running Pure Django Projects on Google App Engine
Third-party Libraries in Python 2.7
The last link shows which versions of Django are currently supported.

Building Erlang applications for the cloud

I'm working on a socket server that'll be deployed to AWS and so far we have the basic OTP application set up following a structure similarly to the sample project in Erlang in Practice, but we wanted to avoid having a global message router because that's not going to scale well.
Having looked through the OTP design guide on Distributed Applications and the corresponding chapters (Distribunomicon and Distributed OTP) in Learn You Some Erlang it seems the built-in distributed application mechanism is geared towards on-premise solutions where you have known hostnames and IPs and the cluster configuration is determined ahead of time, whereas in our intended setup the application will need to scale dynamically up and down and the IP addresses of the nodes will be random.
Sorry that's a bit of a long-winded build up, my question is whether there are design guidelines for distributed Erlang applications that are deployed to the cloud and need to deal with all the dynamic scaling?
Thanks,
There are a few possible approaches:
In Erlang and OTP in Action, one method presented is to use one or two central nodes with known domains or IPs, and have all the other nodes connect to this one to discover each other
Applications like https://github.com/heroku/redgrid/tree/logplex require having a central redis node where all Erlang nodes register themselves instead, and do membership management
Third party services like Zookeeper and whatnot to do something similar
Whatever else people may recommend
Note that unless you're going to need to protect your communication, either by switching the distribution protocol to use SSL, or by using AWS security groups and whatnot to restrict who can access your network.
I'm just learning Erlang so can't offer any practical advice of my own but it sounds like your situation might require a "Resource Discovery" type of approach as i've read about in Erlang & OTP in Action.
Erlware also have an application to help with this: https://github.com/erlware/resource_discovery
Other stupid answers in addition to Fred's smart answers include:
Using Route53 and targetting a name instead of an IP
Keeping an IP address in AWS KMS or AWS Secrets Manager, and connecting to that (nice thing about this is it's updatable without a rebuild)
Environment variables: scourge or necessary evil?
Stuffing it in a text file in an obscured, password protected s3 bucket
VPNs
Hardcoding and updating the build in CI/CD
I mostly do #2

Use cases for web application API?

Nowadays a lot of web applications are providing API for other applications to use.
I am new to the usage of API so I want to understand the use cases for it.
Lets take Basecamp as an example.
What are the use cases for using their API in my web application?
For inserting current data in my web application into a newly created Basecamp account instead of inserting everything manually which could take days or weeks if the data is huge?
For updating my application data when the user changes something in Basecamp. If so, how do I know for example when a user add/edit/remove a contact in Basecamp. Do I make a request and check every minute from the backend?
For making backup of the Basecamp data so I can move it to other applications if necessary?
Are all the above examples good use cases for the usage of API?
Are there more use cases?
I want to have a clear picture of why it's good to use another web service API and how I can leverage that on my application.
Thanks.
I've found the biggest reason to use and provide web services is to be able to programmatically drive the application with another process. This allows the coupling of different actions in different applications driven by one event/process/trigger.
For example I could create a use a webservice provided by Basecamp, my bug tracking database and the continuous integration server. I could tie all those things together and kick them off from a commit hook script.
I can have a monitor in production automatically open a ticket in our ticket tracker. This could trigger an autoremediation process from the ticket tracker which logs into the box remotely and restarts the service.
The other major reason I've seen to use and provide web service is to reduce double entry. If you do change management in your production environment that usually means you create Change tickets. The changes that occur may also need to be reflected in the Change Management Database which is usually a model of how production is suppose to look. Most of these systems don't automatically drive the update of your configuration item with the data from the change. Using web services you can stitch them together to eliminate the double (manual) entry that would normally occur.
APIs are used any time you want to get data to/from an application without using the default interface.
*I'd bet there's a mobile app would use the basecamp api.
*You could use the api to pull information from basecamp into another application (like project manager software or an individual's todo webpage)
*the geekiest of us may prefer to update basecamp from a script/command line rather than interrupting our work flow to open a web page and click around.

How to 'web enable' a legacy C++ application

I am working on a system that splits users by organization. Each user belongs to an organization. Each organization stores its data in its own database which resides on a database server machine. A db server may manage databases for 1 or more organizations.
The existing (legacy) system assumes there is only one organization, however I want to 'scale' the application by running an 'instance' of it (tied to one organization), and run several instances on the server machine (i.e. run multiple instances of the 'single organization' application - one instance for each organization).
I will provide a RESTful API for each instance that is running on the server, so that a thin client can be used to access the services provided by the instance running on the server machine.
Here is a simple schematic that demonstrates the relationships:
Server 1 -> N database (each
organization has one database)
organization 1 -> N users
My question relates to how to 'direct' RESTful requests from a client, to the appropriate instance that is handling requests from users for that organization.
More specifically, when I receive a RESTful request, it will be from a user (who belongs to an organization), how (or indeed, what is the best way) to 'route' the request to the appropriate application instance running on the server?
From what I can gather, this is essentially a sharding problem. Regardless of how you split the instances at a hardware level (using VMs, multiple servers, all on one powerful server, etc), you need a central registry and brokering layer in your overall architecture that maps given users to the correct destination instance per request.
There are many ways to implement this of course, so just choose one that you know and is fast, and will scale, as all requests will come through it. I would suggest a lightweight stateless web application backed by a simple read only database that does the appropriate client identifier -> instance mapping, which you would load into memory/cache. To add flexibility on hardware and instance location, use (assuming Java) JNDI to store the hardware/port/etc information for each instance, and in your identifier mapping map the client identifier to the appropriate JNDI lookup key.
Letting the public API only specify the user sounds a little fragile to me. I would change the public API so that requests specify organization as well as user, and then have something trivial server-side that maps organizations to instances (eg. organization foo -> instance listening on port 7331).
That is a very tough question indeed; simply because there are many possible answers, and which one is the best can only be determined by you and your environment.
I would write an apache module in C++ to do that. Using this book, I managed to start writing very efficient modules.
To be able to give you more solutions (maybe just setting up a Squid proxy?), you'll need to specify how you will be able to determine to which server you need to redirect the client. If you can do it by IPs, though a GET param, though a POST XML param (like SOAP). Etc.
As the other answer says there are many ways to approach this issue. Lets assume that you DON'T have access to legacy software source code, which means you cannot modify it to listen on different ports for different instances.
Writing Apache module seems VERY extreme to solve this issue (and as someone who actually just finished writing a production apache module, I suggest avoiding it unless you are making serious money).
The approach can be as esoteric as you like. For instance if your legacy software runs on normal Intel architecture and you have the hardware capacity there are VM solutions, where you should be able to create a thin virtual machine, one running a single instance of the software and a multiplexer to tie them all.
If on the other hand you are running something like HPUX well :-) there are other approaches. How about you give a bit more detail?
Ahmed.