Possible to route multiple projects to a Cloud Function endpoint? - google-cloud-platform

I have a Saas billing-model and each user has their own GCP Project. This is similar to this reddit thread, which asks:
I’m thinking about selling a saas service. I’ve decided every customer will get their own gcp project every customer will have a bunch of cloud run services, a cloud sql database and some users in Identity platform. I know the default project limit is around 12 and it can be increased by filling a form.
This works for something like BigQuery, where each user's Dataset or Table will be created within their own GCP project, and thus their billing (and data) will be segmented under their project.
However, I also have some shared endpoints on Google Cloud Functions, for example let's say I have general/shared endpoints to do something like "export data". Now of course the query to grab the data will hit the correct GCP project, but if the export (or some other data processing task) is doing something that is very expensive -- some exports might take over an hour to write the data, if dealing with billions of rows, what would be the suggested way to set that up so the end user is paying for their computation, since I imagine an endpoint such as www.example.com/api/export is just going to be on the main Project account, and we wouldn't have, for example, 1000 different cloud functions that do the same thing just to have each one under their respective project.
What might be a solution to this? In a way I'm looking for something like this I suppose where the requestor pays.

You would probably need to record how long each function call took, and save that data somewhere before exiting the shared function.
The only alternative would be to split the function for each client, and use billing labels to help with allocation.

Related

Can we use Google cloud function to convert xls file to csv

I am new to google cloud functions. My requirement is to trigger cloud function on receiving a gmail and convert the xls attachment from the email to csv.
Can we do using GCP.
Thanks in advance !
Very shortly - that is possible as far as I know.
But.
You might found that in order to automate this task in a reliable, robust and self-healing way, it may be necessary to use half a dozen cloud functions, pubsub topics, maybe a cloud storage, maybe a firestore collection, security manager, customer service account with relevant IAM permissions, and so on. Maybe more than a dozen or two dozens of different GCP resources. And, obviously, those cloud functions are to be developed (I mean the code is to be developed). All together that may be not a very easy or quick to implement.
At the same time, I personally saw (and contributed to a development of) a functional component, based on cloud functions, which together did exactly what you would like to achieve. And that was in production.

GCP Best way to manage multiple cloud function flow

I'm studying GCP and reading about different ways to communicate and manage cloud functions I end up wondering when to use each of the services that offer GCP.
So, I have been reading about GCP Composer, GCP Workflows, Cloud Pub/Sub and I don't see clearly when to use each one, or use simple HTTP calls.
I understand that it depends a lot on the application that you are building, but for example, If I'm building a payment gateway and some functions should be fired after the payment was verified, like sending emails, making not related business logic, adding the purchase to a sales platform. So which one should be the way I manage this flow and in which case would be better to use the others? Should I use events to create an async flow with Pub/Sub, or use complex solutions like composer and workflows? or just simple HTTP calls?
As always, it depends!! Even in your use case, it depends! Ok, after a payment you want to send an email, make business logic, adding the order to your databases,...
But, is all theses actions can be done in parallel, or you need to execute them in a certain order and if a step fails, you stop the process?
In the first case, you can use Cloud PubSub with 1 message published (payment OK) and then a fan out to several functions in parallel. Else, you can use workflow to test the response of the fonction and then to call, or not the following fonctions. With composer you can perform much more checks and actions.
You can also imagine to send another email 24h after to thank the customer for their order, and use Cloud Task to delayed an action.
You talked about Cloud Functions, but you also have other solutions to host code on GCP: App Engine and Cloud Run. Cloud function is, most of the time, single purpose. Sending an email is perfect for a function.
Now, if you have "set of functions" to browse your stock, view the object details, review the price, and book an object (validate an order "books" the order content in your warehouse), the "functions" are all single purpose but related to the same domain: warehouse management. Thus you can create a webserver that propose different path to manage the warehouse (a microservice for the warehouse if you prefer) and host it on CloudRun or App Engine.
Each product has its strength and weakness. You will also see this when you will learn about the storage on GCP. Most of the time, you can achieve things with several product, but if you don't use the right one, it will be slower, or cost much more.

Deploy multiple agents with dialogflow

I'm developing a dialogflow agent for bookings. My problem is that I need to deploy the agent for multiple clients with their own calendars. Unfortunately on the Google Cloud Platform is possible to have just one agent per project but at the same time the number of project is limited. How can i solve this? I may have 3 solutions but I'm open to suggestions.
Ask more projects to Google and associate each project to each of my clients. I will be able to manage the projects with a service account. But how much will it cost? May I request like more than 1000 projects?
Create a new Google Cloud Platform account for every client and create a project for each account (Like the qwicklabs account in the google courses). The problem is that I don't know how to scale this solution since I'd need to automate this process and i don't want to create an account manually each time.
Use the same GCP account and the same agent for multiple clients. This may require to insert a unique code when starting the chat to identify to which calendar we are referring. In this way though I won't be able to integrate the chat on the client's website or facebook page unless I don't give the same credentials to everyone.
What do you think could be the best solution? Do you have any other ideas to solve this problem?
Thank you guys
In terms of the best solution, it would best to create a project for each client. As for when using dialogflow products, Each project can have at most one agent, so you need multiple projects if you need multiple agents either way.
Additionally, when it comes to the amount of projects you can have in GCP, the limit for the average user is 30 projects. However, you can always increase the amount of projects by requesting a higher limit. You can do so by referencing this document here.

Is there a way to get GCP service cost by access email ID?

I'm looking to get help on the GCP billing. I know we can get cost info based on the service and project, however, is it possible to get info based on the access email ID? because I'm planning to give access to my colleagues and I want to know how much each one their access cost and against which service.
Something like: Date, Email ID, Service, Cost
With respect to another project, how should we know which access cost us so much?
We are running ~30 sandbox projects internally, each allocated to a specific person that can test and run his/her stuff on GCP.
I strongly suggest you create isolated workspaces (projects) for your colleagues so they don't accidentally delete/update services of other people. You will get a separate billing report for each project as well.
I am also setting up a billing alert for all my colleagues so they get an early notification if they left something running on their testbench.
There are three ways I think you could do that kind of cost segregation, I will number them in order of complexity.
1.- Cloud Export Billing, For this one the best practice is to segregate your resources and users by "Labels", as administrator, you may ask the users to use them and assign them to any resource they create, e.g. If they create a new VM instance, then you will be able to filter by field the exported table and create the reports as you want.(Also your GCP billing dashboard will show these "labels" segregations)
2.- Use Billing API to curl directly the information you need to get from it,you can manage to use in the request the information you need like SKU, User, Date and description.
3.- Usage Reports. This solution is more GSuite scope,and I can't vouch that will work as the documentation say but you can take a look to it, there is an option to get "Usage reports", this usage reports can be made from GSuite to any resource below, GCP included if you already have an organization.

Microservices Architecture: Cross Service data sharing

Consider the following micro services for an online store project:
Users Service keeps account data about the store's users (including first name, last name, email address, etc')
Purchase Service keeps track of details about user's purchases.
Each service provides a UI for viewing and managing it's relevant entities.
The Purchase Service index page lists purchases. Each purchase item should have the following fields:
id, full name of purchasing user, purchased item title and price.
Furthermore, as part of the index page, I'd like to have a search box to let the store manager search purchases by purchasing user name.
It is not clear to me how to get back data which the Purchase Service does not hold - for example: a user's full name.
The problem gets worse when trying to do more complicated things like search purchases by purchasing user name.
I figured that I can obviously solve this by syncing users between the two services by broadcasting some sort of event on user creation (and saving only the relevant user properties on the Purchase Service end). That's far from ideal in my perspective. How do you deal with this when you have millions of users? would you create millions of records in each service which consumes users data?
Another obvious option is exposing an API at the Users Service end which brings back user details based on given ids. That means that every page load in the Purchase Service, I'll have to make a call to the Users Service in order to get the right user names. Not ideal, but I can live with it.
What about implementing a purchase search based on user name? Well I can always expose another API endpoint at the Users Service end which receives the query term, perform a text search over user names in the Users Service, and then return all user details which match the criteria. At the Purchase Service, map the relevant ids back to the right names and show them in the page. This approach is not ideal either.
Am I missing something? Is there another approach for implementing the above? Maybe the fact that I'm facing this issue is sort of a code smell? would love to hear other solutions.
This seems to be a very common and central question when moving into microservices. I wish there was a good answer for that :-)
About the suggested pattern already mentioned here, I would use the term Data Denormalization rather than Polyglot Persistence, as it doesn't necessarily needs to be in different persistence technologies. The point is that each service handles its own data. And yes, you have data duplication and you usually need some kind of event bus to share data across services.
There's another option, which is a sort of a take on the first - making the search itself as a separate service.
So in your example, you have the User service for managing users. The Purchases services manages purchases. Each handles its own data and only the data it needs (so, for instance, the Purchases service doesn't really need the user name, only the ID). And you have a third service - the Search Service - that consumes data produced by other services, and creates a search "view" from the combined data.
It's totally fine to keep appropriate data in different databases, it's called Polyglot Persistence. Yes, you would like to keep user data and data about purchases separately and use message queue for sync. Millions of users seems fine to me, it's scalability, not design issue ;-)
In case of search - you probably want to search more than just username, right? So, if you use message queue to update data between services you can also easily route this data to ElasticSearch, for example. And from ElasticSearch perspective it doesn't really matter what field to index - username or product title.
I usually use both approaches. Sometimes i have another service which is sitting on top on x other services and combines the data. I don't really like this approach because it is causing dependencies and coupling between services. So in general, within my last projects we tried to stick to polyglot persistence.
Also think about, if you need to have x sub http requests for combining data in some kind of middleware service, it will lead you to higher latency. We always try to cut down the amount of requests for one task and handle everything what is possible through asynchronous queues. ( especially data sync )
If you conceptualize modules as the owners and controllers of the data they work on, then your model must also communicate that data out of that module to others. In contrast, the modules in a manufacturing process have the access to change data without possessing and controlling it.
Microservices is an architecture for distributed processing, like most code, where modules pass the data around to work on it. From classic articles by Harvard Business Review and McKinsey on the subject of owning members of a supply chain, I identified complexities arising from this model and wrote an article teaching programmers what you need to know: http://www.powersemantics.com/p.html
Manufacturing is an architecture for integrated processing, where modules work on the data without passing it around from point to point. This can be accomplished by having modules configured to access the same memory, files or database tables. My architecture shows how to accomplish this on memory via reference properties.
When you consider "exposing an API at the Users Service end which brings back user details based on given ids", you need to be aware that creates what HBR calls "irreversible" complexity, which I've dubbed centralization complexity. Don't build A->B (distributed) systems, because you can't decentralize them later after failing to separate requirements. Requirements in production processes represent user instructions, and centralized modules only enable you to change the wrong users' processes. In other words, centralized modules don't document user groups or distinguish them from derived-product-users.