Google Cloud Spanner: Want Java API for doing my own retries - google-cloud-platform

This is really a question for the Google Cloud Spanner Java API team...
Looking at the new Google Cloud Spanner service, it appears that the only way to perform read/write transactions is by providing a callback, via the TransactionRunner interface.
I understand that the API is trying to hide the details of the need to automatically retry transactions as a convenience to the programmer, but this limitation is a serious problem, at least for me. I need to be able to manage the transaction lifecycle myself, even if that means I have to perform my own retries (e.g., based on catching some sort of "retryable" exception).
To make this problem more concrete, suppose you wanted to implement Spring's PlatformTransactionManager for Google Cloud Spanner, so as to fit in with your existing code, and use your existing retry logic. It appears impossible to do that with the current Java API.
It seems like it would be easy to augment the API in a backward compatible way, to add a method returning a TransactionContext to the user, and let the user handle the retries.
Am I missing something? Can this alternate (more traditional) transaction API style be added to the Java API?

You are right in that TransactionRunner is the only way to do Read write transactions in the Java Client for Cloud Spanner. We believe that most users would prefer using that vs hand rolling their own retry logic. But we realize that it might not fit the needs of all the users and would love to hear about such use cases. Can you please file a feature request and we can further discuss there.

Related

Pros and Cons of Google Dataflow VS Cloud Run while pulling data from HTTP endpoint

This is a design approach question where we are trying to pick the best option between Apache Beam / Google Dataflow and Cloud Run to pull data from HTTP endpoints (source) and put them down the stream to Google BigQuery (sink).
Traditionally we have implemented similar functionalities using Google Dataflow where the sources are files in the Google Storage bucket or messages in Google PubSub, etc. In those cases, the data arrived in a 'push' fashion so it makes much more sense to use a streaming Dataflow job.
However, in the new requirement, since the data is fetched periodically from an HTTP endpoint, it sounds reasonable to use a Cloud Run spinning up on schedule.
So I want to gather pros and cons of going with either of these approaches, so that we can make a sensible design for this.
I am not sure this question is appropriate for SO, as it opens a big discussion with different opinions, without clear context, scope, functional and non functional requirements, time and finance restrictions including CAPEX/OPEX, who and how is going to support the solution in BAU after commissioning, etc.
In my personal experience - I developed a few dozens of similar pipelines using various combinations of cloud functions, pubsub topics, cloud storage, firestore (for the pipeline process state managemet) and so on. Sometimes with the dataflow as well (embedded into the pipelieines); but never used the cloud run. But my knowledge and experience may be not relevant in your case.
The only thing I might suggest - try to priorities your requirements (in a whole solution lifecycle context) and then design the solution based on those priorities. I know - it is a trivial idea, sorry to disappoint you.

GCP Best way to manage multiple cloud function flow

I'm studying GCP and reading about different ways to communicate and manage cloud functions I end up wondering when to use each of the services that offer GCP.
So, I have been reading about GCP Composer, GCP Workflows, Cloud Pub/Sub and I don't see clearly when to use each one, or use simple HTTP calls.
I understand that it depends a lot on the application that you are building, but for example, If I'm building a payment gateway and some functions should be fired after the payment was verified, like sending emails, making not related business logic, adding the purchase to a sales platform. So which one should be the way I manage this flow and in which case would be better to use the others? Should I use events to create an async flow with Pub/Sub, or use complex solutions like composer and workflows? or just simple HTTP calls?
As always, it depends!! Even in your use case, it depends! Ok, after a payment you want to send an email, make business logic, adding the order to your databases,...
But, is all theses actions can be done in parallel, or you need to execute them in a certain order and if a step fails, you stop the process?
In the first case, you can use Cloud PubSub with 1 message published (payment OK) and then a fan out to several functions in parallel. Else, you can use workflow to test the response of the fonction and then to call, or not the following fonctions. With composer you can perform much more checks and actions.
You can also imagine to send another email 24h after to thank the customer for their order, and use Cloud Task to delayed an action.
You talked about Cloud Functions, but you also have other solutions to host code on GCP: App Engine and Cloud Run. Cloud function is, most of the time, single purpose. Sending an email is perfect for a function.
Now, if you have "set of functions" to browse your stock, view the object details, review the price, and book an object (validate an order "books" the order content in your warehouse), the "functions" are all single purpose but related to the same domain: warehouse management. Thus you can create a webserver that propose different path to manage the warehouse (a microservice for the warehouse if you prefer) and host it on CloudRun or App Engine.
Each product has its strength and weakness. You will also see this when you will learn about the storage on GCP. Most of the time, you can achieve things with several product, but if you don't use the right one, it will be slower, or cost much more.

(GCloud SQL) Is there a way to opting in to maintenance notifications via CLI or API?

I'm looking for a way to toggle this notification via gcloud CLI or API call since I need to automate it.
Is there a way of doing it? If not, is this going to be available in the future?
Having a lot of environment is hard to keep track of all of them via UI.
I have checked the Cloud SQL Admin API and it seems it is not possible yet. The best way to proceed in this cases is to create a Feature Request in the Public Issue Tracker. I searched for an existing one, but I didn't find any. When submitting a Feature Request, the Engineering team has more visibility of your needs and they can prioritize those requests by the number of users affected.

API Throttling Best Practices

I have a SOAP api that I would like to throttle access to on a User basis after "x" many calls have been received in "y" amount of time.
After searching around, the #1 consideration (obviously) is to consider your parameters for when to throttle users. However, I don't see much in the way of best practices/examples for implementing such a solution. I did see the Leaky Bucket Method which makes sense. I have to believe there are more ideas out there though.
Any other takers on how you go about implementing your throttling solution? Questions include:
Do any frameworks provide capabilities (e.g. Spring, etc.) for throttling in web apis?
Seems to me you would need to store access information per user. How do you minimize the database overhead for doing this EVERY call?
Do you even NEED to access a datastore to implement this?
For what its worth, I've sort of answered this question after working on some other production projects.
Home brew: Using Spring AOP to pointcut around the method calls prior to executing API method code is one home-brew way if you have your own algorithm to implement. This ends up being pretty elegant and flexible as you can capture a lot of metadata prior to deciding what to do with the request.
API Management Service: If you're talking about a production system and you have the budget, probably the best way to go is to delegate this to an API Management layer like Apigee or Mashery.
Advantage is that it separates the concerns so its easier to change and allows you to focus just on your API. This is especially helpful if business stakeholders are involved and you need a good UI and dictionary of terms.
Disadvantage, of course is the cost and the vendor lock in.
Hope this helps someone!

SOA - data access for business services as a separate web service or no?

Currently inside my organization we are trying to come up w/ some conventions for a pilot SOA project. At first glance we thought it would be best to force users of the service to use the business service w/out direct access to any data endpoints .. but are there specific scenarios where this is not true or it might be "valid" for developers to have access to specific data endpoints outside of a service?
I always fear that if we open this up, it will actually hurt reuse because everyone will just "re-invent" similar business services using the same back-end data as it's available and would be "in theory .. easier" to just write a new business service rather than ask "what does this other service do that uses my database?"
Because even if the service is almost a direct pass through to the database we would have the ability to apply rules that would save developers time and ultimately the business money.
Thoughts?
Wouldn't your webservice just be a thin wrapper for your business layer anyway? Your service layer probably shouldn't have anything more than your business layer besides some dumbed-down DTOs perhaps. Then noone is asking "what does this other service do that uses my database?" because it isn't using your db, it is using your BL.
You're waving a red flag when you describe "forcing" users to do anything. Maybe you can think about your question and rephrase it in terms something like "enable" users, because this should be the starting point for your decision, and SOA offers advantages and disadvantages for different UI strategies.