I have a FastAPI project in Cloud Run and it has some background jobs inside it. (Not heavy stuff)
However, when a new instance is being created by Cloud Run due to number of requests etc. every instance runs the background job concurrently.
For example;
I have a task that creates some invoices for customers in the background and if three instances is created immediately, three invoices will be created.
I researched about "FOR UPDATE" usage in PostgreSQL etc. It seems like I can solve by modifying my database but I just wonder if it can be solved in Cloud's side.
I don't want to limit the max. number of instances to 1
What would you do in this situation?
Thank you for your time.
If you can potentially have N instances of a job (because you don't want to set the max limit to 1), you need to implement your jobs in an idempotent way. Broadly speaking, you have a few ways to achieve idempotency:
by enforcing a business constraint.
by storing an idempotency key.
by using the Etag HTTP response header.
For example, Stripe lets you define an idempotency key for all of your API requests. Stripe stores this key on its servers, and when you make a POST request with the same payload of a previous one, Stripe returns you the same result. POST requests are not idempotent, but using this "trick" they become idempotent.
Stripe's idempotency works by saving the resulting status code and body of the first request made for any given idempotency key, regardless of whether it succeeded or failed. Subsequent requests with the same key return the same result, including 500 errors.
https://stripe.com/docs/api/idempotent_requests
Tip: you could expand your question by clarifying how these background tasks are created, and where they run.
We have a service which inserts into dynamodb certain values. For sake of this question let's say its key:value pair i.e., customer_id:customer_email. The inserts don't happen that frequently and once the inserts are done, that specific key doesn't get updated.
What we have done is create a client library which, provided with customer_id will fetch customer_email from dynamodb.
Given that customer_id data is static, what we were thinking is to add cache to the table but one thing which we are not sure that what will happen in the following use-case
client_1 uses our library to fetch customer_email for customer_id = 2.
The customer doesn't exist so API Gateway returns not found
APIGateway will cache this response
For any subsequent calls, this cached response will be sent
Now another system inserts customer_id = 2 with its email id. This system doesn't know if this response has been cached previously or not. It doesn't even know that any other system has fetched this specific data. How can we invalidate cache for this specific customer_id when it gets inserted into dynamodb
You can send a request to the API endpoint with a Cache-Control: max-age=0 header which will cause it to refresh.
This could open your application up to attack as a bad actor can simply flood an expensive endpoint with lots of traffic and buckle your servers/database. In order to safeguard against that it's best to use a signed request.
In case it's useful to people, here's .NET code to create the signed request:
https://gist.github.com/secretorange/905b4811300d7c96c71fa9c6d115ee24
We've built a Lambda which takes care of re-filling cache with updated results. It's a quite manual process, with very little re-usable code, but it works.
Lambda is triggered by the application itself following application needs. For example, in CRUD operations the Lambda is triggered upon successful execution of POST, PATCH and DELETE on a specific resource, in order to clear the general GET request (i.e. clear GET /books whenever POST /book succeeded).
Unfortunately, if you have a View with a server-side paginated table you are going to face all sorts of issues because invalidating /books is not enough since you actually may have /books?page=2, /books?page=3 and so on....a nightmare!
I believe APIG should allow for more granular control of cache entries, otherwise many use cases aren't covered. It would be enough if they would allow to choose a root cache group for each request, so that we could manage cache entries by group rather than by single request (which, imho, is also less common).
Did you look at this https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-caching.html ?
There is way to invalidate entire cache or a particular cache entry
I would like to know how do design the RESTful web service for process methods. For example I want to make a REST Api for ProcessPayroll for given employee id. Since ProcessPayroll is time consuming job, I don't need any response from the method call but just want to invoke the ProcessPayroll method asynchronously and return. I can't use ProcessPayroll in the URL since it is not a resource and it is not a verb. So I thought that, I can go with the below approach
Request 1
http://www.example.com/payroll/v1.0/payroll_processor POST
body
{
"employee" : "123"
}
Request 2
http://www.example.com/payroll/v1.0/payroll_processor?employee=123 GET
Which one of the above approach is correct one? Is there any Restful API Design guidelines to make a Restful service for process methods and functions?
Which one of the above approach is correct one?
Of the two, POST is closest.
The problem with using GET /mumble is that the specification of the GET method restricts its use to operations that are "safe"; which is to say that they don't change the resource in any way. In other words, GET promises that a resource can be pre-fetched, just in case it is needed, by the user agent and the caches along the way.
Is there any Restful API Design guidelines to make a Restful service for process methods and functions?
Jim Webber has a bunch of articles and talks that discuss this sort of thing. Start with How to GET a cup of coffee.
But the rough plot is that your REST api acts as an integration component between the process and the consumer. The protocol is implemented as the manipulation of one or more resources.
So you have some known bookmark that tells you how to submit a payroll request (think web form), and when you submit that request (typically POST, sometimes PUT, details not immediately important) the resource that handles it as a side effect (1) starts an instance of ProcessPayroll from the data in your message, (2) maps that instance to a new resource in its namespace and (3) redirects you to the resource that tracks your payroll instance.
In a simple web api, you just keep refreshing your copy of this new resource to get updates. In a REST api, that resource will be returning a hypermedia representation of the resource that describes what actions are available.
As Webber says, HTTP is a document transport application. Your web api handles document requests, and as a side effect of that handling interacts with your domain application protocol. In other words, a lot of the resources are just messages....
We've come up with the similar solution in my project, so don't blame if my opinion is wrong - I just want to share our experience.
What concerns the resource itself - I'd suggest something like
http://www.example.com/payroll/v1.0/payrollRequest POST
As the job is supposed to be run at the background, the api call should return Accepted (202) http code. That tells the user that the operation will take a lot time. However you should return a payrollRequestId unique identifier (Guid for example) to allow users to get the posted resource later on by calling:
http://www.example.com/payroll/v1.0/payrollRequest/{payrollRequestId} GET
Hope this helps
You decide the post and get on the basis of the API work-
If your Rest API create any new in row DB(means new resource in DB) , then you have to go for POST. In your case if your payroll process method create any resource then you have to choose to POST
If your Rest API do both, create and update the resources. Means ,if your payroll method process the data and update it and create a new data , then go for PUT
If your Rest API just read the data, go for GET. But as I think from your question your payroll method not send any data.So GET is not best for your case.
As I think your payroll method is doing both thing.
Process the data , means updating the data and
Create new Data , means creating the new row in DB
NOTE - One more thing , the PUT is idempotent and POST is not.Follow the link PUT vs POST in REST
So, you have to go for PUT method.
When I request a user's tagged places using the Facebook Graph API and receive back 200 places (in groups of 25, more or less) does that count as one API request or 200?
I know if I ask for 5 specific Ids it'll be counted as 5 separate API calls but I'm wondering about when you are asking for everything. You won't know how much data is coming back. If it counts as one API call then great, but if it's 200 then I'll need some mechanism to gather the data over a longer period of time.
This is the API call.
https://developers.facebook.com/docs/graph-api/reference/user/tagged_places
One call to the API is...well, one API call, obviously. If you call the API one time and get 25 items in the result, it will be one API call. You can increase the limit and it will still be one API call.
More information: https://developers.facebook.com/docs/graph-api/advanced/rate-limiting#what-do-we-consider-an-api-call-
The example in the docs shows that one request can also count as several API calls, if you ask for some specific IDs.
If you get 200 results, you either increased the limit or you used paging. Since you mentioned "in groups of 25", i assume you are using paging. Which means, you would need 8 API calls to get 200 items.
I have a web service that accepts JSON parameters and have specific URLs for methods, e.g.:
http://IP:PORT/API/getAllData?p={JSON}
This is definitely not REST as it is not stateless. It takes cookies into account and has its own session.
Is it RPC? What is the difference between RPC and REST?
Consider the following example of HTTP APIs that model orders being placed in a restaurant.
The RPC API thinks in terms of "verbs", exposing the restaurant functionality as function calls that accept parameters, and invokes these functions via the HTTP verb that seems most appropriate - a 'get' for a query, and so on, but the name of the verb is purely incidental and has no real bearing on the actual functionality, since you're calling a different URL each time. Return codes are hand-coded, and part of the service contract.
The REST API, in contrast, models the various entities within the problem domain as resources, and uses HTTP verbs to represent transactions against these resources - POST to create, PUT to update, and GET to read. All of these verbs, invoked on the same URL, provide different functionality. Common HTTP return codes are used to convey status of the requests.
Placing an Order:
RPC: http://MyRestaurant:8080/Orders/PlaceOrder (POST: {Tacos object})
REST: http://MyRestaurant:8080/Orders/Order?OrderNumber=asdf (POST: {Tacos object})
Retrieving an Order:
RPC: http://MyRestaurant:8080/Orders/GetOrder?OrderNumber=asdf (GET)
REST: http://MyRestaurant:8080/Orders/Order?OrderNumber=asdf (GET)
Updating an Order:
RPC: http://MyRestaurant:8080/Orders/UpdateOrder (PUT: {Pineapple Tacos object})
REST: http://MyRestaurant:8080/Orders/Order?OrderNumber=asdf (PUT: {Pineapple Tacos object})
Example taken from sites.google.com/site/wagingguerillasoftware/rest-series/what-is-restful-rest-vs-rpc
You can't make a clear separation between REST or RPC just by looking at what you posted.
One constraint of REST is that it has to be stateless. If you have a session then you have state so you can't call your service RESTful.
The fact that you have an action in your URL (i.e. getAllData) is an indication towards RPC. In REST you exchange representations and the operation you perform is dictated by the HTTP verbs. Also, in REST, Content negotiation isn't performed with a ?p={JSON} parameter.
Don't know if your service is RPC, but it is not RESTful. You can learn about the difference online, here's an article to get you started: Debunking the Myths of RPC & REST. You know better what's inside your service so compare it's functions to what RPC is and draw your own conclusions.
As others have said, a key difference is that REST URLs are noun-centric and RPC URLs are verb-centric. I just wanted to include this clear table of examples demonstrating that:
---------------------------+-------------------------------------+--------------------------
Operation | RPC (operation) | REST (resource)
---------------------------+-------------------------------------+--------------------------
Signup | POST /signup | POST /persons
---------------------------+-------------------------------------+--------------------------
Resign | POST /resign | DELETE /persons/1234
---------------------------+-------------------------------------+--------------------------
Read person | GET /readPerson?personid=1234 | GET /persons/1234
---------------------------+-------------------------------------+--------------------------
Read person's items list | GET /readUsersItemsList?userid=1234 | GET /persons/1234/items
---------------------------+-------------------------------------+--------------------------
Add item to person's list | POST /addItemToUsersItemsList | POST /persons/1234/items
---------------------------+-------------------------------------+--------------------------
Update item | POST /modifyItem | PUT /items/456
---------------------------+-------------------------------------+--------------------------
Delete item | POST /removeItem?itemId=456 | DELETE /items/456
---------------------------+-------------------------------------+--------------------------
Notes
As the table shows, REST tends to use URL path parameters to identify specific resources
(e.g. GET /persons/1234), whereas RPC tends to use query parameters for function inputs
(e.g. GET /readPerson?personid=1234).
Not shown in the table is how a REST API would handle filtering, which would typically involve query parameters (e.g. GET /persons?height=tall).
Also not shown is how with either system, when you do create/update operations, additional data is probably passed in via the message body (e.g. when you do POST /signup or POST /persons, you include data describing the new person).
Of course, none of this is set in stone, but it gives you an idea of what you are likely to encounter and how you might want to organize your own API for consistency. For further discussion of REST URL design, see this question.
It is RPC using http. A correct implementation of REST should be different from RPC. To have a logic to process data, like a method/function, is RPC. getAllData() is an intelligent method. REST cannot have intelligence, it should be dumb data that can be queried by an external intelligence.
Most implementation I have seen these days are RPC but many mistakenly call it as REST. REST with HTTP is the saviour and SOAP with XML the villain. So your confusion is justified and you are right, it is not REST. But keep in mind that REST is not new(2000) eventhough SOAP/XML is old, json-rpc came later(2005).
Http protocol does not make an implementation of REST. Both REST(GET, POST, PUT, PATCH, DELETE) and RPC(GET + POST) can be developed through HTTP(eg:through a web API project in visual studio for example).
Fine, but what is REST then?
Richardson maturity model is given below(summarized). Only level 3 is RESTful.
Level 0: Http POST
Level 1: each resource/entity has a URI (but still only POST)
Level 2: Both POST and GET can be used
Level 3(RESTful): Uses HATEOAS (hyper media links) or in other words self
exploratory links
eg: level 3(HATEOAS):
Link states this object can be updated this way, and added this way.
Link states this object can only be read and this is how we do it.
Clearly, sending data is not enough to become REST, but how to query the data, should be mentioned too. But then again, why the 4 steps? Why can't it be just Step 4 and call it REST? Richardson just gave us a step by step approach to get there, that is all.
You've built web sites that can be used by humans. But can you also
build web sites that are usable by machines? That's where the future
lies, and RESTful Web Services shows you how to do it.
This book RESTful Web Services helps
A very interesting read RPC vs REST
REST is best described to work with the resources, where as RPC is more about the actions.
REST
stands for Representational State Transfer. It is a simple way to organize interactions between independent systems.
RESTful applications commonly use HTTP requests to post data (create and/or update), read data (e.g., make queries), and delete data. Thus, REST can use HTTP for all four CRUD (Create/Read/Update/Delete) operations.
RPC
is basically used to communicate across the different modules to serve user requests.
e.g. In openstack like how nova, glance and neutron work together when booting a virtual machine.
The URL shared looks like RPC endpoint.
Below are examples for both RPC and REST. Hopefully this helps in understanding when they can be used.
Lets consider an endpoint that sends app maintenance outage emails to customers.
This endpoint preforms one specific action.
RPC
POST https://localhost:8080/sendOutageEmails
BODY: {"message": "we have a scheduled system downtime today at 1 AM"}
REST
POST https://localhost:8080/emails/outage
BODY: {"message": "we have a scheduled system downtime today at 1 AM"}
RPC endpoint is more suitable to use in this case. RPC endpoints usually are used when the API call is performing single task or action. We can obviously use REST as shown, but the endpoint is not very RESTful since we are not performing operations on resources.
Now lets look at an endpoint that stores some data in the database.(typical CRUD operation)
RPC
POST https://localhost:8080/saveBookDetails
BODY: {"id": "123", "name": "book1", "year": "2020"}
REST
POST https://localhost:8080/books
BODY: {"id": "123", "name": "book1", "year": "2020"}
REST is much better for cases like this(CRUD). Here, read(GET) or delete(DELETE) or update(PUT) can be done by using appropriate HTTP methods. Methods decide the operation on the resources(in this case 'books').
Using RPC here is not suitable as we need to have different paths for each CRUD operation(/getBookDetails, /deleteBookDetails, /updateBookDetails) and this has to be done for all resources in the application.
To summarize,
RPC can be used for endpoints that perform single specific action.
REST for endpoints where the resources need CRUD operations.
Slack uses this style of HTTP RPC Web API's - https://api.slack.com/web
There are bunch of good answers here. I would still refer you to this google blog as it does a really good job of discussing the differences between RPC & REST and captures something that I didn't read in any of the answers here.
I would quote a paragraph from the same link that stood out to me:
REST itself is a description of the design principles that underpin HTTP and the world-wide web. But because HTTP is the only commercially important REST API, we can mostly avoid discussing REST and just focus on HTTP. This substitution is useful because there is a lot of confusion and variability in what people think REST means in the context of APIs, but there is much greater clarity and agreement on what HTTP itself is. The HTTP model is the perfect inverse of the RPC model—in the RPC model, the addressable units are procedures, and the entities of the problem domain are hidden behind the procedures. In the HTTP model, the addressable units are the entities themselves and the behaviors of the system are hidden behind the entities as side-effects of creating, updating, or deleting them.
I would argue thusly:
Does my entity hold/own the data? Then RPC: here is a copy of some of my data, manipulate the data copy I send to you and return to me a copy of your result.
Does the called entity hold/own the data? Then REST: either (1) show me a copy of some of your data or (2) manipulate some of your data.
Ultimately it is about which "side" of the action owns/holds the data. And yes, you can use REST verbiage to talk to an RPC-based system, but you will still be doing RPC activity when doing so.
Example 1: I have an object that is communicating to a relational database store (or any other type of data store) via a DAO. Makes sense to use REST style for that interaction between my object and the data access object which can exist as an API. My entity does not own/hold the data, the relational database (or non-relational data store) does.
Example 2: I need to do a lot of complex math. I don't want to load a bunch of math methods into my object, I just want to pass some values to something else that can do all kinds of math, and get a result. Then RPC style makes sense, because the math object/entity will expose to my object a whole bunch of operations. Note that these methods might all be exposed as individual APIs and I might call any of them with GET. I can even claim this is RESTful because I am calling via HTTP GET but really under the covers it is RPC. My entity owns/holds the data, the remote entity is just performing manipulations on the copies of the data that I sent to it.
This is how I understand and use them in different use cases:
Example: Restaurant Management
use-case for REST: order management
- create order (POST), update order (PATCH), cancel order (DELETE), retrieve order (GET)
- endpoint: /order?orderId=123
For resource management, REST is clean. One endpoint with pre-defined actions. It can be seen a way to expose a DB (Sql or NoSql) or class instances to the world.
Implementation Example:
class order:
on_get(self, req, resp): doThis.
on_patch(self, req, resp): doThat.
Framework Example: Falcon for python.
use-case for RPC: operation management
- prepare ingredients: /operation/clean/kitchen
- cook the order: /operation/cook/123
- serve the order /operation/serve/123
For analytical, operational, non-responsive, non-representative, action-based jobs, RPC works better and it is very natural to think functional.
Implementation Example:
#route('/operation/cook/<orderId>')
def cook(orderId): doThis.
#route('/operation/serve/<orderId>')
def serve(orderId): doThat.
Framework Example: Flask for python
Over HTTP they both end up being just HttpRequest objects and they both expect a HttpResponse object back. I think one can continue coding with that description and worry about something else.