I am maintaining a system which can ADD and UPDATE products in a shopping cart. The only difference between ADD and UPDATE is that during UPDATE a product has an OrderId.
The front-end communicates with the back-end using web services.
Should I write a web service with a common method for ADD and UPDATE?
Or should the web service have two different methods, one for ADD and one for UPDATE?
What are advantages / disadvantages for each pattern?
I do realize that you aren't asking about SQL, but it can be informative to look at comparable problem domains. Combinding INSERT and UPDATE operations is common enough to be a standard feature of SQL called MERGE (sometimes, popularly, UPSERT), so by that token, it’s certainly acceptable practice.
Still, it’s good to be aware of possible consequences. One type of problem that can arise from this is when you have concurrent writers.
If you have an explicit distinction between ADD and UPDATE, two concurrent writers both attempting to perform an ADD on the same key should cause a failure in the second writer. Depending on the scenario, such an explicit failure might be a good thing, because it explicitly informs you about a concurrency conflict.
On the other hand, if you invoke a MERGE method, the second write operation (which was originally intended by the writer to be an ADD) will automatically succeed as an UPDATE. Sometimes, this is what you want, while in other scenarios, this may be undesirable.
In general, concurrency conflicts should be addressed with a concurrency token (such as a timestamp).
Related
What is the best way to call a SQL function / stored procedure when converting code to use the repository pattern? Specifically, I am interested in read/query capabilities.
Options
Add an ExecuteSqlQuery to IRepository
Add a new repository interface specific to the context (i.e. ILocationRepository) and add resource specific methods
Add a special "repository" for all the random stored procedures until they are all converted
Don't. Just convert the stored procedures to code and place the logic in the service layer
Option #4 does seem to be the best long term solution, but it's also going to take a lot more time and I was hoping to push this until a future phase.
Which option (above or otherwise) would be "best"?
NOTE: my architecture is based on ardalis/CleanArchitecture using ardalis/Specification, though I'm open to all suggestions.
https://github.com/ardalis/CleanArchitecture/issues/291
If necessary, or create logically grouped Query services/classes for
that purpose. It depends a bit on the functionality of the SPROC how I
would do it. Repositories should be just simple CRUD, at most with a
specification to help shape the result. More complex operations that
span many entities and/or aggregates should not be added to
repositories but modeled as separate Query objects or services. Makes
it easier to follow SOLID that way, especially SRP and OCP (and ISP)
since you're not constantly adding to your repo
interfaces/implementations.
Don't treat STORED PROCEDURES as 2nd order citizens. In general, avoid using them because they very often take away your domain code and hide it inside database, but sometimes due to performance reasons, they are your only choice. In this case, you should use option 2 and treat them same as some simple database fetch.
Option 1 is really bad because you will soon have tons of SQL in places you don't want (Application Service) and it will prevent portability to another storage media.
Option 3 is unnecessary, stored procedures are no worse than simple Entity Framework Core database access requests.
Option 4 is the reason why you cannot always avoid stored procedures. Sometimes trying to query stuff in application service/repositories will create very big performance issues. That's when, and only when, you should step in with stored procedures.
By 'functionalities structuring', I mean how we organize and coordinate different API endpoints to offer desired functionalities to clients. The context here is web APIs for consumption by mobile phones with GPS tracking, and I assume either cellular or WiFi connectivity is required for most functionalities.
I personally prefer a more 'modular' approach where each endpoint does mostly one thing and a collection of them fulfill all the requirements. Of course, you may need to combine some subset or sequence of these endpoints to achieve certain functionalities. Overall, I try to minimize the overlapping between endpoints in terms of both computation and functionalities.
On the other hand, I know some other people prefer client-side convenience (or simplicity) over modularity in the following ways:
If the client needs to achieve a functionality, then there should exist a single API endpoint which does exactly that, such that the client needs only a single request to fulfill the functionality with minimal caching/logic in between requests.
For GET endpoints, if there are multiple levels/kinds of data involved for some functionalities, they prefer as much data as possible (often all necessary data) returned by a single endpoint. Ironically, they may also want a dedicated endpoint for retrieving only the "lowest level" data using a corresponding "highest level" ID. For example, If A corresponds to a collection of Bs, and each B corresponds to a collection of Cs, then they will prefer a direct endpoint that retrieves all the relevant Cs given an A.
In some extreme cases, they will ask for a single endpoint with ambiguous naming (e.g. /api/data) that returns related data from different underlying DB tables (in other words, different resources) based on different combinations of query string parameters.
I understand that people preferring such conveniences above aim to: 1. reduce the number of API requests necessary to fulfill functionalities; 2. minimize data caching and data logic on the client side to reduce client complexity, which arguably leads to a 'simple' client with simplified interaction with the server.
However, I also wonder if the cost of doing so is unjustifiable in other aspects in the long run, especially in terms of the performance and the maintenance of the server-side API. Hence my questions:
What are the tried-and-true guidelines for structuring API functionalities?
How do we determine an optimal number of requests necessary for fulfilling a functionality in a mobile app? Of course, if all other things equal, a single request is the best, but achieving such a single-request implementation usually carries penalty in other aspects.
Given the contention between the number of client requests and the performance and maintainability of server-side API, what are the approaches for striking a balance in order to deliver a sensible design?
What you are asking about breaks into at least three main areas of API design:
Ontology Design (organization)
Request/Response Design (complexity/performance)
Maintenance Considerations
Based on my experience (which is largely from working with very large organizations both on the API producing and consuming side and talking with hundreds of developers on the topic), let's look at each area, addressing the specific points you bring up...
Ontology Design
There are a couple of things to take in to consideration in your design that are perhaps implied when you say:
Overall, I try to minimize the overlapping between endpoints in terms of both computation and functionalities.
This approach makes the APIs easily discoverable. When you are in a situation where you are publishing APIs for consumption by other developers who you may or may not know (and may or may not have enough resources to truly support), this kind of modularity - making them easy to find and learn about - creates a different kind of "convenience" leading to easier adoption and reuse of your APIs.
I know some other people much prefer convenience over modularity: 1. if the client needs a functionality, then there should exist a single endpoint in the API which does exactly that...
The best public example that comes to mind for this approach is perhaps the Google Analytics Core Reporting API. They implement a series of querystring parameters to build a call that returns the data requested, ex:
https://www.googleapis.com/analytics/v3/data/ga
?ids=ga:12134
&dimensions=ga:browser
&metrics=ga:pageviews
&filters=ga:browser%3D~%5EFirefox
&start-date=2007-01-01
&end-date=2007-12-31
In that example we are querying Google Analytics Account 12134 for pageviews by browser where broswer is Firefox for the given date range.
Given the number of metrics, dimensions, filters, and segments their API exposes, they have a tool called the Dimensions & Metrics Explorer to help developers understand how to use the APIs.
One approach makes the APIs discoverable and more understandable from the outset. The other requires more supporting work to explain the intricacies of consuming the API. One thing that isn't immediately obvious with the Google API above is that certain segments and metrics are incompatible, so if you are making calls passing one key/value pair, you may not longer be able to pass certain other pairs.
Request/Response Design
The context here is APIs for mobile applications.
That is still very broad, and better defining (if possible) how you intend for your "mobile applications" to be used can help you design your APIs.
Do you intend for them to be used totally offline? If so, heavy/complete data caching may be desirable.
Do you intend for them to be used in low bandwidth and/or high latency/error-rate connectivity scenarios? If so, heavy/complete data caching may be desirable, but so might small/discrete data requests.
for GET endpoints, they often prefer as much data as possible returned by a single endpoint, especially when there are multiple levels/layers of data involved
This is safe if you know you'll only ever be in good mobile connectivity scenarios, or you can cache the data heavily when you are (and thus access it offline or when things are spotty).
I understand that people preferring convenience aim to reduce the number of API calls necessary to achieve functionalities...
One way to find a happy middle ground is to implement paging in your data-intensive calls. For example, a querystring can be passed in a GET specifying 'pagesize'. Thus 10,000 records could be returned 100 at a time over 100 successive calls, or 1,000 at a time over 10 calls.
With this approach, you can design and publish your API without necessarily knowing what your consuming developer will need. Even though the paging example above uses the Google API referenced earlier, it can still be used in a more semantically designed API. For example, say you have GET /customer/phonecalls you could still design it to accept a pagesize value and make successive calls to get all the phonecalls associated with customer.
Maintenance
I also wonder if the cost of doing so [reduce the number of API calls necessary to achieve functionalities and to minimize data caching] is not justifiable in the long run, especially for the performance and the maintenance of an API.
The key guiding principle here is separation of concerns if your collection of APIs is going to grow to any significant level of complexity and scale.
What happens when you have everything bundled together into one big service and a small part of it changes? You are now creating not only a maintenance headache on your side, but also for your API consumer.
Did that "breaking change" really affect the part of the API they were using? It will take time and energy for them to figure that out. Designing API functionality into discrete, semantic services will let you create a roadmap and version them in a more understandable way.
For further reading, I'd suggest checking out Martin Fowler's writings on Microservices Architecture:
In short, the microservice architectural style is an approach to
developing a single application as a suite of small services, each
running in its own process and communicating with lightweight
mechanisms
Although there is a lot of debate about how to design and build for "microservices" in practice, reading up on that should help further shape your thinking on the API design decisions you're facing and prepare you to engage in "current" discussions around the topic.
I got a query while developing rest service.
As per the REST design, GET is to read , PUT or POST are to create or update based on scenario , DELETE is to delete the resources.
But technically, Can't we perform a create or delete operation in GET call.
i.e. It is up to client way of calling by using specified URL pattern and required response type to hit the exact method in the service class of REST application. But why can't we perform a delete or create of some data in the GET service.
so my question is the DELETE or CREATE technically not possible in GET service or is it a rule to adhere to REST principles.
so my question is the DELETE or CREATE technically not possible in GET service or is it a rule to adhere to REST principles.
The latter. It is only a convention to use the DELETE HTTP method for delete operations. However using the GET HTTP method for delete operations is a bad idea. Below is a quote from "RESTful Java with JAX-RS 2.0, 2nd Edition" that explains why:
It is crucial that we do not assign
functionality to an HTTP method that supersedes the specification-defined boundaries
of that method. For example, an HTTP GET on a particular resource should be readonly.
It should not change the state of the resource it is invoking on. Intermediate services
like a proxy-cache, a CDN (Akamai), or your browser rely on you to follow the semantics
of HTTP strictly so that they can perform built-in tasks like caching effectively. If you
do not follow the definition of each HTTP method strictly, clients and administration
tools cannot make assumptions about your services, and your system becomes more
complex
so my question is the DELETE or CREATE technically not possible in GET
service or is it a rule to adhere to REST principles?
REST uses standards aka. uniform interface constraint. One of these standards is the HTTP standards which defines the HTTP methods. According to the HTTP standard the GET is a safe method:
In particular, the convention has been established that the GET and
HEAD methods SHOULD NOT have the significance of taking an action
other than retrieval. These methods ought to be considered "safe".
This allows user agents to represent other methods, such as POST, PUT
and DELETE, in a special way, so that the user is made aware of the
fact that a possibly unsafe action is being requested.
According to the RFC 2119:
SHOULD NOT - This phrase, or the phrase "NOT RECOMMENDED" mean that there may exist valid reasons in particular circumstances when the particular behavior is acceptable or even useful, but the full implications should be understood and the case carefully weighed before implementing any behavior described with this label.
For example write can be a side effect by GET, if you want to increase the visitor count by each request.
How the server software (API) is constructed and what 'rules' are applied is somewhat 'arbitrary'. Developers and their product managers could enforce 'rules' such as 'thou shalt not code or support DELETE operations through the GET operation', but in practice, that is not necessarily the main reason POST is chosen over GET. As others have mentioned, there may be assumptions based on the HTTP protocol that other vendors may rely on, but that is a rather complex and not necessarily relevant reasoning. For instance, your application may be built to connect directly to a server application, and another vendor's rules may not apply.
In a simpler example, on the world wide web and due to compliance and other factors, query string has a limited byte length. Because of this, operations that require a lot of data, such as a few very long encrypted data strings that might be needed for a DELETE operation in a database, GET may not be able to pass enough data, so POST may be the only viable option.
Custom built applications using a CuRL library might extend to include other RESTful operations with their intended functionality, but that would be for the benefit of the server API. Coding more operations on the client-side doesn't necessarily make things 'easier', 'faster', or necessarily 'more secure' from the client perspective, but doing so could help manage resources (a bit) on the server side and help maintain compatibility with third party software and appliances.
We are currently building a pile of SOAP Web Service to front the access of various backend systems.
While defining our Request/Response message XML, we see multiple services needing the ‘Account’ object with different ‘mandatory/optional’ fields.
How should we define and enforce the validation of these ‘mandatory/optional’ fields on the same Message? I see these options
1) Enforce validation with XSD by creating different 'Account' Complexe Type
Pros : Design time clarity.
Cons : proliferation of Object Type, Less reuse of Object,
2) Enforce validation with XSD by Extending+Restriction a single base 'Account' type
Pros : Design time clarity.
Cons : Not sure of the support of the Extend+Restriction feature (java, .Net)
3) Using a single 'Account' type and enforcing validation in runtime (ie in the Code).
Pros: Simple
Cons: No design time validation. Need to communicate field requirements via a specification doc.
What are you’re thoughts on that?
I would have to assume that: i) some of what you would call optional fields are actually fields that are not applicable (don't make sense) to all accounts and ii) we're not talking trivial scenarios (like two type of accounts with 2 fields each-kind of thing).
Firstly, I would say that unless you're really lucky, from a requirements perspective, then you're going to end up with some sort of "validation in runtime" no matter what option you're going with. XML Schema can't express some common data validation requirements, such as cross field validation; or simply because the data in your XML is not sufficient to feed the rules to validate the integrity of the message (the data in the message being a subset on what's available at the time the XML is being un/marshalled).
Secondly, I would avoid deriving new complex types through restricton; from an authoring perspective you don't achieve much in terms of reuse, and you might end up with problems in how that is interpreted by your XSD to code tooling. I like to think that the original intention of deriving through restriction was to provide a tool for people to use in xsd:redefine scenarios; for people that wouldn't want to fiddle with XML Schemas that were authored by someone else. If one owns (authors) the schema, one can work around the need to restrict by defining the "lesser" object first and extend from that.
As to the "proliferation of objects", you are kind of getting that with option #2 as well (when compared with #1); what I mean by that, all the tools I know will create a class for each named (global) complex type you have in your XSD; so if you have to have three type of accounts, you'll have three for scenario #1, and four, or so, if you choose to extend from one, or so, base classes; a worst case scenario for the later would be when you need three specializations (concrete if you wish); anyway, from my experience, the difference in real life scenarios is not something that would really tip the decision one way or the other.
Extending base types in XML Schema is good for reuse; however, reuse brings coupling; if you're analysing this from a forward/backward compatibility point of view, extending something in the base type could mess up some of the unmarshalling (deserialization) of the XML for clients of your service(s) that don't want to change their code base, yet you want to maintain only one Web Service endpoint for all; in this case, a forward-compatibility strategy that relies on an xsd:any at the end of a compositor (xsd:sequence) would be rendered useless in your first release that goes and extends your base type.
There is even more; because of this, I don't think there's a correct answer, just for the criteria you seem to imply by setting your pro/cons.
All of my preferred options below assume that you put high value on the requirement to ensure forward/backward compatibility of your services, and you want to minimize the cost of your clients having to deal with your services (because of XML Schema changes).
I would say that if all your domain (accounts in particular) can be fully modeled (assume no future change basically) and that there is enough commonality to justify reuse, then go with option #2. Otherwise, go with option #1 since I have yet to see things that don't change...
If the modeling of your domain can be done 80% or more (or some number that you think is high) and that there is enough commonality to justify reuse, then I would still go with option #2, with the caveat that any future extensions for common attributes across accounts, must be applied for each individual account (basically turning your option into a hybrid, by doing #1).
For anything else, I would go #1. Whew, I can't believe I wrote all of this...
I'm working on the initial architecture for a solution for which an SOA approach has been recommended by a previous consultant. From reading the Erl book(s) and applying to previous work with services (and good design patterns in general), I can see the benefits of such an approach. However, this particular group does not currently have any traditional needs for implementing web services -- there are no external consumers, and no integration with other applications.
What I'm wondering is, are there any advantages to going with web services strictly to stick to SOA, that we couldn't get from just implementing objects that are "service ready"?
To explain, an example. Let's say you implement the entity "Person" as a service. You have to implement:
1. Business object/logic
2. Translator to service data structure
3. Translator from service data structure
4. WSDL
5. Service data structure (XML/JSON/etc)
6. Assertions
Now, on the other hand, if you don't go with a service, you only have to implement #1, and make sure the other code accesses it through a loose reference (using dependency injection, or a wrapper, etc). Then, if it later becomes apparent that a service is needed, you can just have the reference instead point to #2/#3 logic above in a wrapper object (so all caller objects do not need updating), and implement the same amount of objects without a penalty to the amount of development you have to do -- no extra objects or code have to be created as opposed to doing it all up front.
So, if the amount of work that has to be done is the same whether the service is implemented initially or as-needed, and there is no current need for external access through a service, is there any reason to initially implement it as a service just to stick to SOA?
Generally speaking you'd be better to wait.
You could design and implement a web service which was simply a technical facade that exposes the underlying functionality - the question is would you just do a straight one for one 'reflection' of that underlying functionality? If yes - did you design that underlying thing in such a way that it's fit for external callers? Does the API make sense, does it expose members that should be private, etc.
Another factor to consider is do you really know what the callers of the service want or need? The risk you run with building a service is that (as you're basically only guessing) you might need to re-write it when the first customers / callers come along. This can could result in all sorts of work including test cases, backwards compatibility if it drives change down to the lower levels, and so on.
having said that the advantage of putting something out there is that it might help spark use of the service - get people thinking - a more agile principled approach.
If your application is an isolated Client type application (a UI that connects to a service just to get data out of the Database) implementing a SOA like architecture is usually overkill.
Nevertheless there could be security, maintainability, serviceability aspects where using web services is a must. e.g. some clients needs access to the data outside the firewall or you prefer to separate your business logic/data access from the UI and put it on 1 server so that you don’t need to re-deploy the app every time some bus. rules changes.
Entreprise applications require many components interacting with each other and many developers working on it. In this type of scénario using SOA type architecture is the way to go.
The main reason to adopt SOA is to reduce the dependencies.
Enterprise Applications usually depends on a lot of external components (logic or data) and you don’t want to integrate these components by sharing assemblies.
Imagine that you share a component that implements some specific calculation, would you deploy this component to all the dependent applications? What will happen if you want to change some calculation logic? Would you ask all teams to upgrade their references and recompile and redeploy their app?
I recently posted on my blog a story where the former Architect had also choosed not to use web services and thought that sharing assemblies was fine. The result was chaos. Read more here.
As I mentioned, it depends on your requirements. If it’s a monolithically application and you’re sure you’ll never integrate this app and that you’ll never reuse the bus. Logic/data access a 2 tier application (UI/DB) is good enough.
Nevertheless this is an Architectural decision and as most of the architectural decisions it’s costly to change. Of course you can still factor in a web service model later on but it’s not as easy as you could think. Refactoring an existing app to add a service layer is usually a difficult task to accomplish even when using a good design based on interfaces. Example of things that could go wrong: data structure that are not serializable, circular references in properties, constructor overloading, dependencies on some internal behaviors…