Following is a point mentioned in a presentation slide related to SOA, and it confuses me with the concepts of service orchestration and service choreography. To enable service choreography, shouldn't a web service be able to call another web service?
SOA builds applications out of software services. Services comprise intrinsically
unassociated, loosely coupled units of functionality that have no calls to
each other embedded in them.
In theory, a service can do anything it needs to do to accomplish its job. So there doesn't seem to be a good reason to forbid using a second service to do your work. Why reinvent the wheel?
In practice, the issue is more complicated. If you start calling other services on your own web server, then you'll eventually starve it of resources. At best, "real" clients will have to wait a bit longer for their answers while your web service server plays with itself.
Another issue is recursive loops: Service A calls B calls C calls A calls B ... you get the idea. A small change in one service can introduce such a loop without anyone noticing and it can sit there for a long time until it suddenly kills your server.
That is why you should build micro services in a hierarchy inside the server (i.e. below the web service layer - this is not exposed to clients). Those micro services can use each other in a top-down manner (to avoid the loops). Unit tests then make sure they behave properly.
Lastly, such reuse is very slow. Each HTTP request takes a lot of resources to create, send, parse and process. Calling an internal method directly can be 10 - 10000 times faster.
These are the main reasons why the services exposed by a single server shouldn't reuse each other via the "public client API".
Note: There are web services which build new services by using existing ones. IFTTT - "If This Then That" is one such beast.
You could adopt every concept according to your needs. In my current project we have a separate module that is responsible for the Orchestration. This is required since in real life usage, scenarios can be very complicated. So in order to be close to the actual management of your system, you need to have such one.
Another advantage of this approach is that the Separation_of_concerns is kept. Also aligns the business request with the applications, data, and infrastructure that you have. It defines policies and service levels through automated workflows, provisioning etc.
Orchestration is critical in the delivery of Cloud services too. As they are networked to allow sharing of data-processing tasks, centralized data storage, and online access to services or resources.
Related
I learned about service oriented architecture yesterday, and I have a question about it.
in order to talk to web service provider the initial communication has to be started by a service consumer. Then does it mean that a web service provider cannot talk to another web service directly (because it is not a consumer)?
I do not have enough information to grasp a full scope of what you're getting at exactly. However, I can say that this statement:
Then does it mean that a web service provider cannot talk to another web service directly (because it is not a consumer)
Isn't really true. A program can (programmatically) access data provided by a web service. The web service has no real awareness as to what a 'consumer' is. It only sees (programmatically) the data provided by the client (typically browser data, cookies, cache, etc..). But that doesn't stop anyone from opening a bash shell and curling the website.
This will retrieve any data statically provided by the server. Note that the data may obfuscated using JavaScript as to take measures to prevent any programs outside of a browser environment to access their critical data.
So the answer to that question, is yes and no.
You should ask this question on https://softwareengineering.stackexchange.com/ as it is more relevant to questions regarding programming concepts.
Both from technical and architectural points of view service of course can call another one. Simply, it is changing its role to behave as a consumer for the second service. Just be aware that things may become messy if both services are calling each other both ways to finish their task for a single client request. Though there are often valid scenarios for such behaviour, if both services are managed by the same entity, its worth looking if tasks shouldn't be moved or services merged as this may be a sign of a bad design decisions.
Any piece of software can talk to a web service as long as it can reach it.
I have an application running in a Java EE App Server and it needs to call a web service of a partner company.
Using wsimport.exe from my JDK (1.6) I have generated the client classes. I instantiate the service and get the port in order to call the web service.
I noticed that the first call to the web service is slow, and I am led to believe this is because it is validating the WSDL. Subsequent calls are fast.
I could keep the WSDL locally, and apparently that will speed up the first call.
In order to optimise my app, I was thinking I could create a pool of the clients. This has the added advantage that I have some throttling in the app - lets say I have a pool of 5 clients, then at most I will be using memory for 5 clients. If the load increased suddenly on my server, I don't have to worry that an unlimited number of clients would cause an out of memory error. I am assuming, based on past experience, that the web service clients use a lot of memory...
Would you bother with a pool?
How would you get over the first call to the web service being slow?
What is the best way to create that pool, so that I have to do the least amount of programming (i.e. I'd like to use a library / API / whatever, so that I don't have to reinvent the wheel and code some hairy bugs).
The Apache Commons Pool might be exactly what I am after.
It is configurable and seems to have thought of everything.
A colleague of mine suggested that you can use the #WebServiceRef annotation on a field in an EJB. The idea is that the server would inject a reference to a client, from which one can create a port for each thread that calls the EJB.
I assume that injected references come from a pool, although the specification doesn't appear to talk about this. The Javadoc for the annotation explicitly mentions that:
"the injected references are not thread safe"
AKKA with a master/slave setup as shown in the link could work well, albeit a little more complex than the Apache Commons Pool listed in another answer. AKKA also uses an execution pool, with its own threads, which isn't strictly allowed in the Java EE world, although I'd argue that because a well tested framework is in charge of the threads, there is no danger, and it shouldn't interfere with the app servers control of threads anyway as the number of threads being handled by AKKA is minimal.
I always read that one reason to chose a RESTful architecture is (among others) better scalability for Webapplications with a high load.
Why is that? One reason I can think of is that because of the defined resources which are the same for every client, caching is made easier. After the first request, subsequent requests are served from a memcached instance which also scales well horizontally.
But couldn't you also accomplish this with a traditional approach where actions are encoded in the url, e.g. (booking.php/userid=123&travelid=456&foobar=789).
A part of REST is indeed the URL part (it's the R in REST) but the S is more important for scaling: state.
The server end of REST is stateless, which means that the server doesn't have to store anything across requests. This means that there doesn't have to be (much) communication between servers, making it horizontally scalable.
Of course, there's a small bonus in the R (representational) in that a load balancer can easily route the request to the right server if you have nice URLs, and GET could go to a slave while POSTs go to masters.
I think what Tom said is very accurate, however another problem with scalability is the barrier to change upon scaling. So, one of the biggest tenants of REST as it was intended is HyperMedia. Basically, the server will own the paths and pass them to the client at runtime. This allows you to change your code without breaking existing clients. However, you will find most implementations of REST to simply be RPC hiding behind the guise of REST...which is not scalable.
"Scalable" or "web scale" is one of the most abused terms when it comes to the web, the cloud and REST, and mainly used to convince management to get their support for moving their development team on board the REST train.
It is a buzzword that holds no value. If you search the web for "REST scalability" you'll find a lot of people parroting each other without any concrete evidence.
A REST service is exactly equally scalable as a service exposed over a SOAP interface. Both are just HTTP interfaces to an application service. How well this service actually scales depends entirely on how this service was actually implemented. It's possible to write a service that cannot scale as all in both REST and SOAP.
Yes, you can do things with SOAP that makes it scale worse, like rely on state and sessions. SOAP out of the box does not do this. This requires you to use a smarter load balancer, which you want anyway if you're really concerned with whatever form of scaling.
One thing that REST allows that SOAP doesn't, and that some other answers here address, is caching cacheable responses through an HTTP caching proxy or at the client side. This may make a REST service somewhat more lightly loaded than a SOAP service when a lot of operations' responses are cacheable. All this means is that fewer requests end up in your service.
The main reason behind saying a rest application is scalable is, Its built upon a HTTP protocol. Because HTTP is stateless. Stateless means it wont share anything between other request. So any request can go to any Server in a load balanced cluster. There is nothing forcing this user request go to this server. We can overcome this by using token.
Because of this statelessness,All REST application are very easy to scale. But if you want get high throughput(number of request capable in one second) in each server, then you should optimize blocking things from the application. Follow the following tips
Make each REST resource is a small entity. Don't read data from join of many tables.
Read data from near by databases
Use caches (Redis) instead of databases(You can save DISK I/O)
Always keep data sources as much as near by because these blocks will make server resources (CPU) ideal and it no other request can use that resource while it is ideal.
A reason (perhaps not the reason) is that RESTful services are sessionless. This means you can easily use a load balancer to direct requests to various web servers without having to replicate session state among all of your web servers or making sure all requests from a single session go to the same web server.
What kind of server do you people see in real projects?
1) Web Services MUST be stateless: Basically you must send username/password with every request, every request must use HTTPS and I will authenticate and load the User object everytime if needed.
2) A Session for Web Services: like in a web container so I can at least save the authenticated User object and have something similar to a session ID so I don't need to authenticate, load and check the User on every request.
3) Sticky Service (persistent service across requests): https://jax-ws.dev.java.net/nonav/2.1/docs/statefulWebservice.html
I understand the scalability problems of stateful services (and of web application sessions), but sometimes you must have some kind of state, for example for a shopping cart. But you can also put this state in the database (use the back-end as a kind of session argh) or passing the entire state to the client (the client becomes responsible for resending the entire shopping cart).
The truth is, at least for web applications, the session helps a lot in many situations. Scalability issues can be ignored if your system accepts that "the user must start over doing whatever he is doing if his web server happens to go down" or you can try a session cluster if that's unacceptable.
How it is for web services? I am inclined to conclude that web services are very different than web applications and accept option 1) (always stateless), but it would be nice to hear other opinions based on real project experience.
While it's only a small difference but it should still be mentioned:
It's not state in web services that kill scalability, rather it's state on the App Server that's hosting the web services that will kill scalability. The moment you say that this user needs to access this server (as done in sticky sessions) you are effectively limiting your scalability options. The point you want to get to is that 'Any of your free load-balanced App servers' can handle this web service request and if I add 1 more App Server I should be able to handle % more users.
It's totally fine (and personally recommended) if you want to maintain state to pass in an authentication token and on each request get the service to retrieve your 'state' from a data store (preferably a redundant and partitioned one, e.g. distributed+replicated key/value data store). That's how Amazon does it with SimpleDb and Google with BigTable.
Ebay takes a slightly different approach and stores most of the clients state in a cookie so it gets passed in with every request. Although it generates a lot more traffic, it still scalable as any of their servers can still handle the request.
If you want a scalable data store I would recommend looking at redis it has speed and features that can't be beat in a key/value data store.
You should also check out highscalability.com if you want access to good material on how to build fast and scalable services.
Ideally webservices (and web sites) should be stateless.
Unfortunately this takes very well thought out problem domain, and clear separation of concerns.
I've found that in practice most real-world web sites depend on state even though this limits their scalability.
I've also found that many real-world web-services also rely on state.
Ultimately the 'right' decision is the one that works for the specific problem, so it's probably okay to write a webservice that relies on state, and refactor it later if scalability becomes an issue.
Highly dependent on whether the service is single transaction oriented (say getting stock quotes) or if the output from the service is dependent on a data provided from a particular client across multiple transactions(in that case state must be maintained.)
As far as scalability issues, storing state in a database isn't actually a bad way to go (in fact it's probably the only way to go if you're load balancing your service across a server farm.)
I think with Flex clients the state is moved out of the service and into the client tier. Keep the services stateless and let the clients maintain the state needed. The services stay simple, and the clients are free to mash them together as they wish.
You seem to be equating state and authentication. Perhaps you're accustomed to storing username and password in session state?
This is not necessary, even with old ASMX web services. Simply pass whatever information you need to your "Login" operation. This operation will be defined to return an "Authentication Ticket" header.
All other operations that require authentication will require this "Authentication Ticket" header. They will each check the header to see if it represents a valid, authenticated user. If so, then they will perform their task. If not, then they will return a SOAP Fault indicating that authentication is required.
No state is required. Simply make sure that the authentication ticket can be validated on any server your service runs on (for instance, in a web farm), and you'll be fine.
I'm building a java/spring application, and i may need to incorporate a stateful web service call.
Any opinions if i should totally run away from a stateful services call, or it can be done and is enterprise ready?
Statefulness runs counter to the basic architecture of HTTP (ask Roy Fielding), and reduces scalability.
Stateful web services are a pain to maintain. The mechanism I have seen for them is to have the first call return an id (basically a transaction id) that is used in subsequent calls. A problem with that is that the web service isn't really stateful so it has to load all the information that it needs from some other data store for each call.