Approach to large object transfers in web services - web-services

I have to implement a SOA solution with web services. I have to transfer large objects (ex: Invoices of 25~30mb of XML data) and I wonder what's the best approach...
Should I:
A. transfer parts of this objects separately (ex: header first, then items one by one, regardless of the fact that there could be 1000 of them) in several WS calls and then organize them in "server side" dealing with retries and errors.
Or ...
B. Should I transfer the entire payload in one single call and try to optimize it (and not to "burn" Http connections)?
I'm using .Net's WCF to expose services layer. I accept recommended readings and considerations.

The idea would be to maximize the load and minimize the number of calls. This isn't always simple since - in a one shot call - firewalls or the web service itself could limit the payload size and your message might not make it, or - in case of multiple calls - as you mentioned yourself, you have to deal with errors and retries (basically doing WS-ReliableMessaging).
So perhaps, instead of concentrating on the message of an usual call, you might try changing how you perform the respective call, and maybe have a look at MTOM (Message Transmission Optimization Mechanism) with WCF, or maybe use streaming.

Related

webservices implemetation design

I am writing a Rest Web Services.
I am not great at designing.
At present i wanted to know the service handler should be singleton or static.
#RequestMapping(value="/{input}" ,method=RequestMethod.GET)
public String getOutput(#PathVariable String input){
ResourceRestService.getInstance().outPutService().getOutput(input);
}
Is using singleton instance of ResourceRestService or OutputService correct in this case?
Does it cause any performance overhead when the number of requests increases if yes then what should be the solution?
What you have done is a standard, no issues. Since it is a singleton and non thread safe, the class should not maintain state, all it has to do is to get request and send response. If the number of requests increase, you may have to use clustered environment(apache mod_jk etc.,)
You Service method should not be a singleton unless there is a specific need for you to do you like in case of making it Thread Safe or so,
You should rather make it static.
When no of requests increases there are quite a few scenarios that you may need to consider for you to scale, A non-exhaustive list is as below
1. Is it due to database layer? no of records, db performance etc etc
2. Is it due to service layer? amt of processing at service layer due to various reason
3. Is it due to server capability? Max no of requests it could handle
so on..

How to monitor communication in a SOA environment with an intermediary?

I'm looking for a possiblity to monitor all messages in a SOA enviroment with an intermediary, who'll be designed to enforce different rule-sets over the message's structure and sequences (e.g., let's say it'll check and ensure that Service A has to be consumed before B).
Obviously the first idea that came to mind is how WS-Adressing might help here, but I'm not sure if it does, as I don't really see any mechanism there to ensure that a message will get delivered via a given intermediary (as it is in WS-Routing, which is an outdated proprietary protocol by Microsoft).
Or maybe there's even a different approach that the monitor wouldn't be part of the route but would be notified on request/responses, which might it then again make somehow harder to actively enforce rules.
I'm looking forward to any suggestions.
You can implement a "service firewall" either by intercepting all the calls in each service as part of your basic servicehost. Alternatively you can use 3rd party solutions and route all your service calls to them (they will do the intercepting and then forward calls to your services).
You can use ESBs to do the routing (and intercepting) or you can use dedicated solutions like IBM's datapower, XML firewall from Layer7 etc.
For all my (technical) services I use messaging and the command processor pattern, which I describe here, without actually calling the pattern name though. I send a message and the framework finds to corresponding class that implements the interface that corresponds to my message. I can create multiple classes that can handle my message, or a single class that handles a multitude of messages. In the article these are classes implementing the IHandleMessages interface.
Either way, as long as I can create multiple classes implementing this interface, and they are all called, I can easily add auditing without adding this logic to my business logic or anything. Just add an additional implementation for every single message, or enhance the framework so it also accepts IHandleMessages implementations. That class can than audit every single message and store all of them centrally.
After doing that, you can find out more information about the messages and the flow. For example, if you put into the header information of your WCF/MSMQ message where it came from and perhaps some unique identifier for that single message, you can track the flow over various components.
NServiceBus also has this functionality for auditing and the team is working on additional tooling for this, called ServiceInsight.
Hope this helps.

Flex webservice huge performance issue

I am pulling out a good amount of data using webservice from a java application. The data is a bit complex in its structure with lot of hierarchical pattern using array collections. I am experiencing huge performance issue of around 15 second(in jboss and WebSphere) to get the data loaded. Time consumed is mostly while converting the service data into flex object structure. Issue gets worsen while moving to Weblogic application server. I am using axis2 framework.
Is there any way to optimize this? What can be the alternative technologies I can use instead of webserivces?
I'm afraid you may not like my answer, because it will involve a lot of refactoring. I can't think of any easy fixes.
What can be the alternative technologies I can use instead of webserivces?
You'll get best performance by using AMF remoting instead of web services. Here's an article that explains what it is and contains a benchmark that will show you that this could cut easily cut your response time in half: http://www.themidnightcoders.com/products/weborb-for-net/developer-den/technical-articles/amf-vs-webservices.html. And that benchmark is using .Net on the server side. It'll work even beter with a Java server.
Is there any way to optimize this?
You should consider refactoring the objects you pass to the client to "Data Transfer Objects" (DTO's). These are simple value objects that contain only the data necessary for the client to display. Which means: less time spent transferring data from the server to the client and less time spent converting objects to ActionScript classes.
How can you limit the work involved?
You could add a layer on the server side, that would call your existing web services, convert complex data into simple DTO's and deliver them to the client through AMF services. That way you can leave your existing code intact and still get a significant performance boost.

Restful API - handling large amounts of data

I have written my own Restful API and am wondering about the best way to deal with large amounts of records returned from the API.
For example, if I use GET method to myapi.co.uk/messages/ this will bring back the XML for all message records, which in some cases could be 1000's. This makes using the API very sluggish.
Can anyone suggest the best way of dealing with this? Is it standard to return results in batches and to specify batch size in the request?
You can change your API to include additional parameters to limit the scope of data returned by your application.
For instance, you could add limit and offset parameters to fetch just a little part. This is how pagination can be done in accordance with REST. A request like this would result in fetching 10 resources from the messages collection, from 21st to 30th. This way you can ask for a specific portion of a huge data set:
myapi.co.uk/messages?limit=10&offset=20
Another way to decrease the payload would be to only ask for certain parts of your resources' representation. Here's how facebook does it:
/joe.smith/friends?fields=id,name,picture
Remember that while using either of these methods, you have to provide a way for the client to discover each of the resources. You can't assume they'll just look at the parameters and start changing them in search of data. That would be a violation of the REST paradigm. Provide them with the necessary hyperlinks to avoid it.
I strongly recommend viewing this presentation on RESTful API design by apigee (the screencast is called "Teach a Dog to REST"). Good practices and neat ideas to approach everyday problems are discussed there.
EDIT: The video has been updated a number of times since I posted this answer, you can check out the 3rd edition from January 2013
There are different ways in general by which one can improve the API performance including for large API sizes. Each of these topics can be explored in depth.
Reduce Size Pagination
Organizing Using Hypermedia
Exactly What a User Need With Schema Filtering
Defining Specific Responses Using The Prefer Header
Using Caching To Make Response
More Efficient More Efficiency Through Compression
Breaking Things Down With Chunked Responses
Switch To Providing More Streaming Responses
Moving Forward With HTTP/2
Source: https://apievangelist.com/2018/04/20/delivering-large-api-responses-as-efficiently-as-possible/
if you are using .net core
you have to try this magic package
Microsoft.AspNetCore.ResponseCompression
then use this line in configureservices in startup file
services.AddResponseCompression();
then in configure function
app.UseResponseCompression();

REST vs RPC for a C++ API

I am writing a C++ API which is to be used as a web service. The functions in the API take in images/path_to_images as input parameters, process them, and give a different set of images/paths_to_images as outputs. I was thinking of implementing a REST interface to enable developers to use this API for their projects (independent of whatever language they'd like to work in). But, I understand REST is good only when you have a collection of data that you want to query or manipulate, which is not exactly the case here.
[The collection I have is of different functions that manipulate the supplied data.]
So, is it better for me to implement an RPC interface for this, or can this be done using REST itself?
Like lcfseth, I would also go for REST. REST is indeed resource-based and, in your case, you might consider that there's no resource to deal with. However, that's not exactly true, the image converter in your system is the resource. You POST images to it and it returns new images. So I'd simply create a URL such as:
POST http://example.com/image-converter
You POST images to it and it returns some array with the path to the new images.
Potentially, you could also have:
GET http://example.com/image-converter
which could tell you about the status of the image conversion (assuming it is a time consuming process).
The advantage of doing it like that is that you are re-using HTTP verbs that developers are familiar with, the interface is almost self-documenting (though of course you still need to document the format accepted and returned by the POST call). With RPC, you would have to define new verbs and document them.
REST use common operation GET,POST,DELETE,HEAD,PUT. As you can imagine, this is very data oriented. However there is no restriction on the data type and no restriction on the size of the data (none I'm aware of anyway).
So it's possible to use it in almost every context (including sending binary data). One of the advantages of REST is that web browser understand REST and your user won't need to have a dedicated application to send requests.
RPC presents more possibilities and can also be used. You can define custom operations for example.
Not sure you need that much power given what you intend to do.
Personally I would go with REST.
Here's a link you might wanna read:
http://www.sitepen.com/blog/2008/03/25/rest-and-rpc-relationship/
Compared to RPC, REST's(json style interface) is lightweight, it's easy for API user to use. RPC(soap/xml) seems complex and heavy.
I guess that what you want is HTTP+JSON based API, not the REST API that claimed by the REST author
http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven