Best practices for server-side architecture for an XSLT-based client application - xslt

I'm considering using Saxon CE for a web application to edit ebooks (metadata and content). It seems like a good match given that important ebook components (such as content.opf) are natively XML. I understand how to grab XML data from the server, transform it, insert the results into the HTML DOM, and handle events to change what and how is displayed.
Where I am getting stuck is how best to sync changes back up to the server. Is it best practice to use an XML database on the server? Is it reasonable to maintain XML on the server as text files, and overwrite them with a post, and could/should this be done through a result-document with a remote URI?
I realize this question may seem a bit open-ended, but I've failed to find any examples of client-side XSLT applications which actually allow the modification of data on the server.

Actually, I don't think this question is specific to using Saxon-CE on the client. The issues would be exactly the same if you were using XForms, or indeed if the client-side code were written in Javascript. And I think the answer depends on volumetrics, availability and concurrency requirements, and so on.
If you're doing a serious level of concurrent update of a shared collection of XML data, then using an XML database is probably a good idea. On the other hand there might be scenarios where this isn't needed, for example where the XML data is part of the user-specific application context, or where the XML document received from the client simply needs to be saved somewhere "as is", or perhaps where it just needs to be appended to a some kind of XML log file.
I think that in nearly all cases, you'll need a server-side component to the application that responds to HTTP put/post requests and decides what to do with them.

Related

Hande Series of Web Requests in a specific way

I am sorry in advance; I am just learning Web development and my knowledge of it is quite limited.
I will describe my problem first.
I have relatively large amount of data (1.8-2 GB), which should be hidden from a public web access. However, a user should be able to request via url call a specific small subset of data and see it on his / her webpage.
Ideally, I would like to write a program on a web server. Let's call it ./oracle, which stores the large amount of data in primary memory.
Each web user should be able to make a specific string calls to oracle and see oracle'sresponse on a web page as html elements.
There should only one instance of oracle, and web users should make asynchronous calls to it.
Can I accomplish the above task with FastCGI or any other protocols?
If yes could you please explain which tools / protocols should I use / learn?
I would recommend setting up an Apache server because it's very common and you'll be able to find a lot of answers to any specific questions here on StackOverflow already.
You could also look into things like http://Swagger.io which can help you generate your API.
Unfortunately, everything past this really depends on what you use to set up your server. Big picture though:
You'll need to open up a port to listen to incoming requests
You'll need to have requests include the parameters they want to send to oracle
You could accomplish this the URI, like localhost/oracle-request?PARAMETER="foo"
You could alternatively use JSON in the body of the http request
Again, this largely depends on how you set up step 1
You'll need to route those requests to the oracle
This implementation depends entirely on step 1
You'll need to capture the output from the oracle and return it to the user
Once you decide on how you want to set up your server, feel free to edit your question and we may be able to provide more specific help.

How to ensure that a webservice whose output changes works?

I would like to ensure that our webservice works but I don't know how to do it because webservices data are controlled by a backoffice and data changes everyday multiple times.
The data loaded by the webservice doesn't come from a database but from json files dynamically loaded and distributed. I've considered replacing those files for testing the behavior, but bad data are a common frequent cause of disfunction, so I would rather tests those simultaneously or at least have some way to ensure that data are valid for the currently deployed sources.
I would also welcome suggestions of books too.
This is a big problem and it is difficult to find a single solution. Instead you should split task into smaller sub tasks:
Does web service work at all? Connect to it and make normal operations. If you are using real data, you cannot verify that it is correct. Just check you get a valid looking reply. You should also have a known set of data in a different server, maybe call it staging. Here you can verify that a new version web service gives out correct output.
How to check that files you get from backoffice are valid? It is not efficient to make you test them just before deployment. You mentioned several reasons why this is not possible so you have to live with it. Because your files are json, it should be possible to write a test suite that checks their validity.
How to check that real json files give out correct output in web service. This is your original question. You have a set of json files. How easy it is to calculate what web service responds based on these files? In some cases you would need to write your own web service engine. This is why testers usually do first two steps first.

Cost of serialization in web service

My next project involves the creation of a data API within an enterprise framework. The data will be consumed by several applications running on different software platforms. While my colleagues generally favour SOAP, I would like to use a RESTful architecture.
Most of the applications will only need a few objects at every call. Other applications will however sometimes need to make several sequential calls each involving thousands of records. I'm concerned about performance. Serialization/deserialization & network usage are where I fear to find a bottleneck. If each request involves a large delay, all of the enterprise's applications will be sluggish.
Are my fears realistic? Will serialization to a voluminous format like XML or JSON be a problem? Are there alternatives?
In the past, we've had to do these large data transfers using a "flatter"/leaner file format such as CSV for performance. How can I hope to achieve the performance I need using a web service?
While I'd prefer replies specific to REST, I'm interested in hearing how SOAP users might deal with this as well.
One advantage of REST is that you are free to use whatever media type you like. Why not continue to use text/csv? You could also enable HTTP compression to further reduce bandwidth consumption.
REST services are great for taking advantage of all different kinds of data formats. Whatever format fits your scenario best.
We offer both XML and JSON. Your mentioned rendering time really can be an issue. On server side we have JAXB whose standard sun-implementation is somewhat slow, when it comes to marshall XML. XML has the disadvantage of verbosity, but is also nice in interoperability and has schema + explicit versioning.
We compensated the verbosity in several ways (especially limiting the result-set):
In case you have a container with items in it, offer paging in your xml response (both page-size and page-number, e.g. /items?page=0&size=3) . The client can itself reduce the size by reducing the page-size.
Offer collapsing elements, for instance several clients are only interested in one data field of your whole item. Do this with a parameter (e.g. /items?select=name), then only the nested element 'name' is included inline of your item element. This dramatically decreases size.
Generally give the clients the power to use result-set limiting. They will definitley use it, because it speeds up response time also on their side :)
Also use compression, it reduces verbose XML extremely (in our case the payload got 10 times smaller). From client side you can do it by header 'Accept-Encoding: gzip'. If you use Apache, server configuration is also straight-forward
I'd like to offer three guidelines:
one is the observation that there are many SOAP Web services out there (especially built with .NET 2.0 "ASMX" technology) that send down their data transfer objects serialized in XML. There are of course many RESTful services that send down XML or JSON. XML serialization/deserialization is rarely the constraining factor.
one common cause of bottlenecks in Web services is an interface that encourages client applications to get data by making those thousands of sequential calls (there is a term for it: a chatty interface). This is what you should avoid when you design your Web service's interface, regardless of what four-letter acronym you decide to go ahead with.
one thing to remember about REST is that it (partially) stands for a transfer of state, which may be ill-suited to some operations where you don't want to transfer the state of a business object from the server to a client application. In those cases, a SOAP Web service (as suggested by your colleagues) is more appropriate; or perhaps a combination of SOAP and REST services, where the REST services would take care of operations where the state transfer is appropriate, and the SOAP services would implement the rest (pun unintended :-)) of the operations.

Is REST suitable for document-style web services?

RESTful and document/message-style seem to be two trends to implement web services nowadays in general. By this, I mean REST vs SOAP, and document-style vs RPC-style.
My question is how compatible REST is with document-style web services. From my limited knowledge of REST, it is utilizing http GET/POST/PUT/DELETE verbs to perform CRUD-like operations on remote resources denoted by URLs, which lends it into a more "chatty" and remote-method like style, aka RPC style. On the other hand, document-style web services emphasize on coarse-grained calls, i.e. sending up a batch like request document with complex information, and expecting a response document back also with complex information. I cannot see how it can be accomplished nicely with REST, without declaring only one resource for "Response" and using POST verb all the time (which will defeat the purpose of REST).
As I am new in both document-style and RESTful web services, please excuse me for, and kindly point out, any ignorance in above assumptions. Thanks!
Your understanding of REST is misguided. This is not surprising nor your fault. There is far, far more mis-information about REST floating around on the internet than there is valid information.
REST is far more suited to the coarse-grain document style type of distributed interface than it is for a data oriented CRUD interface. Although there are similarities between CRUD operations and the HTTP GET/PUT/POST/DELETE there are subtle differences that are very significant to the architecture of your application.
I don't think you mean REST over SOAP. It is possible to do REST over SOAP, but to my knowledge nobody does it, and I have never seen an article talking about it.
SOAP is usually used for "Web Services" and REST is usually done over HTTP.
REST is really meant to be used with documents as long as you consider your document a resource.
GET allows you to retrieve the document. Obviously.
POST allows you to create a document. No need for your API to require the full content of the document to create it. It is up to you to decide what is required to actually create the document.
PUT allows to modify the document. Again, no need to force the client to send the whole document each time he wants to save. Your API may support delta updates sent through PUT requests.
DELETE obviously deletes the document. Again, you can design your API so that deletes does not actually destroy every bits of the document. You can create a system similar to a recycle bin.
What is nice with REST and working with documents is that the server response contains every information needed to understand the response. So if a new resource is created, you should send its location, same if a resource is moved, etc. All you have to document is the data types that will be used (XML formats, JSON, etc.)
Standard HTTP methods are just there because their behaviour is already defined and allow clients to easily discover your API as long as they know the URI.

Web Service vs Form posting

I have 2 websites(www.mysite1.com and myweb2.com, both sites are in ASP.NET with SQL server as backend ) and i want to pass data from one site to another.Now i am confused whether to use a web service or a form posting (from mysite1 to a page in myweb2)
Can any one tell me the Pros and Cons of both ?
By web service I assume you mean SOAP based web service?
Anyway both are equal with few advantages. Posting is more lightweight, while SOAP is standardized (sort of). I would go with more restful approach, because I think SOAP is too much overhead for simple tasks while not giving much of advantage.
Webservices are SOAP messages (the SOAP protocol uses XML to pass messages back and forth), so your server on both ends must understand SOAP and whatever extensions you want to talk about between them, and they probably (but don't have to) be able to grok WMDL files (that "explains" the various services endpoints and remote functionality available). Usually we call this the SOAP / WS-* stack, with emphasis on 'stack' as there's a few bits of software that needs to be available, and the more complex the SOAP calls, the more of this stack needs to be available and maintained.
Using POST, on the other hand, is mostly associated with RESTful behaviours, and as an example of a protocol of such, look to HTTP. Inside the POST you can of course post complex XML, but people tend to use plain POST to simplify the calling, and use HTTP responses as replies. You don't need any extra software, probably, as most if not all webkits has got HTTP support. My own bias leans towards REST, in case you wonder. Through using HATEOAS you can create really good infrastructure for self-aware systems that can modify themselves with load and availability in real-time as opposed to the SOAP way, and this lies at the centre of the argument for it; HTTP was designed for large distributed networks in mind, dealing with performance and stability. SOAP tends to be a one-stop if-it-breaks-you're-stuffed kinda thing. (Again, remeber my bias. I've written about this a lot on my blog, especially the architecture side and the impact of SOA vs. ROA. :)
There's a great debate as to which is "better", to which I can only say "it depends completely on what you want to do, how you prefer to do it, what you need it to do, your environment, your experience, the position of the sun and the moon(s), and the mood my cat is in." Eh, meaning, a lot.
I'm all for a healthy debate about this, but I tend to think that SOAP is a reinvention; SOAP is an envelope with a header and body, and if that sounds familiar, it is exactly how HTML was designed, a fact very few people tend to see. HTTP as just a protocol for shifting stuff around is well understood and extremely well supported, and SOAP uses it to shift their XML envelopes around. Is there a real difference between shifting SOAP and HTML around? Well, yes, the big difference is that SOAP reinvents all the niceties of HTTP (caching, addressability, state, scaling), and then use HTTP only for delivering the message and nothing else and let the stack itself have to deal with those niceities mentioned earlier. So, a lot of the goodness of HTTP is ignored and recreated in another layer (hence, you need a SOAP stack to deal with it), which to me seems wasteful, ignorant and adding complexity.
Next up is what you want to do. For really complex things, there's lots in the webservices stack of standards (I think it's about 1200 pages combined these days) that could help you out, but if your needs are more modest (ie. not that crazy about seriously complex security, for example) a simple POST (or GET) of a request and an envelope back with results might be good enough. Results in HTTP is, as you probably know, HTTP content-type, so lots is already supported but you can create your own, for example application/xml+myformat (or more correctly, application/x-xml+myformat if I remember correctly). Get the request, if it's a response code 200, and parse.
Both will work. One is heavy (WS-* stack) depending on what your needs are, the other is more lightweight and already supported. The rest is glue, as they say.
I would say the webservice is definitely the best choice. A few pro's:
If in the future you need to add another website, your infrastructure (the webservice) is already there
Cross-site form posting might give you problems when using cookies or
might trigger browser privacy restrictions
If you use form posting you have to
write the same code over and over
again, while with using the
webservice you write the code once,
then use it at multiple locations.
Easier to maintain, less code to
write.
Maintainability (this is related to
the above point) ofcourse, all the
code relevant to exchanging data is
all at one location (your webservice)
There's probably even more. Like design time support/code completion.
From my little experience I'd say that you'd be best using a web service since you can see the methods and structure of the service in your code, once you've created it at the recieving end that is.
Also using the form posting methos would eman you have to fake form submissions which isn't as neat as making a web service call.
Your third way would be to get the databases talking, though I'm guessing they're disparate and can't 'see' each other?
I would suggest a web service (or WCF). As Beanie said, with a service you are able to see the methods and types of the service (that you expose), which would make for much easier and cleaner moving of data.
I agree with AlexanderJohannesen that it is debatable whether SOAP webservices or RESTful apis are better, however if both sites are under your control and done with asp.net, definitely go with SOAP webservices. The tools that Visual Studio provides for creating and consuming webservices are just great, it won't take you more than a few minutes to create the link between the two sites.
In the site you want to receive communication create the web service by selecting add item in VS. Select web service and name it appropriately. Then just create a method with the logic you want to implement and add the attribute [WebMethod], eg.
[WebMethod]
public void AddComment(int UserId, string Comment) {
// do stuff
}
Deploy this on your test server, say tst.myweb2.com.
Now on the consuming side (www.myweb1.com), select Add Web Reference, point the url to the address of the webservice we just created, give it a name and click Add refence. You have a proxy class that you can call just like a local class. Easy as pie.