Architecture: Technological questions - web-services

I want to create a web application with the following architecture:
There is some functionality, whiсh is encapsulated in the "Business logic" module (1). It uses MongoDB as a data store (5) and an external (command line) application (4).
The functionality of the application is brought to the end users via two channels:
The web application itself (2) and
public API (3), which allows third-party applications and mobile devices to access the business logic functionality.
The web application is written in Java and based on the Vaadin platform. Currently it runs in Jetty web server.
One important requirement: The web application should be scalable, i. e. it must be possible to increase the number of users/transactions it can service by adding new hardware.
I have following questions regarding the technical implementation of this architecture:
What technology can be used to implement the business logic part? What are the sensible options for creating a SCALABLE app server?
What web server can I choose for the web interface part (2) to make it scalable? What are the options?
Calculations done in the external system (4) are potentially CPU-intensive. Therefore I want to do them in an asynchronous way, i. e.
a) the user sends a request for this operation (via web interface or public API, 2 and 3 in the above image), that
b) request is put into a queue, then
c) the CPU-intensive calculations are done and
d) at some point in time the answer is sent to the user.
What technological options are there to implement this queueing (apart from JMS) ?
Thanks in advance
Dmitri

for scaling the interactions, have you look at Drools grid, Akka or JPPF ?
for making the web-application scalable, have you look at Terracotta or Glassfish clustering capabilities (Vaadin is a glassfish partner if i remember well) ?

Since nobody answered my question, I'll do it myself.
From other resources I learned that following technologies can be used for implementing this architecture:
1) Spring for Business logic (1)
2) GridGain or Apache Hadoop for scaling the interactions with the external system (4)
3) Hazelcast for making the web-application scalable (2, server-sided sessions).

Related

Can a service call another service inside its code?

Following is a point mentioned in a presentation slide related to SOA, and it confuses me with the concepts of service orchestration and service choreography. To enable service choreography, shouldn't a web service be able to call another web service?
SOA builds applications out of software services. Services comprise intrinsically
unassociated, loosely coupled units of functionality that have no calls to
each other embedded in them.
In theory, a service can do anything it needs to do to accomplish its job. So there doesn't seem to be a good reason to forbid using a second service to do your work. Why reinvent the wheel?
In practice, the issue is more complicated. If you start calling other services on your own web server, then you'll eventually starve it of resources. At best, "real" clients will have to wait a bit longer for their answers while your web service server plays with itself.
Another issue is recursive loops: Service A calls B calls C calls A calls B ... you get the idea. A small change in one service can introduce such a loop without anyone noticing and it can sit there for a long time until it suddenly kills your server.
That is why you should build micro services in a hierarchy inside the server (i.e. below the web service layer - this is not exposed to clients). Those micro services can use each other in a top-down manner (to avoid the loops). Unit tests then make sure they behave properly.
Lastly, such reuse is very slow. Each HTTP request takes a lot of resources to create, send, parse and process. Calling an internal method directly can be 10 - 10000 times faster.
These are the main reasons why the services exposed by a single server shouldn't reuse each other via the "public client API".
Note: There are web services which build new services by using existing ones. IFTTT - "If This Then That" is one such beast.
You could adopt every concept according to your needs. In my current project we have a separate module that is responsible for the Orchestration. This is required since in real life usage, scenarios can be very complicated. So in order to be close to the actual management of your system, you need to have such one.
Another advantage of this approach is that the Separation_of_concerns is kept. Also aligns the business request with the applications, data, and infrastructure that you have. It defines policies and service levels through automated workflows, provisioning etc.
Orchestration is critical in the delivery of Cloud services too. As they are networked to allow sharing of data-processing tasks, centralized data storage, and online access to services or resources.

Why do common web services client use a proxy

I've noticed that most architectures that acts as a web service client uses a proxy to communicate with the rest server? While it is possible to access a rest service without a proxy server, one example I've read is this where it uses a proxy server to communicate with its rest server are there any advantages of using a proxy to access a rest service?
Using a proxy is usually not necessary for small local application web services. It depends mostly on your server load (number of clients, frequency of requests), and on the network area where your services are accessed : back-office server-to-server, front-office LAN, WAN or on the whole internet).
The REST webservices are mostly online resources, identified in a unique way by an URL, and generally served in a classic HTTP way. From the client side, he does not know if the data he gets is static, dynamic or cached. He simply gets the data as if it's static.
On large scale applications, with the increase of clients, resources and web services requests, you need technical components to handle problematics like user balancing, usage tracking of your web services as your application evolves. You'll also want to deliver the best performance you can to the clients. This can be achieved efficiently with a proxy solution.
Advantages of NOT using a proxy:
Simplicity
Advantages of using a proxy-based solution:
Rewrite URLs from a single centralized entry point (instead of setting it heterogeneously on each server/app/ws configuration).
Track the usage of your webservices (globally)
Enhance performance capabilities (caching, balancing to dedicated servers)
Managing API versions (switching gobally /myAPI from /myAPI-V1 to /myAPI-V2 easily done, and go back fingers in the nose)
Modifying some API calls on-the-fly (compatibility between versions, do preliminary input data validation, or add technical information to calls).
Manage webservices security globally (control IPs, quota per user, etc).
Hope this answers your question.
Edit (in answer to comment)
The proxy can act as a cache. For frequently asked resources (REST services), it can serve the same response to several users. Your service will be called juste once, even if there is 100 requests on this resource.
But this depend on how your services are really used, so you need to track requests to know if caching is helpful or not in your case.
How many users do you have ?
How many web services ?
Whar kind of data/resources are served ?
How fast are your services (individually) ?
What is the network performance ? (LAN? WAN? Internet? Mobile?)
How many servers and applications serve your users ?
Do you encounter any network load problems ?
A proxy cannot "accelerate" your existing services, but it can enhance the way you serve the resources to your clients.
Do not use a proxy if you do not know if you need it. You must know what is your actual system architecture and what are the weaknesses and bottlenecks.

Database connection pooling for c++ recommendation engine

I am considering to implement a recommendation engine for a small size website.
The website will employ LAMP stack, and for some reasons the recommendation engine must be written in C++. It consists of an On-line Component and Off-line Component, both need to connect to MySQL. The difference is that On-line Component will need a connection pool, whereas several persistent connections or even connect as required would be sufficient for the Off-line Component, since it does not require real time performance in a concurrent requests scenario as in On-line Component.
On-line Component is to be wrapped as a web service via Apache AXIS2. The PHP frontend app on Apache http server retrieves recommendation data from this web service module.
There are two DB connection options for On-line Component I can think of:
1. Use ODBC connection pool, I think unixODBC might be a candidate.
2. Use connection pool APIs that come as a part of Apache HTTP server. mod_dbd would be a choice. http://httpd.apache.org/docs/2.2/mod/mod_dbd.html
As for Off-line Component, a simple DB connection option is direct connection using ODBC.
Due to lack of web app design experience, I have the following questions:
Option 1 for On-line Component is a tightly coupled design without taking advantage of pooling APIs in Apache HTTP server. But if I choose Option 2 (3-tiered architecture), as a standalone component apart from Apache HTTP server, how to use its connection pool APIs? A Java application can be deployed as a WAR file and contained in a servlet container such as tomcat(See Mahout in Action, section 5.5), is there any similar approach for my C++ recommendation engine?
I am not sure if I made a proper prototype.
Any suggestions will be appreciated:)
Thanks,
Mike

Efficient C++ software stack for web development

What C++ software stack do developers use to create custom fast, responsive and not very resource hungry web services?
I'd recommend you to take a look on CppCMS:
http://cppcms.com
It exactly fits the situation you had described:
performance-oriented (preferably web service) software stack
for C++ web development.
It should have a low memory footprint
work on UNIX (FreeBSD) and Linux systems
perform well under high server load and be able to handle many requests with great efficiency
[as I plan to use it in a virtual environment] where resources will be to some extent limited.
So far I have only come across Staff WSF, Boost, Poco libraries. The latter two could be used to implement a custom web server...
The problem that web server is about 2% of web development there are so much stuff to handle:
web templates
sessions
cache
forms
security-security-security - which is far from being trivial
And much more, that is why you need web frameworks.
You could write an apache module, and put all your processing code in there.
Or there's CppCMS, or Treefrog or for writing web services (not web sites) use gSOAP or Apache Axis
But ultimately, there's no "easy to use framework" because C++ developers like to build apps from smaller components. There's no Ruby-style framework, but there is all manner of libraries for handling xml or whatever, and Apache offers the http protocol bits in the module spec so you can build up your app quite happily using whatever pieces make sense to you. Now whether there's a market for bundling this up to make something easier to use is another matter.
Personally, the best web app system I wrote (for a company) used a very think web layer in the web server (IIS and ASP, but this applies to any webserver, use php for example) that did nothing except act as a gateway to pass the data from the requests through to a C++ service. The C++ service could then be written completely as a normal C++ command line server with well-defined entry points, using as thin an RPC system as possible (shared memory, but you may want to check out ZeroMQ), which not only increased security but allowed us to easily scale by shifting the services to app servers and running the web servers on different hardware. It was also really easy to test.

Connecting SAP to remote webservices using cURL

I've been doing a bit of research and cannot seem to quite capture the information I need. Our software offers a public api (webservice) which our clients can implement using HTTPS calls through cURL. Many of our clients use SAP, which I most honestly know next to nothing about (nor does anybody on our crew).
I'm trying to put together a big picture of what those clients would have to do to easily communicate with our webservices. What requirements would SAP clients have? I've read a bit about the WebServices framework in SAP but that doesn't quite seem to be what I need.
Is it simple to create or use existing SAP modules in any language that could connect to a remote webservice through cURL?
Can I find any valuable documentation out there that I could / should read ?
I'm not sure if you'll like this answer, but I'll write it anyway. :-)
If "webservice" means SOAP/WSDL for you, then it should be technically possible to generate some proxies to facilitate communication with your application. If you're talking about REST or some home-brewn stuff, it's a bit more work, but it's still possible. There's an example available in the SAP help portal. (And by the way, "some language" means ABAP.)
HOWEVER: You will need someone with SAP experience in the area you're interested in (materials management, sales, whatever). And you'll probably need someone to code some bits and pieces in the SAP system to make the interface work OR your clients will need some kind of communication server (PI) in between OR both. Unless you've got a customer who will let you play and gain experience in their system, you'll also need a SAP installation to do this.
Unfortunately, the big picture might be even bigger than you imagine...
EDIT: If you want to get an idea of what ABAP is, this answer might be a starting point.
For connecting an SAP system with other systems, consider using SAP NetWeaver Process Integration (SAP PI). It is a part of SAP Netweaver that has the explicit purpose of communicating between various SAP systems as well as other (third party) systems. It's the core component of any SAP flavored service-oriented architecture (SOA).
From Wikipedia :
SAP calls PI an integration broker because it mediates
between entities with varying requirements in terms of connectivity,
format, and protocols. According to SAP, PI reduces the TCO
by providing a common repository for interfaces. The central component
of SAP PI is the SAP Integration Server, which facilitates
interaction between diverse operating systems and applications across
internal and external networked computer systems.
PI is built upon the SAP Web Application Server.