protocol buffers and actual transport options - sockets or middleware - c++

I am developing a client/server application for which I am evaluating a few options for the communications layer.
As part of this communication framework, I am contemplating using google's protocol buffer (PB) for representation of the transport data instead of re-inventing my own binary structure.
Now coming on to the actual transport, I am wondering if one should use plain sockets to send/receive these binary messages or use some form of middleware. Using a middleware has certain obvious advantages over sockets. A few that I care about include: communication models - publish/subscribe, request/response and fail over.
On the other hand, using sockets has the advantage of low overhead compared to the middleware approach and will deliver better performance.
One can also think of using the RPC libraries available with protocol buffers (third party add-ons on google's protocol buffer wiki) to communicate between the client and the server. Though it abstracts from the low level socket, it still does not support the middleware features.
At the moment, my client is an Adobe Flex GUI and two server side processes (One java and another C++). In the future, the client and server side can potentially have other services developed in other languages as well such as .NET
What do the experts feel about these choices and from experience what works well without compromising on performance. Are there other alternatives that developers go with?
Thanks
Dece

Unless it's a learning exercise, you absolutely, positively must use some middleware. There are lots of choices: AMQP, ZeroMQ, XMPP, Comet/Bayeux.
For your scenario, you probably want something web-based, so XMPP over HTTP may be a good option. However, I'm partial to Comet (though I have found Bayeux to be too complex for my needs).

Which platform?
If Windows then I have a free, simple, high performance, IO completion based pluggable server platform called WASP which is available from here.
Simply write a DLL or two and plug them in and you're done with the networking.
I don't currently have a protocol buffers based plugin example but it's on my list of things to do...

Related

Understanding the characteristics of Apache Thrift versus TCP socket servers?

I have some experience writing TCP socket servers using OpenSSL and I'm looking to understand more about Apache Thrift. I've looked over some basic examples of Thrift servers, and I understand that Thrift can handle pipes.
Can someone provide a simple explanation of the ways a Thrift server differs from a TCP server (apart from the use of pipes)?
And do Thrift frameworks use a different transport protocol?
I know this is a straightforward question, but I can't seem to find a beginner level explanation.
[...] ways a Thrift server differs from a TCP Server?
Thrift is at least one abstraction layer above raw sockets. It provides you with an abstraction which allows you to send and receive information across any medium, one of which could be TCP sockets.
The underlying transport medium itself is not important, nor is the protocol used (binary, compact, JSON ... you name it). Both is entirely transparent to the rest of your application.
In other words, you develop against a type-safe service API, rather than programming sockets or parse some JSON on your own.
Instead of fiddling around with bytes, encodings and fighting the subtleties of sockets, Thrift allows you to focus on what you want to do with that Service as a client and/or implementing the server-side logic.
Plus you may change transport and/or protocol as needed without affecting the rest of the code.
Thrift protocol is s higher level implementation over the TCP transport layer. It provides important elements:
serialization
RPC protocol
From my personal point of view, among the main benefits is type safety, especially between different languages.
Manually writing TCP (socket) solutions for sophisticated case seems quite hard.
Because it is generated from a formal definition (IDL - *.thrift files, sometimes this form is called schema) it shares similarities with WSDL/SOAP, but boasts higher performance. I use it on mobile clients etc.
The most trendy last years is JSON over REST, as far I know most code should be written manually. JSON doesn't have a schema (in official standards), maybe in the future.
Acronyms I have used are generally known. My answer is very general and simplified, professionals please forgive me.

How to create a cross platform duplex web service communication

I would like to create a web service in .Net that clients of different types (Web, exe, java) can consume despite of the language they are written with.
In addition, it needs to support callbacks and be able to easily pass through firewalls and NATs (knowing a client internal IP might change, or be removed from NAT).
Thirdly, since it is an enterprise product, I want to avoid being dependent on 3rd parties, especially ones that demand a certain environment or that customer will not want.
What kind of technologies or approaches can I use?
I am looking at web sockets, but there also I see a lot of complexities and I am not sure there aren't a lot of topology and interoperability border cases that may make me unreliable.
Thanks
For simple request-response services, you can use REST (over HTTP). Any client technology can access HTTP at this point (even CLI) and REST is a well-known and well-understood distributed mechanism. The issue involves the callbacks. There are frameworks that handle HTTP callbacks (simple google search will give you good answers), but imo, the solutions that I have seen are clumsy.
Unlike normal HTTP, WebSocket is a persistent connection. And like any other IETF and W3C specification (or any other standard for that matter), there are various implementations with various degrees of reliability, performance, etc. There are probably about 100 implementations of WebSocket clients and servers. Some implementations handle real-world issues like reconnections, network intermediaries, high scalability, mobile capabilities, etc... and some implementations just do not. I would suggest you pick an implementation that provides these enterprise-grade features.
Btw, WebSocket is pretty darn simple

When to choose webservices?

I'd like to give external access to a web application. Several applications on many clients will use this service extensively (hopefully), which will always lead to CRUD functions on a database.
Is a webservice always the first choice? Is there any rule of thumb when to choose webservices, sockets, etc?
It really depends on who your clients are, what kind of performance are you looking at, how well your clients know the technologies.
Sockets etc might give you a good performance speed but development time might increase for both, you and your clients.
SOAP web services established a standard quite some time back but now people are using REST web services more because of its simplicity and less overhead.
I am heavily impressed by the RESTful webservices offered by twilio
I am sure that Twilio is receiving hundreds of thousands of calls a day and they are performing just well.
Have a look at the following articles for more understanding about them
http://www.ibm.com/developerworks/webservices/library/ws-restful/
http://grails.org/doc/1.0.x/guide/13.%20Web%20Services.html
The big benefit of web services is the ease of use and the predefined interface, but they are "slower" compared to low level socket communication, because for example the XML Requests/Respnses of a SOAP-Service needs to be created/interpreded.
So I would say if you open the service to "outsider" use web services unless speed is really the biggest concern.
Also because web services are mostly accessable through port 80 you have much less problems with proxy/firewalls than with a socket on a random other port.
If you are having a high work load cahcing is also very important because it can speed up the system dramaticaly.
I would choose a web services (SOAP or REST) whenever I can. It's easier to scale a web service than a home brew socket implementation and it takes less time to build a webservice.
Sockets is usually the preferred choice if you need two-way communication (I know that WCF has callbacks).
Webservice makes it a generic way for different type of clients and applications to exchange data also it is dependent on your architecure and infrastructure.
Socket programming is bit complex and some time may create problem in data exchange. It entirely depends on your fuctionaly requirements and architecure which should you use.
if your clients will consume this data from browsers of application then webservice is better choice.

Socket Server vs. Standard Servers

I'm working on a project of which a large part is server side software. I started programming in C++ using the sockets library. But, one of my partners suggested that we use a standard server like IIS, Apache or nginx.
Which one is better to do, in the long run? When I program it in C++, I have direct access to the raw requests where as in the case of using standard servers I need to use a scripting language to handle the requests. In any case, which one is the better option and why?
Also, when it comes to security for things like DDOS attacks etc., do the standard servers already have protection? If I would want to implement it in my socket server, what is the best way?
"Server side software" could mean lots of different things, for example this could be a trivial app which "echoes" everything back on a specific port, to a telnet/ftp server to a webserver running lots of "services".
So where in this gamut of possibilities does your particular application lie? Without further information, it's difficult to make any suggestions, but let's see..
Web Services, i.e. your "server side" requirement is to handle individual requests and respond having done some set of business logic. Typically communication is via SOAP/XML, and this is ideal if you bave web based clients (though nothing prevents your from accessing these services via standalone clients). Typially you host these on web servers as you mentioned, and often they are easiest written in Java (I've yet to come across one that needed to be written in C++!)
Simple web site - slightly different to the above, respods to HTML get/post requests and serves up static or dymanic content (I'm guessing this is not what you're after!)
Standalone server which responds to something specific, here you'd have to implement your own "messaging"/protocols etc. and the server will carry out a specific function on incoming request and potentially send responses back. Key thing here is that the server does something specific, and is not a generic container (at which point 1 makes more sense!)
So where does your application lie? If 1/2 use Java or some scripting language (such as Perl/ASP/JSP etc.) If 3, you can certainly use C++, and if you do, use a suitable abstraction, such as boost::asio and Google Protocol buffers, save yourself a lot of headache...
With regards to security, ofcourse bugs and security holes are found all the time, however the good thing with some of these OS projects is that the community will tackle and fix them. Let's just say, you'll be safer using them than your own custom handrolled imlpementation, the likelyhood that you'll be able to address all the issues that they would have encountered in the years they've been around is very small (no disrespect to your abilities!)
EDIT: now that there's a little more info, here is one possible approach (this is what I've done in the past, and I've jused Java most of the way..)
The client facing server should be something reliable, esp. if it's over the internet, here I would use a proven product, something like Apache is good or IIS (depends on which technologies you have available). IMHO, I would go for jBoss AS - really powerful and easily customisable piece of kit, and integrates really nicely with lots of different things (all Java ofcourse!) You could then have a simple bit of Java which can then delegate to your actual Server processes that do the work..
For the Server procesess you can use C++ if that's what you are comfortable with
There is one key bit which I left out, and this is how 1 & 2 talk to each other. This is where you should look at an open source messaging product (even more higher level than asio or protocol buffers), and here I would look at something like Zero MQ, or Red Hat Messaging (both are MQ messaging protocols), the great advantage of this type of "messaging bus" is that there is no tight coupling between your servers, with your own handrolled implementation, you'll be doing lots of boilerplate to get the interaction to work just right, with something like MQ, you'll have multiplatform communication without having to get into the details... You wil save yourself a lot of time and bother if you elect to use something like that.. (btw. there are other messaging products out there, and some are easier to use - such as Tibco RV or EMS etc, however they are commercial products and licenses will cost a lot of money!)
With a messaging solution your servers become trivial as they simply handle incoming messagins and send messages back out again, and you can focus on the business logic...
my two pennies... :)
If you opt for 1st solution in Nim's list (web services) I would suggest you to have a look at WSO's web services framework for C++ , Axis CPP and Axis2/C web services framework (if you are not restricted to C++). Web Services might be the best solution for your requirement as you can quickly build them and use either as processing or proxy modules on the server side of your system.

Best messaging medium for real-time SOA applications?

I'm working on a real time application implemented using in a SOA-style (read loosely coupled components connected via some messaging protocol - JMS, MQ or HTTP).
The architect who designed this system opted to use JMS to connect the components. This system is real time so there no need to queue up messages should one component fail (the transaction will simply time out). Further, there is no need for guaranteed delivery or rollback.
In this instance, is there any benefit to using JMS over something like an HTTP web service (speed, resource footprint, etc)?
One thing that I'm thinking is since the JMS approach requires us to set a thread pool size (the number of components listening to a JMS topic/queue), wouldn't a HTTP service be a better fit since this additional configuration is not needed (a new thread is created for each HTTP request making the application scalable to an "unlimited" number of requests until the server runs out of resources).
Am I missing something?
I don't disagree with the points made by S.Lott at all, but here are a couple of points to consider regarding HTTP web services:
Your clients only need to know how to communicate via HTTP - a protocol well supported by just about every modern langauge in one form or another. JMS, though popular, is more specialist than HTTP, and so restricts the languages your interconnected systems can use. Perhaps not an issue for your system at the moment, but will you need to plug in other systems later that might struggle to support JMS connectivity?
Standards like WSDL and SOAP which you could levarage for your services are well supported by many langauges and there are plenty of tools around that will generate code to implement both ends of the pipeline (client and server) for you from a WSDL file, reducing the amount of dev you'll have to do. These standards also make it relatively simple to define and publish the specification of the data you'll be passing between your systems, something you'll presumably have to do by hand using a queueing technology like JMS.
On the downside, as pointed out by S.Lott, JMS gives you functionality that you throw away using the (stateless) HTTP protocol: guaranteed ordering & reliability; monitoring; scalability; etc. Are you sure you don't need these, and won't need these going forward?
Great question, btw.
I think it's really dependent on the situation. Where I work, we support Remoting, JMS, MQ, HTTP, and sFTP. We are implementing a middleware appliance that speaks Remoting, JMS, MQ, and HTTP, and a software middleware component that speaks JMS, MQ, and HTTP.
As sgreeve alluded to above, standards help us become flexible, but proprietary formats allow more functionality.
In a nutshell, I'd say use HTTP for stateless calls (which could end up meeting almost all of your needs), and whatever proprietary formats you need for stateful calls. If you work in a big enterprise, a hardware appliance is usually a great fit as middleware: Lightning fast compression, encryption, transformation, and translation, with very low total cost of ownership.
I don't know enough about your requirements, but you may be overlooking Manageability, Flexibility and Performance.
JMS allows you to monitor and manage the queue. These are features HTTP lacks, and you'd have to build rather than buy from a vendor.
Also, There are queues and topics in JMS, allowing multiple subscribers to a single publisher. Not possible in HTTP.
While you may not need those things in release 1.0, you might want them in the future.
Also, JMS may be able to use other transport mechanisms like named sockets, which reduces the overheads if there isn't all that socket negotiation going on with (almost) every request.
If you go down the HTTP route and you want to support more than one machine or some kind of reliability - you are going to need a load balancer capable of discovering the available web servers and loading requests across them - then failing over to another web server if a particular box/process dies. Clients making HTTP requests are also going to have to deal with servers failing and retrying operations in some loop.
This is one of the main features of a message queue - reliable load balancing with failover and loose coupling among the producers and consumers without them having to include retry logic - so your client or server code doesn't have to worry about this kinda thing. This is totally separate to whether or not you want message persistence or want to use ACID transactions to produce/consume messages (which can be very handy BTW).
If you focus just on the server side using Java - whether Servlets or MessageListener/MDBs they are kinda similar either way really. The difference is the load balancer.
So maybe the question should really be - is a JMS broker easier to setup & work with than setting up your DNS/NAT/IP/HTTP load balancer infrastructure?
I suppose it depends on what you mean by real-time... Neither JMS nor HTTP in my opinion support "real-time" applications well, meaning they cannot offer predictable/deterministic performance nor properly prioritize flows in the presence of contention.
Part of it is that these technologies are built on top of TCP which serializes all traffic into a single FIFO meaning that different traffic flows cannot be easily prioritized. Moreover TCP timers are not easily controlled resulting unpredictable blocking and timeouts... For this reason many streaming applications use UDP instead of TCP as an underlying protocol.
Another problem with JMS is that typical implementations use a broker that centralizes message dispatch. This is not the best architecture to get deterministic performance.
If you are looking for a middleware that can offer you the kind of reliability guarantees and publish-subscribe semantics you get with JMS, but was developed to fit the real-time application domain I recommend you take a look at the OMG Data-Distribution Service (DDS). See dds.omg.org and this article I wrote arguing why DDS is the best middleware to implement a real-time SOA. http://soa.sys-con.com/node/467488