Understanding the characteristics of Apache Thrift versus TCP socket servers? - c++

I have some experience writing TCP socket servers using OpenSSL and I'm looking to understand more about Apache Thrift. I've looked over some basic examples of Thrift servers, and I understand that Thrift can handle pipes.
Can someone provide a simple explanation of the ways a Thrift server differs from a TCP server (apart from the use of pipes)?
And do Thrift frameworks use a different transport protocol?
I know this is a straightforward question, but I can't seem to find a beginner level explanation.

[...] ways a Thrift server differs from a TCP Server?
Thrift is at least one abstraction layer above raw sockets. It provides you with an abstraction which allows you to send and receive information across any medium, one of which could be TCP sockets.
The underlying transport medium itself is not important, nor is the protocol used (binary, compact, JSON ... you name it). Both is entirely transparent to the rest of your application.
In other words, you develop against a type-safe service API, rather than programming sockets or parse some JSON on your own.
Instead of fiddling around with bytes, encodings and fighting the subtleties of sockets, Thrift allows you to focus on what you want to do with that Service as a client and/or implementing the server-side logic.
Plus you may change transport and/or protocol as needed without affecting the rest of the code.

Thrift protocol is s higher level implementation over the TCP transport layer. It provides important elements:
serialization
RPC protocol
From my personal point of view, among the main benefits is type safety, especially between different languages.
Manually writing TCP (socket) solutions for sophisticated case seems quite hard.
Because it is generated from a formal definition (IDL - *.thrift files, sometimes this form is called schema) it shares similarities with WSDL/SOAP, but boasts higher performance. I use it on mobile clients etc.
The most trendy last years is JSON over REST, as far I know most code should be written manually. JSON doesn't have a schema (in official standards), maybe in the future.
Acronyms I have used are generally known. My answer is very general and simplified, professionals please forgive me.

Related

How to create a cross platform duplex web service communication

I would like to create a web service in .Net that clients of different types (Web, exe, java) can consume despite of the language they are written with.
In addition, it needs to support callbacks and be able to easily pass through firewalls and NATs (knowing a client internal IP might change, or be removed from NAT).
Thirdly, since it is an enterprise product, I want to avoid being dependent on 3rd parties, especially ones that demand a certain environment or that customer will not want.
What kind of technologies or approaches can I use?
I am looking at web sockets, but there also I see a lot of complexities and I am not sure there aren't a lot of topology and interoperability border cases that may make me unreliable.
Thanks
For simple request-response services, you can use REST (over HTTP). Any client technology can access HTTP at this point (even CLI) and REST is a well-known and well-understood distributed mechanism. The issue involves the callbacks. There are frameworks that handle HTTP callbacks (simple google search will give you good answers), but imo, the solutions that I have seen are clumsy.
Unlike normal HTTP, WebSocket is a persistent connection. And like any other IETF and W3C specification (or any other standard for that matter), there are various implementations with various degrees of reliability, performance, etc. There are probably about 100 implementations of WebSocket clients and servers. Some implementations handle real-world issues like reconnections, network intermediaries, high scalability, mobile capabilities, etc... and some implementations just do not. I would suggest you pick an implementation that provides these enterprise-grade features.
Btw, WebSocket is pretty darn simple

Transport layer Services and Application Layer services

I am working with web services right now. We have two types of services, one over HTTP and other over TCP. when Trying to understand the difference between these two, as per my understanding, services over TCP work at the transport layer i.e they transmit data over two ends. So in that case services over TCP will directly transfer data between two ends. But i am not so much clear on services over HTTP. I know we have a Client server model, REST, SOAP and HTTP is the protocol that transmits data but i am not able to properly relate the concept of services over HTTP!
Can anyone please help with an analogy which explains the difference between the two ?
As John Saunders is trying to allude to, I would agree that it is more important to understand the abstractions these protocols provide, rather than specific "Layer" they may be called in certain model (OSI). While the general model helps and applies, it doesn't provide specific details for actual protocols.
Having said that, the difference between so called Transport Layer Services using TCP vs Application Layer Services using HTTP, IMHO boils down to the comparisons between TCP and HTTP itself.
I'll start be saying that I hope it is known to anyone even vaguely familiar with these protocols, that HTTP is higher level abstraction than TCP and in fact it relies on TCP/IP itself. Hence HTTP clearly inherits certain feature like reliability from TCP/IP.
Now the contrast -
TCP Service
Design your own application level protocol - You design your own application level protocol.. For example, how will Client request operation to add an employee? How will Client request to find a given employee? etc... How do you indicate the format in which data can be exchanged between client and server? How will you even distinguish metadata (like request information) from data?
Efficiency - Can be efficient and compact in transmission of data. Since you define your own application layer protocol, Can be anything from binary to string to XML to anything else you can dream of.
HTTP for example, is built on top of TCP, in layman terms, mostly using Key Value pair style request headers.. vs SOAP, where much of information is passed as message envelope and message body (Which is why SOAP can be over HTTP as well as other protocols like Message Queues)
Performance - Given the possibility of having very compact application layer protocol, it can be relatively fast as well. For really high throughput, high performance, latency sensitive intranet applications, this can be a deciding factor.
Development Effort - Along with the flexibility, you certainly end up writing more code, as you attempt to define and implement your own application layer protocol.
HTTP Service
Larger parts of application protocol are defined for you - You design your application over well defined HTTP protocol. Typically HTTP Get would mean querying for a resource. Query filters in request url can be used for searches. HTTP POST, PUT and DELETE similarly have specific, well defined semantics.
Error / Fault handling - Even error are indicated using standards defined in HTTP protocol.. Like Status Code 200 (Success) vs 400 (BadRequest).
Efficiency - Can be quite verbose. Protocols defines almost every aspect of how the request must be defined.. and is typically text based..
Development and Tools support - HTTP can make it easier to use existing, vast variety of tools to send, receive and debug requests (Fiddler or Charles Proxy are famous HTTP debugging tools).
Internet / Firewall Friendly - HTTP is typically used at port 80 (although in theory can be other port as well). Which makes it more suitable not only for intranet applications, where you may have more control over firewalls and ports you open.. but also for accessing those services over Internet, because port 80 is typically open on almost every machine in the world...
Co-existence of multiple services - HTTP is so widely used, that it is expected multiple applications / services on a given machine to use it.. OS typically have special support built into the OS to handle this (http.sys on Windows) and you don't have to worry about one application / service stepping on another, by accidentally using the same port (one will fail in such case). Port negotiation between client and server is typically not an issue in this case, because HTTP is expected to be at port 80.
Securing the communication channel - When it comes to securing the communication, again there is well defined way to establish the same.. i.e. HTTPS. Unlike TCP/IP based service, you don't have to invent your own scheme to encrypt the communication between client and server.
Hosting the service - In theory, there are more ways to host an HTTP service, than a TCP service, again due to HTTP web applications already being a common scenario, which web servers like IIS already cater to. Your HTTP service can take advantage of countless out of the box features which web servers like IIS already have.. Recycling, Authentication, Resource Management, Request Filtering, Caching, Dynamic Compression and Logging etc etc etc.. you get for free with HTTP services hosted on any of the mature web server products.
Interoperability Across Platforms / Technology stacks - With HTTP, it would be far easier to use a mix of any technology stack, again because the implementation of the Protocol will be typically supported on various platforms.. from Linux / Unix to Windows.. or from .Net to Java to Ruby.. You'll get benefit from existing tools and technologies present on these platforms which support HTTP.. Hence Http can be the de facto choice, if, for example, you expect server to be in .Net on Windows, but clients to be in Java on Unix.
I could go on.. This is by no means an exhaustive list, and I am sure that many others could add plenty more to this.. But hopefully this gives you a good idea for what you were looking.. One can clearly see, that this can be a very deep topic.. Based on your response and time, I may edit this answer in future.. or encourage others to update it, as they see fit.
Side note - It is interesting to note, that even though HTTP adds plenty over TCP/IP to make it a great and ubiquitous choice for application protocol.. There is always scope for more / higher level abstraction.. So much so that, there are other, newer service protocols, which are built on top of HTTP. For example - Odata. Look at OData if you are curious..
And of course, in todays world of services, the discussion will not be complete without the mention of REST.
EDIT: Another interesting side note - If you are building on Windows platform, and using .Net framework, there are frameworks like Windows Communication Foundation a.k.a. WCF, which try to provide such abstractions, that you can swap out your choice of communication protocol (Client and Server choice must still match), from HTTP to TCP to MSMQ to IPC etc, with mere configuration changes, or host same service over multiple communication protocols by creating multiple endpoints. Refer to Understanding various types of WCF bindings for high level overview and comparison of various, out of the box, options WCF provides.
When working with TCP/IP and protocols layered on top of it, I would take the 7-layer model with a grain of salt. The true number of layers will differ, and will not match up with the classic OSI model.
For instance, HTTP is built on top of the TELNET protocol, which is layered on top of TCP. Does that make TELNET a Presentation-layer protocol? No, it's an Application-layer protocol that happens to have another Application-layer protocol built on top of it.
And then we run SOAP over HTTP. Or, if we want, we can run SOAP over TCP/IP. So what layer is SOAP? Is that layer 8 or is that layer 9?
As You asked, I'll try to explain by analogy, while not repeating previous answers too much.
Let's say we have helpdesk (service) reachable by phone call (TCP) and by SMS (HTTP). From Your (application) point of view You should get the same information independent of which communication method You chose. But there are differencies how this communication will be going, because phone call (TCP) is statefull channel, while SMS (HTTP) is stateless:
once phone call is established, information exchange will continue until hang'up;
SMS message must contain all relevant information to get a usefull response.
To introduce state into SMS channel, additional steps at helpdesk level are required, for example, You'll be assigned ticket number, which You must send with each related SMS (HTTP cookie/session) - this won't be handled authomatically by GSM network. This state is handled by helpdesk's and Your (service and application) logic.
Both service types have advantages and pitfalls. And both should work - preferance depends on actual use-case.
There is no too much difference what means are used to exchange data (You can even exchange mails using post office, if latency is acceptable). In practice it means You can use ping (ICMP) or DNS queries, or emails to exchange data - as long as Your application knows how to use/decode such channel.
I think John Saunders in his answer refered to 7 layer OSI model, an I think his point is correct.
This analogy is not 100% correct, I tried to explain the idea: the difference is how the state is preserved (by protocol itself, or by application/framework).

protocol buffers and actual transport options - sockets or middleware

I am developing a client/server application for which I am evaluating a few options for the communications layer.
As part of this communication framework, I am contemplating using google's protocol buffer (PB) for representation of the transport data instead of re-inventing my own binary structure.
Now coming on to the actual transport, I am wondering if one should use plain sockets to send/receive these binary messages or use some form of middleware. Using a middleware has certain obvious advantages over sockets. A few that I care about include: communication models - publish/subscribe, request/response and fail over.
On the other hand, using sockets has the advantage of low overhead compared to the middleware approach and will deliver better performance.
One can also think of using the RPC libraries available with protocol buffers (third party add-ons on google's protocol buffer wiki) to communicate between the client and the server. Though it abstracts from the low level socket, it still does not support the middleware features.
At the moment, my client is an Adobe Flex GUI and two server side processes (One java and another C++). In the future, the client and server side can potentially have other services developed in other languages as well such as .NET
What do the experts feel about these choices and from experience what works well without compromising on performance. Are there other alternatives that developers go with?
Thanks
Dece
Unless it's a learning exercise, you absolutely, positively must use some middleware. There are lots of choices: AMQP, ZeroMQ, XMPP, Comet/Bayeux.
For your scenario, you probably want something web-based, so XMPP over HTTP may be a good option. However, I'm partial to Comet (though I have found Bayeux to be too complex for my needs).
Which platform?
If Windows then I have a free, simple, high performance, IO completion based pluggable server platform called WASP which is available from here.
Simply write a DLL or two and plug them in and you're done with the networking.
I don't currently have a protocol buffers based plugin example but it's on my list of things to do...

Client and server

I would like to create a connection between two applications. Should I be using Client-Server or is there another way of efficiently communicating between one another? Is there any premade C++ networking client server libraries which are easy to use/reuse and implement?
Application #1 <---> (Client) <---> (Server) <---> Application #2
Thanks!
Client / server is a generic architecture pattern (much like factory, delegation, inheritance, bridge are design patterns). What you probably want is a library to eliminate the tedium of packing and unpacking your data in a format that can be sent over the wire. I strongly recommend you take a look at the protocol buffers library, which is used extensively at Google and released as open source. It will automatically encode / decode data, and it makes it possible for programs written in different languages to send and receive messages of the same type with all the dirty work done for you automatically. Protobuf only deals with encoding, not actually sending and receiving. For that, you can use primitive sockets (strongly recommend against that) or the Boost.Asio asynchronous I/O library.
I should add that you seem to be confused about the meaning of client and server, since in your diagram you have the application talking to a client which talks to a server which talks to another application. This is wrong. Your application is the client (or the server). Client / server is simply a role that your application takes on during the communication. An application is considered to be a client when it initiates a connection or a request, while an application is considered to be a server when it waits for and processes incoming requests. Client / server are simply terms to describe application behavior.
If you know the applications will be running on the same machine, you can use sockets, message queues, pipes, or shared memory. Which option you choose depends on a lot of factors.
There is a ton of example code for any of these strategies as well as libraries that will abstract away a lot of the details.
If they are running on different machines, you will want to communicate through sockets.
There's a tutorial here, with decent code samples.

Advice on web services without HTTP

My company is planning on implementing a remote programming tool to configure embedded devices in the field. I assumed that these devices would have an HTTP client on them, and planned to implement some REST services for them to access. Unfortunately, I found out that they have a TCP stack but no HTTP client. One of my co-workers suggested that we try to send “soap packets” over port 80 without an HTTP client. The devices also don’t have any SOAP client. Is this possible? Would there be implications if there was a web server running on the network the devices are connected to? I’d appreciate any advice or best practices on how to implement something like this.
If your servers are serving simple files, the embedded devices really only need to send an HTTP GET request (possibly with a little extra data identifying the device, so the server can know which firmware version to send).
From there, it's pretty much a simple matter of reading the raw data coming in on the embedded device's socket -- you might need to only disregard the HTTP header on the response, or you could possibly configure your server to not send it for those requests.
you don't really need an HTTP client per-se. HTTP is a very simple text-based protocol that you can implement yourself if you need to.
That said, you probably won't need to implement it yourself. If they have a TCP stack and a standard sockets library, you can probably find a simple C library (such as this one) that wraps up HTTP or SOAP functionality for you. You could then just build that library into your application.
Basic HTTP is not a particularly difficult protocol to implement by hand. It's a text and line based protocol, save for the payload, and the servers work quite well with "primitive, ham fisted" clients, which is all a simple client needs to be.
If you can use just a subset, likely, then simply write it and be done.
You can implement a trivial http client over sockets (here is an example of how to do it in ruby: http://www.tutorialspoint.com/ruby/ruby_socket_programming.htm )
It probably depends what technology you have available on your embedded devices - if you can easily consume JSON or XML then a webservice approach using the above may work for you.