Is or isn't SOAP a protocol? - web-services

According to the wikipedia page, SOAP is a protocol specification.
What does it mean? Aren't all protocols specifications?
In this answer, the author says that SOAP
is a protocol (or at least tries to be)
Tries to be? In the sense that it's not agreed upon?

SOAP is a protocol as per specs. As wiki as well its protocol.
As per wiki-
SOAP (originally Simple Object Access Protocol) is a messaging protocol specification for exchanging structured information in the implementation of web services in computer networks. Its purpose is to induce extensibility, neutrality and independence. It uses XML Information Set for its message format, and relies on application layer protocols, most often Hypertext Transfer Protocol (HTTP) or Simple Mail Transfer Protocol (SMTP), for message negotiation and transmission.
As per english dictionary Protocol-
specialized computing a computer language allowing computers that are connected to each other to communicate
As per Network terminology protocol-
In information technology, a protocol is the special set of rules that end points in a telecommunication connection use when they communicate. Protocols specify interactions between the communicating entities. It could exist at various levels, e.g. There are protocols for the data interchange at the hardware device level and protocols for data interchange at the application program level.
If you try to refer all three above explanations, SOAP falls into Application level communication protocol.

Related

Understanding the characteristics of Apache Thrift versus TCP socket servers?

I have some experience writing TCP socket servers using OpenSSL and I'm looking to understand more about Apache Thrift. I've looked over some basic examples of Thrift servers, and I understand that Thrift can handle pipes.
Can someone provide a simple explanation of the ways a Thrift server differs from a TCP server (apart from the use of pipes)?
And do Thrift frameworks use a different transport protocol?
I know this is a straightforward question, but I can't seem to find a beginner level explanation.
[...] ways a Thrift server differs from a TCP Server?
Thrift is at least one abstraction layer above raw sockets. It provides you with an abstraction which allows you to send and receive information across any medium, one of which could be TCP sockets.
The underlying transport medium itself is not important, nor is the protocol used (binary, compact, JSON ... you name it). Both is entirely transparent to the rest of your application.
In other words, you develop against a type-safe service API, rather than programming sockets or parse some JSON on your own.
Instead of fiddling around with bytes, encodings and fighting the subtleties of sockets, Thrift allows you to focus on what you want to do with that Service as a client and/or implementing the server-side logic.
Plus you may change transport and/or protocol as needed without affecting the rest of the code.
Thrift protocol is s higher level implementation over the TCP transport layer. It provides important elements:
serialization
RPC protocol
From my personal point of view, among the main benefits is type safety, especially between different languages.
Manually writing TCP (socket) solutions for sophisticated case seems quite hard.
Because it is generated from a formal definition (IDL - *.thrift files, sometimes this form is called schema) it shares similarities with WSDL/SOAP, but boasts higher performance. I use it on mobile clients etc.
The most trendy last years is JSON over REST, as far I know most code should be written manually. JSON doesn't have a schema (in official standards), maybe in the future.
Acronyms I have used are generally known. My answer is very general and simplified, professionals please forgive me.

M2M vs Web service

What is the difference between the term Machine to Machine (M2M) communication and a Web service?
The W3C defines a Web service as
a software system designed to support interoperable machine-to-machine interaction over a network.
Wiki defines M2M communication as a
technologie that allow both wireless and wired systems to communicate with other devices of the same type
That sounds to me like different terms of the same thing.
SOAP, REST etc. are possibilities to implement both, Web services and M2M communication.
But what is the difference between M2M and Web service? Is it just like M2M is used in the context of an industry environment and for everything else (consumer-/financial applications) it's a Web service?
In my opinion, M2M implies lower level of communication and, if I may, 'lower level' of data.
I think formal distinction comes later in the definition:
... It has an interface described in a machine-processable format (specifically WSDL). Other systems interact with the Web service in a manner prescribed by its description using SOAP messages, typically conveyed using HTTP with an XML serialization in conjunction with other Web-related standards.
It is a service on the Web.
So a typical web service operates over HTTP, assumes a machine-processable description and implies use of certain technologies.
M2M, on the other hand, operates over wide range of protocols, that are lower-level than HTTP, and is a subject to restrictions other than those in web services, e.g. low energy consumption, constant data feed (instead of on-demand) etc.
Also, to me web services include human component: somewhere down the road there is a person consuming the data obtained from a web service, while in the case of M2M a human consumer is less expected. A goal of M2M communication might be to sync up an array of machines or have a machine make a decision based on the data it obtained from another machine.
Most of M2M devices rely on pure java or SIM cards to send/receive data to each others or backend systems. MNO SIM cards provide more convenient communication services with mobile data option. SIM cards provided by MNO do not come with big memory sizes and due to nature of SIM, SIM applets running on them must be minimalist.
If OTA related specifications defining remote management of SIM and SIM applications are investigated, you will see that there is always message command header and payload area as TLV byte streams(aka APDU). Each message is responded by received peer with positive or negative acknowledge. M2M applications since they mostly integrated to mobile network through SIM they also apply the same policy and defines their own raw communication protocol over SMS or TCP/IP bearer.

Transport layer Services and Application Layer services

I am working with web services right now. We have two types of services, one over HTTP and other over TCP. when Trying to understand the difference between these two, as per my understanding, services over TCP work at the transport layer i.e they transmit data over two ends. So in that case services over TCP will directly transfer data between two ends. But i am not so much clear on services over HTTP. I know we have a Client server model, REST, SOAP and HTTP is the protocol that transmits data but i am not able to properly relate the concept of services over HTTP!
Can anyone please help with an analogy which explains the difference between the two ?
As John Saunders is trying to allude to, I would agree that it is more important to understand the abstractions these protocols provide, rather than specific "Layer" they may be called in certain model (OSI). While the general model helps and applies, it doesn't provide specific details for actual protocols.
Having said that, the difference between so called Transport Layer Services using TCP vs Application Layer Services using HTTP, IMHO boils down to the comparisons between TCP and HTTP itself.
I'll start be saying that I hope it is known to anyone even vaguely familiar with these protocols, that HTTP is higher level abstraction than TCP and in fact it relies on TCP/IP itself. Hence HTTP clearly inherits certain feature like reliability from TCP/IP.
Now the contrast -
TCP Service
Design your own application level protocol - You design your own application level protocol.. For example, how will Client request operation to add an employee? How will Client request to find a given employee? etc... How do you indicate the format in which data can be exchanged between client and server? How will you even distinguish metadata (like request information) from data?
Efficiency - Can be efficient and compact in transmission of data. Since you define your own application layer protocol, Can be anything from binary to string to XML to anything else you can dream of.
HTTP for example, is built on top of TCP, in layman terms, mostly using Key Value pair style request headers.. vs SOAP, where much of information is passed as message envelope and message body (Which is why SOAP can be over HTTP as well as other protocols like Message Queues)
Performance - Given the possibility of having very compact application layer protocol, it can be relatively fast as well. For really high throughput, high performance, latency sensitive intranet applications, this can be a deciding factor.
Development Effort - Along with the flexibility, you certainly end up writing more code, as you attempt to define and implement your own application layer protocol.
HTTP Service
Larger parts of application protocol are defined for you - You design your application over well defined HTTP protocol. Typically HTTP Get would mean querying for a resource. Query filters in request url can be used for searches. HTTP POST, PUT and DELETE similarly have specific, well defined semantics.
Error / Fault handling - Even error are indicated using standards defined in HTTP protocol.. Like Status Code 200 (Success) vs 400 (BadRequest).
Efficiency - Can be quite verbose. Protocols defines almost every aspect of how the request must be defined.. and is typically text based..
Development and Tools support - HTTP can make it easier to use existing, vast variety of tools to send, receive and debug requests (Fiddler or Charles Proxy are famous HTTP debugging tools).
Internet / Firewall Friendly - HTTP is typically used at port 80 (although in theory can be other port as well). Which makes it more suitable not only for intranet applications, where you may have more control over firewalls and ports you open.. but also for accessing those services over Internet, because port 80 is typically open on almost every machine in the world...
Co-existence of multiple services - HTTP is so widely used, that it is expected multiple applications / services on a given machine to use it.. OS typically have special support built into the OS to handle this (http.sys on Windows) and you don't have to worry about one application / service stepping on another, by accidentally using the same port (one will fail in such case). Port negotiation between client and server is typically not an issue in this case, because HTTP is expected to be at port 80.
Securing the communication channel - When it comes to securing the communication, again there is well defined way to establish the same.. i.e. HTTPS. Unlike TCP/IP based service, you don't have to invent your own scheme to encrypt the communication between client and server.
Hosting the service - In theory, there are more ways to host an HTTP service, than a TCP service, again due to HTTP web applications already being a common scenario, which web servers like IIS already cater to. Your HTTP service can take advantage of countless out of the box features which web servers like IIS already have.. Recycling, Authentication, Resource Management, Request Filtering, Caching, Dynamic Compression and Logging etc etc etc.. you get for free with HTTP services hosted on any of the mature web server products.
Interoperability Across Platforms / Technology stacks - With HTTP, it would be far easier to use a mix of any technology stack, again because the implementation of the Protocol will be typically supported on various platforms.. from Linux / Unix to Windows.. or from .Net to Java to Ruby.. You'll get benefit from existing tools and technologies present on these platforms which support HTTP.. Hence Http can be the de facto choice, if, for example, you expect server to be in .Net on Windows, but clients to be in Java on Unix.
I could go on.. This is by no means an exhaustive list, and I am sure that many others could add plenty more to this.. But hopefully this gives you a good idea for what you were looking.. One can clearly see, that this can be a very deep topic.. Based on your response and time, I may edit this answer in future.. or encourage others to update it, as they see fit.
Side note - It is interesting to note, that even though HTTP adds plenty over TCP/IP to make it a great and ubiquitous choice for application protocol.. There is always scope for more / higher level abstraction.. So much so that, there are other, newer service protocols, which are built on top of HTTP. For example - Odata. Look at OData if you are curious..
And of course, in todays world of services, the discussion will not be complete without the mention of REST.
EDIT: Another interesting side note - If you are building on Windows platform, and using .Net framework, there are frameworks like Windows Communication Foundation a.k.a. WCF, which try to provide such abstractions, that you can swap out your choice of communication protocol (Client and Server choice must still match), from HTTP to TCP to MSMQ to IPC etc, with mere configuration changes, or host same service over multiple communication protocols by creating multiple endpoints. Refer to Understanding various types of WCF bindings for high level overview and comparison of various, out of the box, options WCF provides.
When working with TCP/IP and protocols layered on top of it, I would take the 7-layer model with a grain of salt. The true number of layers will differ, and will not match up with the classic OSI model.
For instance, HTTP is built on top of the TELNET protocol, which is layered on top of TCP. Does that make TELNET a Presentation-layer protocol? No, it's an Application-layer protocol that happens to have another Application-layer protocol built on top of it.
And then we run SOAP over HTTP. Or, if we want, we can run SOAP over TCP/IP. So what layer is SOAP? Is that layer 8 or is that layer 9?
As You asked, I'll try to explain by analogy, while not repeating previous answers too much.
Let's say we have helpdesk (service) reachable by phone call (TCP) and by SMS (HTTP). From Your (application) point of view You should get the same information independent of which communication method You chose. But there are differencies how this communication will be going, because phone call (TCP) is statefull channel, while SMS (HTTP) is stateless:
once phone call is established, information exchange will continue until hang'up;
SMS message must contain all relevant information to get a usefull response.
To introduce state into SMS channel, additional steps at helpdesk level are required, for example, You'll be assigned ticket number, which You must send with each related SMS (HTTP cookie/session) - this won't be handled authomatically by GSM network. This state is handled by helpdesk's and Your (service and application) logic.
Both service types have advantages and pitfalls. And both should work - preferance depends on actual use-case.
There is no too much difference what means are used to exchange data (You can even exchange mails using post office, if latency is acceptable). In practice it means You can use ping (ICMP) or DNS queries, or emails to exchange data - as long as Your application knows how to use/decode such channel.
I think John Saunders in his answer refered to 7 layer OSI model, an I think his point is correct.
This analogy is not 100% correct, I tried to explain the idea: the difference is how the state is preserved (by protocol itself, or by application/framework).

How can SOAP-based service can be seen as a special case of a REST-style service?

A quote from Java Web Services: Up And Running, Second Edition book :
"At present, the distinction between the two flavours of web service is
not sharp,
because a SOAP-based service delivered over HTTP can be seen as a special case
of a REST-style service;"
How ?
How?
I believe the writer's statement is incorrect.
What is SOAP?
According to wikipedia:
SOAP can form the foundation layer of a web services protocol stack,
providing a basic messaging framework upon which web services can be
built. This XML based protocol consists of three parts: an envelope,
which defines what is in the message and how to process it, a set of
encoding rules for expressing instances of application-defined
datatypes, and a convention for representing procedure calls and
responses. SOAP has three major characteristics: Extensibility
(security and WS-routing are among the extensions under development),
Neutrality (SOAP can be used over any transport protocol such as HTTP,
SMTP, TCP, or JMS) and Independence (SOAP allows for any programming
model).
As you can see, there really isn't anything in this description of SOAP that takes any ideological stance over what the structure of your API calls (url wise) must adhere to. Of course, soap uses XML, and XML can have a data structure that essentially works as the rule-set of your API call... thats cool.
In contrast, we have REST.
What is REST?
According to wikipedia:
The REST architectural style describes the following six constraints
applied to the architecture, while leaving the implementation of the
individual components free to design:
Client–server: Servers are not concerned with the user interface or user state, so that servers can be simpler and more scalable.
Stateless: The client–server communication is further constrained by no client context being stored on the server between requests.
Cacheable: Responses must, implicitly or explicitly, define themselves as cacheable, or not, to prevent clients reusing stale or inappropriate data in response to further requests.
Layered system: A client cannot ordinarily tell whether it is connected directly to the end server, or to an intermediary along the way. Intermediary servers may improve system scalability by enabling load-balancing and by providing shared caches.
Code on demand (optional): Servers can temporarily extend or customize the functionality of a client by the transfer of executable code.
Uniform interface: The uniform interface between clients and servers, discussed below, simplifies and decouples the architecture, which enables each part to evolve independently. (i.e. HTTP GET, POST, PUT, PATCH, DELETE)
Comparison
In my mind, it shouldn't be described as SOAP vs REST, it should be RPC vs REST. RPC is remote procedural call, which basically means that every single functionality of your API gets 1 distinct API endpoint, and so on. so, REST can do with 1 url what RPC does with 7. SOAP is RPC (right?)
Yes, both are web services.
But saying that an RPC API is RESTful-ish because its transmitted over HTTP is hardly grounds to say they are similar... from the detailed information above, you can see that REST takes a much more ideological approach to the structure, transfer, purpose, scalability, and state of your service, whereas SOAP doesn't really talk about those things, and presumably the developer can choose to do, or not do, those things.
In conclusion, more context is needed for me to really understand what point the author was trying to make. An RPC API can be similar to REST if you make it do RESTful things.. but that is really circumstantial, isn't it?

Messaging middleware that uses HTTP as a transport

I'm looking for options that would allow a client to receive messages (push notifications) from a server. The client is an ARM/Linux embedded device similar in capabilities to a Raspberry Pi.
Because the client could be behind a firewall, I'd like to use message-oriented middleware (MOM) that can transport on top of HTTP. I think that rules out MOMs that are based on AMQP.
The MOM server should support the Linux platform. The MOM should also provide a C or C++ client library that can be compiled on an ARM/Linux platform.
I am aware of the HTTP long polling technique, as well as HTML 5 WebSockets and Server-Sent Events. But I'd prefer a higher-level (yet lightweight) solution that takes care of transporting messages between point A and point B over HTTP. It doesn't matter much if the messages have to be formatted as XML, JSON, plain text, or binary.
Two that I have used successfully are XML-RPC and gSOAP.
XML-RPC:
It's a spec and a set of implementations that allow software running on disparate operating systems, running in different environments to make procedure calls over the Internet.
It's remote procedure calling using HTTP as the transport and XML as the encoding. XML-RPC is designed to be as simple as possible, while allowing complex data structures to be transmitted, processed and returned.
gSOAP:
The gSOAP toolkit is a C and C++ software development toolkit for SOAP/XML Web services and generic (non-SOAP) C/C++ XML data bindings. The toolkit analyzes WSDLs and XML schemas (separately or as a combined set) and maps the XML schema types and the SOAP messaging protocols to easy-to-use and efficient C and C++ code. It also supports exposing (legacy) C and C++ applications as SOAP/XML Web services by auto-generating XML serialization code and WSDL specifications. Or you can simply use it to automatically convert XML to/from C and C++ data. The toolkit supports options to generate pure ANSI C or C++ with or without STL.
According to my research, these are the available MOM technologies that use HTTP as a transport, and which feature a C/C++ client library:
XMPP
Extensible Messaging and Presence Protocol (XMPP) is a communications
protocol for message-oriented middleware based on XML (Extensible
Markup Language).The protocol was originally named Jabber, and
was developed by the Jabber open-source community in 1999 for near
real-time, instant messaging (IM), presence information, and contact
list maintenance. Designed to be extensible, the protocol has also
been used for publish-subscribe systems, signalling for VoIP, video,
file transfer, gaming, Internet of Things applications such as the
smart grid, and social networking services. (from Wikipedia)
ActiveMQ
Apache ActiveMQ is an open source message broker written in Java
together with a full Java Message Service (JMS) client. It provides
"Enterprise Features" which in this case means fostering the
communication from more than one client or server. Supported clients
include the obvious Java via JMS 1.1 as well as several other "cross
language" clients. The communication is managed with features such as
computer clustering and ability to use any database as a JMS
persistence provider besides virtual memory, cache, and journal
persistency. (from Wikipedia)
Zerogw
Zerogw is a http to zeromq gateway. Which means it listens HTTP,
parses request and sends it using zeromq socket (ZMQ_REQ). Then waits
for the reply and responds with data received from zeromq socket.
Starting with v0.3 zerogw also supports WebSockets. Websockets are
implemented by forwarding incoming messages using ZMQ_PUB socket, and
listening clommands from ZMQ_SUB socket. Each WebSocket client can be
subscribed to unlimited number of topics. Each zeromq message it
either control message (e.g. subscription) or message to a specified
topic which will be efficiently sent to every WebSocket subscribed to
that particular topic. (from the GitHub zerogw page)
There's also the HyperText InterORB Protocol (HTIOP), but TAO seems to be the only CORBA ORB that supports it. There doesn't seem to be anyone using it (please correct me if I'm wrong).
There is work in progress to make OMG's Data Distribution Service (DDS) web-enabled.
I'm also warming up to the idea of using WebSockets for bidirectional communications, despite their "low level" nature. For those interested, available C/C++ libraries include:
libwebsockets
websocket++
QWebSockets
Wt (C++ web development toolkit which includes support for WebSockets)
There is an open Websocket Application Messaging Protocol (WAMP) that provides asynchronous messaging patterns for remote procedure calls and the publish-subscribe pattern. There are a number of implementations for WAMP, but none of them are written in C/C++.