How do I get through proxy server environments for non-standard services? - c++

I'm not real hip on exactly what role(s) today's proxy servers can play and I'm learning so go easy on me :-) I have a client/server system I have written using a homegrown protocol and need to enhance the client side to negotiate its way out of a proxy environment.
I have an existing client and server system written in C and C++ for the speed and a small amount of MFC in the client to handle the user interface. I have written both the server and client side of the system on Windows (the people I work for are mainly web developers using Windows everything - not a choice) sticking to Berkeley Sockets as it were via wsock32 for efficiency. The clients connect to the server through a nonstandard port (even though using port 80 is an option to get out of some environments but the protocol that goes over it isn't HTTP). The TCP connection(s) stay open for the duration of the clients participation in real time conferences.
Our customer base is expanding to all kinds of networked environments. I have been able to solve a lot of problems by adding the ability to connect securely over port 443 and using secure sockets which allows the protocol to pass through a lot environments since the internal packets can't be sniffed. But more and more of our customers are behind a proxy server environment and my direct connections don't make it through. My old school understanding of proxy servers is that they act as a proxy for external HTML content over HTTP, possibly locally caching popular material for faster local access, and also allowing their IT staff to blacklist certain destination sites. Customer are complaining that my software doesn't recognize and easily navigate its way through their proxy environments but I'm finding it difficult to decide what my "best fit" solution should be. My software doesn't tear down the connection after each client request, and on top of that packets can come from either side at any time, basically your typical custom client/server system for a specific niche.
My first reaction is "why can't they just add my server's addresses to their white list" but if there is a programmatic way I can get through without requiring their IT staff to help it is politically better and arguably a better solution anyway. Plus maybe I'm still not understanding the role and purpose of what proxy servers and environments have grown to be these days.
My first attempt at a solution was to use WinInet with its various proxy capabilities to establish a connection over port 80 to my non-standard protocol server (which knows enough to recognize and answer a simple HTTP-looking GET request and answer it with a simple HTTP response page to get around some environments that employ initial packet sniffing (DPI)). I retrieved the actual SOCKET handle behind WinInet's HINTERNET request object and had hoped to use that in place of my software's existing SOCKET connection and hopefully not need to change much more on the client side. It initially seemed to be my solution but on further inspection it seems that the OS gets first-chance at the received data on this socket since when I get notified of events via the standard select(...) statement on the socket and query the size of the data available via ioctlsocket the call succeeds but returns 0 bytes available, the reads don't work and it goes downhill from there.
Can someone tell me of a client-side library (commercial is fine) will let me get past these proxy server environments with as little user and IT staff help as possible? From what I read it has grown past SOCKS and I figure someone has to have solved this problem before me.
Thanks for reading my long-winded question,
Ripred

If your software can make an SSL connection on port 443, then you are 99% of the way there.
Typically HTTP proxies are set up to proxy SSL-on-443 (for the purposes of HTTPS). You just need to teach your software to use the HTTP proxy. Check the HTTP RFCs for the full details, but the Cliffs Notes version is:
Connect to the HTTP proxy on the proxy port;
Send to the proxy:
.
CONNECT your.real.server:443 HTTP/1.1\r\n
Host: your.real.server:443\r\n
User-Agent: YourSoftware/1.234\r\n
\r\n
Then parse the proxy response, which will start with a HTTP status code, followed by HTTP headers, followed by a blank line. You'll then be talking with your destination (if the status code indicated success, anyway), and can start talking SSL.
In many corporate environments you'll have to authenticate with the proxy - this is almost always HTTP Basic Authentication, which is pretty easy - again, see the RFCs.

Related

Easy way to "nudge" a server to keep a connection open?

Okay, so a little context:
I have an app running on an embedded system that sends a few different requests over HTTP (using libcurl in C++) at the following intervals:
5 minutes
15 minutes
1 hour
24 hours
My goal: Reduce data consumption (runs over cellular)
We have both client and server side TLS authentication, so the handshake is costly. The idea is that we use persistent connections (at least for the shorter interval files) to avoid doing the handshake every time.
Unfortunately, after much tinkering I've figured out that the server is closing the connection before the intervals pass. Maybe this is something we can extend? I'll have to talk to the server side guys.
I was under the impression that was the reason the "TCP keep-alive" packets existed, but supposedly those "check the connection" not "keep it open" like the name suggests.
My idea is this:
Have my app send a packet (as small as possible) every 2 minutes or so (however long the timeout is) to "nudge" the connection into staying open.
My questions are:
Does that make any sense?
I don't suppose there is an easy way to do this in libcurl is there?
If so, how small could we get the request?
Is there an even easier way to do it? My only issue here is that all the connection stuff "lives" in libcurl.
Thanks!
It would be easier to give a more precise answer if you gave a little more detail on your application architecture. For example, is it a RESTful API? Is the use of HTTP absolutely mandatory? If so, what HTTP server are you using (nginx, apache, ...)? Could you consider websockets as an alternative to plain HTTP?
If you are at liberty to use something other than regular HTTP or HTTPs - and to use something other than libcurl on the client side - you would have more options.
If, on the other hand, if you are constrained to both
use HTTP (rather than a raw TCP connection or websockets), and
use libcurl
then I think your task is a good bit more difficult - but maybe still possible.
One of your first challenges is that the typical timeouts for a HTTP connection are quite low (as low as a few seconds for Apache 2). If you can configure the server you can increase this.
I was under the impression that was the reason the "TCP keep-alive" packets existed, but supposedly those "check the connection" not "keep it open" like the name suggests.
Your terminology is ambiguous here. Are you referring to TCP keep-alive packets or persistent HTTP connections? These don't necessarily have anything to do with each other. The former is an optional mechanism in TCP (which is disabled by default). The latter is an application-layer concept which is specific to HTTP - and may be used regardless of whether keep-alive packets are being used at the transport layer.
My only issue here is that all the connection stuff "lives" in libcurl.
The problem with using libcurl is that it first and foremost a transfer library. I don't think it is tailored for long-running, persistent TCP connections. Nonetheless, according to Daniel Stenberg (the author of libcurl), the library will automatically try to reuse existing connections where possible - as long as you re-use the same easy handle.
If so, how small could we get the request?
Assuming you are using a 'ping' endpoint on your server - which accepts no data and returns a 204 (success but no content) response, then the overhead - in the application layer - would be the size of the HTTP request headers + the size of the HTTP response headers. Maybe you could get it down to 200-300 bytes, or thereabouts.
Alternatives to (plain) HTTP
If you are using a RESTful API, this paradigm sort of goes against the idea of a persistent TCP connection - although I can not think of any reason why it would not work.
You might consider websockets as an alternative, but - again - libcurl is not ideal for this. Although I know very little about websockets, I believe they would offer some advantages.
Compared to plain HTTP, websockets offer:
significantly less overhead than HTTP per message;
the connection is automatically persistent: there is no need to send extra 'keep alive' messages to keep it open;
Compared to a raw TCP connection, the benefits of websockets are that:
you don't have to open a custom port on your server;
it automatically handles the TLS/SSL stuff for you.
(Someone who knows more about websockets is welcome to correct me on some of the above points - particularly regarding TLS/SSL and keep alive messages.)
Alternatives to libcurl
An alternative to libcurl which might be useful here is the Mongoose networking library. It would provide you with a few different alternatives:
use a plain TCP connection (and a custom application layer protocol),
use a TCP connection and handle the HTTP requests yourself manually,
use websockets - which it has very good support for (both as server and client).
Mongoose allows you to enable SSL for all of these options also.

Transport layer Services and Application Layer services

I am working with web services right now. We have two types of services, one over HTTP and other over TCP. when Trying to understand the difference between these two, as per my understanding, services over TCP work at the transport layer i.e they transmit data over two ends. So in that case services over TCP will directly transfer data between two ends. But i am not so much clear on services over HTTP. I know we have a Client server model, REST, SOAP and HTTP is the protocol that transmits data but i am not able to properly relate the concept of services over HTTP!
Can anyone please help with an analogy which explains the difference between the two ?
As John Saunders is trying to allude to, I would agree that it is more important to understand the abstractions these protocols provide, rather than specific "Layer" they may be called in certain model (OSI). While the general model helps and applies, it doesn't provide specific details for actual protocols.
Having said that, the difference between so called Transport Layer Services using TCP vs Application Layer Services using HTTP, IMHO boils down to the comparisons between TCP and HTTP itself.
I'll start be saying that I hope it is known to anyone even vaguely familiar with these protocols, that HTTP is higher level abstraction than TCP and in fact it relies on TCP/IP itself. Hence HTTP clearly inherits certain feature like reliability from TCP/IP.
Now the contrast -
TCP Service
Design your own application level protocol - You design your own application level protocol.. For example, how will Client request operation to add an employee? How will Client request to find a given employee? etc... How do you indicate the format in which data can be exchanged between client and server? How will you even distinguish metadata (like request information) from data?
Efficiency - Can be efficient and compact in transmission of data. Since you define your own application layer protocol, Can be anything from binary to string to XML to anything else you can dream of.
HTTP for example, is built on top of TCP, in layman terms, mostly using Key Value pair style request headers.. vs SOAP, where much of information is passed as message envelope and message body (Which is why SOAP can be over HTTP as well as other protocols like Message Queues)
Performance - Given the possibility of having very compact application layer protocol, it can be relatively fast as well. For really high throughput, high performance, latency sensitive intranet applications, this can be a deciding factor.
Development Effort - Along with the flexibility, you certainly end up writing more code, as you attempt to define and implement your own application layer protocol.
HTTP Service
Larger parts of application protocol are defined for you - You design your application over well defined HTTP protocol. Typically HTTP Get would mean querying for a resource. Query filters in request url can be used for searches. HTTP POST, PUT and DELETE similarly have specific, well defined semantics.
Error / Fault handling - Even error are indicated using standards defined in HTTP protocol.. Like Status Code 200 (Success) vs 400 (BadRequest).
Efficiency - Can be quite verbose. Protocols defines almost every aspect of how the request must be defined.. and is typically text based..
Development and Tools support - HTTP can make it easier to use existing, vast variety of tools to send, receive and debug requests (Fiddler or Charles Proxy are famous HTTP debugging tools).
Internet / Firewall Friendly - HTTP is typically used at port 80 (although in theory can be other port as well). Which makes it more suitable not only for intranet applications, where you may have more control over firewalls and ports you open.. but also for accessing those services over Internet, because port 80 is typically open on almost every machine in the world...
Co-existence of multiple services - HTTP is so widely used, that it is expected multiple applications / services on a given machine to use it.. OS typically have special support built into the OS to handle this (http.sys on Windows) and you don't have to worry about one application / service stepping on another, by accidentally using the same port (one will fail in such case). Port negotiation between client and server is typically not an issue in this case, because HTTP is expected to be at port 80.
Securing the communication channel - When it comes to securing the communication, again there is well defined way to establish the same.. i.e. HTTPS. Unlike TCP/IP based service, you don't have to invent your own scheme to encrypt the communication between client and server.
Hosting the service - In theory, there are more ways to host an HTTP service, than a TCP service, again due to HTTP web applications already being a common scenario, which web servers like IIS already cater to. Your HTTP service can take advantage of countless out of the box features which web servers like IIS already have.. Recycling, Authentication, Resource Management, Request Filtering, Caching, Dynamic Compression and Logging etc etc etc.. you get for free with HTTP services hosted on any of the mature web server products.
Interoperability Across Platforms / Technology stacks - With HTTP, it would be far easier to use a mix of any technology stack, again because the implementation of the Protocol will be typically supported on various platforms.. from Linux / Unix to Windows.. or from .Net to Java to Ruby.. You'll get benefit from existing tools and technologies present on these platforms which support HTTP.. Hence Http can be the de facto choice, if, for example, you expect server to be in .Net on Windows, but clients to be in Java on Unix.
I could go on.. This is by no means an exhaustive list, and I am sure that many others could add plenty more to this.. But hopefully this gives you a good idea for what you were looking.. One can clearly see, that this can be a very deep topic.. Based on your response and time, I may edit this answer in future.. or encourage others to update it, as they see fit.
Side note - It is interesting to note, that even though HTTP adds plenty over TCP/IP to make it a great and ubiquitous choice for application protocol.. There is always scope for more / higher level abstraction.. So much so that, there are other, newer service protocols, which are built on top of HTTP. For example - Odata. Look at OData if you are curious..
And of course, in todays world of services, the discussion will not be complete without the mention of REST.
EDIT: Another interesting side note - If you are building on Windows platform, and using .Net framework, there are frameworks like Windows Communication Foundation a.k.a. WCF, which try to provide such abstractions, that you can swap out your choice of communication protocol (Client and Server choice must still match), from HTTP to TCP to MSMQ to IPC etc, with mere configuration changes, or host same service over multiple communication protocols by creating multiple endpoints. Refer to Understanding various types of WCF bindings for high level overview and comparison of various, out of the box, options WCF provides.
When working with TCP/IP and protocols layered on top of it, I would take the 7-layer model with a grain of salt. The true number of layers will differ, and will not match up with the classic OSI model.
For instance, HTTP is built on top of the TELNET protocol, which is layered on top of TCP. Does that make TELNET a Presentation-layer protocol? No, it's an Application-layer protocol that happens to have another Application-layer protocol built on top of it.
And then we run SOAP over HTTP. Or, if we want, we can run SOAP over TCP/IP. So what layer is SOAP? Is that layer 8 or is that layer 9?
As You asked, I'll try to explain by analogy, while not repeating previous answers too much.
Let's say we have helpdesk (service) reachable by phone call (TCP) and by SMS (HTTP). From Your (application) point of view You should get the same information independent of which communication method You chose. But there are differencies how this communication will be going, because phone call (TCP) is statefull channel, while SMS (HTTP) is stateless:
once phone call is established, information exchange will continue until hang'up;
SMS message must contain all relevant information to get a usefull response.
To introduce state into SMS channel, additional steps at helpdesk level are required, for example, You'll be assigned ticket number, which You must send with each related SMS (HTTP cookie/session) - this won't be handled authomatically by GSM network. This state is handled by helpdesk's and Your (service and application) logic.
Both service types have advantages and pitfalls. And both should work - preferance depends on actual use-case.
There is no too much difference what means are used to exchange data (You can even exchange mails using post office, if latency is acceptable). In practice it means You can use ping (ICMP) or DNS queries, or emails to exchange data - as long as Your application knows how to use/decode such channel.
I think John Saunders in his answer refered to 7 layer OSI model, an I think his point is correct.
This analogy is not 100% correct, I tried to explain the idea: the difference is how the state is preserved (by protocol itself, or by application/framework).

implement restful webservice behind NAT

As I understand it, it is not simply possible to implement a rest webservice on a device which is behind NAT. So i was searching for some solutions for this.
Is it possible to use long polling in order to implement the webservice? this way, the local device will make a call to the remote client (which is exactly what i want), the client has to keep the connection open (keep alive?) until the client want to call a webservice method. It can do so, because the connection is still open. After the call the client will immediately send another poll to the client ... etc..
Is it possible to implement it this way?
Another solutions on which i came across:
ReverseHTTP - I don't know very much about this, but it sounds like i can implement the webservice with this. right?
There are several other solutions, like TURN or STUN but they seem to be very complicated.
Do you have any suggestions?
I am using c++/linux on my network devices.
EDIT: Port Forwarding is not an option.
You've got a lot of different concepts here in this question. You can certainly implement a RESTful service behind a firewall/NAT... you just need to configure your firewall/NAT to forward connections to your service. There are issues of firewall/NAT devices timing out connections... here again, you can configure your device to not do that, or you can update your communication mechanism with some kind of "keep-alive". "long polling" is somewhat unrelated, and is used as a way of getting an "interactive like response" from a server... basically the server sits on a poll request from a client until it has something to respond with, or the request times out and the client makes another one. STUN and TURN are more voice/video communications-related technologies. I suggest starting with simply having your firewall/NAT device forward web-based requests to your web server.
You don't say what transfer protocol you are using, I'm assuming HTTP.
HTTP uses TCP/IP, so your device NAT needs to redirect the connection request to your server.
There's others ways, like if you have more than one internet IP address, so the requests could be directed to the server too, but thats more complicated than port forward so I think its not what you have.
So basically you need to configure the port forward. Take it like a PABX, calls from the exterior lines needs to know a ramal to reach a phone, thats a distant mean to think of it.
And as said, the suggestions you said, are not intended for that, is mainly for client connection, what for many NAT is not necessary, as the NAT is prepared for doing that.

implementing server for licencing management

I would like to implement the server side of a licence management software. I use C++ in LINUX OS.
When the software starts it must connect to a server that checks privileges and allows/disallow running of some features.
My question is about the implementation of the communication between client and server across internet:
The server will have a static IP on internet so is it enough to use a simple TCP/IP socket client that will connect to a TCP/IP socket server ( providing IP/PORT) ?
I am familiar with socket communication , but less with communication across internet so my question is whether this is the right approach or do I need to use a different mechanism like a http client server or other.
Regards
AFG
Here are some benefits to using HTTP as a transport:
easier to get right, more likely to work in production: Yes, you will probably have to add additional dependencies to deal with HTTP (client and server side), but it's still preferable to yet another homegrown protocol, which you have to implement, maintain, care about backwards compatibility, deal with multiplatform issues (eg. endianness), etc. In terms of implementation ease, using an HTTP based solution should be far easier in the common case (especially true if you build a REST style service API for license checking).
More help available: HTTP as the foundation of the web is one of the most widely used technologies today. Most (all?) problems you will run into are probably publicly documented with solutions/workarounds.
Encryption 'for free': Encryption is already a solved problem (HTTPS/SSL), both with regard to transport as well as with regard to what you have to implement on your end, and it's just a matter of setting it up.
Server Authentication 'for free': HTTPS/SSL doesn't only solve encryption but also server authentication, so that the client can verify whether it's actually talking to the right service.
Guaranteed to work on the internet: HTTP/HTTPS traffic is common on the internet, so you won't run into routing problems or firewalls which are hard to traverse. This might be a problem when using your own protocol.
Flexibility out of the box: You also put less constraints on clients communicating with your server, as it's very simple to build a client in many different environments, as long as they can talk HTTP (and maybe SSL), and they know how to issue the request to your server (ie. what your service API looks like).
Easy to integrate with administrative webapp: If you want to allow users to manage their accounts associated with licenses in some way (update contact info etc.), then you might even combine the license server with that application. You can also build the license administration UI part into the same app if that's useful.
And as a last remark (this puts additional constraints on your client side HTTPS/SSL implementation): you can even use client side SSL certificates, which essentially allow authenticating the client to the server. Depending on how you use them, client side certificates are harder to manage, but they can be eg. expired, or revoked, so to some extent they actually are licenses (to connect to the server).
HTTP is not a different mechanism. It is a protocol operated over TCP/IP connections.
Internet uses IP transport exclusively. You can use UDP, TCP or SCTP session (well, UDP is not much of a session) layer on top of it. TCP is the general choice.
Sockets are operating system interface. They are the only interface to network in most systems, but some systems have different interface. Nothing to do with the transport itself.
IP addresses are in practice tied to network topology, so I strongly discourage hardcoding the IP address into the server. If you have to change network provider for any reason, you won't be getting the same IP address. Use DNS, it's just one gethostbyname call.
And don't forget to authenticate the server; even with hardcoded IP it's too easy to redirect it.

Advice on web services without HTTP

My company is planning on implementing a remote programming tool to configure embedded devices in the field. I assumed that these devices would have an HTTP client on them, and planned to implement some REST services for them to access. Unfortunately, I found out that they have a TCP stack but no HTTP client. One of my co-workers suggested that we try to send “soap packets” over port 80 without an HTTP client. The devices also don’t have any SOAP client. Is this possible? Would there be implications if there was a web server running on the network the devices are connected to? I’d appreciate any advice or best practices on how to implement something like this.
If your servers are serving simple files, the embedded devices really only need to send an HTTP GET request (possibly with a little extra data identifying the device, so the server can know which firmware version to send).
From there, it's pretty much a simple matter of reading the raw data coming in on the embedded device's socket -- you might need to only disregard the HTTP header on the response, or you could possibly configure your server to not send it for those requests.
you don't really need an HTTP client per-se. HTTP is a very simple text-based protocol that you can implement yourself if you need to.
That said, you probably won't need to implement it yourself. If they have a TCP stack and a standard sockets library, you can probably find a simple C library (such as this one) that wraps up HTTP or SOAP functionality for you. You could then just build that library into your application.
Basic HTTP is not a particularly difficult protocol to implement by hand. It's a text and line based protocol, save for the payload, and the servers work quite well with "primitive, ham fisted" clients, which is all a simple client needs to be.
If you can use just a subset, likely, then simply write it and be done.
You can implement a trivial http client over sockets (here is an example of how to do it in ruby: http://www.tutorialspoint.com/ruby/ruby_socket_programming.htm )
It probably depends what technology you have available on your embedded devices - if you can easily consume JSON or XML then a webservice approach using the above may work for you.