How to determine the site protocol using url - c++

How do I determine (using c++ and winsock) the site protocol based on the URL, for example (www.google.com) if the protocol is not known in advance?
Or how do I determine web server TCP port?
I want do an HTTP get request using the link which after www. and need to determine the port or protocol, in order to use http over tls or simple http.

You can't. You decide the protocol you're going to use to contact some server. If you haven't decided it, you don't know it. Certainly your computer can't tell you what it will be.
It's like asking a supermarket cashier what you're going to buy today. They don't know that. You are supposed to tell them that.
What you can do is to see whether a website on that server automatically redirects HTTP traffic to a HTTPS URI (thus enforcing SSL), or otherwise blocks non-HTTPS traffic. If that's what you want to do, you can achieve it by attempting to make an HTTP connection to that domain and see what happens.
Depending on your web browser make/model/version, that may be what it is doing when you enter "www.google.com" without specifying a protocol: assuming http:// then following any remote redirects that take you to https:// instead. Pretty soon, though, or already if you have certain extensions installed, the default is going to be https://. I must stress though, again, that this is still the client (i.e. the browser) making the decision, not the server; if you are writing your own browser then, again, you must choose what that default should be.

In your example, www.google.com is a domain name.
To get protocol you need full urls like
https://www.google.com or http://www.google.com
In the above example, http and https are protocol types.
You can also use nmap to determine the open ports, service name and protocol used

Related

encrypting form data before submitting to server

I have developed a Django application and now want to make sure the POST data transmitted through the page is safe.
I have couple of questions about this?
I see SSL certificates being displayed on many webpages. How do I get this certificate?
Do I need to change anything on my submitted form to encrypt the data or should I change any settings on my webserver?
I know its a general question but it would be great if someone provides a good answer.
First off, the POST data transmitted through the page is never safe from an application perspective. You don't have control over the user of the website. SSL and HTTPS helps prevent man in the middle attacks to ensure the request from the client (browser) to your server is encrypted. The underlying data that is sent can be malicious, so you should always validate inputs.
Secondly, if you want to use HTTPS and SSL, which I highly recommend, you'll need to obtain a certificate from one of the providers out there and install it with your webserver, which I presume is apache. Typically your domain provider can help you with obtaining an SSL certificate for your domain from one of the main certificate authorities. Regarding the installation and setup, there is tons of information about this online as it's a common task. I'm not familiar with Apache configuration to provide any specific recommendations. You'll also want to have rewrite rules so that your site can only be accessed via HTTPS and if someone tries to use HTTP, it simply redirects to HTTPS.
Lastly, you don't need to do anything in your Django application as your webserver should handle the basic interactions between your server and client to validate the HTTPS requests.

Boost ASIO with OpenSSL Can't Read HTTP Headers

I'm attempting to write a simple HTTP/HTTPS proxy using Boost ASIO. HTTP is working fine, but I'm having some issues with HTTPS. For the record this is a local proxy. Anyway so here is an example of how a transaction works with my setup.
Browser asks for Google.com
I lie to the browser and tell it to go to 127.0.0.1:443
Browser socket connects to my local server on 443I attempt to read the headers so I can do a real host lookup and open a second upstream socket so I can simply forward out the requests.
This is where things fail immediately. When I try to print out the headers of the incoming socket, it appears that they are already encrypted by the browser making the request. I thought at first that perhaps the jumbled console output was just that the headers were compressed, but after some thorough testing this is not the case.
So I'm wondering if anyone can point me in the right direction, perhaps to some reading material where I can better understand what is happening here. Why are the headers immediately encrypted before the connection to the "server" (my proxy) even completes and has a chance to communicate with the client? Is it a temp key? Do I need to ignore the initial headers and send some command back telling the client what temporary key to use or not to compress/encrypt at all? Thanks so much in advance for any help, I've been stuck on this for a while.
HTTPS passes all HTTP traffic, headers and all, over a secure SSL connection. This is by design to prevent exactly what you're trying to do which is essentially a man-in-the-middle attack. In order to succeed, you'll have to come up with a way to defeat SSL security.
One way to do this is to provide an SSL certificate that the browser will accept. There are a couple common reasons the browser complains about a certificate: (1) the certificate is not signed by an authority that the browser trusts and (2) the certificate common name (CN) does not match the URL host.
As long as you control the browser environment then (1) is easily fixed by creating your own certificate authority (CA) and installing its certificate as trusted in your operating system and/or browser. Then in your proxy you supply a certificate signed by your CA. You're basically telling the browser that it's okay to trust certificates that your proxy provides.
(2) will be more difficult because you have to supply the certificate with the correct CN before you can read the HTTP headers to determine the host the browser was trying to reach. Furthermore, unless you already know the hosts that might be requested you will have to generate (and sign) a matching certificate dynamically. Perhaps you could use a pool of IP addresses for your proxy and coordinate with your spoofing DNS service so that you know which certificate should be presented on which connection.
Generally HTTPS proxies are not a good idea. I would discourage it because you'll really be working against the grain of browser security.
I liked this book as a SSL/TLS reference. You can use a tool like OpenSSL to create and sign your own certificates.

C++ Winsock Determine HTTP or HTTPS

I've just started studying Winsocks and I've a simple question for you: how can I determine if the connection to a server must take place over a HTTP or HTTPS connection?
Let's say I want to connect to randomsite.random, how can I know what kind of connection I need? I know that for HTTP I must connect to port 80, while for HTTPS is needed 443, but how can I determine WHEN is needed a HTTPS connection?
Thank you for the attention!
The same way a web browser decides: Based on the URL you are trying to load. In a web browser, the URL begins with http or https, which is used to determine whether an SSL connection should be used. This is also used to determine the port if no port number is specified in the URL.
Many sites offer both a secure and a non-secure version. Some offer only a secure version, but still run a non-secure server which issues a redirect to the URL of the secure version. If you implement following of redirects, you don't need to worry about which version to use: it will happen automatically.
This is usually a function of the site you are connecting to.
If the site requires a HTTPS connection, then if you connect over HTTP you will get a redirect response code with a HTTPS URL.
Firstly, it's not always port 80 and port 443. Secondly, you won't establish successful communication if you use the wrong communication protocol. As said in another answer, if you try to connect via HTTP to an HTTPS server, it will give you a redirect response code with an HTTPS URL.
Most of the time, you have this information before-hand!

Determining server-type from http request

I have a web-server written in CPP. I want to determine the server-type of the request. i.e whether the request came from http or https URL ?
If you have your own web-server written in c++ you already know whether it came over http or https as they come through different ports and require different handling.
Which port you're listening to?
By default HTTPS URLs begin with "https://" and use port 443 by default, where HTTP URLs begin with "http://" and use port 80 by default.
There are other questions like how you're managing certificates to serve secure connections?
This article might be helpful - http://java.sun.com/developer/technicalArticles/Security/secureinternet/

How to ignore certain socket requests

I'm currently working on a TCP socket server in C++; and I'm trying to figure out how I can ignore all browser connections made to my server. Any idea's?
Thanks.
Need more details to give good feedback.
Are you going to be listening on port 80 but want to avoid all HTTP traffic? Or will your protocol be HTTP-based? Do you need to listen on 80 or can you pick any port?
If it's your own custom protocol (HTTP or not) you could just look at the first line sent up and if it's not to your liking just close() the socket.
EDIT:
Since you're going to be listening on a custom port, you probably won't get any browser traffic anyhow. Further, since you're going to be writing your own protocol, just require a handshake which establishes your client speaks your custom protocol and then ignore (close()) everything else.
Bonus points: depending on your goal, send back an HTTP error message which can be displayed to the user.
You can't stop a web-browser initiated tcp-session from connecting to your tcp server. You can (as stated above) close the connection once you've detected the client is trying to talk http to you (or any other unwanted application-layer protocol).
Just look at the differences between valid connection requests and invalid ones (i.e. dump both request types to examine each request), in your specific case, you'll want to look at the HTTP request header to ignore all such requests (assuming that valid requests do not make use of HTTP).