How to read binary files from HTTP using C/C++ sockets - c++

I'm writing Http-Client which takes URL on somefile, download it and save it on a disk. Like curl does.
I can use only C/C++ with std:: and libc. I have no problems with downloading text files like XML, CSV or txt, because they were saved like it should be and if to open them in editor - it's ok, there's that text which was expected. But when i download tar or pdf and trying to open them, it tells that files are corrupted.
Here's 2 main methods of my class HttpClient. HttpClient::get - send Http-request to the host, which is mentioned in URL, and calls the 2nd main method - HttpClient::receive which defines what kind of data there is - binary or text, and write whole Http-request body in a file using binary or text mode.
All other methods i decided not to show, but i can if someone needs.
HttpClient::get:
bool HttpClient::get() {
std::string protocol = getProtocol();
if (protocol != "http://") {
std::cerr << "Don't support no HTTP protocol" << std::endl;
return false;
}
std::string host_name = getHost();
std::string request = "GET ";
request += url + " HTTP/" + HTTP_VERSION + "\r\n";
request += "Host: " + host_name + "\r\n";
request += "Accept-Encoding: gzip\r\n";
request += "Connection: close\r\n";
request += "\r\n";
sock = socket(AF_INET, SOCK_STREAM, 0);
if (sock < 0) {
std::cerr << "Can't create socket" << std::endl;
return false;
}
addr.sin_family = AF_INET;
addr.sin_port = htons(HTTP_PORT);
raw_host = gethostbyname(host_name.c_str());
if (raw_host == NULL) {
std::cerr << "No such host: " << host_name << std::endl;
return false;
}
if (!this->connect()) {
std::cerr << "Can't connect" << std::endl;
return false;
} else {
std::cout << "Connection established" << std::endl;
}
if (!sendAll(request)) {
std::cerr << "Error while sending HTTP request" << std::endl;
return false;
}
if (!receive()) {
std::cerr << "Error while receiving HTTP response" << std::endl;
return false;
}
close(sock);
return true;
}
HttpClient::receive:
bool HttpClient::receive() {
char buf[BUF_SIZE];
std::string response = "";
std::ofstream file;
FILE *fd = NULL;
while (1) {
size_t bytes_read = recv(sock, buf, BUF_SIZE - 1, 0);
if (bytes_read < 0)
return false;
buf[bytes_read] = '\0';
if (!file.is_open())
std::cout << buf;
if (!file.is_open()) {
response += buf;
std::string content = getHeader(response, "Content-Type");
if (!content.empty()) {
std::cout << "Content-Type: " << content << std::endl;
if (content.find("text/") == std::string::npos) {
std::cout << "Binary mode" << std::endl;
file.open(filename, std::ios::binary);
}
else {
std::cout << "Text mode" << std::endl;
file.open(filename);
}
std::string::size_type start_file = response.find("\r\n\r\n");
file << response.substr(start_file + 4);
}
}
else
file << buf;
if (bytes_read == 0) {
file.close();
break;
}
}
return true;
}
I can't find help, but i think that binary data is encoded in some way, but how to decode it?

I can't find help, but i think that binary data is encoded in some way, but how to decode it?
You don't explain why you think this way but the following line from your request might cause some encoding you don't handle:
request += "Accept-Encoding: gzip\r\n";
Here you explicitly say that you are willing to accept content encoded (compressed) with gzip. But looking at your code you are not even checking if the content es declared as encoded with gzip by analyzing the Content-Encoding header.
Apart from this the following line might cause a problem too:
request += url + " HTTP/" + HTTP_VERSION + "\r\n";
You don't show what HTTP_VERSION is but assuming that it is 1.1 you also have to deal with Transfer-Encoding: chunked too.

Thanks everyone.
I solved this problem by changing response += buf; to response.append(buf, bytes_read); and file << buf; to file.write(buf, bytes_read);.
It was stupid to write binary data like null-terminating string.

Related

Why doesn't this "http responder" respond to browser access outside the same PC?

Premise:
I'm building on newly-learned networking fundamentals learned from these two questions: one, two.
I'll call the the code at the bottom of this post my "http responder," and not a "http server," since I recently got an educational/appreciated slap on the wrist for calling it the latter.
The program functions as follows:
it listens at INADDR_ANY port 9018 (a naively/randomly-chosen number)
it dumps (to stdout) the content received at the accepted socket until there's no more content to read
it sends a minimal HTTP response with status OK.
(in case #Remy Lebeau visits this question, item 2, specifically, is why this program is not a http server: it does not parse the incoming HTTP request, it just dumbly dumps it and responds -- even in the case of a closed TCP connection -- but I believe this is not relevant to the question asked here).
From my second link, above, I learned about why a web server would want to listen to a specific port on all interfaces.
My understanding is that the way this is done in C-family languages is by binding to INADDR_ANY (as opposed to a specific IP address, like "127.0.0.13").
Question:
When I run this program, I observe the expected result if I try to connect from a web browser that is running on the same PC as where the executable is run: my browser shows a minimal webpage with content "I'm the content" if I connect to 127.0.0.1:9018, 127.0.0.2:9018, 127.0.0.13.9018, 127.0.0.97:9018, etc.
Most relevant to this question, I also get the same minimal webpage by pointing my browser to 10.0.0.17:9018, which is the IP address assigned to my "wlpls0" interface:
$ ifconfig
...
wlp1s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.0.0.17 netmask 255.255.255.0 broadcast 10.0.0.255
inet6 fe80::5f8c:c301:a6a3:6e35 prefixlen 64 scopeid 0x20<link>
ether f8:59:71:01:89:cf txqueuelen 1000 (Ethernet)
RX packets 1272659 bytes 1760801882 (1.7 GB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 543118 bytes 74285210 (74.2 MB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
However, I only observe this desired webpage if the browser that I point to 10.0.0.7:9018 is running on the same PC as where the a.out is running.
From another PC on the same network, if I point its browser to 10.0.0.17:9018, the browser spins without connecting, and eventually says "Hmm...can't reach this page" and "10.0.0.17 took too long to respond".
So my question is: what are reasons why only a browser running on the same PC as the running a.out can connect to the "http responder"? Why do browsers on a different PC in the same network seem unable to connect?
What I have tried:
On the other PC, I am able to ping 10.0.0.17 -- and that just about exhausts my knowledge of how to debug networking issues.
I considered whether the issue at root is more likely to be "networking stuff", which might make this question better asked at Super User, but then I thought to start my inquiry with Stack Overflow, in case the issues is in the C++ code.
The code:
// main.cpp
#include <arpa/inet.h>
#include <cerrno>
#include <cstdio>
#include <cstring>
#include <fcntl.h>
#include <iostream>
#include <netinet/in.h>
#include <pthread.h>
#include <semaphore.h>
#include <stdexcept>
#include <sstream>
#include <sys/types.h>
#include <sys/socket.h>
#include <unistd.h>
#define IP "0.0.0.0"
#define PORT (9018)
/**
* A primitive, POC-level HTTP server that accepts its first incoming connection
* and sends back a minimal HTTP OK response.
*/
class Server {
private:
static const std::string ip_;
static const std::uint16_t port_{PORT};
int listen_sock_;
pthread_t tid_;
public:
Server() { ///< create + bind listen_sock_; start thread for startRoutine().
using namespace std;
int result;
if (! createSocket()) { throw runtime_error("failed creating socket"); }
if (! bindSocket()) { throw runtime_error("failed binding socket"); }
if ((result = pthread_create(&tid_, NULL, startRoutine, this))) {
std::stringstream ss;
ss << "pthread_create() error " << errno << "(" << result << ")";
std::cerr << ss.str() << std::endl;
throw runtime_error("failed spawning Server thread");
}
}
~Server() { ///< wait for the spawned thread and destroy listen_sock_.
pthread_join( tid_, NULL );
destroySocket();
}
private:
bool createSocket() { ///< Creates listen_sock_ as a stream socket.
listen_sock_ = socket(PF_INET, SOCK_STREAM, 0);
if (listen_sock_ < 0) {
std::stringstream ss;
ss << "socket() error " << errno << "(" << strerror(errno) << ")";
std::cerr << ss.str() << std::endl;
}
return (listen_sock_ >= 0);
}
void destroySocket() { ///< shut down and closes listen_sock_.
if (listen_sock_ >= 0) {
shutdown(listen_sock_, SHUT_RDWR);
close(listen_sock_);
}
}
bool bindSocket() { ///< binds listen_sock_ to ip_ and port_.
int ret;
sockaddr_in me;
me.sin_family = PF_INET;
me.sin_port = htons(port_);
me.sin_addr.s_addr = INADDR_ANY;
int optval = 1;
setsockopt(listen_sock_, SOL_SOCKET, SO_REUSEADDR, &optval, sizeof optval);
if ((ret = bind(listen_sock_, (sockaddr*)&me, sizeof me))) {
std::stringstream ss;
ss << "bind() error " << errno << "(" << strerror(errno) << ")";
std::cerr << ss.str() << std::endl;
}
return (! ret);
}
/**
* Accept a connection from listen_sock_.
* Caller guarantees listen_sock_ has been listen()ed to already.
* #param tv [in, out] How long to wait to accept a connection.
* #return accepted socket; -1 on any error.
*/
int acceptConnection(timeval& tv) {
int sock = -1;
int ret;
fd_set readfds;
sockaddr_in peer;
socklen_t addrlen = sizeof peer;
FD_ZERO(&readfds);
FD_SET(listen_sock_, &readfds);
ret = select(listen_sock_ + 1, &readfds, NULL, NULL, &tv);
if (ret < 0) {
std::stringstream ss;
ss << "select() error " << errno << "(" << strerror(errno) << ")";
std::cerr << ss.str() << std::endl;
return sock;
}
else if (! ret) {
std::cout << "no connections within " << tv.tv_sec << " seconds"
<< std::endl;
return sock;
}
if ((sock = accept(listen_sock_, (sockaddr*)&peer, &addrlen)) < 0) {
std::stringstream ss;
ss << "accept() error " << errno << "(" << strerror(errno) << ")";
std::cerr << ss.str() << std::endl;
}
else {
std::stringstream ss;
ss << "socket " << sock << " accepted connection from "
<< inet_ntoa( peer.sin_addr ) << ":" << ntohs(peer.sin_port);
std::cout << ss.str() << std::endl;
}
return sock;
}
static void dumpReceivedContent(const int& sock) { ///< read & dump from sock.
fd_set readfds;
struct timeval tv = {30, 0};
int ret;
FD_ZERO(&readfds);
FD_SET(sock, &readfds);
ret = select(sock + 1, &readfds, NULL, NULL, &tv);
if (ret < 0) {
std::stringstream ss;
ss << "select() error " << errno << "(" << strerror(errno) << ")";
std::cerr << ss.str() << std::endl;
return;
}
else if (! ret) {
std::cout << "no content received within " << tv.tv_sec << "seconds"
<< std::endl;
return;
}
if (FD_ISSET(sock, &readfds)) {
ssize_t bytes_read;
char buf[80] = {0};
fcntl(sock, F_SETFL, fcntl(sock, F_GETFL, 0) | O_NONBLOCK);
std::cout << "received content:" << std::endl;
std::cout << "----" << std::endl;
while ((bytes_read = read(sock, buf, (sizeof buf) - 1)) >= 0) {
buf[bytes_read] = '\0';
std::cout << buf;
}
std::cout << std::endl << "----" << std::endl;
}
}
static void sendMinHttpResponse(const int& sock) { ///< min HTTP OK + content.
static const std::string html =
"<!doctype html>"
"<html lang=en>"
"<head>"
"<meta charset=utf-8>"
"<title>blah</title>"
"</head>"
"<body>"
"<p>I'm the content</p>"
"</body>"
"</html>";
std::stringstream resp;
resp << "HTTP/1.1 200 OK\r\n"
<< "Content-Length: " << html.length() << "\r\n"
<< "Content-Type: text/html\r\n\r\n"
<< html;
write(sock, resp.str().c_str(), resp.str().length());
}
/**
* Thread start routine: listen for, then accept connections; dump received
* content; send a minimal response.
*/
static void* startRoutine(void* arg) {
Server* s;
if (! (s = (Server*)arg)) {
std::cout << "Bad arg" << std::endl;
return NULL;
}
if (listen(s->listen_sock_, 3)) {
std::stringstream ss;
ss << "listen() error " << errno << "(" << strerror(errno) << ")";
std::cerr << ss.str() << std::endl;
return NULL;
}
std::cout << "Server accepting connections at "
<< s->ip_ << ":" << s->port_ << std::endl;
{
timeval tv = { 30, 0 };
int sock = s->acceptConnection(tv);
if (sock < 0) {
std::cout << "no connections accepted" << std::endl;
return NULL;
}
dumpReceivedContent(sock);
sendMinHttpResponse(sock);
shutdown(sock, SHUT_RDWR);
close(sock);
}
return NULL;
}
};
const std::string Server::ip_{IP};
int main( int argc, char* argv[] ) {
Server s;
return 0;
}
Compilation/execution:
This is a "working" case when the http responder receives a connection from a web browser on the same PC connecting to 10.0.0.17:9018:
$ g++ -g ./main.cpp -lpthread && ./a.out
Server accepting connections at 0.0.0.0:9018
socket 4 accepted connection from 10.0.0.17:56000
received content:
----
GET / HTTP/1.1
Host: 10.0.0.17:9018
Connection: keep-alive
Cache-Control: max-age=0
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
Accept-Encoding: gzip, deflate
Accept-Language: en-US,en;q=0.9
----
This is the problem/question case when the http responder receives nothing from a web browser on a different PC in the same network connecting to 10.0.0.17:9018:
$ ./a.out
Server accepting connections at 0.0.0.0:9018
no connections within 0 seconds
no connections accepted
** The "no connections within 0 seconds" message is because select() updated the struct timeval.tv_sec field -- the program has actually waited 30 seconds.

Why does only Firefox display the response from this HTTP server?

I'm trying to get straight in my head the relationship between HTTP and TCP.
I tried to resolve (what I perceived as) contradictory answers from a web search of "tcp vs http" by writing a server that listens at a TCP socket bound to some address+port, then typing that address+port into a web brower.
Having done so, I saw that the content received at the accept()ed socket was text with human-readable "HTTP stuff" (my knowledge of HTTP isn't enough to intelligently identify the content).
From Chrome, my server receives:
GET / HTTP/1.1
Host: 127.0.0.23:9018
Connection: keep-alive
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
Sec-Fetch-Site: none
Sec-Fetch-Mode: navigate
Sec-Fetch-Dest: document
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9
...and from Firefox, my server receives:
GET / HTTP/1.1
Host: 127.0.0.23:9018
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:102.0) Gecko/20100101 Firefox/102.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br
Connection: keep-alive
Upgrade-Insecure-Requests: 1
Sec-Fetch-Dest: document
Sec-Fetch-Mode: navigate
Sec-Fetch-Site: none
Sec-Fetch-User: ?1
From the above results, I conjectured that HTTP is sending HTTP-conformant bytes (is it always ASCII?) over a TCP connection to a server's socket that has been accept()ed after listen()ing to a specific address+port.
So I further conjectured that in order to get content to show up in a web browser that connects to the address+port that my server is listen()ing at, my server should write() some kind of HTTP-compliant response to the socket.
This Stack Overflow Q&A gave me a candidate minimal HTTP response.
Putting it all together, my server's MCVE code is:
#include <arpa/inet.h>
#include <cerrno>
#include <cstdio>
#include <cstring>
#include <fcntl.h>
#include <iostream>
#include <netinet/in.h>
#include <pthread.h>
#include <semaphore.h>
#include <stdexcept>
#include <sstream>
#include <sys/types.h>
#include <sys/socket.h>
#include <unistd.h>
#define IP "127.0.0.23"
#define PORT (9018)
/**
* A primitive, POC-level HTTP server that accepts its first incoming connection
* and sends back a minimal HTTP OK response.
*/
class Server {
private:
static const std::string ip_;
static const std::uint16_t port_{PORT};
int listen_sock_;
pthread_t tid_;
public:
/**
* Ctor: create and bind listen_sock_ and start a thread for startRoutine().
*/
Server() {
using namespace std;
int result;
if (! createSocket()) { throw runtime_error("failed creating socket"); }
if (! bindSocket()) { throw runtime_error("failed binding socket"); }
if ((result = pthread_create(&tid_, NULL, startRoutine, this))) {
std::stringstream ss;
ss << "pthread_create() error " << errno << "(" << result << ")";
std::cerr << ss.str() << std::endl;
throw runtime_error("failed spawning Server thread");
}
}
/** Dtor: wait for the spawned thread and destroy listen_sock_. */
~Server() {
pthread_join( tid_, NULL );
destroySocket();
}
private:
/** Creates listen_sock_ as a stream socket. */
bool createSocket() {
listen_sock_ = socket(PF_INET, SOCK_STREAM, 0);
if (listen_sock_ < 0) {
std::stringstream ss;
ss << "socket() error " << errno << "(" << strerror(errno) << ")";
std::cerr << ss.str() << std::endl;
}
return (listen_sock_ >= 0);
}
[138/573]
/** Shuts down and closes listen_sock_. */
void destroySocket() {
if (listen_sock_ >= 0) {
shutdown(listen_sock_, SHUT_RDWR);
close(listen_sock_);
}
}
/** Binds listen_sock_ to ip_ and port_. */
bool bindSocket() {
int ret;
sockaddr_in me;
me.sin_family = PF_INET;
me.sin_port = htons(port_);
me.sin_addr.s_addr = inet_addr(ip_.c_str());
int optval = 1;
setsockopt(listen_sock_, SOL_SOCKET, SO_REUSEADDR, &optval, sizeof optval);
if ((ret = bind(listen_sock_, (sockaddr*)&me, sizeof me))) {
std::stringstream ss;
ss << "bind() error " << errno << "(" << strerror(errno) << ")";
std::cerr << ss.str() << std::endl;
}
return (! ret);
}
/**
* Accept a connection from listen_sock_.
* Caller guarantees listen_sock_ has been listen()ed to already.
* #param tv [in, out] How long to wait to accept a connection.
* #return accepted socket; -1 on any error.
*/
int acceptConnection(timeval& tv) {
int sock = -1;
int ret;
fd_set readfds;
sockaddr_in peer;
socklen_t addrlen = sizeof peer;
FD_ZERO(&readfds);
FD_SET(listen_sock_, &readfds);
ret = select(listen_sock_ + 1, &readfds, NULL, NULL, &tv);
if (ret < 0) {
std::stringstream ss;
ss << "select() error " << errno << "(" << strerror(errno) << ")";
std::cerr << ss.str() << std::endl;
return sock;
}
else if (! ret) {
std::cout << "no connections within " << tv.tv_sec << "seconds"
<< std::endl;
return sock;
}
if ((sock = accept(listen_sock_, (sockaddr*)&peer, &addrlen)) < 0) {
std::stringstream ss;
ss << "accept() error " << errno << "(" << strerror(errno) << ")";
std::cerr << ss.str() << std::endl;
}
else {
std::stringstream ss;
ss << "socket " << sock << " accepted connection from "
<< inet_ntoa( peer.sin_addr ) << ":" << ntohs(peer.sin_port);
std::cout << ss.str() << std::endl;
}
return sock;
}
[60/573]
/** Read from the specified socket and dump to stdout. */
static void dumpReceivedContent(const int& sock) {
fd_set readfds;
struct timeval tv = {30, 0};
int ret;
FD_ZERO(&readfds);
FD_SET(sock, &readfds);
ret = select(sock + 1, &readfds, NULL, NULL, &tv);
if (ret < 0) {
std::stringstream ss;
ss << "select() error " << errno << "(" << strerror(errno) << ")";
std::cerr << ss.str() << std::endl;
return;
}
else if (! ret) {
std::cout << "no content received within " << tv.tv_sec << "seconds"
<< std::endl;
return;
}
if (FD_ISSET(sock, &readfds)) {
ssize_t bytes_read;
char buf[80] = {0};
fcntl(sock, F_SETFL, fcntl(sock, F_GETFL, 0) | O_NONBLOCK);
std::cout << "received content:" << std::endl;
std::cout << "----" << std::endl;
while ((bytes_read = read(sock, buf, (sizeof buf) - 1)) >= 0) {
buf[bytes_read] = '\0';
std::cout << buf;
}
std::cout << std::endl << "----" << std::endl;
}
}
/** Write a minimal HTTP OK response to the specified socker. */
static void sendMinHttpResponse(const int& sock) {
static const std::string resp =
"HTTP/1.1 200 OK\r\n"
"Content-Length: 13\r\n"
"Content-Type: text/plain\r\n\r\nHello World!";
write(sock, resp.c_str(), resp.length());
}
/**
* Thread start routine: listen for, then accept connections; dump received
* content; send a minimal response.
*/
static void* startRoutine(void* arg) {
Server* s;
if (! (s = (Server*)arg)) {
std::cout << "Bad arg" << std::endl;
return NULL;
}
if (listen(s->listen_sock_, 3)) {
std::stringstream ss;
ss << "listen() error " << errno << "(" << strerror(errno) << ")";
std::cerr << ss.str() << std::endl;
return NULL;
}
std::cout << "Server accepting connections at "
<< s->ip_ << ":" << s->port_ << std::endl;
{
timeval tv = { 30, 0 };
int sock = s->acceptConnection(tv);
if (sock < 0) {
std::cout << "no connections accepted" << std::endl;
return NULL;
}
dumpReceivedContent(sock);
sendMinHttpResponse(sock);
shutdown(sock, SHUT_RDWR);
close(sock);
}
return NULL;
}
};
const std::string Server::ip_{IP};
int main( int argc, char* argv[] ) {
Server s;
return 0;
}
When I point Chrome and Chromium browsers to my server (127.0.0.23:9018), I get a blank page with no content, but when I point Firefox to my server, I get the "Hello world!" string that I wanted.
Why does this only work with Firefox, and not with Chrome or Chromium?
Your server responds with an invalid data size Content-Length: 13.
The data is Hello World!, the size is 12.
resp.length() does not count \0, thus the server does not send Hello World!\0.
The header must be Content-Length: 12.
I addition to #273K's answer, another problem I see is your dumpReceivedContent() method is completely ignoring the HTTP protocol and just reads and logs everything the client sends until the client disconnects or an error occurs, and then you call sendMinHttpResponse() on a now-likely-invalid TCP connection, so the client probably won't be able to receive it.
You can't just blindly read from the TCP connection as you are doing. You MUST parse the client's data as it arrives so you can detect the end of the client's request properly and then leave the TCP connection open until your response has been sent.
Refer to RFC 2616 Section 4.4 and RFC 7230 Section 3.3.3 for the rules you must follow to detect when you have received the client's complete request without over-reading from the TCP connection.
I have a number of previous answers that provide pseudo code for demonstrating how to implement those rules.

Winsock2.h can't send http requests

Salutations fellow programmers,
I am trying to write a program that allows you input what you want and the program will send your input to the server.
At the moment, my goal is sending HTTP requests to a web page. It connects fine. But when the while loop runs in immediately sends something through the cin.getline procedure without me inputting anything. I thought this was weird but it seemed to be work anyway.
Every time I send something like: "GET / HTTP/1.1\r\n\r\n" it will return the correct thing, but anything else I input, like "OPTIONS" returns the source code + "application blocked" (I am at school so it makes sense).
So, I connected to hotspot shield VPN and tested the application, but to my horror when I input something to send it returns nothing.
I searched through stack overflow and google but I haven't been able to find anything so far; probably because I'm searching for the wrong solutions to the problem.
Anyway, if you have time, please scan through the code send some help. It could just be a VPN and school issue and I could try at home if the code seems to be working for you, so just let me know.
SPECIFIC OUTLINE OF PROBLEM:
When I use this outside the school network nothing is returned and the while loop doesn't seem to execute. I can connect but the program seems to be in an endless time-out or something.
cout << "Connected to " << hostName << endl;
while (true) {
cout << ">";
cin.getline(sendBuf, sizeof(sendBuf));
string s(sendBuf);
cout << s.c_str() << endl;
send(connectSocket, s.c_str(), sizeof(s.c_str()), 0);
int rec = recv(connectSocket, recvBuf, sizeof(recvBuf), 0);
if (rec > 0) {
cout << recvBuf << endl;
}
else if (rec <= 0) {
cout << "nothing" << endl;
}
}
system("pause");
}
system("pause");
}
my goal is sending HTTP requests to a web page
The code you showed does not attempt to implement any semblance of the HTTP protocol, not even close.
For one thing, if you look at your own example more carefully, you will see that the GET request (which BTW, is missing a required Host header, due to your use of HTTP 1.1) contains 2 line breaks, but cin.getline() (why not std::getline()?) reads only 1 line at a time. So, you read in one line, send it, and wait for a response that doesn't arrive since you didn't finish sending a complete request yet. That would explain why your while loop is hanging.
If you want the user to type in a complete HTTP request and then you send it as-is, you have to read in the ENTIRE request from the user, and then send it entirely to the server, before you can then attempt to receive the server's response. That means you have to handle line breaks between individual message headers, handle the terminating line break that separates the message headers from the message body, and detect the end of the body data.
I would suggest not relying on the user typing in a complete HTTP request as-is. I suggest you prompt the user for relevant pieces and let the user type normal text, and then your code can format that text into a proper HTTP request as needed.
When you are reading the server's response, you can't just blindly read arbitrary chunks of data. You have to process what you read, per the rules of the HTTP protocol. This is particularly important in order to determine when you have reached the end of the response and need to stop reading. The end of the response can be signaled in one of many different ways, as outlined in RFC 2616 Section 4.4 Message Length.
You are also making some common newbie mistakes in your TCP handling in general. TCP is a streaming transport, you are not taking into account that send() and recv() can sent/receive fewer bytes than requested. Or that recv() does not return null-terminated data.
With that said, try something like this:
void sendAll(SOCKET sckt, const void *buf, int buflen)
{
// send all bytes until buflen has been sent,
// or an error occurs...
const char *pbuf = static_cast<const char*>(buf);
while (buflen > 0)
{
int numSent = send(sckt, pbuf, buflen, 0);
if (numSent < 0) {
std::ostringstream errMsg;
errMsg << "Error sending to socket: " << WSAGetLastError();
throw std::runtime_error(errMsg.str());
}
pbuf += numSent;
buflen -= numSent;
}
}
int readSome(SOCKET sckt, void *buf, int buflen)
{
// read as many bytes as possible until buflen has been received,
// the socket is disconnected, or an error occurs...
char *pbuf = static_cast<char*>(buf);
int total = 0;
while (buflen > 0)
{
int numRecvd = recv(sckt, pbuf, buflen, 0);
if (numRecvd < 0) {
std::ostringstream errMsg;
errMsg << "Error receiving from socket: " << WSAGetLastError();
throw std::runtime_error(errMsg.str());
}
if (numRecvd == 0) break;
pbuf += numRecvd;
buflen -= numRecvd;
total += numRecvd;
}
return total;
}
void readAll(SOCKET sckt, void *buf, int buflen)
{
// read all bytes until buflen has been received,
// or an error occurs...
if (readSome(sckt, buf, buflen) != buflen)
throw std::runtime_error("Socket disconnected unexpectedly");
}
std::string readLine(SOCKET sckt)
{
// read a line of characters until a line break is received...
std::string line;
char c;
do
{
readAll(sckt, &c, 1);
if (c == '\r')
{
readAll(sckt, &c, 1);
if (c == '\n') break;
line.push_back('\r');
}
else if (c == '\n') {
break;
}
line.push_back(c);
}
while (true);
return line;
}
...
inline void ltrim(std::string &s) {
// erase whitespace on the left side...
s.erase(s.begin(), std::find_if(s.begin(), s.end(), [](int ch) {
return !std::isspace(ch);
}));
}
inline void rtrim(std::string &s) {
// erase whitespace on the right side...
s.erase(std::find_if(s.rbegin(), s.rend(), [](int ch) {
return !std::isspace(ch);
}).base(), s.end());
}
inline void trim(std::string &s) {
// erase whitespace on both sides...
ltrim(s);
rtrim(s);
}
inline void upperCase(std::string &s)
{
// translate all characters to upper-case...
std::transform(s.begin(), s.end(), s.begin(), ::toupper);
}
...
std::string makeRequest(const std::string &host, const std::string &method, const std::string &resource, const std::vector<std::string> &extraHeaders, const void *body, int bodyLength)
{
std::ostringstream oss;
oss << method << " " << resource << " HTTP/1.1\r\n";
oss << "Host: " << host << "\r\n";
oss << "Content-Length: " << bodyLength << "\r\n";
for(auto &hdr : extraHeaders)
{
// TODO: ignore Host and Content-Length...
oss << hdr << "\r\n";
}
oss << "\r\n";
oss.write(static_cast<const char*>(body), bodyLength);
return oss.str();
}
bool getHeaderValue(const std::vector<std::string> &headers, const std::string &headerName, std::string &value)
{
value.clear();
std::string toFind = headerName;
upperCase(toFind);
// find the requested header by name...
for(auto &s : headers)
{
std::string::size_type pos = s.find(':');
if (pos != std::string::npos)
{
std::string name = s.substr(0, pos-1);
trim(name);
upperCase(name);
if (name == toFind)
{
// now return its value...
value = s.substr(pos+1);
trim(value);
return true;
}
}
}
// name not found
return false;
}
...
std::cout << "Connected to " << hostName << std::endl;
try
{
std::string method, resource, hdr, data;
std::string status, version, reason;
std::vector<std::string> headers;
int statusCode, rec;
do
{
headers.clear();
data.clear();
// get user input
std::cout << "Method > " << std::flush;
if (!std::getline(std::cin, method))
throw std::runtime_error("Error reading from stdin");
upperCase(method);
std::cout << "Resource > " << std::flush;
if (!std::getline(std::cin, resource))
throw std::runtime_error("Error reading from stdin");
std::cout << "Extra Headers > " << std::flush;
while (std::getline(std::cin, hdr) && !hdr.empty())
headers.push_back(hdr);
if (!std::cin)
throw std::runtime_error("Error reading from stdin");
std::cout << "Data > " << std::flush;
// use Ctrl-Z or Ctrl-D to end the data, depending on platform...
std::ios_base::fmtflags flags = std::cin.flags();
std::cin >> std::noskipws;
std::copy(std::istream_iterator<char>(std::cin), std::istream_iterator<char>(), std::back_inserter(data));
if (!std::cin)
throw std::runtime_error("Error reading from stdin");
std::cin.flags(flags);
std::cin.clear();
// send request
std::string request = makeRequest(hostName, method, resource, headers, data.c_str(), data.length());
std::cout << "Sending request: << std::endl << request << std::endl;
// TODO: reconnect to hostName if previous request disconnected...
sendAll(connectSocket, request.c_str(), request.length());
// receive response
headers.clear();
data.clear();
// read the status line and parse it...
status = readLine(connectSocket);
std::cout << status << std::endl;
std::getline(std::istringstream(status) >> version >> statusCode, reason);
upperCase(version);
// read the headers...
do
{
hdr = readLine(connectSocket);
std::cout << hdr << std::endl;
if (hdr.empty()) break;
headers.push_back(hdr);
}
while (true);
// The transfer-length of a message is the length of the message-body as
// it appears in the message; that is, after any transfer-codings have
// been applied. When a message-body is included with a message, the
// transfer-length of that body is determined by one of the following
// (in order of precedence):
// 1. Any response message which "MUST NOT" include a message-body (such
// as the 1xx, 204, and 304 responses and any response to a HEAD
// request) is always terminated by the first empty line after the
// header fields, regardless of the entity-header fields present in
// the message.
if (((statusCode / 100) != 1) &&
(statusCode != 204) &&
(statusCode != 304) &&
(method != "HEAD"))
{
// 2. If a Transfer-Encoding header field (section 14.41) is present and
// has any value other than "identity", then the transfer-length is
// defined by use of the "chunked" transfer-coding (section 3.6),
// unless the message is terminated by closing the connection.
if (getHeaderValue(headers, "Transfer-Encoding", hdr))
upperCase(hdr);
if (!hdr.empty() && (hdr != "IDENTITY"))
{
std::string chunk;
std::string::size_type oldSize, size;
do
{
chunk = readLine(connectSocket);
std::istringstream(chunk) >> std::hex >> size;
if (size == 0) break;
oldSize = data.size();
chunkData.resize(oldSize + size);
readAll(connectSocket, &data[oldSize], size);
std::cout.write(&data[oldSize], size);
readLine(connectSocket);
}
while (true);
std::cout << std::endl;
do
{
hdr = readLine(connectSocket);
std::cout << hdr << std::endl;
if (hdr.empty()) break;
headers.push_back(hdr);
}
while (true);
}
// 3. If a Content-Length header field (section 14.13) is present, its
// decimal value in OCTETs represents both the entity-length and the
// transfer-length. The Content-Length header field MUST NOT be sent
// if these two lengths are different (i.e., if a Transfer-Encoding
// header field is present). If a message is received with both a
// Transfer-Encoding header field and a Content-Length header field,
// the latter MUST be ignored.
else if (getHeaderValue(headers, "Content-Length", hdr))
{
std::string::size_type size;
if ((std::istringstream(hdr) >> size) && (size > 0))
{
data.resize(size);
readAll(connectSock, &data[0], size);
std::cout << data;
}
}
// 4. If the message uses the media type "multipart/byteranges", and the
// transfer-length is not otherwise specified, then this self-
// delimiting media type defines the transfer-length. This media type
// MUST NOT be used unless the sender knows that the recipient can parse
// it; the presence in a request of a Range header with multiple byte-
// range specifiers from a 1.1 client implies that the client can parse
// multipart/byteranges responses.
else if (getHeaderValue(headers, "Content-Type", hdr) &&
(hdr.compare(0, 10, "multipart/") == 0))
{
// TODO: extract 'boundary' attribute and read from
// socket until the terminating boundary is reached...
}
// 5. By the server closing the connection.
else
{
do
{
rec = readSome(connectSocket, recvBuf, sizeof(recvBuf));
if (rec == 0) break;
data.append(recvBuf, rec);
std::cout.write(recvBuf, rec);
}
while (rec == sizeof(recvBuf));
}
}
std::cout << std::endl;
// use status, headers, and data as needed ...
getHeaderValue(headers, "Connection", hdr);
upperCase(hdr);
if (version == "HTTP/1.0")
{
if (hdr != "KEEP-ALIVE")
break;
}
else
{
if (hdr == "CLOSE")
break;
}
}
while (true);
}
catch (const std::exception &e)
{
std::cerr << e.what() << std::endl;
}
closesocket(connectSocket);
std::cout << "Disconnected from " << hostName << std::endl;
std::system("pause");
Isn't HTTP fun? :-) This is, by far, not a complete HTTP implementation, but it should get you started. However, as you can see, HTTP can be quite complex to implement from scratch, and it has many rules and restrictions that you have to follow. You are better off not implementing HTTP manually at all. There are plenty of 3rd party HTTP libraries that are available for C++. Use one of them instead, and let them handle the hard work for you, so you can focus on your own business logic.

Incorporate body of message while redirecting url in fastCGI?

I am implementing fastCGI in c++ along with nginx. Until now, I am able to develop basic http request method and some url redirection. But, I am not able to send the body of message while redirecting from post url to another post url. Below is my code:
#include <stdlib.h>
#include <stdio.h>
#include <sstream>
#include <iostream>
#include <string>
#include "string.h"
#include "fcgio.h"
#include <fcgi_stdio.h>
#include <boost/algorithm/string.hpp>
using namespace std;
using namespace boost;
// Maximum bytes
const unsigned long STDIN_MAX = 1000000;
/**
* Note this is not thread safe due to the static allocation of the
* content_buffer.
*/
string get_request_content(const FCGX_Request & request) {
char * content_length_str = FCGX_GetParam("CONTENT_LENGTH", request.envp);
unsigned long content_length = STDIN_MAX;
if (content_length_str) {
content_length = strtol(content_length_str, &content_length_str, 10);
if (*content_length_str) {
cerr << "Can't Parse 'CONTENT_LENGTH='"
<< FCGX_GetParam("CONTENT_LENGTH", request.envp)
<< "'. Consuming stdin up to " << STDIN_MAX << endl;
}
if (content_length > STDIN_MAX) {
content_length = STDIN_MAX;
}
} else {
// Do not read from stdin if CONTENT_LENGTH is missing
content_length = 0;
}
char * content_buffer = new char[content_length];
cin.read(content_buffer, content_length);
content_length = cin.gcount();
// Chew up any remaining stdin - this shouldn't be necessary
// but is because mod_fastcgi doesn't handle it correctly.
// ignore() doesn't set the eof bit in some versions of glibc++
// so use gcount() instead of eof()...
do cin.ignore(1024); while (cin.gcount() == 1024);
string content(content_buffer, content_length);
delete [] content_buffer;
return content;
}
int main(void) {
// Backup the stdio streambufs
streambuf * cin_streambuf = cin.rdbuf();
streambuf * cout_streambuf = cout.rdbuf();
streambuf * cerr_streambuf = cerr.rdbuf();
FCGX_Request request;
FCGX_Init();
FCGX_InitRequest(&request, 0, 0);
while (FCGX_Accept_r(&request) == 0) {
fcgi_streambuf cin_fcgi_streambuf(request.in);
fcgi_streambuf cout_fcgi_streambuf(request.out);
fcgi_streambuf cerr_fcgi_streambuf(request.err);
cin.rdbuf(&cin_fcgi_streambuf);
cout.rdbuf(&cout_fcgi_streambuf);
cerr.rdbuf(&cerr_fcgi_streambuf);
const char * uri = FCGX_GetParam("REQUEST_URI", request.envp);
string content = get_request_content(request);
if (content.length() == 0) {
content = ", something!";
}
const char * mediaType = FCGX_GetParam("REQUEST_METHOD",request.envp);
string value;
if(iequals(mediaType,"POST")&&iequals(uri,"/postmethod")) {
get_request_content(request);
cout << "HTTP/1.1 200 OK\r\nContent-Length: " << content.length() << "\r\n\r\n" << content;
}
if(iequals(mediaType,"GET")&&iequals(uri,"/getmethod")) {
string aalu = "this is the new lenght";
cout << "HTTP/1.1 200 OK\r\nContent-Length: " << aalu.length() << "\r\n\r\n" << aalu;
FCGX_Finish_r(&request);
}
if(iequals(mediaType,"GET")&&iequals(uri,"/redirect")) {
cout << "HTTP/1.1 301\r\nLocation: http://localhost/getmethod\r\n\r\n";
// cout << "Status: 301\r\n"
// << "Location: http://localhost/getmethod\r\n";
// << "\r\n";
// << "<html><body>Not Found</body></html>\n";
}
if(iequals(mediaType,"GET")&&iequals(uri,"/postredirect")) { // problem here
string json = "{\"topic\":\"asdf\",\"message\":\"message\"}";
cout << "HTTP/1.1 308\r\nLocation: http://localhost/postmethod\r\n\r\n";
// cout << "Status: 304\r\n"
// << "Location: http://localhost/postmethod\r\n"
// << "\r\n"
// << "<html><body>json</body></html>\n";
}
if(iequals(mediaType,"POST")&&iequals(uri,"/getredirect")) {
string json = "{\"topic\":\"asdf\",\"message\":\"message\"}";
cout << "HTTP/1.1 303\r\nLocation: http://localhost/getmethod\r\n\r\n";
// cout << "Status: 304\r\n"
// << "Location: http://localhost/postmethod\r\n"
// << "\r\n"
// << "<html><body>json</body></html>\n";
}
if(iequals(mediaType,"POST")&&iequals(uri,"/posttopostredirect")) {
string json = "{\"topic\":\"adf\",\"message\":\"message\"}";
cout << "Status: 307\r\n"
<<"Location: http://localhost/postmethod\r\n"
<<"\r\n"
<<"\n";
// cout << "Status: 305\r\n"
// << "Location: http://localhost/postmethod\r\n"
// << "\r\n"
// << "<html><body>"+json+"</body></html>\n";
}
if(iequals(mediaType,"GET")&&iequals(uri,"/getttogettredirect")) {
string json = "{\"topic\":\"ssdf\",\"message\":\"message\"}";
cout << "HTTP/1.X 301\r\nLocation: http://localhost/getmethod\r\n\r\n";
// cout << "Status: 307\r\n"
// << "Location: http://localhost/postmethod\r\n"
// << "\r\n";
// << "<html><body>json</body></html>\n";
}
}
// restore stdio streambufs
cin.rdbuf(cin_streambuf);
cout.rdbuf(cout_streambuf);
cerr.rdbuf(cerr_streambuf);
return 0;
}
/posttopostredirect url is redirecting to /postmethod url. Here, I wish to send json string (above) when /posttopostredirect is hit to /postmethod url. But couldnot figure out how to do so
Any content of HTTP redirect messages is completely ignored. This has nothing to do with FastCGI. This is how HTTP works. HTTP redirect messages cannot be used to send any content. Well, they could, but the HTTP client will simply ignore it, and issue a GET request to the redirected-to URL.
The only thing that can be done is to include any data as parameters encoded in the redirected-to URL. This is done the same way as encoding any parameters in the URL.

POST Winsock trouble

now I started studying the sockets. But it's not working properly with my hands. I have cool method for logging in:
int LogIn(const string &name, const string &password)
{
char Buffer[1024];
string data = "login=" + name + "&password=" + password;
sprintf(Buffer, "POST /Test/login_cl.php HTTP/1.1\r
"\nContent-Length: %d\r"
"\nConnection: Keep-Alive\r"
"\nAccept-Encoding: gzip\r"
"\nAccept-Language: ru-RU,en,*\r"
"\nUser-Agent: Mozilla/5.0\r"
"\nHost: 127.0.0.1\r"
"\nContent-Type: application/x-www-form-urlencoded\r\n\r\n"
"login=%s&"
"password=%s&", data.length(), name.c_str(), password.c_str());
int BytesSent = send(MySocket, Buffer, sizeof(Buffer), 0);
string test;
char ans;
while (recv(MySocket, &ans, 1, 0))
test.push_back(ans);
cout << test << endl;
return 0;
}
And problem is that i can call this method only once. Other calls or calls for logout methods are not working, for example:
int result = LogIn(logn, paswd);
int result = LogIn(logn, paswd);
is not working, I receive only one answer (second is empty and recv is returning -1)
Please help, thanks.
int BytesSent = send(MySocket, Buffer, sizeof(Buffer), 0);
This sends too many bytes of data, leaving the other side with a whole bunch of junk at the end of the request that it thinks is part of the next request.
string data = "login=" + name + "&password=" + password;
This has the query with no & on the end.
"password=%s&", data.length(), name.c_str(), password.c_str());
This puts an & on the end of the query. Well, which is it?
You need to fix two things:
Be consistent about the & on the end of the query. (Why do you compose the query string twice anyway?)
Use strlen(Buffer) instead of sizeof(Buffer).
Really though, you should use an implementation of the actual protocol specification rather than code like this. For example, what happens if there's an & in the password? What happens if the server decides to use chunked encoding in the reply? You don't actually implement HTTP 1.1, yet you claim that you do. That will lead to lots and lots of pain, sooner or later.
You are sending too many bytes, you are not formatting the body of the POST message correctly, and you are not reading the response correctly at all.
If you are going to use C++ for some of your string handling then you should use C++ for all of your string handling, avoid mixing C string handling if you can.
Try something more like this instead:
const string SafeChars = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789*-._";
string Urlencode(const string &str)
{
if (str.length() == 0)
return string();
ostringstream buffer;
for (int I = 0; I < ASrc.length(); ++I)
{
char ch = ASrc[I];
if (ch == ' ')
buffer << '+';
else if (SafeChars.find(ch) != string::npos)
buffer << ch;
else
buffer << '%' << setw(2) << fillchar('0') << hex << ch;
}
return buffer.str();
}
int LogIn(const string &name, const string &password)
{
string data = "login=" + Urlencode(name) + "&password=" + Urlencode(password);
ostringstream buffer;
buffer << "POST /Test/login_cl.php HTTP/1.1\r\n"
<< "Content-Length: " << data.length() << "\r\n"
<< "Connection: Keep-Alive\r\n"
<< "Accept-Encoding: gzip\r\n"
<< "Accept-Language: ru-RU,en,*\r\n"
<< "User-Agent: Mozilla/5.0\r\n"
<< "Host: 127.0.0.1\r\n"
<< "Content-Type: application/x-www-form-urlencoded\r\n"
<< "\r\n"
<< data;
string request = buffer.str();
const char *req = request.c_str();
int reqlen = request.length();
do
{
int BytesSent = send(MySocket, request.c_str(), request.length(), 0);
if (BytesSent <= 0)
return -1;
req += BytesSent;
reqlen -= BytesSent;
}
while (reqlen > 0);
// you REALLY need to flesh out this reading logic!
// See RFC 2616 Section 4.4 for details
string response;
char ch;
while (recv(MySocket, &ch, 1, 0) > 0)
response += ch;
cout << response << endl;
return 0;
}
I will leave it as an exercise for you to learn the correct way to read an HTTP response (HINT: it is a LOT harder then you think - especially since you are including Accept-Encoding: gzip and Connection: Keep-Alive headers, which have big impacts on response handling. Read RFC 2616 Section 4.4 for details how to determine the response length and format).
With that said, HTTP is not a trivial protocol to implement by hand, so you really should use a premade HTTP library, such as libcurl, or use Microsoft's own WinInet or WinHTTP APIs.