Separating HTTP Response Body from Header in C++

Separating HTTP Response Body from Header in C++ - c++

I'm currently writing my own C++ HTTP class for a certain project. And I'm trying to find a way to separate the response body from the header, because that's the only part I need to return.
Here's a sample of the raw http headers if you're not familiar with it:
HTTP/1.1 200 OK
Server: nginx/0.7.65
Date: Wed, 29 Dec 2010 06:13:07 GMT
Content-Type: text
Connection: keep-alive
Vary: Cookie
Content-Length: 82
Below that is the HTML/Response body. What would be the best way to do this? I'm only using Winsock library for the requests by the way (I don't even think this matters).
Thanks in advance.

HTTP headers are terminated by the sequence \r\n\r\n (a blank line). Just search for that, and return everything after. (It may not exist of course, e.g. if it was in response to a HEAD request.)

Do you need to roll your own? There are C/C++ libraries out there for doing HTTP, e.g. libcurl. If you need to support the full gamut of HTTP, then it's not always a simple delineation. You might also have to cater, for example, for chunked encoding.

DO IF Socket.IsServerReady(Sock) THEN Text = text + Socket.Read(Sock, 65000) 'print text '' 32000 bytes... whatever they give us Bytes = bytes + Socket.Transferred StatusBar.Panel(0).Caption = "Bytes Read: " + STR$(Bytes)
END IF
'RichEdit.addstrings text zzz=Bytes LOOP UNTIL Socket.Transferred = 0 RichEdit.Clear RichEdit.Text = text Socket.Close(Sock) dim mem as qmemorystream dim S$ as string S$ = text for n=0 to 400 buff$=mid$(S$,n,5)
if buff$="alive" then' found end of headers richedit1.addstrings (buff$) richedit1.addstrings (mid$(S$,n,9)) richedit1.addstrings str$(n+9) zzz=n+8'offset + 8 bit space after headers and before Bof end if next n Mem.WriteStr(S$, LEN(S$))'write entire file to memory Mem.Position = zzz ' use offset as Start position S$ = Mem.ReadStr(LEN(S$)) ' read rest of file into string till Eof Mem.Close' dont forget to close 'PRINT S$ '' print it
Filex.Open("c:/CAP.AVI", fmCreate)'create file on system filex.WriteBinStr(S$,len(S$)-zzz)' write to it filex.close 'dont forget to close

Related

Why might mutt email be accepted/rejected by windows recipient as a function of alphabetic string content in the body of html file being sent?

Calling mutt-1.5.24 on linux.
I'm seeing some very odd behavior when emailing an html file from linux to windows/outlook using mutt on linux. Example of the mutt call...
mutt -e 'set content_type=text/html' -s 'yuk, yuk, yuk' 'moe.howard#stooge.com' < a.html
The email does not show up on the windows side. mutt returned no error or warning on the linux side. Now, here's the odd part... If I global/replace the string "pcie" in the body of the html to "pcix", the email appears on the windows/outlook side just fine. OR... if I global/replace "ity" to "..." it also works fine (even if I leave "pcie" alone). But changing "ity" to "xxx" fails. Very odd character sensitivity behavior like this.
In my home dir on the linux side I see a file ~/sent getting created. The header (whether the email made it to the windows/outlook side or not) looks like...
From m.howard#theserver.stooge.com Thu Jan 28 18:49:29 2021
Date: Thu, 28 Jan 2021 18:49:29 -0500
From: Moe Howard <mhoward#theserver.stooge.com>
To: moe.howard#stooge.com
Subject: yuk, yuk, yuk
Message-ID: <20210128234929.GA48266#atletx7-reg062.amd.com>
MIME-Version: 1.0
Content-Type: text/html; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.5.24 (2015-08-30)
Status: RO
Content-Length: 20537
Lines: 122
<html>
....etc... the rest of the html which firefox reads just fine if I get rid of the header above
Grasping at straws. Looking at the "charset=us-ascii" in the "sent" file thinking it should be something else ? So I tried providing other options by adding "-e 'set assumed_charset=utf-8:us-ascii'" to the command. No luck.
Any ideas what might be happening and what a solution might be ?

Figured it out. All my email actually arrived in Outlook. It's just that it got sent to the junk folder, labeled as spam. So if the body of the html contains "pcie", it's spam. But "pcix" is not. Got to go undo that now.

groovyx.net.http.ResponseParseException while invoking web service from Spock groovy. Works fine from Postman or other clients

I want to consume a webservice and add assertions to its response. I get the following exception:
groovyx.net.http.ResponseParseException:
at groovyx.net.http.HTTPBuilder$1.handleResponse(HTTPBuilder.java:495)
......
at Subscription.Order Products(Subscription.groovy:14)
Caused by: groovy.json.JsonException: Unable to determine the current character, it is not a string, number, array, or object
The current character read is 'I' with an int value of 73
Unable to determine the current character, it is not a string, number, array, or object
line number 1
index number 0
Invalid UTF-8 middle byte 0x52
My operation is as follows:
def setupSpec() {
client = new RESTClient("http://tsi-services-dev2.canaldigital.com:9080/test/webgw-dealer/v1/");
client.handler.failure = { resp, data -> return resp }
}
///// orderProductPayload is a variable which has an input
def "Order Products"() {
when:
def resp = client.post(path: "order/orderProduct", requestContentType: JSON, contentType: JSON, body: orderProductPayload) as HttpResponseDecorator
then:
println("response: " + resp.data)
resp.status == successResponseStatus
}
I have other such tests too which work fine and this particular test with the input payload entered works fine from Postman
Here is my response headers for this operation which fails in spock/groovy:
Content-Length →159
Content-Type →application/json
Date →Wed, 05 Apr 2017 06:19:16 GMT
X-Correlation-ID → xxxxx
Response header for another opertion which is working:
Connection →close
Content-Length →273
Content-Type →application/json
Date →Wed, 05 Apr 2017 06:16:41 GMT
X-Correlation-ID → xxxxx

Thanks alot for the help turns out there was a special character such as Ø in my request which the service didnt like and hence was responding in such an absurd way. Works fine now

Jmeter-Regular expression extractor

My Jmeter response returns me 'Location' in the response header.I want to fetch this Location header and use it on my other requests.
Sample Start: 2015-07-24 14:46:38 CEST
Load time: 163
Latency: 163
Size in bytes: 372
Headers size in bytes: 350
Body size in bytes: 22
Sample Count: 1
Error Count: 1
Response code: 201
Response message: Processed
Response headers:
HTTP/1.1 201 Processed
X-Backside-Transport: OK OK,FAIL FAIL
Connection: Keep-Alive
Transfer-Encoding: chunked
****Location: /retail/iows/ie/en/storage/servicedocs/paxplanner/2015-07-24/eCommerce.pdf****
X-Client-IP: 127.0.0.1,10.62.26.150
Content-Type: application/octet-stream
Date: Fri, 24 Jul 2015 12:46:38 GMT
X-Archived-Client-IP: 127.0.0.1
Steps I followed:
I have used Regular expression extractor.
Enabled response header radio button with the whole location header.
Please help me to sort it out.

If you want to retrieve the Location field's value from the request's response, you might want to try the following pattern: Location:([^\r?\n]+), the first matching group will contain the value of the Location field.
Above expression is based in the following rules:
HTTP header fields are colon (":") separated <key, value> pairs.
HTTP header fields are terminated by the EOL char combination (CR and LF)

Please try this..
Location:([\s\S]*)X-Client
If it doesn't work then try to use a \ before - in X-Client (escaping -)

Regex: Matching,parsing an FTP response to a request

Here's what i'm trying to do:
I what to have some FTP functionality in one of my apps (this is just for myself, not a business application or such) and since I didn't wanted to write all that FTP request/response code for myself, I (being the lazy man I am) search the internet for an FTP wrapper.
I have found this DLL.
This is all very great, works like a charm. Except for one thing: when I request the LastWriteTime of a specific file ON the FTP server, the DLL is giving me strange dates (namely, prints out fictional dates). I've been able to find the problem. Whenever you send a request to the FTP server, it sends back a one line response, which has a very special format. Now what i've been able to gather, this format is different for most of the servers, my wrapper DLL comes with 6 pre-defined response formats, but my FTP server sends back a 7th one. Here's a response to a request and the REGEX formats:
-rw-r--r-- 1 user user 594 Jun 11 03:44 random_log.file
here are my regex parsing formats:
"(?<dir>[\-d])(?<permission>([\-r][\-w][\-xs]){3})\s+\d+\s+\w+\s+\w+\s+(?<size>\d+)\s+(?<timestamp>\w+\s+\d+\s+\d{4})\s+(?<name>.+)", _
"(?<dir>[\-d])(?<permission>([\-r][\-w][\-xs]){3})\s+\d+\s+\d+\s+(?<size>\d+)\s+(?<timestamp>\w+\s+\d+\s+\d{4})\s+(?<name>.+)", _
"(?<dir>[\-d])(?<permission>([\-r][\-w][\-xs]){3})\s+\d+\s+\d+\s+(?<size>\d+)\s+(?<timestamp>\w+\s+\d+\s+\d{1,2}:\d{2})\s+(?<name>.+)", _
"(?<dir>[\-d])(?<permission>([\-r][\-w][\-xs]){3})\s+\d+\s+\w+\s+\w+\s+(?<size>\d+)\s+(?<timestamp>\w+\s+\d+\s+\d{1,2}:\d{2})\s+(?<name>.+)", _
"(?<dir>[\-d])(?<permission>([\-r][\-w][\-xs]){3})(\s+)(?<size>(\d+))(\s+)(?<ctbit>(\w+\s\w+))(\s+)(?<size2>(\d+))\s+(?<timestamp>\w+\s+\d+\s+\d{2}:\d{2})\s+(?<name>.+)", _
"(?<timestamp>\d{2}\-\d{2}\-\d{2}\s+\d{2}:\d{2}[Aa|Pp][mM])\s+(?<dir>\<\w+\>){0,1}(?<size>\d+){0,1}\s+(?<name>.+)"
Non of these seem to be able to parse the datetime correctly and since I have no idea how to do that, can a REGEX pro please write me a ParsingFormat that would be able to parse the above FTP response?

Both a hand-check and irb check of the fourth format shows that it does match:
> re=/(?<dir>[\-d])(?<permission>([\-r][\-w][\-xs]){3})\s+\d+\s+\w+\s+\w+\s+(?<size>\d+)\s+(?<timestamp>\w+\s+\d+\s+\d{1,2}:\d{2})\s+(?<name>.+)/
=> /(?<dir>[\-d])(?<permission>([\-r][\-w][\-xs]){3})\s+\d+\s+\w+\s+\w+\s+(?<size>\d+)\s+(?<timestamp>\w+\s+\d+\s+\d{1,2}:\d{2})\s+(?<name>.+)/
> m=re.match("-rw-r--r-- 1 user user 594 Jun 11 03:44 random_log.file")
=> #<MatchData "-rw-r--r-- 1 user user 594 Jun 11 03:44 random_log.file" dir:"-" permission:"rw-r--r--" size:"594" timestamp:"Jun 11 03:44" name:"random_log.file">
> m['dir']
=> "-"
> m['permission']
=> "rw-r--r--"
> m['size']
=> "594"
> m['timestamp']
=> "Jun 11 03:44"
> m['name']
=> "random_log.file"
>
I think the pile of regular expressions are fine. Perhaps you need to look elsewhere for the problem.

How can we parse HTTP response header fields using Qt/C++?

I am writing a piece of software that uses Qt/KDE libs. The objective is to parse the HTTP header response fields into different fields of a struct. So far the HTTP header response is contained in a QString.
It looks something like this:
"HTTP/1.1 302 Found
date: Tue, 05 Jun 2012 07:40:16 GMT
server: Apache/2.2.22 (Linux/SUSE)
x-prefix: 49.244.80.0/21
x-as: 23752
x-mirrorbrain-mirror: mirror.averse.net
x-mirrorbrain-realm: region
link: <http://download.services.openoffice.org/files/du.list.meta4>; rel=describedby; type="application/metalink4+xml"
link: <http://download.services.openoffice.org/files/du.list.torrent>; rel=describedby; type="application/x-bittorrent"
link: <http://mirror.averse.net/openoffice/du.list>; rel=duplicate; pri=1; geo=sg
link: <http://ftp.isu.edu.tw/pub/OpenOffice/du.list>; rel=duplicate; pri=2; geo=tw
link: <http://ftp.twaren.net/OpenOffice/du.list>; rel=duplicate; pri=3; geo=tw
link: <http://mirror.yongbok.net/openoffice/du.list>; rel=duplicate; pri=4; geo=kr
link: <http://ftp.kaist.ac.kr/openoffice/du.list>; rel=duplicate; pri=5; geo=kr
digest: MD5=b+zfBEizuD8eXZUTWJ47xg==
digest: SHA=A5zw6PkywlhiPlFfjca+gqIGLHA=
digest: SHA-256=HOrd0MMBzS8Ctljpe4PauwStijsnBKaa3gXO4L30eiA=
location: http://mirror.averse.net/openoffice/du.list
content-length: 329
connection: close
content-type: text/html; charset=iso-8859-1"
In addition to the custom fields there might be few more fields in the header response.
The only possible way that I came up was to manually search for the fields like "link", "digest" and others and create a QMap with the fields as keys.However, I guess there must be a better way to do this. I would be thankful to you if you could help me.

The HTTP header should initially be in a QByteArray (because it is in ASCII, not UTF-16), but the method would be the same with a QString:
split the header line by line,
split each line at the colon character,
trim any white spaces (regular spaces and '\r' characters) around the 2 resulting strings before storing them.
QByteArray httpHeaders = ...;
QMap<QByteArray, QByteArray> headers;
// Discard the first line
httpHeaders = httpHeaders.mid(httpHeaders.indexOf('\n') + 1).trimmed();
foreach(QByteArray line, httpHeaders.split('\n')) {
int colon = line.indexOf(':');
QByteArray headerName = line.left(colon).trimmed();
QByteArray headerValue = line.mid(colon + 1).trimmed();
headers.insertMulti(headerName, headerValue);
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Separating HTTP Response Body from Header in C++ - c++

HTTP headers are terminated by the sequence \r\n\r\n (a blank line). Just search for that, and return everything after. (It may not exist of course, e.g. if it was in response to a HEAD request.)

Do you need to roll your own? There are C/C++ libraries out there for doing HTTP, e.g. libcurl. If you need to support the full gamut of HTTP, then it's not always a simple delineation. You might also have to cater, for example, for chunked encoding.

Related

Why might mutt email be accepted/rejected by windows recipient as a function of alphabetic string content in the body of html file being sent?

groovyx.net.http.ResponseParseException while invoking web service from Spock groovy. Works fine from Postman or other clients

Jmeter-Regular expression extractor

Regex: Matching,parsing an FTP response to a request

How can we parse HTTP response header fields using Qt/C++?

Categories

Resources