Http Digest authentication and utf-8 symbols in request headers - c++

I'm trying to implement HTTP Digest authentication in a server based on cpp-netlib and I'm not sure how to tackle the issue that the username attribute in the authorization header could contain unicode symbols - the actual Digest authentication RFC is not specific on this. But practice shows that e.g. Chrome just sends utf-8 encoded username, which would be fine, apart from the fact that cpp-netlib parses the incoming stream and checks if the header contents are alpha numeric using Boost and std::isalnum and friends (ok, on Linux i could just set the current locale to utf-8, but i'm on Windows) and that of course causes assertions and what not. So, I'm just asking for a general opinion, based on the facts given:
1) Should I just dump this (and I'm really close to that) and just use a customized POST/GET for authentication?
2) Can I anyhow customize the Boost's behaivor (since the functions that verify alpha numeric values come form boost\algorithm\string\classification) to tackle this?
3) Maybe such issues are somehow handled in POCO or other web server frameworks that could server as replacements in this situation?

Related

Implementing Telegram bot webhooks in ColdFusion

I am developing an application in ColdFusion (CFML) to create generic, stateful, bots to be run on the Telegram messaging platform. I've found so far plenty of examples in PHP, some in other languages (Ruby,...), none in CFML. So, here I am.
The "getUpdates" (i.e., polling) way runs like a breeze, but it's not feasible polling the Telegram server for new updates at a rate decent for interactive use (some 30 sec). So, I've turned to Webhooks.
I will go over the webhook setting for a self-signed certificate, it's out of scope here, but I am ready to explain how I did overcome the issue.
My problem is: how to decode the posts received from Telegram server on occurrence of an update?
What my application server (ColdFusion + Tomcat + Apache2) gets from Telegram is an HTTP with an header like this:
struct
accept-encoding: gzip, deflate
connection: keep-alive
content-length: 344
content-type: application/json
host: demo.bigopen.eu
and a content section like this:
binary
1233411711210097116101951..... (*cut*)
Please note that the data section (ASCII) contains only decimal digits, not hex. I've been struggling how to decode that stuff, I'm striving to get a JSON representation of a single message.
I've been trying to use the CFML tools I have, such as BinaryDecode(), CharsetEncode(), Java GZip libraries, etc. but no success so far. I was expecting some serialized JSON in the reply, but it's encoded in a way I cannot decode. I've found no hint in the literature, since only calls to language-specific libraries (such as file_get_contents for PHP) are shown.
I don't expect to be given the actual CFML code, but hopefully what kind of encoding is performed by the Telegram side.
I'd like to inform that after some effort I could be able to have this issue solved. Encoding is handled by ColdFusion itself. The data given back by Telegram in a Webhook update is binary, and CF treats them as ByteArray (actually, they're declared as "Array" but not directly addressable). Nonetheless, the ToString() function, if applied, returns a string fully valid.
So, the first thing to do is :
<cfset reply = DeserializeJSON(ToString(StructFind(GetHttpRequestData(), "content"))) >
BTW, StructFind() just extracts the "content" section by the structure returned by GetHttpRequestData().
After that, reply is a structure holding what is needed, such as :
<cfset message_id = reply.message.message_id />
<cfset message_text = reply.message.text />
and so on.
Hoping that it may be useful to anyone.

Addon to modify http headers specified via regex of content

I need a Firefox addon that would enable me to modify outbound http request headers thusly:
"[pseudocode] if any request header contains x in its content then replace x with y"
where x is a regular expression and y preferably can contain substitution patterns referencing x.
I've looked at the addons Tamper Data Modify Headers and Header Tool and none of them appear to support the above (Am I wrong?) "Header Tool" has some regex capability, but not apparently as specified above. Would greasemonkey or the like enable this? The only problem in my case is that the http request is actually sent out via an .swf (i.e. flash), though it is still displayed by say Tamper Data
(Note: If you think this query doesn't belong at stackoverflow, then please reference what stackexchange to use (though who other than programmers would be messing with reg expressions). This also isn't something to google, as
the first thing it returns is Header Tool which doesn't do what I want.)
It should be fairly trivial to write such an addon yourself if you don't need any GUI.
The SDK has a module (system-events) that requires only a few lines of code to hook into any and all HTTP requests or responses
Burp is not an FF addon but instead a java app but even the free version of it easily does what I described.

Language agnostic cookie encoding / decoding standards

I'm having difficulties to figure out what is the standard (or is there any?) for encoding/decoding cookie values regardless to backend platforms.
According to RFC 2109:
The VALUE is opaque to the user agent and may be anything the origin server chooses to send, possibly in a server-selected printable ASCII encoding. "Opaque" implies that the content is of interest and relevance only to the origin server. The content may, in fact, be readable by anyone that examines the Set-Cookie header.
which sounds like "server is the boss" and it decides whatever the encoding will apply. This makes it quite difficult to set a cookie from, say PHP backend and read it from Python or Java or whatever, without writing any manual encode/decode handling on both sides.
Let's say we have a value needs to be encoded. Russian /"печенье (*} значения"/ means "cookie value" with some additional non alpha-numeric chars in it.
Python:
Almost every WSGI server does the same and uses Python's SimpleCookie class that encodes to / decodes from octal literals even though many says that octal literals are depreciated in ECMA-262, strict mode. Wtf?
So, our raw cookie value becomes "/\"\320\277\320\265\321\207\320\265\320\275\321\214\320\265 (*} \320\267\320\275\320\260\321\207\320\265\320\275\320\270\321\217\"/"
Node.js:
Haven't tested at all but I'm just guessing a JavaScript backend would do it with native encodeURIComponent and decodeURIComponent functions that use hexadecimal escaping / unescaping?
PHP:
PHP applies urlencode to the cookie values that is similar to encodeURIComponent but not exactly the same.
So the raw value becomes; %2F%22%D0%BF%D0%B5%D1%87%D0%B5%D0%BD%D1%8C%D0%B5+%28%2A%7D+%D0%B7%D0%BD%D0%B0%D1%87%D0%B5%D0%BD%D0%B8%D1%8F%22%2F that is not even wrapped with double quotes.
However; if the JavaScript value variable has the PHP encoded value above, decodeURIComponent(value) gives /"печенье+(*}+значения"/, see "+" chars instead of spaces..
What is the situation in Java, Ruby, Perl and .NET? Which language is following (or closest) to the desired behaviour. Actually, is there any standard for this defined by W3?
I think you've got things a bit mixed up here. The server's encoding does not matter to the client, and it shouldn't. That is what RFC 2109 is trying to say here.
The concept of cookies in http is similar to this in real life: Upon paying the entrance fee to a club you get an ink stamp on your wrist. This allows you to leave and reenter the club without paying again. All you have to do is show your wrist to the bouncer. In this real life example, you don't care what it looks like, it might even be invisible in normal light - all that is important is that the bouncer recognises the thing. If you were to wash it off, you'll lose the privilege of reentering the club without paying again.
In HTTP the same thing is happening. The server sets a cookie with the browser. When the browser comes back to the server (read: the next HTTP request), it shows the cookie to the server. The server recognises the cookie, and acts accordingly. Such a cookie could be something as simple as a "WasHereBefore" marker. Again, it's not important that the browser understands what it is. If you delete your cookie, the server will just act as if it has never seen you before, just like the bouncer in that club would if you washed off that ink stamp.
Today, a lot of cookies store just one important piece of information: a session identifier. Everything else is stored server-side and associated with that session identifier. The advantage of this system is that the actual data never leaves the server and as such can be trusted. Everything that is stored client-side can be tampered with and shouldn't be trusted.
Edit: After reading your comment and reading your question yet again, I think I finally understood your situation, and why you're interested in the cookie's actual encoding rather than just leaving it to your programming language: If you have two different software environments on the same server (e.g.: Perl and PHP), you may want to decode a cookie that was set by the other language. In the above example, PHP has to decode the Perl cookie or vice versa.
There is no standard in how data is stored in a cookie. The standard only says that a browser will send the cookie back exactly as it was received. The encoding scheme used is whatever your programming language sees fit.
Going back to the real life example, you now have two bouncers one speaking English, the other speaking Russian. The two will have to agree on one type of ink stamp. More likely than not this will involve at least one of them learning the other's language.
Since the browser behaviour is standardized, you can either imitate one languages encoding scheme in all other languages used on your server, or simply create your own standardized encoding scheme in all languages being used. You may have to use lower level routines, such as PHP's header() instead of higher level routines, such as start_session() to achieve this.
BTW: In the same manner, it is the server side programming language that decides how to store server side session data. You cannot access Perl's CGI::Session by using PHP's $_SESSION array.
Regardless of the cookie being opaque to the client, it still needs to conform to the HTTP spec. rfc2616 specifies that all HTTP headers should be ASCII (ISO-8859-1). rfc5987 extends that to support other character sets, but I don't know how widely supported it is.
I prefer to encode into UTF8 and wrap with base64 encoding. It's fast, ubiquitous, and will never mangle your data at either end.
You will need to ensure an explicit conversion into UTF8 even when wrapping it. Other languages & runtimes, while supporting Unicode, may not store strings as UTF8 internally... like many Windows APIs. Python 2.x, in my experience, rarely gets Unicode strings right without explicit conversion.
ENCODE: nativeString -> utfEncode() -> base64Encode()
DECODE: base64Decode() -> utfDecode() -> nativeString
Almost every language I know of, these days, supports this. You can look for a universal single-function encode, but I err on the side of caution and choose the two-step approach... especially with foreign character sets.

WSDL possible to transfer a FILE type?

A "checkResult" service deployed on a node machine is defined to return the result on the node to a cluster controller that sends the request.The result on node ,which is in the form of file, may vary drastically in length,as is often the case with daily log files.
At first,i thought it might be ok just using a single string to pack the whole content of the file,so i defined
checkResult(inType *in,OutType *out)
where the OutType* is char*. Then i realized that the string could be in KB length or even more. So i wonder whether it is proper to use string here.
I googled a lot and could not find the max length permitted in wsdl(maybe conflict with the local maxbuffer length as well) and did not find any information about transferring a file type parameter either.
Using struct type may be suggested ,but it could be so nested for the file and difficult to parse when some of the elements inside could be nil and absent.
What'd you do when you need to return a file type result or large amount of data in a webservice?
p.s the server and client both in C.
When transferring a large amount of data in a (SOAP) web service request or response, it is generally better practice to use an attachment mechanism versus including the data as part of the body. Probably the order for considering attachment mechanism (broadest to narrowest adoption):
Message Transmission Optimization Mechanism (MTOM) - The newest of these specifications (http://www.w3.org/TR/soap12-mtom/) which is supported in many of the mainstream languages.
SOAP with Attachments - This specification (http://www.w3.org/TR/SOAP-attachments) has been around for many years and is supported in several languages but notably not by Microsoft.
Direct Internet Message Encapsulation (DIME) - This specification (http://bgp.potaroo.net/ietf/all-ids/draft-nielsen-dime-02.txt) was pushed by Microsoft and support has been provided in multiple languages/frameworks including java and .NET.
Ideally, you would be able to work with a framework to give you code stub generation directly from a WSDL indicating MTOM-based web service.
The critical parts of such a WSDL document include:
MTOM policy declaration
Policy application in the binding
Placeholder for the reference to the attachment in the types (schema) section
If you are working contract-first and have a WSDL in hand, the example in section 1.2 of this site (http://www.w3.org/Submission/WS-MTOMPolicy/) shows the simple additions to be made to declare and apply the MTOM policy. Appendix I of the same site shows an example of a schema element which allows a web service client or server to identify a reference to the MTOM attachment.
I have not implemented a web service or client in C, but a brief scan of recently-updated packages revealed gSoap (http://www.cs.fsu.edu/~engelen/soap.html) as a possibility for helping in your endeavors.
Give those documents a look and see if they help to advance your project.

BOM not expected in CF but sent by IIS/SharePoint

I'm trying to consume a SharePoint webservice from ColdFusion via cfinvoke ('cause I don't want to deal with (read: parse) the SOAP response itself).
The SOAP response includes a byte-order-mark character (BOM), which produces the following exception in CF:
"Cannot perform web service invocation GetList.
The fault returned when invoking the web service operation is:
'AxisFault
faultCode: {http://www.w3.org/2003/05/soap-envelope}Server.userException
faultSubcode:
faultString: org.xml.sax.SAXParseException: Content is not allowed in prolog."
The standard for UTF-8 encoding optionally includes the BOM character (http://unicode.org/faq/utf_bom.html#29). Microsoft almost universally includes the BOM character with UTF-8 encoded streams . From what I can tell there’s no way to change that in IIS. The XML parser that JRun (ColdFusion) uses by default doesn’t handle the BOM character for UTF-8 encoded XML streams. So, it appears that the way to fix this is to change the XML parser used by JRun (http://www.bpurcell.org/blog/index.cfm?mode=entry&entry=942).
Adobe says that it doesn't handle the BOM character (see comments from anoynomous and halL on May 2nd and 5th).
http://livedocs.adobe.com/coldfusion/8/htmldocs/Tags_g-h_09.html#comments
I'm going to say that the answer to your question (is it possible?) is no. I don't know that definitively, but the poster who commented just above halL (in the comments on this page) gave a work-around for the problem -- so I assume it is possible to deal with when parsing manually.
You say that you're using CFInvoke because you don't want to deal with the soap response yourself. It looks like you don't have any choice.
As Adam Tuttle said already, the workaround is on the page that you linked to
<!--- Remove BOM from the start of the string, if it exists --->
<cfif Left(responseText, 1) EQ chr(65279)>
<cfset responseText = mid(xmlText, 2, len(responseText))>
</cfif>
It sounds like ColdFusion is using Apache Axis under the covers.
This doesn't apply exactly to your solution, but I've had to deal with this issue once before when consuming a .NET web service with Apache Axis/Java. The only solution I was able to find (since the owner of the web service was unwilling to change anything on his end) was to write a Handler class that Axis would plug into the pipeline which would delete the BOM from the message if it existed.
So perhaps it's possible to configure Axis through ColdFusion? If so you can add additional Handlers to the message handling flow.