Situation: I am trying to call the LinkedIn API from a ColdFusion CFC to get the user's profile and network (connections). The LinkedIn API states that to do this you must call a URL with scope=r_fullprofile+r_network.
Issue: ColdFusion is automatically encoding the URL, so the plus sign is getting encoded, and LinkedIn is rejecting my call. Is there any way around this? I've posted a link below to some code snippets on github which I believe illustrate the issue.
https://gist.github.com/4535364
Any help would be appreciated!
I have searched around on this for a bit and I am seeing lots of examples where ColdFusion is not playing nicely with the LinkedIn API. So I'm afraid if you do get passed this issue (although I have not come up with an alternative yet) another will crop up. While searching I found several suggestions from people to use the linkedin-j, A Java wrapper for LinkedIn APIs instead. Here are some of the references that I found:
Working example Coldfusion and Linkedin API
LinkedIn-J does not return educations
401 Unauthorized response. API people/~ and people/id=; ColdFusion, cfhttp
Problem updating status - 401 unauthorized - ColdFusion
linkedin-j Getting Started
Side Note Your github code example is making a cfhttp call to 'receiver.cfm' but you called the file 'cfhttp_receiver.cfm'. In this line:
<cfhttp url="http://#cgi.http_host#/sandbox/receiver.cfm?scope=#url.scope#" method="post" resolveurl="no">
The scope field is a space delimited list.
The + character is commonly used as a shortcut for space, since it's more readable than %20 (which is what space encodes to).
If using a plus character results in an encoded plus (%2B) being sent, then you are left with two other ways of putting the space into the URL:
using a literal space character, or
using an encoded space %20
Try both of those options, ideally using a network snifer (e.g. WireShark) so that you can see accurately what is being sent.
Update: As per comments below, %20 is correct, but the signature based string needs to be encoded again, so for that the % becomes %25, giving a result of %2520.
Related
I'm using the facebook open graph api to post to a facebook fan page. More info on the method can be found in the answer here.
When one manually posts on facebook they can use # to link a particular person e.g. #Michael Jackson. This auto populates a link to that persons page and shows up on their timeline. The # itself dissapears once the post has been made, leaving only the hyperlinked text i.e. Michael Jackson.
Programatically compiling a post via the api, including the #, results in the text being posted in plain text. i.e. #Michael Jackson shows as #Michael Jackson.
How can I escape, or otherwise parse the anchor through the api so that Facebook recognises it as a link to another user/page?
Edit: I found this reference which describes these links as Actions, specifically in this case a 'Mentioning friends' action. It goes on to explain the syntax of #[USERID] or #[USERNAME] which is promising. But if I compile this encoded it posts the plain encoded text e.g. %40%5BUSERID%5D, when left un-encoded the post fails.
I've found that starting your message with a mention tag (i.e. #[12345678:User Name] ) throws a CurlException error.
To fix this I changed:
'message' => '#[1234567890:User Name]';
To
'message' => ' #[1234567890:User Name]';
Simply adding a space before the mention seems to fix that problem.
I got a request from a customer that he wants to be able to type the query string of my web service with parameters in the IE10 address bar and get the service results. The parameters include string in Hebrew, like:
http://mywebsite.com/service.asmx/foo?param1=123¶m2=מחרוזתבעברית
It seems to me that that IE10 won't encode the query string parameters - every non-ASCII character that goes after the ? mark would be turned to '3f' byte, though it does encode what goes before the ? mark - the url itself.
For example, if i try to reach the url (the parameter is fictional, url is not, and I have no connection with the site)
http://www.shlomo.co.il/pageshe/sales/רכב-למכירה.asp?param=פאראם
and look in wireshark for the bytes I send to the server, it shows me
You can see it does substitute the hebrew part of the URL with urlencoded string, but substitutes the hebrew parameters with ?????, which are '3f's.
The same string in chrome would be encoded in it's entirety:
GET http://www.shlomo.co.il/pageshe/sales/%D7%A8%D7%9B%D7%91-%D7%9C%D7%9E%D7%9B%D7%99%D7%A8%D7%94.asp?param=%D7%A4%D7%90%D7%A8%D7%90%D7%9D HTTP/1.1
I tried it on machines with win7/IE10 and winXPheb/IE8.
My IE settings are (especially checked the "Always show encoded addresses option" to see if it helps and restarted, but made no difference):
I tried to search around for any info about the issue, but didn't find much of it.
My questions are:
Is it indeed like this, or am I missing something?
Is this behavior documented anywhere?
Are there any settings in IE/Win which enable the parameters encoding.
p.s. Sure if I was developing the client/web ui, I would simply urlencode my query, but my request from customer was exactly to paste the query to IE address bar, that's why I'm interested in this specific behavior.
Thanks.
Yes, your observation of the behavior is accurate. Internet Explorer 10 and below follow a complicated algorithm for encoding the URL. This was allegedly updated in Internet Explorer 11, but I've found that the new option doesn't seem to work.
The "Always show encoded addresses option" concerns whether PunyCode is shown for IDN hostnames, and does not impact the query string. Send UTF-8 URLs mostly applies to the encoding of the path, although it can also affect other codepaths
The behavior isn't fully documented anywhere. I'd meant to write a full post on my IEInternals blog about it but ended up moving on from Microsoft before doing so. There's a partial explanation in this blog post.
Yes, there are settings that impact the behavior. The Send UTF-8 URLs checkbox inside Tools > Internet Options > Advanced is one of the variables that determines how URLs are sent, but the option does not blindly do what it implies (it only UTF-8 encodes the path, not the query string). Other variables involved include:
Where the URL was typed (e.g. address bar vs. Start > Run, etc)
What the system's ANSI codepage is (e.g. what locale the OS uses as default)
The charset of the currently loaded page in the browser
As a consequence of these variables, you cannot reliably use URLs which are not properly encoded (e.g. %-escaped UTF8) in Internet Explorer.
Unfortunately this is still true for Internet Explorer 11 (build 11.0.9600.17358, win7-x64)
I saw that you can not unfortunately change the web server. However those who are developing new services may consider changing request parameters into path variables, e.g. from http://myserver.com/page?τεστ into http://myserver.com/τεστ/
If the client is calling the web-service from javascript,
encodeuricomponent can be used. In your case encodeuricomponent("מחרוזתבעברית");
http://www.w3schools.com/jsref/jsref_encodeURIComponent.asp
I am trying to create special characters '•' (alt+7 with numlock on) on a Facebook api feed post using the PHP SDK. - Well, it doesn't work, there is an issue with the character's encoding. Also tried calling: utf8_encode('•') and that also didn't work.
When typing ' <3 ' Facebook parses it automatically and creates an heart sign (♥).
Any idea how should I encode that, or maybe there is a special magic charset like '<3' that creates a bullet?
I'm using lib-cURL as a HTTP client to retrieve various pages (can be any URL for that matter).
Usually the data comes as a UTF-8 string and then I just call "MultiByteToWideChar" and it works well.
However, some web-pages still use code-page encoding and I see gibberish if i try to convert those pages to UTF-8.
Is there an easy way to retrieve the code page from the data? or I'll have to scan it manually (for "encoding=") and then translate it accordingly.
If so, how do i get the code-page id from name (Code Page Identifiers)?
Thanks,
Omer
There are several location where a document can state its encoding:
the Content-Type HTTP header
the (optional) XML declaration
the Content-Type meta tag inside the document header
for HTML5 documents the charset meta tag.
There are probably even more I've forgotten.
In the end, detecting the actual encoding is rather hard. You really shouldn't do this yourself but use high-level libraries for retrieving and parsing HTML content. I'm sure they are available even for C++, even if they have to be thiefed from the a browser environment. :)
I used DetectInputCodepage in IMultiLanguage2 interface and it worked great !
I have been working with Coldfusion 9 lately (background in PHP primarily) and I am scratching my head trying to figure out how to 'clean/sanitize' input / string that is user submitted.
I want to make it HTMLSAFE, eliminate any javascript, or SQL query injection, the usual.
I am hoping I've overlooked some kind of function that already comes with CF9.
Can someone point me in the proper direction?
Well, for SQL injection, you want to use CFQUERYPARAM.
As for sanitizing the input for XSS and the like, you can use the ScriptProtect attribute in CFAPPLICATION, though I've heard that doesn't work flawlessly. You could look at Portcullis or similar 3rd-party CFCs for better script protection if you prefer.
This an addition to Kyle's suggestions not an alternative answer, but the comments panel is a bit rubbish for links.
Take a look a the ColdFusion string functions. You've got HTMLCodeFormat, HTMLEditFormat, JSStringFormat and URLEncodedFormat. All of which can help you with working with content posted from a form.
You can also try to use the regex functions to remove HTML tags, but its never a precise science. This ColdFusion based regex/html question should help there a bit.
You can also try to protect yourself from bots and known spammers using something like cfformprotect, which integrates Project Honeypot and Akismet protection amongst other tools into your forms.
You've got several options:
"Global Script Protection" Administrator setting, which applies a regular expression against post and get (i.e. FORM and URL) variables to strip out <script/>, <img/> and several other tags
Use isValid() to validate variables' data types (see my in depth answer on this one).
<cfqueryparam/>, which serves to create SQL bind parameters and validate the datatype passed to it.
That noted, if you are really trying to sanitize HTML, use Java, which ColdFusion can access natively. In particular use the OWASP AntiSamy Project, which takes an HTML fragment and whitelists what values can be part of it. This is the same approach that sites like SO and slashdot.org use to protect submissions and is a more secure approach to accepting markup content.
Sanitation of strings in coldfusion and in quite any language is very important and depends on what you want to do with the string. most mitigations are for
saving content to database (e.g. <cfqueryparam ...>)
using content to show on next page (e.g. put url-parameter in link or show url-parameter in text)
saving files and using upload filenames and content
There is always a risk if you follow the idea to prevent and reduce a string by allow basically everything in the first step and then sanitize malicious code "away" by deleting or replacing characters (blacklist approach).
The better solution is to replace strings with rereplace(...) agains regular expressions that explicitly allow only the characters needed for the scenario you use it as an easy solution, whenever this is possible. use cases are inputs for numbers, lists, email-addresses, urls, names, zip, cities, etc.
For example if you want to ask for a email-address, you could use
<cfif reFindNoCase("^[A-Z0-9._%+-]+#[A-Z0-9.-]+\.(?:[A-Z]{5})$", stringtosanitize)>...ok, clean...<cfelse>...not ok...</cfif>
(or an own regex).
For HTML-Imput or CSS-Imput I would also recommend OWASP Java HTML Sanitizer Project.