facebook like can't decode %3F to question mark - facebook-like

I'm using like button with href which contains query string. like this:
"http://www.test.com/index.php?param1=one&param2=two"
When this URL is included to href, of course i've encoded like:
"http%3A%2F%2Fwww.test.com%2Findex.php%3Fparam1%3Done%26param2%3Dtwo"
(actually this is what like generator gave me)
But it's very strange. when I check it on Graph API Explorer, it has been changed like:
"http://www.test.com/index.php%3Fparam1=one&param2=two"
It tells me that only "?" (Question mark) has not been decoded. Is it a current facebook bug? (actually it worked 5 days ago). I don't want to use apache with htaccess not to use question mark.

Related

Queries on "Get Regex matches" in Robocorp

I have a form in MS Word which the user fills and emails me. I have to open the form and capture all the details entered by the user and use the same to submit a form in my portal.
I am trying to create a robot using Robocorp to automate this process. Using "Get All texts" - RPA Word library, I am logging the contents from the Word document in Robocorp and then trying to get the required data using Regex but need some help on extracting the data using Regex.
Please find the raw text logged in Robocorp below,
Source Text
Query 1:
Need to extract Manager name:
In Regex101, I am getting the name returned as expected upon using, [^Manager\n].*
In Robocorp, when I use 'Get regex matches' with [^Manager\n].*, I am getting all the content of the text file.
Please help me with the regex to use in Robocorp to extract the Manager name.
Query 2:
I need to extract the answers provided by the user for the questions in the above form. (Note: The answers change with every form submitted)
I tried the below,
For eg: I pulled one question first from the above form using - Get Regex matches (?s)(Lunch).*?(No).
I got the below value returned in robocorp,
['Lunch account required? \x07☒ Yes\r☐ No']
Now again from this value returned (using this as string), I tried to get the answer selected by the user using,
Get Regex matches (?<=☒)\s\w+
But I am getting the error "TypeError: expected string or bytes-like object".
Not sure, If the above flow is right or can I get the answers selected by the user for all questions in a different way?
Sorry if my questions are simple. I am totally new to using Regex and in my learning phase.

How to convert MS Word Smart Quotes and em-dashes to simple quotes and dashes in Ckeditor 4

Hi I really like the new Ckeditor 4 Advanced Content Filtering along with the pastefromword plugin - and have read the docs on what html tags to allow and not, and I understand why it kindly converts my client's MS Word crap into htmlentities. However, I'd like to do a little intervention and convert the smart quotes to straight quotes - and all em dashes to plain dashes and not allow - before the text gets sent to the CMS database. But I can't find any docs on this or examples.
I can see there were many questions about this on the old forum Ckeditor forum http://ckeditor.com/forums/CKEditor-3.x/Replacing-smart-quotes-regular-quotes, http://ckeditor.com/forums/CKEditor-3.x/Problem-copyingpasting-MS-Word but they didn't get answered.
I'm also hoping the ckeditor team reads these forums as this is where they suggest we post questions now.
CKEditor dev here.
If you want the Paste From Word plugin to do this, you could add a rule in the plugin that replaces the contents of text nodes.
To achieve this add a property named 'text' somewhere over here(on the same level as the 'comment' property):
https://github.com/ckeditor/ckeditor-dev/blob/master/plugins/pastefromword/filter/default.js#L1106
It should be a function that accepts one parameter - the text node content, e.g.:
text: function( content ) {
return content.replace(/[\u201E\u201C]/g,'"'); // Unicode for „ and “
}
This way whenever the PFW plugin filter encounters a text node it'll replace its contents with whatever is returned by the above mentioned function.
Caveats: there are quite a few Unicode symbols that represent quotation marks and dashes.
By the way: you may not want to get too attached to the current Paste From Word plugin - we're planning a major refactor of it for v4.6.
I hope this was helpful.

Regex to parse the "Accept" header

I'm working on a REST API. The client is using the Accept header in their request to send in stuff like
...application/vnd.mywebsite+json; version=1... or
...application/vnd.mywebsite+xml; version=2....
Currently, I am parsing the headers and picking out the media type and version to serve with string functions:
json and 1
xml and 2
I was wondering if I could do that faster with a regex.
How can I pull out the format and version from an "Accept" header in the request? I suppose, I would need to make 2 regex calls to get this done, and that's okay.
Update :
Using the answer below, I tried extracting those using ColdFusion, but the pattern just matches the whole string.
Ideally, I want an array of 2 elements, ie ['json', '1']. Any ideas ?
<cfscript>
arrTitles = reMatch(
"application/vnd.website\+([A-Za-z]+);\s*version=(\d+)",
"application/vnd.website+json; version=2"
);
writedump(arrTitles);
</cfscript>
Please refer this runnable example.
You could use something simple like this:
application/vnd.mywebsite\+([A-Za-z]+);\s*version=(\d+)
The type (json or xml) would be in capturing group 1, the version in group 2.
You can see it working here.

Using Regex to validate the number of words in a text area

I am attempting to write a MVC model validation that verifies that there is 10 or more words in a string. The string is being populated correctly, so I did not include the HTML. I have done a fair bit of research, and it seems that something along the lines of what I have tries should work, but, for whatever reason, mine always seem to fail. Any ideas as to what I am doing wrong here?
(using System.ComponentModel.DataAnnotations, in a mvc 4 vb.net environment)
Have tried ([\w]+){10,}, ((\\S+)\s?){10,}, [\b]{20,}, [\w+\w?]{10,}, (\b(\w+?)\b){10,}, ([\w]+?\s){10}, ([\w]+?\s){9}[\w], ([\S]+\s){9}[\S], ([a-zA-Z0-9,.'":;$-]+\s+){10,} and several more varaiations on the same basic idea.
<Required(ErrorMessage:="The Description of Operations field is required"), RegularExpression("([\w]+){20,}", ErrorMessage:="ERROZ")>
Public Property DescOfOperations As String = String.Empty
Correct Solution was ([\S]+\s+){9}[\S\s]+
EDIT Moved accepted version to the top, removing unused versions. Unless I am wrong and the whole sequence needs to match, then something like (also accounting for double spaces):
([\S]+\s+){9}[\S\s]+
Or:
([\w]+?\s+){9}[\w]+
Give this a try:
([a-zA-Z0-9,.'":;$-]+\s){10,}

How do I imitate twitters url-shortener?

the main question is a bit short so I'll collaborate.
I'm building an app for twitter with which you can do the basic actions (get posts, do a post, reply etc.)
Now I figured it would be a good idea if I'd check the max 140 char limit in my app.
So far so good, then someone asked if I could also do the url-shortener thing.
so at the moment I have a regex that picks op most (in fact too much) url's, takes the lenght of them and either adds or deduces the difference from the 140 max.
It's still a but buggy but I can manage that.
Now my problem....
It seems twitter is quite picky in what they think is an url:
I got the most basic ones (starting with http(s):// and such), but twitter also replaces some tld's very easily, (www.)google.com [whatever].net/.biz/.info are just a few of them)
but not .nl .de .tk
Now I was wondering if perhaps someone has found out which ones they do and which ones they don't 'shorten'.
now because I'm pretty sure my regex isn't the best either I'll drop that here as well:
((http|https):\/\/[\w\-_]+(\.[\w\-_]+)+([\w\-\.,#?^=%&:\/~\+#]*[\w\-\#?^=%&\/~\+#])?)|([\w\-_]+(\.[\w\-_]+)+([\w\-\.,#?^=%&:\/~\+#]*[\w\-\#?^=%&\/~\+#])?)
http://support.twitter.com/articles/78124-how-to-shorten-links-urls# indicates that all URLs posted to Twitter will be rewritten to be exactly 19 characters long.
I am using this: var url_expression = /[-a-zA-Z0-9#:%_\+.~#?&//=]{2,256}\.[a-z]{2,4}\b(\/[-a-zA-Z0-9#:%_\+.~#?&//=]*)?/gi; Nobody has complained :)
I figured it out, I found a pretty important line on the tld wikipage. It states that all country TLD's are two chars long. And also the other way around; all 2 char tld's are countries. With that in mind, I started testing a bunch of them with twitter and I'm pretty sure I now know what url's twitter shortens and which ones they don't.
All url's starting with http:// or https://
All url's like [something].[non country tld] # .com .biz .mobi etc. (Except .arpa & .aero)
All url's like [something].[something].[valid tld] # including countries
links like http://[user]:[pass]#[something].[tld] will NOT be shortened
Now to build a regex for it, i'll post it here as soon as I think I have it :D
this is what I got this far:
/(^(?:(?:ht|f)tp(?:s?)\:\/\/|~\/|\/)?(?:(?:[-\w]+\.)+(?:com|asia|cat|coop|edu|int|tel|pro|org|net|gov|mil|biz|info|mobi|name|jobs|museum|travel|([a-z]{2})))(?::[\d]{1,5})?(?:(?:(?:\/(?:[-\w~!$+|.,=\(\)]|%[a-f\d]{2})+)+|\/)+|\?|#)?(?:(?:\?(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)(?:&(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)*)*(?:#(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)?)/gim;
one major flaw still in it, it also accepts [domain].[tld] which twitter doesn't.
I hope this will help someone in the future. I'm pretty sure there's not a whole lot easy-to-find info about this on the web (or at least I couldn't find it).