VTIGER CRM (PHP) encoding Vtiger_Util_Helper::toSafeHTML(ZEND_JSON::encode) with characters - vtiger

This is the string which i passing for
ZEND_JSON {"0":"265","product":"265","1":"059 Königsblau","st_color":"059 Königsblau","2":"","st_material":"","3":"XL","st_size":"XL","4":"287","stockmanagementid":"287"}
I want 059 Königsblau instead of 059 Königsblau this string.
This is the actual code in this files modules/Vtiger/PopupContents.tpl
Here is Code
data-info='{Vtiger_Util_Helper::toSafeHTML(ZEND_JSON::encode($LISTVIEW_ENTRY->getRawData()))}'

I feel the header content type issue.This url will help you for resolved the issue
How to display special characters in PHP

Related

Regex Error - (incompatible encoding regexp match (UTF-8 regexp with ASCII-8BIT string)

I'm trying to do something which seems like it should be very simple. I'm trying to see if a specific string e.g. 'out of stock' is found within a page's source code. However, I don't care if the string is contained within an html comment or javascript. So prior to doing my search, I'd like to remove both of these elements using regular expressions. This is the code I'm using.
urls.each do |url|
response = HTTP.get(url)
if response.status.success?
source_code = response.to_s
# Remove comments
source_code = source_code.gsub(/<!--(.*?)-->/su, '')
# Remove scripts
source_code = source_code.gsub(/<script(.*?)<\/script>/msu, '')
if source_code.match(/out of stock/i)
# Flag URL for further processing
end
end
end
end
This works for 99% of all the urls I tried it with, but certain urls have become problematic. When I try to use these regular expressions on the source code returned for the url "https://www.sunski.com" I get the following error message:
Encoding::CompatibilityError (incompatible encoding regexp match (UTF-8 regexp with ASCII-8BIT string))
The page is definitely UTF-8 encoded, so I don't really understand the error message. A few people on stack overflow recommended using the # encoding: UTF-8 comment at the top of the file, but this didn't work.
If anyone could help with this it would be hugely appreciated. Thank you!
The Net::HTTP standard library only returns binary (ASCII-8BIT) strings. See the long-standing feature request: Feature #2567: Net::HTTP does not handle encoding correctly. So if you want UTF-8 strings you have to manually set their encoding to UTF-8 with String#force_encoding:
source_code.force_encoding(Encoding::UTF_8)
If the website's character encoding isn't UTF-8 you have to implement a heuristic based on the Content-Type header or <meta>'s charset attribute but even then it might not be the correct encoding. You can validate a string's encoding with String#valid_encoding? if you need to deal with such cases. Thankfully most websites use UTF-8 nowadays.
Also as #WiktorStribiżew already wrote in the comments, the regexp encoding specifiers s (Windows-31J) and u (UTF-8) modifiers aren't necessary here and only very rarely are. Especially the latter one since modern Ruby defaults to UTF-8 (or, if sufficient, its subset US-ASCII) anyway. In other programming languages they may have a different meaning, e.g. in Perl s means single line.

Escaping and unescaping HTML

In a function I do not control, data is being returned via
return xmlFormat(rc.content)
I later want to do a
<cfoutput>#resultsofreturn#</cfoutput>
The problem is all the HTML tags are escaped.
I have considered
<cfoutput>#DecodeForHTML(resultsofreturn)#</cfoutput>
But I am not sure these are inverses of each other
Like Adrian concluded, the best option is to implement a system to get to the pre-encoded value.
In the current state, the string your working with is encoded for an xml document. One option is to create an xml document with the text and parse the text back out of the xml document. I'm not sure how efficient this method is, but it will return the text back to it's pre-encoded value.
function xmlDecode(text){
return xmlParse("<t>#text#</t>").t.xmlText;
}
TryCF.com example
As of CF 10, you should be using the newer encodeFor functions. These functions account for high ASCII characters as well as UTF-8 characters.
Old and Busted
XmlFormat()
HTMLEditFormat()
JSStringFormat()
New Hotness
encodeForXML()
encodeForXMLAttribute()
encodeForHTML()
encodeForHTMLAttribute()
encodeForJavaScript()
encodeForCSS()
The output from these functions differs by context.
Then, if you're only getting escaped HTML, you can convert it back using Jsouo or the Jakarta Commons Lang library. There are some examples in a related SO answer.
Obviously, the best solution would be to update the existing function to return either version of the content. Is there a way to copy that function in order to return the unescaped content? Or can you just call it from a new function that uses the Java solution to convert the HTML?

How to convert MS Word Smart Quotes and em-dashes to simple quotes and dashes in Ckeditor 4

Hi I really like the new Ckeditor 4 Advanced Content Filtering along with the pastefromword plugin - and have read the docs on what html tags to allow and not, and I understand why it kindly converts my client's MS Word crap into htmlentities. However, I'd like to do a little intervention and convert the smart quotes to straight quotes - and all em dashes to plain dashes and not allow - before the text gets sent to the CMS database. But I can't find any docs on this or examples.
I can see there were many questions about this on the old forum Ckeditor forum http://ckeditor.com/forums/CKEditor-3.x/Replacing-smart-quotes-regular-quotes, http://ckeditor.com/forums/CKEditor-3.x/Problem-copyingpasting-MS-Word but they didn't get answered.
I'm also hoping the ckeditor team reads these forums as this is where they suggest we post questions now.
CKEditor dev here.
If you want the Paste From Word plugin to do this, you could add a rule in the plugin that replaces the contents of text nodes.
To achieve this add a property named 'text' somewhere over here(on the same level as the 'comment' property):
https://github.com/ckeditor/ckeditor-dev/blob/master/plugins/pastefromword/filter/default.js#L1106
It should be a function that accepts one parameter - the text node content, e.g.:
text: function( content ) {
return content.replace(/[\u201E\u201C]/g,'"'); // Unicode for „ and “
}
This way whenever the PFW plugin filter encounters a text node it'll replace its contents with whatever is returned by the above mentioned function.
Caveats: there are quite a few Unicode symbols that represent quotation marks and dashes.
By the way: you may not want to get too attached to the current Paste From Word plugin - we're planning a major refactor of it for v4.6.
I hope this was helpful.

Replacing special characters from HTML source

I'm new to HTML coding and I know HTML has some reserved characters for its use and it also displays some characters by their character code. For example -:
Œ is Œ
© is ©
® is ®
I have the HTML source in std::string. how can i decipher them into their actual form and replace from std::string? is there any library with source available or can it be done using macros preprocessors?
I would recommend using some HTML/XML parser that can automatically do the conversion for you. Parsing HTML correctly by hand is extremely difficult. If you insist on doing it yourself, Boost String Algorithms library provides useful replacement functions.
Œ is Œ
No it isn't. Œ is 'PARTIAL LINE BACKWARD'. The correct numeric entities for Œ are Œ and Œ.
One method for the numeric entities would be to use a regular expression like &#([0-9]+);, grab the numeric value and convert it to the ASCII character (probably with sprintf in C++).
For the named entities you would need to build a mapping. You could probably do a simple string replace to convert to the numbers, then use the method above. W3C has a table here: http://www.w3.org/TR/WD-html40-970708/sgml/entities.html
But if you're trying to read or parse a bunch of HTML in a string, you should use an HTML parser. Search for the many questions on SO.

Detecting Characters in an XSLT

I have encountered some odd characters that do not display properly in Internet Explorer, such as these: “, –, and ’. I think they're carried over from copy-and-paste Word content.
I am using XSLT to build the page content and it would be great to detect these characters in the XSLT and replace them with valid HTML codes. I already do string replacement in the style sheet, but I'm not sure how detect these encoded characters or whether it's possible.
What about simply changing the encoding for the Stylesheet as well as its output to UTF-8? The characters you mention are “, – and ’. Certainly not invalid or so, given the correct encoding (the characters are at least perfectly valid in Codepage 1252).
Using a good XML editor such as XMLSpy should highlight any errors in formatting your XSLT by validating at development time.
Jeni Tennison's Multiple string replacements may be a good starting point.