How to write Russian characters to file with cffile - coldfusion

I have a string I need to write to create an XML file. The string has Russian characters in it, which I can cfoutput to the page no problem, but when I write the file with cffile, those characters return with a ?. I tried changing the charset to the following with no success:
windows-1252
iso-8859-1
cp1251
cp866
I'm sure the charset is the problem here. Any suggestions?
Here is one of the strings in question: Другие
I'm running ColdFusion 10 on a Windows Server 2008 R2 System.

Untested, but I have had pageencoding problems in the past. Try <cfprocessingDirective pageencoding="utf-8"> Make sure you put it every template in the operation. Putting it in application.cfm or application.cfc may not be sufficient.

Set the charset = "utf-8" when writing to the file

Related

NUSOAP with error response 

NUSOAP is generating in the response some rare characters, and when the JSON is bigger it responds an error, like if code was encrypted, in the begining of response shows <SOAP-ENV:Envelope... , i encoded the results as UTF-8, and changed the codification of file to UTF-8 and nothing...
thanks to all.. yes as Alex K said.. it's a BOM character, the problem was not in my file but in an external file, a class file had the character BOM, i noticed using a tool called bomremover.exe which lists the files and batch removes the character. After that all works fine.. the tool was made by Maurice Wohlkönig

wrong text encoding on linux

I downloaded a source code .rar file from internet to my linux server. Then, I extract all source files into local directory. When I use "cat" command to see the content of each file, the wrong text encoding is shown on my terminal (There are some chinese characters in the source file).
I use
file -bi testapi.cpp
then shows:
text/plain; charset=iso-8859-1
I tried to convert that file to uft-8 encoding with following command:
iconv -f ISO88591 -t UTF8 testapi.cpp > new.cpp
But it doesn't work.
I set my .vimrc file with following two lines:
set encoding=utf-8
set fileencoding=utf-8
After this, when I vim testapi.cpp, the chinese characters will be normally displayed in the vim. But cat testapi.cpp doesn't work.
When I compile and run the program, the printf statement with chinese characters will print wrong characters like ????
What should I do to display correct chinese characters when I run the program?
TLDR Quickest Solution: Copy/Paste the Visible Text to a Brand-New, Confirmed UTF-8 File
Your file is marked as latin1, but the data is stored as utf8.
When you set set-enc=utf8 or set fileencoding=utf-8 in VIM, you're not changing the data or converting it. You're looking at the same exact data, but interpreting as if it is the utf8 charset. So, good news: Your data is good. No conversion or changing necessary.
You just need to put the same exact data into a file already marked as UTF-8 encoding. That can be done easily by simply making a brand new file in vim, using set enc=utf8, and then copy-pasting your old data into the new file. You can test this out by making a testfile with the only text "汉语" ("chinese language"), set enc, save, close, reopen, and see that the text didn't get corrupted. And you can test with file -bi $pathtofile, though that is not super reliable.
Anyway, TLDR: Make a brand new UTF-8 file, confirm that it's utf-8, make your data visible, and then copy/paste and/or transfer it to the new UTF-8 file, without doing any conversion.
Also, theoretically, I considered that iconv -f utf8 -t utf8 would work, since all I wanted to do was make utf-8-encoded data be marked as utf-8-encoded, without changing it. But this gave me an error that indicated it was still trying to do a data conversion.

POCO C++ SAX parser: If the xml document encoding is ANSI then next statement is not reading and throwing encoding error exception

Suppose the following is the xml document then hello tag is not reading by the poco sax parser because of encoding is ANSI.
<?xml version="1.0" encoding="ANSI"?>
<hello xmlns=" ">
If the encoding is UTF-8 then hello tag is reading and everything is went fine.
Is there any solution in POCO for this issue?
It's not a POCO problem, fix the producer. There's no such thing as "ANSI" encoding in XML. The producer should generate output in a valid encoding. Whether that's "UTF-8" or "ISO-8859-1" doesn't really matter, as long as it's all consistent.
The encoding problem arise if you specify a encoding but you use a different encoding, source of trouble could arise (in example) if you copy-paste a XML source between multiple documents, from webpages, or simply because it has a buggy encoder. Try to use some program that can detect encoding and change that, set it to UTF8 and then replace the header tag for ANSI wich the one for UTF8.

Does the Wrap function in ColdFusion insert CR/LFs?

I have the need to do some word wrapping with a few considerations:
Source file is MS WORD
Copy and paste the text into a textarea in a cfform.
Use #wrap(theTextVar,80)# to dump out the text 80 characters
The text is uploaded to a legacy system which needs ansi or ascii chars uploaded.
Everything seems to work okay, I just wanted to confirm see if anyone else has had luck doing this and if they know if a CR / LF is entered after each line in the outputted text (Step 3)?
From the docs on wrap():
Uses the operating-system specific
line break: newline for UNIX, carriage
return and newline on Windows.
So if you are doing this on a Windows box, then the answer is yes.
Tried this?
<cffile action="write" file="i_will_show_the_secret_if_you_open_me_in_text_editor.html" output="#wrap(theTextVar,80)#" />

Special characters in CFMail

I'm trying to auto-generate a plain text email with a trademark symbol in it. I've tried everything I can think of but it's still not going through.
<cfmail from="#x#" to="#y#" subject="test" charset="UTF-8">
™
™
#Chr(153)#
</cfmail>
This is an encoding issue.
You state the mail is encoded as UTF-8, but Chr(153) does not return a trademark symbol in Unicode. It does in Windows-1252, but Chr() works with Unicode code points.
Use Chr(8482) to nail it to the Unicode TM symbol.
I've found an info page that outlines the issue nicely.
By the way, writing the literal TM symbol works for me as well. But this assumes your .cfm files are in fact encoded as Windows-1252 and that the ColdFusion runtime is configured to expect this (Both of which is the default on Windows systems, where I've tested it on. Analog rules apply to other systems.). ColdFusion converts all strings to Unicode internally, so maybe something is broken in this chain of expectations in your set-up.
I think that this is not so much an issue with CFMail but rather an issue with email clients displaying the characters codes in plain text messages literally rather than converting them to their corresponding characters.
Using CFMail in HTML mode should provide the result you're looking for.