Can I use a "\n" character in strings that are to be translated with Qt Linguist - c++

I'm working on supporting different languages for our GUI. I'm having a problem translating strings that have a '\n' in them. They seem to be ignored.
In Qt Designer I have a QCheckBox with this in the text field:
Here's an \nexample that doesn't work
This appears in english in our french translation.
Having looked at the .ts XML it seems that the text after the '\n' is ommited (I guess this is why it doesn't get translated ?)
Does anyone have a way of including a newline in the original text?
Seems I had carriage returns in my text before the newline. (no idea how they got there)
e.g
Here's an [][][][]\nexample that doesn't work
After removing them, the translation worked.

The "\n" character itself was not my problem.
Some invisible carriage returns in the string was the culprit.
See http://qt.nokia.com/developer/task-tracker/index_html?method=entry&id=81275

Use the HTML subset: "Here's an <br />example that does work".

Related

How to find and replace box character in text file?

I have a large text file that I'm going to be working with programmatically but have run into problems with a special character strewn throughout the file. The file is way too large to scan it looking for specific characters. Most of the other unwanted special characters I've been able to get rid of using some regex pattern. But there is a box character, similar to "□". When I tried to copy the character from the actual text file and past it here I get "�", so the example of the box is from Windows character map which includes the code 'U+25A1', which I'm not sure how to interpret or if it's something I could use for a regex search.
Would anyone know how I could search for the box symbol similar to "□" in a UTF-8 encoded file?
EDIT:
Here is an example from the text file:
"� Prune palms when flower spathes show, or delay pruning until after the palm has finished flowering, to prevent infestation of palm flower caterpillars. Leave the top five rows."
The only problem is that, as mentioned in the original post, the square gets converted into a diamond question mark.
It's unclear where and how you are searching, although you could use the hex equivalent:
\x{25A1}
Example:
https://regex101.com/r/b84oBs/1
The black diamond with a question mark is not a character, per se. It is what a browser spits out at you when you give it unrecognizable bytes.
Find out where that data is coming from.
Determine its encoding. (Usually UTF-8, but might be something else.)
Be sure the browser is configured to display that encoding. This is likely to suffice <meta charset=UTF-8> in the header of the page.
I found a workaround using Notepad++ and this website. It's still not clear what encoding system the square is originally from, but when I post it into the query field in the website above or into the Notepad++ Conversion Table (Plugins > Converter > Conversion Table) it gives the hex-character code for the "Replacement Character" which is the diamond with the question mark.
Using this code in a regex expression, \x{FFFD}, within Notepad++ search gave me all the squares, although recognizing them as the Replacement Character.

print with GDI+ Graphics.DrawString produces garbage characters

I'm using GDI+ Graphics.DrawString call to print a document with Chinese characters. All text are in Unicode (WCHAR). The problem is, on some computers (1% of all), all Chinese characters become garbage characters. It seems it tries to interpret the text in a difference code page.
I have found that only characters in regular style (FontStyleRegular) have problems. Characters in Bold style are OK.
I also tried to print to the "Microsoft XPS Document Writer" printer. The problem is the same. So it's not a problem with printer driver.
I have debugged the program and can assure the text parameter in the DrawString call is correct.
I have fixed the problem by copying the font file from a good computer to the problematic one.

I see a character called xDB on notepad++. What character is this?

What is this character
All I really need to know is what is this character. I have not seen anything like this before.
How do i remove this using Vb.net:
data = data.Replace(Chr(???????), "")
Is there a specific control character decimal number or something to this character that i can use in place of ??
Please help.
I tried looking up all the html, ascii and the regex languages to find this character but i did not find this anywhere.
To prevent possible bugs related to the encoding of your source files, you should use a hex editor (such as this Notepad++ plugin) to find the hexadecimal code of the character, then use that to reference the character in your code:
data = data.Replace((char)0xDB, "")
as opposed to:
data = data.Replace("Û", "")
Note: In this case the hex editor is unnecessary because xDB is already a hex code, but other control characters, such as CR and LF, are not displayed as their hex values [in Notepad++].

Can an tinyxml someone explain which characters need to be escaped?

I am using tinyxml to save input from a text ctrl. The user can copy whatever they like into the text box and it gets written to an xml file. I'm finding that the new lines don't get saved and neither do & characters. The weird part is that tinyxml just discards them completely without any warning. If I put a & into the textbox and save, the tag will look like:
<textboxtext></textboxtext>
newlines completely disappear as well. No characters whatsoever are stored. What's going on? Even if I need to escape them with &amp or something, why does it just discard everything? Also, I can't find anything on google regarding this topic. Any help?
EDIT:
I found this topic which suggest the discarding of these characters may be a bug.
TinyXML and preserving HTML Entities
It is, apparently, a bug in TinyXml.
The simple workaround is to escape anything that it might not like:
&, ", ', < and > got their regular xml entities encoding
strange characters (read non-alphanumerical / regular punctuation) are best translated to their unicode codepoint: &#....;
Remember that TinyXml is before all a lightweight xml library, not a full-fledged beast.

Detecting Characters in an XSLT

I have encountered some odd characters that do not display properly in Internet Explorer, such as these: “, –, and ’. I think they're carried over from copy-and-paste Word content.
I am using XSLT to build the page content and it would be great to detect these characters in the XSLT and replace them with valid HTML codes. I already do string replacement in the style sheet, but I'm not sure how detect these encoded characters or whether it's possible.
What about simply changing the encoding for the Stylesheet as well as its output to UTF-8? The characters you mention are “, – and ’. Certainly not invalid or so, given the correct encoding (the characters are at least perfectly valid in Codepage 1252).
Using a good XML editor such as XMLSpy should highlight any errors in formatting your XSLT by validating at development time.
Jeni Tennison's Multiple string replacements may be a good starting point.