Builder c++ Rave Reports encoding problem with cyrillic - c++

When i try save rave project in pdf\html file, have incorrect encoding.
When choose format and press SAVE, it ussually save in iso-8859-1 code.
But i need cp1251 (cyrillic).
For example "Ïëîùàäü" instead of "Площадь".

I would guess that the best solution to your problem would be to use Unicode, rather than a codepage such as CP1251. Is it possible to use Unicode with Rave Reports?

I have the same problem when I want o save reprt to pdf format. I have to create TRvRenderPDF and set it as RenderObject but pdf file was viewed not correctrly.
The TRvRenderPDF component not unicode-compatible(that is very bad) that is why all text in report coverted into Ansi with active codepage(for cyrillic it is CP1251). Now we have pdf file with text in CP1251 encoding.
As default TRvRenderPDF generate pdf with font TYPE1 Helvetica(by the standart of the format pdf it is build-in). But text is interpreted with encoding ISO 8859-1(or CP1252) but it encoding was CP1251 tha is why we have "Ïëîùàäü" or something analogous.
What we can to do:
Get font TYPE1(CP1252) where service symbols(numbers like in cyrillic letters in CP1251) replaced with cyrillic glyphs. For example a link and we need to install it.
Now we need tor replace old font name(Helvetica) from generated pdf document with new font name(AGHelvetica). You can dow it with text editor or i your's programm(read file -> find -> replace -> save file).
That all situation.
P.S. Sorry for my english.
P.P.S. If set property of pdf render EmbedBaseFonts = true, pdf document were saved with TrueType fonts, but problem stay. Neew to see to unicode render? but not this one.

Related

Substitute Wingding fonts for linux

I am using Java aspose.words and trying to build a pdf from docx/ppt in linux. The docx-document has an list with bulletpoints. These bulletpoints use the symbol font.
When i create the pdf with aspose these bulletpoints are shown in webdings font as a clapperboard.
I did not find any free font (for commercial use) that is an equivalent to the symbol font. Does anyone know a good solution to show correct bulletpoints in list?
I found the way to substitute fonts, but i don't know which font to use:
TableSubstitutionRule tableSubstitutionRule = fontSettings.getSubstitutionSettings().getTableSubstitution();
tableSubstitutionRule.addSubstitutes("Symbol", "?WHICH_FONT?");
It might be a known peculiarity. Windows “Symbol” font is a symbolic font (like “Webdings”, “Wingdings”, etc.) which uses Unicode PUA. Thus substitution of this font will cause different glyphs rendering. Provided Mac/Linux “Symbol” font on the other hand is a proper Unicode font (for example Greek characters are in the U+0370…U+03FF Greek and Coptic block). So these fonts are incompatible and Mac/Linux “Symbol” font cannot be used instead of Windows “Symbol” without additional actions. In this particular case you have to change the bullet codepoint from PUA U+F0B7 (or U+00B7 which also can be used in MS Word for symbolic fonts) to the U+2022 in the document to use the Mac “Symbol” font. See the following code for example:
Document doc = new Document("/Users/mac1/Downloads/in.docx");
for (com.aspose.words.List lst : doc.getLists())
{
for (com.aspose.words.ListLevel level : lst.getListLevels())
{
if (level.getFont().getName().equals("Symbol") && level.getNumberFormat().equals("\uF0B7"))
{
level.setNumberFormat("\u2022");
}
}
}
doc.save("/Users/mac1/Downloads/out.pdf");
If this does not help, please post your question in Aspose.Words support forum and attach your input and output document there.

TinyMCE, Django and python-docx

I'm looking into using a rich text editor in my Django project. TinyMCE looks like the obvious solution, however i see that the output format is html (here). Goal is to store user input and then serve it inside a word document using python-docx( which is not html).
Do you know of any solution for this? Either a feature of tinyMCE or a html to word-format converter which keeps styles, or maybe another rich text editor similar to tinymce?
UPDATE:
This is another option which i found to be working fine. Still at the point of trying to convert HTML to Word without losing styles. A solution for this may be pywin32 as stated here but it doesn't help me that much + it's Windows only.
Update2
After quite some digging i found pandoc and pypandoc which appear to be able to translate in any of these output formats:
"asciidoc, beamer, commonmark, context, docbook, docbook4, docbook5, docx, dokuwiki, dzslides, epub, epub2, epub3, fb2, gfm, haddock, html, html4, html5, icml, jats, json, latex, man, markdown, markdown_github, markdown_mmd, markdown_phpextra, markdown_strict, mediawiki, ms, muse, native, odt, opendocument, opml, org, plain, pptx, revealjs, rst, rtf, s5, slideous, slidy, tei, texinfo, textile, zimwiki"
I haven't figured out how to integrate such an input to python-docx.
I had the same challenge. You'll want to use Python's Beautiful Soup library to iterate through the content in your HTML editor (I use Summernote, but any HTML editor should work) then parse HTML tags into a usable format for python-docx. Pandoc and Pypandoc will convert files for you (e.g. you start with a LateX file and need to convert it to Word), but will not provide the tools to need to convert to and from xml/html.
Good luck!

Aspose and umlauts (ä, ö, ü)

I use Aspose.Cells for java to convert excel documents to html. But there is problem with umlauts.
there is the code I use to save excel documents to html
com.aspose.cells.Workbook workbook = new com.aspose.cells.Workbook(stream);
workbook.save(path, com.aspose.cells.SaveFormat.HTML);
is there some way to resolve this?
This type of issues might occur due to missing fonts. Since Aspose.Cells needs the underlying fonts (used in the workbook) to be installed on the system for rendering to PDF, HTML or image, so you got to make sure all the underlying fonts are there on the pc and your application should access to that folder. You can also find the needed fonts for your workbook using Workbook.getFonts() method. You may put all the font files (.ttf files) in some folder and set the fonts directory at the start before using your original code.
e.g
Sample code:
......
String MyFontDir = "your_fonts_folder_path";
// Setting the fonts folder with setFontFolder method
FontConfigs.setFontFolder(MyFontDir, true);
//.......
//Your code goes here.
//......
I am working as Support developer/ Evangelist at Aspose.

MFC: How to Set initial value of CMFCEditBrowseCtrl object?

I have MFC application for which I want to add one dialog to browse file location, using CMFCEditBrowseCtrl object. But I have not been able to set initial path properly, e.g. "C:\Program Files\Path".
When tried it is showing chinese letters.
How can i do that? I have the code as follows :
m_pathCtrl.EnableFolderBrowseButton();
m_pathCtrl.SetWindowText(_T("C:\\Program Files\\Path"));
But it is showing something like this ->
How to properly show the path in English? Please Guide.
The issue arises because you are using the ASCII character set, but the control expects Unicode. MS explains how to set up an CMFCEditBrowseCtrl in a dialog when using ASCII here: https://learn.microsoft.com/en-us/cpp/mfc/reference/cmfceditbrowsectrl-class?view=msvc-170. Use the dialog editor to insert an edit control in the dialog, then change its type from CEdit to CMFCEditBrowseCtrl in the header file.
You can also see the proper characters in the window by using SetWindowTextW; e g, inputFilesCtrl.SetWindowTextW(L"C:\SomeDirectory");. The Chinese characters you are seeing are what happens when a 1-byte character set is interpreted as a 2-byte one.

DCM4CHE cannot display Japnese Character

I am using dcm4che as my PACS and I am inserting a DICOM file which contains the patient name in Japanese character.
But the web based url of dcm4chee is not supporting Japanese character and showing the patient name as garbled characters( like question marks and squares ).
For DCM4CHE i am using postgresql as the DataBase. In DB properties it is showing 'Encoding as UTF8', 'Collation as English_India.1252' and 'Character Type as English_India.1252'. Does my DB supports Japanese character ?
I am new to Database and any help will be appreciated.
EDIT:
This issue was not related to PACS. I obtained a valid DICOM file with Japanese charters( they are using specific character set as \ISO 2022 IR 87 ) and send the same to PACS. Its correctly showing in the PACS. So the issue is with my DICOM file. I also inserted the specific character set as '\ISO 2022 IR 87'. But still I am getting garbled japanese characters.
I am using MergeCom Dicom utility and using 'MC_Set_Value_From_String' API for inserting the japanese string. Am I missing anything ? Is it not possible to insert Japanese characters by using 'MC_Set_Value_From_String' ? I am thinking of using the API MC_Set_Value_From_UnicodeString.
UTF-8 supports all unicode code points, which includes Japanese. So it is unlikely the database is the issue.
What is the content of the Specific Character Set (0008,0005) tag? The default character encoding for dicom is ASCII. There is a section in the dicom spec providing examples of use with Japanese.
I could solve the issue.
The issue was related to the encoding. For Unicode conversion, I was using the windows API "WideCharToMultiByte" with code page was UTF-8. This was not properly converting the Japanese characters which was fixed by using code page as 50222.
You can find all the entire code page from below link.
https://msdn.microsoft.com/en-us/library/dd317756(VS.85).aspx