Embedding XML string resource in C++ project - c++

I've got a C++ Plug-in project that needs to be translatable. As such I've created a Dictionary class that can be used to access the strings. However, I'm not entirely sure what the best way of storing the strings is. I was thinking of creating an XML file and storing all the strings there with their different locales stored under one variable name, e.g.
<sHello>
<en_GB>Hello</en_GB>
<fr_FR>Bonjour</fr_FR>
...
</sHello>
However, I would prefer not to have to distribute the XML file along with the other distributables. The Plug-in needs to be cross platform so I'm developing the Windows version in Visual Studio and the Mac OS version in XCode. Preferably, the C++ code in my Dictionary class for accessing the XML values would be the same on both platforms, meaning I could just included the XML in the respective projects and not have to worry about maintaining 2 separate code bases.
Is it possible to embed the XML file as a compiled resource in such a way that it doesn't need to be distributed?

What's wrong with distributing the xml file?
If you really don't want to distribute the XML file, you could make a small utility program that converts the xml file to a C++ source file that would look e.g like this:
// autogenerated by <put your fancy name here>
// Don't hand edit this source file
const char xmlfile[] = "<sHello>\
<en_GB>Hello</en_GB>\
<fr_FR>Bonjour</fr_FR>\
...\
</sHello>";
Instead of XML you could use any other suitable format such as json or whatever you think fits best for your needs.

Related

Is it a good idea to include a large text variable in compiled code?

I am writing a program that produces a formatted file for the user, but it's not only producing the formatted file, it does more.
I want to distribute a single binary to the end user and when the user runs the program, it will generate the xml file for the user with appropriate data.
In order to achieve this, I want to give the file contents to a char array variable that is compiled in code. When the user runs the program, I will write out the char file to generate an xml file for the user.
char* buffers = "a xml format file contents, \
this represent many block text \
from a file,...";
I have two questions.
Q1. Do you have any other ideas for how to compile my file contents into binary, i.e, distribute as one binary file.
Q2. Is this even a good idea as I described above?
What you describe is by far the norm for C/C++. For large amounts of text data, or for arbitrary binary data (or indeed any data you can store in a file - e.g. zip file) you can write the data to a file, link it into your program directly.
An example may be found on sites like this one
I'll recommend using another file to contain data other than putting data into the binary, unless you have your own reasons. I don't know other portable ways to put strings into binary file, but your solution seems OK.
However, note that using \ at the end of line to form strings of multiple lines, the indentation should be taken care of, because they are concatenated from the begging of the next lineļ¼š
char* buffers = "a xml format file contents, \
this represent many block text \
from a file,...";
Or you can use another form:
char *buffers =
"a xml format file contents,"
"this represent many block text"
"from a file,...";
Probably, my answer provides much redundant information for topic-starter, but here are what I'm aware of:
Embedding in source code: plain C/C++ solution it is a bad idea because each time you will want to change your content, you will need:
recompile
relink
It can be acceptable only your content changes very rarely or never of if build time is not an issue (if you app is small).
Embedding in binary: Few little more flexible solutions of embedding content in executables exists, but none of them cross-platform (you've not stated your target platform):
Windows: resource files. With most IDEs it is very simple
Linux: objcopy.
MacOS: Application Bundles. Even more simple than on Windows.
You will not need recompile C++ file(s), only re-link.
Application virtualization: there are special utilities that wraps all your application resources into single executable, that runs it similar to as on virtual machine.
I'm only aware of such utilities for Windows (ThinApp, BoxedApp), but there are probably such things for other OSes too, or even cross-platform ones.
Consider distributing your application in some form of installer: when starting installer it creates all resources and unpack executable. It is similar to generating whole stuff by main executable. This can be large and complex package or even simple self-extracting archive.
Of course choice, depends on what kind of application you are creating, who are your target auditory, how you will ship package to end-users etc. If it is a game and you targeting children its not the same as Unix console utility for C++ coders =)
It depends. If you are doing some small unix style utility with no perspective on internatialization, then it's probably fine. You don't want to bloat a distributive with a file no one would ever touch anyways.
But in general it is a bad practice, because eventually someone might want to modify this data and he or she would have to rebuild the whole thing just to fix a typo or anything.
The decision is really up to you.
If you just want to keep your distributive in one piece, you might also find this thread interesting: Store data in executable
Why don't you distribute your application with an additional configuration file? e.g. package your application executable and config file together.
If you do want to make it into a single file, try embed your config file into the executable one as resources.
I see it more of an OS than C/C++ issue. You can add the text to the resource part of your binary/program. In Windows programs HTML, graphics and even movie files are often compiled into resources that make part of the final binary.
That is handy for possible future translation into another language, plus you can modify resource part of the binary without recompiling the code.

How to embed resources into a single executable?

If you've ever used the tool Game Maker, it's a bit like that. I want to be able to take all my sounds, images, and everything else of the like and embed them into a single C++ executable. Game Maker would have a built-in editor, and would have the images embedded into the .gmk file, and when you'd open it it would read the images, and display them in the game. I'm thinking he had the images saved not as images, but as pure data stored in the .gmk file and interpreted by the editor or by some interpreter written into the .exe. How would I go about making something similar?
The windows resource system works like this, so if you make a WinAPI or MFC application, you can use this. Also, Qt provides the same functionality, but in a platform independent way. They just write the files in raw binary format into a byte array in a normal C++ file, so they get compiled as data into the exe. Then they provide functions for accessing these data blocks like normal files, although I don't know how they really work. Probably a special implementation of their file class which just accesses those byte array variables.
For images only, a very simple approach is to use the XPM format.
This format is a valid C/C++ header, so you can include it directly into a C++ source file and use it directly.
The main issue with this approach is that XPM is not a compressed format, so uses a lot of storage.
In consequence, in practice I only seen this used for icons and small graphical objects, but in principle you could do more.
The other cool thing about XPM is that it's human readable - again great for designing small and simple icons.
To generalize this idea to other formats, what you could do is to create a compile chain that:
Encodes the target file as ASCII (Uuencode or such)
Turns that into a single named C String in a source file.
Create a header just declaring the name
Define a function recovering the binary form from the string
For the Windows OS I have a solution if you are willing to use another tool and possibly framework. Try the "peresembed" tool. It embeds files into PE image sections so you can have all your images, sounds and configuration files in just one EXE file. Supports compression too, although you do need a ZIP in-memory reading framework then. Can even embed files into the PE resource tree based on their relative file paths.
Example usage:
peresembed -file content.txt _export_to_resolve input.exe output.exe
In your C++ file you have:
struct embedded_data
{
void *dataloc;
size_t datasize;
};
extern "C" __declspec(dllexport) const volatile embedded_data _export_to_resolve = { 0 };
Get peresembed from: https://osdn.net/projects/pefrm-units/releases/70746
Showcase video: https://www.youtube.com/watch?v=1uYdjiZc5XI

Qt translate strings from non-source files

I have a Qt project which uses XML files. Those XML files contain human-readable text and this text should be translated by using the Qt tools (lupdate, lrelease, QtLinguist).
The question is if it is possible to generate entries in .ts file via lupdate without duplicating the strings from the XML files in a source code file by using the QT_TR_NOOP() macro and friends? Or in general, how do you translate strings in non-source files for Qt projects?
We had the same problem : XML files containing human readable strings.
Our solution was to make sure that human readable strings in the XML files were easy to extract (we put them in a LABEL attribute) and we developped a small tool which would parse the XML files, extract the strings, generate a context (by extracting data from the XML file), and then generating a CPP header file containing a list of QT_TR_NOOP().
This file was added to our project file (.pro) that was used by lupdate.
This solution was fine for us but we had to be very careful about two things :
run this tool each time the content of an XML file changed.
make sure the XML files are UTF-8 encoded.
You can translate anything you want at runtime by using tr(), as long as the .qm file has a matching translation/context. It shouldn't make any difference whether lupdate extracted it or not.
I don't know how to make lupdate to extract strings from arbitrary XML, but that doesn't mean you can't use linguist.
.ts files are also XML; it should be easy to make an XSLT that transforms your XML into a .ts file. If you want to target something standard instead of just Qt, lupdate(and linguist) can process also XLIFF files.
you can have multiple .ts files (just call QTranslator::load more than once when setting it up)
If you really want to have it all in one file for the translator, have your XSLT copy the lupdate-generated file into its output.
As long as you use a context name that doesn't duplicate something used in the source code, this shouldn't be any different (from Qt's point of view) from the way many apps load a .qm for each DLL that has GUI.

Storing UTF-8 XML using Word's CustomXMLPart or any other supported way

I am writing a Word add-in which is supposed to store some own XML data per document using Word object model and its CustomXMLPart. The problem I am now facing is the lack of IStream-like functionality for reading/writing XML to/from a CustomXMLPart. It only provides BSTR interface and I am puzzled how to handle UTF-8 XMLs with BSTRs. To my understanding an UTF-8 XML file should really never have to undergo this sort of Unicode conversion. I am not sure what to expect as a result here.
Is there another way of using Word automation interfaces to store arbitrary custom information inside a DOCX file?
The "package" is an OPC document (Open Packaging Convention), which is basically a structured zip folder with a different extension (e.g. .pptx, .docx, .xps, etc.). You can get that file in stream and manipulate it any which way you like - but not artibitrarily. It will not be recognized as valid docx if you put things in the wrong places (not just xml elements, but also files in the folders inside the zip file). But if you're just talking "artibitrary" meaning CustomXMLPart, then that's okay.
This is a good kicker page to learn more about the Open XML SDK and if you're up to it, which allows for somewhat easier access to the file formats than using (.NET) System.IO.Packaging or a third-party zip library. To go deeper, grab the eBook (free) Open XML Explained.
With the Open XML SDK (again, this can all be done without the SDK) in .NET, this is what you'll want to do: How to: Insert Custom XML to an Office Open XML Package by Using the Open XML API.

How do I deal with "Project Files" in my Qt application?

My Qt application should be able to create/open/save a single "Project" at once. What is the painless way to store project's settings in a file? Should it be XML or something less horrible?
Of course data to be stored in a file is a subject to change over time.
What I need is something like QSettings but bounded to a project in my application rather than to the whole application.
You can use QSettings to store data in a specific .ini file.
From the docs:
Sometimes you do want to access settings stored in a specific file or registry path. On all platforms, if you want to read an INI file directly, you can use the QSettings constructor that takes a file name as first argument and pass QSettings::IniFormat as second argument. For example:
QSettings settings("/home/petra/misc/myapp.ini",
QSettings::IniFormat);
I order to make it user editable, I would stick to plain text with one key = values by line, like in most of the Linux apps.
However this is only for the settings, not for the complete project's data which I suppose requires more complex structures.
So maybe JSON ?
Pro XML:
You can have a look at it in an editor
You can store any kind of string in any language in it (unicode support)
It's simple to learn
More than one program can read the same XML
It's easy to structure your data with XML. When you use key/value lists, for example, you'll run into problems when you need to save tree-like structures.
Contra XML
The result is somewhat bloated
Most programming languages (especially old ones like C++) have no good support for XML. The old XML APIs were designed in such a way that they could be implemented in any language (smallest common denominator). 'nuff said.
You need to understand the concept of "encoding" (or "charset"). While this may look trivial at first glance, there are some hidden issues which can bite you. So always test your code with some umlauts and even kanji (Japanese characters) to make sure you got it right.