Application with a patchable embedded configuration - c++

I work on a cross-platform C++ application (Visual C++, GCC, clang++ regarding the target platform). I want to embed a configuration string into my application and have possibility to patch the binary after compilation to change the configuration and make it preconfigured.
Now I only consider declaring a configuration variable:
const char* embeddedConfig = "*magic*random characters filling the maximum configuration size";
Patcher is going to search for the magic in the binary and replace it with the actual configuration.
I am not sure of the stability of the hacky approach. Is there any more reliable way (perhaps compiler-specific)?

That will work if you keep the size of the text unchanged and declare the constant outside of any function. Such constants are simply put into the data section of binary by compiler.
However you will need to re-sign the binary if you are using code signing.

Embed the string as a resource and use the UpdateResourceA function. See here.

Related

How to copy results of a plugin to another project?

In Frama-C, I would like to copy the results of a plugin like Value from one project to another. How exactly do I do this? I'm guessing I have to use Project.copy with the proper State_selection, but what would that be for Value? More generally, how do I determine what the State_selection would be for a given plugin?
Unfortunately, there is no unified mechanism across plug-ins for that. For the EVA1 plug-in, you would probably do something like
let selection = State_selection.with_codependencies Db.Value.self in
Project.copy ~selection ~src dest
in order to capture EVA's state as well as the intermediate states on which it depends.
That said, I'd advise against trying to copy such a substantial part of Frama-C's internal state. It's very error-prone and implies working with arcane API. If you can afford it, two other solutions seem easier:
work in the original project, possibly creating a new project with a new AST as a result, through the File.create_copy_from_visitor.
copy the entire project with Project.copy and work on the new project.
1: Evolved Value Analysis, the new name of Value

Can I read a SMT2 file into a solver through the z3 c++ interface?

I've got a problem where the z3 code embedded in a larger system isn't finding a solution to a certain set of constraints (added through the C++ interface) despite some fairly long timeouts. When I dump the constraints to a file (using the to_smt2() method on the solver, just before the call to check()), and run the file through the standalone z3 executable, it solves the system in about 4 seconds (returning sat). For what it's worth, the file is 476,587 lines long, so a fairly big set of constraints.
Is there a way I can read that file back into the embedded solver using the C++ interface, replacing the existing constraints, to see if the embedded version can solve starting from the exact same starting point as the standalone solver? (Essentially, how could I create a corresponding from_smt2(stream) method on the solver class?)
They should be the same set of constraints as now, of course, but maybe there's some ordering effect going on when they are read from the file, or maybe there are some subtle differences in the solver introduced when we embedded it, or something that didn't get written out with to_smt2(). So I'd like to try reading the file back, if I can, to narrow down the possible sources of the difference. Suggestions on what to look for while debugging the long-running version would also be helpful.
Further note: it looks like another user is having similar issues here. Unlike that user, my problem uses all bit-vectors, and the only unknown result is the one from the embedded code. Is there a way to invoke the (get-info :reason-unknown) from the C++ interface, as suggested there, to find out why the embedded version is having a problem?
You can use the method "solver::reason_unknown()" to retrieve explanations for search failure.
There are methods for parsing files and strings into a single expression.
In case of a set of assertions, the expression is a conjunction.
It is perhaps a good idea to add such a method directly to the solver class for convenience. It would be:
void from_smt2_string(char const* smt2benchmark) {
expr fml = ctx().parse_string(smt2benchmark);
add(fml);
}
So if you were to write it outside of the solver class you need to:
expr fml = solver.ctx().parse_string(smt2benchmark);
solver.add(fml);

Windows DLL initialise array of constant c strings with file

Background
I am currently working on a project for which I have written a DLL as an interface between a Windows driver and MATLAB. All of this is working very well, but one thing up until recently it has been lacking is documentation for certain functionality - essentially it allows command strings to be sent to an FPGA, and all of these commands need documenting.
This could be done using a PDF etc. but I wanted also a way to integrate it the documentation into the DLL itself - so functions like 'lookup command' etc. can be used. At any rate I went ahead and implemented this in a way which I am mostly satisfied.
Essentially I have a structure (see below) to which a pointer can be returned from functions so that the documentation can be accessed. The caller provides the address of a pointer to one of these which is then updated with the address of an entry in a constant global array.
typedef struct {
CONST CHAR * commandString;
ULONG commandStringLen;
CONST CHAR * documentationString;
ULONG documentationStringLen;
CONST CHAR * commandParameters;
ULONG commandParametersLen;
} COMMAND_DOCS;
#define STRING_LEN(a) a,(sizeof(a)-1)
#define NEWGROUP "\n "
#define NEWENTRY "\n "
#define NEWLINE "\n"
#define ENDTITLE "\n----------------------------------------\n"
CONST COMMAND_DOCS CommandDocs[] = {
//-----
#define COMMAND_xyz_GROUP_INDEX_START (0)
{ STRING_LEN("ABCD"),
STRING_LEN("Something Command"
ENDTITLE"Low Queue"
NEWLINE "Description:"
NEWGROUP"The .........."
NEWLINE),
STRING_LEN(NEWGROUP"Type x:"
NEWENTRY"No Payload"
NEWLINE)
},
#define COMMAND_xyz_GROUP_LENGTH (1)
//-----
... And so on
};
This results in a load of constant strings being stored in memory and an array of documentation structures which contain pointers to these constants and their lengths as well for good measure. A pointer to the required element in the array is returned as I say. The caller of the library API is then free to make copies if needed or display the strings, whatever.
This is all working nicely at the moment, except for a minor annoyance. Whenever I need to update the documentation, it required me to recompile the DLL - as clearly all of the strings are compiled into it. For me that is not an issue as I can easily compile it, but as I am working at a university developing a research platform for them to use, I want it to be as simple for people to update in the future as I move on to other work. Ideally if documentation needs updating - say new commands get added to the system, I would like the additions to be possible without having to do a recompile.
Question
So my question really is about what the best way to go about doing this is.
At the moment I am thinking along the lines of loading the documentation from a file, either at the time of loading the DLL, or when the search function is called. Currently there are #defines in the array to separate indexes (identify groups of commands), but these could be replaced by variables that are initialised by data from the file.
I could go for something like XML and parse that to fill up the structures, but part of me thinks it would be easier to understand in the outside world if it was something simpler, but then I suppose I would still need some way of identifying the boundaries between entries, etc.
Thoughts?
Note that the DLL is mostly C - all of the APIs are C interfaces, but internally it is C++ as I've been using classes for other parts. I don't mind which is used as long as it is compatible with C interfaces.
I went away from the problem for a while as it wasn't urgent, but have come back to it in the last few weeks. I've found that I need to add even more settings and configurable stuff to the DLL which makes the approach I was using before definitely not useable.
Basically as #PhilWilliams suggested in the comments, I've gone for using XML files for configuring everything. There are now some APIs that have to be called just after loading the library with which the location of the XML file to load is specified. The DLL will then parse the XML files and populate a load of internal structures. Rather than using #defines and constant strings, I now have an array of structs in which pointers to the parsed strings are located along with the indices which were previously #defines.
After looking on StackOverflow and finding various suggestions on simple XML parsers, I've gone for TinyXML2 as it means that I don't have the headache of allocating and freeing memory for the many documentation strings as it handles that internally.

Localization of string literals

I need to localize error messages from a compiler. As it stands, all error messages are spread throughout the source code as string literals in English. We want to translate these error messages into German. What would be the best way to approach this? Leave the string literals as-is, and map the char* to another language inside the error reporting routine, or replace the english literals with a descriptive macro, eg. ERR_UNDEFINED_LABEL_NAME and have this map to the correct string at compile time?
How is such a thing approached in similar projects?
On Windows, typically this is done by replacing the string with integer constants, and then using LoadString or similar to get them from a resource in a DLL or EXE. Then you can have multiple language DLLs and a single EXE.
On Unixy systems I believe the most typical approach is gettext. The end result is similar, but instead of defining integer constants, you wrap your English string literals in a macro, and it will apply some magic to turn that into a localized string.
The most flexible way would be for the compiler to read the translated messages from message catalogs, with the choice of language being made according to the locale. This would require changing the compiler to use some tool like
gettext.
Just a quick thought...
Could you overload your error reporting routine? Say you are using
printf("MESSAGE")
You could overload it in a way that "MESSAGE" is the input, and you hash it to the corresponding message in German.
Could this work?
You could use my CMsg() and CFMsg() wrappers around the LoadString() API. They make your life easier to load and format the strings pulled out of the resources.
And of course, appTranslator is your best friend to translate your resources ;-)
Disclaimer: I'm the author of appTranslator.
On Windows you can use the resource compiler and the WinAPI load functions to have localized strings and other resources. FindResource() and its specialized derivatives like LoadString() will automatically load language specific resources according to the user's current locale. FindResourceEx() even allows you to manually specify the language version of the resource you wish to retrieve.
In order to enable this in your program you must first change your program to compile the strings in an resource file(.rc) and use LoadString() to fetch the strings at runtime instead of using a literal string. Within the resource file you then setup multiple language versions of the STRINGTABLEs you use, with the LANGUAGE modifier. The multi-lingual resources are then loaded based on the search order described here on MSDN: Multiple-Language Resources
Note: If you have no reason to need a single executable, or are doing something like using a user selected language from within your app, it gives you more control and less confusion to compile each language in a seperate dll and load them dynamically rather than have a large single resource file and trying to dynamically switch locales.
Here is an example of a multiple language StringTable resource file (ie:strings.rc):
#define IDS_HELLOSTR 361
STRINGTABLE
LANGUAGE LANG_ENGLISH, SUBLANG_ENGLISH_CAN
BEGIN
IDS_HELLO, "Hello!"
END
STRINGTABLE
LANG_FRENCH, SUBLANG_NEUTRAL
BEGIN
IDS_HELLO, "Bonjour!"
END

Registering each C/C++ source file to create a runtime list of used sources

For a debugging and logging library, I want to be able to find, at runtime, a list of all of the source files that the project has compiled and linked. I assume I'll be including some kind of header in each source file, and the preprocessor __FILE__ macro can give me a character constant for that file, so I just need to somehow "broadcast" that information from each file to be gathered by a runtime function.
The question is how to elegantly do this, and especially if it can be done from C as opposed to C++. In C++ I'd probably try to make a class with a static storage to hold the list of filenames. Each header file would create a file-local static instance of that class, which on creation would append the FILE pointer or whatever into the class's static data members, perhaps as a linked list.
But I don't think this will work in C, and even in C++ I'm not sure it's guaranteed that each element will be created.
I wouldn't do that sort of thing right in the code. I would write a tool which parsed the project file (vcproj, makefile or even just scan the project directory for *.c* files) and generated an additional C source file which contained the names of all the source files in some kind of pre-initialized data structure.
I would then make that tool part of the build process so that every time you do a build this would all happen automatically. At run time, all you would have to do is read that data structure that was built.
I agree with Ferruccio, the best way to do this is in the build system, not the code itself. As an expansion of his idea, add a target to your build system which dumps a list of the files (which it has to know anyway) to a C file as a string, or array of strings, and compile this file into your source. This avoids a lot of complication in the source, and is expandable, if you want to add additional information, like the version number from your source code control system, who built the executable, etc.
There is a standard way on UNIX and Linux - ident. For every source file you create ID tag - usually it is assigned by you version control system, e.g. SVN keywords.
Then to find out the name and revision of each source file you just use ident command. If you need to do it at runtime check out how ident does it - source for it should be freely available.
Theres no way to do it in C. In C++ you can create a class like this:
struct Reg {
Reg( const char * file ) {
StaticDictionary::Register( file );
};
where StaticDictionary is a singleton container for all your file names. Then in each source file:
static Reg regthisfile( __FILE__ );
You would want to make the dictionary a Meyers singleton to avoid order of creation problems.
I don't think you can do this in the way you outline in a "passive" mode. That is, you are going to somehow run code for each source file to be added to the registry, it's hard to get it to happen automatically.
Of course, it's possible that you can make that code very unobtrusive using macros. It might be problematic for C source files that don't have an "entrypoint", so if your code isn't already organised as "modules", with e.g. an init() function for each module, it might be hard. Static initializing code might be possible, I'm not 100% sure if the order in which things are initialized creates problems here.
Using static storage in the registry module sounds like an excellent idea, a plain linked list or simple hash table should be easy enough to implement, if your project doesn't already include any general-purpose utility library.
In C++ your solution will work. It's guaranteed.
Edit: Just found out a solution in my head: Change a rule in your makefile to add
'-include "cfiles_register.h"' to each 'g++ file.cpp'.
%.o : %.cpp
$(CC) -include 'cfiles_register.h' -o $# $<
put your proposed in the question implemnatation to that 'cfiles_register.h'.
Using static instances in C++ would work fine.
You could do this also in C, but you need to use runtime specific features - for MSVC CRT take a look at http://www.codeguru.com/cpp/misc/misc/threadsprocesses/article.php/c6945/
For C - you could do it with a macro - define a variable named corresponding to your file, and then you could scan the symbols of your executable, just as an idea:
#define TRACK_FILE(name) char _file_tracker_##name;
use it in your my_c_file.c like this:
TRACK_FILE(my_c_file_c)
and than grep all file/variable names from the binary like this
nm my-binary | grep _file_tracker
Not really nice, but...
Horrible idea, I'm sure, but use a singleton. And on each file do something like
Singleton.register(__FILE__);
at global scope. It'll only work on cpp files though.
I did something like this years ago as a novice, and it worked. But I'd cringe to do it now. I'd add a build step now.
I agree with those who say that it is better to avoid doing this at run time, but in C, you can initialize a static variable with a function call, that is, in every file:
static int doesntmatter = register( __FILE__);