Where to store code constants when writing a JIT compiler? [closed] - c++

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I am writing a JIT compiler for x86-64 and I have a question regarding best practice for inclusion of constants into the machine code I am generating.
My approach thus far is straightforward:
Allocate a chunk of RW memory with VirtualAlloc or mmap
Load the machine code into said memory region.
Mark the page executable with VirtualProtect or mprotect (and remove the write privilege for security).
Execute.
When I am generating the code, I have to include constants (numerical, strings) and I am not sure what is the best way to go about it. I have several approaches in mind:
Store all constants as immediate values into instructions' opcodes. This seems like a bad idea for everything except maybe small scalar values.
Allocate a separate memory region for constants. This seems to me like the best idea but it complicates memory management slightly and compilation workflow - I have to know the memory location before I can start writing the executable code. Also I am not sure if this affects performance somehow due to worse memory locality.
Store the constants in the same region as the code and access it with RIP-relative addressing. I like this approach since it keeps relevant parts of the program together but I feel slightly uneasy about mixing instructions and data.
Something completely different?
What is the preferable way to go about this?

A lot depends on how you are generating your binary code. If you use a JIT assembler that handles labels and figuring out offsets, things are pretty easy. You can stick the constants in a block after the end of the code, using pc-relative references to those labels and end up with a single block of bytes with both the code and the constants (easy management). If you're trying to generate binary code on the fly, you already have the problem of figuring out how to handle forward pc-relative references (eg for forward branches). If you use back-patching, you need to extend that to support references to your constants block.
You can avoid the pc-relative offset calculations by putting the constants in a separate block and passing the address of that block as a parameter to your code. This is pretty much the "Allocate a separate region for constants" you propose. You don't need to know the address of the block if you pass it in as an argument.

Related

Instead of creating smart pointers, Why could we not have modified C++ compilers to better catch pointer issues at compile time? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
If we could design smart pointers to know when to destroy/delete heap memory based on the scope. Why couldn't we have just engineered the compiler to have flagged when heap memory was going out of scope without being deleted?
Why was it more practical to create smart pointers?
I know that that is not the only reason for and benefit of smart pointers but why were these improvements not practically implementable via changes to the compiler?
Thank you
When your code is sufficiently complex, deciding whether a pointer goes out of scope can be reduced to some equivalent of the Halting Problem - an undecidable problem for a compiler. It's not that the problem is solvable but impractical, but rather that a computer program to decide whether a program halts literally cannot exist.
A trivial example of this reduction to the Halting Problem is the following pseudocode:
Allocate x;
Do arbitrary tasks using x as storage;
Print x;
Deallocate x;
Do other tasks;
x is leaked if and only if "Do arbitrary tasks using x as storage" halts.
If you throw in additional considerations such as multithreaded/concurrent execution, the problem gets even nastier.
As Nicol Bolas' answer brings up, there are also ways to hide pointers that cannot be easily instrumented by the compiler, e.g. by round-tripping a pointer through a uintptr_t, perhaps with some bijective function obfuscating it.
On the other hand, this is much easier to do at runtime. Garbage collection is a pretty mature technology, seen in runtimes like the Java Virtual Machine.
Furthermore, there is compiler assistance to detecting leaks and other memory issues in C++ -- clang++ and and g++ include a runtime sanitizer known as ASAN, which will warn on illegal accesses during runtime and leaks at shutdown, although it does not warn when an allocation is unreachable/no longer used but the program has not yet terminated.
I'm going to ignore the broader C++ issue that the language has holes in it that let you hide pointers inside of non-pointer-like things. Yes, many of these are UB, but there are APIs that basically require these schenanigans. Such things make automatic GC impossible from a practical perspective. Instead, we'll assume the compiler has a perfect way to instrument pointers to do this. So I'll focus on the more obvious issues:
Backwards compatibility and performance.
Let's assume you can do this while solving the C/C++ interop problem (ie: your C++ pointers still need to be the same size and store the same information as C pointers). Even so, most people don't write their code expecting garbage collection. You have decades of code out there written to destroy objects after their creation.
So what would a GC-based C++ do with such code? If it sees a pointer to an object outlive an explicit deallocation of the object, when should it be destroyed? When the user said to do it, or when the last pointer goes away? If you pick the former answer, then you haven't gained anything, since you just broke GC. And if you pick the latter, then you've broken your covenant with the user, since the user explicitly said "destroy this object and free this memory" and you didn't.
So a codebase has to be written expecting GC; you can't just give it to them behind the scenes.
Also, a common philosophy of C++ is "pay only for what you use". Garbage collection is not free. Even lifetime-scope-based GC isn't free, especially of the shared_ptr variety. But you're forcing this cost on everyone even if they didn't ask for it and don't need it.
Not having automatic memory management is a feature of C++, not a bug. It allows users to have the freedom to decide for themselves what the best form of memory management will be.

Authentication via command line [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I want to provide a binary-only program written in C or C++ where the user has to pass a valid code via command line to be able to use some extra features of the program itself. The idea is to implement some verification strategy in the program which compares the passed code against a run-time generated code which univocally identifies the system or hardware on which the program is being run.
In other words, if and only if the run-time check:
f(<sysinfo>) == <given code>
is true, then the user is allowed to use the extra features of the program. f is the function generating the code at run-time and sysinfo is an appropriate information identifying the current system/hardware (i.e. MAC address of the first ethernet card, Serial Number of the processor, etc..).
The aim is to make it as much difficult as possible for the user to guess or (guess the way to calculate) a valid code without knowing f and sysinfo a priori. More importantly, I want it to be difficult to re-implement f by analyzing the disassembled code of the program.
Assuming the above is a strong strategy, how could I implement f in C or C++ and what can I choose as its argument? Also what GCC compiler flags could I turn on to obfuscate f specifically? Note that, for example, things like MD5(MAC) or MD5(SHA(MAC)) would be too simple for evident reasons.
EDIT: Another interesting point is how to make it difficult for the user to attack the code directly by removing or bypassing the portion of the code doing the check.
If you are on Windows, a standard strategy is to hash the value of the registry key HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Cryptography\MachineGuid
If you're worried that a user might "guess" the hash function, take a standard SHA256 implementation and do something sneaky like change the algorithm initialization values (one of the two groups of these uses binary representations of the cube roots of the primes to initialize - change it to 5th or 7th or whatever roots, starting at the nth place, such that you chop off the "all-zero" parts, etc.)
But really if someone is going to take the time to RE your code, it's much easier to attack the branch in the code that does the if (codeValid) { allowExtraFeatures(); } then to mess with the hashes... so don't worry too much about it.

Produce Large Object file from Smaller source file [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question appears to be off-topic because it lacks sufficient information to diagnose the problem. Describe your problem in more detail or include a minimal example in the question itself.
Closed 8 years ago.
Improve this question
I need to stress test my program with large object files. I have researched into C++ Templates and inline functions/templates but have not been able to get the desired obj/source size ratio I want (about 50). I want to be able to have a single source compiled to a single object file with the latter's max size of 200MB. Any high level ideas would be greatly appreciated.
Additional Edit: I have created large/complex and diverse (random) template functions and have started calling them (creating instantiations) with unique types/parameters. This increased the obj/source ratio, as expected, to a certain point (around 12). Then the ratio dropped significantly (about 1) to what I assume is gcc outsmarting me and optimizing my methods. I have also looked into forcing gcc to create all functions inline, but my tests haven't shown improvements on that yet either.
Using the preprocessor to code bloat is not a valid technique for what I wish to accomplish.
You could use the preprocessor to generate lots and lots of code at preprocessing, but that might count for you as the source file being large on its own.
Generally speaking, the machine code that a C++ compiler produces is relatively small (in bytes) compared to the code that wrote it.
One thing that does hog up space is string resources. Now, if you use the same string over and over again the compiler will be smart enough to realise that it only needs to store it once, but if you change the resource a little bit each time then each variation will probably be stored separately. This changing can be done using the preprocessor.
Another idea is like you said, using templates to generate lots of functions for a lot of different types.
What is it that you want to accomplish? There might be better ways.

Is there a great deal of extra overhead when using C++ over C? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I have been programming microcontrollers for a bit using C, but like the intuition that C++ brings to the table with its object-oriented nature.
What are the major drawbacks of using C++ in general? Aside from class instantiation and deletion, where the associated constructors and destructors are called, is there a significant amount of overhead compared to an equivalent implementation using C?
Specifically, I am concerned about the following areas:
extra memory usage (RAM)
extra instructions required (and consequentially CPU time)
extra memory required to store the C++ program (i.e. result of compilation)
Programming in C++ won't inherently give you a slower/bigger/< insert worst nightmare here> program. However, that are some reasons to prefer C to C++ for microcontrollers:
Writing a C++ compiler is much harder than writing a C compiler. Thus it can be impossible to find a C++ compiler for a small processor, but a C compiler can always be found. This may or may not bother you. Even if it doesn't bother you now, it might in the future if you want to port your code.
C++ can do things behind your back. Vectors are much easier to deal with than arrays because a lot of the work is done for you. But this means that the library is allocating memory for you and it does it when it wants to. If memory is at a premium then you might want to have full control. Also, if there is an element of real time in your use case then you probably want to allocate all memory up front, so that each call is predictable (an insert to a vector might take a long time if you hit the bounds where it needs to grow ... this might mean copying the vector to a new location on the heap).
C++ has features that take up more memory that are very easy to use. If you make functions virtual then the compiler might need to have a virtual function table (more memory, and a slightly slower function call). This might be what you want, but these things are easier to introduce in C++ than in C.
Overall, C++ will let you introduce code that is larger and slower than C will. But if you want those features then doing it in C is a pain (think of function pointers rather than a virtual function call ... they are effectively the same thing). And the C version will end up taking the same time and resources, so there is no saving by using C.
Dynamic dispatch (i.e. methods marked virtual) have slightly more cost (though negligibly so) than non-virtual methods (but, good news, you don't have to mark a method as virtual unless you intend to override it, and when you do use it, it will probably be faster than whatever you would have crafted by hand in C to do the same thing) and exception handling can be slow (though you don't need to use exceptions in your code). Other than that, there is no difference, except that C++ will greatly simplify the code over the equivalent C code.

Architectural tips on building a shared resource for different processess [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
In the company I work at we're dealing with a huge problem: we have a system that consists in several units of processing. We made it this way so each module has specific functionality. The integration between these modules is done using a queue system (which is not fast but we're working on it) and replicating messages between these modules. The problem is that this is generating a great deal of overhead as four of these systems are requiring the same kind of data, and maintaining consistency for these modules is bad.
Another requirement for the system is redundancy, so I was thinking to kill these two problems in one shot.
So I was thinking of using some kind of shared resource. I've looked at shared memories (which are great but could lead to locking inconsistencies if the module crashes leading to inconsistencies in the program), and maybe do some "raw copy" from the segment to another computer to do redundancy.
So I've began to search for alternatives, ideas and something like that. I've found one that is noSQL, but I don't know if the speed that I'm requiring would suffice this.
I need something (ideally):
Memory-like fast
That could provide me redundancy (active-passive is ok, active active is good)
I also think that shared-memory is the way to go. To provide redundancy, let every process copy the data that is going to be changed to local/non-shared memory. Only after the module has done its work, copy it back to shared memory. Make sure the 'copy-to-shared-memory' part is as small as possible and nothing can go wrong while doing the copy. Some tricks you could use are:
Prepare all data in local memory and use one memcpy operation to copy it to shared memory
Use a single value to indicate that the written data is valid. This could be a boolean or something like a version number that indicates the 'version' of the data written in shared memory.