Recently for fun I have decided to build a toy programming, compiler and vm. While starting to implement the virtual machine I got stuck. The stack which holds the variables and structs I implemented as separate arrays for each type. The problem is when I have a reference to a struct the elements are not aligned, int struct.x might be at address 2, and float struct.y might be at address 56, so accessing the struct by a reference would be impossible, because the indexes are not linear. How could I solve this?
edit:
first of all for each type I mean for each primitive, and second I know I could implement it with unions but I want to learn how it is really implemented in java, c++ or c#, that's kind of the point of making a toy language, to better understand what you are programming.
in this case, you have no real choice but to use a single data type like a uin32_t/uint64_t and simply have the compiler break values down into integer
int sp = 0;
uint32_t stack[MAX_STACK_SIZE];
OR
like the others have said, create a stack that is an array of unions, possibly using a tagged union. One implementation could be...
union values {
int i;
float f;
};
struct Type {
int tag;
union values val;
};
Type stack[MAX_STACK_SIZE];
It's up to you to decide on this but this is usually how it's done.
Related
I'm implementing a compiler in C++ and am at the AST stage. I now need to add in the symbol_entry, a value for that varible (it already has type). But how do I keep values of different sizes in an attributes of the class when I don't know the type. My current idea is to declare an attribute "val_pointer" of type void* in the symbol_entry, and cast for example from int* to void* and back. My understanding is that this can be done because pointers are all of the same size. Will this work? And also, is this way of allocating an int* separately each time efficient? I think that it would be better if I store create these int* from a contiguous block of memory, but I want to save space too.
One solution is to use tagged unions. For example:
enum Type
{
tInt,
tDouble
};
struct Data
{
Type type;
union
{
int Int; // only valid when type is tInt
double Double; // only valid when type is tDouble
} as;
};
Note that this isn't the best solution available in C++. You may want to look into std::variant, which has some advantages when compared to a raw tagged union, see Where to use std::variant over union?.
Another approach might be to have a class hierarchy, where each of your data types inherits from a basic Object type.
I have briefly looked around various c++ sites and text books. But none of them have had anything related to what I was looking for.
What I want is a list in c++ which can contain int, string and int array variables within it. But before I spend hours playing around with some code, I was wondering if anyone knew if such a thing actually exists? I'm not asking for code to be shown to me, if it is possible, I will attempt it, and then ask about any issues I have with it.
Thanks
Your best bet is boost::variant. Remember - I didn't tell you it will be easy.
Usage will be simple:
typedef boost::variant<...my necessary types...> MyVariant;
std::list<MyVariant> myList;
In case you meant an object that can contain int, string and arrays as separate objects, not as one (like union) -- I think you should take a look at C++11 tuples
, and use them in list.
It might be unsafe to put different types of objects/data in single list.
But if it the requirement, then why not derive new class from std::list with combination of keeping track of types being inserted into list using aproach mentioned in above answer.
You can also create a struct using union or void pointer.
enum varType
{
vt_int,
vt_float,
vt_string
}
class myVariant
{
private:
void* mVariable;
varType mType;
};
or also,
class myVariant2
{
private:
union
{
float fValue;
int iValue;
std::string* sValue;
};
varType mType;
}
It's not nice and would require casting heavily, but if you don't like using other libraries for such small task, this might be of help.
Edit1: You will need getStringValue, getFloatValue getIntValue functions.
Edit2: You can safely use this classes in std::list.
Edit3: You need to call destructor of std::string (if the variable is a string) yourself.
Given a class instance and pointer to a field we can obtain regular pointer pointing to field variable of this class instance - as in last assignment in following code;
class A
{
public:
int i, j;
};
int main(){
A a;
int A::*p = &A::i;
int* r = &(a.*p); // r now points to a.i;
}
Is it possible to invert this conversion: given class instance A a; and int* r obtain int A::* p (or NULL ptr if pointer given is not in instance given) as in code:
class A
{
public:
int i, j;
};
int main(){
A a;
int A::*p = &A::i;
int* r = &(a.*p); // r now points to a.i;
int A::*s = // a--->r -how to extract r back to member pointer?
}
The only way that I can think of doing it, would be to write a function that takes every known field of A, calculates it's address for given instance and compares with address given. This however requires writing custom code for every class, and might get difficult to manage. It has also suboptimal performance.
I can imagine that such conversion could be done by compiler in few operations under all implementations i know - such pointer is usually just an offset in structure so it would be just a subtraction and range check to see if given pointer is actually in this class storage. Virtual base classes add a bit of complexity, but nothing compiler couldn't handle I think. However it seems that since it's not required by standard (is it?) no compiler vendor cares.
Or am I wrong, and there is some fundamental problem with such conversion?
EDIT:
I see that there is a little misunderstanding about what I am asking about. In short I am asking if either:
There is already some implementation of it (at the compiler level I mean), but since hardly anybody uses it, almost nobody knows about it.
There is no mention of it in standard and no compiler vendor has though of it, but In principle it is possible to implement (once again: by the compiler, not compiled code.)
There is some deep-reaching problem with such an operation, that I missed.
My question was - which of those is true? And in case of the last - what is underlying problem?
I am not asking for workarounds.
There is no cross platform way to do this. Pointer to member values are commonly implemented as offsets to the start of the object. Leveraging off of that fact I made this (works in VS, haven't tried anything else):
class A
{
public:
int i, j;
};
int main()
{
A a;
int A::*p = &A::i;
int* r = &(a.*p); // r now points to a.i;
union
{
std::ptrdiff_t offset;
int A::*s;
};
offset = r - reinterpret_cast<int*>(&a);
a.*s = 7;
std::cout << a.i << '\n';
}
AFAIK C++ does not provide full reflection, which you would need to do that.
One solution is to provide reflection yourself (the way you describe is one way, it may not be the best one but it would work).
A totally non portable solution would be to locate the executable and use any debug information it may contain. Obviously non portable and requires the debug information to be there to begin with.
There's a decent description of the problem of reflection and different possible approaches to it in the introduction section of http://www.garret.ru/cppreflection/docs/reflect.html
Edit:
As I wrote above, there's no portable and general solution. But there may be a very non portable approach. I'm not giving you an implementation here as I do not have a C++ compiler at the moment to test it, but I'll describe the idea.
The basis is what Dave did in his answer: exploit the fact that a pointer to member is often just an offset. The problem is with base classes (especially virtual and multiple inheritance ones). You can approach it with templates. You can use dynamic casting to get a pointer to the base class. And eventually diff that pointer with the original to find out the offset of the base.
I am trying to create something like a list. However, different instances of the list may have a different number of entries, and the type of entry is based on input given by the user. For example, the user states that they want the structure of each entry in the list to contain an int id, a std::string name, a double metricA, and a long metricB. Based on this input, the following is created:
struct some_struct {
int id;
std::string name;
double metricA;
long metricB;
}
list<some_struct> some_list;
The user input may be read from a file, input on the screen, etc. Additionally, their are a variable number of entries in some_struct. In other words, it may have the entries listed above, it may have just 2 of them, or it may have 10 completely different ones. Is there someway to create a struct like this?
Additionally, being able to apply comparison operators to each member of some_struct is a must. I could use boost::any to store the data, but that creates issues with comparison operators, and also incurs more overhead than is ideal.
C++ is a strongly-typed language, meaning you have to declare your data structure types. To that end you cannot declare a struct with arbitrary number or type of members, they have to be known upfront.
Now there are ways, of course, to deal with such issues in C++. To name a few:
Use a map (either std::map or std::unordered_map) to create a "table" instead of a structure. Map strings to strings, i.e. names to string representation of the values, and interpret them to your heart.
Use pre-canned variant type like boost::any.
Use polymorphism - store pointers to base in the list, and have the virtual mechanism dispatch operations invoked on the values.
Create a type system for your input language. Then have table of values per type, and point into appropriate table from the list.
There probably as many other ways to do this as there are C++ programmers.
There are many ways to solve the problem of data structures with varying members and which is best depends a lot on how exactly it is going to be used.
The most obvious is to use inheritance. You derive all your possibilities from a base class:
struct base_struct {
int id;
std::string name;
};
list<base_struct*> some_list;
struct some_struct : public base_struct {
double metricA;
};
struct some_other_struct : public base_struct {
int metricB;
};
base_struct *s1 = new some_struct;
s1->id = 1;
// etc
base_struct *s2 = new some__other_struct;
s2->id = 2;
// etc
some_list.push_back(s1);
some_list.push_back(s2);
The tricky bit is that you'll have to make sure that when you get elements back out, you case appropriately. dynamic_cast can do this in a type-safe manner:
some_struct* ss = dynamic_cast<some_struct*>(some_list.front());
You can query the name before casting using type_info:
typeid(*some_list.front()).name();
Note that both these require building with RTTI, which is usually OK, but not always as RTTI has a performance cost and can bloat your memory footprint, especially if templates are used extensively.
In a previous project, we dealt with something similar using boost any. The advantage of any is that it allows you to mix types that aren't derived from one another. In retrospect, I'm not sure I'd do that again because it made the code a bit too apt to fail at runtime because type checking is being deferred until then. (This is true of the dynamic_cast approach as well.
In the bad old C days, we solved this same problem with a union:
struct base_struct {
int id;
std::string name;
union { // metricA and metricB share memory and only one is ever valid
double metricA;
int metricB;
};
};
Again, you have the problem that you have to deal with ensuring that it is the right type yourself.
In the era before the STL, many container systems were written to take a void*, again requiring the user to know when to cast. In theory, you could still do that by saying list<void*> but you'd have no way to query the type.
Edit: Never, ever use the void* method!
I ended up using a list with a boost::variant. The performance was far better than using boost::any. It went something like this:
#include <boost/variant/variant.hpp>
#include <list>
typedef boost::variant< short, int, long, long long, double, string > flex;
typedef pair<string, flex> flex_pair;
typedef list< flex_pair > row_entry;
list< row_entry > all_records;
Is it possible to write a C++ class or struct that is fully compatible with C struct. From compatibility I mean size of the object and memory locations of the variables. I know that its evil to use *(point*)&pnt or even (float*)&pnt (on a different case where variables are floats) but consider that its really required for the performance sake. Its not logical to use regular type casting operator million times per second.
Take this example
Class Point {
long x,y;
Point(long x, long y) {
this->x=x;
this->y=y;
}
float Distance(Point &point) {
return ....;
}
};
C version is a POD struct
struct point {
long x,y;
};
The cleanest was to do this is to inherit from the C struct:
struct point
{
long x, y;
};
class Point : public struct point
{
public:
Point(long x, long y)
{ this->x=x; this->y=y; }
float Distance(Point &point)
{ return ....; }
}
The C++ compiler guarantees the C style struct point has the same layout as with the C compiler. The C++ class Point inherits this layout for its base class portion (and since it adds no data or virtual members, it will have the same layout). A pointer to class Point will be converted to a pointer to struct point without a cast, since conversion to a base class pointer is always supported. So, you can use class Point objects and freely pass pointers to them to C functions expecting a pointer to struct point.
Of course, if there is already a C header file defining struct point, then you can just include this instead of repeating the definition.
Yes.
Use the same types in the same order in both languages
Make sure the class doesn't have anything virtual in it (so you don't get a vtable pointer stuck on the front)
Depending on the compilers used you may need to adjust the structure packing (usually with pragmas) to ensure compatibility.
(edit)
Also, you must take care to check the sizeof() the types with your compilers. For example, I've encountered a compiler that stored shorts as 32 bit values (when most will use 16). A more common case is that an int will usually be 32 bits on a 32-bit architecture and 64 bits on a 64-bit architecture.
POD applies to C++. You can have member functions. "A POD type in C++ is an aggregate class that contains only POD types as members, has no user-defined destructor, no user-defined copy assignment operator, and no nonstatic members of pointer-to-member type"
You should design your POD data structures so they have natural alignment, and then they can be passed between programs created by different compilers on different architectures. Natural alignment is where the memory offset of any member is divisible by the size of that member. IE: a float is located at an address that is divisible by 4, a double is on an address divisible by 8. If you declare a char followed by a float, most architectures will pad 3 bytes, but some could conceivably pad 1 byte. If you declare a float followed by a char, all compilers (I ought to add a source for this claim, sorry) will not pad at all.
C and C++ are different languages but it has always been the C++'s intention that you can have an implementation that supports both languages in a binary compatible fashion. Because they are different languages it is always a compiler implementation detail whether this is actually supported. Typically vendors who supply both a C and C++ compiler (or a single compiler with two modes) do support full compatibility for passing POD-structs (and pointers to POD-structs) between C++ code and C code.
Often, merely having a user-defined constructor breaks the guarantee although sometimes you can pass a pointer to such an object to a C function expecting a pointer to a struct with and identical data structure and it will work.
In short: check your compiler documentation.
Use the same "struct" in both C and C++. If you want to add methods in the C++ implementation, you can inherit the struct and the size should be the same as long as you don't add data members or virtual functions.
Be aware that if you have an empty struct or data members that are empty structs, they are different sizes in C and C++. In C, sizeof(empty-struct) == 0 although in C99, empty-structs are not supposed to be allowed (but may be supported anyway as a "compiler extension"). In C++, sizeof(empty-struct) != 0 (typical value is 1).
In addition to other answers, I would be sure not to put any access specifiers (public:, private: etc) into your C++ class / struct. IIRC the compiler is allowed to reorder blocks of member variables according to visibility, so that private: int a; pubic: int b; might get a and b swapped round. See eg this link: http://www.embedded.com/design/218600150?printable=true
I admit to being baffled as to why the definition of POD does not include a prohibition to this effect.
As long as your class doesn't exhibit some advanced traits of its kind, like growing something virtual, it should be pretty much the same struct.
Besides, you can change Class (which is invalid due to capitalization, anyway) to struct without doing any harm. Except for the members will turn public (they are private now).
But now that I think of your talking about type conversion… There's no way you can turn float into long representing the same value or vice versa by casting pointer type. I hope you only want it these pointers for the sake of moving stuff around.