I'm trying to write a simple C++ program that creates a linked list. I would like to make this list be able to store any type of data in its container. However, I realised that my main problem is being able to accept any type of input and storing it. For example, if the variable std::cin stores to is of type string, only strings will be accepted, and this variable must be defined before program compilation.
My question is: is there any way to accept any type of input with a std::cin (or any other input method) and then call certain functions depending on the input's type?
Something along the logic of...
cin >> data
if (data.type == string)
{
cout << "data's type: string"
}
if (data.type == int)
{
cout << "data's type: int"
}
Thank you!
C++ is (mostly) statically typed. That is, the types of variables have to be known at compile time and cannot change at runtime, e.g. depending on some user input.
The one exception to this rule are polymorphic classes: When you have a base class with some virtual member function, then there will be a way (most likely a pointer as a member of all instances of that class) to distinguish between sub classes of that class (and itself):
struct Base {
virtual ~Base() {}
};
struct SubA : public Base {};
struct SubB : public Base {};
// ...
Base const & instance = SubA{};
try {
SubA const & as_subA = dynamic_cast<SubA const &>(instance);
// The instance is a SubA, so the code following here will be run.
} catch (std::bad_cast const &) { /* handle that somehow */ }
Using this mechanism, or preferably the virtual function itself, you can have different behavior depending on the dynamic type of an instance, which is only known at run time.
C++ being a flexible language, you can - of course - also implement something similar on your own:
struct Thing {
enum class Type {
Integer, String, Vector
} type;
union {
int integer;
std::string string;
std::vector<int> vector;
} data;
// Plus a lot of work in constructors, destructor and assignment, see rule of 5
};
Using something like this allows you to have objects with a "type" of dynamic nature, and are able to do different things depending on what type the object actually has at run time.
But of course you don't need to write that on your own (though its not that hard), there are a lot of implementations like for example boost::any and boost::variant.
Considering that piece of code:
struct myStruct
{
myStruct *next;
};
Next is a pointer of struct declared in the struct definition, right?
What's the utility of - next - ? How can I use it?
Seems like it's an implementation of a linked-list.
You can use next if you want to chain such structures together to traverse them later. Of course, having other members in myStruct would make more sense.
example:
struct myStruct
{
int data;
myStruct *next;
};
myStruct st_1;
myStruct st_2;
st_1.data = 1;
st_2.data = 2;
st_1.next = &st_2; //st_1.next->data is now 2
The utility of this pointer is whatever you implement in myStruct. You can hold a direct relationship to other myStruct structs (through the pointer) using this pointer and directly manipulate them (i.e. like "knowing" about other objects).
For instance (note that for all intents and purposes, struct's in C++ are public classes),
class Test
{
public:
doSomethingToTheOtherStruct() {
if(t != NULL)
t->touch();
setTouched(bool touch) {
touched = touch;
}
setT(Test* other) {
t = other;
}
bool isTouched() const {
return touched;
}
private:
Test* t;
bool touched;
};
This class has some very simple methods which can demonstrate the power of using pointers. Now an example using it is below.
#include <iostream>
using namespace std;
int main()
{
Test t1;
Test t2;
Test* t3 = new Test;
// Notice that we set the pointers of each struct to point to a different one
// This is not necessary, but is definitely more useful than setting it to itself
// since you already have the "this" pointer in a class.
t1->setT(&t2);
t2->setT(t3);
t3->setT(&t1);
cout<< t1.isTouched() << t2.isTouched() << t3->isTouched() << endl;
t1->doSomethingToTheOtherStruct();
t2.doSomethingToTheOtherStruct();
cout<< t1.isTouched() << t2.isTouched() << t3->isTouched() << endl;
delete t3;
return 0;
}
Take note in the results of this code. t1 is never set to touched, but inadvertently (through the pointers), t2 and t3 become "touched."
The fact it is a pointer to the same class and that the member variable is called "next" suggests it is a linked list, as others have pointed out.
If the variable was a pointer to the same class but called "parent" it would most likely be some kind of parent/child relationship. (For example GUI widgets that have a parent that is also a widget).
What you might question is why you are allowed to do this: the answer is that pointers to data -types are all the same size, so the compiler will already know how many bytes it needs for this pointer.
For the same reason, you can have in your class (or struct) a pointer to a type for which the data type has only been declared and not defined. (Quite common to do).
That is correct. This kind of nested structs are used in linked lists.
the question is how to get pointer to self inside method of a class without touching this:
class Foo
{
int a, b, c;
void Print();
};
This way in common compiler I can do this refering to first data field:
void Foo::Print()
{
cout << &a; // == this
}
But are there any ways to do this without data members when only function exists?
class Foo2
{
void Print();
};
p.s. don't even ask me why do I need this :)
For a POD class with at least one data member, the address of the class-type object is the same as the address of its first data member. This is because there can be no unnamed padding bytes before the first data member of a POD struct type. [In C++11, the rules are a bit different; I believe that this is true for all standard layout class types. I am not entirely familiar with the rules, however.]
For any other class type, there is no way to do this.
class Foo2
{
void Print();
};
#define OBJADDR(x) (th##x##s)
void Foo2::Print()
{
int i; // red herring, but the name MUST be `i`
std::cout << OBJADDR(i);
i; // to get rid of compiler warning about unused local variable
};
If your class has no data members then technically, instances of the class don't even exist. For convenience only, they are required to have size of at least 1 byte and have unique this pointers. If you inherit from a class with no data members, its size collapses to 0 and it disappears.
So no, there isn't any way to get the this pointer without using the this pointer. A less pragmatic version of C++ wouldn't even have this pointers for these objects, as they would have size 0.
You can probably hack it using the offsetof macro
void Foo::Print()
{
cout << static_cast<char *>(&a) - offsetof(Foo, a);
}
if you can't safely assume &a is the start.
So ... why do you need to do this?
What, if any, c++ constructs are there for listing the ancestors of a class at runtime?
Basically, I have a class which stores a pointer to any object, including possibly a primitive type (somewhat like boost::any, which I don't want to use because I need to retain ownership of my objects). Internally, this pointer is a void*, but the goal of this class is to wrap the void* with runtime type-safety. The assignment operator is templated, so at assignment time I take the typeid() of the incoming pointer and store it. Then when I cast back later, I can check the typeid() of the cast type against the stored type_info. If it mismatches, the cast will throw an exception.
But there's a problem: It seems I lose polymorphism. Let's say B is a base of D. If I store a pointer to D in my class, then the stored type_info will also be of D. Then later on, I might want to retrieve a B pointer. If I use my class's method to cast to B*, then typeid(B) == typeid(D) fails, and the cast raises an exception, even though D->B conversion is safe. Dynamic_cast<>() doesn't apply here, since I'm operating on a void* and not an ancestor of B or D.
What I would like to be able to do is check is_ancestor(typeid(B), typeid(D)). Is this possible? (And isn't this what dynamic_cast<> is doing behind the scenes?)
If not, then I am thinking of taking a second approach anyway: implement a a class TypeInfo, whose derived classes are templated singletons. I can then store whatever information I like in these classes, and then keep pointers to them in my AnyPointer class. This would allow me to generate/store the ancestor information at compile time in a more accessible way. So failing option #1 (a built-in way of listing ancestors given only information available at runtime), is there a construct/procedure I can use which will allow the ancestor information to be generated and stored automatically at compile-time, preferably without having to explicitly input that "class A derives from B and C; C derives from D" etc.? Once I have this, is there a safe way to actually perform that cast?
I had a similar problem which I solved through exceptions! I wrote an article about that:
Part 1, Part 2 and code
Ok. Following Peter's advise the outline of the idea follows. It relies on the fact that if D derives from B and a pointer to D is thrown, then a catch clause expecting a pointer to B will be activated.
One can then write a class (in my article I've called it any_ptr) whose template constructor accepts a T* and stores a copy of it as a void*. The class implements a mechanism that statically cast the void* to its original type T* and throws the result. A catch clause expecting U* where U = T or U is a base of T will be activated and this strategy is the key to implementing a test as in the original question.
EDIT: (by Matthieu M. for answers are best self-contained, please refer to Dr Dobbs for the full answer)
class any_ptr {
void* ptr_;
void (*thr_)(void*);
template <typename T>
static void thrower(void* ptr) { throw static_cast<T*>(ptr); }
public:
template <typename T>
any_ptr(T* ptr) : ptr_(ptr), thr_(&thrower<T>) {}
template <typename U>
U* cast() const {
try { thr_(ptr_); }
catch (U* ptr) { return ptr; }
catch (...) {}
return 0;
}
};
The information is (often) there within the implementation. There's no standard C++ way to access it though, it's not exposed. If you're willing to tie yourself to specific implementations or sets of implementations you can play a dirty game to find the information still.
An example for gcc, using the Itanium ABI is:
#include <cassert>
#include <typeinfo>
#include <cxxabi.h>
#include <iostream>
bool is_ancestor(const std::type_info& a, const std::type_info& b);
namespace {
bool walk_tree(const __cxxabiv1::__si_class_type_info *si, const std::type_info& a) {
return si->__base_type == &a ? true : is_ancestor(a, *si->__base_type);
}
bool walk_tree(const __cxxabiv1::__vmi_class_type_info *mi, const std::type_info& a) {
for (unsigned int i = 0; i < mi->__base_count; ++i) {
if (is_ancestor(a, *mi->__base_info[i].__base_type))
return true;
}
return false;
}
}
bool is_ancestor(const std::type_info& a, const std::type_info& b) {
if (a==b)
return true;
const __cxxabiv1::__si_class_type_info *si = dynamic_cast<const __cxxabiv1::__si_class_type_info*>(&b);
if (si)
return walk_tree(si, a);
const __cxxabiv1::__vmi_class_type_info *mi = dynamic_cast<const __cxxabiv1::__vmi_class_type_info*>(&b);
if (mi)
return walk_tree(mi, a);
return false;
}
struct foo {};
struct bar : foo {};
struct baz {};
struct crazy : virtual foo, virtual bar, virtual baz {};
int main() {
std::cout << is_ancestor(typeid(foo), typeid(bar)) << "\n";
std::cout << is_ancestor(typeid(foo), typeid(baz)) << "\n";
std::cout << is_ancestor(typeid(foo), typeid(int)) << "\n";
std::cout << is_ancestor(typeid(foo), typeid(crazy)) << "\n";
}
Where I cast the type_info to the real type that's used internally and then recursively used that to walk the inheritance tree.
I wouldn't recommend doing this in real code, but as an exercise in implementation details it's not impossible.
First, what you are asking for cannot be implemented just on top of type_info.
In C++, for a cast to occur from one object to another, you need more than blindly assuming a type can be used as another, you also need to adjust the pointer, because of multi-inheritance (compile-time offset) and virtual inheritance (runtime offset).
The only way to safely cast a value from a type into another, is to use static_cast (works for single or multi-inheritance) and dynamic_cast (also works for virtual inheritance and actually checks the runtime values).
Unfortunately, this is actually incompatible with type erasure (the old template-virtual incompatibility).
If you limit yourself to non-virtual inheritance, I think it should be possible to achieve this by storing the offsets of conversions to various bases in some Configuration data (the singletons you are talking about).
For virtual inheritance, I can only think of a map of pairs of type_info to a void* (*caster)(void*).
And all this requires enumerating the possible casts manually :(
It is not possible using std::type_info since it does not provide a way to query inheritance information or to convert a std::type_info object to its corresponding type so that you could do the cast.
If you do have a list of all possible types you need to store in your any objects use boost::variant and its visitor.
While I can't think of any way to implement option #1, option #2 should be feasible if you can generate a compile-time list of the classes you would like to use. Filter this type list with boost::MPL and the is_base_of metafunction to get a list of valid-cast typeids, which can be compared to the saved typeid.
Instead of having to remember to initialize a simple 'C' structure, I might derive from it and zero it in the constructor like this:
struct MY_STRUCT
{
int n1;
int n2;
};
class CMyStruct : public MY_STRUCT
{
public:
CMyStruct()
{
memset(this, 0, sizeof(MY_STRUCT));
}
};
This trick is often used to initialize Win32 structures and can sometimes set the ubiquitous cbSize member.
Now, as long as there isn't a virtual function table for the memset call to destroy, is this a safe practice?
You can simply value-initialize the base, and all its members will be zero'ed out. This is guaranteed
struct MY_STRUCT
{
int n1;
int n2;
};
class CMyStruct : public MY_STRUCT
{
public:
CMyStruct():MY_STRUCT() { }
};
For this to work, there should be no user declared constructor in the base class, like in your example.
No nasty memset for that. It's not guaranteed that memset works in your code, even though it should work in practice.
PREAMBLE:
While my answer is still Ok, I find litb's answer quite superior to mine because:
It teaches me a trick that I did not know (litb's answers usually have this effect, but this is the first time I write it down)
It answers exactly the question (that is, initializing the original struct's part to zero)
So please, consider litb's answer before mine. In fact, I suggest the question's author to consider litb's answer as the right one.
Original answer
Putting a true object (i.e. std::string) etc. inside will break, because the true object will be initialized before the memset, and then, overwritten by zeroes.
Using the initialization list doesn't work for g++ (I'm surprised...). Initialize it instead in the CMyStruct constructor body. It will be C++ friendly:
class CMyStruct : public MY_STRUCT
{
public:
CMyStruct() { n1 = 0 ; n2 = 0 ; }
};
P.S.: I assumed you did have no control over MY_STRUCT, of course. With control, you would have added the constructor directly inside MY_STRUCT and forgotten about inheritance. Note that you can add non-virtual methods to a C-like struct, and still have it behave as a struct.
EDIT: Added missing parenthesis, after Lou Franco's comment. Thanks!
EDIT 2 : I tried the code on g++, and for some reason, using the initialization list does not work. I corrected the code using the body constructor. The solution is still valid, though.
Please reevaluate my post, as the original code was changed (see changelog for more info).
EDIT 3 : After reading Rob's comment, I guess he has a point worthy of discussion: "Agreed, but this could be an enormous Win32 structure which may change with a new SDK, so a memset is future proof."
I disagree: Knowing Microsoft, it won't change because of their need for perfect backward compatibility. They will create instead an extended MY_STRUCTEx struct with the same initial layout as MY_STRUCT, with additionnal members at the end, and recognizable through a "size" member variable like the struct used for a RegisterWindow, IIRC.
So the only valid point remaining from Rob's comment is the "enormous" struct. In this case, perhaps a memset is more convenient, but you will have to make MY_STRUCT a variable member of CMyStruct instead of inheriting from it.
I see another hack, but I guess this would break because of possible struct alignment problem.
EDIT 4: Please take a look at Frank Krueger's solution. I can't promise it's portable (I guess it is), but it is still interesting from a technical viewpoint because it shows one case where, in C++, the "this" pointer "address" moves from its base class to its inherited class.
Much better than a memset, you can use this little trick instead:
MY_STRUCT foo = { 0 };
This will initialize all members to 0 (or their default value iirc), no need to specifiy a value for each.
This would make me feel much safer as it should work even if there is a vtable (or the compiler will scream).
memset(static_cast<MY_STRUCT*>(this), 0, sizeof(MY_STRUCT));
I'm sure your solution will work, but I doubt there are any guarantees to be made when mixing memset and classes.
This is a perfect example of porting a C idiom to C++ (and why it might not always work...)
The problem you will have with using memset is that in C++, a struct and a class are exactly the same thing except that by default, a struct has public visibility and a class has private visibility.
Thus, what if later on, some well meaning programmer changes MY_STRUCT like so:
struct MY_STRUCT
{
int n1;
int n2;
// Provide a default implementation...
virtual int add() {return n1 + n2;}
};
By adding that single function, your memset might now cause havoc.
There is a detailed discussion in comp.lang.c+
The examples have "unspecified behaviour".
For a non-POD, the order by which the compiler lays out an object (all bases classes and members) is unspecified (ISO C++ 10/3). Consider the following:
struct A {
int i;
};
class B : public A { // 'B' is not a POD
public:
B ();
private:
int j;
};
This can be laid out as:
[ int i ][ int j ]
Or as:
[ int j ][ int i ]
Therefore, using memset directly on the address of 'this' is very much unspecified behaviour. One of the answers above, at first glance looks to be safer:
memset(static_cast<MY_STRUCT*>(this), 0, sizeof(MY_STRUCT));
I believe, however, that strictly speaking this too results in unspecified behaviour. I cannot find the normative text, however the note in 10/5 says: "A base class subobject may have a layout (3.7) different from the layout of a most derived object of the same type".
As a result, I compiler could perform space optimizations with the different members:
struct A {
char c1;
};
struct B {
char c2;
char c3;
char c4;
int i;
};
class C : public A, public B
{
public:
C ()
: c1 (10);
{
memset(static_cast<B*>(this), 0, sizeof(B));
}
};
Can be laid out as:
[ char c1 ] [ char c2, char c3, char c4, int i ]
On a 32 bit system, due to alighments etc. for 'B', sizeof(B) will most likely be 8 bytes. However, sizeof(C) can also be '8' bytes if the compiler packs the data members. Therefore the call to memset might overwrite the value given to 'c1'.
Precise layout of a class or structure is not guaranteed in C++, which is why you should not make assumptions about the size of it from the outside (that means if you're not a compiler).
Probably it works, until you find a compiler on which it doesn't, or you throw some vtable into the mix.
If you already have a constructor, why not just initialize it there with n1=0; n2=0; -- that's certainly the more normal way.
Edit: Actually, as paercebal has shown, ctor initialization is even better.
My opinion is no. I'm not sure what it gains either.
As your definition of CMyStruct changes and you add/delete members, this can lead to bugs. Easily.
Create a constructor for CMyStruct that takes a MyStruct has a parameter.
CMyStruct::CMyStruct(MyStruct &)
Or something of that sought. You can then initialize a public or private 'MyStruct' member.
From an ISO C++ viewpoint, there are two issues:
(1) Is the object a POD? The acronym stands for Plain Old Data, and the standard enumerates what you can't have in a POD (Wikipedia has a good summary). If it's not a POD, you can't memset it.
(2) Are there members for which all-bits-zero is invalid ? On Windows and Unix, the NULL pointer is all bits zero; it need not be. Floating point 0 has all bits zero in IEEE754, which is quite common, and on x86.
Frank Kruegers tip addresses your concerns by restricting the memset to the POD base of the non-POD class.
Try this - overload new.
EDIT: I should add - This is safe because the memory is zeroed before any constructors are called. Big flaw - only works if object is dynamically allocated.
struct MY_STRUCT
{
int n1;
int n2;
};
class CMyStruct : public MY_STRUCT
{
public:
CMyStruct()
{
// whatever
}
void* new(size_t size)
{
// dangerous
return memset(malloc(size),0,size);
// better
if (void *p = malloc(size))
{
return (memset(p, 0, size));
}
else
{
throw bad_alloc();
}
}
void delete(void *p, size_t size)
{
free(p);
}
};
If MY_STRUCT is your code, and you are happy using a C++ compiler, you can put the constructor there without wrapping in a class:
struct MY_STRUCT
{
int n1;
int n2;
MY_STRUCT(): n1(0), n2(0) {}
};
I'm not sure about efficiency, but I hate doing tricks when you haven't proved efficiency is needed.
Comment on litb's answer (seems I'm not yet allowed to comment directly):
Even with this nice C++-style solution you have to be very careful that you don't apply this naively to a struct containing a non-POD member.
Some compilers then don't initialize correctly anymore.
See this answer to a similar question.
I personally had the bad experience on VC2008 with an additional std::string.
What I do is use aggregate initialization, but only specifying initializers for members I care about, e.g:
STARTUPINFO si = {
sizeof si, /*cb*/
0, /*lpReserved*/
0, /*lpDesktop*/
"my window" /*lpTitle*/
};
The remaining members will be initialized to zeros of the appropriate type (as in Drealmer's post). Here, you are trusting Microsoft not to gratuitously break compatibility by adding new structure members in the middle (a reasonable assumption). This solution strikes me as optimal - one statement, no classes, no memset, no assumptions about the internal representation of floating point zero or null pointers.
I think the hacks involving inheritance are horrible style. Public inheritance means IS-A to most readers. Note also that you're inheriting from a class which isn't designed to be a base. As there's no virtual destructor, clients who delete a derived class instance through a pointer to base will invoke undefined behaviour.
I assume the structure is provided to you and cannot be modified. If you can change the structure, then the obvious solution is adding a constructor.
Don't over engineer your code with C++ wrappers when all you want is a simple macro to initialise your structure.
#include <stdio.h>
#define MY_STRUCT(x) MY_STRUCT x = {0}
struct MY_STRUCT
{
int n1;
int n2;
};
int main(int argc, char *argv[])
{
MY_STRUCT(s);
printf("n1(%d),n2(%d)\n", s.n1, s.n2);
return 0;
}
It's a bit of code, but it's reusable; include it once and it should work for any POD. You can pass an instance of this class to any function expecting a MY_STRUCT, or use the GetPointer function to pass it into a function that will modify the structure.
template <typename STR>
class CStructWrapper
{
private:
STR MyStruct;
public:
CStructWrapper() { STR temp = {}; MyStruct = temp;}
CStructWrapper(const STR &myStruct) : MyStruct(myStruct) {}
operator STR &() { return MyStruct; }
operator const STR &() const { return MyStruct; }
STR *GetPointer() { return &MyStruct; }
};
CStructWrapper<MY_STRUCT> myStruct;
CStructWrapper<ANOTHER_STRUCT> anotherStruct;
This way, you don't have to worry about whether NULLs are all 0, or floating point representations. As long as STR is a simple aggregate type, things will work. When STR is not a simple aggregate type, you'll get a compile-time error, so you won't have to worry about accidentally misusing this. Also, if the type contains something more complex, as long as it has a default constructor, you're ok:
struct MY_STRUCT2
{
int n1;
std::string s1;
};
CStructWrapper<MY_STRUCT2> myStruct2; // n1 is set to 0, s1 is set to "";
On the downside, it's slower since you're making an extra temporary copy, and the compiler will assign each member to 0 individually, instead of one memset.