Related
I am new to C++ and get confused about what goes on under the hood when a class method returns a reference to a member variable that is raw data (rather than a pointer or a reference). Here's an example:
#include <iostream>
using namespace std;
struct Dog {
int age;
};
class Wrapper {
public:
Dog myDog;
Dog& operator*() { return myDog; }
Dog* operator->() { return &myDog; }
};
int main() {
auto w = Wrapper();
// Method 1
w.myDog.age = 1;
cout << w.myDog.age << "\n";
// Method 2
(*w).age = 2;
cout << w.myDog.age << "\n";
// Method 3
w->age = 3;
cout << w.myDog.age << "\n";
}
My question is: what happens at runtime when the code reads (*w) or w-> (as in the main function)? Does it compute the address of the myDog field every time it sees (*it) or it->? Is there overhead to either of these two access methods compared to accessing myDog_ directly?
Thanks!
Technically, what you are asking is entirely system/compiler-specific. As a practicable matter, a pointer and a reference are identical in implementation.
No rational compiler is going to treat
(*x).y
and
x->y
differently. Under the covers both usually appears in assembly language as something like
y(Rn)
Where Rn is a register holding the address of x and y is the offset of y into the structure.
The problem is that C++ is built upon C which in turn is the most f*&*) *p programming language ever devised. The reference construct is a work around to C's inept method of passing parameters.
Let's look at the following C++ code:
#include <iostream>
int main()
{
int z = 2;
class A {
public:
const int & x;
A(const int & x) : x(x) {}
void show(){
std::cout << "x=" << this->x << std::endl ;
}
} a(z);
a.show();
z = 3;
a.show();
}
The program prints: 2 and 3
It clearly shows that while inside class A x can't be modified, it merely means it's read only, because I can change it's value from outside.
Of course I can make it a copy stored inside class A, but I'm wondering if there is (or if there is a proposal?) of a way to say to class A that the member x will be truly constant instead of merely read only, with the meaning of a promise that the external code won't change it ?
To my eyes it looks like something related to the meaning of the C restrict keyword, but I've not heard of any such C++ feature yet. Do you ?
Constness is an attribute of the actual variable.
The term const int& x simply means "x is a reference to an int which it will not modify" and of course the compiler enforces this.
If you want the actual variable to which x refers to be const, simply declare it so:
#include <iostream>
int main()
{
const int z = 2; // declared const. Nothing may ever modify it
class A {
public:
const int & x;
A(const int & x) : x(x) {}
void show(){
std::cout << "x=" << this->x << std::endl ;
}
} a(z);
a.show();
z = 3; // this is a logic error, caught by the compiler.
a.show();
}
compiling correctly produces the error:
./const.cpp:41:7: error: read-only variable is not assignable
z = 3;
~ ^
1 error generated.
You're looking for D's immutable keyword, which was introduced as a new concept in that language precisely because, unfortunately, the answer is no: it does not exist in C++.
Constness in C++ does not mean immutability, but that the variable in question is read-only. It can still be modified by other parts of the program. I understand your question as to whether it's possible to enforce true immutability in a called function without knowing what the caller is doing.
Of course you can create a template wrapper class which accomplishes the task:
template <typename T>
class Immutable
{
public:
template <typename ...Args>
Immutable( Args&&...args )
: x( std::forward<Args>(args)... )
{}
operator const T &() const
{
return x;
}
private:
const T x;
};
As long as you do not reinterpret_cast or const_cast you will have truly immutable objects when you wrap them with Immutable<T>.
However, if you have a constant reference to some object, there is no way to tell, if some other part of the program has a non-constant access to the object. In fact, the underlying object might be a global or static variable, that you have read-only access to, but functions you call might still modify it.
This cannot happen with Immutable<T> object. However, using Immutable<T> might impose an extra copy operation on you. You need to judge yourself if you can live with that and if the cost justifies the gain.
Having a function require an const Immutable<Something> & instead of const Something & as an argument affects the calling code. A copy operation might be triggered. Alternatively, you can ask for an Immutable<Something> & without the const. Then no accidental copies will be triggered, but the calling code must pass a reference to Immutable<Something> object. And rightly so, because if the caller received a const & as an argument then the caller does not know, whether the object might get modified by someone else in the program. The caller has to create the object itself or require an immutable object to be passed to it as a reference.
Your original question
Here's your original problem with Immutable<int> & instead of const int &.
#include <iostream>
int main()
{
Immutable<int> z = 2;
class A {
public:
const Immutable<int> & x;
A(Immutable<int> & x) : x(x) {}
void show(){
std::cout << "x=" << this->x << std::endl ;
}
} a(z);
a.show();
//z = 3; // this would fail
a.show();
}
An other example
Here's how it works: If you write
void printAndIncrementAndPrint( int & i1, const int & i2 )
{
std::cout << i2 << std::endl;
++i1;
std::cout << i2 << std::endl;
}
int main()
{
int i = 0;
printAndIncrementAndPrint( i, i );
}
then it will print
0
1
into the console. If you replace the second argument of printAndIncrementAndPrint() with const Immutable<int> & i2 and keep the rest the same, then a copy will be triggered and it will print
0
0
to the console. You cannot pass and Immutable<int> to the function and a int & to the same underlying data without breaking the typesystem using const_cast or reinterpret_cast.
I think this is a design problem for the programmers, not the language. A const variable means for any user of that variable, they should not change the value of that variable. Our compiler is smart enough to help us make sure of that. So A is a user of z and if you want A know that A::x references to a const variable, then you should make z a const int. The const reference is just to keep the contract between the user and the provider.
This question already has answers here:
Correct way of initializing a struct in a class constructor
(5 answers)
Closed 8 years ago.
So I read about Plain Old Data classes (POD) , and decided to make my structs POD to hold data. For example, I have
struct MyClass {
int ID;
int age;
double height;
char[8] Name;
};
Obviously, to assign values to the struct, I can do this:
MyClass.ID = 1;
MyClass.age = 20;
...
But is there anyway to assign raw data, WITHOUT knowing the name of each field?
For example, My program retrieves field value for each column,, and I want to assign the value to the struct, given that i don't know the name of the fields..
MyClass c;
while (MoreColumns()) {
doSomething( c , GetNextColumn() ); //GetNextColumn() returns some value of POD types
}
I'm assuming there's way to do this using memcpy, or something std::copy,, but Not sure how to start..
Sorry if the question is a bit unclear.
You can use aggregate initialization:
MyClass c1 = { 1, 20, 6.0, "Bob" };
MyClass c2;
c2 = MyClass{ 2, 22, 5.5, "Alice" };
There is no general way to loop over the members of a struct or class. There are some tricks to add data and functions to emulate that sort of thing, but they all require additional setup work beyond just declaring the type.
Since MyClass is an aggregate, you can use a brace-initializer to initialize all fields in one call, without naming any of them:
MyClass m {
1,
2,
42.0,
{ "Joseph" }
};
However, given your description, maybe a POD is not a good idea, and you might want to design a class with accessors to set internal fields based on (for example) index columns.
Maybe boost::fusion can help you with what you want to archive.
You can use the adapt macro to iterate over a struct.
From the example of boost:
struct MyClass
{
int ID;
int age;
double height;
};
BOOST_FUSION_ADAPT_STRUCT(
MyClass,
(int, ID)
(int, age)
(double, height)
)
void fillData(int& i)
{
i = 0;
}
void fillData(double& d)
{
d = 99;
}
struct MoreColumns
{
template<typename T>
void operator()(T& t) const
{
fillData(t);
}
};
int main()
{
struct MyClass m = { 33, 5, 2.0 };
std::cout << m.ID << std::endl;
std::cout << m.age << std::endl;
std::cout << m.height << std::endl;
MoreColumns c;
boost::fusion::for_each(m, c);
std::cout << m.ID << std::endl;
std::cout << m.age << std::endl;
std::cout << m.height << std::endl;
}
What you are trying to achieve usually leads to hard-to-read or even unreadable code. However, assuming that you have a genuinely good reason to try to assign (as opposed to initialize) raw data to a field without knowing its name, you could use reinterpret_cast as below (Link here). I don't recommend it, but just want to point out that you have the option.
#include <cstdio>
#include <cstring>
struct Target { // This is your "target"
char foo[8];
};
struct Trap {
// The "trap" which lets you manipulate your target
// without addressing its internals directly.
// Assuming here that an unsigned occupies 4 bytes (not always holds)
unsigned i1, i2;
};
int main() {
Target t;
strcpy(t.foo, "AAAAAAA");
// Ask the compiler to "reinterpet" Target* as Trap*
Trap* tr = reinterpret_cast<Trap*>(&t);
fprintf(stdout, "Before: %s\n", t.foo);
printf("%x %x\n", tr->i1, tr->i2);
// Now manipulate as you please
// Note the byte ordering issue in i2.
// on another architecture, you might have to use 0x42424200
tr->i1 = 0x42424242;
tr->i2 = 0x00424242;
printf("After: %s\n", t.foo);
return 0;
}
This is just a quick example I came up with, you can figure out how to make it "neater". Note that in the above, you could also access target iteratively, by using an array in "Trap" instead of i1, i2 as I have done above.
Let me reiterate, I don't recommend this style, but if you absolutely must do it, this is an option you could explore.
I was wondering whether assert( this != nullptr ); was a good idea in member functions and someone pointed out that it wouldn’t work if the value of this had been added an offset. In that case, instead of being 0, it would be something like 40, making the assert useless.
When does this happen though?
Multiple inheritance can cause an offset, skipping the extra v-table pointers in the object. The generic name is "this pointer adjustor thunking".
But you are helping too much. Null references are very common bugs, the operating system already has an assert built-in for you. Your program will stop with a segfault or access violation. The diagnostic you'll get from the debugger is always good enough to tell you that the object pointer is null, you'll see a very low address. Not just null, it works for MI cases as well.
this adjustment can happen only in classes that use multiple-inheritance. Here's a program that illustrates this:
#include <iostream>
using namespace std;
struct A {
int n;
void af() { cout << "this=" << this << endl; }
};
struct B {
int m;
void bf() { cout << "this=" << this << endl; }
};
struct C : A,B {
};
int main(int argc, char** argv) {
C* c = NULL;
c->af();
c->bf();
return 0;
}
When I run this program I get this output:
this=0
this=0x4
That is: your assert this != nullptr will not catch the invocation of c->bf() where c is nullptr because the this of the B sub-object inside the C object is shifted by four bytes (due to the A sub-object).
Let's try to illustrate the layout of a C object:
0: | n |
4: | m |
the numbers on the left-hand-side are offsets from the object's beginning. So, at offset 0 we have the A sub-object (with its data member n). at offset 4 we have the B sub-objects (with its data member m).
The this of the entire object, as well as the this of the A sub-object both point at offset 0. However, when we want to refer to the B sub-object (when invoking a method defined by B) the this value need to be adjusted such that it points at the beginning of the B sub-object. Hence the +4.
Note this is UB anyway.
Multiple inheritance can introduce an offset, depending on the implementation:
#include <iostream>
struct wup
{
int i;
void foo()
{
std::cout << (void*)this << std::endl;
}
};
struct dup
{
int j;
void bar()
{
std::cout << (void*)this << std::endl;
}
};
struct s : wup, dup
{
void foobar()
{
foo();
bar();
}
};
int main()
{
s* p = nullptr;
p->foobar();
}
Output on some version of clang++:
0
0x4
Live example.
Also note, as I pointed out in the comments to the OP, that this assert might not work for virtual function calls, as the vtable isn't initialized (if the compiler does a dynamic dispatch, i.e. doesn't optimize if it know the dynamic type of *p).
Here is a situation where it might happen:
struct A {
void f()
{
// this assert will probably not fail
assert(this!=nullptr);
}
};
struct B {
A a1;
A a2;
};
static void g(B *bp)
{
bp->a2.f(); // undefined behavior at this point, but many compilers will
// treat bp as a pointer to address zero and add sizeof(A) to
// the address and pass it as the this pointer to A::f().
}
int main(int,char**)
{
g(nullptr); // oops passed null!
}
This is undefined behavior for C++ in general, but with some compilers, it might have the
consistent behavior of the this pointer having some small non-zero address inside A::f().
Compilers typically implement multiple inheritance by storing the base objects sequentially in memory. If you had, e.g.:
struct bar {
int x;
int something();
};
struct baz {
int y;
int some_other_thing();
};
struct foo : public bar, public baz {};
The compiler will allocate foo and bar at the same address, and baz will be offset by sizeof(bar). So, under some implementation, it's possible that nullptr -> some_other_thing() results in a non-null this.
This example at Coliru demonstrates (assuming the result you get from the undefined behavior is the same one I did) the situation, and shows an assert(this != nullptr) failing to detect the case. (Credit to #DyP who I basically stole the example code from).
I think its not that bad a idea to put assert, for example atleast it can catch see below example
class Test{
public:
void DoSomething() {
std::cout << "Hello";
}
};
int main(int argc , char argv[]) {
Test* nullptr = 0;
nullptr->DoSomething();
}
The above example will run without error, If more complex becomes difficult to debug if that assert is absent.
I am trying to make a point that null this pointer can go unnoticed, and in complex situation becomes difficult to debug , I have faced this situation.
Is there a (more or less at least) standard int class for c++?
If not so, is it planned for say C++13 and if not so, is there any special reasons?
OOP design would benefit from it I guess, like for example it would be nice to have an assignment operator in a custom class that returns an int:
int i=myclass;
and not
int i=myclass.getInt();
OK, there are a lot of examples where it could be useful, why doesn't it exist (if it doesn't)?
It is for dead reckoning and other lag-compensating schemes and treating those values as 'normal' variables will be nice, hopefully anyway!.
it would be nice to have an assignment operator in a custom class that returns an int
You can do that with a conversion operator:
class myclass {
int i;
public:
myclass() : i(42) {}
// Allows implicit conversion to "int".
operator int() {return i;}
};
myclass m;
int i = m;
You should usually avoid this, as the extra implicit conversions can introduce ambiguities, or hide category errors that would otherwise be caught by the type system. In C++11, you can prevent implicit conversion by declaring the operator explicit; then the class can be used to initialise the target type, but won't be converted implicitly:
int i(m); // OK, explicit conversion
i = m; // Error, implicit conversion
If you want to allow your class to implicitly convert to int, you can use an implicit conversion operator (operator int()), but generally speaking implicit conversions cause more problems and debugging than they solve in ease of use.
If your class models an int, then the conversion operator solution presented by other answers is fine, I guess. However, what does your myclass model?
What does it mean to get an integer out of it?
That's what you should be thinking about, and then you should come to the conclusion that it's most likely meaningless to get an integer without any information what it represents.
Take std::vector<T>::size() as an example. It returns an integer. Should std::vector<T> be convertible to an integer for that reason? I don't think so. Should the method be called getInt()? Again, I don't think so. What do you expect from a method called getInt()? From the name alone, you learn nothing about what it returns. Also, it's not the only method that returns an integer, there's capacity() too.
Implement operator int () for your class
This can be realized by the cast operator. E.g:
class MyClass {
private:
int someint;
public:
operator const int() {
return this->someint;
}
}
No there isn't any standard int class. For things such as BigDecimal you can look at Is there a C++ equivalent to Java's BigDecimal?
As for int, if you really need it, you can create your own. I have never come across an instance where I needed an Integer class.
No, and there won't be any. What you want to do can be done with conversion operator:
#include <iostream>
struct foo {
int x;
foo(int x) : x(x) {}
operator int() { return x; }
};
int main() {
foo x(42);
int y(x);
std::cout << y;
}
No, and there probably won't be.
int i=myclass;
This is covered by conversion operators:
struct MyClass {
operator int() {
return v;
}
int v;
} myclass = {2};
int i = myclass; // i = 2
Not everything has to be 'object oriented'. C++ offers other options.
There are obvious reasons to have a class for int, because int by itself does not allow for the absence of any value. Take for instance a JSON message. It can contain the definition for an object named “foo”, and an integer named “bar”, for example:
{"foo": {"bar": 0}}
Which has the meaning that “bar" is equal to 0 (zero), but if you omit “bar”, like this:
{"foo": {}}
Now it takes on the meaning that “bar” is non-existent, which is a completely different meaning and cannot be represented by int alone. In the old days, if this situation arose, some programmers would use a separate flag, or use a specific integer value to signify that the value was not supplied, or undefined, or non-existent. But whatever you call it, a better way is to have a class for integer which defines the functionality and makes it reusable and consistent.
Another case would be a database table that has an integer column added some time after it’s creation. Records that were added prior to when the new column was added will return null, meaning no value present, and records added after the column’s creation would return a value. You may need to take a different action for null value vs. 0 (zero).
So here's the beginnings of what a class for int or string might look like. But before we get to the code, let's look at the usage as that is why you would create the class in the first place, to make your life easier in the long run.
int main(int argc, char **argv) {
xString name;
xInt age;
std::cout<< "before assignment:" << std::endl;
std::cout<< "name is " << name << std::endl;
std::cout<< "age is " << age << std::endl;
// some data collection/transfer occurs
age = 32;
name = "john";
// data validation
if (name.isNull()) {
throw std::runtime_error("name was not supplied");
}
if (age.isNull()) {
throw std::runtime_error("age was not supplied");
}
// data output
std::cout<< std::endl;
std::cout<< "after assignment:" << std::endl;
std::cout<< "name is " << name << std::endl;
std::cout<< "age is " << age << std::endl;
return 0;
}
Here is the sample output from the program:
before assignment:
name is null
age is null
after assignment:
name is john
age is 32
Note that when the instance of the xInt class has not been assigned a value, the << operator automatically prints "null" instead of zero, and the same applies to xString for name. What you do here is totally up to you. For instance, you might decide to print nothing instead of printing “null”. Also, for the sake of brevity, I've hard coded the assignments. In the real world, you would be gathering/parsing data from a file or client connection, where that process would either set (or not set) the data values according to what is found in the input data. And of course, this program won't actually ever throw the runtime exceptions, but I put them there to give you a flavor of how you might throw the errors. So, one might say, well, why don't you just throw the exception in your data collection process? Well, the answer to that is, with the eXtended class variables (xInt & xString), we can write a generic, reusable, data gathering process and then just examine the data that is returned in our business logic where we can then throw appropriate errors based on what we find.
Ok, so here's the class code to go with the above main method:
#include <iostream>
#include <string>
class xInt {
private:
int _value=0;
bool _isNull=true;
public:
xInt(){}
xInt(int value) {
_value=value;
_isNull=false;
}
bool isNull(){return _isNull;}
int value() {return _value;}
void unset() {
_value=0;
_isNull=true;
}
friend std::ostream& operator<<(std::ostream& os, const xInt& i) {
if (i._isNull) {
os << "null";
} else {
os << i._value;
}
return os;
}
xInt& operator=(int value) {
_value=value;
_isNull=false;
return *this;
}
operator const int() {
return _value;
}
};
class xString {
private:
std::string _value;
bool _isNull=true;
public:
xString(){}
xString(int value) {
_value=value;
_isNull=false;
}
bool isNull() {return _isNull;}
std::string value() {return _value;}
void unset() {
_value.clear();
_isNull=true;
}
friend std::ostream& operator<<(std::ostream& os, const xString& str) {
if (str._isNull) {
os << "null";
} else {
os << str._value;
}
return os;
}
xString& operator<<(std::ostream& os) {
os << _value;
return *this;
}
xString& operator=(std::string value) {
_value.assign(value);
_isNull=false;
return *this;
}
operator const std::string() {
return _value;
}
};
Some might say, wow, that's pretty ugly compared to just saying int or string, and yes, I agree that it's pretty wordy, but remember, you only write the base class once, and then from then on, your code that you're reading and writing every day would look more like the main method that we first looked at, and that is very concise and to the point, agreed? Next you'll want to learn how to build shared libraries so you can put all these generic classes and functionality into a re-usable .dll or .so so that you're only compiling the business logic, not the entire universe. :)
There's no reason to have one, and so there won't be any.
Your cast operator should realize this
An example
class MyClass {
private:
int someint;
public:
operator const int() {
return this->someint;
}
}