Excessive use of `this` in C++ [duplicate]

Excessive use of `this` in C++ [duplicate] - c++

This question already has answers here:
When should I make explicit use of the `this` pointer?
(12 answers)
Closed 6 years ago.
I'm dealing with a large code base that uses the following construct throughout
class MyClass
{
public:
void f(int x);
private:
int x;
};
void MyClass::f(int x)
{
'
'
this->x = x;
'
'
}
Personally, I'd always used and hence prefer the form
class MyClass
{
public:
void f(int x);
private:
int _x;
};
void MyClass::f(int x)
{
'
'
_x = x;
'
'
}
The reasons I prefer the latter are that it is more succinct (less code = fewer potential bugs), and that I don't like having multiple variables of the same name in scope at the same time where I can avoid it. That said, I am seeing the former usage more and more often these days. Is there any upside to second approach that I am unaware of? (e.g. effect on compile time, use with templated code, etc...) Are the advantages of either approach significant enough merit a refactor to the other? Reason I ask, that while I don't like the second approach present in the code, the amount of effort and associated risk of introducing further bugs don't quite merit a refactor.

Your version is a bit cleaner, but while you're at it, I would:
Avoid leading underscore: _x is ok until somebody chooses _MyField which is a reserved name. An initial underscore followed by a capital letter is not allowed as a variable name. See: What are the rules about using an underscore in a C++ identifier?
Make the attribute private or protected: the change is safe if it compiles, and you'll ensure your setter will be used.
The this-> story has a use, for example in templated code to make the field name dependent on your type (can solve some lookup issues).
A small example of name resolutions which are fixed by using an explicit this-> (tested with g++ 3.4.3):
#include <iostream>
#include <ostream>
class A
{
public:
int g_;
A() : g_(1) {}
const char* f() { return __FUNCTION__; }
};
const char* f() { return __FUNCTION__; }
int g_ = -1;
template < typename Base >
struct Derived : public Base
{
void print_conflicts()
{
std::cout << f() << std::endl; // Calls ::f()
std::cout << this->f() << std::endl; // Calls A::f()
std::cout << g_ << std::endl; // Prints global g_
std::cout << this->g_ << std::endl; // Prints A::g_
}
};
int main(int argc, char* argv[])
{
Derived< A >().print_conflicts();
return EXIT_SUCCESS;
}

Field naming has nothing to do with a codesmell. As Neil said, field visibility is the only codesmell here.
There are various articles regarding naming conventions in C++:
naming convention for public and private variable?
Private method naming convention
c++ namespace usage and naming rules
etc.

This usage of 'this' is encouraged by Microsoft C# coding standards. It gives a good code clarity, and is intended to be a standard over the usage of m_ or _ or anything else in member variables.
Honestly, I really dislike underscore in names anyway, I used to prefix all my members by a single 'm'.

A lot of people use this because in their IDE it will make a list of identifiers of the current class pop up.
I know I do in BCB.
I think the example you provide with the naming conflict is an exception. In Delphi though, style guidelines use a prefix (usually "a") for parameters to avoid exactly this.

My personal feeling is that fighting an existing coding convention is something you should not do. As Sutter/Alexandrescu puts it in their book 'C++ coding conventions': don't sweat the small stuff. Anyone is able to read the one or the other, whether there is a leading 'this->' or '_' or whatever.
However, consistency in naming conventions is something you typically do want, so sticking to one convention at some scope (at least file scope, ideally the entire code base, of course) is considered good practice. You mentioned that this style is used throughout a larger code base, so I think retrofitting another convention would be rather a bad idea.
If you, after all, find there is a good reason for changing it, don't do it manually. In the best case, your IDE supports these kind of 'refactorings'. Otherwise, write a script for changing it. Search & replace should be the last option. In any case, you should have a backup (source control) and some kind of automated test facility. Otherwise you won't have fun with it.

Using 'this' in this manner is IMO not a code smell, but is simply a personal preference. It is therefore not as important as consistency with the rest of the code in the system. If this code is inconsistent you could change it to match the other code. If by changing it you will introduce inconsistency with the majority of the rest of the code, that is very bad and I would leave it alone.
You don't want to ever get into a position of playing code tennis where somebody changes something purely to make it look "nice" only for somebody else to come along later with different tastes who then changes it back.

I always use the m_ naming convention. Although I dislike "Hungarian notation" in general, I find it very useful to see very clearly if I'm working with class member data. Also, I found using 2 identical variable names in the same scope too error prone.

I agree. I don't like that naming convention - I prefer one where there is an obvious distinction between member variables and local variables. What happens if you leave off the this?

class MyClass{
public:
int x;
void f(int xval);
};
//
void MyClass::f(int xval){
x = xval;
}

In my opinion this tends to add clutter to the code, so I tend to use different variable names (depending on the convention, it might be an underscore, m_, whatever).

class MyClass
{
public:
int m_x;
void f(int p_x);
};
void MyClass::f(int p_x)
{
m_x = p_x;
}
...is my preferred way using scope prefixes. m_ for member, p_ for parameter (some use a_ for argument instead), g_ for global and sometimes l_ for local if it helps readability.
If you have two variables that deserve the same name then this can help a lot and avoids having to make up some random variation on its meaning just to avoid redefinition. Or even worse, the dreaded 'x2, x3, x4, etc'...

It's more normal in C++ for members to be initialised on construction using initialiser.
To do that, you have to use a different name to the name of the member variable.
So even though I'd use Foo(int x) { this.x = x; } in Java, I wouldn't in C++.
The real smell might be the lack of use of initialisers and methods which do nothing other than mutating member variables, rather than the use of the this -> x itself.
Anyone know why it's universal practice in every C++ shop I've been in to use different names for constructor arguments to the member variables when using with initialisers? were there some C++ compilers which didn't support it?

I don't like using "this" because it's atavistic. If you're programming in good old C (remember C?), and you want to mimic some of the characteristics of OOP, you create a struct with several members (these are analogous to the properties of your object) and you create a set of functions which all take a pointer to that struct as their first argument (these are analogous to the methods of that object).
(I think this typedef syntax is correct but it's been a while...)
typedef struct _myclass
{
int _x;
} MyClass;
void f(MyClass this, int x)
{
this->_x = x;
}
In fact I believe older C++ compilers would actually compile your code to the above form and then just pass it to the C compiler. In other words -- C++ to some extent was just syntactic sugar. So I'm not sure why anyone would want to program in C++ and go back to explicitly using "this" in code -- maybe it's "syntactic Nutrisweet"

Today, most IDE editors color your variables to indicate class members of local variables. Thus, IMO, neither prefixes or 'this->' should be required for readability.

If you have a problem with naming conventions you can try to use something like the folowing.
class tea
{
public:
int cup;
int spoon;
tea(int cups, int spoons);
};
or
class tea
{
public:
int cup;
int spoon;
tea(int drink, int sugar);
};
I think you got the idea. It's basically naming the variables different but "same" in the logical sense. I hope it helps.

Related

Is There a Benefit or Performance Boost By Using 'this ->' to Reference Members of a Class?

I was wondering is there was an advantage of any kind by using 'this' to reference class members, rather than not using it, in c++?
for example...
class Test
{
public:
Test();
void print_test()
{
std::cout << this -> m_x // Using 'this'
<< endl;
std::cout << m_x // Rather than referencing 'm_x' this way
<< endl;
}
private:
int m_x;
int m_y;
};

No, there is no performance difference. To the compiler, the meanings are identical.
Well, almost... the only time you specifically need to say this is if you have a variable of the same name in an inner scope that shadows the member variable (which is considered bad form anyway), or funny cases where you have a templated base class and you need to tell the compiler that a name refers to a base class member (this is pretty rare though).

Sometimes it's just a convention.
The basic idea is that we usually needs to use this-> to avoid naming conflicts:
class Value
{
public:
Value(): value(0), valueSet(false) {}
void setValue(int value) {
//value = value; ( WRONG : Naming conflict )
this->value = value; // Use this-> to distinguish between them
valueSet = true;
}
private:
int value;
bool valueSet;
}
Now the statement valueSet = true; without this-> looks ugly. So people prefer to prefix this-> to make all things look consistent:
void setValue(int value) {
this->value = value;
this->valueSet = true; // Isn't this consistent and beautiful?
}
But to my knowledge of C++, this pattern is not widely used. If you'd ever looked at some Java source code, prefixing this. before member field accesses is very common.
PS: Another possible reason is maybe people just want to emphasize it is a member field, not something else. Since most simple editors are not capable of highlighting such fields, it can improve code readability.

If you don't use it it will be implied. So those two writes are equivalent in the compiled code because the compiler will add the this.
The only problem that can occur is an ambiguity on a variable name. But as long as you use the convention of prefix (or postfix) member with m_ or _m you should be safe.
Also you can see When should I make explicit use of the `this` pointer?

No.. Otherwise everyone would've written code like this.
Maybe somebody thought it easy to read. Maybe he was a java developer.

this-> to reference everything

I've recently spent a lot of time with javascript and am now coming back to C++. When I'm accessing a class member from a method I feed inclined to prefix it with this->.
class Foo {
int _bar;
public:
/* ... */
void setBar(int bar) {
this->_bar = bar;
// as opposed to
_bar = bar;
}
}
On reading, it saves me a brain cycle when trying to figure out where it's coming from.
Are there any reasons I shouldn't do this?

Using this-> for class variables is perfectly acceptable.
However, don't start identifiers with an underscore, or include any identifiers with double underscore __ anywhere. There are some classes of reserved symbols that are easy to hit if you violate either of these two rules of thumb. (In particular, _IdentifierStartingWithACapital is reserved by the standard for compilers).

In principle, accessing members via this-> is a coding style that can help in making things clearer, but it seems to be a matter of taste.
However, you also seem to use prefixing members with _ (underscore). I would say that is too much, you should go for either of the two styles.

Are there any reasons I shouldn't do this?
Yes, there is a reason why you shouldn't do this.
Referencing a member variable with this-> is strictly required only when a name has been hidden, such as with:
class Foo
{
public:
void bang(int val);
int val;
};
void Foo::bang(int val)
{
val = val;
}
int main()
{
Foo foo;
foo.val = 42;
foo.bang(84);
cout << foo.val;
}
The output of this program is 42, not 84, because in bang the member variable has been hidden, and val = val results in a no-op. In this case, this-> is required:
void Foo::bang(int val)
{
this->val = val;
}
In other cases, using this-> has no effect, so it is not needed.
That, in itself, is not a reason not to use this->. The maintennance of such a program is however a reason not to use this->.
You are using this-> as a means of documentation to specify that the vairable that follows is a member variable. However, to most programmers, that's not what usign this-> actually documents. What using this-> documents is:
There is a name that's been hidden here, so I'm using a special
technique to work around that.
Since that's not what you wanted to convey, your documentation is broken.
Instead of using this-> to document that a name is a member variable, use a rational naming scheme consistently where member variables and method parameters can never be the same.
Edit Consider another illustration of the same idea.
Suppose in my codebase, you found this:
int main()
{
int(*fn)(int) = pingpong;
(fn)(42);
}
Quite an unusual construct, but being a skilled C++ programmer, you see what's happening here. fn is a pointer-to-function, and being assigned the value of pingpong, whatever that is. And then the function pointed to by pingpong is being called with the singe int value 42. So, wondering why in the world you need such a gizmo, you go looking for pingpong and find this:
static int(*pingpong)(int) = bangbang;
Ok, so what's bangbang?
int bangbang(int val)
{
cout << val;
return val+1;
}
"Now, wait a sec. What in the world is going on here? Why do we need to create a pointer-to-function and then call through that? Why not just call the function? Isn't this the same?"
int main()
{
bangbang(42);
}
Yes, it is the same. The observable effects are the same.
Wondering if that's really all there is too it, you see:
/* IMPLEMENTATION NOTE
*
* I use pointers-to-function to call free functions
* to document the difference between free functions
* and member functions.
*/
So the only reason we're using the pointer-to-function is to show that the function being called is a free function
and not a member function.
Does that seem like just a "matter of style" to you? Because it seems like insanity to me.

Here you will find:
Unless a class member name is hidden, using the class member name is equivalent to using the class member name with the this pointer and the class member access operator (->).

I think you do this backwards. You want the code to assure you that what happens is exactly what is expected.
Why add extra code to point out that nothing special is happening? Accessing class members in the member functions happen all the time. That's what would be expected. It would be much better to add extra info when it is not the normal things that happen.
In code like this
class Foo
{
public:
void setBar(int NewBar)
{ Bar = NewBar; }
you ask yourself - "Where could the Bar come from?".
As this is a setter in a class, what would it set if not a class member variable?! If it wasn't, then there would be a reason to add a lot of info about what's actually going on here!

Since you are already using a convention to signify that an identifer is a data member (although not one I would recommend), adding this-> is simply redundant in almost all cases.

This is a somewhat subjective question obvously. this-> seems much more python-idiomatic than C++-idiomatic. There are only a handful of cases in C++ where the leading this-> is required, dealing with names in parent template classes. In general if your code is well organized it will be obvious to the reader that it's a member or local variable (globals should just be avoided), and reducing the amount to be read may reduce complexity. Additionally you can use an optional style (I like trailing _) to indicate member variables.

It doesn't actually harm anything, but programmers experienced with OO will see it and find it odd. It's similarly surprising to see "yoda conditionals," ie if (0 == x).

Conventions for accessor methods (getters and setters) in C++

Several questions about accessor methods in C++ have been asked on SO, but none was able satisfy my curiosity on the issue.
I try to avoid accessors whenever possible, because, like Stroustrup and other famous programmers, I consider a class with many of them a sign of bad OO. In C++, I can in most cases add more responsibility to a class or use the friend keyword to avoid them. Yet in some cases, you really need access to specific class members.
There are several possibilities:
1. Don't use accessors at all
We can just make the respective member variables public. This is a no-go in Java, but seems to be OK with the C++ community. However, I'm a bit worried about cases were an explicit copy or a read-only (const) reference to an object should be returned, is that exaggerated?
2. Use Java-style get/set methods
I'm not sure if it's from Java at all, but I mean this:
int getAmount(); // Returns the amount
void setAmount(int amount); // Sets the amount
3. Use objective C-style get/set methods
This is a bit weird, but apparently increasingly common:
int amount(); // Returns the amount
void amount(int amount); // Sets the amount
In order for that to work, you will have to find a different name for your member variable. Some people append an underscore, others prepend "m_". I don't like either.
Which style do you use and why?

From my perspective as sitting with 4 million lines of C++ code (and that's just one project) from a maintenance perspective I would say:
It's ok to not use getters/setters if members are immutable (i.e. const) or simple with no dependencies (like a point class with members X and Y).
If member is private only it's also ok to skip getters/setters. I also count members of internal pimpl-classes as private if the .cpp unit is smallish.
If member is public or protected (protected is just as bad as public) and non-const, non-simple or has dependencies then use getters/setters.
As a maintenance guy my main reason for wanting to have getters/setters is because then I have a place to put break points / logging / something else.
I prefer the style of alternative 2. as that's more searchable (a key component in writing maintainable code).

2) is the best IMO, because it makes your intentions clearest. set_amount(10) is more meaningful than amount(10), and as a nice side effect allows a member named amount.
Public variables is usually a bad idea, because there's no encapsulation. Suppose you need to update a cache or refresh a window when a variable is updated? Too bad if your variables are public. If you have a set method, you can add it there.

I never use this style. Because it can limit the future of your class design and explicit geters or setters are just as efficient with a good compilers.
Of course, in reality inline explicit getters or setters create just as much underlying dependency on the class implementation. THey just reduce semantic dependency. You still have to recompile everything if you change them.
This is my default style when I use accessor methods.
This style seems too 'clever' to me. I do use it on rare occasions, but only in cases where I really want the accessor to feel as much as possible like a variable.
I do think there is a case for simple bags of variables with possibly a constructor to make sure they're all initialized to something sane. When I do this, I simply make it a struct and leave it all public.

That is a good style if we just want to represent pure data.
I don't like it :) because get_/set_ is really unnecessary when we can overload them in C++.
STL uses this style, such as std::streamString::str and std::ios_base::flags, except when it should be avoided! when? When method's name conflicts with other type's name, then get_/set_ style is used, such as std::string::get_allocator because of std::allocator.

In general, I feel that it is not a good idea to have too many getters and setters being used by too many entities in the system. It is just an indication of a bad design or wrong encapsulation.
Having said that, if such a design needs to be refactored, and the source code is available, I would prefer to use the Visitor Design pattern. The reason is:
a. It gives a class an opportunity to
decide whom to allow access to its
private state
b. It gives a class an
opportunity to decide what access to
allow to each of the entities who are
interested in its private state
c. It
clearly documents such exteral access
via a clear class interface
Basic idea is:
a) Redesign if possible else,
b)
Refactor such that
All access to class state is via a well known individualistic
interface
It should be possible to configure some kind of do's and don'ts
to each such interface, e.g. all
access from external entity GOOD
should be allowed, all access from
external entity BAD should be
disallowed, and external entity OK
should be allowed to get but not set (for example)

I would not exclude accessors from use. May for some POD structures, but I consider them a good thing (some accessors might have additional logic, too).
It doesn't realy matters the naming convention, if you are consistent in your code. If you are using several third party libraries, they might use different naming conventions anyway. So it is a matter of taste.

I've seen the idealization of classes instead of integral types to refer to meaningful data.
Something like this below is generally not making good use of C++ properties:
struct particle {
float mass;
float acceleration;
float velocity;
} p;
Why? Because the result of p.mass*p.acceleration is a float and not force as expected.
The definition of classes to designate a purpose (even if it's a value, like amount mentioned earlier) makes more sense, and allow us to do something like:
struct amount
{
int value;
amount() : value( 0 ) {}
amount( int value0 ) : value( value0 ) {}
operator int()& { return value; }
operator int()const& { return value; }
amount& operator = ( int const newvalue )
{
value = newvalue;
return *this;
}
};
You can access the value in amount implicitly by the operator int. Furthermore:
struct wage
{
amount balance;
operator amount()& { return balance; }
operator amount()const& { return balance; }
wage& operator = ( amount const& newbalance )
{
balance = newbalance;
return *this;
}
};
Getter/Setter usage:
void wage_test()
{
wage worker;
(amount&)worker = 100; // if you like this, can remove = operator
worker = amount(105); // an alternative if the first one is too weird
int value = (amount)worker; // getting amount is more clear
}
This is a different approach, doesn't mean it's good or bad, but different.

An additional possibility could be :
int& amount();
I'm not sure I would recommend it, but it has the advantage that the unusual notation can refrain users to modify data.
str.length() = 5; // Ok string is a very bad example :)
Sometimes it is maybe just the good choice to make:
image(point) = 255;
Another possibility again, use functional notation to modify the object.
edit::change_amount(obj, val)
This way dangerous/editing function can be pulled away in a separate namespace with it's own documentation. This one seems to come naturally with generic programming.

Let me tell you about one additional possiblity, which seems the most conscise.
Need to read & modify
Simply declare that variable public:
class Worker {
public:
int wage = 5000;
}
worker.wage = 8000;
cout << worker.wage << endl;
Need just to read
class Worker {
int _wage = 5000;
public:
inline int wage() {
return _wage;
}
}
worker.wage = 8000; // error !!
cout << worker.wage() << endl;
The downside of this approach is that you need to change all the calling code (add parentheses, that is) when you want to change the access pattern.

variation on #3, i'm told this could be 'fluent' style
class foo {
private: int bar;
private: int narf;
public: foo & bar(int);
public: int bar();
public: foo & narf(int);
public: int narf();
};
//multi set (get is as expected)
foo f; f.bar(2).narf(3);

Is this ambiguous or is it perfectly fine?

Is this code ambiguous or is it perfectly fine (approved by standards/has consistent behavior for any compilers in existence)?
struct SCustomData {
int nCode;
int nSum;
int nIndex;
SCustomData(int nCode, int nSum, int nIndex)
: nCode(nCode)
, nSum(nSum)
, nIndex(nIndex)
{}
};
edit:
yes, I am referring to the fact that the member variables have the same name with formal parameters of the constructor.

No, in this case there are no ambiguity, but consider following:
struct SCustomData {
//...
void SetCode(int nCode)
{
//OOPS!!! Here we do nothing!
//nCode = nCode;
//This works but still error prone
this->nCode = nCode;
}
};
You should draw attention to one of existing coding styles. For instance General Naming Rule in Google C++ Coding Styles or read excellent book "C++ Coding Standards: 101 Rules, Guidelines, and Best Practices" by Herb Sutter and Andrei Alexandrescu.

Your example is unambiguous (to me), but it's not good practise, because it can quickly become as ambiguous as hell.
It's a long while since I've written any C++ in anger, so I'm guessing what the following will do.
Do you KNOW what it will do? Are you sure?
class Noddy
{
int* i;
Noddy(int *i)
: i(i)
{
if(i == NULL)
i = new int;
}
};

If you're referring to using the same name for members and constructor arguments, then that's absolutely fine. However, you might find some people who insist that it's bad practice for some reason.
If you need to access the members in the constructor body, then you need to be careful - either give the arguments different names, or use this-> to access members.
If you're referring to using pseudoHungarian warts to remind people that integers are integers, then that is technically allowed, but has absolutely no benefits and makes the code much harder to read. Please don't do it.

In general, I've prefixed instance variables with underscores and named parameters in the constructor without any prefixes. At the very least, this will disambiguate your parameters from your instance variables. It also makes life less hectic if initializing within the body of the constructor.
struct SCustomData {
int _nCode;
int _nSum;
int _nIndex;
SCustomData(int nCode, int nSum, int nIndex)
: _nCode(nCode), _nSum(nSum), _nIndex(nIndex)
{}
};
// Alternatively
struct SCustomData {
int _nCode;
SCustomData(int nCode)
{
this->_nCode = nCode;
}
};
I don't like stacking the variables the way it was written in the question. I like to save vertical space, and it's also easier for me to read left-to-right. (This is a personal preference of mine, not a mandatory rule or anything like that.)

I would say that this is perfectly fine.
It is my preferred style for constructors that use the initialization list and don't have any code. I think that it reduces confusion because it is obvious which constructor parameter goes to which member.

It is perfectly standard compliant, but there are compilers out there that would not accept member variables having the same name as constructor parameters. In fact, I had to change my open source library for that reason. See this patch

Way to link 2 variables in a class in C++

Say I wanted to have one variable in a class always be in some relation to another without changing the "linked" variable explicitly.
For example: int foo is always 10 less than int bar.
Making it so that if I changed bar, foo would be changed as well. Is there a way to do this? (Integer overflow isn't really possible so don't worry about it.)
Example: (Obviously doesn't work, but general code for an understanding)
class A
{
int x;
int y = x - 10; // Whenever x is changed, y will become 10 less than x
};

No, you can't do that. Your best option for doing this is to use accessor and mutator member functions:
int getFoo()
{
return foo_;
}
void setFoo(int newFoo)
{
foo_ = newFoo;
}
int getBar()
{
return foo_ + 10;
}
void setBar(int newBar)
{
foo_ = newBar - 10;
}

This is called an invariant. It is a relationship that shall hold, but cannot be enforced by the means provided by the programming language. Invariants should only be introduced when they are really necessary. In a way the are a relatively "bad" thing, since they are something that can be inadvertently broken. So, the first question you have to ask yourself is whether you really have to introduce that invariant. Maybe you can do without two variables in this case, and can just generate the second value from the first variable on the fly, just like James suggested in his answer.
But if you really need two variables (and very often there's no way around it), you'll end up with an invariant. Of course, it is possible to manually implement something in C++ that would effectively link the variables together and change one when the other changes, but most of the time it is not worth the effort. The best thing you can do, if you really need two variables, again, is to be careful to keep the required relationship manually and use lots of assertions that would verify the invariant whenever it can break (and sometimes even when it can't), like
assert(y == x - 10);
in your case.
Also, I'd expect some advanced third-party C++ libraries (like, Boost, for example) to provide some high level assertion tools that can be custom-programmed to watch over invariants in the code (I can't suggest any though), i.e. you can make the language work for you here, but it has to be a library solution. The core language won't help you here.

You could create a new structure which contains both variables and overload the operators you wish to use. Similar to James McNellis' answer above, but allowing you to have it "automatically" happen whenever you operate on the variable in question.
class DualStateDouble
{
public:
DualStateDouble(double &pv1,double &pv2) : m_pv1(pv1),m_pv2(pv2)
// overload all operators needed to maintain the relationship
// operations on this double automatically effect both values
private:
double *m_pv1;
double *m_pv2;
};

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js