The large majority of my programming knowledge is self-taught, so I was never taught proper design patterns, conventions, and so on and so forth.
I've been digging through a lot of my company's software libraries lately, and I notice that a lot of class data members have underscores in their names.
For example:
class Image
{
// various things
// data members
char* _data;
ImageSettings* _imageSettings;
// (and so on...)
};
I see this in a lot of online example code as well. Is there a reason behind this convention? Sorry I couldn't provide better examples, I'm really trying to remember off the top of my head, but I see it a lot.
I am aware of Hungarian notation, but I am trying to get a handle on all of the other conventions used for C++ OOP programming.
It is simply intended to make it clear what variables are members, and which are not. There is no deeper meaning. There is less possibility for confusion when implementing member functions, and you are able to chose more succinct names.
void Class::SomeFunction() {
int imageID;
//...
SetID(imageID + baseID); //wait, where did baseID come from?
}
Personally, I put the underscore at the end instead of the begining [if you accidently follow a leading underscore with a capital letter, your code becomes ill formed]. Some people put mBlah or m_Blah. Others do nothing at all, and explicitly use this->blah to indicate their member variables when confusion is possible. There is no global standard; just do what you want for private projects, and adhere to the existing practices elsewhere.
I have seen _ typing in front of the member, just to notify the reader that it's a class member variable.
More conventional way I have seen is putting m_; i.e. m_data;
Usually underscore before a member is used when the member is private.
It is usefull in language that does not have a builtin way to declare members private, like python.
Usually an underscore is used in member variables so as to distinguish between member variables, static member variables & local variables.
m_ is used for normal member variables &
s_ for static member variables
This way the scope of the variable is visible in the variable name itself.
Also, sometimes underscores are used in member name so that you can name your get and set methods with the member name itself.
For example:
class Image
{
// various things
// data members
char* _data;
ImageSettings* _imageSettings;
// (and so on...)
public:
ImageSettings* imageSettings()
{
//return pointer
}
void imageSettings(ImageSettings *ptr)
{
//set member variable value
}
};
However, different organizations adopt different conventions & coding styles and one should stick to them. Follow the principle,
When in Rome think & act like the Romans :)
I don't think there is a universal rule for naming. However, one of the most important one is to stick to what's already used in your company/project. Don't break it. Otherwise, it's most likely that your tech lead or mentor will challenge you about it.
For your reference, Google C++ style for naming
What I've learned is that having an underscore before or after a variable means that it's a member of that class - private, public, or protected.
The reason for this convention is that member names beginning with underscores show up first in Intellisense. Generally, you have to follow the convention of the project you are contributting to. If you are starting a new project, it is a good idea to follow a commonly accepted convention, such as Sutter's and Alexandrescu's C++ Coding Standards.
Related
http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml?showone=Function_Names#Function_Names
Regular functions have mixed case; accessors and mutators match the
name of the variable: MyExcitingFunction(), MyExcitingMethod(),
my_exciting_member_variable(), set_my_exciting_member_variable().
Isn't it the whole point of encapsulation to hide implementation details from the user so he/she is not aware of whether the accessor/mutator method returns/modifies a member variable or not? What if I change the variable name or change the way it's stored inside the object?
EDIT:
If I have an instance variable int foo_ it seems straightforward
int foo() const { return foo_; }
but if I add another method that returns foo_ + 2, should I name if bar or GetBar?
int bar() const { return foo_ + 2; }
int GetBar() const { return foo_ + 2; }
If I choose GetBar and later decide to cache the returned value in another member variable bar_, will I have to rename the method to bar?
Actually, the point of encapsulation is to hide the inner workings of a class, not necessarily to hide the names of things. The name of the member variable doesn't matter; it's the level of indirection that the accessor or mutator provides.
Having an accessor gives you the ability to change the inner workings of the class (including the names of member variables) without breaking the class's interface with the outside world. The user of the class need not concern himself with implementation details, including what things are named inside the class, but only on the behavior of the class, as seen from the outside.
To put it another way, users of a class should not rely on Google's style guide to determine whether or not they are modifying a member variable.
Because google style guide is only meant to be followed by google employees. Rather - it's not that good of a style guide.
Case in point - they explicitly ban passing by non-const reference because it can be "confusing".
So you're right, it defeats the purpose of encapsulation. Don't guide yourself by it.
When considering a class, it may conceptually have visible
state, which can be accessed by the client. How this state is
represented inside the class is another matter, and that's what
accessors (getters and setters) hide. My own naming convention
also makes this distinction: if the function is conceptually
a getter or a setter, it has the name of the attribute, which
would normally be a noun; otherwise, it is a verb. And
I distinguish between cases where the function is getting or
setting something which isn't conceptually part of the class
(e.g. which partially depends on an argument), which have the
verb get or set in their name, and the case where the
function is actually modifying what is conceptually an
attribute, in which case they don't.
For the rest, like most style guides, not everyone is in total
agreement with this one. I'm not sure I like their naming
conventions, for example. They're called naming conventions
because they are just that: arbitrary conventions. The only
real hard rule is that types, macros and other things must be
distinguished, and that names should never start or end with an
underscore. (There are also some softer rules: I'd be very
suspicious of a convention which ended up making the names of
local variables longer than those of globals.)
I may be taking an assumption of common sense too far, but I'm pretty sure that retaining a published interface takes precedence over following the naming guide.
Since your original bar / GetBar function is not an accessor, I presume it should follow the regular name guide and be called GetBar.
If you later introduce bar_ so that in some sense the function becomes an accessor, I'm pretty sure you should not remove GetBar. I suppose you could add a function bar() as well, defined to do the same thing, but I don't think I'd interpret the style guide to require that.
I'm also pretty sure that as soon as your published interface includes functions that you (and callers) think of as "accessors", encapsulation is in any case out the window to some extent, because you're talking about the state of the object instead of its behavior. Just because a function returns the value of a member variable in the current implementation does not mean that it has to be documented as an accessor. But if you do insist on writing functions that are publicly recognized as accessors, Google tells you how to name them. The classic example is that a sufficiently dumb data record object might reasonably have accessors, since the whole class is publicly defined to be a bundle of fields with maybe a little bit of behavior.
I've read that style guide a few times before, but I have never worked for Google so I'm not privy to how their code reviews tend to apply it in practice. I should think that an organization that size cannot be wholly consistent in every detail. So your guess is probably as good as mine.
I have recently read Mike McShaffry's Game Coding Complete and noticed the code style I haven't seen elsewhere yet. The more important things I noticed were the names of base classes defining interfaces starting with an I like IActor, protected member variables' names starting with m_ like m_Type and names of virtual methods like VSetId(). To show a bigger, more readable example:
class BaseActor : public IActor
{
friend class BaseGameLogic;
protected: ActorId m_id;
Mat4×4 m_Mat;
int m_Type;
shared_ptr <ActorParams> m_Params;
virtual void VSetID(ActorId id) { m_id = id; }
virtual void VSetMat(const Mat4×4 &newMat) { m_Mat = newMat; }
public:
BaseActor(Mat4×4 mat, int type, shared_ptr<ActorParams> params)
{ m_Mat=mat; m_Type=type; m_Params=params; }
/* more code here */
};
I pretty much like this style: it seems justified and looks like it helps increase the overall readability of the code. The question is: Is it a more-or-less established standard? Is there any more to it than the things I mentioned?
That's called Hungarian Notation. It's encoding information about the variable into the variable name.
For example, m_params means "a member variable called params". IActor means "A class called Actor intended to be used as an ifterface". It is something that is a very hot topic. Most people agree Hungarian Notation is a poor choice, but many will defend what they do as not Hungarian.
That looks very similar to Hungarian Notation. It depends on who you ask, but its a rather lets say "aged" style.
All of that seems fairly common. I dont recognize anyone else using the V prep for virtual methods. But its more about making the code human followable then anything else. Sounds like a good use to me.
Most of the coding I do is in C# and they use the same conventions for the most part. tho it is uncommon to see m_ for the member variables. Thats more common to C/C++ tho I have seen the same convention used in C# or the variables would start with _ alone. which is also a common convention in Objective-C. Something to separate the Property from the Variable that the property uses as a container.
I've seen that, and I'm no game programmer. The 'I' probably indicates that the class is intended to be an interface - all virtual methods and no data members. Using m_ is quite common, but so are other conventions. I think I first saw m_ naming convention in some Microsoft Windows examples in the late 1980s but that's probably not its origin. There are multiple coding standards around that follow these conventions (but differ in other ways) - I can't name any specific ones at the moment, but look around.
The initial I to denote interfaces (and A for abstract classes) isn't such a bad practice, however the m_ prefix to denote member variables is horrid. I believe the reasoning behind it is that it allows the parameters to be named nicer by preventing shadowing of member variables. However you will mainly work with member variables in your class, and the m_ prefix really clutters code, hindering readability. It is much better to just rename the parameter, for example id -> pId, id_, identifier, id_p, p_id, etc.
The V to denote virtual methods might be useful if you somehow declared methods virtual in the parent class but not in the child class, and you desperately need to know whether it is virtual or not, however this is easily fixed by declaring them virtual in the child class as well. Otherwise I do not see any advantage.
class C {
private:
int member_; // here is the underscore I refer to.
}
This underscore is recommended by Google Style Guide and Geosoft's C++ Style Guide.
I understand that there are different opinions and tastes.
I want to ask people who used it or were forced to use it whether they found it beneficial, neutral or harmful for them. And why?
Here is my answer:
I understand ask motivation behind it, but it does not convince me.
I tried it and all I got was a little bit of clutter all over the class, but simpler initialization of members in constructor. I haven't encountered situation where underscore helped to differ between private member variable and other variable (except in mentioned initialization).
In that light I consider this style harmful.
Well since no one mentioned it: adding an underscore to member variable allows you to name your getter and setter with the 'conceptual' name of the variable.
ex:
class MyClass
{
int someMember_;
public:
int someMember() const { return someMember_; }
void someMember( int newValue ) { someMember_ = newValue; }
};
not that I use this style though.
I use "m_" as prefix for normal member variables and "s_" for static member variables.
So the scope is directly visible.
If you need this underscore in order to tell class members from other variables, you probably have too large member functions to instantly see what's a variable/parameter.
I still like it because it often simplifies member function parameter naming:
class person
{
public:
person(const std::string& first_name, const std::string& last_name)
: first_name_(first_name), last_name_(last_name) {}
// .......
private:
std::string first_name_;
std::string last_name_;
};
To me the benefit of this style of decorating member variables is it works well with auto complete functionality of text editors. Having a prefix decoration requires you to type more characters before a solid guess on what you mean can be made.
This is basically a religious argument so you're never going to reach a consensus on this style. FWIW, I use this style for my member variables for reasons already stated by others, e.g.:
class Foo
{
public:
Foo(std::string name, int age) :
name_(name),
age_(age)
{
}
std::string name() const { return name_; }
void name(const std::string& name) { name_ = name; }
int age() const { return age_; }
void age(int age) { age_ = age; }
private:
std::string name_;
int age_;
};
Just adopt something you're happy with and stick with it.
I think it's important to distinguish between class variables and local ones (and global ones if really needed). How you do it, is not important - just be consistent.
class Foo
{
int mMember;
int member_;
int _member;
int m_Member;
};
All styles give you the information you need. As long as you stay with the same style all of the time, no problem. Sometimes other people need to work with your code (e.g. when you create a library, or you work with a community). Then it might be a good idea to stick with the most used style in the C++ community.
Sorry - I can't answer what style that is.
I came up with this style independently early in my C++ coding days (late 80s, early 90s) because I encountered several confusing situations in which I had to keep going back to the class header to figure out which variable was really a member variable.
Once I started seeing other people's C++ code that did the same thing I was rather gratified that I had noticed a problem that other people had and that the solution I adopted for myself was something other people also thought of.
It's not frequently useful, but it's fairly innocuous and when it is useful, it's very useful.
This is also why I really hate the m_ style. It's not innocuous, and I think the added ugliness is not worth the benefit.
I do use an S_ prefix for file scope static variables and class static variables that aren't constants. They are sort of like global variables, and I think their use should be signaled loudly.
This came up in discussion we had where I work but with Java programming. But I think this applies to C++ as well. My answer is that IDE's have a handy function of coloring class member variables. In Eclipse they turn blue. The underscore is superfluous. As another poster put it, "EW, hungarian warts!!". 1980 called and wants it's hungarian notation back.
We follow possibility.com's C++ coding standard, which says to prefix member variables with an 'm', but I've also done some work under Google's style guide.
Like you say, it's not strictly necessary, especially if you have an IDE that assigns different syntax highlighting to member variables. However, I think that some kind of consistent naming scheme, to let you tell at a glance whether or not a variable is a member variable, is very worthwhile:
Simplified parameter naming, as in sbi's answer, is one benefit.
A consistent coding style is important, regardless of which style you pick. Ideally, everyone on the team would use the same coding style, so that you can't tell at a glance who wrote a given section of code. This helps helps when bringing new developers onto the team and with agile practices such as no code ownership and is even more important with open source projects that may attract a variety of contributions.
Most importantly, readability can greatly benefit from having all of the code follow a fairly strict style that makes clear the types of identifiers like this. The difference between being able to tell at a glance that a variable is a member and being able to tell from looking at local variables' declarations may be small, but following a good coding standard will make numerous small differences like this throughout a body of code, and it can make a huge difference in how easy the code is to follow and how easy it is to get started in an unfamiliar section of code.
(You mentioned that you gave this style a try, but if it was only for a part of code and only for code that you were already familiar with, then it's harder to see the readability benefit that following a coding style like this for the entire codebase can bring.)
All of this is in my experience, your mileage may vary, etc.
there is one more "style" which suggests declaring class members as below:
class Foo{
int m_value;
public:
//...
};
i found it usable. but it is just my point of view.
I agree with Tobias that there's a benefit to some convention -- whatever it may be -- to highlighting class variables. On the other hand, I invariably find that such conventions make the code "flow" less well. It's just easier to read "totalPrice = productPrice + salesTax" then "m_totalPrice = l_productPrice + l_salesTax" or whatever.
In the end, I prefer to just leave all the field names undecorated, and have few enough class variables that keeping track of them is not a problem. In constructors and setters, I put a prefix or suffix on the parameter, or in Java I typically distinguish the class variable with "this.", like:
public Foo(int bar)
{
this.bar=bar;
}
(Can you do that in C++? It's been so long I don't remember.)
I agree with you that the underscore variable suffix is not the ideal coding style, and that adding a little complexity into the constructor is better than adding more complexity throughout the entire class.
The other day, I took a look at one of my old Java projects where I had applied the underscore suffix to variable names, and I found that it did make reading the code more difficult. It wasn't too hard to get used to, but I found it to be slightly distracting without adding any real benefit.
I always want to distinguish the class members from the variables. I use the _ as prefix for members and personally speaking this keeps the code clean and readable. Prefixes work fine with the editor's intellisense. Prefixing with m_ or s_ is useful but It looks ugly to me.
Is it wrong to use m_varname as public and the same class with _variable as private
Some concerns:
Why do you have public variables?
Identifiers starting with _ and __ are reserved for system libraries. In practice this doesn't matter very often, but it's nice to be aware.
With those things said, there's nothing wrong with creating a naming convention, regardless of how it looks. Just be consistent.
The same goes for C++ and for Java: you do not need any hungarian notation nor any prefixes/suffixes. You got keyword "this"!
class MyClass {
private:
int value;
public:
MyClass(int value) {
this->value = value;
}
}
Of course in this simple example you can (should!) use constructor initialization list ;)
So, instead using any awkward notations just employ language's possibilities. When you know the name of your member variable - you know that it is perfect. Why would you obfuscate it with "_"?
As for using the same names for public and private members: this absolutely wrong thinking! Why would one need two things to represent the same in the same class? Make it private, name it perfectly and give getters and setters public.
You should not use names that begin with an underscore or contain a double underscore. Those names are reserved for the compiler and implementation. Besides that restriction, you can use any naming convention you and your team likes. Personally, I hate any form of "Hungarian" notation and dislike the m_something notation as well. It really bothers me that if I need to change the type of a variable I need to go update its name everywhere where it occurs. That's a maintenance headache.
Everyone has his/her own preferences as far as naming conventions are concerned. I'd say more people would agree on not having any public variables in a class.
Assuming that you're working with C++, the my answer is NO. It's perfectly reasonable, but you'll should really stick to that convention.
However, statically typed languages such as C# assume such naming conventions to be somewhat redundant.
Personally I think it's ugly, but it's not apparent where a variable comes from in C++ as such the sugaring might help.
There are many C++ conventions out there. The key is to find one and/or adapt one. Stick with it, and be consistent. If you are working somewhere, try to get down as many of the conventions laid out. There's so many ones out there, and each one has good arguments, but they can contradict each-other (Joint Strike Fighter, Bell Labs, Mozilla, and so forth)
If there are different conventions between different parts of the project, at least make each file consistent within itself, and the .cpp and .h files should be consistent with each-other.
I find it better to be able to understand code written by different conventions so that you can easily adapt to a new working environment faster
I come from a .NET world and I'm new to writting C++. I'm just wondering what are the preferred naming conventions when it comes to naming local variables and struct members.
For example, the legacy code that I've inheritted has alot of these:
struct MyStruct
{
TCHAR szMyChar[STRING_SIZE];
bool bMyBool;
unsigned long ulMyLong;
void* pMyPointer;
MyObject** ppMyObjects;
}
Coming from a C# background I was shocked to see the variables with hungarian notation (I couldn't stop laughing at the pp prefix the first time I saw it).
I would much rather name my variables this way instead (although I'm not sure if capitalizing the first letter is a good convention. I've seen other ways (see links below)):
struct MyStruct
{
TCHAR MyChar[STRING_SIZE];
bool MyBool;
unsigned long MyLong;
void* MyPointer;
MyObject** MyObjects;
}
My question: Is this (the former way) still a preferred way to name variables in C++?
References:
http://geosoft.no/development/cppstyle.html
http://www.syntext.com/books/syntext-cpp-conventions.htm
http://ootips.org/hungarian-notation.html
Thanks!
That kind of Hungarian Notation is fairly useless, and possibly worse than useless if you have to change the type of something. (The proper kind of Hungarian Notation is a different story.)
I suggest you use whatever your group does. If you're the only person working on the program, name them whatever way makes the most sense to you.
The most important thing is to be consistent. If you're working with a legacy code base, name your variables and functions consistently with the naming convention of the legacy code. If you're writing new code that is only interfacing with old code, use your naming convention in the new code, but be consistent with yourself too.
No.
The "wrong hungarian notation" - especially the pp for double indirection - made some sense for early C compilers where you could write
int * i = 17;
int j = ***i;
without even a warning from the compiler (and that might even be valid code on the right hardware...).
The "true hungarian notation" (as linked by head Geek) is IMO still a valid option, but not necessarily preferred. A modern C++ application usually has dozens or hundreds of types, for which you won't find suitable prefixes.
I still use it locally in a few cases where I have to mix e.g. integer and float variables that have very similar or even identical names in the problem domain, e.g.
float fXmin, fXmax, fXpeak; // x values of range and where y=max
int iXmin, iXMax, iXpeak; // respective indices in x axis vector
However, when maintaining legacy code that does follow some conventions consistently (even if loosely), you should stick to the conventions used there - at least in the existing modules / compilation units to be maintained.
My reasoning: The purpose of coding standards is to comply with the principle of least surprise. Using one style consistently is more important than which style you use.
What's to dislike or mock about "ppMyObjects" in this example apart from it being somewhat ugly? I don't have strong opinions either way, but it does communicate useful information at a glance that "MyObjects" does not.
I agree with the other answers here. Either continue using the style that you are given from the handed down for consistency's sake, or come up with a new convention that works for your team. It's important that the team is in agreement, as it's almost guaranteed that you will be changing the same files. Having said that, some things that I found very intuitive in the past:
Class / struct member variables should stand out - I usually prefix them all with m_
Global variables should stand out - usuall prefix with g_
Variables in general should start with lower case
Function names in general should start with upper case
Macros and possibly enums should be all upper case
All names should describe what the function/variable does, and should never describe its type or value.
I'm a hungarian notation person myself, because I find that it lends readability to the code, and I much prefer self-documenting code to comments and lookups.
That said, I think you can make a case for sacrificing your preferred style and some additional maintainability for team unity. I don't buy the argument of consistency for the sake of uniform code readability, especially if your reducing readability for consistency... it just doesn't make sense. Getting along with the people you work with, though, might be worth a bit more confusion on types looking at variables.
Hungarian notation was common among users of the Win32 and MFC APIs. If your predecessors were using that, you can probably best continue using it (even though it sucks). The rest of the C++ world never had this brain-dead convention, so don't use it if you're using something other than those APIs.
I think that you will still find that most shops that program in Visual C++ stick with hungarian notation or at least a watered down version of it. In our shop, half of our app is legacy C++ with a shiny new C# layer on top (with a managed C++ layer in the middle.) Our C++ code continues to use hungarian notation but our C# code uses notation like you presented. I think it is ugly, but it is consistent.
I say, use whatever your team wants for your project. But if you are working on legacy code or joining a team, stick with the style that is present for consistency.
Its all down to personal preference. I've worked for 2 companies both with similar schemes, where member vars are named as m_varName. I've never seen Hungarian notation in use at work, and really don't like it, but again down to preference. My general feel is that IDE's should take care of telling u what type it is, so as long as the name is descriptive enough of what it does ( m_color, m_shouldBeRenamed ), then thats ok. The other thing i do like is a difference between member variable, local var and constant naming, so its easy to see what is happening in a function and where the vars come from.
Member: m_varName
Const: c_varName
local: varName
I also prefer CamelCase, indeed mostly I've seen people using CamelCase in C++. Personaly I don't use any prefixes expect for private/protected members and interfaces:
class MyClass : public IMyInterface {
public:
unsigned int PublicMember;
MyClass() : PublicMember(1), _PrivateMember(0), _ProtectedMember(2) {}
unsigned int PrivateMember() {
return _PrivateMember * 1234; // some senseless calculation here
}
protected:
unsigned int _ProtectedMember;
private:
unsigned int _PrivateMember;
};
// ...
MyClass My;
My.PublicMember = 12345678;
Why I decided to omit prefixes for public members:
Because public members could be accessed directly like in structs and not clash with private names. Instead using underscores I've also seen people using first lower case letter for members.
struct IMyInterface {
virtual void MyVirtualMethod() = 0;
};
Interfaces contains per definition only pure virtual methods that needs to be implemented later. However in most situation I prefer abstract classes, but this is another story.
struct IMyInterfaceAsAbstract abstract {
virtual void MyVirtualMethod() = 0;
virtual void MyImplementedMethod() {}
unsigned int EvenAnPublicMember;
};
See High Integrity C++ Coding Standard Manual for some more inspiration.
My team follows this Google C++ code convention:
This is a sample of variable name:
string table_name; // OK - uses underscore.
string tablename; // OK - all lowercase.
string tableName; // Bad - mixed case.
If you use CamelCase the convention is to Capitalize the first letter
for classes structs and non primitive type names, and lower case the first letter for data members.
Capitalization of methods tends to be a mixed bag, my methods tend to be verbs and are already distingished by parens so I don't capitalize methods.
Personally I don't like to read CamelCase code and prefer underscores lower case for
data and method identifiers, capitalizing types and reserving uppercase for acronyms
and the rare case where I use a macro (warning this is a MACRO).