C++ notation for array names - c++

I have two classes, an appointment class and a calendar class. The calendar class has an array of appointment objects. Should the appointment array name be plural or not? I.e.:
appointment[] appointment;
or
appointment[] appointments;
Is there an agreed syntax for this situation?
Update: Also, do class names start with a lowercase or uppercase letter?

There is no convention, but you'll need to use distinct names for the type identifier and variable identifier. I tend to capitalise type but not variable identifiers, but it's a matter of personal taste. In this specific case of member data, a prefix or suffix also serves.
Also personal taste, I do tend to use the plural for a container, and have a suspicion it's more common. But, countering that, I've often thought it makes sense when talking about the entire container, while from an English reading perspective "appointment[3]" as "appointment three" reads better than "appointments three".

There is no agreed naming convention, but I would make two suggestions:
Use an std::vector<appointment>
Give data members some special pre-fix or suffix, for example appointments_ or m_appointments
Personally, I find the plural form more intuitive when referring to a collection.

Should the appointment array name be plural or not?
Yes, off course. Because it contains multiple appointments.
Is there an agreed syntax for this situation?
That depends on the coding conventions within your organization, team or department. How you are going to name variables, classes, etc depends on this.

There are no real universal conventions. If you're working in a team,
the team should establish conventions, and everyone should adhere to
them. With regards to the conventions: it's important to be able to
recognize type names, since you cannot parse C++ without knowing whether
a symbol is a type name or not. One frequent convention is to have
type names start with an upper case character, and other names
(functions, variables, etc.) to start with lower case. It's not
universal, however. In theory, at least, it should be clear from the
name itself: the usual "rule" is that type names are unqualified nouns,
variable names qualified nouns, and functions verbs. In practice, the
distinction between qualified and unqualified nouns isn't always that
clear, and (in English, at least), it's not always certain whether a
word is a verb or a noun. I find it useful to distinguish 1) types from
other symbols, and 2) members from non members.
With regards to plurals, it depends. I find that the distinction
between e.g. appointment and appointments isn't very visible.
Depending on context, I may feels it's sufficient, and use
appointments anyway, but in many cases, I'll go a step further, and
call it setOfAppointments or listOfAppointments, or something else
which clearly indicates a container. This is, probably, just laziness
on my part; I should really find a better name (a qualified noun,
indicating what kind of appointments: allAppointments, if nothing
else).

There is no particular convention.
I will suggest HUNGARIAN notation.
please read following article: http://msdn.microsoft.com/en-us/library/aa260976%28v=vs.60%29.aspx
This will give you good idea what to do.

Related

Is there a good/widely adopted c++ template coding convention/standards?

I like coding standards. When writing C++ I love coding standards. A good coding standard adds context to the language, making the hard to parse a little bit easier.
There are a few commonly practiced standards that I think everyone is at least a little bit familiar with:
Member variables prefixed with 'm' or 'm_'
Class prefix (generally project specific, ie in Qt all class names are prefixed with 'Q')
Include guard conventions like "take the filename in all caps, replace '.' with '_' "
The rule of three
There are lots of little C++ rules like this. Unfortunately I've never managed to find guidelines like this relating to templates. I think the most popular name for a template argument is 'T', but it's meaningless and unless the template is obvious it can make the code even trickier to read.
Anyway, the core problem I have is that templates are hard to read and I think some convention could be used to make them easier to read. Does anyone know of a widely applied convention that makes templatized code easier to read?
Just adding my grain of salt. I think the two most important libraries in the world of C++ programming are the Standard Template Libraries and the Boost Libraries. I personally try to mostly conform to the kind of notation that is predominant in these libraries. That is, underscore-separated lower-case names for classes, functions, data members, typedefs, enums, etc., and CamelCase (no underscore separation) for template arguments. Typically, you want to also have sensible names for template arguments. A good practice is to give them the name of the concept they should be implementing (e.g. a template argument which should be an iterator that implements ForwardIteratorConcept should be named ForwardIterator).
The conventions that you mentioned ("m" for members and Capital-letter-starting names for classes) is a sort-of pure Object-Oriented Programming convention ("pure" is meant as in: without any other programming paradigms like generic programming or template meta-programming). It is mostly used in Java (or by Java "natives" who are programming C++). I personally don't like it and know few people who do. I'm always a bit annoyed when working within a framework or project that adopts this notation, it de-tones with the standard libraries, boost libraries, and the overall proper usage of namespaces.
My recommendation is to always look at a language's standard libraries for examples of set a coding convention. The result is that your code will read more naturally for the language in which it is written. Basically, write C++ that looks like it could be part of the ISO C++ documents.
For C++, the standard containers, iterators and algorithms have many templates you can use as examples.
As a counter example, using camel case will make your C++ code to read like Java. When you end up using things from the standard library along side your own code, it will look weird.
That said, there are two exceptions to consider. Firstly, if you already have a large code base, follow what's already there: a mix of styles is confusing. Secondly, there are excellent libraries, such as Qt, that do not follow the style of the standard libraries, they are also worthy as examples of coding standards.
* Member variables prefixed with 'm' or 'm_'
Questionable.
* Class prefix (generally project specific, ie in Qt all class names are prefixed with 'Q')
Terrible. Was a necessary practice back in the day.
Big three isn't really a standard either and has pretty much been superceeded as a good practice by the Big Two (because using RAII for pointers negates the necessity of a destructor even when you need Copy ctr and assignment).
At any rate....
You need to differentiate your template parameters from normal code. Thus, you should use a naming convention that you are not using in standard code for template parameters. One good method, used by a good many, is to use CamelCase for template parameters. Another important aspect is, since C++ doesn't enforce concepts at all, to name your parameters after the concept they expect. ForwardIter thus makes a good parameter name for a parameter than should be a forward iterator.
Of course, if you're already using CamelCase for your class names (Java programmers - blech :p) then you should use something else.
When you get into complex instantiations and such then you need to use some method of declaring your template instantiations in multiple lines. When metaprogramming you also often need to split things up into multiple lines and/or multiple types/templates. It's one of those learn as you go things. I like the following method:
template < typename MyParams >
struct my_metafunction
: mpl::if_
<
check // probably wouldn't actually split this one since it's trivial...but as example...
<
MyParams
>
, some_type_expression
, some_other_type_expression
>
{};
There are no "common conventions" for names. Even the conventions you mention aren't as common as you might think. I can't think of anyone using m or m_ prefix for class member data other than a subset of Windows developers. Similarly for prefixing class names.
Conventions of this sort are very project-specific. You agree about them in a project and move on. When you start a new project it's perfectly alright to have new conventions if you so desire. If you lack imagination or confidence to pick your own conventions then buy Herb Sutter and Andrei Alexandrescu's C++ Coding Standards book. In fact, you should really read it because it deals with far more effective conventions than naming conventions. With things that actually matter.
If it at all helps I sometimes see people choosing for template parameters short names that start with a capital letter. E.g., template<class Ch, class Tr>. Look in your compiler's standard library for inspiration.
Take a look at Boost if you want to see their coding convention.
Like the others say, it depends on the project coding style. I like using lowercase letters separated with under score while coding. And for the harmony I use lowercase letters for template parameters too. To distinguish them from the others, I start with underscore and end with "_t".
`
template<typename _encoder_t>
class compression
{
typedef typename _encoder_t::settings settings_t;
...
};
`

Which is the best among GetCountOfObjects, GetNumberOfObjects, and GetObjectCount?

I am a C++ programmer from a non-English country.I am always confused about how to choose one of the following function names:
GetCountOfObjects
GetNumberOfObjects
GetObjectCount
Who can tell me what the subtle differences are between them?
I'm also a programmer from non-English country, but I think the best way to choose the name is
use the name that is the most clear
use the shortest name enough to understand easily
Also, english language suppose that it's better to swap the order than use 'Of'.
So, IMHO the best variant is 'GetObjectCount' here, of course if it returns the quantity of object.
GetNumberOfObjects probably sounds closest to natural English. GetCountOfObjects sounds slightly awkward. Other than that, there is almost no difference.
My personal style would probably be to use GetNumberOfObjects for a method that just returns a known number, but CountObjects for a method that actually performs the counting.
EDIT: The reason for this difference, at least to me, is that the word 'number' is more commonly used as a noun while 'count' is more commonly used as a verb.
Really, this is a style choice. Use whatever you choose consistently and it will be fine.
Use whatever you want, but use it consistently.
I would go for the shortest simplest: size() if it makes sense. That is, if you are trying to add a member function to a class that somehow resembles a container, using the same names that are used in existing libraries for the same concepts will make code simpler to read.
Even if that does not make sense, while in Java getters and setters are common, in many C++ libraries the same function names will drop the get part and provide a shorter name: GetNumberOfObjects => NumberOfObjects, GetObjectCount => ObjectCount... If you want to make your object different from containers (and thus you explicitly want to avoid size()) I would probably go for objectCount or numObjects. While numObjects is not proper english it is easy to read and interpret and it is short.
use whichever u feel comfortable wid but be consistent wid it.avoid very long names as u can err.also u can use sum kind of distinction in d names 2 help u figure out type of variable or whether it is static,local or public or private

Trailing underscores for member variables in C++

I've seen people use a trailing underscore for member variables in classes, for instance in the renowned C++ FAQ Lite.
I think that it's purpose is not to mark variables as members, that's what "m_" is for. It's actual purpose is to make it possible to have an accessor method named like the field, like this:
class Foo {
public:
bar the_bar() { return the_bar_; }
private:
bar the_bar_;
}
Having accessors omit the "get_" part is common in the STL and boost, and I'm trying to develop a coding style as close to these as possible, but I can't really see them using the underscore trick. I wasn't able to find an accessor in STL or boost that would just return a private variable.
I have a few questions I'm hoping you will be able to answer:
Where does this convention come from? Smalltalk? Objective-C? Microsoft? I'm wondering.
Would I use the trailing underscore for all private members or just as a workaround in case I want to name a function like a variable?
Can you point me to STL or boost code that demonstrates trailing underscores for member variables?
Does anybody know what Stroustrup's views on the issue are?
Can you point me to further discussion of the issue?
In C++,
identifiers starting with an underscore, followed by a capital character
identifiers having two consecutive underscores anywhere
identifiers in the global namespace starting with an underscore
are reserved to the implementation. (More about this can be found here.) Rather than trying to remember these rules, many simply do not use identifiers starting with an underscore. That's why the trailing underscore was invented.
However, C++ itself is old, and builds on 40 years of C (both of which never had a single company behind them), and has a standard library that has "grown" over several decades, rather than brought into being in a single act of creation. This makes for the existence of a lot of differing naming conventions. Trailing underscore for privates (or only for private data) is but one, many use other ones (not few among them arguing that, if you need underscores to tell private members from local variables, your code isn't clear enough).
As for getters/setters - they are an abomination, and a sure sign of "quasi classes", which I hate.
I've read The C++ Programming Language and Stroustrup doesn't use any kind of convention for naming members. He never needs to; there is not a single simple accessor/mutator, he has a way of creating very fine object-oriented designs so there's no need to have a method of the same name. He uses structs with public members whenever he needs simple data structures. His methods always seem to be operations. I've also read somewhere that he disencourages the use of names that differ only by one character.
I am personally a big fan of this guideline: http://geosoft.no/development/cppstyle.html
It includes omitting the m_ prefix, using an underscore suffix to indicate private member variables and dropping the horrid, annoying-to-type habit of using underscores instead of space, and other, more detailed and specific suggestions, such as naming bools appropriately(isDone instead of just done) and using getVariable() instead of just variable() to name a few.
Only speaking for myself...
I always use trailing underscore for private data members, regardless if they have accessor functions or not. I don't use m_ mainly because it gets in the way when I mentally spell the variable's name.
As a maintenance developer that likes searchability I'm leaning towards m_ as its more searchable. When you, as me, are maintaining big projects with large classes (don't ask) you sometimes wonder: "Hmmm, who mutates state?". A quick search for m_ can give a hint.
I've also been known to use l_ to indicate local variables but the current project doesn't use that so I'm "clean" these days.
I'm no fan of hungarian notation. C++ has a strong type system, I use that instead.
I'm guessing that utopia would have been to use a leading underscore - this is quite common in Java and C# for members.
However, for C, leading underscores aren't a good idea, so hence I guess the recommendation by the C++ FAQ Lite to go trailing underscore:
All identifiers that begin with an
underscore and either an uppercase
letter or another underscore are
always reserved for any use.
All identifiers that begin with an
underscore are always reserved for use
as identifiers with file scope in both
the ordinary and tag name spaces.
(ISO C99 specification, section 7.1.3)
As far as I remember, it's not Microsoft that pushed the trailing underscore code style for members.
I have read that Stroustrup is pro the trailing underscore.

A matter of style

Would you write something like:
enum XYZ_TYPE {X=1, Y=2, Z=3};
I saw it and the suffix _TYPE confuses me in the enum context. There is a strong prospect that it is because I am not bright.
I would not write it just like that, but it's hardly a make-or-break situation. Roll with the punches and save your frustration for things that really deserve it, like for-case loops. :)
I already have a preferred convention to distinguish types from other identifiers, which is that I use CamelCase with an initial capital for types and lower-case for others. Constants can be all-caps, including enum values.
But "XYZ_TYPE" with any capitalisation is kind of a poor name for an enumeration. I'd use enum Color {RED=1, GREEN=2, BLUE=3};, or enum FuzzyBool {yes=1, no=2, filenotfound=3};, or some such. Not REDGREENBLUE_TYPE.
I think in general if your names are well-chosen then you shouldn't need a _TYPE suffix. If your names aren't well chosen, and to be fair it can be difficult, then maybe you need it to distinguish the type from an object of that type. Maybe. But I prefer to use case.
There is nothing wrong with that suffix as enums are types of their own, they simply are not type safe.
XYZ_TYPE myXYZ = X;
if(myXYZ == 1) { } //This is what I meant by not strongly typed.
C++0x fixes enums so they are strongly typed though.
Just follow whatever your coding standard says about enum type names. In the end it doesn't matter as long as it is consistent with your coding standard, and it is logically sound.
XYZ_TYPE is just another name that follows the C++ variable-naming conventions, though I would prefer to use all capital names for preprocessor definitions.
We'd just call it an XYZ, making it follow our convention of naming types in CamelCase with a leading capital letter. The enumerated values would be eX, eY, and eZ, following out convention of naming values and variables in CamelCase with a leading lowercase letter, and our convention of all enum values starting with e (constants start with k, and there are no other prefixes in general use. We use a very limited set of Light Side Hungarian.)
As with all conventions, your mileage may vary. But suffixing types with _TYPE seems like a beginner's technique that adds little value for the visual clutter it costs.

Why use prefixes on member variables in C++ classes

A lot of C++ code uses syntactical conventions for marking up member variables. Common examples include
m_memberName for public members (where public members are used at all)
_memberName for private members or all members
Others try to enforce using this->member whenever a member variable is used.
In my experience, most larger code bases fail at applying such rules consistently.
In other languages, these conventions are far less widespread. I see it only occasionally in Java or C# code. I think I have never seen it in Ruby or Python code. Thus, there seems to be a trend with more modern languages to not use special markup for member variables.
Is this convention still useful today in C++ or is it just an anachronism. Especially as it is used so inconsistently across libraries. Haven't the other languages shown that one can do without member prefixes?
I'm all in favour of prefixes done well.
I think (System) Hungarian notation is responsible for most of the "bad rap" that prefixes get.
This notation is largely pointless in strongly typed languages e.g. in C++ "lpsz" to tell you that your string is a long pointer to a nul terminated string, when: segmented architecture is ancient history, C++ strings are by common convention pointers to nul-terminated char arrays, and it's not really all that difficult to know that "customerName" is a string!
However, I do use prefixes to specify the usage of a variable (essentially "Apps Hungarian", although I prefer to avoid the term Hungarian due to it having a bad and unfair association with System Hungarian), and this is a very handy timesaving and bug-reducing approach.
I use:
m for members
c for constants/readonlys
p for pointer (and pp for pointer to pointer)
v for volatile
s for static
i for indexes and iterators
e for events
Where I wish to make the type clear, I use standard suffixes (e.g. List, ComboBox, etc).
This makes the programmer aware of the usage of the variable whenever they see/use it. Arguably the most important case is "p" for pointer (because the usage changes from var. to var-> and you have to be much more careful with pointers - NULLs, pointer arithmetic, etc), but all the others are very handy.
For example, you can use the same variable name in multiple ways in a single function: (here a C++ example, but it applies equally to many languages)
MyClass::MyClass(int numItems)
{
mNumItems = numItems;
for (int iItem = 0; iItem < mNumItems; iItem++)
{
Item *pItem = new Item();
itemList[iItem] = pItem;
}
}
You can see here:
No confusion between member and parameter
No confusion between index/iterator and items
Use of a set of clearly related variables (item list, pointer, and index) that avoid the many pitfalls of generic (vague) names like "count", "index".
Prefixes reduce typing (shorter, and work better with auto-completion) than alternatives like "itemIndex" and "itemPtr"
Another great point of "iName" iterators is that I never index an array with the wrong index, and if I copy a loop inside another loop I don't have to refactor one of the loop index variables.
Compare this unrealistically simple example:
for (int i = 0; i < 100; i++)
for (int j = 0; j < 5; j++)
list[i].score += other[j].score;
(which is hard to read and often leads to use of "i" where "j" was intended)
with:
for (int iCompany = 0; iCompany < numCompanies; iCompany++)
for (int iUser = 0; iUser < numUsers; iUser++)
companyList[iCompany].score += userList[iUser].score;
(which is much more readable, and removes all confusion over indexing. With auto-complete in modern IDEs, this is also quick and easy to type)
The next benefit is that code snippets don't require any context to be understood. I can copy two lines of code into an email or a document, and anyone reading that snippet can tell the difference between all the members, constants, pointers, indexes, etc. I don't have to add "oh, and be careful because 'data' is a pointer to a pointer", because it's called 'ppData'.
And for the same reason, I don't have to move my eyes out of a line of code in order to understand it. I don't have to search through the code to find if 'data' is a local, parameter, member, or constant. I don't have to move my hand to the mouse so I can hover the pointer over 'data' and then wait for a tooltip (that sometimes never appears) to pop up. So programmers can read and understand the code significantly faster, because they don't waste time searching up and down or waiting.
(If you don't think you waste time searching up and down to work stuff out, find some code you wrote a year ago and haven't looked at
since. Open the file and jump about half way down without reading it.
See how far you can read from this point before you don't know if
something is a member, parameter or local. Now jump to another random
location... This is what we all do all day long when we are single
stepping through someone else's code or trying to understand how to
call their function)
The 'm' prefix also avoids the (IMHO) ugly and wordy "this->" notation, and the inconsistency that it guarantees (even if you are careful you'll usually end up with a mixture of 'this->data' and 'data' in the same class, because nothing enforces a consistent spelling of the name).
'this' notation is intended to resolve ambiguity - but why would anyone deliberately write code that can be ambiguous? Ambiguity will lead to a bug sooner or later. And in some languages 'this' can't be used for static members, so you have to introduce 'special cases' in your coding style. I prefer to have a single simple coding rule that applies everywhere - explicit, unambiguous and consistent.
The last major benefit is with Intellisense and auto-completion. Try using Intellisense on a Windows Form to find an event - you have to scroll through hundreds of mysterious base class methods that you will never need to call to find the events. But if every event had an "e" prefix, they would automatically be listed in a group under "e". Thus, prefixing works to group the members, consts, events, etc in the intellisense list, making it much quicker and easier to find the names you want. (Usually, a method might have around 20-50 values (locals, params, members, consts, events) that are accessible in its scope. But after typing the prefix (I want to use an index now, so I type 'i...'), I am presented with only 2-5 auto-complete options. The 'extra typing' people attribute to prefixes and meaningful names drastically reduces the search space and measurably accelerates development speed)
I'm a lazy programmer, and the above convention saves me a lot of work. I can code faster and I make far fewer mistakes because I know how every variable should be used.
Arguments against
So, what are the cons? Typical arguments against prefixes are:
"Prefix schemes are bad/evil". I agree that "m_lpsz" and its ilk are poorly thought out and wholly useless. That's why I'd advise using a well designed notation designed to support your requirements, rather than copying something that is inappropriate for your context. (Use the right tool for the job).
"If I change the usage of something I have to rename it". Yes, of course you do, that's what refactoring is all about, and why IDEs have refactoring tools to do this job quickly and painlessly. Even without prefixes, changing the usage of a variable almost certainly means its name ought to be changed.
"Prefixes just confuse me". As does every tool until you learn how to use it. Once your brain has become used to the naming patterns, it will filter the information out automatically and you won't really mind that the prefixes are there any more. But you have to use a scheme like this solidly for a week or two before you'll really become "fluent". And that's when a lot of people look at old code and start to wonder how they ever managed without a good prefix scheme.
"I can just look at the code to work this stuff out". Yes, but you don't need to waste time looking elsewhere in the code or remembering every little detail of it when the answer is right on the spot your eye is already focussed on.
(Some of) that information can be found by just waiting for a tooltip to pop up on my variable. Yes. Where supported, for some types of prefix, when your code compiles cleanly, after a wait, you can read through a description and find the information the prefix would have conveyed instantly. I feel that the prefix is a simpler, more reliable and more efficient approach.
"It's more typing". Really? One whole character more? Or is it - with IDE auto-completion tools, it will often reduce typing, because each prefix character narrows the search space significantly. Press "e" and the three events in your class pop up in intellisense. Press "c" and the five constants are listed.
"I can use this-> instead of m". Well, yes, you can. But that's just a much uglier and more verbose prefix! Only it carries a far greater risk (especially in teams) because to the compiler it is optional, and therefore its usage is frequently inconsistent. m on the other hand is brief, clear, explicit and not optional, so it's much harder to make mistakes using it.
I generally don't use a prefix for member variables.
I used to use a m prefix, until someone pointed out that "C++ already has a standard prefix for member access: this->.
So that's what I use now. That is, when there is ambiguity, I add the this-> prefix, but usually, no ambiguity exists, and I can just refer directly to the variable name.
To me, that's the best of both worlds. I have a prefix I can use when I need it, and I'm free to leave it out whenever possible.
Of course, the obvious counter to this is "yes, but then you can't see at a glance whether a variable is a class member or not".
To which I say "so what? If you need to know that, your class probably has too much state. Or the function is too big and complicated".
In practice, I've found that this works extremely well. As an added bonus it allows me to promote a local variable to a class member (or the other way around) easily, without having to rename it.
And best of all, it is consistent! I don't have to do anything special or remember any conventions to maintain consistency.
By the way, you shouldn't use leading underscores for your class members. You get uncomfortably close to names that are reserved by the implementation.
The standard reserves all names starting with double underscore or underscore followed by capital letter. It also reserves all names starting with a single underscore in the global namespace.
So a class member with a leading underscore followed by a lower-case letter is legal, but sooner or late you're going to do the same to an identifier starting with upper-case, or otherwise break one of the above rules.
So it's easier to just avoid leading underscores. Use a postfix underscore, or a m_ or just m prefix if you want to encode scope in the variable name.
You have to be careful with using a leading underscore. A leading underscore before a capital letter in a word is reserved.
For example:
_Foo
_L
are all reserved words while
_foo
_l
are not. There are other situations where leading underscores before lowercase letters are not allowed. In my specific case, I found the _L happened to be reserved by Visual C++ 2005 and the clash created some unexpected results.
I am on the fence about how useful it is to mark up local variables.
Here is a link about which identifiers are reserved:
What are the rules about using an underscore in a C++ identifier?
I prefer postfix underscores, like such:
class Foo
{
private:
int bar_;
public:
int bar() { return bar_; }
};
Lately I have been tending to prefer m_ prefix instead of having no prefix at all, the reasons isn't so much that its important to flag member variables, but that it avoids ambiguity, say you have code like:
void set_foo(int foo) { foo = foo; }
That of cause doesn't work, only one foo allowed. So your options are:
this->foo = foo;
I don't like it, as it causes parameter shadowing, you no longer can use g++ -Wshadow warnings, its also longer to type then m_. You also still run into naming conflicts between variables and functions when you have a int foo; and a int foo();.
foo = foo_; or foo = arg_foo;
Been using that for a while, but it makes the argument lists ugly, documentation shouldn't have do deal with name disambiguity in the implementation. Naming conflicts between variables and functions also exist here.
m_foo = foo;
API Documentation stays clean, you don't get ambiguity between member functions and variables and its shorter to type then this->. Only disadvantage is that it makes POD structures ugly, but as POD structures don't suffer from the name ambiguity in the first place, one doesn't need to use it with them. Having a unique prefix also makes a few search&replace operations easier.
foo_ = foo;
Most of the advantages of m_ apply, but I reject it for aesthetic reasons, a trailing or leading underscore just makes the variable look incomplete and unbalanced. m_ just looks better. Using m_ is also more extendable, as you can use g_ for globals and s_ for statics.
PS: The reason why you don't see m_ in Python or Ruby is because both languages enforce the their own prefix, Ruby uses # for member variables and Python requires self..
When reading through a member function, knowing who "owns" each variable is absolutely essential to understanding the meaning of the variable. In a function like this:
void Foo::bar( int apples )
{
int bananas = apples + grapes;
melons = grapes * bananas;
spuds += melons;
}
...it's easy enough to see where apples and bananas are coming from, but what about grapes, melons, and spuds? Should we look in the global namespace? In the class declaration? Is the variable a member of this object or a member of this object's class? Without knowing the answer to these questions, you can't understand the code. And in a longer function, even the declarations of local variables like apples and bananas can get lost in the shuffle.
Prepending a consistent label for globals, member variables, and static member variables (perhaps g_, m_, and s_ respectively) instantly clarifies the situation.
void Foo::bar( int apples )
{
int bananas = apples + g_grapes;
m_melons = g_grapes * bananas;
s_spuds += m_melons;
}
These may take some getting used to at first—but then, what in programming doesn't? There was a day when even { and } looked weird to you. And once you get used to them, they help you understand the code much more quickly.
(Using "this->" in place of m_ makes sense, but is even more long-winded and visually disruptive. I don't see it as a good alternative for marking up all uses of member variables.)
A possible objection to the above argument would be to extend the argument to types. It might also be true that knowing the type of a variable "is absolutely essential to understanding the meaning of the variable." If that is so, why not add a prefix to each variable name that identifies its type? With that logic, you end up with Hungarian notation. But many people find Hungarian notation laborious, ugly, and unhelpful.
void Foo::bar( int iApples )
{
int iBananas = iApples + g_fGrapes;
m_fMelons = g_fGrapes * iBananas;
s_dSpuds += m_fMelons;
}
Hungarian does tell us something new about the code. We now understand that there are several implicit casts in the Foo::bar() function. The problem with the code now is that the value of the information added by Hungarian prefixes is small relative to the visual cost. The C++ type system includes many features to help types either work well together or to raise a compiler warning or error. The compiler helps us deal with types—we don't need notation to do so. We can infer easily enough that the variables in Foo::bar() are probably numeric, and if that's all we know, that's good enough for gaining a general understanding of the function. Therefore the value of knowing the precise type of each variable is relatively low. Yet the ugliness of a variable like "s_dSpuds" (or even just "dSpuds") is great. So, a cost-benefit analysis rejects Hungarian notation, whereas the benefit of g_, s_, and m_ overwhelms the cost in the eyes of many programmers.
I can't say how widespred it is, but speaking personally, I always (and have always) prefixed my member variables with 'm'. E.g.:
class Person {
....
private:
std::string mName;
};
It's the only form of prefixing I do use (I'm very anti Hungarian notation) but it has stood me in good stead over the years. As an aside, I generally detest the use of underscores in names (or anywhere else for that matter), but do make an exception for preprocessor macro names, as they are usually all uppercase.
The main reason for a member prefix is to distinguish between a member function and a member variable with the same name. This is useful if you use getters with the name of the thing.
Consider:
class person
{
public:
person(const std::string& full_name)
: full_name_(full_name)
{}
const std::string& full_name() const { return full_name_; }
private:
std::string full_name_;
};
The member variable could not be named full_name in this case. You need to rename the member function to get_full_name() or decorate the member variable somehow.
I don't think one syntax has real value over another. It all boils down, like you mentionned, to uniformity across the source files.
The only point where I find such rules interesting is when I need 2 things named identicaly, for example :
void myFunc(int index){
this->index = index;
}
void myFunc(int index){
m_index = index;
}
I use it to differentiate the two. Also when I wrap calls, like from windows Dll, RecvPacket(...) from the Dll might be wrapped in RecvPacket(...) in my code. In these particular occasions using a prefix like "_" might make the two look alike, easy to identify which is which, but different for the compiler
Some responses focus on refactoring, rather than naming conventions, as the way to improve readability. I don't feel that one can replace the other.
I've known programmers who are uncomfortable with using local declarations; they prefer to place all the declarations at the top of a block (as in C), so they know where to find them. I've found that, where scoping allows for it, declaring variables where they're first used decreases the time that I spend glancing backwards to find the declarations. (This is true for me even for small functions.) That makes it easier for me to understand the code I'm looking at.
I hope it's clear enough how this relates to member naming conventions: When members are uniformly prefixed, I never have to look back at all; I know the declaration won't even be found in the source file.
I'm sure that I didn't start out preferring these styles. Yet over time, working in environments where they were used consistently, I optimized my thinking to take advantage of them. I think it's possible that many folks who currently feel uncomfortable with them would also come to prefer them, given consistent usage.
Those conventions are just that. Most shops use code conventions to ease code readability so anyone can easily look at a piece of code and quickly decipher between things such as public and private members.
Others try to enforce using
this->member whenever a member
variable is used
That is usually because there is no prefix. The compiler needs enough information to resolve the variable in question, be it a unique name because of the prefix, or via the this keyword.
So, yes, I think prefixes are still useful. I, for one, would prefer to type '_' to access a member rather than 'this->'.
Other languages will use coding conventions, they just tend to be different. C# for example has probably two different styles that people tend to use, either one of the C++ methods (_variable, mVariable or other prefix such as Hungarian notation), or what I refer to as the StyleCop method.
private int privateMember;
public int PublicMember;
public int Function(int parameter)
{
// StyleCop enforces using this. for class members.
this.privateMember = parameter;
}
In the end, it becomes what people know, and what looks best. I personally think code is more readable without Hungarian notation, but it can become easier to find a variable with intellisense for example if the Hungarian notation is attached.
In my example above, you don't need an m prefix for member variables because prefixing your usage with this. indicates the same thing in a compiler-enforced method.
This doesn't necessarily mean the other methods are bad, people stick to what works.
When you have a big method or code blocks, it's convenient to know immediately if you use a local variable or a member. it's to avoid errors and for better clearness !
IMO, this is personal. I'm not putting any prefixes at all. Anyway, if code is meaned to be public, I think it should better has some prefixes, so it can be more readable.
Often large companies are using it's own so called 'developer rules'.
Btw, the funniest yet smartest i saw was DRY KISS (Dont Repeat Yourself. Keep It Simple, Stupid). :-)
As others have already said, the importance is to be colloquial (adapt naming styles and conventions to the code base in which you're writing) and to be consistent.
For years I have worked on a large code base that uses both the "this->" convention as well as using a postfix underscore notation for member variables. Throughout the years I've also worked on smaller projects, some of which did not have any sort of convention for naming member variables, and other which had differing conventions for naming member variables. Of those smaller projects, I've consistently found those which lacked any convention to be the most difficult to jump into quickly and understand.
I'm very anal-retentive about naming. I will agonize over the name to be ascribed to a class or variable to the point that, if I cannot come up with something that I feel is "good", I will choose to name it something nonsensical and provide a comment describing what it really is. That way, at least the name means exactly what I intend it to mean--nothing more and nothing less. And often, after using it for a little while, I discover what the name should really be and can go back and modify or refactor appropriately.
One last point on the topic of an IDE doing the work--that's all nice and good, but IDEs are often not available in environments where I have perform the most urgent work. Sometimes the only thing available at that point is a copy of 'vi'. Also, I've seen many cases where IDE code completion has propagated stupidity such as incorrect spelling in names. Thus, I prefer to not have to rely on an IDE crutch.
The original idea for prefixes on C++ member variables was to store additional type information that the compiler didn't know about. So for example, you could have a string that's a fixed length of chars, and another that's variable and terminated by a '\0'. To the compiler they're both char *, but if you try to copy from one to the other you get in huge trouble. So, off the top of my head,
char *aszFred = "Hi I'm a null-terminated string";
char *arrWilma = {'O', 'o', 'p', 's'};
where "asz" means this variable is "ascii string (zero-terminated) and "arr" means this variable is a character array.
Then the magic happens. The compiler will be perfectly happy with this statement:
strcpy(arrWilma, aszFred);
But you, as a human, can look at it and say "hey, those variables aren't really the same type, I can't do that".
Unfortunately a lot places use standards such as "m_" for member variables, "i" for integers no matter how used, "cp" for char pointers. In other words they're duplicating what the compiler knows, and making the code hard to read at the same time. I believe this pernicious practice should be outlawed by statute and subject to harsh penalties.
Finally, there's two points I should mention:
Judicious use of C++ features allows the compiler to know the information you had to encode in raw C-style variables. You can make classes that will only allow valid operations. This should be done as much as practical.
If your code blocks are so long that you forget what type a variable is before you use it, they are way too long. Don't use names, re-organize.
Our project has always used "its" as a prefix for member data, and "the" as a prefix for parameters, with no prefix for locals. It's a little cutesy, but it was adopted by the early developers of our system because they saw it used as a convention by some commercial source libraries we were using at the time (either XVT or RogueWave - maybe both). So you'd get something like this:
void
MyClass::SetName(const RWCString &theName)
{
itsName = theName;
}
The big reason I see for scoping prefixes (and no others - I hate Hungarian notation) is that it prevents you from getting into trouble by writing code where you think you're referring to one variable, but you're really referring to another variable with the same name defined in the local scope. It also avoids the problem of coming up with a variable names to represent that same concept, but with different scopes, like the example above. In that case, you would have to come up with some prefix or different name for the parameter "theName" anyway - why not make a consistent rule that applies everywhere.
Just using this-> isn't really good enough - we're not as interested in reducing ambiguity as we are in reducing coding errors, and masking names with locally scoped identifiers can be a pain. Granted, some compilers may have the option to raise warnings for cases where you've masked the name in a larger scope, but those warnings may become a nuisance if you're working with a large set of third party libraries that happen to have chosen names for unused variables that occasionally collide with your own.
As for the its/the itself - I honestly find it easier to type than underscores (as a touch typist, I avoid underscores whenever possible - too much stretching off the home rows), and I find it more readable than a mysterious underscore.
I use it because VC++'s Intellisense can't tell when to show private members when accessing out of the class. The only indication is a little "lock" symbol on the field icon in the Intellisense list. It just makes it easier to identify private members(fields) easier. Also a habit from C# to be honest.
class Person {
std::string m_Name;
public:
std::string Name() { return m_Name; }
void SetName(std::string name) { m_Name = name; }
};
int main() {
Person *p = new Person();
p->Name(); // valid
p->m_Name; // invalid, compiler throws error. but intellisense doesn't know this..
return 1;
}
I think that, if you need prefixes to distinguish class members from member function parameters and local variables, either the function is too big or the variables are badly named. If it doesn't fit on the screen so you can easily see what is what, refactor.
Given that they often are declared far from where they are used, I find that naming conventions for global constants (and global variables, although IMO there's rarely ever a need to use those) make sense. But otherwise, I don't see much need.
That said, I used to put an underscore at the end of all private class members. Since all my data is private, this implies members have a trailing underscore. I usually don't do this anymore in new code bases, but since, as a programmer, you mostly work with old code, I still do this a lot. I'm not sure whether my tolerance for this habit comes from the fact that I used to do this always and am still doing it regularly or whether it really makes more sense than the marking of member variables.
In python leading double underscores are used to emulate private members. For more details see this answer
I use m_ for member variables just to take advantage of Intellisense and related IDE-functionality. When I'm coding the implementation of a class I can type m_ and see the combobox with all m_ members grouped together.
But I could live without m_ 's without problem, of course. It's just my style of work.
It is useful to differentiate between member variables and local variables due to memory management. Broadly speaking, heap-allocated member variables should be destroyed in the destructor, while heap-allocated local variables should be destroyed within that scope. Applying a naming convention to member variables facilitates correct memory management.
Code Complete recommends m_varname for member variables.
While I've never thought the m_ notation useful, I would give McConnell's opinion weight in building a standard.
I almost never use prefixes in front of my variable names. If you're using a decent enough IDE you should be able to refactor and find references easily. I use very clear names and am not afraid of having long variable names. I've never had trouble with scope either with this philosophy.
The only time I use a prefix would be on the signature line. I'll prefix parameters to a method with _ so I can program defensively around them.
You should never need such a prefix. If such a prefix offers you any advantage, your coding style in general needs fixing, and it's not the prefix that's keeping your code from being clear. Typical bad variable names include "other" or "2". You do not fix that with requiring it to be mOther, you fix it by getting the developer to think about what that variable is doing there in the context of that function. Perhaps he meant remoteSide, or newValue, or secondTestListener or something in that scope.
It's an effective anachronism that's still propagated too far. Stop prefixing your variables and give them proper names whose clarity reflects how long they're used. Up to 5 lines you could call it "i" without confusion; beyond 50 lines you need a pretty long name.
I like variable names to give only a meaning to the values they contain, and leave how they are declared/implemented out of the name. I want to know what the value means, period. Maybe I've done more than an average amount of refactoring, but I find that embedding how something is implemented in the name makes refactoring more tedious than it needs to be. Prefixes indicating where or how object members are declared are implementation specific.
color = Red;
Most of the time, I don't care if Red is an enum, a struct, or whatever, and if the function is so large that I can't remember if color was declared locally or is a member, it's probably time to break the function into smaller logical units.
If your cyclomatic complexity is so great that you can't keep track of what is going on in the code without implementation-specific clues embedded in the names of things, most likely you need to reduce the complexity of your function/method.
Mostly, I only use 'this' in constructors and initializers.
According to JOINT STRIKE FIGHTER AIR VEHICLE C++ CODING STANDARDS (december 2005):
AV Rule 67
Public and protected data should only be used in
structs—not classes. Rationale: A class is able to maintain its
invariant by controlling access to its data. However, a class cannot
control access to its members if those members non-private. Hence all
data in a class should be private.
Thus, the "m" prefix becomes unuseful as all data should be private.
But it is a good habit to use the p prefix before a pointer as it is a dangerous variable.
Many of those conventions are from a time without sophisticated editors. I would recommend using a proper IDE that allows you to color every kind of variable. Color is by far easier to spot than any prefix.
If you need to get even more detail on a variable any modern IDE should be able to show it to you by moving the caret or cursor over it. And if you use a variable in a wrong way (for instance a pointer with the . operator) you will get an error, anyway.
Personally I use a relatively "simple" system to denote what variables are
I have the different "flags" that I combine then an underscore, then the memory type, then finally the name.
I like this because you can narrow down the amount of variables in an IDE's completion as much as possible as quickly as possible.
The stuff I use is:
m for member function
s for static
c for const/constexpr
then an underscore _
then the variable memory type
p for unowned pointer
v for list
r for reference
nothing for owned value
for example if I had a member variable which is a list of ints I would put
m_vName
and for a static const pointer to a pointer of lists of ints I would put
sc_ppvName
This lets me quickly tell what The variable is used for and how to access it. aswell as how to get/drop values