Pointer arithmetic disguised &(array[0]) - c++

Today I browsed some source code (it was an example file explaining the use of a software framework) and discovered a lot of code like this:
int* array = new int[10]; // or malloc, who cares. Please, no language wars. This is applicable to both languages
for ( int* ptr = &(array[0]); ptr <= &(array[9]); ptr++ )
{
...
}
So basically, they've done "take the address of the object that lies at address array + x".
Normally I would say, that this is plain stupidity, as writing array + 0or array + 9 directly does the same. I even would always rewrite such code to a size_t for loop, but that's a matter of style.
But the overuse of this got me thinking: Am I missing something blatantly obvious or something subtely hidden in the dark corners of the language?
For anyone wanting to take a look at the original source code, with all it's nasty gotos , mallocs and of course this pointer thing, feel free to look at it online.

Yeah, there's no good reason for the first one. This is exactly the same thing:
int *ptr = array;
I agree on the second also, may as well just write:
ptr < (array + 10)
Of course you could also just make it a for loop from 0-9 and set the temp pointer to point to the beginning of the array.
for(int i = 0, *ptr = array; i < 10; ++i, ++ptr)
/* ... */
That of course assumes that ptr is not being modified within the body of the loop.

You're not missing anything, they do mean the same thing.
However, to try to shed some more light on this, I should say that I also write expressions like that from time to time, for added clarity.
I personally tend to think in terms of object-oriented programming, meaning that I prefer to refer to "the address of the nth element of the array", rather than "the nth offset from the beginning address of the array". Even though those two things are equivalent in C, when I'm writing the code, I have the former in mind - so I express that.
Perhaps that's the reasoning of the person who wrote this as well.

Edit: this is partially incorrect. Read the comments.
The problem with &(array[0]) is that it expands to &(*(array + 0)) which involves an dereference. Now, every compiler will obviously optimize this into the same thing as array + 0, but as far as the language is concerned the dereference can cause UB in places where array + 0 would not.

I think the reason why they wrote it this way was that
&(array[0])
and
&(array[9])
just look similar. Another way would be to write it
array + 0
and
array + 9
respectively. As you already mentioned, they essentially do the same (at least most compilers treat it as the same, I hope).
You could interpret the two different type of expressions differently: The first one can be read as "the address of element 0 / 9". The second one can be read as "array pointer with an element offset of 0 / 9". The first one sounds more high-level, the second more low-level. However, most people tend to use the second form, though.
Now since array + 0 of course is the same as array, you could just write array. I think the point here is that the begin and end of the loop look "analogous" to each other. A question of personal taste.

According to classical mathematics:
Array[n]
refers to the nth element in the array.
To "take the address of" the nth element, the & or address of operator is applied:
&Array[n]
To clear out any assumed ambiguities, parenthesis are added:
&(Array[n])
To a reader, reading from left to right, this expression has the meaning:
Return the address of the element at position 'n'
The insurance may have developed as a protection against old faulty compilers.
Some people consider it more readable than:
Array + n
Sorry, but I am old school and prefer using the '&' version, paren or without. I'll waste my time making code easier to read than worrying about which version takes longer to compile or which version is more efficient.
A clear commented section of code has a higher Return On Investment than a section of code that is micro-optimized for efficiency or uses sections of the language that are unfamilar to non language lawyers.

Related

Converting arrays of one type to another

Basically I have an array of doubles. I want to pass this array to a function (ProcessData) which will treat them as short ints. Is creating a short pointer and pointing it to the array, then passing this pointer to the function ok (code 1) ?
Is this in effect the same as creating a short array, iterating through each element and converting each element of the double array to a short and then passing the short array pointer (code 2) ? Thanks
//code 1
//.....
short* shortPtr = (short*)doubleArr;
ProcessData(shortPtr);
..
//code 2
//...
short shortArr [ARRSIZE];
int i;
for (i = 0; i < ARRSIZE; i++)
{
shortArr[i] = (short)doubleArr[i];
}
ProcessData(shortArr);
You can't just cast, as the various comments have said. But if you use iterators you can get more or less the same effect:
void do_something_with_short(short i) {
/* whatever */
}
template <class Iter>
void do_something(Iter first, Iter last) {
while (first != last)
do_something_with_short(*first++);
}
You can call that template function with iterators into an array of any arithmetic type (in fact, any type that's implicitly convertible to short or, if you add a cast at the point of the call to do_something_with_short, with a type that requires a cast):
double data[10]; // needs to be initialized, of course
do_something(std::begin(data), std::end(data));
No you can't do that. Here's at least one reason why:
An array is a contiguous sequence of several memory allocations accessed by way of an index, like so
[----][----][----]
Note the four dashes inside the square brackets. That is to indicate that in most situations in C/C++, an int is four bytes long. Arrays cells can be accessed by their index because if we know the memory address of the first cell (m) and we know how big each cell is meant to be (c) - in this case, four bytes, we can easily find the memory location of any index by doing m + index * c
[----][----][----]
^ array[0]
[----][----][----]
---- ---- ^ array[2]
Fundamentally, this is why pointers can be treated like arrays in C/C++, because when you are accessing arrays, you are basically doing pointer arithmetic anyway.
In most cases in C/C++, a short is 2 bytes long, so to represent it in the same way
[--][--][--]
If you create a short pointer, and try to use it as an array, it is expected to point to something which is arranged like the above. If you try to index it, it is going to have problems: if you were dealing with an array of shorts, the location of array[2] is the same as m + 2 * index, as shown below
[--][--][--]
-- -- ^ array[2] (note moving along four bytes)
But since we are in reality dealing with an array of integers, the following will happen
[----][----][----]
---- ^ array[2] (again moving along four bytes)
Which is clearly wrong
No, because ++ptr actually does something like ptr = (char*)ptr + sizeof *ptr (with sizeof (char) being 1 by definition). So incrementing a double pointer moves it by (usually) 8 bytes, while incrementing a short pointer moves it by only 2 bytes.
Suppose that your kids study piano and occasionally ask you to scan for them a stack of sheet music given to them by their teacher who was born in the 20th century (just like yourself). You take those sheets to your office and feed them to the photocopier. It creates decent digital scans that your kids can use on their piano equipped with a touch screen. All goes well until one day the child brings to you an old rare set of vinyl records. She's desperate of finding those melodies in sheet music form but asks you to at least copy the records. Inexperienced in musical matters, you take those disks to your office, load them in the automatic document feeder of the scanner and realize that you are deep in ... um... crap only as you hear the sounds of the vinyl disks breaking inside the stupid machine. Even if the photocopier were not equipped with an ADF, and you had to place all the originals on its glass flatbed manually, hardly you would receive your fair share of praise when you sent the scans to your daughter.
The scanner doesn't care what you put into it - as long as it fits inside. It does its best, but the result is not up to the expectations. However, had you first taken the vinyl records to an experienced musician who would write them down as musical score, scanning those sheets would result in real delight of your child.
In C++, different types may differ to an extent that a printed sheet of paper differs from a CD. A C++ function expecting to receive an array of shorts will process any sequence of bytes/bits as an array of shorts. It doesn't care that the memory area is actually filled with values of a different type, having a completely different representation, just like the scanner didn't care about the contents of the stack on the ADF. Assuming that a function will internally convert each element of the array from double to short, is the same as believing that a photocopier includes a gramophone and a musician that will automatically transcribe vinyl recordings to sheet form. Note that the latter is a possible design for a real-world photocopier, and some other programming languages work like that. But not existing implementations of1 C++.
1 In theory, a standard compliant implementation of C/C++ is possible that would interpret all provisions of UB in the language in favor of the opposite answer to your question, rather than in favor of best performance. But that would make little sense for a language like C/C++.

what's the difference of i++ and ++i in for loop? [duplicate]

Perhaps it doesn't matter to the compiler once it optimizes, but in C/C++, I see most people make a for loop in the form of:
for (i = 0; i < arr.length; i++)
where the incrementing is done with the post fix ++. I get the difference between the two forms. i++ returns the current value of i, but then adds 1 to i on the quiet. ++i first adds 1 to i, and returns the new value (being 1 more than i was).
I would think that i++ takes a little more work, since a previous value needs to be stored in addition to a next value: Push *(&i) to stack (or load to register); increment *(&i). Versus ++i: Increment *(&i); then use *(&i) as needed.
(I get that the "Increment *(&i)" operation may involve a register load, depending on CPU design. In which case, i++ would need either another register or a stack push.)
Anyway, at what point, and why, did i++ become more fashionable?
I'm inclined to believe azheglov: It's a pedagogic thing, and since most of us do C/C++ on a Window or *nix system where the compilers are of high quality, nobody gets hurt.
If you're using a low quality compiler or an interpreted environment, you may need to be sensitive to this. Certainly, if you're doing advanced C++ or device driver or embedded work, hopefully you're well seasoned enough for this to be not a big deal at all. (Do dogs have Buddah-nature? Who really needs to know?)
It doesn't matter which you use. On some extremely obsolete machines, and in certain instances with C++, ++i is more efficient, but modern compilers don't store the result if it's not stored. As to when it became popular to postincriment in for loops, my copy of K&R 2nd edition uses i++ on page 65 (the first for loop I found while flipping through.)
For some reason, i++ became more idiomatic in C, even though it creates a needless copy. (I thought that was through K&R, but I see this debated in other answers.) But I don't think there's a performance difference in C, where it's only used on built-ins, for which the compiler can optimize away the copy operation.
It does make a difference in C++, however, where i might be a user-defined type for which operator++() is overloaded. The compiler might not be able to assert that the copy operation has no visible side-effects and might thus not be able to eliminate it.
As for the reason why, here is what K&R had to say on the subject:
Brian Kernighan
you'll have to ask dennis (and it might be in the HOPL paper). i have a
dim memory that it was related to the post-increment operation in the
pdp-11, though beyond that i don't know, so don't quote me.
in c++ the preferred style for iterators is actually ++i for some subtle
implementation reason.
Dennis Ritchie
No particular reason, it just became fashionable. The code produced
is identical on the PDP-11, just an inc instruction, no autoincrement.
HOPL Paper
Thompson went a step further by inventing the ++ and -- operators, which increment or decrement; their prefix or postfix position determines whether the alteration occurs before or after noting the value of the operand. They were not in the earliest versions of B, but appeared along the way. People often guess that they were created to use the auto-increment and auto-decrement address modes provided by the DEC PDP-11 on which C and Unix first became popular. This is historically impossible, since there was no PDP-11 when B was developed. The PDP-7, however, did have a few ‘auto-increment’ memory cells, with the property that an indirect memory reference through them incremented the cell. This feature probably suggested such operators to Thompson; the generalization to make them both prefix and postfix was his own. Indeed, the auto-increment cells were not used directly in implementation of the operators, and a stronger
motivation for the innovation was probably his observation that the translation of ++x was smaller than that of x=x+1.
For integer types the two forms should be equivalent when you don't use the value of the expression. This is no longer true in the C++ world with more complicated types, but is preserved in the language name.
I suspect that "i++" became more popular in the early days because that's the style used in the original K&R "The C Programming Language" book. You'd have to ask them why they chose that variant.
Because as soon as you start using "++i" people will be confused and curios. They will halt there everyday work and start googling for explanations. 12 minutes later they will enter stack overflow and create a question like this. And voila, your employer just spent yet another $10
Going a little further back than K&R, I looked at its predecessor: Kernighan's C tutorial (~1975). Here the first few while examples use ++n. But each and every for loop uses i++. So to answer your question: Almost right from the beginning i++ became more fashionable.
My theory (why i++ is more fashionable) is that when people learn C (or C++) they eventually learn to code iterations like this:
while( *p++ ) {
...
}
Note that the post-fix form is important here (using the infix form would create a one-off type of bug).
When the time comes to write a for loop where ++i or i++ doesn't really matter, it may feel more natural to use the postfix form.
ADDED: What I wrote above applies to primitive types, really. When coding something with primitive types, you tend to do things quickly and do what comes naturally. That's the important caveat that I need to attach to my theory.
If ++ is an overloaded operator on a C++ class (the possibility Rich K. suggested in the comments) then of course you need to code loops involving such classes with extreme care as opposed to doing simple things that come naturally.
At some level it's idiomatic C code. It's just the way things are usually done. If that's your big performance bottleneck you're likely working on a unique problem.
However, looking at my K&R The C Programming Language, 1st edition, the first instance I find of i in a loop (pp 38) does use ++i rather than i++.
Im my opinion it became more fashionable with the creation of C++ as C++ enables you to call ++ on non-trivial objects.
Ok, I elaborate: If you call i++ and i is a non-trivial object, then storing a copy containing the value of i before the increment will be more expensive than for say a pointer or an integer.
I think my predecessors are right regarding the side effects of choosing postincrement over preincrement.
For it's fashonability, it may be as simple as that you start all three expressions within the for statement the same repetitive way, something the human brain seems to lean towards to.
I would add up to what other people told you that the main rule is: be consistent. Pick one, and do not use the other one unless it is a specific case.
If the loop is too long, you need to reload the value in the cache to increment it before the jump to the begining.
What you don't need with ++i, no cache move.
In C, all operators that result in a variable having a new value besides prefix inc/dec modify the left hand variable (i=2, i+=5, etc). So in situations where ++i and i++ can be interchanged, many people are more comfortable with i++ because the operator is on the right hand side, modifying the left hand variable
Please tell me if that first sentence is incorrect, I'm not an expert with C.

C++ using boolean evaluations for array positions (jump table)

I have a C++ IF statement which looks like (pseudo code- all variables are ints):
if(x < y){
c += d;
}
else{
c += f;
}
and I am thinking of trying to remove the IF statement and instead, load the values d and f into a two-element array:
array[0] = d
array[1] = f
and then I would like to be able to refer to the array elements '0' or '1' based upon the underlying type of boolean (at least in C- 0 or 1). Is there any way to do this? So my code would change to be something like:
c += array[(x<y)] if this is true, c increments by f, otherwise if its false, c increments by d.
Can I do this, using the boolean result to look up the array index?
Of course you can do it. However, chances are that you are only going to make it worse. If you think that you are removing a branch in this case — you are mistaken. Assuming a production quality compiler and x86_64 architecture, your first version will result in a nice conditional move (i.e. cmovge). The second version, however, will result in extra level of indirection and reading memory (i.e. mov eax,DWORD PTR [rax*4+0x4005d0].
If you accept suggestions, I have a very bad feeling that you are on a very, very wrong path right now. When you are optimizing your program, you have to first measure/profile to determine a bottleneck. Only when you know what are bottlenecks, you can start optimizing them. When optimizing, you have to measure/profile it again to see whether there is an improvement or not. What you seem to be doing is not trusting your compiler, guessing, and doing false-optimization. I recommend you stop right there, or else it will go down the hill from there, trust me.
You could replace the if statement with the following if you want more compact code.
c += (x < y) ? d : f;
Yes that will work. Although it will make your code harder to understand and modern compilers will eliminate the if statement anyways (when translating to assembler).

Is it a bad idea to use pointers as loop incrementers instead of the usual "int i"?

An example of this would be:
char str[] = "Hello";
int strLength = strlen(str);
for ( char * pc = str;
pc < str + strLength;
pc++)
{
*pc += 2;
}
Edit: Accounted for write-protected memory issue.
My one issue is that you'd have a lot of fun if you leave out the * in *pc in the for loop. Whoops? More generally, it is slightly harder to tell the difference between reassigning the pointer and modifying the value.
However, (though I don't have it handy), Stroustroup himself endorses(see edit) pointer iteration in the C++ Programming Language book. Basically, you can have a pretty terse implementation of string comparison between two char arrays using pointer arithmetic.
In short, I would recommend using such pointers in a "read only" fashion. If you need to write to the array, I would use the more traditional i.
This is, of course, all my personal preference.
Edit: Stroustroup doesn't endorse pointer iteration OVER integer -- he simply uses it at one point in the book, so my reasoning is that he doesn't think its anethema to good practice.
It's ALWAYS a bad idea to use a construct that you don't fully understand. This extends to the people who will have to read your code after you... (I think this is a corollary to the "Don't be a clever programmer" rule)
In this case, if you DO understand, and are fully comfortable with the construct, then there's nothing inherently wrong with it... But usually, if you have to ask if it's a bad idea, then you're not fully comfortable with it...
No, it's not a bad idea, except that you messed it up.
For one, you're writing into a string literal. That's undefined behavior. (It crashes on Windows.) Had you written const char* str = "Hello!" the compiler would have barked at you. Unfortunately there's a (in C++ deprecated, but still allowed) conversion from a string literal to a non-const char* which allows your code to compile. However, what you want is an array which you can write into (and which is pre-initialized). For that use char str[] = "Hello!".
The other, minor, mistake is, that you loop through the string twice: strlen runs along the characters until it finds a '\0', and then you do the same again. It would be better if you checked for that '\0' yourself and avoid the call to strlen altogether.
Here's a fixed version of your loop:
char str[] = "Hello!";
for (char * pc = str; *pc != '\0'; pc++)
{
*pc += 2;
}
It is pretty much the idea behind STL iterators, so no, it's not a bad idea.
A canonical loop working on iterators looks something like this:
for (iter cur = begin(); cur != end(); ++cur)
where iter might be a pointer type, or might be some other iterator. It is basically how all the standard library algorithms are implemented.
However, a better question might be what you're trying to achieve with it. The C++ standard library does it because it enables a similar syntax for iterating over any kind of sequence, not just arrays or other containers which define operator[].
It better expresses your intent, in some cases. Sometimes, you don't care about the loop counter i, so why should it be there?
But in other cases, a plain old for loop, where you have access to the counter variable, makes more sense. Do what best expresses your intent.
It can be confusing to people not used to working with pointers. But there is simply no point in writing
for (int i=0; a[i]!=NULL; ++i){
a[i] = ...;
}
instead of
for (aptr p=a; p!=NULL; ++i){
*p = ...;
}
Use the counter when they are equivalent and a pointer when it makes sense.
It's not always a bad idea, but you need to be careful. Take a look at this article.
https://www.securecoding.cert.org/confluence/display/seccode/ARR38-C.+Do+not+add+or+subtract+an+integer+to+a+pointer+if+the+resulting+value+does+not+refer+to+a+valid+array+element
For C++:
It's not a bad idea at all. In C++ you can use pointers similar to the iterators of the standard library. You can even use the standard library algorithms, such as std::copy, with pointers. It's also feasible to implement std::vector using pointers as iterators. Therefore I prefer iterating using pointers instead of indexes.
I agree with ralu (and Milewski). Many years ago compilers were dumb and would literal-mindedly recalculate the array offset each time (I'm told), so that it was more efficient to use and bump a ptr, yourself. However, they got smarter a few years later (as Milewski says) and could convert the [i] pattern to ptr bumping themselves. In addition, they could use the [i] pattern to unroll the loop a bit, but at that time were not smart enough to see through a programmer's bump-your-own-ptr trick. Now I don't know whether compilers are smart enough nowadays to unroll a loop with hand-rolled simple pointer bumping, possibly so; but I took from that example that the compiler could do cleverer things than had occurred to me, and that the best I could do was make my intent clear and get out of its way. Plus, I think it's easier for another programmer to understand indexes, and that trumps a lot things.
Its not a bad idea in and of itself, but it is unusual. If you expect other people to be working on your code, I'd use the int i version just to reduce confusion.
Either way you've got to worry about the same problems.
It is faster than using higher level objects but as others warn be careful with it. You might have to do it due to the constraints that you are programming to but it is unconventional.
The scope of the variable being in the loop may not be portable as well.
Have a look at the C bible, a.k.a. K&R C (sanitised Amazon link) as they have a discussion about the advantages of both techniques.
Either way, "there be dragons ahead, arr!" so tread very carefully as the road of good pointer arithmetic intentions is paved with the abundant corpses of buffer overflow exploit victims! (-:
In fact, for an excellent discussion and a "wander out on to the thin ice of advanced pointer manipulation" (his term), have a look at Andy Koenig's excellent book "C Traps and Pitfalls" (sanitised Amazon link)
Edit: One thing I forgot to mention, is that I tend to prefer the usual "for (int i = 0; ..) style purely because it is such an ingrained idiom that anyone can see what you're doing with a quick glance. Using pointer arithmetic requires a bit more of a deeper look.
HTH
Sometimes incrementing pointer in a loop looks pretty natural. Take a look at the following code that initialize DirectX texture from GDI+ bitmap:
boost::uint8_t* pDest = static_cast<boost::uint8_t*>(d3dlr.pBits);
const boost::uint8_t* pSrc = static_cast<const boost::uint8_t*>(bitmap_data.Scan0);
for ( UINT i = 0; i < bmp_height; ++i, pSrc += bitmap_data.Stride, pDest += d3dlr.Pitch )
memcpy(pDest, pSrc, bmp_width * BPP);
Here were used two pointers. Each pointer has its own incrementation. I believe that using additional int in this loop will result in deterioration of code readability.
Generaly I stick to this claim by Bartosz Milewski in his great freely avaliable C++ book C++ In Action.
Don’t use pointers unless there is no other way. I leave this kind of optimisations to compilers. It is so simple and common usage that is verry unlikely that compilers can not figure out how to optimise this kind of code.
One last thing from his book:
If your compiler is unable to optimize the human readable, maintainable version of the algorithm, and you have to double as a human compiler-- buy a new compiler! Nobody can afford human compilers any more. So, have mercy on yourself and your fellow programmers who will have to look at your code.
I do not find it problematic to use a pointer as the loop variable. However, I have a few problems with your example:
char str[] = "Hello";
int strLength = strlen(str);
for ( char * pc = str;
pc < str + strLength;
pc++)
{
*pc += 2;
}
strlen iterates over the whole string to figure out the length of the string. Even for a simple example such as this, there is no need for that kind of waste.
This example can be more clearly and succinctly written as:
char str[] = "Hello";
for (char *pc = str; *pc; ++pc) {
*pc += 2;
}
This version is more efficient and easier to understand. It also illustrates there is nothing wrong in principle with using a pointer as the loop variable.

Magic Numbers In Arrays? - C++

I'm a fairly new programmer, and I apologize if this information is easily available out there, I just haven't been able to find it yet.
Here's my question:
Is is considered magic numbers when you use a literal number to access a specific element of an array?
For example:
arrayOfNumbers[6] // Is six a magic number in this case?
I ask this question because one of my professors is adamant that all literal numbers in a program are magic numbers. It would be nice for me just to access an element of an array using a real number, instead of using a named constant for each element.
Thanks!
That really depends on the context. If you have code like this:
arr[0] = "Long";
arr[1] = "sentence";
arr[2] = "as";
arr[3] = "array.";
...then 0..3 are not considered magic numbers. However, if you have:
int doStuff()
{
return my_global_array[6];
}
...then 6 is definitively a magic number.
It's pretty magic.
I mean, why are you accessing the 6th element? What's are the semantics that should be applied to that number? As it stands all we know is "the 6th (zero-based) number". If we knew the declaration of arrayOfNumbers we would further know its type (e.g. an int or a double).
But if you said:
arrayOfNumbers[kDistanceToSaturn];
...now it has much more meaning to someone reading the code.
In general one iterates over an array, performing some operation on each element, because one doesn't know how long the array is and you can't just access it in a hardcoded manner.
However, sometimes array elements have specific meanings, for example, in graphics programming. Sometimes an array is always the same size because the data demands it (e.g. certain transform matrices). In these cases it may or may not be okay to access the specific element by number: domain experts will know what you're doing, but generalists probably won't. Giving the magic index number a name makes it more obvious to those who have to maintain your code, and helps you to prevent typing the wrong one accidentally.
In my example above I assumed your array holds distances from the sun to a planet. The sun would be the zeroth element, thus arrayOfNumbers[kDistanceToSun] = 0. Then as you increment, each element contains the distance to the next farthest planet: mercury, venus, etc. This is much more readable than just typing the number of the planet you want. In this case the array is of a fixed size because there are a fixed number of planets (well, except the whole Pluto debacle).
The other problem is that "arrayOfNumbers" tells us nothing about the contents of the array. We already know its an array of numbers because we saw the declaration somewhere where you said int arrayOfNumers[12345]; or however you declared it. Instead, something like:
int distanceToPlanetsFromSol[kNumberOfPlanets];
...gives us a much better idea of what the data actually is and what its semantics are. One of your goals as a programmer should be to write code that is self-documenting in this manner.
And then we can argue elsewhere if kNumberOfPlanets should be 8 or 9. :)
You should ask yourself why are you accessing that particular position. In this case, I assume that if you are doing arrayOfNumbers[6] the sixth position has some special meaning. If you think what's that meaning, you probably realize that it's a magic number hiding that.
another way to look at it:
What if after some chance the program needs to access 7th element instead of 6th? HOw would you or a maintainer know that? If for example if the 6th entry is the count of trees in CA it would be a good thing to put
#define CA_STATE_ENTRY 6
Then if now the table is reordered somebody can see that they need to change this to 9 (say). BTW I am not saying this is the best way to maintain an array for tree counts by state - it probably isnt.
Likewise, if later people want to change the program to deal with trees in oregon, then they know to replace
trees[CA_STATE_ENTRY]
with
trees[OR_STATE_ENTRY]
The point is
trees[6]
is not self-documenting
Of course for c++ it should be an enum not a #define
You'd have to provide more context for a meaningful answer. Not all literal numbers are magic, but many are. In a case like that there is no way at all to tell for sure, though most cases I can think of off-hand with an explicit array index >>1 probably qualify as magic.
Not all literals in a program really qualify as "magic numbers" -- but this one certainly seems to. The 6 gives us no clue of why you're accessing that particular element of the array.
To not be a magic number, you need its meaning to be quite clear even on first examination (or at least minimal examination) why that value is being used. Just for example, a lot of code will do things like: &x[0]. In this case, it's typically pretty clear that the '0' really just means "the beginning of the array."
If you need to access a particular element of the array, chances are you're doing it wrong.
You should almost always be iterating over the entire array.
It's only not a magic number if your program is doing something very special involving the number six specifically. Could you provide some context?
That's the problem with professors, they're often too academic. In theory he's right, as usual, but usually magic numbers are used in a stricter context, when the number is embedded in a data stream, allowing you to detect certain properties of the stream (like the signature header of a file type for instance).
See also this Wikipedia entry.
Usually not all constant values in software are called magic numbers.
A java class files always starts with the hex value 0xcafebabe a windows .exe
file with MZ 0x4d, 0x5a , this allows you quickly (but not for sure) to identify
the content of a binary file.
In a MISRA compliant system, all values except 0 and 1 are considered magic numbers. My opinion has always been if the constant value is obvious or likely won't change then leave it as a number. If in doubt create a unique constant since long term maintenance will be easier.