#define an array value with index - c++

For Legacy reasons I need to maintain the old array structure(s), however the compiler tells me that there is a problem with my 'expression syntax'.
Here's what I have:
#define Array[GlobalIndexVariable] Get(GlobalIndexVariable)->SubType;
Get returns a valid pointer to a struct, which contains SubType
Now I imagined this to take every occasion of Array[GlobalIndexVariable] and simply convert it to the function call, but apparently I was mistaken.
I don't even really know what to search for in this context, so a little help would be highly appreciated.
Thank you very much.
Edit:
Since it was asked, here is one of the lines that throw the error:
LocalVariable = Array[GlobalIndexVariable];
I wish to stress, that I'm not looking to make this into a macro, but much rather just want to replace Array[GlobalIndexVariable]; with Get(GlobalIndexVariable), so as soon as anything other than GlobalIndexVariable is used, the replacement would fail and throw a different error.
Edit:
Just to make absolutely sure that I am understood:
Right now I have 4 long arrays of type int. I want to stuff these into a struct to both mark that they belong together (they describe different aspects of the same thing) however I need a bulk solution for the previous code fragments (a few thousand occurrences).
Old:
int Index = 2;
int Array1[ANZ] = { 0, 1, 2 };
int Array2[ANZ] = { 0, 1, 2 };
int Array3[ANZ] = { 0, 1, 2 };
int Array4[ANZ] = { 0, 1, 2 };
if(Array1[Index] == 2) //doSomething
New:
struct {
int Type, Conf, Foo, Bar;
};
if(Array1[Index] == 2) //doSomething
Without touching Array1[Index] == 2, is it possible to use a define or something else to get the exact same behavior as before for the old code?

With a custom type, you may do
struct MyType
{
int& operator[](int index) const { return Get(GlobalIndexVariable)->SubType; }
};
#define Array MyType{}
And then
Array[GlobalIndexVariable]
will be replaced by
MyType{}[GlobalIndexVariable]
which will calls
Get(GlobalIndexVariable)->SubType;

Your problem is in your #define
#define Array[GlobalIndexVariable] Get(GlobalIndexVariable)->SubType;
The [] isn't a valid syntax for macro calls. If you're doing C++ you can write an operator[] for your array type. If you're in C your in a bit of a bind. You'll probably have to replace everything with round brackets rather than square ones.

I'd perform a substitution, writing something like
#ifdef USE_OLD_SYNTAX
# define GET_ARRAY_VALUE(globalIndex) Array[globalIndex]
#else
# define GET_ARRAY_VALUE(globalIndex) Get(globalIndex)->SubType;
#endif
Then use GET_ARRAY_VALUE in your code and check with USE_OLD_SINTAX which definition you want to use by find-and-replace your previous code.
But maybe it can be more simple a find-and-replace inside your code without macro at all.

Related

How can I elide a call if an edge condition is known at compile time?

I have the following situation: there's a huge set of templates like std::vector that will call memmove() to move parts of array. Sometimes they will want to "move" parts of length zero - for example, if the array tail is removed (like std::vector::erase()), they will want to move the remainder of the array which will happen to have length zero and that zero will be known at compile time (I saw the disassembly - the compiler is aware) yet the compiler will still emit a memmove() call.
So basically I could have a wrapper:
inline void callMemmove( void* dest, const void* source, size_t count )
{
if( count > 0 ) {
memmove( dest, source, count );
}
}
but this would introduce an extra runtime check in cases count is not known in compile time that I don't want.
Is it somehow possible to use __assume hint to indicate to the compiler that if it knows for sure that count is zero it should eliminate the memmove()?
The point of the __assume is to tell the compiler to skip portions of code when optimizing. In the link you provided the example is given with the default clause of the switch construct - there the hint tells the compiler that the clause will never be reached even though theoretically it could. You're telling the optimizer, basically, "Hey, I know better, throw this code away".
For default you can't not write it in (unless you cover the whole range in cases, which is sometimes problematic) because it would cause compilation error. So you need the hint to optimize the code you know that is unneeded out.
In your case - the code can be reached, but not always, so the __assume hint won't help you much. You have to check if the count is really 0. Unless you're sure it can never be anything but 0, then just don't write it in.
This solution uses a trick described in C++ compile-time constant detection - the trick uses the fact compile time integer zero can be converted to a pointer, and this can be used together with overloading to check for the "compile time known" property.
struct chkconst {
struct Small {char a;};
struct Big: Small {char b;};
struct Temp { Temp( int x ) {} };
static Small chk2( void* ) { return Small(); }
static Big chk2( Temp ) { return Big(); }
};
#define is_const_0(X) (sizeof(chkconst::chk2(X))<sizeof(chkconst::Big))
#define is_const(X) is_const_0( int(X)-int(X) )
#define memmove_smart(dst,src,n) do { \
if (is_const(n)) {if (n>0) memmove(dst,src,n);} \
else memmove(dst,src,n); \
} while (false)
Or, in your case, as you want to check for zero only anyway, one could use is_const_0 directly for maximum simplicity and portability:
#define memmove_smart(dst,src,n) if (is_const_0(n)) {} else memmove(dst,src,n)
Note: the code here used a version of is_const simpler than in the linked question. This is because Visual Studio is more standard conformant than GCC in this case. If targeting gcc, you could use following is_const variant (adapted to handle all possible integral values, including negative and INT_MAX):
#define is_const_0(X) (sizeof(chkconst::chk2(X))<sizeof(chkconst::Big))
#define is_const_pos(X) is_const_0( int(X)^(int(X)&INT_MAX) )
#define is_const(X) (is_const_pos(X)|is_const_pos(-int(X))|is_const_pos(-(int(X)+1)))
I think that you misunderstood the meaning of __assume. It does not tell the compiler to change its behavior when it knows what the values are, but rather it tells it what the values will be when it cannot infer it by itself.
In your case, if you told it to __assume that count > 0 it will skip the test, as you already told it that the result will always be true, it will remove the condition and will call memmove always, which is exactly what you want to avoid.
I don't know the intrinsics of VS, but in GCC there is a likely/unlikely intrinsic (__builtin_expect((x),1)) that can be used to hint the compiler as to which is the most probable outcome of the test. that will not remove the test, but will layout code so that the most probable (as in by your definition) branch is more efficient (will not branch).
If its possible to rename the memmove, I think something like this
would do - http://codepad.org/s974Fp9k
struct Temp {
int x;
Temp( int y ) { x=y; }
operator int() { return x; };
};
void memmove1( void* dest, const void* source, void* count ) {
printf( "void\n" );
}
void memmove1( void* dest, const void* source, Temp count ) {
memmove( dest, source, count );
printf( "temp\n" );
}
int main( void ) {
int a,b;
memmove1( &a,&b, sizeof(a) );
memmove1( &a,&b, sizeof(a)-4 );
}
I think the same is probably possible without the class - have to look at conversion rules
to confirm it.
Also it should be possible to overload the original memmove(), eg. by passing an
object (like Temp(sizeof(a)) as 3rd argument.
Not sure which way would be more convenient.

Which is the better practice: global constant or #define? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
C++ - enum vs. const vs. #define
Before I used #define I used to create constants in my main function and pass them where they were needed. I found that I passed them very often and it was kind of odd, especially array sizes.
More recently I have been using #define for the reason that I don't have to pass constants in my main to each individual function.
But now that I think of it, I could use global constants as well, but for some reason I have been a little hesitant towards them.
Which is the better practice: global constants or #define?
A side question, also related: Is passing constants from my main as I described a bad practice?
They don't do quite the same thing. #define lets you affect the code at compilation time, while global constants only come into effect at runtime.
Seeing as #define can only give you extra trouble because there's no checking going on with how you use it, you should use global constants when you can and #define when you must. It will be safer and more readable that way.
As for passing constants from main, it's not unreasonable because it makes the called functions more flexible to accept an argument from the caller than to blindly pull it out of some global. Of course it the argument isn't really expected to change for the lifetime of the program you don't have much to gain from that.
Using constants instead of #define is very much to be preferred. #define replaces the token dumbly in every place it appears, and can cause all sorts of unintended consequences.
Passing values instead of using globals is good practice. It makes the code more flexible and modular, and more testable. Try googling for "parameterise from above".
You should never use either #defines or const variables to represent array sizes; it's better to make them explicit.
Instead of:
#define TYPICAL_ARRAY_SIZE 4711
int fill_with_zeroes(char *array)
{
memset(array, 0, TYPICAL_ARRAY_SIZE);
}
int main(void)
{
char *za;
if((za = malloc(TYPICAL_ARRAY_SIZE)) != NULL)
{
fill_with_zeroes(za);
}
}
which uses a (shared, imagine it's in a common header or something) #define to communicate the array size, it's much better to just pass it to the function as a real argument:
void fill_with_zeroes(char *array, size_t num_elements)
{
memset(array, 0, num_elements); /* sizeof (char) == 1. */
}
Then just change the call site:
int main(void)
{
const size_t array_size = 4711;
char *za;
if((za = malloc(array_size)) != NULL)
{
fill_with_zeroes(za, array_size);
}
}
This makes the size local to the place that allocated it, there's no need for the called function to magically "know" something about its arguments that is not communicated through its arguments.
If the array is non-dynamically allocated, we can do even better and remove the repeated symbolic size even locally:
int main(void)
{
char array[42];
fill_with_zeroes(array, sizeof array / sizeof *array);
}
Here, the well-known sizeof x / sizeof *x expression is used to (at compile-time) compute the number of elements in the array.
Constants are better. The only difference between the two is that constants are type-safe.
You shouldn't use values defined with #define like const parameters. Defines are used mostly to prevent the compiler to compile some parts of code depending on your needings at compile time (platform dependent choices, optimization at compile time, ).
So if you are not using define for these reasons avoid that and use costant values.

Large Dynamic MultiDimensional Array Not Working

My code that I have is quite large and complicated so I won't waste your time reading it, but you're going to have to make certain assumtions about variables in it as a result. I will tell you the values of the variables which I have confirmed in the debugger so you know with certainty. Know that I have omitted a lot of unrelated code in here so what you see isn't everything but I have included everything that is relevant.
// This is defined in a class:
char**** m_DataKeys;
// This is in a member function of the same class:
m_DataKeys = new char*** [m_iNumOfHeroes]; // m_iNumOfHeroes = 2
while ( pkvHero )
{
// iHeroNum = 0 and then 1 #define NUM_OF_ABILITIES 4
m_DataKeys[iHeroNum] = new char** [NUM_OF_ABILITIES];
for (int ability = 0; ability < NUM_OF_ABILITIES; ability++)
{
if (pkvExtraData) // only is true when iHeroNum == 1 and ability == 0
{
// iNumOfExtraData == 2
m_DataKeys[iHeroNum][ability] = new char* [iNumOfExtraData];
while ( pkvSubKey )
{
// iCurExtraDataNum increments from 0 to 2
m_DataKeys[iHeroNum][ability][iCurExtraDataNum] = new char [50];
I put a break point on the line
m_DataKeys[iHeroNum] = new char** [NUM_OF_ABILITIES];
Before the line is called and when iHeroNum == 0 the m_DataKeys array looks like:
m_DataKeys | 0x02072a60
pointer | 0xffeeffee
Error : expression cannot be evaluated
Which is expected. After the line gets called it looks like:
m_DataKeys | 0x02072a60
pointer | 0x02496b00
pointer | 0xffeeffee
Error : expression cannot be evaluated
Which seems to look correct. However, since I set a breakpoint there, I hit play and had it hit it on the next loop around, where iHeroNum == 1 now and ran the line and m_DataKeys then looked like this:
m_DataKeys | 0x02072a60
pointer | 0x02496b00
pointer | 0xffeeffee
Error : expression cannot be evaluated
Which is the exact same as before! The line didn't change the array.... At all!
For clarification, m_DataKeys is a 3 dimensional array of character pointers to character arrays of size 50.
I can't figure out why this is happening, it looks like my code is correct. Is it possible that the garbage collector is screwing me over here? Or maybe the new allocator?
Edit: A Symptom of a Larger Problem
Let me elaborate a little more on the structure of my code, because really, this is just a cheap solution to a bigger problem.
I already have structs as one of you wisely suggested:
struct HeroData
{
// Lots o data here
// ...
// .
//
AbilityData* Abilities[NUM_OF_ABILITIES];
}
struct AbilityData
{
// More data here
// ...
// .
CUtlMap<char*,int> ExtraData [MAX_ABILITY_LEVELS];
}
Now when it got complicated and I had to do this DataKeys arrays of pointers to arrays of pointers crap is only when the need arose to be loading in some data to a dynamic structure, where both the keys, the values, and the numbers of data are completely dynamic. So I thought to use a map of char arrays to ints, but the only problem is that I can't store the actual char array in my map, I have to use a char *. I tried defining the map as:
CUtlMap<char[50],int> ExtraData [MAX_ABILITY_LEVELS];
But that really didn't work and it seems sort of strange to me anyway. So, I had to find some place to stick all these ExtraDataKeys and for some reason I thought it cool to do it like this. How can I store char arrays in objects like arrays or maps?
Since you are using pointers as class members, my best guess is that you are violating The Rule Of Three. That is, you did not provide a copy constructor and a copy assignment operator for your class. That usually leads to strange data loss when passing objects of your class around.
Note that no sane C++ programmer would use char****. Here is my best attempt to fix your problem using vectors and strings, but there is probably a much better design for your specific problem:
#include <string>
#include <vector>
class Foo
{
int m_iNumOfHeroes;
std::vector<std::vector<std::vector<std::string> > > m_DataKeys;
enum { NUM_OF_ABILITIES = 4, iNumOfExtraData = 2 };
public:
explicit Foo(int iNumOfHeroes)
: m_iNumOfHeroes(iNumOfHeroes)
, m_DataKeys(m_iNumOfHeroes, std::vector<std::vector<std::string> >
(NUM_OF_ABILITIES, std::vector<std::string>(iNumOfExtraData)))
{
}
};
int main()
{
Foo x(2);
}
In case you have never seen that colon syntax in the constructor before, that is a member initializer list.
I really wish C++ had array bounds checking
std::vector and std::string do have bounds checking if you use the foo.at(i) syntax instead of foo[i]. In Debug mode, even foo[i] has bounds checking enabled in Visual C++, IIRC.
Though the code might be correct, I personally find that working with something like a char **** can get pretty confusing pretty fast.
This is just my personal preference, but I always try to organize things in the most clear and unambiguous way I can, so what I would do in your situation would be something like
struct Ability
{
char extraData[NUM_OF_EXTRA_DATA][50];
};
struct HeroData
{
Ability abilities[NUM_OF_ABILITIES];
};
class Foo
{
// here you can choose a
HeroData *heroArray;
// and then you would alloc it with "heroArray = new HeroData[m_iNumOfHeroes];"
// or you can more simply go with a
std::vector<HeroData> heroVector;
};
I think this makes things more clear, making it easier for you and other programmers working on that code to keep track of what is what inside your arrays.
I think you expect the wrong thing to happen (that the visual display in the debugger would change), even though your code seems correct.
Your debugger displays m_DataKeys, *m_DataKeys and **m_DataKeys, which is the same as m_DataKeys, m_DataKeys[0] and m_DataKeys[0][0]. When you change m_DataKeys[1], you are not going to notice it in your debugger output.
The following might help you: in my debugger (MS Visual Studio 2005), if you enter e.g. m_DataKeys,5 as your watch expression, you will see the first 5 elements of the array, that is, m_DataKeys[0], m_DataKeys[1], ..., m_DataKeys[4] - arranged in a neat table. If this syntax (with the ,5) doesn't work for you, just add m_DataKeys[1] into the debugger's watch window.
Not sure why this didn't occur to me last night, but I was pretty tired. Heres what I decided to do:
struct AbilityData
{
// Stuff
CUtlMap<char*,int> ExtraData [MAX_ABILITY_LEVELS];
char **DataKeys;
}
Thats what my abilityData struct now looks like, and it now works, but now I want to reorganize it to be like:
struct AbilityData
{
// Stuff
CUtlMap<char*,int[MAX_ABILITY_LEVELS]> ExtraData;
char **DataKeys;
}
Because it makes more sense that way, but then I run into the same problem that I had before with the char array. It almost seems like to me it might just be best to ditch the whole map idea and make it like:
struct AbilityData
{
// Stuff
int *ExtraData;
char **DataKeys;
}
Where ExtraData is now also a dynamically allocated array.
The only problem with that is that I now have to get my data via a function which will loop through all the DataKeys, find a matching key for my input string, then return the ExtraData associated with it.

Should I use an enum or multiple const for non-sequential constants in c++?

I'm writing porting file-io set of functions from c into a c++ class. "Magic numbers" (unnamed constants) abound.
The functions read a file header which has a number of specific entries whose locations are currently denoted by magic numbers.
I was taught by a veteran programmer a couple years back that using "magic numbers" is inherently evil, and thus, I have since tried to avoid using unnamed constants in my port. So I want to create some sort of list of constants of where the entries are stored.
So far I've come up with two solutions that seem relatively safe -- use a namespace enclosed set of constants or a namespace enclosed enum.
Can I use either solution safely? Would there be any advantages to one over the other?
e.g.
OPTION 1
namespace hdr_pos {
const unsigned int item_1_pos=4;
const unsigned int item_2_pos=8;
const unsigned int item_3_pos=12;
const unsigned int item_4_pos=24;
const unsigned int item_5_pos=32;
};
OPTION 2
namespace hdr_pos {
enum e {
item_1_pos=4,
item_2_pos=8,
item_3_pos=12,
item_4_pos=24,
item_5_pos=32
};
};
Is there anyway to prevent duplicates, to catch if I change the positions due to a future update to the file header, but forget to change one of them?
Please keep this factual and non-subjective. If there is no advantage you know of, feel free to answer that.
Note: I would use more descriptive names, of course, in my actual implementation; I just called things item_<#>_... for examples sake...
I can see two advantages to using an enum. First, some debuggers can translate constants back into enum variable names (which can make debugging easier in some cases). Also, you can declare a variable of an enumerated type which can only hold a value from that enumeration. This can give you an additional form of type checking that you wouldn't have simply by using constants.
Checking to see if a value is duplicated might depend on your particular compiler. The easiest way to do so would probably be to write an external script that will parse your enum definition and report whether or not a value is duplicated (you can run this as part of your build process if you like).
I've dealt with this situation before, for error codes.
I have seen people using enums for error codes, and this pose some issues:
you can assign an int to the enum that doesn't not correspond to any value (too bad)
the value itself is declared in a header, meaning that error code reassignment (this happens...) breaks code compatibility, you also have to take care when adding elements...
you have to define all codes in the same header, even if often times some code are naturally restricted to a small portion of the application, because enums cannot be "extended"
there is no check that a same code is not assigned twice
you cannot iterate over the various fields of an enum
When designing my error codes solution, I thus chose another road: constants in a namespace, defined in source files, which address points 2 and 3. To gain in type safety though, the constants are not int, but a specific Code class:
namespace error { class Code; }
Then I can define several error files:
// error/common.hpp
namespace error
{
extern Code const Unknown;
extern Code const LostDatabaseConnection;
extern Code const LostNASConnection;
}
// error/service1.hpp
// error/service2.hpp
I didn't solved the arbitrary cast issue though (constructor is explicit, but public), because in my case I was required to forward error codes returned by other servers, and I certainly didn't want to have to know them all (that would have been too brittle)
However I did thought about it, by making the required constructor private and enforcing the use of a builder, we're even going to get 4. and 5. in a swoop:
// error/code.hpp
namespace error
{
class Code;
template <size_t constant> Code const& Make(); // not defined here
class Code: boost::totally_ordered<Code>
{
public:
Code(): m(0) {} // Default Construction is useful, 0 is therefore invalid
bool operator<(Code const& rhs) const { return m < rhs.m; }
bool operator==(Code const& rhs) const { return m == rhs.m; }
private:
template <size_t> friend Code const& Make();
explicit Code(size_t c): m(c) { assert(c && "Code - 0 means invalid"); }
size_t m;
};
std::set<Code> const& Codes();
}
// error/privateheader.hpp (inaccessible to clients)
namespace error
{
std::set<Code>& PrivateCodes() { static std::set<Code> Set; return Set; }
std::set<Code> const& Codes() { return PrivateCodes(); }
template <size_t constant>
Code const& Make()
{
static std::pair< std::set<Code>::iterator, bool > r
= PrivateCodes().insert(Code(constant));
assert(r.second && "Make - same code redeclared");
return *(r.first);
}
}
//
// We use a macro trick to create a function whose name depends
// on the code therefore, if the same value is assigned twice, the
// linker should complain about two functions having the same name
// at the condition that both are located into the same namespace
//
#define MAKE_NEW_ERROR_CODE(name, value) \
Make<value>(); void _make_new_code_##value ();
// error/common.cpp
#include "error/common.hpp"
#include "privateheader.hpp"
namespace error
{
Code const Unkown = MAKE_NEW_ERROR_CODE(1)
/// ....
}
A tad more work (for the framework), and only link-time/run-time check of the same assignment check. Though it's easy to diagnose duplicates simply by scanning for the pattern MAKE_NEW_ERROR_CODE
Have fun!
The title of your question suggests that the main reason you have doubts about using a enum is that your constants are non-iterative. But in C++ enum types are non-iterative already. You have to jump through quite a few hoops to make an iterative enum type.
I'd say that if your constants are related by nature, then enum is a pretty good idea, regardless of whether the constants are iterative or not. The main disadvantage of enums though is total lack of type control. In many cases you might prefer to have strict control over the types of your constant values (like have them unsigned) and that's something enum can't help you with (at least yet).
One thing to keep in mind is that you can't take the address of an enum:
const unsigned* my_arbitrary_item = &item_1_pos;
If they're purely constants and require no run-time stuff (like can't init enum with non-enum value) then they should just be const unsigned ints. Of course, the enum is less typing, but that's besides the point.

In which case is if(a=b) a good idea? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Inadvertent use of = instead of ==
C++ compilers let you know via warnings that you wrote,
if( a = b ) { //...
And that it might be a mistake that you certainly wanted to write:
if( a == b ) { //...
But is there a case where the warning should be ignored, because it's a good way to use this "feature"?
I don't see any code clarity reason possible, so is there a case where it’s useful?
Two possible reasons:
Assign & Check
The = operator (when not overriden) normally returns the value that it assigned. This is to allow statements such as a=b=c=3. In the context of your question, it also allows you to do something like this:
bool global;//a global variable
//a function
int foo(bool x){
//assign the value of x to global
//if x is equal to true, return 4
if (global=x)
return 4;
//otherwise return 3
return 3;
}
...which is equivalent to but shorter than:
bool global;//a global variable
//a function
int foo(bool x){
//assign the value of x to global
global=x;
//if x is equal to true, return 4
if (global==true)
return 4;
//otherwise return 3
return 3;
}
Also, it should be noted (as stated by Billy ONeal in a comment below) that this can also work when the left-hand argument of the = operator is actually a class with a conversion operator specified for a type which can be coerced (implicitly converted) to a bool. In other words, (a=b) will evaulate to true or false if a is of a type which can be coerced to a boolean value.
So the following is a similar situation to the above, except the left-hand argument to = is an object and not a bool:
#include <iostream>
using namespace std;
class Foo {
public:
operator bool (){ return true; }
Foo(){}
};
int main(){
Foo a;
Foo b;
if (a=b)
cout<<"true";
else
cout<<"false";
}
//output: true
Note: At the time of this writing, the code formatting above is bugged. My code (check the source) actually features proper indenting, shift operators and line spacing. The <'s are supposed to be <'s, and there aren't supposed to be enourmous gaps between each line.
Overridden = operator
Since C++ allows the overriding of operators, sometimes = will be overriden to do something other than what it does with primitive types. In these cases, the performing the = operation on an object could return a boolean (if that's how the = operator was overridden for that object type).
So the following code would perform the = operation on a with b as an argument. Then it would conditionally execute some code depending on the return value of that operation:
if (a=b){
//execute some code
}
Here, a would have to be an object and b would be of the correct type as defined by the overriding of the = operator for objects of a's type. To learn more about operator overriding, see this wikipedia article which includes C++ examples: Wikipedia article on operator overriding
while ( (line = readNextLine()) != EOF) {
processLine();
}
You could use to test if a function returned any error:
if (error_no = some_function(...)) {
// Handle error
}
Assuming that some_function returns the error code in case of an error. Or zero otherwise.
This is a consequence of basic feature of the C language:
The value of an assignment operation is the assigned value itself.
The fact that you can use that "return value" as the condition of an if() statement is incidental.
By the way, this is the same trick that allows this crazy conciseness:
void strcpy(char *s, char *t)
{
while( *s++ = *t++ );
}
Of course, the while exits when the nullchar in t is reached, but at the same time it is copied to the destination s string.
Whether it is a good idea, usually not, as it reduce code readability and is prone to errors.
Although the construct is perfectly legal syntax and your intent may truly be as shown below, don't leave the "!= 0" part out.
if( (a = b) != 0 ) {
...
}
The person looking at the code 6 months, 1 year, 5 years from now, at first glance, is simply going to believe the code contains a "classic bug" written by a junior programmer and will try to "fix" it. The construct above clearly indicates your intent and will be optimized out by the compiler. This would be especially embarrassing if you are that person.
Your other option is to heavily load it with comments. But the above is self-documenting code, which is better.
Lastly, my preference is to do this:
a = b;
if( a != 0 ) {
...
}
This is about a clear as the code can get. If there is a performance hit, it is virtually zero.
A common example where it is useful might be:
do {
...
} while (current = current->next);
I know that with this syntax you can avoid putting an extra line in your code, but I think it takes away some readability from the code.
This syntax is very useful for things like the one suggested by Steven Schlansker, but using it directly as a condition isn't a good idea.
This isn't actually a deliberate feature of C, but a consequence of two other features:
Assignment returns the assigned value
This is useful for performing multiple assignments, like a = b = 0, or loops like while ((n = getchar()) != EOF).
Numbers and pointers have truth values
C originally didn't have a bool type until the 1999 standard, so it used int to represent Boolean values. Backwards compatibility requires C and C++ to allow non-bool expressions in if, while, and for.
So, if a = b has a value and if is lenient about what values it accepts, then if (a = b) works. But I'd recommend using if ((a = b) != 0) instead to discourage anyone from "fixing" it.
You should explicitly write the checking statement in a better coding manner, avoiding the assign & check approach. Example:
if ((fp = fopen("filename.txt", "wt")) != NULL) {
// Do something with fp
}
void some( int b ) {
int a = 0;
if( a = b ) {
// or do something with a
// knowing that is not 0
}
// b remains the same
}
But is there a case where the warning
should be ignored because it's a good
way to use this "feature"? I don't see
any code clarity reason possible so is
there a case where its useful?
The warning can be suppressed by placing an extra parentheses around the assignment. That sort of clarifies the programmer's intent. Common cases I've seen that would match the (a = b) case directly would be something like:
if ( (a = expression_with_zero_for_failure) )
{
// do something with 'a' to avoid having to reevaluate
// 'expression_with_zero_for_failure' (might be a function call, e.g.)
}
else if ( (a = expression2_with_zero_for_failure) )
{
// do something with 'a' to avoid having to reevaluate
// 'expression2_with_zero_for_failure'
}
// etc.
As to whether writing this kind of code is useful enough to justify the common mistakes that beginners (and sometimes even professionals in their worst moments) encounter when using C++, it's difficult to say. It's a legacy inherited from C and Stroustrup and others contributing to the design of C++ might have gone a completely different, safer route had they not tried to make C++ backwards compatible with C as much as possible.
Personally I think it's not worth it. I work in a team and I've encountered this bug several times before. I would have been in favor of disallowing it (requiring parentheses or some other explicit syntax at least or else it's considered a build error) in exchange for lifting the burden of ever encountering these bugs.
while( (l = getline()) != EOF){
printf("%s\n", l);
}
This is of course the simplest example, and there are lots of times when this is useful. The primary thing to remember is that (a = true) returns true, just as (a = false) returns false.
Preamble
Note that this answer is about C++ (I started writing this answer before the tag "C" was added).
Still, after reading Jens Gustedt's comment, I realized it was not the first time I wrote this kind of answer. Truth is, this question is a duplicate of another, to which I gave the following answer:
Inadvertent use of = instead of ==
So, I'll shamelessly quote myself here to add an important information: if is not about comparison. It's about evaluation.
This difference is very important, because it means anything can be inside the parentheses of a if as long as it can be evaluated to a Boolean. And this is a good thing.
Now, limiting the language by forbidding =, where all other operators are authorized, is a dangerous exception for the language, an exception whose use would be far from certain, and whose drawbacks would be numerous indeed.
For those who are uneasy with the = typo, then there are solutions (see Alternatives below...).
About the valid uses of if(i = 0) [Quoted from myself]
The problem is that you're taking the problem upside down. The "if" notation is not about comparing two values like in some other languages.
The C/C++ if instruction waits for any expression that will evaluate to either a Boolean, or a null/non-null value. This expression can include two values comparison, and/or can be much more complex.
For example, you can have:
if(i >> 3)
{
std::cout << "i is less than 8" << std::endl
}
Which proves that, in C/C++, the if expression is not limited to == and =. Anything will do, as long as it can be evaluated as true or false (C++), or zero non-zero (C/C++).
About valid uses
Back to the non-quoted answer.
The following notation:
if(MyObject * p = findMyObject())
{
// uses p
}
enables the user to declare and then use p inside the if. It is a syntactic sugar... But an interesting one. For example, imagine the case of an XML DOM-like object whose type is unknown well until runtime, and you need to use RTTI:
void foo(Node * p_p)
{
if(BodyNode * p = dynamic_cast<BodyNode *>(p_p))
{
// this is a <body> node
}
else if(SpanNode * p = dynamic_cast<SpanNode *>(p_p))
{
// this is a <span> node
}
else if(DivNode * p = dynamic_cast<DivNode *>(p_p))
{
// this is a <div> node
}
// etc.
}
RTTI should not be abused, of course, but this is but one example of this syntactic sugar.
Another use would be to use what is called C++ variable injection. In Java, there is this cool keyword:
synchronized(p)
{
// Now, the Java code is synchronized using p as a mutex
}
In C++, you can do it, too. I don't have the exact code in mind (nor the exact Dr. Dobb's Journal's article where I discovered it), but this simple define should be enough for demonstration purposes:
#define synchronized(lock) \
if (auto_lock lock_##__LINE__(lock))
synchronized(p)
{
// Now, the C++ code is synchronized using p as a mutex
}
(Note that this macro is quite primitive, and should not be used as is in production code. The real macro uses a if and a for. See sources below for a more correct implementation).
This is the same way, mixing injection with if and for declaration, you can declare a primitive foreach macro (if you want an industrial-strength foreach, use Boost's).
About your typo problem
Your problem is a typo, and there are multiple ways to limit its frequency in your code. The most important one is to make sure the left-hand-side operand is constant.
For example, this code won't compile for multiple reasons:
if( NULL = b ) // won't compile because it is illegal
// to assign a value to r-values.
Or even better:
const T a ;
// etc.
if( a = b ) // Won't compile because it is illegal
// to modify a constant object
This is why in my code, const is one of the most used keyword you'll find. Unless I really want to modify a variable, it is declared const and thus, the compiler protects me from most errors, including the typo error that motivated you to write this question.
But is there a case where the warning should be ignored because it's a good way to use this "feature"? I don't see any code clarity reason possible so is there a case where its useful?
Conclusion
As shown in the examples above, there are multiple valid uses for the feature you used in your question.
My own code is a magnitude cleaner and clearer since I use the code injection enabled by this feature:
void foo()
{
// some code
LOCK(mutex)
{
// some code protected by a mutex
}
FOREACH(char c, MyVectorOfChar)
{
// using 'c'
}
}
... which makes the rare times I was confronted to this typo a negligible price to pay (and I can't remember the last time I wrote this type without being caught by the compiler).
Interesting sources
I finally found the articles I've had read on variable injection. Here we go!!!
FOR_EACH and LOCK (2003-11-01)
Exception Safety Analysis (2003-12-01)
Concurrent Access Control & C++ (2004-01-01)
Alternatives
If one fears being victim of the =/== typo, then perhaps using a macro could help:
#define EQUALS ==
#define ARE_EQUALS(lhs,rhs) (lhs == rhs)
int main(int argc, char* argv[])
{
int a = 25 ;
double b = 25 ;
if(a EQUALS b)
std::cout << "equals" << std::endl ;
else
std::cout << "NOT equals" << std::endl ;
if(ARE_EQUALS(a, b))
std::cout << "equals" << std::endl ;
else
std::cout << "NOT equals" << std::endl ;
return 0 ;
}
This way, one can protect oneself from the typo error, without needing a language limitation (that would cripple language), for a bug that happens rarely (i.e., almost never, as far as I remember it in my code).
There's an aspect of this that hasn't been mentioned: C doesn't prevent you from doing anything it doesn't have to. It doesn't prevent you from doing it because C's job is to give you enough rope to hang yourself by. To not think that it's smarter than you. And it's good at it.
Never!
The exceptions cited don't generate the compiler warning. In cases where the compiler generates the warning, it is never a good idea.
RegEx sample
RegEx r;
if(((r = new RegEx("\w*)).IsMatch()) {
// ... do something here
}
else if((r = new RegEx("\d*")).IsMatch()) {
// ... do something here
}
Assign a value test
int i = 0;
if((i = 1) == 1) {
// 1 is equal to i that was assigned to a int value 1
}
else {
// ?
}
My favourite is:
if (CComQIPtr<DerivedClassA> a = BaseClassPtr)
{
...
}
else if (CComQIPtr<DerivedClassB> b = BaseClassPtr)
{
...
}