Are Magic Numbers Okay In This Instance? - c++

In this question regarding magic numbers in arrays, paxdiablo says that any number other than -1, 0 and 1 are magic numbers. I know that this is only a guideline but I was wondering if when specifying dimensions of a multi dimensional array if I needed to define constants for which dimension was x, y and z. For example, is this okay:
typedef boost::multi_array<TileID, 3> TileArray3D;
TileArray3D::size_type GoRPG::MapData::GetWidth() {
return mData.shape()[0];
}
or is this preferred:
TileArray3D::size_type GoRPG::MapData::GetWidth() {
return mData.shape()[x_dimension];
}
Thanks in advance, ell.
Edit: The reason the lengths are stored in an array is because I am using Boost::multi_array and that is how they are stored. Apologies for the broken code! I can fix that later - at the moment I just would like to know about the magic numbers, although it is still useful, thank you!

No, magic numbers are non-obvious ones. -1, 0, and 1 are generally obvious, but not always.
In your case, neither of your snippets is obvious to me. Why is the width of your map data stored in an array?
Without seeing more code, I would prefer this:
auto GoRPG::MapData::GetWidth() {
return mData.shape()[width_index];
}

I think you should have a constant defined for this case and not use a number. 0 is not always "magic", in instances where accessing the first element is the intuitive thing to do. Example: checking the first byte of a string, there's no reason to define a c_first_byte constant.)
But in your code, it would be even better to add a method to shape or mData named x() that returns the value that you want. There's no need to expose the internal representation of mData or shape to its users.

The code does not compile though. Since, you haven't used decltype to deduce the return type. Do all the shapes have the same width ?
If yes, then should be OK. If not, then better approach could be to provide a default argument.
auto MapData::GetWidth(const size_t index = 0) -> decltype( mData.shape()[0]) {
return mData.shape()[index];
}

Related

C++: Create integer vector of infinities

I'm working on an algorithm and I need to initialize the vector of ints:
std::vector<int> subs(10)
of fixed length with values:
{-inf, +inf, +inf …. }
This is where I read that it is possible to use MAX_INT, but it's not quiete correct because the elements of my vector are supposed to be greater than any possible int value.
I liked overrloading comparison operator method from this answer, but how do you initialize the vector with infinitytype class objects if there are supposed to be an int?
Or maybe you know any better solution?
Thank you.
The solution depends on the assumptions your algorithm (or the implementation of your algorithm) has:
You could increase the element size beyond int (e.g. if your sizeof(int) is 4, use int64_t), and initialize to (int64_t) 1 + std::numeric_limits<int>:max() (and similarly for the negative values). But perhaps your algorithm assumes that you can't "exceed infinity" by adding on multiplying by positive numbers?
You could use an std::variant like other answers suggest, selecting between an int and infinity; but perhaps your algorithm assumes your elements behave like numbers?
You could use a ratio-based "number" class, ensuring it will not get non-integral values except infinity.
You could have your algorithm special-case the maximum and minimum integers
You could use floats or doubles which support -/+ infinity, and restrict them to integrality.
etc.
So, again, it really just depends and there's no one-size-fits-all solution.
AS already said in the comments, you can't have an infinity value stored in int: all values of this type are well-defined and finite.
If you are ok with a vector of something working as an infinite for ints, then consider using a type like this:
struct infinite
{ };
bool operator < (int, infinite)
{
return true;
}
You can use a variant (for example, boost::variant) which supports double dispatching, which stores either an int or an infinitytype (which should store the sign of the infinity, for example in a bool), then implement the comparison operators through a visitor.
But I think it would be simpler if you simply used a double instead of int, and whenever you take out a value that is not infinity, convert it to int. If performance is not that great of an issue, then it will work fine (probably still faster than a variant). If you need great performance, then just use MAX_INT and be done with it.
You are already aware of the idea of an "infinite" type, but that implementation could only contain infinite values. There's another related idea:
struct extended_int {
enum {NEGINF, FINITE, POSINF} type;
int finiteValue; // Only meaningful when type==FINITE
bool operator<(extended_int rhs) {
if (this->type==POSINF) return false;
if (rhs.type==NEGINF) return false;
if (this->type==FINITE && rhs.type==POSINF) return false;
if (this->type==NEGINF && rhs.type==FINITE) return false;
assert(this->type==FINITE && rhs.type==FINITE);
return this->finiteValue < rhs.finiteValue)
}
// Implicitly converting ctor
constexpr extended_int(int value) : type(FINITE), finiteValue(value) { }
// And the two infinities
static constexpr extended_int posinf;
static constexpr extended_int neginf;
}
You now have extended_int(5) < extended_int(6) but also extended_int(5) < extended_int::posinf

C++ function to tell whether a given function is injective

This might seem like a weird question, but how would I create a C++ function that tells whether a given C++ function that takes as a parameter a variable of type X and returns a variable of type X, is injective in the space of machine representation of those variables, i.e. never returns the same variable for two different variables passed to it?
(For those of you who weren't Math majors, maybe check out this page if you're still confused about the definition of injective: http://en.wikipedia.org/wiki/Injective_function)
For instance, the function
double square(double x) { return x*x};
is not injective since square(2.0) = square(-2.0),
but the function
double cube(double x) { return x*x*x};
is, obviously.
The goal is to create a function
template <typename T>
bool is_injective(T(*foo)(T))
{
/* Create a set std::set<T> retVals;
For each element x of type T:
if x is in retVals, return false;
if x is not in retVals, add it to retVals;
Return true if we made it through the above loop.
*/
}
I think I can implement that procedure except that I'm not sure how to iterate through every element of type T. How do I accomplish that?
Also, what problems might arise in trying to create such a function?
You need to test every possible bit pattern of length sizeof(T).
There was a widely circulated blog post about this topic recently: There are Only Four Billion Floats - So Test Them All!
In that post, the author was able to test all 32-bit floats in 90 seconds. Turns out that would take a few centuries for 64-bit values.
So this is only possible with small input types.
Multiple inputs, structs, or anything with pointers are going to get impossible fast.
BTW, even with 32-bit values you will probably exhaust system memory trying to store all the output values in a std::set, because std::set uses a lot of extra memory for pointers. Instead, you should use a bitmap that's big enough to hold all 2^sizeof(T) output values. The specialized std::vector<bool> should work. That will take 2^sizeof(T) / 8 bytes of memory.
Maybe what you need is std::numeric_limits. To store the results, you may use an unordered_map (from std if you're using C++11, or from boost if you're not).
You can check the limits of the data types, maybe something like this might work (it's a dumb solution, but it may get you started):
template <typename T>
bool is_injective(T(*foo)(T))
{
std::unordered_map<T, T> hash_table;
T min = std::numeric_limits<T>::min();
T max = std::numeric_limits<T>::max();
for(T it = min; i < max; ++i)
{
auto result = hash_table.emplace(it, foo(it));
if(result.second == false)
{
return false;
}
}
return true;
}
Of course, you may want to restrict a few of the possible data types. Otherwise, if you check for floats, doubles or long integers, it'll get very intensive.
but the function
double cube(double x) { return x*x*x};
is, obviously.
It is obviously not. There are 2^53 more double values representable in [0..0.5) than in [0..0.125).
As far as I know, you cannot iterate all possible values of a type in C++.
But, even if you could, that approach would get you nowhere. If your type is a 64 bit integer, you might have to iterate through 2^64 values and keep track of the result for all of them, which is not possible.
Like other people said, there is no solution for a generic type X.

Defining a C array where every element is the same memory location

In C (or C++), is it possible to create an array a (or something that "looks like" an array), such that a[0], a[1], etc., all point to the same memory location? So if you do
a[0] = 0.0f;
a[1] += 1.0f;
then a[0] will be equal to 1.0f, because it's the same memory location as a[1].
I do have a reason for wanting to do this. It probably isn't a good reason. Therefore, please treat this question as if it were asked purely out of curiosity.
I should have said: I want to do this without overloading the [] operator. The reason for this has to do with avoiding a dynamic dispatch. (I already told you my reason for wanting to do this is probably not a good one. There's no need to tell me I shouldn't want to do it. I already know this.)
I suppose a class like this is what you need
template <typename T>
struct strange_array
{
T & operator [] (int) { return value; }
private:
T value;
};
You can always define an array of pointers which points towards the same variable :
typedef int* special;
int i = 0;
unsigned int var = 0xdeadbeef;
special arr[5];
for (i=0; i<5; i++)
arr[i] = &var;
*(arr[0]) = 0;
*(arr[3]) += 3;
printf("%d\n", *(arr[2]));
// -> 3
In C, I don't think so.
The expression a[i] simply means *(a + i), so it's hard to avoid the addition due to the indexing.
You might be able to glue something together by making a (the array name) a macro, but I'm not sure how: you wouldn't have access to the index in order to compensate for it.
Without overloading operator[]?
No, it's not possible.
Fortunately.
From all that conversation here, I now understand the problem as follows:
You want to have the syntax of an array, e.g.
a[n] // only lookup
a[n]++ // lookup and write
but you want to have the semantics changed to all of those map to the same element, like
a[0]
a[0]++
The C++ way to achieve this is IMHO to overload the index access operator [].
But, you don't want it for performance reasons.
I join the opinon of user Lightness Races in Orbit that you can not do this within C++.
As you don't provide more information about the use case it is hard to come up with a solution.
Best I can imagine is that you have lots of written code which uses array semantics which you can not change.
What is left (wanting to keep performance) are code transformation techniques (CPP, sed, ..), generating a source code from the given source code with the desired behaviour, e.g. by forcing all index values to 0.

Class design: arrays vs multiple variables

I have a bit of a theoretical question, however it is a problem I sometimes face when designing classes and I see it done differently when reading others code. Which of the following would be better and why:
example 1:
class Color
{
public:
Color(float, float, float);
~Color();
friend bool operator==(Color& lhs, Color& rhs);
void multiply(Color);
// ...
float get_r();
float get_g();
float get_b();
private:
float color_values[3];
}
example 2:
class Color
{
public:
// as above
private:
float r;
float g;
float b;
}
Is there a general rule one should follow in cases like this or is it just up to a programmer and what seems to make more sense?
Both!
Use this:
class Color {
// ...
private:
union {
struct {
float r, g, b;
};
float c[3];
};
};
Then c[0] will be equivalent to r, et cetera.
It depends, do you intend to iterate over the whole array ?
In that case, I think solution 1 is more appropriate.
It is very useful to have an array like that when you have functions that operate in a loop on the data
e.g.
void BumpColors(float idx)
{
for (int i = 0; i < 3; ++i)
color_values[i] += idx;
}
vs
void BumpColors(float idx)
{
color_values[0] += idx;
color_values[1] += idx;
color_values[2] += idx;
}
Of course this is trivial, and I think it really is a matter of preference. In some rare occasion you might have APIs that take a pointer to the data though, and while you can do
awesomeAPI((float*)&r);
I would much prefer doing
awesomeAPI((float*)&color_values[0]);
because the array will guarantee its contiguity whereas you can mess up with the contiguity by adding by mistake another member variable that is not related after float r.
Performance wise there would be no difference.
I'd say the second one is the best one.
First, the data your variables contain isn't supposed (physically) to be in an array. If you had for example a class with 3 students, not more, not less, you'd put them in an array, cause they are an array of students, but here, it's just colors.
Second, Someone that reads your code also can understand in the second case really fast what your variables contain (r is red, etc). It isn't the case with an array.
Third, you'll have less bugs, you won't have to remember "oh, in my array, red is 0, g is 1, b is 2", and you won't replace by mistake
return color_values[0]
by
return color_values[1]
in your code.
I think that you are right: "It just up to a programmer and what seems to make more sense." If this were my program, I would choose one form or the other without worrying too much about it, then write some other parts of the program, then revisit the matter later.
One of the benefits of class-oriented design is that it makes internal implementation details of this kind private, which makes it convenient to alter them later.
I think that your question does matter, only I doubt that one can answer it well until one has written more code. In the abstract, there are only three elements, and the three have names -- red, green and blue -- so I think that you could go either way with this. If forced to choose, I choose example 2.
Is there a general rule one should follow in cases like this or is it just up to a programmer and what seems to make more sense?
It's definitely up to the programmer and whatever makes more sense.
In your case, the second option seems more appropriate. After all, logically thinking, your member isn't an array of values, but values for r, g and b.
Advantages of using an array:
Maintainability: You can use the values in the array to loop
Maintainability: When a value should be added (like yellow?) than you don't have to change a lot of code.
Disadvantage:
Readability: The 'values' have more clearer names (namely r, g, b in this case).
In your case probably the r, g, b variables are best, since it's unlikely a color is added and a loop over 3 elements has probably a less high importance than readability.
Sometimes a programmer will use an array ( or data structure )
in order to save the data faster to disk (or memory) using 1 write operation.
This is especially useful if you are reading and writing a lot of data.

Boost::Tuples vs Structs for return values

I'm trying to get my head around tuples (thanks #litb), and the common suggestion for their use is for functions returning > 1 value.
This is something that I'd normally use a struct for , and I can't understand the advantages to tuples in this case - it seems an error-prone approach for the terminally lazy.
Borrowing an example, I'd use this
struct divide_result {
int quotient;
int remainder;
};
Using a tuple, you'd have
typedef boost::tuple<int, int> divide_result;
But without reading the code of the function you're calling (or the comments, if you're dumb enough to trust them) you have no idea which int is quotient and vice-versa. It seems rather like...
struct divide_result {
int results[2]; // 0 is quotient, 1 is remainder, I think
};
...which wouldn't fill me with confidence.
So, what are the advantages of tuples over structs that compensate for the ambiguity?
tuples
I think i agree with you that the issue with what position corresponds to what variable can introduce confusion. But i think there are two sides. One is the call-side and the other is the callee-side:
int remainder;
int quotient;
tie(quotient, remainder) = div(10, 3);
I think it's crystal clear what we got, but it can become confusing if you have to return more values at once. Once the caller's programmer has looked up the documentation of div, he will know what position is what, and can write effective code. As a rule of thumb, i would say not to return more than 4 values at once. For anything beyond, prefer a struct.
output parameters
Output parameters can be used too, of course:
int remainder;
int quotient;
div(10, 3, &quotient, &remainder);
Now i think that illustrates how tuples are better than output parameters. We have mixed the input of div with its output, while not gaining any advantage. Worse, we leave the reader of that code in doubt on what could be the actual return value of div be. There are wonderful examples when output parameters are useful. In my opinion, you should use them only when you've got no other way, because the return value is already taken and can't be changed to either a tuple or struct. operator>> is a good example on where you use output parameters, because the return value is already reserved for the stream, so you can chain operator>> calls. If you've not to do with operators, and the context is not crystal clear, i recommend you to use pointers, to signal at the call side that the object is actually used as an output parameter, in addition to comments where appropriate.
returning a struct
The third option is to use a struct:
div_result d = div(10, 3);
I think that definitely wins the award for clearness. But note you have still to access the result within that struct, and the result is not "laid bare" on the table, as it was the case for the output parameters and the tuple used with tie.
I think a major point these days is to make everything as generic as possible. So, say you have got a function that can print out tuples. You can just do
cout << div(10, 3);
And have your result displayed. I think that tuples, on the other side, clearly win for their versatile nature. Doing that with div_result, you need to overload operator<<, or need to output each member separately.
Another option is to use a Boost Fusion map (code untested):
struct quotient;
struct remainder;
using boost::fusion::map;
using boost::fusion::pair;
typedef map<
pair< quotient, int >,
pair< remainder, int >
> div_result;
You can access the results relatively intuitively:
using boost::fusion::at_key;
res = div(x, y);
int q = at_key<quotient>(res);
int r = at_key<remainder>(res);
There are other advantages too, such as the ability to iterate over the fields of the map, etc etc. See the doco for more information.
With tuples, you can use tie, which is sometimes quite useful: std::tr1::tie (quotient, remainder) = do_division ();. This is not so easy with structs. Second, when using template code, it's sometimes easier to rely on pairs than to add yet another typedef for the struct type.
And if the types are different, then a pair/tuple is really no worse than a struct. Think for example pair<int, bool> readFromFile(), where the int is the number of bytes read and bool is whether the eof has been hit. Adding a struct in this case seems like overkill for me, especially as there is no ambiguity here.
Tuples are very useful in languages such as ML or Haskell.
In C++, their syntax makes them less elegant, but can be useful in the following situations:
you have a function that must return more than one argument, but the result is "local" to the caller and the callee; you don't want to define a structure just for this
you can use the tie function to do a very limited form of pattern matching "a la ML", which is more elegant than using a structure for the same purpose.
they come with predefined < operators, which can be a time saver.
I tend to use tuples in conjunction with typedefs to at least partially alleviate the 'nameless tuple' problem. For instance if I had a grid structure then:
//row is element 0 column is element 1
typedef boost::tuple<int,int> grid_index;
Then I use the named type as :
grid_index find(const grid& g, int value);
This is a somewhat contrived example but I think most of the time it hits a happy medium between readability, explicitness, and ease of use.
Or in your example:
//quotient is element 0 remainder is element 1
typedef boost:tuple<int,int> div_result;
div_result div(int dividend,int divisor);
One feature of tuples that you don't have with structs is in their initialization. Consider something like the following:
struct A
{
int a;
int b;
};
Unless you write a make_tuple equivalent or constructor then to use this structure as an input parameter you first have to create a temporary object:
void foo (A const & a)
{
// ...
}
void bar ()
{
A dummy = { 1, 2 };
foo (dummy);
}
Not too bad, however, take the case where maintenance adds a new member to our struct for whatever reason:
struct A
{
int a;
int b;
int c;
};
The rules of aggregate initialization actually mean that our code will continue to compile without change. We therefore have to search for all usages of this struct and updating them, without any help from the compiler.
Contrast this with a tuple:
typedef boost::tuple<int, int, int> Tuple;
enum {
A
, B
, C
};
void foo (Tuple const & p) {
}
void bar ()
{
foo (boost::make_tuple (1, 2)); // Compile error
}
The compiler cannot initailize "Tuple" with the result of make_tuple, and so generates the error that allows you to specify the correct values for the third parameter.
Finally, the other advantage of tuples is that they allow you to write code which iterates over each value. This is simply not possible using a struct.
void incrementValues (boost::tuples::null_type) {}
template <typename Tuple_>
void incrementValues (Tuple_ & tuple) {
// ...
++tuple.get_head ();
incrementValues (tuple.get_tail ());
}
Prevents your code being littered with many struct definitions. It's easier for the person writing the code, and for other using it when you just document what each element in the tuple is, rather than writing your own struct/making people look up the struct definition.
Tuples will be easier to write - no need to create a new struct for every function that returns something. Documentation about what goes where will go to the function documentation, which will be needed anyway. To use the function one will need to read the function documentation in any case and the tuple will be explained there.
I agree with you 100% Roddy.
To return multiple values from a method, you have several options other than tuples, which one is best depends on your case:
Creating a new struct. This is good when the multiple values you're returning are related, and it's appropriate to create a new abstraction. For example, I think "divide_result" is a good general abstraction, and passing this entity around makes your code much clearer than just passing a nameless tuple around. You could then create methods that operate on the this new type, convert it to other numeric types, etc.
Using "Out" parameters. Pass several parameters by reference, and return multiple values by assigning to the each out parameter. This is appropriate when your method returns several unrelated pieces of information. Creating a new struct in this case would be overkill, and with Out parameters you emphasize this point, plus each item gets the name it deserves.
Tuples are Evil.