Is this inefficient? - c++

Extending this question, I wanted to use my enumed val as they're "supposed" to be,
#include <stdio.h>
enum E{ A, B, C } ;
#define inc(enVal) (*((int*)&enVal))++
int main()
{
E t = A ;
inc( t ) ;
printf( "t %d\n", t ) ;
}
Now uh, t is a variable of enum'd type E, and I have a macro inc that increases the value of t by 1,
So is this macro (and presumably other macros like it for flag checking) going to be that much less efficient than just using int t instead?

No, it's not going to be less efficient. It will, however, be incredibly, hideously, wrong. Please, don't ever.
Oh, especially since the backing type of enums is undefined and they might well actually be compiled to less than the size of an int on some compilers.

Come on, it's ok to overload for enums:
E& operator ++ (E& x)
{
x = E((int)x + 1);
return x;
}
See in action.

I'm pretty sure this violates the strict-aliasing rules in the standard, and not only that it won't work right for C at all. What are you really trying to do? Does it actually make SENSE to increment the value?
Say you're trying to implement a state machine, much better is to just have a vector/array lookup table and use that to move to a new state.
Are you sure you shouldn't just be using an int instead, if you want to be able to assume that the enumerated values are consecutive?

Related

Logical negation and assignment operator in C/C++?

Is there a more concise way to write the second line of this function without repeating a reduntantly?
void myfunc()
{
bool a = false; // Assume that this won't alway be hardcoded to false.
a = !a;
}
Is there a more concise way to write the second line of this function without repeating a reduntantly?
! is an unary operator, you cannot associate it with = if this is what you expected.
But you can do something like a ^= true;, this is not a more concise way but this is without repeating a redundantly. Anyway this is not a good way to do contrarily to a = !a; which is much more readable
Well, I don't really see the value of doing so, but you could use xor or a simple decrement. Both of these work.
a ^= 1;
a--;
Then you won't have to repeat a.
But if you want it to be very clear, consider using a function:
void flip(bool *b)
{
(*b)--;
}
bool b = true;
flip(&b);
In C++, you can use references
void flip(bool &b)
{
b--;
}
bool b = true;
flip(b);
Or write a macro. Actually, macros are pretty handy for solving duplication problems. Although, I almost never use them for that, since it's simply rarely worth the effort. I wrote one macro to avoid duplication for malloc calls. Such a call typically look like this:
int *x = malloc(12 * sizeof *x);
You can avoid the duplication with this:
#define ALLOC(p, n) \
((p) = malloc((n) * sizeof *(p)))
But even this is something I hesitate to use.
To be honest, it's not really a problem worth solving. :)
The only possible answer is: unfortunately not.
The unary operator ! is actually concise enough, and any other trick would lead to unreadable code.
Any other shortcut form, for example for binary operators such as + or *:
a += 5;
b *= 7;
mantain the reference to the original meaning of theoperator: multiplication and addition respectively.
In your sentence:
without repeating a reduntantly
there's the wrong assumption that a is redundant in case the compiler could be instructed to negate a without repeating the variable name. No: it tells to people reading the code (even to yourself, for example an year later!) that the variable to be negated is a. And since C grammar doesn't define any syntactic sugar for logical negation operator, a is not redundant at all.

Can I check a small array of bools in one go?

There was a similar question here, but the user in that question seemed to have a much larger array, or vector. If I have:
bool boolArray[4];
And I want to check if all elements are false, I can check [ 0 ], [ 1 ] , [ 2 ] and [ 3 ] either separately, or I can loop through it. Since (as far as I know) false should have value 0 and anything other than 0 is true, I thought about simply doing:
if ( *(int*) boolArray) { }
This works, but I realize that it relies on bool being one byte and int being four bytes. If I cast to (std::uint32_t) would it be OK, or is it still a bad idea? I just happen to have 3 or 4 bools in an array and was wondering if this is safe, and if not if there is a better way to do it.
Also, in the case I end up with more than 4 bools but less than 8 can I do the same thing with a std::uint64_t or unsigned long long or something?
As πάντα ῥεῖ noticed in comments, std::bitset is probably the best way to deal with that in UB-free manner.
std::bitset<4> boolArray {};
if(boolArray.any()) {
//do the thing
}
If you want to stick to arrays, you could use std::any_of, but this requires (possibly peculiar to the readers) usage of functor which just returns its argument:
bool boolArray[4];
if(std::any_of(std::begin(boolArray), std::end(boolArray), [](bool b){return b;}) {
//do the thing
}
Type-punning 4 bools to int might be a bad idea - you cannot be sure of the size of each of the types. It probably will work on most architectures, but std::bitset is guaranteed to work everywhere, under any circumstances.
Several answers have already explained good alternatives, particularly std::bitset and std::any_of(). I am writing separately to point out that, unless you know something we don't, it is not safe to type pun between bool and int in this fashion, for several reasons:
int might not be four bytes, as multiple answers have pointed out.
M.M points out in the comments that bool might not be one byte. I'm not aware of any real-world architectures in which this has ever been the case, but it is nevertheless spec-legal. It (probably) can't be smaller than a byte unless the compiler is doing some very elaborate hide-the-ball chicanery with its memory model, and a multi-byte bool seems rather useless. Note however that a byte need not be 8 bits in the first place.
int can have trap representations. That is, it is legal for certain bit patterns to cause undefined behavior when they are cast to int. This is rare on modern architectures, but might arise on (for example) ia64, or any system with signed zeros.
Regardless of whether you have to worry about any of the above, your code violates the strict aliasing rule, so compilers are free to "optimize" it under the assumption that the bools and the int are entirely separate objects with non-overlapping lifetimes. For example, the compiler might decide that the code which initializes the bool array is a dead store and eliminate it, because the bools "must have" ceased to exist* at some point before you dereferenced the pointer. More complicated situations can also arise relating to register reuse and load/store reordering. All of these infelicities are expressly permitted by the C++ standard, which says the behavior is undefined when you engage in this kind of type punning.
You should use one of the alternative solutions provided by the other answers.
* It is legal (with some qualifications, particularly regarding alignment) to reuse the memory pointed to by boolArray by casting it to int and storing an integer, although if you actually want to do this, you must then pass boolArray through std::launder if you want to read the resulting int later. Regardless, the compiler is entitled to assume that you have done this once it sees the read, even if you don't call launder.
You can use std::bitset<N>::any:
Any returns true if any of the bits are set to true, otherwise false.
#include <iostream>
#include <bitset>
int main ()
{
std::bitset<4> foo;
// modify foo here
if (foo.any())
std::cout << foo << " has " << foo.count() << " bits set.\n";
else
std::cout << foo << " has no bits set.\n";
return 0;
}
Live
If you want to return true if all or none of the bits set to on, you can use std::bitset<N>::all or std::bitset<N>::none respectively.
The standard library has what you need in the form of the std::all_of, std::any_of, std::none_of algorithms.
...And for the obligatory "roll your own" answer, we can provide a simple "or"-like function for any array bool[N], like so:
template<size_t N>
constexpr bool or_all(const bool (&bs)[N]) {
for (bool b : bs) {
if (b) { return b; }
}
return false;
}
Or more concisely,
template<size_t N>
constexpr bool or_all(const bool (&bs)[N]) {
for (bool b : bs) { if (b) { return b; } }
return false;
}
This also has the benefit of both short-circuiting like ||, and being optimised out entirely if calculable at compile time.
Apart from that, if you want to examine the original idea of type-punning bool[N] to some other type to simplify observation, I would very much recommend that you don't do that view it as char[N2] instead, where N2 == (sizeof(bool) * N). This would allow you to provide a simple representation viewer that can automatically scale to the viewed object's actual size, allow iteration over its individual bytes, and allow you to more easily determine whether the representation matches specific values (such as, e.g., zero or non-zero). I'm not entirely sure off the top of my head whether such examination would invoke any UB, but I can say for certain that any such type's construction cannot be a viable constant-expression, due to requiring a reinterpret cast to char* or unsigned char* or similar (either explicitly, or in std::memcpy()), and thus couldn't as easily be optimised out.

Using a size_t as limiter for a "for loop"

I'm using a C++ app on my s6 named CppDroid to make a quick program.
How do you use a size_t as limiter for a "for loop" on it's counter?
int c;
//... more codes here...
for (c=0; c < a.used; ++c)
//... more codes here...
The a.used is a number of used array that came from a solution to make a dynamic sized array
The error is: comparison of integer of different signs: 'int' and 'size_t' (aka unsigned int)
The for loop is one of an internal nested loops of the program so I want to maintain variable c as an "int" as much as possible.
I've seen examples about Comparing int with size_t but I'm not sure how it can help since it is for an "if" condition.
Just use std::size_t c instead of int c.
As long as a.used doesn't change during the iteration, a common idiom is:
for(int c=0, n=a.used; c<n; ++c) {
...
}
In this way, the cast happens implicitly, and you also have the "total number of elements" variable n handy in the loop body. Also, when n comes from a methods call (say, vec.size()) you evaluate it just once, which is slightly more efficient.1
1. In theory the compiler may do this optimization by itself, but with "complicated" stuff like std::vector and a nontrivial loop body it's surprisingly difficult to prove that it's a loop invariant, so often it's just recalculated at each iteration.
Regarding
” comparison of integer of different signs: 'int' and 'size_t' (aka unsigned int)
… this is a warning. It's not an error that prevents creation of an executable, unless you've asked the compiler to treat warnings as errors.
A very direct way to address it is to use a cast, int(a.size).
More generally I recommend defining a common function to do that, e.g. named n_items (C++17 will have a size function, unfortunately with unsigned result and conflating two or more logical functions, so that name's taken):
using My_array = ...; // Whatever
using Size = ptrdiff_t;
auto n_items( My_array const& a )
-> Size
{ return a.used; }
then for your loop:
for( int c = 0; c < n_items( a ); ++c )
By the way it's generally Not A Good Idea™ to reuse a variable, like c here. I'm assuming that that reuse was unintentional. The example above shows how to declare the loop variable in the for loop head.
Also, as Matteo Italia notes in his answer, it can sometimes be a good idea to manually optimize a loop like this, if measuring shows it to be a bottleneck. That's because the compiler can't easily prove that the result of the n_items call, or any other dynamic array size expression, is the same (is “invariant”) in all executions of the loop body.
Thus, if measuring tells you that the possibly repeated size expression evaluations are a bottleneck, you can do e.g.
for( int c = 0, n = n_items( a ); c < n; ++c )
It's worth noting that any manual optimization carries costs, which are not easy to measure, but which are severe enough that the usual advice to is to defer optimization until measurements tell you that it's really needed.

Why are C macros not type-safe?

If have encountered this claim multiple times and can't figure out what it is supposed to mean. Since the resulting code is compiled using a regular C compiler it will end up being type checked just as much (or little) as any other code.
So why are macros not type safe? It seems to be one of the major reasons why they should be considered evil.
Consider the typical "max" macro, versus function:
#define MAX(a,b) a < b ? a : b
int max(int a, int b) {return a < b ? a : b;}
Here's what people mean when they say the macro is not type-safe in the way the function is:
If a caller of the function writes
char *foo = max("abc","def");
the compiler will warn.
Whereas, if a caller of the macro writes:
char *foo = MAX("abc", "def");
the preprocessor will replace that with:
char *foo = "abc" < "def" ? "abc" : "def";
which will compile with no problems, but almost certainly not give the result you wanted.
Additionally of course the side effects are different, consider the function case:
int x = 1, y = 2;
int a = max(x++,y++);
the max() function will operate on the original values of x and y and the post-increments will take effect after the function returns.
In the macro case:
int x = 1, y = 2;
int b = MAX(x++,y++);
that second line is preprocessed to give:
int b = x++ < y++ ? x++ : y++;
Again, no compiler warnings or errors but will not be the behaviour you expected.
Macros aren't type safe because they don't understand types.
You can't tell a macro to only take integers. The preprocessor recognises a macro usage and it replaces one sequence of tokens (the macro with its arguments) with another set of tokens. This is a powerful facility if used correctly, but it's easy to use incorrectly.
With a function you can define a function void f(int, int) and the compiler will flag if you try to use the return value of f or pass it strings.
With a macro - no chance. The only checks that get made are it is given the correct number of arguments. then it replaces the tokens appropriately and passes onto the compiler.
#define F(A, B)
will allow you to call F(1, 2), or F("A", 2) or F(1, (2, 3, 4)) or ...
You might get an error from the compiler, or you might not, if something within the macro requires some sort of type safety. But that's not down to the preprocessor.
You can get some very odd results when passing strings to macros that expect numbers, as the chances are you'll end up using string addresses as numbers without a squeak from the compiler.
Well they're not directly type-safe... I suppose in certain scenarios/usages you could argue they can be indirectly (i.e. resulting code) type-safe. But you could certainly create a macro intended for integers and pass it strings... the pre-processor handling the macros certainly doesn't care. The compiler may choke on it, depending on usage...
Since macros are handled by the preprocessor, and the preprocessor doesn't understand types, it will happily accept variables that are of the wrong type.
This is usually only a concern for function-like macros, and any type errors will often be caught by the compiler even if the preprocessor doesn't, but this isn't guaranteed.
An example
In the Windows API, if you wanted to show a balloon tip on an edit control, you'd use Edit_ShowBalloonTip. Edit_ShowBalloonTip is defined as taking two parameters: the handle to the edit control and a pointer to an EDITBALLOONTIP structure. However, Edit_ShowBalloonTip(hwnd, peditballoontip); is actually a macro that evaluates to
SendMessage(hwnd, EM_SHOWBALLOONTIP, 0, (LPARAM)(peditballoontip));
Since configuring controls is generally done by sending messages to them, Edit_ShowBalloonTip has to do a typecast in its implementation, but since it's a macro rather than an inline function, it can't do any type checking in its peditballoontip parameter.
A digression
Interestingly enough, sometimes C++ inline functions are a bit too type-safe. Consider the standard C MAX macro
#define MAX(a, b) ((a) > (b) ? (a) : (b))
and its C++ inline version
template<typename T>
inline T max(T a, T b) { return a > b ? a : b; }
MAX(1, 2u) will work as expected, but max(1, 2u) will not. (Since 1 and 2u are different types, max can't be instantiated on both of them.)
This isn't really an argument for using macros in most cases (they're still evil), but it's an interesting result of C and C++'s type safety.
There are situations where macros are even less type-safe than functions. E.g.
void printlog(int iter, double obj)
{
printf("%.3f at iteration %d\n", obj, iteration);
}
Calling this with the arguments reversed will cause truncation and erroneous results, but nothing dangerous. By contrast,
#define PRINTLOG(iter, obj) printf("%.3f at iteration %d\n", obj, iter)
causes undefined behavior. To be fair, GCC warns about the latter, but not about the former, but that's because it knows printf -- for other varargs functions, the results are potentially disastrous.
When the macro runs, it just does a text match through your source files. This is before any compilation, so it is not aware of the datatypes of anything it changes.
Macros aren't type safe, because they were never meant to be type safe.
The compiler does the type checking after macros had been expanded.
Macros and there expansion are meant as a helper to the ("lazy") author (in the sense of writer/reader) of C source code. That's all.

Is it good practice to use the comma operator?

I've recently (only on SO actually) run into uses of the C/C++ comma operator. From what I can tell, it creates a sequence point on the line between the left and right hand side operators so that you have a predictable (defined) order of evaluation.
I'm a little confused about why this would be provided in the language as it seems like a patch that can be applied to code that shouldn't work in the first place. I find it hard to imagine a place it could be used that wasn't overly complex (and in need of refactoring).
Can someone explain the purpose of this language feature and where it may be used in real code (within reason), if ever?
It can be useful in the condition of while() loops:
while (update_thing(&foo), foo != 0) {
/* ... */
}
This avoids having to duplicate the update_thing() line while still maintaining the exit condition within the while() controlling expression, where you expect to find it. It also plays nicely with continue;.
It's also useful in writing complex macros that evaluate to a value.
The comma operator just separates expressions, so you can do multiple things instead of just one where only a single expression is required. It lets you do things like
(x) (y)
for (int i = 0, j = 0; ...; ++i, ++j)
Note that x is not the comma operator but y is.
You really don't have to think about it. It has some more arcane uses, but I don't believe they're ever absolutely necessary, so they're just curiosities.
Within for loop constructs it can make sense. Though I generally find them harder to read in this instance.
It's also really handy for angering your coworkers and people on SO.
bool guess() {
return true, false;
}
Playing Devil's Advocate, it might be reasonable to reverse the question:
Is it good practice to always use the semi-colon terminator?
Some points:
Replacing most semi-colons with commas would immediately make the structure of most C and C++ code clearer, and would eliminate some common errors.
This is more in the flavor of functional programming as opposed to imperative.
Javascript's 'automatic semicolon insertion' is one of its controversial syntactic features.
Whether this practice would increase 'common errors' is unknown, because nobody does this.
But of course if you did do this, you would likely annoy your fellow programmers, and become a pariah on SO.
Edit: See AndreyT's excellent 2009 answer to Uses of C comma operator. And Joel 2008 also talks a bit about the two parallel syntactic categories in C#/C/C++.
As a simple example, the structure of while (foo) a, b, c; is clear, but while (foo) a; b; c; is misleading in the absence of indentation or braces, or both.
Edit #2: As AndreyT states:
[The] C language (as well as C++) is historically a mix of two completely different programming styles, which one can refer to as "statement programming" and "expression programming".
But his assertion that "in practice statement programming produces much more readable code" [emphasis added] is patently false. Using his example, in your opinion, which of the following two lines is more readable?
a = rand(), ++a, b = rand(), c = a + b / 2, d = a < c - 5 ? a : b;
a = rand(); ++a; b = rand(); c = a + b / 2; if (a < c - 5) d = a; else d = b;
Answer: They are both unreadable. It is the white space which gives the readability--hurray for Python!. The first is shorter. But the semi-colon version does have more pixels of black space, or green space if you have a Hazeltine terminal--which may be the real issue here?
Everyone is saying that it is often used in a for loop, and that's true. However, I find it's more useful in the condition statement of the for loop. For example:
for (int x; x=get_x(), x!=sentinel; )
{
// use x
}
Rewriting this without the comma operator would require doing at least one of a few things that I'm not entirely comfortable with, such as declaring x outside the scope where it's used, or special casing the first call to get_x().
I'm also plotting ways I can utilize it with C++11 constexpr functions, since I guess they can only consist of single statements.
I think the only common example is the for loop:
for (int i = 0, j = 3; i < 10 ; ++i, ++j)
As mentioned in the c-faq:
Once in a while, you find yourself in a situation in which C expects a
single expression, but you have two things you want to say. The most
common (and in fact the only common) example is in a for loop,
specifically the first and third controlling expressions.
The only reasonable use I can think of is in the for construct
for (int count=0, bit=1; count<10; count=count+1, bit=bit<<1)
{
...
}
as it allows increment of multiple variables at the same time, still keeping the for construct structure (easy to read and understand for a trained eye).
In other cases I agree it's sort of a bad hack...
I also use the comma operator to glue together related operations:
void superclass::insert(item i) {
add(i), numInQ++, numLeft--;
}
The comma operator is useful for putting sequence in places where you can't insert a block of code. As pointed out this is handy in writing compact and readable loops. Additionally, it is useful in macro definitions. The following macro increments the number of warnings and if a boolean variable is set will also show the warning.
#define WARN if (++nwarnings, show_warnings) std::cerr
So that you may write (example 1):
if (warning_condition)
WARN << "some warning message.\n";
The comma operator is effectively a poor mans lambda function.
Though posted a few months after C++11 was ratified, I don't see any answers here pertaining to constexpr functions. This answer to a not-entirely-related question references a discussion on the comma operator and its usefulness in constant expressions, where the new constexpr keyword was mentioned specifically.
While C++14 did relax some of the restrictions on constexpr functions, it's still useful to note that the comma operator can grant you predictably ordered operations within a constexpr function, such as (from the aforementioned discussion):
template<typename T>
constexpr T my_array<T>::at(size_type n)
{
return (n < size() || throw "n too large"), (*this)[n];
}
Or even something like:
constexpr MyConstexprObject& operator+=(int value)
{
return (m_value += value), *this;
}
Whether this is useful is entirely up to the implementation, but these are just two quick examples of how the comma operator might be applied in a constexpr function.