Check if variable is equal to any one of these values [duplicate] - c++

This question already has answers here:
Most efficient way to compare a variable to multiple values?
(7 answers)
Closed 2 years ago.
In c++, is there a way to see if a variable is equal to one of the values?
Right now I have to do
if (fExt == "zip" | fExt == "7z" | fExt == "gz" | fExt == "tar")
{
//do something
}
However, is there a more efficient way?

Example with a set:
if (std::set<std::string>{"zip", "7z", "gz", "tar"}.count(fExt)) {
std::cerr << "yes" << std::endl;
}
Note that std::set::count() returns 0 or 1 and provides effectively the same functionality as std::set::contains() which unfortunately is only introduced in C++20.
This is not too bad as it may skip a few comparisons. It will still not be more efficient, given the extra work to setup and teardown the set.
But if your code is called more often, it gets better by re-using the set:
static const std::set<std::string> extensions{"zip", "7z", "gz", "tar"};
if (extensions.count(fExt)) {
std::cerr << "yes" << std::endl;
}
Debatable whether it will be more efficient then doing the four string comparisons but it will probably also not be worse; and it might be easier to maintain.
You could also use an unordered_set, but for a low number of elements like in your case, it will do more unnecessary computation.

If you can use c++17, I would suggest a fold-expression:
template<typename T, typename ...Opts>
bool any_of(T val, Opts ...opts)
{
return (... || (val == opts));
}
which you can then use like this:
if (any_of(fExt, "zip", "7z", "gz", "tar"))
{
// ...
}
You should put this function into your own namespace to avoid any possible confusion with std::any_of from the <algorithm> header.

Related

Is it defined behaviour to assign to function call in or in if (C++17)

During a codebase refactor I found code like this:
void myFunction (std::map<int, int> my_map)
{
int linked_element;
if (my_map[linked_element = firstIndex] != 0
|| my_map[linked_element = secondIndex] != 0)
{
// do some stuff with linked_element
}
}
Or
void myFunction (std::set<int> my_set)
{
int linked_element;
if (my_set.find(linked_element = firstIndex) != my_set.end()
|| my_set.find(linked_element = secondIndex) != my_set.end())
{
// do some stuff with linked_element
}
}
From what I understood the aim of that was to avoid checking 2 times (first when entering in the if, second when assigning the variable).
I can understand that depending on which side of the || is true linked_element will be assigned to the right value but this still feels kind of bad to me.
Is this kind of behaviour defined?
This behavior is well defined by the order of evaluation.
First, the linked_element = firstIndex assignment happens. This expression returns the value of firstIndex, that is then used as an argument for the subscript operator on my_map (i.e., my_map[linked_element = firstIndex]). The return value from that expression is checked against the != 0 condition. If it's true, the other side of the || operator is not evaluated due to short-circuit logic. If it's false, the same story happens on the other side of the operator.
Whether or not it's a good practice to write code in such a style is a different question though. Personally speaking, I'd prioritize readability and maintainability over this micro-optimization unless it's a super-critical piece of the program, but it's a matter of opinion, I guess.
In original code behavior is well defined, since operator || evaluates first argument and if this is evaluated to false evaluates second argument.
BUT: Assignment there is confusing and many (probably all) static analyzes tools will complain about this. So I would reflector this code in this way, so it would require less brain power to read:
void doSomeStuff(const std::set<int>& my_set, int linked_element)
{
.....
}
void myFunction (const std::set<int>& my_set)
{
if (my_set.find(firstIndex) != my_set.end())
{
doSomeStuff(my_set, firstIndex);
} else if (my_set.find(secondIndex) != my_set.end()) {
doSomeStuff(my_set, secondIndex);
}
}
Since you had to ask question about this code this proves that original version is bad from maintainer point of view. Code which requires lots of focus to understand is costly in maintenance.
BTW this fragment of code:
if (my_map[linked_element = firstIndex] != 0
looks suspicious. I have even more suspensions seeing set-version.
This looks like that someone do not understand how operator[] works for maps. If value for key do not exist, default value is introduced to map. So checking for default value 0 seem like attempt to adders this issue. Possibly my_map.count(firstIndex) should be used.
An alternate version, assuming firstIndex and secondIndex are literal values (like 2 and 7), or are otherwise known relative to some invalid third index value:
void myFunction (std::set<int> & my_set)
{
int linked_element =
my_set.contains (firstIndex) ? firstIndex :
my_set.contains (secondIndex) ? secondIndex :
thirdIndex;
if (linked_element != thirdIndex)
{
// do some stuff with linked_element
}
}
If the indices are not known then a std::optional<int> can step in here too.
If pre-C++20, replace .contains() with .count().
Bigger concerns with the original code are:
the pass-by-value of a potentially large container (never assume COW)
map[index] silently adds the index to the map if not present

C++ - if statement simplification [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How can I check whether multiple variables are equal to the same value?
Is there a way to write this:
if ((var1==var2) && (var2==var3) && (var3==var4) ...)
into something like this
if (var1==var2==var3==var4 ...)
?
In C++11, you could write a set of functions like this:
template<typename T>
bool all_equal(T const &)
{
return true;
}
template<typename T, typename U, typename... Args>
bool all_equal(T const & a, U const & b, Args const&... c)
{
return a==b && all_equal(b,c...);
}
int main()
{
std::cout << all_equal(1,2,3) << '\n';
std::cout << all_equal(1,1,1) << '\n';
}
Edit: I guess Steve Jessop had this same idea on the linked duplicate here
Not in a way that's clearer than that, no. You can insert the values in a set for example and check the if size == 1, but what you have now is the way to go.
Essentially, no.
If you have a collection, rather than just sporadic variables, it is possible to apply algorithms to check if they are all equal, which are O(N) if they are all indeed equal (as your long statement is) and will break immediately when it finds one that is not.

Memoizing a function with two inputs in C++

I have a function, f(a,b), that accepts two inputs. I do not know ahead of time which values of a and b will be used. I'm okay with being a little wasteful on memory (I care about speed). I want to be able to check if the output of f(a,b) has already been delivered, and if so, deliver that output again without re-running through the f(a,b) process.
Trivially easy to do in Python with decorators, but C++ is way over my head here.
I would use a std::map (or maybe an std::unordered_map) whose key is a std::pair, or perhaps use a map of maps.
C++11 improvements are probably helpful in that case. Or maybe some Boost thing.
The poster asks:
I want to be able to check if the output of f(a,b) has already been delivered, and if so, deliver that output again without re-running through the f(a,b) process.
It's pretty easy in C++ using a std::map. The fact that the function has exactly two parameters means that we can use std::pair to describe them.
#include <map>
#include <iostream>
uint64_t real_f(int a, int b) {
std::cout << "*";
// Do something tough:
return (uint64_t)a*b;
}
uint64_t memo_f(int a, int b) {
typedef std::pair<int, int> key;
typedef std::map<key, uint64_t> map;
static map m;
key k(a,b);
map::iterator it = m.find(k);
if(it == m.end()) {
return m[k] = real_f(a, b);
}
return it->second;
}
int main () {
std::cout << memo_f(1, 2) << "\n";
std::cout << memo_f(3, 4) << "\n";
std::cout << memo_f(1, 2) << "\n";
std::cout << memo_f(3, 4) << "\n";
std::cout << memo_f(5, 6) << "\n";
}
The output of the above program is:
*2
*12
2
12
*30
The lines without asterisks represent cached results.
With C++11, you could use tasks and futures. Let f be your function:
int f(int a, int b)
{
// Do hard work.
}
Then you would schedule the function execution, which returns you a handle to the return value. This handle is called a future:
template <typename F>
std::future<typename std::result_of<F()>::type>
schedule(F f)
{
typedef typename std::result_of<F()>::type result_type;
std::packaged_task<result_type> task(f);
auto future = task.get_future();
tasks_.push_back(std::move(task)); // Queue the task, execute later.
return std::move(future);
}
Then, you could use this mechanism as follows:
auto future = schedule(std::bind(&f, 42, 43)); // Via std::bind.
auto future = schedule([&] { f(42, 43); }); // Lambda alternative.
if (future.has_value())
{
auto x = future.get(); // Blocks if the result of f(a,b) is not yet availble.
g(x);
}
Disclaimer: my compiler does not support tasks/futures, so the code may have some rough edges.
The main point about this question are the relative expenses in CPU and RAM between calculating f(a,b) and keeping some sort of lookup table to cache results.
Since an exhaustive table of 128 bits index length is not (yet) feasable, we need to reduce the lookup space into a manageable size - this can't be done without some considerations inside your app:
How big is the really used space of function inputs? Is there a pattern in it?
What about the temporal component? Do you expect repeated calculations to be close to one another or ditributed along the timeline?
What about the distribution? Do you assume a tiny part of the index space to consume the majority of function calls?
I would simply start with a fixed-size array of (a,b, f(a,b)) tuples and a linear search. Depending on your pattern as asked above, you might want to
window-slide it (drop oldest on a cache miss): This is good for localized reocurrences
have (a,b,f(a,b),count) tuples with the tuple with the smallest count being expelled - this is good for non-localized occurrences
have some key-function determine a position in the cache (this is good for tiny index space usage)
whatever else Knuth or Google might have thought of
You might also want to benchmark repeated calculation against the lookup mechanism, if the latter becomes more and more complex: std::map and freinds don't come for free, even if they are high-quality implementations.
The only easy way is to use std::map. std::unordered_map does not work. We cannot use std::pair as the key in unordered map. You can do the following,
std::map<pair<int, int>, int> mp;
int func(int a, int b)
{
if (mp.find({a, b}) != mp.end()) return mp[{a, b}];
// compute f(a, b)...
mp[{a, b}] = // computed value;
return mp[{a, b}];
}

Why use std::for_each over a for loop? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Advantages of std::for_each over for loop
So I was playing around with some C++11 features and I'm curious as to why std::for_each is beneficial. Wouldn't it be easier and look cleaner to do a for loop or is it because I'm so used to doing it this way?
#include <iostream>
#include <tuple>
#include <vector>
#include <algorithm>
typedef std::tuple<int, int> pow_tuple;
pow_tuple pow(int x)
{
return std::make_tuple(x, x*x);
}
void print_values(pow_tuple values)
{
std::cout << std::get<0>(values) << "\t" << std::get<1>(values) << std::endl;
}
int main(int argc, char** argv)
{
std::vector<int> numbers;
for (int i=1; i < 10; i++)
numbers.push_back(i);
std::for_each(numbers.begin(), numbers.end(),
[](int x) { print_values(pow(x)); }
);
std::cout << "Using auto keyword:" << std::endl;
auto values = pow(20);
print_values(values);
return 0;
}
the standard algorithms handle all of the looping issues correctly, reducing the chance of making one-off errors and such, allowing you to focus on the calculation and not the looping.
It depends somewhat on the local coding conventions, but there are two
potential advantages. The first is that it states clearly that the code
iterates over all of the elements in the sequence; unless the local
coding conventions say otherwise (and they are enforced), you have to
consider that some cowboy programmer might have inserted a break. The
second is that it names the operation you are performing on each
element; this once can easily be handled by calling a function in the
loop, and of course, really trivial operations may not need a name.
There's also the advantage, at least if you aren't yet using C++11, that
you don't have to spell out the iterator types; the spelled out iterator
types create a lot of verbiage, in which the important logic can get
lost or overlooked.
one could say that this form allows you write this piece of code without the unnecessary index to manipulate and make mistakes with.
It is also an idiom which exists in other languages, and since you are getting anonymous functions, this feature can be a good example of higher level functions (educational purpose?).
I agree that it does not feel like c++ ...

Lambda Expression vs Functor in C++

I wonder where should we use lambda expression over functor in C++. To me, these two techniques are basically the same, even functor is more elegant and cleaner than lambda. For example, if I want to reuse my predicate, I have to copy the lambda part over and over. So when does lambda really come in to place?
A lambda expression creates an nameless functor, it's syntactic sugar.
So you mainly use it if it makes your code look better. That generally would occur if either (a) you aren't going to reuse the functor, or (b) you are going to reuse it, but from code so totally unrelated to the current code that in order to share it you'd basically end up creating my_favourite_two_line_functors.h, and have disparate files depend on it.
Pretty much the same conditions under which you would type any line(s) of code, and not abstract that code block into a function.
That said, with range-for statements in C++0x, there are some places where you would have used a functor before where it might well make your code look better now to write the code as a loop body, not a functor or a lambda.
1) It's trivial and trying to share it is more work than benefit.
2) Defining a functor simply adds complexity (due to having to make a bunch of member variables and crap).
If neither of those things is true then maybe you should think about defining a functor.
Edit: it seems to be that you need an example of when it would be nice to use a lambda over a functor. Here you go:
typedef std::vector< std::pair<int,std::string> > whatsit_t;
int find_it(std::string value, whatsit_t const& stuff)
{
auto fit = std::find_if(stuff.begin(), stuff.end(), [value](whatsit_t::value_type const& vt) -> bool { return vt.second == value; });
if (fit == stuff.end()) throw std::wtf_error();
return fit->first;
}
Without lambdas you'd have to use something that similarly constructs a functor on the spot or write an externally linkable functor object for something that's annoyingly trivial.
BTW, I think maybe wtf_error is an extension.
Lambdas are basically just syntactic sugar that implement functors (NB: closures are not simple.) In C++0x, you can use the auto keyword to store lambdas locally, and std::function will enable you to store lambdas, or pass them around in a type-safe manner.
Check out the Wikipedia article on C++0x.
Small functions that are not repeated.
The main complain about functors is that they are not in the same place that they were used. So you had to find and read the functor out of context to the place it was being used in (even if it is only being used in one place).
The other problem was that functor required some wiring to get parameters into the functor object. Not complex but all basic boilerplate code. And boiler plate is susceptible to cut and paste problems.
Lambda try and fix both these. But I would use functors if the function is repeated in multiple places or is larger than (can't think up an appropriate term as it will be context sensitive) small.
lambda and functor have context. Functor is a class and therefore can be more complex then a lambda. A function has no context.
#include <iostream>
#include <list>
#include <vector>
using namespace std;
//Functions have no context, mod is always 3
bool myFunc(int n) { return n % 3 == 0; }
//Functors have context, e.g. _v
//Functors can be more complex, e.g. additional addNum(...) method
class FunctorV
{
public:
FunctorV(int num ) : _v{num} {}
void addNum(int num) { _v.push_back(num); }
bool operator() (int num)
{
for(int i : _v) {
if( num % i == 0)
return true;
}
return false;
}
private:
vector<int> _v;
};
void print(string prefix,list<int>& l)
{
cout << prefix << "l={ ";
for(int i : l)
cout << i << " ";
cout << "}" << endl;
}
int main()
{
list<int> l={1,2,3,4,5,6,7,8,9};
print("initial for each test: ",l);
cout << endl;
//function, so no context.
l.remove_if(myFunc);
print("function mod 3: ",l);
cout << endl;
//nameless lambda, context is x
l={1,2,3,4,5,6,7,8,9};
int x = 3;
l.remove_if([x](int n){ return n % x == 0; });
print("lambda mod x=3: ",l);
x = 4;
l.remove_if([x](int n){ return n % x == 0; });
print("lambda mod x=4: ",l);
cout << endl;
//functor has context and can be more complex
l={1,2,3,4,5,6,7,8,9};
FunctorV myFunctor(3);
myFunctor.addNum(4);
l.remove_if(myFunctor);
print("functor mod v={3,4}: ",l);
return 0;
}
Output:
initial for each test: l={ 1 2 3 4 5 6 7 8 9 }
function mod 3: l={ 1 2 4 5 7 8 }
lambda mod x=3: l={ 1 2 4 5 7 8 }
lambda mod x=4: l={ 1 2 5 7 }
functor mod v={3,4}: l={ 1 2 5 7 }
First, i would like to clear some clutter here.
There are two different things
Lambda function
Lambda expression/functor.
Usually, Lambda expression i.e. [] () {} -> return-type does not always synthesize to closure(i.e. kind of functor). Although this is compiler dependent. But you can force compiler by enforcing + sign before [] as +[] () {} -> return-type. This will create function pointer.
Now, coming to your question. You can use lambda repeatedly as follows:
int main()
{
auto print = [i=0] () mutable {return i++;};
cout<<print()<<endl;
cout<<print()<<endl;
cout<<print()<<endl;
// Call as many time as you want
return 0;
}
You should use Lambda wherever it strikes in your mind considering code expressiveness & easy maintainability like you can use it in custom deleters for smart pointers & with most of the STL algorithms.
If you combine Lambda with other features like constexpr, variadic template parameter pack or generic lambda. You can achieve many things.
You can find more about it here
As you pointed out, it works best when you need a one-off and the coding overhead of writing it out as a function isn't worth it.
Conceptually, the decision of which to use is driven by the same criterion as using a named variable versus a in-place expression or constant...
size_t length = strlen(x) + sizeof(y) + z++ + strlen('\0');
...
allocate(length);
std::cout << length;
...here, creating a length variable encourages the program to consider it's correctness and meaning in isolation of it's later use. The name hopefully conveys enough that it can be understood intuitively and independently of it's initial value. It then allows the value to be used several times without repeating the expression (while handling z being different). While here...
allocate(strlen(x) + sizeof(y) + z++ + strlen('\0'));
...the total code is reduced and the value is localised at the point it's needed. The only thing to "carry forwards" from a reading of this line is the side effects of allocation and increment (z), but there's no extra local variable with scope or later use to consider. The programmer has to mentally juggle less state while continuing their analysis of the code.
The same distinction applies to functions versus inline statements. For the purposes of answering your question, functors versus lambdas can be seen as just a particular case of this function versus inlining decision.
I tend to prefer Functors over Lambdas these days. Although they require more code, Functors yield cleaner algorithms. The below comparison between find_id and find_id2 showcase that result. While both yield sufficiently clean code, find_id2 is slightly easier to read as the MatchName(name) definition is extracted from (and secondary to) the primary algorithm.
I would argue, however, that the Functor code should be placed inside implementation files right above the function definition where it is used to provide direct access to the function definition. Otherwise a Lambda would be better for code-locality/organization.
#include <iostream>
#include <vector>
#include <string>
using namespace std;
struct Person {
int id;
string name;
};
typedef vector<Person> People;
int find_id(string const& name, People const& people) {
auto MatchName = [name](Person const& p) -> bool
{
return p.name == name;
};
auto found = find_if(people.begin(), people.end(), MatchName);
if (found == people.end()) return -1;
return found->id;
}
struct MatchName {
string const& name;
MatchName(string const& name) : name(name) {}
bool operator() (Person const& person)
{
return person.name == name;
}
};
int find_id2(string const& name, People const& people) {
auto found = find_if(people.begin(), people.end(), MatchName(name));
if (found == people.end()) return -1;
return found->id;
}
int main() {
People people { {0, "Jim"}, {1, "Pam"}, {2, "Dwight"} };
cout << "Pam's ID is " << find_id("Pam", people) << endl;
cout << "Dwight's ID is " << find_id2("Dwight", people) << endl;
}
The Functor is self-documenting by default; but Lambda's need to be stored in variables (to be self-documenting) inside more-complex algorithm definitions. Hence, it is preferable to not use Lambda's inline as many people do (for code readability) in order to gain the self-documenting benefit as shown above in the MatchName Lambda.
When a Lambda is stored in a variable at the call-site (or used inline), primary algorithms are slightly more difficult to read. Since Lambdas are secondary in nature to algorithms where they are used, it is preferable to clean up the primary algorithms by using self-documenting subroutines (e.g. Functors). This might not matter as much in this example, but if one wanted to use more complex algorithms it can significantly reduce the burden interpreting code.
Functors can be as simple (as in the example above) or complex as they need to be. Sometimes complexity is desirable and cases for dynamic polymorphism (e.g. for strategy/decorator design patterns; or their template-equivalent policy types). This is a use-case Lambda's can not satisfy.
Functors require explicit declaration of capture variables without polluting primary algorithms. When more-and-more capture variables are required by Lambda's the tendency is to use a blanket-capture like [=]. But this reduces readability greatly as one must mentally jump between the Lambda definition and all surrounding local variables, possibly member variables, and more.