std::algorithm to make this more expressive

std::algorithm to make this more expressive - c++

I'm working on my own blockchain implementation in C++17.
To learn as much as possible, I would like to reduce loops as least as possible, moving to the alternative (and more expressive) loops within the std::algorithm library.
In a nullshell, within the Blockchain algorithm, each block contains it's own hash and the hash of the previous block (except the Genesis-Block which doesn't contain a previous block [first in the container]).
I want to implement a Validate function in the Blockchain object that takes pairs of blocks (which are contained in a std::vector), and checks the hashes (and previous hashes) to make sure they haven't been tampered with.
Current code (works):
bool Blockchain::Validate() const
{
// HASH is in ASCII
std::array<unsigned char, 64> calculated_hash{};
// No underflow here since Genesis-Block is created in the CTOR
for (auto index = this->chain_.size() - 1; index > 0; --index)
{
const auto& current_block{this->chain_[index]};
const auto& previous_block{this->chain_[index - 1]};
SHA256{previous_block}.FillHash(calculated_hash);
if (!std::equal(std::begin(calculated_hash), std::end(calculated_hash),
std::begin(current_block.GetPreviousHash()), std::end(current_block.GetPreviousHash())))
{
return false;
}
}
return true;
}
I would like to know if there's an algorithm that works somehow the way Python does its ", ".join(arr) for strings, which appends commas between each adjacent pair in the array, but instead will check until a certain condition returns false in which case stops running.
TLDR:
If this is my container:
A B C D E F G
I would like to know if theres an algorithm that asserts a condition in adjacent pairs: (A, B), (B, C), (C, D), (D, E), (E, F), (F, G)
And will stop if a condition has failed, for example:
A with B -> True
B with C -> True
C with D -> False
So the algorithm will return false. (Sounds like an adjacent implementation of std::all_of).
Does a std::algorithm like this exist? Thanks!

If you have some range v where you want to check each adjacent element for some condition, and return early, you can use std::adjacent_find.
First write a lambda that compares adjacent elements:
auto comp = [](auto left, auto right)
{
return // the Negation of the actual condition
}
Note that the negation is needed, so that you return early when you reach the actual false case. So in your case, A,B and B,C would compare false, and C,D would compare true.
Now you can use the lambda like this:
return std::adjacent_find(std::begin(v), std::end(v), comp) == std::end(v);
In your manual loop, you actually appear to be iterating in reverse, in which case you can write:
return std::adjacent_find(std::rbegin(v), std::rend(v), comp) == std::rend(v);

Related

How can I apply a condition on a lambda expression in kotlin?

I want to apply a condition using higher-order infix extension function on a list of pair.
I want my condition to show me the first numbers of the pair. Specifically, show those first numbers whose sum with their second number is 5.
T
this is the code I have written so far
fun main() {
infix fun List<Pair<Int,Int>>.takeFirstsBy(
condition: // i need help in this line to apply a condition on list of pair
):List<Int>{
}
}
I don't know how to write a conditional statement to apply to pairs after the (condition) statement
please help me. Thanks

There's no reason to make a higher order function infix, because of trailing lambda syntax. All it would do is allow you to replace the dot with a space, and make the code harder to read because it differs from conventional syntax. Also, higher order functions on collections are often chained from line to line, and you can only do this by using dot notation.
It would make sense to make it inline though, so the lambda doesn't have to be allocated each time you call this function.
You can make the receiver type an Iterable to make the function more versatile.
Your conditional has an input of a Pair of Ints, and needs to return a Boolean, so the signature is (Pair<Int, Int>) -> Boolean.
The logic of your function is to filter to only the ones that pass the condition, and then to map that filtered result to show only the first number of the pair.
inline fun Iterable<Pair<Int,Int>>.takeFirstsBy(
condition: (Pair<Int,Int>)->Boolean
):List<Int>{
return filter(condition).map { it.first }
}
fun main() {
val input = listOf(1 to 4, 2 to 3, 5 to 5)
val result = input.takeFirstsBy { it.first + it.second == 5 }
println(result)
}
Alternate implementation. I think this is less readable, but it is more efficient:
inline fun Iterable<Pair<Int,Int>>.takeFirstsBy(
condition: (Pair<Int,Int>)->Boolean
):List<Int>{
return mapNotNull { it.takeIf(condition)?.first }
}

You don't write the condition but the invoker does.
condition: (Pair<Int, Int>) -> Boolean
We need an argument Pair<Int, Int> because is the item of the list, so you call it on the iterator you use.
infix fun List<Pair<Int,Int>>.takeFirstsBy(
condition: (Pair<Int, Int>) -> Boolean
):List<Int>{
return this.mapNotNull {
if (condition.invoke(it)) {
//pair.first or pair.second ?
} else null //null won't be included
}
}
And then you call it like this
So the iterator runs through the list and filter the thing on the condition
list.takeFirstBy { pair: Pair<Int, Int> ->
//implement the condition here, returning true/false
}

removing nested paths from vector of strings

I have an std::vector<std::string>paths where each entry is a path and I want to remove all the paths that are sub-directories of another one.
If for example I have root/dir1/, root/dir1/sub_dir/ and root/dir2/, the result should be root/dir1/, root/dir2/.
The way I've implemented it is by using std::sort + std::unique with a predicate that checks if string2 starts with string1.
std::vector<std::string> paths = getPaths();
std::sort(paths.begin(), paths.end());
const auto to_erase = std::unique(paths.begin(), paths.end(), [](const std::string & s1, const std::string & s2) {
return (s2.starts_with(s1) || s1.starts_with(s2));
});
paths.erase(to_erase, paths.end());
But since the predicate should be symetrycal I wonder if in some implementation std::unique iterate from end to start, and in that case the result will be wrong.

Your predicate is symmetric.
Let p be your predicate (the lambda), and a and b some strings, different from each other, but such that p(a, b) is true. Then either a.starts_with(b) or b.starts_with(a).
If a.starts_with(b), then p(b, a) is true because s2.starts_with(s1) is true in the lambda. Similarly, if b.starts_with(a), then p(b, a) is true because s1.starts_with(s2) is true in the lambda.
So, if p(a, b), then p(b, a) (and vice versa), which is the definition of a symmetric predicate.
It's not transitive though (p("a/b", "a") and p("a", "a/c") but not p("a/b", "a/c")), but I can't see a way this could pose a problem in practice. It could definitely lead to different results if the input isn't sorted, but yours is.
So your implementation is probably fine.

How does std::set comparator function work?

Currently working on an algorithm problems using set.
set<string> mySet;
mySet.insert("(())()");
mySet.insert("()()()");
//print mySet:
(())()
()()()
Ok great, as expected.
However if I put a comp function that sorts the set by its length, I only get 1 result back.
struct size_comp
{
bool operator()(const string& a, const string& b) const{
return a.size()>b.size();
}
};
set<string, size_comp> mySet;
mySet.insert("(())()");
mySet.insert("()()()");
//print myset
(())()
Can someone explain to me why?
I tried using a multi set, but its appending duplicates.
multiset<string,size_comp> mSet;
mSet.insert("(())()");
mSet.insert("()()()");
mSet.insert("()()()");
//print mset
"(())()","()()()","()()()"

std::set stores unique values only. Two values a,b are considered equivalent if and only if
!comp(a,b) && !comp(b,a)
or in everyday language, if a is not smaller than b and b is not smaller than a. In particular, only this criterion is used to check for equality, the normal operator== is not considered at all.
So with your comparator, the set can only contain one string of length n for every n.
If you want to allow multiple values that are equivalent under your comparison, use std::multiset. This will of course also allow exact duplicates, again, under your comparator, "asdf" is just as equivalent to "aaaa" as it is to "asdf".
If that does not make sense for your problem, you need to come up with either a different comparator that induces a proper notion of equality or use another data structure.
A quick fix to get the behavior you probably want (correct me if I'm wrong) would be introducing a secondary comparison criterion like the normal operator>. That way, we sort by length first, but are still able to distinguish between different strings of the same length.
struct size_comp
{
bool operator()(const string& a, const string& b) const{
if (a.size() != b.size())
return a.size() > b.size();
return a > b;
}
};

The comparator template argument, which defaults to std::less<T>, must represent a strict weak ordering relation between values in its domain.
This kind of relation has some requirements:
it's not reflexive (x < x yields false)
it's asymmetric (x < y implies that y < x is false)
it's transitive (x < y && y < z implies x < z)
Taking this further we can define equivalence between values in term of this relation, because if !(x < y) && !(y < x) then it must hold that x == y.
In your situation you have that ∀ x, y such that x.size() == y.size(), then both comp(x,y) == false && comp(y,x) == false, so since no x or y is lesser than the other, then they must be equal.
This equivalence is used to determine if two items correspond to the same, thus ignoring second insertion in your example.
To fix this you must make sure that your comparator never returns false for both comp(x,y) and comp(y,x) if you don't want to consider x equal to y, for example by doing
auto cmp = [](const string& a, const string& b) {
if (a.size() != b.size())
return a.size() > b.size();
else
return std::less()(a, b);
}
So that for input of same length you fallback to normal lexicographic order.

This is because equality of elements is defined by the comparator. An element is considered equal to another if and only if !comp(a, b) && !comp(b, a).
Since the length of "(())()" is not greater, nor lesser than the length of "()()()", they are considered equal by your comparator. There can be only unique elements in a std::set, and an equivalent object will overwrite the existing one.
The default comparator uses operator<, which in the case of strings, performs lexicographical ordering.
I tried using a multi set, but its appending duplicates.
Multiset indeed does allow duplicates. Therefore both strings will be contained despite having the same length.

size_comp considers only the length of the strings. The default comparison operator uses lexicographic comparison, which distinguishes based on the content of the string as well as the length.

Is there a standard way to compare two ranges using a predicate?

Given...
string a; // = something.
string b; // = something else. The two strings are of equal length.
string::size_type score = 0;
...what I would like to do is something like...
compare(a.cbegin(), a.cend(), b.cbegin(), b.cend(), [&score](const char c1, const char c2) -> void {
if (c1 == c2) { // actually a bit more complicated in real life
score++;
}
});
...but as far as I can tell there doesn't seem to be a std::compare. The nearest seems to be std::lexicographical_compare but that doesn't quite match. Ditto for std::equal. Is there really nothing appropriate in the standard library? I suppose I could write my own (or use a plain old C style loop which is what I did but how boring :-) but I would think what I'm doing is rather common so that would be a strange omission IMO. So my question is am I missing something?

Is there a standard algorithm to compare to ranges using a predicate? Yes, std::equal, or std::lexicographical_compare.
Is there a standard algorithm to do what your code is doing? std::inner_product can be made to do it:
std::string a = "something";
std::string b = "samething";
auto score = std::inner_product(
a.begin(), a.end(), b.begin(), 0,
[](int x, bool b) { return x + b; },
[](char a, char b) { return a == b; });
but I would think what I'm doing is rather common
No, not really. If you just want to run a general function over corresponding elements in two ranges, the appropriate algorithm would be for_each with a zip iterator. If anything's missing from the standard, it's the zip iterator. We don't need a special algorithm for this purpose.

It looks a bit as if you are looking for std::mismatch() which yields the iterators where the first difference is found (or the end, of course). It doesn't compute the difference, however, because there isn't a subtraction defined for all types. Like the other algorithms std::mismatch() comes in a form with a predicate and one without a predicate.

Thankyou to all that answered. What I was trying to do (more for my edification than anything else really) was to replace this...
for (string::const_iterator c1 = a.begin(), c2 = b.begin(); c1 != a.end(); ++c1, ++c2) {
if (*c1 == *c2) {
score++;
}
}
...with snazzy new c++11 stuff :-) I looked at equal, lexicographical_compare etc. but I guess what tripped me up was that they take a boolean predicate and if it returns false processing stops whereas I needed to process the entire ranges each time. Then after reading the answers you gave me I had the epiphany that just because there is a return value doesn't mean I can't throw it away if I don't need it. By simply always returning true in my lambda I can use any of the above mentioned algorithms and they will run to the end of the range.
The only thing is as I would be using the algorithms in a different way than their names suggest, it might cause maintainance problems in the future so I will just stick to my boring old loop for now but I learned something new so thanks once again.

C++ Standard Library approach to removing one of a pair of items in a list that satisfy a criterion

Imagine you have an std::list with a set of values in it. For demonstration's sake, we'll say it's just std::list<int>, but in my case they're actually 2D points. Anyway, I want to remove one of a pair of ints (or points) which satisfy some sort of distance criterion. My question is how to approach this as an iteration that doesn't do more than O(N^2) operations.
Example
Source is a list of ints containing:
{ 16, 2, 5, 10, 15, 1, 20 }
If I gave this a distance criterion of 1 (i.e. no item in the list should be within 1 of any other), I'd like to produce the following output:
{ 16, 2, 5, 10, 20 } if I iterated forward or
{ 20, 1, 15, 10, 5 } if I iterated backward
I feel that there must be some awesome way to do this, but I'm stuck with this double loop of iterators and trying to erase items while iterating through the list.

Make a map of "regions", basically, a std::map<coordinates/len, std::vector<point>>.
Add each point to it's region, and each of the 8 neighboring regions O(N*logN). Run the "nieve" algorithm on each of these smaller lists (technically O(N^2) unless theres a maximum density, then it becomes O(N*density)). Finally: On your origional list, iterate through each point, and if it has been removed from any of the 8 mini-lists it was put in, remove it from the list. O(n)
With no limit on density, this is O(N^2), and slow. But this gets faster and faster the more spread out the points are. If the points are somewhat evenly distributed in a known boundary, you can switch to a two dimensional array, making this significantly faster, and if there's a constant limit to the density, that technically makes this a O(N) algorithm.
That is how you sort a list of two variables by the way. The grid/map/2dvector thing.
[EDIT] You mentioned you were having trouble with the "nieve" method too, so here's that:
template<class iterator, class criterion>
iterator RemoveCriterion(iterator begin, iterator end, criterion criter) {
iterator actend = end;
for(iterator L=begin; L != actend; ++L) {
iterator R(L);
for(++R; R != actend;) {
if (criter(*L, *R) {
iterator N(R);
std::rotate(R, ++N, actend);
--actend;
} else
++R;
}
}
return actend;
}
This should work on linked lists, vectors, and similar containers, and works in reverse. Unfortunately, it's kinda slow due to not taking into account the properties of linked lists. It's possible to make much faster versions that only work on linked lists in a specific direction. Note that the return value is important, like with the other mutating algorithms. It can only alter contents of the container, not the container itself, so you'll have to erase all elements after the return value when it finishes.

Cubbi had the best answer, though he deleted it for some reason:
Sounds like it's a sorted list, in which case std::unique will do the job of removing the second element of each pair:
#include <list>
#include <algorithm>
#include <iostream>
#include <iterator>
int main()
{
std::list<int> data = {1,2,5,10,15,16,20};
std::unique_copy(data.begin(), data.end(),
std::ostream_iterator<int>(std::cout, " "),
[](int n, int m){return abs(n-m)<=1;});
std::cout << '\n';
}
demo: https://ideone.com/OnGxk
That trivially extends to other types -- either by changing int to something else, or by defining a template:
template<typename T> void remove_close(std::list<T> &data, int distance)
{
std::unique_copy(data.begin(), data.end(),
std::ostream_iterator<int>(std::cout, " "),
[distance](T n, T m){return abs(n-m)<=distance;});
return data;
}
Which will work for any type that defines operator - and abs to allow finding a distance between two objects.

As a mathematician I am pretty sure there is no 'awesome' way to approaching this problem for an unsorted list. It seems to me that it is a logical necessity to check the criterion for any one element against all previous elements selected in order to determine whether insertion is viable or not. There may be a number of ways to optimize this, depending on the size of the list and the criterion.
Perhaps you could maintain a bitset based on the criterion. E.g. suppose abs(n-m)<1) is the criterion. Suppose the first element is of size 5. This is carried over into the new list. So flip bitset[5] to 1. Then, when you encounter an element of size 6, say, you need only test
!( bitset[5] | bitset[6] | bitset[7])
This would ensure no element is within magnitude 1 of the resulting list. This idea may be difficult to extend for more complicated(non discrete) criterions however.

What about:
struct IsNeighbour : public std::binary_function<int,int,bool>
{
IsNeighbour(int dist)
: distance(dist) {}
bool operator()(int a, int b) const
{ return abs(a-b) <= distance; }
int distance;
};
std::list<int>::iterator iter = lst.begin();
while(iter != lst.end())
{
iter = std::adjacent_find(iter, lst.end(), IsNeighbour(some_distance)));
if(iter != lst.end())
iter = lst.erase(iter);
}
This should have O(n). It searches for the first pair of neighbours (which are at maximum some_distance away from each other) and removes the first of this pair. This is repeated (starting from the found item and not from the beginning, of course) until no pairs are found anymore.
EDIT: Oh sorry, you said any other and not just its next element. In this case the above algorithm only works for a sorted list. So you should sort it first, if neccessary.
You can also use std::unique instead of this custom loop above:
lst.erase(std::unique(lst.begin(), lst.end(), IsNeighbour(some_distance), lst.end());
but this removes the second item of each equal pair, and not the first, so you may have to reverse the iteration direction if this matters.
For 2D points instead of ints (1D points) it is not that easy, as you cannot just sort them by their euclidean distance. So if your real problem is to do it on 2D points, you might rephrase the question to point that out more clearly and remove the oversimplified int example.

I think this will work, as long as you don't mind making copies of the data, but if it's just a pair of integer/floats, that should be pretty low-cost. You're making n^2 comparisons, but you're using std::algorithm and can declare the input vector const.
//calculates the distance between two points and returns true if said distance is
//under its threshold
bool isTooClose(const Point& lhs, const Point& rhs, int threshold = 1);
vector<Point>& vec; //the original vector, passed in
vector<Point>& out; //the output vector, returned however you like
for(b = vec.begin(), e = vec.end(); b != e; b++) {
Point& candidate = *b;
if(find_if(out.begin(),
out.end(),
bind1st(isTooClose, candidate)) == out.end())
{//we didn't find anyone too close to us in the output vector. Let's add!
out.push_back(candidate);
}
}

std::list<>.erase(remove_if(...)) using functors
http://en.wikipedia.org/wiki/Erase-remove_idiom
Update(added code):
struct IsNeighbour : public std::unary_function<int,bool>
{
IsNeighbour(int dist)
: m_distance(dist), m_old_value(0){}
bool operator()(int a)
{
bool result = abs(a-m_old_value) <= m_distance;
m_old_value = a;
return result;
}
int m_distance;
int m_old_value;
};
main function...
std::list<int> data = {1,2,5,10,15,16,20};
data.erase(std::remove_if(data.begin(), data.end(), IsNeighbour(1)), data.end());

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

std::algorithm to make this more expressive - c++

Related

How can I apply a condition on a lambda expression in kotlin?

removing nested paths from vector of strings

How does std::set comparator function work?

Is there a standard way to compare two ranges using a predicate?

C++ Standard Library approach to removing one of a pair of items in a list that satisfy a criterion

Categories

Resources