Efficient shifting of vectors - c++

What is the best way for linear shifting of a vector keeping the length same and setting the empty slots to 0, something like what valarray.shift(int n) does.
I can think of a naive way, just wondering if there is a better one
int shift = 2;
std::vector<int> v = {1,2,3,4,5};
std::rotate(v.begin(), v.end() - shift, v.end());
std::fill(v.begin(), v.begin() + shift, 0);
// Input: 1,2,3,4,5
// Output: 0,0,1,2,3

You could use std::move instead, as it should probably be a little more "efficient" than std::rotate. Still need the std::fill call though.
Use it like
std::move(begin(v), end(v) - shift, begin(v) + shift);
std::fill(begin(v), begin(v) + shift, 0);
Also if the shift or size of the vector is input from outside the program, then don't forget to add some safety checks (as in the answer by Paolo).

I think that can limit to a call to std::copy as it follows:
#include <iostream>
#include <vector>
int main()
{
const size_t shift {2};
const std::vector<int> inVec = {1,2,3,4,5};
std::vector<int> outVec(inVec.size());
if(inVec.size() - shift > 0)
{
const size_t start {inVec.size() - shift};
std::copy(inVec.begin(), inVec.begin() + start, outVec.begin() + shift);
}
for(const auto& val : inVec)
{
std::cout << val << " ";
}
std::cout << std::endl;
for(const auto& val : outVec)
{
std::cout << val << " ";
}
std::cout << std::endl;
}

Related

Compare two unsorted std::vector [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
What is the best way to compare two unsorted std::vector
std::vector<int> v1 = {1, 2, 3, 4, 5};
std::vector<int> v2 = {2, 3, 4, 5, 1};
What I am currently doing is
const auto is_similar = v1.size() == v2.size() && std::is_permutation(v1.begin(), v1.end(), v2.begin());
Here two vectors are similar only when the size of both vectors are equal and they contain the same elements
What would be a better approach for
two small std::vectors (size well under 50 elements)
two large std::vectors
std::is_permutation appears to be very-very slow for large arrays. Already for 64 K elements for simlar arrays it takes around 5 seconds to give answer. While regular sorting takes 0.007 seconds for this size of arrays. Timings are provided in my code below.
I suggest to do following thing - compute any simple (and fast) hash function of elements that is independent of elements order. If hash of two arrays is not equal then they are not similar, in other words two arrays as sets are not equal. Only if hashes are same then do regular sorting and compare arrays for equality.
Things suggested in my answer are meant for large arrays, to make computation fast, for tiny arrays probably std::is_permutation is enough. Although everything in this answer applies well to small arrays too.
In following code there are three functions implemented SimilarProd(), SimilarSort(), SimilarIsPermutation(), first of them uses my suggestion about first computing hash function and then sorting.
As a position-independent hash function I took regular product (multiplication) of all elements shifted (added to) by some fixed random 64-bit value. This kind of computation applied to integer arrays will be computed very fast due to good auto-vectorization capabilities of modern compilers (like CLang and GCC) which use SIMD instructions to boost computation.
In below code I did timings for all three implementations of similarity functions. It appeared that in case of similar arrays (same set of numbers) for arrays 64 K in size it takes 5 seconds for std::is_permutation(), while both hash approach and sort approach take 0.007 seconds. For unsimilar arrays std::is_permutation is very fast, below 0.00005 seconds, while sort is also 0.007 seconds and hash is 100x times faster, 0.00008 seconds.
So conclusion is that std::is_permutation is very slow for large similar arrays, and very fast for unsimilar. Sort approach is same fast speed for similar and unsimilar. While hash approach is fast for similar and blazingly fast for unsimilar. Hash approach is about the same speed as std::is_permutation for the case of unsimilar arrays, but for similar arrays is a clear win.
So out of three approaches hash approach is a clear win.
See timings below after code.
Update. For comparison just now added one more method SimilarMap(). Counting number of occurances of each integer in arrays using std::unordered_map. It appeared to be a bit slower than sorting. So still Hash+Sort method is the fastest. Although for very large arrays this map-counting method should outperform sorting speed.
Try it online!
#include <cstdint>
#include <array>
#include <vector>
#include <algorithm>
#include <random>
#include <chrono>
#include <iomanip>
#include <iostream>
#include <unordered_map>
bool SimilarProd(std::vector<int> const & a, std::vector<int> const & b) {
using std::size_t;
using u64 = uint64_t;
if (a.size() != b.size())
return false;
u64 constexpr randv = 0x6A7BE8CD0708EC4CULL;
size_t constexpr num_words = 8;
std::array<u64, num_words> prodA = {}, prodB = {};
std::fill(prodA.begin(), prodA.end(), 1);
std::fill(prodB.begin(), prodB.end(), 1);
for (size_t i = 0; i < a.size() - a.size() % num_words; i += num_words)
for (size_t j = 0; j < num_words; ++j) {
prodA[j] *= (randv + u64(a[i + j])) | 1;
prodB[j] *= (randv + u64(b[i + j])) | 1;
}
for (size_t i = a.size() - a.size() % num_words; i < a.size(); ++i) {
prodA[0] *= (randv + u64(a[i])) | 1;
prodB[0] *= (randv + u64(b[i])) | 1;
}
for (size_t i = 1; i < num_words; ++i) {
prodA[0] *= prodA[i];
prodB[0] *= prodB[i];
}
if (prodA[0] != prodB[0])
return false;
auto a2 = a, b2 = b;
std::sort(a2.begin(), a2.end());
std::sort(b2.begin(), b2.end());
return a2 == b2;
}
bool SimilarSort(std::vector<int> a, std::vector<int> b) {
if (a.size() != b.size())
return false;
std::sort(a.begin(), a.end());
std::sort(b.begin(), b.end());
return a == b;
}
bool SimilarIsPermutation(std::vector<int> const & a, std::vector<int> const & b) {
return a.size() == b.size() && std::is_permutation(a.begin(), a.end(), b.begin());
}
bool SimilarMap(std::vector<int> const & a, std::vector<int> const & b) {
if (a.size() != b.size())
return false;
std::unordered_map<int, int> m;
for (auto x: a)
++m[x];
for (auto x: b)
--m[x];
for (auto const & p: m)
if (p.second != 0)
return false;
return true;
}
void Test() {
using std::size_t;
auto TimeCur = []{ return std::chrono::high_resolution_clock::now(); };
auto const gtb = TimeCur();
auto Time = [&]{ return std::chrono::duration_cast<
std::chrono::microseconds>(TimeCur() - gtb).count() / 1000000.0; };
std::mt19937_64 rng{123};
auto RandV = [&](size_t n) {
std::vector<int> v(n);
for (size_t i = 0; i < v.size(); ++i)
v[i] = rng() % (1 << 30);
return v;
};
size_t constexpr n = 1 << 16;
auto a = RandV(n), b = a, c = RandV(n);
std::shuffle(b.begin(), b.end(), rng);
std::cout << std::boolalpha << std::fixed;
double tb = 0;
tb = Time(); std::cout << "Prod "
<< SimilarProd(a, b) << " time " << (Time() - tb) << std::endl;
tb = Time(); std::cout << "Sort "
<< SimilarSort(a, b) << " time " << (Time() - tb) << std::endl;
tb = Time(); std::cout << "IsPermutation "
<< SimilarIsPermutation(a, b) << " time " << (Time() - tb) << std::endl;
tb = Time(); std::cout << "Map "
<< SimilarMap(a, b) << " time " << (Time() - tb) << std::endl;
tb = Time(); std::cout << "Prod "
<< SimilarProd(a, c) << " time " << (Time() - tb) << std::endl;
tb = Time(); std::cout << "Sort "
<< SimilarSort(a, c) << " time " << (Time() - tb) << std::endl;
tb = Time(); std::cout << "IsPermutation "
<< SimilarIsPermutation(a, c) << " time " << (Time() - tb) << std::endl;
tb = Time(); std::cout << "Map "
<< SimilarMap(a, c) << " time " << (Time() - tb) << std::endl;
}
int main() {
Test();
}
Output:
Prod true time 0.009208
Sort true time 0.008080
IsPermutation true time 4.436632
Map true time 0.010382
Prod false time 0.000082
Sort false time 0.008750
IsPermutation false time 0.000036
Map false time 0.016390
What would be a better approach
Remove the v1.size() == v2.size() && expression and instead pass end iterator to std::is_permutation.
You tagged C++11, but to those who can use C++20, I recommend following:
std::ranges::is_permutation(v1, v2)
If you can modify the vectors, then it will be asymptotically faster to sort them and compare equality. If you cannot modify, then you could create a sorted copy if you can afford the storage cost.

Adding elements to std::vector in a repeated way

I want to copy values from one vector to other one that will be stored in a specific order and the second vector will contain more elements than the first one.
For example:
vector<int> temp;
temp.push_back(2);
temp.push_back(0);
temp.push_back(1);
int size1 = temp.size();
int size2 = 4;
vector<int> temp2(size1 * size2);
And now I would like to fill temp2 like that: {2, 2, 2, 2, 0, 0, 0, 0, 1, 1, 1, 1}.
Is it possible to do this using only algorithms (e.g. fill)?
Yes, it is possible using std::generate_n algorithm:
int main() {
std::vector<int> base{1, 0, 2};
const int factor = 4;
std::vector<int> out{};
std::generate_n(std::back_inserter(out), base.size() * factor,
[&base, counter=0]() mutable {
return base[counter++ / factor];
});
for(const auto i : out) {
std::cout << i << ' ';
}
}
This code prints: 1 1 1 1 0 0 0 0 2 2 2 2
The key is the lambda used in std::generate_n. It operates on internal counter to know which values, based on base vector (and accessed depending on factor and counter values), to generate.
No, this is quite a specific use case, but you can trivially implement it yourself.
#include <vector>
#include <iostream>
std::vector<int> Elongate(const std::vector<int>& src, const size_t factor)
{
std::vector<int> result;
result.reserve(src.size() * factor);
for (const auto& el : src)
result.insert(result.end(), factor, el);
return result;
}
int main()
{
std::vector<int> temp{2, 0, 1};
std::vector<int> real = Elongate(temp, 4);
for (const auto& el : real)
std::cerr << el << ' ';
std::cerr << '\n';
}
(live demo)

Finding the median value of a vector using C++

I'm a programming student, and for a project I'm working on, on of the things I have to do is compute the median value of a vector of int values and must be done by passing it through functions. Also the vector is initially generated randomly using the C++ random generator mt19937 which i have already written down in my code.I'm to do this using the sort function and vector member functions such as .begin(), .end(), and .size().
I'm supposed to make sure I find the median value of the vector and then output it
And I'm Stuck, below I have included my attempt. So where am I going wrong? I would appreciate if you would be willing to give me some pointers or resources to get going in the right direction.
Code:
#include<iostream>
#include<vector>
#include<cstdlib>
#include<ctime>
#include<random>
#include<vector>
#include<cstdlib>
#include<ctime>
#include<random>
using namespace std;
double find_median(vector<double>);
double find_median(vector<double> len)
{
{
int i;
double temp;
int n=len.size();
int mid;
double median;
bool swap;
do
{
swap = false;
for (i = 0; i< len.size()-1; i++)
{
if (len[i] > len[i + 1])
{
temp = len[i];
len[i] = len[i + 1];
len[i + 1] = temp;
swap = true;
}
}
}
while (swap);
for (i=0; i<len.size(); i++)
{
if (len[i]>len[i+1])
{
temp=len[i];
len[i]=len[i+1];
len[i+1]=temp;
}
mid=len.size()/2;
if (mid%2==0)
{
median= len[i]+len[i+1];
}
else
{
median= (len[i]+0.5);
}
}
return median;
}
}
int main()
{
int n,i;
cout<<"Input the vector size: "<<endl;
cin>>n;
vector <double> foo(n);
mt19937 rand_generator;
rand_generator.seed(time(0));
uniform_real_distribution<double> rand_distribution(0,0.8);
cout<<"original vector: "<<" ";
for (i=0; i<n; i++)
{
double rand_num=rand_distribution(rand_generator);
foo[i]=rand_num;
cout<<foo[i]<<" ";
}
double median;
median=find_median(foo);
cout<<endl;
cout<<"The median of the vector is: "<<" ";
cout<<median<<endl;
}
The median is given by
const auto median_it = len.begin() + len.size() / 2;
std::nth_element(len.begin(), median_it , len.end());
auto median = *median_it;
For even numbers (size of vector) you need to be a bit more precise. E.g., you can use
assert(!len.empty());
if (len.size() % 2 == 0) {
const auto median_it1 = len.begin() + len.size() / 2 - 1;
const auto median_it2 = len.begin() + len.size() / 2;
std::nth_element(len.begin(), median_it1 , len.end());
const auto e1 = *median_it1;
std::nth_element(len.begin(), median_it2 , len.end());
const auto e2 = *median_it2;
return (e1 + e2) / 2;
} else {
const auto median_it = len.begin() + len.size() / 2;
std::nth_element(len.begin(), median_it , len.end());
return *median_it;
}
There are of course many different ways how we can get element e1. We could also use max or whatever we want. But this line is important because nth_element only places the nth element correctly, the remaining elements are ordered before or after this element, depending on whether they are larger or smaller. This range is unsorted.
This code is guaranteed to have linear complexity on average, i.e., O(N), therefore it is asymptotically better than sort, which is O(N log N).
Regarding your code:
for (i=0; i<len.size(); i++){
if (len[i]>len[i+1])
This will not work, as you access len[len.size()] in the last iteration which does not exist.
std::sort(len.begin(), len.end());
double median = len[len.size() / 2];
will do it. You might need to take the average of the middle two elements if size() is even, depending on your requirements:
0.5 * (len[len.size() / 2 - 1] + len[len.size() / 2]);
Instead of trying to do everything at once, you should start with simple test cases and work upwards:
#include<vector>
double find_median(std::vector<double> len);
// Return the number of failures - shell interprets 0 as 'success',
// which suits us perfectly.
int main()
{
return find_median({0, 1, 1, 2}) != 1;
}
This already fails with your code (even after fixing i to be an unsigned type), so you could start debugging (even 'dry' debugging, where you trace the code through on paper; that's probably enough here).
I do note that with a smaller test case, such as {0, 1, 2}, I get a crash rather than merely failing the test, so there's something that really needs to be fixed.
Let's replace the implementation with one based on overseas's answer:
#include <algorithm>
#include <limits>
#include <vector>
double find_median(std::vector<double> len)
{
if (len.size() < 1)
return std::numeric_limits<double>::signaling_NaN();
const auto alpha = len.begin();
const auto omega = len.end();
// Find the two middle positions (they will be the same if size is odd)
const auto i1 = alpha + (len.size()-1) / 2;
const auto i2 = alpha + len.size() / 2;
// Partial sort to place the correct elements at those indexes (it's okay to modify the vector,
// as we've been given a copy; otherwise, we could use std::partial_sort_copy to populate a
// temporary vector).
std::nth_element(alpha, i1, omega);
std::nth_element(i1, i2, omega);
return 0.5 * (*i1 + *i2);
}
Now, our test passes. We can write a helper method to allow us to create more tests:
#include <iostream>
bool test_median(const std::vector<double>& v, double expected)
{
auto actual = find_median(v);
if (abs(expected - actual) > 0.01) {
std::cerr << actual << " - expected " << expected << std::endl;
return true;
} else {
std::cout << actual << std::endl;
return false;
}
}
int main()
{
return test_median({0, 1, 1, 2}, 1)
+ test_median({5}, 5)
+ test_median({5, 5, 5, 0, 0, 0, 1, 2}, 1.5);
}
Once you have the simple test cases working, you can manage more complex ones. Only then is it time to create a large array of random values to see how well it scales:
#include <ctime>
#include <functional>
#include <random>
int main(int argc, char **argv)
{
std::vector<double> foo;
const int n = argc > 1 ? std::stoi(argv[1]) : 10;
foo.reserve(n);
std::mt19937 rand_generator(std::time(0));
std::uniform_real_distribution<double> rand_distribution(0,0.8);
std::generate_n(std::back_inserter(foo), n, std::bind(rand_distribution, rand_generator));
std::cout << "Vector:";
for (auto v: foo)
std::cout << ' ' << v;
std::cout << "\nMedian = " << find_median(foo) << std::endl;
}
(I've taken the number of elements as a command-line argument; that's more convenient in my build than reading it from cin). Notice that instead of allocating n doubles in the vector, we simply reserve capacity for them, but don't create any until needed.
For fun and kicks, we can now make find_median() generic. I'll leave that as an exercise; I suggest you start with:
typename<class Iterator>
auto find_median(Iterator alpha, Iterator omega)
{
using value_type = typename Iterator::value_type;
if (alpha == omega)
return std::numeric_limits<value_type>::signaling_NaN();
}

Using `transform` to create an increasing vector

I am trying to make an increasing vector using transform and must not be doing it correctly. I want to use transform. What am I doing wrong?
PS - I will be using the c++ 11 standard and g++.
#include <iostream>
#include <algorithm>
#include <vector>
int main()
{
std::vector<double> x(10);
x.front() = 0.0;
double h = 0.1;
std::transform(x.begin(), x.end() - 1, x.begin() + 1, [h](unsigned int xn) {return xn + h;});
std::cout << x.at(3) << " " << x.at(9) << std::endl;
}
The conversion to unsigned int is truncating each value when it is used to calculate the next
std::transform - Using an unary operator
std::transform applies the given function to a range and stores the
result in another range, beginning at d_first.
Via std::transform and a closure you can initialize your std::vector:
#include <algorithm>
#include <iostream>
#include <vector>
int main() {
std::vector<double> v(10);
const double step = 0.1;
std::transform(begin(v), end(v), begin(v),
[step](const double value) { return value + step; });
for (const auto value : v) {
std::cout << value << ' ';
}
}
std::generate - Increment via a callable
Assigns each element in range [first, last) a value generated by the
given function object
If you want a custom increment, you can use std::generate:
#include <algorithm>
#include <iostream>
#include <vector>
int main() {
std::vector<double> v(10);
double seed = 0.0;
std::generate(begin(v), end(v), [&seed]() {
const auto ret = seed;
seed += 0.1;
return ret;
});
for (const auto value : v) {
std::cout << value << ' ';
} // outputs: 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
}
std::iota - Increment via ++value
Slightly off topic. You can provide a type with a operator++ for an increment of 0.1 but it is not intuitive for the reader.
You can use std::iota which relies on operator++.
Fills the range [first, last) with sequentially increasing values, starting with value and repetitively evaluating ++value.
The code in your case will be:
#include <numeric>
#include <iostream>
#include <vector>
int main() {
std::vector<double> v(10);
std::iota(begin(v), end(v), 0.0);
for (const auto value : v) {
std::cout << value << ' ';
} // outputs: 0 1 2 3 4 5 6 7 8 9
}
The lambda declares a wrong type of the parameter
[h](unsigned int xn) {return xn + h;});
^^^^^^^^^^^^^^^
There should be
[h]( double xn ) {return xn + h;});
^^^^^^^^^^^
Here are some other ways to write this. You may find them more expressive.
#include <vector>
#include <algorithm>
#include <numeric>
std::vector<double> create1(double i, double h)
{
std::vector<double> v(10);
std::generate(std::begin(v), std::end(v),
[&]() mutable
{
auto result = i;
i += h;
return i;
});
return v;
}
std::vector<double> create2(double i, double h)
{
std::vector<double> v(10);
for (std::size_t x = 0 ; v.size() ; ++x) {
v[x] = i + h * x;
}
return v;
}
std::vector<double> create3(double i, double h)
{
struct emitter
{
emitter& operator++() {
i += h;
}
operator double() const { return i; }
double i, h;
};
std::vector<double> v(10);
std::iota(v.begin(), v.end(), emitter { i, h });
return v;
}
int main()
{
auto v1 = create1(0, 0.1);
auto v2 = create2(0, 0.1);
auto v3 = create3(0, 0.1);
}
Regardless of any other problems it might have, your implementation has a subtle flaw: it relies on each preceding value in the vector having been already set.
This is not guaranteed to work, because std::transform() does not guarantee in-order application of the operator.

Convert vector<bool> to binary

I have a vector<bool> that contains 10 elements. How can I convert it to a binary type;
vector<bool> a={0,1,1,1,1,0,1,1,1,0}
I want to get binary values, something like this:
long long int x = convert2bin(s)
cout << "x = " << x << endl
x = 0b0111101110
Note: the size of vector will be change during run time, max size = 400.
0b is important, I want to use the gcc extension, or some literal type.
As I understood of comment
Yes it can even hold 400 values
And in question
0b is important
You need to have string, not int.
std::string convert2bin(const std::vector<bool>& v)
{
std::string out("0b");
out.reserve(v.size() + 2);
for (bool b : v)
{
out += b ? '1' : '0';
}
return i;
}
std::vector<bool> a = { 0, 1, 1, 1, 1, 0, 1, 1, 1, 0 };
std::string s = "";
for (bool b : a)
{
s += std::to_string(b);
}
int result = std::stoi(s);
If you really want to do this, you start from the end. Although I support Marius Bancila and advise to use a bitset instead.
int mValue = 0
for(int i=a.size()-1, pos=0; i>=0; i--, pos++)
{
// Here we create the bitmask for this value
if(a[i] == 1)
{
mask = 1;
mask << pos;
myValue |= mask;
}
}
Your x is just an integer form from a, so can use std::accumulate like following
long long x = accumulate(a.begin(), a.end(), 0,
[](long long p, long long q)
{ return (p << 1) + q; }
);
For a 400 size, you need a std::string though
First of all the result of the conversion is not a literal. So you may not use prefix 0b applied to variable x.
Here is an example
#include <iostream>
#include <iomanip>
#include <algorithm>
#include <numeric>
#include <vector>
#include <iterator>
#include <limits>
int main()
{
std::vector<bool> v = { 0, 1, 1, 1, 1, 0, 1, 1, 1, 0 };
typedef std::vector<bool>::size_type size_type;
size_type n = std::min<size_type>( v.size(),
std::numeric_limits<long long>::digits + 1 );
long long x = std::accumulate( v.begin(), std::next( v.begin(), n ), 0ll,
[]( long long acc, int value )
{
return acc << 1 | value;
} );
for ( int i : v ) std::cout << i;
std::cout << std::endl;
std::cout << std::hex << x << std::endl;
return 0;
}
The output is
0111101110
1ee
vector<bool> is already a "binary" type.
Converting to an int is not possible for more bits than available in an int. However if you want to be able to print in that format, you can use a facet and attach it to the locale then imbue() before you print your vector<bool>. Ideally you will "store" the locale once.
I don't know the GNU extension for printing an int with 0b prefix but you can get your print facet to do that.
A simpler way is to create a "wrapper" for your vector<bool> and print that.
Although vector<bool> is always internally implemented as a "bitset" there is no public method to extract the raw data out nor necessarily a standard representation for it.
You can of course convert it to a different type by iterating through it, although I guess you may have been looking for something else?
If the number of bits is known in advance and by some reason you need to start from an std::array rather than from an std::bitset directly, consider this option (inspired by this book):
#include <sstream>
#include <iostream>
#include <bitset>
#include <array>
#include <iterator>
/**
* #brief Converts an array of bools to a bitset
* #tparam nBits the size of the array
* #param bits the array of bools
* #return a bitset with size nBits
* #see https://www.linuxtopia.org/online_books/programming_books/c++_practical_programming/c++_practical_programming_192.html
*/
template <size_t nBits>
std::bitset<nBits> BitsToBitset(const std::array<bool, nBits> bits)
{
std::ostringstream oss;
std::copy(std::begin(bits), std::end(bits), std::ostream_iterator<bool>(oss, ""));
return std::bitset<nBits>(oss.str());
}
int main()
{
std::array<bool, 10> a = { 0, 1, 1, 1, 1, 0, 1, 1, 1, 0 };
unsigned long int x = BitsToBitset(a).to_ulong();
std::cout << x << std::endl;
return x;
}