C++ | STL Sort() Function Third Argument Complexity confusions - c++

Suppose, we have a 2D vector<vector<int>> vec;
We need to sort this 2D vector. I tried with two methods below:
Method 1: Using a comparator function
static bool cmp(vector<int> a, vector<int> b) {
return a[1] < b[1];
}
...
sort(vec.begin(),vec.end(),cmp);
Method 2: Using a lambda
sort(vec.begin(), vec.end(), [](const vector<int>& a, vector<int>& b) {
return a[1] < b[1];
});
For a problem from leetcode, Method 1 caused a "Time Limit Exceeded" verdict, while Method 2 was accepted.
Can there be that much contrast between these two methods in terms of time complexity?

vector<int> a makes a copy of the vector while const vector<int>& a just passes the address. Huge difference.

Your comparator is taking its parameters by value, which means every vector object that sort() passes to cmp() will have to be copied in memory. That increases time complexity and memory usage, multiplied out by however many elements are actually in your vectors.
Your lambda, on the other hand, is taking its parameters by reference instead, which means every vector object that sort() passes to the lambda will have only its current memory address passed, no copies are made. So there is no increase in time complexity.
Simply update your comparator to take reference parameters, and then the two methods will have similar complexity:
static bool cmp(const vector<int> &a, const vector<int> &b) {
return a[1] < b[1];
}

Related

Lambda function, arguments and logic in c++ [duplicate]

This question already has answers here:
How does comparator function of c++ STL sort work?
(1 answer)
how does sort function in c++ work? [closed]
(2 answers)
Closed 1 year ago.
I am new to using lambda functions in C++. I have been researching the web, and found several articles, explaining the syntax and purpose of lambda function, but I have not come found articles which are clearly giving an explaining how to write the inner logic of a lambda function.
For example
During sorting a vector in c++ in decreasing order:
sort(v1.begin(), v1.end(), [](const int &a, const int &b){return b < a;});
I write the above code. Here, I have a few questions:
Why do I only provide two parameters in the lambda function? Why not three? or why not I give all the n parameter(n is size of vector) and do a logic? I am not trying to find maximum of two elements, I am trying to sort a vector, why should I only consider two values?
Why does a > b gives descending order? Why not b > a? Are there any kind of ordering inside the lambda function?
The return value in the above lambda function is either false(0) or true(1)? Why do I only have to return false(0) or true(1) to sort? Why can't I return a character to sort, like let's suppose for return value 'a' it is ascending and return value 'd' it is descending?
Again
During finding the max even element
itr = max_element(v1.begin(), v1.end(), [](const int &a, const int &b){
if (isEven(a) && isEven(b)) {
return (a < b);
} else
return false;
}
);
I am returning b > a. Rather than a greater than b. ???
Any suggestion would be greatly appreciated.
Your question has nothing to do with lambdas, but with the std::sort function.
Indeed, if you read the documentation about the third parameter (the comparison function, the lambda in your case), it says:
comparison function object which returns ​true if the first argument is
less than (i.e. is ordered before) the second.
The signature of the comparison function should be equivalent to the
following:
bool cmp(const Type1 &a, const Type2 &b);
Indeed, there is not need to create a lambda to pass it as the third parameter. You can pass any function object that receives two arguments of type T (the one of your container's elements) and returns bool.
For example, you can do something like this:
#include <vector>
#include <algorithm>
#include <iostream>
struct {
bool operator () (int a, int b){
return a > b;
}
}my_comp;
int main(){
std::vector<int> v={1,2,3};
std::sort(v.begin(), v.end(), my_comp);
for(auto e:v) std::cout << e << " ";
std::cout << std::endl;
}

how to sum up a vector of vector int in C++ without loops

I try to implement that summing up all elements of a vector<vector<int>> in a non-loop ways.
I have checked some relevant questions before, How to sum up elements of a C++ vector?.
So I try to use std::accumulate to implement it but I find it is hard for me to overload a Binary Operator in std::accumulate and implement it.
So I am confused about how to implement it with std::accumulate or is there a better way?
If not mind could anyone help me?
Thanks in advance.
You need to use std::accumulate twice, once for the outer vector with a binary operator that knows how to sum the inner vector using an additional call to std::accumulate:
int sum = std::accumulate(
vec.begin(), vec.end(), // iterators for the outer vector
0, // initial value for summation - 0
[](int init, const std::vector<int>& intvec){ // binaryOp that sums a single vector<int>
return std::accumulate(
intvec.begin(), intvec.end(), // iterators for the inner vector
init); // current sum
// use the default binaryOp here
}
);
In this case, I do not suggest using std::accumulate as it would greatly impair readability. Moreover, this function use loops internally, so you would not save anything. Just compare the following loop-based solution with the other answers that use std::accumulate:
int result = 0 ;
for (auto const & subvector : your_vector)
for (int element : subvector)
result += element;
Does using a combination of iterators, STL functions, and lambda functions makes your code easier to understand and faster? For me, the answer is clear. Loops are not evil, especially for such simple application.
According to https://en.cppreference.com/w/cpp/algorithm/accumulate , looks like BinaryOp has the current sum on the left hand, and the next range element on the right. So you should run std::accumulate on the right hand side argument, and then just sum it with left hand side argument and return the result. If you use C++14 or later,
auto binary_op = [&](auto cur_sum, const auto& el){
auto rhs_sum = std::accumulate(el.begin(), el.end(), 0);
return cur_sum + rhs_sum;
};
I didn't try to compile the code though :). If i messed up the order of arguments, just replace them.
Edit: wrong terminology - you don't overload BinaryOp, you just pass it.
Signature of std::accumulate is:
T accumulate( InputIt first, InputIt last, T init,
BinaryOperation op );
Note that the return value is deduced from the init parameter (it is not necessarily the value_type of InputIt).
The binary operation is:
Ret binary_op(const Type1 &a, const Type2 &b);
where... (from cppreference)...
The type Type1 must be such that an object of type T can be implicitly converted to Type1. The type Type2 must be such that an object of type InputIt can be dereferenced and then implicitly converted to Type2. The type Ret must be such that an object of type T can be assigned a value of type Ret.
However, when T is the value_type of InputIt, the above is simpler and you have:
using value_type = std::iterator_traits<InputIt>::value_type;
T binary_op(T,value_type&).
Your final result is supposed to be an int, hence T is int. You need two calls two std::accumulate, one for the outer vector (where value_type == std::vector<int>) and one for the inner vectors (where value_type == int):
#include <iostream>
#include <numeric>
#include <iterator>
#include <vector>
template <typename IT, typename T>
T accumulate2d(IT outer_begin, IT outer_end,const T& init){
using value_type = typename std::iterator_traits<IT>::value_type;
return std::accumulate( outer_begin,outer_end,init,
[](T accu,const value_type& inner){
return std::accumulate( inner.begin(),inner.end(),accu);
});
}
int main() {
std::vector<std::vector<int>> x{ {1,2} , {1,2,3} };
std::cout << accumulate2d(x.begin(),x.end(),0);
}
Solutions based on nesting std::accumulate may be difficult to understand.
By using a 1D array of intermediate sums, the solution can be more straightforward (but possibly less efficient).
int main()
{
// create a unary operator for 'std::transform'
auto accumulate = []( vector<int> const & v ) -> int
{
return std::accumulate(v.begin(),v.end(),int{});
};
vector<vector<int>> data = {{1,2,3},{4,5},{6,7,8,9}}; // 2D array
vector<int> temp; // 1D array of intermediate sums
transform( data.begin(), data.end(), back_inserter(temp), accumulate );
int result = accumulate(temp);
cerr<<"result="<<result<<"\n";
}
The call to transform accumulates each of the inner arrays to initialize the 1D temp array.
To avoid loops, you'll have to specifically add each element:
std::vector<int> database = {1, 2, 3, 4};
int sum = 0;
int index = 0;
// Start the accumulation
sum = database[index++];
sum = database[index++];
sum = database[index++];
sum = database[index++];
There is no guarantee that std::accumulate will be non-loop (no loops). If you need to avoid loops, then don't use it.
IMHO, there is nothing wrong with using loops: for, while or do-while. Processors that have specialized instructions for summing arrays use loops. Loops are a convenient method for conserving code space. However, there may be times when loops want to be unrolled (for performance reasons). You can have a loop with expanded or unrolled content in it.
With range-v3 (and soon with C++20), you might do
const std::vector<std::vector<int>> v{{1, 2}, {3, 4, 5, 6}};
auto flat = v | ranges::view::join;
std::cout << std::accumulate(begin(flat), end(flat), 0);
Demo

Comparing a vector with an array assuming the elements are in different order

I would like to compare a vector with an array assuming that elements are in different order.
I have got a struct like below:
struct A
{
int index;
A() : index(0) {}
};
The size of the vector and the array is the same:
std::vector<A> l_v = {A(1), A(2), A(3)};
A l_a[3] = {A(3), A(1), A(2)};
The function to compare elements is:
bool isTheSame()
{
return std::equal(l_v.begin(), l_v.end(), l_a,
[](const A& lhs, const A& rhs){
return lhs.index == rhs.index;
});
}
The problem is that my function will return false, because the elements are the same, but not in the same order.
A solution is to sort the elements in the vector and array before "std::equal", but is there any better solution?
Using sort would be the way to go. Sorting in general is a good idea. And as far as I know it would result in the best performance.
Note: I would recommend passing the vectors as arguments. Rather than using the member variables. After doing that this would be a typical function that would be very well suited to inline. Also you might also want to consider taking it out of the class and/or making it static.

Using Vectors with functions, Pointer Issues

I have gotten myself completely muddled up and confused when it comes to these points and vectors.
I have two functions
vector<int>& add(vector<int>& num1, vector<int>& num2){
vector<int> ansVec;
....
return ansVec;
}
and
vector<int>& multiply(vector<int>& num1, vector<int>& num2){
...
return add( multiply(num1, num2), multiply(num1, num2));
}
The issue seems to be that I'm returning a reference to a local variable.
How could I pass the entire vector, and not just the reference?
you cannot always return by reference,
here are some solutions to your problem :
1) put the result in a container parameter:
void add(vector<int>& num1, vector<int>& num2 ,vector<int>* result)
and fill the result vector with the result of the computation
2) return by value :
vector<int> add(vector<int>& num1, vector<int>& num2);
note that here you actually copy each element from the temporary vector you will use for the computation, to the returned vector
3) return a pointer (or a some kind of smart pointer) to a vector allocated on heap
vector<int>* add(vector<int>& num1, vector<int>& num2)
{
vector<int>* res = new vector<int>();
...
return res ;
}
note that here the clients of this code must not forget to delete the vector
or , for example with boost::shared_ptr , or std::shared_ptr if you have c++11:
shared_ptr<vector<int>> add(vector<int>& num1, vector<int>& num2)
{
shared_ptr<vector<int>> res (new vector<int>());
...
return res ;
}
I think you'll be better off using std algorithms for carrying our addition/multiplication of vector e.g. use std::transform for adding two vectors or for multiplication
//Add vec1 and vec2 and store the results in result vec
//Note: take care of the ranges of two sources/destination
transform(vec1.begin(), vec1.end(), vec2.begin(), result.begin(), std::plus<int>());
Issues with your code:
returning reference of the local variable
multiply doesn't have a terminating condition.
In C++11, life is easier. Just return by value. It will automatically apply move semantics as opposed to copy semantics. Copying vectors is expensive, but moving them is cheap. Do not bother with pointers or references unless you are willing to profile your code and prove that it is a bottleneck. Using a pointer or reference here obfuscates code, and is a premature optimization.
See this talk: https://www.youtube.com/watch?v=xnqTKD8uD64. He takes up this exact issue, and recommends return by value for any type that is cheap to move.
Edit: when I say don't bother with references/pointers, I mean for returning. Passing in a const ref to a vector for input arguments, and a non-const ref to a vector for an in/out argument, is standard and good practice.

qsort comparison compilation error

My medianfilter.cpp class invokes qsort as seen below.
vector<float> medianfilter::computeMedian(vector<float> v) {
float arr[100];
std::copy(v.begin(), v.end(), arr);
unsigned int i;
qsort(arr, v.size(), sizeof(float), compare);
for (i = 0; i < v.size(); i++) {
printf("%f ", arr[i]);
}
printf("median=%d ", arr[v.size() / 2]);
return v;
}
The implementaiton of my comparison is:
int medianfilter::compare(const void * a, const void * b) {
float fa = *(const float*) a;
float fb = *(const float*) b;
return (fa > fb) - (fa < fb);
}
while the declaration in mediafilter.hpp is set private and looks like that:
int compare (const void*, const void*);
A compilation error occurs: cannot convert ‘mediafilter::compare’ from type ‘int (mediafilter::)(const void*, const void*)’ to type ‘__compar_fn_t {aka int (*)(const void*, const void*)}’
I don't understand this error completly. How do I correctly declare and implement this comparison method?
Thanks!
Compare is a non-static member function whereas qsort expects a non-member function (or a static member function). As your compare function doesn't seem to use any non-static members of the class, you could just declare it static. In fact I'm not sure what your median filter class does at all. Perhaps you just need a namespace.
Why not sort the vector directly instead of copying it into a second array? Furthermore your code will break if the vector has more than 100 elements.
The default behavior of sort does just want you need, but for completeness I show how to use a compare function.
I also changed the return type of your function because I don't understand why a function called computeMedian wouldn't return the median..
namespace medianfilter
{
bool compare(float fa, float fb)
{
return fa < fb;
}
float computeMedian(vector<float> v)
{
std::sort(v.begin(), v.end(), compare);
// or simply: std::sort(v.begin(), v.end());
for (size_t i = 0; i < v.size(); i++) {
printf("%f ", v[i]);
}
if (v.empty())
{
// what do you want to happen here?
}
else
{
float median = v[v.size() / 2]; // what should happen if size is odd?
printf("median=%f ", median); // it was %d before
return median;
}
}
}
You can't call compare as it is because it is a member function and requires a this pointer (i.e. it needs to be called on an object). However, as your compare function doesn't need a this pointer, simply make it a static function and your code will compile.
Declare it like this in your class:
static int compare(const void * a, const void * b);
Not directly related to your question (for which you already have the answer) but some observations:
Your calculation of median is wrong. If the number of elements is even you should return the average of the two center values not the value of lower one.
The copy to the array with a set size screams buffer overflow. Copy to another vector and std:sort it or (as suggested by #NeilKirk) just sort the original one unless you have cause not to modify it.
There is no guard against empty input. Median is undefined in this case but your implementation would just return whatever happens to be on arr[0]
Ok, this is more of an appendix to Eli Algranti (excellent) answer than an answer to the original question.
Here is a generic code to compute the quantile quant of a vector of double called x (which the code below preserves).
First things first: there are many definitions of quantiles (R alone lists 9). The code below corresponds to definition #5 (which is also the default quantile function in matlab and generally the ones statisticians think of when they think quantile).
The key idea here is that when the quantile do not fall on a precise observation (e.g. when you want the 15% quantile of an array of length 10) the implementation below realizes the (correct) interpolation (in this case between the 10% and 20%) between adjacent quantile. This is important so that when you increase the number of observations (i m hinting at the name medianfilter here) the value of the quantile do not jump about abruptly but converges smoothly instead (which is one reason why this is the statistician's preferred definition).
The code assumes that x has at least one element (the code below is part of a longer one and I feel this point has been made already).
Unfortunately it s written using many function from the (excellent!) c++ eigen library and it is too late for me at this advanced time in the night to translate the eigen functions --or sanitize the variable names--, but the key ideas should be readable.
#include <Eigen/Dense>
#include <Eigen/QR>
using namespace std;
using namespace Eigen;
using Eigen::MatrixXd;
using Eigen::VectorXd;
using Eigen::VectorXi;
double quantiles(const Ref<const VectorXd>& x,const double quant){
//computes the quantile 'quant' of x.
const int n=x.size();
double lq,uq,fq;
const double q1=n*(double)quant+0.5;
const int index1=floor(q1);
const int index2=ceil(q1);
const double index3=(double)index2-q1;
VectorXd x1=x;
std::nth_element(x1.data(),x1.data()+index1-1,x1.data()+x1.size());
lq=x1(index1-1);
if(index1==index2){
fq=lq;
} else {
uq=x1.segment(index1,x1.size()-index1-1).minCoeff();
fq=lq*index3+uq*(1.0-index3);
}
return(fq);
}
So the code uses one call to nth_element, which has average complexity O(n) [sorry for sloppely using big O for average] and (when n is even) one extra call to min() [which in eigen dialect is noted .minCoeff()] on at most n/2 elements of the vector, which is O(n/2).
This is much better than using partial sort (which would cost O(nlog(n/2)), worst case) or sort (which would cost
O(nlogn))