Related
What i am trying to accomplish is to store an unknown size of a polynomial using arrays.
What i have seen over the internet is using an array that each cell contains the coeffecient and the degree is the cell number, but that is not effecient because what if we have a polynomial like : 6x^14+x+5. this would mean we would have zeros all throughout the cells from 1 till 13.Ive already looked at some solutions with vectors and linked lists but is there any other way to effectively tackle this problem, without the use of (std::vectors or std::list)?
Unless there is a compelling reason to act otherwise (this is a programming assignment where you are required to use C-style arrays), you should use a std::vector from the standard library. Libraries are there for a reason: to make your life easier. The overhead is probably insignificant in the context of your program.
You mention that storing a polynomial (such as 4*x^5 + x - 1) in an std::vector with the indices representing the power (such as [-1, 1, 0, 0, 0, 4]) is inefficient. This is true, but unless you are storing polynomials of degree greater than 1000, this waste is entirely insignificant. For "sparse" polynomials, of high degree but with few coefficients, you could consider using a vector of pairs, with the first value of each pair storing the power and the second value storing the coefficient.
A sparse polynomial can be represented with a map, where a zero element is represented by nonexistent key. Here is an example of such class:
#include <map>
//example of sparse integer polynomial
class SparsePolynomial{
std::map<int,int> coeff;
int& operator[](const int& degree);
int get(int degree);
void update(int degree, int val);
};
Whenever you try to get or update the coefficient of an element, its existence in the map is evaluated. Everytime the coefficient of an element is updated, it is checked whether the value is zero. Hence, the size of the map can always be minimal.
We can replace these two methods with operator[]. However, in that case, we would not be able to check for zero during an update operation, thus the storage would not be as efficient as using two separate methods for access and update.
int SparsePolynomial::get(int degree){
if (coeff.find(degree) == coeff.end()){
return 0;
}else{
return coeff[degree];
}
}
void SparsePolynomial::update(int degree, int val){
if (val == 0){
std::map<int,int>::iterator it = coeff.find(degree);
if (it!=coeff.end()){
coeff.erase(it);
}
}else{
coeff[degree]=val;
}
}
While this method gives us a more efficient storage, it requires more time for access and update than vector does. However, in the case of a sparse polynomial, the difference can be small. Given a std::map of size N, the average search complexity of an element is O(log N). Suppose you have a sparse polynomial with degree d and number of non-zero coefficients N. If N is much smaller than d, then the access and update time would be small enough not to notice.
I have a matrix consisting of a vector of which each element representing the rows is composed of a vector representing the columns of the matrix. I would like to sort the rows according to the 1st column.
Each element inside this matrix is a double, although the first column contains a number that serves as an identifier (but is not unique).
My goal is to have something like the aggregate functions available in SQL, such as count() and sum() when I group by the first column.
For instance, if I have:
ID VALUE
1 10
2 20
1 30
2 40
3 60
I would like to get:
ID COUNT MEAN
1 2 20
2 2 30
3 1 60
However, I am stuck in the very first step: how do I sort the rows according to the value of the first element of each row?
I found a clue on this topic, and changed adapted the comparator to:
bool compareFunction (double i,double j)
{
return (i<j);
}
But the compiler was not very happy about that (making a reference to the stl_algo.h file):
error: cannot convert 'std::vector<double>' to 'double' in argument passing
I was therefore wondering if there is a way to sort such a vector of vectors when it contains doubles.
Answer (imho): use a different datastructure. What you are trying to do is setup a multimap. Oh hey look:
http://www.cplusplus.com/reference/map/multimap/
stl::multimap - how do i get groups of data?
It'll be faster for large numbers of elements. And is actually a map rather than a vector of vector of double.
Either that, or skip the sorting all together, and count by key using std::map, std::unordered_map, or (if you know the number of keys and/or the keys are offset by 1 with no breaks) std::vector.
To expand, sorting your list to get means will be slow. Sorting (using std::sort) is O(nlogn), and will be O(nlogn) every time you compute this mean. And it is an unessisary step: your stuff is grouped by key reguardless of order. std::map and std::multimap will "sort as you go" which will be just a little faster than sorting every time, but you won't have to sort the whole thing to get the list. Then you can just iterate the multimap to get the mean, O(n) each mean calculation. (It is still O(nlg(n)) to add all the elements to the multimap)
But if you know the key output is going to be 1,2,3...n-1,n, than sorting is a complete waste of time. Just make a counter for each key (since you know what the keys can be) and add to the key while iterating the array.
BUT WAIT THERE IS MORE
If the keys are actually setup the way you are thinking, than the best way from the get go is to forget the table structure, and make build it like this:
Index VALUE
0 10,30
1 20,40
2 60
Count is now constant time for each row. Mean for each row is O(n). Getting a list is constant time for each row. EVERYBODY WINS.
You need to create a comparator function comparing vector<double>:
struct VecComp {
bool operator()(const vector<double>& _a, const vector<double>& _b) {
//compare first elements
}
}
Then you can use std::sort on your structure with the new comparator function:
std::sort(myMat.begin(), myMat.end(), VecComp());
If you are using c++11 features you can also utilize lambda functions here:
std::sort(myMat.begin(), myMat.end(), [](const vector<double>& a, const vector<double>& b) {
//compare the first elements
}
);
You need to write your own comparator functor to pass into your vector declaration:
struct comp {
bool operator() (const std::vector<double>& i,
const std::vector<double>& j) {
return i[0] < j[0];
}
Have you tried just this?:
std::sort(vecOfVecs.begin(), vecOfVecs.end());
That should work as std::vector has operator< which provides lexicographical sorting, which is (a little more specific than) what you want.
I have a data structure like this:
struct X {
float value;
int id;
};
a vector of those (size N (think 100000), sorted by value (stays constant during the execution of the program):
std::vector<X> values;
Now, I want to write a function
void subvector(std::vector<X> const& values,
std::vector<int> const& ids,
std::vector<X>& out /*,
helper data here */);
that fills the out parameter with a sorted subset of values, given by the passed ids (size M < N (about 0.8 times N)), fast (memory is not an issue, and this will be done repeatedly, so building lookuptables (the helper data from the function parameters) or something else that is done only once is entirely ok).
My solution so far:
Build lookuptable lut containing id -> offset in values (preparation, so constant runtime)
create std::vector<X> tmp, size N, filled with invalid ids (linear in N)
for each id, copy values[lut[id]] to tmp[lut[id]] (linear in M)
loop over tmp, copying items to out (linear in N)
this is linear in N (as it's bigger than M), but the temporary variable and repeated copying bugs me. Is there a way to do it quicker than this? Note that M will be close to N, so things that are O(M log N) are unfavourable.
Edit: http://ideone.com/xR8Vp is a sample implementation of mentioned algorithm, to make the desired output clear and prove that it's doable in linear time - the question is about the possibility of avoiding the temporary variable or speeding it up in some other way, something that is not linear is not faster :).
An alternative approach you could try is to use a hash table instead of a vector to look up ids in:
void subvector(std::vector<X> const& values,
std::unordered_set<int> const& ids,
std::vector<X>& out) {
out.clear();
out.reserve(ids.size());
for(std::vector<X>::const_iterator i = values.begin(); i != values.end(); ++i) {
if(ids.find(i->id) != ids.end()) {
out.push_back(*i);
}
}
}
This runs in linear time since unordered_set::find is constant expected time (assuming that we have no problems hashing ints). However I suspect it might not be as fast in practice as the approach you described initially using vectors.
Since your vector is sorted, and you want a subset of it sorted the same way, I assume we can just slice out the chunk you want without rearranging it.
Why not just use find_if() twice. Once to find the start of the range you want and once to find the end of the range. This will give you the start and end iterators of the sub vector. Construct a new vector using those iterators. One of the vector constructor overloads takes two iterators.
That or the partition algorithm should work.
If I understood your problem correctly, you actually try to create a linear time sorting algorithm (subject to the input size of numbers M).
That is NOT possible.
Your current approach is to have a sorted list of possible values.
This takes linear time to the number of possible values N (theoretically, given that the map search takes O(1) time).
The best you could do, is to sort the values (you found from the map) with a quick sorting method (O(MlogM) f.e. quicksort, mergesort etc) for small values of M and maybe do that linear search for bigger values of M.
For example, if N is 100000 and M is 100 it is much faster to just use a sorting algorithm.
I hope you can understand what I say. If you still have questions I will try to answer them :)
edit: (comment)
I will further explain what I mean.
Say you know that your numbers will range from 1 to 100.
You have them sorted somewhere (actually they are "naturally" sorted) and you want to get a subset of them in sorted form.
If it would be possible to do it faster than O(N) or O(MlogM), sorting algorithms would just use this method to sort.
F.e. by having the set of numbers {5,10,3,8,9,1,7}, knowing that they are a subset of the sorted set of numbers {1,2,3,4,5,6,7,8,9,10} you still can't sort them faster than O(N) (N = 10) or O(MlogM) (M = 7).
I have an unsorted vector of eigenvalues and a related matrix of eigenvectors. I'd like to sort the columns of the matrix with respect to the sorted set of eigenvalues. (e.g., if eigenvalue[3] moves to eigenvalue[2], I want column 3 of the eigenvector matrix to move over to column 2.)
I know I can sort the eigenvalues in O(N log N) via std::sort. Without rolling my own sorting algorithm, how do I make sure the matrix's columns (the associated eigenvectors) follow along with their eigenvalues as the latter are sorted?
Typically just create a structure something like this:
struct eigen {
int value;
double *vector;
bool operator<(eigen const &other) const {
return value < other.value;
}
};
Alternatively, just put the eigenvalue/eigenvector into an std::pair -- though I'd prefer eigen.value and eigen.vector over something.first and something.second.
I've done this a number of times in different situations. Rather than sorting the array, just create a new array that has the sorted indices in it.
For example, you have a length n array (vector) evals, and a 2d nxn array evects. Create a new array index that has contains the values [0, n-1].
Then rather than accessing evals as evals[i], you access it as evals[index[i]] and instead of evects[i][j], you access it evects[index[i]][j].
Now you write your sort routine to sort the index array rather than the evals array, so instead of index looking like {0, 1, 2, ... , n-1}, the value in the index array will be in increasing order of the values in the evals array.
So after sorting, if you do this:
for (int i=0;i<n;++i)
{
cout << evals[index[i]] << endl;
}
you'll get a sorted list of evals.
this way you can sort anything that's associated with that evals array without actually moving memory around. This is important when n gets large, you don't want to be moving around the columns of the evects matrix.
basically the i'th smallest eval will be located at index[i] and that corresponds to the index[i]th evect.
Edited to add. Here's a sort function that I've written to work with std::sort to do what I just said:
template <class DataType, class IndexType>
class SortIndicesInc
{
protected:
DataType* mData;
public:
SortIndicesInc(DataType* Data) : mData(Data) {}
Bool operator()(const IndexType& i, const IndexType& j) const
{
return mData[i]<mData[j];
}
};
The solution purely relies on the way you store your eigenvector matrix.
The best performance while sorting will be achieved if you can implement swap(evector1, evector2) so that it only rebinds the pointers and the real data is left unchanged.
This could be done using something like double* or probably something more complicated, depends on your matrix implementation.
If done this way, swap(...) wouldn't affect your sorting operation performance.
The idea of conglomerating your vector and matrix is probably the best way to do it in C++. I am thinking about how I would do it in R and seeing if that can be translated to C++. In R it's very easy, simply evec<-evec[,order(eval)]. Unfortunately, I don't know of any built in way to perform the order() operation in C++. Perhaps someone else does, in which case this could be done in a similar way.
I have an integral position-based algorithm. (That is, the output of the algorithm is based on a curvilinear position, and each result is influenced by the values of the previous results).
To avoid recalculating each time, I would like to pre-calculate at a given sample rate, and subsequently perform a lookup and either return a pre-calculated result (if I land directly on one), or interpolate between two adjacent results.
This would be trivial for me in F# or C#, but my C++ is very rusty, (and wasn't even ever that good).
Is map the right construct to use? And could you be so kind as to give me an example of how I'd perform the lookup? (I'm thinking of precalculating in milimetres, which means the key could be an int, the value would be a double).
UPDATE OK, maybe what I need is a sorted dictionary. (Rolls up sleeves), pseudocode:
//Initialisation
fun MyFunction(int position, double previousresult) returns double {/*etc*/};
double lastresult = 0.0;
for(int s = startposition to endposition by sampledist)
{
lastresult = MyFunction(s, lastresult);
MapOrWhatever.Add(s, lastresult);
}
//Using for lookup
fun GetValueAtPosition(int position) returns double
{
CheckPositionIsInRangeElseException(position);
if(MapOrWhatever.ContainsKey(position))
return MapOrWhatever[position];
else
{
int i = 0;
//or possibly something clever with position % sampledist...
while(MapOrWhatever.Keys[i] < position) i+=sampledist;
return Interpolate(MapOrWhatever, i, i+sampledist, position);
}
}
Thinks... maybe if I keep a constant sampledist, I could just use an array and index it...
A std::map sounds reasonable for memoization here provided your values are guaranteed not to be contiguous.
#include <map>
// ...
std::map<int, double> memo;
memo.insert(std::make_pair(5, 0.5));
double x = memo[5]; // x == 0.5
If you consider a map, always consider a vector, too. For values that aren't changed much (or even not at all) during the application running, a pre-sorted std::vector< std::pair<Key,Value> > (with O(N) lookup) more often than never performs faster for lookups than a std::map<key,Value> (with O(log N) lookup) - despite all the theory.
You need to try and measure.
std::map is probably fine as long as speed is not too critical. If the speed of the lookup is critical you could try a vector as mentioned above where you go straight to the element you need (don't use a binary search since you can compute the index from the position). Something like:
vector<double> stored;
// store the values in the vector
double lastresult = 0.0;
for(int s = startposition, index = 0; s <= endposition; s+=sampledist, ++index)
{
lastresult = MyFunction(s, lastresult);
stored[index] = lastresult;
}
//then to lookup
double GetValueAtPosition(int position) returns double
{
int index = (position - startposition) / sampledist;
lower = stored[index];
upper = stored[index+1];
return interpolate(lower, upper, position);
}
please see my comment, but here is map documentation
http://www.cplusplus.com/reference/stl/map/
and important note than another poster did not mention is that if you use [] to search on a key that doesn't exist in the map, map will create an object so that there's something there.
edit: see docs here for this info http://msdn.microsoft.com/en-us/library/fe72hft9%28VS.80%29.aspx
instead, use find(), which returns an iterator. then test this iterator against map.end(), and if it is equal then there was no match.
Refer : http://www.cplusplus.com/reference/stl/map/
You can use Map ,
typedef std::map<int,const double> mapType;
Performance of maps are like :
map:: find
Complexity
Logarithmic in size.
Beware of Operator [ ] in map
If x matches the key of an element in the container, the function returns a reference to its mapped value.
If x does not match the key of any element in the container, the function inserts a new element with that key and returns a reference to its mapped value. Notice that this always increases the map size by one, even if no mapped value is assigned to the element (the element is constructed using its default constructor).
The HASH_MAP is the best STL algoirthim for fast lookup than any other algorithims. But, filling takes little bit more time than map or vector and also it is not sorted. It takes constant time for any value search.
std::hash_map<int, double,> memo;
memo.insert(std::make_pair(5, 0.5));
memo.insert(std::make_pair(7,0.8));
.
.
.
hash_map<int,double>::iterator cur = memo.find(5);
hash_map<int,double>::iterator prev = cur;
hash_map<int,double>::iterator next = cur;
++next;
--prev;
Interpolate current value with (*next).second(), (*prev).second() values..