Sorting dates (string) stored in vector - c++

I have a vector that contains dates in this format, DDMMMYY, e.g 12Jan14
vector<string> tempDate
When I use the STL sort, basically it will work when all of the dates are in the same year/month. 01Jan14, 05Jan14, 11Jan14, etc. But when I enter 02Feb14, it will all be messed up.
sort (tempDate.begin(), tempDate.end())
Can anyone tell me what went wrong with it?
Edit: I tried to use templates, but it doesn't work as all the string got 'split' up.
I did some troubleshooting inside the sortDayMonthYear
template <class T>
bool sortDayMonthYear(T a, T b)
{
for (int i =0; i < a.size(); i++)
cout << a.at(i)<< endl;
}
What i get is something like this
0
1
J
A
N
1
4
0
2
J
A
N
1
4
Actually what I intend to do is to pass the two strings into the template function, using substr to get the day, month, year, compare and return true or false.

Your predicate should take two constant string references as argument, and then use those to decide which one is "lesser" than the other.
So:
bool sortDayMonthYear(const std::string& first, const std::string& second)
{
// Algorithm to find out which string is "lesser" than the other
}
Then you use that comparator function when sorting:
std::sort(std::begin(tempDate), std::end(tempDate), sortDayMonthYear);

The standard way of sorting strings is lexicographical (by alphabet, for each letter, just like in a dictionary).
Therefore, what you need is your own comparator. There is a version of sort which takes one as third parameter. See e.g. this question for starting ideas: Sorting a vector of custom objects

Related

Iterating through one variable in a vector of struct with lower/upper bound

I have 2 structs, one simply has 2 values:
struct combo {
int output;
int input;
};
And another that sorts the input element based on the index of the output element:
struct organize {
bool operator()(combo const &a, combo const &b)
{
return a.input < b.input;
}
};
Using this:
sort(myVector.begin(), myVector.end(), organize());
What I'm trying to do with this, is iterate through the input varlable, and check if each element is equal to another input 'in'.
If it is equal, I want to insert the value at the same index it was found to be equal at for input, but from output into another temp vector.
I originally went with a more simple solution (when I wasn't using a structs and simply had 2 vectors, one input and one output) and had this in a function called copy:
for(int i = 0; i < input.size(); ++i){
if(input == in){
temp.push_back(output[i]);
}
}
Now this code did work exactly how I needed it, the only issue is it is simply too slow. It can handle 10 integer inputs, or 100 inputs but around 1000 it begins to slow down taking an extra 5 seconds or so, then at 10,000 it takes minutes, and you can forget about 100,000 or 1,000,000+ inputs.
So, I asked how to speed it up on here (just the function iterator) and somebody suggested sorting the input vector which I did, implemented their suggestion of using upper/lower bound, changing my iterator to this:
std::vector<int>::iterator it = input.begin();
auto lowerIt = std::lower_bound(input.begin(), input.end(), in);
auto upperIt = std::upper_bound(input.begin(), input.end(), in);
for (auto it = lowerIt; it != upperIt; ++it)
{
temp.push_back(output[it - input.begin()]);
}
And it worked, it made it much faster, I still would like it to be able to handle 1,000,000+ inputs in seconds but I'm not sure how to do that yet.
I then realized that I can't have the input vector sorted, what if the inputs are something like:
input.push_back(10);
input.push_back(-1);
output.push_back(1);
output.push_back(2);
Well then we have 10 in input corresponding to 1 in output, and -1 corresponding to 2. Obviously 10 doesn't come before -1 so sorting it smallest to largest doesn't really work here.
So I found a way to sort the input based on the output. So no matter how you organize input, the indexes match each other based on what order they were added.
My issue is, I have no clue how to iterate through just input with the same upper/lower bound iterator above. I can't seem to call upon just the input variable of myVector, I've tried something like:
std::vector<combo>::iterator it = myVector.input.begin();
But I get an error saying there is no member 'input'.
How can I iterate through just input so I can apply the upper/lower bound iterator to this new way with the structs?
Also I explained everything so everyone could get the best idea of what I have and what I'm trying to do, also maybe somebody could point me in a completely different direction that is fast enough to handle those millions of inputs. Keep in mind I'd prefer to stick with vectors because not doing so would involve me changing 2 other files to work with things that aren't vectors or lists.
Thank you!
I think that if you sort it in smallest to largest (x is an integer after all) that you should be able to use std::adjacent_find to find duplicates in the array, and process them properly. For the performance issues, you might consider using reserve to preallocate space for your large vector, so that your push back operations don't have to reallocate memory as often.

Sort vector of vectors with doubles according to a column in C++

I have a matrix consisting of a vector of which each element representing the rows is composed of a vector representing the columns of the matrix. I would like to sort the rows according to the 1st column.
Each element inside this matrix is a double, although the first column contains a number that serves as an identifier (but is not unique).
My goal is to have something like the aggregate functions available in SQL, such as count() and sum() when I group by the first column.
For instance, if I have:
ID VALUE
1 10
2 20
1 30
2 40
3 60
I would like to get:
ID COUNT MEAN
1 2 20
2 2 30
3 1 60
However, I am stuck in the very first step: how do I sort the rows according to the value of the first element of each row?
I found a clue on this topic, and changed adapted the comparator to:
bool compareFunction (double i,double j)
{
return (i<j);
}
But the compiler was not very happy about that (making a reference to the stl_algo.h file):
error: cannot convert 'std::vector<double>' to 'double' in argument passing
I was therefore wondering if there is a way to sort such a vector of vectors when it contains doubles.
Answer (imho): use a different datastructure. What you are trying to do is setup a multimap. Oh hey look:
http://www.cplusplus.com/reference/map/multimap/
stl::multimap - how do i get groups of data?
It'll be faster for large numbers of elements. And is actually a map rather than a vector of vector of double.
Either that, or skip the sorting all together, and count by key using std::map, std::unordered_map, or (if you know the number of keys and/or the keys are offset by 1 with no breaks) std::vector.
To expand, sorting your list to get means will be slow. Sorting (using std::sort) is O(nlogn), and will be O(nlogn) every time you compute this mean. And it is an unessisary step: your stuff is grouped by key reguardless of order. std::map and std::multimap will "sort as you go" which will be just a little faster than sorting every time, but you won't have to sort the whole thing to get the list. Then you can just iterate the multimap to get the mean, O(n) each mean calculation. (It is still O(nlg(n)) to add all the elements to the multimap)
But if you know the key output is going to be 1,2,3...n-1,n, than sorting is a complete waste of time. Just make a counter for each key (since you know what the keys can be) and add to the key while iterating the array.
BUT WAIT THERE IS MORE
If the keys are actually setup the way you are thinking, than the best way from the get go is to forget the table structure, and make build it like this:
Index VALUE
0 10,30
1 20,40
2 60
Count is now constant time for each row. Mean for each row is O(n). Getting a list is constant time for each row. EVERYBODY WINS.
You need to create a comparator function comparing vector<double>:
struct VecComp {
bool operator()(const vector<double>& _a, const vector<double>& _b) {
//compare first elements
}
}
Then you can use std::sort on your structure with the new comparator function:
std::sort(myMat.begin(), myMat.end(), VecComp());
If you are using c++11 features you can also utilize lambda functions here:
std::sort(myMat.begin(), myMat.end(), [](const vector<double>& a, const vector<double>& b) {
//compare the first elements
}
);
You need to write your own comparator functor to pass into your vector declaration:
struct comp {
bool operator() (const std::vector<double>& i,
const std::vector<double>& j) {
return i[0] < j[0];
}
Have you tried just this?:
std::sort(vecOfVecs.begin(), vecOfVecs.end());
That should work as std::vector has operator< which provides lexicographical sorting, which is (a little more specific than) what you want.

C++: Time complexity of using STL's sort in order to sort a 2d array of integers on different columns

let's say we have the following 2d array of integers:
1 3 3 1
1 0 2 2
2 0 3 1
1 1 1 0
2 1 1 3
I was trying to create an implementation where the user could give as input the array itself and a string. An example of a string in the above example would be 03 which would mean that the user wants to sort the array based on the first and the fourth column.
So in this case the result of the sorting would be the following:
1 1 1 0
1 3 3 1
1 0 2 2
2 0 3 1
2 1 1 3
I didn't know a lot about the compare functions that are being used inside the STL's sort function, however after searching I created the following simple implementation:
I created a class called Comparator.h
class Comparator{
private:
std::string attr;
public:
Comparator(std::string attr) { this->attr = attr; }
bool operator()(const int* first, const int* second){
std::vector<int> left;
std::vector<int> right;
size_t i;
for(i=0;i<attr.size();i++){
left.push_back(first[attr.at(i) - '0']);
right.push_back(second[attr.at(i) - '0']);
}
for(i=0;i<left.size();i++){
if(left[i] < right[i]) return true;
else if(left[i] > right[i]) return false;
}
return false;
}
};
I need to know the information inside the string so I need to have a class where this string is a private variable. Inside the operator I would have two parameters first and second, each of which will refer to a row. Now having this information I create a left and a right vector where in the left vector I have only the numbers of the first row that are important to the sorting and are specified by the string variable and in the right vector I have only the numbers of the second row that are important to the sorting and are specified by the string variable.
Then I do the needed comparisons and return true or false. The user can use this class by calling this function inside the Sorting.cpp class:
void Sorting::applySort(int **data, std::string attr, int amountOfRows){
std::sort(data, data+amountOfRows, Comparator(attr));
}
Here is an example use:
int main(void){
//create a data[][] variable and fill it with integers
Sorting sort;
sort.applySort(data, "03", number_of_rows);
}
I have two questions:
First question
Can my implementation get better? I use extra variables like the left and right vectors, and then I have some for loops which brings some extra costing to the sorting operation.
Second question
Due to the extra cost, how much worse does the time complexity of the sorting become? I know that STL's sort is O(n*logn) where n is the number of integers that you want to sort. Here n has a different meaning, n is the number of rows and each row can have up to m integers which in turn can be found inside the Comparator class by overriding the operator function and using extra variables(the vectors) and for loops.
Because I'm not sure how exactly is STL's sort implemented I can only make some estimates.
My initial estimate would be O(n*m*log(n)) where m is the number of columns that are important to the sorting however I'm not 100% certain about it.
Thank you in advance
You can certainly improve your comparator. There's no need to copy the columns and then compare them. Instead of the two push_back calls, just compare the values and either return true, return false, or continue the loop according to whether they're less, greater, or equal.
The relevant part of the complexity of sort is O(n * log n) comparisons (in C++11. C++03 doesn't give quite such a good guarantee), where n is the number of elements being sorted. So provided your comparator is O(m), your estimate is OK to sort the n rows. Since attr.size() <= m, you're right.
First question: you don't need left and rigth - you add elements one by one and then iterate over the vectors in the same order. So instead of pushing values to vectors and then iterating over them, simply use the values as you generate them in the first cycle like so:
for(i=0;i<attr.size();i++){
int left = first[attr.at(i) - '0'];
int right = second[attr.at(i) - '0'];
if(left < right) return true;
else if(left > right) return false;
}
Second question: can the time complexity be improved? Not with sorting algorithm that uses direct comparison. On the other had the problem you solve here is somewhat similar to radix sort. And so I believe you should be able to do the sorting in O(n*m) where m is the number of sorting criteria.
1) Firstly to start off you should convert the string into an integer array in the constructor. With validation of values being less than the number of columns.
(You could also have another constructor that takes an integer array as a parameter.
A slight enhancement is to allow negative values to indicate that the order of the sort is reversed for that column. In this case the values would be -N..-1 , 1..N)
2) There is no need for the intermediate left, right arrays.

Sorting a two-dimensional array of characters? C++

I'm trying to sort a 10X15 array of characters, where each row is a word. My goal is to sort it in a descending order, from the largest value word at the top, at array[row 0][column 0 through 14] position, and the smallest value word at the bottom array[row 9][column 0 through 14]. Each row is a word (yeah, they don't look as words, but it's to test the sorting capability of the program).
To clarify: What I need to do is this... considering that EACH row is a whole word, I need to sort the rows from the highest value word being at the top, and the lowest value word being at the bottom.
Edit:
Everything works now. For anyone who has a similar question, look to the comments below, there are several fantastic solutions, I just went with the one where I create my own sort function to learn more about sorting. And thanks to all of you for helping me! :)
You are using c++ so quit using arrays and begin with stl types:
Convert each row into a string:
string tempString
for (int i = 0; i < rowSize; ++i) {
tempString.pushBack(array[foreachrow][i])
}
add them to a vector
std::vector<std::string> sorter;
sorter.push_back(tempString);
Do that for each row.
std::vector<std::string> sorter;
for each row {
for each coloumn {
the string thing
}
push back the string
}
Then sort the vector with std::sort and write the vector back into the array (if you have to but don't because arrays suck)
As usual, you need qsort:
void qsort( const void *ptr, size_t count, size_t size,
int (*comp)(const void *, const void *) );
That takes a void pointer to your starting address, the number of elements to sort, the size of each element, and a comparison function.
You would call it like this:
qsort( array, ROWS, COLS, compare_word );
Where you define compare_word to sort in reverse:
int compare_word( const void* a, const void* b )
{
return strncmp( b, a, COLS );
}
Now, given that each word is 15 characters long, there may be padding to deal with. I don't have absolute knowledge that the array will be packed as 10 by 15 instead of 10 by 16. But if you suspect so, you could pass (&array[1][0] - &array[0][0]) as the element size instead of COLS.
If you are not allowed to use qsort and instead must write your own sorting algorithm, do something simple like selection sort. You can use strncmp to test the strings. Look up the function (google makes it easy, or if you use Linux, man 3 strncmp). To swap the characters, you could use a temporary char array of length COLS and then 3 calls to memcpy to swap the words.
The problem with your new code using string and vector is a simple typo:
sorter[count] = array[count+1]; should be sorter[count] = sorter[count+1];

Fastest way to take 2D input and sort it simultaneously Row wise

I need a way in Which I can take the Input in a 2d array and sort it row wise in one of the fastest way . I tried taking Input and Sort it simultaneously using Insertion Sort. The Second thing I used is i took a multimap individually for a row and inserted with key value as the value i want and mapped value relates to that key as some Dummy value . Since map sorts key while Inserting It could be the one way I thought .
The below code is for making sure that 1 row in my 2D has its element sorted in
multimap. Basically you can say that I dont want to use a 2D structure at all as I
will use these rows individually one by one and hence can be considered as 1D array.
I also want they they gets rearranged While reading the Input , so i dont have to
extra opeartions for doing them.
for(long int j=1;j<=number_in_group;j++)
{
cin >> arrival_time;
arrival_map.insert(pair<long int, long int>(arrival_time,1));
}
Try an STL std::priority_queue? The output is guaranteed to be sorted, and if you polarize the inputs to be 2-D objects (that contain a row number for example) you're queue will build literally perfectly. At that point simply slurp the number off the queue in batches of 'n' where 'n' is your row size and each one will be sorted correctly. You will need a element type that encodes both the value AND the row in your priority queue, and sorts biased to the row # first, then then value. Your example uses long int as the data type for your values. Assuming your rows are no larger than the size of a system unsigned int:
class Element
{
public:
Element(unsigned int row, long int val)
: myrow(row), myval(val)
{};
bool operator <(const Element& elem)
{
return (myrow < elem.myrow ||
(myrow == elem.myrow && myval < elem.myvel);
}
unsigned int myrow;
long int myval;
};
typedef std::priority_queue<Element> MyQueue;
Note: this takes advantage of the priority queue's default comparison operator invoking std::less<>, which simply compares the items using the item-defined operator <(). Once you have this simply push your matrix into the queue, incrementing the row index as you switch to the next row.
MyQueue mq;
mq.push_back(Element(1,100));
mq.push_back(Element(1,99));
mq.push_back(Element(2,100));
mq.push_back(Element(2,101));
Popping the queue when finished will result in the following sequence:
99
100
100
101
Which I hope is what you want. Finally, please forgive the syntax errors and/or missing junk, as I just blasted this on the fly and have no compiler to check it against. Gotta love web cafes.