Why isn't my std::set sorted? - c++

I have a class to store data that looks like this:
class DataLine
{
public:
std::string name;
boost::posix_time::time_duration time;
double x, y, z;
DataLine(std::string _name, boost::posix_time::time_duration _time, double _x,
double _y, double _z); //assign all these, not going to do it here
bool operator < (DataLine* dataLine) { return time < dataLine->time; }
}
Then I read in a bunch of data and .insert it into a std::set of the objects:
std::set<DataLine*> data;
data.insert( new DataLine(newname, newtime, newx, newy, newz) );
//...insert all data - IS OUT OF ORDER HERE
Then I run through my data and do stuff with it while appending new elements to the set.
boost::posix_time::time_duration machineTime(0,0,0);
for(std::set<DataLine*>::reverse_iterator it = data.rbegin(); it != data.rend(); ++it)
{
if(machineTime < (*it)->time)
{
machineTime = (*it)->time;
}
machineTime += processDataLine(*it); //do stuff with data, might add to append list below
for(std::vector<AppendList*>::iterator iter = appendList.begin(); iter != appendList.end(); ++iter)
{
data.insert( new DataLine( (*iter)->name, machineTime,
(*iter)->x, (*iter)->y, (*iter)->z); );
}
}
When I try to loop through the set of data both before and after inserting the elements all my data is out of order! Here are some times outputted when looped using
for(std::set<DataLine*>::iterator it = data.begin(); it != data.end(); ++it)
{
std::cout << std::endl << (*it)->time;
}
14:39:55.003001
14:39:55.003002
14:39:55.001000
14:39:59.122000
14:39:58.697000
14:39:57.576000
14:39:56.980000
Why aren't these times sorted in order?

It is sorted. It's sorted based on the data type you're storing in the set, which is a pointer to a DataLine. In other words, it'll sort according to the location in memory of your objects which is probably creation order (but may not be, depending on how the memory allocation functions work in your implementation).
If you want to sort based on the DataLine type itself, don't use a pointer. Store the objects themselves.
You can see a similar effect from the following code which creates two sets. The first is a set of integer pointers, the second a set of actual integers:
#include <iostream>
#include <iomanip>
#include <set>
using namespace std;
int main (void) {
set<int*> ipset;
set<int> iset;
cout << "inserting: ";
for (int i = 0; i < 10; i++) {
int val = (i * 7) % 13;
cout << ' ' << setw(2) << val;
ipset.insert (new int (val));
iset.insert (val);
}
cout << '\n';
cout << "integer pointer set:";
for (set<int*>::iterator it = ipset.begin(); it != ipset.end(); ++it)
cout << ' ' << setw(2) << **it;
cout << '\n';
cout << "integer set: ";
for (set<int>::iterator it = iset.begin(); it != iset.end(); ++it)
cout << ' ' << setw(2) << *it;
cout << '\n';
cout << "integer pointer set pointers:\n";
for (set<int*>::iterator it = ipset.begin(); it != ipset.end(); ++it)
cout << " " << *it << '\n';
cout << '\n';
return 0;
}
When you run that code, you see something like:
inserting: 0 7 1 8 2 9 3 10 4 11
integer pointer set: 0 7 1 8 2 9 3 10 4 11
integer set: 0 1 2 3 4 7 8 9 10 11
integer pointer set pointers:
0x907c020
0x907c060
0x907c0a0
0x907c0e0
0x907c120
0x907c160
0x907c1a0
0x907c1e0
0x907c220
0x907c260
You can see the unordered way in which values are added to the two sets (first line) and the way the pointer set in this case matches the order of input (second line). That's because the addresses are what's being used for ordering as you can see by the fact that the final section shows the ordered addresses.
Although, as mentioned, it may not necessarily match the input order, since the memory arena may be somewhat fragmented (as one example).
The set containing the actual integers (as opposed to pointers to integers) is clearly sorted by the integer value itself (third line).

You need to define member operator < like below, and save objects in std::set instead of raw pointers. Because for raw pointers, the default comparision criteria is based on the pointer value itself.
bool operator < (const DataLine &dataLine) const
{
return time < dataLine.time;
}
...
std::set<DataLine> data;

Related

C++ finding uint8_t in vector<uint8_t>

I have the following simple code. I declare a vector and initialize it with one value 21 in this case. And then i am trying to find that value in the vector using find. I can see that the element "21" in this case is in the vector since i print it in the for loop. However why the iterator of find does not resolve to true?
vector<uint8_t> v = { 21 };
uint8_t valueToSearch = 21;
for (vector<uint8_t>::const_iterator i = v.begin(); i != v.end(); ++i){
cout << unsigned(*i) << ' ' << endl;
}
auto it = find(v.begin(), v.end(), valueToSearch);
if ( it != v.end() )
{
string m = "valueToSearch was found in the vector " + valueToSearch;
cout << m << endl;
}
are you sure it doesn't work?
I just tried it:
#include<iostream> // std::cout
#include<vector>
#include <algorithm>
using namespace std;
int main()
{
vector<uint8_t> v = { 21 };
uint8_t valueToSearch = 21;
for (vector<uint8_t>::const_iterator i = v.begin(); i != v.end(); ++i){
cout << unsigned(*i) << ' ' << endl;
}
auto it = find(v.begin(), v.end(), valueToSearch);
if ( it != v.end() )
{// if we hit this condition, we found the element
string error = "valueToSearch was found in the vector ";
cout << error << int(valueToSearch) << endl;
}
return 0;
}
There are two small modifications:
in the last lines inside the "if", because you cannot add directly a
number to a string:
string m = "valueToSearch was found in the vector " + valueToSearch;
and it prints:
21
valueToSearch was found in the vector 21
while it's true that you cannot add a number to a string, cout
support the insertion operator (<<) for int types, but not uint8_t,
so you need to convert it to it.
cout << error << int(valueToSearch) << endl;
This to say that the find is working correctly, and it is telling you that it found the number in the first position, and for this, it != end (end is not a valid element, but is a valid iterator that marks the end of your container.)
Try it here

storing addresses of std::list elements; memory

I am creating std::list of struct elements. With a certain criterion, I want to store addresses of few elements (because those addresses don't change(?)) from the list into std::vector for quick access in another usage. An example of the things is given below
#include <iostream>
#include <vector>
#include <list>
struct Astruct{
double x[2];
int rank;
};
int main(int argc, char *argv[]) {
std::list<Astruct> ants;
std::vector< Astruct* > ptr;
for (auto i = 0; i != 20; ++i) {
Astruct local;
local.x[0] = 1.1;
local.x[1] = 1.2;
local.rank = i;
// put in list
ants.push_back(local);
// store address of odd numbers
// rather than temperory address, permenent address from list is needed
if(local.rank %2 == 0) ptr.push_back(&local);
}
// print the selected elements using addresses from the list
for(int num = 0; num != ptr.size(); num++){
Astruct *local;
local = ptr.at(num);
std::cout << " rank " << local->rank << "\n";
}
/*
// quick way to check whether certain address (eg 3rd element) exists in the std::vector
std::list<Astruct>::iterator it = ants.begin();
std::advance(it , 2);
for(int num = 0; num != ptr.size(); num++){
if(it == ptr.at(num)) std::cout << " exists in vector \n " ;
}
*/
// print memory in bytes for all variables
std::cout << " sizeof Astruct " << sizeof(Astruct) << "\n";
std::cout << " sizeof ants " << sizeof(ants) << "\n";
std::cout << " sizeof ptr " << sizeof(ptr) << "\n";
}
What's the way to access an address of a particular element from the list?
Is it efficient method to add elements to list? (in first for loop)
What is the quickest way to check whether certain address exists in the vector? (shown in comment block)
How to determine the memory size in bytes for different variables here? (end of the code)
Thanks.
What's the way to access an address of a particular element from the list?
address=&(*iterator);
Is it efficient method to add elements to list? (in first for loop)
the first loop does not use the list at all! (Ah, OK, after edition it does)
all the addresses which are stored in the vector refer to a local variable which disappears after each iteration; this is undefined behaviour (very probably, but nothing is certain, all these addresses are the same)
What is the quickest way to check whether certain address exists in the vector? (shown in comment block)
usualy std::find() from <algorithm> is suitable.
How to determine the memory size in bytes for different variables here? (end of the code)
std::cout << " sizeof Astruct " << sizeof(Astruct) << "\n"; is OK
std::cout << " sizeof ants " << size(ants)*sizeof(Astruct) << "\n"; is an approximation since we don't know the overhead of the list and its nodes
std::cout << " sizeof ptr " << size(ptr)*sizeof(Astruct *) << "\n"; is an approximation since we don't know the overhead of the vector

C++ Usage of set, iterator, find line where duplicate was found

The program adds different strings to a set. The iterator checks the set for a certain string, what i want to achieve is to get the line where the iterator finds this certain string. Is it possible to get this with a set or do i have to create a vector? The reason i use sets is because i also want not to have duplicates in the end. It is a bit confusing i know, i hope you'll understand.
Edit: i want to get the line number of the original element already existing in the set, if a duplicate is found
#include <iostream>
#include <set>
#include <string>
#include <vector>
#include <atlstr.h>
#include <sstream>
using namespace std;
int _tmain(int argc, _TCHAR* argv[])
{
set<string> test;
set<string>::iterator it;
vector<int> crossproduct(9, 0);
for (int i = 0; i < 6; i++)
{
crossproduct[i] = i+1;
}
crossproduct[6] = 1;
crossproduct[7] = 2;
crossproduct[8] = 3;
for (int i = 0; i < 3; i++)
{
ostringstream cp; cp.precision(1); cp << fixed;
ostringstream cp1; cp1.precision(1); cp1 << fixed;
ostringstream cp2; cp2.precision(1); cp2 << fixed;
cp << crossproduct[i*3];
cp1 << crossproduct[i*3+1];
cp2 << crossproduct[i*3+2];
string cps(cp.str());
string cps1(cp1.str());
string cps2(cp2.str());
string cpstot = cps + " " + cps1 + " " + cps2;
cout << "cpstot: " << cpstot << endl;
it = test.find(cpstot);
if (it != test.end())
{
//Display here the line where "1 2 3" was found
cout << "i: " << i << endl;
}
test.insert(cpstot);
}
set<string>::iterator it2;
for (it2 = test.begin(); it2 != test.end(); ++it2)
{
cout << *it2 << endl;
}
cin.get();
return 0;
}
"Line number" is not very meaningful to a std::set<string>,
because as you add more strings to the set you may change the
order in which the existing strings are iterated through
(which is about as much of a "line number" as the set::set template
itself will give you).
Here's an alternative that may work better:
std::map<std::string, int> test.
The way you use this is you keep a "line counter" n somewhere.
Each time you need to put a new string cpstot in your set,
you have code like this:
std::map<std::string>::iterator it = test.find(cpstot);
if (it == test.end())
{
test[cpstot] = n;
// alternatively, test.insert(std::pair<std::string, int>(cpstot, n))
++n;
}
else
{
// this prints out the integer that was associated with cpstot in the map
std::cout << "i: " << it->second;
// Notice that we don't try to insert cpstot into the map in this case.
// It's already there, and we don't want to change its "line number",
// so there is nothing good we can accomplish by an insertion.
// It's a waste of effort to even try.
}
If you set n = 0 before you started putting any strings in test then
(and don't mess with the value of n in any other way)
then you will end up with strings at "line numbers" 0, 1, 2, etc.
in test and n will be the number of strings stored in test.
By the way, neither std::map<std::string, int>::iterator nor
std::set<std::string>::iterator is guaranteed to iterate through
the strings in the sequence in which they were first inserted.
Instead, what you'll get is the strings in whatever order the
template's comparison object puts the string values.
(I think by default you get them back in lexicographic order,
that is, "alphabetized".)
But when you store the original "line number" of each string in
std::map<std::string, int> test, when you are ready to
print out the list of strings you can copy the string-integer pairs
from test to a new object, std::map<int, std::string> output_sequence,
and now (assuming you do not override the default comparison object)
when you iterate through output_sequence you will get its
contents sorted by line number.
(You will then probably want to get the string
from the second field of the iterator.)

A c++ program that stores the positions of each bit 1 in a binary sequence

I have made this code to store the position of each bit 1 entered in a binary sequence. The output of the program is not what it is desired. The output I get for 10100 is 0x7fff9109be00. Here is the code:
#include <iostream>
#include <bitset>
using namespace std;
int main()
{
bitset <5> inpSeq;
int x = 0;
int xorArray[x];
unsigned int i;
cout << "Enter a 5-bit sequence: \n";
cin >> inpSeq;
for ( i = 0; i < inpSeq.size(); i++)
{
if ( inpSeq[i] == 1 )
{
x = x+1;
xorArray[x] = i;
}
}
cout << xorArray << "\n";
}
Update for clarity: What I had in mind was that 'cout << xorArray' will print bit 1's positions.
cout << xorArray << "\n";
This does not print the elements of xorArray; it prints its address.
You must iterate ("loop over") it:
for (auto x : xorArray)
cout << x << ' ';
cout << '\n';
Your other problem is that you're trying to use a variable-length array, which does not exist in C++. Use a vector instead.
Now it gives you your desired output:
#include <iostream>
#include <bitset>
#include <vector>
using namespace std;
int main()
{
bitset<5> inpSeq("10111");
std::vector<int> xorArray;
for (unsigned int i = 0; i < inpSeq.size(); i++) {
if (inpSeq[i] == 1)
xorArray.push_back(i);
}
for (auto x : xorArray)
cout << x << ' ';
cout << '\n';
}
If you're not using C++11 for whatever reason, you can perform that final loop the traditional way:
for (std::vector<int>::const_iterator it = xorArray.begin(),
end = xorArray.end(),
it != end; ++it) {
cout << *it << ' ';
}
Or the naive way:
for (unsigned int i = 0; i < xorArray.size(); i++)
cout << xorArray[i] << ' ';
I am a little unclear on exactly what you are trying to achieve, but I think the following might help.
#include <iostream>
#include <bitset>
#include <list>
using namespace std;
int main() {
bitset<5> inpSeq;
unsigned int i;
list<int> xorList;
cout << "Enter a 5-bit sequence: \n";
cin >> inpSeq;
for (i = 0; i < inpSeq.size(); ++i) {
if (inpSeq[i] == 1) {
xorList.push_back(i);
}
}
for (list<int>::iterator list_iter = xorList.begin();
list_iter != xorList.end(); list_iter++)
{
cout << *list_iter << endl;
}
return 0;
}
The reason why I am using a list is because you mentioned wanting to store the positions of the 1 bit. The list is being used as the container for those positions, in case you need them in another point in the program.
One of the problems with the original code was that you assigned variable 'x' the value 0. When you declared xorArray[x], that meant you were essentially creating an array of length 0. This is incorrect syntax. It looks like you actually were trying to dynamically allocate the size of the array at runtime. That requires a different syntax and usage of pointers. The list allows you to grow the data structure for each 1 bit that you encounter.
Also, you cannot print an array's values by using
cout << xorArray << endl
That will print the memory address of the first element in the array, so, xorArray[0]. Whenever you want to print the values of a data structure such as a list or array, you need to iterate across the structure and print the values one by one. That is the purpose of the second for() loop in the above code.
Lastly, the values stored are in accordance with the 0 index. If you want positions that start with 1, you'll have to use
xorList.push_back(i+1);
Hope this helps!

Is there a limit on the size of std::set::iterator?

I have a std::set of strings and I want to iterate over them, but the iterator is behaving differently for different sizes of set. Given below is the code snippet that I'm working on:
int test(set<string> &KeywordsDictionary){
int keyword_len = 0;
string word;
set<string>::iterator iter;
cout << "total words in the database : " << KeywordsDictionary.size() << endl;
for(iter=KeywordsDictionary.begin();iter != KeywordsDictionary.end();iter++) {
cout << *iter;
word = *iter;
keyword_len = word.size();
if(keyword_len>0)
Dosomething();
else
cout << "Length of keyword is <= 0" << endl;
}
cout << "exiting test program" << endl;
}
The code is working properly & *iter is being dereferenced & assigned to word until the size of KeywordsDictionary is around 15000. However when the size of KeywordsDictionary increases beyond 15000,
the print statement cout << *iter; is printing all the contents of KeywordsDictionary correctly.
but the pointer to the iterator *iter is not being dereferenced & not being assigned to word. word is just being an empty string.
EDIT: And the output of the program is :
total words in the database : 22771
�z���AAAADAAIIABABBABLEABNABOUTACACCEPTEDACCESSACCOUNT...
Length of keyword is <= 0
exiting test program
So basically, I'm guessing the loop is executing only once.
Try to declare keyword_len as
std::string::size_type keyword_len = 0;
instead of
int keyword_len = 0;