Clearing Object Duplicates In a vector produces infinite loop - c++

I have a vector called:
vector<MiniPair> miniPairVector;
MiniPair object has 2 property inside,1 is an integer docNumber other is a string word
I am trying to clear duplicates in this vector which means that if docNumber and word exist in another object inside vector remove the duplicates
This is what i have tried but it is producing an infinite loop:
for (int i = 0; i < miniPairVector.size(); i++) {
for (int k = i + 1; k < miniPairVector.size(); k++) {
if (miniPairVector[i].getDocNumber() == miniPairVector[k].getDocNumber() && miniPairVector[i].getWord() == miniPairVector[k].getWord()) {
cout << "i am erasing" << endl;
miniPairVector.erase(miniPairVector.begin() + k);
}
}
}
this is the minipair class:
#pragma once
// classes example
#ifndef MINIPAIR_H
#define MINIPAIR_H
#include <iostream>
using namespace std;
class MiniPair {
friend bool operator<(MiniPair const &a, MiniPair const &b) {
return a.docNumber < b.docNumber || a.docNumber == b.docNumber && a.word < b.word;
}
friend bool operator==(MiniPair const &a, MiniPair const &b) {
return a.docNumber == b.docNumber && a.word == b.word;
}
private:
string word;
int docNumber;
public:
MiniPair();
MiniPair(string word, int docNumber);
string getWord();
int getDocNumber();
};
#endif

My presumption is that you are doing this for a class.
First, while this may not be relevant for the problem you're solving write now because of class imposed constraints, this is a poor way of implementing this. When implemented correctly the number of comparisons will be something like miniPairVector.size() * miniPairVector.size(). That's a lot of comparisons, and way more than you actually need.
If I were trying to do this in a non-toy (or non-assignment) program, I would use the <algorithm> section of the standard library. I would use ::std::sort and then ::std::unique.
Here's how I would do it using those two:
#include <algorithm>
void remove_dupes(::std::vector<MiniPair> &minipair_vec)
{
::std::sort(minipair_vec.begin(), minipair_vec.end(),
[](MiniPair const &a, MiniPair const &b) -> bool {
return (a.getDocNumber() < b.getDocNumber())
|| ((a.getDocNumber() == b.getDocNumber())
&& (a.getWord() < b.getWord())));
}); // End lambda and sort.
auto newend = ::std::unique(minipair_vec.begin(), minipair_vec.end(),
[](MiniPair const &a, MiniPair const &b) -> bool {
return a.getDocNumber() == b.getDocNumber()
&& a.getWord() == b.getWord();
}); // End lambda and unique.
minipair_vec.resize(newend - minipair_vec.begin());
}
I have tested it, so it should work just fine.
The general lesson is that if you find yourself looping, go through this set of questions:
Am I indexing into a linear data structure? If so, why am I using indexes instead of iterators?
Is there an algorithm that already does what I need, or can a couple of algorithms be easily composed to do what I need?
The code I presented should run in a time that's proportional to minipair_vec.size() * ::std::log2(minipair_vec.size()). The code you wrote would run in a time proportional to minipair_vec.size() * minipair_vec.size() (once you got it to work), which is a lot longer for a large list.

A C++98 solution:
#include <algorithm>
#include <string>
#include <vector>
struct MiniPair {
int docNumber;
std::string word;
friend bool operator<(MiniPair const &a, MiniPair const &b) {
return a.docNumber < b.docNumber || a.docNumber == b.docNumber && a.word < b.word;
}
friend bool operator==(MiniPair const &a, MiniPair const &b) {
return a.docNumber == b.docNumber && a.word == b.word;
}
};
int main() {
std::vector<MiniPair> miniPairVector;
// fill miniPairVector with data
std::sort(miniPairVector.begin(), miniPairVector.end());
miniPairVector.erase(std::unique(miniPairVector.begin(), miniPairVector.end()), miniPairVector.end());
}

Related

C++11 Check if at least one element in vector is not in another vector

I wrote the following code in order to check if at least one element in vector is not in another vector.
There are no duplicates in the vectors. Only unique elements
Is there a more elegant way to do it by using the stl?
// Online C++ compiler to run C++ program online
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;
bool areVectorsDifferent(vector<int> &a, vector<int> &b){
if(a.size() != b.size()){
return true;
}
std::sort(a.begin(),a.end());
std::sort(b.begin(),b.end());
for(int i = 0; i < a.size();i++){
if(a[i] != b[i]) {
return true;
}
}
return false;
}
int main() {
bool isDifferent = false;
vector<int> a = {1,2,3,5};
vector<int> b = {4,3,2,1};
std::cout << areVectorsDifferent(a,b) << std::endl;
return 0;
}
It depends on your definition of "different", but:
bool areVectorsDifferent(const vector<int> &a, const vector<int> &b){
return a.size() != b.size()
|| std::set<int>{a.cbegin(), a.cend()} != std::set<int>{b.cbegin(), b.cend()};
}
I hope this solves your problem:
using namespace std;
template <typename T>
auto are_different(const vector<T>& lhs, const vector<T>& rhs) -> bool {
return lhs.size() != rhs.size() ||
any_of(cbegin(lhs), cend(lhs), [&rhs](const auto& item) {
return find(cbegin(rhs), cend(rhs), item) == cend(rhs);
});
}

search a structure element in map

i am new to C++(also english).
i want to search {1,2,3} in the map and if it exists , print TRUE on screen
but i can not
my code comes below
can you help me?
#include <iostream>
#include <map>
#include <iterator>
#define PKT_UNIT_MAX_LEN 10000
using namespace std;
struct PKT_UNIT
{
int len;
unsigned int checksum;
unsigned char data[PKT_UNIT_MAX_LEN];
};
int main()
{
map<int,PKT_UNIT> maharan;
maharan.insert(pair<int,PKT_UNIT>(1,{1,2,3}));
map<int,PKT_UNIT> ::iterator it;
it=maharan.begin();
for(it=maharan.begin(); it != maharan.end(); it++ )
{
if (maharan.find(it)!=maharan.end())
{
if (it->second.len==1 && it->second.checksum==2 && it->second.data==3)
cout<<"TRUE"<<endl;
}
return 0;
}
map::find takes something comparable to the key_type, not the mapped_type, and certainly not an iterator. Searching for the value is not what std::map is designed to support. You can instead use the generic searching algorithms.
bool operator==(const PKT_UNIT& lhs, const PKT_UNIT& rhs)
{
return (lhs.len == rhs.len)
&& (lhs.checksum == rhs.checksum)
&& std::equal(lhs.data, lhs.data + lhs.len, rhs.data);
}
int main()
{
PKT_UNIT needle{1,2,3};
std::map<int,PKT_UNIT> maharan;
maharan.insert(pair<int,PKT_UNIT>(1,needle));
auto it = std::find_if(maharan.begin(), maharan.end(), [&needle](auto & item){ return item.second == needle; });
if (it != maharan.end())
{
std::cout << "TRUE";
}
return 0;
}
You should overload the equality operator for PKT_UNIT. Then you should just use std::map::find to find what you are looking for.
bool operator==(const PKT_UNIT& lhs, const PKT_UNIT& rhs)
{
return (lhs.len == rhs.len) &&
(lhs.checksum == rhs.checksum) &&
std::equal(lhs.data,lhs.data + PKT_UNIT_MAX_LEN,rhs.data);
}
The you can do something like this:
PKT_UNIT goal {1,2,3};
for (const auto& e : maharan)
{
if (e.second == goal)
{
std::cout << "TRUE\n";
break;
}
}
Also, as it seems you don't need the key, maybe you want to use std::set and use a search algorithm, or std::unordered_set, in which case you wouldn't need a search algorithm at all.

std::map's find fails to find, but element in map with manual scan

I am using std::map and a list to keep track of windowing over elements and associated scores. When a window is full, I want to pop an element off the windows queue and remove it from the map. Because there can be duplicates, the map keeps track of how many times each element in the window was encountered. I'm also using an ordered map so that I can keep getting the minimum values in a given window.
My problem is that find() is returning end() when it is not expected to.
And when I iterate through the map, I find the element to be present. I don't want to sacrifice the logarithmic complexity of using map.
tl;dr: std::map says an element isn't in the map. A manual scan says it is.
[Edit: Bryan Chen's suggestion fixed the map. Thank you!]
#include <cstdint>
#include <cstdio>
#include <cinttypes>
#include <map>
#include <list>
#include <vector>
#include "util.h"
#include "kmerutil.h"
namespace kpg {
struct elscore_t {
uint64_t el_, score_;
INLINE elscore_t(uint64_t el, uint64_t score): el_(el), score_(score) {
LOG_ASSERT(el == el_);
LOG_ASSERT(score == score_);
}
INLINE elscore_t(): el_(0), score_(0) {}
inline bool operator <(const elscore_t &other) const {
return score_ < other.score_ || el_ < other.el_; // Lexicographic is tie-breaker.
}
inline bool operator ==(const elscore_t &other) const {
return score_ == other.score_ && el_ == other.el_; // Lexicographic is tie-breaker.
}
std::string to_string() const {
return std::to_string(el_) + "," + std::to_string(score_);
}
};
struct esq_t: public std::list<elscore_t> {
};
typedef std::map<elscore_t, unsigned> esmap_t;
class qmap_t {
// I could make this more efficient by using pointers instead of
// elscore_t structs.
// *maybe* TODO
// Could also easily templatify this module for other windowing tasks.
esq_t list_;
#if !NDEBUG
public:
esmap_t map_;
private:
#else
esmap_t map_;
#endif
const size_t wsz_; // window size to keep
public:
void add(const elscore_t &el) {
auto it(map_.upper_bound(el));
if(it->first == el) ++it->second;
else map_.emplace(el, 1);
}
void del(const elscore_t &el) {
auto f(map_.find(el));
if(f == map_.end()) {
LOG_DEBUG("map failed :(\n");
for(f = map_.begin(); f != map_.end(); ++f)
if(f->first == el)
break;
}
LOG_ASSERT(f != map_.end());
if(--f->second <= 0)
map_.erase(f);
}
uint64_t next_value(const uint64_t el, const uint64_t score) {
list_.emplace_back(el, score);
LOG_ASSERT(list_.back().el_ == el);
LOG_ASSERT(list_.back().score_ == score);
add(list_.back());
if(list_.size() > wsz_) {
//fprintf(stderr, "list size: %zu. wsz: %zu\n", list_.size(), wsz_);
//map_.del(list_.front());
del(list_.front());
list_.pop_front();
}
LOG_ASSERT(list_.size() <= wsz_);
return list_.size() == wsz_ ? map_.begin()->first.el_: BF;
// Signal a window that is not filled by 0xFFFFFFFFFFFFFFFF
}
qmap_t(size_t wsz): wsz_(wsz) {
}
void reset() {
list_.clear();
map_.clear();
}
};
}
This is not a valid strict weak ordering:
return score_ < other.score_ || el_ < other.el_;
You have elscore_t(0, 1) < elscore_t(1, 0) and elscore_t(1, 0) < elscore_t(0, 1).
As T.C. pointed out in his answer, your operator< is not correct.
You can use std::tie to do lexicographical comparison
return std::tie(score_, el_) < std::tie(other.score_, other.el_);
Otherwise you can do
if (score_ == other.score_) {
return el_ < other.el_; // use el_ to compare only if score_ are same
}
return score_ < other.score_;

Boost disjoint set

I need to make a disjoint set of the type dataum.
I have all the data in the vector as follows
vector<dataum> S;
S.push_back( dataum(0,0) );
S.push_back( dataum(0,1) );
S.push_back( dataum(0,2) );
.
.
Then I create the disjoint_set
std::vector<int> rank (100);
std::vector<dataum> parent (100);
boost::disjoint_sets<int*,dataum*> ds(&rank[0], &parent[0]);
for( int i=0 ; i<S.size() ; i++ )
{
ds.make_set( S[i] );
}
This seem to not work. What am I missing?
I want to create a disjoint set with custom datatype. In this case dataum. Initially each of my dataums should be in different sets.
The documentation states that
Rank must be a model of ReadWritePropertyMap with an integer value type and a key type equal to the set's element type.
Parent must be a model of ReadWritePropertyMap and the key and value type the same as the set's element type.
At your previous question I posted the following sample code in a comment:
After looking at the (new for me) disjoint_set_* classes, I don't think that they afford iterating members of sets. They act like unidirectional mapping from element to set representative. In case it helps you: http://paste.ubuntu.com/8881626 – sehe 9 hours ago
Here it is, reworked for an imagined dataum type:
struct dataum {
int x,y;
bool operator< (const dataum& o) const { return tie(x,y) < tie(o.x,o.y); }
bool operator==(const dataum& o) const { return tie(x,y) == tie(o.x,o.y); }
bool operator!=(const dataum& o) const { return tie(x,y) != tie(o.x,o.y); }
};
Here's how I can see a disjoint_set declaration for it:
std::map<dataum,int> rank;
std::map<dataum,dataum> parent;
boost::disjoint_sets<
associative_property_map<std::map<dataum,int>>,
associative_property_map<std::map<dataum,dataum>> > ds(
make_assoc_property_map(rank),
make_assoc_property_map(parent));
The mechanics of this are to be found in the documentation for Boost PropertyMap, which is a very powerful generic data structure abstraction layer, mostly used with Boost Graph Library. It's wildly powerful, but I can't say it's user friendly.
Here's the full demo Live On Coliru¹
#include <boost/pending/disjoint_sets.hpp>
#include <boost/property_map/property_map.hpp>
#include <boost/tuple/tuple_comparison.hpp>
#include <iostream>
#include <map>
#include <cassert>
using namespace boost;
struct dataum {
int x,y;
bool operator< (const dataum& o) const { return tie(x,y) < tie(o.x,o.y); }
bool operator==(const dataum& o) const { return tie(x,y) == tie(o.x,o.y); }
bool operator!=(const dataum& o) const { return tie(x,y) != tie(o.x,o.y); }
};
int main() {
std::vector<dataum> S { {0,0}, {0,1}, {0,2} };
std::map<dataum,int> rank;
std::map<dataum,dataum> parent;
boost::disjoint_sets<
associative_property_map<std::map<dataum,int>>,
associative_property_map<std::map<dataum,dataum>> > ds(
make_assoc_property_map(rank),
make_assoc_property_map(parent));
for(auto i=0ul; i<S.size(); i++)
ds.make_set(S[i]);
assert((ds.count_sets(S.begin(), S.end()) == 3));
assert((ds.find_set(dataum{0,2}) == dataum{0,2}));
assert((ds.find_set(dataum{0,1}) == dataum{0,1}));
ds.union_set(dataum{0,2},dataum{0,1});
assert((ds.count_sets(S.begin(), S.end()) == 2));
assert((ds.find_set(dataum{0,2}) == dataum{0,1}));
assert((ds.find_set(dataum{0,1}) == dataum{0,1}));
std::cout << "done";
}
¹ Coliru still not cooperating

Find function is not working

I want to insert object of struct one as a unique key in map. So i have written operator() function but find is not working even element exist in map.
#include <iostream>
#include<map>
#include <stdio.h>
#include <string.h>
#include <math.h>
using namespace std;
struct one
{
char* name_;
double accuracy_;
one(char* name, double accuracy)
{
name_ = name;
accuracy_ = accuracy;
}
};
const float Precision = 0.000001;
struct CompLess:public std::binary_function<const one, const one, bool>{
bool operator()(const one p1, const one p2) const
{
if (strcmp(p1.name_, p2.name_)<0)
{
return true;
}
if(((p1.accuracy_) - (p2.accuracy_)) < Precision and
fabs((p1.accuracy_) - (p2.accuracy_))> Precision)
{
return true;
}
return false;
}
};
typedef map<const one,int,CompLess> Map;
int main( )
{
one first("box",30.97);
one first1("war",20.97);
Map a;
a.insert(pair<one,int>(first,1));
a.insert(pair<one,int>(first1,11));
if(a.find(first1) == a.end())
{
cout<<"Not found"<<endl;
}
else
{
cout<<"Found"<<endl;
}
return 0;
}
Your comparison class doesn't induce a strict ordering. You should change it to this:
bool operator()(const one p1, const one p2) const
{
if (strcmp(p1.name_, p2.name_) == 0)
{
if (((p1.accuracy_) - (p2.accuracy_)) < Precision and
fabs((p1.accuracy_) - (p2.accuracy_))> Precision)
{
return true;
}
}
return false;
}
In your version first1 was less than first because strcmp("war", "box") > 0 (first condition is false) and 20.97 < 30.97 (second condition is true), but in the same time first was less than first1, because strcmp("box", "war") < 0 (first condition is true). You should compare the second dimension only if the first one is equal - that's the good rule of thumb for less comparisons.