I'm trying to create an iterator on a library that allows reading a specific file format.
From the docs, to read the file content you need do something like this:
CKMCFile database;
if (!database.OpenForListing(path)) {
std::cerr << "ERROR: unable to open " << path << std::endl;
}
CKMCFileInfo info;
database.Info(info);
CKmerAPI kmer(info.kmer_length);
uint32 cnt;
std::vector<uint64_t> data;
std::vector<uint64> ulong_kmer;
data.reserve(info.total_kmers);
while (database.ReadNextKmer(kmer, cnt)) {
kmer.to_long(ulong_kmer);
data.push_back(ulong_kmer[0]);
}
Now, I started with this class wrapper:
class FileWrapper {
CKMCFile database;
CKMCFileInfo info;
Iterator _end;
public:
explicit FileWrapper(const std::string &path) {
if (!database.OpenForListing(path)) {
std::cout << "ERROR: unable to open " << path << std::endl;
}
database.Info(info);
}
Iterator begin() {
Iterator it;
it.database = &database;
it.total = 0;
uint32_t cnt;
std::vector<uint64_t> ulong_kmer;
CKmerAPI tmp(info.kmer_length);
database.ReadNextKmer(tmp, cnt);
tmp.to_long(ulong_kmer);
return it;
}
Iterator end() const { return _end; }
uint64_t size() { return info.total_kmers; }
};
And then, this is the Iterator class:
class Iterator {
friend class FileWrapper;
CKMCFileInfo info;
CKMCFile *database;
uint64_t kmer, total;
public:
Iterator &operator++() {
++total;
uint32_t cnt;
std::vector<uint64_t> ulong_kmer;
CKmerAPI tmp(info.kmer_length);
database->ReadNextKmer(tmp, cnt);
tmp.to_long(ulong_kmer);
return *this;
}
bool operator<(const Iterator &rhs) const { return total < rhs.total; }
uint64_t operator*() const { return kmer; }
};
But, during some test I can't use into a for loop for something like for (auto it = begin(); it != end(); ++i) { ... } or begin() + size(). How can I overload correctly this two operatos? opeartor!= and operato+
You'll have to think about 2 major things before:
Ownership. Currently, you have to make sure your FileWrapper survives at least as long as any Iterator returned from it by calling its begin() (since your Iterators store pointers to data owned by the FileWrapper object). If you cannot guarantee that, maybe think about using unique_ptrs or shared_ptrs
Iterator Category. As discussed in the comments, it appears that your database requires you to use "input iterators". They can only be incremented by one (do not provide operator+(int)) and dereferenced. Indeed, what would the iterator begin() + 10 look like? If this should advance your file-pointer, then you cannot define the end as begin() + size() as that would just skip through the file.
Representation. What should an end-iterator look like? A simple choice might be to indicate the end with database == nullptr. In this case, an operator!= might look like this:
bool is_end() const { return database == nullptr; }
bool operator==(const Iterator& other) const {
if(is_end()) return other.is_end();
if(other.is_end()) return false;
return (database == other.database) && (total == other.total);
}
bool operator!=(const Iterator& other) const { return !operator==(other); }
Now, you'll need code that ensures that all end-iterators have database == nullptr and, whenever a non-end iterator becomes and end-iterator by application of operator++(), you'll need to set database = nullptr and total = 0 (or something).
A note at the end: your Iterators may be in an inconsistent state after construction and before assignment of their database member. It is prudent to declare a proper constructor for Iterator that initializes its members.
EDIT: here's a suggestion for an integration
Related
Hi i am reading C++ primer 5th addition and have some doubts in the section of weak_ptr. It is written that
By using a weak_ptr, we don’t affect the lifetime of the vector to which a given StrBlob points. However, we can prevent the user from attempting to access a vector that no longer exists.
Then they have given the following code as an example:
#include<iostream>
#include<string>
#include<vector>
#include<memory>
#include<initializer_list>
using namespace std;
class StrBlobPtr;
class StrBlob {
friend class StrBlobPtr;
public:
typedef std::vector<std::string>::size_type size_type;
StrBlob():data(std::make_shared<std::vector<std::string>>()){
}
StrBlob(std::initializer_list<std::string> il):data(make_shared<vector<std::string>>(il)){
}
size_type size() const {
return data->size();
}
bool empty() const {
return data->empty();
}
void push_back(const std::string &t){
data->push_back(t);
}
std::string& front(){
check(0,"front on empty StrBlob");
return data->front();
}
std::string& front() const{
check(0,"front on const empty StrBlob");
return data->front();
}
std::string& back(){
check(0,"back on empty StrBlob");
return data->back();
}
std::string& back() const {
check(0,"back on const empty StrBlob");
return data->back();
}
void pop_back(){
check(0,"pop_back on empty StrBlob");
data->pop_back();
}
private:
std::shared_ptr<std::vector<std::string>> data;
void check(size_type i, const std::string &msg) const{
if(i >= data->size()){
throw out_of_range(msg);
}
}
StrBlobPtr begin();
StrBlobPtr end();
};
class StrBlobPtr {
public:
typedef std::vector<std::string>::size_type size_type;
StrBlobPtr():curr(0){
}
StrBlobPtr(StrBlob &a, size_type sz = 0):wptr(a.data), curr(sz){
}
std::string& deref() const {
auto p = check(curr, "dereference past end");
return (*p)[curr];
}
StrBlobPtr& incr(){
check(curr, "increment past end of StrBlobPtr");
++curr;
return *this;
}
std::shared_ptr<std::vector<std::string>> check(std::size_t i, const std::string &msg) const{
auto ret = wptr.lock();
if(!ret){
throw std::runtime_error("unbound StrBlobPtr");
}
if(i>= ret->size()){
throw std::out_of_range(msg);
}
return ret;
}
private:
std::weak_ptr<std::vector<std::string>> wptr;
size_type curr;
};
StrBlobPtr StrBlob::begin() {
return StrBlobPtr(*this);
}
StrBlobPtr StrBlob::end() {
auto ret = StrBlobPtr(*this, data->size());
}
int main(){
return 0;
}
My questions are as follows:
How can we prevent the user from attempting to access a vector that no longer exists? I can't come up with a use case,how can we use the above quoted statement in this example?
How does this example shows/verifies that we can prevent the user from attempting to access a vector that no longer exists? *If this example does not shows what they have written then why is this example there in the book?*Note that i have written if.
1. How can we prevent the user from attempting to access a vector that no longer exists?
We can prevent it by exchanging a weak_ptr for a shared_ptr. weak_ptr::lock() does that. It atomically checks if the pointed-to object still exists and increments the corresponding shared_ptr ref count, thus "blocking" any possible deletion from that point on.
So after this line:
auto ret = wptr.lock();
ret will be a shared_ptr that either owns the object or doesn't, and that fact will not change for as long as ret exists.
Then with a simple test you can safely check if there is an object or not:
if(!ret){
/* no object anymore */
}
At the end the function does return ret;, which returns a copy of it, thus still preventing an object from being deleted (ref count is again incremented and then decremented). So as long as you own an instance of shared_ptr, you can rest assured the object will continue to exist.
However, here we have a problem:
std::string& deref() const {
auto p = check(curr, "dereference past end");
return (*p)[curr];
}
This returns a reference to std::string inside a vector which, after p goes out of scope is held only by weak_ptr, i.e. a potentially dangling reference (which is no different from a dangling pointer).
2. How does this example shows/verifies that we can prevent the user from attempting to access a vector that no longer exists?
Apparently it doesn't. Just ignore it.
I have a somewhat simple text file parser. The text I parse is split into blocks denoted by { block data }.
My parser has a string read() function, which gets tokens back, such that in the example above the first token is { followed by block followed by data followed by }.
To make things less repetitive, I want to write a generator-like iterator that will allow me to write something similar to this JavaScript code:
* readBlock() {
this.read(); // {
let token = this.read();
while (token !== '}') {
yield token;
token = this.read();
}
}
which in turn allows me to use simple for-of syntax:
for (let token of parser.readBlock()) {
// block
// data
}
For C++ I would like something similar:
for (string token : reader.read_block())
{
// block
// data
}
I googled around to see if this can be done with an iterator, but I couldn't figure if I can have a lazy iterator like this which has no defined beginning or end. That is, its beginning is the current position of the reader (an integer offset into a vector of characters), and its end is when the token } is found.
I don't need to construct arbitrary iterators, or to iterate in reverse, or to see if two iterators are equal, since it's purely to make linear iteration less repetitive.
Currently every time I want to read a block, I need to re-write the following:
stream.skip(); // {
while ((token = stream.read()) != "}")
{
// block
// data
}
This becomes very messy, especially when I have blocks inside blocks. To support blocks inside blocks, the iterators would have to all reference the same reader's offset, such that an inner block will advance the offset, and the outer block will re-start iterating (after the inner is finished) from that advanced offset.
Is this possible to achieve in C++?
In order to be usable in a for-range loop, a class has to have member functions begin() and end() which return iterators.
What is an iterator? Any object fulfilling a set of requirements. There are several kind of iterators, depending on which operations allow you. I suggest to implement an input iterator, which is the simplest: https://en.cppreference.com/w/cpp/named_req/InputIterator
class Stream
{
public:
std::string read() { /**/ }
bool valid() const { /* return true while more tokens are available */ }
};
class FileParser
{
std::string current_;
Stream* stream_;
public:
class iterator
{
FileParser* obj_;
public:
using value_type = std::string;
using reference = const std::string&;
using pointer = const std::string*;
using iterator_category = std::input_iterator_tag;
iterator(FileParser* obj=nullptr): obj_ {obj} {}
reference operator*() const { return obj_->current_; }
iterator& operator++() { increment(); return *this; }
iterator operator++(int) { increment(); return *this; }
bool operator==(iterator rhs) const { return obj_ == rhs.obj_; }
bool operator!=(iterator rhs) const { return !(rhs==*this); }
protected:
void increment()
{
obj_->next();
if (!obj_->valid())
obj_ = nullptr;
}
};
FileParser(Stream& stream): stream_ {&stream} {};
iterator begin() { return iterator{this}; }
iterator end() { return iterator{}; }
void next() { current_ = stream_->read(); }
bool valid() const { return stream_->valid(); }
};
So your end-of-file iterator is represented by an iterator pointing to no object.
Then you can use it like this:
int main()
{
Stream s; // Initialize it as needed
FileParser parser {s};
for (const std::string& token: parser)
{
std::cout << token << std::endl;
}
}
Apologies if this is a trivial problem.
I'm trying to pass a multimap that has been put together with one class in a library to another class in that library in order to further manipulate the data there.
The code relates to a GUI written by other people and the classes here relate to two different tools in the GUI.
Very roughly speaking my code and what I'm after here is like this
class A
{
private:
std::multimap<int, double> mMap;
int anInt;
double aDouble;
***some more definitions***
public:
void aFunction(***openscenegraph node, a string, and a parser function***)
{
***a few definitions are declared and initialised here
during calculations***
***some code calculating data stuff that
passes bits of that data to mMap (including information
initialised within the function)***
}
}
class B
{
public:
void bFunction(***openscenegraph node and some other data***)
{
***I want to be able to access all the data in mMap here***
}
}
Can anyone make it clear to me how I can do this, please?
Edit: Added to clarify what i'm aiming for
//Edit by Monkone
//section below is akin to what I'm trying to do
class B
{
private:
std::multimap<int, double> mMapb;
public:
std::multimap<int,double> bFunction2(A::MultiMapDataType data)
{
return mMap;
}
void bFunctionOriginal()
{
***I want to be able to access all the data in mMap here***
***i.e. mMapb.bFunction2(mMap);***
***do stuff with mMapb***
}
}
However I can't get anything to actually do something like this
I won't be needing to work on the map, only get information from it.
You could then add a function to return a const reference to the map and functions for returning const iterators to A:
class A {
public:
typedef std::multimap<int, double> intdoublemap_t;
typedef intdoublemap_t::const_iterator const_iterator;
// typedef intdoublemap_t::iterator iterator;
private:
intdoublemap_t mMap;
public:
// direct access to the whole map
const intdoublemap_t& getMap() const { return mMap; }
// iterators
const_iterator cbegin() const { return mMap.begin(); }
const_iterator cend() const { return mMap.end(); }
const_iterator begin() const { return cbegin(); }
const_iterator end() const { return cend(); }
/*
iterator begin() { return mMap.begin(); }
iterator end() { return mMap.end(); }
*/
};
Now you can iterate over the map from the outside (from B):
void bFunction(const A& a) {
for(A::const_iterator it = a.begin(); it!=a.end(); ++it) {
std::cout << it->first << " " << it->second << "\n";
}
}
Or access the map directly:
void bFunction(const A& a) {
const A::intdoublemap_t& mref = a.getMap();
//...
}
The C++ private members cannot be accessed by other (non friend) classed.
The first solution it would be to keep mMap private (not polite to work on other classes members) and offer assessors over it.
class A
{
public:
typedef std::multimap<int, double> MultiMapDataType;
private:
MultiMapDataType mMap;
***some more definitions***
public:
const MultiMapDataType& getConstMMap() const;
MultiMapDataType getMMap();
}
class B
{
public:
void bFunction(A::MultiMapDataType data)
{
***I want to be able to access all the data in mMap here***
}
void bFunction2(const A::MultiMapDataType& data)
{
***I want to be able to access all the data in mMap here***
}
}
A a;
B b;
b.bFunction(a.getMMap());
b.bFunction2(a.getConstMMap());
However, from architectural point, if you have multiple members that you need to share you should move them all in a new structure/class that encapsulates that functionality.
I am trying to implement insertion of a word into a chained hashtable.
The problem is I am trying to insert a object that has 2 fileds and I need access to one with an iterator. The problem seems to happen with the iterator it as the code doesn't work from the for cycle. I also overloaded the operator== in Vocabolo.cpp to make it work for my case.
I also have a problem the size of the vector, can I use a define? It seems not. Any advices please?
I declared my vector of list + iterator in the header file as :
vector<list<Vocabolo>> hash;
list<Vocabolo>::iterator it;
this is part of the class Vocabolo :
class Vocabolo {
public:
Vocabolo();
~Vocabolo();
void setVocabolo(Vocabolo);
string getVocabolo();
bool operator== (Vocabolo);
string termine;
string tipo;
};
this is the overloaded method operator==:
bool Vocabolo::operator== (Vocabolo x) {
return getVocabolo() == x.termine;
}
the method that is not working!
bool HashV::Insert(Vocabolo nuovo) {
key = this->HashUniversale(nuovo.termine);
for (it = this->hash[key].begin(); it != this->hash[key].end(); it++)
if (it->termine == nuovo.termine)
return false;
else {
hash[key].push_back(nuovo);
return true;
}
}
Consider using std::find_if instead:
auto itVoca = std::find_if(this->hash[key].begin(), this->hash[key].end(), [nuovo](const string& str)
{
return str != nuovo.termine;
});
bool found = itVoca != this->hash[key].end();
if(found ) hash[key].push_back(nuovo);
return found;
I'm doing this:
template<typename T> class var_accessor {
public:
std::set<std::shared_ptr<T>> varset;
std::map<std::string,std::shared_ptr<T>> vars_by_name;
std::map<uint32,std::shared_ptr<T>> vars_by_id;
std::shared_ptr<T> operator[](const uint32& index) { return vars_by_id[index]; }
std::shared_ptr<T> operator[](const std::string& index) { return vars_by_name[index]; }
bool is_in_set(std::shared_ptr<T> what) { auto it = varset.find(what); if (it == varset.end()) return false; return true; }
bool is_in_set(uint32 what) { auto it = vars_by_id.find(what); if (it == vars_by_id.end()) return false; return true; }
bool is_in_set(std::string& what) { auto it = vars_by_name.find(what); if (it == vars_by_name.end()) return false; return true; }
bool place(std::shared_ptr<T> what, const uint32 whatid, const std::string& whatstring) {
if (is_in_set(what)) return false;
varset.emplace(what);
vars_by_name.emplace(whatstring,what);
vars_by_id.emplace(whatid,what);
return true;
}
};
Then...
class whatever {
std::string name;
std::function<int32()> exec;
};
And:
class foo {
public:
var_accessor<whatever> stuff;
};
This works:
std::shared_ptr<whatever> thing(new whatever);
thing->name = "Anne";
thing->exec = []() { return 1; }
foo person;
person.stuff.emplace(thing, 1, thing->name);
Getting the name crashes it:
std::cout << person.stuff[1]->name;
But if I change the operator[]'s to return references, it works fine.
I don't want to be able to accidentally add new elements without adding to all 3 structures, so that's why I made
std::shared_ptr<T> operator[]
instead of
std::shared_ptr<T>& operator[]
Is there any way to prevent assignment by subscript but keep the subscript operator working?
To be clear I want to be able to keep doing
std::cout << person.stuff[4];
But NOT be able to do
std::shared_ptr<whatever> bob(new whatever);
bob->name = "bob";
person.stuff[2] = bob;
The error is a EXC_BAD_ACCESS inside the std::string class madness
Everything I read says simply "don't return references if you want to prevent assignment" but it also prevents using it for me.
Yes I know some things should be made private but I just want to get it working first.
Using Clang/LLVM in XCode 5.1
Thanks!
You should return a const reference. See this question
A const reference means the caller is not allowed to change the value, only look at it. So assignment will be a compile-time error. But using it will work (and be efficient).