Segmentation fault in std::transform - c++

I'm trying to transfer parsed out the file names from regex match to the list of filesystem::path objects.
I believe that matches are valid because for_each for the same iterators and print to console work perfectly. However, I'm getting a segmentation fault running this code. What am I doing wrong? Is there a mistake in my lambda?
namespace fs = boost::filesystem;
std::forward_list<fs::path> results;
std::transform(std::sregex_iterator(file_data.begin(), file_data.end(), re),
std::sregex_iterator(), results.begin(),
[&](const std::smatch& m)->fs::path{
return root / fs::path(m[1].str());
});
GDB shows me this line as a place of error:
path& operator=(const path& p)
{
m_pathname = p.m_pathname;
return *this;
}
UPDATE: found the solution - use back_inserter(results) instead of results.begin(). However, why is that?

The std::transform algorithm's third parameter should be an iterator to the start of the range where the values should be written. Specifically, it works by overwriting the values in the range pointed at by the iterator with the transformed values. This means that there actually have to be values there to overwrite in the first place. In your case, you're writing to an empty forward_list, so there's nothing to write to, hence the crash.
To fix this, consider replacing the last argument with a back_inserter, which will automatically create the space that's needed as the values are produced:
std::transform(std::sregex_iterator(file_data.begin(), file_data.end(), re),
std::sregex_iterator(),
back_inserter(results), // <--- This is new
[&](const std::smatch& m)->fs::path{
return root / fs::path(m[1].str());
});
More generally, to the best of my knowledge, all of the algorithms in <algorithm> that write to output ranges will assume that there are values available to overwrite in that range. If that isn't the case, consider using a back_inserter or other type of insert iterator, which will automatically create the space that's needed for you.
Hope this helps!

Your output iterator is a simple results.begin(), which is probably == results.end(). The clue here is that the failure comes when trying to assign the result.
You either need a back_inserter as you found, or to use some container with enough space already allocated (which can only work if you know how many items you're transforming in advance).
Specifically, consider the sample implementation of the first overload here.
The line
*d_first++ = op(*first1++);
requires the destination iterator already to be valid. If it's == end() as suggested, the whole operation is illegal.

Related

Returning nullptr iterators, how to cast them

I'm having some trouble solving an issue in my program. So currently each chunk will return an iterator, but the iterator depends on two cases:
the desired element is found in the chunk: return resultIter;
the desired element is not found in the chunk: 'return nullptr`
the first case is simple enough and easy to solve, but the second case is where I am running into trouble. Given a template argument InIter, how can I convert a nullptr into the InIter category?
template< typename InIter, ...>
InInter func(...) {
InIter res = //returns iter to found element if found
loop(...) //if so a token will changed to signify a cancelation
if(token.was_cancelled())
return res; //easy enough
return nullptr; //doesn't work
}
which gives me this error:
'nullptr': all return expressions in a lambda must have the same type:
previously it was 'test::test_iterator'
it makes sense, I can't suddenly switch up return types in the middle of a lambda function, but I don't know how to solve this. note the code about is a very simplified version of the issue at hand, in it's actual implementation it is inside a lambda and part of a much bigger function call. However this is the only relevant portion
i've also tried:
return InIter(nullptr);
return (InIter)(nullptr);
return NULL;
return InIter(NULL);
...
Of course none of these work. there as to be an easy way to do this I just am not seeing?
The expected pattern for using iterators, is that if you want to report that you found no match, you would return the iterator that points to the end of your sequence.
So if you called:
InIter res = find_an_iterator_meeting_an_interesting_condition(begin, end);
and it found no match, you would return end. The caller would be responsible for checking that condition.
There are two approaches.
First, the standard approach, is that when working with iterators, you are actually working with a range of iterators (from a begin, to an end).
In that case, failure to find something would consist of returning end.
In some extreme corner cases this isn't the right thing to do (imagine if you ask "where is the right place to insert Y? And the answer isn't "at the end of the sequence" but rather "somewhere completely different")
In that case, something like boost::optional is the right answer -- your function returns an optional<Iterator>. Then you can return a nullopt to mean "no answer is valid", and an iterator if an answer is valid.
There are proposals to bring in an optional to C++ in C++14.
A "poor man's optional" is a std::pair<bool, Iterator>, where you ignore the .second's value if the .first is false. If you have no access to boost, I'd advise reimplementing optional rather than using this technique.

efficient way to remove a list of string from a big vector

I am using visual studio 2012 (windows) and I am trying to write an efficient c++ function to remove some words from a big vector of strings.
I am using stl algorithms. I am a c++ beginner so I am not sure that it is the best way to proceed. This is what I have did :
#include <algorithm>
#include <unordered_set>
using std::vector;
vector<std::string> stripWords(vector<std::string>& input,
std::tr1::unordered_set<std::string>& toRemove){
input.erase(
remove_if(input.begin(), input.end(),
[&toRemove](std::string x) -> bool {
return toRemove.find(x) != toRemove.end();
}));
return input;
}
But this don't work, It doesn't loop over all input vector.
This how I test my code:
vector<std::string> in_tokens;
in_tokens.push_back("removeme");
in_tokens.push_back("keep");
in_tokens.push_back("removeme1");
in_tokens.push_back("removeme1");
std::tr1::unordered_set<std::string> words;
words.insert("removeme");
words.insert("removeme1");
stripWords(in_tokens,words);
You need the two-argument form of erase. Don't outsmart yourself and write it on separate lines:
auto it = std::remove_if(input.begin(), input.end(),
[&toRemove](std::string x) -> bool
{ return toRemove.find(x) != toRemove.end(); });
input.erase(it, input.end()); // erases an entire range
Your approach using std::remove_if() is nearly the correct approach but it erases just one element. You need to use the two argument version of erase():
input.erase(
remove_if(input.begin(), input.end(),
[&toRemove](std::string x) -> bool {
return toRemove.find(x) != toRemove.end();
}), input.end());
std::remove_if() reorders the elements such that the kept elements are in the front of the sequence. It returns an iterator it to the first position which is to be considered the new end of the sequence, i.e., you need to erase the range [it, input.end()).
You've already gotten a couple of answers about how to this correctly.
Now, the question is whether you can make it substantially more efficient. The answer to that will depend on another question: do you care about the order of the strings in the vector?
If you can rearrange the strings in the vector without causing a problem, then you can make the removal substantially more efficient.
Instead of removing strings from the middle of the vector (which requires moving all the other strings over to fill in the hole) you can swap all the unwanted strings to the end of the vector, then remove them.
Especially if you're only removing a few strings from near the beginning of a large vector, this can improve efficiency a lot. Just for example, let's assume a string you want to remove is followed by 1000 other strings. With this, you end up swapping only two strings, then erasing the last one (which is fast). With your current method, you end up moving 1000 strings just to remove one.
Better still, even with fairly old compilers, you can expect swapping strings to be quite fast as a rule--typically faster than moving them would be (unless your compiler is new enough to support move assignment).

Compare the Current And Next Element of A set

I want to compare the current and next element of a set of addresses . I tried the following code
struct Address{
string state;
string city;
}
if((*it).state == (*(it+1)).state){
}
But the compiler gave an error that no match for operator+ in "it+1". On cplusplus.com I found that + operator is not supported for set containers. So I am unable to figure out a way to access both the current and the next element of a set in the same if statement.
But ++ is provided, so you can write:
?::iterator next = it;
next++;
Just create a copy of the iterator, advance it(++), then compare. Or, if your standard library has it, you can use the c++11 next function from the <iterator> library.
if(it->state == std::next(it)->state)
As you already found out the operator + is not supported for std::set iterators, since those are only bidirectional iterators and not random access iterators. So if you want to access the next element at the same time as the current one you have to make a copy and increment that one:
std::set<Address>::iterator next_it = it;
++next_it;
if(it->state == (next_it)->state)
If you are using c++11 this code can be simplyfied using the std::next function found in <iterator>(which basically does the same thing):
if(it->state == std::next(it)->state)
Of course writing that function is pretty trivial, so you could always write your own next when coding pre C++11 .
Also: Remember to make sure that the next iterator isn't equal to set.end()

Erase final member of std::set

How can I delete the last member from a set?
For example:
set<int> setInt;
setInt.insert(1);
setInt.insert(4);
setInt.insert(3);
setInt.insert(2);
How can I delete 4 from setInt? I tried something like:
setInt.erase(setInt.rbegin());
but I received an error.
in C++11
setInt.erase(std::prev(setInt.end()));
You can decide how you want to handle cases where the set is empty.
if (!setInt.empty()) {
std::set<int>::iterator it = setInt.end();
--it;
setInt.erase(it);
}
By the way, if you're doing this a lot (adding things to a set in arbitrary order and then removing the top element), you could also take a look at std::priority_queue, see whether that suits your usage.
Edit: You should use std::prev as shown in Benjamin's better answer instead of the older style suggested in this answer.
I'd propose using a different name for rbegin which has a proper type:
setInt.erase(--setInt.end());
Assuming you checked that setInt is not empty!
Btw. this works because you can call the mutating decrement operator on a temporary (of type std::set<int>::iterator). This temporary will then be passed to the erase function.
A bit less performant, but an alternative option:
setInt.erase(*setInt.rbegin());
If you want to delete 4 instead of the last you should use the find method.
Depending on the use case 4 might not be the last.
std::set<int>::iterator it = setInt.find(4);
if(it != setInt.end()) {
setInt.erase(it);
}
If you want to delete the last element use:
if (!setInt.empty()) {
setInt.erase(--setInt.rbegin().base());
// line above is equal to
// setInt.erase(--setInt.end());
}
While I was not sure if --*.end(); is O.K. I did some reading.
So the -- on rbegin().base() leads to the same result as -- on end().
And both should work.
Check if the set is empty or not. If not, then get the last element and set that as iterator and reduce that iterator and erase the last element.
if (!setInt.empty())
{
std::set<int>::iterator it = setInt.end();
--it;
if(it != setInt.end()) {
setInt.erase(it);
}
}

c++ insert into vector at known position

I wish to insert into a c++ vector at a known position. I know the c++ library has an insert() function that takes a position and the object to insert but the position type is an iterator. I wish to insert into the vector like I would insert into an array, using a specific index.
This should do what you want.
vector<int>myVec(3);
myVec.insert(myVec.begin() + INTEGER_OFFSET, DATA);
Please be aware that iterators may get invalidated when vector get reallocated. Please see this site.
EDIT: I'm not sure why the other answer disappeared...but another person mentioned something along the lines of:
myVec.insert(INDEX, DATA);
If I remember correctly, this should be just fine.
It's always nice to wrap these things up:
template <typename T>
T& insert_at(T& pContainer, size_t pIndex, const T::value_type& pValue)
{
pContainer.insert(pContainer.begin() + pIndex, pValue);
return pContainer;
}
That should do it. There is a now deleted answer that you can construct an iterator from an index, but I've never see that before. If that's true, that's definitely the way to go; I'm looking for it now.
Look at that debugging trace. The last thing that's executed is std::copy(__first=0x90c6fa8, __last=0x90c63bc, __result=0x90c6878). Looking back at what caused it, you called insert giving the position to insert at as 0x90c63bc. std::copy copies the range [first, last) to result, which must have room for last - first elements. This call has last < first, which is illegal (!), so I'm guessing that the position you're giving to insert at is wrong. Are you sure vnum hasn't underflowed somewhere along the line? In GDB with that trace showing, you should run
frame 10
print vnum
to check. In fact, if you haven't just abbreviated in your question, I've just found your bug. Your second line is:
new_mesh->Face(face_loc)->vertices.insert(vertices.begin()+vnum+1, new_vertices[j]);
It should have been:
new_mesh->Face(face_loc)->vertices.insert(new_mesg->Face(face_loc)->vertices.begin()+vnum+1, new_vertices[j]);
The first line gives the insertion point relative to the start of some other variable called vertices, not the one you want to insert into.