I've discovered a stange behaviour and I don't know, why it is as it is. Look at this code here:
std::map<size_t, std::vector<Resource*>> bla;
for(int i = 0; i<100000; i++) {
std::vector<Resource*> blup;
blup.push_back(new Resource());
bla[i] = blup;
}
for (auto& resources : bla) {
for (auto resource : resources.second) {
delete resource; // <---- this delete here
}
resources.second.clear();
}
bla.clear(); // <---- this clear here
If I run this program in eclipse-debugger, the clear() on the last line took several seconds (too long in my opinion). For bigger maps (>10M elements) it needs up to several minutes(!).
But if I comment out the delete statement in the inner loop, then the clear() function become very fast (as fast as I expect it to be).
Without debugger the code is fast in both cases (with and without delete).
The class "Resource" is a small container class, containinig 2 uint64_t's and a std::string (and ofcourse some methods).
Why is clear() slow? And why it speeds up, if I don't delete the pointers. I don't understand it, because I even clear the vectors in the map. So the map should not see any pointers, it is a map between uint64_t and std::vectors.
I am using MinGw_64 for Windows (msys64, msys2-x86_64-20210419), and Eclipse.
Now I have switched to map-only without vectors. And all is fine with this. Clear is super fast, even, if I delete the pointers. But without vectors I have to think about my algorithm a little bit.
Related
I get a data structure like this:
struct My_data
{
MyArray<float> points;
MyArray<float> normals;
MyArray<float> uvCoords;
};
This function can be used to free them:
void ClearAlembicData(My_data* myData)
{
myData->points.clear();
myData->normals.clear();
myData->uvCoords.clear();
}
I want to asynchronously clean the myData so that the program will not wait util all the xxx.clear() are done. Here is my actual code:
My_data myData;
myData.point.push_back(point);
myData.nomals.push_back(nomals);
myData.uvCorrds.push_back(uvCorrds);
ClearAlembicData(&myData);
myData.point.push_back(point);
myData.nomals.push_back(nomals);
myData.uvCorrds.push_back(uvCorrds);
Could you show me how to do it in C++? thanks
Depending on the definition of MyArray, this will either cause undefined behavior or be completely pointless.
If the container is thread-safe, it will block your push_back while the clear is being executed in the other thread (making the 'clearing it asynchronously' completely pointless), if it is not, you are introducing a race condition, which might even end up crashing your program, because you would be concurrently manipulating a shared resource. (Not a good idea).
If you still want to do it, here is a way that might work:
Make 'myData' into a pointer and operate on that.
Upon wanting to clear, store that pointer in another pointer variable and replace the pointer with new My_Data.
Pass the stored pointer to your asynchronous 'free' function.
Continue to work with your new data structure in the original thread.
This way, you are not working on a shared resource and asynchronous freeing becomes feasible.
As for 'how to', if you've got C++11, something like this would work (Pseudo-Code).
My_data *myData = new My_data();
myData->point.push_back(point);
myData->nomals.push_back(nomals);
myData->uvCorrds.push_back(uvCorrds);
std::thread([=](){ ClearAlembicData(myData); delete myData; }).detach();
myData = new My_data()
myData->point.push_back(point);
myData->nomals.push_back(nomals);
myData->uvCorrds.push_back(uvCorrds);
delete myData;
One of the solutions, implement MyArray::swap(MyArray&). Then
void ClearAlembicData(My_data* myData)
{
MyArray<float> old_points;
MyArray<float> old_normal;
MyArray<float> old_coords;
// Fast swap, myData arrays become empty
myData->points.swap(old_points);
myData->normals.clear(old_normals);
myData->uvCoords.clear(old_coords);
// Assumed to be passed to async function
old_points.clear();
old_normals.clear();
old_coords.clear();
}
There is no enough information about MyArray abilities, such as support of move semantics, a real function clear, thus this is just an idea.
If MyArray is a wrapper around std::vector, know that the clear() operation is O(1) as float is trivially destructible and the compiler can eliminate the loop that destroys elements individually.
If it is not, consider implementing MyArray.clear() such that it marks the container as having a non-zero capacity yet zero contents. As stated above, std::vector accomplishes this by simply setting slots_available = capacity.
I am processing really large text file in following way:
class Loader{
template<class READER>
bool loadFile(READER &reader){
/* for each line of the input file */ {
processLine_(line);
}
}
bool processLine_(std::string_view line){
std::vector<std::string> set; // <-- here
std::string buffer; // <-- here
// I can not do set.reserve(),
// because I have no idea how much items I will put.
// do something...
}
void printResult(){
// print aggregated result
}
}
The processing of 143,000,000 records take around 68 minutes.
So I decided to do some very tricky optimizations with several std::array buffers. Result was about 62 minutes.
However the code become very unreadable so I decided not to use them in production.
Then I decided to do partial optimization, e.g.
class Loader{
template<class READER>
bool loadFile(READER &reader);
std::vector<std::string> set; // <-- here
std::string buffer; // <-- here
bool processLine_(std::string_view line){
set.clear();
// do something...
}
void printResult();
}
I was hoping this will reduce malloc / free (new[] / delete[]) operation from buffer and from the set vector. I realize the strings inside the set vector still dynamically allocate memory.
However result went to 83 minutes.
Note I do not change anything except I move set and buffer on "class" level. I use them only inside processLine_ method.
Why is that?
Locality of reference?
Only explanation I think about is some strings to be small enough and to fit in SSO, but this sounds unlikely.
Using clang with -O3
I did profile and I found that most of the time is spent in a third party C library.
I supposed this library to be very fast, but this was not the case.
I am still puzzling with the slowdown, but even if I optimize it, it wont make such a big difference.
In my code, I have something like this:
vector<SuperHeavyObject> objects; // objects in this vector are extremely slow to copy!
for (auto &objectGroup : objectGroups) {
vector<SuperHeavyObject> objectsInThisGroup;
for (size_t index : objectGroup) {
objectsInThisGroup.push_back(objects[index]); // slow as copying is needed!
}
doSomething(objectsInThisGroup.begin(), objectsInThisGroup.end());
}
What I'd really want is something like this:
vector<SuperHeavyObject> objects; // objects in this vector are extremely slow to copy!
for (auto &objectGroup : objectGroups) {
vector<SuperHeavyObject*> objectsInThisGroup; // pointers!
for (size_t index : objectGroup) {
objectsInThisGroup.push_back(&objects[index]); // not slow anymore
}
doSomething(magicIterator(objectsInThisGroup.begin()),
magicIterator(objectsInThisGroup.end()));
}
doSomething is allowed to copy the objects, so there's no scope problem. Inside doSomething is the only place where I'd like copying to take place, because these objects really are very slow to copy (I've profiled and it's a chokepoint).
At the same time, I don't want to change the signature of doSomething to accept iterators that dereference SuperHeavyObject*, because that would require a lot of changes; dereferencing to SuperHeavyObject would be ideal, as it would only happen at one place (where copying happens).
My question is; I could write an iterator like this myself, but it feels like I'm reinventing the wheel. Does C++ (11) have facilities to do this? I also have Boost if someone knows of anything like this.
Seems like a legitimate use case for std::reference_wrapper1:
vector<SuperHeavyObject> objects;
for (auto &objectGroup : objectGroups) {
vector<std::reference_wrapper<SuperHeavyObject>> objectsInThisGroup;
for (size_t index : objectGroup) {
// fast, we are only storing reference-like objects
objectsInThisGroup.push_back(objects[index]);
}
doSomething(objectsInThisGroup.begin(), objectsInThisGroup.end());
}
C++11 required
Thanks #matteo-italia for your helpful answer! I used it for a while, and decided to look closer at Boost's iterators, and I found that they have an indirect_iterator which is also a good way to do what I want.
"indirect_iterator adapts an iterator by applying an extra dereference inside of operator*()"
http://www.boost.org/doc/libs/1_59_0/libs/iterator/doc/indirect_iterator.html
Let Action be a class with a is_finished method and a numeric tag property.
Let this->vactions be a std::vector<Action>
The intent is to iterate the vector and identify those Actions who are finished,
store their tags in a std::vector<unsigned int> and delete the actions.
I tried to play with lambdas and a little and came up with a little
code that read nicely but caused memory corruptions. The "extended" version,
on the other hand, works as expected.
I suspect foul play in the remove_if part, but for the life of me I can't figure
out what's wrong.
Here's the example code.
This causes memory corruptions
std::vector<unsigned int> tags;
auto is_finished=[p_delta](Action& action) -> bool {return action.is_finished();};
//This is supposed to put the finished actions at the end of the vector and return
//a iterator to the first element that is finished.
std::vector<Action>::iterator nend=remove_if(this->vactions.begin(), this->vactions.end(), is_finished);
auto store_tag=[&tags](Action& action)
{
if(action->has_tag())
{
tags.push_back(action->get_tag());
}
};
//Store the tags...
for_each(nend, this->vactions.end(), store_tag);
//Erase the finished ones, they're supposed to be at the end.
this->vaction.erase(nend, this->vaction.end());
if(tags.size())
{
auto do_something=[this](unsigned int tag){this->do_something_with_tag(tag);};
for_each(tags.begin(), tags.end(), do_something);
}
This, on the other side, works as expected
std::vector<Action>::iterator ini=this->vactions.begin(),
end=this->vactions.end();
std::vector<unsigned int> tags;
while(ini < end)
{
if( (*ini).is_finished())
{
if((*ini).has_tag())
{
tags.push_back((*ini).get_tag());
}
ini=this->vaction.erase(ini);
end=this->vaction.end();
}
else
{
++ini;
}
}
if(tags.size())
{
auto do_something=[this](unsigned int tag){this->do_something_with_tag(tag);};
for_each(tags.begin(), tags.end(), do_something);
}
I am sure there's some rookie mistake here. Can you help me spot it?.
I thought that the for_each could be updating my nend iterator but found
no information about it. What if it did? Could the vector try to erase beyond the "end" point?.
std::remove_if does not preserve the values of the elements that are to be removed (See cppreference). Either get the tag values before calling remove_if - as you do in the second case - or use std::partition instead.
I've stumbled across this great post about validating parameters in C#, and now I wonder how to implement something similar in C++. The main thing I like about this stuff is that is does not cost anything until the first validation fails, as the Begin() function returns null, and the other functions check for this.
Obviously, I can achieve something similar in C++ using Validate* v = 0; IsNotNull(v, ...).IsInRange(v, ...) and have each of them pass on the v pointer, plus return a proxy object for which I duplicate all functions.
Now I wonder whether there is a similar way to achieve this without temporary objects, until the first validation fails. Though I'd guess that allocating something like a std::vector on the stack should be for free (is this actually true? I'd suspect an empty vector does no allocations on the heap, right?)
Other than the fact that C++ does not have extension methods (which prevents being able to add in new validations as easily) it should be too hard.
class Validation
{
vector<string> *errors;
void AddError(const string &error)
{
if (errors == NULL) errors = new vector<string>();
errors->push_back(error);
}
public:
Validation() : errors(NULL) {}
~Validation() { delete errors; }
const Validation &operator=(const Validation &rhs)
{
if (errors == NULL && rhs.errors == NULL) return *this;
if (rhs.errors == NULL)
{
delete errors;
errors = NULL;
return *this;
}
vector<string> *temp = new vector<string>(*rhs.errors);
std::swap(temp, errors);
}
void Check()
{
if (errors)
throw exception();
}
template <typename T>
Validation &IsNotNull(T *value)
{
if (value == NULL) AddError("Cannot be null!");
return *this;
}
template <typename T, typename S>
Validation &IsLessThan(T valueToCheck, S maxValue)
{
if (valueToCheck < maxValue) AddError("Value is too big!");
return *this;
}
// etc..
};
class Validate
{
public:
static Validation Begin() { return Validation(); }
};
Use..
Validate::Begin().IsNotNull(somePointer).IsLessThan(4, 30).Check();
Can't say much to the rest of the question, but I did want to point out this:
Though I'd guess that allocating
something like a std::vector on the
stack should be for free (is this
actually true? I'd suspect an empty
vector does no allocations on the
heap, right?)
No. You still have to allocate any other variables in the vector (such as storage for length) and I believe that it's up to the implementation if they pre-allocate any room for vector elements upon construction. Either way, you are allocating SOMETHING, and while it may not be much allocation is never "free", regardless of taking place on the stack or heap.
That being said, I would imagine that the time taken to do such things will be so minimal that it will only really matter if you are doing it many many times over in quick succession.
I recommend to get a look into Boost.Exception, which provides basically the same functionality (adding arbitrary detailed exception-information to a single exception-object).
Of course you'll need to write some utility methods so you can get the interface you want. But beware: Dereferencing a null-pointer in C++ results in undefined behavior, and null-references must not even exist. So you cannot return a null-pointer in a way as your linked example uses null-references in C# extension methods.
For the zero-cost thing: A simple stack-allocation is quite cheap, and a boost::exception object does not do any heap-allocation itself, but only if you attach any error_info<> objects to it. So it is not exactly zero cost, but nearly as cheap as it can get (one vtable-ptr for the exception-object, plus sizeof(intrusive_ptr<>)).
Therefore this should be the last part where one tries to optimize further...
Re the linked article: Apparently, the overhaead of creating objects in C# is so great that function calls are free in comparison.
I'd personally propose a syntax like
Validate().ISNOTNULL(src).ISNOTNULL(dst);
Validate() contructs a temporary object which is basically just a std::list of problems. Empty lists are quite cheap (no nodes, size=0). ~Validate will throw if the list is not empty. If profiling shows even this is too expensive, then you just change the std::list to a hand-rolled list. Remember, a pointer is an object too. You're not saving an object just by sticking to the unfortunate syntax of a raw pointer. Conversely, the overhead of wrapping a raw pointer with a nice syntax is purely a compile-time price.
PS. ISNOTNULL(x) would be a #define for IsNotNull(x,#x) - similar to how assert() prints out the failed condition, without having to repeat it.