How to find the memory used by any object - c++

class Help
{
public:
Help();
~Help();
typedef std::set<string> Terms;
typedef std::map<string, std::pair<int,Terms> > TermMap;
typedef std::multimap<int, string, greater<int> > TermsMap;
private:
TermMap terms;
TermsMap termsMap;
};
How can we find the memory used (in bytes) by the objects term and termsMap. Do we have any library ?

If you are looking for the full memory usage of an object, this can't be solved in general in C++ - while we can get the size of an instance itself via sizeof(), the object can always allocate memory dynamically as needed.
If you can find out how big the individual element in a container are, you can get a lower bound:
size = sizeof(map<type>) + sum_of_element_sizes;
Keep in mind though that the containers can still allocate additional memory as an implementation detail and that for containers like vector and string you have to check for the allocated size.

How can we find the memory used (in
bytes) by the objects term and
termsMap. Do we have any library ?
You should use your own allocator type.
typedef std::set<string,
your_allocator_1_that_can_count_memory_consumption_t> Terms;
typedef std::map<string, std::pair<int,Terms>,
your_allocator_2_that_can_count_memory_consumption_t> TermMap;
typedef std::multimap<int, string, greater<int>,
your_allocator_3_that_can_count_memory_consumption_t> TermsMap;
I have not yet checked this idea for std::string so if it is difficult to implement just use your own class fixed_string which just wraps char s[max-string-lenght].
And when you need in your program to find out memory consumption just get it from your_allocator_1_that_can_counts_memory_consumption_t, your_allocator_2_that_can_counts_memory_consumption_t,
your_allocator_3_that_can_counts_memory_consumption_t.
Edited
For UncleBens I want to clarify my point.
As far as I understand the question of the ARV it is necessary to know how much memory is allocated for set::set and std::map including all memory allocated for elements of the set and the map. So it is not just sizeof(terms).
So I just suggested a very simple allocator. Without going into too much details it might look like this:
template <class T>
class your_allocator_1_that_can_counts_memory_consumption_t {
public:
// interfaces that are required by the standart
private:
std::allocator<T> std_allocator_;
// here you need to put your variable to count bytes
size_t globale_variable_for_allocator_1_to_count_bytes_;
};
This allocator just counts number of allocated and deallocated bytes and for real allocation and deallocation use its member std_allocator_. I might need to debug it under gdb in order to set a breakpoint on malloc() and on free() to make sure that every allocation and deallocation actually goes through my allocator.
I would be grateful if you point me at some problems with this idea since I have already implemented it in my program that runs on Windows, Linux and HP-UX and I simply asks my allocators in order to find how much memory each of my containers use.

Short Answer: No
Long Answer:
-> The basic object yes. sizeof(<TYPE>) but this is only useful for limited things.
-> A container and its contained members: NO
If you make assumptions about the structures used to implement these objects you can estimate it. But even that is not really useful ( apart from the very specific case of the vector).
The designers of the STL deliberately did not define the data structures that should be used by these containers. There are several reasons for this, but one of them (in my opinion) is to stop people making assumptions about the internals and thus try and do silly things that are not encapsulated by the interface.
So the question then comes down to why do you need to know the size?
Do you really need to know the size (unlikely but possible).
Or is there a task you are trying to achieve where you think you need the size?

If you're looking for the actual block of memory, the numerical value of a pointer to it should be it. (Then just add the number of bytes, and you have the end of the block).

the sizeof() operator ought to do it:
size_t bytes = sizeof(Help::TermMap);

Related

Allocator-aware `std::array`-style container?

I'm writing some code that handles cryptographic secrets, and I've created a custom ZeroedMemory implementation of std::pmr::memory_resource which handles sanitizes memory on deallocation and encapsulates using the magic you have to use to prevent optimizing compilers from eliding away the operation. The idea was to avoid specializing std::array, because the lack of a virtual destructor means that destruction after type erasure would cause memory to be freed without being sanitized.
Unfortunately, I came to realize afterwards that std::array isn't an AllocatorAwareContainer. My std::pmr::polymorphic_allocator approach was a bit misguided, since obviously there's no room in an std::array to store a pointer to a specific allocator instance. Still, I can't fathom why allocators for which std::allocator_traits<A>::is_always_equal::value == true wouldn't be allowed, and I could easily re-implement my solution as a generic Allocator instead of the easier-to-use std::pmr::memory_resource...
Now, I could normally just use an std::pmr::vector instead, but one of the nice features of std::array is that the length of the array is part of the type. If I'm dealing with a 32-byte key, for example, I don't have to do runtime checks to be sure that the std::array<uint8_t, 32> parameter someone passed to my function is, in fact, the right length. In fact, those cast down nicely to a const std::span<uint8_t, 32>, which vastly simplifies writing functions that need to interoperate with C code because they enable me to handle arbitrary memory blocks from any source basically for free.
Ironically, std::tuple takes allocators... but I shudder to imagine the typedef needed to handle a 32-byte std::tuple<uint8_t, uint8_t, uint8_t, uint8_t, ...>.
So: is there any standard-ish type that holds a fixed number of homogenously-typed items, a la std::array, but is allocator aware (and preferably stores the items in a continguous region, so it can be down-cast to an std::span)?
You need cooperation from both the compiler and the OS in order for such a scheme to work. P1315 is a proposal to address the compiler/language side of things. As for the OS, you have to make sure that the memory was never paged out to disk, etc. in order for this to truly zero memory.
This sounds like a XY problem. You seem to be misusing allocators. Allocators are used to handle runtime memory allocation and deallocation, not to hook stack memory. What you are trying to do — zeroing the memory after using — should really be done with a destructor. You may want to write a class Key for this:
class Key {
public:
// ...
~Key()
{
secure_clear(*this); // for illustration
}
// ...
private:
std::array<std::uint8_t, 32> key;
};
You can easily implement iterator and span support. And you don't need to play with allocators.
If you want to reduce boilerplate code and make the new class automatically iterator / span friendly, use inheritance:
class Key :public std::array<std::uint8_t, 32> {
public:
// ...
~Key()
{
secure_clear(*this); // for illustration
}
// ...
};

Is there a way to distinguish what type of memory used by the object instance?

If i have this code :
#include <assert.h>
class Foo {
public:
bool is_static();
bool is_stack();
bool is_dynamic();
};
Foo a;
int main()
{
Foo b;
Foo* c = new Foo;
assert( a.is_static() && !a.is_stack() && !a.is_dynamic());
assert(!b.is_static() && b.is_stack() && !b.is_dynamic());
assert(!c->is_static() && !c->is_stack() && c->is_dynamic());
delete c;
}
Is it possible to implement is_stack, is_static, is_dynamic method to do so in order to be assertions fulfilled?
Example of use: counting size of memory which particular objects of type Foo uses on stack, but not counting static or dynamic memory
This cannot be done using standard C++ facilities, which take pains to ensure that objects work the same way no matter how they are allocated.
You can do it, however, by asking the OS about your process memory map, and figuring out what address range a given object falls into. (Be sure to use uintptr_t for arithmetic while doing this.)
Scroll down to the second answer that gives a wide array of available options depending on the Operating System:
How to determine CPU and memory consumption from inside a process?
I would also recommend reading this article on Tracking Memory Alloactions in C++:
http://www.almostinfinite.com/memtrack.html
Just be aware that it's a ton of work.
while the intention is good here, the approach is not the best.
Consider a few things:
on the stack you allocate temporary variables for your methods. You
don't always have to worry about how much stack you use because the
lifetime of the temp variables is short
related to stack what you usually care about is not corrupting it,
which can happen if your program uses pointers and accesses data
outside the intended bounds. For this type of problems a isStatic
function will not help.
for dynamic memory allocation you usually override the new/ delete
operators and keep a counter to track the amount of memory used. so
again, a isDynamic function might not do the trick.
in the case of global variables (you said static but I extended the
scope a bit) which are allocated in a separate data section (not
stack nor heap) well you don't always care about them because they
are statically allocated and the linker will tell you at link time if
you don't have enough space. Plus you can check the map file if you
really want to know address ranges.
So most of your concerns are solved at compile time and to be honest you rarely care about them. And the rest are (dynamic memory allocation) are treated differently.
But if you insist on having those methods you can tell the linker to generate a map file which will give you the address ranges for all data sections and use those for your purposes.

c++ vector construct with given memory

I'd like to use a std::vector to control a given piece of memory. First of all I'm pretty sure this isn't good practice, but curiosity has the better of me and I'd like to know how to do this anyway.
The problem I have is a method like this:
vector<float> getRow(unsigned long rowIndex)
{
float* row = _m->getRow(rowIndex); // row is now a piece of memory (of a known size) that I control
vector<float> returnValue(row, row+_m->cols()); // construct a new vec from this data
delete [] row; // delete the original memory
return returnValue; // return the new vector
}
_m is a DLL interface class which returns an array of float which is the callers responsibility to delete. So I'd like to wrap this in a vector and return that to the user.... but this implementation allocates new memory for the vector, copies it, and then deletes the returned memory, then returns the vector.
What I'd like to do is to straight up tell the new vector that it has full control over this block of memory so when it gets deleted that memory gets cleaned up.
UPDATE: The original motivation for this (memory returned from a DLL) has been fairly firmly squashed by a number of responders :) However, I'd love to know the answer to the question anyway... Is there a way to construct a std::vector using a given chunk of pre-allocated memory T* array, and the size of this memory?
The obvious answer is to use a custom allocator, however you might find that is really quite a heavyweight solution for what you need. If you want to do it, the simplest way is to take the allocator defined (as the default scond template argument to vector<>) by the implementation, copy that and make it work as required.
Another solution might be to define a template specialisation of vector, define as much of the interface as you actually need and implement the memory customisation.
Finally, how about defining your own container with a conforming STL interface, defining random access iterators etc. This might be quite easy given that underlying array will map nicely to vector<>, and pointers into it will map to iterators.
Comment on UPDATE: "Is there a way to construct a std::vector using a given chunk of pre-allocated memory T* array, and the size of this memory?"
Surely the simple answer here is "No". Provided you want the result to be a vector<>, then it has to support growing as required, such as through the reserve() method, and that will not be possible for a given fixed allocation. So the real question is really: what exactly do you want to achieve? Something that can be used like vector<>, or something that really does have to in some sense be a vector, and if so, what is that sense?
Vector's default allocator doesn't provide this type of access to its internals. You could do it with your own allocator (vector's second template parameter), but that would change the type of the vector.
It would be much easier if you could write directly into the vector:
vector<float> getRow(unsigned long rowIndex) {
vector<float> row (_m->cols());
_m->getRow(rowIndex, &row[0]); // writes _m->cols() values into &row[0]
return row;
}
Note that &row[0] is a float* and it is guaranteed for vector to store items contiguously.
The most important thing to know here is that different DLL/Modules have different Heaps. This means that any memory that is allocated from a DLL needs to be deleted from that DLL (it's not just a matter of compiler version or delete vs delete[] or whatever). DO NOT PASS MEMORY MANAGEMENT RESPONSIBILITY ACROSS A DLL BOUNDARY. This includes creating a std::vector in a dll and returning it. But it also includes passing a std::vector to the DLL to be filled by the DLL; such an operation is unsafe since you don't know for sure that the std::vector will not try a resize of some kind while it is being filled with values.
There are two options:
Define your own allocator for the std::vector class that uses an allocation function that is guaranteed to reside in the DLL/Module from which the vector was created. This can easily be done with dynamic binding (that is, make the allocator class call some virtual function). Since dynamic binding will look-up in the vtable for the function call, it is guaranteed that it will fall in the code from the DLL/Module that originally created it.
Don't pass the vector object to or from the DLL. You can use, for example, a function getRowBegin() and getRowEnd() that return iterators (i.e. pointers) in the row array (if it is contiguous), and let the user std::copy that into its own, local std::vector object. You could also do it the other way around, pass the iterators begin() and end() to a function like fillRowInto(begin, end).
This problem is very real, although many people neglect it without knowing. Don't underestimate it. I have personally suffered silent bugs related to this issue and it wasn't pretty! It took me months to resolve it.
I have checked in the source code, and boost::shared_ptr and boost::shared_array use dynamic binding (first option above) to deal with this.. however, they are not guaranteed to be binary compatible. Still, this could be a slightly better option (usually binary compatibility is a much lesser problem than memory management across modules).
Your best bet is probably a std::vector<shared_ptr<MatrixCelType>>.
Lots more details in this thread.
If you're trying to change where/how the vector allocates/reallocates/deallocates memory, the allocator template parameter of the vector class is what you're looking for.
If you're simply trying to avoid the overhead of construction, copy construction, assignment, and destruction, then allow the user to instantiate the vector, then pass it to your function by reference. The user is then responsible for construction and destruction.
It sounds like what you're looking for is a form of smart pointer. One that deletes what it points to when it's destroyed. Look into the Boost libraries or roll your own in that case.
The Boost.SmartPtr library contains a whole lot of interesting classes, some of which are dedicated to handle arrays.
For example, behold scoped_array:
int main(int argc, char* argv[])
{
boost::scoped_array<float> array(_m->getRow(atoi(argv[1])));
return 0;
}
The issue, of course, is that scoped_array cannot be copied, so if you really want a std::vector<float>, #Fred Nurk's is probably the best you can get.
In the ideal case you'd want the equivalent to unique_ptr but in array form, however I don't think it's part of the standard.

Locating objects (structs) in memory - how to?

How would you locate an object in memory, lets say that you have a struct defined as:
struct POINT {
int x;
int y;
};
How would I scan the memory region of my app to find instances of this struct so that I can read them out?
Thanks R.
You can't without adding type information to the struct. In memory a struct like that is nothing else than 2 integers so you can't recognize them any better than you could recognize any other object.
You can't. Structs don't store any type information (unless they have virtual member functions), so you can't distinguish them from any other block of sizeof(POINT) bytes.
Why don't you store your points in a vector or something?
You can't. You have to know the layout to know what section of memory have to represent a variable. That's a kind of protocol and that's why we use text based languages instead raw values.
You don't - how would you distinguish two arbitrary integers from random noise?
( but given a Point p; in your source code, you can obtain its address using the address-of operator ... Point* pp = &p;).
Short answer: you can't. Any (appropriately aligned) sequence of 8 bytes could potentially represent a POINT. In fact, an array of ints will be indistinguishable from an array of POINTS. In some cases, you could take advantage of knowledge of the compiler implementation to do better. For instance, if the struct had virtual functions, you could look for the correct vtable pointer - but there could also be false positives.
If you want to keep track of objects, you need to register them in their constructor and unregister them in their destructor (and pay the performance penalty), or give them their own allocator.
There's no way to identify that struct. You need to put the struct somewhere it can be found, on the stack or on the heap.
Sometimes data structures are tagged with identifying information to assist with debugging or memory management. As a means of data organization, it is among the worst possible approaches.
You probably need to a lot of general reading on memory management.
There is no standard way of doing this. The platform may specify some APIs which allow you to access the stack and the free store. Further, even if you did, without any additional information how would you be sure that you are reading a POINT object and not a couple of ints? The compiler/linker can read this because it deals with (albeit virtual) addresses and has some more information (and control) than you do.
You can't. Something like that would probably be possible on some "tagged" architecture that also supported tagging objects of user-defined types. But on a traditional architecture it is absolutely impossible to say for sure what is stored in memory simply by looking at the raw memory content.
You can come closer to achieving what you want by introducing a unique signature into the type, like
struct POINT {
char signature[8];
int x;
int y;
};
and carefully setting it to some fixed and "unique" pattern in each object of POINT type, and then looking for that pattern in memory. If it is your application, you can be sure with good degree of certainty that each instance of the pattern is your POINT object. But in general, of course, there will never be any guarantee that the pattern you found belongs to your object, as opposed to being there purely accidentally.
What everyone else has said is true. In memory, your struct is just a few bytes, there's nothing in particular to distinguish it.
However, if you feel like a little hacking, you can look up the internals of your C library and figure out where memory is stored on the heap and how it appears. For example, this link shows how stuff gets allocated in one particular system.
Armed with this knowledge, you could scan your heap to find allocated blocks that were sizeof(POINT), which would narrow down the search considerably. If you look at the table you'll notice that the file name and line number of the malloc() call are being recorded - if you know where in your source code you're allocating POINTs, you could use this as a reference too.
However, if your struct was allocated on the stack, you're out of luck.

How do I allocate a std::string on the stack using glibc's string implementation?

int main(void)
{
std::string foo("foo");
}
My understanding is that the above code uses the default allocator to call new. So even though the std::string foo is allocated on the stack the internal buffer inside of foo is allocated on the heap.
How can I create a string that is allocated entirely on the stack?
I wanted to do just this myself recently and found the following code illuminating:
Chronium's stack_container.h
It defines a new std::allocator which can provide stack-based allocation for the initial allocation of storage for STL containers. I wound up finding a different way to solve my particular problem, so I didn't actually use the code myself, but perhaps it will be useful to you. Do be sure to read the comments in the code regarding usage and caveats.
To those who have questioned the utility and sanity of doing this, consider:
Oftentimes you know a priori that your string has a reasonable maximum size. For example, if the string is going to store a decimal-formatted 32-bit integer,you know that you do not need more than 11 characters to do so. There is no need for a string that can dynamically grow to unlimited size in that case.
Allocating from the stack is faster in many cases than allocating from the heap.
If the string is created and destroyed frequently (suppose it is a local variable in a commonly used utility function), allocating from the stack instead of the heap will avoid fragmentation-inducing churn in the heap allocator. For applications that use a lot of memory, this could be a game changer.
Some people have commented that a string that uses stack-based allocation will not be a std::string as if this somehow diminishes its utility. True, you can't use the two interchangeably, so you won't be able to pass your stackstring to functions expecting a std::string. But (if you do it right), you will be able to use all the same member functions on your stackstring that you use now on std::string, like find_first_of(), append(), etc. begin() and end() will still work fine, so you'll be able to use many of the STL algorithms. Sure, it won't be std::string in the strictest sense, but it will still be a "string" in the practical sense, and it will still be quite useful.
The problem is that std::basic_string has a template parameter for the allocator. But std::string is not a template and has no parameters.
So, you could in principle use an instantiation of std::basic_string with an allocator that uses memory on the stack, but it wouldn't be a std::string. In particular, you wouldn't get runtime polymorphism, and you couldn't pass the resulting objects into functions expecting a std::string.
You can't. Except...
std::string is an instantiation of
std::basic_string<class CharType,
class Traits=char_traits<CharType>,
class Allocator=allocator<CharType> >
You could conceivably define an Allocator class that uses alloca for memory management. This would only work if the Allocator itself, and the basic_string methods that invoke it directly or indirectly, are all inline. A basic_string object created with this allocator would not be a std::string, but it would behave (mostly) like it. However, this would be a fair amount of work for limited gains. Specifically, using this class to return values from a function would be a career-limiting move.
I have no idea why you or anyone else would want to do this.
I suspect that doing such a thing would be difficult to do, I wonder why you want to do it? To allocate something entirely on the stack, the compiler needs to know at compile time what the exact size of the thing is - in your example it would need to know not only the size of the std::string metadata, but also the size of the string data itself. This isn't too flexible, you would probably need different string types depending on the size of the string data you are wanting to include in it - not that it would be impossible to do, just it would tend to complicate things up a bit.
std::string will always manage it's internal storage with new/delete.
Not sure why your question contains glibc’s string implementation. The string implementation of the c++ standard library has nothing to do with glibc.
The only way to store a string on the stack is to use a C char array on the stack (like what Shhnap outlined). But that's probably not what you want anyway :-)