C++ Using the new operator efficiently

C++ Using the new operator efficiently - c++

When instantiating a class with new. Instead of deleting the memory what kinds of benefits would we gain based on the reuse of the objects?
What is the process of new? Does a context switch occur? New memory is allocated, who is doing the allocation? OS ?

You've asked a few questions here...
Instead of deleting the memory what kinds of benefits would we gain based on the reuse of the objects?
That depends entirely on your application. Even supposing I knew what the application is, you've left another detail unspecified -- what is the strategy behind your re-use? But even knowing that, it's very hard to predict or answer generically. Try some things and measure them.
As a rule of thumb I like to minimize the most gratuitous of allocations. This is mostly premature optimization, though. It'd only make a difference over thousands of calls.
What is the process of new?
Entirely implementation dependent. But the general strategy that allocators use is to have a free list, that is, a list of blocks which have been freed in the process. When the free list is empty or contains insufficient contiguous free space, it must ask the kernel for the memory, which it can only give out in blocks of a constant page size. (4096 on x86.) An allocator also has to decide when to chop up, pad, or coalesce blocks. Multi-threading can also put pressure on allocators because they must synchronize their free lists.
Generally it's a pretty expensive operation. Maybe not so much relative to what else you're doing. But it ain't cheap.
Does a context switch occur?Entirely possible. It's also possible that it won't. Your OS is free to do a context switch any time it gets an interrupt or a syscall, so uh... That can happen at a lot of times; I don't see any special relationship between this and your allocator.
New memory is allocated, who is doing the allocation? OS ?It might come from a free list, in which case there is no system call involved, hence no help from the OS. But it might come from the OS if the free list can't satisfy the request. Also, even if it comes from the free list, your kernel might have paged out that data, so you could get a page fault on access and the kernel's allocator would kick in. So I guess it'd be a mixed bag. Of course, you can have a conforming implementation that does all kinds of crazy things.

new allocates memory for the class on the heap, and calls the constructor.
context switches do not have to occur.
The c++-runtime allocates the memory on its freestore using whatever mechanism it deems fit.
Usually the c++ runtime allocates large blocks of memory using OS memory management functions, and then subdivides those up using its own heap implementation. The microsoft c++ runtime mostly uses the Win32 heap functions which are implemented in usermode, and divide up OS memory allocated using the virtual memory apis. There are thus no context switches until and unless its current allocation of virtual memory is needed and it needs to go to the OS to allocate more.
There is a theoretical problem when allocating memory that there is no upper bound on how long a heap traversal might take to find a free block. Practically tho, heap allocations are usually fast.
With the exception of threaded applications. Because most c++ runtimes share a single heap between multiple threads, access to the heap needs to be serialized. This can severly degrade the performance of certain classes of applications that rely on multiple threads being able to new and delete many objects.

If you new or delete an address it's marked as occupied or unassigned. The implementations do not talk all the time with the kernel. Bigger chucks of memory are reserved and divided in smaller chucks in user space within your application.
Because new and delete are re-entrant (or thread-safe depending on the implementation) a context switch may occur but your implementation is thread-safe anyway while using the default new and delete.
In C++ you are able to overwrite the new and delete operator, e.g. to place your memory management:
#include <cstdlib> //declarations of malloc and free
#include <new>
#include <iostream>
using namespace std;
class C {
public:
C();
void* operator new (size_t size); //implicitly declared as a static member function
void operator delete (void *p); //implicitly declared as a static member function
};
void* C::operator new (size_t size) throw (const char *){
void * p = malloc(size);
if (p == 0) throw "allocation failure"; //instead of std::bad_alloc
return p;
}
void C::operator delete (void *p){
C* pc = static_cast<C*>(p);
free(p);
}
int main() {
C *p = new C; // calls C::new
delete p; // calls C::delete
}

Related

Memory consumption after new then delete

I have created a sample application like below. I have a requirement to create 1024*1024 structs. Before calling the new operator my application is consuming some amount of memory (say 0.3mb). After calling the new operator the memory is increased (say 175mb). After calling the delete operator the memory is decreased (say 15mb). So finally there is a difference in the memory. I observed all these memory details from Task Manager. I am confused whether it should be considered as a memory leak, or will that memory slowly releases? If not, how can I free that remaining memory?
struct testSt
{
bool check;
std::string testString;
};
int main()
{
testSt *testObj = new testSt[1024 * 1024];
delete[] testObj;
return 0;
}

There is definitely no memory leak in your application. The reason the numbers before and after the allocation do not appear to match is that task manager tool is to coarse for the purposes of detecting memory leaks in a C++ program. Rather than recording the memory usage of only your code, it records all memory usage of the process that executes your code, including any memory used by the standard C++ library that supports your code's operations.
Use a memory profiler, such as valgrind, to test your code for memory leaks.
In addition, consider switching away from raw pointers for making containers. The best way by far to reduce the possibility of having a memory leak is to automate memory management using containers from the standard C++ library. In your case, defining a vector
std::vector<testSt> testObj(1024*1024);
would let you avoid allocation and deallocation altogether.

There is no memory leak in the posted code. The reason memory usage reported by task manager doesn’t go back to what it was is that the process’s runtime is keeping some of the allocated pages for later reuse, so it (hopefully) won’t have to bother the OS for more RAM the next time it wants to allocate an object. This is a normal optimization and nothing to be too concerned about. The real test for a leak would be to run your code in a loop for many iterations; if during that test you see your process’s memory usage increasing without bound, that would suggest there is a memory leak. If it levels off and then remains constant, on the other hand, that suggests there isn’t one.

You're code is correct, the array will be deleted.
You can test this with the following:
struct testSt
{
bool check;
std::string testString;
~testSt()
{
std::cout << "Destroyed!" << std::endl;
}
};
Are you running from the debugger? The additional memory may be held by the IDE.

malloc within constructor safe?

If I allocate memory with malloc (or new/new[]) within a class constructor, is that bit of memory safe from being overwritten?
class stack {
private:
int * stackPointer;
public:
stack (int size) {
stackPointer = (int *) malloc (sizeof(int) * stackSize);
}
int peek (int pos) {
return *(stackPointer + pos); //pos < size
}
}

malloc/new within a constructor is safe, provided you follow the rule of three. With malloc/new you now have a resource that you have to explicitly take care to release at the right times.
Therefore: you must define a copy constructor, an assignment operator, and a destructor that will free the memory. If you don't, the class can be misused and cause you a lot of problems.
If you want to avoid having to define these extra functions, use std::vector instead, which handles them for you.

Yes, any memory that you allocate using malloc() is safely yours. And it will not be overwritten except by your code (whether intentionally or from a bug).

Technically it's safe from being overwritten by others as long as you don't pass handle to that memory to outside world in any of possible manner. That way you can localize the manipulation of that memory to class members only.
However, you can't be 100% sure on that as other programmer could write a program in a way which could corrupt your memory.For e.g passing out of bound index to arrays.

No writable memory is safe from being overwritten within a C or C++ program. The allocation functions establish a claim over a memory and it is the program's responsibility to respect these claims.

The language protections in C++ are just that, language protections.
If you are doing some fancy C pointer games, you can eventually find and overwrite the allocated memory. It's considered the exact opposite of best practice, but it can happen.
As such, the "protection" is much like "hiding". Malloc within a constructor will return pointers that are "hidden" based on the exposure the surrounding class decides to allow, but they are not protected in the sense of "memory fencing" or the other more expensive operations that an operating system / hardware platform might impose between programs.
As far as it being "safe", I wouldn't recommend the practice, mostly because there is a chance you don't exit the constructor. If you fail within the constructor, attempting to do proper memory cleanup of any mallocs that might have succeeded would be a very hard bit of programming to verify it worked correctly. Use new instead, and put your memory in an object, and that way at least in failure conditions, you will have only one memory allocation technique to worry about.
Malloc with C++ means you have two memory allocation techniques, and two different ways they can cross-interact. That's four scenarios to deal with, and odds are you'll never get around to testing them all sufficiently.

How compiler is going to know which memory is allocated using which operator or function?

Suppose I have allocated memory for two arrays, one using new operator and other using malloc function. As far as I know both of the memories are allocated in heap segment then my question is how the compiler is going to know which memory is allocated using which operator or function? Or is there any other concept behind this.

The compiler doesn't have to know how memory behind a pointer was allocated, it's the responsibility of the programmer. You should always use matching allocate-deallocate functions/operators. For example the operator new can be overloaded. In this case when you allocate object with new, and release it with free(), you're in trouble because free() has no idea what kind of book-keeping you have there. Here's simplified an example of this situation:
#include <iostream>
#include <stdlib.h>
struct MyClass
{
// Really dumb allocator.
static void* operator new(size_t s)
{
std::cout << "Allocating MyClass " << s << " bytes.\n";
void* res = Pool + N * sizeof(MyClass);
++N;
return res;
}
// matching operator delete not implemented on purpose.
static char Pool[]; // take memory from this statically allocated array.
static unsigned N; // keep track of allocated objects.
};
char MyClass::Pool[10*sizeof(MyClass)];
unsigned MyClass::N = 0;
int main(int argc, char** argv)
{
MyClass* p = new MyClass();
if (argc == 1)
{
std::cout << "Trying to delete\n";
delete p; // boom - non-matching deallocator used.
}
else
{
std::cout << "Trying to free\n";
free(p); // also boom - non-matching deallocator used.
}
}
If you mix and match the allocators and deallocators you will run into similar problems.

Internally, both allocation mechanisms may or may not finally use the same mechanism, but pairing new and free or malloc and delete would mix conceptually different things and cause undefined behaviour.

You must not use delete for malloc or free for new. Although for basic data types you might get away with it on most compilers, it is still wrong. It is not guaranteed to work. malloc and new could deal with different heaps and not the same one. Furthermore, delete will call destructors of objects whereas free will not.
Compilers don't have to keep track of which memory blocks are allocated by malloc or new. They might as a debug help, or they might not. Don't rely on that.

It does not know. It just calls a function that returns a pointer, and pointers do not carry the information of how they got to be or what kind of memory they point to. It just passes along that pointer and does not care about it any further.
However, the function you use to deallocate the memory (i.e. free/delete) might depend on information that got stored somewhere hidden by malloc/new. So if you allocate memory by malloc and try to deallocate it by using delete (or new and free), it might not work (apart from the obvious problems with constructors/destructors).
Might not work in this case it is undefined what happens. This is a huge bonus for cmpiler developers and performance, because they simply don't have to care. On the other hand, the effort is put off to the developers who have to keep track of how certain memory got allocated. The easiest way to do that is by using just one of the two methods.

new/delete is the C++ way to allocate memory and deallocate memory from the heap
whereas
malloc/free/family is the C way to allocate and free memory from the heap
I don't know why you want the compiler to know who allocated the heap memory but
if you want to track how there is a way.
One way of doing so is new would initialize the allocated memory by calling a constructor you can monitor this constructor to know who allocated the memory to heap.
Regards,
yanivx

As far as I know both of the memories are allocated in heap segment then my question is how compiler is going to know which memory is allocated using which operator or function?
What is this thing you call the "heap segment"?
There is no such thing as far as the C and C++ standards are concerned. The "heap" and "stack" are implementation-specific concepts. They are very widely used concepts, but neither standard mandates a "heap" or a "stack".
How the implementation (not the compiler!) knows where things are allocated is up to the implementation. Your best bet, and the only safe bet, is to follow what the standards say to do:
If you allocate memory using new[] you must deallocate it with delete[] (or leave it undeleted).
Any other deallocation is undefined behavior.
If you allocate memory using new you must deallocate it with delete (or leave it undeleted).
Any other deallocation is undefined behavior.
If you allocate memory using malloc or its kin you must deallocate it with free (or leave it undeleted).
Any other deallocation is undefined behavior.
Not freeing allocated memory can sometimes be a serious problem. If you continuously allocate big chunks of memory and never free a single one you will run into problems. Other times, it's not a problem at all. Allocating one chunk of memory at program start and oops, you didn't free it oftentimes is not a problem because that allocated memory is released when the program terminates. It's up to you to determine whether those memory leaks truly are a problem.
The easiest way to avoid these larger issues is to have the program properly free every single byte of allocated memory before the program exits.
Note well: Doing that doesn't guarantee that you don't have a memory problem. Just because your program eventually should free every single one of the multiple terabytes allocated over the course of the program's execution doesn't necessarily mean that the program is okay memory-wise.

C++ memory management and Misra

I need some clarification about c++ memory management and MISRA guidelines..
I have to implement one program that it's MISRA compatible so I have to respect a important rule: is not possible to use 'new' operator (dynamic memory heap).
In this case, for any custom object, I must use static allocation:
For example:
I have my class Student with a constructor Student(int age).
Whenever I have to instantiate a Student object I must do it this way:
int theAge = 18;
Student exampleOfStudent(theAge);
This creates an Student object exampleOfStudent.
In this way I do not to have to worry about I do not use destructors.
Is this correct all this?
Are there other ways to use static memory management?
Can I use in the same way std::vector or other data structure?
Can I add, for example, a Student instance (that I created as Student exampleOfStudent(theAge)) into a std::vector.

Student exampleOfStudent(theAge); is an automatic variable, not static.
As far as I remember, MISRA rules disallow all forms of dynamic memory. This includes both malloc and new and std::vector (with the default allocator).
You are left with only automatic variables and static variables.
If your system has a limited amount of RAM you don't want to use dynamic memory because of the risk you will ask for more memory than is available. Heap fragmentation is also an issue. This prevents you from writing provably correct code. If you use variables with automatic or static storage a static analysis application can, for instance, output the maximum amount of memory your application will use. This number you can check against your system RAM.

The idea behind the rule is not that malloc and new, specifically, are unsafe, but that memory allocation is (usually) a lazy workaround for not understanding, or managing, the memory requirements of your program.
pre-allocating your calculated maximum input, and trapping overruns
providing a packet, stream, or other line-oriented means of managing input
use of an alternative pre-allocated data structure to manage non-uniform elements
Particularly in the context of a small, non-MMU, embedded system that lack of design depth frequently leads to an unstable system, that crashes outright in those odd, "corner case" exceptions. Small memory, short stack, is a system killer.
A few, of many, strategies that avoid the assumption that you do not have infinite memory, or even much memory in that inexpensive, embedded system - and force you to deal with the faults that might be important in your application.
Don't write your own malloc.

For MISRA compliance, placement-new is not a problem, as there is no dynamic allocation happening.
A library could be written (like an STL allocator) in such a way as to reference a statically allocated memory region as it's memory pool for such a purpose.
Advantages: deterministic, fast.
Disadvantages: memory inefficient.
A favorable trade off for deterministic real-time systems.
All needed RAM has to be there at program startup, or the program won't run.
If the program starts, it's unaffected by available heap size, fragmentation etc..
Writing ones own allocator can be complex and out-of-memory conditions (static memory pool size is fixed after all) still have to be dealt with.

I once wrote a library that had to comply to the MISRA rules. I needed dynamic memory as well, so I came up with a trick:
My lib was written in C, but my trick may work for you.
Part of the header-file looked like this:
/* declare two function pointers compatible to malloc and free: */
typedef void * (*allocatorFunc)(size_t size);
typedef void (*freeFunc) (void * data);
/* and let the library user pass them during lib-init: */
int library_init (allocatorFunc allocator, freeFunc deallocator);
Inside the library I never called malloc/free directly. I always used the supplied function-pointers. So I delegated the problem how to the dynamic memory allocation should look like to someone else.
The customer actually liked this solution. He was aware of the fact that my library would not work without dynamic memory allocation and it gave him freedom to implement his own memory scheme using preallocated pools or whatnot.
In C++ you can do the same, just use the malloc function and do the object creation using placement new.

New vs. Malloc, when overloading New

I'm overloading new and delete to implement my own small-objects/thread-safe allocator.
The problem is that when I am overloading new, I cannot use new without breaking universal causality or at least the compiler. Most examples I found where new is overloaded, use Malloc() to do the actual allocation. But from what I understood of C++, there is no use-case for Malloc() at all.
Multiple answers similar to this one, some with less tort outside of SO: In what cases do I use malloc vs new?
My question, is how do I allocate the actual memory when overloading operator new without using Malloc() ?
(This is out of curiosity more than anything, try not to take the reasoning behind the overload too seriously; I have a seperate question out on that anywho!)

Short answer: if you don't want existing malloc, you need to implement your own heap manager.
A heap manager, for example malloc in glibc of Linux, HeapAlloc in Windows, is a user-level algorithm. First, keep in mind that heap is optimized for allocating small sizes of objects like 4~512 bytes.
How to implement your own heap manager? At least, you must call a system API that allocates a memory chunk in your process. There are VirtualAlloc for Windows and sbrk for Linux. These APIs allocate a large chunk of memory, but the size must be multiple of page size. Typically, the size of page in x86 and Windows/Linux is 4KB.
After obtaining a chunk of page, you need to implement your own algorithms how to chop down this big memory into smaller requests. A classic (still very practical) implementation and algorithm is dlmalloc: http://g.oswego.edu/dl/html/malloc.html
To implement, you need to have several data structures for book-keeping and a number of policies for optimization. For example, for small objects like 16, 20, 36, 256 bytes, a heap manager maintains a list of blocks of each size. So, there are a list of lists. If requested size is bigger than a page size, then it just call VirtualAlloc or sbrk. However, an efficient implementation is very challenging. You must consider not only speed and space overhead, but also cache locality and fragmentation.
If you are interested in heap managers optimized for multithreaded environment, take a look a tcmalloc: http://goog-perftools.sourceforge.net/doc/tcmalloc.html

I see no problem in calling malloc() inside a new overload, just make sure you overload delete so it calls free(). But if you really don't want to call malloc(), one way is to just allocate enough memory another way:
class A {
public:
/* ... */
static void* operator new (size_t size) {
return (void *)new unsigned char[size];
}
static void operator delete (void *p) {
delete[]((unsigned char *)p);
}
/* ... */
};

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js