Let's say I have a pointer allocated to hold 4096 bytes. How would one deallocate the last 1024 bytes in C? What about in C++? What if, instead, I wanted to deallocate the first 1024 bytes, and keep the rest (in both languages)? What about deallocating from the middle (it seems to me that this would require splitting it into two pointers, before and after the deallocated region).

Don't try and second-guess memory management. It's usually cleverer than you ;-)
The only thing you can achieve is the first scenario to 'deallocate' the last 1K
char * foo = malloc(4096);
foo = realloc(foo, 4096-1024);
However, even in this case, there is NO GUARANTEE that "foo" will be unchanged. Your entire 4K may be freed, and realloc() may move your memory elsewhere, thus invalidating any pointers to it that you may hold.
This is valid for both C and C++ - however, use of malloc() in C++ is a bad code smell, and most folk would expect you to use new() to allocate storage. And memory allocated with new() cannot be realloc()ed - or at least, not in any kind of portable way. STL vectors would be a much better approach in C++

If you have n bytes of mallocated memory, you can realloc m bytes (where m < n) and thus throw away the last n-m bytes.
To throw away from the beginning, you can malloc a new, smaller buffer and memcpy the bytes you want and then free the original.
The latter option is also available using C++ new and delete. It can also emulate the first realloc case.

You don't have "a pointer allocated to hold 4096 bytes", you have a pointer to an allocated block of 4096 bytes.
If your block was allocated with malloc(), realloc() will allow you to reduce or increase the size of the block. The start address of the block won't necessarily stay the same, though.
You can't change the start address of a malloc'd memory block, which is really what your second scenario is asking. There's also no way to split a malloc'd block.
This is a limitation of the malloc/calloc/realloc/free API -- and implementations may rely on these limitations (for example, keeping bookkeeping information about the allocation immediately before the start address, which would make moving the start address difficult.)
Now, malloc isn't the only allocator out there -- your platform or libraries might provide other ones, or you could write your own (which gets memory from the system via malloc, mmap, VirtualAlloc or some other mechanism) and then hands it out to your program in whatever fashion you desire.
For C++, if you allocate memory with std::malloc, the information above applies. If you're using new and delete, you're allocating storage for and constructing objects, and so changing the size of an allocated block doesn't make sense -- objects in C++ are a fixed size.

You can make it shorter with realloc(). I don't think the rest is possible.

You can use realloc() to apparently make the memory shorter. Note that for some implementations such a call will actually do nothing. You can't free the first bit of the block and retain the last bit.
If you find yourself needing this kind of functionality, you should consider using a more complex data structure. An array is not the correct answer to every programming problem.

SUMMARY:In contrast to C's realloc, it
is not possible to directly reallocate
memory allocated with new[]. To extend
or reduce the size of a block, one
must allocate a new block of adequate
size, copy over the old memory, and
delete the old block. The C++ standard
library provides a dynamic array that
can be extended or reduced in its
std::vector template.


Sequential memory allocation

I'm planning a application that allocates a lot of variables in memory. In difference from another "regular" application, I want this memory be allocated in specific memory blocks of 4096 bytes. My allocated vars must be placed in memory sequentially. One after another, in order to fill the whole allocated memory.
For example, I'm allocating a region (4096 bytes) in memory and this region is ready for my further use. From now, each time that my application creates a new variable in memory (which is probably made in "regular" application with malloc), this variable will be placed in free space in my memory region.
This sequential memory allocation is similare to how an array allocation works. But, in my case, I need an array that will be able to contain many types of data (string, byte, int, ...).
One possible solution is to achieve this is by pointer arithmetics. I want to avoid this method, this may insert a lot of bugs in my application.
Maybe someone solved this problem before?
Thank you!
malloc() by no means guarantees that subsequent allocated blocks are on sequential memory address. Even worse, most implementations use a small number of bytes before and/or after the allocated block for 'housekeeping'. This means that, even if you're lucky that addresses are sequential, there will be small gaps in between the blocks. So the actual allocated blocks are slightly bigger to make space for those 'housekeeping' bytes.
As you suggest, you'll need to write some code yourself and write a few functions with malloc(), realloc(), ... You can hide all the logic in these functions and should not make your application code using these functions more complex compared to using malloc() if it did what you wanted.
Important questions: Why do you need to have these blocks adjacent to each other? What about freeing blocks?

`std::string` allocations are my current bottleneck - how can I optimize with a custom allocator?

I'm writing a C++14 JSON library as an exercise and to use it in my personal projects.
By using callgrind I've discovered that the current bottleneck during a continuous value creation from string stress test is an std::string dynamic memory allocation. Precisely, the bottleneck is the call to malloc(...) made from std::string::reserve.
I've read that many existing JSON libraries such as rapidjson use custom allocators to avoid malloc(...) calls during string memory allocations.
I tried to analyze rapidjson's source code but the large amount of additional code and comments, plus the fact that I'm not really sure what I'm looking for, didn't help me much.
How do custom allocators help in this situation?
Is a memory buffer preallocated somewhere (where? statically?) and std::strings take available memory from it?
Are strings using custom allocators "compatible" with normal strings?
They have different types. Do they have to be "converted"? (And does that result in a performance hit?)
Code notes:
Str is an alias for std::string.
By default, std::string allocates memory as needed from the same heap as anything that you allocate with malloc or new. To get a performance gain from providing your own custom allocator, you will need to be managing your own "chunk" of memory in such a way that your allocator can deal out the amounts of memory that your strings ask for faster than malloc does. Your memory manager will make relatively few calls to malloc, (or new, depending on your approach) under the hood, requesting "large" amounts of memory at once, then deal out sections of this (these) memory block(s) through the custom allocator. To actually achieve better performance than malloc, your memory manager will usually have to be tuned based on known allocation patterns of your use cases.
This kind of thing often comes down to the age-old trade off of memory use versus execution speed. For example: if you have a known upper bound on your string sizes in practice, you can pull tricks with over-allocating to always accommodate the largest case. While this is wasteful of your memory resources, it can alleviate the performance overhead that more generalized allocation runs into with memory fragmentation. As well as making any calls to realloc essentially constant time for your purposes.
#sehe is exactly right. There are many ways.
To finally address your second question, strings using different allocators can play nicely together, and usage should be transparent.
For example:
class myalloc : public std::allocator<char>{};
myalloc customAllocator;
int main(void)
std::string mystring(customAllocator);
std::string regularString = "test string";
mystring = regularString;
std::cout << mystring;
return 0;
This is a fairly silly example and, of course, uses the same workhorse code under the hood. However, it shows assignment between strings using allocator classes of "different types". Implementing a useful allocator that supplies the full interface required by the STL without just disguising the default std::allocator is not as trivial. This seems to be a decent write up covering the concepts involved. The key to why this works, in the context of your question at least, is that using different allocators doesn't cause the strings to be of different type. Notice that the custom allocator is given as an argument to the constructor not a template parameter. The STL still does fun things with templates (such as rebind and Traits) to homogenize allocator interfaces and tracking.
What often helps is the creation of a GlobalStringTable.
See if you can find portions of the old NiMain library from the now defunct NetImmerse software stack. It contains an example implementation.
What is important to note is that this string table needs to be accessible between different DLL spaces, and that it is not a static object. R. Martinho Fernandes already warned that the object needs to be created when the application or DLL thread is created / attached, and disposed when the thread is destroyed or the dll is detached, and preferrably before any string object is actually used. This sounds easier than it actually is.
Memory allocation
Once you have a single point of access that exports correctly, you can have it allocate a memory buffer up-front. If the memory is not enough, you have to resize it and move the existing strings over. Strings essentially become handles to regions of memory in this buffer.
Placement new
Something that often works well is called the placement new() operator, where you can actually specify where in memory your new string object needs to be allocated. However, instead of allocating, the operator can simply grab the memory location that is passed in as an argument, zero the memory at that location, and return it. You can also keep track of the allocation, the actual size of the string etc.. in the Globalstringtable object.
Handling the actual memory scheduling is something that is up to you, but there are many possible ways to approach this. Often, the allocated space is partitioned in several regions so that you have several blocks per possible string size. A block for strings <= 4 bytes, one for <= 8 bytes, and so on. This is called a Small Object Allocator, and can be implemented for any type and buffer.
If you expect many string operations where small strings are incremented repeatedly, you may change your strategy and allocate larger buffers from the start, so that the number of memmove operations are reduced. Or you can opt for a different approach and use string streams for those.
String operations
It is not a bad idea to derive from std::basic_str, so that most of the operations still work but the internal storage is actually in the GlobalStringTable, so that you can keep using the same stl conventions. This way, you also make sure that all the allocations are within a single DLL, so that there can be no heap corruption by linking different kinds of strings between different libraries, since all the allocation operations are essentially in your DLL (and are rerouted to the GlobalStringTable object)
Custom allocators can help because most malloc()/new implementations are designed for maximum flexibility, thread-safety and bullet-proof workings. For instance, they must gracefully handle the case that one thread keeps allocating memory, sending the pointers to another thread that deallocates them. Things like these are difficult to handle in a performant way and drive the cost of malloc() calls.
However, if you know that some things cannot happen in your application (like one thread deallocating stuff another thread allocated, etc.), you can optimize your allocator further than the standard implementation. This can yield significant results, especially when you don't need thread safety.
Also, the standard implementation is not necessarily well optimized: Implementing void* operator new(size_t size) and void operator delete(void* pointer) by simply calling through to malloc() and free() gives an average performance gain of 100 CPU cycles on my machine, which proves that the default implementation is suboptimal.
I think you'd be best served by reading up on the EASTL
It has a section on allocators and you might find fixed_string useful.
The best way to avoid a memory allocation is don't do it!
BUT if I remember JSON correctly all the readStr values either gets used as keys or as identifiers so you will have to allocate them eventually, std::strings move semantics should insure that the allocated array are not copied around but reused until its final use. The default NRVO/RVO/Move should reduce any copying of the data if not of the string header itself.
Method 1:
Pass result as a ref from the caller which has reserved SomeResonableLargeValue chars, then clear it at the start of readStr. This is only usable if the caller actually can reuse the string.
Method 2:
Use the stack.
// Reserve memory for the string (BOTTLENECK)
if (end - idx < SomeReasonableValue) { // 32?
char result[SomeReasonableValue] = {0}; // feel free to use std::array if you want bounds checking, but the preceding "if" should insure its not a problem.
int ridx = 0;
for(; idx < end; ++idx) {
// Not an escape sequence
if(!isC('\\')) { result[ridx++] = getC(); continue; }
// Escape sequence: skip '\'
// Convert escape sequence
result[ridx++] = getEscapeSequence(getC());
// Skip closing '"'
result[ridx] = 0; // 0-terminated.
// optional assert here to insure nothing went wrong.
return result; // the bottleneck might now move here as the data is copied to the receiving string.
// fallback code only if the string is long.
// Your original code here
Method 3:
If your string by default can allocate some size to fill its 32/64 byte boundary, you might want to try to use that, construct result like this instead in case the constructor can optimize it.
Str result(end - idx, 0);
Method 4:
Most systems already has some optimized allocator that like specific block sizes, 16,32,64 etc.
siz = ((end - idx)&~0xf)+16; // if the allocator has chunks of 16 bytes already.
Str result(siz);
Method 5:
Use either the allocator made by google or facebooks as global new/delete replacement.
To understand how a custom allocator can help you, you need to understand what malloc and the heap does and why it is quite slow in comparison to the stack.
The Stack
The stack is a large block of memory allocated for your current scope. You can think of it as this
([] means a byte of memory)
(P is a pointer that points to a specific byte of memory, in this case its pointing at the first byte)
So the stack is a block with only 1 pointer. When you allocate memory, what it does is it performs a pointer arithmetic on P, which takes constant time.
So declaring int i = 0; would mean this,
P + sizeof(int).
(i in [] is a block of memory occupied by an integer)
This is blazing fast and as soon as you go out of scope, the entire chunk of memory is emptied simply by moving P back to the first position.
The Heap
The heap allocates memory from a reserved pool of bytes reserved by the c++ compiler at runtime, when you call malloc, the heap finds a length of contiguous memory that fits your malloc requirements, marks it as used so nothing else can use it, and returns that to you as a void*.
So, a theoretical heap with little optimization calling new(sizeof(int)), would do this.
Heap chunk
At first : [][][][][][][][][][][][][][][][][][][][][][][][][]
Allocate 4 bytes (sizeof(int)):
A pointer goes though every byte of memory, finds one that is of correct length, and returns to you a pointer.
After : [i][i][i][i][][][]][][][][][][][][][]][][][][][][][]
This is not an accurate representation of the heap, but from this you can already see numerous reasons for being slow relative to the stack.
The heap is required to keep track of all already allocated memory and their respective lengths. In our test case above, the heap was already empty and did not require much, but in worst case scenarios, the heap will be populated with multiple objects with gaps in between (heap fragmentation), and this will be much slower.
The heap is required to cycle though all the bytes to find one that fits your length.
The heap can suffer from fragmentation since it will never completely clean itself unless you specify it. So if you allocated an int, a char, and another int, your heap would look like this
(i stands for bytes occupied by int and c stands for bytes occupied by a char. When you de-allocate the char, it will look like this.
So when you want to allocate another object into the heap,
unless an object is the size of 1 char, the overall heap size for that allocation is reduced by 1 byte. In more complex programs with millions of allocations and deallocations, the fragmentation issue becomes severe and the program will become unstable.
Worry about cases like thread safety (Someone else said this already).
Custom Heap/Allocator
So, a custom allocator usually needs to address these problems while providing the benefits of the heap, such as personalized memory management and object permanence.
These are usually accomplished with specialized allocators. If you know you dont need to worry about thread safety or you know exactly how long your string will be or a predictable usage pattern you can make your allocator fast than malloc and new by quite a lot.
For example, if your program requires a lot of allocations as fast as possible without lots of deallocations, you could implement a stack allocator, in which you allocate a huge chunk of memory with malloc at startup,
typedef char* buffer;
//Super simple example that probably doesnt work.
struct StackAllocator:public Allocator{
buffer stack;
char* pointer;
StackAllocator(int expectedSize){ stack = new char[expectedSize];pointer = stack;}
allocate(int size){ char* returnedPointer = pointer; pointer += size; return returnedPointer}
empty() {pointer = stack;}
Get expected size, get a chunk of memory from the heap.
Assign a pointer to the beginning.
[P][][][][][][][][][] ..... [].
then have one pointer that moves for each allocation. When you no longer need the memory, you simply move the pointer to the beginning of your buffer. This gives your the advantage of O(1) speed allocations and deallocations as well as object permanence for the lack of flexible deallocation and large initial memory requirements.
For strings, you could try a chunk allocator. For every allocation, the allocator gives a set chunk of memory.
Compatibility with other strings is almost guaranteed. As long as you are allocating a contiguous chunk of memory and preventing anything else from using that block of memory, it will work.

Also check realloc() if shrinking allocated size of memory?

When you call realloc() you should check whether the function failed before assigning the returned pointer to the pointer passed as a parameter to the function...
I've always followed this rule.
Now is it necessary to follow this rule when you know for sure the memory will be truncated and not increased?
I've never ever seen it fail. Just wondered if I could save a couple instructions.
realloc may, at its discretion, copy the block to a new address regardless of whether the new size is larger or smaller. This may be necessary if the malloc implementation requires a new allocation to "shrink" a memory block (e.g. if the new size requires placing the memory block in a different allocation pool). This is noted in the glibc documentation:
In several allocation implementations, making a block smaller sometimes necessitates copying it, so it can fail if no other space is available.
Therefore, you must always check the result of realloc, even when shrinking. It is possible that realloc has failed to shrink the block because it cannot simultaneously allocate a new, smaller block.
Even if you realloc (read carefully realloc(3) and about Posix realloc please) to a smaller size, the underlying implementation is doing the equivalent of malloc (of the new smaller size), followed by a memcpy (from old to new zone), then free (of the old zone). Or it may do nothing... (e.g. because some crude malloc implementations maitain a limited set of sizes -like power of two or 3 times power of two-, and the old and new size requirements fits in the same size....)
That malloc can fail. So realloc can still fail.
Actually, I usually don't recommend using realloc for that reason: just do the malloc, memcpy, free yourself.
Indeed, dynamic heap memory functions like malloc rarely fail. But when they do, chaos may happen if you don't handle that. On Linux and some other Posix systems you could setrlimit(2) with RLIMIT_AS -e.g. using bash ulimit builtin- to lower the limits for testing purposes.
You might want to study the source code implementations of C memory management. For example MUSL libc (for Linux) is very readable code. On Linux, malloc is often built above mmap(2) (the C library may allocate a large chunk of memory using mmap then managing smaller used and freed memory zones inside it).

Memory reallocation - keep my data in place

Assuming I have some array in heap doesn't matter constructed by malloc or new. I need the most efficient way to enlarge it. I mean if it has enough free space which lying after already allocated data can I keep my data untouched. Is it possible to maintain in C++?
Does realloc work in such manner?
Yes, realloc is what you are looking for. Note that it won't work with new, you will have to use malloc (or, say, calloc). Also, sometimes it is just impossible to extend memory, so realloc will try to do it for you, but if it couldn't — it will resort to allocating new memory, copying your contents to a new place and freeing the old memory.
yes, realloc works like that, though the link says that it is not guaranteed, I think this is for cases where memory is fragmented and there is not enough room to expand the memory block in-situ.

delete & new in c++

This may be very simple question,But please help me.
i wanted to know what exactly happens when i call new & delete , For example in below code
char * ptr=new char [10];
delete [] ptr;
call to new returns me memory address. Does it allocate exact 10 bytes on heap, Where information about size is stored.When i call delete on same pointer,i see in debugger that there are a lot of byte get changed before and after the 10 Bytes.
Is there any header for each new which contain information about number of byte allocated by new.
Thanks a lot
Do it allocate exact 10 bytes
That's implementation dependant. The guarantee is "at least 10 chars".
Where information about size is stored?
That's implementation dependant.
Is there any header for each new which contain information about number of byte allocated by new?
That's implementation dependant.
By "that's implementation dependant" I mean it's not defined in the standard.
That's all up to the compiler and your runtime library. It's only exactly defined what effects new and delete have on your program, but how exactly these are acieved is not specified.
In your case it seems like a little more memory than requested is allocated and it will probably store management information like the size of the current chunk of memory, information about adjacent areas of free space or information to help the debugger try to detect buffer overflows and similar problems.
It is completely implementation-dependent. In general case you have to store the number of elements elsewhere. The implementation must allocate enough space for at least the number of elements specified, but it can allocate more.
Is there any header for each new which contain information about number of byte allocated by new.
That's platform dependent but yes, on many platforms there are.
Precisely, according to the standard, new char[10] will alloc at least 10 bytes in the heap.
The internals of new and delete are implementation dependent. So it will vary from compiler to compiler, and platform to platform. Additionally, you can find a variety of allocator algorithms (e.g: TCMalloc).
I'll give you an overview of how it could work internally, but don't take it as absolute truth. It's written for the solely purpose of this explanation.
In short, the new operator internally invokes malloc. The malloc uses a really long linked list of available memory blocks, aka free chain. When malloc is invoked, it lookups this list for the first block that's big enough to hold the requested size. After that, it splits the block in two parts, one with the size you requested, and the other with the rest, which is then added back to the free chain. Finally, it returns the block with the request size.
The inverse occurs in a free call, which is invoked by delete/delete[]. In short, it puts the provided block back to the free chain.
There could be fancy tricks during the processes I described above, like sorting the free chain, rounding the requested size to the next power of two to reduce memory fragmentation, and so on.
char * ptr=new char [10];
You are creating an array of 10 character's in heap and storing the address of 0th element in a pointer.this is similar to doing an malloc in C
delete [] ptr;
You are deleting(freeing the memory) the heap memory which was allocated by the earlier statement.this is similar to doing a free in c.
It is implementation dependent, but mostly the metadata for a block of memory is usually stored in the area before the memory address returned. The change that you observed before the 10 bytes was likely metadata being updated for this block (likely the size of the block being written into the meta data), and after the 10 bytes were metadata being updated for the next block (still unallocated, likely the pointer to the next chunk on the free list).
It is not a good idea to mess with the heap as it is not portable. However, if you want to do such heap magic, I suggest you implement your own memory pools (just get a large chunk of memory from the heap and manage it yourself). A possible place to start would be to look at libmm.
While the specifics are implementation dependent, one piece of information the implementation will need to store is the number of elements in the array. Or if it does not store it directly, it will need to accurately derive it from the block size allocated.
The reason for this because if an array of objects is allocated with new[], when they are deleted with delete[], the destructor of each object in the array will need to be called. delete[] will need to know how many objects to destruct. This is why it is necessary to match new with delete and new[] with delete[].