C++ Struct to Byte* throwing error - c++

I have attached my code below. I do not see what I am doing wrong. I have a struct that I am trying to serialize into a byte array. I have wrote some some simple code to test it. It all appears to work during runtime when I print out the values of objects, but once I hit return 0 it throws the error:
Run-Time Check Failure #2 - Stack around the variable 'command' was corrupted.
I do not see the issue. I appreciate all help.
namespace CommIO
{
enum Direction {READ, WRITE};
struct CommCommand
{
int command;
Direction dir;
int rwSize;
BYTE* wData;
CommCommand(BYTE* bytes)
{
int offset = 0;
int intsize = sizeof(int);
command = 0;
dir = READ;
rwSize = 0;
memcpy(&command, bytes + offset, intsize);
offset += intsize;
memcpy(&dir, bytes + offset, intsize);
offset += intsize;
memcpy(&rwSize, bytes + offset, intsize);
offset += intsize;
wData = new BYTE[rwSize];
if (dir == WRITE)
{
memcpy(&wData, bytes + offset, rwSize);
}
}
CommCommand() {}
}
int main()
{
CommIO::CommCommand command;
command.command = 0x6AEA6BEB;
command.dir = CommIO::WRITE;
command.rwSize = 128;
command.wData = new BYTE[command.rwSize];
for (int i = 0; i < command.rwSize; i++)
{
command.wData[i] = i;
}
command.print();
CommIO::CommCommand command2(reinterpret_cast<BYTE*>(&command));
command2.print();
cin.get();
return 0;
}

The following points mentioned in comments are most likely the causes of your problem.
You seem to be assuming that the size of Direction is the same as the size of an int. That may indeed be the case, but C++ does not guarantee it.
You also seem to be assuming that the members of CommIO::CommCommand will be laid out in memory without any padding between, which again may happen to be the case, but is not guaranteed.
There are couple of ways to fix the that.
Make sure that you fill up the BYTE array in the calling function with matching objects, or
Simply cast the BYTE* to CommCommand* and access the members directly.
For (1), you can use:
int command = 0x6AEA6BEB;
int dir = CommIO::WRITE;
int rwSize = 128;
totatlSize = rwSize + 3*sizeof(int);
BYTE* data = new BYTE[totalSize];
int offset = 0;
memcpy(data + offset, &comand, sizeof(int));
offset += sizeof(int);
memcpy(data + offset, &dir, sizeof(int));
offset += sizeof(int);
memcpy(data + offset, &rwSize, sizeof(int));
offset += sizeof(int);
for (int i = 0; i < rwSize; i++)
{
data[i + offset] = i;
}
CommIO::CommCommand command2(data);
For (2), you can use:
CommCommand(BYTE* bytes)
{
CommCommand* in = reinterpret_cast<CommCommand*>(bytes);
command = in->command;
dir = in->dir;
rwSize = in->size;
wData = new BYTE[rwSize];
if (dir == WRITE)
{
memcpy(wData, in->wData, rwSize);
}
}
The other error is that you are using
memcpy(&wData, bytes + offset, rwSize);
That is incorrect since you are treating the address of the variable as though it can hold the data. It cannot.
You need to use:
memcpy(wData, bytes + offset, rwSize);

The memory for your struct is laid out without padding, this can be rectified by adding the macro #pragma pack(1) at the start of the struct and #pragma pop() at the end of the struct - check its syntax though.
For your struct to byte conversion, I would use something simple as:
template<typename T, typename IteratorForBytes>
void ConvertToBytes(const T& t, IteratorForBytes bytes, std::size_t pos = 0)
{
std::advance(bytes, pos);
const std::size_t length = sizeof(t);
const uint8_t* temp = reinterpret_cast<const uint8_t*>(&t);
for (std::size_t i = 0; i < length; ++i)
{
(*bytes) = (*temp);
++temp;
++bytes;
}
}
Where T is the is the struct in your case your Command struct and bytes would be the array.
CommIO::CommCommand command;
command.wData = new BYTE[command.rwSize];
ConvertToBytes(command, command.wData);
The resulting array would contain the expected bytes You could specify the offset as well as an extra parameter if you want to start filling your byte array from a particular location

The main problem is here:
memcpy(&wData, bytes + offset, rwSize);
Member wData is a BYTE *, and you seem to mean to copy bytes into the space to which it points. Instead, you are copying data into the memory where the pointer value itself is stored. Therefore, if you copy more bytes than the size of the pointer then you will overrun its bounds and produce undefined behavior. In any case, you are trashing the original pointer value. You probably want this, instead:
memcpy(wData, bytes + offset, rwSize);
Additionally, although the rest of the deserialization code may be right for your actual serialization format, it is not safe to assume that it is right for the byte sequence you present to it in your test program via
CommIO::CommCommand command2(reinterpret_cast<BYTE*>(&command));
As detailed in comments, you are making assumptions about the layout in memory of a CommIO::CommCommand that C++ does not guarantee will hold.

At
memcpy(&wData, bytes + offset, rwSize);
you copy from the location of the wData pointer and to the location of the wData pointer of the new CommCommand. But you want to copy from and to the location that the pointer points to. You need to dereference. You corrupt the heap, because you have only sizeof(BYTE*) space (plus some extra, because heap blocks cannot be arbitrarily small), but you copy rwSize bytes, which is 128 bytes. What you probably meant to write is:
memcpy(wData, *(BYTE*)(bytes + offset), rwSize);
which would take use the pointer stored at bytes + offset, rather than the value of bytes + offset itself.
You also assume that your struct is tightly packed. However, C++ does not guarantee that. Is there a reason why you do not override the default copy constructor rather than write this function?

Related

Run-Time Check Failure #2 - Stack around the variable 'newRow' was corrupted

I've still getting an error of how the stack around newRow is tried using strncat() so that I can say how many new charters that where added to the string, but in the end I still have a corruption around newRow.
In terms of a variables being passed into this function, I think they are pretty straight forward. I also use sizeOfString as a custom made function because I'm not allowed to use the standard sizeof function.
char* makeRow(char elementOne[20], int elementNumber, int numCycles, int orginalData[40], float ctValues[7]){
char newRow[] = "";
int lookingAt;
int dataPoint;
char* elementPtr;
int charArrSize;
elementNumber = elementNumber--;
elementPtr = elementOne;
int lenOfElemnt = *(&elementOne + 1) - elementOne;
//charArrSize = sizeOfString(elementPtr);
charArrSize = sizeOfString(elementOne);
strncat(newRow, elementOne, charArrSize);
//strcpy(csvThirdRow, (",%s", elementOne));
for (int i = 1; i <= 5; i++)
{
lookingAt = (((i - 1) * 5) + 1 - 1);
int maxLookingAt = numCycles * 5;
dataPoint = orginalData[lookingAt];
char dataPointBuffer[100];
if (lookingAt < maxLookingAt)
{
sprintf(dataPointBuffer, ",%d", dataPoint);
charArrSize = sizeOfString(dataPointBuffer);
strncat(newRow, dataPointBuffer, charArrSize);
}
else
{
strncat(newRow, ",",1);
}
}
char ctBuffer[20];
float ctNumber = ctValues[elementNumber];
sprintf(ctBuffer, ",%.2f\n", ctNumber);
charArrSize = sizeOfString(ctBuffer);
strncat(newRow, ctBuffer, charArrSize);
return newRow;
}
If we omit the array dimension, compiler computes it for us based on the size of initialiser.
So, this
char newRow[] = "";
is same as this
char newRow[1] = "";
The size of newRow array is 1.
You are trying to copy more than 1 character to newRow array which is leading to undefined behaviour and resulting in corruption.
From strncat():
The behavior is undefined if the destination array does not have enough space for the contents of both dest and the first count characters of src, plus the terminating null character....
May you should try giving enough size to newRow array, like this
char newRow[1024] = {0};
There is another problem in your code -
You are returning the address of local variable newRow1) from makeRow() function. Note that a local(automatic) non-static variable lifetime is limited to its scope i.e. the block in which it has been declared. Any attempt to access it outside of its lifetime lead to undefined behaviour.
Couple of things that you can do to fix it:
Either make numRow array static or declare numRow as char * type and allocate memory dynamically to it and, in this case, make sure to free it once done with it.
1). An expression that has type array of type is converted to an expression with type pointer to type that points to the initial element of the array object [there are few exceptions to this rule].

how do I solve this C++ access violation problem?

I'm getting an error in the following code. Visual Studio throws an access violation error when writing to _buf. How can I fix this?
The Sendn function is a socket sending function. It's not the problem, you can ignore it.
It looks like _buf points at 0x00000000
The error message I'm seeing is
0xC0000005: 0x00000000 : access violation
void ?????::?????(int number, string title)
{
int titlesize = sizeof(title);
int bufsize = 4 + 4 + 4 + titlesize;
char *_buf = new char[bufsize];
_buf = { 0 };
// char _buf[bufsize] = { 0 }; (수정 내용)
int commands = 3;
int index = 0;
memcpy(_buf, &commands, sizeof(int));
index += sizeof(int);
memcpy(_buf + index, &number, sizeof(int));
index += sizeof(int);
memcpy(_buf + index, &titlesize, sizeof(int));
index += sizeof(int);
for (int i = 0; i < titlesize; i++)
{
memcpy(_buf + index, &title[i], sizeof(char));
index += sizeof(char);
}
Sendn(_buf, bufsize);
delete[] _buf;
return;
}
char *_buf = new char[bufsize];
_buf = { 0 };
This does not zero-fill the dynamically-allocated array pointed to by _buf. It sets the pointer _buf to be a null pointer. Since _buf is a null pointer, later attempts to dereference it lead to undefined behavior.
There's no need to zero-fill the array pointed to by _buf in this case, so you can simply remove the _buf = { 0 }; line.
Once you've fixed that problem, you also aren't allocating the right amount of memory. sizeof(title) will not give you the number of characters that title holds. It just gives you the static size of a std::string object, which is usually only a pointer and two integers. Use title.size() instead.
You're trying to copy the content of title together with 3 other integer numbers into _buf right? The problem is that sizeof(title) is not the length of the string stored in title. In order to get the length of title, you need to call the member function length on type std::string like this:
auto titlesize = title.length();
The sizeof operator only gives you the size of your std::string object on stack (in comparison, the actual string is stored on heap) and sizeof expressions are always constant expressions. On my computer, sizeof(std::string) is 24 regardless of what the actual string is.

memcpy not copying into buffer

I have a class with a std::vector<unsigned char> mPacket as a packet buffer (for sending UDP strings). There is a corresponding member variable mPacketNumber that keeps track of how many packets have been sent so far.
The first thing I do in the class is reserve space:
mPacket.reserve(400);
and then later, in a loop that runs while I want packets to get sent:
mPacket.clear(); //empty out the vector
long packetLength = 0; //keep track of packetLength for sending udp strings
memcpy(&mPacket[0], &&mPacketNumber, 4); //4 bytes because it's a long
packetLength += 4; //add 4 bytes to the packet length
memcpy(&mPacket[packetLength], &data, dataLength);
packetLength += dataLength;
udp.send(mPacket.data(), packetLength);
Except I realized that nothing was getting sent! How peculiar.
So I dug a bit deeper, and found that mPacket.size() returns zero, while packetLength returns the size I think the packet should be.
I can't think of a reason for mPacket to have zero length -- even if I'm mishandling the data, the header with mPacketNumber should have been written just fine.
Can anyone suggest why I'm running into this problem?
thanks!
The elements you reserve are not for normal use. The elements are created only if you resize the vector. While it might somehow look it works, it would be a different situation with types having constructors - you could see that the constructors were not called. This is undefined behaviour - you're accessing elements which you aren't allowed in this situation.
The .reserve() operation is normally used together with .push_back() to avoid reallocations, but this is not the case here.
The .size() is not modified if you use .reserve(). You should use .resize() instead.
Alternatively, you can use your copy operation together with .push_back() and .reserve(), but you need to drop the usage of memcpy, and instead use the std::copy together with std::back_inserter, which uses .push_back() to push the elements to the other container:
std::copy(reinterpret_cast<unsigned char*>(&mPacketNumber), reinterpret_cast<unsigned char*>(&mPacketNumber) + sizeof(mPacketNumber), std::back_inserter(mPacket))
std::copy(reinterpret_cast<unsigned char*>(&data), reinterpret_cast<unsigned char*>(&data) + dataLength, std::back_inserter(mPacket));
These reinterpret_casts are vexing, but the code still has one advantage - you won't get buffer overrun in case your estimate was too low.
vector, apparently, doesn't count the elements when you call size(). There's a counter variable inside the vector that holds that information, because vector has plenty of memory allocated and can't really know where the end of your data is. It changes counter variable as you add/remove elements using methods of vector object, because they are programmed to do so.
You added data directly to its array pointer, which awakens no reaction of your vector object because it does not use any of its methods. Data is there, but vector doesn't acknowledge it, so counter remains at 0 and size() returns 0.
You should either replace all size() calls with packageLength, or use methods inside your vector to add/remove/read data, or use a dynamically allocated array instead of a vector, or create your own class for containing array and managing it the way you like it. To be honest, using a vector in a situation like this doesn't really make sense.
Vector is a conventional high-level object-oriented component and in most os the cases it should be used that way.
Example of one's own Array class:
If you used your own dynamically allocated array, you'd have to remember its length all the time in order to use it. So lets create a class that will cut us some slack in that. This example has element transfer based on memcpy, and the [] notation works perfectly. It has an original max length, but extends itself when necessary.
Also, this is an in-line class. certain IDEs may ask of you to actually seperate it in header and source file, so you may have to do that yourself.
Add more methods yourself if necessary. When applying this, do not use memcpy unless you're going to change arraySize attribute manually. You've got integrated addFrom and addBytesFrom methods that use memcpy inside (assuming calling array being the destination) and separately increase arraySize. If you do want to use memcpy, setSize method can be used for forcing new array size without modifying the array.
#include <cstring>
//this way you can easily change types during coding in case you change your mind
//more conventional object-oriented method would use templates and generic programming, but lets not complicate too much now
typedef unsigned char type;
class Array {
private:
type *array;
long arraySize;
long allocAmount; //number of allocated bytes
long currentMaxSize; //number of allocated elements
//private call that extends memory taken by the array
bool reallocMore()
{
//preserve old data
type *temp = new type[currentMaxSize];
memcpy(temp, array, allocAmount);
long oldAmount = allocAmount;
//calculate new max size and number of allocation bytes
currentMaxSize *= 16;
allocAmount = currentMaxSize * sizeof(type);
//reallocate array and copy its elements back into it
delete[] array;
array = new type[currentMaxSize];
memcpy(array, temp, oldAmount);
//we no longer need temp to take space in out heap
delete[] temp;
//check if space was successfully allocated
if(array) return true;
else return false;
}
public:
//constructor
Array(bool huge)
{
if(huge) currentMaxSize = 1024 * 1024;
else currentMaxSize = 1024;
allocAmount = currentMaxSize * sizeof(type);
array = new type[currentMaxSize];
arraySize = 0;
}
//copy elements from another array and add to this one, updating arraySize
bool addFrom(void *src, long howMany)
{
//predict new array size and extend if larger than currentMaxSize
long newSize = howMany + arraySize;
while(true)
{
if(newSize > currentMaxSize)
{
bool result = reallocMore();
if(!result) return false;
}
else break;
}
//add new elements
memcpy(&array[arraySize], src, howMany * sizeof(type));
arraySize = newSize;
return true;
}
//copy BYTES from another array and add to this one, updating arraySize
bool addBytesFrom(void *src, long byteNumber)
{
//predict new array size and extend if larger than currentMaxSize
int typeSize = sizeof(type);
long howMany = byteNumber / typeSize;
if(byteNumber % typeSize != 0) howMany++;
long newSize = howMany + arraySize;
while(true)
{
if(newSize > currentMaxSize)
{
bool result = reallocMore();
if(!result) return false;
}
else break;
}
//add new elements
memcpy(&array[arraySize], src, byteNumber);
arraySize = newSize;
return true;
}
//clear the array as if it's just been made
bool clear(bool huge)
{
//huge >>> 1MB, not huge >>> 1KB
if(huge) currentMaxSize = 1024 * 1024;
else currentMaxSize = 1024;
allocAmount = currentMaxSize * sizeof(type);
delete[] array;
array = new type[currentMaxSize];
arraySize = 0;
}
//if you modify this array out of class, you must manually set the correct size
bool setSize(long newSize) {
while(true)
{
if(newSize > currentMaxSize)
{
bool result = reallocMore();
if(!result) return false;
}
else break;
}
arraySize = newSize;
}
//current number of elements
long size() {
return arraySize;
}
//current number of elements
long sizeInBytes() {
return arraySize * sizeof(type);
}
//this enables the usage of [] as in yourArray[i]
type& operator[](long i)
{
return array[i];
}
};
mPacket.reserve();
mPacket.resize(4 + dataLength); //call this first and copy into, you can get what you want
mPacket.clear(); //empty out the vector
long packetLength = 0; //keep track of packetLength for sending udp strings
memcpy(&mPacket[0], &&mPacketNumber, 4); //4 bytes because it's a long
packetLength += 4; //add 4 bytes to the packet length
memcpy(&mPacket[packetLength], &data, dataLength);
packetLength += dataLength;
udp.send(mPacket, packetLength);

Allocating an Array in Memory Manager

I want to successfully allocate an Array in my Memory Manager. I am having a hard time getting the data setup successfully in my Heap. I don't know how to instantiate the elements of the array, and then set the pointer that is passed in to that Array. Any help would be greatly appreciated. =)
Basically to sum it up, I want to write my own new[#] function using my own Heap block instead of the normal heap. Don't even want to think about what would be required for a dynamic array. o.O
// Parameter 1: Pointer that you want to pointer to the Array.
// Parameter 2: Amount of Array Elements requested.
// Return: true if Allocation was successful, false if it failed.
template <typename T>
bool AllocateArray(T*& data, unsigned int count)
{
if((m_Heap.m_Pool == nullptr) || count <= 0)
return false;
unsigned int allocSize = sizeof(T)*count;
// If we have an array, pad an extra 16 bytes so that it will start the data on a 16 byte boundary and have room to store
// the number of items allocated within this pad space, and the size of the original data type so in a delete call we can move
// the pointer by the appropriate size and call a destructor(potentially a base class destructor) on each element in the array
allocSize += 16;
unsigned int* mem = (unsigned int*)(m_Heap.Allocate(allocSize));
if(!mem)
{
return false;
}
mem[2] = count;
mem[3] = sizeof(T);
T* iter = (T*)(&(mem[4]));
data = iter;
iter++;
for(unsigned int i = 0; i < count; ++i,++iter)
{
// I have tried a bunch of stuff, not sure what to do. :(
}
return true;
}
Heap Allocate function:
void* Heap::Allocate(unsigned int allocSize)
{
Header* HeadPtr = FindBlock(allocSize);
Footer* FootPtr = (Footer*)HeadPtr;
FootPtr = (Footer*)((char*)FootPtr + (HeadPtr->size + sizeof(Header)));
// Right Split Free Memory if there is enough to make another block.
if((HeadPtr->size - allocSize) >= MINBLOCKSIZE)
{
// Create the Header for the Allocated Block and Update it's Footer
Header* NewHead = (Header*)FootPtr;
NewHead = (Header*)((char*)NewHead - (allocSize + sizeof(Header)));
NewHead->size = allocSize;
NewHead->next = NewHead;
NewHead->prev = NewHead;
FootPtr->size = NewHead->size;
// Create the Footer for the remaining Free Block and update it's size
Footer* NewFoot = (Footer*)NewHead;
NewFoot = (Footer*)((char*)NewFoot - sizeof(Footer));
HeadPtr->size -= (allocSize + HEADANDFOOTSIZE);
NewFoot->size = HeadPtr->size;
// Turn new Header and Old Footer High Bits On
(NewHead->size |= (1 << 31));
(FootPtr->size |= (1 << 31));
// Return actual allocated memory's location
void* MemAddress = NewHead;
MemAddress = ((char*)MemAddress + sizeof(Header));
m_PoolSizeTotal = HeadPtr->size;
return MemAddress;
}
else
{
// Updating descriptors
HeadPtr->prev->next = HeadPtr->next;
HeadPtr->next->prev = HeadPtr->prev;
HeadPtr->next = NULL;
HeadPtr->prev = NULL;
// Turning Header and Footer High Bits On
(HeadPtr->size |= (1 << 31));
(FootPtr->size |= (1 << 31));
// Return actual allocated memory's location
void* MemAddress = HeadPtr;
MemAddress = ((char*)MemAddress + sizeof(Header));
m_PoolSizeTotal = HeadPtr->size;
return MemAddress;
}
}
Main.cpp
int* TestArray;
MemoryManager::GetInstance()->CreateHeap(1); // Allocates 1MB
MemoryManager::GetInstance()->AllocateArray(TestArray, 3);
MemoryManager::GetInstance()->DeallocateArray(TestArray);
MemoryManager::GetInstance()->DestroyHeap();
As far as these two specific points:
Instantiate the elements of the array
Set the pointer that is passed in to that Array.
For (1): there is no definitive notion of "initializing" the elements of the array in C++. There are at least two reasonable behaviors, this depends on the semantics you want. The first is to simply zero the array (see memset). The other would be to call the default constructor for each element of the array -- I would not recommend this option as the default (zero argument) constructor may not exist.
EDIT: Example initialization using inplace-new
for (i = 0; i < len; i++)
new (&arr[i]) T();
For (2): It is not exactly clear what you mean by "and then set the pointer that is passed in to that Array." You could "set" the memory returned as data = static_cast<T*>(&mem[4]), which you already do.
A few other words of cautioning (having written my own memory managers), be very careful about byte alignment (reinterpret_cast(mem) % 16); you'll want to ensure you are returning points that are word (or even 16 byte) aligned. Also, I would recommend using inttypes.h to explicitly use uint64_t to be explicit about sizing -- current it looks like this allocator will break for >4GB allocations.
EDIT:
Speaking from experiment -- writing a memory allocator is a very difficult thing to do, and it is even more painful to debug. As commenters have stated, a memory allocator is specific to the kernel -- so information about your platform would be very helpful.

Allocate chunk of memory for array of structs

I need an array of this struct allocated in one solid chunk of memory. The length of "char *extension" and "char *type" are not known at compile time.
struct MIMETYPE
{
char *extension;
char *type;
};
If I used the "new" operator to initialize each element by itself, the memory may be scattered. This is how I tried to allocate a single contiguous block of memory for it:
//numTypes = total elements of array
//maxExtension and maxType are the needed lengths for the (char*) in the struct
//std::string ext, type;
unsigned int size = (maxExtension+1 + maxType+1) * numTypes;
mimeTypes = (MIMETYPE*)HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, size);
But, when I try to load the data in like this, the data is all out of order and scattered when I try to access it later.
for(unsigned int i = 0; i < numTypes; i++)
{
//get data from file
getline(fin, line);
stringstream parser.str(line);
parser >> ext >> type;
//point the pointers at a spot in the memory that I allocated
mimeTypes[i].extension = (char*)(&mimeTypes[i]);
mimeTypes[i].type = (char*)((&mimeTypes[i]) + maxExtension);
//copy the data into the elements
strcpy(mimeTypes[i].extension, ext.c_str());
strcpy(mimeTypes[i].type, type.c_str());
}
can anyone help me out?
EDIT:
unsigned int size = (maxExtension+1 + maxType+1);
mimeTypes = (MIMETYPE*)HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, size * numTypes);
for(unsigned int i = 0; i < numTypes; i++)
strcpy((char*)(mimeTypes + (i*size)), ext.c_str());
strcpy((char*)(mimeTypes + (i*size) + (maxExtension+1)), type.c_str());
You mix 2 allocation:
1) manage array of MIMETYPE and
2) manage array of characters
May be (I don't really understand your objectives):
struct MIMETYPE
{
char extension[const_ofmaxExtension];
char type[maxType];
};
would be better to allocate linear items in form:
new MIMETYPE[numTypes];
I'll put aside the point that this is premature optimization (and that you ought to just use std::string, std::vector, etc), since others have already stated that.
The fundamental problem I'm seeing is that you're using the same memory for both the MIMETYPE structs and the strings that they'll point to. No matter how you allocate it, a pointer itself and the data it points to cannot occupy the exact same place in memory.
Lets say you needed an array of 3 types and had MIMETYPE* mimeTypes pointing to the memory you allocated for them.
That means you're treating that memory as if it contains:
8 bytes: mime type 0
8 bytes: mime type 1
8 bytes: mime type 2
Now, consider what you're doing in this next line of code:
mimeTypes[i].extension = (char*)(&mimeTypes[i]);
extension is being set to point to the same location in memory as the MIMETYPE struct itself. That is not going to work. When subsequent code writes to the location that extension points to, it overwrites the MIMETYPE structs.
Similarly, this code:
strcpy((char*)(mimeTypes + (i*size)), ext.c_str());
is writing the string data in the same memory that you otherwise want to MIMETYPE structs to occupy.
If you really want store all the necessary memory in one contiguous space, then doing so is a bit more complicated. You would need to allocate a block of memory to contain the MIMETYPE array at the start of it, and then the string data afterwards.
As an example, lets say you need 3 types. Lets also say the max length for an extension string (maxExtension) is 3 and the max length for a type string (maxType) is 10. In this case, your block of memory needs to be laid out as:
8 bytes: mime type 0
8 bytes: mime type 1
8 bytes: mime type 2
4 bytes: extension string 0
11 bytes: type string 0
4 bytes: extension string 1
11 bytes: type string 1
4 bytes: extension string 2
11 bytes: type string 2
So to allocate, setup, and fill it all correctly you would want to do something like:
unsigned int mimeTypeStringsSize = (maxExtension+1 + maxType+1);
unsigned int totalSize = (sizeof(MIMETYPE) + mimeTypeStringsSize) * numTypes;
char* data = (char*)HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, totalSize);
MIMETYPE* mimeTypes = (MIMETYPE*)data;
char* stringData = data + (sizeof(MIMETYPE) * numTypes);
for(unsigned int i = 0; i < numTypes; i++)
{
//get data from file
getline(fin, line);
stringstream parser.str(line);
parser >> ext >> type;
// set pointers to proper locations
mimeTypes[i].extension = stringData + (mimeTypeStringsSize * i);
mimeTypes[i].type = stringData + (mimeTypeStringsSize * i) + maxExtension+1;
//copy the data into the elements
strcpy(mimeTypes[i].extension, ext.c_str());
strcpy(mimeTypes[i].type, type.c_str());
}
(Note: I've based my byte layout explanations on typical behavior of 32-bit code. 64-bit code would have more space used for the pointers, but the principle is the same. Furthermore, the actual code I've written here should work regardless of 32/64-bit differences.)
What you need to do is get a garbage collector and manage the heap. A simple collector using RAII for object destruction is not that difficult to write. That way, you can simply allocate off the collector and know that it's going to be contiguous. However, you should really, REALLY profile before determining that this is a serious problem for you. When that happens, you can typedef many std types like string and stringstream to use your custom allocator, meaning that you can go back to just std::string instead of the C-style string horrors you have there.
You really have to know the length of extension and type in order to allocate MIMETYPEs contiguously (if "contiguously" means that extension and type are actually allocated within the object). Since you say that the length of extension and type are not known at compile time, you cannot do this in an array or a vector (the overall length of a vector can be set and changed at runtime, but the size of the individual elements must be known at compile time, and you can't know that size without knowing the length of extension and type).
I would personally recommend using a vector of MIMETYPEs, and making the extension and type fields both strings. You're requirements sound suspiciously like premature optimization guided by a gut feeling that dereferencing pointers is slow, especially if the pointers cause cache misses. I wouldn't worry about that until you have actual data that reading these fields is an actual bottleneck.
However, I can think of a possible "solution": you can allocate the extension and type strings inside the MIMETYPE object when they are shorter than a particular threshold and allocate them dynamically otherwise:
#include <algorithm>
#include <cstring>
#include <new>
template<size_t Threshold> class Kinda_contig_string {
char contiguous_buffer[Threshold];
char* value;
public:
Kinda_contig_string() : value(NULL) { }
Kinda_contig_string(const char* s)
{
size_t length = std::strlen(s);
if (s < Threshold) {
value = contiguous_buffer;
}
else {
value = new char[length];
}
std::strcpy(value, s);
}
void set(const char* s)
{
size_t length = std::strlen(s);
if (length < Threshold && value == contiguous_buffer) {
// simple case, both old and new string fit in contiguous_buffer
// and value points to contiguous_buffer
std::strcpy(contiguous_buffer, s);
return;
}
if (length >= Threshold && value == contiguous_buffer) {
// old string fit in contiguous_buffer, new string does not
value = new char[length];
std::strcpy(value, s);
return;
}
if (length < Threshold && value != contiguous_buffer) {
// old string did not fit in contiguous_buffer, but new string does
std::strcpy(contiguous_buffer, s);
delete[] value;
value = contiguous_buffer;
return;
}
// old and new strings both too long to fit in extension_buffer
// provide strong exception guarantee
char* temp_buffer = new char[length];
std::strcpy(temp_buffer, s);
std::swap(temp_buffer, value);
delete[] temp_buffer;
return;
}
const char* get() const
{
return value;
}
}
class MIMETYPE {
Kinda_contig_string<16> extension;
Kinda_contig_string<64> type;
public:
const char* get_extension() const
{
return extension.get();
}
const char* get_type() const
{
return type.get();
}
void set_extension(const char* e)
{
extension.set(e);
}
// t must be NULL terminated
void set_type(const char* t)
{
type.set(t);
}
MIMETYPE() : extension(), type() { }
MIMETYPE(const char* e, const char* t) : extension(e), type(t) { }
};
I really can't endorse this without feeling guilty.
Add one byte in between strings... extension and type are not \0-terminated the way do it.
here you allocate allowing for an extra \0 - OK
unsigned int size = (maxExtension+1 + maxType+1) * numTypes;
mimeTypes = (MIMETYPE*)HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, size);
here you don't leave any room for extension's ending \0 (if string len == maxExtension)
//point the pointers at a spot in the memory that I allocated
mimeTypes[i].extension = (char*)(&mimeTypes[i]);
mimeTypes[i].type = (char*)((&mimeTypes[i]) + maxExtension);
instead i think it should be
mimeTypes[i].type = (char*)((&mimeTypes[i]) + maxExtension + 1);