Testing constructor initialization list - c++

I am working on a test which checks if all class attributes are initialized in a constructor.
My current solution works for non pointer attributes:
void CSplitVectorTest::TestConstructorInitialization()
{
const size_t memorySize = sizeof(CSplitVector);
char* pBuffer1 = (char*) malloc(memorySize);
char* pBuffer2 = (char*) malloc(memorySize);
memset(pBuffer1,'?',memorySize);
memset(pBuffer2,'-',memorySize);
new(pBuffer1) CSplitVector;
new(pBuffer2) CSplitVector;
const bool bObjectsAreEqual = memcmp(pBuffer1,pBuffer2,memorySize)==0;
if (!TEST(bObjectsAreEqual))
{
COMMENT("Constructor initialization list not complete!");
}
free(pBuffer1);
free(pBuffer2);
}
Do you have an idea how could it be improved to test if pointers are initialized?

Your test checks whether every byte of the object has been written over by the constructor. As a straight memory check it looks OK, although if the class contains other objects which don't necessarily initialise themselves fully, you may be in trouble.
That said, my main question would be: Is it really an effective test? For example, is it critical that every attribute in the CSplitVector class is initialised by the initialisation list? Do you perhaps have some which may not need to be initialised at this point? Also, how about checking whether the attributes are set to values that you'd expect?

Instead of comparing byte by byte, you probably should use the right padding or word size, and test if any byte of each word got initialized. That way you will probably get around compiler using padding and constructor leaving uninitialized bytes between padded shorter-than-word fields.
To test the real padding size, shooting from the hip, following code should do it pretty reliably:
struct PaddingTest {
volatile char c; // volatile probably not needed, but will not hurt either
volatile int i;
static int getCharPadding() {
PaddingTest *t = new PaddingTest;
int diff = (int)(&(t->i)) - (int)&((t->c));
delete t;
return diff;
}
}
Edit: You still need the two objects, but you no longer compare them to each others, you just compare each initialized data to the memset value, and if either object has any change, it means the word got touched (also on the other one, it's just chance that it got initialized to same value you memset).

I found a solution for mentioned problems, tested it with initialized/not initialized pointers and with different length types.
In test header I added #pragma pack(1) (I am working on gcc)
#pragma pack(1)
#include <CSplitVector>
Test got a little bit complicated:
void CSplitVectorTest::TestConstructorInitialization()
{
const size_t memorySize = sizeof(CSplitVector);
char* pBuffer = (char*) malloc(memorySize);
memset(pBuffer,'?',memorySize);
CSplitVector* pSplitVector = new(pBuffer) CSplitVector;
// find pointers for all '?'
QList<char*> aFound;
char* pFoundChar = (char*) memchr(pBuffer,'?',memorySize);
while (pFoundChar)
{
aFound.append(pFoundChar);
char* pStartFrom = pFoundChar+1;
pFoundChar = (char*) memchr(pStartFrom,'?',memorySize-(int)(pStartFrom-pBuffer));
}
// if there are any '?'....
if (aFound.count())
{
// allocate the same area with '-'...
pSplitVector->~CSplitVector();
memset(pBuffer,'-',memorySize);
pSplitVector = new(pBuffer) CSplitVector;
// and check if places found before contain '-'
while (aFound.count())
{
pFoundChar = aFound.takeFirst();
if (*pFoundChar=='-')
{
// if yes then class has uninitialized attribute
TEST_FAILED("Constructor initialization list not complete!");
pSplitVector->~CSplitVector();
free(pBuffer);
return;
}
}
}
// if no then all attributes are initialized
pSplitVector->~CSplitVector();
free(pBuffer);
TEST(true);
}
Feel free to point any flaws in this solution.

Related

Coalescing two memory chunks in C++?

I'm trying to make my own memory allocator in C++ for educational purposes, and I have a code like this:
class IntObj
{
public:
IntObj(): var_int(6) {}
void setVar(int var)
{
var_int = var;
}
int getVar()
{
return var_int;
}
virtual size_t getMemorySize()
{
return sizeof(*this);
}
int a = 8;
~IntObj()
{}
private:
int var_int;
};
And I'm stuck with how to have unused memory chunks merge. I'm trying to test it like this:
char *pz = new char[sizeof(IntObj) * 2]; //In MacOS, IntObj takes 16 bytes
char *pz2 = &pz[sizeof(IntObj)]; // Take address of 16-th cell
char *pz3 = new char[sizeof(IntObj) / 2]; //Array of 8 bytes
char **pzz = &pz2;
pzz[sizeof(IntObj)] = pz3; // Set address of cell 16 to the pz3 array
new (&pzz) IntObj; //placement new
IntObj *ss = reinterpret_cast<IntObj *>(&pzz);
cout << ss->a;
The output is 8 as expected. My questions:
Why the output is correct?
Is the code like this correct? If not, are there any other ways to implement coalescence of two memory chunks?
UPDATE: All methods work correctly.
e.g this would work:
ss->setVar(54);
cout << ss->getVar();
The output is 54.
UPDATE 2: First of all, my task is not to request a new memory block from OS for instantiating an object, but to give it from a linked list of free blocks(that were allocated when starting a program). My problem is that I can have polymorphic objects with different sizes, and don't know how to split memory blocks, or merge (that is what I understand by merging or coalescing chunks) them (if allocation is requested) effectively.
There's a number of misunderstandings apparent here
char *pz = new char[sizeof(IntObj) * 2]; // fine
char *pz2 = &pz[sizeof(IntObj)]; // fine
char *pz3 = new char[sizeof(IntObj) / 2]; // fine
char **pzz = &pz2; // fine
pzz[sizeof(IntObj)] = pz3; // bad
pzz is a pointer that is pointing to only a single char*, which is the variable pz2. Meaning that any access or modification past pzz[0] is undefined behavior (very bad). You're likely modifying the contents of some other variable.
new (&pzz) IntObj; // questionable
This is constructing an IntObj in the space of the variable pzz, not where pzz is pointing to. The constructor of course sets a to 8 thereby stomping on the contents of pzz (it won't be pointing to pz2 anymore). I'm uncertain if this in-and-of-itself is undefined behavior (since there would be room for a whole IntObj), but using it certainly is:
IntObj *ss = reinterpret_cast<IntObj *>(&pzz); // bad
This violates the strict-aliasing rule. While the standard is generous for char* aliases, it does not allow char** to IntObj* aliases. This exhibits more undefined behavior.
If your question comes down to whether or not you can use two independent and contiguous blocks of memory as a single block then no, you cannot.

How to avoid dynamic allocation of memory C++

[edit] Outside of this get method (see below), i'd like to have a pointer double * result; and then call the get method, i.e.
// Pull results out
int story = 3;
double * data;
int len;
m_Scene->GetSectionStoryGrid_m(story, data, len);
with that said, I want to a get method that simply sets the result (*&data) by reference, and does not dynamically allocate memory.
The results I am looking for already exist in memory, but they are within C-structs and are not in one continuous block of memory. Fyi, &len is just the length of the array. I want one big array that holds all of the results.
Since the actual results that I am looking for are stored within the native C-struct pointer story_ptr->int_hv[i].ab.center.x;. How would I avoid dynamically allocating memory like I am doing above? I’d like to point the data* to the results, but I just don’t know how to do it. It’s probably something simple I am overlooking… The code is below.
Is this even possible? From what I've read, it is not, but as my username implies, I'm not a software developer. Thanks to all who have replied so far by the way!
Here is a snippet of code:
void GetSectionStoryGrid_m( int story_number, double *&data, int &len )
{
std::stringstream LogMessage;
if (!ValidateStoryNumber(story_number))
{
data = NULL;
len = -1;
}
else
{
// Check to see if we already retrieved this result
if ( m_dStoryNum_To_GridMap_m.find(story_number) == m_dStoryNum_To_GridMap_m.end() )
{
data = new double[GetSectionNumInternalHazardVolumes()*3];
len = GetSectionNumInternalHazardVolumes()*3;
Story * story_ptr = m_StoriesInSection.at(story_number-1);
int counter = 0; // counts the current int hv number we are on
for ( int i = 0; i < GetSectionNumInternalHazardVolumes() && story_ptr->int_hv != NULL; i++ )
{
data[0 + counter] = story_ptr->int_hv[i].ab.center.x;
data[1 + counter] = story_ptr->int_hv[i].ab.center.y;
data[2 + counter] = story_ptr->int_hv[i].ab.center.z;
m_dStoryNum_To_GridMap_m.insert( std::pair<int, double*>(story_number,data));
counter += 3;
}
}
else
{
data = m_dStoryNum_To_GridMap_m.find(story_number)->second;
len = GetSectionNumInternalHazardVolumes()*3;
}
}
}
Consider returning a custom accessor class instead of the "double *&data". Depending on your needs that class would look something like this:
class StoryGrid {
public:
StoryGrid(int story_index):m_storyIndex(story_index) {
m_storyPtr = m_StoriesInSection.at(story_index-1);
}
inline int length() { return GetSectionNumInternalHazardVolumes()*3; }
double &operator[](int index) {
int i = index / 3;
int axis = index % 3;
switch(axis){
case 0: return m_storyPtr->int_hv[i].ab.center.x;
case 1: return m_storyPtr->int_hv[i].ab.center.y;
case 2: return m_storyPtr->int_hv[i].ab.center.z;
}
}
};
Sorry for any syntax problems, but you get the idea. Return a reference to this and record this in your map. If done correctly the map with then manage all of the dynamic allocation required.
So you want the allocated array to go "down" in the call stack. You can only achieve this allocating it in the heap, using dynamic allocation. Or creating a static variable, since static variables' lifecycle are not controlled by the call stack.
void GetSectionStoryGrid_m( int story_number, double *&data, int &len )
{
static g_data[DATA_SIZE];
data = g_data;
// continues ...
If you want to "avoid any allocation", the solution by #Speed8ump is your first choice! But then you will not have your double * result; anymore. You will be turning your "offline" solution (calculates the whole array first, then use the array elsewhere) to an "online" solution (calculates values as they are needed). This is a good refactoring to avoid memory allocation.
This answer to this question relies on the lifetime of the doubles you want pointers to. Consider:
// "pointless" because it takes no input and throws away all its work
void pointless_function()
{
double foo = 3.14159;
int j = 0;
for (int i = 0; i < 10; ++i) {
j += i;
}
}
foo exists and has a value inside pointless_function, but ceases to exist as soon as the function exits. Even if you could get a pointer to it, that pointer would be useless outside of pointless_function. It would be a dangling pointer, and dereferencing it would trigger undefined behavior.
On the other hand, you are correct that if you have data in memory (and you can guarantee it will live long enough for whatever you want to do with it), it can be a great idea to get pointers to that data instead of paying the cost to copy it. However, the main way for data to outlive the function that creates it is to call new, new[], or malloc. You really can't get out of that.
Looking at the code you posted, I don't see how you can avoid new[]-ing up the doubles when you create story. But you can then get pointers to those doubles later without needing to call new or new[] again.
I should mention that pointers to data can be used to modify the original data. Often that can lead to hard-to-track-down bugs. So there are times that it's better to pay the price of copying the data (which you're then free to muck with however you want), or to get a pointer-to-const (in this case const double* or double const*, they are equivalent; a pointer-to-const will give you a compiler error if you try to change the data being pointed to). In fact, that's so often the case that the advice should be inverted: "there are a few times when you don't want to copy or get a pointer-to-const; in those cases you must be very careful."

How to make char array and std::string "in a relationship"?

I'm looking for a way to associate a char array with a string so that whenever the char array changes, the string also changes. I tried to put both char array and string variables in a union but that didn't worked as the compiler complained...
Any ideas are welcome...
class Observable_CharArray
{
char* arr;
std::function<void(char*)> change_callback;
public:
Observable_CharArray(int size, std::function<void(char*)> callback)
: arr(new char[size]), change_callback(callback){}
~Observable_CharArray()/*as mentioned by Hulk*/
{
delete[] arr;
}
void SetCallback(std::function<void(char*)> callback)
{
change_callback = callback;
}
/*other member function to give access to array*/
void change_function()
{
//change the array here
change_callback(arr);
}
};
class Observer_String
{
std::string rep;
void callback(char* cc)
{
rep = std::string(cc);
}
public:
Observer_String(Observable_CharArray* och)
{
och->SetCallback(std::bind(&callback, this, _1));
}
/*other member functions to access rep*/
};
The design can definitely be improved.
There can be other ways to solve your actual problem rather than observing char arrays.
The problem is that the std::string may change the string array inside (especially when it resizes). For instance, c_str returns the address of the current string - documentation says that "The pointer returned may be invalidated by further calls to other member functions that modify the object.".
If you're sure you won't call string methods (hence the string will stay at the same memory location), you could try accessing the c_str pointer (your char array) directly and modify its content.
std::string str = "test";
char* arr = (char*)str.c_str();
arr[3] = 'a';
NOTE: I strongly advice against this unless in a testing context.
In other words, the string class doesn't guarantee it's going to stay in the same place in memory - meaning trying to access it through a char array is impossible.
The best is to create another string class that enforces the char array to always stay the same size (and so can stay in the same memory position all the time). You could also create a bigger array (max size string for instance) to cope with any string size changes - but that should be enforced in your wrapper class.
Well you can do this, but you shouldn't
#include <iostream>
#include <string>
int main()
{
std::string test("123456789");
std::cout << test << "\n";
char* data = &test.front(); // use &(*test.begin()) for pre-C++11 code
for ( size_t i(0); i < test.size(); ++i )
{
data[i] = 57 - i;
}
std::cout << test << "\n";
}
Output will be
123456789
987654321
This however goes again everything std::string is trying to facilitate for you. If you use data, you risk causing UB and changes to test may make data point to garbage.
You should not do this!
However, there are many (dangerous) ways to achieve it:
char* cStr = const_cast<char*>(cppStr.c_str());
or
char* cStr = const_cast<char*>(cppStr.data());
or
char* cStr = &cppStr[0];
But beware that the cppStr might be reallocated whenever you touch it, hence invalidating your cStr. That would crash at some point in time, although maybe not immediately (which is even worse).
Therefore, if you are going to do this anyway. Make sure to cppStr.reserve(SOMETHING) *before* you get the cStr out of it. This way, you will at least stabilise the pointer for a while.

How to initialize an array that is part of a struct typedef?

If I have a typedef of a struct
typedef struct
{
char SmType;
char SRes;
float SParm;
float EParm;
WORD Count;
char Flags;
char unused;
GPOINT2 Nodes[];
} GPATH2;
and it contains an uninitialized array, how can I create an instance of this type so that is will hold, say, 4 values in Nodes[]?
Edit: This belongs to an API for a program written in Assembler. I guess as long as the underlying data in memory is the same, an answer changing the struct definition would work, but not if the underlying memory is different. The Assembly Language application is not using this definition .... but .... a C program using it can create GPATH2 elements that the Assembly Language application can "read".
Can I ever resize Nodes[] once I have created an instance of GPATH2?
Note: I would have placed this with a straight C tag, but there is only a C++ tag.
You could use a bastard mix of C and C++ if you really want to:
#include <new>
#include <cstdlib>
#include "definition_of_GPATH2.h"
using namespace std;
int main(void)
{
int i;
/* Allocate raw memory buffer */
void * raw_buffer = calloc(1, sizeof(GPATH2) + 4 * sizeof(GPOINT2));
/* Initialize struct with placement-new */
GPATH2 * path = new (raw_buffer) GPATH2;
path->Count = 4;
for ( i = 0 ; i < 4 ; i++ )
{
path->Nodes[i].x = rand();
path->Nodes[i].y = rand();
}
/* Resize raw buffer */
raw_buffer = realloc(raw_buffer, sizeof(GPATH2) + 8 * sizeof(GPOINT2));
/* 'path' still points to the old buffer that might have been free'd
* by realloc, so it has to be re-initialized
* realloc copies old memory contents, so I am not certain this would
* work with a proper object that actaully does something in the
* constructor
*/
path = new (raw_buffer) GPATH2;
/* now we can write more elements of array */
path->Count = 5;
path->Nodes[4].x = rand();
path->Nodes[4].y = rand();
/* Because this is allocated with malloc/realloc, free it with free
* rather than delete.
* If 'path' was a proper object rather than a struct, you should
* call the destructor manually first.
*/
free(raw_buffer);
return 0;
}
Granted, it's not idiomatic C++ as others have observed, but if the struct is part of legacy code it might be the most straightforward option.
Correctness of the above sample program has only been checked with valgrind using dummy definitions of the structs, your mileage may vary.
If it is fixed size write:
typedef struct
{
char SmType;
char SRes;
float SParm;
float EParm;
WORD Count;
char Flags;
char unused;
GPOINT2 Nodes[4];
} GPATH2;
if not fixed then change declaration to
GPOINT2* Nodes;
after creation or in constructor do
Nodes = new GPOINT2[size];
if you want to resize it you should use vector<GPOINT2>, because you can't resize array, only create new one. If you decide to do it, don't forget to delete previous one.
also typedef is not needed in c++, you can write
struct GPATH2
{
char SmType;
char SRes;
float SParm;
float EParm;
WORD Count;
char Flags;
char unused;
GPOINT2 Nodes[4];
};
This appears to be a C99 idiom known as the "struct hack". You cannot (in standard C99; some compilers have an extension that allows it) declare a variable with this type, but you can declare pointers to it. You have to allocate objects of this type with malloc, providing extra space for the appropriate number of array elements. If nothing holds a pointer to an array element, you can resize the array with realloc.
Code that needs to be backward compatible with C89 needs to use
GPOINT2 Nodes[1];
as the last member, and take note of this when allocating.
This is very much not idiomatic C++ -- note for instance that you would have to jump through several extra hoops to make new and delete usable -- although I have seen it done. Idiomatic C++ would use vector<GPOINT2> as the last member of the struct.
Arrays of unknown size are not valid as C++ data members. They are valid in C99, and your compiler may be mixing C99 support with C++.
What you can do in C++ is 1) give it a size, 2) use a vector or another container, or 3) ditch both automatic (local variable) and normal dynamic storage in order to control allocation explicitly. The third is particularly cumbersome in C++, especially with non-POD, but possible; example:
struct A {
int const size;
char data[1];
~A() {
// if data was of non-POD type, we'd destruct data[1] to data[size-1] here
}
static auto_ptr<A> create(int size) {
// because new is used, auto_ptr's use of delete is fine
// consider another smart pointer type that allows specifying a deleter
A *p = ::operator new(sizeof(A) + (size - 1) * sizeof(char));
try { // not necessary in our case, but is if A's ctor can throw
new(p) A(size);
}
catch (...) {
::operator delete(p);
throw;
}
return auto_ptr<A>(p);
}
private:
A(int size) : size (size) {
// if data was of non-POD type, we'd construct here, being very careful
// of exception safety
}
A(A const &other); // be careful if you define these,
A& operator=(A const &other); // but it likely makes sense to forbid them
void* operator new(size_t size); // doesn't prevent all erroneous uses,
void* operator new[](size_t size); // but this is a start
};
Note you cannot trust sizeof(A) any where else in the code, and using an array of size 1 guarantees alignment (matters when the type isn't char).
This type of structure is not trivially useable on the stack, you'll have to malloc it. the significant thing to know is that sizeof(GPATH2) doesn't include the trailing array. so to create one, you'd do something like this:
GPATH2 *somePath;
size_t numPoints;
numPoints = 4;
somePath = malloc(sizeof(GPATH2) + numPoints*sizeof(GPOINT2));
I'm guessing GPATH2.Count is the number of elements in the Nodes array, so if it's up to you to initialize that, be sure and set somePath->Count = numPoints; at some point. If I'm mistaken, and the convention used is to null terminate the array, then you would do things just a little different:
somePath = malloc(sizeof(GPATH2) + (numPoints+1)*sizeof(GPOINT2));
somePath->Nodes[numPoints] = Some_Sentinel_Value;
make darn sure you know which convention the library uses.
As other folks have mentioned, realloc() can be used to resize the struct, but it will invalidate old pointers to the struct, so make sure you aren't keeping extra copies of it (like passing it to the library).

Variable sized packet structs with vectors

Lately I've been diving into network programming, and I'm having some difficulty constructing a packet with a variable "data" property. Several prior questions have helped tremendously, but I'm still lacking some implementation details. I'm trying to avoid using variable sized arrays, and just use a vector. But I can't get it to be transmitted correctly, and I believe it's somewhere during serialization.
Now for some code.
Packet Header
class Packet {
public:
void* Serialize();
bool Deserialize(void *message);
unsigned int sender_id;
unsigned int sequence_number;
std::vector<char> data;
};
Packet ImpL
typedef struct {
unsigned int sender_id;
unsigned int sequence_number;
std::vector<char> data;
} Packet;
void* Packet::Serialize(int size) {
Packet* p = (Packet *) malloc(8 + 30);
p->sender_id = htonl(this->sender_id);
p->sequence_number = htonl(this->sequence_number);
p->data.assign(size,'&'); //just for testing purposes
}
bool Packet::Deserialize(void *message) {
Packet *s = (Packet*)message;
this->sender_id = ntohl(s->sender_id);
this->sequence_number = ntohl(s->sequence_number);
this->data = s->data;
}
During execution, I simply create a packet, assign it's members, and send/receive accordingly. The above methods are only responsible for serialization. Unfortunately, the data never gets transferred.
Couple of things to point out here. I'm guessing the malloc is wrong, but I'm not sure how else to compute it (i.e. what other value it would be). Other than that, I'm unsure of the proper way to use a vector in this fashion, and would love for someone to show me how (code examples please!) :)
Edit: I've awarded the question to the most comprehensive answer regarding the implementation with a vector data property. Appreciate all the responses!
This trick works with a C-style array at the end of the struct, but not with a C++ vector. There is no guarantee that the C++ vector class will (and it most likely won't) put its contained data in the "header object" that is present in the Packet struct. Instead, that object will contain a pointer to somewhere else, where the actual data is stored.
i think you might want to do like this:
`
struct PacketHeader
{
unsigned int senderId;
unsigned int sequenceNum;
};
class Packet
{
protected:
PacketHeader header;
std::vector<char> data;
public:
char* serialize(int& packetSize);
void deserialize(const char* data,int dataSize);
}
char* Packet::serialize(int& packetSize)
{
packetSize = this->data.size()+sizeof(PacketHeader);
char* packetData = new char[packetSize];
PacketHeader* packetHeader = (PacketHeader*)packetData;
packetHeader->senderId = htonl(this->header.senderId);
packetHeader->sequenceNum = htonl(this->header.sequenceNum);
char* packetBody = (packetData + sizeof(packetHeader));
for(size_t i=0 ; i<this->data.size() ; i++)
{
packetBody[i] = this->data.at(i);
}
return packetData;
}
void deserialize(const char* data,int dataSize)
{
PacketHeader* packetHeader = (PacketHeader*)data;
this->header.senderId = ntohl(packetHeader->senderId);
this->header.sequenceNum = ntohl(packetHeader->sequenceNum);
this->data.clear();
for(int i=sizeof(PacketHeader) ; i<dataSize ; i++)
{
this->data.push_back(data[i]);
}
}
`
those codes does not include bound checking and free allocated data, don't forget to delete the returned buffer from serialize() function, and also you can use memcpy instead of using loop to copy byte per byte into or from std::vector.
most compiler sometime add padding inside a structure, this would cause an issue if you send those data intact without disable the padding, you can do this by using #pragma pack(1) if you are using visual studio
disclaimer: i don't actually compile those codes, you might want to recheck it
I think the problem centres around you trying the 'serialise' the vector that way and you're probably assuming that the vector's state information gets transmitted. As you've found, that doesn't really work that way as you're trying to move an object across the network and things like pointers etc don't mean anything on the other machine.
I think the easiest way to handle this would be to change Packet to the following structure:
struct Packet {
unsigned int sender_id;
unsigned int sequence_number;
unsigned int vector_size;
char data[1];
};
The data[1] bit is an old C trick for variable length array - it has to be the last element in the struct as you're essentially writing past the size of the struct. You have to get the allocation for the data structure right for this, otherwise you'll be in a world of hurt.
Your serialisation function then looks something like this:
void* Packet::Serialize(std::vector<char> &data) {
Packet* p = (Packet *) malloc(sizeof(Packet) + data.size());
p->sender_id = htonl(this->sender_id);
p->sequence_number = htonl(this->sequence_number);
p->vector_size = htonl(data.size());
::memcpy(p->data, data[0], size);
}
As you can see, we'll transmit the data size and the contents of the vector, copied into a plain C array which transmits easily. You have to keep in mind that in your network sending routine, you have to calculate the size of the structure properly as you'll have to send sizeof(Packet) + sizeof(data), otherwise you'll get the vector cut off and are back into nice buffer overflow territory.
Disclaimer - I haven't tested the code above, it's just written from memory so you might have to fix the odd compilation error.
I think you need to work directly with byte arrays returned by the socket functions.
For these purposes it's good to have two distinct parts of a message in your protocol. The first part is a fixed-size "header". This will include the size of the byes that follow, the "payload", or, data in your example.
So, to borrow some of your snippets and expand on them, maybe you'll have something like this:
typedef struct {
unsigned int sender_id;
unsigned int sequence_number;
unsigned int data_length; // this is new
} PacketHeader;
So then when you get a buffer in, you'll treat it as a PacketHeader*, and check data_length to know how much bytes will appear in the byte vector that follows.
I would also add a few points...
Making these fields unsigned int is not wise. The standards for C and C++ don't specify how big int is, and you want something that will be predictable on all compilers. I suggest the C99 type uint32_t defined in <stdint.h>
Note that when you get bytes from the socket... It is in no way guaranteed to be the same size as what the other end wrote to send() or write(). You might get incomplete messages ("packets" in your terminology), or you might get multiple ones in a single read() or recv() call. It's your responsibility to buffer these if they are short of a single request, or loop through them if you get multiple requests in the same pass.
This cast is very dangerous as you have allocated some raw memory and then treated it as an initialized object of a non-POD class type. This is likely to cause a crash at some point.
Packet* p = (Packet *) malloc(8 + 30);
Looking at your code, I assume that you want to write out a sequence of bytes from the Packet object that the seralize function is called on. In this case you have no need of a second packet object. You can create a vector of bytes of the appropriate size and then copy the data across.
e.g.
void* Packet::Serialize(int size)
{
char* raw_data = new char[sizeof sender_id + sizeof sequence_number + data.size()];
char* p = raw_data;
unsigned int tmp;
tmp = htonl(sender_id);
std::memcpy(p, &tmp, sizeof tmp);
p += sizeof tmp;
tmp = htonl(sequence_number);
std::memcpy(p, &tmp, sizeof tmp);
p += sizeof tmp;
std::copy(data.begin(), data.end(), p);
return raw_data;
}
This may not be exactly what you intended as I'm not sure what the final object of your size parameter is and your interface is potentially unsafe as you return a pointer to raw data that I assume is supposed to be dynamically allocated. It is much safer to use an object that manages the lifetime of dynamically allocated memory then the caller doesn't have to guess whether and how to deallocate the memory.
Also the caller has no way of knowing how much memory was allocated. This may not matter for deallocation but presumably if this buffer is to be copied or streamed then this information is needed.
It may be better to return a std::vector<char> or to take one by reference, or even make the function a template and use an output iterator.