variable len string - c++

i have a packet struct which have a variable len for a string example:
BYTE StringLen;
String MyString; //String is not a real type, just trying to represent an string of unknown size
My question is how i can make the implementation of this packet inside an struct without knowing the size of members (in this case strings). Here is an example of how i want it to "look like"
void ProcessPacket (PacketStruct* packet)
{
pointer = &packet.MyString;
}
I think its not possible to make since the compiler doesn't know the size of the string until run time. So how can make it look high level and comprehensible?.
The reason i need structs its for document every packet without the user actually have to look any of the functions that analyze the packet.
So i can resume the question to: is there a way to declare an struct of undefined size members or something close as a struct?

I would recommend a shell class that just interprets the packet data.
struct StringPacket {
char *data_;
StringPacket (char *data) : data_(data) {}
unsigned char len () const { return *data_; }
std::string str () const { return std::string(data_+1, len());
};
As mentioned in comments, you wanted a way to treat a variable-sized packet like a struct. The old C way to do that was to create a struct that looked like this:
struct StringPacketC {
unsigned char len_;
char str_[1]; /* Modern C allows char str_[]; but C++ doesn't */
};
And then, cast the data (remember, this is C code):
struct StringPacketC *strpack = (struct StringPacketC *)packet;
But, you are entering undefined behavior, since to access the full range of data in strpack, you would have to read beyond the 1 byte array boundary defined in the struct. But, this is a commonly used technique in C.
But, in C++, you don't have to resort to such a hack, because you can define accessor methods to treat the variable length data appropriately.

you can copy the string into a high-level std::string (at least, if my guess that String is a typedef for const char* is correct):
void ProcessPacket( const PacketStruct& packet )
{
std::string highLevelString( packet.MyString,
static_cast< size_t >( packet.StringLen ) );
...
}

A simple variant according to your posting would be:
struct PacketStruct {
std::string MyString;
size_t length () const { return MyString.length(); }
const char* operator & () const { return MyString.c_str(); }
};
This can be used (almost) as you desired above:
void ProcessPacket (const PacketStruct& packet)
{
const char * pointer = &packet;
size_t length = packet.length();
std::cout << pointer << '\t' << length << std::endl;
}
and should be invoked like:
int main()
{
PacketStruct p;
p.MyString ="Hello";
ProcessPacket(p);
}

Related

Implementing a String class with implicit conversion to char* (C++)

It might not be advisable according to what I have read at a couple of places (and that's probably the reason std::string doesn't do it already), but in a controlled environment and with careful usage, I think it might be ok to write a string class which can be implicitly converted to a proper writable char buffer when needed by third party library methods (which take only char* as an argument), and still behave like a modern string having methods like Find(), Split(), SubString() etc. While I can try to implement the usual other string manipulation methods later, I first wanted to ask about the efficient and safe way to do this main task. Currently, we have to allocate a char array of roughly the maximum size of the char* output that is expected from the third party method, pass it there, then convert the return char* to a std::string to be able to use the convenient methods it allows, then again pass its (const char*) result to another method using string.c_str(). This is both lengthy and makes the code look a little messy.
Here is my very initial implementation so far:
MyString.h
#pragma once
#include<string>
using namespace std;
class MyString
{
private:
bool mBufferInitialized;
size_t mAllocSize;
string mString;
char *mBuffer;
public:
MyString(size_t size);
MyString(const char* cstr);
MyString();
~MyString();
operator char*() { return GetBuffer(); }
operator const char*() { return GetAsConstChar(); }
const char* GetAsConstChar() { InvalidateBuffer(); return mString.c_str(); }
private:
char* GetBuffer();
void InvalidateBuffer();
};
MyString.cpp
#include "MyString.h"
MyString::MyString(size_t size)
:mAllocSize(size)
,mBufferInitialized(false)
,mBuffer(nullptr)
{
mString.reserve(size);
}
MyString::MyString(const char * cstr)
:MyString()
{
mString.assign(cstr);
}
MyString::MyString()
:MyString((size_t)1024)
{
}
MyString::~MyString()
{
if (mBufferInitialized)
delete[] mBuffer;
}
char * MyString::GetBuffer()
{
if (!mBufferInitialized)
{
mBuffer = new char[mAllocSize]{ '\0' };
mBufferInitialized = true;
}
if (mString.length() > 0)
memcpy(mBuffer, mString.c_str(), mString.length());
return mBuffer;
}
void MyString::InvalidateBuffer()
{
if (mBufferInitialized && mBuffer && strlen(mBuffer) > 0)
{
mString.assign(mBuffer);
mBuffer[0] = '\0';
}
}
Sample usage (main.cpp)
#include "MyString.h"
#include <iostream>
void testSetChars(char * name)
{
if (!name)
return;
//This length is not known to us, but the maximum
//return length is known for each function.
char str[] = "random random name";
strcpy_s(name, strlen(str) + 1, str);
}
int main(int, char*)
{
MyString cs("test initializer");
cout << cs.GetAsConstChar() << '\n';
testSetChars(cs);
cout << cs.GetAsConstChar() << '\n';
getchar();
return 0;
}
Now, I plan to call the InvalidateBuffer() in almost all the methods before doing anything else. Now some of my questions are :
Is there a better way to do it in terms of memory/performance and/or safety, especially in C++ 11 (apart from the usual move constructor/assignment operators which I plan to add to it soon)?
I had initially implemented the 'buffer' using a std::vector of chars, which was easier to implement and more C++ like, but was concerned about performance. So the GetBuffer() method would just return the beginning pointer of the resized vector of . Do you think there are any major pros/cons of using a vector instead of char* here?
I plan to add wide char support to it later. Do you think a union of two structs : {char,string} and {wchar_t, wstring} would be the way to go for that purpose (it will be only one of these two at a time)?
Is it too much overkill rather than just doing the usual way of passing char array pointer, converting to a std::string and doing our work with it. The third party function calls expecting char* arguments are used heavily in the code and I plan to completely replace both char* and std::string with this new string if it works.
Thank you for your patience and help!
If I understood you correctly, you want this to work:
mystring foo;
c_function(foo);
// use the filled foo
with a c_function like ...
void c_function(char * dest) {
strcpy(dest, "FOOOOO");
}
Instead, I propose this (ideone example):
template<std::size_t max>
struct string_filler {
char data[max+1];
std::string & destination;
string_filler(std::string & d) : destination(d) {
data[0] = '\0'; // paranoia
}
~string_filler() {
destination = data;
}
operator char *() {
return data;
}
};
and using it like:
std::string foo;
c_function(string_filler<80>{foo});
This way you provide a "normal" buffer to the C function with a maximum that you specify (which you should know either way ... otherwise calling the function would be unsafe). On destruction of the temporary (which, according to the standard, must happen after that expression with the function call) the string is copied (using std::string assignment operator) into a buffer managed by the std::string.
Addressing your questions:
Do you think there are any major pros/cons of using a vector instead of char* here?
Yes: Using a vector frees your from manual memory management. This is a huge pro.
I plan to add wide char support to it later. Do you think a union of two structs : {char,string} and {wchar_t, wstring} would be the way to go for that purpose (it will be only one of these two at a time)?
A union is a bad idea. How do you know which member is currently active? You need a flag outside of the union. Do you really want every string to carry that around? Instead look what the standard library is doing: It's using templates to provide this abstraction.
Is it too much overkill [..]
Writing a string class? Yes, way too much.
What you want to do already exists. For example with this plain old C function:
/**
* Write n characters into buffer.
* n cann't be more than size
* Return number of written characters
*/
ssize_t fillString(char * buffer, ssize_t size);
Since C++11:
std::string str;
// Resize string to be sure to have memory
str.resize(80);
auto newSize = fillSrting(&str[0], str.size());
str.resize(newSize);
or without first resizing:
std::string str;
if (!str.empty()) // To avoid UB
{
auto newSize = fillSrting(&str[0], str.size());
str.resize(newSize);
}
But before C++11, std::string isn't guaranteed to be stored in a single chunk of contiguous memory. So you have to pass through a std::vector<char> before;
std::vector<char> v;
// Resize string to be sure to have memor
v.resize(80);
ssize_t newSize = fillSrting(&v[0], v.size());
std::string str(v.begin(), v.begin() + newSize);
You can use it easily with something like Daniel's proposition

C++, array of objects, customize where they are stored in memory

Currently I working on a existing project (DLL ) which I have to extend.
For the transport through the DLL I have a struct for example 'ExternEntry'
and a struct which passes a array of it.
struct ExternEntry
{
unsigned int MyInt;
const wchar_t* Text;
}
struct ExternEntries
{
const ExternEntry* Data;
const unsigned int Length;
ExternEntries(const ExternEntry* ptr, const unsigned int size)
: Data(ptr)
, Length(size);
{
}
}
In the existing project architecture, it will be the first time that a array is passed to the DLL callers. So the existing architecture doesn't allow arrays and if a struct is passed to a caller, normally there is a wrapper-struct for it (because of their str pointers).
Inside the DLL I need to wrap the ExternEntry so have a valid Text pointer.
struct InternEntry
{
ExternEntry Data;
std::wstring Text;
inline const ExternEntry* operator&() const { return& Data }
UpdateText() { Data.Text = Text.c_str(); }
}
struct InternEntries
{
std::vector<InternEntry> Data;
operator ExternEntries() const
{
return ExternEntries(Data.data()->operator&(), Data.size());
}
}
So the problem is, when the Caller received the ExternEntries and created a vector again:
auto container = DllFuncReturnInternEntries(); // returns ExternEntries
std::vector<ExternEntry> v(container.Data, container.Data + container.Length);
The first element is valid. All other elements are pointing to the wrong memory because in memory the InternEntry (with the wstring Text) is stored between the next InternEntry.
Maybe I'm wrong with the reason why this can't work.
[Data][std::wstring][Data][std::wstring][Data][std::wstring]
Caller knows just about the size of the [Data]
So the vector is doing the following:
[Data][std::wstring][Data][std::wstring][Data][std::wstring] 
  |       |       |
 Get     Get     Get
instead of
[Data][std::wstring][Data][std::wstring][Data][std::wstring]
  |                   |                   |
 Get                 Get                 Get
Do I have any possibilities to customize how the vector stores InternEntry objects in memory?
like Data,Data,Data ..anywhere else wstring,wstring,wstring
I hope I have explained my problem well

2d array as a default constructor argument c++

I can't find the answer anywhere.
I wrote this class:
class Message {
private:
char senderName[32];
char* namesOfRecipients[];
int numOfContacts;
char subject[129];
char body[10001];
};
And I'm trying to write a constructor with default arguments like this:
Message(char senderName[32]="EVA",
char* Recipents[]={"glados","edi"},
int numOfRec=3,
char subject[129]="None",
char content[10001]="None");
However, it won't accept the recipients default argument no matter how I write it.
Is it even possible to pass a 2D array as a default argument for a constructor?
Sooo many pointers and arrays... if It is C++ why bother? Just write:
class Message {
private:
std::string senderName;
std::vector<std::string> namesOfRecipients;
int numOfContacts;
std::string subject;
std::string body;
};
And:
Message("EVA", {"glados","edi"}, 3, "None", "None");
And everbody is happy...
As Paul mentioned, you should change the declaration of namesOfRecipients to
char **namesOfRecipients;
Then you can have a private const static array of default names in the class and initialize namesOfRecipients with a pointer to its first element. The code is below.
Edit: It's important to understand what the data semantics are here, for example compared to Jarod's solution. The default ctor stores the address of an array of constant pointers to constant character strings. It's not at all possible to copy different characters into a name or to let one of the pointers in the array point to a new name, or to append a name. The only legal thing here is to replace the value of namesOfRecipients with a pointer to a new array of pointers to char.
class Message {
private:
char senderName[32];
char** namesOfRecipients;
int numOfContacts;
char subject[129];
char body[10001];
static const char* defaultNames[];
public:
Message(const char senderName[32]="EVA",
const char** Recipents = defaultNames,
int numOfRec=3,
const char subject[129]="None",
const char content[10001]="None");
};
const char *Message::defaultNames[] = {"Jim", "Joe"};
You can do something like:
namespace
{
char (&defaultSenderName())[32]
{
static char s[32] = "EVA";
return s;
}
const char* (&defaultNamesOfRecipients())[2]
{
static const char* namesOfRecipients[2]={"glados", "edi"};
return namesOfRecipients;
}
}
class Message {
private:
char senderName[32];
const char* namesOfRecipients[2];
public:
Message(char (&senderName)[32] = defaultSenderName(),
const char* (&namesOfRecipients)[2] = defaultNamesOfRecipients())
{
std::copy(std::begin(senderName), std::end(senderName), std::begin(this->senderName));
std::copy(std::begin(namesOfRecipients), std::end(namesOfRecipients), std::begin(this->namesOfRecipients));
}
};
but using std::string/std::vector would be simpler.
Use a separate array of pointers (it's not a 2-D array, though it may look like it) as a default argument:
char* defaultRecipents[] = {"glados","edi"};
class Message {
public:
Message(char senderName[32]="EVA",
char* Recipents[]=defaultRecipents){}
};
Specifying the default array "inline" doesn't work because the compiler "thinks" about it in terms of std::initializer_list, which is only suitable in initialization, not in declaration. Sorry if this sounds vague; I don't have enough experience with this matter.
Note: you might want to use const to declare your strings, to make it clear to the compiler (and your future self) whether the class is or is not going to alter the strings:
const char* const defaultRecipents[] = {"glados","edi"};
class Message {
public:
Message(char senderName[32]="EVA",
const char* const Recipents[]=defaultRecipents){}
};
Here it says const twice to declare that it's not going to:
Change the array elements (e.g. replace one array element, which is a string, by another string or nullptr); and
Change the contents of the strings (e.g. cut a string in the middle, or edit it)

reinterpret_cast for 'serializing' data, byte order and alignment on receiving end

If we have a POD struct say A, and I do this:
char* ptr = reinterpret_cast<char*>(A);
char buf[20];
for (int i =0;i<20; ++i)
buf[i] = ptr[i];
network_send(buf,..);
If the recieving end remote box, is not necessarily same hardware or OS, can I safely do this to 'unserialize':
void onRecieve(..char* buf,..) {
A* result = reinterpret_cast<A*>(buf); // given same bytes in same order from the sending end
Will the 'result' always be valid? The C++ standard states with POD structures, the result of reinterpret_cast should point to the first member, but does it mean the actual byte order will be correct also, even if the recieving end is a different platform?
No, you cannot. You can only ever cast "down" to char*, never back to an object pointer:
Source Destination
\ /
\ /
V V
read as char* ---> write as if to char*
In code:
Foo Source;
Foo Destination;
char buf[sizeof(Foo)];
// Serialize:
char const * ps = reinterpret_cast<char const *>(&Source);
std::copy(ps, ps + sizeof(Foo), buf);
// Deserialize:
char * pd = reinterpret_cast<char *>(&Destination);
std::copy(buf, buf + sizeof(Foo), pd);
In a nutshell: If you want an object, you have to have an object. You cannot just pretend a random memory location is an object if it really isn't (i.e. if it isn't the address of an actual object of the desired type).
You may consider using a templatefor this and letting the compiler handle it for you
template<typename T>
struct base_type {
union {
T scalar;
char bytes[sizeof(T)];
};
void serialize(T val, byte* dest) {
scalar = val;
if is_big_endian { /* swap bytes and write */ }
else { /* just write */ }
}
};

C++ advice on writing code

I am having difficulty writing my code in the way it should be written. This is my default constructor:
Address::Address() : m_city(NULL), m_street(NULL), m_buildingNumber(0), m_apartmentNumber(0)
{}
...and this is my other constructor:
Address::Address(const char* city, const char* street, const int buildingNumber,const int apartmentNumber) : m_city(NULL), m_street(NULL)
{
SetAddress(city,street,buildingNumber,apartmentNumber);
}
I have to initialize my city and street fields as they contain char * and my setter uses remove to set a new city for example. I would very much like to hear your opinion on how to write it in the right way without repeating code.
this is my SetAddress code :
bool Address::SetAddress(const char* city, const char* street, const int buildingNumber, const int apartmentNumber)
{
if (SetCity(city) == false || SetStreet(street) == false || SetBuildingNumber(buildingNumber) == false || SetApartmentNumber(apartmentNumber) == false)
return false;
return true;
}
and this is my SetCity:
bool Address::SetCity(const char* city)
{
if(city == NULL)
return false;
delete[] m_city;
m_city = new char[strlen(city)+1];
strcpy(m_city, city);
return true;
}
1 more question if i do change char* to string how can i check if string city doesnt equal to NULL as i know string does not have the "==" operator and string is an object and cannot be equal to null,
how can i check if the string i get is indeed legeal.
You should use std::string instead of C strings (const char*). Then you don't have to worry about having a "remove" function because std::string will manage the memory for you.
The only repeating code I see is the initializers. Since you should both be using initializers and cannot share initializers, some code redundancy is required here. I wouldn't worry about it.
When the new C++ comes out you'll be able to call the former constructor during initialization of the later. Until then, you'll just have to live with this minor smell.
You can combine the two ctors:
Address::Address(const char* city=NULL,
const char* street=NULL,
int buildingNumber=0,
int apartmentNumber=0)
: m_city(city),
m_street(street),
m_buildingNumber(buildingNumber),
m_apartmentNumber(apartmentNumber)
{}
[The top-level const on buildingNumber and apartmentNumber accomplished nothing and attempt to move implementation information into the interface, so I remove them.]
Of, if you really prefer:
Address::Address(const char* city=NULL,
const char* street=NULL,
int buildingNumber=0,
int apartmentNumber=0)
{
SetAddress(city,street,buildingNumber,apartmentNumber);
}
I generally prefer the former, but if SetAddress qualifies its inputs, it may be worthwhile. Of course, the suggestion to use std::string instead of pointers to char is a good one as well, but that's a more or less separate subject.
One other minor note: this does differ in one fundamental way from your original code. Your code required either 0 or 4 arguments to the ctor. This will accept anywhere from 0 to 4, arguments so a person could specify (for example) a city and street, but not a building number or apartment number. If it's really important to you that attempts at using 1, 2 or 3 arguments be rejected, this approach won't be useful to you. In this case, the extra flexibility looks like an improvement to me though -- for example, if somebody lives in a single-family dwelling, it's quite reasonable to omit an apartment number.
As answered by others (James McNellis' answer comes to mind), you should switch to std:string instead of char *.
Your problem is that repetition can't be avoided (both non default constructor and the setAddress method set the data), and having one calling the other could be less effective.
Now, the real problem, I guess, is that your code is doing a lot, which means that repetition of delicate code could be dangerous and buggy, thus your need to have one function call the other. This need can be remove by using the std::string, as it will remove the delicate code from your code altogether.
As it was not shown
Let's re-imagine your class:
class Address
{
public :
Address() ;
Address(const std::string & p_city
, const std::string & p_street
, int p_buildingNumber
, int p_apartmentNumber) ;
// Etc.
private :
std::string m_city ;
std::string m_street ;
int m_buildingNumber ;
int m_apartmentNumber ;
} ;
Using the std::string instead of the const char * will make the std::string object responsible for handling the resource (the string itself).
For example, you'll see I wrote no destructor in the class above. This is not an error, as without a destructor, the compiler will generate its own default one, which will handle the destructor of each member variable as needed. The remove you use for resource disposal (freeing the unused char *) is useless, too, so it won't be written. This means a lot of delicate code that won't be written, and thus, won't produce bugs.
And it simplifies greatly the implementation of the constructors, or even the setAddress method :
Address::Address()
// std::string are initialized by default to an empty string ""
// so no need to mention them in the initializer list
: m_buildingNumber(0)
, m_apartmentNumber(0)
{
}
Address::Address(const std::string & p_city
, const std::string & p_street
, int p_buildingNumber
, int p_apartmentNumber)
: m_city(p_city)
, m_street(p_street)
, m_buildingNumber(p_buildingNumber)
, m_apartmentNumber(p_apartmentNumber)
{
}
void Address::setAddress(const std::string & p_city
, const std::string & p_street
, int p_buildingNumber
, int p_apartmentNumber)
{
m_city = p_city ;
m_street = p_street ;
m_buildingNumber = p_buildingNumber ;
m_apartmentNumber = p_apartmentNumber ;
}
Still, there is repetition in this code, and indeed, we'll have to wait C++0x to have less repetition. But at least, the repetition is trivial, and easy to follow: No dangerous and delicate code, everything is simple to write and read. Which makes your code more robust than the char * version.
Your code looks good - it might be worthy to see the contents of SetAddress. I would highly recommend using std::string over char *s, if city and street aren't hard-coded into the program, which I doubt. You'll find std::string will save you headaches with memory-management and bugs, and will generally make dealing with strings much easier.
I might rewrite the setAddress() method as follows:
bool Address::setAddress(const char* city, const char* street, const int buildingNumber, const int apartmentNumber)
{
return (setCity(city)
&& setStreet(street)
&& setBuildingNumber(buildingNumber)
&& setApartmentNumber(apartmentNumber))
}
which will achieve the same short-circuiting and returning semantics, with a bit less code.
If you must use char * rather than std::string you need to manage the memory for the strings yourself. This includes copy on write when sharing the text or complete copy of the text.
Here is an example:
class Address
{
public:
Address(); // Empty constructor.
Address(const char * city,
const char * street,
const char * apt); // Full constructor.
Address(const Address& addr); // Copy constructor
virtual ~Address(); // Destructor
void set_city(const char * new_city);
void set_street(const char * new_street);
void set_apartment(const char * new_apartment);
private:
const char * m_city;
const char * m_street;
const char * m_apt;
};
Address::Address()
: m_city(0), m_street(0), m_apt(0)
{ ; }
Address::Address(const char * city,
const char * street,
const char * apt)
: m_city(0), m_street(0), m_apt(0)
{
set_city(city);
set_street(street);
set_apt(apt);
}
Address::Address(const Address& addr)
: m_city(0), m_street(0), m_apt(0)
{
set_city(addr.city);
set_street(addr.street);
set_apt(addr.apt);
}
Address::~Address()
{
delete [] m_city;
delete [] m_street;
delete [] m_apt;
}
void Address::set_city(const char * new_city)
{
delete [] m_city;
m_city = NULL;
if (new_city)
{
const size_t length = strlen(new_city);
m_city = new char [length + 1]; // +1 for the '\0' terminator.
strcpy(m_city, new_city);
m_city[length] = '\0';
}
return;
}
void Address::set_street(const char * new_street)
{
delete [] m_street;
m_street = NULL;
if (new_street)
{
const size_t length = strlen(new_street);
m_street = new char [length + 1]; // +1 for the '\0' terminator.
strcpy(m_street, new_street);
m_street[length] = '\0';
}
return;
}
void Address::set_apt(const char * new_apt)
{
delete [] m_apt;
m_apt = NULL;
if (new_apt)
{
const size_t length = strlen(new_apt);
m_apt = new char [length + 1]; // +1 for the '\0' terminator.
strcpy(m_apt, new_apt);
m_apt[length] = '\0';
}
return;
}
In the above example, the Address instance holds copies of the given text. This prevents problems when another entity points to the same text, and modifies the text. Another common issue is when the other entity deletes the memory area. The instance still holds the pointer, but the target area is invalid.
These issues are avoided by using the std::string class. The code is much smaller and easier to maintain. Look at the above code versus some of the other answers using std::string.