C++ My String class: Pointer doesn't work

C++ My String class: Pointer doesn't work - c++

I'm building my own string class in c++ 11 and I have a memory problem.
in main:
MyString str1; //Works ok, constructor creates empty char array.
const char* pointer1 = str1.c_str(); //Return the pointer to the array.
str1.Reserve(5);
// Now, when I use the Reverse method in string1, Pointer1 is
// pointing to the old memory address.
How to I change the array data in str1, but to the memory address?
With aother words, How do I fix this so that:
pointer1 == str1.c_str();
Reserve method:
void reserve(int res)
{
capacity = NewSize(size + res,0 , capacity); //Method to find the best cap.
char* oldData = data;
data = new char[capacity];
memcpy(data, oldData, capacity);
oldData = data;
//delete[] data;
data[(size)] = '\0';
}
This returns all the right data, but when I do "oldData = data", the memory address is lost.
I appreciate all help, thanks!

I think what you are asking is if there is a way to get a return value from your string class which will always point to the current string array. There are a number of ways to do this but generally this indicates bad design/implementation.
The more normal way to do this would be to advise API users that the result of c_str() is invalidated by any subsequent modifications to the object: don't keep the pointer, just call c_str() again.
Two obvious options are: a) a pointer to the pointer, very dangerous because now someone outside your class can tweak it, b) provide a wrapper class which encapsulates a pointer-to-pointer without allowing modifications.
template<typename T>
struct ReadOnlyPointer {
T* m_ptr;
... operator * ...
... operator -> ...
... operator T ...
};
ReadOnlyPointer<const char*> pointer = str1.pointer();
There also appear to be at least a couple of issues with your "reserve" function.
You push a '\0' at data[0] even though size might be zero.
MyString a;
a.reserve(0); // crash? you wrote to the first byte of a zero length array.
After copying the data from oldData to data, for some reason you assign the value of 'data' to 'oldData' and then never use 'oldData' again - this is a memory leak.
Your memcpy uses 'capacity' instead of 'size' so it may be over-copying.
Consider instead:
// ensure we have an additional 'res' bytes.
// caution: unlike stl and boost reserve, these are
// additional bytes, not total bytes.
void reserve(int res)
{
int newCapacity = NewSize(m_size + res, 0, m_capacity); //Method to find the best cap.
if(newCapacity <= m_capacity)
return;
char* newData = new char[newCapacity];
memcpy(newData, m_data, m_size);
delete[] m_data; // release the old allocation
m_data = newData;
m_capacity = newCapacity;
}
The extra data[(size)] = '\0'; could be the cause of your string becoming truncated if you are not changing the value of size elsewhere in your code.

Related

Reallocate array with memcpy and memset

I've taken over some code, and came across a weird reallocation of an array. This is a function from within an Array class (used by the JsonValue)
void reserve( uint32_t newCapacity ) {
if ( newCapacity > length + additionalCapacity ) {
newCapacity = std::min( newCapacity, length + std::numeric_limits<decltype( additionalCapacity )>::max() );
JsonValue *newPtr = new JsonValue[newCapacity];
if ( length > 0 ) {
memcpy( newPtr, values, length * sizeof( JsonValue ) );
memset( values, 0, length * sizeof( JsonValue ) );
}
delete[] values;
values = newPtr;
additionalCapacity = uint16_t( newCapacity - length );
}
}
I get the point of this; it is just allocating a new array, and doing a copy of the memory contents from the old array into the new array, then zero-ing out the old array's contents. I also know this was done in order to prevent calling destructors, and moves.
The JsonValue is a class with functions, and some data which is stored in a union (string, array, number, etc.).
My concern is whether this is actually defined behaviour or not. I know it works, and has not had a problem since we began using it a few months ago; but if its undefined then it doesn't mean it is going to keep working.
EDIT:
JsonValue looks something like this:
struct JsonValue {
// …
~JsonValue() {
switch ( details.type ) {
case Type::Array:
case Type::Object:
array.destroy();
break;
case Type::String:
delete[] string.buffer;
break;
default: break;
}
}
private:
struct Details {
Key key = Key::Unknown;
Type type = Type::Null; // (0)
};
union {
Array array;
String string;
EmbedString embedString;
Number number;
Details details;
};
};
Where Array is a wrapper around an array of JsonValues, String is a char*, EmbedString is char[14], Number is a union of int, unsigned int, and double, Details contains the type of value it holds. All values have 16-bits of unused data at the beginning, which is used for Details. Example:
struct EmbedString {
uint16_t : 16;
char buffer[14] = { 0 };
};

Whether this code has well-defined behavior basically depends on two things: 1) is JsonValue trivially-copyable and, 2) if so, are a bunch of all-zero Bytes a valid object representation for a JsonValue.
If JsonValue is trivially-copyable, then the memcpy from one array of JsonValues to another will indeed be equivalent to copying all the elements over [basic.types]/3. If all-zeroes is a valid object representation for a JsonValue, then the memset should be ok (I believe this actually falls into a bit of a grey-area with the current wording of the standard, but I believe at least the intention would be that this is fine).
I'm not sure why you'd need to "prevent calling destructors and moves", but overwriting objects with zeroes does not prevent destructors from running. delete[] values will call the destructurs of the array members. And moving the elements of an array of trivially-copyable type should compile down to just copying over the bytes anyways.
Furthermore, I would suggest to get rid of these String and EmbedString classes and simply use std::string. At least, it would seem to me that the sole purpose of EmbedString is to manually perform small string optimization. Any std::string implementation worth its salt is already going to do exactly that under the hood. Note that std::string is not guaranteed (and will often not be) trivially-copyable. Thus, you cannot simply replace String and EmbedString with std::string while keeping the rest of this current implementation.
If you can use C++17, I would suggest to simply use std::variant instead of or at least inside this custom JsonValue implementation as that seems to be exactly what it's trying to do. If you need some common information stored in front of whatever the variant value may be, just have a suitable member holding that information in front of the member that holds the variant value rather than relying on every member of the union starting with the same couple of members (which would only be well-defined if all union members are standard-layout types that keep this information in their common initial sequence [class.mem]/23).
The sole purpose of Array would seem to be to serve as a vector that zeroes memory before deallocating it for security reasons. If this is the case, I would suggest to just use an std::vector with an allocator that zeros memory before deallocating instead. For example:
template <typename T>
struct ZeroingAllocator
{
using value_type = T;
T* allocate(std::size_t N)
{
return reinterpret_cast<T*>(new unsigned char[N * sizeof(T)]);
}
void deallocate(T* buffer, std::size_t N) noexcept
{
auto ptr = reinterpret_cast<volatile unsigned char*>(buffer);
std::fill(ptr, ptr + N, 0);
delete[] reinterpret_cast<unsigned char*>(buffer);
}
};
template <typename A, typename B>
bool operator ==(const ZeroingAllocator<A>&, const ZeroingAllocator<B>&) noexcept { return true; }
template <typename A, typename B>
bool operator !=(const ZeroingAllocator<A>&, const ZeroingAllocator<B>&) noexcept { return false; }
and then
using Array = std::vector<JsonValue, ZeroingAllocator<JsonValue>>;
Note: I fill the memory via volatile unsigned char* to prevent the compiler from optimizing away the zeroing. If you need to support overaligned types, you can replace the new[] and delete[] with direct calls to ::operator new and ::operator delete (doing this will prevent the compiler from optimizing away allocations). Pre C++17, you will have to allocate a sufficiently large buffer and then manually align the pointer, e.g., using std::align…

Allocating an array of aligned struct

I'm trying to allocate an array of struct and I want each struct to be aligned to 64 bytes.
I tried this (it's for Windows only for now), but it doesn't work (I tried with VS2012 and VS2013):
struct __declspec(align(64)) A
{
std::vector<int> v;
A()
{
assert(sizeof(A) == 64);
assert((size_t)this % 64 == 0);
}
void* operator new[] (size_t size)
{
void* ptr = _aligned_malloc(size, 64);
assert((size_t)ptr % 64 == 0);
return ptr;
}
void operator delete[] (void* p)
{
_aligned_free(p);
}
};
int main(int argc, char* argv[])
{
A* arr = new A[200];
return 0;
}
The assert ((size_t)this % 64 == 0) breaks (the modulo returns 16). It looks like it works if the struct only contains simple types though, but breaks when it contains an std container (or some other std classes).
Am I doing something wrong? Is there a way of doing this properly? (Preferably c++03 compatible, but any solution that works in VS2012 is fine).
Edit:
As hinted by Shokwav, this works:
A* arr = (A*)new std::aligned_storage<sizeof(A), 64>::type[200];
// this works too actually:
//A* arr = (A*)_aligned_malloc(sizeof(A) * 200, 64);
for (int i=0; i<200; ++i)
new (&arr[i]) A();
So it looks like it's related to the use of new[]... I'm very curious if anybody has an explanation.

I wonder why you need such a huge alignment requirement, moreover to store a dynamic heap allocated object in the struct. But you can do this:
struct __declspec(align(64)) A
{
unsigned char ___padding[64 - sizeof(std::vector<int>)];
std::vector<int> v;
void* operator new[] (size_t size)
{
// Make sure the buffer will fit even in the worst case
unsigned char* ptr = (unsigned char*)malloc(size + 63);
// Find out the next aligned position in the buffer
unsigned char* endptr = (unsigned char*)(((intptr_t)ptr + 63) & ~63ULL);
// Also store the misalignment in the first padding of the structure
unsigned char misalign = (unsigned char)(endptr - ptr);
*endptr = misalign;
return endptr;
}
void operator delete[] (void* p)
{
unsigned char * ptr = (unsigned char*)p;
// It's required to call back with the original pointer, so subtract the misalignment offset
ptr -= *ptr;
free(ptr);
}
};
int main()
{
A * a = new A[2];
printf("%p - %p = %d\n", &a[1], &a[0], int((char*)&a[1] - (char*)&a[0]));
return 0;
}
I did not have your align_malloc and free function, so the implementation I'm providing is doing this:
It allocates larger to make sure it will fit in 64-bytes boundaries
It computes the offset from the allocation to the closest 64-bytes boundary
It stores the "offset" in the padding of the first structure (else I would have required a larger allocation space each time)
This is used to compute back the original pointer to the free()
Outputs:
0x7fff57b1ca40 - 0x7fff57b1ca00 = 64
Warning: If there is no padding in your structure, then the scheme above will corrupt data, since I'll be storing the misalignement offset in a place that'll be overwritten by the constructor of the internal members.
Remember that when you do "new X[n]", "n" has to be stored "somewhere" so when calling delete[], "n" calls to the destructors will be done. Usually, it's stored before the returned memory buffer (new will likely allocate the required size + 4 for storing the number of elements). The scheme here avoid this.
Another warning: Because C++ calls this operator with some additional padding included in the size for storing the array's number of elements, you'll might still get a "shift" in the returned pointer address for your objects. You might need to account for it. This is what the std::align does, it takes the extra space, compute the alignment like I did and return the aligned pointer. However, you can not get both done in the new[] overload, because of the "count storage" shift that happens after returning from new(). However, you can figure out the "count storage" space once by a single allocation, and adjust the offset accordingly in the new[] implementation.

How to initialize an array whose size is initially unknown?

Say I have this:
int x;
int x = (State Determined By Program);
const char * pArray[(const int)x]; // ??
How would I initialize pArray before using it?
Because the initial size of the Array is determined by user input
Thanks!

Size of dynamically created array on the stack must be known at compile time.
You can either use new:
const char* pArray = new char[x];
...
delete[] pArray;
or better to use std::vector instead (no need to do memory management manually):
vector<char> pArray;
...
pArray.resize(x);

You cannot initialize an array at compile-time if you are determining the size at run-time.
But depending on what you are trying to do, a non-const pointer to const data may provide you with what you're going for.
const char * pArray = new const char[determine_size()];
A more complete example:
int determine_size()
{
return 5;
}
const char * const allocate_a( int size )
{
char * data = new char[size];
for( int i=0; i<size; ++i )
data[i] = 'a';
return data;
}
int main()
{
const char * const pArray = allocate_a(determine_size());
//const char * const pArray = new char[determine_size()];
pArray[0] = 'b'; // compile error: read-only variable is not assignable
pArray = 0 ; // compile error: read-only variable is not assignable
delete[] pArray;
return 0;
}
I do agree with others that a std::vector is probably more what you're looking for. If you want it to behave more like your const array, you can assign it to a const reference.
#include <vector>
int main()
{
std::vector<char> data;
data.resize(5);
const std::vector<char> & pArray = data;
pArray[0] = 'b'; // compile error: read-only variable is not assignable
}

The example you provided attempts to build the array on the stack.
const char pArray[x];
However, you cannot dynamically create objects on the stack. These types of items must be known at compile time. If this is a variable based on user input then you must create the array in heap memory with the new keyword.
const char* pArray = new char[x];
However, not all items need to be created on the heap. Heap allocation is normally a lot slower then stack allocation. If you want to keep your array on the stack you could always use block based initialization.
#define MAX_ITEMS 100
const char pArray[MAX_ITEMS]
It should be noted that the second option is wasteful. Because you can not dynamically resize this array you must allocate a large enough chunk to hold the maximum number of items your program could create.
Finally, you can always use data structures provide by C++. std::vector is such a class. It provides you a good level of abstraction and item are stored in contingent memory like an array. As noted by one of the other answers you should use the resize option once you know the final size of your vector.
std::vector<char> pArray;
pArray.resize(X);
The reason for this is every time you add an element to a vector, if it no longer has enough room to grow, it has to relocate all items so they can exist next to one another. Using the resize method helps prevent vector from having to grow as you add items.

Trouble assigning a string to dynamic array location

I am getting an error of "EXC_BAD_ACCESS" when trying to add a string to a dynamic array. Am I doing something wrong? Here is some snippets of code:
typedef unsigned short ushort_t;
typedef string* stringPtr_t;
class Doctor {
private:
string doctorName;
stringPtr_t patientArray;
ushort_t patientArraySize;
ushort_t numOfPatient;
bool Doctor::addPatient(string patientName)
{
patientArray[numOfPatient].assign(patientName);
numOfPatient++;
return true;
}
Doctor& Doctor::operator =(const Doctor& docSource)
{
for (int i = 0; i < docSource.patientArraySize; i++) {
patientArray[i].assign(docSource.patientArray[i]);
}
return *this;
}
};
int main()
{
Doctor testDoc5(2);
cout.clear();
assert(testDoc5.addPatient("Bob Smith")==true);
}
Doctor::Doctor(ushort_t patientArrayCapacity)
: doctorName("need a name.")
, patientArraySize(patientArrayCapacity)
, numOfPatient(0)
{
patientArray = *new stringPtr_t[patientArraySize];
}

The suspect line is:
patientArray = *new stringPtr_t[patientArraySize];
Let's examine this in a little more detail.
Expanding (replacing typedefs) results in
patientArray = * new string * [patientArraySize];
Looking at the allocation part:
new string * [patientArraySize];
Allocates an array of pointers to strings. This may not be what you want.
The next part:
* (new string * [patientArraySize]);
dereferences the pointer to the array of strings, thus referring to the first element of the array.
And lastly, the assignment:
patientArray = * (new string * [patientArraySize]);
Assigns the contents of array location zero to your variable patientArray. This is legal since you told the compiler you are allocating an array of pointers to strings.
Side effects:
1. You have lost the location of the start of array. Also known as a memory leak.
2. The content of the patientArray pointer is undefined since you didn't initialize the pointer value in the first location of the array.
Maybe you want:
patientArray = new string [patientArraySize];
which allocates an array of strings and assigns to your pointer patientArray.
This whole issue would go away if you used std::vector<string>(patientArraySize).

Dynamically allocate C struct?

I want to dynamically allocate a C struct:
typedef struct {
short *offset;
char *values;
} swc;
Both 'offset' and 'values' are supposed to be arrays, but their size is unknown until runtime.
How can I dynamically allocate memory for my struct and the struct's arrays?

swc *a = (swc*)malloc(sizeof(swc));
a->offset = (short*)malloc(sizeof(short)*n);
a->values = (char*)malloc(sizeof(char)*n);
Where n = the number of items in each array and a is the address of the newly allocated data structure. Don't forget to free() offsets and values before free()'ing a.

In C:
swc *s = malloc(sizeof *s); // assuming you're creating a single instance of swc
if (s)
{
s->offset = malloc(sizeof *(s->offset) * number_of_offset_elements);
s->values = malloc(sizeof *(s->values) * number_of_value_elements);
}
In C++:
try
{
swc *s = new swc;
s->offset = new short[number_of_offset_elements];
s->values = new char[number_of_value_elements];
}
catch(...)
{
...
}
Note that in C++, you might be better off using vectors as opposed to dynamically allocated buffers:
struct swc
{
std::vector<short> offset;
std::vector<char> values;
};
swc *a = new swc;
Question: is values supposed to be an array of individual characters or an array of strings? That would change things a bit.
EDIT
The more I think about it, the less satisfied I am with the C++ answer; the right way to do this sort of thing in C++ (assuming you need dynamically allocated buffers as opposed to vectors, which you probably don't) is to perform the memory allocation for offset and values as part of a constructor within the struct type, and have a destructor deallocate those elements when the struct instance is destroyed (either by a delete or by going out of scope).
struct swc
{
swc(size_t numOffset = SOME_DEFAULT_VALUE,
size_t numValues = SOME_OTHER_DEFAULT_VALUE)
{
m_offset = new short[numOffset];
m_values = new char[numValues];
}
~swc()
{
delete[] m_offset;
delete[] m_values;
}
short *m_offset;
char *m_values;
};
void foo(void)
{
swc *a = new swc(10,20); // m_offset and m_values allocated as
// part of the constructor
swc b; // uses default sizes for m_offset and m_values
...
a->m_offset[0] = 1;
a->m_values[0] = 'a';
b.m_offset[0] = 2;
b.m_values[0] = 'b';
...
delete a; // handles freeing m_offset and m_values
// b's members are deallocated when it goes out of scope
}

You have to do it seperately. First allocate the struct, then the memory for the arrays.
In C:
swc *pSwc = malloc(sizeof(swc));
pSwc->offset = malloc(sizeof(short)*offsetArrayLength);
pSwc->values = malloc(valuesArrayLength);
In C++, you shouldn't be doing anything like that.

In C:
typedef struct
{
short *offset;
char *values;
} swc;
/// Pre-Condition: None
/// Post-Condition: On failure will return NULL.
/// On Success a valid pointer is returned where
/// offset[0-n) and values[0-n) are legally de-refrancable.
/// Ownership of this memory is returned to the caller who
/// is responsible for destroying it via destroy_swc()
swc *create_swc(unsigned int size)
{
swc *data = (swc*) malloc(sizeof(swc));
if (data)
{
data->offset = (short*)malloc(sizeof(short)*n);
data->values = (char*) malloc(sizeof(char) *n);
}
if ((data != NULL) && (size != 0) && ((data->offset == NULL) || (data->values == NULL)))
{
// Partially created object is dangerous and of no use.
destroy_swc(data);
data = NULL;
}
return data;
}
void destroy_swc(swc* data)
{
free(data->offset);
free(data->values);
free(data);
}
In C++
struct swc
{
std::vector<short> offset;
std::vector<char> values;
swc(unsigned int size)
:offset(size)
,values(size)
{}
};

You will need a function to do this.
Something like (my C/C++ is rusty)
swc* makeStruct(int offsetCount, int valuesCount) {
swc *ans = new swc();
ans->offset = new short[offsetCount];
ans->values = new char[valuesCount];
return ans;
}
myNewStruct = makeStruct(4, 20);
Syntax may be a bit off but that is generally what you are going to need. If you're using C++ then you probably want a class with a constructor taking the 2 args instead of the makeStruct but doing something very similar.

One thing to add to the many correct answers here: you can malloc an over-sized structure to accommodate a variable sized array in the last member.
struct foo {
short* offset;
char values[0]
};
and later
struct *foo foo1 = malloc(sizeof(struct foo)+30); // takes advantage of sizeof(char)==1
to get room for 30 objects in the values array. You would still need to do
foo1->offsets = malloc(30*sizeof(short));
if you want them to use the same size arrays.
I generally wouldn't actually do this (maintenance nightmare if the structure ever needs to expand), but it is a tool in the kit.
[code here in c. You'll need to cast the malloc's (or better use new and RAII idioms) in c++]

swc* a = malloc(sizeof(*a));
a->offset = calloc(n, sizeof(*(a->offset)));
a->values = calloc(n, sizeof(*(a->values)));
You should not cast void* in c... in c++ you must!

Use malloc function or calloc to allocate memory dynamically .
and search it on google to get examples.
The calloc function initializes allocated memory to zero.

Since nobody has mentioned it yet, sometimes it is nice to grab this chunk of memory in one allocation so you only have to call free() on one thing:
swc* AllocSWC(int items)
{
int size = sizeof(swc); // for the struct itself
size += (items * sizeof(short)); // for the array of shorts
size += (items * sizeof(char)); // for the array of chars
swc* p = (swc*)malloc(size);
memset(p, 0, size);
p->offset = (short*)((char*)swc + sizeof(swc)); // array of shorts begins immediately after the struct
p->values = (char*)((char*)swc + sizeof(swc) + items * sizeof(short)); // array of chars begins immediately after the array of shorts
return p;
}
Of course this is a bit more difficult to read and maintain (especially if you dynamically resize the arrays after it is first allocated). Just an alternative method I've seen used in a number of places.

Most of the answers are correct. I would like to add something that you haven't explicitly asked but might also be important.
C / C++ arrays don't store their own size in memory. Thus, unless you want offset and values to have compile-time defined values (and, in that case, it's better to use fixed-size arrays), you might want to store the sizes of both arrays in the struct.
typedef struct tagswc {
short *offset;
char *values;
// EDIT: Changed int to size_t, thanks Chris Lutz!
size_t offset_count;
size_t values_count; // You don't need this one if values is a C string.
} swc;
DISCLAIMER: I might be wrong. For example, if all offsets of all swc instances have the same size, it would be better to store offset_count as a global member, not as a member of the struct. The same can be said about values and values_count. Also, if values is a C string, you don't need to store its size, but beware of Schlemiel the painter-like problems.

You want to use malloc to allocate the memory, and probably also sizeof() to allocate the correct amount of space.
Something like:
structVariable = (*swc) malloc(sizeof(swc));
Should do the trick.

In addition to the above, I would like to add freeing up the allocated memory as below.,
typedef struct {
short *offset;
char *values;
} swc;
swc* createStructure(int Count1, int Count2) {
swc *s1 = new swc();
s1->offset = new short[Count1];
s1->values = new char[Count2];
return s1;
}
int _tmain(int argc, _TCHAR* argv[])
{
swc *mystruct;
mystruct = createStructure(11, 11);
delete[] mystruct->offset;
delete[] mystruct->values;
delete mystruct;
return 0;
}

**If** you will not be resizing the arrays, then you can get away with a single call to malloc().
swc *new_swc (int m, int n) {
swc *p;
p = malloc (sizeof (*p) + m * sizeof (p->offset[0]) + n * sizeof (p->values[0]);
p->offset = (short *) &p[1];
p->values = (char *) &p->offset[m];
return p;
}
You can then free it with a single call to free().
(In general, there are alignment considerations to take into account, but for an array of shorts followed by an array of chars, you will be fine.)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

C++ My String class: Pointer doesn't work - c++

Related

Reallocate array with memcpy and memset

Allocating an array of aligned struct

How to initialize an array whose size is initially unknown?

Trouble assigning a string to dynamic array location

Dynamically allocate C struct?

Categories

Resources