Best way to wrap a char* in C++? - c++

In my code I use buffers currently allocated this way:
char* buf1 = (char*)malloc(size);
However at some points in the code I want to reassign the pointer to some place else in memory. The problem is that there are other places in the code that still need to be able to access the pointer buf1.
What's the best way to do this in C++? Right now I am considering writing a struct with a single char* in it, then allocating an object of this struct type and passing it to the places where I need to, and referring to wrapped pointer to get the current value of buf1.
However it seems that this is similar to what unique_ptr does. If I use unique_ptr how can I wrap a char* with it? I had some trouble with testing this and I'm not sure it's supported.
To clarify: these buffers are bytes of varying sizes.

In general, this question cannot be answered. There are simply way too many things you could be wanting to be doing with an array of char. Without knowing what it actually is that you want to do, its impossible to say what may be good abstractions to use…
If you want to do stuff with strings, just use std::string. If you want a dynamically-sized buffer that can grow and shrink, use std::vector.
If you just need a byte buffer the size of which is determined at runtime or which you'd just generally want to live in dynamic storage, I'd go with std::unique_ptr. While std::unique_ptr<T> is just for single objects, the partial specialization std::unique_ptr<T[]> can be used for dealing with dynamically allocated arrays. For example:
auto buffer = std::unique_ptr<char[]> { new char[size] };
Typically, the recommended way to create an object via new and get an std::unique_ptr to it would be to use std::make_unique. And if you want your buffer initialized to some particular value, you should indeed use std::make_unique<char[]>(value). However, std::make_unique<T[]>() will value-initialize the elements of the array it creates. In the case of a char array, that effectively means that your array will be zero-initialized. In my experience, compilers are, unfortunately, unable to optimize away the zero-initialization, even if the entire buffer would be overwritten first thing right after being created. So if you want an uninitialized buffer for the sake of avoiding the overhead of initialization, you can't use std::make_unique. Ideally, you'd just define your own function to create a default-initialized array via new and get an std::unique_ptr to it, for example:
template <typename T>
inline std::enable_if_t<std::is_array_v<T> && (std::extent_v<T> == 0), std::unique_ptr<T>> make_unique_default(std::size_t size)
{
return std::unique_ptr<T> { new std::remove_extent_t<T>[size] };
}
and then
auto buffer = make_unique_default<char[]>(new char[size]);
It seems that C++20 will include this functionality in the form of std::make_unique_default_init. So that would be the preferred method then.
Note that, if you're dealing with plain std::unique_ptr, you will still have to pass around the size of the buffer separately. You may want to bundle up an std::unique_ptr and an std::size_t if you're planning to pass around the buffer
template <typename T>
struct buffer_t
{
std::unique_ptr<T[]> data;
std::size_t size;
};
Note that something like above struct represents ownership of the buffer. So you'd want to use this, e.g., when returning a new buffer from a factory function, e.g.,
buffer_t makeMeABuffer();
or handing off ownership of the buffer to someone else, e.g.,
DataSink(buffer_t&& buffer)
You would not want to use it just to point some function to the buffer data and size do some processing without transferring ownership. For that, you'd just pass a pointer and size, or, e.g., use a span (starting, again, with C++20; also available as part of GSL)…

Related

Idiomatic way to handle T[]-like objects in C++

I am using some C-library in my C++ code. The library wants me to allocate some amount of the memory and pass the pointer to the library. Unfortunately, the exact required memory size is not known in advance, so the library also requires me to provide C-callback with the following signature:
void* callback_realloc(void* ptr, size_t new_size);
where ptr is previously passed memory and new_size is required size, and the callback must return the pointer to newly allocated memory. There is no direct way to store the allocator state. Instead, I need to rely on pointer arithmetic somehow as the following:
template<class T>
struct o_s {
std::aligned_storage_t<sizeof(T), alignof(T)> data;
};
template<class Alloc>
struct o_i: private Alloc {
std::size_t allocated_size;
const Alloc& get_allocator() const { return *this; }
};
template<class T>
struct o: public o_i, public o_s<T> {
void* ptr() {
return &data;
}
// Additionally, override class operator new and operator delete...
};
void* my_callback(void* ptr, size_t new_size) {
auto meta = static_cast<o*>(reinterpret_cast<o_s<char>*>(ptr));
// access to the allocator state ...
}
Then sizeof(o_i) + initial_size memory is allocated and ptr() is passed to the C library.
At this point, I understand that I am not the first person in the world who needs this pattern. Unfortunately (and surprisingly) I have not found anything suitable for this in Boost or STL. I would like to use ready implementation to avoid possible underwater rocks.
The simplest solution is to allocate the memory using std::malloc, and use std::realloc as the callback. C API such as this are the case where using those makes sense in C++.
You don't necessarily have to use std::malloc however. You can implement a custom allocator if you want to. Using some of the allocated storage is one way of storing allocation metadata, and it's an efficient way. That's not necessary either though, since you can also store the metadata separately in a map-like structure.
I don't think you'd find an idiomatic way to do this in C++, as this is a C idiom.
Conceptually, you have two ways of going about this:
Store the metadata alongside the buffer you've allocated (as you seem to be doing). I feel using double inheritance etc. is a bit overkill here, when you could just allocate a char buffer of sizeof(size_t)+allocation_size and use the first part for your metadata.
Allocate and return to the API a raw buffer as needed, and use a separate static data structure to manage this with a map of ptr->allocation size. I suppose this is what malloc/realloc is doing behind the scenes anyway.
Both are valid, and the one you chose depends on the specific details of your application. When dealing with memory it's ok to actually deal with memory, be it pointer arithmetic or what not.
The important thing is probably to keep the ugly bit to a single location, and provide a C++ style API to this on the C++ side of things, so that the client code isn't exposed to implementation details.

How can I pass and store an array of variable size containing pointers to objects?

For my project I need to store pointers to objects of type ComplicatedClass in an array. This array is stored in a class Storage along with other information I have omitted here.
Here's what I would like to do (which obviously doesn't work, but hopefully explains what I'm trying to achieve):
class ComplicatedClass
{
...
}
class Storage
{
public:
Storage(const size_t& numberOfObjects, const std::array<ComplicatedClass *, numberOfObjects>& objectArray)
: size(numberOfObjects),
objectArray(objectArray)
{}
...
public:
size_t size;
std::array<ComplicatedClass *, size> objectArray;
...
}
int main()
{
ComplicatedClass * object1 = new ComplicatedClass(...);
ComplicatedClass * object2 = new ComplicatedClass(...);
Storage myStorage(2, {object1, object2});
...
return 0;
}
What I am considering is:
Using std::vector instead of std::array. I would like to avoid this because there are parts of my program that are not allowed to allocate memory on the free-store. As far as I know, std::vector would have to do that. As a plus I would be able to ditch size.
Changing Storage to a class template. I would like to avoid this because then I have templates all over my code. This is not terrible but it would make classes that use Storage much less readable, because they would also have to have templated functions.
Are there any other options that I am missing?
How can I pass and store an array of variable size containing pointers to objects?
By creating the objects dynamically. Most convenient solution is to use std::vector.
size_t size;
std::array<ComplicatedClass *, size> objectArray;
This cannot work. Template arguments must be compile time constant. Non-static member variables are not compile time constant.
I would like to avoid this because there are parts of my program that are not allowed to allocate memory on the free-store. As far as I know, std::vector would have to do that.
std::vector would not necessarily require the use of free-store. Like all standard containers (besides std::array), std::vector accepts an allocator. If you implement a custom allocator that doesn't use free-store, then your requirement can be satisfied.
Alternatively, even if you do use the default allocator, you could write your program in such way that elements are inserted into the vector only in parts of your program that are allowed to allocate from the free-store.
I thought C++ had "free-store" instead of heap, does it not?
Those are just different words for the same thing. "Free store" is the term used in C++. It's often informally called "heap memory" since "heap" is a data structure that is sometimes used to implement it.
Beginning with C++11 std::vector has the data() method to access the underlying array the vector is using for storage.
And in most cases a std::vector can be used similar to an array allowing you to take advantage of the size adjusting container qualities of std::vector when you need them or using it as an array when you need that. See https://stackoverflow.com/a/261607/1466970
Finally, you are aware that you can use vectors in place of arrays,
right? Even when a function expects c-style arrays you can use
vectors:
vector<char> v(50); // Ensure there's enough space
strcpy(&v[0], "prefer vectors to c arrays");

How do you declare a pointer to a C++11 std::array?

Depending on a variable, I need to select the SeedPositions32 or SeedPositions16 array for further use. I thought a pointer would allow this but I can't seed to make it work. How do you declare a pointer to a C++11 std::array? I tried the below.
array<int>* ArrayPointer;
//array<typedef T, size_t Size>* ArrayPointer;
array<int,32> SeedPositions32 = {0,127,95,32,64,96,31,63,16,112,79,48,15,111,80,
47,41,72,8,119,23,104,55,87,71,39,24,7,56,88,103,120};
array<int,16> SeedPositions16 = {...}
std::array has a template parameter for size. Two std::array template instantiations with different sizes are different types. So you cannot have a pointer that can point to arrays of different sizes (barring void* trickery, which opens its own can of worms.)
You could use templates for the client code, or use std::vector<int> instead.
For example:
template <std::size_t N>
void do_stuff_with_array(std::array<int, N> the_array)
{
// do stuff with the_array.
}
do_stuff_with_array(SeedPositions32);
do_stuff_with_array(SeedPositions16);
Note that you can also get a pointer to the data:
int* ArrayPtr = SeedPositions32.data();
but here, you have lose the size information. You will have to keep track of it independently.
You can simply access the content of the std::array as a raw C-like array pointer using the std::array::data() member function:
int* arrayPointer = useSeedPositions32 ? SeedPositions32.data() : SeedPositions16.data();
In his answer juanchopanza explained very well why what you want cannot work.
The question is why would you want to do that? There is no way where you could use a (pointer to) std::array<int,32> in place of std::array<int,16>.
The point of std::array<> is to keep track of the number of elements at compile time (and also to avoid memory allocation for small fixed-sized arrays). If you instead want the number of elements to be managed at run time, you should presumably not use std::array<>, but std::vector.
The alternative of obtaining a pointer to the underlying data (using std::array::data() as proposed in other answers) and keeping track of the number of elements by yourself is somewhat dangerous and not really recommendable. The problem is that you must ensure that the pointer is never dangling.
Finally, I cannot find any possible use case. In order to use your pointer, you must declare both an array<int,32> and an array<int,16> object, yet use only one of them.
Why don't you simply only declare a array<int,32> and use only its first 16 elements if not all 32 are needed?
You could do something like this:
int * myArray = use32 ? &SeedPositions32[0] : &SeedPositions16[0];

is there any way to avoid the copy from and to between the valarray and array?

I have a lot of data in a list, say several kbytes in each element, I would like to extract each by each to do some numeric processing. These data are originally stored as float[]. Since the processing involves a lot of indexing and global calculation, I think valarray might be easy to program. But if I use valarray, I may have to copy from the array to the valarray first, and then copy back to the array. Is there any way to avoid this? Any way such that to let me work directly on the arrays? Or do you have better ways to solve similar problems?
The valarray type does not provide any way to use an existing array for its data store; it always makes a copy for itself. Instead of storing your data in an ordinary array, store the values directly in the valarray from the start. Call v.resize to set the size, and either assign values into it with the [] operator, or use &v[0] to get a pointer to the first value and use it as you would an iterator or buffer pointer — elements of a valarray are stored contiguously in memory.
Warning: ugly hack.
On my system (MS Visual Studio) the valarray class is defined like this:
template<class _Ty>
class valarray
{
...
private:
_Ty *_Myptr; // current storage reserved for array
size_t _Mysize; // current length of sequence
size_t _Myres; // length of array
};
So i can build my own class that has the same layout (with a good level of confidence):
struct my_valarray_hack
{
void *_Myptr;
size_t num_of_elements;
size_t size_of_buffer;
};
Then create an empty valarray and overwrite its internal variables so it points to your data.
void do_stuff(float my_data[], size_t size)
{
valarray<float> my_valarray;
my_valarray_hack hack = {my_data, size, size};
my_valarray_hack cleanup;
assert(sizeof(my_valarray) == sizeof(hack));
// Save the contents of the object that we are fiddling with
memcpy(&cleanup, &my_valarray, sizeof(cleanup));
// Overwrite the object so it points to our array
memcpy(&my_valarray, &hack, sizeof(hack));
// Do calculations
...
// Do cleanup (otherwise, it will crash)
memcpy(&my_valarray, &cleanup, sizeof(cleanup));
// Destructor is silently invoked here
}
This is not a recommended way of doing things; you should consider it only if you have no other way to implement what you want (maybe not even then). Possible reasons why it could fail:
Layout of valarray may be different in another mode of compilation (examples of modes: debug/release; different platforms; different versions of Standard Library)
If your calculations resize the valarray in any manner, it will try to reallocate your buffer and crash
If the implementation of valarray assumes its buffer has e.g. 16-byte alignment, it may crash, do wrong calculations or just work slowly (depending on your platform)
(I am sure there are some more reasons for it not to work)
Anyway, it's described as "undefined behavior" by the Standard, so strictly speaking anything may happen if you use this solution.

Prototype for function that allocates memory on the heap (C/C++)

I'm fairly new to C++ so this is probably somewhat of a beginner question. It regards the "proper" style for doing something I suspect to be rather common.
I'm writing a function that, in performing its duties, allocates memory on the heap for use by the caller. I'm curious about what a good prototype for this function should look like. Right now I've got:
int f(char** buffer);
To use it, I would write:
char* data;
int data_length = f(&data);
// ...
delete[] data;
However, the fact that I'm passing a pointer to a pointer tips me off that I'm probably doing this the wrong way.
Anyone care to enlighten me?
In C, that would have been more or less legal.
In C++, functions typically shouldn't do that. You should try to use RAII to guarantee memory doesn't get leaked.
And now you might say "how would it leak memory, I call delete[] just there!", but what if an exception is thrown at the // ... lines?
Depending on what exactly the functions are meant to do, you have several options to consider. One obvious one is to replace the array with a vector:
std::vector<char> f();
std::vector<char> data = f();
int data_length = data.size();
// ...
//delete[] data;
and now we no longer need to explicitly delete, because the vector is allocated on the stack, and its destructor is called when it goes out of scope.
I should mention, in response to comments, that the above implies a copy of the vector, which could potentially be expensive. Most compilers will, if the f function is not too complex, optimize that copy away so this will be fine. (and if the function isn't called too often, the overhead won't matter anyway). But if that doesn't happen, you could instead pass an empty array to the f function by reference, and have f store its data in that instead of returning a new vector.
If the performance of returning a copy is unacceptable, another alternative would be to decouple the choice of container entirely, and use iterators instead:
// definition of f
template <typename iter>
void f(iter out);
// use of f
std::vector<char> vec;
f(std::back_inserter(vec));
Now the usual iterator operations can be used (*out to reference or write to the current element, and ++out to move the iterator forward to the next element) -- and more importantly, all the standard algorithms will now work. You could use std::copy to copy the data to the iterator, for example. This is the approach usually chosen by the standard library (ie. it is a good idea;)) when a function has to return a sequence of data.
Another option would be to make your own object taking responsibility for the allocation/deallocation:
struct f { // simplified for the sake of example. In the real world, it should be given a proper copy constructor + assignment operator, or they should be made inaccessible to avoid copying the object
f(){
// do whatever the f function was originally meant to do here
size = ???
data = new char[size];
}
~f() { delete[] data; }
int size;
char* data;
};
f data;
int data_length = data.size;
// ...
//delete[] data;
And again we no longer need to explicitly delete because the allocation is managed by an object on the stack. The latter is obviously more work, and there's more room for errors, so if the standard vector class (or other standard library components) do the job, prefer them. This example is only if you need something customized to your situation.
The general rule of thumb in C++ is that "if you're writing a delete or delete[] outside a RAII object, you're doing it wrong. If you're writing a new or `new[] outside a RAII object, you're doing it wrong, unless the result is immediately passed to a smart pointer"
In 'proper' C++ you would return an object that contains the memory allocation somewhere inside of it. Something like a std::vector.
Your function should not return a naked pointer to some memory. The pointer, after all, can be copied. Then you have the ownership problem: Who actually owns the memory and should delete it? You also have the problem that a naked pointer might point to a single object on the stack, on the heap, or to a static object. It could also point to an array at these places. Given that all you return is a pointer, how are users supposed to know?
What you should do instead is to return an object that manages its resource in an appropriate way. (Look up RAII.) Give the fact that the resource in this case is an array of char, either a std::string or a std::vector seem to be best:
int f(std::vector<char>& buffer);
std::vector<char> buffer;
int result = f(buffer);
Why not do the same way as malloc() - void* malloc( size_t numberOfBytes )? This way the number of bytes is the input parameter and the allocated block address is the return value.
UPD:
In comments you say that f() basically performs some action besides allocating memory. In this case using std::vector is a much better way.
void f( std::vector<char>& buffer )
{
buffer.clear();
// generate data and add it to the vector
}
the caller will just pass an allocated vector:
std::vector buffer;
f( buffer );
//f.size() now will return the number of elements to work with
Pass the pointer by reference...
int f(char* &buffer)
However you may wish to consider using reference counted pointers such as boost::shared_array to manage the memory if you are just starting this out.
e.g.
int f(boost::shared_array<char> &buffer)
Use RAII (Resource Acquisition Is Initialization) design pattern.
http://en.wikipedia.org/wiki/RAII
Understanding the meaning of the term and the concept - RAII (Resource Acquisition is Initialization)
Just return the pointer:
char * f() {
return new char[100];
}
Having said that, you probably do not need to mess with explicit allocation like this - instead of arrays of char, use std::string or std::vector<char> instead.
If all f() does with the buffer is to return it (and its length), let it just return the length, and have the caller new it. If f() also does something with the buffer, then do as polyglot suggeted.
Of course, there may be a better design for the problem you want to solve, but for us to suggest anything would require that you provide more context.
The proper style is probably not to use a char* but a std::vector or a std::string depending on what you are using char* for.
About the problem of passing a parameter to be modified, instead of passing a pointer, pass a reference. In your case:
int f(char*&);
and if you follow the first advice:
int f(std::string&);
or
int f(std::vector<char>&);
Actually, the smart thing to do would be to put that pointer in a class. That way you have better control over its destruction, and the interface is much less confusing to the user.
class Cookie {
public:
Cookie () : pointer (new char[100]) {};
~Cookie () {
delete[] pointer;
}
private:
char * pointer;
// Prevent copying. Otherwise we have to make these "smart" to prevent
// destruction issues.
Cookie(const Cookie&);
Cookie& operator=(const Cookie&);
};
Provided that f does a new[] to match, it will work, but it's not very idiomatic.
Assuming that f fills in the data and is not just a malloc()-alike you would be better wrapping the allocation up as a std::vector<char>
void f(std::vector<char> &buffer)
{
// compute length
int len = ...
std::vector<char> data(len);
// fill in data
...
buffer.swap(data);
}
EDIT -- remove the spurious * from the signature
I guess you are trying to allocate a one dimensional array. If so, you don't need to pass a pointer to pointer.
int f(char* &buffer)
should be sufficient. And the usage scenario would be:
char* data;
int data_length = f(data);
// ...
delete[] data;