Splitting dynamically allocated array without linear time copy [closed] - c++

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I am working with a list of array in C++, each in an object and wanted to split some of them.
These are allocated dynamically.
I wanted to do the split in constant time as it is theoretically possible:
from
[ pointer, size1 ]
to
[ pointer, size2 ]; [ other array ]; [ pointer + size2, size1-size2 ]
(+ other data each time)
I tried to use malloc and simply create a new pointer incremented with the size.
As it could be expected, I got error due to the automatic freeing of the memory.
I tried a realloc starting at the second address, but as in "what is the difference between malloc and calloc" on this site already told me it is not possible.
Is there a way to avoid recopying the second part and define correctly the pointer?
Having a linear cost where I know I can have constant time is frustrating.
class TableA
{
public:
(constructor)
void divide(int size); // the one i am trying to implement
(other, geteur, seteur)
private
Evenement* _el;
vector<bool>** _old;//said arrays
int _size;
}
nothing really complicated

Basically, the malloc library can't cope with mallocing a chunk of memory and then freeing it slices.
You can do what you want, but you must only free the memory all at once right at the end using the original pointer that malloc handed you.
e.g.
int* p = malloc(9 * sizeof(int));
int* q = p + 3;
int* r = p + 6;
// Now we have three pointers to three arrays of three integers.
// Do stuff with p, q, r
free(p); // p is the only pointer it is valid to free.
By the way, if this is really about C++, there are probably standard C++ data structures you can use.

I don't think you can have only part of a dynamic array freed by keeping track of pointers and length. However, you can fake this by making a new class to manage the starting array, allocate it however you want and have it managed by a std::shared_ptr.
You simply return a Class containing a shared_ptr to memory, plain pointer to first element, and array size. When the current Array class goes out of scope, the shared_ptr gets decremented, and when no slices of the memory are used anymore, the memory gets freed.
You need to be careful with this though, as there may be multiple objects referencing the same memory, but there's ways around that (marking the original invalid after splitting with a bool for example).
[Edit] Below is a very basic implementation of this idea. A split() operator can very easily be implemented in terms of 2 slice() operations. I'm not sure how you want to implement this in terms of your example above, as I'm not sure how you manage your vector<bool> **, but if you wanted the possibility of splitting your vector<bool>, you could instantiate a ShareVector<bool>, or if you have an array of vector<bool>, your make a SharedVector<vector<bool>> instead.
#ifndef __SharedVector__
#define __SharedVector__
#include <memory>
#include <assert.h>
template <typename T>
class SharedVector {
std::shared_ptr<T> _data;
T *_begin;
size_t _size;
// perhaps add size_t capacity if need for limited resizing arises.
public:
SharedVector<T>(size_t const size)
: _data(std::shared_ptr<T>(new T[size], []( T *p ) { delete[] p; })), _begin(_data.get()), _size(size)
{}
// standard copy and move constructors work fine
// pass shared_ptr by reference to avoid unnecessary refcount changes
SharedVector<T>(std::shared_ptr<T> &data, T *begin, size_t size)
: _data(data), _begin(begin), _size(size)
{}
T& operator[] (const size_t nIndex) {
assert(nIndex < _size);
return _begin[nIndex];
}
T const & operator[] (const size_t nIndex) const {
assert(nIndex < _size);
return _begin.get()[nIndex];
}
size_t size(){
return _size;
}
SharedVector<T> slice(size_t const begin, size_t const end) {
assert(begin + end < _size);
return SharedVector<T>(_data, _begin + begin, end - begin);
}
T *begin() {
return _begin;
}
T *end() {
return _begin + _size;
}
};
#endif

I think the malloc and create new pointer idea is good. But you need to free the memory manually I think.

Maybe you can use std::copy.
int* p = (int*)malloc(sizeof(int) * 5);
for (int i = 0; i < 5; i++)
p[i] = i;
for (int i = 0; i < 5; i++)
std::cout << p[i];
// tmp will hold 3 values
// p2 will hold 2 values
// so we want to copy first 3 into tmp
// and last 2 into p2
int tmp[3];
std::copy(p, p+3, tmp);
for (int i = 0; i < 3; i++)
std::cout << tmp[i];
int p2[2];
std::copy(p+3, p+5, p2);
for (int i = 0; i < 2; i++)
std::cout << p2[i];
// get rid of original when done
free(p);
Output:
0123401234

The question is relatively unclear
But according to your topic
splitting dynamically allocated array without linear time copy
i would suggest you to use linked list instead of an array, you don't need to copy anything (and therefore no need to free anything UNLESS you wanna delete one of the item), manipulating the pointers would be sufficient for splitting a linked list

Related

C++ : Using a int pointer to point to a vector created inside a function

I am still perfecting the art of posting here so bear with me, I will edit and fix anything suggested!
I have a homework that requires me to create functions that manipulate vectors. The "catch" is that all the data passed to the function is passed by reference to this struct:
struct Vector { // this struct must stay as is
int sze = 0; // "size" took out i for compatability
int capacity = 0;
int * data = nullptr ;
}a,b,c;
i.e.
void construct_vector ( Vector& v, int size= 0, int initVal= 0);
The problem that I am having is in the function construct_vector() I have to, you guessed it, construct a vector and use int* data to point to the vector on the heap? I am not positive about that last part). I just know I have to use the int pointer to point to the vector created within the construct function, and cannot for the life of me figure out how to do that.
An example of what I am trying:
void construct_vector ( Vector &v, int size, int initVal){
std::vector<int> t(size,initVal);
*v.data = &t ; // ERROR: Assigning to 'int' from incompatible type 'std::vector<int> *'
v.capacity = size; //
v.sze = size;
for (int i=0; i < t.size(); i++){
/* I originally tried to implement a
dynamic int pointer here but I cannot change int* data
to int*data[sze] within the struct*/
}
}
The reason int * data must point to the vector is because the data is passed to the subsequent functions by reference to struct member v:
void destroy_vector ( Vector & v );
void copy_data ( Vector & v );
Edit: My problem was that I misunderstood the objective of my assignment but I think the answers I received can really help people understand dynamic memory and how it should be used within functions. So I am going to leave everything as is!
You have two problems here:
std::vector<int> t(size,initVal);
*v.data = &t ; // ERROR: Assigning to 'int' from
First, *v.data and &t are different types, one is int, the other is a pointer to a vector of ints.
You can get it compile with (but you SHOULD NOT, see the second problem)
v.data = t.data();
The other problem is the vector is local to the function. As soon as the function returns, your pointer will dangle.
So the right solution for your problem is using a dynamic array:
v.data = new int[size];
Don't forget to delete[] it in the struct's destructor when you are done using it:
delete [] data;
Instead of
std::vector<int> t(size,initVal);
*v.data = &t ;
You need
v.data = new int[size];
To fill up the object with the input value, use
for ( int i = 0; i < size; ++i )
{
v.data[i] = initVal;
}
You can use std::fill to make your code a bit simpler.
std::fill(v.data, v.data+size, initVal);
Make sure to follow The Rule of Three when you manage dynamic memory yourself.

Is it possible that a pointer gets owned by a vector without any data copy?

If I have a C type raw pointer, is it possible to create a std::vector from the same type that owns the pointer's data without any data copy (only moving)? What motivates me for asking this question is the existence of data() member function for std::vector which means vector's elements are residing somewhere in the memory consecutively.
Edit: I have to add that the hope I had was also intensified by the existence of functions like std::make_shared.
I don't think that this is directly possible, although you're not the first one to miss this feature. It is even more painful with std::string which doesn't have a non-const data member. Hopefully, this will change in C++17.
If you are allocating the buffer yourself, there is a solution, though. Just use a std::vector up-front. For example, assume you have the following C-style function,
extern void
fill_in_the_numbers(double * buffer, std::size_t count);
then you can do the following.
std::vector<double>
get_the_numbers_1st(const std::size_t n)
{
auto numbers = std::vector<double> (n);
fill_in_the_numbers(numbers.data(), numbers.size());
return numbers;
}
Alternatively, if you're not so lucky and your C-style function insists in allocating the memory itself,
extern double *
allocate_the_buffer_and_fill_in_the_numbers(std::size_t n);
you could resort to a std::unique_ptr, which is sadly inferior.
std::unique_ptr<double[], void (*)(void *)>
get_the_numbers_2nd(const std::size_t n)
{
return {
allocate_the_buffer_and_fill_in_the_numbers(n),
&std::free
};
}
No, std::vector is not designed to be able to assume/utilize a pre-existing array for its internal storage.
Yes, provided that you've created and populated the vector before getting the pointers and that you will not
erase any element
you will not add new elements when vec.size() == vec.capacity() - 1 ,,, doing so will change the address of the elements
Example
#include <iostream>
void fill_my_vector<std::vector<double>& vec){
for(int i=0; i<300; i++){
vec.push_back(i);
}
}
void do_something(double* d, int size)
{ /* ..... */ }
int main(){
std::vector<double> vec;
fill_my_vector(vec);
//You hereby promise to follow the contract conditions, then you are safe doing this
double* ptr;
int ptr_len = vec.size();
ptr = &vec[0];
//call do_something
do_something(ptr, ptr_len);
//ptr will be alive until this function scope exits
}
EDIT
If you mean managing the data from an already created array, you can't... vector manages its own array... It cannot take ownership of an array that wasn't created by its class (vector).

find value in dynamic created varchar array C++

so i'm running my head against a wall at the Moment. I want to create a dynamic array, which can contain numbers or text - but most likely numbers.
At the Moment i'm using:
string test[];
for that purpose.
Okay, now the thing is. I want to dynamically fill the array, if the Element is not already in the array. I've tryed googling solution, but most of them came up with vector, which wouldn't work in this case, because the array can be any size.
Again:
Check if the Element is in the Array, that can be empty or not
If Element is not in the Array, put it in.
Anybody got a Solution for this, please? Would be very thankfully!
Thank you very much for all those comments and answers. I just noticed one major difference from between what i found in the net and what you guys posted. if i use test[] it won't work, but if i use test{} everything is fine. Can somebody maybe explain me why is that?
So if I understood correctly you're trying to get an dynamic array which changes its size depending on how much elements are in there? Well then use vector. It does excatly this! And its pretty fast and the C++ standard for tasks like that.
Using a vectoralso allows you to use std::find. So to fullfill your task do something like that:
std::vector<std::string> test{"Hello"};
if (std::find(test.begin(), test.end(), "World!") == test.end())
test.push_back("World!");
Or even better, use std::set wich only allows unique elements:
std::set<std::string> test{"Hello"};
test.insert("World"); // works
test.insert("Hello"); // won't work
If you realy want to use an array then I would recommend you to write a template class to manage the array and then allocate the array on the heap.
template<typename T>
class Array
{
private:
T *data;
unsigned int size;
unsigned int index;
public:
Array()
{
size = 100;
index = 0;
data = new T[size];
}
~Array()
{
delete[] data;
}
void push_back(const T& val)
{
if (index == size)
{ // reallocate data if there is no memory left to store val
std::vector<T> tmp(data, data + size);
delete[] data;
size += 100;
data = new T[size];
for (size_t i = 0; i < tmp.size(); i++)
data[i] = tmp[i];
data[index++] = val;
}
else
{
data[index++] = val;
}
}
T& operator[](const unsigned int& i)
{
if (i >= size)
throw std::runtime_error("i out of bounds");
return data[i];
}
};
You then need to search in the array for an existing value, and if you couldn't find it use push_back to push the value into the array. Or use the subscript operator [] like you're used to.

Is new int[][] a valid thing to do in C++?

I have come across some code which allocates a 2d array with following approach:
auto a = new int[10][10];
Is this a valid thing to do in C++? I have search through several C++ reference books, none of them has mentioned such approach.
Normally I would have done the allocation manually as follow:
int **a = new int *[10];
for (int i = 0; i < 10; i++) {
a[i] = new int[10];
}
If the first approach is valid, then which one is preferred?
The first example:
auto a = new int[10][10];
That allocates a multidimensional array or array of arrays as a contiguous block of memory.
The second example:
int** a = new int*[10];
for (int i = 0; i < 10; i++) {
a[i] = new int[10];
}
That is not a true multidimensional array. It is, in fact, an array of pointers and requires two indirections to access each element.
The expression new int[10][10] means to allocate an array of ten elements of type int[10], so yes, it is a valid thing to do.
The type of the pointer returned by new is int(*)[10]; one could declare a variable of such a type via int (*ptr)[10];.
For the sake of legibility, one probably shouldn't use that syntax, and should prefer to use auto as in your example, or use a typedef to simplify as in
using int10 = int[10]; // typedef int int10[10];
int10 *ptr;
In this case, for small arrays, it is more efficient to allocate them on the stack. Perhaps even using a convenience wrapper such as std::array<std::array<int, 10>, 10>. However, in general, it is valid to do something like the following:
auto arr = new int[a][b];
Where a is a std::size_t and b is a constexpr std::size_t. This results in more efficient allocation as there should only be one call to operator new[] with sizeof(int) * a * b as the argument, instead of the a calls to operator new[] with sizeof(int) * b as the argument. As stated by Galik in his answer, there is also the potential for faster access times, due to increased cache coherency (the entire array is contiguous in memory).
However, the only reason I can imagine one using something like this would be with a compile-time-sized matrix/tensor, where all of the dimensions are known at compile time, but it allocates on the heap if it exceeds the stack size.
In general, it is probably best to write your own RAII wrapper class like follows (you would also need to add various accessors for height/width, along with implementing a copy/move constructor and assignment, but the general idea is here:
template <typename T>
class Matrix {
public:
Matrix( std::size_t height, std::size_t width ) : m_height( height ), m_width( width )
{
m_data = new T[height * width]();
}
~Matrix() { delete m_data; m_data = nullptr; }
public:
T& operator()( std::size_t x, std::size_t y )
{
// Add bounds-checking here depending on your use-case
// by throwing a std::out_of_range if x/y are outside
// of the valid domain.
return m_data[x + y * m_width];
}
const T& operator()( std::size_t x, std::size_t y ) const
{
return m_data[x + y * m_width];
}
private:
std::size_t m_height;
std::size_t m_width;
T* m_data;
};

C++: joining array together - is it possible with pointers WITHOUT copying?

as in the title is it possible to join a number of arrays together without copying and only using pointers? I'm spending a significant amount of computation time copying smaller arrays into larger ones.
note I can't used vectors since umfpack (some matrix solving library) does not allow me to or i don't know how.
As an example:
int n = 5;
// dynamically allocate array with use of pointer
int *a = new int[n];
// define array pointed by *a as [1 2 3 4 5]
for(int i=0;i<n;i++) {
a[i]=i+1;
}
// pointer to array of pointers ??? --> this does not work
int *large_a = new int[4];
for(int i=0;i<4;i++) {
large_a[i] = a;
}
Note: There is already a simple solution I know and that is just to iteratively copy them to a new large array, but would be nice to know if there is no need to copy repeated blocks that are stored throughout the duration of the program. I'm in a learning curve atm.
thanks for reading everyone
as in the title is it possible to join a number of arrays together without copying and only using pointers?
In short, no.
A pointer is simply an address into memory - like a street address. You can't move two houses next to each other, just by copying their addresses around. Nor can you move two houses together by changing their addresses. Changing the address doesn't move the house, it points to a new house.
note I can't used vectors since umfpack (some matrix solving library) does not allow me to or i don't know how.
In most cases, you can pass the address of the first element of a std::vector when an array is expected.
std::vector a = {0, 1, 2}; // C++0x initialization
void c_fn_call(int*);
c_fn_call(&a[0]);
This works because vector guarantees that the storage for its contents is always contiguous.
However, when you insert or erase an element from a vector, it invalidates pointers and iterators that came from it. Any pointers you might have gotten from taking an element's address no longer point to the vector, if the storage that it has allocated must change size.
No. The memory of two arrays are not necessarily contiguous so there is no way to join them without copying. And array elements must be in contiguous memory...or pointer access would not be possible.
I'd probably use memcpy/memmove, which is still going to be copying the memory around, but at least it's been optimized and tested by your compiler vendor.
Of course, the "real" C++ way of doing it would be to use standard containers and iterators. If you've got memory scattered all over the place like this, it sounds like a better idea to me to use a linked list, unless you are going to do a lot of random access operations.
Also, keep in mind that if you use pointers and dynamically allocated arrays instead of standard containers, it's a lot easier to cause memory leaks and other problems. I know sometimes you don't have a choice, but just saying.
If you want to join arrays without copying the elements and at the same time you want to access the elements using subscript operator i.e [], then that isn't possible without writing a class which encapsulates all such functionalities.
I wrote the following class with minimal consideration, but it demonstrates the basic idea, which you can further edit if you want it to have functionalities which it's not currently having. There should be few error also, which I didn't write, just to make it look shorter, but I believe you will understand the code, and handle error cases accordingly.
template<typename T>
class joinable_array
{
std::vector<T*> m_data;
std::vector<size_t> m_size;
size_t m_allsize;
public:
joinable_array() : m_allsize() { }
joinable_array(T *a, size_t len) : m_allsize() { join(a,len);}
void join(T *a, size_t len)
{
m_data.push_back(a);
m_size.push_back(len);
m_allsize += len;
}
T & operator[](size_t i)
{
index ix = get_index(i);
return m_data[ix.v][ix.i];
}
const T & operator[](size_t i) const
{
index ix = get_index(i);
return m_data[ix.v][ix.i];
}
size_t size() const { return m_allsize; }
private:
struct index
{
size_t v;
size_t i;
};
index get_index(size_t i) const
{
index ix = { 0, i};
for(auto it = m_size.begin(); it != m_size.end(); it++)
{
if ( ix.i >= *it ) { ix.i -= *it; ix.v++; }
else break;
}
return ix;
}
};
And here is one test code:
#define alen(a) sizeof(a)/sizeof(*a)
int main() {
int a[] = {1,2,3,4,5,6};
int b[] = {11,12,13,14,15,16,17,18};
joinable_array<int> arr(a,alen(a));
arr.join(b, alen(b));
arr.join(a, alen(a)); //join it again!
for(size_t i = 0 ; i < arr.size() ; i++ )
std::cout << arr[i] << " ";
}
Output:
1 2 3 4 5 6 11 12 13 14 15 16 17 18 1 2 3 4 5 6
Online demo : http://ideone.com/VRSJI
Here's how to do it properly:
template<class T, class K1, class K2>
class JoinArray {
JoinArray(K1 &k1, K2 &k2) : k1(k1), k2(k2) { }
T operator[](int i) const { int s = k1.size(); if (i < s) return k1.operator[](i); else return k2.operator[](i-s); }
int size() const { return k1.size() + k2.size(); }
private:
K1 &k1;
K2 &k2;
};
template<class T, class K1, class K2>
JoinArray<T,K1,K2> join(K1 &k1, K2 &k2) { return JoinArray<T,K1,K2>(k1,k2); }
template<class T>
class NativeArray
{
NativeArray(T *ptr, int size) : ptr(ptr), size(size) { }
T operator[](int i) const { return ptr[i]; }
int size() const { return size; }
private:
T *ptr;
int size;
};
int main() {
int array[2] = { 0,1 };
int array2[2] = { 2,3 };
NativeArray<int> na(array, 2);
NativeArray<int> na2(array2, 2);
auto joinarray = join(na,na2);
}
A variable that is a pointer to a pointer must be declared as such.
This is done by placing an additional asterik in front of its name.
Hence, int **large_a = new int*[4]; Your large_a goes and find a pointer, while you've defined it as a pointer to an int. It should be defined (declared) as a pointer to a pointer variable. Just as int **large_a; could be enough.