Arrays and pointers in a template - c++

I am attempting to write a template/class that has a few functions, but I'm running into what seems like a rather newbie problem. I have a simple insert function and a display values function, however whenever I attempt to display the value, I always receive what looks like a memory address(but I have no idea), but I would like to receive the value stored (in this particular example, the int 2). I'm not sure how to dereference that to a value, or if I'm just completely messing up. I know that vectors are a better alternative, however I need to use an array in this implementation - and honestly I would like to gain a more thorough understanding of the code and what's going on. Any help as to how to accomplish this task would be greatly appreciated.
Example Output (running the program in the same way every time):
003358C0
001A58C0
007158C0
Code:
#include <iostream>
using namespace std;
template <typename Comparable>
class Collection
{
public: Collection() {
currentSize = 0;
count = 0;
}
Comparable * values;
int currentSize; // internal counter for the number of elements stored
void insert(Comparable value) {
currentSize++;
// temparray below is used as a way to increase the size of the
// values array each time the insert function is called
Comparable * temparray = new Comparable[currentSize];
memcpy(temparray,values,sizeof values);
// Not sure if the commented section below is necessary,
// but either way it doesn't run the way I intended
temparray[currentSize/* * (sizeof Comparable) */] = value;
values = temparray;
}
void displayValues() {
for (int i = 0; i < currentSize; i++) {
cout << values[i] << endl;
}
}
};
int main()
{
Collection<int> test;
int inserter = 2;
test.insert(inserter);
test.displayValues();
cin.get();
return 0;
}

Well, if you insist, you can write and debug your own limited version of std::vector.
First, don't memcpy from an uninitialized pointer. Set values to new Comparable[0] in the constructor.
Second, memcpy the right number of bytes: (currentSize-1)*sizeof(Comparable).
Third, don't memcpy at all. That assumes that Comparable types can all be copied byte-by-byte, which is a severe limitation in C++. Instead:
EDIT: changed uninitialized_copy to copy:
std::copy(values, values + currentSize - 1, temparray);
Fourth, delete the old array when it's no longer in use:
delete [] values;
Fifth, unless the code is going to make very few insertions, expand the array by more than one. std::vector typically increases its size by a factor of 1.5.
Sixth, don't increment currentSize until the size changes. That will change all those currentSize-1s into currentSize, which is much less annoying. <g>
Seventh, an array of size N has indices from 0 to N-1, so the top element of the new array is at currentSize - 1, not currentSize.
Eighth, did I mention, you really should use std::vector.

This line is wrong:
memcpy(temparray,values,sizeof values);
The first time this line is run, the values pointer is uninitialized, so it will cause undefined behavior. Additionally, using sizeof values is wrong since that will always give the size of a pointer.
Another issue:
temparray[currentSize] = value;
This will also cause undefined bahavior because you have only allocated currentSize items in temparray, so you can only access indices 0 to currentSize-1.

There is also an error in your array access.
temparray[currentSize/* * (sizeof Comparable) */] = value;
Remember that arrays start at index zero. So for an array of length 1, you would set temparray[0] = value. Since you increment currentSize at the top of the insert function, you will need to do this instead:
temparray[currentSize-1] = value;

Related

Dynamic array of Linear search funcion implementation

Need to implement a function
int* linearSearch(int* array, int num);
That gets a fixed size array of integers with a number and return an array with indices to the occurrences of the searched number.
For example array={3,4,5,3,6,8,7,8,3,5} & num=5 will return occArray={2,9}.
I've implemented it in c++ with a main function to check the output
#include <iostream>
using namespace std;
int* linearSearch(int* array, int num);
int main()
{
int array[] = {3,4,5,3,6,8,7,8,3,5}, num=5;
int* occArray = linearSearch(array, num);
int i = sizeof(occArray)/sizeof(occArray[0]);
while (i>0) {
std::cout<<occArray[i]<<" ";
i--;
}
}
int* linearSearch(int* array, int num)
{
int *occArray= new int[];
for (int i = 0,j = 0; i < sizeof(array) / sizeof(array[0]); i++) {
if (array[i] == num) {
occArray[j] = i;
j++;
}
}
return occArray;
}
I think the logic is fine but I have a syntax problems with creating a dynamic cell for occArray
Also a neater implantation with std::vector will be welcomed
Thank You
At very first I join in the std::vector recommendation in the question's comments (pass it as const reference to avoid unnecessary copy!), that solves all of your issues:
std::vector<size_t> linearSearch(std::vector<int> const& array, int value)
{
std::vector<size_t> occurrences;
// to prevent unnecessary re-allocations, which are expensive,
// one should reserve sufficient space in advance
occurrences.reserve(array.size());
// if you expect only few occurrences you might reserve a bit less,
// maybe half or quarter of array's size, then in general you use
// less memory but in few cases you still re-allocate
for(auto i = array.begin(); i != array.end(); ++i)
{
if(*i == value)
{
// as using iterators, need to calculate the distance:
occurrences.push_back(i - array.begin());
}
}
return occurences;
}
Alternatively you could iterate with a size_t i variable from 0 to array.size(), compare array[i] == value and push_back(i); – that's equivalent, so select whichever you like better...
If you cannot use std::vector for whatever reason you need to be aware of a few issues:
You indeed can get the length of an array by sizeof(array)/sizeof(*array) – but that only works as long as you have direct access to that array. In most other cases (including passing them to functions) arrays decay to pointers and these do not retain any size information, thus this trick won't work any more, you'd always get sizeOfPointer/sizeOfUnderlyingType, on typical modern 64-bit hardware that would be 8/4 = 2 for int* – no matter how long the array originally was.
So you need to pass the size of the array in an additional parameter, e.g.:
size_t* linearSearch
(
int* array,
size_t number, // of elements in the array
int value // to search for
);
Similarly you need to return the number of occurrences of the searched value by some means. There are several options for:
Turn num into a reference (size_t& num), then you can modify it inside the function and the change gets visible outside. Usage of the function get's a bit inconvenient, though, as you need to explicitly define a variable for:
size_t num = sizeof(array)/sizeof(*array);
auto occurrences = linearSearch(array, num, 7);
Append a sentinel value to the array, which might be the array size or probably better maximum value for size_t – with all the disadvantages you have with C strings as well (mainly having to iterate over the result array to detect the number of occurences).
Prepend the number of occurrences to the array – somehow ugly as well as you mix different kind of information into one and the same array.
Return result pointer and size in a custom struct of yours or in e.g. a std::pair<size_t, size_t*>. You could even use that in a structured binding expression when calling the function:
auto [num, occurences] = linearSearch(array, sizeof(array)/sizeof(*array), 7);
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
// here the trick yet works provided the array is declared above the call, too,
// as is in your example
Option 4 would be my personal recommendation out of these.
Side note: I switched to size_t for return values as negative indices into an array are meaningless anyway (unless you intend to use these as sentinel values, e. g. -1 for end of occurrences in option 2).

How to create an auxiliary data structure to keep track of heap indices in a minheap for the decrease_key operation in c++

I think this is probably a trivial problem to solve but I have been struggling with this for past few days.
I have the following vector: v = [7,3,16,4,2,1]. I was able to implement with some help from google simple minheap algorithm to get the smallest element in each iteration. After extraction of the minimum element, I need to decrease the values of some of the elements and then bubble them up.
The issue I am having is that I want find the elements whose value has to be reduced in the heap in constant time, then reduce that value and then bubble it up.
After the heapify operation, the heap_vector v_h looks like this: v_h = [1,2,7,4,3,16]. When I remove the min element 1, then the heap vector becomes, [2,3,7,4,16]. But before we do the swap and bubble up, say I want to change the values of 7 to 4, 16 to 4 and 4 to 3.5 . But I am not sure where they will be in the heap. The indices of values of the elements that have to be decreased will be given with respect to the original vector v. I figured out that I need to have an auxiliary data structure that can keep track of the heap indices in relation to the original order of the elements (the heap index vector should look like h_iv = [2,4,5,3,1,0] after all the elements have been inserted into the minheap. And whenever an element is deleted from the minheap, the heap_index should be -1. I created a vector to try to update the heap indices whenever there is a change but I am unable to do it.
I am pasting my work here and also at https://onlinegdb.com/SJR4LqQO4
Some of the work I had tried is commented out. I am unable to map the heap indices when there is a swap in the bubble up or bubble down operations. I will be very grateful to anyone who can lead me in a direction to solve my problem. Please also let me know if I have to rethink some of my logic.
The .hpp file
#ifndef minheap_hpp
#define minheap_hpp
#include <stdio.h>
// #include "helper.h"
#include <vector>
class minheap
{
public:
std::vector<int> vect;
std::vector<int> heap_index;
void bubble_down(int index);
void bubble_up(int index);
void Heapify();
public:
minheap(const std::vector<int>& input_vector);
minheap();
void insert(int value);
int get_min();
void delete_min();
void print_heap_vector();
};
#endif /* minheap_hpp */
The .cpp file
#include "minheap.hpp"
minheap::minheap(const std::vector<int>& input_vector) : vect(input_vector)
{
Heapify();
}
void minheap::Heapify()
{
int length = static_cast<int>(vect.size());
// auto start = 0;
// for (auto i = 0; i < vect.size(); i++){
// heap_index.push_back(start);
// start++;
// }
for(int i=length/2-1; i>=0; --i)
{
bubble_down(i);
}
}
void minheap::bubble_down(int index)
{
int length = static_cast<int>(vect.size());
int leftChildIndex = 2*index + 1;
int rightChildIndex = 2*index + 2;
if(leftChildIndex >= length){
return;
}
int minIndex = index;
if(vect[index] > vect[leftChildIndex])
{
minIndex = leftChildIndex;
}
if((rightChildIndex < length) && (vect[minIndex] > vect[rightChildIndex]))
{
minIndex = rightChildIndex;
}
if(minIndex != index)
{
std::swap(vect[index], vect[minIndex]);
// std::cout << "swap " << index << " - " << minIndex << "\n";
// auto a = heap_index[heap_index[index]];
// auto b = heap_index[heap_index[minIndex]];
// heap_index[a] = b;
// heap_index[b] = a;
// print_vector(heap_index);
bubble_down(minIndex);
}
}
void minheap::bubble_up(int index)
{
if(index == 0)
return;
int par_index = (index-1)/2;
if(vect[par_index] > vect[index])
{
std::swap(vect[index], vect[par_index]);
bubble_up(par_index);
}
}
void minheap::insert(int value)
{
int length = static_cast<int>(vect.size());
vect.push_back(value);
bubble_up(length);
}
int minheap::get_min()
{
return vect[0];
}
void minheap::delete_min()
{
int length = static_cast<int>(vect.size());
if(length == 0)
{
return;
}
vect[0] = vect[length-1];
vect.pop_back();
bubble_down(0);
}
void minheap::print_heap_vector(){
// print_vector(vect);
}
and the main file
#include <iostream>
#include <iostream>
#include "minheap.hpp"
int main(int argc, const char * argv[]) {
std::vector<int> vec {7, 3, 16, 4, 2, 1};
minheap mh(vec);
// mh.print_heap_vector();
for(int i=0; i<3; ++i)
{
auto a = mh.get_min();
mh.delete_min();
// mh.print_heap_vector();
std::cout << a << "\n";
}
// std::cout << "\n";
return 0;
}
"I want to change the values of 7 to 4, 16 to 4 and 4 to 3.5 . But I am not sure where they will be in the heap. The indices of values of the elements that have to be decreased will be given with respect to the original vector v. ... Please also let me know if I have to rethink some of my logic."
Rather than manipulate the values inside the heap, I would suggest keeping the values that need changing inside a vector (possibly v itself). The heap could be based on elements that are a struct (or class) that holds an index into the corresponding position in the vector with the values, rather than hold the (changing) value itself.
The struct (or class) would implement an operator< function that compares the values retrieved from the two vector locations for the respective index values. So, instead of storing the comparison value in the heap elements and comparing a < b, you would store index positions i and j and so on and compare v[i] < v[j] for the purpose of heap ordering.
In this way, the positions of the numerical values you need to update will never change from their original positions. The position information will never go stale (as I understand it from your description).
Of course, when you make changes to those stored values in the vector, that could easily invalidate any ordering that might have existed in the heap itself. As I understand your description, that much was necessarily true in any case. Therefore, depending on how you change the values, you might need to do a fresh make_heap to restore proper heap ordering. (That isn't clear, since it depends on whether your intended changes violate heap assumptions, but it would be a safe thing to assume unless there are strong assurances otherwise.)
I think the rest is pretty straight forward. You can still operate the heap as you intended before. For ease you might even give the struct (or class) a lookup function to return the current value at it's corresponding position in the vector, if you need that (rather than the index) as you pop out minimum values.
p.s. Here is a variation on the same idea.
In the original version above, one would likely need to also store a pointer to the location of the vector that held the vector of values, possibly as a shared static pointer of that struct (or class) so that all the members could dereference the pointer to that vector in combination with the index values to look up the particular member associated with that element.
If you prefer, instead of storing that shared vector pointer and an index in each member, each struct (or class) instance could more simply store a pointer (or iterator) directly to the corresponding value's location. If the values are integers, the heap element struct's member value could be int pointer. While each pointer might be larger than an index value, this does have the advantage that it eliminates any assumption about the data structure that holds the compared values and it is even simpler/faster to dereference vs. lookup with an index into the vector. (Both are constant time.)
One caution: In this alternate approach, the pointer values would be invalidated if you were to cause the vector's storage positions to change, e.g. by pushing in new values and expanding it in a way that forces it to reallocate it's space. I'm assuming you only need to change values, not expand the number of values after you've begun to use the heap. But if you did need to do that, that would be one reason to prefer index values, since they remain valid after expanding the vector (unlike pointers).
p.p.s. This technique is also valuable when the objects that you want to compare in the heap are large. Rather than have the heap perform many copy operations on large objects as it reorders the positions of the heap elements, by storing only pointers (or index values) the copying is much more efficient. In fact, this makes it possible to use heaps on objects that you might not want to copy at all.
Here is a quick idea of one version of the comparison function (with some class context now added).
class YourHeapElementClassName
{
public:
// constructor
explicit YourHeapElementClassName(theTypeOfYourComparableValueOrObject & val)
: m_valPointer(&val)
{
}
bool operator<(const YourHeapElementClassName & other) const
{
return *m_valPointer < *(other.m_valPointer);
}
...
private:
theTypeOfYourComparableValueOrObject * m_valPointer;
}; // YourHeapElementClassName
// and later instead of making a heap of int or double,
// you make a heap of YourHeapElementClassName objects
// that you initialize so each points to a value in v
// by using the constructor above with each v member.
// If you (probably) don't need to change the v values
// through these heap objects, the member value could be
// a pointer to a const value and the constructor could
// have a const reference argument for the original value.
If you had need to do this with different types of values or objects, the pointer approach could be implemented with a template that generalizes on the type of value or object and holds a pointer to that general type.

Dynamic array crashing at constructor

I'm trying to implement a dynamic array of strings for educational purpose. The problem that I ran into is the program crashing whenever I try to add strings to the empty array in my constructor.
Array::Array(string dir, string dim)
{
size = 0;
ptr = new string[size + 1];
ptr[size] = dir;
size++;
ptr[size] = dim;
size++;
}
I have int size and string *ptr declared in my header file. I originally thought this to be a out-of-bounds problem, but after looking at this post, I fixed the initial allocation to size + 1, but the persisting problem seems to prove otherwise.
Changing the value of size does not change the size of the array.
You allocate an array of size 1.
Then you assign something to the first (only) element of that array.
Then you assign something to the second element of that array - but the array only has one element.
Also note that using new does not allocate a dynamic array. Once allocated, the size can't change.
As mentioned by Sid S, "changing the value of size does not change the size of the array."
And for your "inefficient" concern, a common trick that reflect to PaulMcKenzie and Daniel H's idea, is to use the doubling strategy. See the following C code for an simple idea:
#include <stdlib.h>
struct MyArray {
int capacity;
int size;
int *data;
}
/* any other functions you would use, eg: create, destroy of MyArray */
void push(struct MyArray myarray, int n) {
if (size == capacity) {
capacity *= 2;
data = realloc(data, capacity*sizeof(int));
}
/* add the element to the end of data and increase size */
}
In this way, instead of doing realloc every time there is an element added, you would have a lower runtime in average.
A detailed amortized analysis about doubling strategy can be found here.
Instead of string use pointer and allocate memory dynamically every time how much they need not 0 and then ++.
Array :: Array(char *dir,char *dim)
{
int l1,l2;
l1=strlen(dir);
l2=strlen(dim);
/**assume n1 & n2 are data member of "Array" class.**/
n1=new char[l1+1];// allocating memory dynamically
n2=new char[l2+1];
}
I hope it helps.

Size of dynamic array or loop through it without knowing size

Like in title, can I somehow get size of dynamic allocated array (I can't keep it separately), or somehow loop through this array without using it's size?
int *ar=new int[x]; //x-size of array, I don't know it in the beggining,
P.S. If I wanted to use std::vector, I wouldn't ask about it, so don't tell me to use it :)
A std::vector is designed for this.
If you can't use a std::vector I can see a couple of options.
1) Use an array terminator.
If your array should only contain positive numbers (for example) or numbers within a given range then you can use an illegal value (eg -1) as an array terminator.
for(int* i = arr; *i != -1; ++i)
{
// do something with *i
}
2) Embed the length in the array.
For a numeric array you could, by convention, store its length in the first element.
for(int i = 0; i < arr[0]; ++i)
{
// do something with arr[i + 1]
}
If you want to store the size your dynamic array inside it, well, just do so:
#include <iostream>
#include <cstdint>
#include <cstddef>
using std::size_t;
struct head_t { size_t size; int data[]; };
int main() {
head_t* h = static_cast<head_t*>(::operator new(sizeof(head_t) + 10 * sizeof(int)));
h->size = 10;
int* my_10_ints = h->data;
// Oh noez! I forgot 10!
size_t what_was_10_again = static_cast<head_t*>(static_cast<void*>(my_10_ints) - offsetof(head_t, data))->size;
::std::cout << what_was_10_again << "\n";
::operator delete(static_cast<void*>(my_10_ints) - offsetof(head_t, data));
}
You can even put that functionality in a libraryesque set of functions! Oh, and once you do that you realize you could just have an unordered_map that maps pointers to sizes. But that would be like using vector: Totally boring.
No. That's one reason everybody uses containers. If std::vector doesn't please you, you can make a container of your own.
Edit: Since dynamic array size is determined at runtime, someone must store the size somewhere (unless you're willing to use a sentry value). Not even the compiler can help, because the size is determined in runtime.

How can I make my dynamic array or vector operate at a similar speed to a standard array? C++

I'm still quite inexperienced in C++ and i'm trying to write sum code to add numbers precisely. This is a dll plugin for some finite difference software and the code is called several million times during a run. I want to write a function where any number of arguments can be passed in and the sum will be returned. My code looks like:
#include <cstdarg>
double SumFunction(int numArgs, ...){ // this allows me to pass any number
// of arguments to my function.
va_list args;
va_start(args,numArgs); //necessary prerequisites for using cstdarg
double myarray[10];
for (int i = 0; i < numArgs; i++) {
myarray[i] = va_arg(args,double);
} // I imagine this is sloppy code; however i cannot create
// myarray{numArgs] because numArgs is not a const int.
sum(myarray); // The actual method of addition is not relevant here, but
//for more complicated methods, I need to put the summation
// terms in a list.
vector<double> vec(numArgs); // instead, place all values in a vector
for (int i = 0; i < numArgs; i++) {
vec.at(i) = va_arg(args,double);
}
sum(vec); //This would be passed by reference, of course. The function sum
// doesn't actually exist, it would all be contained within the
// current function. This is method is twice as slow as placing
//all the values in the static array.
double *vec;
vec = new double[numArgs];
for (int i = 0; i < (numArgs); i++) {
vec[i] = va_arg(args,double);
}
sum(vec); // Again half of the speed of using a standard array and
// increasing in magnitude for every extra dynamic array!
delete[] vec;
va_end(args);
}
So the problem I have is that using an oversized static array is sloppy programming, but using either a vector or a dynamic array slows the program down considerably. So I really don't know what to do. Can anyone help, please?
One way to speed the code up (at the cost of making it more complicated) is to reuse a dynamic array or vector between calls, then you will avoid incurring the overhead of memory allocation and deallocation each time you call the function.
For example declare these variables outside your function either as global variables or as member variables inside some class. I'll just make them globals for ease of explanation:
double* sumArray = NULL;
int sumArraySize = 0;
In your SumFunction, check if the array exists and if not allocate it, and resize if necessary:
double SumFunction(int numArgs, ...){ // this allows me to pass any number
// of arguments to my function.
va_list args;
va_start(args,numArgs); //necessary prerequisites for using cstdarg
// if the array has already been allocated, check if it is large enough and delete if not:
if((sumArray != NULL) && (numArgs > sumArraySize))
{
delete[] sumArray;
sumArray = NULL;
}
// allocate the array, but only if necessary:
if(sumArray == NULL)
{
sumArray = new double[numArgs];
sumArraySize = numArgs;
}
double *vec = sumArray; // set to your array, reusable between calls
for (int i = 0; i < (numArgs); i++) {
vec[i] = va_arg(args,double);
}
sum(vec, numArgs); // you will need to pass the array size
va_end(args);
// note no array deallocation
}
The catch is that you need to remember to deallocate the array at some point by calling a function similar to this (like I said, you pay for speed with extra complexity):
void freeSumArray()
{
if(sumArray != NULL)
{
delete[] sumArray;
sumArray = NULL;
sumArraySize = 0;
}
}
You can take a similar (and simpler/cleaner) approach with a vector, allocate it the first time if it doesn't already exist, or call resize() on it with numArgs if it does.
When using a std::vector the optimizer must consider that relocation is possible and this introduces an extra indirection.
In other words the code for
v[index] += value;
where v is for example a std::vector<int> is expanded to
int *p = v._begin + index;
*p += value;
i.e. from vector you need first to get the field _begin (that contains where the content starts in memory), then apply the index, and then dereference to get the value and mutate it.
If the code performing the computation on the elements of the vector in a loop calls any unknown non-inlined code, the optimizer is forced to assume that unknown code may mutate the _begin field of the vector and this will require doing the two-steps indirection for each element.
(NOTE: that the vector is passed with a cost std::vector<T>& reference is totally irrelevant: a const reference doesn't mean that the vector is const but simply puts a limitation on what operations are permitted using that reference; external code could have a non-const reference to access the vector and constness can also be legally casted away... constness of references is basically ignored by the optimizer).
One way to remove this extra lookup (if you know that the vector is not being resized during the computation) is to cache this address in a local and use that instead of the vector operator [] to access the element:
int *p = &v[0];
for (int i=0,n=v.size(); i<n; i++) {
/// use p[i] instead of v[i]
}
This will generate code that is almost as efficient as a static array because, given that the address of p is not published, nothing in the body of the loop can change it and the value p can be assumed constant (something that cannot be done for v._begin as the optimizer cannot know if someone else knows the address of _begin).
I'm saying "almost" because a static array only requires indexing, while using a dynamically allocated area requires "base + indexing" access; most CPUs however provide this kind of memory access at no extra cost. Moreover if you're processing elements in sequence the indexing addressing becomes just a sequential memory access but only if you can assume the start address constant (i.e. not in the case of std::vector<T>::operator[]).
Assuming that the "max storage ever needed" is in the order of 10-50, I'd say using a local array is perfectly fine.
Using vector<T> will use 3 * sizeof(*T) (at least) to track the contents of the vector. So if we compare that to an array of double arr[10];, then that's 7 elements more on the stack of equal size (or 8.5 in 32-bit build). But you also need a call to new, which takes a size argument. So that takes up AT LEAST one, more likely 2-3 elements of stackspace, and the implementation of new is quite possibly not straightforward, so further calls are needed, which take up further stack-space.
If you "don't know" the number of elements, and need to cope with quite large numbers of elements, then using a hybrid solution, where you have a small stack-based local array, and if numargs > small_size use vector, and then pass vec.data() to the function sum.