c++ understanding size_t behaviour for vector creation - c++

this is a folow up to this https://softwareengineering.stackexchange.com/questions/256241/c-coding-practice-class-vs-free-functions question I posted a few day ago. In short, the idea is to create a custom vector class for statistical data analysis.
I got a great response, that made me realise that I need to understand: why use size_t in a constructor of a container class and why use it anyway?
Here is a part of the proposed solution:
template<class T>
class vect
{
std::vector<T> m;
public:
vect(size_t n) :m(n) {}
void addTo(T a){ m.push_back(a); }
std::vector<T> get() const { return m;}
... more functions and overloaded operators
};
I understand that size_t is an (unsigned int) data type and should be used to indicate that it's value should represent the size of n object.
In order to understand the behaviour of size_t I did the following:
int main() {
vect<int> m(0);
vect<int> n(100);
std::cout << sizeof(n) << std::endl;
std::cout << sizeof(m) << std::endl;
std::cout << sizeof(m.get()) << std::endl;
for (int i = 0 ; i < 100; i++) {
m.addTo(i);
}
std::cout << sizeof(m) << std::endl;
std::cout << sizeof(m.get()) << std::endl;
}
all of which return "24". (I expected a change in the size of the object after adding parameters to it.) However:
for(int i = 0; i<100;i++)
std::cout << m[i] << std::endl;
nicely prints out all values from 0 to 100. Thus I know that there are 100 integers stared in the vector, but then why is its size 24 and not 100?
Obviously I am new to c++ programming and to make things worse this is my first template class.
Thank you for your time and patience, I really appreciate it.

sizeof is concerned with the size of a type in memory. Your vect class is a type that takes up 24 bytes. The size of a type never changes once it's been compiled.
But how can your vector store so much information without changing it size? Because it contains pointers to other things that can take up much more space (indeed, they can take up as much space as you need - that's why they're called dynamic data structures). Presumably your vect instance contains a pointer to the standard vector class, which contains other pointers, maybe to an array or a dynamically allocated section of memory which holds the actual data.
You cannot query the size of those private, indirectly referenced bits of memory because you don't know their name, or their type. In fact, a major reason for creating such container classes is so that you don't have to know how much memory to allocate - you just stuff things into them, and they silently allocate as much memory as necessary.

Related

Incorrect values for overloading multiplication by scalar for a simple Polynomial class in C++

Simplifying: I am trying to write a simple Polynomial class and I get incorrect values when I try to overload the multiplication operator:
#include <iostream>
class Polynomial {
public:
unsigned int degree;
int *coefficients;
Polynomial(unsigned int deg, int* arr) {
degree = deg;
coefficients = arr;
}
~Polynomial() {}
int& operator[](int index) const {
return coefficients[index];
}
};
std::ostream& operator<<(std::ostream &os, const Polynomial& P){
for (unsigned int i=0; i < P.degree; i++) os << P[i] << ",";
os << P.coefficients[P.degree];
return os;
}
Polynomial operator*(const Polynomial &P, const int &x) {
int arr[P.degree];
for (unsigned int i=0; i <= P.degree; i++) arr[i] = P[i];
Polynomial p(P.degree, arr);
std::cout << p << std:: endl; // just for debugging
return p;
}
I am testing the code in my main:
int g[] = {-1, 0, 1, 1, 0, 1, 0, 0, -1, 0, -1};
Polynomial P_g = Polynomial(10, g);
std::cout << P_g << std::endl;
Polynomial P_g_3 = P_g*3;
std::cout << P_g_3[0] << std::endl;
std::cout << P_g_3[1] << std::endl;
std::cout << P_g_3[2] << std::endl;
std::cout << P_g_3[3] << std::endl;
std::cout << P_g_3[4] << std::endl;
std::cout << P_g_3[5] << std::endl;
std::cout << P_g_3[6] << std::endl;
std::cout << P_g_3[7] << std::endl;
std::cout << P_g_3[8] << std::endl;
std::cout << P_g_3[9] << std::endl;
std::cout << P_g_3[10] << std::endl;
std::cout << P_g_3 << std::endl;
But the output in the console is something totally different then what I expect:
-1,0,1,1,0,1,0,0,-1,0,-1
-1,0,1,1,0,1,0,0,-1,0,-1
-1
0
0
0
0
0
-2002329776
32767
-516177072
32766
539561192
0,32766,539581697,32767,-2002329776,32767,1,0,243560063,1,-2002329776
Although notice that the inner cout statement from within the overloaded operator returns a correct polynomial. Only when the program exits that function the coefficients get screwed... Moreover the two printing strategies are not even consistent with themselves. What is going on?
Analysis
Your Polynomial class is rather unusual in that it does not own its own data. While this is not inherently wrong, it is almost always an unwise approach.
As an illustration, look at the variables in your main function. The data for the polynomial P_g is stored in the array g. The variable g serves no purpose other than storing that data, so its name is clutter. There is also a consistency concern here, because if someone were to change an element of g, then P_g would also change. Even worse, if g were to cease to exist while P_g was still around, you would have a polynomial without access to its data!
Fortunately, local variables are destroyed in the reverse order of creation, so P_g will be destroyed before g. However, no such luck intervenes when you invoke operator*. In that operator, the polynomial p stores its data in the local variable arr. So far, the situation is the same as in main(). Until you return p. At that point, p is copied/moved into the calling scope, and that copy uses the same data as the original. The array arr gets destroyed, yet the returned object still tries to access arr for its data. The returned object, P_g_3, has a dangling pointer.
When you try to access the data of P_g_3, undefined behavior occurs. In some cases, you might see the behavior you expect. In this case, garbage values were produced. From one perspective, your result is the more desirable one since you were able to detect that a problem exists, which allowed you to attempt to fix it. Far more insidious is the undefined behavior that performs as you expect when you run the program, but not when someone else does.
Solution
The usual approach is to have objects own their own data. Often, this is exclusive ownership, so that the data cannot change without going through the object.
A first refinement would be to move the data-holding arrays into the objects. The main obstacle is that you do not know in advance how large the array is. This calls for dynamic memory allocation. You could start down this road without changing the data members of your class; a first step might be to initialize coefficients to memory allocated with new rather than assigning it arr. This leads to the need to follow the Rule of Three. Unfortunately, this is a bit of work to do correctly, especially since you have chosen to track the degree of the polynomial instead of the size of the array.
Fortunately, storing data of unknown length is a common concern, common enough that the standard library provides tools to automate most of the details. One such tool is std::vector. If you change the type of coefficients from int* to std::vector<int>, then Polynomial objects will be able to own their own data, plus the Rule of Three becomes the Rule of Zero. (That is, the compiler-generated destructor and copy methods will suffice.) Your operator* could simply make a copy of the incoming Polynomial, then iterate over the copy's vector to make changes.
As an added benefit, you no longer need to track degree manually, as the degree of non-zero polynomials will be coefficients.size() - 1. (Well, there is a complication if the leading coefficient is zero, but that's also an unhandled concern for your original implementation.) This demonstrates one reason a class will often keep its data members private. If you had made the data member private and instead defined a degree() method, you could change how the degree is determined without modifying any code that uses Polynomial.
Note:
If you use range-based for loops with your vectors, the example code would have no need to look at the degree of a polynomial. You would be providing the member function for the benefit of code external to the class.

C array of objects that allow for dynamic allocation

If I have a class with dynamic allocaton like std::vector. How much sense does it make to create an array with elements of that object?
I tested the following:
std::vector<int> array[2];
array[0].push_back(10);
array[0].push_back(20);
array[0].push_back(30);
array[1].push_back(40);
array[1].push_back(50);
array[1].push_back(60);
for (auto x : array[0]) {
std::cout << x << std::endl;
}
for (auto x : array[1]) {
std::cout << x << std::endl;
}
Which outputs the values correctly. However is this actually undefined behavior? Array elements are located contiguously in memory. When a std::vector object is initialized without specifying size, it doesn't allocate any memory. So when we create an array with two elements, is any memory allocated at all? When we then add new elements to std::vector are we writing outside of the array bounds?
There is no undefined behavior here.
How much sense does it make to create an array with elements of that object?
This is entirely up to you, and your design considerations. It's perfectly fine to have an array of vectors.
So when we create an array with two elements, is any memory allocated at all?
Yes, memory for the array is allocated.
When we then add new elements to std::vector are we writing outside of the array bounds?
std::vector allocates memory dynamically. A default-constructed vector may or may not allocate memory. This is implementation-dependent. Internally, the vector is probably just holding pointers.
Some implementations may be smarter than this, by switching from a fixed number of elements with no dynamic allocation to using dynamic allocation if the requirements exceed that. std::string implementations commonly do this, by repurposing the internal pointer storage to hold string data instead.
Note that if you are pushing some number of elements onto a new vector, it's good practice to call reserve on that vector to avoid unnecessary reallocations as it grows.
An object of the type std::vector<int> has a fixed size that depends on its implementation.
Consider the following demonstrative program.
#include <iostream>
#include <vector>
int main()
{
std::vector<int> array[2];
std::cout << "sizeof( std::vector<int> ) = "
<< sizeof( std::vector<int> )
<< '\n';
std::cout << "sizeof( array = "
<< sizeof( array )
<< '\n';
std::cout << '\n';
array[0].push_back(10);
array[0].push_back(20);
array[0].push_back(30);
array[1].push_back(40);
array[1].push_back(50);
array[1].push_back(60);
std::cout << "sizeof( std::vector<int> ) = "
<< sizeof( std::vector<int> )
<< '\n';
std::cout << "sizeof( array = "
<< sizeof( array )
<< '\n';
}
Its output might look like
sizeof( std::vector<int> ) = 24
sizeof( array = 48
sizeof( std::vector<int> ) = 24
sizeof( array = 48
When you are adding new elements to the vectors their sizes are not changed and correspondingly the array size also is not changed.
The program is well-formed.
The class template std::vector contains as its data member a pointer to a memory where it places its elements. If there is no enough memory the vector reallocates it. But the size of an object itself of the vector type does not depend on the reallocated memory.

Pass array to function without so it would not change original array no matter what

I have a function that performs some magic on the array that I am passing. But the original array should be intact. Unfortunately it is changing its content based on what is happening in the array.
Can you help me, please?
Function:
void test(int* array) {
array[0] = 1; // EDIT: Added missing line
std::cout << "Inside: " << array[0] << endl;
}
int main() {
int *testArray = new int[1];
testArray[0] = 0;
std::cout<<testArray[0]<<endl;
test(testArray);
std::cout << "Outside: " << testArray[0] << endl;
}
Current result is:
0
Inside: 1
Outside: 1
Result I would want to have:
0
Inside: 1
Outside: 0
Is this possible?
It sounds like you want to pass array by value not by reference. You are passing pointer to a first element here. So, any changes which you perform to that array inside that function will be reflected to original array.
The other problem is you haven't posted fair amount of code regarding the problem you want to solve. I am assuming you want functionality like this.
See live demo here.
#include <iostream>
void test(const int* array) {
array[0]=1;
std::cout << "Inside: " << array[0] << std::endl;
}
int main() {
int *testArray = new int[1];
testArray[0] = 0;
std::cout<<testArray[0]<<std::endl;
test(testArray);
std::cout << "Outside: " << testArray[0] << std::endl;
delete[] testArray;
}
Compiler will give you following errors:
Error(s):
source_file.cpp:4:13: error: read-only variable is not assignable
array[0]=1;
~~~~~~~~^
1 error generated.
You should not use new[] to allocate dynamic arrays in C++. 99% of the time you should be using std::vector If you want dynamic array in C++.
Avoid using C compatibility features...
void test( std::array<int, 1> a )
{
a[0] = 1; // fine
std::cout << "Inside: " << a[0] << endl;
};
int main()
{
std::array<int, 1> testArray;
testArray[0] = 0;
std::cout<<testArray[0]<<endl;
test(testArray);
std::cout << "Outside: " << testArray[0] << endl;
}
If you need the size determined at runtime, use std::vector instead of std::array.
EDIT: As others have pointed out, it seems like you want to either pass the array by value instead of by reference, thus copying the elements of the array and modifying only the copy, or you want to avoid modifying any part of the array altogether. I'll elaborate on both parts a bit more:
In C++, there is near to no distinction between arrays and pointers. Note that both your variable testArray and your parameter array are pointers to the beginning of an array. If you use array to modify any part of the underlying array, what you actually do is modify the memory are that is described by both testArray and array. If you don't want to modify any part of the array at all, it would be helpful to use the const qualifier, as Destructor already wrote in his answer. If you however want to keep the original array but still want to make some modifications inside the function, the following still applies:
To keep the array from being modified, the only general way that works is to copy all of its elements by creating a new array of the same size, copying all elements from the input array to the copy and then working only on the copy, which should be deleted after the function has finished.
My personal answer:
I would recommend that you look into some of C++'s data structures, especially std::vector. If you pass it by value (not by reference), vector takes care of all the necessary copy operations I just described and in all cases, you can use it in the same way as an array, while it provides lots of additional features (i.e. dynamic size, deletion and insertion of elements, simplified iteration, ...).

What's the correct way to work with bounded arrays in C++?

I'm trying to understand how bounded arrays work in C++. I need to have a quick length method to return the size of my bounded array. In Java I would do something like that:
int[] array = new int[10];
System.out.println(array.length); // 10
System.out.println(array[5]); // 0
array[5] = 11;
System.out.prinltn(array[5]); // 11
If possible I would like to use an object array (i.e. a class implementing an array functionality) instead of pointers. Would I be correct to say that it feels much more natural to use an object array instead of a memory pointer to work with arrays?
C++ has a class std::array<type, size> which is basically just a wrapper for stack-allocated arrays. C++ also has a class std::vector<type> which is a wrapper for heap-allocated arrays (like what you're used to in Java) but which also has ArrayList-like functionality.
In your case, writing code which is logically and semantically identical to yours is:
std::vector<int> array(10, 0); //EDIT: I added the second parameter because I don't think all implementations zero the memory when allocating it. This ensures that the memory is zeroed before use, like it would be in Java.
std::cout << array.size() << std::endl;
std::cout << array[5] << std::endl;
array[5] = 11;
std::cout << array[5] << std::endl;
Though, I wouldn't name the variable array, since that could be confusing.

Can I determine the size/length of an array in C++ without having to hardcode it?

I am basically looking for some sort of "dynamic" way of passing the size/length of an array to a function.
I have tried:
void printArray(int arrayName[])
{
for(int i = 0 ; i < sizeof(arrayName); ++i)
{
cout << arrayName[i] << ' ';
}
}
But I realized it only considers its bytesize and not how many elements are on the array.
And also:
void printArray(int *arrayName)
{
while (*arrayName)
{
cout << *arrayName << ' ';
*arrayName++;
}
}
This has at least printed me everything but more than what I expected, so it doesn't actually work how I want it to.
I reckon it is because I don't exactly tell it how big I need it to be so it plays it "safe" and throws me some big size and eventually starts printing me very odd integers after my last element in the array.
So I finally got this work around, yet I believe there is something better out there!:
void printArray(int *arrayName)
{
while (*arrayName)
{
if (*arrayName == -858993460)
{
break;
}
cout << *arrayName << ' ';
*arrayName++;
}
cout << '\n';
}
After running the program a few times I realized the value after the last element of the array that I have input is always: -858993460, so I made it break the while loop once this value is encountered.
include <iostream>
include <conio.h>
using namespace std;
// functions prototypes
void printArray (int arrayName[], int lengthArray);
// global variables
//main
int main ()
{
int firstArray[] = {5, 10, 15};
int secondArray[] = {2, 4, 6, 8, 10};
printArray (firstArray,3);
printArray (secondArray,5);
// end of program
_getch();
return 0;
}
// functions definitions
void printArray(int arrayName[], int lengthArray)
{
for (int i=0; i<lengthArray; i++)
{
cout << arrayName[i] << " ";
}
cout << "\n";
}
Thank you very much.
TL;DR answer: use std::vector.
But I realized it [sizeof()] only considers its bytesize and not how many elements are on the array.
That wouldn't be a problem in itself: you could still get the size of the array using sizeof(array) / sizeof(array[0]), but the problem is that when passed to a function, arrays decay into a pointer to their first element, so all you can get is sizeof(T *) (T being the type of an element in the array).
About *arrayName++:
This has at least printed me everything but more than what I expected
I don't even understand what inspired you to calculate the size of the array in this way. All that this code does is incrementing the first object in the array until it's zero.
After running the program a few times I realized the value after the last element of the array that I have input is always: -858993460
That's a terrible assumption and it also relies on undefined behavior. You can't really be sure what's in the memory after the first element of your array, you should not even be accessing it.
Basically, in C++, if you want to know the size of a raw array from within a function, then you have to keep track of it manually (e. g. adding an extra size_t size argument), because of the way arrays are passed to functions (remember, they "decay into" a pointer). If you want something more flexible, consider using std::vector<int> (or whatever type of objects you want to store) from the C++ standard library -- it has a size() method, which does exactly what you want.
1st try
When arrays are passed into functions they decay to pointers. Normally, using sizeof on an array would give you its size in bytes which you could then divide by the size in bytes of each element and get the number of elements. But now, since you have a pointer instead of an array, calling sizeof just gives you the size of the pointer (usually 4 or 8 bytes), not the array itself and that's why this fails.
2nd try
The while loop in this example assumes that your array ends with a zero and that's very bad (unless you really did use a zero as a terminator like null-terminated strings for example do). If your array doesn't end with a zero you might be accessing memory that isn't yours and therefore invoking undefined behavior. Another thing that could happen is that your array has a zero element in the middle which would then only print the first few elements.
3rd try
This special value you found lurking at the end of your array can change any time. This value just happened to be there at this point and it might be different another time so hardcoding it like this is very dangerous because again, you could end up accessing memory that isn't yours.
Your final code
This code is correct and passing the length of the array along with the array itself is something commonly done (especially in APIs written in C). This code shouldn't cause any problems as long as you don't pass a length that's actually bigger than the real length of the array and this can happen sometimes so it is also error prone.
Another solution
Another solution would be to use std::vector, a container which along with keeping track of its size, also allows you to add as many elements as you want, i.e. the size doesn't need to be known at runtime. So you could do something like this:
#include <iostream>
#include <vector>
#include <cstddef>
void print_vec(const std::vector<int>& v)
{
std::size_t len = v.size();
for (std::size_t i = 0; i < len; ++i)
{
std::cout << v[i] << std::endl;
}
}
int main()
{
std::vector<int> elements;
elements.push_back(5);
elements.push_back(4);
elements.push_back(3);
elements.push_back(2);
elements.push_back(1);
print_vec(elements);
return 0;
}
Useful links worth checking out
Undefined behavior: Undefined, unspecified and implementation-defined behavior
Array decay: What is array decaying?
std::vector: http://en.cppreference.com/w/cpp/container/vector
As all the other answers say, you should use std::vector or, as you already did, pass the number of elements of the array to the printing function.
Another way to do is is by putting a sentinel element (a value you are sure it won't be inside the array) at the end of the array. In the printing function you then cycle through the elements and when you find the sentinel you stop.
A possible solution: you can use a template to deduce the array length:
template <typename T, int N>
int array_length(T (&array)[N]) {
return N;
}
Note that you have to do this before the array decays to a pointer, but you can use the technique directly or in a wrapper.
For example, if you don't mind rolling your own array wrapper:
template <typename T>
struct array {
T *a_;
int n_;
template <int N> array(T (&a)[N]) : a_(a), n_(N) {}
};
You can do this:
void printArray(array<int> a)
{
for (int i = 0 ; i < a.n_; ++i)
cout << a.a_[i] << ' ';
}
and call it like
int firstArray[] = {5, 10, 15};
int secondArray[] = {2, 4, 6, 8, 10};
printArray (firstArray);
printArray (secondArray);
The key is that the templated constructor isn't explicit so your array can be converted to an instance, capturing the size, before decaying to a pointer.
NB. The wrapper shown isn't suitable for owning dynamically-sized arrays, only for handling statically-sized arrays conveniently. It's also missing various operators and a default constructor, for brevity. In general, prefer std::vector or std::array instead for general use.
... OP's own attempts are completely addressed elsewhere ...
Using the -858993460 value is highly unreliable and, in fact, incorrect.
You can pass a length of array in two ways: pass an additional parameter (say size_t length) to your function, or put a special value to the end of array. The first way is preferred, but the second is used, for example, for passing strings by char*.
In C/C++ it's not possible to know the size of an array at runtime. You might consider using an std::vector class if you need that, and it has other advantages as well.
When you pass the length of the array to printArray, you can use sizeof(array) / sizeof(array[0]), which is to say the size in bytes of the whole array divided by the size in bytes of a single element gives you the size in elements of the array itself.
More to the point, in C++ you may find it to your advantage to learn about std::vector and std::array and prefer these over raw arrays—unless of course you’re doing a homework assignment that requires you to learn about raw arrays. The size() member function will give you the number of elements in a vector.
In C/C++, native arrays degrade to pointers as soon as they are passed to functions. As such, the "length" parameter has to be passed as a parameter for the function.
C++ offers the std::vector collection class. Make sure when you pass it to a function, you pass it by reference or by pointer (to avoid making a copy of the array as it's passed).
#include <vector>
#include <string>
void printArray(std::vector<std::string> &arrayName)
{
size_t length = arrayName.size();
for(size_t i = 0 ; i < length; ++i)
{
cout << arrayName[i] << ' ';
}
}
int main()
{
std::vector<std::string> arrayOfNames;
arrayOfNames.push_back(std::string("Stack"));
arrayOfNames.push_back(std::string("Overflow"));
printArray(arrayOfNames);
...
}