Why does my array say write access violation partially through? - c++

Updated to be copy/pasted and run. My bad.
I know I'm probably going to get the whole "this question was asked already" but I spent sometime looking and couldn't find a matching problem. It's very possible I just don't know enough to look in the right place.
When I call InitSortedArray() it runs through a seemingly random number of elements before throwing exception: write access violation. Everytime I run it it stops at a different element number. Any ideas?
#include <array>
#include <iostream>
using namespace std;
int * toSort;
const int SIZE = 100000;
void InitSortedArray()
{
srand(0);
toSort[0] = rand() % 5;
cout << toSort[0];
for (int i = 1; i < SIZE - 1; i++)
{
srand(0);
toSort[i] = toSort[i - 1] + rand() % 5;
cout << toSort[i] << endl;
}
}
void Search()
{
toSort[SIZE];
InitSortedArray();
}
int main()
{
Search();
}

int * toSort;
allocates a pointer to some data yet to be assigned to it. No data is ever assigned. You could
int * toSort = new int[100000];
but that picks up some memory management work you don't need. Any time you use new[] sooner or later you must delete[]. Instead use
const int SIZE = 100000; // place first so we can use it below
int toSort[SIZE];
or the more modern
const int SIZE = 100000; // place first so we can use it below
std::array<int, SIZE> toSort;
to declare an array.
toSort[100000];
in Search does nothing helpful (and is in fact harmful as it invokes Undefined Behaviour by accessing outside the bounds of toSort) and should be removed.
Extra stuff:
srand reseeds and restarts the random number generator. It is only in truly rare circumstances that you want to call it more than once, and in those cases there are many better options than srand and rand.
Place a single call to srand at the top of main and make absolutely certain you want srand(0) as this will always generate the exact same numbers on a given computer. It's great for testing, but not so good if you want a different sequence every time. Typical use is srand(time(NULL)) to seed the generator based on the ever-changing flow of time. That's still not all that good, but good enough for most cases where rand is in use.

It looks like you're using an uninitialized pointer that points to random space, and trying to store elements and access elements in it. Also, your inclusion of "array" doesn't make any sense here. I believe what you want to do here is initialize your toSort array to actually point to a section of memory that you intend to point it to:
int toSort[SIZE];
instead of
int * toSort;
If you're looking to use the STL array (which is likely highly recommendable) then you need to explicitly use it:
std::array<int, SIZE> toSort;
The nice thing about using the STL is it takes care of a lot of the memory access issues you can run into like memory access violation. Another helpful thing from the STL would be vector:
#include <vector>
std::vector<int> toSort;
then later: (this adds an item to the back of the vector)
toSort.push_back(<some number>);
and to access:
int somethingElse = toSort[<index number>];
Arrays: http://en.cppreference.com/w/cpp/container/array
Vectors: http://en.cppreference.com/w/cpp/container/vector

Related

How can I define large multidimensional arrays in c++ without getting this error?

My programming knowledge is very basic but usually enough to get by for what I need it for.
I'm using visual studio and trying to define large arrays of 4 dimensions (maybe more) of size up to [20][20][20][10000].
At first I was defining an array as int array[5][5][5][900] which was working fine. I then tried defining a new array, exaclty the same size but with a different name and got an unhandled exception error on chkstk.asm Find next lower page and probe
cs20:
sub eax, PAGESIZE ; decrease by PAGESIZE
test dword ptr [eax],eax ; probe page.
jmp short cs10
I tried defining as double and long double, and using a vector but seems to make no difference. I'd like to increase the size also and possible add further dimensions.
Can someone please explain a simple way to make an array like this without this happening?
The array elements only need to contain 0 or 1
The issue is that the array you've specified will be very large in size (20 * 20 * 20 * 900 = 7.2 million). Since that data is stored on the stack you're probably seeing a stack overflow.
You'll probably want to allocate something that large with new like:
auto test = new int[20][20][20][900];
test[0][1][0] = 0;
// when you're done with it, you'll need to delete though
delete[] test;
which will put it in the (much larger) heap
You can consider this approach:
#include <vector>
int main()
{
std::vector<std::vector<std::vector<std::vector<bool>>>> tab4;
int n1 = 10; // 1st dimension
int n2 = 20; // 2nd dimension
int n3 = 30; // 3rd dimension
int n4 = 10000; // 4th dimension
tab4.resize(n1);
for (auto& v : tab4)
{
v.resize(n2);
for (auto& w : v)
{
w.resize(n3);
for (auto& u : w)
{
u.resize(n4);
}
}
}
tab4[1][12][23][4000] = 9999;
}
Pros:
This code is exception-safe and leak-free
The vector can be easily resized, should the need appear.
The size of each of the vector dimensions need not be compile constants
Beginners should not be exposed to bare pointers where more robust, safe and easy to use alternatives exist.
Cons:
More code is needed (this is not a serious drawback)
The allocated memory is not contiguous (this may be of some importance to advanced users only, can be circumvented, but I'm not aware of a solution that could be advised to beginners)
Alternative solution
This can be used if the sizes of the vector are fixed at compile-time. One vector is used to allocate the memory on the heap automatically, without resorting to bare pointers, operator new, etc.
#include <array>
#include <vector>
int main()
{
// vector [10][20][30][1000]
std::vector<std::array<std::array<std::array<bool, 10000>, 30>, 20>> tab4 (10);
tab4[1][12][23][4000] = 9999;
}
I personally like std::, but sometimes there's too many of them. This can be achieved like this:
using std::array;
std::vector<array<array<array<bool, 10000>, 30>, 20>> tab4 (10);

Specifying the size of a vector in declaration vs using reserve [duplicate]

I know the size of a vector, which is the best way to initialize it?
Option 1:
vector<int> vec(3); //in .h
vec.at(0)=var1; //in .cpp
vec.at(1)=var2; //in .cpp
vec.at(2)=var3; //in .cpp
Option 2:
vector<int> vec; //in .h
vec.reserve(3); //in .cpp
vec.push_back(var1); //in .cpp
vec.push_back(var2); //in .cpp
vec.push_back(var3); //in .cpp
I guess, Option2 is better than Option1. Is it? Any other options?
Somehow, a non-answer answer that is completely wrong has remained accepted and most upvoted for ~7 years. This is not an apples and oranges question. This is not a question to be answered with vague cliches.
For a simple rule to follow:
Option #1 is faster...
...but this probably shouldn't be your biggest concern.
Firstly, the difference is pretty minor. Secondly, as we crank up the compiler optimization, the difference becomes even smaller. For example, on my gcc-5.4.0, the difference is arguably trivial when running level 3 compiler optimization (-O3):
So in general, I would recommending using method #1 whenever you encounter this situation. However, if you can't remember which one is optimal, it's probably not worth the effort to find out. Just pick either one and move on, because this is unlikely to ever cause a noticeable slowdown in your program as a whole.
These tests were run by sampling random vector sizes from a normal distribution, and then timing the initialization of vectors of these sizes using the two methods. We keep a dummy sum variable to ensure the vector initialization is not optimized out, and we randomize vector sizes and values to make an effort to avoid any errors due to branch prediction, caching, and other such tricks.
main.cpp:
/*
* Test constructing and filling a vector in two ways: construction with size
* then assignment versus construction of empty vector followed by push_back
* We collect dummy sums to prevent the compiler from optimizing out computation
*/
#include <iostream>
#include <vector>
#include "rng.hpp"
#include "timer.hpp"
const size_t kMinSize = 1000;
const size_t kMaxSize = 100000;
const double kSizeIncrementFactor = 1.2;
const int kNumVecs = 10000;
int main() {
for (size_t mean_size = kMinSize; mean_size <= kMaxSize;
mean_size = static_cast<size_t>(mean_size * kSizeIncrementFactor)) {
// Generate sizes from normal distribution
std::vector<size_t> sizes_vec;
NormalIntRng<size_t> sizes_rng(mean_size, mean_size / 10.0);
for (int i = 0; i < kNumVecs; ++i) {
sizes_vec.push_back(sizes_rng.GenerateValue());
}
Timer timer;
UniformIntRng<int> values_rng(0, 5);
// Method 1: construct with size, then assign
timer.Reset();
int method_1_sum = 0;
for (size_t num_els : sizes_vec) {
std::vector<int> vec(num_els);
for (size_t i = 0; i < num_els; ++i) {
vec[i] = values_rng.GenerateValue();
}
// Compute sum - this part identical for two methods
for (size_t i = 0; i < num_els; ++i) {
method_1_sum += vec[i];
}
}
double method_1_seconds = timer.GetSeconds();
// Method 2: reserve then push_back
timer.Reset();
int method_2_sum = 0;
for (size_t num_els : sizes_vec) {
std::vector<int> vec;
vec.reserve(num_els);
for (size_t i = 0; i < num_els; ++i) {
vec.push_back(values_rng.GenerateValue());
}
// Compute sum - this part identical for two methods
for (size_t i = 0; i < num_els; ++i) {
method_2_sum += vec[i];
}
}
double method_2_seconds = timer.GetSeconds();
// Report results as mean_size, method_1_seconds, method_2_seconds
std::cout << mean_size << ", " << method_1_seconds << ", " << method_2_seconds;
// Do something with the dummy sums that cannot be optimized out
std::cout << ((method_1_sum > method_2_sum) ? "" : " ") << std::endl;
}
return 0;
}
The header files I used are located here:
rng.hpp
timer.hpp
Both variants have different semantics, i.e. you are comparing apples and oranges.
The first gives you a vector of n default-initialized values, the second variant reserves the memory, but does not initialize them.
Choose what better fits your needs, i.e. what is "better" in a certain situation.
The "best" way would be:
vector<int> vec = {var1, var2, var3};
available with a C++11 capable compiler.
Not sure exactly what you mean by doing things in a header or implementation files. A mutable global is a no-no for me. If it is a class member, then it can be initialized in the constructor initialization list.
Otherwise, option 1 would be generally used if you know how many items you are going to use and the default values (0 for int) would be useful.
Using at here means that you can't guarantee the index is valid. A situation like that is alarming itself. Even though you will be able to reliably detect problems, it's definitely simpler to use push_back and stop worrying about getting the indexes right.
In case of option 2, generally it makes zero performance difference whether you reserve memory or not, so it's simpler not to reserve*. Unless perhaps if the vector contains types that are very expensive to copy (and don't provide fast moving in C++11), or the size of the vector is going to be enormous.
* From Stroustrups C++ Style and Technique FAQ:
People sometimes worry about the cost of std::vector growing
incrementally. I used to worry about that and used reserve() to
optimize the growth. After measuring my code and repeatedly having
trouble finding the performance benefits of reserve() in real
programs, I stopped using it except where it is needed to avoid
iterator invalidation (a rare case in my code). Again: measure before
you optimize.
While your examples are essentially the same, it may be that when the type used is not an int the choice is taken from you. If your type doesn't have a default constructor, or if you'll have to re-construct each element later anyway, I would use reserve. Just don't fall into the trap I did and use reserve and then the operator[] for initialisation!
Constructor
std::vector<MyType> myVec(numberOfElementsToStart);
int size = myVec.size();
int capacity = myVec.capacity();
In this first case, using the constructor, size and numberOfElementsToStart will be equal and capacity will be greater than or equal to them.
Think of myVec as a vector containing a number of items of MyType which can be accessed and modified, push_back(anotherInstanceOfMyType) will append it the the end of the vector.
Reserve
std::vector<MyType> myVec;
myVec.reserve(numberOfElementsToStart);
int size = myVec.size();
int capacity = myVec.capacity();
When using the reserve function, size will be 0 until you add an element to the array and capacity will be equal to or greater than numberOfElementsToStart.
Think of myVec as an empty vector which can have new items appended to it using push_back with no memory allocation for at least the first numberOfElementsToStart elements.
Note that push_back() still requires an internal check to ensure that size < capacity and to increment size, so you may want to weigh this against the cost of default construction.
List initialisation
std::vector<MyType> myVec{ var1, var2, var3 };
This is an additional option for initialising your vector, and while it is only feasible for very small vectors, it is a clear way to initialise a small vector with known values. size will be equal to the number of elements you initialised it with, and capacity will be equal to or greater than size. Modern compilers may optimise away the creation of temporary objects and prevent unnecessary copying.
Option 2 is better, as reserve only needs to reserve memory (3 * sizeof(T)), while the first option calls the constructor of the base type for each cell inside the container.
For C-like types it will probably be the same.
How it Works
This is implementation specific however in general Vector data structure internally will have pointer to the memory block where the elements would actually resides. Both GCC and VC++ allocate for 0 elements by default. So you can think of Vector's internal memory pointer to be nullptr by default.
When you call vector<int> vec(N); as in your Option 1, the N objects are created using default constructor. This is called fill constructor.
When you do vec.reserve(N); after default constructor as in Option 2, you get data block to hold 3 elements but no objects are created unlike in option 1.
Why to Select Option 1
If you know the number of elements vector will hold and you might leave most of the elements to its default values then you might want to use this option.
Why to Select Option 2
This option is generally better of the two as it only allocates data block for the future use and not actually filling up with objects created from default constructor.
Since it seems 5 years have passed and a wrong answer is still the accepted one, and the most-upvoted answer is completely useless (missed the forest for the trees), I will add a real response.
Method #1: we pass an initial size parameter into the vector, let's call it n. That means the vector is filled with n elements, which will be initialized to their default value. For example, if the vector holds ints, it will be filled with n zeros.
Method #2: we first create an empty vector. Then we reserve space for n elements. In this case, we never create the n elements and thus we never perform any initialization of the elements in the vector. Since we plan to overwrite the values of every element immediately, the lack of initialization will do us no harm. On the other hand, since we have done less overall, this would be the better* option.
* better - real definition: never worse. It's always possible a smart compiler will figure out what you're trying to do and optimize it for you.
Conclusion: use method #2.
In the long run, it depends on the usage and numbers of the elements.
Run the program below to understand how the compiler reserves space:
vector<int> vec;
for(int i=0; i<50; i++)
{
cout << "size=" << vec.size() << "capacity=" << vec.capacity() << endl;
vec.push_back(i);
}
size is the number of actual elements and capacity is the actual size of the array to imlement vector.
In my computer, till 10, both are the same. But, when size is 43 the capacity is 63. depending on the number of elements, either may be better. For example, increasing the capacity may be expensive.
Another option is to Trust Your Compiler(tm) and do the push_backs without calling reserve first. It has to allocate some space when you start adding elements. Perhaps it does that just as well as you would?
It is "better" to have simpler code that does the same job.
I think answer may depend on situation. For instance:
Lets try to copy simple vector to another vector. Vector hold example class which has only integer. In first example lets use reserve.
#include <iostream>
#include <vector>
#include <algorithm>
class example
{
public:
// Copy constructor
example(const example& p1)
{
std::cout<<"copy"<<std::endl;
this->a = p1.a;
}
example(example&& o) noexcept
{
std::cout<<"move"<<std::endl;
std::swap(o.a, this->a);
}
example(int a_)
{
std::cout<<"const"<<std::endl;
a = a_;
}
example()
{
std::cout<<"Def const"<<std::endl;
}
int a;
};
int main()
{
auto vec = std::vector<example>{1,2,3};
auto vec2 = std::vector<example>{};
vec2.reserve(vec.size());
auto dst_vec2 = std::back_inserter(vec2);
std::cout<<"transform"<<std::endl;
std::transform(vec.begin(), vec.end(),
dst_vec2, [](const example& ex){ return ex; });
}
For this case, transform will call copy and move constructors.
The output of the transform part:
copy
move
copy
move
copy
move
Now lets remove the reserve and use the constructor.
#include <iostream>
#include <vector>
#include <algorithm>
class example
{
public:
// Copy constructor
example(const example& p1)
{
std::cout<<"copy"<<std::endl;
this->a = p1.a;
}
example(example&& o) noexcept
{
std::cout<<"move"<<std::endl;
std::swap(o.a, this->a);
}
example(int a_)
{
std::cout<<"const"<<std::endl;
a = a_;
}
example()
{
std::cout<<"Def const"<<std::endl;
}
int a;
};
int main()
{
auto vec = std::vector<example>{1,2,3};
std::vector<example> vec2(vec.size());
auto dst_vec2 = std::back_inserter(vec2);
std::cout<<"transform"<<std::endl;
std::transform(vec.begin(), vec.end(),
dst_vec2, [](const example& ex){ return ex; });
}
And in this case transform part produces:
copy
move
move
move
move
copy
move
copy
move
As it is seen, for this specific case, reserve prevents extra move operations because there is no initialized object to move.

Create a list of pointers to random generated int

I'm having some difficulty in generating a random int* and store it into a list<int*>.
I have tried the following:
std::list<int*> generateInt(){
std::list<int*> randomInt;
int i = 0;
// initialize random seed
srand (time(NULL));
while (i < 5){
int* random = (int*)std::rand();
std::cout << "Random int generated: " << random << std::endl;
randomInt.push_back(random);
i++;
}
return randomInt;
}
But I get compiler issue as following
error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
int* random = (int*)std::rand();
^
I'm not sure if i'm missing something important here?
Any help or advice would be very appreciated. Thanks!
The easiest way I can think of is:
std::vector<int> generateInt(){
std::vector<int> randomInt;
int i = 0;
// initialize random seed
// srand (time(NULL)); Let that be the first line in main()
while (i < 5){
int random = std::rand();
std::cout << "Random int generated: " << random << std::endl;
randomInt.push_back(random);
i++;
}
return randomInt;
}
There's no need to introduce pointers or std::list at all.
Don't use pointers in your task, list use a self allocator:) if you need poiner use unique ore shared ptr
if you need int prt with random value use
int* p = new int{rand()};
Use mt to generate a random value (http://www.cplusplus.com/reference/random/mt19937/) ande see coments in code block
#include <iostream>
#include <list>
#include <random>
#include <functional>
namespace test {
std::list<int*> __attribute__((noinline))
foo() noexcept{
// Mersenne Twister generator with random device seed
// good way and not use time spec
std::mt19937_64 generator(std::random_device{}());
// for int gen
// this a seporator for generated value
std::uniform_int_distribution<> uid(0, 256);
// make functional object to use
auto rand_gen{std::bind(uid, generator)};
std::list<int*> lst;
int i{0};
for (auto i{0}; i < 5; ++i) {
// emplase is right way now
try {
lst.emplace_back(new int{rand_gen()});
} catch (std::bad_alloc& ba) {
std::cerr << "bad_alloc caught: " << ba.what() << std::endl;
}
}
return std::move(lst);
}
}
int main() {
std::list<int*> lst{test::foo()};
for(const auto& val : lst) {
std::cout << *val << std::endl;
delete val;
}
return 0;
}
ore use /dev/urandom if you linux like user :)
A note: You would not do this in real life. You would use a container of int, possibly std::list but more likely std::vector, not pointers to int. This is probably an assignment to make the learner struggle for a while with pointers.
So let's deal in pointers and do this the hard way.
int* random = (int*)std::rand();
means Get me a random number and store the random number as a memory address. What does this address point to? Anybody's guess. It's random. That makes it a bad idea.
A pointer must point to a valid object of the same type (or has an is-a relationship with the pointer's type) or it should be pointed at a "safe" parking location like nullptr until it can be pointed at a valid object. There are exceptions to this, but you'll learn those later.
Given
I'm implementing an assignment question that requires to store a list of pointers to random integers.
std::list<int*> randomInt;
Is probably correct. But... You still need an int to point at, so
Get a valid integer,
store the random number as an integer in that integer, and
store a pointer to that integer in the list.
So how do you get a valid integer? Obviously you need more than one so,
int number; // easy, but not enough numbers.
is out. All of randomInt would point at the same place and only the last value generated would be stored.
If you don't know how many numbers you're getting you need dynamic storage and should use new
int * random = new int; // get an int from dynamic memory
*random = std::rand(); // generate random number and assign to int pointed at by random
(or a smart pointer if they are available to you). Remember everything you new you will have to delete
for (int * p:randomInt) // for all pointers in randomInt
{
delete p; // free the memory (and other stuff you'll cover later)
}
after you are finished using it to avoid memory leaks. This is disturbingly error-prone, it is sickeningly easy to have a path that misses the delete or deletes while the allocation is still needed, so avoid this in real life. Use Automatic allocation, containers and smart pointers. If you can't, embrace the power of RAII.
Where possible avoid having to mess around with managing dynamic memory, smart pointer managed or otherwise, because dynamic memory always has a cost.
In this case it looks like a maximum of five, so you can make an array
int numbers[5];
(but NOT a local variable
std::list<int*> generateInt(){
int numbers[5]; // do not use!!! FATAL!!!
as it would go out of scope and vanish at the end of the function, leaving the program with a list full of pointers to invalid objects) and point at the elements of the array. For a simple program, you could get away with a static local variable
std::list<int*> generateInt(){
static int numbers[5];
or a global variable
int numbers[5];
std::list<int*> generateInt(){
but what if the function is called more than once? The second call would destroy the results of the first call. This may be tolerable, but the responsibility for making this call and guaranteeing the program works as expected falls on the programmer.
My suspicion is the Asker is intended to use new. Check with whomever assigned the problem to see what other options they will accept.

How can I make my dynamic array or vector operate at a similar speed to a standard array? C++

I'm still quite inexperienced in C++ and i'm trying to write sum code to add numbers precisely. This is a dll plugin for some finite difference software and the code is called several million times during a run. I want to write a function where any number of arguments can be passed in and the sum will be returned. My code looks like:
#include <cstdarg>
double SumFunction(int numArgs, ...){ // this allows me to pass any number
// of arguments to my function.
va_list args;
va_start(args,numArgs); //necessary prerequisites for using cstdarg
double myarray[10];
for (int i = 0; i < numArgs; i++) {
myarray[i] = va_arg(args,double);
} // I imagine this is sloppy code; however i cannot create
// myarray{numArgs] because numArgs is not a const int.
sum(myarray); // The actual method of addition is not relevant here, but
//for more complicated methods, I need to put the summation
// terms in a list.
vector<double> vec(numArgs); // instead, place all values in a vector
for (int i = 0; i < numArgs; i++) {
vec.at(i) = va_arg(args,double);
}
sum(vec); //This would be passed by reference, of course. The function sum
// doesn't actually exist, it would all be contained within the
// current function. This is method is twice as slow as placing
//all the values in the static array.
double *vec;
vec = new double[numArgs];
for (int i = 0; i < (numArgs); i++) {
vec[i] = va_arg(args,double);
}
sum(vec); // Again half of the speed of using a standard array and
// increasing in magnitude for every extra dynamic array!
delete[] vec;
va_end(args);
}
So the problem I have is that using an oversized static array is sloppy programming, but using either a vector or a dynamic array slows the program down considerably. So I really don't know what to do. Can anyone help, please?
One way to speed the code up (at the cost of making it more complicated) is to reuse a dynamic array or vector between calls, then you will avoid incurring the overhead of memory allocation and deallocation each time you call the function.
For example declare these variables outside your function either as global variables or as member variables inside some class. I'll just make them globals for ease of explanation:
double* sumArray = NULL;
int sumArraySize = 0;
In your SumFunction, check if the array exists and if not allocate it, and resize if necessary:
double SumFunction(int numArgs, ...){ // this allows me to pass any number
// of arguments to my function.
va_list args;
va_start(args,numArgs); //necessary prerequisites for using cstdarg
// if the array has already been allocated, check if it is large enough and delete if not:
if((sumArray != NULL) && (numArgs > sumArraySize))
{
delete[] sumArray;
sumArray = NULL;
}
// allocate the array, but only if necessary:
if(sumArray == NULL)
{
sumArray = new double[numArgs];
sumArraySize = numArgs;
}
double *vec = sumArray; // set to your array, reusable between calls
for (int i = 0; i < (numArgs); i++) {
vec[i] = va_arg(args,double);
}
sum(vec, numArgs); // you will need to pass the array size
va_end(args);
// note no array deallocation
}
The catch is that you need to remember to deallocate the array at some point by calling a function similar to this (like I said, you pay for speed with extra complexity):
void freeSumArray()
{
if(sumArray != NULL)
{
delete[] sumArray;
sumArray = NULL;
sumArraySize = 0;
}
}
You can take a similar (and simpler/cleaner) approach with a vector, allocate it the first time if it doesn't already exist, or call resize() on it with numArgs if it does.
When using a std::vector the optimizer must consider that relocation is possible and this introduces an extra indirection.
In other words the code for
v[index] += value;
where v is for example a std::vector<int> is expanded to
int *p = v._begin + index;
*p += value;
i.e. from vector you need first to get the field _begin (that contains where the content starts in memory), then apply the index, and then dereference to get the value and mutate it.
If the code performing the computation on the elements of the vector in a loop calls any unknown non-inlined code, the optimizer is forced to assume that unknown code may mutate the _begin field of the vector and this will require doing the two-steps indirection for each element.
(NOTE: that the vector is passed with a cost std::vector<T>& reference is totally irrelevant: a const reference doesn't mean that the vector is const but simply puts a limitation on what operations are permitted using that reference; external code could have a non-const reference to access the vector and constness can also be legally casted away... constness of references is basically ignored by the optimizer).
One way to remove this extra lookup (if you know that the vector is not being resized during the computation) is to cache this address in a local and use that instead of the vector operator [] to access the element:
int *p = &v[0];
for (int i=0,n=v.size(); i<n; i++) {
/// use p[i] instead of v[i]
}
This will generate code that is almost as efficient as a static array because, given that the address of p is not published, nothing in the body of the loop can change it and the value p can be assumed constant (something that cannot be done for v._begin as the optimizer cannot know if someone else knows the address of _begin).
I'm saying "almost" because a static array only requires indexing, while using a dynamically allocated area requires "base + indexing" access; most CPUs however provide this kind of memory access at no extra cost. Moreover if you're processing elements in sequence the indexing addressing becomes just a sequential memory access but only if you can assume the start address constant (i.e. not in the case of std::vector<T>::operator[]).
Assuming that the "max storage ever needed" is in the order of 10-50, I'd say using a local array is perfectly fine.
Using vector<T> will use 3 * sizeof(*T) (at least) to track the contents of the vector. So if we compare that to an array of double arr[10];, then that's 7 elements more on the stack of equal size (or 8.5 in 32-bit build). But you also need a call to new, which takes a size argument. So that takes up AT LEAST one, more likely 2-3 elements of stackspace, and the implementation of new is quite possibly not straightforward, so further calls are needed, which take up further stack-space.
If you "don't know" the number of elements, and need to cope with quite large numbers of elements, then using a hybrid solution, where you have a small stack-based local array, and if numargs > small_size use vector, and then pass vec.data() to the function sum.

Why might std::vector be faster than a raw dynamically allocated array?

The result of a discussion with a colleague I ended up writing benchmarks to test std::vector vs raw dynamically allocated arrays, and ended up with a surprise.
My tests are as follows:
#include "testconsts.h" // defines NUM_INTS across all tests
#include <vector>
int main()
{
const int numInts = NUM_INTS;
std::vector<int> intVector( numInts );
int * const intArray = new int[ numInts ];
++intVector[0]; // force access to affect optimization
++intArray[0]; // force access to affect optimization
for( int i = 0; i < numInts; ++i )
{
++intArray[i];
}
delete[] intArray;
return 0;
}
and:
#include "testconsts.h" // defines NUM_INTS across all tests
#include <vector>
int main()
{
const int numInts = NUM_INTS;
std::vector<int> intVector( numInts );
int * intArray = new int[ numInts ];
++intArray[0]; // force access to affect optimization
++intVector[0]; // force access to affect optimization
for( int i = 0; i < numInts; ++i )
{
++intVector[i];
}
delete[] intArray;
return 0;
}
They are compiled with g++ -O3 with gcc 4.4.3
The results of multiple runs of benchmarking using time are similar to:
Array:
real 0m0.757s
user 0m0.176s
sys 0m0.588s
Vector:
real 0m0.572s
user 0m0.268s
sys 0m0.304s
Three things are clear:
Array is faster in user time
Vector is faster less system time
Over all vector won this fight
The question is "why?".
The system time issue I'm guessing must have to do with page faults, but I can't describe for myself exactly why one would have significantly more page faults.
As for the user time issue, it's less interesting to me, but I'm still curious of opinions on that as well. I had imagined it had something to do with initialization, though I'm not passing an initialization value to the vector constructor so I don't know.
The difference is not in the performance of the vector compared to the dynamic array, but in the number of accesses to memory that you perform.
Effectively, in the vector test you are re-accessing cached memory, while in the array version you don't. You pay the price of caching the vector version in either case.
In the vector test, you allocate the dynamic memory for the array but leave it untouched, with the memory never being touched there are no page faults due to that operation. The vector is created, initialized and then the second pass will be accessing already cached data (if the size fits the cache, if it does not, it will not be in cache, but the same cost will be incurred in both versions).
On the other hand when testing the array, the vector constructor initializes the elements, and that means that in the case were you are trying to profile the behavior of the array, the vector contents are walked over and the array elements are walked over. Double the number of memory accesses, page faults and memory used by the application.
You can try modifying the code so that the dynamic allocation is performed like this:
int * intArray = new int[ numInts ](); // note extra ()
Which will value-initialize the whole array, or you initialize the array contents else how. The results of running that modified version of the test should be similar.
Have you run the test more than once? Benchmarking is a hard process, and has to rely on averages to get any kind of meaningful result; it's possible that at the time you were running your array benchmark a few CPU cycles were dedicated to something else, slowing it down. I would expect that given enough results they would be similar, as std::vector is written with a C-style array at it's core.