Newbie in vectors in C++ - c++

I am trying to understand how vectors work. From what I ve read they are a class that can be used as an array with many helpful functions to handle its elements. So I ve tried creating a vector of a class A which contains a vector of class B.
Here is the code:
#include <iostream>
#include <vector>
using namespace std;
class B
{
public:
B()
{}
void print()
{
cout<<"The mighty ";
}
~B()
{}
};
class A
{
B b;
vector<B> Blist;
public:
A()
{
cout<<"An A!"<<endl;
}
void pushb()
{
Blist.push_back(b);
}
void printb()
{
Blist[7].print();
}
void print()
{
cout<<"Kass Company"<<endl;
}
~A()
{
}
};
int main(void)
{
vector<A> Alist;
A a, b, c;
Alist.push_back(a);
Alist[1].printb();
Alist[1].print();
return 0;
}
Well, my problem is that... it works fine. If vectors work like arrays shouldnt the first object that gets pushbacked get the 0 position of the vector? As a result, shouldnt the program fail to run, since there is no object in the Alist[1] or the Blist[7]?
Thanks in advance!

Well, my problem is that... it works fine
Well, in fact it shouldn't, since you're accessing both Alist and Alist::Blist out of their bounds.
If vectors work like arrays shouldnt the first object that gets pushbacked get the 0 position of the vector?
The std::vector<T>::push_back function appends an element to the end of the vector, so the push-backed element is given the index size() - 1 (after the push, e.g. the old size()).
Check your bounds
When using std::vector, you are responsible for checking the bounds you're trying to access to. You can use std::vector<T>::size() for this check, or the function std::vector<T>::at(size_t) as said by Jarod42.
See the STL documentation for more information : http://www.cplusplus.com/reference/vector/.
Why it seems to work anyway
You're stumbling across undefined behavior but still, it seems to work fine. Why ?
Well, internally the vector holds a pointer to dynamically allocated memory, holding the vector contents. The class encapsulates all the nasty memory management (calling new, delete, resizing the array, etc.).
When you're calling std::vector<T>::operator[](size_t), by doing for example Alist[1], it simply boils down to dereferencing the internal array at the given index (without bound checking).
Using a bad index, you end up reading some memory past the end of the allocated region, that does not contain any meaningful data, and is probably either uninitialized or zero'ed out. In conclusion when you're doing Alist[1], you're getting some garbage memory interpreted as an A instance.
Now why the hell doing Alist[1].print() does not crash ? Because the function A::print() is not using of the class members, and doing a->print() simply does not uses a contents.
You can verify this using this program (please don't actually use this, it is just intended for this demonstration) :
int foo = 0xDEADBEEF;
A& z = static_cast<A&>(*((A*) &foo));
z.print();
This code simply uses the memory occupied by the integer value foo as an A instance (much like you're using uninitialized memory when accessing the vector out of bounds), and calls the A::print() function.
You can try this for yourself, it works as expected ! This is because this member function does not need to use the actual memory content of the instance, and will run no matter z points to garbage or not.
How to debug and check this program
Use valgrind (http://valgrind.org/). Definitely.
Using valgrind's memcheck, you can track down invalid reads and writes (as well as other memory related stuff) :
you$ valgrind --tool=memcheck a.out
==1785== Memcheck, a memory error detector
==1785== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==1785== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info
==1785== Command: ./a.out
==1785==
An A!
An A!
An A!
==1785== Invalid read of size 8
==1785== at 0x400F14: std::vector<B, std::allocator<B> >::operator[](unsigned long) (stl_vector.h:771)
==1785== by 0x400E02: A::printb() (main.c:34)
==1785== by 0x400C0D: main (main.c:51)
==1785== Address 0x5a12068 is 8 bytes after a block of size 32 alloc'd
==1785== at 0x4C28965: operator new(unsigned long) (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==1785== by 0x4022E5: __gnu_cxx::new_allocator<A>::allocate(unsigned long, void const*) (new_allocator.h:104)
==1785== by 0x401D20: std::_Vector_base<A, std::allocator<A> >::_M_allocate(unsigned long) (in /home/amonti/.local/share/people/temp/a.out)
==1785== by 0x4013F8: std::vector<A, std::allocator<A> >::_M_insert_aux(__gnu_cxx::__normal_iterator<A*, std::vector<A, std::allocator<A> > >, A const&) (vector.tcc:345)
==1785== by 0x401017: std::vector<A, std::allocator<A> >::push_back(A const&) (stl_vector.h:913)
==1785== by 0x400BF4: main (main.c:50)
==1785==
The mighty Kass Company
==1785==
==1785== HEAP SUMMARY:
==1785== in use at exit: 0 bytes in 0 blocks
==1785== total heap usage: 1 allocs, 1 frees, 32 bytes allocated
==1785==
==1785== All heap blocks were freed -- no leaks are possible
==1785==
==1785== For counts of detected and suppressed errors, rerun with: -v
==1785== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 3 from 3)
In this trace valgrind detects an invalid read (of size 8 because you're reading a pointer on a 64-bit platform) at main.c:34 :
Blist[7].print();
So you can verify that you're doing something wrong.

in your case the output maybe a trash result because the logic of the vector data structure is that its a dynamic array that expands it self(by a constant range ) when it reaches the last free space .
for example when first creating a vector it has 10 spaces when it reaches the 10 space it becomes 20 and in this stage the vec[11] has a trash value.

This is exactly why you're supposed to use vector::at() instead of vector::operator[] when you're testing/writing your program for the first time.
You can use macros and preprocessor defines to declare that you're compiling for debug, such as:
#ifdef THISISDEBUG
return myvec.at(5);
#else
return myvec[5];
#endif
Then you tell your makefile to define THISISDEBUG when you're debugging/testing.
The difference between at() and operator[], is that at() throws an exception if you're out of range, while operator[] accesses memory directly.
In C++, you're generally allowed to read any place in memory (at least in Windows and Linux), but you're only allowed to write into places that belong to your program. Your operating system protects you! Imagine you do what you did up there, and you try to modify something that doesn't belong to your proram. Back then in the 80s and 90s, this would've been accepted and would've lead to a blue screen. Now, your operating system raises a SEGFAULT.
On the other hand, the reason why you're seeing a result there, is because deleting an object doesn't necessarily mean resetting values in memory. It just means that your program tells the operating system: "look, I don't need this region of memory anymore". So, your operating system can assign this region to another program. So, if you try to read that region again, it will work, but you'll get garbage! That's exactly what this technically is called. Like when you do:
double x;
std::cout << x << std::endl;
What is the value that will be printed? It's garbage. It's the remnant of some other program that freed that memory.

Basicly vectors are arrays class.
vector <string> arr; // defines that this is array
vector <MyClass *> arrw; // defines that this is array to my created class vector
Vector is useful to use, when you don't know how much array elements you need. For example, read lines from file. To add new element to vector you can use arr.insert(arr.end(), ""); and add.insert(arr.end(), new MyClass); (I like this better then push_back, becouse You can insert in any place of vector.)
You can access you array element by the same way:
arr[2];
Also it's useful to know some tricks like get access to last element; arr[arr.size() - 1] i. (arr.size() will return INT [elements in array]. And -1 will count it for good index. Othewise you will get segmentation error).
P.S. There is no difference between vector Class and array, ecxept this methods, that allow add new elements, when you don't know how big your array will be.

Related

memory leaked when globaly declared?

Is this code still going to leak if instead of declaring the pointers as part of main I declare them globally?
I tested with Valgrind memcheck and it doesn't
class Test1 {
public:
Test1() { std::cout << "Constructor of Test " << std::endl; }
~Test1() { std::cout << "Destructor of Test " << std::endl; }
};
//Memory leaked or not when globally declared?
// Test1 *t1;
// Test1 *t2;
// Test1 *t;
int main()
{
//mem will leak if not deallocated later
Test1 *t1;
Test1 *t2;
Test1 *t;
try {
t1=new Test1[100];
t2=new Test1;
t =new Test1;
throw 10;
}
catch(int i)
{
std::cout << "Caught " << i << std::endl;
// delete []t1;
// delete t;
// delete t2;
}
return 0;
}
Declaring the variable global will make the pointer variable global, not what the pointer points to (which is already global as it is located on the heap).
Therefore, your current implementation also has a leak.
Local variables get destroyed when out of scope, but what they point to is not automatically out. Suggestion: forget completety new and delete operators and use STL or smart pointers.
Edit: You are asking why valgrind does not detect it, this is a different question than the original (I edited to add a tag).
Right now you're always leaking memory, regardless if you're declaring pointers in main or globally.
Whenever you use new in your code, you need to use a delete or delete[].
In modern C++, using new is considered a bad practice, you should be using std::vector, if you want an array, or std::unique_ptr if you're managing a pointer to an object.
As was mentioned already in other answers, the allocated object's destructor will not be called in both variants of your program, the scope and lifetime of the pointers do not influence what happens to the pointees, but you showed that already by printing in the destructor.
Valgrind will report this slightly differently however.
I ran the with a shorter array of 2 elements to reduce the amount of output.
The heap summary, which tells you what data remains on the heap at the end of the run, is the same for both programs:
==397== HEAP SUMMARY:
==397== in use at exit: 12 bytes in 3 blocks
==397== total heap usage: 6 allocs, 3 frees, 76,944 bytes allocated
That means both programs never deallocated the objects.
Valgrind does however make a difference between "definitely lost" allocations, with no reference to the memory blocks remaining in any variable and "still reachable" allocations, where a reference remains.
The leak summary with local pointers
==397== LEAK SUMMARY:
==397== definitely lost: 12 bytes in 3 blocks
==397== indirectly lost: 0 bytes in 0 blocks
==397== possibly lost: 0 bytes in 0 blocks
==397== still reachable: 0 bytes in 0 blocks
==397== suppressed: 0 bytes in 0 blocks
The leak summary with global pointers
==385== LEAK SUMMARY:
==385== definitely lost: 0 bytes in 0 blocks
==385== indirectly lost: 0 bytes in 0 blocks
==385== possibly lost: 0 bytes in 0 blocks
==385== still reachable: 12 bytes in 3 blocks
==385== of which reachable via heuristic:
==385== length64 : 10 bytes in 1 blocks
If the pointers are local, valgrind can be sure that no reference remains, because after main returns, the stack locations are no longer valid.
If the pointers are global, they remain valid and could thus still be used or deallocated.
Why does valgrind make this distinction?
Especially in historic C programs it may be considered legitimate to allocate some memory once and use it throughout the execution, without bothering to later free the memory. The operating system will clean up the whole virtual memory space of the program anyway once the program exits. So while this could be a bug, it could also be intentional.
If you are interested in such leaks, valgrind itself tells you how it must be called to see them:
==405== Reachable blocks (those to which a pointer was found) are not shown.
==405== To see them, rerun with: --leak-check=full --show-leak-kinds=all
"Definitely lost" memory is always suspicious, however, and that is why valgrind distinguished the cases. The value of a tool like valgrind lies in its precision. It is not sufficient to report many actual errors, in order to be useful it must also strive to produce a low number of false positives, otherwise looking at the reports would too often be a waste of developer time.
In modern C++, there are not many excuses for leaking memory, as std::unique_ptr should be the way to allocate dynamic objects. std::vector should be used for dynamic arrays and local objects used wherever possible as the compiler never forgets a deallocation. Even for singletons, the noise in the output of tools like valgrind and address sanitizer usually outweighs the usually minuscule benefits of saving one destructor call or deallocation.

Why do valgrind and gdb point to different lines of code? Or: How to malloc() and free() pointer of pointer in loop?

The loop in the following code can be executed a few times but then it crashes.
#include <cstdlib>
#include <iomanip>
using namespace std;
int main(int argc, char *argv[])
{
//not needed but program will not crash if I remove it
int blocksize=stoi(argv[1]);
//typical value 70-100
int min_length=stoi(argv[2]);
for(int i=0;i<250000;i++)
{
//Allocate memory for new integer array[row][col]. First allocate the memory for the top-level array (rows).
int **output_std = (int**) malloc(20*sizeof(int*));
//Allocate a contiguous chunk of memory for the array data values.
output_std[0] = (int*) malloc( min_length*20*sizeof(int) );
//Set the pointers in the top-level (row) array to the correct memory locations in the data value chunk.
for (int k=1; k < 20; k++)
{
output_std[k] = output_std[0]+k*min_length;
}
//do something with output_std
//free malloc'd space
free(output_std[0]);
for(int k=0;k<20;k++)
{
output_std[i]=NULL;
}
free(output_std);
output_std=NULL;
}
return 0;
}
Debugging with GDB points to line 36: free(output_std);.
Debugging with valgrind yields the following error:
nvalid write of size 8
==32161== at 0x4031A0: main (test.cpp:31)
==32161== Address 0x82f2620 is 0 bytes after a block of size 160 alloc'd
==32161== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==32161== by 0x403159: main (test.cpp:16)
Line 16 is: int **output_std = (int**) malloc(20*sizeof(int*));
Line 31 is: free(output_std[0]);
Why do I get different positions for the error in my code?
How to proceed in such a situation?
(How can I fix my code?)
Edit: The lines are correct. I need such an object for a third party library.
Valgrind can often detect the problem earlier. That's the point of using it. Valgrind often catches the origin of the problem (or gets closer to the origin), while in GDB you can only see the consequences.
In your case the origin of the problem is heap memory corruption caused by an out-of-bounds write into an array. The consequence is crash inside free caused by that heap corruption. Valgrind catches the former. When you run your program (e.g. under GDB) you can only see the latter.
In your code
for(int k=0;k<20;k++)
{
output_std[i]=NULL;
}
the intended iteration variable is k. But you are accessing your array at i. At this point i is apparently 20, which results in out-of-bounds access caught by valgrind.
I'd say this cycle is rather pointless anyway: you are trying to zero out memory that you are about to deallocate immediately after that. One can provide some arguments as to why it might make sense... But things like this are more appropriate inside the memory deallocation function itself, in debug version of the library. In user-level code it just clutters the code with unnecessary noise.
P.S. In any case, you apparently posted invalid line numbers. If free(output_std) is line 36, then the offending line should be seen by valgrind as 34, not 31. Please, next time post accurate code and strive to accurately identify the offending lines.
valgrind replaces memory allocation and deallocation functions with its own instrumented versions. You can see that in the output:
at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
This is why under valgrind the application may crash in another place.

Memory leak on deallocating char * set by strcpy?

I have a memory leak detector tool which tells me below code is leaking 100 bytes
#include <string>
#include <iostream>
void setStr(char ** strToSet)
{
strcpy(*strToSet, "something!");
}
void str(std::string& s)
{
char* a = new char[100]();
setStr(&a);
s = a;
delete[] a;
}
int main()
{
std::string s1;
str(s1);
std::cout << s1 << "\n";
return 0;
}
According to this point number 3 it is leaking the amount I allocated (100) minus length of "something!" (10) and I should be leaking 90 bytes.
Am I missing something here or it is safe to assume the tool is reporting wrong?
EDIT: setStr() is in a library and I cannot see the code, so I guessed it is doing that. It could be that it is allocating "something!" on the heap, what about that scenario? Would we have a 90 bytes leak or 100?
This code does not leak and is not the same as point number 3 as you never overwrite variables storing pointer to allocated memory. The potential problems with this code are that it is vulnerable to buffer overflow as if setStr prints more than 99 symbols and it is not exception-safe as if s = a; throws then delete[] a; won't be called and memory would leak.
Updated: If setStr allocates new string and overwrites initial pointer value then the pointer to the 100 byte buffer that you've allocated is lost and those 100 bytes leak. You should initialize a with nullptr prior to passing it to setStr and check that it is not null after setStr returns so assignment s = a; won't cause null pointer dereference.
Summing up all the comments, it is clear what the problem is. The library you are using is requesting a char **. This is a common interface pattern for C functions that allocate memory and return a pointer to that memory, or that return a pointer to memory they own.
The memory you are leaking is allocated in the line char* a = new char[100]();. Because setStr is changing the value of a, you can no longer deallocate that memory.
Unfortunately, without the documentation, we cannot deduce what you are supposed to do with the pointer.
If it is from a call to new[] you need to call delete[].
If it is from a call to malloc you need to call std::free.
If it is a pointer to memory owned by the library, you should do nothing.
You need to find the documentation for this. However, if it is not available, you can try using your memory leak detection tool after removing the new statement and see if it detects a leak. I'm not sure if it is going to be reliable with memory allocated from a library function but it is worth a try.
Finally, regarding the question in your edit, if you leak memory you leak the whole amount, unless you do something that is undefined behavior, which is pointless to discuss anyway. If you new 100 chars and then write some data on them, that doesn't change the amount of memory leaked. It will still be 100 * sizeof(char)

delete[] operator causes segmentation fault in very simple case

I have a very strange segmentation fault that occurs when I call delete[] on an allocated dynamic array (created with the new keyword). At first it occurred when I deleted a global pointer, but it also happens in the following very simple case, where I delete[] arr
int main(int argc, char * argv [])
{
double * arr = new double [5];
delete[] arr;
}
I get the following message:
*** Error in `./energy_out': free(): invalid next size (fast): 0x0000000001741470 ***
Aborted (core dumped)
Apart from the main function, I define some fairly standard functions, as well as the following (defined before the main function)
vector<double> cos_vector()
{
vector<double> cos_vec_temp = vector<double>(int(2*pi()/trig_incr));
double curr_val = 0;
int curr_idx = 0;
while (curr_val < 2*pi())
{
cos_vec_temp[curr_idx] = cos(curr_val);
curr_idx++;
curr_val += trig_incr;
}
return cos_vec_temp;
}
const vector<double> cos_vec = cos_vector();
Note that the return value of cos_vector, cos_vec_temp, gets assigned to the global variable cos_vec before the main function is called.
The thing is, I know what causes the error: cos_vec_temp should be one element bigger, as cos_vec_temp[curr_idx] ends up accessing one element past the end of the vector cos_vec_temp. When I make cos_vec_temp one element larger at its creation, the error does not occur. But I do not understand why it occurs at the delete[] of arr. When I run gdb, after setting a breakpoint at the start of the main function, just after the creation of arr, I get the following output when examining contents of the variables:
(gdb) p &cos_vec[6283]
$11 = (__gnu_cxx::__alloc_traits<std::allocator<double> >::value_type *) 0x610468
(gdb) p arr
$12 = (double *) 0x610470
In the first gdb command, I show the memory location of the element just past the end of the cos_vec vector, which is 0x610468. The second gdb command shows the memory location of the arr pointer, which is 0x610470. Since I assigned a double to the invalid memory location 0x610468, I understand it must have wrote partly over the location that starts at 0x610470, but this was done before arr was even created (the function is called before main). So why does this affect arr? I would have thought that when arr is created, it does not "care" what was previously done to the memory location there, since it is not registered as being in use.
Any clarification would be appreciated.
NOTE:
cos_vec_temp was previously declared as a dynamic double array of size int(2*pi()/trig_incr) (same size as the one in the code, but created with new). In that case, I also had the invalid access as above, and it also did not give any errors when I accessed the element at that location. But when I tried to call delete[] on the cos_vec global variable (which was of type double * then) it also gave a segmentation fault, but it did not give the message that I got for the case above.
NOTE 2:
Before you downvote me for using a dynamic array, I am just curious as to why this occurs. I normally use STL containers and all their conveniences (I almost NEVER use dynamic arrays).
Many heap allocators have meta-data stored next to the memory it allocates for you, before or after (or both) the memory. If you write out of bounds of some heap-allocated memory (and remember that std::vector dynamically allocates off the heap) you might overwrite some of this meta-data, corrupting the heap.
None of this is actually specified in the C++ specifications. All it says that going out of bounds leads to undefined behavior. What the allocators do, or store, and where it possibly store meta-data, is up to the implementation.
As for a solution, well most people tell you to use push_back instead of direct indexing, and that will solve the problem. Unfortunately it will also mean that the vector needs to be reallocated and copied a few times. That can be solved by reserving an approximate amount of memory beforehand, and then let the extra stray element lead to a reallocation and copying.
Or, or course, make better predictions for the actual amount of elements the vector will contain.
It looks like you are writing past the end of the vector allocated in the function executing before main, causing undefined behavior later on.
You should be able to fix the problem by rounding the number up when allocating the vector (casting to int rounds the number down), or using push_back instead of indexing:
cos_vec_temp.push_back(cos(curr_val));

Possible memory leak using C++ string

Consider the following C++ program:
#include <cstdlib> // for exit(3)
#include <string>
#include <iostream>
using namespace std;
void die()
{
exit(0);
}
int main()
{
string s("Hello, World!");
cout << s << endl;
die();
}
Running this through valgrind shows this (some output trimmed for brevity):
==1643== HEAP SUMMARY:
==1643== in use at exit: 26 bytes in 1 blocks
==1643== total heap usage: 1 allocs, 0 frees, 26 bytes allocated
==1643==
==1643== LEAK SUMMARY:
==1643== definitely lost: 0 bytes in 0 blocks
==1643== indirectly lost: 0 bytes in 0 blocks
==1643== possibly lost: 26 bytes in 1 blocks
==1643== still reachable: 0 bytes in 0 blocks
==1643== suppressed: 0 bytes in 0 blocks
As you can see, there's a possibility that 26 bytes allocated on the heap were lost. I know that the std::string class has a 12-byte struct (at least on my 32-bit x86 arch and GNU compiler 4.2.4), and "Hello, World!" with a null terminator has 14 bytes. If I understand it correctly, the 12-byte structure contains a pointer to the character string, the allocated size, and the reference count (someone correct me if I'm wrong here).
Now my questions: How are C++ strings stored with regard to the stack/heap? Does a stack object exist for a std::string (or other STL containers) when declared?
P.S. I've read somewhere that valgrind may report a false positive of a memory leak in some C++ programs that use STL containers (and "almost-containers" such as std::string). I'm not too worried about this leak, but it does pique my curiosity regarding STL containers and memory management.
Calling exit "terminates the program without leaving the current block and hence without
destroying any objects with automatic storage duration".
In other words, leak or not, you shouldn't really care. When you call exit, you're saying "close this program, I no longer care about anything in it." So stop caring. :)
Obviously it's going to leak resources because you never let the destructor of the string run, absolutely regardless of how it manages those resources.
Others are correct, you are leaking because you are calling exit. To be clear, the leak isn't the string allocated on the stack, it is memory allocated on the heap by the string. For example:
struct Foo { };
int main()
{
Foo f;
die();
}
will not cause valgrind to report a leak.
The leak is probable (instead of definite) because you have an interior pointer to memory allocated on the heap. basic_string is responsible for this. From the header on my machine:
* A string looks like this:
*
* #code
* [_Rep]
* _M_length
* [basic_string<char_type>] _M_capacity
* _M_dataplus _M_refcount
* _M_p ----------------> unnamed array of char_type
* #endcode
*
* Where the _M_p points to the first character in the string, and
* you cast it to a pointer-to-_Rep and subtract 1 to get a
* pointer to the header.
They key is that _M_p doesn't point to the start of the memory allocated on the heap, it points to the first character in the string. Here is a simple example:
struct Foo
{
Foo()
{
// Allocate 4 ints.
m_data = new int[4];
// Move the pointer.
++m_data;
// Null the pointer
//m_data = 0;
}
~Foo()
{
// Put the pointer back, then delete it.
--m_data;
delete [] m_data;
}
int* m_data;
};
int main()
{
Foo f;
die();
}
This will report a probable leak in valgrind. If you comment out the lines where I move m_data valgrind will report 'still reachable'. If you uncomment the line where I set m_data to 0 you'll get a definite leak.
The valgrind documentation has more information on probable leaks and interior pointers.
Of course this "leaks", by exiting before s's stack frame is left you don't give s's destructor a chance to execute.
As for your question wrt std::string storage: Different implementations do different things. Some allocate some 12 bytes on the stack which is used if the string is 12 bytes or shorter. Longer strings go to the heap. Other implementations always go to the heap. Some are reference counted and with copy-on-write semantics, some not. Please turn to Scott Meyers' Effective STL, Item 15.
gcc STL has private memory pool for containers and strings. You can turn this off ; look in valgrind FAQ
http://valgrind.org/docs/manual/faq.html#faq.reports
I would avoid using exit() I see no real reason to use that call. Not sure if it will cause the process to stop instantly without cleaning up the memory first although valgrind does still appear to run.