Seg Fault resulting from push_back call on vector (threads linux) - c++

So what I'm trying to do is write a program that creates a series of child threads that take the arguments using the pthread_create method and uses the parameter passed in to do more manipulation and so on. The parameter I'm trying to pass in is a vector argument called reduce_args_. this is the header information for the struct ReduceVector.
typedef vector<string> StringVector;
// a data structure to maintain info for the reduce task
struct ReduceArg
{
ReduceArg (void); // constructor
~ReduceArg (void); // destructor
pthread_t tid; // thread id of the reduce thread
StringVector files_to_reduce; // set of files for reduce task
};
// more typedefs
typedef vector<ReduceArg *> ReduceVector;
now the issues comes when I call push_back here:
for(int i = 0; i < num_reduce_threads_ ; i++){
reduce_args_.push_back(phold);
int count = 0;
for(ShuffleSet::iterator it = shuffle_set_.begin(); it!=shuffle_set_.end(); ++it){
string line = *it;
string space = " ";
string file = line.substr(0, line.find(space)) + ".txt";
if (count < num_reduce_threads_){
cout << reduce_args_[i+1];
(reduce_args_[i+1] -> files_to_reduce)[count] = file;
//(reduce_args_[i+1] -> files_to_reduce).push_back(file);
}
count++;
//cout << ((reduce_args_.back())->files_to_reduce).back()<< endl;
}
}
both of those push_back methods cause a seg fault. the shuffle set is just a set and is outputting strings. and as noted in the .h file, the files_to_reduce is a string vector. So what I'm trying to do is access the files_to_reduce and push_back a string onto it, but each time I get a seg fault. The reduce_args_ obj is declared as below:
ReduceArg* plhold;
reduce_args_.push_back(plhold);
((reduce_args_.back()) -> files_to_reduce).push_back("hello");
for (int i = 0; i < this->num_reduce_threads_; ++i) {
// create a placeholder reduce argument and store it in our vector
(reduce_args_.push_back(plhold));
}
thanks for the help!!

This:
ReduceArg* plhold;
reduce_args_.push_back(plhold);
Unless you've hidden some important code, you're pushing an uninitialised pointer, so the next line will cause chaos.
Possibly you meant this?
ReduceArg* plhold(new ReduceArg);
..but I suspect you haven't properly thought about the object lifetimes and ownership of the object whose address you are storing in the vector.
In general, avoid pointers unless you know exactly what you're doing, and why. The code as posted doesn't need them, and I would recommend you just use something like this:
typedef vector<ReduceArg> ReduceVector;
....
reduce_args_.push_back(ReduceArg());
reduce_args_.back().files_to_reduce.push_back("hello");
for (int i = 0; i < num_reduce_threads_; ++i) {
// create a placeholder reduce argument and store it in our vector
(reduce_args_.push_back(ReduceArg());
}

Related

For loop in C++ stops after a single iteration w/ pointer variable

So first all I'll preface this with: I just started using c++.
I have a structure that I store the pointer to in an unordered_map, setting members' values in the struct pointer as I get them through my process. Then I no longer need them in a map so I transfer then to a vector and loop through them.
Though on the second loop, it outputs my index (1) but the next statement of making a local pointer var for the struct at that index breaks it and the code terminates without any errors. since there are no errors then a try/catch doesn't give me anything either.
// Wanted to create a structure to handle the objects easier instead
// of multiple vectors for each property
struct appData {
std::string id = "";
std::string name = "";
std::string vdf_file = "";
std::string vdf_path = "";
};
// Relevant parts of my main()
int main() {
// Map that stores all the struct pointers
std::unordered_map<std::string, appData*> appDatas;
char memory[sizeof(appData)];
void* p = memory;
// New instance of appData
appData *tempAppData = new(p) appData();
tempAppData->appid = "86901";
// Add tempAppData to map with string key
appDatas["86901"] = tempAppData;
...
std::vector<appData*> unhashed_appDatas;
for (auto const& pair: appDatas) {
unhashed_appDatas.push_back(pair.second);
}
...
for (unsigned int x = 0; x < unhashed_appDatas.size(); x++) {
// Output index to see where it was messing up
std::cout << x << std::endl;
!! // This is where the issue happens on the second loop (see output)
appData *thisAppData = unhashed_appDatas[x];
std::string id = thisAppData->appid;
std::cout << id << std::endl;
/* ...
Do more stuff below
*/
}
...
return 0;
}
Terminal Output:
0 // Initial index of x
86901 // Id of first item
1 // New index of x on second loop before pointer var is created
// Nothing more is printed and execution terminates with no errors
My knowledge of c++ is pretty lacking, started it couple days ago, so the few things within my knowledge I've tried: moving the *thisAppData variable outside of the loop, using a for(var: vector) { ... }, and a while loop. I can assume that the issue lies with the pointer and the local variable when inside the loop.
Any help/input about how I could better approach this or if there's an issue with my code would be appreciated :)
Edit: Changed code to use .size() instead of sizeof() per #Jarod42 answer, though main issue persists
Edit2: Turns out it was my own mess-up, imagine that. 4Am brain wasn't working too well- posted answer regarding what I did incorrectly. Thanks to everyone who helped me
sizeof is the wrong tool here:
for (unsigned int x = 0; x < sizeof(unhashed_appDatas); x++) {
// ^^ wrong: give **static** size of the structure
// mainly 3 members (data, capacity, size), so something like `3*sizeof(void*)`
it should be
for (unsigned int x = 0; x < unhashed_appDatas.size(); x++) {
After many hours of trial and error I have determined the issue (aside from doing things in a way I should, which I've since corrected) it was something I messed up on that caused this issue.
TLDR:
Items wouldn't exist that I assumed did and tried to read files with a blank path and parse the contents that didn't exist.
Explaination:
In the first loop, the data I was getting was a list of files from a directory then parsing a json-like file that contained these file names and properties associated with them. Though, the file list contained entries that weren't in this other data file (since I had no check if they existed) so it would break there.
Additionally in the last loop I would get a member from a struct that would be the path of a file to read, but it would be blank (unset) because it didn't exist in data file so std::ifstream file(path); would break it.
I've since implemented checks for each key and value to ensure it will no longer break because of that.
Fixes:
Here are some fixes that were mentioned that I added to the code, which did help it work correctly in the end even if they weren't the main issue that I myself caused:
// Thanks to #EOF:
// No longer "using placement new on a buffer with automatic storage duration"
// (whatever that means haha) and was changed from:
char memory[sizeof(appData)];
void* p = memory;
appData *tempAppData = new(p) appData();
// To:
appData *tempAppData = new appData();
// Thanks to #Jarod42:
// Last for loop limit expression was corrected from:
for (unsigned int x = 0; x < sizeof(unhashed_appDatas); x++) {
}
// To:
for (unsigned int x = 0; x < unhashed_appDatas.size(); x++) {
}
// I am still using a map, despite comment noting to just use vectors
// (which I could have, but just would prefer using maps):
std::unordered_map<std::string, appData*> appDatas;
// Instead of doing something like this instead (would have arguably have been easier):
std::vector<std::string> dataKeys = { "1234" };
std::vector<appData*> appDatas = { ... };
auto indx = find(dataKeys.begin(), dataKeys.end(), "1234");
indx = (indx != dataKeys.end() ? indx : -1);
if (indx == -1) continue;
auto dataItem = appDatas[indx];
//
I appreciate everyone's assistance with my code

How to pass a vector element of type string as argument to pthread_create()?

I am trying to pass string type vector element to the pthread_create() function. The message is not getting printed in the output. Where am I wrong?
#include <iostream>
#include <pthread.h>
#include <cstdlib>
#include <vector>
using namespace std;
#define NUM_THREADS 5
void *print_thread_details(void *thread_no){
std::string str = *reinterpret_cast<std::string*>(thread_no);
cout<<"\n Running thread = "<<str<<endl;
pthread_exit(NULL);
}
int main(){
/*initialize an array of pthreads*/
pthread_t threads[NUM_THREADS];
int rc;
vector<string> v(NUM_THREADS);
for(int i=0;i<NUM_THREADS;i++){
string s = "Thread No = ";
char temp = i+'0';
s=s+temp;
v.push_back(s);
rc = pthread_create(&threads[i], NULL, print_thread_details,&v[i] );
if (rc){
cout << "Error:unable to create thread," << rc << endl;
exit(-1);
}
}
pthread_exit(NULL);
return 0;
}
Output:
Running thread =
Running thread =
Running thread =
Running thread =
Running thread =
How to pass a vector element of type string as argument to pthread_create()?
Exactly how you pass them now.
However, you must take care to prevent the strings from being destroyed or moved while the thread is alive.
The message is not getting printed in the output.
All of the strings that you pass to the threads are empty.
I suspect that you are confused about how vectors work:
vector<string> v(NUM_THREADS);
This constructs a vector of 5 elements. Each of the 5 string are empty.
v.push_back(s);
This adds a 6th ... 10th element into the vector. These strings are not empty, but also aren't passed to the threads because you used the indices 0...4, which contain the empty strings.
Furthermore, these push backs may cause the vector to reallocate, in which case the pointers passed to the threads created earlier would become invalid, resulting in undefined behaviour.
You should probably replace this with:
v[i] = s;
Another approach is to start with an empty vector, and push the generated strings in the loop. But in that case you must pre-reserve the memory to avoid pointer invalidation due to reallocation. Or fill the vector in a separate loop, before starting any threads.
PS. print_thread_details returns void* but is missing a return statement. The behaviour is undefined.
The problem is that you are calling the vector constructor that pre-fills the vector with blank strings, and then afterwards you are pushing additional non-blank strings onto the end of the vector. The vector will end up with 10 strings, not 5. But the threads will only see the blank strings.
Remove the value you are passing to the vector constructor, then the vector will be empty initially. Call the vector's reserve() method instead to preallocate the vector without actually adding items to it:
vector<string> v;
v.reserve(NUM_THREADS);
Otherwise, without the reserve(), each call to push_back() will potentially reallocate the vector's internal area, invalidating any existing string pointers, which would be bad when you populate the vector and create the threads at the same time. The safer approach is to push all of the strings into the vector before then creating the threads:
vector<string> v;
for(int i=0;i<NUM_THREADS;i++){
// consider using std::ostringstream instead...
string s = "Thread No = ";
char temp = i+'0';
s=s+temp;
v.push_back(s);
}
for(int i=0;i<NUM_THREADS;i++){
rc = pthread_create(&threads[i], NULL, print_thread_details,&v[i] );
...
}
On a side note, once you start the threads, you need to wait for them to terminate before allowing main() to exit, otherwise the vector can be destroyed while the threads are still using the string values.

std::Map of int and struct running out of memory (std::Bad_alloc) c++

I have the following in my program
class memModel
{
struct Addrlist
{
vector<string> data;
vector<int> timestamp;
vector<string> client;
}
map<int, Addrlist> AddrMap ; //store based address and list of all accesses
}
In main() I read from a few files and store millions of entries into this stuct
int main()
{
memModel newObj ;
ifstream file1("dataStream");
ifstream file2("timeStampSteam");
ifstream file3("clientStream");
ifstream file4("addrStream") ;
string dataSTR,clientSTR;
int time = 0 ;
int addr;
for(int i=0; i<10000000/*10mil*/ ; i++)
{
getline(file1,dataSTR);
getline(file3,clientSTR);
file2 >> time ;
file4 >> hex >> addr ;
newObj.AddrMap[addr].data.push_back(dataSTR) ;
newObj.AddrMap[addr].time.push_back( time) ;
newObj.AddrMap[addr].client.push_back(clientSTR) ;
}
}
So the problem is I am running out of memory and get the std::Bad_alloc exception. This code works with smaller data sizes.
I am trying to understand where the struct and Map are being stored. Is everything going on the Stack ?
The vectors are dynamically allocated right. Are those going to the heap ?
This is my first time working with large data sets so I would like to understand the concepts better. How can I change this to make sure I am using the heap and I do not run out of memory.
newObj.AddrMap[addr].data[i] = dataSTR ;
newObj.AddrMap[addr].time[i] = time ;
newObj.AddrMap[addr].client[i] = clientSTR ;
This stores three items of data into three vectors, here.
Unfortunately, all of these vectors are empty, and they contain no elements. This results in undefined behavior.
You either have to use push_back(), or resize() these vectors in advance, so they are of sufficient size to store the items you're placing into the vectors, here.
A std::vector's operator[] does not automatically create or resize the array. It merely accesses the existing element in the array.

C++ - Delete std::string*; heap corruption

I'm relatively new to C++ memory management, and I'm getting this weird error of heap corruption (plus an automatic breakpoint in Visual Studio before it). Here is the offending code:
z_world::z_world(char* name)
{
unsigned int i, skip;
char tmp;
//Load data from file
std::string* data = loadString(name);
//Base case if there is no world data
tiles = NULL;
w = 0;
h = 0;
if(data->length() > 0) {
//Set up the 'tiles' array
for(i = 0; i < data->length(); i++) {
if(data->at(i) == '\n')
h++;
if(h == 0)
w++;
}
tiles = new int[data->length()-h];
//Load Data
skip = 0;
for(i = 0; i < data->length(); i++) {
if(data->at(i) == '\n') {
skip++;
printf("\n");
continue;
}
tmp = data->at(i);
tiles[i+skip] = atoi(&tmp);
printf("%i ",tiles[i+skip]);
}
}
delete data;
}
Here's where I load in the string:
std::string* loadString(char* name)
{
ifstream in(name);
std::string* input = new string();
while(in) {
std::string line;
getline(in,line);
input->append(line);
input->append("\n");
}
in.close();
return input;
}
I get the breakpoint and error inside of "delete data;", which makes me think that "data" gets deleted somewhere before that, but I can't find where it would. For reference, this method is to create an object that contains world data for a game in the form of a virtual 2D integer array (for the ID's of the tiles).
Youre problem is probably here:
tiles[i+skip] = atoi(&tmp);
Problem 1:
It should be -skip
tiles[i - skip] =
Problem 2:
The atoi() command is being used incorrectly (tmp does not contain a string). But also I don't think atoi() is the appropriate method. I think what you are looking for is simple assignment. The conversion from char to int is automatic:
tiles[i - skip] = tmp;
Problem 3:
You are not using objects correctly. In this situation there is no need to generate dynamic objects and create a mess with dynamic memory management. It would be simpler to just to create automatic objects and pass those back normally:
std::string* loadString(char* name)
// ^ Don't do this.
std::string loadString(std::string const& name)
// ^^^^^^^ return a string by value.
// The compiler will handle memory management very well.
In general you should not be passing pointers around. In the few situations where you do need pointers they should be held within a smart pointer object or containers (for multiple objects) so that their lifespan is correctly controlled.
atoi(&tmp);
atoi expects a pointer to a null terminated string - not a pointer to a char
There's no need to dynamically allocate the string in the code you've shown. Change the loadString function to
std::string loadString(char* name)
{
ifstream in(name);
std::string input;
// ...
return input;
}
In the caller
std::string data = loadString( name );
Now there's no need to delete the string after you're done.
Instead of
int *tiles = NULL;
tiles = new int[data->length()-h];
use
std::vector<int> tiles;
tiles.resize(data.length() - h);
Also, if you do need to dynamically allocate objects you should be using smart pointers (std::unique_ptr and std::shared_ptr) instead of raw pointers.
There is a bug in
tiles[i+skip] = atoi(&tmp);
For example, for a string
Hello\n
World\n
and for the loop iteration at the point of i == 10, skip is already 1 (since we have encountered the first \n before) and you are writing to tiles[10 + 1], but tiles only has been allocated as an array with 10 elements.
May be the variable input is local to this function. So after returning from this the memory is freed. So, calling later delete on this string tries to free already freed memory.

Segmentation fault trying to dereference a pointer from a vector of pointers

I have a vector of pointers to objects that I am iterating through using std::vector::iterator`. Since the element returned is itself a pointer I dereference the iterator twice, once to return the pointer and once to resolve the pointer to the actual object.
I'm trying to invoke a member function (getClass) that returns an std::string and I have tried both (**it).getClass() and (*it)->getClass() but both give me a segmentation fault. I keep feeling like I'm missing something obvious.
partial function code:
void dataSet::createFolds()
{
// Shuffle the data vector
std::random_shuffle( m_records.begin(), m_records.end());
std::cout << "STARTING MAIN LOOP. THERE ARE " << m_records.size() << " RECORDS\n";
// iterate through the data vector and assign each to a fold
std::vector<dataRecord *>::iterator it = m_records.begin();
while (it != m_records.end())
{
std::string currentClass = (*it)->getClass(); // SEG FAULT HERE
.
.
.
}
.
.
.
}
The vector is m_records ... code
dataRecord is defined like this ... code
In response to questions about filling the vector:
The data is read from a text file and I really don't want to post the entire thing unless I have to (212 lines) but the pertinent code for populating the vector is below. The constructor for the dataRecord object takes a vector of field objects. I use a temporary pointer, use new to create the object then push_back the pointer.
while ...
{
std::vector<field> fields;
// build the fields vector
for (unsigned int i = 0; i < numAttribs; ++i)
fields.push_back(field(data.at(i), attribTypes[i]));
// create the new dataRecord
dataRecord * newRecord = new dataRecord(fields);
// add the record to the set
m_records.push_back(newRecord);
++recordNum;
std::cout << "read record " << recordNum << std::endl;
}
In my opinion the vector elements are badly initialized. Perhaps you have to test the code that fill the vector independently before testing to extract them. Sorry for my english ;)
Either the pointers in your containers are null, or they are dangling pointers to free'd memory.
Double check the code that fills m_records.
In
std::string dataRecord::getClass() {return m_data.at(m_data.size() - 1).getTextData();}
You must to verify m_data.size() because could be 0, so you will get an exception of out or range.
// create the new dataRecord
dataRecord * newRecord = new dataRecord(fields);
I'm guessing the bug is in dataRecord's constructor. Are you sure it's doing it's job properly?
This doesn't necessarily apply to the OP's question but for those arriving here from googling...
If you're receiving segfaults on dereferencing vector iterators and are working in multi-threaded applications, remember that vectors aren't necessary thread-safe!
The following example is unsafe without mutex locks.
Thread 1
myVector.push_back(new Object());
Thread 2
std::vector<Object*>::iterator = myVector.begin();
for (it; it != myVector.end(); it++) {
Object* obj = *it;
}
Instead, something like this should be done:
myMutex.lock();
myVector.push_back(new Object());
myMutex.unlock();
Thread 2
myMutex.lock();
std::vector<Object*>::iterator = myVector.begin();
for (it; it != myVector.end(); it++) {
Object* obj = *it;
}
myMutex.unlock();