c++ seg fault issue - c++

I am working on a C++ program that uses some external C libraries. As far as I can tell though that is not the cause of the problem, and the issue is with my C++ code. The program runs fine with no errors or anything on my test datasets, but after going through nearly the entire full dataset, I get a segfault. Running GDB gives me this segfault:
(gdb) run -speciesMain=allMis1 -speciesOther=anoCar2 -speciesMain=allMis1 -speciesOther=anoCar2 /hive/data/genomes/allMis1/bed/lastz.anoCar2/mafRBestNet/*.maf.gz
Starting program: /cluster/home/jstjohn/bin/mafPairwiseSyntenyDecay -speciesMain=allMis1 -speciesOther=anoCar2 -speciesMain=allMis1 -speciesOther=anoCar2 /hive/data/genome
s/allMis1/bed/lastz.anoCar2/mafRBestNet/*.maf.gz
Detaching after fork from child process 3718.
Program received signal SIGSEGV, Segmentation fault.
0x0000003009cb7672 in __gnu_cxx::__exchange_and_add(int volatile*, int) () from /usr/lib64/libstdc++.so.6
(gdb) up
#1 0x0000003009c9db59 in std::basic_string, std::allocator >::~basic_string() () from /usr/lib64/libstdc++.so.6
(gdb) up
#2 0x00000000004051e7 in PairAlnInfo::~PairAlnInfo (this=0x7fffffffcd70, __in_chrg=) at mafPairwiseSyntenyDecay.cpp:37
(gdb) up
#3 0x0000000000404eb0 in main (argc=2, argv=0x7fffffffcf78) at mafPairwiseSyntenyDecay.cpp:260
It looks like something is going on with a double free of my PairAlnInfo class. The weird thing is that I don't define a destructor, and I am not allocating anything with new. I have tried this both with g++44 and g++4.1.2 on the linux machine and have had the same results.
To make things even weirder, on my linux box (with more available RAM and everything, not that RAM is an issue with this program, but it is a beefy system) the seg fault happens as described above before the program reaches the loop to print output. On my much smaller macbook air using either g++ or clang++, the program still segfaults, but it doesn't do that until after the results are printed, right before the final return(0) out of the main function. Here is what the GDB trace looks like on my mac running on the same file after compiling with Mac's default g++4.2:
(more results)...
98000 27527 162181 0.83027
99000 27457 161467 0.829953
100000 27411 160794 0.829527
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x00004a2c00106077
0x00007fff9365a6e5 in std::string::_Rep::_M_dispose ()
(gdb) up
#1 0x00007fff9365a740 in std::basic_string, std::allocator >::~basic_string ()
(gdb) up
#2 0x0000000100003938 in main (argc=1261, argv=0x851d5fbff533) at mafPairwiseSyntenyDecay.cpp:301
(gdb)
Just in case you didn't notice the time of my posting, it's about 2:30AM now... I have been hacking away at this problem for about 10 hours now. Thanks so much for taking the time to look at this and help me out! The code and some instructions for replicating my situation follow.
If you are interested in downloading and installing the whole thing with dependencies then download my KentLib repository, make in the base directory, and then go to examples/mafPairwiseSyntenyDecay and run make there. An example (rather large) that causes the bug I am discussing is the gziped file available here: 100Mb file that the program crashes on. Then execute the program with these arguments -speciesMain=allMis1 -speciesOther=anoCar2 anoCar2.allMis1.rbest.maf.gz.
/**
* mafPairwiseSyntenyDecay
* Author: John St. John
* Date: 4/26/2012
*
* calculates the mean synteny decay in different range bins
*
*
*/
//Kent source C imports
extern "C" {
#include "common.h"
#include "options.h"
#include "maf.h"
}
#include <map>
#include <string>
#include <set>
#include <vector>
#include <sstream>
#include <iostream>
//#define NDEBUG
#include <assert.h>
using namespace std;
/*
Global variables
*/
class PairAlnInfo {
public:
string oname;
int sstart;
int send;
int ostart;
int oend;
char strand;
PairAlnInfo(string _oname,
int _sstart, int _send,
int _ostart, int _oend,
char _strand):
oname(_oname),
sstart(_sstart),
send(_send),
ostart(_ostart),
oend(_oend),
strand(_strand){}
PairAlnInfo():
oname("DUMMY"),
sstart(-1),
send(-1),
ostart(-1),
oend(-1),
strand(-1){}
};
vector<string> &split(const string &s, char delim, vector<string> &elems) {
stringstream ss(s);
string item;
while(getline(ss, item, delim)) {
elems.push_back(item);
}
return(elems);
}
vector<string> split(const string &s, char delim) {
vector<string> elems;
return(split(s, delim, elems));
}
#define DEF_MIN_LEN (200)
#define DEF_MIN_SCORE (200)
typedef map<int,PairAlnInfo> PairAlnInfoByPos;
typedef map<string, PairAlnInfoByPos > ChromToPairAlnInfoByPos;
ChromToPairAlnInfoByPos pairAlnInfoByPosByChrom;
void usage()
/* Explain usage and exit. */
{
errAbort(
(char*)"mafPairwiseSyntenyDecay -- Calculates pairwise syntenic decay from maf alignment containing at least the two specified species.\n"
"usage:\n"
"\tmafPairwiseSyntenyDecay [options] [*required options] file1.maf[.gz] ... \n"
"Options:\n"
"\t-help\tPrints this message.\n"
"\t-minScore=NUM\tMinimum MAF alignment score to consider (default 200)\n"
"\t-minAlnLen=NUM\tMinimum MAF alignment block length to consider (default 200)\n"
"\t-speciesMain=NAME\t*Name of the main species (exactly as it appears before the '.') in the maf file (REQUIRED)\n"
"\t-speciesOther=NAME\t*Name of the other species (exactly as it appears before the '.') in the maf file (REQUIRED)\n"
);
}//end usage()
static struct optionSpec options[] = {
/* Structure holding command line options */
{(char*)"help",OPTION_STRING},
{(char*)"minScore",OPTION_INT},
{(char*)"minAlnLen",OPTION_INT},
{(char*)"speciesMain",OPTION_STRING},
{(char*)"speciesOther",OPTION_STRING},
{NULL, 0}
}; //end options()
/**
* Main function, takes filenames for paired qseq reads
* and outputs three files.
*/
int iterateOverAlignmentBlocksAndStorePairInfo(char *fileName, const int minScore, const int minAlnLen, const string speciesMain, const string speciesOther){
struct mafFile * mFile = mafOpen(fileName);
struct mafAli * mAli;
//loop over alignment blocks
while((mAli = mafNext(mFile)) != NULL){
struct mafComp *first = mAli->components;
int seqlen = mAli->textSize;
//First find and store set of duplicates in this block
set<string> seen;
set<string> dups;
if(mAli->score < minScore || seqlen < minAlnLen){
//free here and pre-maturely end
mafAliFree(&mAli);
continue;
}
for(struct mafComp *item = first; item != NULL; item = item->next){
string tmp(item->src);
string tname = split(tmp,'.')[0];
if(seen.count(tname)){
//seen this item
dups.insert(tname);
}else{
seen.insert(tname);
}
}
for(struct mafComp *item1 = first; item1->next != NULL; item1 = item1->next){
//stop one before the end
string tmp1(item1->src);
vector<string> nameSplit1(split(tmp1,'.'));
string name1(nameSplit1[0]);
if(dups.count(name1) || (name1 != speciesMain && name1 != speciesOther)){
continue;
}
for(struct mafComp *item2 = item1->next; item2 != NULL; item2 = item2->next){
string tmp2(item2->src);
vector<string> nameSplit2(split(tmp2,'.'));
string name2 = nameSplit2[0];
if(dups.count(name2) || (name2 != speciesMain && name2 != speciesOther)){
continue;
}
string chr1(nameSplit1[1]);
string chr2(nameSplit2[1]);
char strand;
if(item1->strand == item2->strand)
strand = '+';
else
strand = '-';
int start1,end1,start2,end2;
if(item1->strand == '+'){
start1 = item1->start;
end1 = start1 + item1->size;
}else{
end1 = item1->start;
start1 = end1 - item1->size;
}
if(item2->strand == '+'){
start2 = item2->start;
end2 = start2+ item2->size;
}else{
end2 = item2->start;
start2 = end2 - item2->size;
}
if(name1 == speciesMain){
PairAlnInfo aln(chr2,start1,end1,start2,end2,strand);
pairAlnInfoByPosByChrom[chr1][start1] = aln;
}else{
PairAlnInfo aln(chr1,start2,end2,start1,end1,strand);
pairAlnInfoByPosByChrom[chr2][start2] = aln;
}
} //end loop over item2
} //end loop over item1
mafAliFree(&mAli);
}//end loop over alignment blocks
mafFileFree(&mFile);
return(0);
}
int main(int argc, char *argv[])
/* Process command line. */
{
optionInit(&argc, argv, options);
if(optionExists((char*)"help") || argc <= 1){
usage();
}
int minAlnScore = optionInt((char*)"minScore",DEF_MIN_SCORE);
int minAlnLen = optionInt((char*)"minAlnLen",DEF_MIN_LEN);
string speciesMain(optionVal((char*)"speciesMain",NULL));
string speciesOther(optionVal((char*)"speciesOther",NULL));
if(speciesMain.empty() || speciesOther.empty())
usage();
//load the relevant alignment info from the maf(s)
for(int i = 1; i<argc; i++){
iterateOverAlignmentBlocksAndStorePairInfo(argv[i], minAlnScore, minAlnLen, speciesMain, speciesOther);
}
const int blockSize = 1000;
const int blockCount = 100;
int totalWindows[blockCount] = {0};
int containBreak[blockCount] = {0};
//we want the fraction of windows of each size that contain a break
//
for(ChromToPairAlnInfoByPos::iterator mainChromItter = pairAlnInfoByPosByChrom.begin();
mainChromItter != pairAlnInfoByPosByChrom.end();
mainChromItter++){
//process the alignments shared by this chromosome
//note that map stores them sorted by begin position
vector<int> keys;
for(PairAlnInfoByPos::iterator posIter = mainChromItter->second.begin();
posIter != mainChromItter->second.end();
posIter++){
keys.push_back(posIter->first);
}
for(int i = 0; i < keys.size(); i++){
//first check for trivial window (ie our block)
PairAlnInfo pi1 = mainChromItter->second[keys[i]];
assert(pi1.send > pi1.sstart);
assert(pi1.sstart == keys[i]);
int numBucketsThisWindow = (pi1.send - pi1.sstart) / blockSize;
for(int k = 0; k < numBucketsThisWindow && k < blockCount; k++)
totalWindows[k]++;
for(int j = i+1; j < keys.size(); j++){
PairAlnInfo pi2 = mainChromItter->second[keys[j]];
assert(pi2.sstart == keys[j]);
assert(pi2.send > pi2.sstart);
assert(pi2.sstart > pi1.sstart);
if(pi2.oname == pi1.oname){
int moreToInc = (pi2.send - pi1.sstart) / blockSize;
for(int k = numBucketsThisWindow; k < moreToInc && k < blockCount; k++)
totalWindows[k]++;
numBucketsThisWindow = moreToInc; //so we don't double count
}else{
int numDiscontigBuckets = (pi2.send - pi1.sstart) / blockSize;
for(int k = numBucketsThisWindow; k < numDiscontigBuckets && k < blockSize; k++){
containBreak[k]++;
totalWindows[k]++;
}
numBucketsThisWindow = numDiscontigBuckets;
}
if((keys[j] - keys[i]) >= (blockSize * blockCount)){
//i = j;
break;
}
}
}
}
cout << "#WindowSize\tNumContainBreak\tNumTotal\t1-(NumContainBreak/NumTotal)" << endl;
for(int i = 0; i < blockCount; i++){
cout << (i+1)*blockSize << '\t';
cout << containBreak[i] << '\t';
cout << totalWindows[i] << '\t';
cout << (totalWindows[i] > 0? 1.0 - (double(containBreak[i])/double(totalWindows[i])): 0) << endl;
}
return(0);
} //end main()

Try running your program under valgrind. This will give you a report of possibly or actually lost memory, uninitialised, etc.

Your issues are probably due to due memory corruption occurring at some point in the program sometime prior to the actual errors you are seeing.
One potential issue in the code you posted is the loop:
for(int k = numBucketsThisWindow; k<numDiscontigBuckets && k < blockSize; k++){
which uses blockSize instead of the correct blockCount which leads to a possible overflow of both the totalWindows[] and containBreak[] arrays. This would overwrite the speciesMain and speciesOther strings, alonth with anything else on the stack, which might very well result in the errors you are seeing.

Related

Heap corruption detected in C++ after removing strings

When running this code I get an error as shown in the image below.
I've tried running it on GCC compiler and it worked fine. But when running it on Visual Studio on Windows this error appeared:
Debug Error!
Program: C:\Users\yudab\source\repos\Project2\Debug\Project2.exe
HEAP CORRUPTION DETECTED: after Normal block (#153) at 0x014FD2E0.
CRT detected that the application wrote to memory after end of heap buffer.
After some testing it seems as the error only appears after trying to delete the second word.
#include <cstring>
#include <string>
#pragma warning(disable : 4996)
#include <iostream>
using namespace std;
void delStr(char**& lexicon, int& lexSize, char word[]);
void printAll(char** lexicon, int lexSize);
void retract2dArr(char**& arr, int& size);
int main() {
char** lexicon = new char* [3];
lexicon[0] = new char[6]{ "hello" };
lexicon[1] = new char[5]{ "test" };
lexicon[2] = new char[6]{ "world" };
int size = 3;
char removeTest[5] = { "test" }; //The first word I want to remove from the list
char removeWorld[6] = { "world" }; //The second word I want to remove from the list
printAll(lexicon, size); //First prints the entire list
delStr(lexicon, size, removeTest); //Removes the first word
delStr(lexicon, size, removeWorld); //Removes the second word
printAll(lexicon, size); //Prints the list after deleting the words
return 0;
}
void delStr(char**& lexicon, int& lexSize, char word[]) {
bool toDelete = false;
for (int i = 0; i < lexSize; i++) {
if (strcmp(lexicon[i], word) == 0) {
toDelete = true;
for (; i < lexSize - 1; i++) {
strcpy(lexicon[i], lexicon[i + 1]);
}
}
}
if (toDelete == true) {
delete[] lexicon[lexSize - 1];
retract2dArr(lexicon, lexSize);
}
return;
}
void printAll(char** lexicon, int lexSize) {
for (int i = 0; i < lexSize; i++) {
cout << lexicon[i];
if (i != lexSize - 1) {
cout << " ";
}
}
cout << endl;
return;
}
void retract2dArr(char**& arr, int& size) {
size--;
char** newArr = new char* [size];
for (int i = 0; i < size; i++) {
*(newArr + i) = *(arr + i);
}
printAll(newArr, size);
delete[] arr;
arr = newArr;
return;
}
You can't strcpy one string to another
if (strcmp(lexicon[i], word) == 0) {
toDelete = true;
for (; i < lexSize - 1; i++) {
strcpy(lexicon[i], lexicon[i + 1]);
}
}
As length will be different for each strings.
Example:
lexicon[0] = new char[6]{ "hello" };
lexicon[1] = new char[5]{ "test" }; // length is 4
lexicon[2] = new char[6]{ "world" }; // length is 5
3rd string won't fit in 2nd string, it causes out of bound access.
As kiran Biradar pointed out, the strcpy is to blame here. Although instead of copying each word in the lexicon to the memory allocated for the previous word, it would probably be better to simply move the pointers back withing the lexicon array.
Try something like this for your delStr function:
void delStr(char**& lexicon, int& lexSize, char word[]) {
for (int i = 0; i < lexSize; i++) {
if (strcmp(lexicon[i], word) == 0) {
delete[] lexicon[i];
for (; i < lexSize - 1; i++) {
lexicon[i] = lexicon[i + 1];
}
retract2dArr(lexicon, lexSize);
}
}
}
P.S. You didnt need to use a toDelete flag, you could call teh retract2dArr function within the first if.

Modifying integer value from another process using process_vm_readv

I am using Ubuntu Linux to write two programs. I am attempting to change the value of an integer from another process. My first process (A) is a simple program that loops forever and displays the value to the screen. This program works as intended and simply displays the value -1430532899 (0xAABBCCDD) to the screen.
#include <stdio.h>
int main()
{
//The needle that I am looking for to change from another process
int x = 0xAABBCCDD;
//Loop forever printing out the value of x
int counter = 0;
while(1==1)
{
while(counter<100000000)
{
counter++;
}
counter = 0;
printf("%d",x);
fflush(stdout);
}
return 0;
}
In a separate terminal, I use the ps -e command to list the processes and note the process id for process (A). Next as root use (sudo) I run this next program (B) and enter in the process ID that I noted from process (A).
The program basically searches for the needle which is in memory backwards (DD CC BB AA) find the needle, and takes note of the address. It then goes and tries to write the hex value (0xEEEEEEEE) to that same location, but I get a bad address error when errno is set to 14. The strange thing is a little later in the address space, I am able to write the values successfully to the address (0x601000) but the address where the needle(0xAABBCCDD) is at 0x6005DF I cannot write there. (But can read obviously because that is where I found the needle)
#include <stdio.h>
#include <iostream>
#include <sys/uio.h>
#include <string>
#include <errno.h>
#include <vector>
using namespace std;
char getHex(char value);
string printHex(unsigned char* buffer, int length);
int getProcessId();
int main()
{
//Get the process ID of the process we want to read and write
int pid = getProcessId();
//Lists of addresses where we find our needle 0xAABBCCDD and the addresses where we simply cannot read
vector<long> needleAddresses;
vector<long> unableToReadAddresses;
unsigned char buf1[1000]; //buffer used to store memory values read from other process
//Number of bytes read, also is -1 if an error has occurred
ssize_t nread;
//Structures used in the process_vm_readv system call
struct iovec local[1];
struct iovec remote[1];
local[0].iov_base = buf1;
local[0].iov_len = 1000;
remote[0].iov_base = (void * ) 0x00000; //start at address 0 and work up
remote[0].iov_len = 1000;
for(int i=0;i<10000;i++)
{
nread = process_vm_readv(pid, local, 1, remote, 1 ,0);
if(nread == -1)
{
//errno is 14 then the problem is "bad address"
if(errno == 14)
unableToReadAddresses.push_back((long)remote[0].iov_base);
}
else
{
cout<<printHex(buf1,local[0].iov_len);
for(int j=0;j<1000-3;j++)
{
if(buf1[j] == 0xDD && buf1[j+1] == 0xCC && buf1[j+2] == 0xBB && buf1[j+3] == 0xAA)
{
needleAddresses.push_back((long)(remote[0].iov_base+j));
}
}
}
remote[0].iov_base += 1000;
}
cout<<"Addresses found at...";
for(int i=0;i<needleAddresses.size();i++)
{
cout<<needleAddresses[i]<<endl;
}
//How many bytes written
int nwrite = 0;
struct iovec local2[1];
struct iovec remote2[1];
unsigned char data[] = {0xEE,0xEE,0xEE,0xEE};
local2[0].iov_base = data;
local2[0].iov_len = 4;
remote2[0].iov_base = (void*)0x601000;
remote2[0].iov_len = 4;
for(int i=0;i<needleAddresses.size();i++)
{
cout<<"Attempting to write "<<printHex(data,4)<<" to address "<<needleAddresses[i]<<endl;
remote2[0].iov_base = (void*)needleAddresses[i];
nwrite = process_vm_writev(pid,local2,1,remote2,1,0);
if(nwrite == -1)
{
cout<<"Error writing to "<<needleAddresses[i]<<endl;
}
else
{
cout<<"Successfully wrote data";
}
}
//For some reason THIS will work
remote2[0].iov_base = (void*)0x601000;
nwrite = process_vm_writev(pid,local2,1,remote2,1,0);
cout<<"Wrote "<<nwrite<<" Bytes to the address "<<0x601000 <<" "<<errno;
return 0;
}
string printHex(unsigned char* buffer, int length)
{
string retval;
char temp;
for(int i=0;i<length;i++)
{
temp = buffer[i];
temp = temp>>4;
temp = temp & 0x0F;
retval += getHex(temp);
temp = buffer[i];
temp = temp & 0x0F;
retval += getHex(temp);
retval += ' ';
}
return retval;
}
char getHex(char value)
{
if(value < 10)
{
return value+'0';
}
else
{
value = value - 10;
return value+'A';
}
}
int getProcessId()
{
int data = 0;
printf("Please enter the process id...");
scanf("%d",&data);
return data;
}
Bottom line is that I cannot modify the repeating printed integer from another process.
I can see at least these problems.
No one guarantees there's 0xAABBCCDD anywhere in the writable memory of the process. The compiler can optimize it away entirely, or put in in a register. One way to enssure a variable will be placed in the main memory is to declare it volatile.
volatile int x = 0xAABBCCDDEE;
No one guarantees there's no 0xAABBCCDD somewhere in the read-only memory of the process. On the contrary, one could be quite certain there is in fact such a value there. Where else could the program possibly obtain it to initialise the variable? The initialisation probably translates to an assembly instruction similar to this
mov eax, 0xAABBCCDD
which, unsurprisingly, contains a bit pattern that matches 0xAABBCCDD. The address 0x6005DF could well be in the .text section. It is extremely unlikely it is on the stack, because stack addresses are typically close to the top of the address space.
The address space of a 64-bit process is huge. There is no hope to traverse it all in a reasonable amount of time. One needs to limit the range of addresses somehow.

CUDA, a new initializer may not be specified for an array

So I currently have this code, but i get an error when compiling which is "a new initializer may not be specified for an array" but no where in my code am i declaring a NEW array. So i'm unsure where and how this problem is occurring. Here's the section of code where i think the problem is....
struct indvPasswords
{
int blockId;
int threadId;
list<char[1024]> passwords;
};
//22,500 individual blocks
int gridSize = 150;
//1024 threads contained within each block
int blockSize = 32;
int totalNumThreads = (gridSize*gridSize)*(blockSize*blockSize);
indvPasswords* pointerToPass = new indvPasswords[totalNumThreads];
Here is all the source code. It currently uses cuda for dictionary password look ups...
#include "cuda_runtime.h"
#include "device_launch_parameters.h"
#include <stdio.h>
#include <iostream>
#include <fstream>
#include <string>
#include <list>
using namespace std;
//////////////////////////////
// Constants //
//////////////////////////////
string password = "C:\\Users\\james\\Desktop\\password.txt";
string passwordListFile = "C:\\Users\\james\\Desktop\\10millionpasswordlist.txt";
struct indvPasswords
{
int blockId;
int threadId;
list<char[1024]> passwords;
};
//22,500 individual blocks
int gridSize = 150;
//1024 threads contained within each block
int blockSize = 32;
int totalNumThreads = (gridSize*gridSize)*(blockSize*blockSize);
indvPasswords* pointerToPass = new indvPasswords[totalNumThreads];
//Some serial setup first
string passwordFile()
{
string pwd = "";
ifstream fileStream(password);
if (fileStream.is_open()) {
getline(fileStream, pwd);
if (pwd != "") {
//Found a password
fileStream.close();
}
else {
cout << "No password found in file" << '\n';
}
}
else {
cout << "Cannot open password file" << '\n';
}
return pwd;
}
list<string> readPasswordList()
{
//open password list
string line = "";
ifstream fileStream(passwordListFile);
list<string> passwordList;
if (fileStream.is_open()) {
while (getline(fileStream, line)) {
passwordList.push_back(line);
}
}
else {
cout << "Cannot open password file" << '\n';
}
return passwordList;
}
void indexing()
{
list<string> passwords = readPasswordList();
int sizeOfList = passwords.size();
int modulus = sizeOfList%gridSize;
int runFlag = 0;
if (modulus != 0) {
//not evenly divided, pass overflow to first block
//Take the modulus off the size of password list, i.e.
//take a number of passwords from the top of the list.
//Temporarily store the passwords removed and give them
//as additional work to the first block.
list<string> temp;
for (int i = 0; i < modulus; i++) {
temp.push_back(passwords.front());
passwords.pop_front();
}
//Now evenly divide up the passwords amoungst the blocks
int numEach = passwords.size();
//The first for loop, iterates through each BLOCK within the GRID
for (int i = 0; i < (gridSize*gridSize); i++) {
//Set flag, for single run of first block list
if (i == 0) {
runFlag = 1;
}
//The second loop, iterates through each THREAD within the current BLOCK
for (int j = 0; j < (blockSize*blockSize); j++){
//setup the indexes
pointerToPass[i].blockId = i;
pointerToPass[i].threadId = j;
//The third loop, copies over a collection of passwords for each THREAD
//within the current BLOCK
for (int l = 0; l < numEach; l++) {
//Add the temporary passwords to the first block
if (runFlag == 1) {
for (int x = 0; x < temp.size(); x++){
//Convert
string tmp = temp.front();
char charTmp[1024];
strcpy(charTmp, tmp.c_str());
pointerToPass[i].passwords.push_back(charTmp);
temp.pop_front();
}
//Now never block run again
runFlag = 0;
}
//convert the passwords from string to char[] before adding
string tmp = passwords.front();
char charTmp[1024];
strcpy(charTmp, tmp.c_str());
pointerToPass[i].passwords.push_back(charTmp);
//now delete the item from passwords that has just been transfered
//over to the other list
passwords.pop_front();
}
}
}
}
else {
//Start to transfer passwords
//First calculate how many passwords each thread will be given to check
int numEach = sizeOfList / totalNumThreads;
//The first for loop, iterates through each BLOCK within the GRID
for (int i = 0; i < (gridSize*gridSize); i++) {
//The second loop, iterates through each THREAD within the current BLOCK
for (int j = 0; j < (blockSize*blockSize); j++){
//setup the indexes
pointerToPass[i].blockId = i;
pointerToPass[i].threadId = j;
//The third loop, copies over a collection of passwords for each THREAD
//within the current BLOCK
for (int l = 0; l < numEach; l++) {
//convert the passwords from string to char[] before adding
string tmp = passwords.front();
char charTmp[1024];
strcpy(charTmp, tmp.c_str());
pointerToPass[i].passwords.push_back(charTmp);
//now delete the item from passwords that has just been transfered
//over to the other list
passwords.pop_front();
}
}
}
}
}
//Now onto the parallel
__device__ bool comparison(indvPasswords *collection, char *password, int *totalThreads)
{
//return value
bool val = false;
//first calculate the block ID
int currentBlockId = blockIdx.y * gridDim.x + blockIdx.x;
//calculate the thread ID
int currentThreadId = currentBlockId * blockDim.y * blockDim.x + threadIdx.y * blockDim.x + threadIdx.x;
//Now go and find the data required for this thread
//Get the block
for (int i = 0; i < (int)totalThreads; i++) {
if (collection[i].blockId == currentBlockId) {
//Now check the thread id
if (collection[i].threadId == currentThreadId) {
//Now begin comparison
for (int j = 0; j < collection[i].passwords.size(); j++)
{
if (password == collection[i].passwords.front()) {
//password found!
val = true;
break;
}
//delete the recently compared password
collection[i].passwords.pop_front();
}
}
}
}
return val;
}
__global__ void kernelComp(indvPasswords *collection, char *password, int *totalThreads)
{
comparison(collection, password, totalThreads);
}
int main()
{
int size = totalNumThreads;
//grid and block sizes
dim3 gridDim(gridSize, gridSize, 1);
dim3 blockDim(blockSize, blockSize, 1);
//Get the password
string tmp = passwordFile();
//convert
char pwd[1024];
strcpy(pwd, tmp.c_str());
indexing();
//device memory pointers
indvPasswords* array_d;
char *pwd_d;
int *threadSize_d;
int *threads = &totalNumThreads;
//allocate host memory
pwd_d = (char*)malloc(1024);
//allocate memory
cudaMalloc((void**)&array_d, size);
cudaMalloc((void**)&pwd_d, 1024);
cudaMalloc((void**)&threadSize_d, sizeof(int));
//copy the data to the device
cudaMemcpy(array_d, pointerToPass, size, cudaMemcpyHostToDevice);
cudaMemcpy(pwd_d, pwd, 1024, cudaMemcpyHostToDevice);
cudaMemcpy(threadSize_d, threads, sizeof(int), cudaMemcpyHostToDevice);
//call the kernel
kernelComp << <gridDim, blockDim>> >(array_d,pwd_d,threadSize_d);
//copy results over
return 0;
}
When I compile your code on linux with the -std=c+=11 compiler switch for nvcc, I get a warning on this line:
for (int i = 0; i < (int)totalThreads; i++) {
in your comparison function. Given that totalThreads is a pointer passed to the function:
__device__ bool comparison(indvPasswords *collection, char *password, int *totalThreads)
that usage seems almost certainly broken. I'm pretty sure that warning should not be ignored and what you really want is:
for (int i = 0; i < *totalThreads; i++) {
The only error I get is on this line:
pointerToPass[i].passwords.push_back(charTmp);
but it doesn't seem to be exactly the error you are reporting:
$ nvcc -std=c++11 -o t1016 t1016.cu
/usr/include/c++/4.8.3/bits/stl_list.h(114): error: a value of type "const char *" cannot be used to initialize an entity of type "char [1024]"
detected during:
instantiation of "std::_List_node<_Tp>::_List_node(_Args &&...) [with _Tp=char [1024], _Args=<const std::list<char [1024], std::allocator<char [1024]>>::value_type &>]"
/usr/include/c++/4.8.3/ext/new_allocator.h(120): here
instantiation of "void __gnu_cxx::new_allocator<_Tp>::construct(_Up *, _Args &&...) [with _Tp=std::_List_node<char [1024]>, _Up=std::_List_node<char [1024]>, _Args=<const std::list<char [1024], std::allocator<char [1024]>>::value_type &>]"
(506): here
instantiation of "std::list<_Tp, _Alloc>::_Node *std::list<_Tp, _Alloc>::_M_create_node(_Args &&...) [with _Tp=char [1024], _Alloc=std::allocator<char [1024]>, _Args=<const std::list<char [1024], std::allocator<char [1024]>>::value_type &>]"
(1561): here
instantiation of "void std::list<_Tp, _Alloc>::_M_insert(std::list<_Tp, _Alloc>::iterator, _Args &&...) [with _Tp=char [1024], _Alloc=std::allocator<char [1024]>, _Args=<const std::list<char [1024], std::allocator<char [1024]>>::value_type &>]"
(1016): here
instantiation of "void std::list<_Tp, _Alloc>::push_back(const std::list<_Tp, _Alloc>::value_type &) [with _Tp=char [1024], _Alloc=std::allocator<char [1024]>]"
t1016.cu(108): here
1 error detected in the compilation of "/tmp/tmpxft_0000182c_00000000-8_t1016.cpp1.ii".
The solution to that issue is probably to not use an array as the list element type, as #ManosNikolaidis has already suggested.
However a bigger issue I see in your code is that you are attempting to use items from the c++ standard library in __device__ functions, and that won't work:
struct indvPasswords
{
int blockId;
int threadId;
list<char[1024]> passwords; // using std::list
};
...
__global__ void kernelComp(indvPasswords *collection, char *password, int *totalThreads)
^^^^^^^^^^^^^
// can't use std::list in device code
You should re-write your data structures used by the device to only depend on built-in types, or types derived from those built-in types.
There is a problem in this line :
list<char[1024]> passwords;
Arrays cannot be used in STL containers because it requires the type to be copy constructible and assignable. More details here. Try using a std::string for each password. Or even an array<char, 1024>.
You are actually using new to create an array here
indvPasswords* pointerToPass = new indvPasswords[totalNumThreads];
but I don't see anything wrong with that

How to stop looping through array when index is empty?

If I have an array of size MAX_SIZE and only have 20 index occupied how do you make it so that it stops printing 0s after itemList[20]? (I am reading in from a text file)
const int MAX_SIZE = 1000;
item itemList[MAX_SIZE];
for(int i= 0; i<MAX_SIZE;i++)
{
itemList[i].Print(); //prints members in item
if(i==19) // I used this just to see what I was printing properly
{ //I know it is bad practice so I would like an alternative.
break; //Also, it is only possible if you have access to the text file.
}
}
You can perform a basic check:
if(itemList[i].Function() == 0) break;
As an alternative to break, you can use a while loop:
int i = 0;
while (itemList[i] != 0 && i < MAX_SIZE)
{
itemList[i].Print();
i++;
}
Replace itemList[i] != 0 with whatever expression you're using to determine whether the element is occupied or not.
Alternatively, keep track of how many elements there are as you build up the array from the file, and only loop that many times.
Better still, remember that you're using C++, not C. Add elements from the file to a container such as std::vector instead of a raw array, then just loop through the whole thing. This also fixes a serious bug in your code; namely, that you will have undefined behaviour when there are more than 1000 entries in the file.
const int MAX_SIZE = 1000;
item itemList[MAX_SIZE];
for (int i = 0; i < sizeof(aitemList / sizeof(*itemList)); i++)
{
if (itemList[i] != 0)
itemList[i].Print();
}
or use a vector
std::vector<int> itemList;
for (int i = 0; i < itemList.size(); i++)
{
if (itemList[i] != 0)
{
// do stuff
}
}
A short lesson in:
correctly iterating through a null-terminated pointer array
cleaning up arrays of pointers in an exception safe way
solving your problem correctly
.
#include <iostream>
#include <algorithm>
#include <memory>
using namespace std;
struct Oper {
void exec() const { cout << "Hello\n"; }
~Oper() { cout << "deleted\n"; }
};
int main()
{
Oper* myArray[1000];
fill(begin(myArray), end(myArray), nullptr);
// make a sentinel to ensure that the array is cleaned up if there is an exception
std::shared_ptr<void> sentinel(&myArray, [&](void*) {
// clean up array
for (auto p = begin(myArray) ; p != end(myArray) ; ++p) {
delete *p;
}
});
myArray[0] = new Oper;
myArray[1] = new Oper;
myArray[2] = new Oper;
myArray[4] = new Oper; // note: missed 3
for(auto p = begin(myArray) ; p != end(myArray) && *p ; ++p) {
(*p)->exec();
}
return 0;
}
Output:
Compiling the source code....
$g++ -std=c++11 main.cpp -o demo -lm -pthread -lgmpxx -lgmp -lreadline 2>&1
Executing the program....
$demo
Hello
Hello
Hello
deleted
deleted
deleted
deleted
for(int i= 0; i<MAX_SIZE && i<20;i++)
it looks better because using of break is a bad pattern.

Double free or corruption (fasttop) while executing Parallel version

I am trying to implement a serial algorithm in parallel manner using pthread.
Following is the code that i wrote -
#include <iostream>
#include <fstream>
#include <cstdlib>
#include <string>
#include <cmath>
#include <sys/time.h>
#include <pthread.h>
#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <ctime>
#include <vector>
#include <map>
#include <sstream>
using namespace std;
double similarity_score(char a,char b);
double find_array_max(double array[],int length);
void insert_at(char arr[], int n, int idx, char val);
void checkfile(int open, char filename[]);
void *main_algo(void* rank);
double valueChanger(int p1 , int p2);
string read_sequence(ifstream& f);
int ind;
double mu,delta;
double H_max = 0.;
int N_a; // get the actual lengths of the sequences
int N_b;
int** H;
int count_gap = 6;
int matrix_i = 1;
double temp[4];
char *nameof_seq_a;
char *nameof_seq_b;
string seq_a,seq_b;
int bigCount;
int start_i;
int stop_i;
int start_j;
int stop_j;
int stopSituation;
int k;
map<int,int> mymap;
std::string strValue;
std::stringstream out;
std::stringstream out2;
double value;
int main(int argc, char** argv){
// read info from arguments
if(argc!=5){
cout<<"Give me the propen number of input arguments:"<<endl<<"1 : mu"<<endl;
cout<<"2 : delta"<<endl<<"3 : filename sequence A"<<endl<<"4 : filename sequence B"<<endl;
cout<<"5 : maximal length N of sequences"<<endl;exit(1);
}
mu = atof(argv[1]);
delta = atof(argv[2]);
/////////////////////////////////
// give it the filenames
nameof_seq_a = argv[3];
nameof_seq_b = argv[4];
//int N_max = atoi(argv[5]);
// read the sequences into two vectors:
ifstream stream_seq_b; // first define the input-streams for seq_a and seq_b
stream_seq_b.open(nameof_seq_b); // the same for seq_b
checkfile(! stream_seq_b,nameof_seq_b);
seq_b = read_sequence(stream_seq_b);
ifstream stream_seq_a;
stream_seq_a.open(nameof_seq_a); // open the file for input
checkfile(! stream_seq_a,nameof_seq_a); // check, whether the file was opened successfully
seq_a = read_sequence(stream_seq_a);
// string s_a=seq_a,s_b=seq_b;
N_a = seq_a.length(); // get the actual lengths of the sequences
N_b = seq_b.length();
cout<<"N_a length "<<N_a<<endl;
cout<<"N_b length "<<N_b<<endl;
////////////////////////////////////////////////
// initialize H
int temp1 = N_a+1;
int temp2 = N_b+1;
// call to main algorithm
bigCount = N_a;
stopSituation = (N_a + N_b -1) / count_gap;
cout<<"--------Big Count----"<<bigCount<<endl;
cout<<"--------stopSituation----"<<stopSituation<<endl;
//cout<<"--------Matrix_i-------"<<matrix_i<<endl;
while(matrix_i <= stopSituation+1){
//cout<<"--------MATRIX_I----"<<matrix_i<<endl;
pthread_t myThreads[matrix_i];
for(k = 1; k < matrix_i +1 ; k++){
if(k == 1){
start_i = 1;
stop_j = (count_gap * (matrix_i))+1;
}
else if(k == matrix_i){
start_i = (count_gap * (k-1)) + 1;
stop_j = count_gap +1;
}
else{
start_i = (count_gap * (k-1)) + 1;
stop_j = (count_gap * ((matrix_i - k)+1))+1;
}
start_j = (count_gap * (matrix_i - k))+1;
stop_i = (count_gap * k)+1;
//main_algo();
pthread_create(&myThreads[k], NULL,main_algo, (void*)k);
}
//wait until all threads finish
for(k = 1; k<matrix_i+1; k++){
pthread_join(myThreads[k],NULL);
}
//sleep(1);
if(matrix_i > 1)
{
}
matrix_i++;
}
// Print the matrix H to the console
cout<<"**************** Max is = "<<H_max<<endl;
mymap.clear();
} // END of main
/////////////////////////////////////////////////////////////////////////////
// auxiliary functions used by main:
/////////////////////////////////////////////////////////////////////////////
void checkfile(int open, char filename[]){
if (open){cout << "Error: Can't open the file "<<filename<<endl;exit(1);}
else cout<<"Opened file "<<filename<<endl;
}
/////////////////////////////////////////////////////////////////////////////
double valueChanger(int p1 , int p2){
out.str(std::string());
out2.str(std::string());
out << p1;
out2 << p2;
strValue = out.str()+out2.str();
double value_new = atoi(strValue.c_str());
return value_new;
}
void *main_algo(void* rank){
//void main_algo(){
// algorithm
for(int i=start_i;i<stop_i;i++){
if(i > N_a) {
break;
}
for(int j=start_j;j<stop_j;j++){
if(j > N_b){
break;
}
int iCheck = i-1;
int jCheck = j-1;
value = valueChanger(iCheck , jCheck);
//double myVal = H[iCheck][jCheck];
double myVal;
if ( mymap.find(value) == mymap.end() ) {
// not found
myVal = 0;
} else {
// found
myVal = mymap.find(value)->second;
}
if(seq_a[i-1]==seq_b[j-1]){
temp[0]= myVal+1.0;
}
else{
temp[0]= myVal-mu;
}
value = valueChanger(i-1 , j-1);
if ( mymap.find(value) == mymap.end() ) {
// not found
temp[1] = 0;
} else {
// found
temp[1] = mymap.find(value)->second-delta;
}
value = valueChanger(i , j-1);
if ( mymap.find(value) == mymap.end() ) {
// not found
temp[2] = 0;
} else {
// found
temp[2] = mymap.find(value)->second-delta;
}
temp[3] = 0.;
value = valueChanger(i , j);
mymap[value] = find_array_max(temp,4);
if( H_max < mymap[value] ){
H_max = mymap[value];
}
}
}
cout<<"Done"<< my_rank<<endl;
}
double similarity_score(char a,char b){
double result;
if(a==b){
result=1.;
}
else{
result=-mu;
}
return result;
}
/////////////////////////////////////////////////////////////////////////////
double find_array_max(double array[],int length){
double max = array[0]; // start with max = first element
for(int i = 1; i<length; i++){
if(array[i] > max){
max = array[i];
}
}
return max; // return highest value in array
}
/////////////////////////////////////////////////////////////////////////////
string read_sequence(ifstream& f)
{
// overflows.
string seq;
char line[5000];
while( f.good() )
{
f.getline(line,5000);
// cout << "Line:" << line << endl;
if( line[0] == 0 || line[0]=='#' )
continue;
for(int i = 0; line[i] != 0; ++i)
{
int c = toupper(line[i]);
// cout << char(c);
if( c != 'A' && c != 'G' && c != 'C' && c != 'T' )
continue;
seq.push_back(char(c));
}
}
return seq;
}
Here , when i execute this program , i get following error
*** Error in `./myProg': double free or corruption (fasttop): 0xb5300480 ***
======= Backtrace: =========
/lib/i386-linux-gnu/libc.so.6(+0x767e2)[0xb74e67e2]
/lib/i386-linux-gnu/libc.so.6(+0x77530)[0xb74e7530]
/usr/lib/i386-linux-gnu/libstdc++.so.6(_ZdlPv+0x1f)[0xb7689aaf]
/usr/lib/i386-linux-gnu/libstdc++.so.6(_ZNSs4_Rep10_M_destroyERKSaIcE+0x1b)[0xb76ee5ab]
/usr/lib/i386-linux-gnu/libstdc++.so.6(+0xad5f0)[0xb76ee5f0]
/usr/lib/i386-linux-gnu/libstdc++.so.6(_ZNSs6assignERKSs+0x82)[0xb76efde2]
/usr/lib/i386-linux-gnu/libstdc++.so.6(_ZNSsaSERKSs+0x23)[0xb76efe33]
./myProg[0x80499dc]
./myProg[0x8049c2f]
/lib/i386-linux-gnu/libpthread.so.0(+0x6d78)[0xb7730d78]
/lib/i386-linux-gnu/libc.so.6(clone+0x5e)[0xb75613de]
======= Memory map: ========
*** Error in `./myProg': double free or corruption (fasttop): 0x0829c8f8 ***
======= Backtrace: =========
/lib/i386-linux-gnu/libc.so.6(+0x767e2)[0xb74e67e2]
/lib/i386-linux-gnu/libc.so.6(+0x77530)[0xb74e7530]
/usr/lib/i386-linux-gnu/libstdc++.so.6(_ZdlPv+0x1f)[0xb7689aaf]
/usr/lib/i386-linux-gnu/libstdc++.so.6(_ZNSs4_Rep10_M_destroyERKSaIcE+0x1b)[0xb76ee5ab]
/usr/lib/i386-linux-gnu/libstdc++.so.6(+0xad5f0)[0xb76ee5f0]
/usr/lib/i386-linux-gnu/libstdc++.so.6(_ZNSsaSERKSs+0x23)[0xb76efe33]
./myProg[0x80499dc]
./myProg[0x8049aee]
/lib/i386-linux-gnu/libpthread.so.0(+0x6d78)[0xb7730d78]
Aborted (core dumped)
Please feel free to copy and execute with following two statements -
1) g++ [name].cpp -o [name] -lpthread
2) ./[name] 1 1 filetoread.txt filetoread2.txt
Please update for any kind of mistake i am doing in writing my code.
Thank you !!!!
You need to protect any non-thread-safe variables (such as std::map) with a mutex (or other locking mechanism). Since you're using pthread, take a look at pthread_mutex_lock, pthread_mutex_unlock, and pthread_mutex_destroy.
Here are some notes if you want to go multithreading:
don't (unless you really need to) ... (multithreading brings a new world of complexity and problems)
.... If you really need to, don't use more threads than processors (this is becoming muddled in recent years with CPUs/cores/hyperthreads... but what I mean by processor is what the hardware or virtualization allows you to run in parallel) (otherwise you will have scheduling contention at the OS/hardware level)
lock (mutexes) all common data access.... (otherwise you will corrupt your common data)
.... but try to lock the least possible time. (because every time you lock it means that all your threads need to wait and they will be doing nothing).
and don't use locks unless you need to (because locking/unlocking is expensive)
Avoid locking by using processing queues (e.g. to deliver commands or deliver results across threads). Instead of locking while processing the command/result, you can lock only while inserting/removing from the queue.
Avoid locking by subdividing map-like single structure (with a single mutex) into an array of map-like structure (with an array of mutexes) indexed with a hash of the key of the original map. Instead of locking all threads, you will be locking only some of them.
If there's an unbalance between readers/writers, try a read/write mutex to minimize contention. Ths will allow multiple simultaneous readers or one writer.