C++ Time Library and Octave .oct files - c++

I am trying to write an Octave C++ .oct function that uses the linasm-1.13 library but I cannot seem to get even basic loading of tzdata from /usr/share/zoneinfo/ to work. My simple test function so far is
#include <octave/oct.h>
#include <Time.h> // the linasm-1.13 library
DEFUN_DLD ( tz, args, nargout,
"-*- texinfo -*-\n\
#deftypefn {Function File} {} tz (#var{YYYYMMDDHHMMSS})\n\
\n\
#end deftypefn" )
{
octave_value_list retval_list ;
unsigned int tz ;
const char *ny_time = "/usr/share/zoneinfo/America/New_York" ;
tz = Time::LoadTimeZone( ny_time ) ;
return retval_list ;
which, on compiling with mkoctfile, gives this error
>> mkoctfile tz.cc
tz.cc: In function ‘octave_value_list Ftz(const octave_value_list&, int)’:
tz.cc:24:34: error: cannot call member function ‘unsigned int Time::LoadTimeZone(const char*)’ without object
tz = Time::LoadTimeZone( ny_time ) ;
^
warning: mkoctfile: building exited with failure status
My understanding of this is that ny_time is not an object that is recognised, but I have tried casting ny_time as a string literal as detailed in this accepted SO answer.
I am doing things this way because the input for LoadTimeZone according to the linasm page should be a "path to tzfile, which describes required time zone." Where am I going wrong?

I think you have to #include "source.cc" files also, not just the #include "header.h" files. In your case, I guess you should add: #include "Time.cc" or something like that. I don't know why but this worked for me when working with Rafat's Hussain wavemin library, but I had only 4 files, it must be incredibly tedious with lots of files.
This is what I did (it's a modified version of the test code provided by Rafat with his library).
#include "wavemin.h"
#include "waveaux.h"
#include "wavemin.cc"
#include "waveaux.cc"
#include <octave/oct.h>
double ensayo();
double absmax(double *array, int N);
DEFUN_DLD(helloctave2, argv, , "Usage: hello()"){
wave_object obj;
wt_object wt;
double *inp, *out, *diff;
int N, i, J;
char *name = "db4";
obj = wave_init(name);// Initialize the wavelet
N = 14; //Length of Signal
inp = (double*)malloc(sizeof(double)* N); //Input signal
out = (double*)malloc(sizeof(double)* N);
diff = (double*)malloc(sizeof(double)* N);
//wmean = mean(temp, N);
for (i = 0; i < N; ++i) {
inp[i] = i;
}
J = 1; //Decomposition Levels
wt = wt_init(obj, "dwt", N, J);// Initialize the wavelet transform object
setDWTExtension(wt, "sym");// Options are "per" and "sym". Symmetric is the default option
setWTConv(wt, "direct");
dwt(wt, inp);// Perform DWT
//DWT output can be accessed using wt->output vector. Use wt_summary to find out how to extract appx and detail coefficients
for (i = 0; i < wt->outlength; ++i) {
octave_stdout << wt->output[i];
octave_stdout << "\n";
}
idwt(wt, out);// Perform IDWT (if needed)
// Test Reconstruction
for (i = 0; i < wt->siglength; ++i) {
diff[i] = out[i] - inp[i];
}
octave_stdout << absmax(diff, wt->siglength);
octave_stdout << "\n";
octave_value_list retval;
return retval;
}
double
absmax(double *array, int N) {
double max;
int i;
max = 0.0;
for (i = 0; i < N; ++i) {
if (fabs(array[i]) >= max) {
max = fabs(array[i]);
}
}
return max;
}

Related

How do I resolve this error: expected unqualified-id before '-' token

Context: Preparing for the Fall semester, I whipped up a quick code file to check if you can call a function as a parameter of another function. However, before I could compile the code and check - this error happened.
C:\mingw64\bin\g++.exe -fdiagnostics-color=always -g
\wsl$\kali-linux\home\tyrael\Foundry\morga.cpp -o
\wsl$\kali-linux\home\tyrael\Foundry\morga.exe
'\wsl$\kali-linux\home\tyrael\Foundry'
CMD.EXE was started with the above path as the current directory.
UNC paths are not supported. Defaulting to Windows directory.
In file included from
C:/mingw64/lib/gcc/x86_64-w64-mingw32/8.1.0/include/c++/iostream:39,
from \wsl$\kali-linux\home\tyrael\Foundry\morga.cpp:1:
C:/mingw64/lib/gcc/x86_64-w64-mingw32/8.1.0/include/c++/ostream:681:49:
error: expected unqualified-id before '-' token
__rvalue_ostream_type<_Ostream>>::type - operator<<(_Ostream&& __os, const _Tp& __x)
^
Build finished with error(s).
The terminal process failed to launch (exit code: -1).
Terminal will be reused by tasks, press any key to close it.
I had tried moving around the #include statement, didn't work.
I tried to isolate the cause of the error by commenting out huge swaths of the code, but the error still there.
My only guess as far as what the issue could be is that the compiler is very angy that I am trying to call a function as a parameter for another function, but I can't verify that.
I'm truly at a loss, I just don't know what it could be. Any help would be much appreciated!
This is the code:
#include <iostream>
using std::cout;
int combiner();
int multiplier(int target_sum);
// Forward declaration of my functions
int combiner() {
int input1 = 5;
int input2 = 10;
int input3 = 15;
int sum = input1 + input2 + input3;
return sum;
}
// simple function that takes numbers and combines them into one total sum
int multiplier (int target_sum) {
int big_sum = target_sum * 5;
return big_sum;
}
// simple function that takes a number and multiples by 5
int main (int argc, char** argv) {
combiner();
int final = multiplier(combiner());
cout << "This is our final number: " << final;
return 0;
}
// putting it all together, using a function as a parameter for another function
I remembered that you can't pass a func as a parameter for another func in C, but you can in other langs. However, even when I get rid of that process - I still hit the error. Here's the new code:
#include <iostream>
using std::cout;
int combiner();
int multiplier(int target_sum);
int combiner() {
int input1 = 5;
int input2 = 10;
int input3 = 15;
int sum = input1 + input2 + input3;
return sum;
}
int multiplier (int target_sum) {
int big_sum = target_sum * 5;
return big_sum;
}
int main (int argc, char** argv) {
int tiago = combiner();
int final = multiplier(tiago);
// Storing return value of combiner() into a int variable, using that var as
// parameter instead for 2nd function - multiplier
cout << "This is our final number: " << final;
return 0;
}
EDIT: In the vein of investigating my compiler, here are some screenshots to show what my compilation process looks like on VS Code.
Selecting the compiler type
Select the actual compiler, the one I want is g++

C++: Issues with Circular Buffer

I'm having some trouble writing a circular buffer in C++. Here is my code base at the moment:
circ_buf.h:
#ifndef __CIRC_BUF_H__
#define __CIRC_BUF_H__
#define MAX_DATA (25) // Arbitrary size limit
// The Circular Buffer itself
struct circ_buf {
int s; // Index of oldest reading
int e; // Index of most recent reading
int data[MAX_DATA]; // The data
};
/*** Function Declarations ***/
void empty(circ_buf*);
bool is_empty(circ_buf*);
bool is_full(circ_buf*);
void read(circ_buf*, int);
int overwrite(circ_buf*);
#endif // __CIRC_BUF_H__
circ_buf.cpp:
#include "circ_buf.h"
/*** Function Definitions ***/
// Empty the buffer
void empty(circ_buf* cb) {
cb->s = 0; cb->e = 0;
}
// Is the buffer empty?
bool is_empty(circ_buf* cb) {
// By common convention, if the start index is equal to the end
// index, our buffer is considered empty.
return cb->s == cb->e;
}
// Is the buffer full?
bool is_full(circ_buf* cb) {
// By common convention, if the start index is one greater than
// the end index, our buffer is considered full.
// REMEMBER: we still need to account for wrapping around!
return cb->s == ((cb->e + 1) % MAX_DATA);
}
// Read data into the buffer
void read(circ_buf* cb, int k) {
int i = cb->e;
cb->data[i] = k;
cb->e = (i + 1) % MAX_DATA;
}
// Overwrite data in the buffer
int overwrite(circ_buf* cb) {
int i = cb->s;
int k = cb->data[i];
cb->s = (i + 1) % MAX_DATA;
}
circ_buf_test.cpp:
#include <iostream>
#include <fstream>
#include <string>
#include <cstdlib>
#include "circ_buf.h"
int main(int argc, char** argv) {
// Our data source
std::string file = "million_numbers.txt";
std::fstream in(file, std::ios_base::in);
// The buffer
circ_buf buffer = { .s = 0, .e = 0, .data = {} };
for (int i = 0; i < MAX_DATA; ++i) {
int k = 0; in >> k; // Get next int from in
read(&buffer, k);
}
for (int i = 0; i < MAX_DATA; ++i)
std::cout << overwrite(&buffer) << std::endl;
}
The main issue I'm having is getting the buffer to write integers to its array. When I compile and run the main program (circ_buf_test), it just prints the same number 25 times, instead of what I expect it to print (the numbers 1 through 25 - "million_numbers.txt" is literally just the numbers 1 through 1000000). The number is 2292656, in case this may be important.
Does anyone have an idea about what might be going wrong here?
Your function overwrite(circ_buf* cb) returns nothing (there are no return in it's body). So the code for printing of values can print anything (see "undefined behavior"):
for (int i = 0; i < MAX_DATA; ++i)
std::cout << overwrite(&buffer) << std::endl;
I expect you can find the reason of this "main issue" in the compilation log (see lines started with "Warning"). You can fix it this way:
int overwrite(circ_buf* cb) {
int i = cb->s;
int k = cb->data[i];
cb->s = (i + 1) % MAX_DATA;
return k;
}

'future' has been explicitly marked deleted here

I am trying to build a Async application to allow processing of large lists in parallel, and after two days of learning C++ through googling I have come up with the title error, from the following code:
//
// main.cpp
// ThreadedLearning
//
// Created by Andy Kirk on 19/01/2016.
// Copyright © 2016 Andy Kirk. All rights reserved.
//
#include <iostream>
#include <thread>
#include <vector>
#include <chrono>
#include <future>
typedef struct {
long mailing_id;
char emailAddress[100];
} emailStruct ;
typedef struct {
long mailing_id = 0;
int result = 0;
} returnValues;
returnValues work(emailStruct eMail) {
returnValues result;
std::this_thread::sleep_for(std::chrono::seconds(2));
result.mailing_id = eMail.mailing_id;
return result;
}
int main(int argc, const char * argv[]) {
std::vector<emailStruct> Emails;
emailStruct eMail;
// Create a Dummy Structure Vector
for (int i = 0 ; i < 100 ; ++i) {
std::snprintf(eMail.emailAddress,sizeof(eMail.emailAddress),"user-%d#email_domain.tld",i);
eMail.mailing_id = i;
Emails.push_back(eMail);
}
std::vector<std::future<returnValues>> workers;
int worker_count = 0;
int max_workers = 11;
for ( ; worker_count < Emails.size(); worker_count += max_workers ){
workers.clear();
for (int inner_count = 0 ; inner_count < max_workers ; ++inner_count) {
int entry = worker_count + inner_count;
if(entry < Emails.size()) {
emailStruct workItem = Emails[entry];
auto fut = std::async(&work, workItem);
workers.push_back(fut);
}
}
std::for_each(workers.begin(), workers.end(), [](std::future<returnValues> & res) {
res.get();
});
}
return 0;
}
Really not sure what I am doing wrong, and have found limited answers searching. Its on OSX 10 if that is relevant, and XCode 7.
The future class has its copy constructor deleted, because you really don't want to have multiple copies of it.
To add it to the vector, you have to move it instead of copying it:
workers.push_back(std::move(fut));
This error can also be raised if you are passing a future object (within a thread) to a function which expects a pass by value.
For example, this would raise an error when you pass the future:
void multiplyForever(int x, int y, std::future<void> exit_future);
multiplyForever(3, 5, fut);
You can fix it by passing the future by reference:
void multiplyForever(int x, int y, std::future<void>& exit_future);
multiplyForever(3, 5, fut);

Double free or corruption (fasttop) while executing Parallel version

I am trying to implement a serial algorithm in parallel manner using pthread.
Following is the code that i wrote -
#include <iostream>
#include <fstream>
#include <cstdlib>
#include <string>
#include <cmath>
#include <sys/time.h>
#include <pthread.h>
#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <ctime>
#include <vector>
#include <map>
#include <sstream>
using namespace std;
double similarity_score(char a,char b);
double find_array_max(double array[],int length);
void insert_at(char arr[], int n, int idx, char val);
void checkfile(int open, char filename[]);
void *main_algo(void* rank);
double valueChanger(int p1 , int p2);
string read_sequence(ifstream& f);
int ind;
double mu,delta;
double H_max = 0.;
int N_a; // get the actual lengths of the sequences
int N_b;
int** H;
int count_gap = 6;
int matrix_i = 1;
double temp[4];
char *nameof_seq_a;
char *nameof_seq_b;
string seq_a,seq_b;
int bigCount;
int start_i;
int stop_i;
int start_j;
int stop_j;
int stopSituation;
int k;
map<int,int> mymap;
std::string strValue;
std::stringstream out;
std::stringstream out2;
double value;
int main(int argc, char** argv){
// read info from arguments
if(argc!=5){
cout<<"Give me the propen number of input arguments:"<<endl<<"1 : mu"<<endl;
cout<<"2 : delta"<<endl<<"3 : filename sequence A"<<endl<<"4 : filename sequence B"<<endl;
cout<<"5 : maximal length N of sequences"<<endl;exit(1);
}
mu = atof(argv[1]);
delta = atof(argv[2]);
/////////////////////////////////
// give it the filenames
nameof_seq_a = argv[3];
nameof_seq_b = argv[4];
//int N_max = atoi(argv[5]);
// read the sequences into two vectors:
ifstream stream_seq_b; // first define the input-streams for seq_a and seq_b
stream_seq_b.open(nameof_seq_b); // the same for seq_b
checkfile(! stream_seq_b,nameof_seq_b);
seq_b = read_sequence(stream_seq_b);
ifstream stream_seq_a;
stream_seq_a.open(nameof_seq_a); // open the file for input
checkfile(! stream_seq_a,nameof_seq_a); // check, whether the file was opened successfully
seq_a = read_sequence(stream_seq_a);
// string s_a=seq_a,s_b=seq_b;
N_a = seq_a.length(); // get the actual lengths of the sequences
N_b = seq_b.length();
cout<<"N_a length "<<N_a<<endl;
cout<<"N_b length "<<N_b<<endl;
////////////////////////////////////////////////
// initialize H
int temp1 = N_a+1;
int temp2 = N_b+1;
// call to main algorithm
bigCount = N_a;
stopSituation = (N_a + N_b -1) / count_gap;
cout<<"--------Big Count----"<<bigCount<<endl;
cout<<"--------stopSituation----"<<stopSituation<<endl;
//cout<<"--------Matrix_i-------"<<matrix_i<<endl;
while(matrix_i <= stopSituation+1){
//cout<<"--------MATRIX_I----"<<matrix_i<<endl;
pthread_t myThreads[matrix_i];
for(k = 1; k < matrix_i +1 ; k++){
if(k == 1){
start_i = 1;
stop_j = (count_gap * (matrix_i))+1;
}
else if(k == matrix_i){
start_i = (count_gap * (k-1)) + 1;
stop_j = count_gap +1;
}
else{
start_i = (count_gap * (k-1)) + 1;
stop_j = (count_gap * ((matrix_i - k)+1))+1;
}
start_j = (count_gap * (matrix_i - k))+1;
stop_i = (count_gap * k)+1;
//main_algo();
pthread_create(&myThreads[k], NULL,main_algo, (void*)k);
}
//wait until all threads finish
for(k = 1; k<matrix_i+1; k++){
pthread_join(myThreads[k],NULL);
}
//sleep(1);
if(matrix_i > 1)
{
}
matrix_i++;
}
// Print the matrix H to the console
cout<<"**************** Max is = "<<H_max<<endl;
mymap.clear();
} // END of main
/////////////////////////////////////////////////////////////////////////////
// auxiliary functions used by main:
/////////////////////////////////////////////////////////////////////////////
void checkfile(int open, char filename[]){
if (open){cout << "Error: Can't open the file "<<filename<<endl;exit(1);}
else cout<<"Opened file "<<filename<<endl;
}
/////////////////////////////////////////////////////////////////////////////
double valueChanger(int p1 , int p2){
out.str(std::string());
out2.str(std::string());
out << p1;
out2 << p2;
strValue = out.str()+out2.str();
double value_new = atoi(strValue.c_str());
return value_new;
}
void *main_algo(void* rank){
//void main_algo(){
// algorithm
for(int i=start_i;i<stop_i;i++){
if(i > N_a) {
break;
}
for(int j=start_j;j<stop_j;j++){
if(j > N_b){
break;
}
int iCheck = i-1;
int jCheck = j-1;
value = valueChanger(iCheck , jCheck);
//double myVal = H[iCheck][jCheck];
double myVal;
if ( mymap.find(value) == mymap.end() ) {
// not found
myVal = 0;
} else {
// found
myVal = mymap.find(value)->second;
}
if(seq_a[i-1]==seq_b[j-1]){
temp[0]= myVal+1.0;
}
else{
temp[0]= myVal-mu;
}
value = valueChanger(i-1 , j-1);
if ( mymap.find(value) == mymap.end() ) {
// not found
temp[1] = 0;
} else {
// found
temp[1] = mymap.find(value)->second-delta;
}
value = valueChanger(i , j-1);
if ( mymap.find(value) == mymap.end() ) {
// not found
temp[2] = 0;
} else {
// found
temp[2] = mymap.find(value)->second-delta;
}
temp[3] = 0.;
value = valueChanger(i , j);
mymap[value] = find_array_max(temp,4);
if( H_max < mymap[value] ){
H_max = mymap[value];
}
}
}
cout<<"Done"<< my_rank<<endl;
}
double similarity_score(char a,char b){
double result;
if(a==b){
result=1.;
}
else{
result=-mu;
}
return result;
}
/////////////////////////////////////////////////////////////////////////////
double find_array_max(double array[],int length){
double max = array[0]; // start with max = first element
for(int i = 1; i<length; i++){
if(array[i] > max){
max = array[i];
}
}
return max; // return highest value in array
}
/////////////////////////////////////////////////////////////////////////////
string read_sequence(ifstream& f)
{
// overflows.
string seq;
char line[5000];
while( f.good() )
{
f.getline(line,5000);
// cout << "Line:" << line << endl;
if( line[0] == 0 || line[0]=='#' )
continue;
for(int i = 0; line[i] != 0; ++i)
{
int c = toupper(line[i]);
// cout << char(c);
if( c != 'A' && c != 'G' && c != 'C' && c != 'T' )
continue;
seq.push_back(char(c));
}
}
return seq;
}
Here , when i execute this program , i get following error
*** Error in `./myProg': double free or corruption (fasttop): 0xb5300480 ***
======= Backtrace: =========
/lib/i386-linux-gnu/libc.so.6(+0x767e2)[0xb74e67e2]
/lib/i386-linux-gnu/libc.so.6(+0x77530)[0xb74e7530]
/usr/lib/i386-linux-gnu/libstdc++.so.6(_ZdlPv+0x1f)[0xb7689aaf]
/usr/lib/i386-linux-gnu/libstdc++.so.6(_ZNSs4_Rep10_M_destroyERKSaIcE+0x1b)[0xb76ee5ab]
/usr/lib/i386-linux-gnu/libstdc++.so.6(+0xad5f0)[0xb76ee5f0]
/usr/lib/i386-linux-gnu/libstdc++.so.6(_ZNSs6assignERKSs+0x82)[0xb76efde2]
/usr/lib/i386-linux-gnu/libstdc++.so.6(_ZNSsaSERKSs+0x23)[0xb76efe33]
./myProg[0x80499dc]
./myProg[0x8049c2f]
/lib/i386-linux-gnu/libpthread.so.0(+0x6d78)[0xb7730d78]
/lib/i386-linux-gnu/libc.so.6(clone+0x5e)[0xb75613de]
======= Memory map: ========
*** Error in `./myProg': double free or corruption (fasttop): 0x0829c8f8 ***
======= Backtrace: =========
/lib/i386-linux-gnu/libc.so.6(+0x767e2)[0xb74e67e2]
/lib/i386-linux-gnu/libc.so.6(+0x77530)[0xb74e7530]
/usr/lib/i386-linux-gnu/libstdc++.so.6(_ZdlPv+0x1f)[0xb7689aaf]
/usr/lib/i386-linux-gnu/libstdc++.so.6(_ZNSs4_Rep10_M_destroyERKSaIcE+0x1b)[0xb76ee5ab]
/usr/lib/i386-linux-gnu/libstdc++.so.6(+0xad5f0)[0xb76ee5f0]
/usr/lib/i386-linux-gnu/libstdc++.so.6(_ZNSsaSERKSs+0x23)[0xb76efe33]
./myProg[0x80499dc]
./myProg[0x8049aee]
/lib/i386-linux-gnu/libpthread.so.0(+0x6d78)[0xb7730d78]
Aborted (core dumped)
Please feel free to copy and execute with following two statements -
1) g++ [name].cpp -o [name] -lpthread
2) ./[name] 1 1 filetoread.txt filetoread2.txt
Please update for any kind of mistake i am doing in writing my code.
Thank you !!!!
You need to protect any non-thread-safe variables (such as std::map) with a mutex (or other locking mechanism). Since you're using pthread, take a look at pthread_mutex_lock, pthread_mutex_unlock, and pthread_mutex_destroy.
Here are some notes if you want to go multithreading:
don't (unless you really need to) ... (multithreading brings a new world of complexity and problems)
.... If you really need to, don't use more threads than processors (this is becoming muddled in recent years with CPUs/cores/hyperthreads... but what I mean by processor is what the hardware or virtualization allows you to run in parallel) (otherwise you will have scheduling contention at the OS/hardware level)
lock (mutexes) all common data access.... (otherwise you will corrupt your common data)
.... but try to lock the least possible time. (because every time you lock it means that all your threads need to wait and they will be doing nothing).
and don't use locks unless you need to (because locking/unlocking is expensive)
Avoid locking by using processing queues (e.g. to deliver commands or deliver results across threads). Instead of locking while processing the command/result, you can lock only while inserting/removing from the queue.
Avoid locking by subdividing map-like single structure (with a single mutex) into an array of map-like structure (with an array of mutexes) indexed with a hash of the key of the original map. Instead of locking all threads, you will be locking only some of them.
If there's an unbalance between readers/writers, try a read/write mutex to minimize contention. Ths will allow multiple simultaneous readers or one writer.

c++ seg fault issue

I am working on a C++ program that uses some external C libraries. As far as I can tell though that is not the cause of the problem, and the issue is with my C++ code. The program runs fine with no errors or anything on my test datasets, but after going through nearly the entire full dataset, I get a segfault. Running GDB gives me this segfault:
(gdb) run -speciesMain=allMis1 -speciesOther=anoCar2 -speciesMain=allMis1 -speciesOther=anoCar2 /hive/data/genomes/allMis1/bed/lastz.anoCar2/mafRBestNet/*.maf.gz
Starting program: /cluster/home/jstjohn/bin/mafPairwiseSyntenyDecay -speciesMain=allMis1 -speciesOther=anoCar2 -speciesMain=allMis1 -speciesOther=anoCar2 /hive/data/genome
s/allMis1/bed/lastz.anoCar2/mafRBestNet/*.maf.gz
Detaching after fork from child process 3718.
Program received signal SIGSEGV, Segmentation fault.
0x0000003009cb7672 in __gnu_cxx::__exchange_and_add(int volatile*, int) () from /usr/lib64/libstdc++.so.6
(gdb) up
#1 0x0000003009c9db59 in std::basic_string, std::allocator >::~basic_string() () from /usr/lib64/libstdc++.so.6
(gdb) up
#2 0x00000000004051e7 in PairAlnInfo::~PairAlnInfo (this=0x7fffffffcd70, __in_chrg=) at mafPairwiseSyntenyDecay.cpp:37
(gdb) up
#3 0x0000000000404eb0 in main (argc=2, argv=0x7fffffffcf78) at mafPairwiseSyntenyDecay.cpp:260
It looks like something is going on with a double free of my PairAlnInfo class. The weird thing is that I don't define a destructor, and I am not allocating anything with new. I have tried this both with g++44 and g++4.1.2 on the linux machine and have had the same results.
To make things even weirder, on my linux box (with more available RAM and everything, not that RAM is an issue with this program, but it is a beefy system) the seg fault happens as described above before the program reaches the loop to print output. On my much smaller macbook air using either g++ or clang++, the program still segfaults, but it doesn't do that until after the results are printed, right before the final return(0) out of the main function. Here is what the GDB trace looks like on my mac running on the same file after compiling with Mac's default g++4.2:
(more results)...
98000 27527 162181 0.83027
99000 27457 161467 0.829953
100000 27411 160794 0.829527
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x00004a2c00106077
0x00007fff9365a6e5 in std::string::_Rep::_M_dispose ()
(gdb) up
#1 0x00007fff9365a740 in std::basic_string, std::allocator >::~basic_string ()
(gdb) up
#2 0x0000000100003938 in main (argc=1261, argv=0x851d5fbff533) at mafPairwiseSyntenyDecay.cpp:301
(gdb)
Just in case you didn't notice the time of my posting, it's about 2:30AM now... I have been hacking away at this problem for about 10 hours now. Thanks so much for taking the time to look at this and help me out! The code and some instructions for replicating my situation follow.
If you are interested in downloading and installing the whole thing with dependencies then download my KentLib repository, make in the base directory, and then go to examples/mafPairwiseSyntenyDecay and run make there. An example (rather large) that causes the bug I am discussing is the gziped file available here: 100Mb file that the program crashes on. Then execute the program with these arguments -speciesMain=allMis1 -speciesOther=anoCar2 anoCar2.allMis1.rbest.maf.gz.
/**
* mafPairwiseSyntenyDecay
* Author: John St. John
* Date: 4/26/2012
*
* calculates the mean synteny decay in different range bins
*
*
*/
//Kent source C imports
extern "C" {
#include "common.h"
#include "options.h"
#include "maf.h"
}
#include <map>
#include <string>
#include <set>
#include <vector>
#include <sstream>
#include <iostream>
//#define NDEBUG
#include <assert.h>
using namespace std;
/*
Global variables
*/
class PairAlnInfo {
public:
string oname;
int sstart;
int send;
int ostart;
int oend;
char strand;
PairAlnInfo(string _oname,
int _sstart, int _send,
int _ostart, int _oend,
char _strand):
oname(_oname),
sstart(_sstart),
send(_send),
ostart(_ostart),
oend(_oend),
strand(_strand){}
PairAlnInfo():
oname("DUMMY"),
sstart(-1),
send(-1),
ostart(-1),
oend(-1),
strand(-1){}
};
vector<string> &split(const string &s, char delim, vector<string> &elems) {
stringstream ss(s);
string item;
while(getline(ss, item, delim)) {
elems.push_back(item);
}
return(elems);
}
vector<string> split(const string &s, char delim) {
vector<string> elems;
return(split(s, delim, elems));
}
#define DEF_MIN_LEN (200)
#define DEF_MIN_SCORE (200)
typedef map<int,PairAlnInfo> PairAlnInfoByPos;
typedef map<string, PairAlnInfoByPos > ChromToPairAlnInfoByPos;
ChromToPairAlnInfoByPos pairAlnInfoByPosByChrom;
void usage()
/* Explain usage and exit. */
{
errAbort(
(char*)"mafPairwiseSyntenyDecay -- Calculates pairwise syntenic decay from maf alignment containing at least the two specified species.\n"
"usage:\n"
"\tmafPairwiseSyntenyDecay [options] [*required options] file1.maf[.gz] ... \n"
"Options:\n"
"\t-help\tPrints this message.\n"
"\t-minScore=NUM\tMinimum MAF alignment score to consider (default 200)\n"
"\t-minAlnLen=NUM\tMinimum MAF alignment block length to consider (default 200)\n"
"\t-speciesMain=NAME\t*Name of the main species (exactly as it appears before the '.') in the maf file (REQUIRED)\n"
"\t-speciesOther=NAME\t*Name of the other species (exactly as it appears before the '.') in the maf file (REQUIRED)\n"
);
}//end usage()
static struct optionSpec options[] = {
/* Structure holding command line options */
{(char*)"help",OPTION_STRING},
{(char*)"minScore",OPTION_INT},
{(char*)"minAlnLen",OPTION_INT},
{(char*)"speciesMain",OPTION_STRING},
{(char*)"speciesOther",OPTION_STRING},
{NULL, 0}
}; //end options()
/**
* Main function, takes filenames for paired qseq reads
* and outputs three files.
*/
int iterateOverAlignmentBlocksAndStorePairInfo(char *fileName, const int minScore, const int minAlnLen, const string speciesMain, const string speciesOther){
struct mafFile * mFile = mafOpen(fileName);
struct mafAli * mAli;
//loop over alignment blocks
while((mAli = mafNext(mFile)) != NULL){
struct mafComp *first = mAli->components;
int seqlen = mAli->textSize;
//First find and store set of duplicates in this block
set<string> seen;
set<string> dups;
if(mAli->score < minScore || seqlen < minAlnLen){
//free here and pre-maturely end
mafAliFree(&mAli);
continue;
}
for(struct mafComp *item = first; item != NULL; item = item->next){
string tmp(item->src);
string tname = split(tmp,'.')[0];
if(seen.count(tname)){
//seen this item
dups.insert(tname);
}else{
seen.insert(tname);
}
}
for(struct mafComp *item1 = first; item1->next != NULL; item1 = item1->next){
//stop one before the end
string tmp1(item1->src);
vector<string> nameSplit1(split(tmp1,'.'));
string name1(nameSplit1[0]);
if(dups.count(name1) || (name1 != speciesMain && name1 != speciesOther)){
continue;
}
for(struct mafComp *item2 = item1->next; item2 != NULL; item2 = item2->next){
string tmp2(item2->src);
vector<string> nameSplit2(split(tmp2,'.'));
string name2 = nameSplit2[0];
if(dups.count(name2) || (name2 != speciesMain && name2 != speciesOther)){
continue;
}
string chr1(nameSplit1[1]);
string chr2(nameSplit2[1]);
char strand;
if(item1->strand == item2->strand)
strand = '+';
else
strand = '-';
int start1,end1,start2,end2;
if(item1->strand == '+'){
start1 = item1->start;
end1 = start1 + item1->size;
}else{
end1 = item1->start;
start1 = end1 - item1->size;
}
if(item2->strand == '+'){
start2 = item2->start;
end2 = start2+ item2->size;
}else{
end2 = item2->start;
start2 = end2 - item2->size;
}
if(name1 == speciesMain){
PairAlnInfo aln(chr2,start1,end1,start2,end2,strand);
pairAlnInfoByPosByChrom[chr1][start1] = aln;
}else{
PairAlnInfo aln(chr1,start2,end2,start1,end1,strand);
pairAlnInfoByPosByChrom[chr2][start2] = aln;
}
} //end loop over item2
} //end loop over item1
mafAliFree(&mAli);
}//end loop over alignment blocks
mafFileFree(&mFile);
return(0);
}
int main(int argc, char *argv[])
/* Process command line. */
{
optionInit(&argc, argv, options);
if(optionExists((char*)"help") || argc <= 1){
usage();
}
int minAlnScore = optionInt((char*)"minScore",DEF_MIN_SCORE);
int minAlnLen = optionInt((char*)"minAlnLen",DEF_MIN_LEN);
string speciesMain(optionVal((char*)"speciesMain",NULL));
string speciesOther(optionVal((char*)"speciesOther",NULL));
if(speciesMain.empty() || speciesOther.empty())
usage();
//load the relevant alignment info from the maf(s)
for(int i = 1; i<argc; i++){
iterateOverAlignmentBlocksAndStorePairInfo(argv[i], minAlnScore, minAlnLen, speciesMain, speciesOther);
}
const int blockSize = 1000;
const int blockCount = 100;
int totalWindows[blockCount] = {0};
int containBreak[blockCount] = {0};
//we want the fraction of windows of each size that contain a break
//
for(ChromToPairAlnInfoByPos::iterator mainChromItter = pairAlnInfoByPosByChrom.begin();
mainChromItter != pairAlnInfoByPosByChrom.end();
mainChromItter++){
//process the alignments shared by this chromosome
//note that map stores them sorted by begin position
vector<int> keys;
for(PairAlnInfoByPos::iterator posIter = mainChromItter->second.begin();
posIter != mainChromItter->second.end();
posIter++){
keys.push_back(posIter->first);
}
for(int i = 0; i < keys.size(); i++){
//first check for trivial window (ie our block)
PairAlnInfo pi1 = mainChromItter->second[keys[i]];
assert(pi1.send > pi1.sstart);
assert(pi1.sstart == keys[i]);
int numBucketsThisWindow = (pi1.send - pi1.sstart) / blockSize;
for(int k = 0; k < numBucketsThisWindow && k < blockCount; k++)
totalWindows[k]++;
for(int j = i+1; j < keys.size(); j++){
PairAlnInfo pi2 = mainChromItter->second[keys[j]];
assert(pi2.sstart == keys[j]);
assert(pi2.send > pi2.sstart);
assert(pi2.sstart > pi1.sstart);
if(pi2.oname == pi1.oname){
int moreToInc = (pi2.send - pi1.sstart) / blockSize;
for(int k = numBucketsThisWindow; k < moreToInc && k < blockCount; k++)
totalWindows[k]++;
numBucketsThisWindow = moreToInc; //so we don't double count
}else{
int numDiscontigBuckets = (pi2.send - pi1.sstart) / blockSize;
for(int k = numBucketsThisWindow; k < numDiscontigBuckets && k < blockSize; k++){
containBreak[k]++;
totalWindows[k]++;
}
numBucketsThisWindow = numDiscontigBuckets;
}
if((keys[j] - keys[i]) >= (blockSize * blockCount)){
//i = j;
break;
}
}
}
}
cout << "#WindowSize\tNumContainBreak\tNumTotal\t1-(NumContainBreak/NumTotal)" << endl;
for(int i = 0; i < blockCount; i++){
cout << (i+1)*blockSize << '\t';
cout << containBreak[i] << '\t';
cout << totalWindows[i] << '\t';
cout << (totalWindows[i] > 0? 1.0 - (double(containBreak[i])/double(totalWindows[i])): 0) << endl;
}
return(0);
} //end main()
Try running your program under valgrind. This will give you a report of possibly or actually lost memory, uninitialised, etc.
Your issues are probably due to due memory corruption occurring at some point in the program sometime prior to the actual errors you are seeing.
One potential issue in the code you posted is the loop:
for(int k = numBucketsThisWindow; k<numDiscontigBuckets && k < blockSize; k++){
which uses blockSize instead of the correct blockCount which leads to a possible overflow of both the totalWindows[] and containBreak[] arrays. This would overwrite the speciesMain and speciesOther strings, alonth with anything else on the stack, which might very well result in the errors you are seeing.