I am trying to access an environment variable from a C++ program. So I made a test program which works fine :
#include <stdio.h>
#include <stdlib.h>
int main ()
{
printf("MANIFOLD : %s\n", getenv("MANIFOLD_DIRECTORY"));
return(0);
}
Output : MANIFOLD : /home/n1603031f/Desktop/manifold-0.12.1/kitfox_configuration/input.config
Note : Signature of getenv is :
char *getenv(const char *name);
But when I use this as a part of a bigger program with many files linked :
energy_introspector->configure (getenv("MANIFOLD_DIRECTORY"));
Above does not work.
char *a = new char [1000];
a = getenv("MANIFOLD_DIRECTORY");
energy_introspector->configure (a);
Above also does not work.
Note : Signature of configure function :
void configure(const char *ConfigFile);
Error message :
Number of LPs = 1
[Ubuntu10:18455] *** Process received signal ***
[Ubuntu10:18455] Signal: Segmentation fault (11)
[Ubuntu10:18455] Signal code: Address not mapped (1)
[Ubuntu10:18455] Failing at address: (nil)
[Ubuntu10:18455] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x10330) [0x7f9a38149330]
[Ubuntu10:18455] [ 1] /lib/x86_64-linux-gnu/libc.so.6(strlen+0x2a) [0x7f9a37dfc9da]
[Ubuntu10:18455] [ 2] /home/n1603031f/Desktop/manifold-0.12.1/simulator/smp/QsimLib/smp_llp() [0x5bf8c4]
[Ubuntu10:18455] [ 3] /home/n1603031f/Desktop/manifold-0.12.1/simulator/smp/QsimLib/smp_llp() [0x5a4ac6]
[Ubuntu10:18455] [ 4] /home/n1603031f/Desktop/manifold-0.12.1/simulator/smp/QsimLib/smp_llp() [0x5a4df8]
[Ubuntu10:18455] [ 5] /home/n1603031f/Desktop/manifold-0.12.1/simulator/smp/QsimLib/smp_llp() [0x4283b6]
[Ubuntu10:18455] [ 6] /home/n1603031f/Desktop/manifold-0.12.1/simulator/smp/QsimLib/smp_llp() [0x41e197]
[Ubuntu10:18455] [ 7] /home/n1603031f/Desktop/manifold-0.12.1/simulator/smp/QsimLib/smp_llp() [0x41de7a]
[Ubuntu10:18455] [ 8] /home/n1603031f/Desktop/manifold-0.12.1/simulator/smp/QsimLib/smp_llp() [0x41d906]
[Ubuntu10:18455] [ 9] /home/n1603031f/Desktop/manifold-0.12.1/simulator/smp/QsimLib/smp_llp() [0x41710b]
[Ubuntu10:18455] [10] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f9a37d95f45]
[Ubuntu10:18455] [11] /home/n1603031f/Desktop/manifold-0.12.1/simulator/smp/QsimLib/smp_llp() [0x41697f]
[Ubuntu10:18455] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 18455 on node Ubuntu10 exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
But this works :
energy_introspector->configure ("/home/n1603031f/Desktop/manifold-0.12.1/kitfox_configuration/input.config");
getenv returns a pointer to library-allocated memory that is not owned by your program. Your
a = new char [1000]
line shows you did not recognize this and seem to assume you need to supply the memory. That is not true, especially you may never free the memory returned by getenv.
(Even if that would be correct, the simple pointer assignment
a = getenv...
would still be wrong, as you're just swapping a pointer and not copying the memory. That line is a memory leak as you loose the pointer to the allocated 1000 chars)
If you want your program to own that memory so you can later free it, you need to copy it into our private memory space.
a = new char [1000];
e = getenv (<whatever>);
strcpy (a, e);
Unfortunately, I cannot see what you do with the pointer later on in your other examples, especially if you try to free or delete it. Both will lead to an error.
First explicit error in your code is char array allocation and then assigning result of getenv. This leads to memory leak. In your case use:
std::string a = getenv("MANIFOLD_DIRECTORY");
This saves the result in variable a and makes your code immune to unsetting environment variables.
If getenv returns NULL then variable with specified name is not in the environment passed to your application. Try to list all available environment variables with code like below.
extern char** environ;
for (int i = 0; environ[i] != NULL; ++i) {
std::cout << environ[i] << std::endl;
}
If your variable is not listed then most probably it's the problem how you call your application. The other option is that your environment has been unset.
Related
I have to work on a code written a few years ago which uses MPI and PETSc.
When I try to run it, I have an error with the function MPI_Comm_rank().
Here is the beginning of the code :
int main(int argc,char **argv)
{
double mesure_tps2,mesure_tps1;
struct timeval tv;
time_t curtime2,curtime1;
char help[] = "Solves linear system with KSP.\n\n"; // NB: Petsc est defini dans "fafemo_Constant_Globales.h"
std::cout<< "d�but PetscInitialize" <<std::endl;
(void*) PetscInitialize(&argc,&argv,(char *)0,help);
std::cout<< "d�but PetscInitialize fait" <<std::endl;
int world_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
PetscFinalize();
}
Obviously, there are some code between MPI_Comm_rank() and PetscFinalize().
PetscInitialize and PetscFinalize call respectively MPI_INIT and MPI_FINALIZE.
In my makefil I have :
PETSC_DIR=/home/thib/Documents/bibliotheques/petsc-3.13.2
PETSC_ARCH=arch-linux-c-debug
include ${PETSC_DIR}/lib/petsc/conf/variables
include ${PETSC_DIR}/lib/petsc/conf/rules
PETSC36 = -I/home/thib/Documents/bibliotheques/petsc-3.13.2/include -I/home/thib/Documents/bibliotheques/petsc-3.13.2/arch-linux-c-debug/include
Mpi_include=-I/usr/lib/x86_64-linux-gnu/openmpi
#a variable with some files names
fafemo_files = fafemo_CI_CL-def.cc fafemo_Flux.cc fafemo_initialisation_probleme.cc fafemo_FEM_setup.cc fafemo_sorties.cc fafemo_richards_solve.cc element_read_split.cpp point_read_split.cpp read_split_mesh.cpp
PETSC_KSP_LIB_VSOIL=-L/home/thib/Documents/bibliotheques/petsc-3.13.2/ -lpetsc_real -lmpi -lmpi++
fafemo: ${fafemo_files} fafemo_Richards_Main.o
g++ ${CXXFLAGS} -g -o fafemo_CD ${fafemo_files} fafemo_Richards_Main.cc ${PETSC_KSP_LIB_VSOIL} $(PETSC36) ${Mpi_include}
Using g++ or mpic++ doesn't seem to change anything.
It compiles, but when I try to execute I have :
[thib-X540UP:03696] Signal: Segmentation fault (11)
[thib-X540UP:03696] Signal code: Address not mapped (1)
[thib-X540UP:03696] Failing at address: 0x44000098
[thib-X540UP:03696] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x3efd0)[0x7fbfa87e4fd0]
[thib-X540UP:03696] [ 1] /usr/lib/x86_64-linux-gnu/libmpi.so.20(MPI_Comm_rank+0x42)[0x7fbfa9533c42]
[thib-X540UP:03696] [ 2] ./fafemo_CD(+0x230c8)[0x561caa6920c8]
[thib-X540UP:03696] [ 3] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7fbfa87c7b97]
[thib-X540UP:03696] [ 4] ./fafemo_CD(+0x346a)[0x561caa67246a]
[thib-X540UP:03696] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 0 on node thib-X540UP exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
Also, I have others MPI programs on my computer and I never had such a problem.
Does anyone know why do I get this ?
If someone has the same issue :
When I installed PETSc, I ran ./configure with --download-mpich while I already had mpi installed on my computer.
To solve the problem I did "rm -rf ${PETSC_ARCH}" and ran ./configure again.
I have problem with MPI library.
I have to read the text from file and send it to another processes
for example as a vector.
I've written the following code:
#include "mpi.h"
#include <stdio.h>
#include<string.h>
#include<stdlib.h>
#include<string>
#include <fstream>
#include <cstring>
#include <vector>
class PatternAndText
{
public:
static std::string textPreparaation()
{
std::ifstream t("file.txt");
std::string str((std::istreambuf_iterator<char>(t)), std::istreambuf_iterator<char>());
std::string text = str;
return text;
}
};
int main(int argc, char* argv[])
{
int size, rank ;
std::string text;
std::vector<char> cstr;
MPI_Init(&argc, &argv);
MPI_Status status;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
if (rank == 0)
{
text = PatternAndText::textPreparaation();
std::vector<char> cstr(text.c_str(), text.c_str() + text.size() + 1);
}
MPI_Bcast(cstr.data(), cstr.size(), MPI_CHAR,0,MPI_COMM_WORLD);
if (rank != 0 )
{
std::cout<<"\n";
std::cout<<cstr[1]<<" "<<rank;
std::cout<<"\n";
}
MPI_Finalize();
return 0;
}
I want to read the text from file by main process and broadcast i to the others.
When I try to run, it gives me:
[alek:26408] *** Process received signal ***
[alek:26408] Signal: Segmentation fault (11)
[alek:26408] Signal code: Address not mapped (1)
[alek:26408] Failing at address: 0x1
[alek:26408] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x3ef20)[0x7fc7c1a8af20]
[alek:26408] [ 1] spli(+0xc63d)[0x55b0104bb63d]
[alek:26408] [ 2] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7fc7c1a6db97]
[alek:26408] [ 3] spli(+0xc3ba)[0x55b0104bb3ba]
[alek:26408] *** End of error message ***
[alek:26406] *** Process received signal ***
[alek:26406] Signal: Segmentation fault (11)
[alek:26406] Signal code: Address not mapped (1)
[alek:26406] Failing at address: 0x1
[alek:26406] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x3ef20)[0x7f01ef5f5f20]
[alek:26406] [ 1] spli(+0xc63d)[0x5579714df63d]
[alek:26406] [ 2] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7f01ef5d8b97]
[alek:26406] [ 3] spli(+0xc3ba)[0x5579714df3ba]
[alek:26406] *** End of error message ***
[alek:26414] *** Process received signal ***
[alek:26414] Signal: Segmentation fault (11)
[alek:26414] Signal code: Address not mapped (1)
[alek:26414] Failing at address: 0x1
[alek:26413] *** Process received signal ***
[alek:26422] *** Process received signal ***
[alek:26417] *** Process received signal ***
[alek:26417] Signal: Segmentation fault (11)
[alek:26417] Signal code: Address not mapped (1)
[alek:26417] Failing at address: 0x1
[alek:26413] Signal: Segmentation fault (11)
[alek:26413] Signal code: Address not mapped (1)
[alek:26413] Failing at address: 0x1
[alek:26422] Signal: Segmentation fault (11)
[alek:26422] Signal code: Address not mapped (1)
[alek:26422] Failing at address: 0x1
[alek:26413] [alek:26425] *** Process received signal ***
[alek:26425] Signal: Segmentation fault (11)
[alek:26425] Signal code: Address not mapped (1)
[alek:26425] Failing at address: 0x1
[alek:26414] [ 0] [alek:26422] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x3ef20)[0x7ff9c3740f20]
[alek:26414] [ 1] spli(+0xc63d)[0x563e8a58563d]
[alek:26417] [ 0] [alek:26425] [ 0] [ 0] [alek:26414] [ 2] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7/lib/x86_64-linux-gnu/libc.so.6/lib/x86_64-linux-gnu/libc.so.6(+0x3ef20)[0x7f5a4dd75f20]
[alek:26417] /lib/x86_64-linux-gnu/libc.so.6(/lib/x86_64-linux-gnu/libc.so.6(+0x3ef20)[0x7f90009f0f20]
[alek:26425] [ 1] spli(+0xc63d))[0x7ff9c3723b97]
[alek:26414] [ 3] spli+0x3ef20)[0x7f2a6faf6f20]
[alek:26413] [ 1] spli(+0xc63d)[0x5557de07763d]
[0x557dee98063d]
[alek:26425] [ 2] (+0xc3ba)[0x563e8a5853ba]
[alek:26414] *** End of error message ***
(+0x3ef20)[0x7f8c41861f20]
[alek:26422] [ 1] spli(+0xc63d)[ 1] spli(+0xc63d)[0x5650e93dc63d]
[alek:26417] [alek:26413] [ 2] [0x561eb2de463d]
[alek:26422] [ 2] [ 2] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7f2a6fad9b97]
[alek:26413] [ 3] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7f5a4dd58b97]
[alek:26417] [ 3] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7f8c41844b97]
[alek:26422] [ 3] spli(+0xc3ba)[0x5557de0773ba]
[alek:26413] *** End of error message ***
spli(+0xc3ba)[0x5650e93dc3ba]
[alek:26417] *** End of error message ***
spli(+0xc3ba)[0x561eb2de43ba]
[alek:26422] *** End of error message ***
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7f90009d3b97]
[alek:26425] [ 3] spli(+0xc3ba)[0x557dee9803ba]
[alek:26425] *** End of error message ***
[alek:26411] *** Process received signal ***
[alek:26411] Signal: Segmentation fault (11)
[alek:26411] Signal code: Address not mapped (1)
[alek:26411] Failing at address: 0x1
[alek:26411] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x3ef20)[0x7f1a339adf20]
[alek:26411] [ 1] spli(+0xc63d)[0x555737c1263d]
[alek:26411] [ 2] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7f1a33990b97]
[alek:26411] [ 3] spli(+0xc3ba)[0x555737c123ba]
[alek:26411] *** End of error message ***
[warn] Epoll ADD(4) on fd 88 failed. Old events were 0; read change was 0 (none); write change was 1 (add): Bad file descriptor
--------------------------------------------------------------------------
mpirun noticed that process rank 5 with PID 0 on node alek exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
When I check the size instead of [0] processes print 0
What should I change to make it works?
The local variable std::vector<char> cstr inside if (rank == 0) {...} shadows that inside main. The variable cstr inside main is not affected.
To assign data to cstr, use cstr.assign(...):
if (rank == 0) {
const std::string text = PatternAndText::textPreparaation();
cstr.assign(text.c_str(), text.c_str() + text.size() + 1);
}
Other processes should first allocate storage in cstr by calling cstr.resize(...). To do that they should know its size. You can first broadcast size and then resize cstr:
unsigned long long size = cstr.size();
MPI_Bcast(&size, 1, MPI_UNSIGNED_LONG_LONG, 0, MPI_COMM_WORLD);
if (rank != 0)
cstr.resize(size);
before broadcasting the vector itself:
MPI_Bcast(cstr.data(), size, MPI_CHAR, 0, MPI_COMM_WORLD);
I have a vector
std::vector<double**> blocks(L);
std::vector<double**> localblocks(blocks_local);
I then use send commands to send the data that resides on rank 0 to the other ranks (I think that this is the correct terminology)
for(i=1;i<numnodes-1;i++)
{
for(j=0;j<blocks_local;j++)
{
MPI_Send(blocks[i*blocks_local+j],N*N,MPI_DOUBLE,i,j,MPI_COMM_WORLD);
}
}
Up until this point the code runs perfectly fine: No errors. Then on the remaining ranks the following code is
for(i=0;i<blocks_local;i++)
{
MPI_Recv(&localblocks[i],N*N,MPI_DOUBLE,0,i,MPI_COMM_WORLD,&status);
}
It is at this point I get an invalid pointer error.
The total output is
6.8297e-05
3.6895e-05
4.3906e-05
4.4463e-05 << these just show the time it takes for a process to complete. Shows that the program has excited successfully.
free(): invalid pointer
[localhost:16841] *** Process received signal ***
[localhost:16841] Signal: Aborted (6)
[localhost:16841] Signal code: (-6)
free(): invalid pointer
free(): invalid pointer
[localhost:16842] *** Process received signal ***
[localhost:16842] Signal: Aborted (6)
[localhost:16842] Signal code: (-6)
[localhost:16840] *** Process received signal ***
[localhost:16840] Signal: Aborted (6)
[localhost:16840] Signal code: (-6)
[localhost:16841] [ 0] /lib64/libpthread.so.0(+0x11fc0)[0x7fb761c10fc0]
[localhost:16841] [ 1] [localhost:16840] [ 0] [localhost:16842] [ 0] /lib64/libc.so.6(gsignal+0x10b)[0x7fb761876f2b]
[localhost:16841] [ 2] /lib64/libpthread.so.0(+0x11fc0)[0x7fb0f377cfc0]
[localhost:16840] [ 1] /lib64/libpthread.so.0(+0x11fc0)[0x7f0ec17e9fc0]
[localhost:16842] [ 1] /lib64/libc.so.6(gsignal+0x10b)[0x7fb0f33e2f2b]
[localhost:16840] /lib64/libc.so.6(abort+0x12b)[0x7fb761861561]
/lib64/libc.so.6(gsignal+0x10b)[0x7f0ec144ff2b]
[localhost:16842] [ 2] [localhost:16841] [ 3] [ 2] /lib64/libc.so.6(abort+0x12b)/lib64/libc.so.6(abort+0x12b)[0x7fb0f33cd561]
[localhost:16840] [ 3] [0x7f0ec143a561]
[localhost:16842] [ 3] /lib64/libc.so.6(+0x79917)[0x7fb7618b9917]
[localhost:16841] /lib64/libc.so.6(+0x79917)[0x7fb0f3425917]
[ 4] [localhost:16840] [ 4] /lib64/libc.so.6(+0x79917)[0x7f0ec1492917]
/lib64/libc.so.6(+0x7fdec)[0x7fb7618bfdec]
[localhost:16841] [ 5] [localhost:16842] [ 4] /lib64/libc.so.6(+0x7fdec)[0x7fb0f342bdec]
[localhost:16840] [ 5] /lib64/libc.so.6(+0x8157c)[0x7fb7618c157c]
[localhost:16841] [ 6] /lib64/libc.so.6(+0x7fdec)[0x7f0ec1498dec]
[localhost:16842] [ 5] /lib64/libc.so.6(+0x8157c)[0x7fb0f342d57c]
[localhost:16840] [ 6] /usr/lib64/openmpi/lib/libopen-pal.so.20(+0x4ffb2)[0x7fb76134dfb2]
[localhost:16841] [ 7] /lib64/libc.so.6(+0x8157c)[0x7f0ec149a57c]
[localhost:16842] [ 6] /usr/lib64/openmpi/lib/libopen-pal.so.20(+0x4ffb2)[0x7fb0f2eb9fb2]
[localhost:16840] /usr/lib64/openmpi/lib/libmpi.so.20(ompi_win_finalize+0x1c1)[0x7fb7627a9881]
[localhost:16841] [ 8] /usr/lib64/openmpi/lib/libopen-pal.so.20(+0x4ffb2)[0x7f0ec0f26fb2]
[localhost:16842] [ 7] [ 7] /usr/lib64/openmpi/lib/libmpi.so.20(ompi_win_finalize+0x1c1)[0x7fb0f4315881]
[localhost:16840] [ 8] /usr/lib64/openmpi/lib/libmpi.so.20(ompi_win_finalize+0x1c1)[0x7f0ec2382881]
[localhost:16842] [ 8] /usr/lib64/openmpi/lib/libmpi.so.20(ompi_mpi_finalize+0x2f1)[0x7fb7627a7711]
[localhost:16841] [ 9] ./a.out[0x408c75]
/usr/lib64/openmpi/lib/libmpi.so.20(ompi_mpi_finalize+0x2f1)[0x7fb0f4313711]
[localhost:16840] [ 9] ./a.out[0x408c75]
/usr/lib64/openmpi/lib/libmpi.so.20(ompi_mpi_finalize+0x2f1)[0x7f0ec2380711]
[localhost:16842] [ 9] [localhost:16841] [10] /lib64/libc.so.6(__libc_start_main+0xeb)./a.out[0x408c75]
[localhost:16842] [localhost:16840] [10] [0x7fb76186318b]
[localhost:16841] [11] ./a.out[0x40896a]
[localhost:16841] *** End of error message ***
/lib64/libc.so.6(__libc_start_main+0xeb)[0x7fb0f33cf18b]
[localhost:16840] [11] ./a.out[0x40896a]
[localhost:16840] *** End of error message ***
[10] /lib64/libc.so.6(__libc_start_main+0xeb)[0x7f0ec143c18b]
[localhost:16842] [11] ./a.out[0x40896a]
[localhost:16842] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 2 with PID 0 on node localhost exited on signal 6 (Aborted).
--------------------------------------------------------------------------
I have deleted my clean up code. So this cleanup has to come from MPI, i am unsure how to resolve this.
void BlockMatVecMultiplication(int mynode, int numnodes,int N,int L, std::vector<double **> blocks,double *x,double* y)
{
int i,j;
int local_offset,blocks_local,last_blocks_local;
int *count;
int *displacements;
// The number of rows each processor is dealt
blocks_local = L/numnodes;
double *temp = new double[N*blocks_local];
std::vector<double**> localblocks(blocks_local);
// the offset
local_offset = mynode*blocks_local;
MPI_Status status;
if(mynode == (numnodes-1))
{
blocks_local = L-blocks_local*(numnodes-1);
}
/* Distribute the blocks across the processes */
// At this point node 0 has the matrix. So we only need
// to distribute among the remaining nodes, using the
// last node as a cleanup.
if(mynode ==0)
{
// This deals the matrix between processes 1 to numnodes -2
for(i=1;i<numnodes-1;i++)
{
for(j=0;j<blocks_local;j++)
{
MPI_Send(blocks[i*blocks_local+j],N*N,MPI_DOUBLE,i,j,MPI_COMM_WORLD);
}
}
// Here we use the last process to "clean up". For small N
// the load is poorly balanced.
last_blocks_local = L- blocks_local*(numnodes-1);
for(j=0;j<last_blocks_local;j++)
{
MPI_Send(blocks[(numnodes-1)*blocks_local+j],N*N,MPI_DOUBLE,numnodes-1,j,MPI_COMM_WORLD);
}
}
else
{
/*This code allows other processes to obtain the chunks of data
/* sent by process 0 */
/* rows_local has a different value on the last processor, remember */
for(i=0;i<blocks_local;i++)
{
MPI_Recv(&localblocks[i],N*N,MPI_DOUBLE,0,i,MPI_COMM_WORLD,&status);
}
}
}
The above method is called from the below code
#include <iostream>
#include <iomanip>
#include <mpi.h>
#include "SCmathlib.h"
#include "SCchapter7.h"
using namespace std;
int main(int argc, char * argv[])
{
int i,j, N = 10,L=10;
double **A,*x,*y;
int totalnodes,mynode;
std::vector<double**> blocks;
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD, &totalnodes);
MPI_Comm_rank(MPI_COMM_WORLD, &mynode);
//output variable
y = new double[N];
//input variable
x = new double[N];
for(i=0;i<N;i++)
{
x[i] = 1.0;
}
// forms identity matrix on node 0
if(mynode==0)
{
for(j=0;j<L;j++)
{
A = CreateMatrix(N,N);
// fills the block
for(i=0;i<N;i++)
{
A[i][i] = 1.0;
}
blocks.push_back(A);
}
}
double start = MPI_Wtime();
BlockMatVecMultiplication(mynode,totalnodes,N,L,blocks,x,y);
double end = MPI_Wtime();
if(mynode==0)
{
for(i=0;i<L;i++)
{
//DestroyMatrix(blocks[i],N,N);
}
//delete[] x;
//delete[] y;
}
std::cout << end- start << std::endl;
MPI_Finalize();
}
The "includes" just provide basic matrix functionality. The following function creates a matrix
double ** CreateMatrix(int m, int n){
double ** mat;
mat = new double*[m];
for(int i=0;i<m;i++){
mat[i] = new double[n];
for(int j=0;j<m;j++)
mat[i][j] = 0.0;
}
return mat;
}
During my work writing a C++ wrapper for MPI I ran into a segmentation fault in MPI_Test(), the reason of which I can't figure out.
The following code is a minimal crashing example, to be compiled and run with mpic++ -std=c++11 -g -o test test.cpp && ./test:
#include <stdlib.h>
#include <stdio.h>
#include <memory>
#include <mpi.h>
class Environment {
public:
static Environment &getInstance() {
static Environment instance;
return instance;
}
static bool initialized() {
int ini;
MPI_Initialized(&ini);
return ini != 0;
}
static bool finalized() {
int fin;
MPI_Finalized(&fin);
return fin != 0;
}
private:
Environment() {
if(!initialized()) {
MPI_Init(NULL, NULL);
_initialized = true;
}
}
~Environment() {
if(!_initialized)
return;
if(finalized())
return;
MPI_Finalize();
}
bool _initialized{false};
public:
Environment(Environment const &) = delete;
void operator=(Environment const &) = delete;
};
class Status {
private:
std::shared_ptr<MPI_Status> _mpi_status;
MPI_Datatype _mpi_type;
};
class Request {
private:
std::shared_ptr<MPI_Request> _request;
int _flag;
Status _status;
};
int main() {
auto &m = Environment::getInstance();
MPI_Request r;
MPI_Status s;
int a;
MPI_Test(&r, &a, &s);
Request r2;
printf("b\n");
}
Basically, the Environment class is a singleton wrapper around MPI_Init and MPI_Finalize. When the program exits, MPI will be finalized and the first time the class is instantiated, MPI_Init is called. Then I do some MPI stuff in the main() function, involving some other simple wrapper objects.
The code above crashes (on my machine, OpenMPI & Linux). However, it works when I
comment any of the private members of Request or Status (even int _flag;)
comment the last line, printf("b\n");
Replace auto &m = Environment::getInstance(); with MPI_Init().
There doesn't seem to be a connection between these points and I have no clue where to look for the segmentation fault.
The stack trace is:
[pc13090:05978] *** Process received signal ***
[pc13090:05978] Signal: Segmentation fault (11)
[pc13090:05978] Signal code: Address not mapped (1)
[pc13090:05978] Failing at address: 0x61
[pc13090:05978] [ 0] /usr/lib/libpthread.so.0(+0x11dd0)[0x7fa9cf818dd0]
[pc13090:05978] [ 1] /usr/lib/openmpi/libmpi.so.40(ompi_request_default_test+0x16)[0x7fa9d0357326]
[pc13090:05978] [ 2] /usr/lib/openmpi/libmpi.so.40(MPI_Test+0x31)[0x7fa9d03970b1]
[pc13090:05978] [ 3] ./test(+0xb7ae)[0x55713d1aa7ae]
[pc13090:05978] [ 4] /usr/lib/libc.so.6(__libc_start_main+0xea)[0x7fa9cf470f4a]
[pc13090:05978] [ 5] ./test(+0xb5ea)[0x55713d1aa5ea]
[pc13090:05978] *** End of error message ***
-------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 0 on node pc13090 exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
I have written a simple program in MPI, which sends and receives messages between the processors but its running with segmentation fault.
Here's my entire code
#include <iostream>
#include <stdio.h>
#include <stdlib.h>
#include <string>
#include <string.h>
#include <strings.h>
#include <sstream>
#include<mpi.h>
using namespace std;
class Case {
public:
int value;
std::stringstream sta;
};
int main(int argc, char **argv) {
int rank,size;
MPI::Init(argc,argv);
rank=MPI::COMM_WORLD.Get_rank();
size=MPI::COMM_WORLD.Get_size();
if(rank==0){
Case *s=new Case();
s->value=1;
s->sta<<"test";
cout<<"\nInside send before copy value :"<<s->value;
fflush(stdout);
cout<<"\nInside send before copy data :"<<s->sta.str();
fflush(stdout);
Case scpy;
scpy.value=s->value;
scpy.sta<<(s->sta).rdbuf();
cout<<"\nInside send after copy value :"<<scpy.value;
cout<<"\nInside send after copy value :"<<scpy.sta.str();
MPI::COMM_WORLD.Send(&scpy,sizeof(Case),MPI::BYTE,1,23);
}
MPI::COMM_WORLD.Barrier();
if(rank==1){
Case r;
MPI::COMM_WORLD.Recv(&r,sizeof(Case),MPI::BYTE,0,23);
cout<<"\nRecieve value"<<r.value;
fflush(stdout);
cout<<"\nRecieve data"<<r.sta;
fflush(stdout);
}
MPI::Finalize();
return 0;
}
I got the below error message and I'm not able to figure out what is wrong in this program. Can anyone please explain?
Inside send before copy value :1
Inside send before copy data :test
Inside send after copy value :1
Recieve value1
Recieve data0xbfa5d6b4[localhost:03706] *** Process received signal ***
[localhost:03706] Signal: Segmentation fault (11)
[localhost:03706] Signal code: Address not mapped (1)
[localhost:03706] Failing at address: 0x8e1a210
[localhost:03706] [ 0] [0xe6940c]
[localhost:03706] [ 1] /usr/lib/libstdc++.so.6(_ZNSt18basic_stringstreamIcSt11char_traitsIcESaIcEED1Ev+0xc6) [0x6a425f6]
[localhost:03706] [ 2] ./a.out(_ZN4CaseD1Ev+0x14) [0x8052d8e]
[localhost:03706] [ 3] ./a.out(main+0x2f9) [0x804f90d]
[localhost:03706] [ 4] /lib/libc.so.6(__libc_start_main+0xe6) [0x897e36]
[localhost:03706] [ 5] ./a.out() [0x804f581]
[localhost:03706] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 1 with PID 3706 on node localhost.localdomain exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
Problem
I think the problem is that the line:
MPI::COMM_WORLD.Send(&scpy,sizeof(Case),MPI::BYTE,1,23);
sends a copy of the Case structure to the receiver, but it is sending a raw copy of the bytes which is not very useful. The std::stringstream class will contain a pointer to the actual memory used to store your string, so this code will:
Send a pointer to the receiver (containing an address that will be meaningless to the receiver)
Not send the actual contents of the string.
The receiver will seg fault when it attempts to dereference the invalid pointer.
Fix 1
One approach to fix this is to send the character data yourself.
In this approach you would send a message pointing to std::stringstream::str()::c_str() and of length std::stringstream::str()::size()*sizeof(char).
Fix 2
An alternative approach that seems to fit better with the way you are attempting to use MPI and strings is to use the Boost libraries. Boost contains functions for MPI that automatically serialize the data for you.
A useful tutorial on Boost and MPI is available on the boost website.
Here is example code from that tutorial that does a similar task:
#include <boost/mpi.hpp>
#include <iostream>
#include <string>
#include <boost/serialization/string.hpp>
namespace mpi = boost::mpi;
int main(int argc, char* argv[])
{
mpi::environment env(argc, argv);
mpi::communicator world;
if (world.rank() == 0) {
world.send(1, 0, std::string("Hello"));
std::string msg;
world.recv(1, 1, msg);
std::cout << msg << "!" << std::endl;
} else {
std::string msg;
world.recv(0, 0, msg);
std::cout << msg << ", ";
std::cout.flush();
world.send(0, 1, std::string("world"));
}
return 0;
}