I have written a simple program in MPI, which sends and receives messages between the processors but its running with segmentation fault.
Here's my entire code
#include <iostream>
#include <stdio.h>
#include <stdlib.h>
#include <string>
#include <string.h>
#include <strings.h>
#include <sstream>
#include<mpi.h>
using namespace std;
class Case {
public:
int value;
std::stringstream sta;
};
int main(int argc, char **argv) {
int rank,size;
MPI::Init(argc,argv);
rank=MPI::COMM_WORLD.Get_rank();
size=MPI::COMM_WORLD.Get_size();
if(rank==0){
Case *s=new Case();
s->value=1;
s->sta<<"test";
cout<<"\nInside send before copy value :"<<s->value;
fflush(stdout);
cout<<"\nInside send before copy data :"<<s->sta.str();
fflush(stdout);
Case scpy;
scpy.value=s->value;
scpy.sta<<(s->sta).rdbuf();
cout<<"\nInside send after copy value :"<<scpy.value;
cout<<"\nInside send after copy value :"<<scpy.sta.str();
MPI::COMM_WORLD.Send(&scpy,sizeof(Case),MPI::BYTE,1,23);
}
MPI::COMM_WORLD.Barrier();
if(rank==1){
Case r;
MPI::COMM_WORLD.Recv(&r,sizeof(Case),MPI::BYTE,0,23);
cout<<"\nRecieve value"<<r.value;
fflush(stdout);
cout<<"\nRecieve data"<<r.sta;
fflush(stdout);
}
MPI::Finalize();
return 0;
}
I got the below error message and I'm not able to figure out what is wrong in this program. Can anyone please explain?
Inside send before copy value :1
Inside send before copy data :test
Inside send after copy value :1
Recieve value1
Recieve data0xbfa5d6b4[localhost:03706] *** Process received signal ***
[localhost:03706] Signal: Segmentation fault (11)
[localhost:03706] Signal code: Address not mapped (1)
[localhost:03706] Failing at address: 0x8e1a210
[localhost:03706] [ 0] [0xe6940c]
[localhost:03706] [ 1] /usr/lib/libstdc++.so.6(_ZNSt18basic_stringstreamIcSt11char_traitsIcESaIcEED1Ev+0xc6) [0x6a425f6]
[localhost:03706] [ 2] ./a.out(_ZN4CaseD1Ev+0x14) [0x8052d8e]
[localhost:03706] [ 3] ./a.out(main+0x2f9) [0x804f90d]
[localhost:03706] [ 4] /lib/libc.so.6(__libc_start_main+0xe6) [0x897e36]
[localhost:03706] [ 5] ./a.out() [0x804f581]
[localhost:03706] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 1 with PID 3706 on node localhost.localdomain exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
Problem
I think the problem is that the line:
MPI::COMM_WORLD.Send(&scpy,sizeof(Case),MPI::BYTE,1,23);
sends a copy of the Case structure to the receiver, but it is sending a raw copy of the bytes which is not very useful. The std::stringstream class will contain a pointer to the actual memory used to store your string, so this code will:
Send a pointer to the receiver (containing an address that will be meaningless to the receiver)
Not send the actual contents of the string.
The receiver will seg fault when it attempts to dereference the invalid pointer.
Fix 1
One approach to fix this is to send the character data yourself.
In this approach you would send a message pointing to std::stringstream::str()::c_str() and of length std::stringstream::str()::size()*sizeof(char).
Fix 2
An alternative approach that seems to fit better with the way you are attempting to use MPI and strings is to use the Boost libraries. Boost contains functions for MPI that automatically serialize the data for you.
A useful tutorial on Boost and MPI is available on the boost website.
Here is example code from that tutorial that does a similar task:
#include <boost/mpi.hpp>
#include <iostream>
#include <string>
#include <boost/serialization/string.hpp>
namespace mpi = boost::mpi;
int main(int argc, char* argv[])
{
mpi::environment env(argc, argv);
mpi::communicator world;
if (world.rank() == 0) {
world.send(1, 0, std::string("Hello"));
std::string msg;
world.recv(1, 1, msg);
std::cout << msg << "!" << std::endl;
} else {
std::string msg;
world.recv(0, 0, msg);
std::cout << msg << ", ";
std::cout.flush();
world.send(0, 1, std::string("world"));
}
return 0;
}
Related
I use chdir() to switch the directory, and then use execvp() to execute "java Main". I'm sure there is Main.class, but something went wrong. I want to know why.
#include <cstdio>
#include <unistd.h>
using namespace std;
int main(){
char buf[80];
getcwd(buf,sizeof(buf));
printf("current working directory: %s\n", buf);
chdir("/home/keane/Judge/temp");
getcwd(buf,sizeof(buf));
printf("current working directory: %s\n", buf);
char *array[3];
array[0] = "java";
array[1] = "Main";
array[2] = NULL;
execvp("java", array);
return 0;
}
the error is could not find the main class , and I can run java Main in that directory.
What drives me crazy is that I can't use system("java Main"), and the error is that Error: Could not find or load main class Main, and it's just like this on my computer
update:
#include <unistd.h>
#include <cstdlib>
int main(){
chdir("/home/keane/Judge/temp");
system("pwd");
system("ls");
system("java Main");
return 0;
}
the output on console is:
/home/keane/Judge/temp
1.out 3.out 5.out Main.class stdout_spj.txt
2.out 4.out ce.txt Main.java
Error: Could not find or load the main class Main
my final solution is to reboot the computer and add -cp . to the java command.
althought I don't why is necessary.
thanks everyone!
This works as intended on my system, maybe you need to add -cp . to your java call.
EDIT: to elaborate: -cp (for classpath) tells java where to look for user provided .class files. This does not necessarily include the current working directory by default.
The execution of execvp() is non-blocking and takes ownership of the caller, that means that when it starts if the program ends too quickly you will never be able to see the result, to solve this I use fork(). The wait is just to avoid using sleep as I used at the begining. Its all in c.
#include <stdio.h>
#include <string.h>
#include <errno.h>
#include <unistd.h>
#include <sys/wait.h>
int main(int argc, char** argv){
char buf[80];
getcwd(buf,sizeof(buf));
printf("current working directory: %s\n", buf);
chdir("/home/");
getcwd(buf,sizeof(buf));
printf("current working directory: %s\n", buf);
char *array[3] = {"java", "Main", NULL};
if(fork() == 0) {
if(execvp("java", array) < 0) {
fprintf(stderr, "Error spawning command: %s\n", strerror(errno));
}
} else {
printf("Command spawned\n");
wait(NULL); // Wait to the forked process to end (avoid using sleep)
}
return 0;
}
During my work writing a C++ wrapper for MPI I ran into a segmentation fault in MPI_Test(), the reason of which I can't figure out.
The following code is a minimal crashing example, to be compiled and run with mpic++ -std=c++11 -g -o test test.cpp && ./test:
#include <stdlib.h>
#include <stdio.h>
#include <memory>
#include <mpi.h>
class Environment {
public:
static Environment &getInstance() {
static Environment instance;
return instance;
}
static bool initialized() {
int ini;
MPI_Initialized(&ini);
return ini != 0;
}
static bool finalized() {
int fin;
MPI_Finalized(&fin);
return fin != 0;
}
private:
Environment() {
if(!initialized()) {
MPI_Init(NULL, NULL);
_initialized = true;
}
}
~Environment() {
if(!_initialized)
return;
if(finalized())
return;
MPI_Finalize();
}
bool _initialized{false};
public:
Environment(Environment const &) = delete;
void operator=(Environment const &) = delete;
};
class Status {
private:
std::shared_ptr<MPI_Status> _mpi_status;
MPI_Datatype _mpi_type;
};
class Request {
private:
std::shared_ptr<MPI_Request> _request;
int _flag;
Status _status;
};
int main() {
auto &m = Environment::getInstance();
MPI_Request r;
MPI_Status s;
int a;
MPI_Test(&r, &a, &s);
Request r2;
printf("b\n");
}
Basically, the Environment class is a singleton wrapper around MPI_Init and MPI_Finalize. When the program exits, MPI will be finalized and the first time the class is instantiated, MPI_Init is called. Then I do some MPI stuff in the main() function, involving some other simple wrapper objects.
The code above crashes (on my machine, OpenMPI & Linux). However, it works when I
comment any of the private members of Request or Status (even int _flag;)
comment the last line, printf("b\n");
Replace auto &m = Environment::getInstance(); with MPI_Init().
There doesn't seem to be a connection between these points and I have no clue where to look for the segmentation fault.
The stack trace is:
[pc13090:05978] *** Process received signal ***
[pc13090:05978] Signal: Segmentation fault (11)
[pc13090:05978] Signal code: Address not mapped (1)
[pc13090:05978] Failing at address: 0x61
[pc13090:05978] [ 0] /usr/lib/libpthread.so.0(+0x11dd0)[0x7fa9cf818dd0]
[pc13090:05978] [ 1] /usr/lib/openmpi/libmpi.so.40(ompi_request_default_test+0x16)[0x7fa9d0357326]
[pc13090:05978] [ 2] /usr/lib/openmpi/libmpi.so.40(MPI_Test+0x31)[0x7fa9d03970b1]
[pc13090:05978] [ 3] ./test(+0xb7ae)[0x55713d1aa7ae]
[pc13090:05978] [ 4] /usr/lib/libc.so.6(__libc_start_main+0xea)[0x7fa9cf470f4a]
[pc13090:05978] [ 5] ./test(+0xb5ea)[0x55713d1aa5ea]
[pc13090:05978] *** End of error message ***
-------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 0 on node pc13090 exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
This is a follow-on question to How do I free a boost::mpi::request? . I'm noting odd behavior when listening for lists rather than individual items. Is this my error or an error in boost? I'm using MSVC and MSMPI, Boost 1.62. I'm pretty sure that it's not behaving properly on the wait for a cancelled job.
If you try version B with mpiexec -n 2 then you get a clean exit - if you try version A, it hangs indefinitely. Do you all see this as well? Is this a bug?
#include "boost/mpi.hpp"
#include "mpi.h"
#include <list>
#include "boost/serialization/list.hpp"
int main()
{
MPI_Init(NULL, NULL);
MPI_Comm regional;
MPI_Comm_dup(MPI_COMM_WORLD, ®ional);
boost::mpi::communicator comm = boost::mpi::communicator(regional, boost::mpi::comm_attach);
if (comm.rank() == 1)
{
//VERSION A:
std::list<int> q;
boost::mpi::request z = comm.irecv<std::list<int>>(1, 0, q);
z.cancel();
z.wait();
//VERSION B:
// int q;
// boost::mpi::request z = comm.irecv<int>(1, 0, q);
// z.cancel();
// z.wait();
}
MPI_Comm_disconnect(®ional);
MPI_Finalize();
return 0;
}
This is clearly a bug in Boost.MPI.
For serialized types, like std::list, the cancel is forwarded from request::cancel() to request::handle_serialized_irecv, which does not specify a proper handling for ra_cancel.
I am unable to find a correct usage of asynchronous thread in c++11.The thing i want to do is i want to spwan threads and each thread will function simultaneously without waiting for each other like thread.join(),which makes other thread to wait until the current thread is done with.So,is there any library in c++ which makes threads to run parallely doing their work simultaneously without having to wait for the other to complete.Actually the thing i want is i want to run each threads sumultaneously, so that they don't wait for the other to complete and its functionality is executed simultaneously without having to wait for others to finish .
Thanks,
Kushal
EDIT:
EDIT:: i am posting the code below
#include <signal.h>
#include <thread>
#include <algorithm>
#include <cstring>
#include <csignal>
#include "paho_client.h"
using namespace std;
vector<string> topic_container{"rpi2/temp","sense /bannana","sense/util","mqtt/temp","sense/temp","sense/pine","sense/fortis/udap"};
vector<paho_client> publisher;
vector<paho_client> subscriber;
int finish_thread=1;
void Onfinish(int signum){
finish_thread=0;
exit(EXIT_FAILURE);
}
int main(int argc, char** argv) {
signal(SIGINT, Onfinish);
int topic_index;
if(argc<3){
cout<<"the format of starting commandline argument is"<<endl;
exit(1);
}
while(finish_thread!=0){
//paho_client::get_library_handle();
if(strcmp(argv[1],"create_publisher")){
for(topic_index=0;topic_index<atoi(argv[2]);topic_index++){
thread pub_th;
pub_th = thread([ = ]() {
paho_client client("publisher", "192.168.0.102", "9876",
topic_container[topic_index].c_str());
client.paho_connect_withpub();
publisher.push_back(client);
});
pub_th.join();
}
vector<paho_client>::iterator it;
int publisher_traverse=0;
for(it=publisher.begin();it<publisher.end();publisher_traverse++){
publisher[publisher_traverse].increment_count();
publisher[publisher_traverse].get_count();
}
}
}
return 0;
}
After using async with future am getting the same behaviour as above please point me where am i going wrong
#include <signal.h>
#include <thread>
#include <algorithm>
#include <cstring>
#include <csignal>
#include <future>
#include "paho_client.h"
using namespace std;
vector<string> topic_container{"rpi2/temp","sense/apple","sense/bannana","sense/util","mqtt/temp","sense/temp","sense/pine","sense/fortis/udap"};
vector<paho_client> publisher;
vector<paho_client> subscriber;
int finish_thread=1;
void Onfinish(int signum){
finish_thread=0;
exit(EXIT_FAILURE);
}
int accumulate_block_worker_ret(int topic_index) {
//int topic_index=0;
paho_client client("publisher", "192.168.0.102", "9876",
topic_container[topic_index].c_str());
client.paho_connect_withpub();
publisher.push_back(client);
client.increment_count();
return client.get_count();
}
int main(int argc, char** argv) {
signal(SIGINT, Onfinish);
if(argc<3){
cout<<"the format of starting commandline argument is . /paho_client_emulate <create_publisher><count of publisher client to spawn>" <<endl;
exit(1);
}
while(finish_thread!=0){
// paho_client::get_library_handle();
int topic_index;
if(strcmp(argv[1],"create_publisher")){
for(topic_index=0;topic_index<atoi(argv[2]);topic_index++){
// thread pub_th;
// pub_th = thread([ = ]() {
future<int> f = async(std::launch::async,accumulate_block_worker_ret,topic_index);
// });
// pub_th.join();
cout<<"the returned value from future is"<<f.get()<<endl;
}
vector<paho_client>::iterator it;
int publisher_traverse=0;
for(it=publisher.begin();it<=publisher.end();publisher_traverse++){
cout<<"came here"<<endl;
publisher[publisher_traverse].increment_count();
publisher[publisher_traverse].get_count();
}
}
}
return 0;
}
i want to launch all the publisher clients first (as threads) and
later publish messages from each threads
The pub_th.join() is misplaced inside the loop where the threads are started, thus waiting for the termination of each thread before starting the next one. To let the threads run in parallel, just move the .join() outside that loop. Of course to access the threads after the loop body, they have to be stored somewhere, e. g. in a vector - for this, change the first for loop to
vector <thread> pub_threads;
for (topic_index=0; topic_index<atoi(argv[2]); topic_index++)
{
pub_threads.push_back(thread([ = ]() { /* whatever */ }));
}
and later when done:
for (auto &th: pub_threads) th.join();
Actually i am running infinite while inside every instance of
paho_client so the first thread is not completed …
that thread is run continously
Of course if never done, there's no point to .join().
I'm trying to following the code from this post to have signal handlers print a backtrace on errors such as floating point and segmentation faults. I'm using seg fault signals as a starting point. Here is the code:
#include <cstdlib> //for exit()
#include <signal.h> //signal handling
#include <execinfo.h> //backtrace, backtrace_symbols and backtrace_fd
#include <iostream>
#include <string.h>
#include <stdio.h>
#define TRACE_MSG fprintf(stderr, "TRACE at: %s() [%s:%d]\n", \
__FUNCTION__, __FILE__, __LINE__)
void show_stackframe()
{
void *trace[1024];
char **messages = (char **) NULL;
int i, trace_size = 0;
TRACE_MSG;
trace_size = backtrace(trace, 1024); // segfault here???
// More code here to print backtrace, but not needed at the moment..
TRACE_MSG;
}
void sigSegvHandler( int signum, siginfo_t* info, void* arg )
{
TRACE_MSG;
show_stackframe();
return;
}
double func_b()
{
show_stackframe(); // Show that backtrace works without being
// called inside sighandler.
TRACE_MSG;
int int_a[5];
int_a[0] = 4;
int_a[11] = 10; // cause a segfault on purpose to see
// how the signal handling performs.
return 1.1;
}
int main()
{
// Examine and change the seg fault signal
struct sigaction segvAction; // File: /usr/include/bits/sigaction.h
// Initialize segvAction struct to all zeros for initialiation
memset( &segvAction, 0, sizeof( segvAction ) );
segvAction.sa_sigaction = sigSegvHandler;
segvAction.sa_flags = SA_SIGINFO; //Invoke signal catching function with 3 arguments instead of 1
// Set the action for the SIGSEGV signal
sigaction( SIGSEGV, &segvAction, NULL );
func_b(); // Produce a SIGSEGV error
}
I am compiling using:
g++ -rdynamic testprogram.cpp -o testprogram
I receive the following output from the program:
TRACE at: show_stackframe() [stackoverflow.cpp:15]
TRACE at: show_stackframe() [stackoverflow.cpp:17]
TRACE at: func_b() [stackoverflow.cpp:33]
TRACE at: sigSegvHandler() [stackoverflow.cpp:22]
TRACE at: show_stackframe() [stackoverflow.cpp:15]
Segmentation fault
My question is why does show_stackframe() cause a segmentation fault inside of sigaction but works fine when not inside of the sigaction handler? I obviously seem to be setting up the signal handler/action incorrect but I haven't been able to find it all day. GDB doesn't seem to be any help in this case.
As stated here, the backtrace function is AS-Unsafe, which means it is unsafe to call from an asynchronous signal handler. Doing so invokes undefined behavior.