My code is to convert a json document into string:
class JMSG_C: public rapidjson::Document
{
inline std::string JMSG_C::get_msg_str()
{
// Convert JSON document to string
rapidjson::StringBuffer buffer;
rapidjson::Writer< rapidjson::StringBuffer > writer(buffer);
Accept(writer);
std::string str = buffer.GetString();
return str;
}
}
Function get_msg_str() fails on segmentation fault. The error message is as in the following. The error does not happen every time but randomly. I suspects that is due to memory allocation failure. But when the program runs, I monitor the system RAM consumption and don't see RAM exhausted. Anybody has any idea?
*** Process received signal ***
Signal: Segmentation fault (11)
Signal code: Address not mapped (1)
Failing at address: 0x96a5e64d
[ 0] [0xb770d40c]
[ 1] /lib/i386-linux-gnu/i686/cmov/libc.so.6(+0x7129e) [0xb732b29e]
[ 2] /lib/i386-linux-gnu/i686/cmov/libc.so.6(+0x73565) [0xb732d565]
[ 3] /lib/i386-linux-gnu/i686/cmov/libc.so.6(__libc_malloc+0x5c) [0xb732fa6c]
[ 4] /home/logia/work/projects/dev_Tao/products/mpitao/mpitao(_ZN9rapidjson12CrtAllocator6MallocEj+0x11) [0x809a7e5]
[ 5] /home/logia/work/projects/dev_Tao/products/mpitao/mpitao(_ZN9rapidjson19MemoryPoolAllocatorINS_12CrtAllocatorEE8AddChunkEj+0x1e) [0x80a4bf6]
[ 6] /home/logia/work/projects/dev_Tao/products/mpitao/mpitao(_ZN9rapidjson19MemoryPoolAllocatorINS_12CrtAllocatorEEC2EjPS1_+0x74) [0x80a2c6a]
[ 7] /home/logia/work/projects/dev_Tao/products/mpitao/mpitao(_ZN9rapidjson8internal5StackINS_19MemoryPoolAllocatorINS_12CrtAllocatorEEEEC2EPS4_j+0x9e)
[ 8] /home/logia/work/projects/dev_Tao/products/mpitao/mpitao(_ZN9rapidjson6WriterINS_19GenericStringBufferINS_4UTF8IcEENS_12CrtAllocatorEEES3_NS_19MemoryPoolAllocatorIS4_EEEC2ERS5_PS7_j+0x2d) [0x809f255]
[ 9] /home/logia/work/projects/dev_Tao/products/mpitao/mpitao(_ZN6JMSG_C11get_msg_strEb+0x64) [0x809ad50]
[10] /home/logia/work/projects/dev_Tao/products/mpitao/mpitao(_Z9cell_mainR6COMM_C+0x7da) [0x80aba23]
[11] /home/logia/work/projects/dev_Tao/products/mpitao/mpitao(main+0xef)
[12] /lib/i386-linux-gnu/i686/cmov/libc.so.6(__libc_start_main+0xe6)
Related
I have the following c code, have a large vector and is scattered through the different processor and then gather again, I have a custom MPI type that has been previously tested on another program:
pixel *src = (pixel*) malloc(sizeof(pixel) * MAX_PIXELS);
const int ROOT = 0;
int pid, n_processors;
pixel *receive_buffer;
MPI_Init(NULL, NULL);
MPI_Comm_rank(MPI_COMM_WORLD, &pid);
MPI_Comm_size(MPI_COMM_WORLD, &n_processors);
[...code...]
receive_buffer = (pixel*)malloc(send_count * sizeof(pixel));
//send_count is a value calculated on root and then broadcasted
MPI_Scatter(src, send_count, mpi_pixel_type, receive_buffer, send_count, mpi_pixel_type, ROOT, MPI_COMM_WORLD);
pixel* dst = (pixel*)malloc(sizeof(pixel)*xsize*ysize_per_proccesor);
[...operations with dst and receive buffer...]
MPI_Gather(&receive_buffer, send_count, mpi_pixel_type, src, send_count, mpi_pixel_type, ROOT, MPI_COMM_WORLD);
But this last line is giving me the following error in execution:
[user] Read -1, expected 32769, errno = 14
[user] *** Process received signal ***
[user] Signal: Segmentation fault (11)
[user] Signal code: Address not mapped (1)
[user] Failing at address: 0x7ffe2dd15000
[user] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x46210)[0x7f04af898210]
[user] [ 1] /lib/x86_64-linux-gnu/libc.so.6(+0x18e533)[0x7f04af9e0533]
[user] [ 2] /home/yunhi/.openmpi/lib/openmpi/mca_btl_vader.so(+0x3284)[0x7f04ac24c284]
[user] [ 3] /home/yunhi/.openmpi/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_send_request_schedule_once+0x1c6)[0x7f04ac0add46]
[user] [ 4] /home/yunhi/.openmpi/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_recv_frag_callback_ack+0x1a9)[0x7f04ac0a6849]
[user] [ 5] /home/yunhi/.openmpi/lib/openmpi/mca_btl_vader.so(mca_btl_vader_poll_handle_frag+0x95)[0x7f04ac24df95]
[user] [ 6] /home/yunhi/.openmpi/lib/openmpi/mca_btl_vader.so(+0x52d7)[0x7f04ac24e2d7]
[user] [ 7] /home/yunhi/.openmpi/lib/libopen-pal.so.40(opal_progress+0x34)[0x7f04af6be0b4]
[user] [ 8] /home/yunhi/.openmpi/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_send+0x7c5)[0x7f04ac0a1a15]
[user] [ 9] /home/yunhi/.openmpi/lib/libmpi.so.40(ompi_coll_base_gather_intra_linear_sync+0xdf)[0x7f04afc5845f]
[user] [10] /home/yunhi/.openmpi/lib/openmpi/mca_coll_tuned.so(ompi_coll_tuned_gather_intra_dec_fixed+0xb7)[0x7f04ac04da27]
[user] [11] /home/yunhi/.openmpi/lib/libmpi.so.40(PMPI_Gather+0x15a)[0x7f04afc221ea]
[user] [12] ./blurc(+0x2d73)[0x5576edabed73]
[user] [13] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7f04af8790b3]
[user] [14] ./blurc(+0x13ce)[0x5576edabd3ce]
[user] *** End of error message ***
I have revised the memories allocations and I can't find why recive_buffer or source is causing address not mapped.
Any suggestions will help, thanks :)
I think you did not mean to pass the address of the receive_buffer point into the function, did you? You want
MPI_Gather(receive_buffer, send_count, mpi_pixel_type, src, send_count, mpi_pixel_type, ROOT, MPI_COMM_WORLD);
I am trying to seek till the end of a file using fseek, seeking in steps of 2560 * 5 bytes at a time. No idea why I am getting a segmentation fault
I am running this code on TDA2x board. Is here a problem in the particular environment which I am unaware of?
Starting value of uNumFrames is 1000
This code is basically trying to read the data from an externally connected SSD drive, and is reading the file byte by byte, skipping certain containers in the file using fseek. The problem I am facing is that while trying to read the files like this, the code is crashing with the following details
while(uNumFrames > 500)
{
logger::addLog(logger::LOGGER_INFO,"Inside while loop, uNumFrames = %d", uNumFrames);
/*Skip packet Header*/
ui_skip_count = fseek(fp, HeaderSize * 5, SEEK_CUR);
fsize = ftell(fp);
logger::addLog(logger::LOGGER_INFO,"fsize = %ld\n", fsize);
if (ferror(fp))
{
logger::addLog(logger::LOGGER_INFO,"fseek Error");
fclose (fp);
break;
}
if(0 != ui_skip_count)
{
logger::addLog(logger::LOGGER_INFO,"fseek for fp failed");
}
else
{
logger::addLog(logger::LOGGER_INFO,"Inside else for read and scale");
}
uNumFrames--;
}
Sample output looks something like this:
[HOST ] [INFO] 86.234680 s: Inside while loop, uNumFrames = 978
[HOST ] [INFO] 86.234680 s: fsize = 3368961
[HOST ] [INFO] 86.234710 s: Inside else for read and scale
[HOST ] [INFO] 86.234710 s: Inside while loop, uNumFrames = 977
[HOST ] [INFO] 86.234710 s: fsize = 3381761
[HOST ] [INFO] 86.234710 s: Inside else for read and scale
[HOST ] [INFO] 86.234710 s: Inside while loop, uNumFrames = 976
[HOST ] [INFO] 86.234741 s: fsize = 3394561
[HOST ] [INFO] 86.234741 s: Inside else for read and scale
[HOST ] [INFO] 86.234741 s: Inside while loop, uNumFrames = 975
[HOST ] [INFO] 86.234741 s: fsize = 3407361
[HOST ] [INFO] 86.234741 s: Inside else for read and scale
[HOST ] [INFO] 86.234741 s: Inside while loop, uNumFrames = 974
[HOST ] [INFO] 86.234771 s: fsize = 3420161
[HOST ] [INFO] 86.234771 s: Inside else for read and scale
[HOST ] [INFO] 86.234771 s: Inside while loop, uNumFrames = 973
[HOST ] [INFO] 86.234771 s: fsize = 3432961
[HOST ] [INFO] 86.234771 s: Inside else for read and scale
[HOST ] [INFO] 86.234771 s: Inside while loop, uNumFrames = 972
[HOST ] [INFO] 86.234802 s: fsize = 3445761
[HOST ] [INFO] 86.234802 s: Inside else for read and scale
[HOST ] [INFO] 86.234802 s: Inside while loop, uNumFrames = 971
[HOST ] [INFO] 86.234802 s: fsize = 3458561
[HOST ] [INFO] 86.234802 s: Inside else for read and scale
****** Segmentation fault caught ....
Faulty address is 0xa6499020, called from 0x77ddb
Totally Obtained 0 stack frames. signal number =11
Signal number = 11, Signal errno = 0
SI code = 1 (Address not mapped to object)
Fault addr = 0xa6499020
[bt] Execution path:
References for this post:
[1] http://www.mathworks.com/matlabcentral/newsreader/view_thread/278243 " Making C++ objects persistent between mex calls, and robust."
[2] MATLAB parfor and C++ class mex wrappers (copy constructor required?) "MATLAB parfor and C++ class mex wrappers (copy constructor required?)"
I successfully implemented a Matlab/C++ interface, based on method proposed on [1].
Anyway, i'm having troubles when trying to use the system with Matlab Parallel Computing.
What happens to me is a segmentation fault, when trying a conversion between matlab handle and C++ pointer, in the MEX interface.
To be more clear, I will recap the structure proposed in [1].
There are three files in the system, with this communication scheme:
[myInterface.m] <--> [myMexInterface.cpp] <--> [myClass.cpp]
where
myInterface.m is a matlab class
myMexInterface.cpp is a C++ (mex) function
myClass.cpp is a C++ (mex) class
The use of this system is divided in 2 phases:
Construction:
a matlab object myInterface is created. This causes a call to myMexInterface.mexa64, that invokes the creation of a C++ myClass object. The C++ pointer of myClass is sent back thru myMexInterface.mexa64 to myInterface, that stores it for further use. In particular, myMexInterface.mexa64 operates a conversion of the C++ pointer of myClass to a matlab handle.
Use of the C++ class from matlab:
myInterface offers methods to clients that, passing thru myMexInterface.mexa64, invokes functions on the object myClass. In this phase, the handle stored from myInterface during construction phase is necessary to myMexInterface.mexa64 in order to invokes the functions on the correct C++ object. Of course, in this phase myMexInterface.mexa64 operates a reverse conversion, from Matlab handle to C++ pointer.
In my implementation, that works in single-thread, a segmentation fault error occours during the conversion from handle to pointer.
In particular, I would like to focus on myMexInterface.cpp.
The command to be executed is passed thru a string, as first argument, while possibly (for the second phase operations) the second argument is a matlab handle relative to the C++ object associated to that interface.
#include "mex.h"
#include "class_handle.hpp"
CLASS void myMexInterface(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
if (!strcmp("new", cmd)) {
// ...
plhs[0] = convertPtr2Mat<myClass>(new myClass());
}
if (!strcmp("delete", cmd)) {
// ...
destroyObject<myClass>(prhs[1]);
}
// Get the class instance pointer from the second input
cout << " trying to convert handle...";
myClass *myClass_instanceAddress = convertMat2Ptr<myClass>(prhs[1]); // SEGMENTATION FAULT ON MULTI CORE!!!
cout << " success in handle conversion. \n";
if (!strcmp("aFunction", cmd)) {
myClass_instanceAddress->aFunction();
}
// .. other functions
}
The function convertMat2Pt, generating the segfault, comes from the inclusion of class_handle.hpp, that is part of the solution proposed in [1].
In particular the function inside that class_handle.hpp in which the segfault is convertMat2HandlePtr:
template<class base> class class_handle
// ...
template<class base> inline class_handle<base> *convertMat2HandlePtr(const mxArray *in)
{
if (mxGetNumberOfElements(in) != 1 || mxGetClassID(in) != mxUINT64_CLASS || mxIsComplex(in))
mexErrMsgTxt("Input must be a real uint64 scalar.");
std::cout << "class_handle: trying to cast \n";
class_handle<base> *ptr = reinterpret_cast<class_handle<base> *>(*((uint64_t *)mxGetData(in))); // SEGMENTATION FAULT ON MULTI CORE!!!
if (!ptr->isValid())
mexErrMsgTxt("Handle not valid.");
return ptr;
}
template<class base> inline base *convertMat2Ptr(const mxArray *in)
{
return convertMat2HandlePtr<base>(in)->ptr();
}
// ...
Actually is not clear to me what really happen in that cast, so I cannot go in deeper analysis.
What I can imagine is that for some reason, the Matlab Parallel Computing generates an inconsistence with the C++ object, previously created.
The matlab client function that generate the segfault is the following:
myInterface = myMexInterface();
matlabpool open local 1
out = myModelInterf.aFunction()
disp(' Now starting parfor ***');
parfor i = 1:1
out = myModelInterf.aFunction()
end
Note that in order to simplify the situation, I open only one worker in matlabpool, and the parfor execute only one loop: anyway I still have the error. Of course without parfor block there is no error, even with multiple calls of aFunction().
What I obtain in Command window is:
Create interface with CPP handle: 139698584223104
Starting matlabpool using the 'local' profile ... connected to 1 workers.
trying to convert handle...class_handle: trying to cast
success in handle conversion.
out =
3
Now starting parfor ***
Save interface with CPP handle: 139698584223104
Save interface with CPP handle: 139698584223104
Create interface with CPP handle: 0
Load interface with CPP handle: 139698584223104
trying to convert handle...class_handle: trying to cast
------------------------------------------------------------------------
Segmentation violation detected at Wed Jan 30 15:00:47 2013
------------------------------------------------------------------------
Configuration:
Crash Decoding : Disabled
Current Visual : None
Default Encoding: UTF-8
GNU C Library : 2.15 stable
MATLAB Root : /usr/local/MATLAB/R2012b
MATLAB Version : 8.0.0.783 (R2012b)
Operating System: Linux 3.2.0-31-generic #50-Ubuntu SMP Fri Sep 7 16:16:45 UTC 2012 x86_64
Processor ID : x86 Family 6 Model 42 Stepping 7, GenuineIntel
Virtual Machine : Java 1.6.0_17-b04 with Sun Microsystems Inc. Java HotSpot(TM) 64-Bit Server VM mixed mode
Window System : No active display
Fault Count: 1
Abnormal termination:
Segmentation violation
Register State (from fault):
RAX = 00007fedbcba74a0 RBX = 00007fee2afa0fe0
RCX = 0000000000000006 RDX = 0000000000000060
RSP = 00007fee2afa0520 RBP = 00007fee2afa09f0
RSI = 0000000000000000 RDI = 00007fee2b817a50
R8 = 00007fee2afa09af R9 = 00007fedbcdb9208
R10 = 00007fee2afa0200 R11 = 00007fee3d57ba00
R12 = 0000000000000002 R13 = 00007f0e1c7cfd80
R14 = 00007fee2afa0dd0 R15 = 00007fee2afa0f20
RIP = 00007fedfc00a8e4 EFL = 0000000000010206
CS = 0033 FS = 0000 GS = 0000
Stack Trace (from fault):
[ 0] 0x00007fee3f5e31de /usr/local/MATLAB/R2012b/bin/glnxa64/libmwfl.so+00516574 _ZN2fl4diag15stacktrace_base7captureERKNS0_14thread_contextEm+000158
[ 1] 0x00007fee3f5e44b2 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwfl.so+00521394
[ 2] 0x00007fee3f5e5ffe /usr/local/MATLAB/R2012b/bin/glnxa64/libmwfl.so+00528382 _ZN2fl4diag13terminate_logEPKcRKNS0_14thread_contextE+000174
[ 3] 0x00007fee3e8d2093 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwmcr.so+00557203 _ZN2fl4diag13terminate_logEPKcPK8ucontext+000067
[ 4] 0x00007fee3e8ceb9d /usr/local/MATLAB/R2012b/bin/glnxa64/libmwmcr.so+00543645
[ 5] 0x00007fee3e8d0835 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwmcr.so+00550965
[ 6] 0x00007fee3e8d0a55 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwmcr.so+00551509
[ 7] 0x00007fee3e8d10fe /usr/local/MATLAB/R2012b/bin/glnxa64/libmwmcr.so+00553214
[ 8] 0x00007fee3e8d1295 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwmcr.so+00553621
[ 9] 0x00007fee3cdc8cb0 /lib/x86_64-linux-gnu/libpthread.so.0+00064688
[ 10] 0x00007fedfc00a8e4 /home/gwala/Documents/mexInitConfig/mex_torque_profile.mexa64+00010468 mexFunction+002116
[ 11] 0x00007fee355b269a /usr/local/MATLAB/R2012b/bin/glnxa64/libmex.so+00112282 mexRunMexFile+000090
[ 12] 0x00007fee355ae4e9 /usr/local/MATLAB/R2012b/bin/glnxa64/libmex.so+00095465
[ 13] 0x00007fee355af33c /usr/local/MATLAB/R2012b/bin/glnxa64/libmex.so+00099132
[ 14] 0x00007fee3e620a4b /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_dispatcher.so+00596555 _ZN8Mfh_file11dispatch_fhEiPP11mxArray_tagiS2_+000539
[ 15] 0x00007fee3deb4e56 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+02264662
[ 16] 0x00007fee3de651c6 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+01937862
[ 17] 0x00007fee3de69ab4 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+01956532
[ 18] 0x00007fee3de660d3 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+01941715
[ 19] 0x00007fee3de66ed7 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+01945303
[ 20] 0x00007fee3ded2760 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+02385760
[ 21] 0x00007fee3e620a4b /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_dispatcher.so+00596555 _ZN8Mfh_file11dispatch_fhEiPP11mxArray_tagiS2_+000539
[ 22] 0x00007fee35b73538 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwmcos.so+01574200
[ 23] 0x00007fee35b15232 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwmcos.so+01188402
[ 24] 0x00007fee35b154ce /usr/local/MATLAB/R2012b/bin/glnxa64/libmwmcos.so+01189070
[ 25] 0x00007fee35b1723c /usr/local/MATLAB/R2012b/bin/glnxa64/libmwmcos.so+01196604
[ 26] 0x00007fee35bfc9c7 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwmcos.so+02136519
[ 27] 0x00007fee3e5d6431 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_dispatcher.so+00291889 _ZN13Mfh_MATLAB_fn11dispatch_fhEiPP11mxArray_tagiS2_+000529
[ 28] 0x00007fee3deb4933 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+02263347
[ 29] 0x00007fee3dec40d8 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+02326744
[ 30] 0x00007fee3dec7038 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+02338872
[ 31] 0x00007fee3de6ab18 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+01960728
[ 32] 0x00007fee3de660d3 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+01941715
[ 33] 0x00007fee3de66ed7 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+01945303
[ 34] 0x00007fee3ded2760 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+02385760
[ 35] 0x00007fee3e62053b /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_dispatcher.so+00595259 _ZN8Mfh_file11dispatch_fhEP20_mdUnknown_workspaceiPP11mxArray_tagiS4_+000555
[ 36] 0x00007fee3e5e3057 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_dispatcher.so+00344151 _Z30callViamdMxarrayFunctionHandlePviPP11mxArray_tagiS2_+000039
[ 37] 0x00007fee3de3e143 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+01777987
[ 38] 0x00007fee3de3f07d /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+01781885
[ 39] 0x00007fee3de3f78b /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+01783691
[ 40] 0x00007fee3de42759 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+01795929
[ 41] 0x00007fee3de6922b /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+01954347
[ 42] 0x00007fee3de660d3 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+01941715
[ 43] 0x00007fee3de66ed7 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+01945303
[ 44] 0x00007fee3ded2760 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+02385760
[ 45] 0x00007fee3e62053b /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_dispatcher.so+00595259 _ZN8Mfh_file11dispatch_fhEP20_mdUnknown_workspaceiPP11mxArray_tagiS4_+000555
[ 46] 0x00007fee3e5e3057 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_dispatcher.so+00344151 _Z30callViamdMxarrayFunctionHandlePviPP11mxArray_tagiS2_+000039
[ 47] 0x00007fee3ddf38b5 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+01472693 inFullFevalFcn+001045
[ 48] 0x00007fee3e5e66ba /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_dispatcher.so+00358074 _ZN11Mfh_builtin11dispatch_mfEiPP11mxArray_tagiS2_+000074
[ 49] 0x00007fee3e5d6431 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_dispatcher.so+00291889 _ZN13Mfh_MATLAB_fn11dispatch_fhEiPP11mxArray_tagiS2_+000529
[ 50] 0x00007fee3e0a1140 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+04280640
[ 51] 0x00007fee3e0a197a /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+04282746
[ 52] 0x00007fee3e0a24ea /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+04285674
[ 53] 0x00007fee3df054cd /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+02593997
[ 54] 0x00007fee3df30d22 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+02772258
[ 55] 0x00007fee3df30e4f /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+02772559
[ 56] 0x00007fee3e04db30 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+03939120
[ 57] 0x00007fee3de69fec /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+01957868
[ 58] 0x00007fee3de660d3 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+01941715
[ 59] 0x00007fee3de66ed7 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+01945303
[ 60] 0x00007fee3ded2760 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+02385760
[ 61] 0x00007fee3e620a4b /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_dispatcher.so+00596555 _ZN8Mfh_file11dispatch_fhEiPP11mxArray_tagiS2_+000539
[ 62] 0x00007fee3de9388f /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+02128015
[ 63] 0x00007fee3de92c69 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+02124905
[ 64] 0x00007fee3ddf129c /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+01462940 inCallFcnWithTrap+000092
[ 65] 0x00007fee3de57bfb /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+01883131
[ 66] 0x00007fee3ddf0168 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwm_interpreter.so+01458536 _Z28inCallFcnWithTrapInDesiredWSiPP11mxArray_tagiS1_PKcbP15inWorkSpace_tag+000104
[ 67] 0x00007fee36365b09 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwiqm.so+02878217 _ZN3iqm15BaseFEvalPlugin7executeEP15inWorkSpace_tagRN5boost10shared_ptrIN14cmddistributor17IIPCompletedEventEEE+000457
[ 68] 0x00007fedff8a992d /usr/local/MATLAB/R2012b/bin/glnxa64/libnativejmi.so+00674093 _ZN9nativejmi14JmiFEvalPlugin7executeEP15inWorkSpace_tagRN5boost10shared_ptrIN14cmddistributor17IIPCompletedEventEEE+000173
[ 69] 0x00007fedff8d6c45 /usr/local/MATLAB/R2012b/bin/glnxa64/libnativejmi.so+00859205 _ZN3mcr3mvm27McrSwappingIqmPluginAdapterIN9nativejmi14JmiFEvalPluginEE7executeEP15inWorkSpace_tagRN5boost10shared_ptrIN14cmddistributor17IIPCompletedEventEEE+000629
[ 70] 0x00007fee3633bbfa /usr/local/MATLAB/R2012b/bin/glnxa64/libmwiqm.so+02706426
[ 71] 0x00007fee3632d594 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwiqm.so+02647444
[ 72] 0x00007fee357d7ccd /usr/local/MATLAB/R2012b/bin/glnxa64/libmwbridge.so+00122061 _Z10ioReadLinebP8_IO_FILERKN5boost8optionalIKP15inWorkSpace_tagEEb+000429
[ 73] 0x00007fee357d8354 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwbridge.so+00123732
[ 74] 0x00007fee357dd71d /usr/local/MATLAB/R2012b/bin/glnxa64/libmwbridge.so+00145181
[ 75] 0x00007fee357dd81e /usr/local/MATLAB/R2012b/bin/glnxa64/libmwbridge.so+00145438
[ 76] 0x00007fee357ddf07 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwbridge.so+00147207 _Z8mnParserv+000631
[ 77] 0x00007fee3e8b7472 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwmcr.so+00447602 _ZN11mcrInstance30mnParser_on_interpreter_threadEv+000034
[ 78] 0x00007fee3e895b69 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwmcr.so+00310121
[ 79] 0x00007fee3e895d48 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwmcr.so+00310600
[ 80] 0x00007fee3eeabf73 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwservices.so+00999283 _ZN10eventqueue18UserEventQueueImpl5flushEv+000371
[ 81] 0x00007fee3eeac695 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwservices.so+01001109 _ZN10eventqueue8ReadPipeEib+000053
[ 82] 0x00007fee3eeab321 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwservices.so+00996129 _ZN10eventqueue18UserEventQueueImpl9selectFcnEb+000353
[ 83] 0x00007fee3284fa65 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwuix.so+00518757
[ 84] 0x00007fee3ef45a11 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwservices.so+01628689 _ZSt8for_eachIN9__gnu_cxx17__normal_iteratorIPN5boost8weak_ptrIN4sysq10ws_ppeHookEEESt6vectorIS6_SaIS6_EEEENS4_8during_FIS6_NS2_10shared_ptrIS5_EEEEET0_T_SH_SG_+000081
[ 85] 0x00007fee3ef46aeb /usr/local/MATLAB/R2012b/bin/glnxa64/libmwservices.so+01633003 _ZN4sysq12ppe_for_eachINS_8during_FIN5boost8weak_ptrINS_10ws_ppeHookEEENS2_10shared_ptrIS4_EEEEEET_RKS9_+000251
[ 86] 0x00007fee3ef445a2 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwservices.so+01623458 _ZN4sysq19ppePollingDuringFcnEb+000114
[ 87] 0x00007fee3ef44969 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwservices.so+01624425 _ZN4sysq11ppeMainLoopEiib+000121
[ 88] 0x00007fee3ef44b08 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwservices.so+01624840 _ZN4sysq11ppeLoopIfOKEiib+000152
[ 89] 0x00007fee3ef44c63 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwservices.so+01625187 _ZN4sysq20processPendingEventsEiib+000147
[ 90] 0x00007fee3e896664 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwmcr.so+00312932
[ 91] 0x00007fee3e896b3c /usr/local/MATLAB/R2012b/bin/glnxa64/libmwmcr.so+00314172
[ 92] 0x00007fee3e890592 /usr/local/MATLAB/R2012b/bin/glnxa64/libmwmcr.so+00288146
[ 93] 0x00007fee3cdc0e9a /lib/x86_64-linux-gnu/libpthread.so.0+00032410
[ 94] 0x00007fee3caedcbd /lib/x86_64-linux-gnu/libc.so.6+00998589 clone+000109
This error was detected while a MEX-file was running. If the MEX-file
is not an official MathWorks function, please examine its source code
for errors. Please consult the External Interfaces Guide for information
on debugging MEX-files.
If this problem is reproducible, please submit a Service Request via:
http://www.mathworks.com/support/contact_us/
A technical support engineer might contact you with further information.
Thank you for your help.** This crash report has been saved to disk as /home/gwala/matlab_crash_dump.11705-1 **
class_handle: trying to cast
Destroy object 0x7f0e1e45f090
class_handle: trying to cast
Deleted myclass with handle: 139698584223104
Error using parallel_function (line 589)
The session that parfor is using has shut down.
As can be seen, the interface is correctly created, then aFunction returns correctly the value 3 for 'out' (and this works even for multiple calls).
Then the parfor loop starts, as is known the matlab object is saved (actually it is not clear to me why it is saved twice, but this happens even from Command Window, when I save a myInterface object).
Finally a new myInterface is created, with handle 0, and the handle is restored to the correct previous value. Altought, the call to aFunctions fails.
I Finally report the myInterface matlab class:
classdef myInterface < handle
properties (SetAccess = private, Transient=true)
objectHandle; % Handle to the underlying C++ class instance
end
methods(Static=true)
function obj = loadobj(this)
obj = myInterface();
obj.objectHandle = this.objectHandle;
disp(['Load interface with CPP handle: ' num2str(this.objectHandle)]);
end
end
methods
function obj = saveobj(this)
obj.objectHandle = this.objectHandle;
disp(['Save interface with CPP handle: ' num2str(this.objectHandle)]);
end
%% Constructor - Create a new C++ class instance
function this = myInterface(varargin)
if (size(varargin) == 0) % constructor with no arguments. Used in load/save operations
this.objectHandle = 0;
else % constructor with normal arguments
this.objectHandle = myMexInterface('new', varargin{:});
end
disp(['Create interface with CPP handle: ' num2str(this.objectHandle)]);
end
%% Destructor - Destroy the C++ class instance
function delete(this)
myMexInterface('delete', this.objectHandle);
disp(['Deleted myclass with handle: ' num2str(this.objectHandle)]);
end
%% aFunction
function varargout = aFunction(this, varargin)
[varargout{1:nargout}] = myMexInterface('aFunction', this.objectHandle,varargin{:});
end
end
end
Please note that, as suggested in [2], I included the loadobj, saveobj functions, and the class has the "Transient" property in order to achieve load/save operations.
Hoping in the help of someone, I hope that this post can help someone.
Regards,
Gabriele Gualandi
Parallel computing toolbox workers are separate MATLAB processes - they might even be running on separate machines, and as such they have separate address spaces. If I've understood correctly, you are passing a pointer to an object which is valid on your MATLAB client to a worker, and trying to dereference that pointer there.
This can possibly be made to work if you explicitly use shared memory regions to communicate between client and workers. For example, see this file exchange contribution: http://www.mathworks.co.uk/matlabcentral/fileexchange/28572-sharedmatrix .
The alternative, as I proposed in the solution to one of the questions you reference above, is to support saving to disk more fully - in this case, this probably means pulling the contents out of your C++ object into a MATLAB structure, and supporting recreating your C++ object from that MATLAB structure.
I got segfault when trying to call "new" to create a pointer and push it into a vector. The code that I push the element in the vector is:
queue->push_back(new Box(gen_id, Interval(x_mid, x_end), Interval(y_mid-y_halfwidth, y_mid+y_halfwidth)));
Basically Box is a Class and the constructor just take 3 arguments, generation_id, and 2 Intervals. I printed out the content in vector before and after this "push",
before:
[ -0.30908203125, -0.3087158203125 ] , [ -0.951416015625, -0.9510498046875 ]
[ -0.3087158203125, -0.308349609375 ] , [ -0.951416015625, -0.9510498046875 ]
[ -0.30908203125, -0.3087158203125 ] , [ -0.9510498046875, -0.95068359375 ]
[ -0.3087158203125, -0.308349609375 ] , [ -0.9510498046875, -0.95068359375 ]
after:
[ -0.30908203125, -0.3087158203125 ] , [ -0.951416015625, -0.9510498046875 ]
[ -0.3087158203125, -0.308349609375 ] , [ -0.951416015625, -0.9510498046875 ]
[ 8.9039208750109844342e-243, 6.7903818933216500424e-173 ] , [ -0.9510498046875, -0.95068359375 ]
[ -0.3087158203125, -0.308349609375 ] , [ -0.9510498046875, -0.95068359375 ]
[ -0.3087158203125, -0.308349609375 ] , [ -0.95123291015625, -0.95086669921875 ]
I have no clue why does this happen, but apparently, there's one element got corrupted. There's no other codes between these two sets of output except that "push", and I used gdb to confirm that. Also, I checked those 2 Intervals variables, both give me a result that make sense.
My questions is: in what situation does "new" get segfault? Or is my problem caused because of other stuff? Thanks.
Assuming it really is new generating the segfault, the most common cause would be a corrupted heap, typically a result of overwriting memory you don't own and/or a double delete.
Valgrind will be your friend if you can run on a Linux system.
I doubt that new itself is giving you the segfault; the problem is probably in one of the constructors. Try splitting up that giant line, and put in some print statements to see exactly where the problem is.
printf("Creating the first interval...\n");
Interval a(x_mid, x_end);
printf("Creating the second interval...\n");
Interval b(y_mid-y_halfwidth, y_mid + y_halfwidth);
printf("Creating the box...\n");
Box* box_to_enqueue = new Box(gen_id, a, b);
printf("Enqueueing the box...\n");
// Do you really want to enqueue a pointer instead of a Box?
queue->push_back(box_to_enqueue);
I have a piece of code like this, running on 4 MPI process.
for (i=0;i<niter;i++){
//.. do something with temprs
memcpy(rs, temprs,..) // copy temprs content to rs
MPI_Gather(rs,...0...); //gather result to 0
if (mpiRank == 0) writeToDisk(rs);
}
I want to put 2 last line of code into a function comm_and_save then threaded it so that It can run in parallel with the remaining code, something like below:
boost::thread t1;
for (i=0;i<niter;i++){
//.. do something with temprs
t1.join(); // make sure previous comm_and_save done
memcpy(rs, temprs,..) // copy temprs content to rs
t1 = boost::thread( comm_and_save, rs );
}
However, the code sometime run, sometime hang, sometime throws some error:
local QP operation err (QPN 5a0407, WQE # 00000f02, CQN 280084, index 100677)
[ 0] 005a0407
[ 4] 00000000
[ 8] 00000000
[ c] 00000000
[10] 0270c84f
[14] 00000000
[18] 00000f02
[1c] ff100000
Please enlighten me which part I'm doing incorrectly
Thank you
Use MPI_Init_thread:
http://www.mpi-forum.org/docs/mpi-20-html/node165.htm
and check return status: Available level of thread support
Cheerz.