LinearHashTable iter not dereferencable and iter not - c++

Hy all :)
I am using 1.5.4-all (2014-10-22) in my VC++ project (Microsoft Visual C++ Compiler 18.00.21005.1 for x86 platform).
My problem is that I get the following error message after some time. The time, after which the error occurs differ quiet a lot - sometimes it happens after 30 secs and sometimes after 5 minutes.
I could locate the source for the error in the LinearHashTable.h file at line 214:
I have the following method where a Shot (struct) is added to the table:
void ShotSimulationService::SimulateShot(Shot shot) {
MutexThreadLock.lock();
shots.insert(ShotsSetType::ValueType(SimulationShot(shot)));
errorCount = 0;
MutexThreadLock.unlock();
}
The call of SimulateShot is from another thread than the handling of the following code:
void ShotSimulationService::Update(WebcamService* observable) {
if (shots.empty()) {
return;
}
try {
Mat frame = observable->GetLastImage().clone();
ShotsSetType::Iterator iter = shots.begin();
vector<Shot> deleteShots;
errorCount++;
while (iter != shots.end()){
if (iter->SimulateStartExplosion()) {
//simulate gun explosion
OverlayImage(frame, gunShotImg, iter->startPoint);
}
//simulate explosion
SimulationShot::SimulationHitStatus status = iter->status;
if (status == SimulationShot::SimulationHitStatus::UNKNOWN) {
if (detectionService.HasShotHitPlayer(frame, *iter)) {
iter->status = SimulationShot::HIT_PLAYER;
iter->SetCurrentPointAsEndoint();
//Notify that player was hit
playerHitQueue.enqueueNotification(new PlayerHitNotification(iter->hitPlayer));
}
}
if (iter->SimulateEndExplosion()) {
if (status == SimulationShot::HIT_PLAYER) {
int explosionx = iter->endPoint.x - robotExplosionHalfXSize > 0 ? iter->endPoint.x - robotExplosionHalfXSize : 0;
int explosionY = iter->endPoint.y - robotExplosionHalfYSize > 0 ? iter->endPoint.y - robotExplosionHalfYSize : 0;
OverlayImage(frame, robotExplosionImg, Point2i(explosionx, explosionY));
}
else {
// status == SimulationShot::HIT_WALL or UNKNOWN
int explosionx = iter->endPoint.x - wallExplosionHalfXSize > 0 ? iter->endPoint.x - wallExplosionHalfXSize : 0;
int explosionY = iter->endPoint.y - wallExplosionHalfYSize > 0 ? iter->endPoint.y - wallExplosionHalfYSize : 0;
OverlayImage(frame, robotExplosionImg, Point2i(explosionx, explosionY));
if (status != SimulationShot::HIT_WALL) {
iter->status = SimulationShot::HIT_WALL;
}
}
if (iter->IsSimulationFinished()) {
deleteShots.push_back(*iter);
}
}
else {
//simulate bullet
OverlayImage(frame, cheeseImg, iter->GetNextShotPoint());
}
++iter;
}
//delete finished simulations
MutexThreadLock.lock();
for each (Shot shot in deleteShots)
{
shots.erase(shot);
}
MutexThreadLock.unlock();
}
catch (cv::Exception& e) {
Logger& logger = Logger::get("Test");
logger.error(e.what());
}
}
The Update method is called quiet often - always when a new webcam frame is available.
The callstack of the error starts in the following line:
if (iter->SimulateEndExplosion()) {
In the method SimulateEndExplosion only members of the struct were used:
bool SimulateEndExplosion() {
if (status == HIT_PLAYER) {
currPercentage = 1.0;
return true;
}
if (currPercentage < 1.0) {
return false;
}
++endExplosionCtr;
return endExplosionCtr <= maxEndExplosions;
}
Does anybody have an idea why this problem occurs?
Any help and any feedback is welcome!! I have absolutly no idea what is going wrong here :(
Thanks!

Iterating in one thread and inserting in another without protecting the operations with mutex in both threads will cause this problem; when you insert, iterator will be invalidated and you will get the assertion failure. You should protect both insertion and iteration with mutex.
Also, the way you are using mutex is not safe because mutex will not be unlocked if an exception is thrown between lock() and unlock(). Use ScopedLock instead and RAII will do the job automatically and safely in all cases:
void ShotSimulationService::SimulateShot(Shot shot) {
Mutex::ScopedLock lock(MutexThreadLock);
shots.insert(ShotsSetType::ValueType(SimulationShot(shot)));
errorCount = 0;
// unlock will be called by ScopedLock destructor
}

Related

Future task just vanishes

I run into a rather strange situation when using std::future and ThreadPool, though I do not think it's ThreadPool (I'm using https://github.com/bandi13/ThreadPool/blob/master/example.cpp) since (I've tried multiple forks of it and after some debugging I do not see how it would be related to the issue).
The issue is that under certain situation my doProcess method just goes nirvana - it does not return. It just disappears midst of a long running loop.
Therefore I think I must be doing something wrong, but can't figure out what.
Here's the code:
ThreadPool pool(numThreads);
std::vector< std::future<bool> > futures;
int count = 0;
string orgOut = outFile;
for (auto fileToProcess : filesToProcess) {
count++;
outFile = orgOut + std::to_string(count);
// enque processing in the thread pool
futures.emplace_back(
pool.enqueue([count, fileToProcess, outFile, filteredKeys, sql] {
return doProcess(fileToProcess, outFile, filteredKeys, sql);
})
);
}
Then I wait for all processings to be done (I think this could be done in a more elegant way also):
bool done = false;
while (!done) {
done = true;
for (auto && futr : futures) {
auto status = futr.wait_for(std::chrono::milliseconds(1));
if (status != std::future_status::ready) {
done = false;
break;
}
}
}
Edit: At first I also tried the obvius wait(), with the same result however:
bool done = false;
while (!done) {
done = true;
for (auto && futr : futures) {
futr.wait();
}
}
Edit: The doProcess() method. The behavior is this: The loopcnt variable is just a counter to debug how often the method was entered and the loop started. As you can see, there is no return from this loop, but the thread just vanishes when inside this loop with no error whatsoever and wasHereCnt is reached only occasionally (like 1 of 100 times the method is run). I'm really puzzled.
bool doProcess([...]) {
// ....
vector<vector<KVO*>*>& features = filter.result();
vector<vector<KVO*>*> filteredFeatures;
static int loopcnt = 0;
std::cout << "loops " << loopcnt << endl;
loopcnt++;
for (vector<KVO*>* feature : features) {
for (KVO *kv : *feature) {
switch (kv->value.type()) {
case Variant::JNULL:
sqlFilter.setNullValue(kv->key);
break;
case Variant::INT:
sqlFilter.setValue(static_cast<int64_t>(kv->value), kv->key);
break;
case Variant::UINT:
sqlFilter.setValue(static_cast<int64_t>(kv->value), kv->key);
break;
case Variant::DOUBLE:
sqlFilter.setValue(static_cast<double>(kv->value), kv->key);
break;
case Variant::STRING:
sqlFilter.setValue(static_cast<string>(kv->value), kv->key);
break;
default:
assert(false);
break;
}
}
int filterResult = sqlFilter.exec();
if (filterResult > 0) {
filteredFeatures.push_back(feature);
}
sqlFilter.reset();
}
static int wasHereCnt = 0;
std::cout << "was here: " << wasHereCnt << endl;
wasHereCnt++;
JsonWriter<Writer<FileWriteStream>> geojsonWriter(writer, filteredFeatures);
bool res = geojsonWriter.write();
os.Flush();
fclose(fp);
return res;
}
The doProcess method does work when it's taking less time. It breaks and disappears when it takes somewhat more time. The difference being just the complexity of an SQL query I run in the method. So I don't post the code for doProcess().
What causes the thread of the thread pool to be interrupted, and how to fix it?
UPDATE
Well, I found it out. After several hours I decided to remove the future tasks and just ran the task on the main thread. The issue was that an exception was thrown via:
throw std::runtime_error("bad cast");
... some time down the code flow after this:
case Variant::UINT:
sqlFilter.setValue(static_cast<int64_t>(kv->value), kv->key);
break;
This error was thrown as expected when running on the main thread. But it's never raised when run as future task. This is really odd and seems like a compiler or debugger issue.

C++ Visual Studio 2013 Unwinding object

Well, I'm trying to build this line of code, bit I get a compiler error. I've tryed to build without the compiler, but that didn't work either. Its about the __try and __except. Someone told me to move the code in the try block to another function. But I don't understand this:
Error 12 error C2712: Cannot use __try in functions that require
object unwinding
Error 437 error LNK1181: cannot open input file
void MSocketThread::Run()
{
__try{
//throw(pThread);
while (true) { // Waiting for SafeUDP Settting...
DWORD dwVal = WaitForSingleObject(m_KillEvent.GetEvent(), 100);
if (dwVal == WAIT_OBJECT_0) {
return;
}
else if (dwVal == WAIT_TIMEOUT) {
if (m_pSafeUDP)
break;
}
}
WSAEVENT EventArray[WSA_MAXIMUM_WAIT_EVENTS];
WORD wEventIndex = 0;
bool bSendable = false;
WSANETWORKEVENTS NetEvent;
WSAEVENT hFDEvent = WSACreateEvent();
EventArray[wEventIndex++] = hFDEvent;
EventArray[wEventIndex++] = m_ACKEvent.GetEvent();
EventArray[wEventIndex++] = m_SendEvent.GetEvent();
EventArray[wEventIndex++] = m_KillEvent.GetEvent();
WSAEventSelect(m_pSafeUDP->GetLocalSocket(), hFDEvent, FD_READ | FD_WRITE);
while (TRUE) {
DWORD dwReturn = WSAWaitForMultipleEvents(wEventIndex, EventArray, FALSE, SAFEUDP_SAFE_MANAGE_TIME, FALSE);
if (dwReturn == WSA_WAIT_TIMEOUT) { // Time
m_pSafeUDP->LockNetLink();
SafeSendManage();
m_pSafeUDP->UnlockNetLink();
}
else if (dwReturn == WSA_WAIT_EVENT_0) { // Socket Event
WSAEnumNetworkEvents(m_pSafeUDP->GetLocalSocket(), hFDEvent, &NetEvent);
if ((NetEvent.lNetworkEvents & FD_READ) == FD_READ) {
// OutputDebugString("SUDP> FD_READ \n");
m_pSafeUDP->LockNetLink();
Recv();
m_pSafeUDP->UnlockNetLink();
}
if ((NetEvent.lNetworkEvents & FD_WRITE) == FD_WRITE) {
bSendable = true;
// OutputDebugString("SUDP> FD_WRITE \n");
}
}
else if (dwReturn == WSA_WAIT_EVENT_0 + 1) { // ACK Send Event
// OutputDebugString("SUDP> ACK_EVENT \n");
FlushACK();
}
else if (dwReturn == WSA_WAIT_EVENT_0 + 2) { // Packet Send Event
// OutputDebugString("SUDP> SEND_EVENT \n");
if (bSendable == true)
FlushSend();
}
else if (dwReturn == WSA_WAIT_EVENT_0 + 3) { // Kill the Thread
break; // Stop Thread
}
}
WSACloseEvent(hFDEvent);
// Clear Queues
LockSend();
{
for (SendListItor itor = m_SendList.begin(); itor != m_SendList.end();) {
delete (*itor);
itor = m_SendList.erase(itor);
}
}
{
for (SendListItor itor = m_TempSendList.begin(); itor != m_TempSendList.end();) {
delete (*itor);
itor = m_TempSendList.erase(itor);
}
}
UnlockSend();
LockACK();
{
for (ACKSendListItor itor = m_ACKSendList.begin(); itor != m_ACKSendList.end();) {
delete (*itor);
itor = m_ACKSendList.erase(itor);
}
}
{
for (ACKSendListItor itor = m_TempACKSendList.begin(); itor != m_TempACKSendList.end();) {
delete (*itor);
itor = m_TempACKSendList.erase(itor);
}
}
UnlockACK();
}
__except (this->CrashDump(GetExceptionInformation()))
Basically, SEH and C++ unwinding aren't exactly compatible; they require the compiler to modify the function on the machine code level, and MS apparently decided not to support the modifications for the two at the same time, so any function can only support either SEH unwind actions (__except or __finally) or C++ unwind actions (catch or objects with destructors).
I suspect the problem in your case are the iterators in your loops; they might have destructors. Although iterators are usually simple and don't need destructors, this is not the case for debug iterators, which often register their existence on construction and deregister it on destruction, in order to detect invalidated iterators and other invalid usage.
The usual workaround is to split the function. Make your run function contain just this:
void MSocketThread::Run()
{
__try {
RunNoSeh();
} __except (this->CrashDump(GetExceptionInformation())) {
}
}
void MSocketThread::RunNoSeh()
{
// Code that was inside the __try goes here.
}

Pops / clicks when stopping and starting DirectX sound synth in C++ / MFC

I have made a soft synthesizer in Visual Studio 2012 with C++, MFC and DirectX. Despite having added code to rapidly fade out the sound I am experiencing popping / clicking when stopping playback (also when starting).
I copied the DirectX code from this project: http://www.codeproject.com/Articles/7474/Sound-Generator-How-to-create-alien-sounds-using-m
I'm not sure if I'm allowed to cut and paste all the code from the Code Project. Basically I use the Player class from that project as is, the instance of this class is called m_player in my code. The Stop member function in that class calls the Stop function of LPDIRECTSOUNDBUFFER:
void Player::Stop()
{
DWORD status;
if (m_lpDSBuffer == NULL)
return;
HRESULT hres = m_lpDSBuffer->GetStatus(&status);
if (FAILED(hres))
EXCEP(DirectSoundErr::GetErrDesc(hres), "Player::Stop GetStatus");
if ((status & DSBSTATUS_PLAYING) == DSBSTATUS_PLAYING)
{
hres = m_lpDSBuffer->Stop();
if (FAILED(hres))
EXCEP(DirectSoundErr::GetErrDesc(hres), "Player::Stop Stop");
}
}
Here is the notification code (with some supporting code) in my project that fills the sound buffer. Note that the rend function always returns a double between -1 to 1, m_ev_smps = 441, m_n_evs = 3 and m_ev_sz = 882. subInit is called from OnInitDialog:
#define FD_STEP 0.0005
#define SC_NOT_PLYD 0
#define SC_PLYNG 1
#define SC_FD_OUT 2
#define SC_FD_IN 3
#define SC_STPNG 4
#define SC_STPD 5
bool CMainDlg::subInit()
// initialises various variables and the sound player
{
Player *pPlayer;
SOUNDFORMAT format;
std::vector<DWORD> events;
int t, buf_sz;
try
{
pPlayer = new Player();
pPlayer->SetHWnd(m_hWnd);
m_player = pPlayer;
m_player->Init();
format.NbBitsPerSample = 16;
format.NbChannels = 1;
format.SamplingRate = 44100;
m_ev_smps = 441;
m_n_evs = 3;
m_smps = new short[m_ev_smps];
m_smp_scale = (int)pow(2, format.NbBitsPerSample - 1);
m_max_tm = (int)((double)m_ev_smps / (double)(format.SamplingRate * 1000));
m_ev_sz = m_ev_smps * format.NbBitsPerSample/8;
buf_sz = m_ev_sz * m_n_evs;
m_player->CreateSoundBuffer(format, buf_sz, 0);
m_player->SetSoundEventListener(this);
for(t = 0; t < m_n_evs; t++)
events.push_back((int)((t + 1)*m_ev_sz - m_ev_sz * 0.95));
m_player->CreateEventReadNotification(events);
m_status = SC_NOT_PLYD;
}
catch(MATExceptions &e)
{
MessageBox(e.getAllExceptionStr().c_str(), "Error initializing the sound player");
EndDialog(IDCANCEL);
return FALSE;
}
return TRUE;
}
void CMainDlg::Stop()
// stop playing
{
m_player->Stop();
m_status = SC_STPD;
}
void CMainDlg::OnBnClickedStop()
// causes fade out
{
m_status = SC_FD_OUT;
}
void CMainDlg::OnSoundPlayerNotify(int ev_num)
// render some sound samples and check for errors
{
ScopeGuardMutex guard(&m_mutex);
int s, end, begin, elapsed;
if (m_status != SC_STPNG)
{
begin = GetTickCount();
try
{
for(s = 0; s < m_ev_smps; s++)
{
m_smps[s] = (int)(m_synth->rend() * 32768 * m_fade);
if (m_status == SC_FD_IN)
{
m_fade += FD_STEP;
if (m_fade > 1)
{
m_fade = 1;
m_status = SC_PLYNG;
}
}
else if (m_status == SC_FD_OUT)
{
m_fade -= FD_STEP;
if (m_fade < 0)
{
m_fade = 0;
m_status = SC_STPNG;
}
}
}
}
catch(MATExceptions &e)
{
OutputDebugString(e.getAllExceptionStr().c_str());
}
try
{
m_player->Write(((ev_num + 1) % m_n_evs)*m_ev_sz, (unsigned char*)m_smps, m_ev_sz);
}
catch(MATExceptions &e)
{
OutputDebugString(e.getAllExceptionStr().c_str());
}
end = GetTickCount();
elapsed = end - begin;
if(elapsed > m_max_tm)
m_warn_msg.Format(_T("Warning! compute time: %dms"), elapsed);
else
m_warn_msg.Format(_T("compute time: %dms"), elapsed);
}
if (m_status == SC_STPNG)
Stop();
}
It seems like the buffer is not always sounding out when the stop button is clicked. I don't have any specific code for waiting for the sound buffer to finish playing before the DirectX Stop is called. Other than that the sound playback is working just fine, so at least I am initialising the player correctly and notification code is working in that respect.
Try replacing 32768 with 32767. Not by any means sure this is your issue, but it could overflow the positive short int range (assuming your audio is 16-bit) and cause a "pop".
I got rid of the pops / clicks when stopping playback, by filling the buffer with zeros after the fade out. However I still get pops when re-starting playback, despite filling with zeros and then fading back in (it is frustrating).

Execute a piece of code in a function from the second invocation onwards

If I desire to run a piece of code in a function, only from the second invocation of the function onwards,
Questions:
Is there something wrong to do that?
How can I possibly achieve this ? Is using a static variable to do this a good idea ?
There's two answers to this question, depending on whether you have to deal with multi-threaded serialization or not.
No threading:
void doSomething() {
static bool firstTime = true;
if (firstTime) {
// do code specific to first pass
firstTime = false;
} else {
// do code specific to 2nd+ pass
}
// do any code that is common
}
With threading:
I'll write the generic boilerplate, but this code is system specific (requiring some variant of an atomic compareAndSet).
void doSomethingThreadSafe() {
static volatile atomic<int> passState = 0;
do {
if ( passState == 2 ) {
//perform pass 2+ code
break;
} else
if ( passState.compareAndSet(0,1) ) { // if passState==0 set passState=1 return true else return false
//perform pass 1 initialization code
passState = 2;
break;
} else {
//loser in setup collision, delay (wait for init code to finish) then retry
sleep(1);
}
} while(1);
//perform code common to all passes
}
Multi-threading will be a problem. To prevent this, if required, you'll probably need something like a mutex.
Like this:
void someFunction()
{
static bool firstRun = true;
if (!firstRun)
{
// code to execute from the second time onwards
}
else
{
firstRun = false;
}
// other code
}
Add a global counter.
eg:-
static int counter = 0;
public void testFunc(){
if(counter==1){
........
<Execute the functionality>
........
}
counter++;
}

strange segmentation fault during function return

I am running a program on 2 different machines. On one it works fine without issue. On the other it results in a segmentation fault. Through debugging, I have figured out where the fault occurs, but I can't figure out a logical reason for it to happen.
In one function I have the following code:
pass_particles(particle_grid, particle_properties, input_data, coll_eros_track, collision_number_part, world, grid_rank_lookup, grid_locations);
cout<<"done passing particles"<<endl;
The function pass_particles looks like:
void pass_particles(map<int,map<int,Particle> > & particle_grid, std::vector<Particle_props> & particle_properties, User_input& input_data, data_tracking & coll_eros_track, vector<int> & collision_number_part, mpi::communicator & world, std::map<int,int> & grid_rank_lookup, map<int,std::vector<double> > & grid_locations)
{
//cout<<"east-west"<<endl;
//east-west exchange (x direction)
map<int, vector<Particle> > particles_to_be_sent_east;
map<int, vector<Particle> > particles_to_be_sent_west;
vector<Particle> particles_received_east;
vector<Particle> particles_received_west;
int counter_x_sent=0;
int counter_x_received=0;
for(grid_iter=particle_grid.begin();grid_iter!=particle_grid.end();grid_iter++)
{
map<int,Particle>::iterator part_iter;
for (part_iter=grid_iter->second.begin();part_iter!=grid_iter->second.end();)
{
if (particle_properties[part_iter->second.global_part_num()].particle_in_box()[grid_iter->first])
{
//decide if a particle has left the box...need to consider whether particle was already outside the box
if ((part_iter->second.position().x()<(grid_locations[grid_iter->first][0]) && part_iter->second.position().x()>(grid_locations[grid_iter->first-input_data.z_numboxes()][0]))
|| (input_data.periodic_walls_x() && (grid_iter->first-floor(grid_iter->first/(input_data.xz_numboxes()))*input_data.xz_numboxes()<input_data.z_numboxes()) && (part_iter->second.position().x()>(grid_locations[input_data.total_boxes()-1][0]))))
{
particles_to_be_sent_west[grid_iter->first].push_back(part_iter->second);
particle_properties[particle_grid[grid_iter->first][part_iter->first].global_part_num()].particle_in_box()[grid_iter->first]=false;
counter_sent++;
counter_x_sent++;
}
else if ((part_iter->second.position().x()>(grid_locations[grid_iter->first][1]) && part_iter->second.position().x()<(grid_locations[grid_iter->first+input_data.z_numboxes()][1]))
|| (input_data.periodic_walls_x() && (grid_iter->first-floor(grid_iter->first/(input_data.xz_numboxes()))*input_data.xz_numboxes())>input_data.xz_numboxes()-input_data.z_numboxes()-1) && (part_iter->second.position().x()<(grid_locations[0][1])))
{
particles_to_be_sent_east[grid_iter->first].push_back(part_iter->second);
particle_properties[particle_grid[grid_iter->first][part_iter->first].global_part_num()].particle_in_box()[grid_iter->first]=false;
counter_sent++;
counter_x_sent++;
}
//select particles in overlap areas to send to neighboring cells
else if ((part_iter->second.position().x()>(grid_locations[grid_iter->first][0]) && part_iter->second.position().x()<(grid_locations[grid_iter->first][0]+input_data.diam_large())))
{
particles_to_be_sent_west[grid_iter->first].push_back(part_iter->second);
counter_sent++;
counter_x_sent++;
}
else if ((part_iter->second.position().x()<(grid_locations[grid_iter->first][1]) && part_iter->second.position().x()>(grid_locations[grid_iter->first][1]-input_data.diam_large())))
{
particles_to_be_sent_east[grid_iter->first].push_back(part_iter->second);
counter_sent++;
counter_x_sent++;
}
++part_iter;
}
else if (particles_received_current[grid_iter->first].find(part_iter->first)!=particles_received_current[grid_iter->first].end())
{
if ((part_iter->second.position().x()>(grid_locations[grid_iter->first][0]) && part_iter->second.position().x()<(grid_locations[grid_iter->first][0]+input_data.diam_large())))
{
particles_to_be_sent_west[grid_iter->first].push_back(part_iter->second);
counter_sent++;
counter_x_sent++;
}
else if ((part_iter->second.position().x()<(grid_locations[grid_iter->first][1]) && part_iter->second.position().x()>(grid_locations[grid_iter->first][1]-input_data.diam_large())))
{
particles_to_be_sent_east[grid_iter->first].push_back(part_iter->second);
counter_sent++;
counter_x_sent++;
}
part_iter++;
}
else
{
particle_grid[grid_iter->first].erase(part_iter++);
counter_removed++;
}
}
}
world.barrier();
mpi::request reqs_x_send[particles_to_be_sent_west.size()+particles_to_be_sent_east.size()];
vector<multimap<int,int> > box_sent_x_info;
box_sent_x_info.resize(world.size());
vector<multimap<int,int> > box_received_x_info;
box_received_x_info.resize(world.size());
int counter_x_reqs=0;
//send particles
for(grid_iter_vec=particles_to_be_sent_west.begin();grid_iter_vec!=particles_to_be_sent_west.end();grid_iter_vec++)
{
if (grid_iter_vec->second.size()!=0)
{
//send a particle. 50 will be "west" tag
if (input_data.periodic_walls_x() && (grid_iter_vec->first-floor(grid_iter_vec->first/(input_data.xz_numboxes()))*input_data.xz_numboxes()<input_data.z_numboxes()))
{
reqs_x_send[counter_x_reqs++]=world.isend(grid_rank_lookup[grid_iter_vec->first + input_data.z_numboxes()*(input_data.x_numboxes()-1)], grid_iter_vec->first + input_data.z_numboxes()*(input_data.x_numboxes()-1), particles_to_be_sent_west[grid_iter_vec->first]);
box_sent_x_info[grid_rank_lookup[grid_iter_vec->first + input_data.z_numboxes()*(input_data.x_numboxes()-1)]].insert(pair<int,int>(world.rank(), grid_iter_vec->first + input_data.z_numboxes()*(input_data.x_numboxes()-1)));
}
else if (!(grid_iter_vec->first-floor(grid_iter_vec->first/(input_data.xz_numboxes()))*input_data.xz_numboxes()<input_data.z_numboxes()))
{
reqs_x_send[counter_x_reqs++]=world.isend(grid_rank_lookup[grid_iter_vec->first - input_data.z_numboxes()], grid_iter_vec->first - input_data.z_numboxes(), particles_to_be_sent_west[grid_iter_vec->first]);
box_sent_x_info[grid_rank_lookup[grid_iter_vec->first - input_data.z_numboxes()]].insert(pair<int,int>(world.rank(),grid_iter_vec->first - input_data.z_numboxes()));
}
}
}
for(grid_iter_vec=particles_to_be_sent_east.begin();grid_iter_vec!=particles_to_be_sent_east.end();grid_iter_vec++)
{
if (grid_iter_vec->second.size()!=0)
{
//send a particle. 60 will be "east" tag
if (input_data.periodic_walls_x() && (grid_iter_vec->first-floor(grid_iter_vec->first/(input_data.xz_numboxes())*input_data.xz_numboxes())>input_data.xz_numboxes()-input_data.z_numboxes()-1))
{
reqs_x_send[counter_x_reqs++]=world.isend(grid_rank_lookup[grid_iter_vec->first - input_data.z_numboxes()*(input_data.x_numboxes()-1)], 2000000000-(grid_iter_vec->first - input_data.z_numboxes()*(input_data.x_numboxes()-1)), particles_to_be_sent_east[grid_iter_vec->first]);
box_sent_x_info[grid_rank_lookup[grid_iter_vec->first - input_data.z_numboxes()*(input_data.x_numboxes()-1)]].insert(pair<int,int>(world.rank(),2000000000-(grid_iter_vec->first - input_data.z_numboxes()*(input_data.x_numboxes()-1))));
}
else if (!(grid_iter_vec->first-floor(grid_iter_vec->first/(input_data.xz_numboxes())*input_data.xz_numboxes())>input_data.xz_numboxes()-input_data.z_numboxes()-1))
{
reqs_x_send[counter_x_reqs++]=world.isend(grid_rank_lookup[grid_iter_vec->first + input_data.z_numboxes()], 2000000000-(grid_iter_vec->first + input_data.z_numboxes()), particles_to_be_sent_east[grid_iter_vec->first]);
box_sent_x_info[grid_rank_lookup[grid_iter_vec->first + input_data.z_numboxes()]].insert(pair<int,int>(world.rank(), 2000000000-(grid_iter_vec->first + input_data.z_numboxes())));
}
}
}
counter=0;
for (int i=0;i<world.size();i++)
{
//if (world.rank()!=i)
//{
reqs[counter++]=world.isend(i,1000000000,box_sent_x_info[i]);
reqs[counter++]=world.irecv(i,1000000000,box_received_x_info[i]);
//}
}
mpi::wait_all(reqs, reqs + world.size()*2);
//receive particles
//receive west particles
for (int j=0;j<world.size();j++)
{
multimap<int,int>::iterator received_info_iter;
for (received_info_iter=box_received_x_info[j].begin();received_info_iter!=box_received_x_info[j].end();received_info_iter++)
{
//receive the message
if (received_info_iter->second<1000000000)
{
//receive the message
world.recv(received_info_iter->first,received_info_iter->second,particles_received_west);
//loop through all the received particles and add them to the particle_grid for this processor
for (unsigned int i=0;i<particles_received_west.size();i++)
{
particle_grid[received_info_iter->second].insert(pair<int,Particle>(particles_received_west[i].global_part_num(),particles_received_west[i]));
if(particles_received_west[i].position().x()>grid_locations[received_info_iter->second][0] && particles_received_west[i].position().x()<grid_locations[received_info_iter->second][1])
{
particle_properties[particles_received_west[i].global_part_num()].particle_in_box()[received_info_iter->second]=true;
}
counter_received++;
counter_x_received++;
}
}
else
{
//receive the message
world.recv(received_info_iter->first,received_info_iter->second,particles_received_east);
//loop through all the received particles and add them to the particle_grid for this processor
for (unsigned int i=0;i<particles_received_east.size();i++)
{
particle_grid[2000000000-received_info_iter->second].insert(pair<int,Particle>(particles_received_east[i].global_part_num(),particles_received_east[i]));
if(particles_received_east[i].position().x()>grid_locations[2000000000-received_info_iter->second][0] && particles_received_east[i].position().x()<grid_locations[2000000000-received_info_iter->second][1])
{
particle_properties[particles_received_east[i].global_part_num()].particle_in_box()[2000000000-received_info_iter->second]=true;
}
counter_received++;
counter_x_received++;
}
}
}
}
mpi::wait_all(reqs_y_send, reqs_y_send + particles_to_be_sent_bottom.size()+particles_to_be_sent_top.size());
mpi::wait_all(reqs_z_send, reqs_z_send + particles_to_be_sent_south.size()+particles_to_be_sent_north.size());
mpi::wait_all(reqs_x_send, reqs_x_send + particles_to_be_sent_west.size()+particles_to_be_sent_east.size());
cout<<"x sent "<<counter_x_sent<<" and received "<<counter_x_received<<" from rank "<<world.rank()<<endl;
cout<<"rank "<<world.rank()<<" sent "<<counter_sent<<" and received "<<counter_received<<" and removed "<<counter_removed<<endl;
cout<<"done passing"<<endl;
}
I only posted some of the code (so ignore the fact that some variables may appear to be undefined, as they are in a portion of the code I didn't post)
When I run the code (on the machine in which it fails), I get done passing but not done passing particles
I am lost as to what could possibly cause a segmentation fault between the end of the called function and the next line in the calling function and why it would happen on one machine and not another.
If you're crashing between the end of a function and the subsequent line in the caller, you're probably crashing in the destructor of a local variable. You need to run the program in a debugger to find out which object's destructor is crashing.
There are a couple of possibilities:
You actually are returning, but cout is buffered by the OS so you don't see "done passing particles" because the application crashes first.
You have some local class that has a destructor that is seg faulting.
Try running it in a debugger to find out where it is actually crashing.
Edit:
Since you've mentioned you're using gcc, add the -g flag and run it with gdb. Gdb will then tell you exactly where it's going wrong (probably a null dereference).
Just in case anyone comes back to this later. I updated to the newest version of boost mpi(at the time), 1.50 and this issue went away. Not much of a solution, but it worked.