C++: both classes do not run concurrently - c++

its my first time here. My code is suppose to make two ultrasonic sensors function at the same time using an mbed. However, i cant seem to make both classes void us_right() and void us_left() in the code run concurrently. Help please :(
#include "mbed.h"
DigitalOut triggerRight(p9);
DigitalIn echoRight(p10);
DigitalOut triggerLeft(p13);
DigitalIn echoLeft(p14);
//DigitalOut myled(LED1); //monitor trigger
//DigitalOut myled2(LED2); //monitor echo
PwmOut steering(p21);
PwmOut velocity(p22);
int distanceRight = 0, distanceLeft = 0;
int correctionRight = 0, correctionLeft = 0;
Timer sonarRight, sonarLeft;
float vo=0;
// Velocity expects -1 (reverse) to +1 (forward)
void Velocity(float v) {
v=v+1;
if (v>=0 && v<=2) {
if (vo>=1 && v<1) { //
velocity.pulsewidth(0.0014); // this is required to
wait(0.1); //
velocity.pulsewidth(0.0015); // move into reverse
wait(0.1); //
} //
velocity.pulsewidth(v/2000+0.001);
vo=v;
}
}
// Steering expects -1 (left) to +1 (right)
void Steering(float s) {
s=s+1;
if (s>=0 && s<=2) {
steering.pulsewidth(s/2000+0.001);
}
}
void us_right() {
sonarRight.reset();
sonarRight.start();
while (echoRight==2) {};
sonarRight.stop();
correctionRight = sonarLeft.read_us();
triggerRight = 1;
sonarRight.reset();
wait_us(10.0);
triggerRight = 0;
while (echoRight==0) {};
// myled2=echoRight;
sonarRight.start();
while (echoRight==1) {};
sonarRight.stop();
distanceRight = ((sonarRight.read_us()-correctionRight)/58.0);
printf("Distance from Right is: %d cm \n\r",distanceRight);
}
void us_left() {
sonarLeft.reset();
sonarLeft.start();
while (echoLeft==2) {};
sonarLeft.stop();
correctionLeft = sonarLeft.read_us();
triggerLeft = 1;
sonarLeft.reset();
wait_us(10.0);
triggerLeft = 0;
while (echoLeft==0) {};
// myled2=echoLeft;
sonarLeft.start();
while (echoLeft==1) {};
sonarLeft.stop();
distanceLeft = (sonarLeft.read_us()-correctionLeft)/58.0;
printf("Distance from Left is: %d cm \n\r",distanceLeft);
}
int main() {
while(true) {
us_right();
us_left();
}
if (distanceLeft < 10 || distanceRight < 10) {
if (distanceLeft < distanceRight) {
for(int i=0; i>-100; i--) { // Go left
Steering(i/100.0);
wait(0.1);
}
}
if (distanceLeft > distanceRight) {
for(int i=0; i>100; i++) { // Go Right
Steering(i/100.0);
wait(0.1);
}
}
}
wait(0.2);
}

You need to use some mechanism to create new threads or processes. Your implementation is sequential, there is nothing you do that tells the code to run concurrently.
You should take a look at some threads libraries (pthreads for example, or if you have access to c++11, there are thread functionality there) or how to create new processes as well as some kind of message passing interface between these processes.

Create two threads, one for each ultrasonic sensor:
void read_left_sensor() {
while (1) {
// do the reading
wait(0.5f);
}
}
int main() {
Thread left_thread;
left_thread.start(&read_left_sensor);
Thread right_thread;
right_thread.start(&read_right_sensor);
while (1) {
// put your control code for the vehicle here
wait(0.1f);
}
}
You can use global variables to write to when reading the sensor, and read them in your main loop. The memory is shared.

Your first problem is that you have placed code outside of your infinite while(true) loop. This later code will never run. But maybe you know this.
int main() {
while(true) {
us_right();
us_left();
} // <- Loops back to the start of while()
// You Never pass this point!!!
if (distanceLeft < 10 || distanceRight < 10) {
// Do stuff etc.
}
wait(0.2);
}
But, I think you are expecting us_right() and us_left() to happen at exactly the same time. You cannot do that in a sequential environment.
Jan Jongboom is correct in suggesting you could use Threads. This allows the 'OS' to designate time for each piece of code to run. But it is still not truly parallel. Each function (classes are a different thing) will get a chance to run. One will run, and when it is finished (or during a wait) another function will get its chance to run.
As you are using an mbed, I'd suggest that your project is an MBED OS 5 project
(you select this when you start a new project). Otherwise you'll need to use an RTOS library. There is a blinky example using threads that should sum it up well. Here is more info.
Threading can be dangerous for someone without experience. So stick to a simple implementation to start with. Make sure you understand what/why/how you are doing it.
Aside: From a hardware perspective, running ultrasonic sensors in parallel is actually not ideal. They both broadcast the same frequency, and can hear each other. Triggering them at the same time, they interfere with each other.
Imagine two people shouting words in a closed room. If they take turns, it will be obvious what they are saying. If they both shout at the same time, it will be very hard!
So actually, not being able to run in parallel is probably a good thing.

Related

C++ referencing instances created within a function's scope

Context
The context of the problem is that I am currently writing a small library for use with the Arduino in order to act as a game controller. The problem I am encountering has more to do with C++ than anything Arduino specific however.
I've included the libraries' header and source code below, followed by the Arduino code. I've truncated it where possible.
Problem
In short, only the last switch / action I define actually gets properly handles.
These actions get defined in the Arduino setup function. For example:
controller.addSwitchContinuous(10, 0); // Pin 10; btn index 0
means that pin 10 gets mapped to button 0. When pin 10 is switched closed this is treated as the button being pressed. This works fine for a single action but when I start adding more only the last action actually works. So in the following example only pin 9 is recognized:
controller.addSwitchContinuous(10, 0); // <-- Doesn't work
controller.addSwitchContinuous(9, 1); // <-- Works
This goes for any arbitrary number of actions:
controller.addSwitchContinuous(10, 0); // <-- Doesn't work
controller.addSwitchContinuous(9, 1); // <-- Doesn't work
controller.addSwitchContinuous(8, 2); // <-- Doesn't work
controller.addSwitchContinuous(7, 3); // <-- Works
Potential causes
I am fairly novice with C++ so this I suspect I'm doing something wrong with pointers. More specifically, something seems wrong with how the Joystick_ instance gets passed around.
I have been fiddling with the constructor and trying to use references instead of pointers but I couldn't get it to work properly.
I can confirm the iteration in JFSF::loop does iterate over all actions, if I modify it with:
void JFSF::loop()
{
for (int n = 0; n < _nextActionIndex; n++)
{
if (_actions[n])
{
_actions[n]->loop();
_joystick->setButton(n, PRESSED); // Debug: Set button pressed, regardless of switch.
}
}
if (_doSendState)
{
_joystick->sendState();
}
}
then buttons 0 through n get pressed as expected. It is possible that loop() isn't properly being called, but I would expect it to fail for the N = 1 case as well in that case. Furthermore the fact the last action always succeeds would suggest the iteration is ok.
Full code
// JFSF.h
#ifndef JFSF_h
#define JFSF_h
// ... include for Arduino.h and Joystick.h; bunch of defines
namespace JFSF_PRIV
{
class AbstractAction
{
public:
virtual void loop();
};
/* A Switch that essentially acts as a push button. */
class SwitchContinuousAction : public AbstractAction
{
public:
SwitchContinuousAction(Joystick_ *joystick, int pin, int btnIndex);
void loop();
private:
Joystick_ *_joystick;
int _pin;
int _btnIndex;
};
} // namespace JFSF_PRIV
class JFSF
{
public:
JFSF(Joystick_ *joystick, bool doSendState); // doSendState should be true if Joystick_ does not auto send state.
void loop();
void addSwitchContinuous(int inputPin, int btnIndex);
private:
Joystick_ *_joystick;
JFSF_PRIV::AbstractAction *_actions[MAX_ACTIONS];
int _nextActionIndex;
bool _doSendState;
};
#endif
Source file (trimmed):
// JFSF.cpp
#include "Arduino.h"
#include "Joystick.h"
#include "JFSF.h"
#define PRESSED 1
#define RELEASED 0
// Private classes
namespace JFSF_PRIV
{
SwitchContinuousAction::SwitchContinuousAction(Joystick_ *joystick, int pin, int btnIndex)
{
_joystick = joystick;
_pin = pin;
_btnIndex = btnIndex;
pinMode(_pin, INPUT_PULLUP);
}
void SwitchContinuousAction::loop()
{
int _state = digitalRead(_pin) == LOW ? PRESSED : RELEASED;
_joystick->setButton(_btnIndex, _state);
}
} // namespace JFSF_PRIV
JFSF::JFSF(Joystick_ *joystick, bool doSendState)
{
_joystick = joystick;
_nextActionIndex = 0;
_doSendState = doSendState;
}
void JFSF::addSwitchContinuous(int inputPin, int btnIndex)
{
JFSF_PRIV::SwitchContinuousAction newBtnAction(_joystick, inputPin, btnIndex);
_actions[_nextActionIndex++] = &newBtnAction;
}
void JFSF::loop()
{
for (int n = 0; n < _nextActionIndex; n++)
{
if (_actions[n])
{
_actions[n]->loop();
}
}
if (_doSendState)
{
_joystick->sendState();
}
}
For completeness sake, this is the code for the Arduino, but it is pretty much just declarations:
#include <JFSF.h>
// ... A bunch of const declarations used below. These are pretty self explanatory.
// See: https://github.com/MHeironimus/ArduinoJoystickLibrary#joystick-library-api
Joystick_ joystick(HID_REPORT_ID,
JOYSTICK_TYPE_JOYSTICK, // _JOYSTICK, _GAMEPAD or _MULTI_AXIS
BTN_COUNT, HAT_SWITCH_COUNT,
INCLUDE_X_AXIS, INCLUDE_Y_AXIS, INCLUDE_Z_AXIS,
INCLUDE_RX_AXIS, INCLUDE_RY_AXIS, INCLUDE_RZ_AXIS,
INCLUDE_RUDDER, INCLUDE_THROTTLE,
INCLUDE_ACCELERATOR, INCLUDE_BRAKE, INCLUDE_STEERING);
JFSF controller(&joystick, !DO_AUTO_SEND_STATE);
void setup() {
joystick.begin(DO_AUTO_SEND_STATE);
controller.addSwitchContinuous(10, 0); // <-- Doesn't work
controller.addSwitchContinuous(9, 1); // <-- Works
}
void loop() {
controller.loop();
}
References
ArduinoJoystickLibrary (Source for Joystick_) can be found here: https://github.com/MHeironimus/ArduinoJoystickLibrary#joystick-library-api
I dont really understand your code. Please read How to create a Minimal, Complete and Verifiable example. Anyhow, the following is certainly wrong and likely the cause of your problem:
void JFSF::addSwitchContinuous(int inputPin, int btnIndex)
{
JFSF_PRIV::SwitchContinuousAction newBtnAction(_joystick, inputPin, btnIndex);
_actions[_nextActionIndex++] = &newBtnAction;
}
Lets rewrite it a bit for clarity:
void foo(){
T bar;
container[index] = &bar;
}
What happens here is that bar gets destroyed when it goes out of scope, hence the pointer you put into the container, points to garbage. Presumably somewhere else in your code you are dereferencing those pointers, which is undefined behaviour (aka anything can happen).
Long story short: It is a common pattern among c++ beginners to overuse pointers. Most likely you should make container a container of objects rather than pointers and make use of automatic memory managment instead of trying to fight it.
Thanks to #user463035818 and #drescherjm for identifiying the actual problem.
So in the end I fixed it by simply moving the Action object creation up to the Arduino code (where it's essentially global) and passing references to those objects to the controller.
In code this translates to:
JFSF.cpp
void JFSF::addAction(JFSF_PRIV::AbstractAction *action){
_actions[_nextActionIndex++] = action;
}
Arduino code (ino)
// See code in original post
JFSF controller(&joystick, !DO_AUTO_SEND_STATE);
JFSF_PRIV::SwitchContinuousAction btnOne(&joystick, 10, 0);
JFSF_PRIV::SwitchContinuousAction btnTwo(&joystick, 9, 1);
void setup() {
joystick.begin(DO_AUTO_SEND_STATE);
// controller.addSwitchContinuous(10, 0); // Pin 10; btn index 0
// controller.addSwitchContinuous(9, 1); // Pin 9 ; btn index 1
controller.addAction(&btnOne);
controller.addAction(&btnTwo);
}
// loop() is unchanged

c/c++ get large size data like 180 array from another class in stm32

I have an 32-bit ARM Cortex M4 (the processor in Pixhawk) to write two classes, each one is one threading in Pixhawk codebase setting.
The first one is LidarScanner, which dealing with incoming serial data and generates "obstacle situation". The second one is Algorithm, which handle "obstacle situation" and take some planning strategy. Here are my solution right now, use the reference function LidarScanner::updateObstacle(uint8_t (&array)[181]) to update "obstacle situation" which is 181 size array.
LidarScanner.cpp:
class LidarScanner{
private:
struct{
bool available = false;
int AngleArr[181];
int RangeArr[181];
bool isObstacle[181] = {}; //1: unsafe; 0:safe;
}scan;
......
public:
LidarScanner();
//main function
void update()
{
while(hal.uartE->available()) //incoming serial data is available
{
decode_data(); //decode serial data into three kind data: Range, Angle and Period_flag
if(complete_scan()) //determine if the lidarscanner one period is completed
{
scan.available = false;
checkObstacle(); //check obstacle situation and store safety in isObstacle[181]
scan.available = true;
}
}
}
//for another API recall
void updateObstacle(uint8_t (&array)[181])
{
for(int i=0; i<=181; i++)
{
array[i]=scan.isObstacle[i];
}
}
//for another API recall
bool ScanAvailable() const { return scan.available; }
......
}
Algorithm.cpp:
class Algorithm{
private:
uint8_t Obatcle_Value[181] = {};
class LidarScanner& _lidarscanner;
......
public:
Algorithm(class LidarScanner& _lidarscanner);
//main funcation
void update()
{
if (hal.uartE->available() && _lidarscanner.ScanAvailable())
{
//Update obstacle situation into Algorithm phase and do more planning strategy
_lidarscanner.updateObstacle(Obatcle_Value);
}
}
......
}`
Usually, it works fine. But I want to improve the performances so that I want to know what's the most effective way to do that. thanks!!!!
The most efficient way to copy data is to use the DMA.
DMAx_Channelx->CNDTR = size;
DMAx_Channelx->CPAR = (uint32_t)&source;
DMAx_Channelx->CMAR = (uint32_t)&destination;
DMAx_Channelx->CCR = (0<<DMA_CCR_MSIZE_Pos) | (0<<DMA_CCR_PSIZE_Pos)
| DMA_CCR_MINC | DMA_CCR_PINC | DMA_CCR_MEM2MEM ;
while(!(DMAx->ISR & DMA_ISR_TCIFx ));
AN4031 Using the DMA controller.

Threads C++, Access Violation reading location x error

Platform : Windows 7
I'm developing a project for known text cipher attack in which;
Main process creates n child processes
Child processes decrypt an encrypted string, key subspace is partitioned according to number of child processes
Communication between child processes are by a static variable
for(int i = 0; i<info.totalNumberOfChildren; i++)
{
startChild( &info.childInfoList[i]);
//_beginthread(startChild, 0, &info.childInfoList[i]);
}
Above code works fine since:
First child starts execution, the key is set as a number such as 8 for testing purposes which is within the first child's partition, so first child finds the key, reports and sets true the killSwitch.
All the other children that are created are closed even before checking the first key as the killSwitch is true.
When I however do this :
for(int i = 0; i<info.totalNumberOfChildren; i++)
{
//startChild( &info.childInfoList[i]);
_beginthread(startChild, 0, &info.childInfoList[i]);
}
I get an access violation error. What could possibly be my source of error ?
Edit: I will try to share as relevant code as I can
startChild does the following:
void startChild( void* pParams)
{
ChildInfo *ci = (ChildInfo*)pParams;
// cout<<"buraya geldi"<<endl;
ChildProcess cp(*ci);
// write to log
cp.completeNextJob();
}
childInfo holds the following :
// header file
class ChildInfo
{
public:
ChildInfo();
ChildInfo(char * encrypted, char * original, static bool killSwitch, int totalNumOfChildren, int idNum, int orjLen);
void getNextJob();
bool keyIsFound();
Des des;
void printTest();
bool stopExecution;
bool allIsChecked;
char * encyptedString;
char * originalString;
int id;
int orjStrLen;
private:
int lastJobCompleted;
int totalNumberOfChildren;
int jobDistBits;
};
completeNextJob() does the following :
void ChildProcess::completeNextJob()
{
cout<<"Child Id : "<<info.id<<endl;
// cout<<"Trying : "<<info.encyptedString<<endl; // here I got an error
char * newtrial = info.encyptedString;
char * cand = info.des.Decrypt(newtrial); // here I also get an error if I comment out
/*
cout<<"Resultant : "<<cand<<endl;
cout<<"Comparing with : "<<info.originalString<<endl;
*/
bool match = true;
for(int i = 0; i<info.orjStrLen; i++)
{
if(!(cand[i] == info.originalString[i]))
match = false;
}
if(match)
{
cout<<"It has been acknowledged "<<endl;
info.stopExecution = true;
return;
}
else
{
if(!info.keyIsFound())
{
if(!info.allIsChecked)
{
info.getNextJob();
completeNextJob();
}
else
{
}
}
else
{
}
}
}
decrypt() method does the following :
char * Des::Decrypt(char *Text1)
{
int i,a1,j,nB,m,iB,k,K,B[8],n,t,d,round;
char *Text=new char[1000];
unsigned char ch;
strcpy(Text,Text1); // this is where I get the error
i=strlen(Text);
keygen();
int mc=0;
for(iB=0,nB=0,m=0;m<(strlen(Text)/8);m++) //Repeat for TextLenth/8 times.
{
for(iB=0,i=0;i<8;i++,nB++)
{
ch=Text[nB];
n=(int)ch;//(int)Text[nB];
for(K=7;n>=1;K--)
{
B[K]=n%2; //Converting 8-Bytes to 64-bit Binary Format
n/=2;
} for(;K>=0;K--) B[K]=0;
for(K=0;K<8;K++,iB++) total[iB]=B[K]; //Now `total' contains the 64-Bit binary format of 8-Bytes
}
IP(); //Performing initial permutation on `total[64]'
for(i=0;i<64;i++) total[i]=ip[i]; //Store values of ip[64] into total[64]
for(i=0;i<32;i++) left[i]=total[i]; // +--> left[32]
// total[64]--|
for(;i<64;i++) right[i-32]=total[i];// +--> right[32]
for(round=1;round<=16;round++)
{
Expansion(); //Performing expansion on `right[32]' to get `expansion[48]'
xor_oneD(round);
substitution();//Perform substitution on xor1[48] to get sub[32]
permutation(); //Performing Permutation on sub[32] to get p[32]
xor_two(); //Performing XOR operation on left[32],p[32] to get xor2[32]
for(i=0;i<32;i++) left[i]=right[i]; //Dumping right[32] into left[32]
for(i=0;i<32;i++) right[i]=xor2[i]; //Dumping xor2[32] into right[32]
} //rounds end here
for(i=0;i<32;i++) temp[i]=right[i]; // Dumping -->[ swap32bit ]
for(;i<64;i++) temp[i]=left[i-32]; // left[32],right[32] into temp[64]
inverse(); //Inversing the bits of temp[64] to get inv[8][8]
/* Obtaining the Cypher-Text into final[1000]*/
k=128; d=0;
for(i=0;i<8;i++)
{
for(j=0;j<8;j++)
{
d=d+inv[i][j]*k;
k=k/2;
}
final[mc++]=(char)d;
k=128; d=0;
}
} //for loop ends here
final[mc]='\0';
char *final1=new char[1000];
for(i=0,j=strlen(Text);i<strlen(Text);i++,j++)
final1[i]=final[j]; final1[i]='\0';
return(final);
}
Windows is trying to tell you why your program crashed. Please use a debugger to see what Windows is talking about. Location X is important: it should tell you whether your program is dereferencing NULL, overflowing a buffer, or doing something else. The call stack at the time of the crash is also very important.
Debugger is your best friend, try to use it and check step by step what could cause this access violation.
I think that info.encyptedString is not initialized correctly and pointing to not allocated memory, but I cant be sure because you didn't show this part of code.
And of course you must protect your shared resources (info) using some synchronization objects like critical section or mutex or semaphore.
I don't know, the basic issue seems pretty straightforward to me. You have multiple threads executing simultaneously, which access the same information via *pParams, which presumably is of type ChildInfo since that's what you cast it to. That info must be getting accessed elsewhere in the program, perhaps in the main thread. This is corrupting something, which may or may not have to do with Text1 or info.id, these errors can often be 'non-local' and hard to debug for this reason. So start mutex-protecting the entire thread (within your initial loop), and then zero in on the critical sections by trial and error, i.e. mutex-protect as small a region of code as you can get away with without producing errors.

Buffer communication speed nightmare

I'm trying to use buffers to communicate between several 'layers' (threads) in my program and now that I have visual output of what's going on inside, I realize there's a devastating amount of time being eaten up in the process of using these buffers.
Here's some notes about what's going on in my code.
when the rendering mode is triggered in this thread, it begins sending as many points as it can to the layer (thread) below it
the points from the lower thread are then processed and returned to this thread via the output buffer of the lower thread
points received back are mapped (for now) as white pixels in the D3D surface
if I bypass the buffer and put the points directly into the surface pixels, it only takes about 3 seconds to do the whole job
if I hand the point down and then have the lower layer pass it right back up, skipping any actual number-crunching, the whole job takes about 30 minutes (which makes the whole program useless)
changing the size of my buffers has no noticeable effect on the speed
I was originally using MUTEXes in my buffers but have eliminated them in attempt the fix the problem
Is there something I can do differently to fix this speed problem I'm having?
...something to do with the way I'm handling these messages???
Here's my code
I'm very sorry that it's such a mess. I'm having to move way too fast on this project and I've left a lot of pieces laying around in comments where I've been experimenting.
DWORD WINAPI CONTROLSUBSYSTEM::InternalExProcedure(__in LPVOID lpSelf)
{
XMSG xmsg;
LPCONTROLSUBSYSTEM lpThis = ((LPCONTROLSUBSYSTEM)lpSelf);
BOOL bStall;
BOOL bRendering = FALSE;
UINT64 iOutstandingPoints = 0; // points that are out being tested
UINT64 iPointsDone = 0;
UINT64 iPointsTotal = 0;
BOOL bAssigning;
DOUBLE dNextX;
DOUBLE dNextY;
while(1)
{
if( lpThis->hwTargetWindow!=NULL && lpThis->d3ddev!=NULL )
{
lpThis->d3ddev->Clear(0,NULL,D3DCLEAR_TARGET,D3DCOLOR_XRGB(0,0,0),1.0f,0);
if(lpThis->d3ddev->BeginScene())
{
lpThis->d3ddev->StretchRect(lpThis->sfRenderingCanvas,NULL,lpThis->sfBackBuffer,NULL,D3DTEXF_NONE);
lpThis->d3ddev->EndScene();
}
lpThis->d3ddev->Present(NULL,NULL,NULL,NULL);
}
//bStall = TRUE;
// read input buffer
if(lpThis->bfInBuffer.PeekMessage(&xmsg))
{
bStall = FALSE;
if( HIBYTE(xmsg.wType)==HIBYTE(CONT_MSG) )
{
// take message off
lpThis->bfInBuffer.GetMessage(&xmsg);
// double check consistency
if( HIBYTE(xmsg.wType)==HIBYTE(CONT_MSG) )
{
switch(LOBYTE(xmsg.wType))
{
case SETRESOLUTION_MSG:
lpThis->iAreaWidth = (UINT)xmsg.dptPoint.X;
lpThis->iAreaHeight = (UINT)xmsg.dptPoint.Y;
lpThis->sfRenderingCanvas->Release();
if(lpThis->d3ddev->CreateOffscreenPlainSurface(
(UINT)xmsg.dptPoint.X,(UINT)xmsg.dptPoint.Y,
D3DFMT_X8R8G8B8,
D3DPOOL_DEFAULT,
&(lpThis->sfRenderingCanvas),
NULL)!=D3D_OK)
{
MessageBox(NULL,"Error resizing surface.","ERROR",MB_ICONERROR);
}
else
{
D3DLOCKED_RECT lrt;
if(D3D_OK == lpThis->sfRenderingCanvas->LockRect(&lrt,NULL,0))
{
lpThis->iPitch = lrt.Pitch;
VOID *data;
data = lrt.pBits;
ZeroMemory(data,lpThis->iPitch*lpThis->iAreaHeight);
lpThis->sfRenderingCanvas->UnlockRect();
MessageBox(NULL,"Surface Resized","yay",0);
}
else
{
MessageBox(NULL,"Error resizing surface.","ERROR",MB_ICONERROR);
}
}
break;
case SETCOLORMETHOD_MSG:
break;
case SAVESNAPSHOT_MSG:
lpThis->SaveSnapshot();
break;
case FORCERENDER_MSG:
bRendering = TRUE;
iPointsTotal = lpThis->iAreaHeight*lpThis->iPitch;
iPointsDone = 0;
MessageBox(NULL,"yay, render something!",":o",0);
break;
default:
break;
}
}// else, lost this message
}
else
{
if( HIBYTE(xmsg.wType)==HIBYTE(MATH_MSG) )
{
XMSG xmsg2;
switch(LOBYTE(xmsg.wType))
{
case RESETFRAME_MSG:
case ZOOMIN_MSG:
case ZOOMOUT_MSG:
case PANUP_MSG:
case PANDOWN_MSG:
case PANLEFT_MSG:
case PANRIGHT_MSG:
// tell self to start a render
xmsg2.wType = CONT_MSG|FORCERENDER_MSG;
if(lpThis->bfInBuffer.PutMessage(&xmsg2))
{
// pass it down
while(!lpThis->lplrSubordinate->PutMessage(&xmsg));
// message passed so pull it from buffer
lpThis->bfInBuffer.GetMessage(&xmsg);
}
break;
default:
// pass it down
if(lpThis->lplrSubordinate->PutMessage(&xmsg))
{
// message passed so pull it from buffer
lpThis->bfInBuffer.GetMessage(&xmsg);
}
break;
}
}
else if( lpThis->lplrSubordinate!=NULL )
// pass message down
{
if(lpThis->lplrSubordinate->PutMessage(&xmsg))
{
// message passed so pull it from buffer
lpThis->bfInBuffer.GetMessage(&xmsg);
}
}
}
}
// read output buffer from subordinate
if( lpThis->lplrSubordinate!=NULL && lpThis->lplrSubordinate->PeekMessage(&xmsg) )
{
bStall = FALSE;
if( xmsg.wType==(REPLY_MSG|TESTPOINT_MSG) )
{
// got point test back
D3DLOCKED_RECT lrt;
if(D3D_OK == lpThis->sfRenderingCanvas->LockRect(&lrt,NULL,0))
{
INT pitch = lrt.Pitch;
VOID *data;
data = lrt.pBits;
INT Y=dRound((xmsg.dptPoint.Y/(DOUBLE)100)*((DOUBLE)lpThis->iAreaHeight));
INT X=dRound((xmsg.dptPoint.X/(DOUBLE)100)*((DOUBLE)pitch));
// decide color
if( xmsg.iNum==0 )
((WORD *)data)[X+Y*pitch] = 0xFFFFFFFF;
else
((WORD *)data)[X+Y*pitch] = 0xFFFFFFFF;
// message handled so remove from buffer
lpThis->lplrSubordinate->GetMessage(&xmsg);
lpThis->sfRenderingCanvas->UnlockRect();
}
}
else if(lpThis->bfOutBuffer.PutMessage(&xmsg))
{
// message sent so pull the real one off the buffer
lpThis->lplrSubordinate->GetMessage(&xmsg);
}
}
if( bRendering && lpThis->lplrSubordinate!=NULL )
{
bAssigning = TRUE;
while(bAssigning)
{
dNextX = 100*((DOUBLE)(iPointsDone%lpThis->iPitch))/((DOUBLE)lpThis->iPitch);
dNextY = 100*(DOUBLE)((INT)(iPointsDone/lpThis->iPitch))/(DOUBLE)(lpThis->iAreaHeight);
xmsg.dptPoint.X = dNextX;
xmsg.dptPoint.Y = dNextY;
//
//xmsg.iNum = 0;
//xmsg.wType = REPLY_MSG|TESTPOINT_MSG;
//
xmsg.wType = MATH_MSG|TESTPOINT_MSG;
/*D3DLOCKED_RECT lrt;
if(D3D_OK == lpThis->sfRenderingCanvas->LockRect(&lrt,NULL,0))
{
INT pitch = lrt.Pitch;
VOID *data;
data = lrt.pBits;
INT Y=dRound((dNextY/(DOUBLE)100)*((DOUBLE)lpThis->iAreaHeight));
INT X=dRound((dNextX/(DOUBLE)100)*((DOUBLE)pitch));
((WORD *)data)[X+Y*pitch] = 0xFFFFFFFF;
lpThis->sfRenderingCanvas->UnlockRect();
}
iPointsDone++;
if( iPointsDone>=iPointsTotal )
{
MessageBox(NULL,"done rendering","",0);
bRendering = FALSE;
bAssigning = FALSE;
}
*/
if( lpThis->lplrSubordinate->PutMessage(&xmsg) )
{
bStall = FALSE;
iPointsDone++;
if( iPointsDone>=iPointsTotal )
{
MessageBox(NULL,"done rendering","",0);
bRendering = FALSE;
bAssigning = FALSE;
}
}
else
{
bAssigning = FALSE;
}
}
}
//if( bStall )
//Sleep(10);
}
return 0;
}
}
(still getting used to this forum's code block stuff)
Edit:
Here's an example that I perceive to be similar in concept, although this example consumes the messages it produces in the same thread.
#include <Windows.h>
#include "BUFFER.h"
int main()
{
BUFFER myBuffer;
INT jobsTotal = 1024*768;
INT currentJob = 0;
INT jobsOut = 0;
XMSG xmsg;
while(1)
{
if(myBuffer.PeekMessage(&xmsg))
{
// do something with message
// ...
// if successful, remove message
myBuffer.GetMessage(&xmsg);
jobsOut--;
}
while( currentJob<jobsTotal )
{
if( myBuffer.PutMessage(&xmsg) )
{
currentJob++;
jobsOut++;
}
else
{
// buffer is full at the moment
// stop for now and put more on later
break;
}
}
if( currentJob==jobsTotal && jobsOut==0 )
{
MessageBox(NULL,"done","",0);
break;
}
}
return 0;
}
This example also runs in about 3 seconds, as opposed to 30 minutes.
Btw, if anybody knows why visual studio keeps trying to make me say PeekMessageA and GetMessageA instead of the actual names I defined, that would be nice to know as well.
Locking and Unlocking an entire rect to change a single point is probably not very efficient, you might be better off generating a list of points you intend to modify and then locking the rect once, iterating over that list and modifying all the points, and then unlocking the rect.
When you lock the rect you are effectively stalling concurrent access to it, so its like a mutex for the GPU in that respect - then you only modify a single pixel. Doing this repeatedly for each pixel will constantly stall the GPU. You could use D3DLOCK_NOSYSLOCK to avoid this to some extent, but I'm not sure if it will play nicely in the larger context of your program.
I'm obviously not entirely sure what the goal of your algorithm is, but if you are trying to parallel process pixels on a d3d surface, then i think the best approach would be via a shader on the GPU.
Where you basically generate an array in system memory, populate it with "input" values on a per point/pixel basis, then generate a texture on a GPU from the array. Next you paint the texture to a full screen quad, and then render it with a pixel shader to some render target. The shader can be coded to process each point in whatever way you like, the GPU will take care of optimizing parallelization. Then you generate a new texture from that render target and then you copy that texture into a system memory array. And then you can extract all your outputs from that array. You can also apply multiple shaders to the render target result back into the render target to pipeline multiple transformations if needed.
A couple notes:
Don't write your own messape-passing code. It may be correct and slow, or fast and buggy. It takes a lot of experience to design code that's fast and then getting it bug-free is really hard, because debugging threaded code is hard. Win32 provides a couple of efficient threadsafe queues: SList and the window message queue.
Your design splits up work in the worst possible way. Passing information between threads is expensive even under the best circumstances, because it causes cache contention, both on the data and on the synchronization objects. It's MUCH better to split your work into distinct non-interacting (or minimize interaction) datasets and give each to a separate thread, that is then responsible for all stages of processing that dataset.
Don't poll.
That's likely to be the heart of the problem. You have a task continually calling peekmessage and probably finding nothing there. This will just eat all available CPU. Any task that wants to post messages is unlikely to receive any CPU time to acheive this.
I can't remember how you'd achieve this with the windows message queue (probably WaitMessage or some variant) but typically you might implement this with a counting semaphore. When the consumer wants data, it waits for the semaphore to be signalled. When the producer has data, it signals the semaphore.
I managed to resolve it by redesigning the whole thing
It now passes huge payloads instead of individual tasks
(I'm the poster)

How can I synchronize three threads?

My app consist of the main-process and two threads, all running concurrently and making use of three fifo-queues:
The fifo-q's are Qmain, Q1 and Q2. Internally the queues each use a counter that is incremented when an item is put into the queue, and decremented when an item is 'get'ed from the queue.
The processing involve two threads,
QMaster, which get from Q1 and Q2, and put into Qmain,
Monitor, which put into Q2,
and the main process, which get from Qmain and put into Q1.
The QMaster-thread loop consecutively checks the counts of Q1 and Q2 and if any items are in the q's, it get's them and puts them into Qmain.
The Monitor-thread loop obtains data from external sources, package it and put it into Q2.
The main-process of the app also runs a loop checking the count of Qmain, and if any items, get's an item
from Qmain at each iteration of the loop and process it further. During this processing it occasionally
puts an item into Q1 to be processed later (when it is get'ed from Qmain in turn).
The problem:
I've implemented all as described above, and it works for a randomly (short) time and then hangs.
I've managed to identify the source of the crashing to happen in the increment/decrement of the
count of a fifo-q (it may happen in any of them).
What I've tried:
Using three mutex's: QMAIN_LOCK, Q1_LOCK and Q2_LOCK, which I lock whenever any get/put operation
is done on a relevant fifo-q. Result: the app doesn't get going, just hangs.
The main-process must continue running all the time, must not be blocked on a 'read' (named-pipes fail, socketpair fail).
Any advice?
I think I'm not implementing the mutex's properly, how should it be done?
(Any comments on improving the above design also welcome)
[edit] below are the processes and the fifo-q-template:
Where & how in this should I place the mutex's to avoid the problems described above?
main-process:
...
start thread QMaster
start thread Monitor
...
while (!quit)
{
...
if (Qmain.count() > 0)
{
X = Qmain.get();
process(X)
delete X;
}
...
//at some random time:
Q2.put(Y);
...
}
Monitor:
{
while (1)
{
//obtain & package data
Q2.put(data)
}
}
QMaster:
{
while(1)
{
if (Q1.count() > 0)
Qmain.put(Q1.get());
if (Q2.count() > 0)
Qmain.put(Q2.get());
}
}
fifo_q:
template < class X* > class fifo_q
{
struct item
{
X* data;
item *next;
item() { data=NULL; next=NULL; }
}
item *head, *tail;
int count;
public:
fifo_q() { head=tail=NULL; count=0; }
~fifo_q() { clear(); /*deletes all items*/ }
void put(X x) { item i=new item(); (... adds to tail...); count++; }
X* get() { X *d = h.data; (...deletes head ...); count--; return d; }
clear() {...}
};
An example of how I would adapt the design and lock the queue access the posix way.
Remark that I would wrap the mutex to use RAII or use boost-threading and that I would use stl::deque or stl::queue as queue, but staying as close as possible to your code:
main-process:
...
start thread Monitor
...
while (!quit)
{
...
if (Qmain.count() > 0)
{
X = Qmain.get();
process(X)
delete X;
}
...
//at some random time:
QMain.put(Y);
...
}
Monitor:
{
while (1)
{
//obtain & package data
QMain.put(data)
}
}
fifo_q:
template < class X* > class fifo_q
{
struct item
{
X* data;
item *next;
item() { data=NULL; next=NULL; }
}
item *head, *tail;
int count;
pthread_mutex_t m;
public:
fifo_q() { head=tail=NULL; count=0; }
~fifo_q() { clear(); /*deletes all items*/ }
void put(X x)
{
pthread_mutex_lock(&m);
item i=new item();
(... adds to tail...);
count++;
pthread_mutex_unlock(&m);
}
X* get()
{
pthread_mutex_lock(&m);
X *d = h.data;
(...deletes head ...);
count--;
pthread_mutex_unlock(&m);
return d;
}
clear() {...}
};
Remark too that the mutex still needs to be initialized as in the example here and that count() should also use the mutex
Use the debugger. When your solution with mutexes hangs look at what the threads are doing and you will get a good idea about the cause of the problem.
What is your platform? In Unix/Linux you can use POSIX message queues (you can also use System V message queues, sockets, FIFOs, ...) so you don't need mutexes.
Learn about condition variables. By your description it looks like your Qmaster-thread is busy looping, burning your CPU.
One of your responses suggest you are doing something like:
Q2_mutex.lock()
Qmain_mutex.lock()
Qmain.put(Q2.get())
Qmain_mutex.unlock()
Q2_mutex.unlock()
but you probably want to do it like:
Q2_mutex.lock()
X = Q2.get()
Q2_mutex.unlock()
Qmain_mutex.lock()
Qmain.put(X)
Qmain_mutex.unlock()
and as Gregory suggested above, encapsulate the logic into the get/put.
EDIT: Now that you posted your code I wonder, is this a learning exercise?
Because I see that you are coding your own FIFO queue class instead of using the C++ standard std::queue. I suppose you have tested your class really well and the problem is not there.
Also, I don't understand why you need three different queues. It seems that the Qmain queue would be enough, and then you will not need the Qmaster thread that is indeed busy waiting.
About the encapsulation, you can create a synch_fifo_q class that encapsulates the fifo_q class. Add a private mutex variable and then the public methods (put, get, clear, count,...) should be like put(X) { lock m_mutex; m_fifo_q.put(X); unlock m_mutex; }
question: what would happen if you have more than one reader from the queue? Is it guaranteed that after a "count() > 0" you can do a "get()" and get an element?
I wrote a simple application below:
#include <queue>
#include <windows.h>
#include <process.h>
using namespace std;
queue<int> QMain, Q1, Q2;
CRITICAL_SECTION csMain, cs1, cs2;
unsigned __stdcall TMaster(void*)
{
while(1)
{
if( Q1.size() > 0)
{
::EnterCriticalSection(&cs1);
::EnterCriticalSection(&csMain);
int i1 = Q1.front();
Q1.pop();
//use i1;
i1 = 2 * i1;
//end use;
QMain.push(i1);
::LeaveCriticalSection(&csMain);
::LeaveCriticalSection(&cs1);
}
if( Q2.size() > 0)
{
::EnterCriticalSection(&cs2);
::EnterCriticalSection(&csMain);
int i1 = Q2.front();
Q2.pop();
//use i1;
i1 = 3 * i1;
//end use;
QMain.push(i1);
::LeaveCriticalSection(&csMain);
::LeaveCriticalSection(&cs2);
}
}
return 0;
}
unsigned __stdcall TMoniter(void*)
{
while(1)
{
int irand = ::rand();
if ( irand % 6 >= 3)
{
::EnterCriticalSection(&cs2);
Q2.push(irand % 6);
::LeaveCriticalSection(&cs2);
}
}
return 0;
}
unsigned __stdcall TMain(void)
{
while(1)
{
if (QMain.size() > 0)
{
::EnterCriticalSection(&cs1);
::EnterCriticalSection(&csMain);
int i = QMain.front();
QMain.pop();
i = 4 * i;
Q1.push(i);
::LeaveCriticalSection(&csMain);
::LeaveCriticalSection(&cs1);
}
}
return 0;
}
int _tmain(int argc, _TCHAR* argv[])
{
::InitializeCriticalSection(&cs1);
::InitializeCriticalSection(&cs2);
::InitializeCriticalSection(&csMain);
unsigned threadID;
::_beginthreadex(NULL, 0, &TMaster, NULL, 0, &threadID);
::_beginthreadex(NULL, 0, &TMoniter, NULL, 0, &threadID);
TMain();
return 0;
}
You should not lock second mutex when you already locked one.
Since the question is tagged with C++, I suggest to implement locking inside get/add logic of the queue class (e.g. using boost locks) or write a wrapper if your queue is not a class.
This allows you to simplify the locking logic.
Regarding the sources you have added: queue size check and following put/get should be done in one transaction otherwise another thread can edit the queue in between
Are you acquiring multiple locks simultaneously? This is generally something you want to avoid. If you must, ensure you are always acquiring the locks in the same order in each thread (this is more restrictive to your concurrency and why you generally want to avoid it).
Other concurrency advice: Are you acquiring the lock prior to reading the queue sizes? If you're using a mutex to protect the queues, then your queue implementation isn't concurrent and you probably need to acquire the lock before reading the queue size.
1 problem may occur due to this rule "The main-process must continue running all the time, must not be blocked on a 'read'". How did you implement it? what is the difference between 'get' and 'read'?
Problem seems to be in your implementation, not in the logic. And as you stated, you should not be in any dead lock because you are not acquiring another lock whether in a lock.