D parallel loop - d

First, how D create parallel foreach (underlying logic)?
int main(string[] args)
{
int[] arr;
arr.length = 100000000;
/* Why it is working?, it's simple foreach which working with
reference to int from arr, parallel function return ParallelForeach!R
(ParallelForeach!int[]), but I don't know what it is.
Parallel function is part od phobos library, not D builtin function, then what
kind of magic is used for this? */
foreach (ref e;parallel(arr))
{
e = 100;
}
foreach (ref e;parallel(arr))
{
e *= e;
}
return 0;
}
And second, why it is slower then simple foreach?
Finally, If I create my own taskPool (and don't use global taskPool object), program never end. Why?

parallel returns a struct (of type ParallelForeach) that implements the opApply(int delegate(...)) foreach overload.
when called the struct submits a parallel function to the private submitAndExecute which submits the same task to all threads in the pool.
this then does:
scope(failure)
{
// If an exception is thrown, all threads should bail.
atomicStore(shouldContinue, false);
}
while (atomicLoad(shouldContinue))
{
immutable myUnitIndex = atomicOp!"+="(workUnitIndex, 1);
immutable start = workUnitSize * myUnitIndex;
if(start >= len)
{
atomicStore(shouldContinue, false);
break;
}
immutable end = min(len, start + workUnitSize);
foreach(i; start..end)
{
static if(withIndex)
{
if(dg(i, range[i])) foreachErr();
}
else
{
if(dg(range[i])) foreachErr();
}
}
}
where workUnitIndex and shouldContinue are shared variables and dg is the foreach delegate
The reason it is slower is simply because of the overhead required to pass the function to the threads in the pool and atomically accessing the shared variables.
the reason your custom pool doesn't shut down is likely you don't shut down the threadpool with finish

Related

"Reimplementing" Connections to catch all custom types within specified signals in a QML hook

We are developing a test tool for Qt/QML apps based on the Petri's Nets and looking for a legal hack to reimplement Connections a bit differently, but without any private API.
Consider a Petri's transition declared in a QML component like:
Transition {
act: {
obj1.callAsyncOperation1();
obj2.callAsyncOperation2();
//...
}
TimeWaiter {
timeout: 100
}
ObjectWaiter {
target: obj1
function onAsyncOperation1Done(arg1: int, arg2: CustomAnonymousType)
{
console.log("...");
}
function onAsyncOperation1Failed(message: string, fatal: bool)
{
console.error("...");
}
}
}
act is called from c++ logic
wait for any specified signal if some *Waiters are declared in a Transition
go to the next step (according to the PN idea after an Action phase in a Transition we are switching to an Assertion stage declared in some Place but this out of scope and just FYI)
Just problem we have in the current implementation is accessing custom types within signals due to nature of binding with dynamically declared slots. What we need are connections between signals from a target and all declared slots to test& debug purposes + a bypass signal from a target to a Transition as a permit to move on.
a) values are undefined if types are not declared in slots at all
b) some kind of UB happens and app is crashing time to time if types are declared as variants
c) works as expected just if types are fully and properly(according to a signal) defined
The corner case of this story are custom types which don't have typenames in the QML scope which are registered via qmlRegisterAnonymousType() or qRegisterMetaType().
Is there any method to make a connection between signals and slots with such arguments?
Now we are using QMetaObject::connect and finally, in qqmlvmemetaobject.cpp
it boils down to:
for (uint ii = 0; ii < parameterCount; ++ii) {
jsCallData->args[ii] = scope.engine->metaTypeToJS(arguments->arguments[ii + 1], a[ii + 1]);
}
const int returnType = methodData->propType();
QV4::ScopedValue result(scope, function->call(jsCallData));
metaTypeToJS(...) are converting all undefined arguments to QV4::Value(Null) through intermediate QV4::Encode::undefined:
QV4::ReturnedValue QV4::ExecutionEngine::fromVariant(const QVariant &variant)
{
// ...
if (type < QMetaType::User) {
switch (QMetaType::Type(type)) {
case QMetaType::UnknownType:
case QMetaType::Void:
return QV4::Encode::undefined();
Real code where we are remembering declared slots:
void ObjectWaiter::componentComplete()
{
const QMetaObject* meta_object(metaObject());
if (!meta_object) {
return ;
}
for (int m = meta_object->methodOffset(), count = meta_object->methodOffset() + meta_object->methodCount(); m < count; ++m) {
const QMetaMethod meta_method(meta_object->method(m));
if (meta_method.methodType() == QMetaMethod::MethodType::Slot) {
QByteArray slot_name(meta_method.name().remove(0, 2));
if (!slot_name.isEmpty()) {
slot_name.front() = QChar(slot_name.front()).toLower().toLatin1();
_slot_map.insert(slot_name, meta_method);
}
}
}
bind();
}
... and connecting signals to slots when target is changed:
void ObjectWaiter::bind()
{
if (!_target || _slot_map.isEmpty()) {
return ;
}
const QMetaObject* meta_object(_target->metaObject());
if (!meta_object) {
return ;
}
for (int m = meta_object->methodOffset(), count = meta_object->methodOffset() + meta_object->methodCount(); m < count; ++m) {
const QMetaMethod meta_method(meta_object->method(m));
if (meta_method.methodType() == QMetaMethod::MethodType::Signal) {
QMap<QByteArray, QMetaMethod>::ConstIterator m(_slot_map.find(meta_method.name()));
if (_slot_map.constEnd() != m) {
std::unique_ptr<int[]> argv(new int[meta_method.parameterCount()]{});
for (int p = 0; p < meta_method.parameterCount(); ++p) {
int parameter_type(meta_method.parameterType(p));
if (QMetaType::UnknownType != parameter_type) {
argv[p] = parameter_type;
} else {
void* arg[] = { &parameter_type, &p };
QMetaObject::metacall(_target, QMetaObject::RegisterMethodArgumentMetaType, meta_method.methodIndex(), arg);
if (parameter_type == -1) {
argv[p] = QMetaType::UnknownType;
qWarning(
"ObjectWaiter: Unable to handle parameter '%s' of type '%s' of method '%s', use qRegisterMetaType to register it.",
meta_method.parameterNames().at(p).constData(), meta_method.parameterTypes().at(p).constData(), meta_method.name().constData()
);
} else {
argv[p] = parameter_type;
}
}
}
if (!QMetaObject::connect(_target, meta_method.methodIndex(), this, m.value().methodIndex(), Qt::DirectConnection, argv.release())) {
qWarning("ObjectWaiter: Unable to make a dynamic connect");
}
if (!QObject::connect(_target, meta_method, this, QMetaMethod::fromSignal(&AbstractWaiter::done))) {
qWarning("ObjectWaiter: Unable to make the connection and won't be able to emit a stop signal at the end");
}
}
}
}
}
I was digging around Connections and QSpySingal implementations but could not find any other public API to reimplement ObjectWaiter, but Qt is huge and hopefully there are something I have missed.

physx multithreading copy transform data

I have a scene with tons of similar objects moved by physx, and i want to draw all of this using opengl instansing. So, i need to form a array with transform data of each object and pass it in to opengl shader. And, currently, filling an array is bottleneck of my app, because physx simulation using 16 thread, but creating array use just one thread.
So, i created data_transfer_task class, which contain two indexes, start and stop and move transform data of physx objects between this indexes to array.
class data_transfer_task : public physx::PxTask {
public:
int start;
int stop;
start_transfer_task* base_task;
data_transfer_task(int start, int stop, start_transfer_task* task, physx::PxTaskManager *mtm) :physx::PxTask() {
this->start = start;
this->stop = stop;
this->mTm = mtm;
base_task = task;
}
void update_transforms();
virtual const char* getName() const { return "data_transfer_task"; }
virtual void run();
};
void data_transfer_task::update_transforms() {
for (int i = start; i < stop; i++) {
auto obj = base_task->objects->at(i);
auto transform = obj->getGlobalPose();
DrawableObject* dr = (DrawableObject*)obj->userData;
auto pos = transform.p;
auto rot = transform.q;
dr->set_position(glm::vec3(pos.x, pos.y, pos.z));
dr->set_rotation(glm::quat(rot.w, rot.x, rot.y, rot.z));
}
}
void data_transfer_task::run() { update_transforms(); }
I created another class start_transfer_task, which creates and sheduled tasks according to thread count.
class start_transfer_task : public physx::PxLightCpuTask{
public:
start_transfer_task(physx::PxCpuDispatcher* disp, std::vector<physx::PxRigidDynamic*>* obj, physx::PxTaskManager* mtm) :physx::PxLightCpuTask() {
this->mTm = mtm;
this->dispatcher = disp;
this->objects = obj;
}
physx::PxCpuDispatcher* dispatcher;
std::vector<physx::PxRigidDynamic*>* objects;
void start();
virtual const char* getName() const { return "start_transfer_task"; }
virtual void run();
};
void start_transfer_task::start() {
int thread_count = dispatcher->getWorkerCount();
int obj_count = objects->size();
int batch_size = obj_count / thread_count;
int first_size = batch_size + obj_count % thread_count;
auto task = new data_transfer_task(0, first_size, this, this->mTm);
this->mTm->submitUnnamedTask(*task, physx::PxTaskType::TT_CPU);
task->removeReference();
if (batch_size > 0) {
for (int i = 1; i < thread_count; i++) {
task = new data_transfer_task(first_size + batch_size * (i - 1), first_size + batch_size * i, this, this->mTm);
this->mTm->submitUnnamedTask(*task, physx::PxTaskType::TT_CPU);
task->removeReference();
}
}
}
void data_transfer_task::run() { update_transforms(); }
I create start_transfer_task instance before call simulate, pass start_transfer_task to simulate, and i expect that start_transfer_task should run after all physx task done its own job, so write and read api calls dont owerlap, and calling fetchResults(block=true) continue execution only where all of my tasks finish copy transform data.
while (is_simulate) {
auto transfer_task = new start_transfer_task(gScene->getCpuDispatcher(), &objects, gScene->getTaskManager());
gScene->simulate(1.0f / 60.0f, transfer_task);
gScene->fetchResults(true);
//some other logic to call graphics api and sleep to sustain 30 updates per second
But i got many warnings about read and write api call owerlapping like this.
\physx\source\physx\src\NpWriteCheck.cpp (53) : invalid operation : Concurrent API write call or overlapping API read and write call detected during physx::NpScene::simulateOrCollide from thread 8492! Note that write operations to the SDK must be sequential, i.e., no overlap with other write or read calls, else the resulting behavior is undefined. Also note that API writes during a callback function are not permitted.
And, sometimes after start my app, i got a strange assert message.
physx\source\task\src\TaskManager.cpp(195) : Assertion failed: !mPendingTasks"
So, what i doing wrong ?
The concurrent API call warning is essentially telling you that you are calling for multiple thread PhysX API functions that are supposed to be single threaded.
Using the PhysX API you have to be very careful because it is not thread safe, and the thread safeness is left to the user.
Read this for more information.

What are alternatives to functional programming for handling shared mutability?

After watching some videos on the Rust language, I'm increasingly interested in examining my coding decisions based on mitigating the complexity of shared mutable state. Functional programming/Lambda Calculus seems to be the most popular standard to overcome the problem of shared mutable state. Are there alternatives though? Is there a consensus now that functional programming is a reasonable default approach to solve the problem?
Disclaimer:
I am aware that this post might not directly answer your question.
However, many programmers still overlook they can sometimes avoid shared mutability. I want to show you how here with an example and hope, it helps you though.
TL;DR: Ask yourself whether unshared mutability or shared immutability can also be options.
What about doubting whether you really need shared mutability?
If you turn one of both terms into the opposite, then you gain two useful alternatives:
unshared mutability
shared immutability
Let's have an example in Java 8 to illustrate what I mean.
This example of shared mutability uses synchronize to avoid visibility issues and race conditions:
public class MutablePoint {
private int x, y;
void move(int dx, int dy) {
x += dx;
y += dy;
}
#Override
public String toString() {
return "MutablePoint{x=" + x + ", y=" + y + '}';
}
}
public class SharedMutability {
public static void main(String[] args) {
final MutablePoint mutablePoint = new MutablePoint();
final Thread moveRightThread = new Thread(() -> {
for (int i = 0; i < 1000; i++) {
synchronized (mutablePoint) {
mutablePoint.move(1, 0);
}
Thread.yield();
}
}, "moveRight");
final Thread moveDownThread = new Thread(() -> {
for (int i = 0; i < 1000; i++) {
synchronized (mutablePoint) {
mutablePoint.move(0, 1);
}
Thread.yield();
}
}, "moveDown");
final Thread displayThread = new Thread(() -> {
for (int i = 0; i < 1000; i++) {
synchronized (mutablePoint) {
System.out.println(mutablePoint);
}
Thread.yield();
}
}, "display");
moveRightThread.start();
moveDownThread.start();
displayThread.start();
}
}
Explanation:
We have got 3 threads. While the two threads moveRight and moveDown write on the mutable point, the one thread display reads from it. All 3 threads must synchronize on the mutable point to avoid visibility issues and race conditions.
How can you apply unshared mutability?
Unshared means "only one thread reading and writing on a mutable object".
You don't need much for that. It's quite easy. You always only access one mutable object from the same ONE thread. Therefore you don't need the keyword synchronize nor any locks nor the keyword volatile. Moreover, this one thread can be very fast without locks and broken memory barriers if it only focuses on reading and writing values in the mutable object.
However you are limited to that one thread. That's usually no problem unless you block that one thread with tasks like I/O (don't do that!). Furthermore, you must ensure that the mutable object doesn't "escape" somehow by being assigned to a variable or field outside the one thread and accessed from there.
If you apply unshared mutability to the example, it could look like that:
public class UnsharedMutability {
private static final ExecutorService accessorService = Executors.newSingleThreadExecutor(); // only ONE thread!
private static final MutablePoint mutablePoint = new MutablePoint();
public static void main(String[] args) {
final Thread moveRightThread = new Thread(() -> {
for (int i = 0; i < 1000; i++) {
accessorService.submit(() -> {
mutablePoint.move(1, 0);
});
Thread.yield();
}
}, "moveRight");
final Thread moveDownThread = new Thread(() -> {
for (int i = 0; i < 1000; i++) {
accessorService.submit(() -> {
mutablePoint.move(0, 1);
});
Thread.yield();
}
}, "moveDown");
final Thread displayThread = new Thread(() -> {
for (int i = 0; i < 1000; i++) {
accessorService.submit(() -> {
System.out.println(mutablePoint);
});
Thread.yield();
}
}, "display");
moveRightThread.start();
moveDownThread.start();
displayThread.start();
}
}
Explanation:
We have got all 3 threads again. However, all 3 threads don't need to synchronize on the mutable point because they only access the mutable point in the same one thread which runs in the single threaded ExecutorService accessorService.
How can you apply shared immutability?
Immutability means "no ability to change the state of the object after its creation". Immutable objects always have only one state. Therefore they are always threadsafe. Immutable objects can create new immutable objects when you want to change them though.
However, creating too many objects too fast can cause a high memory consumption and lead to a higher GC activity. Sometimes you can deduplicate immutable objects if you have many duplicates of them.
If you apply shared immutability to the example, it could look like that:
public class ImmutablePoint {
private final int x;
private final int y;
public ImmutablePoint(int x, int y) {
this.x = x;
this.y = y;
}
ImmutablePoint move(int dx, int dy) {
return new ImmutablePoint(x+dx, y+dy);
}
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
ImmutablePoint that = (ImmutablePoint) o;
return x == that.x && y == that.y;
}
#Override
public int hashCode() {
return Objects.hash(x, y);
}
#Override
public String toString() {
return "ImmutablePoint{x=" + x + ", y=" + y + '}';
}
}
public class SharedImmutability {
private static AtomicReference<ImmutablePoint> pointReference = new AtomicReference<>(new ImmutablePoint(0, 0));
public static void main(String[] args) {
final Thread moveRightThread = new Thread(() -> {
for (int i = 0; i < 1000; i++) {
pointReference.updateAndGet(point -> point.move(1, 0));
Thread.yield();
}
}, "moveRight");
final Thread moveDownThread = new Thread(() -> {
for (int i = 0; i < 1000; i++) {
pointReference.updateAndGet(point -> point.move(0, 1));
Thread.yield();
}
}, "moveDown");
final Thread displayThread = new Thread(() -> {
for (int i = 0; i < 1000; i++) {
System.out.println(pointReference.get());
Thread.yield();
}
}, "display");
moveRightThread.start();
moveDownThread.start();
displayThread.start();
}
}
Explanation:
We have got all 3 threads again. However, we use an immutable point instead of a mutable point. While the two threads moveRight and moveDown replace the older instance of the immutable point by a newer one in the atomic reference pointReference, the thread display can get the current instance from pointReference and display it (whenever this thread wants because the instance of immutable point is independent of older and newer ones).
Remark:
The calls to yield() should force thread switches because a loop with only 1000 iterations is just too small. Most CPUs execute such a loop in one time slice.

Understanding an example about resumable functions in proposal N3650 for C++1y

Consider the following example taken from N3650:
int cnt = 0;
do {
cnt = await streamR.read(512, buf);
if (cnt == 0)
break;
cnt = await streamW.write(cnt, buf);
} while (cnt > 0);
I am probably missing something, but if I understood async and await well, what is the point in showing the usefulness of the two constructs with the above example when the effects are equivalent to writing:
int cnt = 0;
do {
cnt = streamR.read(512, buf).get();
if (cnt == 0)
break;
cnt = streamW.write(cnt, buf).get();
} while (cnt > 0);
where both the read().get() and write().get() calls are synchronous?
The await keyword is not equal to calling get on a future. You might look at it more like this, suppose you start from this:
future<T> complex_function()
{
do_some_stuff();
future<Result> x = await some_async_operation();
return do_some_other_stuff(x);
}
This is functionally more or less the same as
future<T> complex_function()
{
do_some_stuff();
return some_async_operation().then([=](future<Result> x) {
return do_some_other_stuff(x);
});
}
Note the more or less, because there are some resource management implications, variables created in do_some_stuff shouldn't be copied to execute do_some_other_stuff like the lambda version will do.
The second variant makes it more clear what will happen upon invocation.
The do_some_stuff() will be invoked synchronously when you call complex_function.
some_async_operation is called asynchronously and results in a future. The exact moment when this operation is executed depends on your actual asynchronous calling implementation, it might be immediate when you use threads, it might be whenever the .get() is called when you use defered execution.
We don't execute do_some_other_stuff immediately, but rather chain it to the future obtained in step 2. This means that it can be executed as soon as the result from some_async_operation is ready but not before. Aside from that, it's moment of execution is determined by the runtime. If the implementation would just wrap the then proposal, this means it would inherit the parent future's executor/launch policy (as per N3558).
The function returns the last future, that represents the eventual result. Note this NEEDS to be a future, as part of the function body is executed asynchronously.
A more complete example (hopefully correct):
future<void> forwardMsgs(istream& streamR, ostream& streamW) async
{
char buf[512];
int cnt = 0;
do {
cnt = await streamR.read(512, buf);
if (cnt == 0)
break;
cnt = await streamW.write(cnt, buf);
} while (cnt > 0);
}
future<void> fut = forwardMsgs(myStreamR, myStreamW);
/* do something */
fut.get();
The important point is (quoting from the draft):
After suspending, a resumable function may be resumed by the scheduling logic of the runtime and will eventually complete its logic, at which point it executes a return statement (explicit or implicit) and sets the function’s result value in the placeholder.
and:
A resumable function may continue execution on another thread after resuming following a suspension of its execution.
That is, the thread who originally called forwardMsgs can return at any of the suspension points. If it does, during the /* do something */ line, the code inside forwardMsgs can be executed by another thread even though the function has been called "synchronously".
This example is very similar to
future<void> fut = std::async(forwardMsgs, myStreamR, myStreamW);
/* do something */
fut.get();
The difference is the resumable function can be executed by different threads: a different thread can resume execution (of the resumable function) after each resumption/suspension point.
I think the idea is that the streamR.read() and streamW.write() calls are asynchronous I/O operations and return futures, which are automatically waited on by the await expressions.
So the equivalent synchronous version would have to call future::get() to obtain the results e.g.
int cnt = 0;
do {
cnt = streamR.read(512, buf).get();
if (cnt == 0)
break;
cnt = streamW.write(cnt, buf).get();
} while (cnt > 0);
You're correct to point out that there is no concurrency here. However in the context of a resumable function the await makes the behaviour different to the snippet above. When the await is reached the function will return a future, so the caller of the function can proceed without blocking even if the resumable function is blocked at an await while waiting for some other result (e.g. in this case the read() or write() calls to finish.) The resumable function might resume running asynchronously, so the result becomes available in the background while the caller is doing something else.
Here's the correct translation of the example function to not use await:
struct Copy$StackFrame {
promise<void> $result;
input_stream& streamR;
output_stream& streamW;
int cnt;
char buf[512];
};
using Copy$StackPtr = std::shared_ptr<Copy$StackFrame>;
future<void> Copy(input_stream& streamR, output_stream& streamW) {
Copy$StackPtr $stack{ new Copy$StackFrame{ {}, streamR, streamW, 0 } };
future<int> f$1 = $stack->streamR.read(512, stack->buf);
f$1.then([$stack](future<int> f) { Copy$Cont1($stack, std::move(f)); });
return $stack->$result.get_future();
}
void Copy$Cont1(Copy$StackPtr $stack, future<int> f$1) {
try {
$stack->cnt = f$1.get();
if ($stack->cnt == 0) {
// break;
$stack->$result.set_value();
return;
}
future<int> f$2 = $stack->streamW.write($stack->cnt, $stack->buf);
f$2.then([$stack](future<int> f) { Copy$Cont2($stack, std::move(f)); });
} catch (...) {
$stack->$result.set_exception(std::current_exception());
}
}
void Copy$Cont2(Copy$StackPtr $stack, future<int> f$2) {
try {
$stack->cnt = f$2.get();
// while (cnt > 0)
if (cnt <= 0) {
$stack->$result.set_value();
return;
}
future<int> f$1 = $stack->streamR.read(512, stack->buf);
f$1.then([$stack](future<int> f) { Copy$Cont1($stack, std::move(f)); });
} catch (...) {
$stack->$result.set_exception(std::current_exception());
}
}
As you can see, the compiler transformation here is quite complex. The key point here is that, unlike the get() version, the original Copy returns its future as soon as the first async call has been made.
I have the same issue with the meaning of the difference between these two code samples. Let's re write them a little to be more complete.
// Having two functions
future<void> f (istream&streamR, ostream&streamW) async
{ int cnt = 0;
do {
cnt = await streamR.read(512, buf);
if (cnt == 0)
break;
cnt = await streamW.write(cnt, buf);
} while (cnt > 0);
}
void g(istream&streamR, ostream&streamW)
{ int cnt = 0;
do {
cnt = streamR.read(512, buf).get();
if (cnt == 0)
break;
cnt = streamW.write(cnt, buf).get();
} while (cnt > 0);
}
// what is the difference between
auto a = f(streamR, streamW);
// and
auto b = async(g, streamR, streamW);
You still need at least three stacks. In both cases main thread is not blocked. Is it assumption that await would be implemented by compiler more efficiently than future<>:get()?. Well, the one without await can be used now.
Thanks
Adam Zielinski

How can I find the depth of a recursive function in C++

How can I find the current depth inside a recursive function in C++ without passing in the previous level? i.e. is it possible to know how many times the function was called without using a parameter to keep track of the level and passing that number in as a parameter each time the function is called?
For example my recursive function looks like this:
DoSomething(int level)
{
print level;
if (level > 10)
return;
DoSomething(++level);
}
main
{
DoSomething(0);
}
Building on the answer already given by JoshD:
void recursive()
{
static int calls = 0;
static int max_calls = 0;
calls++;
if (calls > max_calls)
max_calls = calls;
recursive();
calls--;
}
This resets the counter after the recursive function is complete, but still tracks the maximum depth of the recursion.
I wouldn't use static variables like this for anything but a quick test, to be deleted soon after. If you really need to track this on an ongoing basis there are better methods.
You could use a static variable in the function...
void recursive()
{
static int calls = 0;
calls++;
recursive();
}
Of course, this will keep counting when you start a new originating call....
If you want it to be re-entrant and thread-safe, why not:
void rec(int &level) // reference to your level var
{
// do work
rec(++level); // go down one level
}
main()
{
//and you call it like
int level=0;
rec(level);
cout<<level<<" levels."<<endl;
}
No static/global variables to mess up threading and you can use different variables for different recursive chains for re-entrancy issues.
You can use a local static variable, if you don't care about thread-safety.
Although, this will only give you a proper count the first time you run your recursive routine. A better technique would be a RAII guard-type class which contains an internal static variable. At the start of the recursive routine, construct the guard class. The constructor would increment the internal static variable, and the destructor would decrement it. This way, when you create a new stack-frame the counter increments by one, and when you return from each stack-frame the counter would decrement by one.
struct recursion_guard
{
recursion_guard() { ++counter; }
~recursion_guard() { --counter; }
static int counter;
};
int recursion_guard::counter = 0;
void recurse(int x)
{
recursion_guard rg;
if (x > 10) return;
recurse(x + 1);
}
int main()
{
recurse(0);
recurse(0);
}
Note however, that this is still not thread-safe. If you need thread-safety, you can replace the static-storage variable with a thread-local-storage variable, either using boost::thread_specific_ptr or the C++0x thread local facilities.
You could also pass in the level as a template parameter, if it can be determined at compile-time. You could also use a function object. This is by far and away the best option - less hassle, and static variables should be avoided wherever possible.
struct DoSomething {
DoSomething() {
calls = 0;
}
void operator()() {
std::cout << calls;
calls++;
if (calls < 10)
return operator()();
return;
}
int calls;
};
int main() {
DoSomething()(); // note the double ().
std::cin.get();
}
convert level to an instance variable of a new object (typically a template) capable of containing the arguments and (possibly) the function. then you can reuse the recursion accumulator interface.
You can also try using a global variable to log the depth.
var depth = 0;
DoSomething()
{
print ++depth;
if (depth > 10)
return;
DoSomething();
}
main
{
DoSomething(0);
}
I came here when I sensed that some recursion is required, because I was implementing a function that can validate the chain of trust in a certificate chain. This is not X.509 but instead it is just the basics wherein the issuer key of a certificate must match the public key of the signer.
bool verify_chain(std::vector<Cert>& chain,
Cert* certificate,
unsigned char* pOrigin = nullptr, int depth = 0)
{
bool flag = false;
if (certificate == nullptr) {
// use first element in case parameter is null
certificate = &chain[0];
}
if (pOrigin == nullptr) {
pOrigin = certificate->pubkey;
} else {
if (std::memcmp(pOrigin, certificate->pubkey, 32) == 0) {
return false; // detected circular chain
}
}
if (certificate->hasValidSignature()) {
if (!certificate->isRootCA()) {
Cert* issuerCert = certificate->getIssuer(chain);
if (issuerCert) {
flag = verify_chain(chain, issuerCert, pOrigin, depth+1);
}
} else {
flag = true;
}
}
if (pOrigin && depth == 1) {
pOrigin = nullptr;
}
return flag;
}
I needed to know the recursion depth so that I can correctly clean up pOrigin. at the right stack frame during the unwinding of the call stack.
I used pOrigin to detect a circular chain, without which the recursive call can go on forever. For example,
cert0 signs cert1
cert1 signs cert2
cert2 signs cert0
I later realized that a simple for-loop can do it for simple cases when there is only one common chain.
bool verify_chain2(std::vector<Cert> &chain, Cert& cert)
{
Cert *pCert = &cert;
unsigned char *startkey = cert.pubkey;
while (pCert != nullptr) {
if (pCert->hasValidSignature()) {
if (!pCert->isRootCA()) {
pCert = pCert->getIssuer(chain);
if (pCert == nullptr
|| std::memcmp(pCert->pubkey, startkey, 32) == 0) {
return false;
}
continue;
} else {
return true;
}
} else {
return false;
}
}
return false;
}
But recursion is a must when there is not one common chain but instead the chain is within each certificate. I welcome any comments. Thank you.