I´ve tried to figure out if it is faster to give a function an argument or simply use member variables. I have the following code.
class Variable
{
private:
public:
Variable() {}
~Variable() {}
struct
{
static const int test = 3;
}testVar;
};
class VariableTransmit
{
private:
Variable var;
public:
VariableTransmit() {}
~VariableTransmit() {}
void testFunc1(int test)
{
int foo = 2;
foo = test;
}
void testFunc2()
{
int foo = 2;
foo = var.testVar.test;
}
};
struct
{
static const int test = 3;
}extVar;
int main(void)
{
VariableTransmit transmit;
clock_t prgstart, prgend;
prgstart = clock();
for(int i = 0; i <= 10000000; i++)
{
transmit.testFunc1(extVar.test);
}
prgend = clock();
printf("delivered: %.5f seconds\n\n", (float)(prgend - prgstart) / CLOCKS_PER_SEC);
prgstart = clock();
for(int i = 0; i <= 10000000; i++)
{
transmit.testFunc2();
}
prgend = clock();
printf("member: %.5f seconds\n\n", (float)(prgend - prgstart) / CLOCKS_PER_SEC);
return 0;
}
I tested this code and to my surprise testFunc1 and testFunc2 have identical processing speed. I had thought that testFunc1 would be the faster one since it gets the value as argument out of the struct and just has to set it, while testFunc2 has to access the var object and then get the value out of the struct inside the object. Is this compiler specific optimizing (I'm using VS2010 btw.) or did I just overlook something?
edit: Removed second question for being to opinion based.
Your example will be highly optimized by the visual-c++ compiler. testFunc1 might even be slower, depending on the register used to pass test, while testFunc2 copies from always the same virtual adress.
Why is that?
Your Variable var is created on the stack. It will be created upon object creation, and thus the compiler can predict its virtual address. (You could create the Variable object on the heap and thus have a slightly longer execution time.)
[About the public/private-things: usually, you can do as you please. I - personally - believe, that in multithreaded environments, making internal structures of classes public poses a higher risk of accidental race conditions.]
Related
I need a consistent way of resetting all thread-local variables my program creates. The problem lies in that the thread-local data is created in places different from where they are used.
My program outline is the following:
struct data_t { /* ... */ };
// 1. Function that fetches the "global" thread-local data
data_t& GetData()
{
static data_t *d = NULL;
#pragma omp threadprivate(d); // !!!
if (!d) { d = new data_t(); }
return *d;
}
// 2 example function that uses the data
void user(int *elements, int num, int *output)
{
#pragma omp parallel for shared(elements, output) if (num > 1000)
for (int i = 0; i < num; ++i)
{
// computation is a heavy calculation, on memoized data
computation(GetData());
}
}
Now, my problem is I need a function that resets data, i.e. every thread-local object created must be accounted for.
For now, my solution, is to use a parallel region, that hopefully uses equal or more threads than the "parallel for" so every object is "iterated" through:
void ClearThreadLocalData()
{
#pragma omp parallel
{
// assuming data_t has a "clear()" method
GetData().clear();
}
}
Is there a more idiomatic / safe way to implement ClearThreadLocalData() ?
You can create and use a global version number for your data. Increment it every time you need to clear the existing caches. Then modify GetData to check the version number if there is an existing data object, discarding the existing one and creating a new one if it is out of date. (The version number for the allocated data_t object can be stored within data_t if you can modify the class, or in a second thread local variable if not.) You'd end up with something like
static int dataVersion;
data_t& GetData()
{
static data_t *d = NULL;
#pragma omp threadprivate(d); // !!!
if (d && d->myDataVersion != dataVersion) {
delete d;
d = nullptr;
}
if (!d) {
d = new data_t();
d->myDataVersion = dataVersion;
}
return *d;
}
This doesn't depend on the existence of a Clear method in data_t, but if you have one replace the delete-and-reset with a call to Clear. I'm using d = nullptr to avoid duplicating the call to new data_t().
The global dataVersion could be a static member of data_t if you want to avoid the global variable, and it can be atomic if necessary although GetData would need changes to handle that.
When it comes time to reset the data, just change the global version number:
++dataVersion;
This question already has answers here:
Is local static variable initialization thread-safe in C++11? [duplicate]
(2 answers)
Closed 6 years ago.
Suppose I have a class with three static functions like this :
#include <vector>
#include <iostream>
using namespace std;
#include <thread>
class Employee
{
};
class client
{
public:
void Doprocessing()
{
//is this thread safe in c++11/14
static int i = CreateEmployee();
//does it look good to use static variable like this to make it thread safe?
static int k = ProcessEmploye();
}
private:
static int CreateEmployee()
{
static Employee * e = new Employee();
InsertEmployee(e);
return 1;
}
static int InsertEmployee(Employee *e)
{
vec.push_back(e);
return 1;
}
static int ProcessEmploye()
{
Employee* e = vec[0];
//do something with e
//...
//.
//Suppose 10 -20 lines
return 1;
}
static std::vector<Employee*> vec;
};
std::vector<Employee*> client::vec;
void func()
{
client cobj;
cobj.Doprocessing();
}
const int No_Of_Threads = 10;
int main() {
std::thread * threadpointer = new std::thread[No_Of_Threads];
std::srand(11);
for (int i = 0; i < No_Of_Threads; i++)
{
threadpointer[i] = std::thread(func);
}
for (int i = 0; i < No_Of_Threads; i++)
{
threadpointer[i].join();
}
delete[] threadpointer;
std::cout << " Good" << std::endl;
return 0;
}
My questions are :
1)If I use static int i = Somefunc() and no matter how bulky somefunc is, will be called once and will it be thread safe ?
2) if answer of 1) is yes, Does it look good to programers eyes to use static int i = SomeFunc() for above purpose.
Yes, it will be thread safe, but only since C++11. Static variables are initialized in thread safe way, they are often also called magic statics.
For more see here: http://en.cppreference.com/w/cpp/language/storage_duration#Static_local_variables
If multiple threads attempt to initialize the same static local variable concurrently, the initialization occurs exactly once (similar behavior can be obtained for arbitrary functions with std::call_once).
Note: usual implementations of this feature use variants of the double-checked locking pattern, which reduces runtime overhead for already-initialized local statics to a single non-atomic boolean comparison.
Also - in your code you call CreateEmployee(); during initialization of static i, and CreateEmployee( also initializes a static variable. This should also be OK, you can find in standard following footnote:
The implementation must not introduce any deadlock around execution of the
initializer.
As to your second question, from the code you have shown I dont see that it is OK to use static variable as a way to gain thread safety.
Are you aware that specifying a variable as static inside function body allows you to assign it only once? This means your CreateEmployee() will always return the same Employee instance.
I have a class Itch that is a member of the class Scratch. I want to do some computations in the Scratch constructor and pass the result of these computations to instantiate the Itch object. My best guess in doing this is below, but this returns garbage:
#include <iostream>
class Itch {
public:
int N;
Itch(int n) {N = n;}
};
class Scratch {
private:
int N;
public:
Itch it;
Scratch(int n);
};
Scratch::Scratch(int n) : it(N) // here is where I want to use new data
{
// do some silly things
int temp = 5;
temp += n + 45;
N = temp - 1;
}
int main() {
int n = 1;
Scratch sc(n);
std::cout << sc.it.N << "\n";
}
Is there a standard way to do this?
The things in the initializer list happen before the things in the constructor code. Therefore, you cannot affect anything in the initializer list with the code in the constructor. You have a few options.
A reasonable approach would be to have an Itch * member rather than an Itch, and initialize it when it's ready, e.g.:
class Scratch {
...
Itch *it;
...
};
Scratch::Scratch(int n) : it(NULL)
{
// do some silly things
int temp = 5;
temp += n + 45;
N = temp - 1;
it = new Itch(N); // <- now you have enough info to instantiate an Itch
}
And you'll have to remember to clean up in the destructor unless you use an auto_ptr:
Scratch::~Scratch () {
delete it;
}
Another reasonable approach would be to pass n to the Itch constructor and have it do the calculations there instead of in Scratch, perhaps even allowing Itch to determine N, e.g.:
class Itch {
private:
int N;
public:
Itch (int n);
int getN () const { return N; }
}
Itch::Itch (int n) {
// do some silly things
int temp = 5;
temp += n + 45;
N = temp - 1;
}
Scratch::Scratch (int n) : it(n) {
// you can either remove Scratch::N altogether, or I suppose do:
N = it.getN();
// ... do whatever makes sense, try not to have redundant data.
// (also ask yourself if Scratch even *needs* to know N, or at
// least if it can just use it.getN() everywhere instead of
// keeping its own copy.)
}
Another approach, which IMO is a bit odd but it's still possible in some situations, is to have e.g. a static function (member or not) that computes N from n, which you can use in the initializer list, e.g.:
static int doSillyThings (int n) {
int temp = 5;
temp += n + 45;
return temp - 1;
}
Scratch::Scratch(int n) : N(doSillyThings(n)), it(N)
{
}
Choose whichever leads to the cleanest, most maintainable and easy-to-read code. Personally I'd prefer the first, Itch * option, since it makes logical sense and is very clear: You do the calculations necessary to initialize the Itch, then you initialize it.
You should think about your code a bit. If the Scratch's N is always equal to it.N, then do you really need both Ns?
There are other options too (including restructuring your code completely so you don't have to have an Itch member of Scratch, or so that you don't have to have it depend on extra calculations done on the Scratchs constructor parameters but that really depends on the situation), but hopefully that inspires you a little.
The reason your code returns garbage, by the way, is because N is garbage at the point you pass it to the Itch constructor. It's uninitialized until you initialize it, and at the point where it(N) is you haven't initialized N yet.
I'm considering a type erasure setup that uses typeid to resolve the type like so...
struct BaseThing
{
virtual ~BaseThing() = 0 {}
};
template<typename T>
struct Thing : public BaseThing
{
T x;
};
struct A{};
struct B{};
int main()
{
BaseThing* pThing = new Thing<B>();
const std::type_info& x = typeid(*pThing);
if( x == typeid(Thing<B>))
{
std::cout << "pThing is a Thing<B>!\n";
Thing<B>* pB = static_cast<Thing<B>*>(pThing);
}
else if( x == typeid(Thing<A>))
{
std::cout << "pThing is a Thing<A>!\n";
Thing<A>* pA = static_cast<Thing<A>*>(pThing);
}
}
I've never seen anyone else do this. The alternative would be for BaseThing to have a pure virtual GetID() which would be used to deduce the type instead of using typeid. In this situation, with only 1 level of inheritance, what's the cost of typeid vs the cost of a virtual function call? I know typeid uses the vtable somehow, but how exactly does it work?
This would be desirable instead of GetID() because it takes quite a bit of hackery to try to make sure the IDs are unique and deterministic.
The alternative would be for BaseThing to have a pure virtual GetID() which would be used to deduce the type instead of using typeid. In this situation, with only 1 level of inheritance, what's the cost of typeid vs the cost of a virtual function call? I know typeid uses the vtable somehow, but how exactly does it work?
On Linux and Mac, or anything else using the Itanium C++ ABI, typeid(x) compiles into two load instructions — it simply loads the vptr (that is, the address of some vtable) from the first 8 bytes of object x, and then loads the -1th pointer from that vtable. That pointer is &typeid(x). This is one function call less expensive than calling a virtual method.
On Windows, it involves on the order of four load instructions and a couple of (negligible) ALU ops, because the Microsoft C++ ABI is a bit more enterprisey. (source) This might end up being on par with a virtual method call, honestly. But that's still dirt cheap compared to a dynamic_cast.
A dynamic_cast involves a function call into the C++ runtime, which has a lot of loads and conditional branches and such.
So yes, exploiting typeid will be much much faster than dynamic_cast. Will it be correct for your use-case?— that's questionable. (See the other answers about Liskov substitutability and such.) But will it be fast?— yes.
Here, I took the toy benchmark code from Vaughn's highly-rated answer and made it into an actual benchmark, avoiding the obvious loop-hoisting optimization that borked all his timings. Result, for libc++abi on my Macbook:
$ g++ test.cc -lbenchmark -std=c++14; ./a.out
Run on (4 X 2400 MHz CPU s)
2017-06-27 20:44:12
Benchmark Time CPU Iterations
---------------------------------------------------------
bench_dynamic_cast 70407 ns 70355 ns 9712
bench_typeid 31205 ns 31185 ns 21877
bench_id_method 30453 ns 29956 ns 25039
$ g++ test.cc -lbenchmark -std=c++14 -O3; ./a.out
Run on (4 X 2400 MHz CPU s)
2017-06-27 20:44:27
Benchmark Time CPU Iterations
---------------------------------------------------------
bench_dynamic_cast 57613 ns 57591 ns 11441
bench_typeid 12930 ns 12844 ns 56370
bench_id_method 20942 ns 20585 ns 33965
(Lower ns is better. You can ignore the latter two columns: "CPU" just shows that it's spending all its time running and no time waiting, and "Iterations" is just the number of runs it took to get a good margin of error.)
You can see that typeid thrashes dynamic_cast even at -O0, but when you turn on optimizations, it does even better — because the compiler can optimize any code that you write. All that ugly code hidden inside libc++abi's __dynamic_cast function can't be optimized by the compiler any more than it already has been, so turning on -O3 didn't help much.
Typically, you don't just want to know the type, but also do something with the object as that type. In that case, dynamic_cast is more useful:
int main()
{
BaseThing* pThing = new Thing<B>();
if(Thing<B>* pThingB = dynamic_cast<Thing<B>*>(pThing)) {
{
// Do something with pThingB
}
else if(Thing<A>* pThingA = dynamic_cast<Thing<A>*>(pThing)) {
{
// Do something with pThingA
}
}
I think this is why you rarely see typeid used in practice.
Update:
Since this question concerns performance. I ran some benchmarks on g++ 4.5.1. With this code:
struct Base {
virtual ~Base() { }
virtual int id() const = 0;
};
template <class T> struct Id;
template<> struct Id<int> { static const int value = 1; };
template<> struct Id<float> { static const int value = 2; };
template<> struct Id<char> { static const int value = 3; };
template<> struct Id<unsigned long> { static const int value = 4; };
template <class T>
struct Derived : Base {
virtual int id() const { return Id<T>::value; }
};
static const int count = 100000000;
static int test1(Base *bp)
{
int total = 0;
for (int iter=0; iter!=count; ++iter) {
if (Derived<int>* dp = dynamic_cast<Derived<int>*>(bp)) {
total += 5;
}
else if (Derived<float> *dp = dynamic_cast<Derived<float>*>(bp)) {
total += 7;
}
else if (Derived<char> *dp = dynamic_cast<Derived<char>*>(bp)) {
total += 2;
}
else if (
Derived<unsigned long> *dp = dynamic_cast<Derived<unsigned long>*>(bp)
) {
total += 9;
}
}
return total;
}
static int test2(Base *bp)
{
int total = 0;
for (int iter=0; iter!=count; ++iter) {
const std::type_info& type = typeid(*bp);
if (type==typeid(Derived<int>)) {
total += 5;
}
else if (type==typeid(Derived<float>)) {
total += 7;
}
else if (type==typeid(Derived<char>)) {
total += 2;
}
else if (type==typeid(Derived<unsigned long>)) {
total += 9;
}
}
return total;
}
static int test3(Base *bp)
{
int total = 0;
for (int iter=0; iter!=count; ++iter) {
int id = bp->id();
switch (id) {
case 1: total += 5; break;
case 2: total += 7; break;
case 3: total += 2; break;
case 4: total += 9; break;
}
}
return total;
}
Without optimization, I got these runtimes:
test1: 2.277s
test2: 0.629s
test3: 0.469s
With optimization -O2, I got these runtimes:
test1: 0.118s
test2: 0.220s
test3: 0.290s
So it appears that dynamic_cast is the fastest method when using optimization with this compiler.
In almost all cases you don't want the exact type, but you want to make sure that it's of the given type or any type derived from it. If an object of a type derived from it cannot be substituted for an object of the type in question, then you are violating the Liskov Substitution Principle which is one of the most fundamental rules of proper OO design.
I want to access variable v1 & v2 in Func() while being in main()
int main(void)
{
Func();
int k = ? //How to access variable 'v1' which is in Func()
int j = ? //How to access variable 'v2' which is in Func()
}
void Func()
{
int v1 = 10;
int v2 = 20;
}
I have heard that we can access from Stack. But how to do.
Thank you.
You can't legally do that. Automatic variables disappear once execution leaves the scope they're declared in.
I'm sure there are tricks, like inspecting the stack and going "backwards" in time, but all such tricks are platform-dependent, and might break if you, for instance, cause the stack to be overwritten in main().
Why do you want to do that? Do you want those values as return values? I would introduce a struct for that, according to the meaning of the values the struct would get a suitable name
struct DivideResult {
int div;
int rem;
};
DivideResult Func() {
DivideResult r = { 10, 20 };
return r;
}
int main() {
DivideResult r = Func();
}
Otherwise, such variables are for managing local state in the function while it is activated. They don't have any meaning or life anymore after the function terminated.
Some ways you can do this are:
Declare the variables in main() and pass them by pointer or reference into Func()
Return the variable, or vector< int >, or a struct that you made, etc. of the variables to main()
Dynamically allocate the variables in Func(), and return a pointer to them. You would then have to remember to delete the allocated memory later as well.
But there is no access to the stack of Func() from main() that is standard.
You can't do that portably. When Func()'s stack frame disappears, there's no reliable way to access it. It's free to be trampled. However, in x86-64, there is something known as the red zone, which is a 128B area below the stack pointer that is safe from trampling, and theoretically you might still be able to access it, but this is not portable, easy, nor correct. Simply put, don't do it.
Here's how I would do it:
int main(void)
{
int k, j;
Func(&k, &j);
}
void Func(int *a, int *b)
{
*a = 10;
*b = 20;
}
You're in C/C++ land. There are little you cannot do.
If this your own code, you shouldn't even try to do that. Like others suggested: pass a output parameter by reference (or by pointer in C) or return the values in a struct.
However, since you asked the question, I assume you are attempting to look into something you only have binary access to. If it is just an one time thing, using a debugger will be easier.
Anyway, to answer your original question, try the following code. You have to compile it in for x86 CPU, with optimization and any stack debug flag turned off.
void f() {
int i = 12345;
int j = 54321;
}
int main()
{
int* pa = 0;
int buf[16] = {0};
f();
// get the stack pointer
__asm {
mov dword ptr [pa],ESP
}
// copy the stack, try not to do anything that "use" the stack
// before here
for (int i = 0; i < 16; ++i, --pa) {
buf[i] = *pa;
}
// print out the stack, assuming what you want to see
// are aligned at sizeof(int)
for (int i = 0; i < 16; ++i) {
std::cout << i << ":" << buf[i] << std::endl;
}
return 0;
}