IOS code has become very slow because of objc_msgSend

IOS code has become very slow because of objc_msgSend - c++

I have rewritten part of my code from very simple c arrays to using (or trying to use) objects in order to get more structure into it. Instead of passing arrays through the function header I am now using a global array defined by a singleton. You can see an example of a function in my code below:
it was:
void calcdiv(int nx,int ny,float **u,float **v,
float **divu,float dx,float dy,float **p,
float dt,float rho, float **bp,float **lapp)
{
int i,j;
for (i=2;i<=nx-3;++i){
for (j=2;j<=ny-3;++j){
divu[i][j] = (u[i+1][j]-u[i-1][j])*facu +
(v[i][j+1]-v[i][j-1])*facv;
}
}
...
now it is:
void calcdiv()
{
int i,j;
SingletonClass* gV = [SingletonClass sharedInstance];
for (i=2;i<=gV.nx-3;++i){
for (j=2;j<=gV.ny-3;++j){
gV.divu[i][j] = (gV.u[i+1][j]-gV.u[i-1][j])*facu +
(gV.v[i][j+1]-gV.v[i][j-1])*facv;
}
}
...
Before the restructuring I have been using the function call as given above. That means passing the pointers to the arrays directly. Now I access the arrays by the singleton call "SingletonClass* gV...". It works very fine except the fact that it is much slower than before. The profiler tells me that my program spends 41% of the time with objc_msgSend which I have not had before.
From reading through the posts I have understood that this probably can happen when msgSend is called very often. This is then most likely the case here, because my program needs a lot of number crunching in order to display an animated flow with OpenGl.
This leads me to my question: What would you suggest? Should I stay with my simple C implementation or is there a rather simple way to accelerate the objective c version? Please be patient with me since I am new to objective c programming.
Any hints and recommendations are greatly appreciated! Thanks in advance.

If your straight C method works fine, and your Objective C method puts you at a disadvantage due to method calling, and you need the performance, then there's no reason not to use straight C. From looking at your code, I don't see any advantage to whatever "structure" you're adding, because the working code looks almost precisely the same. In other words, Obj-C doesn't buy you anything here, but straight C does, so go with what's best for your user, because in terms of maintainability and readability, there's no difference in the two implementations.

Related

C++ Calling different functions by string name

I am relatively new to C++ - I leanerd it some 6+ years ago, but have never really used it until some months ago.
What is the scenario:
Considerably large system with a lot of modules.
Desired Output:
Modules (namely X) "expose" certain functions to be called over network and have the result sent back to the caller (namely Y)
The caller Y doesn´t know any info about X, despite what was exposed by the library (function name and parameters).
The calling of function in X from the library will have to happen through a string received from Y - or a set of strings, as there will be parameters as well.
Ideally what I want to have is something as generic as possible with variable return/paramaters types, or some kind of type-erasure - owing to the fact that I don´t know which functions each module will want to expose. I reckon its quite utopic to get something like that running in C++. But hopefully with pre-determined possible return/parameter types, it is feasible. The communication is not a problem for now, what matters is what should be done in the module side.
Question:
Would it be possible to accomplish such thing using C++ and Boost ? I would be really greateful if someone could give me some guidelines - literature/tutorials/(pseudo)code examples and so on and so forth. I am ofc not expecting a full solution here.
Possible solution:
I am a little bit lost as to which "functionalities" of the languages I can/should use - mainly due to my restrictions in the project.
I thought about using Variadic Templates and found the question below, which really helps, the only problem is that Variadic Templates are not supported in VS2010.
Generic functor for functions with any argument list
After some extensive research in the Web, the closest answer I got was this:
map of pointers to functions of different return types and signatures
The scenario is pretty much the same. The difference, however, seems to me that the OP already knows beforehand the return/parameters the functions he will be using. Due to my low reputation (I just joined) I unfortunately cannot ask/comment anything there.
TBH I didn´t get that well how to accomplish what the selected answer explains.
Using maps is a way, but I would have to store objects which contains function pointers (as also answered in the question), but as it is possible to see in the provided code by the user, it does have some hard-coded stuff which I wasn´t desiring to have.
Further clarifications:
Yes, I am restricted to use C++ AND VS2010 SP1.
No, despite Boost, I cannot use any other 3rd library - it would be great to be able to use some Reflection libraries such as CPGF http://www.cpgf.org/ (even though I am not 100% sure if thats what I really need)
Minor Edit:
- Scripting language bindings (such as LUA) are indeed a way to go, yet I didn´t want to include it in the project.
I hope someone can shed light on this problem!
Thanking in advance for any input!

Looks like you're needed a little reflection module. For example we have a struct of method info such as:
struct argument_info {
std::string name;
std::string type;
std::string value;
}
struct method_info {
std::string method_name;
std::string return_type;
std::list<argument_info> arguments;
}
then compile a dll with all exported functions
extern"C" __declspec(dllexport) void f1(int a, int b){/*...*/}
extern"C" __declspec(dllexport) int f1(std::string a, int b, char* c){ return x; }
in the interpreter's code:
void call_function(method_info mi, argument_info& t_return)
{
/* let g_mi be a map, where key is a std::string name of the method and the
value is method_info struct */
if(!g_mi->find(mi.method_name))
throw MethodNotFindException
if(g_mi[mi.method_name].arguments.size() != mi.arguments.size())
throw InvalidArgumentsCountException;
for(int i = 0; i < g_mi[mi.method_name].arguments.size(); i++)
{
if(g_mi[mi.method_name].arguments[i].type != mi.arguments[i].type)
throw InvalidArgumentException;
}
t_return = module->call(mi.arguments);
}
I hope it may help you.

Communication between R and C++

I have a program written in C++ which calculates values for a likelihood function, which relies on lot of data. I want to be able to call the function from R to request function values (the calculations would take to much time in R, and the C++ program is already to long to change it, it's approximately 150K lines of code).
I can do this to request one value, but then the C++ application terminates and I have to restart it and load all the data again, (did this with .c()). The loading takes from 10-30 seconds, depending on the model for the likelihood function and the data, and I was thinking if there is a way to keep the C++ application alive, waiting for requests for function values, so I don't have to read all the data back into memory. Already calculating one function value in the C++ application takes around half a second, which is very long for C++.
I was thinking about using pipe() to do this, and ask you if that is a feasible option or should I use some other method? Is it possible to do this with rcpp?
I'm doing this to test minimizing algorithms for R on this function.

Forget about .C. That is clunky. Perhaps using .C over .Call or .External made sense before Rcpp. But now with the work we've put in Rcpp, I really don't see the point of using .C anymore. Just use .Call.
Better still, with attributes (sourceCpp and compileAttributes), you don't even have to see the .Call anymore, it just feels like you are using a c++ function.
Now, if I wanted to do something that preserves states, I'd use a module. For example, your application is this Test class. It has methods do_something and do_something_else and it counts the number of times these methods are used:
#include <Rcpp.h>
using namespace Rcpp ;
class Test {
public:
Test(): count(0){}
void do_something(){
// do whatever
count++ ;
}
void do_something_else(){
// do whatever
count++ ;
}
int get_count(){
return count ;
}
private:
int count ;
} ;
This is pretty standard C++ so far. Now, to make this available to R, you create a module like this :
RCPP_MODULE(test){
class_<Test>( "Test" )
.constructor()
.method( "do_something", &Test::do_something )
.method( "do_something_else", &Test::do_something_else )
.property( "count", &Test::get_count )
;
}
And then you can just use it :
app <- new( Test )
app$count
app$do_something()
app$do_something()
app$do_something_else()
app$count

There are several questions here.
What is the best way to call C++ code from R?
As other commenters have pointed out, the Rcpp package provides the nicest interface. Using the .Call function from base R is also possible, but not recommended as nice as Rcpp.
How do I stop repeatedly passing data back and forth between R and C++?
You'll just just to restructure your code a little bit. Rewrite a wrapper routine in C++ that calls all the existing C++ routines, and call that from R.

C and C++ Code Interoperability - Data Passing Issues

The following is the situation. There is a system/software which is completely written in C. This C program spawns a new thread to start some kind of a data processing engine written in C++. Hence, the system which I have, runs 2 threads (the main thread and the data processing engine thread). Now, I have written some function in C which takes in a C struct and passes it to the data processing thread so that a C++ function can access the C struct. While doing so, I am observing that the values of certain fields (like unsigned int) in the C struct changes when being accessed in the C++ side and I am not sure why. At the same time, if I pass around a primitive data type like an int, the value does not change. It would be great if someone can explain me why it behaves like this. The following is the code that i wrote.
`
/* C++ Function */
void DataProcessor::HandleDataRecv(custom_struct* cs)
{
/*Accesses the fields in the structure cs - an unsigned int field. The value of
field here is different from the value when accessed through the C function below.
*/
}
/*C Function */
void forwardData(custom_struct* cs)
{
dataProcessor->HandleDataRecv(cs); //Here dataProcessor is a reference to the object
//of the C++ class.
}
`
Also, both these functions are in different source files(one with .c ext and other with .cc ext)

I'd check that both sides layout the struct in the same
print sizeof(custom_struct) in both languages
Create an instance of custom_struct in both languages and print the offset of
each member variable.

My wild guess would be Michael Andresson is right, structure aligment might be the issue.
Try to compile both c and c++ files with
-fpack-struct=4
(or some other number for 4). This way, the struct is aligned the same in every case.
If we could see the struct declaration, it would probably clearer. The struct does not contain any #ifdef with c++-specific code like a constructor, does it? Also, check for #pragma pack directives which manipulate data alignment.

Maybe on one side the struct has 'empty bytes' added to make the variables align on 32 bit boundaries for speed (so a CPU register can point to the variable directly).
And on the other side the struct may be packed to conserve space.

(CORRECTION) With minor exceptions, C++ is a superset of C (meaning C89), So i'm confused about what is going on. I can only assume it has something to do with how you are passing or typing your variables, and/or the systems they are running on. It should, technically speaking, unless I am very mistaken, have nothing to do with c/c++ interoperability.
Some more details would help.

Instrumenting C/C++ codes using LLVM

I just read about the LLVM project and that it could be used to do static analysis on C/C++ codes using the analyzer Clang which the front end of LLVM. I wanted to know if it is possible to extract all the accesses to memory(variables, local as well as global) in the source code using LLVM.
Is there any inbuilt library present in LLVM which I could use to extract this information.
If not please suggest me how to write functions to do the same.(existing source code, reference, tutorial, example...)
Of what i have thought, is I would first convert the source code into LLVM bc and then instrument it to do the analysis, but don't know exactly how to do it.
I tried to figure out myself which IR should I use for my purpose ( Clang's Abstract Syntax Tree (AST) or LLVM's SSA Intermediate Representation (IR). ), but couldn't really figure out which one to use.
Here is what I m trying to do.
Given any C/C++ program (like the one given below), I am trying to insert calls to some function, before and after every instruction that reads/writes to/from memory. For example consider the below C++ program ( Account.cpp)
#include <stdio.h>
class Account {
int balance;
public:
Account(int b) {
balance = b;
}
int read() {
int r;
r = balance;
return r;
}
void deposit(int n) {
balance = balance + n;
}
void withdraw(int n) {
int r = read();
balance = r - n;
}
};
int main () {
Account* a = new Account(10);
a->deposit(1);
a->withdraw(2);
delete a;
}
So after the instrumentation my program should look like:
#include <stdio.h>
class Account {
int balance;
public:
Account(int b) {
balance = b;
}
int read() {
int r;
foo();
r = balance;
foo();
return r;
}
void deposit(int n) {
foo();
balance = balance + n;
foo();
}
void withdraw(int n) {
foo();
int r = read();
foo();
foo();
balance = r - n;
foo();
}
};
int main () {
Account* a = new Account(10);
a->deposit(1);
a->withdraw(2);
delete a;
}
where foo() may be any function like get the current system time or increment a counter .. so on. I understand that to insert function like above I will have to first get the IR and then run an instrumentation pass on the IR which will insert such calls into the IR, but I don't really know how to achieve it. Please suggest me with examples how to go about it.
Also I understand that once I compile the program into the IR, it would be really difficult to get 1:1 mapping between my original program and the instrumented IR. So, is it possible to reflect the changes made in the IR ( because of instrumentation ) into the original program.
In order to get started with LLVM pass and how to make one on my own, I looked at an example of a pass that adds run-time checks to LLVM IR loads and stores, the SAFECode's load/store instrumentation pass (http://llvm.org/viewvc/llvm-project/safecode/trunk/include/safecode/LoadStoreChecks.h?view=markup and http://llvm.org/viewvc/llvm-project/safecode/trunk/lib/InsertPoolChecks/LoadStoreChecks.cpp?view=markup). But I couldn't figure out how to run this pass. Please give me steps how to run this pass on some program say the above Account.cpp.

First off, you have to decide whether you want to work with clang or LLVM. They both operate on very different data structures which have advantages and disadvantages.
From your sparse description of your problem, I'll recommend going for optimization passes in LLVM. Working with the IR will make it much easier to sanitize, analyze and inject code because that's what it was designed to do. The downside is that your project will be dependent on LLVM which may or may not be a problem for you. You could output the result using the C backend but that won't be usable by a human.
Another important downside when working with optimization passes is that you also lose all symbols from the original source code. Even if the Value class (more on that later) has a getName method, you should never rely on it to contain anything meaningful. It's meant to help you debug your passes and nothing else.
You will also have to have a basic understanding of compilers. For example, it's a bit of a requirement to know about basic blocks and static single assignment form. Fortunately they're not very difficult concepts to learn or understand (the Wikipedia articles should be adequate).
Before you can start coding, you first have to do some reading so here's a few links to get you started:
Architecture Overview: A quick architectural overview of LLVM. Will give you a good idea of what you're working with and whether LLVM is the right tool for you.
Documentation Head: Where you can find all the links below and more. Refer to this if I missed anything.
LLVM's IR reference: This is the full description of the LLVM IR which is what you'll be manipulating. The language is relatively simple so there isn't too much to learn.
Programmer's manual: A quick overview of basic stuff you'll need to know when working with LLVM.
Writting Passes: Everything you need to know to write transformation or analysis passes.
LLVM Passes: A comprehensive list of all the passes provided by LLVM that you can and should use. These can really help clean up the code and make it easier to analyze. For example, when working with loops, the lcssa, simplify-loop and indvar passes will save your life.
Value Inheritance Tree: This is the doxygen page for the Value class. The important bit here is the inheritance tree that you can follow to get the documentation for all the instructions defined in the IR reference page. Just ignore the ungodly monstrosity that they call the collaboration diagram.
Type Inheritance Tree: Same as above but for types.
Once you understand all that then it's cake. To find memory accesses? Search for store and load instructions. To instrument? Just create what you need using the proper subclass of the Value class and insert it before or after the store and load instruction. Because your question is a bit too broad, I can't really help you more than this. (See correction below)
By the way, I had to do something similar a few weeks ago. In about 2-3 weeks I was able to learn all I needed about LLVM, create an analysis pass to find memory accesses (and more) within a loop and instrument them with a transformation pass I created. There was no fancy algorithms involved (except the ones provided by LLVM) and everything was pretty straightforward. Moral of the story is that LLVM is easy to learn and work with.
Correction: I made an error when I said that all you have to do is search for load and store instructions.
The load and store instruction will only give accesses that are made to the heap using pointers. In order to get all memory accesses you also have to look at the values which can represent a memory location on the stack. Whether the value is written to the stack or stored in a register is determined during the register allocation phase which occurs in an optimization pass of the backend. Meaning that it's platform dependent and shouldn't be relied on.
Now unless you provide more information about what kind of memory accesses you're looking for, in what context and how you intend to instrument them, I can't help you much more then this.

Since there are no answer to your question after two days, I will offer his one which is slightly but not completely off-topic.
As an alternative to LLVM, for static analysis of C programs, you may consider writing a Frama-C plug-in.
The existing plug-in that computes a list of inputs for a C function needs to visit every lvalue in the function's body. This is implemented in file src/inout/inputs.ml. The implementation is short (the complexity is in other plug-ins that provide their results to this one, e.g. resolving pointers) and can be used as a skeleton for your own plug-in.
A visitor for the Abstract Syntax Tree is provided by the framework. In order to do something special for lvalues, you simply define the corresponding method. The heart of the inputs plug-in is the method definition:
method vlval lv = ...
Here is an example of what the inputs plug-in does:
int a, b, c, d, *p;
main(){
p = &a;
b = c + *p;
}
The inputs of main() are computed thus:
$ frama-c -input t.c
...
[inout] Inputs for function main:
a; c; p;
More information about writing Frama-C plug-ins in general can be found here.

Execution time differences, are there any?

Consider this piece of code:
class A {
void methodX() {
// snip (1 liner function)
}
}
class B {
void methodX() {
// same -code
}
}
Now other way i can go is, I have a class(AppManager) most of whose members are static, (from legacy code, don't suggest me singleton ;))
class AppManager {
public:
static void methodX(){
// same-code
}
}
Which one should be preferred?
As both are inlined, there shouldn't be a runtime difference, right?
Which form is more cleaner?

Now first of all, this is a concern so minuscule that you would never have to worry about it unless the functions are called thousands of times per frame (and you're doing something where "frames" matter).
Second, IF they are inlined, the code will be (hopefully) optimized so much that there is no sign whatsoever of the function being non-static. It would be identical.
Even if they were not inlined, the difference would be minor. The ABI would put the "this" pointer into a register (or the stack), which it wouldn't do in a static function, but again, the net result would be almost not measurable.
Bottom line - write your code in the cleanest possible way. Performance is not a concern at this point.

In my opinion Inline way would be faster.
because inline functions are replaced in code in compile time and therefor there is no need to save registers, make a function call and then return again. but when you call a static function it's just a function call and it has much overhead than the inline one.

I think that this is most common optimisation problem. At first level when you writing a code you try every single trick that would help compiler so if compiler can not optimise code well, you already have. This is wrong. What are you looking for in first stage of optimisation during writing code is just clean and understandable code, design and structure. That will make by far better code, that "optimised" by hand.
Rule is:
If you do not have resources to benchmark code, rewrite it and spend lot of time for optimisation than you do not need optimised code. In most cases it is hard to gain any speed boost whit any kind optimisation, if you structured your code well.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js