Advice on my graphing project - c++

I'm working on a program that will update a list of objects every (.1) seconds. After the program finishes updating the list, the program will be aware if any object is within a certain distance of any other object. Every object has an X,Y position on a graph. Every object has a value known as 'Range'. Every tick (.1s) the program will use the distance formula to calculate if any other objects are less than or equal to the range of the object being processed.
For instance, if point A has a range of 4 and is at (1,1) and point B is at (1,2), the distance formula will return ~1, meaning point B is within range of point A. The calculation will look similar to this:
objects = { A = {X = 1,Y = 1,Range = 4}, B = {X = 1,Y = 2,Range = 3}, C = {X = 4,Y = 7,Range = 9} }
while(true) do
for i,v in pairs(objects) do
v:CheckDistance()
end
wait()
end
-- Point:CheckDistance() calculates the distance of all other points from Point "self".
-- Returns true if a point is within range of the Point "self", otherwise false.
--
The Problem:
The graph may contain over 200 points, each point would have math applied to it for every other point that exists. This will occur for every point every .1s. I imagine this may slow down or create lag in the 3D environment I am using.
Question:
Does this sound like the optimal way to do this?
What are your ideas on how this should be done more efficiently/quickly?

As Alex Feinamn said: it seems you are making your own collision detector, albeit a primitive one.
I'm not sure if you have points on a 2D or 3D plane, however. You say every object "has an X,Y position on a graph" and further on talk about "lag in the 3D environment I am using."
Well, both 2D and 3D physics – as well as Lua – are well developed fields, so there are no shortage of optimisations.
Spatial Trees
A quadtree (or octree for 3D) is a data structure that represents your entire 2 world as a square divided into four squares, which are each divided into four squares, and so on.
You can experiment with an interactive example yourself at this handy site.
Spatial trees in general provide very fast access for localised points.
The circles represent the interaction radius of a particular particle. As you can see, it is easy to find exactly which branches need to be traversed.
When dealing with point clouds, you need to ensure two points do not share the same location, or that there is a maximum division depth to your tree; otherwise, it will attempt to infintely divide branches.
I don't know of any octree implementations in Lua, but it would be pretty easy to make one. If you need examples, look for a Python or C implementation; do not look for one in C++, unless you can handle the template-madness.
Alternatively, you can use a C or C++ implementation via Lua API bindings or a FFI library (recommended, see binding section).
LuaJIT
LuaJIT is a custom Lua 5.1 interpreter and just-in-time compiler that provides significant speed and storage optimisations as well as an FFI library that allows for easy and efficient use of C functions and types, such as integers.
Using C types to represent your points and spatial tree will significant improve performance.
local ffi = require"ffi"
ffi.cdef[[
// gp = graphing project
struct gp_point_s {
double x, y;
double range;
};
struct gp_quadtree_root_s {
// This would be extensive
};
struct gp_quadtree_node_s {
//
};
]]
gp_point_mt = {
__add = function(a, b)
return gp_point(a.x+b.x, a.y+b.y)
end,
__tostring = function(self)
return self.x..", "..self.y
end
__index = {
-- I couldn't think of anything you might need here!
something = function(self) return self.range^27 end,
},
}
gp_point = ffi.metatype("struct gp_point_s", gp_point_mt)
-- Now use gp_point at will
local p = gp_point(22.5, 5.4, 6)
print(p)
print(p+gp_point(1, 1, 0))
print(p:something())
LuaJIT will compile any runtime usage of gp_point to native assembly, meaning C-like speeds in some cases.
Lua API vs FFI
This is a tricky one...
Calls via the Lua API cannot be properly optimised, as they are in authority over the Lua state.
Whereas raw calls to C functions via LuaJIT's FFI can be fully optiised.
It's up to you to decide how your code should interoperate:
Directly within the scripts (Lua, limiting factor: dynamic languages can only be optimised to a certain extent)
Scripts -> Application bindings (Lua -> C/C++, limiting factor: Lua API)
Scripts -> External libraries (Lua -> C, limiting factor: none, FFI calls are JIT compiled)
Delta time
Not really optimisation, but it's important.
If you're making an application designed for user interaction, then you should not fix your time step; that is, you cannot assume that every iteration takes exactly 0.1 seconds. Instead, you must multiply all time dependant operations by time.
pos = pos+vel*delta
vel = vel+accel*delta
accel = accel+jerk*delta
-- and so on!
However, this is a physics simulation; there are distinct issues with both fixed and variable time steps for physics, as discussed by Glenn Fiedler:
Fix your timestep or explode
... If you have a series of really stiff spring constraints for shock absorbers in a car simulation then tiny changes in dt can actually make the simulation explode. ...
If you use a fixed time step, then the simulation should theoretically run identically every time. If you use variable time step, it will be very smooth but unpredictable. I'd suggest asking your professor. (This is a university project, right?)

I don't know whether it's possible within your given circumstances, but I'd definitely use events rather than looping. That means track when a point changes it's position and react to it. This is much more efficient as it needs less processing and refreshes the positions faster than every 1 second. You should probably put in some function-call-per-time cap if your points float around because then these events would be called very often.

Related

Steps for creating an optimizer on TensorFlow

I'm trying to implement a new optimizer that consist in a big part of the Gradient Descent method (which means I want to perform a few Gradient Descent steps, then do different operations on the output and then again). Unfortunately, I found 2 pieces of information;
You can't perform a given amount of steps with the optimizers. Am I wrong about that? Because it would seem a logical option to add.
Given that 1 is true, you need to code the optimizer using C++ as a kernel and thus losing the powerful possibilities of TensorFlow (like computing gradients).
If both of them are true then 2 makes no sense for me, and I'm trying to figure out then what's the correct way to build a new optimizer (the algorithm and everything else are crystal clear).
Thanks a lot
I am not 100% sure about that, but I think you are right. But I don't see the benefits of adding such option to TensorFlow. The optimizers based on GD I know usually work like this:
for i in num_of_epochs:
g = gradient_of_loss()
some_storage = f(previous_storage, func(g))
params = func2(previous_params, some_storage)
If you need to perform a couple of optimization steps, you can simply do it in a loop:
train_op = optimizer.minimize(loss)
for i in range(10):
sess.run(train_op)
I don't think parameter multitrain_op = optimizer.minimize(loss, steps) was needed in the implementation of the current optimizers and the final user can easily simulate it with code before, so that was probably the reason it was not added.
Let's take a look at a TF implementation of an example optimizer, Adam: python code, c++ code.
The "gradient handling" part is processed entirely by inheriting optimizer.Optimizer in python code. The python code only define types of storage to hold the moving window averages, square of gradients, etc, and executes c++ code passing to it the already calculated gradient.
The c++ code has 4 lines, updating the stored averages and parameters.
So to your question "how to build an optimizer":
1 . define what you need to store between the calculations of the gradient
2. inherit optimizer.Optimizer
3. implement updating the variables in c++.

Mimimization of anonymous function in C++

I have a cyclic program in C++ which includes composing of a function (every time it is different) and further minimization of it. Composing of a function is implemented with GiNaC package (symbolic expressions).
I tried to minimize functions using Matlab fmincon function but it ate all the memory while converting string to lambda function (functions are rather complicated). And I couldn't manage to export function from C++ to Matlab in any way but as a string.
Is there any way to compose a complicated function (3 variables, sin-cos-square root etc.) and minimize it without determing gradient by myself because I don't know how functions look before running the program?
I also looked at NLopt and as I understood it requires gradients to be writte by programmer.
Most optimization algorithms do require the gradient. However, if it's impossible to 'know' it directly, you may evaluate it considering a small increment of every coordinate. If your F function depends on coordinates of x vector, you may approximate the i's component of you gradient vector G as
x1 = x;
x1[i] += dx;
G[i] = (F(x1) - F(x))/dx;
where dx is some small increment. Although such a calculation is approximate it's usually absolutely good for a minimum finding provided that dx is small enough.

Formula for PI-regulation Proportional Integral algorithm

I've been reading this website: http://www.csimn.com/CSI_pages/PIDforDummies.html and I'm confused about the proportional integral part. Here's what it says.
Proportional control
Here’s a diagram of the controller when we have enabled only P control:
In Proportional Only mode, the controller simply multiplies the Error by the Proportional Gain (Kp) to get the controller output.
The Proportional Gain is the setting that we tune to get our desired performance from a “P only” controller.
A match made in heaven: The P + I Controller
If we put Proportional and Integral Action together, we get the humble PI controller. The Diagram below shows how the algorithm in a PI controller is calculated.
The tricky thing about Integral Action is that it will really screw up your process unless you know exactly how much Integral action to apply.
A good PID Tuning technique will calculate exactly how much Integral to apply for your specific process - but how is the Integral Action adjusted in the first place?
As you can see, the proportional part is easy to understand it says that you multiply error by tuning variable. The part that I don't get is where you get the P and I from on the second part, and what mathematical operation you do with them. I don't have a degree in mathematics or advanced calculus knowledge, so I would appreciate it if you would try to keep it algebra level.
There is a big part missing from the text, the actual physical system that turns the control into a process and the actual physical variable.
Think of the integral as some kind of averaging operation that filters out small oscillations in the PV input. It also represents some kind of memory of the immediate past of the process.
A moving exponential average, for instance, can be thought of being a mix of integral and proportional action.
Staying with the car driving example, if you come to a curb where you need the steering wheel in a certain position to go in a circle, you don't just yank the wheel to that position, you move it gradually (most of the time). Exactly such ramp-up and -down actions are effects of using the integral action part.
I integral part is just summation also multiplied by some constant.
Analogue integration is done by nonlinear gain and amplifier.
Digital integration of first order is just:
output += input*dt;
second order is:
temp += input*dt;
output += temp*dt;
dt is the duration time of iteration loop (timer or what ever)
do not forget that PI regulator can have more complicated response
i1 += input*dt;
i2 += i1*dt;
i3 += i2*dt;
output = a0*input + a1*i1 + a2*i2 +a3*i3 ...;
where a0 is the P part
Now the I regulator adds more and more amount of control value
until the controlled value is the same as the preset value
the longer it takes to match it the faster it controls
this creates fast oscillations around preset value
in comparison to P with the same gain
but in average the control time is smaller then in just P regulators
therefore the I gain is usually much much smaller which creates the memory and smooth effect LutzL mentioned. (while the regulation time is similar or smaller then just for P regulation)
The controlled device has its own response
this can be represented as differential function
there is a lot of theory in cybernetics about obtaining the right regulator response
to match your process needs as:
quality of control
reaction times
max oscillations amplitude
stability
but for all you need differential math like solving system of differential equations of any order
strongly recommend use of Laplace transform
but many people also use Z transform instead
So I-regulator add speed to regulation
but it also create bigger oscillations
and when not matching the regulated system properly also creates instability
Integration adds overflow risks to regulation (Analog integration is very sensitive to it)
Also take in mind you can also substracting the I part from control value
which will make the exact opposite
sometimes the combination of more I parts are used to match desired regulation response shape

odeint and ad hoc change of state variable

I just implemented the numerical integration for a set of coupled ODEs
from a discretized PDE using the odeint C++ library. It works nicely and
is lightning fast, but there is one issue:
My system of ODEs has, so-called, absorbing boundary conditions: the time
derivatives of my state variable, n, which is a vector of N doubles
(a population density) gets calculated in the system function, but before that happens
(or after the time integration) I would like to set:
n[N]=n[N-2];
n[N-1]=n[N-2];
However, of course this doesn't work because the state variable in the system
function is declared as const, and it looks as if this could not be changed
other than through meddling with the library... is there any way around this?
I should mention that setting dndt[N] and dndt[N-1] to zero might look like a
solution, but it doesn't really help as it defies the concept of absorbing boundary
conditions (n[N] and n[N-1] would then always have the values they had at t=0, rather
then the value of n[N-2] at any point in time), and so I'd really prefer to change n.
Thanks for any help!
Regards,
Michael
Usually absorbing boundary condition manifests itself in the equations of motion. n[N] = n[N-1] = n[N-2], so can insert n[N]=n[N-2] and n[N-1]=n[N-2] into the equation for dndt[N-2].
For example, the discrete Laplacian Lx[i] = x[i+1]-2 x[i] +x[i-1] with absorbing boundaries x[n]=x[n-1] can be written as Lx[n-1] = x[n-2] - x[n-1]. The equation for x[n] can then be omitted.

How do I most effectively prevent my normally-distributed random variable from being zero?

I'm writing a Monte Carlo algorithm, in which at one point I need to divide by a random variable. More precisely: the random variable is used as a step width for a difference quotient, so I actually first multiply something by the variable and then again divide it out of some locally linear function of this expression. Like
double f(double);
std::tr1::variate_generator<std::tr1::mt19937, std::tr1::normal_distribution<> >
r( std::tr1::mt19937(time(NULL)),
std::tr1::normal_distribution<>(0) );
double h = r();
double a = ( f(x+h) - f(x) ) / h;
This works fine most of the time, but fails when h=0. Mathematically, this is not a concern because in any finite (or, indeed, countable) selection of normally-distributed random variables, all of them will be nonzero with probability 1. But in the digital implementation I will encounter an h==0 every ≈2³² function calls (regardless of the mersenne twister having a period longer than the universe, it still outputs ordinary longs!).
It's pretty simple to avoid this trouble, at the moment I'm doing
double h = r();
while (h==0) h=r();
but I don't consider this particularly elegant. Is there any better way?
The function I'm evaluating is actually not just a simple ℝ->ℝ like f is, but an ℝᵐxℝⁿ -> ℝ in which I calculate the gradient in the ℝᵐ variables while numerically integrating over the ℝⁿ variables. The whole function is superimposed with unpredictable (but "coherent") noise, sometimes with specific (but unknown) outstanding frequencies, that's what gets me into trouble when I try it with fixed values for h.
your way seems elegant enough, maybe a little different:
do {
h = r();
} while (h == 0.0);
The ratio of two normally-distributed random variables is the Cauchy distribution. The Cauchy distribution is one of those nasty distributions with an infinite variance. Very nasty indeed. A Cauchy distribution will make a mess of your Monte Carlo experiment.
In many cases where the ratio of two random variables is computed, the denominator is not normal. People often use a normal distribution to approximate this non-normally distributed random variable because
normal distributions are usually so easy to work with,
usually have such nice mathematical properties,
the normal assumption appears to be more or less correct, and
the real distribution is a bear.
Suppose you are dividing by distance. Distance is semi-positive definite by definition, and is often positive definite as a random variable. So right off the bat distance can never be normally distributed. Nonetheless, people often assume a normal distribution for distance in cases where the mean is much, much larger than the standard deviation. When this normal assumption is made you need to protect against those non-real values. One simple solution is a truncated normal.
If you want to preserve normal distribution you have to either exclude 0 or assign 0 to a new previously non-occurring value. Since the second is most likely not possible in the finite ranges of computer science the first is our only option.
A function (f(x+h)-f(x))/h has a limit as h->0 and therefore if you encounter h==0 you should use that limit. The limit would be f'(x) so if you know the derivative you can use it.
If what you are actually doing is creating number of discrete points though that approximate a normal distribution, and this is good enough for your distribution, create it in a way that none of them will actually have the value 0.
Depending on what you're trying to compute, perhaps something like this would work:
double h = r();
double a;
if (h != 0)
a = ( f(x+h) - f(x) ) / h;
else
a = 0;
If f is a linear function, this should (I think?) remain continuous at h = 0.
You might also want to instead consider trapping division-by-zero exceptions to avoid the cost of the branch. Note that this may or may not have a detrimental effect on performance - benchmark both ways!
On Linux, you will need to build the file that contains your potential division by zero with -fnon-call-exceptions, and install a SIGFPE handler:
struct fp_exception { };
void sigfpe(int) {
signal(SIGFPE, sigfpe);
throw fp_exception();
}
void setup() {
signal(SIGFPE, sigfpe);
}
// Later...
try {
run_one_monte_carlo_trial();
} catch (fp_exception &) {
// skip this trial
}
On Windows, use SEH:
__try
{
run_one_monte_carlo_trial();
}
__except(GetExceptionCode() == EXCEPTION_INT_DIVIDE_BY_ZERO ?
EXCEPTION_EXECUTE_HANDLER : EXCEPTION_CONTINUE_SEARCH)
{
// skip this trial
}
This has the advantage of potentially having less effect on the fast path. There is no branch, although there may be some adjustment of exception handler records. On Linux, there may be a small performance hit due to the compiler generating more conservative code for for -fnon-call-exceptions. This is less likely to be a problem if the code compiled under -fnon-call-exceptions does not allocate any automatic (stack) C++ objects. It's also worth noting that this makes the case in which division by zero does happen VERY expensive.