Is it possible to create constraints in parallel? - pyomo

I have a relatively simple Pyomo model, where I wish to create the constraints in parallel. I have created the following functions:
def constraintA():
constraintA = pyo.ConstraintList()
for y, h in model.YH:
constraintA.add(model.varA[y, h] <= data[y, h])
return constraintA
def constraintB():
constraintB = pyo.ConstraintList()
for y, h in model.YH:
constraintB.add(model.varB[y, h] >= other_data[y, h])
return constraintB
def constraintC():
constraintC = pyo.ConstraintList()
for y, h in model.YH:
constraintC.add(model.varC[y, h] <= data[y,h] * 0.5)
return constraintC
Note that the model, data and other_data are considered 'global' and therefore can be used within these constraint functions. Now, what I would like to do is create attributes in the pyomo model with the respective constraints, but then in parallel.
So I start with:
constraints = [constraintA, constraintB, constraintC]
with ProcessPoolExecutor() as pool:
for work_item in enumerate(constraints):
fs = [pool.submit(c) for c in constraints]
However, I get "finished raised AttributeError". Does this mean that despite Python only has to read and copy it, it is locking the object and therefore cannot access the attribute and therefore is unable to create the constraints?
In order to resolve this issue, I was wondering whether I need to explicitly copy the model and the data and other_data to the constraint functions themselves. Or is there another way to achieve this, without "passing variables around"?

For all intents and purposes, Pyomo model construction is a serial exercise.
This is primarily due to the CPython GIL (Global Interpreter Lock), which effectively serializes everything in Python (multithreading in CPython is actually serial, and relies on things like I/O waits to provide "parallelism"). In theory, you can construct model fragments in distributed processes (either through forked processes or something like mpi4py). The problem is that the process of serializing the model fragments and collecting them back into a single process is actually more costly than just generating the model serially in the single process.
There are some Pyomo-based ideas that may enable parallel model construction in the future: one is possibly supporting distributed model construction as well as LP / NL file creation, then merging the LP / NL file fragments instead of bringing back the original modeling objects, and the other would rely on a compiled C extension that would accept a Constraint "template" and construct objects on the C side (where we can avoid the GIL). Both of these are very experimental ideas at the moment.
There is also always the possibility that CPython will remove the GIL (in which case Pyomo will almost certainly immediately add multithreading to constraint construction). There have been many efforts over the years to remove the GIL from CPython, and maybe one will finally succeed and be adopted by CPython.
The last thing to point to is #AirSquid's comment that model construction is usually not the bottleneck. If it is, then some time looking at how you are defining your model / constraints would be warranted (Pyomo is very forgiving in what it will allow you to do, but not all approaches are equally performant).

Related

How to ensure Eigen isometry stays isometric?

I am currently looking into Eigen::Isometry3f, defined as
typedef Transform<float,3,Isometry> Isometry3f;.
Therewith i cannot, for example, assign an Affine3f to that Isometry3f, which is good to keep the isometry intact. (The reason is, that Mode is checked in the assignment operator of Transform.)
I can however - via the Transform::operator(...), which shortcuts to Transform::m_matrix(...) - do
Eigen::Isometry3f iso;
iso.setIdentity();
iso(1, 1) = 2; //works (but should not ?!)
and thus destroy the isometry.
Q1:
Shouldn't Transform::operator(...) be disallowed or at least issue a warning? If you really want to mess up you could still use Transform.matrix()(1,1) = 2 ...
Q2:
Are there other pitfalls where i could accidentally destroy my isometry?
Q3:
If there are other pitfalls: what is the intention of Mode==Isometry? Is it not to ensure closedness/safety?
The main purpose of Mode==Isometry is to improve the speed of some operations, like inversion, or extraction of rotation part. It essentially says "I, the user, guaranty to Eigen that the underlying matrix represent an isometry". So it is the responsibility of the user to no shoot itself. You can also break an initial isometry by replacing the linear part with a bad matrix:
iso.linear() = Matrix3f::Random();
Checking for isometry is not cheap at all, so adding checks everywhere would break the initial purpose. Perhaps, adding a bool Transform::checkIsometry() would help tracking issues in user code, but this is out-of the scope of SO.

Steps for creating an optimizer on TensorFlow

I'm trying to implement a new optimizer that consist in a big part of the Gradient Descent method (which means I want to perform a few Gradient Descent steps, then do different operations on the output and then again). Unfortunately, I found 2 pieces of information;
You can't perform a given amount of steps with the optimizers. Am I wrong about that? Because it would seem a logical option to add.
Given that 1 is true, you need to code the optimizer using C++ as a kernel and thus losing the powerful possibilities of TensorFlow (like computing gradients).
If both of them are true then 2 makes no sense for me, and I'm trying to figure out then what's the correct way to build a new optimizer (the algorithm and everything else are crystal clear).
Thanks a lot
I am not 100% sure about that, but I think you are right. But I don't see the benefits of adding such option to TensorFlow. The optimizers based on GD I know usually work like this:
for i in num_of_epochs:
g = gradient_of_loss()
some_storage = f(previous_storage, func(g))
params = func2(previous_params, some_storage)
If you need to perform a couple of optimization steps, you can simply do it in a loop:
train_op = optimizer.minimize(loss)
for i in range(10):
sess.run(train_op)
I don't think parameter multitrain_op = optimizer.minimize(loss, steps) was needed in the implementation of the current optimizers and the final user can easily simulate it with code before, so that was probably the reason it was not added.
Let's take a look at a TF implementation of an example optimizer, Adam: python code, c++ code.
The "gradient handling" part is processed entirely by inheriting optimizer.Optimizer in python code. The python code only define types of storage to hold the moving window averages, square of gradients, etc, and executes c++ code passing to it the already calculated gradient.
The c++ code has 4 lines, updating the stored averages and parameters.
So to your question "how to build an optimizer":
1 . define what you need to store between the calculations of the gradient
2. inherit optimizer.Optimizer
3. implement updating the variables in c++.

Efficient evaluation of arbitrary functions given as data, in C++

Consider the following goal:
Create a program that solves: minimize f(x) for an arbitrary f and x supplied as input.
How could one design a C++ program that could receive a description of f and x and process it efficiently?
If the program was actually a C++ library then one could explicitly write the code for f and x (probably inheriting from some base function class for f and state class for x).
However, what should one do if the program is for example a service, and the user is sending the description of f and x in some high level representation, e.g. a JSON object?
Ideas that come to mind
1- Convert f into an internal function representation (e.g. a list of basic operations). Apply those whenever f is evaluated.
Problems: inefficient unless each operation is a batch operation (e.g. if we are doing vector or matrix operations with large vectors / matrices).
2- Somehow generate C++ code and compile the code for representing x and computing f. Is there a way to restrict compilation so that only that code needs to be compiled, but the rest of the code is 'pre-compiled' already?
The usual approach used by the mp library and others is to create an expression tree (or DAG) and use some kind of a nonlinear optimization method that normally relies on derivative information which can be computed using automatic or numeric differentiation.
An expression tree can be efficiently traversed for evaluation using a generic visitor pattern. Using JIT might be an overkill unless the time taken for evaluating a function takes substantial fraction of the optimization time.

LPSolve - specify constant coefficients

I'm using LPSolve IDE to solve a LP problem. I have to test the model against about 10 or 20 sets of different parameters and compare them.
Is there any way for me to keep the general model, but to specify the constants as I wish? For example, if I have the following constraint:
A >= [c]*B
I want to test how the model behaves when [c] = 10, [c] = 20, and so on. For now, I'm simply preparing different .lp files via search&replace, but:
a) it doesn't seem too efficient
b) at some point, I need to consider the constraint of the form A >= B/[c] // =(1/[c]*B). It seems, however, that LPSolve doesn't recogize the division operator. Is specifying 1/[c] directly each time the only option?
It is not completely clear what format you use with lp_solve. With the cplex lp format for example, there is no better way: you cannot use division for the coefficient (or even multiplication for that matter) and there is no function to 'include' another file or introduce a symbolic names for a parameter. It is a very simple language, and not suitable for any complex task.
There are several solutions for your problem; it depends if you are interested in something fast to implement, or 'clean', reusable and with a short runtime (of course this is a compromise).
You have the possibility to generate your lp files from another language, e.g. python, bash, etc. This is a 'quick and dirty' solution: very slow at runtime, but probably the faster to implement.
As every lp solver I know, lp_solve comes with several modelling interfaces: you can for example use the GNU mp format instead of the current one. It recognizes multiplication, divisions, conditionals, etc. (everything you are looking for, see the section 3.1 'numeric expressions')
Finally, you have the possibility to use directly the lp_solve interface from another programming language (e.g. C) which will be the most flexible option, but it may require a little bit more work.
See the lp_solve documentation for more details on the supported input formats and the API reference.

Advice on my graphing project

I'm working on a program that will update a list of objects every (.1) seconds. After the program finishes updating the list, the program will be aware if any object is within a certain distance of any other object. Every object has an X,Y position on a graph. Every object has a value known as 'Range'. Every tick (.1s) the program will use the distance formula to calculate if any other objects are less than or equal to the range of the object being processed.
For instance, if point A has a range of 4 and is at (1,1) and point B is at (1,2), the distance formula will return ~1, meaning point B is within range of point A. The calculation will look similar to this:
objects = { A = {X = 1,Y = 1,Range = 4}, B = {X = 1,Y = 2,Range = 3}, C = {X = 4,Y = 7,Range = 9} }
while(true) do
for i,v in pairs(objects) do
v:CheckDistance()
end
wait()
end
-- Point:CheckDistance() calculates the distance of all other points from Point "self".
-- Returns true if a point is within range of the Point "self", otherwise false.
--
The Problem:
The graph may contain over 200 points, each point would have math applied to it for every other point that exists. This will occur for every point every .1s. I imagine this may slow down or create lag in the 3D environment I am using.
Question:
Does this sound like the optimal way to do this?
What are your ideas on how this should be done more efficiently/quickly?
As Alex Feinamn said: it seems you are making your own collision detector, albeit a primitive one.
I'm not sure if you have points on a 2D or 3D plane, however. You say every object "has an X,Y position on a graph" and further on talk about "lag in the 3D environment I am using."
Well, both 2D and 3D physics – as well as Lua – are well developed fields, so there are no shortage of optimisations.
Spatial Trees
A quadtree (or octree for 3D) is a data structure that represents your entire 2 world as a square divided into four squares, which are each divided into four squares, and so on.
You can experiment with an interactive example yourself at this handy site.
Spatial trees in general provide very fast access for localised points.
The circles represent the interaction radius of a particular particle. As you can see, it is easy to find exactly which branches need to be traversed.
When dealing with point clouds, you need to ensure two points do not share the same location, or that there is a maximum division depth to your tree; otherwise, it will attempt to infintely divide branches.
I don't know of any octree implementations in Lua, but it would be pretty easy to make one. If you need examples, look for a Python or C implementation; do not look for one in C++, unless you can handle the template-madness.
Alternatively, you can use a C or C++ implementation via Lua API bindings or a FFI library (recommended, see binding section).
LuaJIT
LuaJIT is a custom Lua 5.1 interpreter and just-in-time compiler that provides significant speed and storage optimisations as well as an FFI library that allows for easy and efficient use of C functions and types, such as integers.
Using C types to represent your points and spatial tree will significant improve performance.
local ffi = require"ffi"
ffi.cdef[[
// gp = graphing project
struct gp_point_s {
double x, y;
double range;
};
struct gp_quadtree_root_s {
// This would be extensive
};
struct gp_quadtree_node_s {
//
};
]]
gp_point_mt = {
__add = function(a, b)
return gp_point(a.x+b.x, a.y+b.y)
end,
__tostring = function(self)
return self.x..", "..self.y
end
__index = {
-- I couldn't think of anything you might need here!
something = function(self) return self.range^27 end,
},
}
gp_point = ffi.metatype("struct gp_point_s", gp_point_mt)
-- Now use gp_point at will
local p = gp_point(22.5, 5.4, 6)
print(p)
print(p+gp_point(1, 1, 0))
print(p:something())
LuaJIT will compile any runtime usage of gp_point to native assembly, meaning C-like speeds in some cases.
Lua API vs FFI
This is a tricky one...
Calls via the Lua API cannot be properly optimised, as they are in authority over the Lua state.
Whereas raw calls to C functions via LuaJIT's FFI can be fully optiised.
It's up to you to decide how your code should interoperate:
Directly within the scripts (Lua, limiting factor: dynamic languages can only be optimised to a certain extent)
Scripts -> Application bindings (Lua -> C/C++, limiting factor: Lua API)
Scripts -> External libraries (Lua -> C, limiting factor: none, FFI calls are JIT compiled)
Delta time
Not really optimisation, but it's important.
If you're making an application designed for user interaction, then you should not fix your time step; that is, you cannot assume that every iteration takes exactly 0.1 seconds. Instead, you must multiply all time dependant operations by time.
pos = pos+vel*delta
vel = vel+accel*delta
accel = accel+jerk*delta
-- and so on!
However, this is a physics simulation; there are distinct issues with both fixed and variable time steps for physics, as discussed by Glenn Fiedler:
Fix your timestep or explode
... If you have a series of really stiff spring constraints for shock absorbers in a car simulation then tiny changes in dt can actually make the simulation explode. ...
If you use a fixed time step, then the simulation should theoretically run identically every time. If you use variable time step, it will be very smooth but unpredictable. I'd suggest asking your professor. (This is a university project, right?)
I don't know whether it's possible within your given circumstances, but I'd definitely use events rather than looping. That means track when a point changes it's position and react to it. This is much more efficient as it needs less processing and refreshes the positions faster than every 1 second. You should probably put in some function-call-per-time cap if your points float around because then these events would be called very often.