How to convert a multi-dimensional array model to a tuple structure in CPLEX? - tuples

I would like to use a large dataset (3100 demand locations) for my facility location problem.
One of the constraints is the size of the distance matrix. If I use a 2d array for distance between locations, I store a very large amount of unnecessary data. (Like long distances which I will not use in my model, so I add another constraint like <= maxdist )
Instead of using a 2D array I am trying to use the following tuple, however if I don`t use the complete distance matrix (which converted to a tuple) I do not get a solution ?
Thanks for your suggestions ...
{string} Supply = ...; // Supply locations
{string} DC = ...; // Candidate facility locations
{string} Demand = ...; // Demand locations
tuple Dist_Tup{
string FROM;
string TO;
float MILES;
setof(Dist_Tup) DistanceTmp=...;
setof(Dist_Tup) Distance = { <FROM,TO,MILES> | <FROM,TO,MILES> in DistanceTmp : FROM in Supply || FROM in DC};
dexpr float TransportCost1 = sum(i in Supply , a in Alt , j in DC , p in Period, BB in Distance : BB.TO==j && BB.FROM==i) X[i][a][j][p]*Dist[BB]*C[i][j];
//dexpr float TransportCost1 = sum(i in Supply , a in Alt , j in DC , p in Period) X[i][a][j][p]*G[i][a][j]*C[i][j];

Moved solution from question to answer:
My new Code as a solution :
tuple Arc {
string FROM;
string TO;
tuple Dist_Tup{
string FROM;
string TO;
float MILES;
setof(Dist_Tup) Distance=...;
setof(Arc) SDC_Arcs = { <FROM,TO> | <FROM,TO,MILES> in Distance : FROM in Supply && TO in DC};
setof(Arc) SD_Arcs = { <FROM,TO> | <FROM,TO,MILES> in Distance : FROM in Supply && TO in Demand && MILES<=maxDist};
setof(Arc) DCD_Arcs = { <FROM,TO> | <FROM,TO,MILES> in Distance : FROM in DC && TO in Demand && MILES<=maxDist};
dexpr float TransportCost1 = sum(i in Supply , a in Alt , j in DC , p in Period, DIST in Distance : DIST.FROM==i && DIST.TO==j, ARC in SDC_Arcs : ARC.FROM==i && ARC.TO==j) X[ARC][a][p]*G2[DIST]*C[i][j];


Compute normal based on Voronoi pattern

I am applying a 3D Voronoi pattern on a mesh. Using those loops, I am able to compute the cell position, an id and the distance.
But I would like to compute a normal based on the generated pattern.
How can I generate a normal or reorient the current normal based on this pattern and associated cells ?
The aim is to provide a faced look for the mesh. Each cell's normal should point in the same direction and adjacent cells point in different directions. Those directions should be based on the original mesh normals, I don't want to totally break mesh normals and have those points in random directions.
Here's how I generate the Voronoi pattern.
float3 p = floor(position);
float3 f = frac(position);
float id = 0.0;
float distance = 10.0;
for (int k = -1; k <= 1; k++)
for (int j = -1; j <= 1; j++)
for (int i = -1; i <= 1; i++)
float3 cell = float3(float(i), float(j), float(k));
float3 random = hash3(p + cell);
float3 r = cell - f + random * angleOffset;
float d = dot(r, r);
if (d < distance)
id = random;
distance = d;
cellPosition = cell + p;
normal = ?
And here's the hash function :
float3 hash3(float3 x)
x = float3(dot(x, float3(127.1, 311.7, 74.7)),
dot(x, float3(269.5, 183.3, 246.1)),
dot(x, float3(113.5, 271.9, 124.6)));
return frac(sin(x)*43758.5453123);
This looks like a fairly expensive fragment shader, it might more sense to bake out a normal map than to try to do this in real time.
It's hard to tell what your shader is doing but I think it's checking every pixel against a 3x3 grid of voronoi cells. One weird thing is that random is a vec3 that somehow gets assigned to id, which is just a scalar.
Anyway, it sounds like you would like to perturb the mesh-supplied normal by a random vector, but you'd like all pixels corresponding to a particular voronoi cell to be perturbed in the same way.
Since you already have a variable called random which presumably represents a random value generated deterministically as a function of the voronoi cell, you could just use that. For example, the following would perturb the normal by a small amount:
normal = normalize(meshNormal + 0.2 * normalize(random));
If you want to give more weight to the random component, just increase the 0.2 constant.

Computing Rand error efficiently

I'm trying to compare two image segmentations to one another.
In order to do so, I transform each image into a vector of unsigned short values, and calculate the rand error,
according to the following formula:
Here is my code (the rand error calculation part):
cv::Mat im1,im2;
//code for acquiring data for im1, im2
//code for copying im1(:)->v1, im2(:)->v2
int N = v1.size();
double a = 0;
double b = 0;
for (int i = 0; i <N; i++)
for (int j = 0; j < i; j++)
unsigned short l1 = v1[i];
unsigned short l2 = v1[j];
unsigned short gt1 = v2[i];
unsigned short gt2 = v2[j];
if (l1 == l2 && gt1 == gt2)
else if (l1 != l2 && gt1 != gt2)
double NPairs = (double)(N*N)/2;
double res = (a + b) / NPairs;
My problem is that length of each vector is 307,200.
Therefore the total number of iterations is 47,185,920,000.
It makes the running time of the entire process is very slow (a few minutes to compute).
Do you have any idea how can I improve it?
Let's assume that we have P distinct labels in the first image and Q distinct labels in the second image. The key observation for efficient computation of Rand error, also called Rand index, is that the number of distinct labels is usually much smaller than the number of pixels (i.e. P, Q << n).
Step 1
First, pre-compute the following auxiliary data:
the vector s1, with size P, such that s1[p] is the number of pixel positions i with v1[i] = p.
the vector s2, with size Q, such that s2[q] is the number of pixel positions i with v2[i] = q.
the matrix M, with size P x Q, such that M[p][q] is the number of pixel positions i with v1[i] = p and v2[i] = q.
The vectors s1, s2 and the matrix M can be computed by passing once through the input images, i.e. in O(n).
Step 2
Once s1, s2 and M are available, a and b can be computed efficiently:
This holds because each pair of pixels (i, j) that we are interested in has the property that both its pixels have the same label in image 1, i.e. v1[i] = v1[j] = p; and the same label in image 2, i.e. v2[i] = v2[ j ] = q. Since v1[i] = p and v2[i] = q, the pixel i will contribute to the bin M[p][q], and the same does the pixel j. Therefore, for each combination of labels p and q we need to consider the number of pairs of pixels that fall into the M[p][q] bin, and then to sum them up for all possible labels p and q.
Similarly, for b we have:
Here, we are counting how many pairs are formed with one of the pixels falling into the bin M[p][q]. Such a pixel can form a good pair with each pixel that is falling into a bin M[p'][q'], with the condition that p != p' and q != q'. Summing over all such M[p'][q'] is equivalent to subtracting from the sum over the entire matrix M (this sum is n) the sum on row p (i.e. s1[p]) and the sum on the column q (i.e. s2[q]). However, after subtracting the row and column sums, we have subtracted M[p][q] twice, and this is why it is added at the end of the expression above. Finally, this is divided by 2 because each pair was counted twice (once for each of its two constituent pixels as being part of a bin M[p][q] in the argument above).
The Rand error (Rand index) can now be computed as:
The overall complexity of this method is O(n) + O(PQ), with the first term usually being the dominant one.
After reading your comments, I tried the following approach:
calculate the intersections for each possible pair of values.
use the intersection results to calculate the error.
I performed the calculation straight on the cv::Mat objects, without converting them into std::vector objects. That gave me the ability to use opencv functions and achieve a faster runtime.
double a = 0, b = 0; //init variables
//unique function finds all the unique value of a matrix, with an optional input mask
std::set<unsigned short> m1Vals = unique(mat1);
for (unsigned short s1 : m1Vals)
cv::Mat mask1 = (mat1 == s1);
std::set<unsigned short> m2ValsInRoi = unique(mat2, mat1==s1);
for (unsigned short s2 : m2ValsInRoi)
cv::Mat mask2 = mat2 == s2;
cv::Mat andMask = mask1 & mask2;
double andVal = cv::countNonZero(andMask);
a += (andVal*(andVal - 1)) / 2;
b += ((double)cv::countNonZero(andMask) * (double)cv::countNonZero(~mask1 & ~mask2)) / 2;
double NPairs = (double)(N*(N-1)) / 2;
double res = (a + b) / NPairs;
The runtime is now reasonable (only a few milliseconds vs a few minutes), and the output is the same as the code above.
I ran the code on the following matrices:
//mat1 = [1 1 2]
cv::Mat mat1 = cv::Mat::ones(cv::Size(3, 1), CV_16U);<ushort>(cv::Point(2, 0)) = 2;
//mat2 = [1 2 1]
cv::Mat mat2 = cv::Mat::ones(cv::Size(3, 1), CV_16U);<ushort>(cv::Point(1, 0)) = 2;
In this case a = 0 (no matching pairs correspondence), and b=1(one matching pair for i=2,j=3). The algorithm result:
a = 0
b = 1
NPairs = 3
result = 0.3333333
Thank you all for your help!

How to do logarithmic binning on a histogram?

I'm looking for a technique to logarithmically bin some data sets. We've got data with values ranging from _min to _max (floats >= 0) and the user needs to be able to specify a varying number of bins _num_bins (some int n).
I've implemented a solution taken from this question and some help on scaling here but my solution stops working when my data values lie below 1.0.
class Histogram {
double _min, _max;
int _num_bins;
double Histogram::logarithmicValueOfBin(double in) const {
if (in == 0.0)
return _min;
double b = std::log(_max / _min) / (_max - _min);
double a = _max / std::exp(b * _max);
double in_unscaled = in * (_max - _min) / _num_bins + _min;
return a * std::exp(b * in_unscaled) ;
When the data values are all greater than 1 I get nicely sized bins and can plot properly. When the values are less than 1 the bins come out more or less the same size and we get way too many of them.
I found a solution by reimplementing an opensource version of Matlab's logspace function.
Given a range and a number of bins you need to create an evenly spaced numerical sequence
module.exports = function linspace(a,b,n) {
var every = (b-a)/(n-1),
ranged = integers(a,b,every);
return ranged.length == n ? ranged : ranged.concat(b);
After that you need to loop through each value and with your base (e, 2 or 10 most likely) store the power and you get your bin ranges.
module.exports.logspace = function logspace(a,b,n) {
return linspace(a,b,n).map(function(x) { return Math.pow(10,x); });
I rewrote this in C++ and it's able to support ranges > 0.
You can do something like the following
// Create isolethargic binning
int T_MIN = 0; //The lower limit i.e. 1.e0
int T_MAX = 8; //The uper limit i.e. 1.e8
int ndec = T_MAX - T_MIN; //Number of decades
int N_BPDEC = 1000; //Number of bins per decade
int nbins = (int) ndec*N_BPDEC; //Total number of bins
double step = (double) ndec / nbins;//The increment
double tbins[nbins+1]; //The array to store the bins
for(int i=0; i <= nbins; ++i)
tbins[i] = (float) pow(10., step * (double) i + T_MIN);

Using Dijkstra's algorithm with an unordered_map graph

So this my current code, I will post the header declarations below...
// Using Dijkstra's
int Graph::closeness(string v1, string v2){
int edgesTaken = 0;
unordered_map<string, bool> visited;
unordered_map<string, int> distances;
string source = v1; // Starting node
while(source != v2 && !visited[source]){
// The node has been visited
visited[source] = 1;
// Set all initial distances to infinity
for(auto i = vertices.begin(); i != vertices.end(); i++){
distances[i->first] = INT_MAX;
// Consider all neighbors and calculate distances from the current node
// & store them in the distances map
for(int i = 0; i < vertices[source].edges.size(); i++){
string neighbor = vertices[source].edges[i].name;
distances[neighbor] = vertices[source].edges[i].weight;
// Find the neighbor with the least distance
int minDistance = INT_MAX;
string nodeWithMin;
for(auto i = distances.begin(); i != distances.end(); i++){
int currDistance = i->second;
if(currDistance < minDistance){
minDistance = currDistance;
nodeWithMin = i->first;
// There are no neighbors and the node hasn't been found yet
// then terminate the function and return -1. The nodes aren't connected
if(minDistance == INT_MAX)
return -1;
// Set source to the neighbor that has the shortest distance
source = nodeWithMin;
// Increment edgesTaken
// clear the distances map to prepare for the next iteration
return edgesTaken;
Declarations (This is an undirected graph) :
class Graph{
// This holds the connected name and the corresponding we
struct EdgeInfo{
std::string name;
int weight;
EdgeInfo() { }
EdgeInfo(std::string n, int w) : name(n), weight(
// This will hold the data members of the vertices, inclu
struct VertexInfo{
float value;
std::vector<EdgeInfo> edges;
VertexInfo() { }
VertexInfo(float v) : value(v) { }
// A map is used so that the name is used as the index
std::unordered_map<std::string, VertexInfo> vertices;
NOTE: Please do not suggest that I change the header declarations, I am contributing to a project that has already had 8 other functions written and it's definitely too late to go back and change anything since every other function would then have to be rewritten
I'm currently getting incorrect output. The function is handling a 0 distance situation correctly however (If two vertices aren't connected then the function should return a -1). If the two nodes are the same vertex ex closeness("Boston", "Boston") then the function should return a 0.
Example graph
the closeness of the following two vertices on the left will be on the right:
Trenton -> Philadelphia: 2
Binghamton -> San Francisco: -1
Boston -> Boston: 0
Palo Alto -> Boston: -1
Output of my function:
Trenton -> Philadelphia: 3
Binghamton -> San Francisco: -1
Boston -> Boston: 0
Palo Alto -> Boston: 3
I've tried to copy dijkstra's exactly how it is described, but I'm getting incorrect readings, I've been trying to figure this out for a while now -> Can anyone point me in the right direction?
This is most certainly not a real answer to the question (since I'm not pointing you in a direction regarding your implementation), but did you think about just using the Boost Graph library?
It boils down to writing a short Traits class for your graph structure (and thus it is not necessary to alter your graph definition/header) and is - at least for these fundamental algorithms - proven to be working stable and correctly.
I'd always suggest not to reinvent the wheel especially when it comes to graphs and numerics...
Your implementation is wrong, and it is only by chance you get "correct" results.
Lets do one example by hand. From Trenton to Philadelphia. I use the first letter of the cities as labels.
First iteration
visited = [(T, 1), (N, 0), (W, 0), (P, 0), (B, 0)]
minDistance = 3;
nodeWithMin = N;
edgesTaken = 1
second iteration
visited = [(T, 1), (N, 1), (W, 0), (P, 0), (B, 0)]
minDistance = 2;
nodeWithMin = W;
edgesTaken = 2
third iteration
visited = [(T, 1), (N, 1), (W, 1), (P, 0), (B, 0)]
minDistance = 2;
nodeWithMin = N;
edgesTaken = 3;
fourth iteration
N is already 1 so we stop. Can you see the errors?
Traditionally Dijkstras shortest path algorithm is implemented with a priority queue
dijkstra(graph, source)
weights is a map indexed by nodes with all weights = infinity
predecessor is a map indexed by nodes with all predecessors set to itself
unvisited is a priority queue containing all nodes
weights[source] = 0
while unvisited is not empty
current = unvisited.pop();
for each neighbour to current
if weights[current] + edge_weight(current, neighbour) < weights[neighbour]
weights[neighbour] = weights[current] + + edge_weight(current, neighbour)
predecessors[neighbour] = current
return (weights, predecessors)
And you can get the path length by following the predecessors.
The problem with Palo Alto -> Boston seems to be that the algorithm takes the route Palo Alto -> San Fransisco -> Los Angeles -> San Fransisco (edgesTaken = 3) and then fails the while condition because San Fransisco's been visited already.

Improve minimum distance filter for pointset

I create a minimum distance filter for points.
The function takes a stream of points (x1,y1,x2,y2...) and removes the corresponding ones.
void minDistanceFilter(vector<float> &points, float distance = 0.0)
float p0x, p0y;
float dx, dy, dsq;
float mdsq = distance*distance; // minimum distance square
unsigned i, j, n = points.size();
for(i=0; i<n; ++i)
p0x = points[i];
p0y = points[i+1];
for(j=0; j<n; j+=2)
//if (i == j) continue; // discard itself (seems like it slows down the algorithm)
dx = p0x - points[j]; // delta x (p0x - p1x)
dy = p0y - points[j+1]; // delta y (p0y - p1y)
dsq = dx*dx + dy*dy; // distance square
if (dsq < mdsq)
auto del = points.begin() + j;
n = points.size(); // update n
j -= 2; // decrement j
The only problem that is very slow, due to it tests all points against all points (n^2).
How could it be improved?
kd-trees or range trees could be used for your problem. However, if you want to code from scratch and want something simpler, then you can use a hash table structure. For each point (a,b), hash using the key (round(a/d),round(b/d)) and store all the points that have the same key in a list. Then, for each key (m,n) in your hash table, compare all points in the list to the list of points that have key (m',n') for all 9 choices of (m',n') where m' = m + (-1 or 0 or 1) and n' = n + (-1 or 0 or 1). These are the only points that can be within distance d of your points that have key (m,n). The downside compared to a kd-tree or range tree is that for a given point, you are effectively searching within a square of side length 3*d for points that might have distance d or less, instead of searching within a square of side length 2*d which is what you would get if you used a kd-tree or range tree. But if you are coding from scratch, this is easier to code; also kd-trees and range trees are kinda overkill if you only have one universal distance d that you care about for all points.
Look up range tree, e.g. . You can use this structure to store 2-dimensional points and very quickly find all the points that lie inside a query rectangle. Since you want to find points within a certain distance d of a point (a,b), your query rectangle will need to be [a-d,a+d]x[b-d,b+d] and then you test any points found inside the rectangle to make sure they are actually within distance d of (a,b). Range tree can be built in O(n log n) time and space, and range queries take O(log n + k) time where k is the number of points found in the rectangle. Seems optimal for your problem.