Use Pyomo to minimize the sum of the k largest values of a vector with weights - pyomo

Thinking about Conditional Value-at-Risk (CVaR) optimization problem in spectral risk measure form (https://en.wikipedia.org/wiki/Spectral_risk_measure), for portfolio construction.
a. Original problem:
b. Second problem (R_2=-R_1):
c. Bilevel optimization:
$R$ is a tall random matrix of shape around $100000\times1000$.
$\mathbf{p}$ is a univariate probability mass function of an arbitrary distribution within $(0,1)$. $\mathbf{p}$ matches with each row of the sorted $R\mathbf{w}$. All elements in $\mathbf{p}$ are positive and ideally, they have a sum equal to $1$. $\mathbf{p}$ can be sorted as well.
Here I rewrite the 'Sort' part using permutation matrix $M$. It is okay to assume the 'Sort' as increasing or decreasing ($\text{Sort} \left[R\mathbf{w} \right]^T \mathbf{p}$ as largest element with largest weight or largest element with smallest weight).
The 'Sort' part can be relaxed, a weak approximation is acceptable.
Noticed that 'the sum of top k values (with non-increasing weights)' is LP representable, but please let me know if this problem could be solved by Pyomo.
https://yalmip.github.io/command/sumk/
https://www.sciencedirect.com/science/article/abs/pii/S030505481930019X
2.
Also wondering if Value-at-Risk (VaR) optimization problem could also be solved by Pyomo?

Related

Find all pairwise differences in an array of distinct integers less than 1e5

Given an array of distinct positive integers ≤ 105 , I need to find differences of all pairs.
I don't really need to count frequency of every difference, just unique differences.
Using brute force, this can be approached by checking all possible pairs. However, this would not be efficient enough considering the size of array (as all elements are distinct so the maximum size is 105 ). This would lead to O (n2) complexity.
I need to exploit the property of this array that the differences are ≤ 105
So my another approach :
The array elements can be represented using another hash array where the indices representing array elements will be 1 and rest will be 0.
This hash array is represented as a polynomial with all coefficients as 1 and exponents as respective hash values.
Now clone this polynomial and make another polynomial with exponents negated.
If now these polynomials are multiplied, all the positive exponents in the result correspond to differences required.
However this multiplication is something that I am not certain how to efficiently implement. I think FFT can be used as it helps multiply two polynomials in O(n log n) complexity. But it requires positive exponents.
Please provide with suggestions on how to proceed now?
I also came across this algorithm, which uses FFT to find pairwise differences in O(n log n), however I can't understand how the algorithm is working. It seems that it is trying to find all possible sums.
A proof of this algorithm would be appreciated.

Understanding Curve Global Approximation algorithm

Problem description
I am trying to understand and implement the Curve Global Approximation, as proposed here:
https://pages.mtu.edu/~shene/COURSES/cs3621/NOTES/INT-APP/CURVE-APP-global.html
To implement the algorithm it is necessary to calculate base function coefficients, as described here:
https://pages.mtu.edu/~shene/COURSES/cs3621/NOTES/spline/B-spline/bspline-curve-coef.html
I have trouble wrapping my head around some of the details.
First there is some trouble with variable nomenclature. Specifically I am tripped up by the fact there is as function parameter as well as input and . Currently I assume, that first I decide how many knot vectors I want to find for my approximation. Let us say I want 10. So then my parameters are:
I assume this is what is input parameter in the coefficient calculation algorithm?
The reason this tripped me up is because of the sentence:
Let u be in knot span
If input parameter was one of the elements of the knot vector , then there was no need for an interval. So I assume is actually one of these elements ( ?), defined earlier:
Is that assumption correct?
Most important question. I am trying to get my N to work with the first of the two links, i.e. the implementation of the Global Curve Approximation. As I look at the matrix dimensions (where P, Q, N dimensions are mentioned), it seems that N is supposed to have n rows and h-1 columns. That means, N has rows equal to the amount of data points and columns equal to the curve degree minus one. However when I look at the implementation details of N in the second link, an N row is initialized with n elements. I refer to this:
Initialize N[0..n] to 0; // initialization
But I also need to calculate N for all parameters which correspond to my parameters which in turn correspond to the datapoints. So the resulting matrix is of ddimension ( n x n ). This does not correspond to the previously mentioned ( n x ( h - 1 ) ).
To go further, in the link describing the approximation algorithm, N is used to calculate Q. However directly after that I am asked to calculate N which I supposedly already had, how else would I have calculated Q? Is this even the same N? Do I have to calculate a new N for the desired amount of control points?
Conclusion
If somebody has any helpful insight on this - please do share. I aim to implement this using C++ with Eigen for its usefulness w.r.t. to solving M * P = Q and matrix calculations. Currently I am at a loss though. Everything seems more or less clear, except for N and especially its dimensions and whether it needs to be calculated multiple times or not.
Additional media
In the last image it is supposed to say, "[...] used before in the calculation of Q"
The 2nd link is telling you how to compute the basis function of B-spline curve at parameter u where the B-spline curve is defined by its degree, knot vector [u0,...um] and control points. So, for your first question, if you want to have 10 knots in your knot vector, then the typical knot vector will look like:
[0, 0, 0, 0, 0.3, 0.7, 1, 1, 1, 1]
This will be a B-spline curve of degree 3 with 6 control points.
For your 2nd question, The input parameter u is generally not one of the knots [u0, u1,...um]. Input parameter u is simply the parameter we would like to evaluate the B-spline curve at. The value of u actually varies from 0 to 1 (assuming the knot vector ranges is also from 0 to 1).
For your 3rd questions, N (in the first link) represents a matrix where each element of this matrix is a Ni,p(tj). So, basically the N[] array computed from 2nd link is actually a row vector of the matrix N in the first link.
I hope my answers have cleared out some of your confusions.

What is the cheapest way to sort a permutation in C++?

The problem is:
You have to sort an array in ascending order(permutation: numbers from 1 to N in a random order) using series of swaps. Every swap has a price and there are 5 types of prices. Write a program that sorts the given array for the smallest price.
There are two kinds of prices: priceByValue and priceByIndex. All of the prices of a kind are given in 2 two-dimensional arrays N*N. Example how to access prices:
You want to swap the 2nd and the 5th elements from the permutation with values of 4 and 7. The price for this swap will be priceByValue[4][7] + priceByIndex[2][5].
Indexes of all arrays are counted from 1 (, not from 0) in order to have access to all of the prices (the permutation elements’ values start from 1): priceByIndex[2][5] would actually be priceByIndex[1][4] in code. Moreover, the order of the indexes by which you access prices from the two-dimensional arrays doesn’t matter: priceByIndex[i][j] = priceByIndex[j][i] and priceByIndex[i][i] is always equal to 0. (priceByValue is the same)
Types of prices:
Price[i][j] = 0;
Price[i][j] = random number between 1 and 4*N;
Price[i][j] = |i-j|*6;
Price[i][j] = sqrt(|i-j|) *sqrt(N)*15/4;
Price[i][j] = max(i,j)*3;
When you access prices by index i and j are the indexes of the elements you want to swap from the original array; when you access prices by value i and j are the values of the elements you want to swap from the original array. (And they are always counted from 1)
Things given:
N - an integer from 1 to 400, Mixed array, Type of priceByIndex, priceByIndex matrix, Type of priceByValue, priceByValue matrix. (all elements of a matrix are from the given type)
Things that should 'appear on the screen': number of swaps, all swaps (only by index - 2 5 means that you have swapped 2nd and 3rd elements) and the price.
As I am still learning C++, I was wondering what is the most effective way to sort the array in order to try to find the sort with the smallest cost.
There might be a way how to access series of swaps that result a sorted array and see which one is with the smallest price and I need to sort the array by swapping the elements which are close by both value and index, but I don’t know how to do this. I would be very grateful if someone can give me a solution how to find the cheapest sort in code. Thank you in advance!
More: this problem might have no solution, I am just trying to get a result close to the ideal.
Dynamic Programming!
Think of the problem as a graph. Each of the N-factorial permutations represents a graph vertex, and the allowed swaps are just arcs between vertices. The price-tag of a swap is just the weight on the arc.
When you look at the problem this way, it can be easily solved with Dijkstra's algortihm for finding the lowest cost path through a graph from one vertex to another.
This is also called Single Pair Shortest Path
you can use an algorithm for sorting an array in lexicographical order and modify it so that it fits your needs ( you did not mention the sorting criteria like the desired result i.e. least value first, ... ) there are multiple algorithms available for this, i.e. quick sort,...
a code example is in https://www.geeksforgeeks.org/lexicographic-permutations-of-string/

Maximum contiguous subsequence -- dynamic programming or greedy algorithm?

Given an array vector<int> arr with positive and negative entries, the maximum contiguous subsequence problem requires to find a (contiguous) segment of the array arr with maximum sum. Sum of empty segment is zero. The C++ code of the algorithm I'm using is as follows:
int MaxContSum(const vector<int>& arr){
int i,sum=0,max=0;
for(i=0;i<arr.size();i++){
if(arr[i]>=0) {if(sum<0) sum=0;}
else {if(sum>max) max=sum;}
sum+=arr[i];
}
if(sum>max) max=sum; return max;
}
Is this algorithm a greedy algorithm or dynamic programming? It looks like it's just scanning the entries one by one and applying different strategies based on whether arr[i] is potive or negative, a locally checkable condition. Why does this problem appear in the dynamic programming chapter, then?
This is Kadane's algorithm for the maximum subarray problem. It scans through the sequence and keeps track of the maximum subarray sum found up to this iteration in general, and the maximum subarray sum ending exactly at this point. How does it know the starting position of the subarray leading to the best sum up to exactly this point? Whenever 1) the previous sum is negative, and 2) a positive element is encountered, it pays to start at the positive element and continue from there. The proof that it works is by simple induction.
This algorithm is not greedy, but it can be viewed as dynamic programming.
A greedy algorithm makes a locally-optimum guess, and sticks with it (just continuing it further and further). Here, conversely, the algorithm can guess to check a subsequence starting at some point (where the sum ending at a positive element is negative), and later discard it and try a subsequence starting at some other point (again, because the sum becomes negative and the element is positive).
Conversely, it can be viewed as a dynamic programming problem. As the Wikipedia entry puts it:
Because of the way this algorithm uses optimal substructures (the maximum subarray ending at each position is calculated in a simple way from a related but smaller and overlapping subproblem: the maximum subarray ending at the previous position) this algorithm can be viewed as a simple example of dynamic programming.
Two main properties that problem should have in order to be eligible for solving with DP are:
Overlapping subproblems
Optimal substructure
From what you presented, first property is definitely missing and therefore I wouldn't classify this algorithm as DP. On the other hand, you use the result of the calculation for the smaller problem to get a final result - so we have Optimal substructure and that is probably the reason why you found this algorithm in the dynamic programming chapter, even though it should not belong there.

Checking efficiently if three binary vectors are linearly independent over finite field

I am given three binary vectors v1, v2, v3 represented by unsigned int in my program and a finite field F, which is also a set of binary vectors. I need to check if the vectors are linearly independent that is there are no f1, f2 in F such that f1*v1 +f2*v2 = v3.
The immediate brute force solution is to iterate over the field and check all possible linear combinations.
Does there exist a more efficient algorithm?
I'd like to emphasize two points:
The field elements are vectors, not scalars. Therefor,e a product of a field element f1 and a given vector vi is a dot product. So the Gaussian elimination does not work (if I am not missing something)
the field is finite, so if I find that f1*v1 +f2*v2 = v3 for some f1,f2 it does not mean that f1,f2 belong to F.
If vectors are in r^2, then they are automatically dependent because when we make a matrix of them and reduce it to echelon form, there will be atleast one free variable(in this case only one).
If vectors are in R^3, then you can make a matrix from them i. a 2d array and then you can take determinant of that matrix. If determinant is equal to 0 then vectors are linearly dependent otherwise not.
If vectors are in R^4,R^5 and so on the then the appropriate way is to reduce matrix into echelon form.
For any finite set of M vectors defined in a space of dimension N, they are linearly independent iff the rank of a MxN matrix constructed by stacking these vectors row by row has rank equal to M.
Regarding numerically stable computation involving linear algebra, the singular value decomposition is usually the way to go and there are plenty of implementations available out there. The key point in this context is to realize the rank of a matrix equals the number of its non zero singular values. One must however note, that due to floating point approximations, a finite precision must be chosen to decide whether a value is effectively zero.
Your question mentions your vectors are defined in the set of integers and that certainly can be taken advantage of to overcome the finite precision of floating point computations, but I would not know how. Maybe somebody out there could help us out?
Gaussian elimination does work if you do it inside the finite field.
For binary it should be quite simple, because inverse element is trivial.
For larger finite fields, you will need somehow to find inverse elements, that may turns into a separate problem.