Optimization run assistance

Optimization run assistance - c++

I am running the optimisation of two sets of data against each other and am after some assistance as to looking up settings of the run based on the calculated results. I'll explain....
I run 2 data lines against each other (think graph lines) - Line A and Line B. These lines have crossing points - upward and downward based on the direction of each line.e.g. Line A is going up and Line B is going down is an 'Upwards cross' and Line A going down and Line B going up is a 'Downward cross'.The program calculates financial analysis.
I analyze the crossing points and gain a resultant 'Rank' from the analysis based on a set of rules. The rank is a single integer.
Line A has a number of settings for the optimisation run e.g. Window 1 from a value of 10 to 20 and window 2 at a value of 30 to 40. Line B also has settings.
When I run the optimisation I iterate through the parameters available for each line and calculate the rank. The result of the optimisation run is a list of the ranks which is the size of the number of permutations avaliable.
So my question is this:
What is the best way to look up the line settings from the calculated rank using a position (index) in the rank list. The optimisation settings used to create the run will be stored for that rank run and can be used for the look-up.
I also will be adding additional parameters in the future to the system for the line so I want the program to take into account additional future line settings without affecting any rank files created previous to adding the new parameter.
In addition to that I want to be able to find out an index based on a particular setting included in the optimisation run (the reverse look-up of the previous method).
I want to avoid versioning for backward compatability if at all possible so that the lookup algorithm will be self-sufficient.
Is a hash table suitable for this purpose or do you have any implementation techniques that would fit better? Do you have any examples of this type of operation in action in C++?
Thanks,
Chris.

If I understand correctly, you have a bunch of associated data (settings + rank), on which you would like to be able to perform lookups with different key types. If so, then Boost.MultiIndex sounds like what you're looking for.

Related

How to automatically feed a cell value from a range of values, based on its matching condition with other cell value

I'm making a time-spending tracker based on the work I do every hour of the day.
Now, suppose I have 28 types of work listed in my tracker (which I also have to increase from time to time), and I have about 8 significance values that I have decided to relate to these 28 types of work, predefined.
I want that, as soon as I enter a type of work in cell 1 - I want the adjacent cell 2 to get automatically populated with a significance value (from a range of 8 values) that is pre-definitely set by me.
Every time I input a new or old occurrence of a type of work, the adjacent cell should automatically get matched with its relevant significance value & automatically get populated in real-time.
I know how to do it using IF, IFS, and IF_OR conditions, but I feel that based on the ever-expanding types of work & significance values, the above formulas will be very big, complicated, and repetitive in the future. I feel there's a more efficient way to achieve it. Also, I don't want it to be selected from a drop-down list.
Guys, please help me out with the most efficient way to handle this. TUIA :)
Also, I've added a snapshot and a sample sheet describing the problem.
Sample sheet

XLOOKUP() may work. Try-
=XLOOKUP(D2,A2:A,B2:B)
Or FILTER() function like-
=FILTER(B2:B,A2:A=D2)

You can use this formula for a whole column:
=INDEX(IFERROR(VLOOKUP(C14:C,A2:B9,2,0)))
Adapt the ranges to your actual tables in order to include in the second argument all the potential values and their significances

This is the formula, that worked for me (for anybody's reference):
I created another reference sheet, stating the types of work & their significance. From that sheet, I'm using either vlookup, filter, xlookup.Using gforms for inputting my data.
=ARRAYFORMULA(IFS(ROW(D:D)=1,"Significance",A:A="","",TRUE,VLOOKUP(D:D,Reference!$A:$B,2,0)))

Check reduced cost of new column in column generation

I'm working on a decomposition model using column generation. When generate a new column, I'd like to check it's reduced cost using CPLEX function. I did read a post here, however the sign of the reduced cost obtain by this method is always negative (in my case), since the new columns are stated as nonbasis (according to the accepted answer).
Edit 1: So the purpose of checking these values is that I have trouble running my code. My master problem is to maximize an objective function. The pricing problems keep generating new column with positive reduce cost, however, after added new columns to the master and reoptimize, the objective function didn't improve at all, and this cycle keep going on for 30 minutes. So I decide to look into each detail to see what is the problem.

Use the function "mod" in the instructions "if" and "select case"

I wrote a little code in Fortran. But the code doesn't behave as I thought, and I can figure out where is the problem.
I will not put the code here because it has 1200 lines but here its philosophy:
I create a 3D grid represented by a four dimensional table (I stock a vector of 2 elements on each point of the grid, corresponding at the nature of the site and who is occupying the site). This grid represents what we call a crystal (where atoms can be found periodically)
When this grid is constructed, the code scans each point of this grid and it looks to the neighboring sites to count the different type of atoms or the vacancies.
For this last point, I use a triple imbricated loop which permit to explore the different sites and I check the different neighboring site using either the if or the select case instructions. As I want my grid to be periodic, I have the function mod in the argument of the if or the select case.
The problem is sometimes, It found a different element in a neighboring site that the actual element in this specific neighboring site. As an example:
In the two ouput files where all the coordinates are written with the
element type I have grid(0,0,1)=-1 (which correspond to a empty site).
But while the code is looking to the neighboring sites of grdi(0,0,1) It tells that there is actually an element indexed 2 in grid(0,0,1).
I look carefully to the block in the triple implemented loop, but it seems fine.
I would like to know if anyone has already meet this kind of problem, or know if there is some problems using mod in a if or select case argument ?
If some of you want to look closer, I can send you the code, with some explanations.

Arrays are usually dimensioned as:
REAL(KIND=8),DIMENSION(0:N) ::A
or
REAL(KIND=8),DIMENSION(N) :: A
In the later example, they are assumed to start at 1.
You could also go (-N:N) or (10:191)
If you use the compiler switch '-check bounds' or ;-check all' you will see if you are going outside the array/etc. This is not an uncommon thing to get hosed up, but the compiler will abort quickly when the dimension is outside.
Once it works then removed the -check bounds and/or -check all.

Thanks for your consideration francescalus and haraldkl.
It was not related to the dimension of arrays Holmz, but thank you to try to help
It seems I finally succeed to fix it. I will post an over answer If I fully understand why it was not working properly.
Apparently, it was related to the combination of a different argument order in a call procedure and the subroutine header + a declaration in the subroutine with intent(inout).
It was like the intent(inout) was masking the problem. But It a bit strange for me.
Some explanations about the code :
As I said, the code create a 3D grid where each intersection of the 3D grid correspond to a crystallographic site. I attribute a value at each site -1 for an empty site, 1 for a crystal atom (0 if there is a vacancy instead of a crystal atom), 2,3,4,5 for different impurities. Actually, the empty sites and the sites which received crystal atoms are not of the same type, that's why an empty site and a vacancy are distinguished. The impurities can only occupied the empty site and are forbidden to occupied a crystal site.
The aim of the code is to explore the configurational space of the system, in other words all the possible distribution we can obtained with the different elements. To do so I start from a initial configuration and I choose randomly to site (respecting the rules of occupation) and I virtually switch them. I calculate the energy of the old an new configurations, if the new has a lower energy I keep it, if not, i keep the old one. The calculus of the energy is based on the knowledge of the environment of each vacancies and impurities, so we need to know their neighbors. And I repeat the all procedure again and again to converge to the most stable (so the most probable) configuration.
The next step is to include the temperature effect, and to add the second type of empty sites.
Have a nice day,
M.

How to report a list in Behaviorspace NetLogo?

I am running a NetLogo model in BehaviorSpace each time varying number of runs. I have turtle-breed pigs, and they accumulate a table with patch-types as keys and number of visits to each patch-type as values.
In the end I calculate a list of mean number of visits from all pigs. The list has the same length as long as the original table has the same number of keys (number of patch-types). I would like to export this mean number of visits to each patch-type with BehaviorSpace.
Perhaps I could write a separate csv file (tried - creates many files, so lots of work later on putting them together). But I would rather have everything in the same file output after a run.
I could make a global variable for each patch-type but this seems crude and wrong. Especially if I upload a different patch configuration.
I tried just exporting the list, but then in Excel I see it with brackets e.g. [49 0 31.5 76 7 0].
So my question Q1: is there a proper way to export a list of values so that in BehaviorSpace table output csv there is a column for each value?
Q2: Or perhaps there is an example of how to output a single csv that looks exactly as I want it from BehaviorSpace?
PS: In my case the patch types are costs. And I might change those in the future and rerun everything. Ideally I would like to have as output: a graph of costs vs frequency of visits.
Thanks

If the lists are a fixed length that doesn't vary from run to run, you can get the items into separate columns by using one metric for each item. So in your BehaviorSpace experiment definition, instead of putting mylist, put item 0 mylist and item 1 mylist and so on.
If the lists aren't always the same length, you're out of luck. BehaviorSpace isn't flexible that way. You would have to write a separate program (in the programming language of your choice, perhaps NetLogo itself, perhaps an Excel macro, perhaps something else) to postprocess the BehaviorSpace output and make it look how you want.

How to normalize sequence of numbers?

I am working user behavior project. Based on user interaction I have got some data. There is nice sequence which smoothly increases and decreases over the time. But there are little discrepancies, which are very bad. Please refer to graph below:
You can also find data here:
2.0789 2.09604 2.11472 2.13414 2.15609 2.17776 2.2021 2.22722 2.25019 2.27304 2.29724 2.31991 2.34285 2.36569 2.38682 2.40634 2.42068 2.43947 2.45099 2.46564 2.48385 2.49747 2.49031 2.51458 2.5149 2.52632 2.54689 2.56077 2.57821 2.57877 2.59104 2.57625 2.55987 2.5694 2.56244 2.56599 2.54696 2.52479 2.50345 2.48306 2.50934 2.4512 2.43586 2.40664 2.38721 2.3816 2.36415 2.33408 2.31225 2.28801 2.26583 2.24054 2.2135 2.19678 2.16366 2.13945 2.11102 2.08389 2.05533 2.02899 2.00373 1.9752 1.94862 1.91982 1.89125 1.86307 1.83539 1.80641 1.77946 1.75333 1.72765 1.70417 1.68106 1.65971 1.64032 1.62386 1.6034 1.5829 1.56022 1.54167 1.53141 1.52329 1.51128 1.52125 1.51127 1.50753 1.51494 1.51777 1.55563 1.56948 1.57866 1.60095 1.61939 1.64399 1.67643 1.70784 1.74259 1.7815 1.81939 1.84942 1.87731
1.89895 1.91676 1.92987
I would want to smooth out this sequence. The technique should be able to eliminate numbers with characteristic of X and Y, i.e. error in mono-increasing or mono-decreasing.
If not eliminate, technique should be able to shift them so that series is not affected by errors.
What I have tried and failed:
I tried to test difference between values. In some special cases it works, but for sequence as presented in this the distance between numbers is not such that I can cut out errors
I tried applying a counter, which is some X, then only change is accepted otherwise point is mapped to previous point only. Here I have great trouble deciding on value of X, because this is based on user-interaction, I am not really controller of it. If user interaction is such that its plot would be a zigzag pattern, I am ending up with 'no user movement data detected at all' situation.
Please share the techniques that you are aware of.
PS: Data made available in this example is a particular case. There is no typical pattern in which numbers are going to occure, but we expect some range to be continuous with all the examples. Solution I am seeking is generic.

I do not know how much effort you want to involve in this problem but if you want theoretical guaranties,
topological persistence seems well adapted to your problem imho.
Basically with that method, you can filtrate local maximum/minimum by fixing a scale
and there are theoritical proofs that says that if you sampling is
close from your function, then you extracts correct number of maximums with persistence.
You can see these slides (mainly pages 7-9 to get the idea) to get an idea of the method.
Basically, if you take your points as a landscape and imagine a watershed starting from maximum height and decreasing, you have some picks.
Every pick has a time where it is born which is the time where it becomes emerged and a time where it dies which is when it merges with an higher pick. Now a persistence diagram pictures a point for every pick where its x/y coordinates are its time of birth/death (by assumption the first pick does not die and is not shown).
If a pick is a global maximal, then it will be further from the diagonal in the persistence diagram than a local maximum pick. To remove local maximums you have to remove picks close to the diagonal. There are fours local maximums in your example as you can see with the persistence diagram of your data (thanks for providing the data btw) and two global ones (the first pick is not pictured in a persistence diagram):
If you noise your data like that :
You will still get a very decent persistence diagram that will allow you to filter local maximum as you want :
Please ask if you want more details or references.

Since you can not decide on a cut off frequency, and not even on the filter you want to use, I would implement several, and let the user set the parameters.
The first thing that I thought of is running average, and you can see that there are so many things to set, to get different outputs.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js