CPLEX CP Scheduling problems: Float Times in Interval Variables - scheduling

I have been carrying out experiments with CPLEX ILOC CP Optimizer using **docplex for Python ** in the field of scheduling. However, as far as CPLEX doc states, interval variables must be defined by integer values (start, duration, end).
Thus, my question is about the possibility of introducing float values for times in docplex, since in my case activities average durations are defined by floating-point numbers.
As far as I know, I have not found any source that talks about how to work it around.
Thanks in advance.

with CPOptimizer you can use decimal decision variables as a workaround.
from docplex.cp.model import CpoModel
mdl = CpoModel(name='buses')
#now suppose we can book a % of buses not only complete buses
scale=100
scalenbbus40 = mdl.integer_var(0,1000,name='scalenbBus40')
scalenbbus30 = mdl.integer_var(0,1000,name='scalenbBus30')
nbbus40= scalenbbus40 / scale
nbbus30= scalenbbus30 / scale
mdl.add(nbbus40*40 + nbbus30*30 >= 310)
mdl.minimize(nbbus40*500 + nbbus30*400)
msol=mdl.solve()
print(msol[scalenbbus40]/scale," buses 40 seats")
print(msol[scalenbbus30]/scale," buses 30 seats")

Related

Why is there a difference between the sum (stime + utime) of all processes' CPU usage, compared to the overall CPU usage from /proc/stat in Linux?

I need to calculate the overall CPU usage of my Linux device over some time (1-5 seconds) and a list of processes with their respective CPU usage times. The programm should be designed and implemented in C++. My assumption would be that the sum of all process CPU times would be equal to the total value for the whole CPU. For now the CPU I am using is multi-cored (2 cores).
According to How to determine CPU and memory consumption from inside a process? it is possible to calculate all "jiffies" available in the system since startup using the values for "cpu" in /proc/stat. If you now sample the values at two points in time and compare the values for user, nice, system and idle at the two time points, you can calculate the average CPU usage in this interval. The formula would be
totalCPUUsage = ((user_aft - user_bef) + (nice_aft - nice_bef) + (system_aft - system_bef)) /
((user_aft - user_bef) + (nice_aft - nice_bef) + (system_aft - system_bef) + (idle_aft - idle_bef)) * 100 %
According to How to calculate the CPU usage of a process by PID in Linux from C? the used jiffies for a single process can be calculated by adding utime and stime from /proc/${PID}/stat (column 14 and 15 in this file). When I now calculate this sum and divide it by the total amount of jiffies in the analyzed interval, I would assume the formula for one process to be
processCPUUsage = ((process_utime_aft - process_utime_bef) + (process_stime_aft - process_stime_bef)) /
((user_aft - user_bef) + (nice_aft - nice_bef) + (system_aft - system_bef) + (idle_aft - idle_bef)) * 100 %
When I now sum up the values for all processes and compare it to the overall calculated CPU usage, I receive a slightly higher value for the aggregated value most of the time (although the values are quite close for all different CPU loads).
Can anyone explain to me, what's the reason for that? Are there any CPU resources that are used by more than one process and thus accounted twice or more in my accumlation? Or am I simply missing something here? I can not find any further hint in the Linux man page for the proc file system (https://linux.die.net/man/5/proc) as well.
Thanks in advance!

Using an if-statement for div by 0 protection in Modelica

I made a simple model of a heat pump which uses sensor data to calculate its COP.
while COP = heat / power
sometimes there is no power so the system does a (cannot divide by zero). I would like these values to just be zero. So i tried an IF-statementif-statement. if power(u) = 0 then COP(y) = 0. somehow this does not work (see time 8)COP output + data. Anyone who seems to notice the problem?
edit(still problems at time 8.1
edit(heat and power)
To make the computation a bit more generally applicable (e.g. the sign of power can change), take a look at the code below. It could also be a good idea to build a function from it (for the function the noEvent()-statements can be left out)...
model DivNoZeroExample
parameter Real eps = 1e-6 "Smallest number to be used as divisor";
Real power = 0.5-time "Some artificial value for power";
Real heat = 1 "Some artificial value for heat";
Real COP "To be computed";
equation
if noEvent(abs(power) < abs(eps)) then
COP = if noEvent(power>= 0) then heat/eps else heat/(-eps);
else
COP = heat/power;
end if;
end DivNoZeroExample;
Relational operations work a bit differently in Modelica.
If you replace if u>0 by if noEvent(u>0) it should work as you expected.
For details see section 8.5 Events and Synchronization in the Modelica specification https://modelica.org/documents/ModelicaSpec34.pdf

How would you implement this adaptive 'fudge factor' in a scheduler?

I have a scheduler, endlessly executing n actions. Each action is scheduled for x seconds into the future. When an action completes, it is re-scheduled for another x seconds into the future after its previously scheduled time. Every 1s, the scheduler "ticks", executing at most 25 actions which are due to fire. Actions may take a second or so to complete (though this value should be considered variable and unpredictable).
Say that x is 60 seconds. Due to the throttling of at most 25 actions being executed simultaneously, when n grows large, it is conceivable that the scheduler won't have time to execute all n actions within a 60 second window, and actions will be executed later and later as time goes on. This is undesirable, as it'll become true that there are actions to execute on every single tick and this increases load on my system. It's less important to me to keep x exactly constant than it is to keep load down.
So I wish to implement an adaptive "handicap", an automatically-applied fudge factor h, increasing it when a majority of actions are executed "late", and decreasing it (edging it back to its default of zero) when they're all seemingly and consistently on time. The scheduler would then be made to schedule actions for x+h seconds' time, rather than x.
At a high level, how would you approach this? How would you define "a majority of actions are executed 'late'" and how would you represent/detect it in C++03 code?
Better yet, is there an existing well-known approach that objectively "works" here?
To be clear, you are aiming to avoid sustained high load where there are tasks
every tick, rather than aiming to minimise the scheduling delay.
Correspondingly, the metric you should be looking at when considering the fudge
factor is the load, not the lateness.
If you have full knowledge of the system — the number of tasks, their
rescheduling intervals, the distribution of their execution time —
you could in principle exactly solve for a handicap value that would give you
a mean target load when busy, or would say, only exceed the target load
10% of the time when busy, or so on.
On the other hand, if this information is not available or predictable,
you will need an adaptive approach.
The general theory for this sort of thing is control theory, which can get
quite involved. Broadly though the heuristic is: if the load is less than the
threshold, and we have a positive handicap, reduce the handicap; if the load is
over the threshold, increase the handicap.
The handicap should be proportional, rather than additional: if, for example,
we knew we were consistently 10% overloaded, then we'd be right on target if we
applied a proportional delay of 10% on the scheduling of jobs. That is, we're
looking to apply a handicap factor h such that jobs are scheduled at xh
seconds time instead of x. A factor of 1 would correspond to no handicap.
When we're overloaded, but not maximally overloaded, the response then is linear
in the log: log(h) = log(load) - log(load_target). So the simplest method
would be:
load = get_current_load();
if (load>load_target) h = load/load_target;
else h = 1.0;
Unfortunately, there is a maximum measured load, and linearity breaks down
here. The linear model can be extended to incorporate the accumulated
deviation from the target load, and the rate of change of the load.
This corresponds to the proportional-integral-derivative controller.
As this is a noisy environment (there is variation in the action
execution times), it might be wise to shy away from the derivative bit
of this model, and stick with the proportional-integral (PI) part.
When this model is discretized, we get an expression for log(h)
that is proportional to the current (log) overload, plus a term that
captures how badly we've been doing:
load = get_current_load();
deviation = load > load_target ? log(load/load_target) : 0;
accum += p1 * deviation;
log_h = p2 * deviation + accum;
h = log_h < 0 ? 1.0 : exp(log_h);
Except, we don't have a symmetric problem: when we're below
the load target, but the accumulated error term stays high.
We could work around it by accumulating negative deviations
as well, but limiting the accumulated error to be at least
non-negative, so that a period of legitimately low load
doesn't give us a free pass for later:
load = get_current_load();
if (load > 0) {
deviation = log(load/load_target);
accum += p1 * deviation;
if (accum < 0) accum = 0;
if (deviation < 0) deviation = 0;
}
else {
accum = 0;
deviation = 0;
}
log_h = p2 * deviation + accum;
h = log_h < 0 ? 1.0 : exp(log_h);
The value for p2 will be somewhere (roughly) between 0.5 and 0.9,
to leave some room for the influence of the accumulated error.
A good value for p1 will be probably be around 0.3 to 0.5 times
the reciprocal of the lag time, the number of steps it takes for a change
in h to present itself as a change in load. This can be estimated
by the mean rescheduling time of the actions.
You can play around with these parameters to get the sort of
response you'd like, or you can make a more faithful mathematical
model of your scheduling problem and then do maths to it!
The parameters themselves can also be modified adaptively over
time, based on the observed response to changes in load.
(Warning, I haven't actually tried these fragments in a mock scheduler!)

Want to change a value continuously from min to max to min in a loop as a Sine curve

I am working on a game where I need an algorithm to vary a value in a loop. I have implemented the algorithm but I guess its not working as I want it to work. Here's what I want and what I have already implemented :
Given :
a commodity whose price I want to circulate (from min to max to min again and continuously in a loop)
I am using cocos2d-x (C++) where I have a scheduler which runs a function at a given interval say SCHEDULE_INTERVAL
MIN_PRICE and MAX_PRICE of the commodity
currentPrice
Time duration which it will take to complete one cycle (min-max-min)
Current Implementation :
SCHEDULE_INTERVAL = 0.3 (sec) (so the function is running every 0.3 secs)
counter = 0;
timeDuration = time to complete one cycle
function
{
counter++;
_amplitude = (maxPrice - minPrice)/2;
_midValue = (maxPrice + minPrice)/2;
currentPrice = _midValue + _amplitude * sin (2*PI*counter/timeDuration)
}
why i am using sine wave : because at the peaks i want to make the transitions slow.
Problem : for some reasons its not behaving the way I want it to behave
I want to continuously change the currentPrice form minPrice-maxPrice-minPrice in timeDuration and the loop running at SCHEDULE_INTERVAL
please suggest any solutions.
Thanks :)
EDIT :
what's not working in the above implementation is that the values are not changing according to the 'timeDuration' variable
If the pseudocode you posted accurately mirrors the expressions you use in real code, you probably want to change the argument of sin to this:
2 * PI * (counter * SCHEDULE_INTERVAL) / timeDuration
counter is the number of executions, while timeDuration is (I presume) the desired length in seconds.
In other words, your units don't match - it's always worthwhile to perform a dimensional analysis when formulae don't work.

Fast percentile in C++ - speed more important than precision

This is a follow-up to Fast percentile in C++
I have a sorted array of 365 daily cashflows (xDailyCashflowsDistro) which I randomly sample 365 times to get a generated yearly cashflow. Generating is carried out by
1/ picking a random probability in the [0,1] interval
2/ converting this probability to an index in the [0,364] interval
3/ determining what daily cashflow corresponds to this probability by using the index and some linear aproximation.
and summing 365 generated daily cashflows. Following the previously mentioned thread, my code precalculates the differences of sorted daily cashflows (xDailyCashflowDiffs) where
xDailyCashflowDiffs[i] = xDailyCashflowsDistro[i+1] - xDailyCashflowsDistro[i]
and thus the whole code looks like
double _dIdxConverter = ((double)(365 - 1)) / (double)(RAND_MAX - 1);
for ( unsigned int xIdx = 0; xIdx < _xCount; xIdx++ )
{
double generatedVal = 0.0;
for ( unsigned int xDayIdx = 0; xDayIdx < 365; xDayIdx ++ )
{
double dIdx = (double)fastRand()* _dIdxConverter;
long iIdx1 = (unsigned long)dIdx;
double dFloor = (double)iIdx1;
generatedVal += xDailyCashflowsDistro[iIdx1] + xDailyCashflowDiffs[iIdx1] *(dIdx - dFloor);
}
results.push_back(generatedVal) ;
}
_xCount (the number of simulations) is 1K+, usually 10K.
The problem:
This simulation is being carried out 15M times (compared to 100K when the first thread was written) at the moment, and it takes ~10 minutes on a 3.4GHz machine. Due to the nature of problem, this 15M is unlikely to be significantly lowered in the future, only increased. Having used VTune Analyzer, I am being told that the last but one line (generatedVal += ...) generates 80% of runtime. And my question is why and how I can work with that.
Things I have tried:
1/ getting rid of the (dIdx - dFloor) part to see whether double difference and multiplication is the main culprit - runtime dropped by a couple of percent
2/ declaring xDailyCashflowsDistro and xDailyCashflowDiffs as __restict so as to prevent the compiler thinking they are dependendent on each other - no change
3/ tried using 16 days (as opposed to 365) to see whether it is cache misses that drag my performance - not a slight change
4/ tried using floats as opposed to doubles - no change
5/ compiling with different /fp: - no change
6/ compiling as x64 - has effect on the double <-> ulong conversions, but the line in question is unaffected
What I am willing to sacrifice is resolution - I do not care whether the generatedVal is 100010.1 or 100020.0 at the end if the speed gain is substantial.
EDIT:
The daily/yearly cashflows are related to the whole portfolio. I could divide all daily cashflows by portflio size and would thus (at 99.99% confidence level) ensure that daily cashflows/pflio_size will not reach out of the [-1000,+1000] interval. In this case, though, I would need precision to the hundredths.
Perhaps you could turn your piecewise linear function into a piecewise-linear "histogram" of its values. The number you're sampling appears to be the sum of 365 samples from that histogram. What you're doing is a not-particularly-fast way to sample from the sum of 365 samples from that histogram.
You might try computing a Fourier (or wavelet or similar) transform, keeping only the first few terms, raising it to the 365th power, and computing the inverse transform. You won't get a probability distribution in the end, but there shouldn't be "too much" mass below 0 or above 1 and the total mass shouldn't be "too different" from 1 with this technique. (I don't know what your data looks like; this technique may well be unworkable for good mathematical reasons.)