Constraint Programming: Scheduling with multiple workers - scheduling

I'm new to constraint programming. I imagine this is an easy problem but I can't wrap my head around it. Here's the problem:
We have multiple machines (N), each with a limited resource (let's say memory, and it can be the same for all machines.)
We have T tasks, each with a duration and each requiring some amount of the resource.
A machine can work on multiple tasks at the same time as long as its resource isn't exceeded.
A task cannot be split among machines and it has to be done in one shot (i.e. no pausing).
How do we assign the tasks to the machines to minimize the end-time or the number of machines used?
It seems like I should be able to achieve this with the cumulative predicate but it seems like it's limited to scheduling one set of tasks to one worker with a limited global resource rather than a variable number of workers.
I'm just learning CP & MiniZinc. Any ideas on how to generalize cumulative? Alternatively, is there an existing MiniZinc model I can understand that does something like this (or close enough?)
Thanks,
PS: I don't have any concrete data since this is a hypothetical/learning exercise for the most part. Imagine you have 10 machines and 10 tasks with various durations (in hours): 2,4,6,5,2,1,4,6,3,2,12 with memory requirements (GBs): 1,2,4,2,1,8,12,4,1,10. Each machine has 32 GBs of ram.

Here's a model that seems to be correct. However, it don't use "cumulative" at all since I wanted to visualize as much as possible (see below).
The main idea is that - for each time step, 1..max_step - each machine must have only tasks <= 32 GB. The foreach loop checks - for each machine - that the sum of memory of each task that is active at that time on that machine is below 32GB.
The output section shows the solution in different ways. See comments below.
The model is a slightly edited version of http://hakank.org/minizinc/scheduling_with_multiple_workers.mzn
Update: I should also mention that this model allows for different size of RAM on the machines, e.g. some machines have 64GB and some 32GB. This is demonstrated - but commented - in the model on my site. Since the model use value_precede_chain/2 - which ensures that the machines are used in order - it's recommended that the machines are ordered of decreasing size of RAM (so the bigger machines are used first).
(Also, I've modeled the problem in Picat: http://hakank.org/picat/scheduling_with_multiple_workers.pi )
include "globals.mzn";
int: num_tasks = 10;
int: num_machines = 10;
array[1..num_tasks] of int: duration = [2,4,6,5,2,1,4,6,3,12]; % duration of tasks
array[1..num_tasks] of int: memory = [1,2,4,2,1,8,12,4,1,10]; % RAM requirements (GB)
int: max_time = 30; % max allowed time
% RAM for each machine (GB)
array[1..num_machines] of int: machines_memory = [32 | i in 1..num_machines];
% decision variables
array[1..num_tasks] of var 1..max_time: start_time; % start time for each task
array[1..num_tasks] of var 1..max_time: end_time; % end time for each task
array[1..num_tasks] of var 1..num_machines: machine; % which machine to use
array[1..num_machines,1..max_time] of var 0..max(machines_memory): machine_used_ram;
var 1..num_machines: machines_used = max(machine);
var 1..max_time: last_time = max(end_time);
% solve :: int_search(start_time ++ machine ++ array1d(machine_used_ram), first_fail, indomain_split, complete) minimize last_time;
solve :: int_search(start_time ++ machine ++ array1d(machine_used_ram), first_fail, indomain_split, complete) minimize machines_used;
constraint
forall(t in 1..num_tasks) (
end_time[t] = start_time[t] + duration[t] -1
)
% /\ cumulative(start_time,duration,[1 | i in 1..num_tasks],machines_used)
/\
forall(m in 1..num_machines) (
% check all the times when a machine is used
forall(tt in 1..max_time) (
machine_used_ram[m,tt] = sum([memory[t]*(machine[t]=m)*(tt in start_time[t]..end_time[t]) | t in 1..num_tasks]) /\
machine_used_ram[m,tt] <= machines_memory[m]
% sum([memory[t]*(machine[t]=m)*(tt in start_time[t]..end_time[t]) | t in 1..num_tasks]) <= machines_memory[m]
)
)
% ensure that machine m is used before machine m+1 (for machine_used)
/\ value_precede_chain([i | i in 1..num_machines],machine)
;
output [
"start_time: \(start_time)\n",
"durations : \(duration)\n",
"end_time : \(end_time)\n",
"memory : \(memory)\n",
"last_time : \(last_time)\n",
"machine : \(machine)\n",
"machines_used: \(machines_used)\n",
]
++
[ "Machine memory per time:\n "]
++
[ show_int(3,tt) | tt in 1..max_time ]
++
[
if tt = 1 then "\n" ++ "M" ++ show_int(2, m) ++ ": " else " " endif ++
show_int(2,machine_used_ram[m,tt])
| m in 1..num_machines, tt in 1..max_time
]
++ ["\n\nTime / task: machine(task's memory)\n Task "] ++
[
show_int(7,t)
| t in 1..num_tasks
]
++
[
if t = 1 then "\nTime " ++ show_int(2,tt) ++ " " else " " endif ++
if tt in fix(start_time[t])..fix(end_time[t]) then
show_int(2,fix(machine[t])) ++ "(" ++ show_int(2,memory[t]) ++ ")"
else
" "
endif
| tt in 1..fix(last_time), t in 1..num_tasks
]
;
The model has two "modes": one to minimize the time ("minimize last_time") and one to minimize the number of machine used ("minimize machines_used").
The result of minimizing the time is:
start_time: [11, 8, 3, 8, 11, 8, 9, 7, 8, 1]
durations : [2, 4, 6, 5, 2, 1, 4, 6, 3, 12]
end_time : [12, 11, 8, 12, 12, 8, 12, 12, 10, 12]
memory : [1, 2, 4, 2, 1, 8, 12, 4, 1, 10]
last_time : 12
machine : [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
machines_used: 1
Machine memory per time:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
M 1: 10 10 14 14 14 14 18 31 31 31 32 30 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
M 2: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
M 3: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
M 4: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
M 5: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
M 6: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
M 7: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
M 8: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
M 9: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
M10: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Time / task: machine(task's memory)
Task 1 2 3 4 5 6 7 8 9 10
Time 1 1(10)
Time 2 1(10)
Time 3 1( 4) 1(10)
Time 4 1( 4) 1(10)
Time 5 1( 4) 1(10)
Time 6 1( 4) 1(10)
Time 7 1( 4) 1( 4) 1(10)
Time 8 1( 2) 1( 4) 1( 2) 1( 8) 1( 4) 1( 1) 1(10)
Time 9 1( 2) 1( 2) 1(12) 1( 4) 1( 1) 1(10)
Time 10 1( 2) 1( 2) 1(12) 1( 4) 1( 1) 1(10)
Time 11 1( 1) 1( 2) 1( 2) 1( 1) 1(12) 1( 4) 1(10)
Time 12 1( 1) 1( 2) 1( 1) 1(12) 1( 4) 1(10)
----------
==========
The first part "Machine memory per time" shows how loaded each machine (1..10) is per time step (1..30).
The second part "Time / task: machine(task's memory)" shows for each time step (rows) and tasks (columns) which machine that is used and the memory of the task in the form "machine(memory of the machine)".
The second way of using the model, to minimize the number of used machines, give this result (edited to save space). I.e. one machine is enough for handling all the tasks during the allowed time (1..22 time steps).
start_time: [19, 11, 3, 9, 20, 22, 13, 7, 17, 1]
durations : [2, 4, 6, 5, 2, 1, 4, 6, 3, 12]
end_time : [20, 14, 8, 13, 21, 22, 16, 12, 19, 12]
memory : [1, 2, 4, 2, 1, 8, 12, 4, 1, 10]
last_time : 22
machine : [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
machines_used: 1
Machine memory per time:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
M 1: 10 10 14 14 14 14 18 18 16 16 18 18 16 14 12 12 1 1 2 2 1 8 0 0 0 0 0 0 0 0
M 2: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
....
Time / task: machine(task's memory)
Task 1 2 3 4 5 6 7 8 9 10
Time 1 1(10)
Time 2 1(10)
Time 3 1( 4) 1(10)
Time 4 1( 4) 1(10)
.....
----------
==========

This is an old question, but here is a CP Optimizer model for this problem (in Python).
In this version, I minimize a lexicographical objective : first minimize the makespan (optimal value is 12), then given this makespan, minimize the number of used machines (here, one can execute all the tasks on a single machine and still finish at 12).
DUR = [2,4,6,5,2,1,4,6,3,12]
MEM = [1,2,4,2,1,8,12,4,1,10]
CAP = 32
TASKS = range(len(DUR))
MACHINES = range(10)
from docplex.cp.model import *
model = CpoModel()
# Decision variables: tasks and alloc
task = [interval_var(size=DUR[i]) for i in TASKS]
alloc = [ [interval_var(optional=True) for j in MACHINES] for i in TASKS]
# Objective terms
makespan = max(end_of(task[i]) for i in TASKS)
nmachines = sum(max(presence_of(alloc[i][j]) for i in TASKS) for j in MACHINES)
# Objective: minimize makespan, then number of machine used
model.add(minimize_static_lex([makespan, nmachines]))
# Allocation of tasks to machines
model.add([alternative(task[i], [alloc[i][j] for j in MACHINES]) for i in TASKS])
# Machine capacity
model.add([sum(pulse(alloc[i][j],MEM[i]) for i in TASKS) <= CAP for j in MACHINES])
# Resolution
sol = model.solve(trace_log=True)
# Display solution
for i in TASKS:
for j in MACHINES:
s = sol.get_var_solution(alloc[i][j])
if s.is_present():
print('Task ' + str(i) + ' scheduled on machine ' + str(j) + ' on [' + str(s.get_start()) + ',' + str(s.get_end()) + ')')
And the result is:
! ----------------------------------------------------------------------------
! Minimization problem - 110 variables, 20 constraints
! Initial process time : 0.00s (0.00s extraction + 0.00s propagation)
! . Log search space : 66.4 (before), 66.4 (after)
! . Memory usage : 897.0 kB (before), 897.0 kB (after)
! Using parallel search with 8 workers.
! ----------------------------------------------------------------------------
! Best Branches Non-fixed W Branch decision
0 110 -
+ New bound is 12; 0
* 12 111 0.01s 1 (gap is 100.0% # crit. 2 of 2)
New objective is 12; 7
* 12 131 0.01s 1 (gap is 100.0% # crit. 2 of 2)
New objective is 12; 6
* 12 151 0.01s 1 (gap is 100.0% # crit. 2 of 2)
New objective is 12; 5
* 12 171 0.01s 1 (gap is 100.0% # crit. 2 of 2)
New objective is 12; 4
* 12 191 0.01s 1 (gap is 100.0% # crit. 2 of 2)
New objective is 12; 3
* 12 211 0.01s 1 (gap is 100.0% # crit. 2 of 2)
New objective is 12; 2
* 12 231 0.01s 1 (gap is 100.0% # crit. 2 of 2)
New objective is 12; 1
! ----------------------------------------------------------------------------
! Search completed, 7 solutions found.
! Best objective : 12; 1 (optimal)
! Best bound : 12; 1
! ----------------------------------------------------------------------------
! Number of branches : 1318
! Number of fails : 40
! Total memory usage : 6.7 MB (6.6 MB CP Optimizer + 0.1 MB Concert)
! Time spent in solve : 0.00s (0.00s engine + 0.00s extraction)
! Search speed (br. / s) : 131800.0
! ----------------------------------------------------------------------------
Task 0 scheduled on machine 4 on [4,6)
Task 1 scheduled on machine 4 on [4,8)
Task 2 scheduled on machine 4 on [1,7)
Task 3 scheduled on machine 4 on [0,5)
Task 4 scheduled on machine 4 on [4,6)
Task 5 scheduled on machine 4 on [0,1)
Task 6 scheduled on machine 4 on [0,4)
Task 7 scheduled on machine 4 on [1,7)
Task 8 scheduled on machine 4 on [4,7)
Task 9 scheduled on machine 4 on [0,12)

Related

How to store a complete path that a priority queue follows while performing A* search

I have been give a problem in which I am provided with user-entered matrix (rows and columns). User will also provide Start State (row and column) and the Goal State.
The job is to use A* search to find the path from the start node to the goal node.
A sample matrix is provided below,
0 0 0 0 0 0 0 0 0 0
0 0 0 1 1 0 0 0 0 G
0 0 0 1 1 0 0 0 1 1
0 0 0 0 0 0 0 0 0 0
0 1 1 0 0 0 0 0 0 1
0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 1 1 0 0 0
0 0 0 0 0 1 1 0 0 0
0 1 0 1 0 1 1 0 0 0
0 1 0 1 0 1 1 0 0 0
0 0 0 0 0 0 0 0 0 0
S 0 0 0 0 0 0 0 0 0
0 1 0 0 0 1 1 0 0 0
0 0 0 0 0 1 1 0 0 0
where "S" is the start state, and "G" is the goal state. 0 are the states, in which you can move to and 1 are the obstacles in the grid, you can't move to them.
There are 3 actions allowed.
Up one cell (cost is 1)
right one cell (cost is 3)
diagonally up towards the right (cost is 2)
To solve this problem, I used Manhattan's Distance as my heuristic function and calculated the heuristic values for all my states.... They looked something like this (for the grid specified above)
10 9 8 7 6 5 4 3 2 1
9 8 7 6 5 4 3 2 1 0
10 9 8 7 6 5 4 3 2 1
11 10 9 8 7 6 5 4 3 2
12 11 10 9 8 7 6 5 4 3
13 12 11 10 9 8 7 6 5 4
14 13 12 11 10 9 8 7 6 5
15 14 13 12 11 10 9 8 7 6
16 15 14 13 12 11 10 9 8 7
17 16 15 14 13 12 11 10 9 8
18 17 16 15 14 13 12 11 10 9
19 18 17 16 15 14 13 12 11 10
20 19 18 17 16 15 14 13 12 11
21 20 19 18 17 16 15 14 13 12
Now, this is my code for A* search
void A_search()
{
priority_queue<node, vector<node>, CompareCost>q; // Priority Queue is used.
// node contains 4 elements... 1. "Heuristic" value, 2. Index row, 3. Index Col, 4. Actual Cost until this point
q.push(node(heuristic[srow][scol], srow, scol, 0)); // srow, scol is start state. 0 is Actual Cost
while (!q.empty())
{
node temp = q.top();
path_cost = temp.cost; // path_cost is global variable, which stores actual cost until this point
path[temp.i][temp.j] = true; // Boolean array, which tells the path followed so far.
q.pop();
if (temp.i == grow && temp.j == gcol) // If goal state is found, we break out of the loop
break;
if (temp.i > 0) // Checking for rows above the current state.
{
if (arr[temp.i - 1][temp.j] != 1) // Checking if index above current state is obstacle or not
{
q.push(node(heuristic[temp.i - 1][temp.j] + (temp.cost+1), temp.i - 1, temp.j, temp.cost + 1)); // pushing the above index into queue
}
if (temp.j - 1 < cols)
{
if (arr[temp.i - 1][temp.j + 1] != 1) // Diagonal Index checking
{
q.push(node(heuristic[temp.i - 1][temp.j + 1] + (temp.cost + 2), temp.i - 1, temp.j + 1, temp.cost + 2));
}
}
}
if (temp.j - 1 < cols) // Horizontal Index... Checking if column has exceeded the total cols or not
{
if (arr[temp.i][temp.j + 1] != 1) // Obstacle check for horizontal index
{
q.push(node(heuristic[temp.i][temp.j + 1] + (temp.cost + 3), temp.i, temp.j + 1, temp.cost + 3));
}
}
}
}
And this is the result I get after running this algorithm (Please note that # represents the path taken by the program... I am simply using a boolean 2D array to check which nodes are being visited by Priority Queue. For those indexes only, I am printing # and rest of the grid remains the same)
0 0 0 0 0 # # # # #
# # # 1 1 # # # # G
# # # 1 1 # # # 1 1
# # 0 # # # # # 0 0
# 1 1 # # # # 0 0 1
# # # # # # # 0 1 0
# # # # # 1 1 0 0 0
# # # # 0 1 1 0 0 0
# 1 # 1 0 1 1 0 0 0
# 1 # 1 0 1 1 0 0 0
# # 0 0 0 0 0 0 0 0
S 0 0 0 0 0 0 0 0 0
0 1 0 0 0 1 1 0 0 0
0 0 0 0 0 1 1 0 0 0
Path Cost: 21
Now, the problem, as evident from the output, is that it is storing every index that gets visited (because heuristic values have very low difference for all the indexes, that is why, almost every node is being visited.. However, ultimately, A* search finds the best path, and that can be seen from "Path Cost: 21" which is the actual cost of the optimal path)
I believe that my algorithm is correct, considering the path cost but what I want now is store also the path of the optimal path.
For this, I want to keep a record of all the indexes (row and column) that are visited by one path.
For example, my path starts from
Row 11, Col 0
Then "optimal paths" goes to,
Row 10, Col 1 -> When I push these nodes into queue, I want to store "11, 0" as well. So that, I can know what path this node has taken previously to reach this state.
Following the same, then it will go to,
Row 9, Col 2 -> So, this node should also store both "11, 0" and "10, 1" in it, hence keeping record of the path it has taken so far.
And this goes on, until the "goal" node.
But I can't seem to find a way to implement this thing, something that keeps track of all the path every node has taken. In this way, I can easily avoid the problem I am facing (I will simply print the path the "goal node" took to reach that point, ignoring all the other nodes which were visited unnecessarily)
Can anyone help me in trying to find a logic for this?
Also, just to be clear, my class node and CompareCost have this implementation,
class node
{
public:
int i, j, heuristic, cost;
node() { i = j = heuristic = cost = 0; }
node(int heuristic, int i, int j, int cost) :i(i), j(j), heuristic(heuristic), cost(cost) {}
};
struct CompareCost {
bool operator()(node const& p1, node const& p2)
{
return p1.heuristic > p2.heuristic;
}
};
I am guessing that I need to store something extra in my "class node" but I can't seem to figure out the exact thing.
Construct your node like a linked list:
class node
{
public:
int i, j, cost;
node* next;
}
Add a method to the class to display the full path:
void ShowPath()
{
Node* temp = this;
do
{
if (temp != NULL)
{
std::cout << "(" << temp->i << ", " << temp->j << ")";
temp = temp->next;
}
} while (temp != NULL);
}
Last, modify A_search() so that it returns the new node definition. You can then call ShowPath() on the return value.

time series sliding window with occurrence counts

I am trying to get a count between two timestamped values:
for example:
time letter
1 A
4 B
5 C
9 C
18 B
30 A
30 B
I am dividing time to time windows: 1+ 30 / 30
then I want to know how many A B C in each time window of size 1
timeseries A B C
1 1 0 0
2 0 0 0
...
30 1 1 0
this shoud give me a table of 30 rows and 3 columns: A B C of ocurancess
The problem is the data is taking to long to be break down because it iterates through all master table every time to slice the data eventhough thd data is already sorted
master = mytable
minimum = master.timestamp.min()
maximum = master.timestamp.max()
window = (minimum + maximum) / maximum
wstart = minimum
wend = minimum + window
concurrent_tasks = []
while ( wstart <= maximum ):
As = 0
Bs = 0
Cs = 0
for d, row in master.iterrows():
ttime = row.timestamp
if ((ttime >= wstart) & (ttime < wend)):
#print (row.channel)
if (row.channel == 'A'):
As = As + 1
elif (row.channel == 'B'):
Bs = Bs + 1
elif (row.channel == 'C'):
Cs = Cs + 1
concurrent_tasks.append([m_id, As, Bs, Cs])
wstart = wstart + window
wend = wend + window
Could you help me in making this perform better ? i want to use map function and i want to prevent python from looping through all the loop every time.
This is part of big data and it taking days to finish ?
thank you
There is a faster approach - pd.get_dummies():
In [116]: pd.get_dummies(df.set_index('time')['letter'])
Out[116]:
A B C
time
1 1 0 0
4 0 1 0
5 0 0 1
9 0 0 1
18 0 1 0
30 1 0 0
30 0 1 0
If you want to "compress" (group) it by time:
In [146]: pd.get_dummies(df.set_index('time')['letter']).groupby(level=0).sum()
Out[146]:
A B C
time
1 1 0 0
4 0 1 0
5 0 0 1
9 0 0 1
18 0 1 0
30 1 1 0
or using sklearn.feature_extraction.text.CountVectorizer:
from sklearn.feature_extraction.text import CountVectorizer
cv = CountVectorizer(token_pattern=r"\b\w+\b", stop_words=None)
r = pd.SparseDataFrame(cv.fit_transform(df.groupby('time')['letter'].agg(' '.join)),
index=df['time'].unique(),
columns=df['letter'].unique(),
default_fill_value=0)
Result:
In [143]: r
Out[143]:
A B C
1 1 0 0
4 0 1 0
5 0 0 1
9 0 0 1
18 0 1 0
30 1 1 0
If we want to list all times from 1 to 30:
In [153]: r.reindex(np.arange(r.index.min(), r.index.max()+1)).fillna(0).astype(np.int8)
Out[153]:
A B C
1 1 0 0
2 0 0 0
3 0 0 0
4 0 1 0
5 0 0 1
6 0 0 0
7 0 0 0
8 0 0 0
9 0 0 1
10 0 0 0
11 0 0 0
12 0 0 0
13 0 0 0
14 0 0 0
15 0 0 0
16 0 0 0
17 0 0 0
18 0 1 0
19 0 0 0
20 0 0 0
21 0 0 0
22 0 0 0
23 0 0 0
24 0 0 0
25 0 0 0
26 0 0 0
27 0 0 0
28 0 0 0
29 0 0 0
30 1 1 0
or using Pandas approach:
In [159]: pd.get_dummies(df.set_index('time')['letter']) \
...: .groupby(level=0) \
...: .sum() \
...: .reindex(np.arange(r.index.min(), r.index.max()+1), fill_value=0)
...:
Out[159]:
A B C
time
1 1 0 0
2 0 0 0
3 0 0 0
4 0 1 0
5 0 0 1
6 0 0 0
7 0 0 0
8 0 0 0
9 0 0 1
10 0 0 0
... .. .. ..
21 0 0 0
22 0 0 0
23 0 0 0
24 0 0 0
25 0 0 0
26 0 0 0
27 0 0 0
28 0 0 0
29 0 0 0
30 1 1 0
[30 rows x 3 columns]
UPDATE:
Timing:
In [163]: df = pd.concat([df] * 10**4, ignore_index=True)
In [164]: %timeit pd.get_dummies(df.set_index('time')['letter'])
100 loops, best of 3: 10.9 ms per loop
In [165]: %timeit df.set_index('time').letter.str.get_dummies()
1 loop, best of 3: 914 ms per loop

Distribution of M objects in N container

Given a N size array whose elements denotes the capacity of containers ...In how many ways M similar objects can be distributed so that each containers is filled at the end.
for example
for arr={2,1,2,1} N=4 and M=10 there comes out be 35 ways.
Please help me out with this question.
First calculate the sum of the container sizes. I your case 2+1+2+1 = 6 let this be P. Find the number of ways of choosing P objects from M. There are M choices for the first object, M-1 for the second, M-2 for the third etc. This gives use M * (M-1) * ... (M-p+1) or M! / (M-P)!. This will give us more states than you want for example
1 2 | 3 | 4 5 | 6
2 1 | 3 | 4 5 | 6
There is q! ways of arranging q object in q slots so we need to divide by factorial(arr[0]) and factorial(arr[1]) etc. In this case divide by 2! * 1! * 2! * 1! = 4.
I'm getting a very much larger number than 35. 10! / 4! = 151200 divide that by 4 gives 37800, so I'm not sure if I have understood your question correctly.
Ah so looking at the problem you need to find N integers n1, n2, ... ,nN so that n1+n2+...+nN = M and n1>= arr[1], n2>=arr[2].
Looks quite simple let P be as above. Take the first P pills and give the students their minimum number, arr[1], arr[2] etc. You will have M-P pills left, let this be R.
Essentially the problem simplifies to finding N number >=0 which sum to R. This is a classic problem. As its a challenges I won't do the answer for you but if we break the N=4, R=4 answer down you may see the pattern
4 0 0 0 - 1 case starting with 4
3 1 0 0 - 3 cases starting with 3
3 0 1 0
3 0 0 1
2 2 0 0 - 6 cases
2 1 1 0
2 1 0 1
2 0 2 0
2 0 1 1
2 0 0 2
1 3 0 0 - 10 cases
1 2 1 0
1 2 0 1
1 1 2 0
1 1 1 1
1 1 0 2
1 0 3 0
1 0 2 1
1 0 1 2
1 0 0 3
0 4 0 0 - 15 cases
0 3 1 0
0 3 0 1
0 2 2 0
0 2 1 1
0 2 0 2
0 1 3 0
0 1 2 1
0 1 1 2
0 1 0 3
0 0 4 0
0 0 3 1
0 0 2 2
0 0 1 3
0 0 0 4
You should recognise the numbers 1, 3, 6, 10, 15.

Fortran 77 Real to Int rounding Direction?

Porting a bit of Fortran 77 code. It appears that REAL variables are being assigned to INTEGER variables. I do not have a method to run this code and wonder what the behavior is in the following case:
REAL*4 A
A = 123.25
B = INT(A)
B = 123 or B = 124?
How about at the 0.5 mark?
REAL*4 C
C = 123.5
D = INT(C)
D = 123 or D = 123.5?
INT is always rounding down:
From the GCC documentation:
These functions return a INTEGER variable or array under the following rules:
(A) If A is of type INTEGER, INT(A) = A
(B) If A is of type REAL and
|A| < 1, INT(A) equals 0. If |A| \geq 1, then INT(A) equals the
largest integer that does not exceed the range of A and whose sign is
the same as the sign of A.
(C) If A is of type COMPLEX, rule B is
applied to the real part of A.
If you want to round to the nearest integer, use NINT.
So, in your case B and D are always 123 (if they are declared as integer).
Here is one example of code and the output, it's an extension of previous answer:
PROGRAM test
implicit none
integer :: i=0
real :: dummy = 0.
do i = 0,30
dummy = -1.0 + (i*0.1)
write(*,*) i, dummy , int(dummy) , nint(dummy) ,floor(dummy)
enddo
stop
end PROGRAM test
This is the output:
$ ./test
0 -1.000000 -1 -1 -1
1 -0.9000000 0 -1 -1
2 -0.8000000 0 -1 -1
3 -0.7000000 0 -1 -1
4 -0.6000000 0 -1 -1
5 -0.5000000 0 -1 -1
6 -0.4000000 0 0 -1
7 -0.3000000 0 0 -1
8 -0.2000000 0 0 -1
9 -9.9999964E-02 0 0 -1
10 0.0000000E+00 0 0 0
11 0.1000000 0 0 0
12 0.2000000 0 0 0
13 0.3000001 0 0 0
14 0.4000000 0 0 0
15 0.5000000 0 1 0
16 0.6000000 0 1 0
17 0.7000000 0 1 0
18 0.8000001 0 1 0
19 0.9000000 0 1 0
20 1.000000 1 1 1
21 1.100000 1 1 1
22 1.200000 1 1 1
23 1.300000 1 1 1
24 1.400000 1 1 1
25 1.500000 1 2 1
26 1.600000 1 2 1
27 1.700000 1 2 1
28 1.800000 1 2 1
29 1.900000 1 2 1
30 2.000000 2 2 2
I hope that this can better clarify the question
EDIT: Compiled with ifort 2013 on xeon
Only as completion to the existing answers I want to add an example how the commercial rounds can be realized without using NINT by
L = INT(F + 0.5)
where L is INTEGER and F is a positive REAL number. I've found this in FORTRAN 77 code samples from the last century.
Extending this to negative REAL numbers by
L = SIGN(1.0,F)*INT(ABS(F) + 0.5)
and going back to the 80th of last century, the minimal code example looks like this
PROGRAM ROUNDTEST
DO 12345 I=0,30
F = -1.0 + I * 0.1
J = INT(F)
K = NINT(F)
L = SIGN(1.0,F)*INT(ABS(F) + 0.5)
PRINT *, I, F, J, K, L
12345 CONTINUE
END
which creates the output
$ ./ROUNDTEST
0 -1.00000000 -1 -1 -1
1 -0.899999976 0 -1 -1
2 -0.800000012 0 -1 -1
3 -0.699999988 0 -1 -1
4 -0.600000024 0 -1 -1
5 -0.500000000 0 -1 -1
6 -0.399999976 0 0 0
7 -0.300000012 0 0 0
8 -0.199999988 0 0 0
9 -9.99999642E-02 0 0 0
10 0.00000000 0 0 0
11 0.100000024 0 0 0
12 0.200000048 0 0 0
13 0.300000072 0 0 0
14 0.399999976 0 0 0
15 0.500000000 0 1 1
16 0.600000024 0 1 1
17 0.700000048 0 1 1
18 0.800000072 0 1 1
19 0.899999976 0 1 1
20 1.00000000 1 1 1
21 1.10000014 1 1 1
22 1.20000005 1 1 1
23 1.29999995 1 1 1
24 1.40000010 1 1 1
25 1.50000000 1 2 2
26 1.60000014 1 2 2
27 1.70000005 1 2 2
28 1.79999995 1 2 2
29 1.90000010 1 2 2
30 2.00000000 2 2 2
ROUNDTEST is compiled and linked by gfortran version 7.4.0 by
$ gfortran.exe ROUNDTEST.FOR -o ROUNDTEST
Hope this helps you if you have to deal with old FORTRAN code.

J (Tacit) Sieve Of Eratosthenes

I'm looking for a J code to do the following.
Suppose I have a list of random integers (sorted),
2 3 4 5 7 21 45 49 61
I want to start with the first element and remove any multiples of the element in the list then move on to the next element cancel out its multiples, so on and so forth.
Thus the output
I'm looking at is 2 3 5 7 61. Basically a Sieve Of Eratosthenes. Would appreciate if someone could explain the code as well, since I'm learning J and find it difficult to get most codes :(
Regards,
babsdoc
It's not exactly what you ask but here is a more idiomatic (and much faster) version of the Sieve.
Basically, what you need is to check which number is a multiple of which. You can get this from the table of modulos: |/~
l =: 2 3 4 5 7 21 45 49 61
|/~ l
0 1 0 1 1 1 1 1 1
2 0 1 2 1 0 0 1 1
2 3 0 1 3 1 1 1 1
2 3 4 0 2 1 0 4 1
2 3 4 5 0 0 3 0 5
2 3 4 5 7 0 3 7 19
2 3 4 5 7 21 0 4 16
2 3 4 5 7 21 45 0 12
2 3 4 5 7 21 45 49 0
Every pair of multiples gives a 0 on the table. Now, we are not interested in the 0s that correspond to self-modulos (2 mod 2, 3 mod 3, etc; the 0s on the diagonal) so we have to remove them. One way to do this is to add 1s on their place, like so:
=/~ l
1 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0
0 0 0 0 1 0 0 0 0
0 0 0 0 0 1 0 0 0
0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 1
(=/~l) + (|/~l)
1 1 0 1 1 1 1 1 1
2 1 1 2 1 0 0 1 1
2 3 1 1 3 1 1 1 1
2 3 4 1 2 1 0 4 1
2 3 4 5 1 0 3 0 5
2 3 4 5 7 1 3 7 19
2 3 4 5 7 21 1 4 16
2 3 4 5 7 21 45 1 12
2 3 4 5 7 21 45 49 1
This can be also written as (=/~ + |/~) l.
From this table we get the final list of numbers: every number whose column contains a 0, is excluded.
We build this list of exclusions simply by multiplying by column. If a column contains a 0, its product is 0 otherwise it's a positive number:
*/ (=/~ + |/~) l
256 2187 0 6250 14406 0 0 0 18240
Before doing the last step, we'll have to improve this a little. There is no reason to perform long multiplications since we are only interested in 0s and not-0s. So, when building the table, we'll keep only 0s and 1s by taking the "sign" of each number (this is the signum:*):
* (=/~ + |/~) l
1 1 0 1 1 1 1 1 1
1 1 1 1 1 0 0 1 1
1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 0 1 1
1 1 1 1 1 0 1 0 1
1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1
so,
*/ * (=/~ + |/~) l
1 1 0 1 1 0 0 0 1
From the list of exclusion, you just copy:# the numbers to your final list:
l #~ */ * (=/~ + |/~) l
2 3 5 7 61
or,
(]#~[:*/[:*=/~+|/~) l
2 3 5 7 61
Tacit iteration is usually done with the conjunction Power. When the test for completion needs to be something other than hitting a fixpoint, the Do While construction works well.
In this solution filterMultiplesOfHead is applied repeatedly until there are no more numbers not either applied or filtered. Numbers already applied are accumulated in a partial answer. When the list to be processed is empty the partial answer is the result, after stripping off the boxing used to segregate processed from unprocessed data.
filterMultiplesOfHead=: {. (((~: >.)# %~) # ]) }.
appendHead=: (>#[ , {.#>#])/
pass=: appendHead ; filterMultiplesOfHead#>#{:
prep=: a: , <
unfinished=: [: -. a: -: {:
sieve=: [: ; [: pass^:unfinished^:_ prep
sieve 2 3 4 5 7 21 45 49 61
2 3 5 7 61
prep 2 3 4 7 9 10
┌┬────────────┐
││2 3 4 7 9 10│
└┴────────────┘
appendHead prep 2 3 4 7 9 10
2
filterMultiplesOfHead 2 3 4 7 9 10
3 7 9
pass^:2 prep 2 3 4 7 9 10
┌───┬─┐
│2 3│7│
└───┴─┘
sieve 1-.~/:~~.>:?.$~100
2 3 7 11 29 31 41 53 67 73 83 95 97