How would I implement a SJF and Round Robin scheduling simulator? - c++

I have a vector of structs, with the structs looking like this:
struct myData{
int ID;
int arrivalTime;
int burstTime;
};
After populating my vector with this data:
1 1 5
2 3 2
3 5 10
where each row is an individual struct's ID, arrivalTime, and burstTime, how would I use "for" or "while" loops to step through my vector's indices and calculate the data in a way that I could print something like this out:
Time 0 Processor is Idle
Time 1 Process 1 is running
Time 3 Process 2 is running
Time 5 Process 1 is running
Time 8 Process 3 is running
I know that SJF and RR scheduling are pretty similar with the exception that RR has the time quantum so that no process can last longer than a arbitrary time limit before being pre-empted by another process. With that in mind, I think that after I implement SJF, RR will come easily with just a few modifications of the SJF algorithm.
The way I thought about implementing SJF is to sort the vector based on arrival times first, then if two or more vector indices have the same arrival time, sort it based on shortest burstTime first. After that, using
int currentTime = 0;
to keep track of how much time has passed, and
int i = 0;
to use as the index of my vector and to control a "while" loop, how would I implement an algorithm that allows me to print out my desired output shown above? I have a general idea of what needs to happen, but I can't seem to lay it all out in code in a way that works.
I know that whenever the currentTime is less than the next soonest arrivalTime, then that means the processor is idle and currentTime needs to be set to this arrivalTime.
If the vector[i+1].arrivalTime < currentTime + vector[i].burstTime, I need to set the vector[i].burstTime to vector[i+1].arrivalTime - currentTime, then set currentTime to vector[i+1].arrivalTime, then print out currentTime and the process ID
I know that these are simple mathematical operations to implement but I cant think of how to lay it all out in a way that works the way I want it to. The way it loops around and how sometimes a few processes have the same arrival times throws me off. Do I need more variables to keep track of what is going on? Should I shift the arrival times of all the items in the vector every time a process is pre-empted and interrupted by a newer process with a shorter burst time? Any help in C++ code or even psuedo-code would be greatly appreciated. I feel like I am pretty solid on the concept of how SJF works but I'm just having trouble translating what I understand into code.
Thanks!

I know that SJF and RR scheduling are pretty similar with the exception that RR has the time quantum so that no process can last longer than a arbitrary time limit before being pre-empted by another process.
I don't think that's right. At least that's not how I learned it. RR is closer to FCFS (first come, first served) than it is to SJF.
One way to implement SJF is to insert incoming jobs into the pending list based on the running time. The insert position is at the end if this the new job's running time is longer than that of the job at the end; otherwise it's before the first job with a running time longer than the incoming job. Scheduling is easy: Remove the job at the head of the pending list and run that job to completion. A job with a long running time might not ever be run if short jobs keep coming in and getting processed ahead of that job with a long running time.
One way to implement round robin is to use a FIFO, just like with FCFS. New jobs are added to the end of the queue. Scheduling is once again easy: Remove the job at the head of the queue and process it. So far, this is exactly what FCFS does. The two differ in that RR has a limit on how long a job can be run. If the job takes longer than some time quantum to finish, the job is run for only that amount of time and then it is added back to the end of the queue. Note that with this formulation, RR is equivalent to FCFS if the time quantum is longer than the running time of the longest running job.
I suppose you could insert those incomplete jobs back into in the middle of the process list as SJF does, but that doesn't seem very round-robinish to me, and the scheduling would be a good deal hairier. You couldn't use the "always run the job at the head" scheduling rule because then all you would have is SJF, just made more complex.

Related

Delay job in AWS

I have messages inside amazon SQS. for some of the messages I need to perform a delay of six hours before I can start working on them (the delay is a giving).
one solution would be to do Thread.Sleep(6h).
I don't like this solution because I'm afraid something will happen to the thread and I'll lose the data. another solution will be to read the message see if 6 hours have passed, and if not return the message to the queue. again I don't like it because the procedure will happen a lot.
Is there any better solution ??
Can you create individual Multiple Queues and put the queue items separately.
Example 1:
You can have 6 queues like Queue0, Queue1, Queue2, Queue3, Queue4, Queue5 and use a hash-function like hash(x) = current-hour % 6 - this function will return values from 0 to 5 and you can put the items in Queue_f(x) and read the queues individually based on current time.
Example 2:
If the current time is 01:00 Hours you can create separate queues like Queue0700Hours, if the current time is 02:00 hours you can create a another new queue as Queue0800Hours like wise and go.
This way you are decoupling the need to wait / stop a processing and pick up the producers and consumers independently based on the current timestamp.

Unbalanced load (v2.0) using MPI

(the problem is embarrassingly parallel)
Consider an array of 12 cells:
|__|__|__|__|__|__|__|__|__|__|__|__|
and four (4) CPUs.
Naively, I would run 4 parallel jobs and feeding 3 cells to each CPU.
|__|__|__|__|__|__|__|__|__|__|__|__|
=========|========|========|========|
1 CPU 2 CPU 3 CPU 4 CPU
BUT, it appears, that each cell has different evaluation time, some cells are evaluated very quickly, and some are not.
So, instead of wasting "relaxed CPU", I think to feed EACH cell to EACH CPU at time and continue until the entire job is done.
Namely:
at the beginning:
|____|____|____|____|____|____|____|____|____|____|____|____|
1cpu 2cpu 3cpu 4cpu
if, 2cpu finished his job at cell "2", it can jump to the first empty cell "5" and continue working:
|____|done|____|____|____|____|____|____|____|____|____|____|
1cpu 3cpu 4cpu 2cpu
|-------------->
if 1cpu finished, it can take sixth cell:
|done|done|____|____|____|____|____|____|____|____|____|____|
3cpu 4cpu 2cpu 1cpu
|------------------------>
and so on, until the full array is done.
QUESTION:
I do not know a priori which cell is "quick" and which cell is "slow", so I cannot spread cpus according to the load (more cpus to slow, less to quick).
How one can implement such algorithm for dynamic evaluation with MPI?
Thanks!!!!!
UPDATE
I use a very simple approach, how to divide the entire job into chunks, with IO-MPI:
given: array[NNN] and nprocs - number of available working units:
for (int i=0;i<NNN/nprocs;++i)
{
do_what_I_need(start+i);
}
MPI_File_write(...);
where "start" corresponds to particular rank number. In simple words, I divide the entire NNN array into fixed size chunk according to the number of available CPU and each CPU performs its chunk, writes the result to (common) output and relaxes.
IS IT POSSIBLE to change the code (Not to completely re-write in terms of Master/Slave paradigm) in such a way, that each CPU will get only ONE iteration (and not NNN/nprocs) and after it completes its job and writes its part to the file, will Continue to the next cell and not to relax.
Thanks!
There is a well known parallel programming pattern, known under many names, some of which are: bag of tasks, master / worker, task farm, work pool, etc. The idea is to have a single master process, which distributes cells to the other processes (workers). Each worker runs an infinite loop in which it waits for a message from the master, computes something and then returns the result. The loop is terminated by having the master send a message with a special tag. The wildcard tag value MPI_ANY_TAG can be used by the worker to receive messages with different tags.
The master is more complex. It also runs a loop but until all cells have been processed. Initially it sends each worker a cell and then starts a loop. In this loop it receives a message from any worker using the wildcard source value of MPI_ANY_SOURCE and if there are more cells to be processed, sends one of them to the same worker that have returned the result. Otherwise it sends a message with a tag set to the termination value.
There are many many many readily available implementations of this model on the Internet and even some on Stack Overflow (for example this one). Mind that this scheme requires one additional MPI process that often does very little work. If this is unacceptable, one can run a worker loop in a separate thread.
You want to implement a kind of client-server architecture where you have workers asking the server for work whenever they are out of work.
Depending on the size of the chunks and the speed of your communication between workers and server, you may want to adjust the size of the chunks sent to workers.
To answer your updated question:
Under the master/slave (or worker pool if that's how you prefer it to be labelled) model, you will basically need a task scheduler. The master should have information about what work has been done and what still needs to be done. The master will give each process some work to be done, then sit and wait until a process completes (using nonblocking receives and a wait_all). Once a process completes, have it send the data to the master then wait for the master to respond with more work. Continue this until the work is done.

how to detect if a thread or process is getting starved due to OS scheduling

This is on Linux OS. App is written in C++ with ACE library.
I am suspecting that one of the thread in the process is getting blocked for unusually long time(5 to 40 seconds) sometimes. The app runs fine most of the times except couple times a day it has this issue. There are other similar 5 apps running on the box which are also I/O bound due to heavy socket incoming data.
I would like to know if there is any thing I can do programatically to see if the thread/process are getting their time slice.
If a process is being starved out, self monitoring for that process would not be that productive. But, if you just want that process to notice it hasn't been run in a while, it can call times periodically and compare the relative difference in elapsed time with the relative difference in scheduled user time (you would sum the tms_utime and tms_cutime fields if you want to count waiting for children as productive time, and you would sum in the tms_stime and tms_cstime fields if you count kernel time spent on your behalf to be productive time). For thread times, the only way I know of is to consult the /proc filesystem.
A high priority external process or high priority thread could externally monitor processes (and threads) of interest by reading the appropriate /proc/<pid>/stat entries for the process (and /proc/<pid>/task/<tid>/stat for the threads). The user times are found in the 14th and 16th fields of the stat file. The system times are found in the 15th and 17th fields. (The field positions are accurate for my Linux 2.6 kernel.)
Between two time points, you determine the amount of elapsed time that has passed (a monitor process or thread would usually wake up at regular intervals). Then the difference between the cumulative processing times at each of those time points represents how much time the thread of interest got to run during that time. The ratio of processing time to elapsed time would represent the time slice.
One last bit of info: On Linux, I use the following to obtain the tid of the current thread for examining the right task in the /proc/<pid>/task/ directory:
tid = syscall(__NR_gettid);
I do this, because I could not find the gettid system call actually exported by any library on my system, even though it was documented. But, it might be available on yours.

Time based event handling

I am currently working on writing a simple game in order to learn how to use SFML in C++. So far things have been going smoothly and I have a basic understanding of most things to do with SFML. The problem I have run into is finding an efficient way to and time based events.
The game I'm working on is a very simple space invaders esque game that has waves of enemies come in at predefined times during the level. As of now I have an event class that holds and controls what is supposed to happen with each called event in order to allow me to reuse simple events multiple times. As of now I trigger these events by looping utilizing an SFML cloock and with each iteration of the game loop running through a vector of all of the events for the current level and comparing the elapsed time on the clock with the specified time for the event to be called. The problem is that I have to check an entire vector of events every game loop iteration and if the event list get long enough I am worried it will begin to have an impact on the performance of the game.
I had an idea to give each event a simple numerical Id for its position in the timeline and simply store which event is supposed to run next that way I would only have to check the time of one event per loop iteration, and while this will work fairly well I think, I was curious if a more efficient method that didn't require a check every loop iteration was possible? I looked a bit into event driven programming but could not find much specifically related to time based events.
Thus I was wondering if anyone has had any experience with timelines or time based event triggering and would have any tips or resources that I could look at to figure out a more efficient idea? Any help would be great, thanks!
The method you suggest is just one comparison each time through the game loop. It doesn't get any faster than that.
To allow multiple events to fire at the same instant, use a multimap, where keys are event times, and values are the events themselves. The multimap will then be sorted by time.
Each time through the game loop, do something like this (pseudocode):
now = getCurrentTime()
while not events.isEmpty() and events.firstElement().key() < now:
e = events.firstElement().value
e.execute()
events.removeFirst()
Sort your vector based on the events' times and then just store how far you've gotten through the vector so far. Each loop then you'll advance that position until the next event shouldn't occur yet, and fire off the events you just iterated over.
std::vector<Event> time_line;
size_t time_line_position;
void fire_new_events(Time t) {
size_t new_time_line_position = time_line_position;
while(new_time_line_position < time_line.size()
&& time_line[new_time_line_position].time <= t)
++new_time_line_position;
fire_events(time_line.begin() + time_line_position,
time_line.begin() + new_time_line_position);
time_line_position = new_time_line_position;
}

Modify Time for simulation in c++

i am writing a program which simulates an activity, i am wondering how to speed up time for the simulation, let say 1 hour in the real world is equal to 1 month in the program.
thank you
the program is actually similar to a restaurant simulation where you dont really know when customer come. let say we pick a random number (2-10) customer every one hour
It depends on how it gets time now.
For example, if it calls Linux system time(), just replace that with your own function (like mytime) which returns speedier times. Perhaps mytime calls time and multiplies the returned time by whatever factor makes sense. 1 hr = 1 month is 720 times. Handling the origin as when the program begins should be accounted for:
time_t t0;
main ()
{
t0 = time(NULL); // at program initialization
....
for (;;)
{
time_t sim_time = mytime (NULL);
// yada yada yada
...
}
}
time_t mytime (void *)
{
return 720 * (time (NULL) - t0); // account for time since program started
// and magnify by 720, so one hour is one month
}
You just do it. You decide how many events take place in an hour of simulation time (eg., if an event takes place once a second, then after 3600 simulated events you've simulated an hour of time). There's no need for your simulation to run in real time; you can run it as fast as you can calculate the relevant numbers.
It sounds like you are implementing a Discrete Event Simulation. You don't even need to have a free-running timer (no matter what scaling you may use) in such a situation. It's all driven by the events. You have a priority queue containing events, ordered by the event time. You have a processing loop which takes the event at the head of the queue, and advances the simulation time to the event time. You process the event, which may involve scheduling more events. (For example, the customerArrived event may cause a customerOrdersDinner event to be generated 2 minutes later.) You can easily simulate customers arriving using random().
The other answers I've read thus far are still assuming you need a continuous timer, which is usually not the most efficient way of simulating an event-driven system. You don't need to scale real time to simulation time, or have ticks. Let the events drive time!
If the simulation is data dependent (like a stock market program), just speed up the rate at which the data is pumped. If it is some think that depends on time() calls you will have to do some thing like wallyk's answer (assuming you have the source code).
If time in your simulation is discrete, one option is to structure your program so that something happens "every tick".
Once you do that, time in your program is arbitrarily fast.
Is there really a reason for having a month of simulation time correspond exactly to an hour of time in the real world ? If yes, you can always process the number of ticks that correspond to a month, and then pause the appropriate amount of time to let an hour of "real time" finish.
Of course, a key variable here is the granularity of your simulation, i.e. how many ticks correspond to a second of simulated time.