Is it possible to run through several states in one statechart at the same time? - state

Is it possible to run through several states in one statechart at the same time?
My simulation model is agent-based.
A) At the moment I consider my process as a continuous chain for simplicity. This means that only when the product is ejected from the machine can the process restart. The individual stations of the machine are represented as states.
B) Now I would like to represent the following: The machine should be able to run through several states simultaneously in one run. Example: If the manufactured product is just ejected from the machine, there is raw material in the filling station and in the pressing station at the same time. This means that more product is produced in the best possible time than when I look at the process as in A.
I would be glad about any help. :)

Three axioms that are always true, and you must make your logic follow them:
an agent can always only be in 1 state per state chart
While in 1 state, it can be part of a larger "composite state" (see help)
An agent can have several state charts running in parallel, for example one for "machine states" and one for "failure states"
Be careful with point 3, though. If you have several state charts in 1 agent type, they should be 100% mutually exclusive, i.e. represent very different things.

Related

DISPLAYCONFIG_TARGET_DEVICE_NAME.connectorInstance is zero when multiple monitors are connected

The Microsoft documentation for the DISPLAYCONFIG_TARGET_DEVICE_NAME struct has this to say about its connectorInstance member:
The one-based instance number of this particular target only when the adapter has multiple targets of this type. The connector instance is a consecutive one-based number that is unique within each adapter. If this is the only target of this type on the adapter, this value is zero.
When I have a single monitor connected to a video card, I get back a 1 in this field. However, when I have multiple monitors connected to a video card, I get a 0 in this field for each of them.
I was expecting this to be 1... 2... 3..., etc. What's more, the code based upon this assumption has been in production for over a year with no reported issues. Then within the last month or two we suddenly got a flood of user support issues that boil down to our getting zero back for all connected monitors (for users with multiple monitors). Maybe that's coincidence, but it makes me wonder if some behavior changed on the Windows side of things...
...only when the adapter has multiple targets of this type.
...
If this is the only target of this type on the adapter, this value is zero.
Maybe I'm misunderstanding what's meant here by 'type'. It's kind of vague, so it could be referring to something in the EDID, or some combination of attributes... bleh.
Anyone familiar enough with the DisplayConfigGetDeviceInfo(...) API to provide any insight on this?

Map-Reduce with a wait

The concept of map-reduce is very familiar. It seems like a great fit for a problem I'm trying to solve, but it's either missing something (or I lack enough understanding of the concept).
I have a stream of items, structured as follows:
{
"jobId": 777,
"numberOfParts": 5,
"data": "some data..."
}
I want to do a map-reduce on many such items.
My mapping operation is straightforward - take the jobId.
My reduce operation is irrelevant for this phase, but all we know is that it takes multiple strings (the "some data..." part) and somehow reduces them to a single object.
The only problem is - I need all five parts of this job to complete before I can reduce all the strings into a single object. Every item has a "numberOfParts" property which indicates the number of items I must have before I apply the reduce operation. The items are not ordered, therefore I don't have a "partId" field.
Long story short - I need to apply some kind of a waiting mechanism that waits for all parts of the job to complete before initiating the reduce operation, and I need this waiting mechanism to rely on a value that exists within the payload (therefore solutions like kafka wouldn't work).
Is there a way to do that, hopefully using a single tool/framework?
I only want to write the map/reduce part and the "waiting" logic, the rest I believe should come out of the box.
**** EDIT ****
I'm currently in the design phase of the project and therefore not using any framework (such as spark, hadoop, etc...)
I asked this because I wanted to find out the best way to tackle this problem.
"Waiting" is not the correct approach.
Assuming your jobId is the key, and data contains some number of parts (zero or more), then you must have multiple reducers. One that gathers all parts of the same job, then another that processes all jobs with a collection of parts greater than or equal to numberOfParts while ignoring others

tensorflow repeated running of fully connected model

Question:
How can I "rerun" tensorflow code that depends on queues? Is the best way really to close the session, build the model again, load variables and run?
Motivation:
In a standing unanswered question I asked how in a fully connected model one could interleave actions (such as generating cumulative summaries, calc AUC on test data, etc.) with training that reads data from tensorflow TFRecords files and tf.Queues.
For example, tf.train.string_input_producer returns a filename_queue. As part of the constructor it takes a "num_epochs" arg. Instead of setting "num_epochs" to 100, I'm thinking to just set "num_epochs" to "2" to generate summaries every other epoch. This requires running the same code 50 times, hence the need for an efficient answer to above.

What is the easiest way to achieve parallelization of gridded/looped processes in C++

Everything I describe is currently occurring in a hydrologic model I am building.
I have some for loops that control the reading of input data across gridded data sets. The initial inputs can be anywhere from 100x100 to 3000x3000 cells. After reading in these inputs, I perform some initial calculations (5-10) across the grid. (See my question here for questions I have related to reading in the inputs: http://bit.ly/1AkyzWy). After the initial calculations, I enter a mode where I step "into" each cell and run 4-15 processes. Each cell has a different subset of roughly 15 processes - some of these cells are identical with others in terms of the processes that are run, and no cell runs a subset that doesn't exist elsewhere. A time step consists of one complete loop through all of the cells. I run anywhere from 30 to 15,000 time steps.
And no here's the important part, I think: Each cell depends on the results of the processes run in the neighboring cells, but not during each time step. Within a time step, when in a cell, the current running processes are referencing the results of the processes run in the neighboring cells during the previous timestep. Nothing within a cell depends on the processes run in a neighboring cell during the same timestep.
So, I think my program, which can take an hour or so to run 1500 time steps on 1000x10000 cells, is ripe for parallelization. I've done initial research into this, I'm worried about solutions affecting portability and performance on different end-users machines.
Does an easy to implement solution exist that doesn't affect portability and adapts to different users' number of computer cores?

clustered key/value database: most recent record

Imagine following situation:
There is a distributed key/value database stored on computer network. One central "main" computer that fetches request, and multiple child machines that store portions of data.
I.e. something like this:
main computer
|
+--child A
+--child B
+--child C
.....
I.e. "star" topology.
Additional description:
Portions of database overlap, and several different versions of record with same "key" can be stored on several machines at same time.
Key is not guaranteed to exist on all machines or on specific machines.
"Children" do not synchronize data with each other.
Data is requested/read only via main computer, which must return most recent version of data for requested key.
Data is written only through children - they receive new values from several sources.
Data is never deleted.
Now the main problem:
With such structure, how do I determine which version is most recent?
I can think of two ways to deal with the problem:
Add timestamp for every record, when it is written into database vial child machine, use timestamp to determine version.
Use "revision number" or "write operation index" (issued by main computer, increments by one for every write operation) instead of timestamps.
However, both approaches are not perfect:
1st approach requires perfect clock synchronization for all machines, otherwise system will fail to deliver most recent record value.
2nd approach will cause every child to ask main machine for timestamp via network, which will introduce writing delays, plus main machine will have to be locked by mutex, so multithreading performance will suffer.
What is the better way to deal with this situation?
How do real clustered databases deal with this situation (most recent record version in cluster)?
Your statement that the first approach requires perfect clock synchronization is not correct.
You do not care about the absolute timestamps issued by a child, only the relative timestamps. So as long as the clocks advance at the same rate, they need not be synchronized; you can correct for the known offsets.
If the clocks on the children advance at different rates, then you must use a method which involves coordination (writing cannot be lock-free in the slow path). This is provable by contradiction, since obviously two children independently writing a value with time-records that cannot be related to each other will not let an outside observer determine which was written later.
However, you can do the coordination in parallel with the actual write: write to the child and, simultaneously, to an ordered log which allows a determination of which write happened first (you don't need a ticket-type system like you seem to suggest if you've got a write log). So it doesn't necessarily delay the process of writing at all!
Take a look at logical-timestamp key-value systems like Accumulo, an HBase alternative (currently in Apache-project incubation) - this is real world clustered database doing exactly what you're asking for.