This question is a refinement of my question Different ways of observing data changes.
I still have a lot of classes in my C++ application, which are updated (or could be updated) frequently in complex mathematical routines and in complex pieces of business logic.
If I go for the 'observer' approach, and send out notifications every time a value of an instance is changed, I have 2 big risks:
sending out the notifications itself may slow down the applications seriously
if user interface elements need to be updated by the change, they are updated with every change, resulting in e.g. screens being updated thousends of times while some piece of business logic is executing
Some problems may be solved by adding buffering-mechanisms (where you send out notifications when you are going to start with an algorith, and when the algorithm is finished), but since the business logic may be executed on many places in the software, we end up adding buffering almost everywhere, after every possible action chosen in the menu.
Instead of the 'observer' aproach, I could also use the 'mark-dirty' approach, only marking the instances that have been altered, and at the end of the action telling the user interface that it should update itself.
Again, business logic may be executed from everywhere within the application, so in practice we may have to add an extra call (telling all windows they should update themselves) after almost every action executed by the user.
Both approaches seem to have similar, but opposite disadvantages:
With the 'observer' approach we have the risk of updating the user-interface too many times
With the 'mark-dirty' approach we have the risk of not updating the user-interface at all
Both disadvantages could be solved by embedding every application action within additional logic (for observers: sending out start-end notifications, for mark-dirty: sending out update-yourself notifications).
Notice that in non-windowing applications this is probably not a problem. You could e.g. use the mark-dirty approach and only if some calculation needs the data, it may need to do some extra processing in case the data is dirty (this is a kind of caching approach).
However, for windowing applications, there is no signal that the user is 'looking at your screen' and that the windows should be updated. So there is no real good moment where you have to look at the dirty-data (although you could do some tricks with focus-events).
What is a good solution to solve this problem? And how have you solved problems like this in your application?
Notice that I don't want to introduce windowing techniques in the calculation/datamodel part of my application. If windowing techniques are needed to solve this problem, it must only be used in the user-interface part of my application.
Any idea?
An approach I used was with a large windows app a few years back was to use WM_KICKIDLE. All things that are update-able utilise a abstract base class called IdleTarget. An IdleTargetManager then intercepts the KICKIDLE messages and calls the update on a list of registered clients. In your instance you could create a list of specific targets to update but I found the list of registered clients enough.
The only gotcha I hit was with a realtime graph. Using just the kick idle message it would spike the CPU to 100% due to constant updating of the graph. Use a timer to sleep until the next refresh solved that problem.
If you need more assistance - I am available at reasonable rates...:-)
Another point I was thinking about.
If you are overwhelmed by the number of events generated, and possibly the extra-work it is causing, you may have a two phases approach:
Do the work
Commit
where notifications are only sent on commit.
It does have the disadvantage of forcing to rewrite some code...
You could use the observer pattern with coalescing. It might be a little ugly to implement in C++, though. It would look something like this:
m_observerList.beginCoalescing();
m_observerList.notify();
m_observerList.notify();
m_observerList.notify();
m_observerList.endCoalescing(); //observers are notified here, only once
So even though you call notify three times, the observers aren't actually notified until endCoalescing when the observers are only notified once.
Related
I whould like to use separate databases for runtime and history data without implementing a custom HistoryEventHandler. Does someone know how this is possible?
I read the camunda user guides but this did not help much because it only hints the custom implementation way.
Currently, everytime I query history data (about 2mil activity entries) the performance of the system drops as it kind of blocks the runtime, too. I'd like to avoid this without loosing the ability to query historic data.
That would be a really cool feature, but it is currently not supported. You will have to disable the default history and implement a custom handler.
Camunda BPM offers Optimize, which pulls the history data from the Engine to an Elastic Search database. If you are using the Enterprise version, it may be a way to solve it.
(Based on your comments to other answers, it appears that you're interested in learning more about custom HistoryEventHandler implementations. Thus, I'm adding this answer in the hope that it will help.)
Implementing a custom History Event Handler isn't difficult, but there are a few important points to keep in mind:
Unless you want to skip the storage of history information in the standard Camunda history tables, you'll want to use their CompositeHistoryEventHandler. This simply gives you the ability to use multiple HistoryEventHandler implementations.
Any HistoryEventHandler implementations will complete in the same threads as the ones executing process instances; thus, you will want to be cognizant of the performance impacts your custom HistoryEventHandler will have.
You may want to consider publishing your history events through a message bus or messaging system to allow for reliable delivery without impacting Camunda workflow instance performance.
Finally, it may make sense to use your custom HistoryEventHandler along with Camunda's default HistoryEventHandler and their functionality for deleting process instances after a period of time. This would allow you to use their querying capabilities for some period of time without having the history stack up (and thus slowing down your system).
I've been diving into ReST lately, and a few things still bug me:
1) Since there are only resources and no services to call, how can I provide operations to the client that only do stuff and don't change any data?
For example, in my application it is possible to trigger a service that connects to a remote server and executes a shell scripts. I don't know how this scenario would apply to a resource?
2) Another thing I'm not sure about is side effects: Let's say I have a resource that can be in certain states. When transitioning into another state, a lot of things might happen (e-mails might be sent). The transition is triggered by the client. Should I handle this transition merely by letting the resource be updated via PUT? This feels a bit odd.
For the client this means that updating an attribute of this ressource might only change the attribute, or it also might do a lot of other things. So PUT =/= PUT, kind of.
And implementation wise, I have to check what exacty the PUT request changed, and according to that trigger the side effects. So there would be a lot of checks like if(old_attribute != new_attribute) {side_effects}
Is this how it's supposed to be?
BR,
Philipp
Since there are only resources and no services to call, how can I provide operations to the client that only do stuff and don't change any data?
HTTP is a document transport application. Send documents (ie: messages) that trigger the behaviors that you want.
In other words, you can think about the message you are sending as a description of a task, or as an entry being added to a task queue. "I'm creating a task resource that describes some work I want done."
Jim Webber covers this pretty well.
Another thing I'm not sure about is side effects: Let's say I have a resource that can be in certain states. When transitioning into another state, a lot of things might happen (e-mails might be sent). The transition is triggered by the client. Should I handle this transition merely by letting the resource be updated via PUT?
Maybe, but that's not your only choice -- you could handle the transition by having the client put some other resource (ie, a message describing the change to be made). That affords having a number of messages (commands) that describe very specific modifications to the domain entity.
In other words, you can work around PUT =/= PUT by putting more specific things.
(In HTTP, the semantics of PUT are effectively create or replace. Which is great for dumb documents, or CRUD, but need a bit of design help when applied to an entity with its own agency.)
And implementation wise, I have to check what exacty the PUT request changed, and according to that trigger the side effects.
Is this how it's supposed to be?
Sort of. Review Udi Dahan's talk on reliable messaging; it's not REST specific, but it may help clarify the separation of responsibilities here.
I'm writing PHP for fairly simple workflow for Amazon SWF. I've found myself starting to write a library to check if certain actions have been started or completed. Essentially looping over the event list to check how things have progressed, and then starting an appropriate activity if its needed. This can be a bit faffy at times as the activity type and input information isn't in every event, it seems to be in the ActivityTaskScheduled event. This sort of thing I've discovered along the way, and I'm concerned that I could be missing subtle things about event lists.
It makes me suspect that someone must have already written some sort of generic library for finding the current state of various activities. Maybe even some sort of more declarative way of coding up the flowcharts that are associated with SWF. Does anything like this exist for PHP?
(Googling hasn't come up with anything)
I'm not aware of anything out there that does what you want, but you are doing it right. What you're talking about is coding up the decider, which necessarily has to look at the entire execution state (basically loop through the event list) and decide what to do next.
Here's an example written in python
( Using Amazon SWF To communicate between servers )
that looks for events of type 'ActivityTaskCompleted' to then decide what to do next, and then, yes, looks at the previous 'ActivityTaskScheduled' entry to figure out what the attributes for the previous task were.
If you write a php framework that specifies the workflow in a declarative way then a generic decider that implements it, please consider sharing it :)
I've since found https://github.com/cbalan/aws-swf-fluent-php which looks promising, but not really used it, so can't speak to the whether it works or not.
I've forked it and started a bit of very light refactoring to allow some testing, available at https://github.com/michalc/aws-swf-fluent-php
Since there's no complete BPM framework/solution in ColdFusion as of yet, how would you model a workflow into a ColdFusion app that can be easily extensible and maintainable?
A business workflow is more then a flowchart that maps nicely into a programming language. For example:
How do you model a task X that follows by multiple tasks Y0,Y1,Y2 that happen in parallel, where Y0 is a human process (need to wait for inputs) and Y1 is a web service that might go wrong and might need auto retry, and Y2 is an automated process; follows by a task Z that only should be carried out when all Y's are completed?
My thoughts...
Seems like I need to do a whole lot of storing / managing / keeping
track of states, and frequent checking with cfscheuler.
cfthread ain't going to help much since some tasks can take days
(e.g. wait for user's confirmation).
I can already image the flow is going to be spread around in multiple UDFs,
DB, and CFCs
any opensource workflow engine in other language that maybe we can port over to CF?
Thank you for your brain power. :)
Study the Java Process Definition Language specification where JBoss has an execution engine for it. Using this Java based engine may be your easiest solution, and it solves many of the problems you've outlined.
If you intend to write your own, you will probably end up modelling states and transitions, vertices and edges in a directed graph. And this as Ciaran Archer wrote are the components of a State Machine. The best persistence approach IMO is capturing versions of whatever data is being sent through workflow via serialization, capturing the current state, and a history of transitions between states and changes to that data. The mechanism probably needs a way to keep track of who or what has responsibility for taking the next action against that workflow.
Based on your question, one thing to consider is whether or not you really need to represent parallel tasks in your solution. Where instead it might be possible to en-queue a set of messages and then specify a wait state for all of those to complete. Representing actual parallelism implies you are moving data simultaneously through several different processes. In which case when they join again you need an algorithm to resolve deltas, which is very much a non trivial task.
In the context of ColdFusion and what you're trying to accomplish, a scheduled task may be necessary if the system you're writing needs to poll other systems. Consider WDDX as a serialization format. JSON, while seductively simple, I recall has some edge cases around numbers and dates that can cause you grief.
Finally see my answer to this question for some additional thoughts.
Off the top of my head I'm thinking about the State design pattern with state persisted to a database. Check out the Head First Design Patterns's Gumball Machine example.
Generally this will work if you have something (like a client / order / etc.) going through a number of changes of state.
Different things will happen to your object depending on what state you are in, and that might mean sitting in a database table waiting for a flag to be updated by a user manually.
In terms of other languages I know Grails has a workflow module available. I don't know if you would be better off porting to CF or jumping ship to Grails (right tool for the job and all that).
It's just a thought, hope it helps.
I currently have a GUI single-threaded application in C++ and Qt. It takes a good 1 minute to load (read from disk) and ~5 seconds to close (saving settings, finalize connections, ...).
What can I do to make my application appear to be faster?
My first thought was to have a server component of the app that does all the works while the GUI component is only for displaying. The communication is done via socket, pipe or memory map. That seems like an overkill (in term of development effort) since my application is only used by a handful of people.
The first step is to start profiling. Use an actual, low-overhead profiling tool (eg, on Linux, you could use oprofile), not guesswork. What is your app doing in that one minute it takes to start up? Can any of that work be deferred until later, or perhaps skipped entirely?
For example, if you're loading, say, a list of document templates, you could defer that until the user tells you to create a new document. If you're scanning the system for a list of fonts, load a cached list from last startup and use that until you finish updating the font list in a separate thread. These are just examples - use a profiler to figure out where the time's actually going, and then attack the code starting with the largest time figures.
In any case, some of the more effective approaches to keep in mind:
Skip work until needed. If you're doing initialization for some feature that's used infrequently, skip it until that feature is actually used.
Defer work until after startup. You can take care of a lot of things on a separate thread while the UI is responsive. If you are collecting information that changes infrequently but is needed immediately, consider caching the value from a previous run, then updating it in the background.
For your shutdown time, hide your GUI instantly, and then spend those five seconds shutting down in the background. As long as the user doesn't notice the work, it might as well be instantaneous.
You could employ the standard trick of showing something interesting while you load.
Like many games nowadays show a tip or two while they are loading
It looks to me like you're only guessing at where all this time is being burned. "Read from disk" would not be high on my list of candidates. Learn more about what's really going on.
Use a decent profiler.
Profiling is a given, of course.
Most likely, you may find I/O is substantial - reading in your startup files. As bdonlan notes, deferring work is a standard technique. Google 'lazy evaluation'.
You can also consider caching data that does not change. Save a cache in a faster format, such as binary. This is most useful if you happen to have a large static data set read into something like an array.