Is it possible to use policy based design together with automated testing? - templates

I am developing a numerical simulations library which is centred around a single collection of data operated on by different computational algorithms. The algorithms are complex, they have different states involving multiple parameters, and are interchangeable (under some semantic restrictions).
To avoid bloated interface of the collection and to enable different implementations etc, I'm thinking about using policy based design. This gives the collection a wide combination of choices between storage structures, algorithms, parameters, internal stuff.
If I imagine that I redesigned my generic / object oriented existing design usign policies, how can I choose the optimal algorithms and data structures? Conceptually I need to define the set of policies and a set of verification test cases and execute a parametric study.
This is easy when object oriented programming is used since I can determine all necessary types and their parameters during run-time using e.g. a string-based Abstract Factory with type names stored in the input file, that is then changed by an external script that executes the client application on a family of test cases.
How do I do that with policies, where a combination of N policies ends up in being N different client applications?
How is automated testing done together with policy based design in a professional way?

If you're representing algorithms as policies, you /should/ have a pretty uniform interface already thought up. You could imagine an "AlgorithmPolicy" processing some data from your data store and returning some representation of the results.
"If I imagine that I redesigned my generic / object oriented existing design usign policies, how can I choose the optimal algorithms and data structures?"
If your object oriented design currently makes use of the strategy pattern (see also: the Gang of Four book), your policies will simply replace every place that you've used a strategy. Choosing "optimal algorithms" for the different policies you design will simply be a matter of nailing the right conceptual structure / interface for those policies. (If you're going to use many different data stores, make sure that the interface for adding / removing / getting data from them is uniform, for example. Here, it can be helpful to think of three examples and find commonalities... then think of another exmaple and make sure it fits the schema. Iterate until things feel correct.)
You'll still have adequate type checking, it'll just feel a bit different (and you may run into some nasty compile errors occaisionally. ;)
Testing will simply be a matter of writing some unit tests for each of the configurations / policy combinations you'd like to cover. You probably should already be writing these tests anyways; the primary difference is that you'll want to try to hit the interfaces you designate rather than targetting specifics.
You can validate different storage methods based on validations of your algorithm policies. (So, if I have some algorithm that can be stored in different ways, I can run the algorithm on some test data for ecah storage mechanism and expect the same results.) Assuming that you've spec'd out the inteface correclty, you should only need to write a single test for each additional storage mechanism you add.
Again: It'd be nice to have more details about the structure of the program, what different parameters and such you'd need to pass in. (Is any of this code open source / going to be open sourced?)
From what you've said, in my mind, your complicated-policy process may have an interface like so:
FancyDataStore.Process()
For testing it, I'd write:
MockAlgorithmPolicy - A very simple algorithm that's trivial to validate.
MockInternalStuffPolicy - A very simple internal stuff policy that causes no integrations / reports nothing new.
MockStoragePolicy - A very simple storage policy that meets your interface for storage / doesn't cause many issues.
Write a test that validates the mocks put together...
For each StoragePolicy you create, write an automated test to validate it:
testSomeStoragePolicy{
// has a call to:
FancyDataStore.Process<MockAlgorithmPolicy, SomeStoragePolicy, MockInternalStuff>()
// validate...
}
That should prove that the SomeStoragePolicy works as expected.
Then, for your algorithms, you could write:
testSomeAlgorithmPolicy{
FancyDataStore.process<SomeAlgorithmPolicy, MockStoragePolicy, MockInternalStuff>();
///Validate.
}
etc.
This way, you write basically 1 test per each policy you end up writing (which seems feasible and not too ridiculous) Additionally, you can always add additional unit tests to cover other subtle integrations that may spin up over time.
If you're looking for good books on this subject, I'd suggest reading "Modern C++ Programming"; it provides a great primer on policy-driven design in C++.

Related

How to call SQL functions / stored procedure when using the Repository pattern

What is the best way to call a SQL function / stored procedure when converting code to use the repository pattern? Specifically, I am interested in read/query capabilities.
Options
Add an ExecuteSqlQuery to IRepository
Add a new repository interface specific to the context (i.e. ILocationRepository) and add resource specific methods
Add a special "repository" for all the random stored procedures until they are all converted
Don't. Just convert the stored procedures to code and place the logic in the service layer
Option #4 does seem to be the best long term solution, but it's also going to take a lot more time and I was hoping to push this until a future phase.
Which option (above or otherwise) would be "best"?
NOTE: my architecture is based on ardalis/CleanArchitecture using ardalis/Specification, though I'm open to all suggestions.
https://github.com/ardalis/CleanArchitecture/issues/291
If necessary, or create logically grouped Query services/classes for
that purpose. It depends a bit on the functionality of the SPROC how I
would do it. Repositories should be just simple CRUD, at most with a
specification to help shape the result. More complex operations that
span many entities and/or aggregates should not be added to
repositories but modeled as separate Query objects or services. Makes
it easier to follow SOLID that way, especially SRP and OCP (and ISP)
since you're not constantly adding to your repo
interfaces/implementations.
Don't treat STORED PROCEDURES as 2nd order citizens. In general, avoid using them because they very often take away your domain code and hide it inside database, but sometimes due to performance reasons, they are your only choice. In this case, you should use option 2 and treat them same as some simple database fetch.
Option 1 is really bad because you will soon have tons of SQL in places you don't want (Application Service) and it will prevent portability to another storage media.
Option 3 is unnecessary, stored procedures are no worse than simple Entity Framework Core database access requests.
Option 4 is the reason why you cannot always avoid stored procedures. Sometimes trying to query stuff in application service/repositories will create very big performance issues. That's when, and only when, you should step in with stored procedures.

How to make proper design/architecture of partially reusable algorithm? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I am very sorry for the long explanation, but it is required for proper understanding.
I am working on computer vision algorithms for industrial tasks. Computer vision algorithms tend to be very complicate. Usually they involve calls for dozens (at the very least) of simpler algorithms (that are not simple either). Those calls form certain hierarchy: bigger tasks call some smaller ones, which in turn call even smaller ones, and so on.
Let’s take for example typical computer vision task: find object in image under certain conditions. This is a task that should be performed in dozens of different applications. Each application has its own set of conditions and thus it is impossible to create single algorithm that works for all of them. But they are pretty similar. Usually it is enough to replace one or two lower level functions. For example: use different method for detection of points of interest in image.
And here comes a problem: for each new application I had to copy whole code from one of the existing applications and adapt relevant parts, which is a bad practice. I am trying to eliminate those duplications by creating system of algorithms that can be used in all application without changing the code itself. Here is the list of issues system had to deal with (at least the ones I identified so far):
1) Arguments provided to main algorithm should be able to set the 'algorithmic flow' inside the system, i.e. they determine what lower level algorithms are used and how
2) Different sub-algorithms that perform same task may require different inputs. One may need an array of ints, another requires pair of double, and so on... Algorithms on the higher level should be oblivious to replacement of one sub-algorithm with another. That means they should not be aware of what arguments they receive and pass down to sub-algorithms. Same true for output of sub-algorithm. It may vary if different combination of sub-algorithms is used
3) The system must be extendable. If new sub-algorithm became available (for example: yet another way to find points of interest) the system should be able to call it. I understand that changes might be unavoidable at this point, but I would like to keep them at minimum. And in any case the system should be able to work at the same way with previous sets of arguments.
4) System must be debuggable. End user of the system should have reasonable way to dump debug information about the 'algorithmic flow' in his system, so that algorithm developer will be able to recreate the situation. It is not that trivial considering requirement (3).
5) There should be reasonable way to make sanity check for the flow of algorithms.
6) I am not going to throw exceptions but there should be reasonable way to return success / fail status of each algorithm. Again it is not easy because of requirement (3).
7) This one is more 'good to have' rather than 'must have', but it may be important. Some calculations may be performed by multiple sub-algorithms. For example calculation of gradients in image may (or may not) be required for multiple different tasks. It is good to have an option to store results of those calculations in order to reuse them later.
I created some kind of solution to this but it is far from being good. Do you have any recommendations about how this should be done?
Used language: C++
Thanks you
I'd just use some tried and true design patterns.
Use a strategy pattern to represent an algorithm that you may wish to swap out for alternatives.
Use a factory to instantiate different algorithm (strategy) instances based on some input parameter or runtime context - I'm a fan of the prototype factory where you have "inert" instances of each object in some lookup table, and based on a key you pass in you can request a clone of the one needed. I like it mainly because it's easiest to extend - you can even add new configured prototype instances to such a factory at runtime.
Note that the same "strategy" model does not have to serve for everything - it sounds like you might have some higher-level/fuzzy operations which then assemble or chain together low-level/detailed operations. The high level operations could be one type of abstract object while the detailed algorithms are the more concrete strategy instances.
As far as the inputs to the various algorithms, if it varies a lot from algorithm to algorithm you could use an extensible object like a dictionary for parameters so that each algorithm can use just the parameters it needs and ignore the others for an operation. If the dictionary is modifiable during the operation this would also permit upstream algorithms to add parameters for downstream algorithms. Key-value pairs are pretty easy to dump to a log or view in a debugger.
If each strategy instance has a unique semantic identifier you could easily debug the algorithms that get instantiated and chained together. (I use an audio DSP library that has a function to dump a description of the whole chain of configured audio processors, it's very handy).
If you use a system with strategy patterns and extensible parameters you should also be able to segregate shared algorithms from application-specific algorithms, but still have the same basic framework for instantiating and running them.
hth
I'm going to assume that you are a competent OO programmer with good domain knowledge, and your problem is more about a higher level of organisation of software components (implementing algorithms) than OO generally provides.
The patterns mentioned by #orpheist make perfect sense. Consider them. They will not solve all the problems you list. You should also consider the following.
In what form will the data be for algorithms to access?
Will you need adapters to connect one component to another?
Do you pass the data to the component or the component to the data?
Do you want to assemble a pipeline or group of components to build higher ones, which can then be applied to the data?
Do you need a language (XML, DSL) to express connections and to allow for easy experimentation?
Is performance a dominant issue already, or can you afford more interpretive techniques at this stage?
It think you need to refine some of your questions and provide some more concrete specifics. I also think your questions would be a better fit on programmers.stackexchange than here.

Web Service ‘mandatory/optional’ fields: XSD Design time vs Runtime

We are currently building a pile of SOAP Web Service to front the access of various backend systems.
While defining our Request/Response message XML, we see multiple services needing the ‘Account’ object with different ‘mandatory/optional’ fields.
How should we define and enforce the validation of these ‘mandatory/optional’ fields on the same Message? I see these options
1) Enforce validation with XSD by creating different 'Account' Complexe Type
Pros : Design time clarity.
Cons : proliferation of Object Type, Less reuse of Object,
2) Enforce validation with XSD by Extending+Restriction a single base 'Account' type
Pros : Design time clarity.
Cons : Not sure of the support of the Extend+Restriction feature (java, .Net)
3) Using a single 'Account' type and enforcing validation in runtime (ie in the Code).
Pros: Simple
Cons: No design time validation. Need to communicate field requirements via a specification doc.
What are you’re thoughts on that?
I would have to assume that: i) some of what you would call optional fields are actually fields that are not applicable (don't make sense) to all accounts and ii) we're not talking trivial scenarios (like two type of accounts with 2 fields each-kind of thing).
Firstly, I would say that unless you're really lucky, from a requirements perspective, then you're going to end up with some sort of "validation in runtime" no matter what option you're going with. XML Schema can't express some common data validation requirements, such as cross field validation; or simply because the data in your XML is not sufficient to feed the rules to validate the integrity of the message (the data in the message being a subset on what's available at the time the XML is being un/marshalled).
Secondly, I would avoid deriving new complex types through restricton; from an authoring perspective you don't achieve much in terms of reuse, and you might end up with problems in how that is interpreted by your XSD to code tooling. I like to think that the original intention of deriving through restriction was to provide a tool for people to use in xsd:redefine scenarios; for people that wouldn't want to fiddle with XML Schemas that were authored by someone else. If one owns (authors) the schema, one can work around the need to restrict by defining the "lesser" object first and extend from that.
As to the "proliferation of objects", you are kind of getting that with option #2 as well (when compared with #1); what I mean by that, all the tools I know will create a class for each named (global) complex type you have in your XSD; so if you have to have three type of accounts, you'll have three for scenario #1, and four, or so, if you choose to extend from one, or so, base classes; a worst case scenario for the later would be when you need three specializations (concrete if you wish); anyway, from my experience, the difference in real life scenarios is not something that would really tip the decision one way or the other.
Extending base types in XML Schema is good for reuse; however, reuse brings coupling; if you're analysing this from a forward/backward compatibility point of view, extending something in the base type could mess up some of the unmarshalling (deserialization) of the XML for clients of your service(s) that don't want to change their code base, yet you want to maintain only one Web Service endpoint for all; in this case, a forward-compatibility strategy that relies on an xsd:any at the end of a compositor (xsd:sequence) would be rendered useless in your first release that goes and extends your base type.
There is even more; because of this, I don't think there's a correct answer, just for the criteria you seem to imply by setting your pro/cons.
All of my preferred options below assume that you put high value on the requirement to ensure forward/backward compatibility of your services, and you want to minimize the cost of your clients having to deal with your services (because of XML Schema changes).
I would say that if all your domain (accounts in particular) can be fully modeled (assume no future change basically) and that there is enough commonality to justify reuse, then go with option #2. Otherwise, go with option #1 since I have yet to see things that don't change...
If the modeling of your domain can be done 80% or more (or some number that you think is high) and that there is enough commonality to justify reuse, then I would still go with option #2, with the caveat that any future extensions for common attributes across accounts, must be applied for each individual account (basically turning your option into a hybrid, by doing #1).
For anything else, I would go #1. Whew, I can't believe I wrote all of this...

How should I design a mechanism in C++ to manage relatively generic entities within a simulation?

I would like to start my question by stating that this is a C++ design question, more then anything, limiting the scope of the discussion to what is accomplishable in that language.
Let us pretend that I am working on a vehicle simulator that is intended to model modern highway systems. As part of this simulation, entities will be interacting with each other to avoid accidents, stop at stop lights and perhaps eventually even model traffic enforcement with radar guns and subsequent exciting high speed chases.
Being a spatial simulation written in C++, it seems like it would be ideal to start with some kind of Vehicle hierarchy, with cars and trucks deriving from some common base class. However, a common problem I have run in to is that such a hierarchy is usually very rigidly defined, and introducing unexpected changes - modeling a boat for instance - tends to introduce unexpected complexity that tends to grow over time into something quite unwieldy.
This simple aproach seems to suffer from a combinatoric explosion of classes. Imagine if I created a MoveOnWater interface and a MoveOnGround interface, and used them to define Car and Boat. Then lets say I add RadarEquipment. Now I have to do something like add the classes RadarBoat and RadarCar. Adding more capabilities using this approach and the whole thing rapidly becomes quite unreasonable.
One approach I have been investigating to address this inflexibility issue is to do away with the inheritance hierarchy all together. Instead of trying to come up with a type safe way to define everything that could ever be in this simulation, I defined one class - I will call it 'Entity' - and the capabilities that make up an entity - can it drive, can it fly, can it use radar - are all created as interfaces and added to a kind of capability list that the Entity class contains. At runtime, the proper capabilities are created and attached to the entity and functions that want to use these interfaced must first query the entity object and check for there existence. This approach seems to be the most obvious alternative, and is working well for the time being. I, however, worry about the maintenance issues that this approach will have. Effectively any arbitrary thing can be added, and there is no single location in which all possible capabilities are defined. Its not a problem currently, when the total number of things is quite small, but I worry that it might be a problem when someone else starts trying to use and modify the code.
As one potential alternative, I pondered using the template system to achieve type safe while keeping the same kind of flexibility. I imagine I could create entities that inherited whatever combination of interfaces I wanted. Using these objects would entail creating a template class or function that used any combination of the interfaces. One example might be the simple move on road using just the MoveOnRoad interface, whereas more complex logic, like a "high speed freeway chase", could use methods from both MoveOnRoad and Radar interfaces.
Of course making this approach usable mandates the use of boost concept check just to make debugging feasible. Also, this approach has the unfortunate side effect of making "optional" interfaces all but impossible. It is not simple to write a function that can have logic to do one thing if the entity has a RadarEquipment interface, and do something else if it doesn't. In this regard, type safety is somewhat of a curse. I think some trickery with boost any may be able to pull it off, but I haven't figured out how to make that work and it seems like way to much complexity for what I am trying to achieve.
Thus, we are left with the dynamic "list of capabilities" and achieving the goal of having decision logic that drives behavior based on what the entity is capable of becomes trivial.
Now, with that background in mind, I am open to any design gurus telling me where I err'd in my reasoning. I am eager to learn of a design pattern or idiom that is commonly used to address this issue, and the sort of tradeoffs I will have to make.
I also want to mention that I have been contemplating perhaps an even more random design. Even though I my gut tells me that this should be designed as a high performance C++ simulation, a part of me wants to do away with the Entity class and object-orientated foo all together and uses a relational model to define all of these entity states. My initial thought is to treat entities as an in memory database and use procedural query logic to read and write the various state information, with the necessary behavior logic that drives these queries written in C++. I am somewhat concerned about performance, although it would not surprise me if that was a non-issue. I am perhaps more concerned about what maintenance issues and additional complexity this would introduce, as opposed to the relatively simple list-of-capabilities approach.
Encapsulate what varies and Prefer object composition to inheritance, are the two OOAD principles at work here.
Check out the Bridge Design pattern. I visualize Vehicle abstraction as one thing that varies, and the other aspect that varies is the "Medium". Boat/Bus/Car are all Vehicle abstractions, while Water/Road/Rail are all Mediums.
I believe that in such a mechanism, there may be no need to maintain any capability. For example, if a Bus cannot move on Water, such a behavior can be modelled by a NOP behavior in the Vehicle Abstraction.
Use the Bridge pattern when
you want to avoid a permanent binding
between an abstraction and its
implementation. This might be the
case, for example, when the
implementation must be selected or
switched at run-time.
both the abstractions and their
implementations should be extensible
by subclassing. In this case, the
Bridge pattern lets you combine the
different abstractions and
implementations and extend them
independently.
changes in the implementation of an
abstraction should have no impact on
clients; that is, their code should
not have to be recompiled.
Now, with that background in mind, I am open to any design gurus telling me where I err'd in my reasoning.
You may be erring in using C++ to define a system for which you as yet have no need/no requirements:
This approach seems to be the most
obvious alternative, and is working
well for the time being. I, however,
worry about the maintenance issues
that this approach will have.
Effectively any arbitrary thing can be
added, and there is no single location
in which all possible capabilities are
defined. Its not a problem currently,
when the total number of things is
quite small, but I worry that it might
be a problem when someone else starts
trying to use and modify the code.
Maybe you should be considering principles like YAGNI as opposed to BDUF.
Some of my personal favourites are from Systemantics:
"15. A complex system that works is invariably found to have evolved from a simple system that works"
"16. A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over, beginning with a working simple system."
You're also worring about performance, when you have no defined performance requirements, and no problems with performance:
I am somewhat concerned about
performance, although it would not
surprise me if that was a non-issue.
Also, I hope you know about double-dispatch, which might be useful for implementing anything-to-anything interactions (it's described in some detail in More Effective C++ by Scott Meyers).

How can I decrease complexity in library without increasing complexity elsewhere?

I am tasked to maintain and update a library which allows a computer to send commands at a hardware device and then receive its response. Currently the code is setup in such a way that every single possible command the device can receive is sent via its own function. Code repetition is everywhere; a DRY advocate's worst nightmare.
Obviously there is much opportunity for improvement. The problem is each command has a different payload. Currently the data that is to be the payload is passed to each command function in the form of arguments. It's difficult to consolidate functionality without pushing the complexity to a level that calls the library.
When a response is received from the device its data is put into an object of a class solely responsible for holding this data, they do nothing else. There are hundreds of classes which do this. These objects are then used to access the returned data by the app layer.
My objectives:
Throughly reduce code repetition
Maintain similiar level of complexity at application layer
Make it easier to add new commands
My idea:
Have one function to send a command and one to receive (the receiving function is automatically called when a response from the device is detected). Have a struct holding all command/response data which will be passed to sending function and returned by receiving function. Since each command has a corresponding enum value, have a switch statement which sets up any command specific data for sending.
Is my idea the best way to do it? Is there a design pattern I could use here? I've looked and looked but nothing seems to fit my needs.
Thanks in advance! (Please let me know if clarification is necessary)
This reminds me of the REST vs. SOA debate, albeit on a smaller physical scale.
If I understand you correctly, right now you have calls like
device->DoThing();
device->DoOtherThing();
and then sometimes I get a callback like
callback->DoneThing(ThingResult&);
callback->DoneOtherTHing(OtherThingResult&)
I suggest that the user is the key component here. Do the current library users like the interface at the level it is designed? Is the interface consistent, even if it is large?
You seem to want to propose
device->Do(ThingAndOtherThingParameters&)
callback->Done(ThingAndOtherThingResult&)
so to have a single entry point with more complex data.
The downside from a library user perspective may that now I have to use a manual switch() or other type statement to tell what really happened. While the dispatching to the appropriate result callback used to be done for me, now you have made it a burden upon the library user.
Unless this bought me as a user some level of flexibility, that I as as user wanted I would consider this a step backwards.
For your part as an implementor, one suggestion would be to go to the generic form internally, and then offer both interfaces externally. Perhaps the old specific interface could even be auto-generated somehow.
Good Luck.
Well, your question implies that there is a balance between the library's complexity and the client's. When those are the only two choices, one almost always goes with making the client's life easier. However, those are rarely really the only two choices.
Now in the text you talk about a command processing architecture where each command has a different set of data associated with it. In the olden days, this would typically be implemented with a big honking case statement in a loop, where each case called a different routine with different parameters and perhaps some setup code. Grisly. McCabe complexity analysers hate this.
These days what you can do with an OO language is use dynamic dispatch. Create a base abstract "command" class with a standard "handle()" method, and have each different command inherit from it to add their own members (to represent the different "arguments" to the different commands). Then you create a big honking array of these at startup, usually indexed by the command ID. For languages like C++ or Ada it has to be an array of pointers to "command" objects, for the dynamic dispatch to work. Then you can just call the appropriate command object for the command ID you read from the client. The big honking case statement is now handled implicitly by the dynamic dispatch.
Where you can get the big savings in this scenario is in subclassing. Do you have several commands that use the exact same parameters? Make a subclass for them, and then derive all of those commands from that subclass. Do you have several commands that have to perform the same operation on one of the parameters? Make a subclass for them with that one method implemented for that operation, and then derive all those commands from that subclass.
Your first objective should be to produce a library that decouples higher software layers from the hardware. Users of your library shouldn't care that you have a hardware device that can execute a number of functions with a different payload. They should only care what the device does in a higher level. In this sense, it is in my opinion a good thing that every command is mapped to each one function.
My plan will be:
Identify the objects the higher data layers need to get the job done. Model the objects in C++ classes from their perspective, not from the perspective of the hardware
Define the interface of the library using the above objects
Start the implementation of the library. Perhaps an intermediate layer that maps software objects to hardware objects is necessary
There are many things you can do to reduce code repetition. You can use polymorphism. Define a class with the base functionality and extend it. You can also use utility classes, that implement functions needed for many commands.