Suppose I have a program that does three things in succession:
Task 1
Task 2
Task 3
Now suppose I want to write BDDs for Task 2. Let us say when it fails. Now Task 2 is performed only after Task 1 succeeds. But Task 1 itself can succeed in many ways (for instance, we my program retries Task 1 if the downstream system responds with an error for a fixed number of times). My question is, what all behaviour of Task 1 should I consider when writing tests for Task 2?
Given_Task2Fails_Task1IsRetried_Expect_SomeBehaviour
Given_Task2Fails_Task1IsNotRetried_Expect_SomeBehaviour
If this is the case, then I need create all the permutations and combinations of each of the tasks keeping Task 2 as constant. This blows up the number of scenarios with a large amount of duplicate code. This is what I mean:
Given_Task2Fails_Task1IsRetried_Task3IsNotRetried_Expect_SomeBehaviour
Given_Task2Fails_Task1IsNotRetried_Task3IsNotRetried_Expect_SomeBehaviour
Given_Task2Fails_Task1IsRetried_Task3IsRetried_Expect_SomeBehaviour
Given_Task2Fails_Task1IsNotRetried_Task3IsRetried_Expect_SomeBehaviour
How do I write solid scenarios in such cases? Ideally, I would want to vary each and every parameter of my system keeping Task 2 constant. But that's like a brute-force approach and I am pretty sure there are better approaches out there.
You should assume Task 1 has succeeded when writing tests for Task 2. The various ways Task 1 can fail — and recover — is not something you need to capture in tests verifying the behavior of Task 2.
Given Task 1 succeeded
When Task 2 is performed
Then Outcome 2 should have happened
What happens when Task 1 fails is not even a concern for tests asserting that Task 1 succeeds. In fact, the various ways Task 1 can fail could be (and possibly should be) captured as their own scenarios, unit tests or integration tests.
Related
I'm looking to parallelize bucket sort as an exercise. I am sorting integers lexiographically.
The motivation is, this is my first time doing any form of parallel programming. I'm trying to learn about the different ways of doing it, advantages/disadvantages.
E.g. {1,6,2,6778,8,2,43,52,23} -> [1 2 2 23 43 52 6 6778 8]
There are 3 steps to the task:
Initialize 9 vectors, 1 for each final bucket.
1) sort a chunk into buckets. This step parallelized by giving each thread a portion of the data.
2) sort each bucket into lexicographic order
3) concat all buckets
Visualization of the sort:
Option 1: Threadpool
I'm considering either dividing up all those tasks into jobs for 2 different functions, a bucketize function and a sort_bucket function then feeding them into a thread pool.
Option 2: Futures
Alternatively create futures of the functions and wait at the end of each step. Wait for all futures to return at the end of step 1, then create futures of sort_bucket in step 2 and join them.
Can anyone provide any opinions on these methods?
CPU utilization:
I can be sure that in the threadpool version I am using the appropirate number of threads as regards to available processors. In futures, they would be scheduled appropriately by the OS?
Are there other ways I've missed out on? I'm trying to learn so would like to compare all the possible methods of doing this.
Thanks!
You could sort subsequences of the initial array (in parallel, so in different threads) then merge them.
BTW the overhead is not negligible. You probably need to get an initial array of many dozen of thousands to observe a gain in parallelisation, and you are likely to sometimes observe some loss (e.g. with a too small initial array).
And for a first multi-threaded project, I'll rather suggest having a (nearly) fixed small set of threads (at most a dozen of them, assuming your computer has 8 cores). So both thread pools and futures are IMHO too complex for that.
Threads are heavy and expensive. They need at least a call stack (of a megabyte) each, and actually much more.
Don't forget synchronization (e.g. with mutexes).
This is a Pthread tutorial that you could adapt to C++ threads.
See my other question: How can I modify my Akka streams Prime sieve to exclude modulo checks for known primes? for an example of the kind of problem I have in mind.
In studying Akka streams thus far I have been surprised by how little discussion is devoted towards the specifics on flow control within the pipe, especially in terms of 1) timing/waiting, 2) futures within GraphDSL.create, and 3) making use of existing, already processed values.
Certain flows, for instance, have a "costly calculation step" that is limiting. For primes, that's checking for all p to Math.floor(Math.sqrt(whatvever)). Well, it would be best if I only checked the known p modulo, but I end up checking ALL numbers less than the root because I don't know how to use the data I am gathering. With a cheap lookup step in plcae, I could save a lot of time.
Similarly, maybe I want to work on implementing a flow at such a low level it might a well be item by item. I was expecting to find examples where I could do something like... `Item(n = 3) => Add three seconds of runtime, Item(n = -1) => Hold for 1 second... whateer. Is there a way to work at that level and remain within akka?
I need your help to solve this problem. I have a set of tasks, each task has its execution time. I have two types of constraints. first type is the precedence relationships between tasks. The second constraint type is allowing set of tasks to be in execution at the same time. For example : I have a graph G with 6 tasks and the following edges (T1,T2),(T2,T3),(T4,T3),(T4,T5) and (T6,T5). Suppose that T1,T4 are able to execute together and also T1,T6 but not T4,T6. Taking into account the execution time for each task. How to find the schedule which satisfies the precedence relationships between tasks and also minimize the length of the schedule taking into consideration the parallel execution of some tasks.
If the exclusion constraints ("T1,T4 are able to execute together") wouldn't be there (and no other constraints added), you could just start every task by taking the maximum finish time of all of its preceding tasks. That would be optimal and scale well. You would automatically get the shortest makespan. It would not be NP-complete/hard, and not job shop scheduling.
Unfortunately, the exclusion constraint (and potentially any other you add in the future) turn it into job shop scheduling (as mentioned by Lars), which is NP-complete/hard. See this video of job shop scheduling variant of an open source java implementation, that demo's why some tasks start later than their preceding tasks are finished. To solve that, look into heuristics, Metaheuristics (Tabu Search, ...) or other related techniques.
To remain simple, you can use a constructive heuristic based on priority rules along with a schedule generatiin scheme or also called SGS, see this for further reference. The heuristic will generate a ordered list of activities according to some criteria and the SGS will take this list as input and will generate the schedule. In your implementation of the SGS, you will tell if two tasks may or may not be executed in parallel basing in your second constraint.
If you want something more robust, you can use a Metaheuristic, where basically you will generate a solution (list of tasks) and modify this solution using local search techniques, exploring your solutions search space. You could generate solutions based on priority rules and evaluate them with the SGS implementation. This is just a simplified explanation of how a Metaheuristic will works, there are several variances. A good example of Metaheuristic is the Simulated Annealing, applied to the RCPSP problem: http://www.sciencedirect.com/science/article/pii/S0377221702007610.
What is code branching? I've seen it mentioned in various places, especially with bit twiddling, but never really thought about it?
How does it slow a program down and what should I be thinking about while coding?
I see mention of if statements. I really don't understand how such code can slow down the code. If condition is true do following instructions, otherwise jump to another set of instructions? I see the other thread mentioning "branch prediction", maybe this is where I'm really lost. What is there to predict? The condition is right there and it can only be true or false.
I don't believe this to be a duplicate of this related question. The linked thread is talking about "Branch prediction" in reference to an unsorted array. I'm asking what is branching and why prediction is required.
The most simple example of a branch is an if statement:
if (condition)
doSomething();
Now if condition is true then doSomething() is executed. If not then the execution branches, by jumping to the statement that follows the end of the if.
In very simple machine pseudo code this might be compiled to something along these lines:
TEST condition
JZ label1 ; jump over the CALL if condition is 0
CALL doSomething
##label1
The branch point is the JZ instruction. The subsequent execution point depends on the outcome of the test of condition.
Branching affects performance because modern processors predict the outcome of branches and perform speculative execution, ahead of time. If the prediction turns out to be wrong then the speculative execution has to be unwound.
If you can arrange the code so that prediction success rates are higher, then performance is increased. That's because the speculatively executed code is now less of an overhead since it has already been executed before it was even needed. That this is possible is down to the fact that modern processors are highly parallel. Spare execution units can be put to use performing this speculative execution.
Now, there's one sort of code that never has branch prediction misses. And that is code with no branches. For branch free code, the results of speculative execution are always useful. So, all other things being equal, code without branches executes faster than code with branches.
Essentially imagine an assembly line in a factory. Imagine that, as each item passes through the assembly line, it will go to employee 1, then employee 2, on up to employee 5. After employee 5 is done with it, the item is finished and is ready to be packaged. Thus all five employees can be working on different items at the same time and not having to just wait around on each other. Unlike most assembly lines though, every single time employee 1 starts working on a new item, it's potentially a new type of item - not just the same type over and over.
Well, for whatever weird and imaginative reason, imagine the manager is standing at the very end of the assembly line. And he has a list saying, "Make this item first. Then make that type of item. Then that type of item." And so on. As he sees employee 5 finish each item and move on to the next, the manager then tells employee 1 which type of item to start working on, looking at where they are in the list at that time.
Now let's say there's a point in that list - that "sequence of computer instructions" - where it says, "Now start making a coffee cup. If it's nighttime when you finish making the cup, then start making a frozen dinner. If it's daytime, then start making a bag of coffee grounds." This is your if statement. Since the manager, in this kind of fake example, doesn't really know what time of day it's going to be until he actually sees the cup after it's finished, he could just wait until that time to call out the next item to make - either a frozen dinner or some coffee grounds.
The problem there is that if waits until the very last second like that - which he has to wait until to be absolutely sure what time of day it'll be when the cup is finished, and thus what the next item's going to be - then workers 1-4 are not going to be working on anything at all until worker 5 is finished. That completely defeats the purpose of an assembly line! So the manager takes a guess. The factory is open 7 hours in the day and only 1 hour at night. So it is much more likely that the cup will be finished in the daytime, thus warranting the coffee grounds.
So as soon as employee 2 starts working on the coffee cup, the manager calls out the coffee grounds to the employee 1. Then the assembly line just keeps moving along like it had been, until employee 5 is finished with the cup. At that time the manager finally sees what time of day it is. If it's daytime, that's great! If it's nighttime, everything started on after that coffee cup must be thrown away, and the frozen dinner must be started on. ...So essentially branch prediction is where the manager temporarily ventures a guess like that, and the line moves along faster when he's right.
Pseudo-Edit:
It is largely hardware-related. The main search phrase would probably be "computer pipeline cpu". But the list of instructions is already made up - it's just that that list of instructions has branches within it; it's not always 1, 2, 3, etc. But as stage 5 of the pipeline is finishing up instruction 10, stage 1 can already be working on instruction 14. Usually computer instructions can be broken up like that and worked on in segments. If stages 1-n are all working on something at the same time, and nothing gets trashed later, that's just faster than finishing one before starting another.
Any jump in your code is a branch. This happens in if statements function calls and loops.
Modern CPUs have long pipelines. This means the CPUs is processes various parts of multiple instructions at the same time. The problem with branches is that the pipeline might not have started processing the correct instructions. This means that the speculative instructions need to be thrown out and the processor will need to start processing the instructions from scratch.
When a branch is encountered, the CPU tries to predict which branch is going to be used. This is called branch prediction.
Most of the optimizations for branch prediction will be done by your compiler so you do not really need to worry about branching.
This probably falls into the category of only worry about branch optimizations if you have profiled the code and can see that this is a problem.
A branch is a deviation from normal control flow. Processors will execute instructions sequentially, but in a branch, the program counter is moved to another place in memory (for example, a branch depending on a condition, or a procedure call).
What kind of execution rate do you aim for with your unit tests (# test per second)? How long is too long for an individual unit test?
I'd be interested in knowing if people have any specific thresholds for determining whether their tests are too slow, or is it just when the friction of a long running test suite gets the better of you?
Finally, when you do decide the tests need to run faster, what techniques do you use to speed up your tests?
Note: integration tests are obviously a different matter again. We are strictly talking unit tests that need to be run as frequently as possible.
Response roundup: Thanks for the great responses so far. Most advice seems to be don't worry about the speed -- concentrate on quality and just selectively run them if they are too slow. Answers with specific numbers have included aiming for <10ms up to 0.5 and 1 second per test, or just keeping the entire suite of commonly run tests under 10 seconds.
Not sure whether it's right to mark one as an "accepted answer" when they're all helpful :)
All unit tests should run in under a second (that is all unit tests combined should run in 1 second). Now I'm sure this has practical limits, but I've had a project with a 1000 tests that run this fast on a laptop. You'll really want this speed so your developers don't dread refactoring some core part of the model (i.e., Lemme go get some coffee while I run these tests...10 minutes later he comes back).
This requirement also forces you to design your application correctly. It means that your domain model is pure and contains zero references to any type of persistance (File I/O, Database, etc). Unit tests are all about testing those business relatonships.
Now that doesn't mean you ignore testing your database or persistence. But these issues are now isolated behind repositories that can be separately tested with integration tests that is located in a separate project. You run your unit tests constantly when writing domain code and then run your integration tests once on check in.
The goal is 100s of tests per second. The way you get there is by following Michael Feather's rules of unit tests.
An important point that came up in a past CITCON discussion is that if your tests aren't this fast it is quite likely that you aren't getting the design benefits of unit testing.
If we're talking strictly unit tests, I'd aim more for completeness than speed. If the run time starts to cause friction, separate the test into different project/classes etc., and only run the tests related to what you're working on. Let the Integration server run all the tests on checkin.
I tend to focus more on readability of my tests than speed. However, I still try to make them reasonably fast. I think if they run on the order of milliseconds, you are fine. If they run a second or more per test... then you might be doing something that should be optimized.
Slow tests only become a problem as the system matures and causes the build to take hours, at which point you are more likely running into an issue of a lot of kind of slow tests rather than one or 2 tests that you can optimize easily... thus you should probably pay attention RIGHT AWAY if you see lots of tests running hundreds of milliseconds each (or worse, seconds each), rather than wait till it gets to the hundreds of tests taking that long point (at which point it is going to be really hard to solve the problem).
Even so, it will only reduce the time between when your automated build issues errors... which is ok if it is an hour later (or even a few hours later), I think. The problem is running them before you check in, but this can be avoided by selecting a small subset of tests to run that are related to what you are working on. Just make sure to fix the build if you check in code that breaks tests you didn't run!
We're currently at 270 tests in around 3.something seconds. There are probably around 8 tests that perform file IO.
These are run automatically upon a successful build of our libraries on every engineers machine. We have more extensive (and time consuming) smoke-testing that is done by the build machine every night, or can be started manually on an engineers machine.
As you can see we haven't yet reached the problem of tests being too time consuming. 10 seconds for me is the point where it starts to become intrusive, when we start to approach that it'll be something we'll take a look at. We'll likely move the lower level libraries, which are more robust since they change infrequently and have few dependencies, into the nightly builds, or a configuration where they're only executed by the build machine.
If you find it's taking more than a few seconds to run a hundred or so tests you may need to examine what you are classifying as a unit test and whether it would be better treated as a smoke test.
your mileage will obviously be highly variable depending on your area of development.
Data Point -- Python Regression Tests
Here are the numbers on my laptop for running "make test" for Python 2.5.2:
number of tests: 3851 (approx)
execution time: 9 min, 6 sec
execution rate: 7 tests / sec
One of the most important rules about unit tests is they should run fast.
How long is too long for an individual unit test?
Developers should be able to run the whole suite of unit tests in seconds, and definitely not in minutes and minutes. Developers should be able to quickly run them after changing the code in anyway. If it takes too long, they won't bother running them and you lose one of the main benefits of the tests.
What kind of execution rate do you aim for with your unit tests (# test per second)?
You should aim for each test to run in an order of milliseconds, anything over 1 second is probably testing too much.
We currently have about 800 tests that run in under 30 seconds, about 27 tests per second. This includes the time to launch the mobile emulator needed to run them. Most of them each take 0-5ms (if I remember correctly).
We have one or two that take about 3 seconds, which are probably candidates for checking, but the important thing is the whole test suite doesn't take so long that it puts off developers running it, and doesn't significantly slow down our continuous integration build.
We also have a configurable timeout limit set to 5 seconds -- anything taking longer will fail.
I judge my unit tests on a per test basis, not by by # of tests per second. The rate I aim for is 500ms or less. If it is above that, I will look into the test to find out why it is taking so long.
When I think a test is to slow, it usually means that it is doing too much. Therefore, just refactoring the test by splitting it up into more tests usually does the trick. The other times that I have noticed my tests running slow is when the test shows a bottleneck in my code, then a refactoring of the code is in order.
How long is too long for an individual
unit test?
I'd say it depends on the compile speed. One usually executes the tests at every compile. The objective of unit testing is not to slow down, but to bring a message "nothing broken, go on" (or "something broke, STOP").
I do not bother about test execution speed until this is something that starts to get annoying.
The danger is to stop running the tests because they're too slow.
Finally, when you do decide the tests
need to run faster, what techniques do
you use to speed up your tests?
First thing to do is to manage to find out why they are too slow, and wether the issue is in the unit tests or in the code under test ?
I'd try to break the test suite into several logical parts, running only the part that is supposedly affected by the code I changed at every compile. I'd run the other suites less often, perhaps once a day, or when in doubt I could have broken something, and at least before integrating.
Some frameworks provide automatic execution of specific unit tests based on heuristics such as last-modified time. For Ruby and Rails, AutoTest provides much faster and responsive execution of the tests -- when I save a Rails model app/models/foo.rb, the corresponding unit tests in test/unit/foo_test.rb get run.
I don't know if anything similar exists for other platforms, but it would make sense.