Twisted Django Comet(Orbited): the interaction of upper and middle level - django

I'm developing a monitoring system(something like real-time web app). And the question is about system architecture.
Device connects to server and sends information about controlled parameters state. Sever should save information to Database and notify Comet server. Comet server sends message to user saying that new data avaliable. User gets new information.
What's the best way to analyze and save information(create alarm messages if needed) about device state:
Twisted app it self analyzes and interacts with DB(adbapi) and Comet server(Orbited).
Twisted pushes received data to Django(how to push?) and Django analyzes and saves data to DB and sends "NEW" flag to orbited.
Any Your suggestions, if there is a better way.
More information you can find on the pictures below:

This question is fairly open ended. Someone could probably write a dozen pages on each of the options you described, and that much again on a handful of other approaches as a bonus.
Instead of doing that, I'll take an alternate route.
Make sure you have a good understanding of your requirements. Think about which approach is going to be easiest for you (or for the developers on your team) to satisfy those requirements. Take that approach, documenting the overall idea and unit testing everything you write (preferably using TDD).
When you're done, you might not have the optimal solution, but you'll have a solution, and 99 times out of 100 that's indistinguishable from being optimal.
If I do think about your proposed approaches a little bit, then what mostly occurs to me is that they don't differ from each other very much. Your analysis is just some Python code somewhere that you're going to invoke. Whether you invoke it closer to some Twisted-using code or closer to some Django-using code doesn't seem to make a huge difference to the outcome. Perhaps some part of your requirements would make one approach better than the other. However, if you have unit tests and understand your requirements, then I expect you'll actually find it quite easy to switch between those two approaches.
After you've implemented something, you'll have a much better understanding of the trade-offs involved and you'll be in a better position to decide if one implementation is going to work better or worse than another.
Note that unit tests are a pretty essential part of this idea. Without them, you won't really know if you've implemented your requirements, you won't know if your functionality still works after any particular refactoring, and refactoring itself will be harder because your units will not be as well-defined and isolated as they would be if you were doing test-driven development.

Related

Refactoring legacy C code into MVC design

I am working on some OLD (as in older than me) C code that needs to be cleaned up and brought up to date so that (amongst other things), it is easier to maintain and integrates more cleanly with current code.
The existing code is quite messy, and freely intersperses GUI logic with business logic and data access logic. The only saving grace is that it is NOT spaghetti code, and that it is modular (as most code from the seventies tends to be).
My question is this: Can anyone provide me with a guideline on how to go about refactoring the code into MVC (BTW I am also moving the code from C to C++ whilst undertaking this task - but that is the least of my concerns, as I am quite aufait with both languages).
BTW, I am fully aware that this is not a trivial task. I just want to know what the steps are from going from modular code that mixes DBAL/BL/GUI to a cleaner MVC implementation.
I'm not convinced that there can be a definitive set of steps, what we do will vary with the structure of the existing code.
I agree with #Jesus Ramos that figuring out a test strategy is key here. The problem for you is likely to be that the code is currently not unit-testable, because there are effectively no "units", we can't test the business logic,say, without testing the UI.
I would give very serious consideration to rewriting the the thing rather than refactoring.
If you are going to refactor, then my guess is that you'll take a kind of "Swiss Cheese" approach. Drill out pieces, leaving a central mass with lots of holes. So pull out database access code, focusing on providing a clear API and set of Data Objects - these become the basis of your Model. Pull out the GUI code into a view layer. What's left is the Controller logic, which you can then refactor.
I would build the business logic layer first (along with some unit tests to make sure it runs like the original), I would then work in the data layer (again along with unit tests). Once you have these two it's probably best to create an interface to allow the GUI code to be robust without being so coupled and expose the required functionality of business logic and data to the GUI although personally the GUI should only submit the data it has to the business logic layer and then that submits to the data layer. The key here is unit testing (if possible) as this will make your life easier to make sure that your code and the original are both the same functionally.
Again you don't have to follow this step by step it's just preference to leave the GUI until the end as that is less complicated (most of the time) than the business logic layer.
The most difficult task is figuring out the decoupling itself as some people make this difficult and just have all 3 layers in one single function and ripping that apart can be a hassle.

Unit testing handling of degraded network stack, file corruption, and other imperfections

I'm primarily a C++ coder, and thus far, have managed without really writing tests for all of my code. I've decided this is a Bad Idea(tm), after adding new features that subtly broke old features, or, depending on how you wish to look at it, introduced some new "features" of their own.
But, unit testing seems to be an extremely brittle mechanism. You can test for something in "perfect" conditions, but you don't get to see how your code performs when stuff breaks. A for instance is a crawler, let's say it crawls a few specific sites, for data X. Do you simply save sample pages, test against those, and hope that the sites never change? This would work fine as regression tests, but, what sort of tests would you write to constantly check those sites live and let you know when the application isn't doing it's job because the site changed something, that now causes your application to crash? Wouldn't you want your test suite to monitor the intent of the code?
The above example is a bit contrived, and something I haven't run into (in case you haven't guessed). Let me pick something I have, though. How do you test an application will do its job in the face of a degraded network stack? That is, say you have a moderate amount of packet loss, for one reason or the other, and you have a function DoSomethingOverTheNetwork() which is supposed to degrade gracefully when the stack isn't performing as it's supposed to; but does it? The developer tests it personally by purposely setting up a gateway that drops packets to simulate a bad network when he first writes it. A few months later, someone checks in some code that modifies something subtly, so the degradation isn't detected in time, or, the application doesn't even recognize the degradation, this is never caught, because you can't run real world tests like this using unit tests, can you?
Further, how about file corruption? Let's say you're storing a list of servers in a file, and the checksum looks okay, but the data isn't really. You want the code to handle that, you write some code that you think does that. How do you test that it does exactly that for the life of the application? Can you?
Hence, brittleness. Unit tests seem to test the code only in perfect conditions(and this is promoted, with mock objects and such), not what they'll face in the wild. Don't get me wrong, I think unit tests are great, but a test suite composed only of them seems to be a smart way to introduce subtle bugs in your code while feeling overconfident about it's reliability.
How do I address the above situations? If unit tests aren't the answer, what is?
Edit: I see a lot of answers that say "just mock it". Well, you can't "just mock it", here's why:
Taking my example of the degrading network stack, let's assume your function has a well defined NetworkInterface, which we'll mock. The application sends out packets over both TCP, and UDP. Now, let's say, hey, let's simulate 10% loss on the interface using a mock object, and see what happens. Your TCP connections increase their retry attempts, as well as increasing their back-off, all good practice. You decide to change X% of your UDP packets to actually make a TCP connection, lossy interface, we want to be able to be able to guarantee delivery of some packets, and the others shouldn't lose too much. Works great. Meanwhile, in the real world.. when you increase the number of TCP connections (or, data over TCP), on a connection that's lossy enough, you'll end up increasing your UDP packet loss, as your TCP connections will end up re-sending their data more and more and/or reducing their window, causing your 10% packet loss to actually be more like 90% UDP packet loss now. Whoopsie.
No biggie, let's break that up into UDPInterface, and TCPInterface. Wait a minute.. those are interdependent, testing 10% UDP loss and 10% TCP loss is no different than the above.
So, the issue is now you're not simply unit testing your code, you're introducing your assumptions into the way the operating system's TCP stack works. And, that's a Bad Idea(tm). A much worse idea than just avoiding this entire fiasco.
At some point, you're going to have to create a Mock OS, which behaves exactly like your real OS, except, is testable. That doesn't seem like a nice way forward.
This is stuff we've experienced, I'm sure others can add their experiences too.
I hope someone will tell me I'm very wrong, and point out why!
Thanks!
You start by talking about unit tests, then talk about entire applications; it seems you are a little confused about what unit testing is. Unit testing by definition is about testing at the most fine grained level, when each "unit" of the software is being tested. In common use, a "unit" is an individual function, not an entire application. Contemporary programming style has short functions, each of which does one well defined thing, which is therefore easy to unit test.
what sort of tests would you write to constantly check those sites live?
UnitTests target small sections of code you write. UnitTests do not confirm that things are ok in the world. You should instead define application behavior for those imperfect scenarios. Then you can UnitTest your application in those imperfect scenarios.
for instance a crawler
A crawler is a large body of code you might write. It has some different parts, one part might fetch a webpage. Another part might analyze html. Even these parts may be too large to write a unit test against.
How do you test an application will do its job in the face of a degraded network stack?
The developer tests it personally by purposely setting up a gateway that drops packets to simulate a bad network when he first writes it.
If a test uses the network, it's not a UnitTest.
A UnitTest (which must target your code) cannot call the network. You didn't write the network. The UnitTest should involve a mock network with simulated (but consistent each time) packet loss.
Unit tests seem to test the code only in perfect conditions
UnitTests test your code in defined conditions. If you're only capable of defining perfect conditions, your statement is true. If you're capable of defining imperfect conditions, your statement is false.
Work through any decent book on unit testing - you'll find that it's normal practise to write tests that do indeed cover edge cases where the input is not ideal or is plain wrong.
The most common approach in languages with exception handling is a "should throw" specification, where a certain test is expected to cause a specific exception type to be thrown. If it doesn't throw an exception, the test fails.
Update
In your update you describe complex timing-sensitive interactions. Unit testing simply doesn't help at all there. No need to introduce networking: just think of trying to write a simple thread safe queue class, perhaps on a platform with some new concurrency primitives. Test it on an 8 core system... does it work? You simply can't know that for sure by testing it. There are just too many different ways that the timing can cause operations to overlap between the cores. Depending on luck, it could take weeks of continuous execution before some really unlikely coincidence occurs. The only way to get such things right is through careful analysis (static checking tools can help). It's likely that most concurrent software has some rarely occuring bugs in it, including all operating systems.
Returning to the cases that can actually be tested, I've found integration tests to be often just as useful as unit tests. This can be as elaborate as automating the installation of your product, adding configurations to it (such as your users might create) and then "poking" it from the outside, e.g. automating your UI. This finds a whole other class of issue separate from unit testing.
It sounds as if you answered your own question.
Mocks/stubs are the key to testing difficult to test areas. For all of your examples, the manual approach of say creating a website with dodgy data, or causing network failure could be done manually. However it would be very difficult and tedious to do so, not something anyone would recommend. In fact, doing some would mean you are not actually unit testing.
Instead you'd use mock/stubs to pretend such scenarios have happened allowing you to test them. The benefit of using mocks is that unlike the manual approach you can guarantee that each time you run your tests the same procedure will be carried out. The tests in turn will be much faster and stable because of this.
Edit - With regards the updated question.
Just as a disclaimer my networking experience is very limited, therefore I can't comment on the technical side of your issues. However, I can comment on the fact you sound as if you are testing too much. In other words, your tests cover too much of a wide scope. I don't know what your code base is like but given functions/objects within that, you should still be able to provide fake input that will allow you to test that your objects/functions do the right thing in isolation.
So lets imagine your isolated areas work fine given the requirements. Just because your unit tests pass does not mean you've tested your application. You'll still need to manually test such scenarios you describe. In this scenario it sounds as if stress testing - limiting network resources and so on are required. If your application works as expected - great. If not, you've got missing tests. Unit testing (more in tie with TDD/BDD) is about ensuring small, isolated areas of your application work. You still need integration/manual/regression etc.. testing afterwards. Therefore you should use mocks/stubs to test your small, isolated areas function. Unit testing is more akin to a design process if anything in my opinion.
Integration Testing vs Unit Testing
I should preface this answer by saying I am biased towards integration tests vs unit tests as the primary type of test used in tdd. At work we also have some unit tests mixed in, but only as necessary. The primary reason why we start with an integration test is because we care more about what the application is doing rather than what a particular function does. We also get integration coverage which has been, in my experience, a huge gap for automated testing.
To Mock or Not, Why Not Do Both
Our integration tests can run either fully wired (to unmanaged resource) or with mocks. We have found that helps to cover the gap between real world vs mocks. This also provides us with the option to decide NOT to have a mocked version because the ROI for implementing the mock isn't worth it. You may ask why use mocks at all.
tests suite runs faster
guaranteed same response every time (no timeouts, unforeseen degraded network, etc)
fine-grained control over behavior
Sometimes you shouldn't write a test
Testing, any kind of testing has trade offs. You look at the cost to implement the test, the mock, variant tests, etc and weigh that against the benefits and sometime it doesn't make sense to write the test, the mock, or the variant. This decision is also made within the context of the kind of software your building which really is one of the major factor in deciding how deep and broad your test suite needs to be. To put it another way, I'll write a few tests for the social bacon meetup feature, but I'm not going to write the formal verification test for the bacon-friend algorithm.
Do you simply save sample pages, test
against those, and hope that the sites
never change?
Testing is not a panacea
Yes, you save samples (as fixtures). You don't hope the page doesn't change, but you can't know how and when it will change. If you have ideas or parameters of how it may change then you can create variants to make sure your code will handle those variants. When and if it does change, and it breaks, you add new samples, fix the problems and move on.
what sort of tests would you write to
constantly check those sites live and
let you know when the application
isn't doing it's job because the site
changed something, that now causes
your application to crash?
Testing != Monitoring
Tests are tests and part of development (and QA), not for production. MONITORING is what you use in production to make sure your application is working properly. You can write monitors which should alert you when something is broken. That's a whole other topic.
How do you test an application will do
its job in the face of a degraded
network stack?
Bacon
If it were me I would have a wired and mocked mode for the test (assuming the mock was good enough to be useful). If the mock is difficult to get right, or if it's not worth it then I would just have the wired test. However, I have found that there is almost always a way split the variables in play into different tests. Then each of those tests are targeted to testing that vector of change, while minimizing all the other variability in play. The trick is to write the important variants, not every possible variant.
Further, how about file corruption?
How Much Testing
You mention that checksum being correct, but the file actually being corrupt. The question here is what is the class of software I'm writing. Do I need to be super paranoid about the possibility of a statistically small false positive or not. If I do, then we work to find what how deep and broad to test.
I think you can't and shouldn't make an unit test for all possible errors you might face (what if a meteorite hits the db server?) - you should make an effort to test errors with reasonably probablity and/or rely or another services.
For example; if your application requires the correct arrival of network packets; you should use the TCP transport layer: it guarantees the correctness of the received packets transparently, so you only have to concentrace eg. what happens if network connection is dropped.
Checksums are designed to detect or correct a reasonable amount of errors - if you expect 10 errors per file, you would use different checksum than if you expect 100 errors. If the chosen checksum indicates that the file is correct, than you have no reason to think it's broken (the probablity that it is broken is negligible).
Because you don't have infinite resources (eg. time) you have to make compromises when you write your tests; and choosing these compromises it a tough question.
Although not a complete answer to the massive dilema you face, you can reduce the amount of tests by using a technique called Equivalence Partitioning.
In my organization, we perform many levels of coverage, regression, positive, negative, scenario based, UI in automated and manual tests, all starting from a 'clean environment', but even that isn't perfect.
As for one of the cases you mention, where a programmer comes in and changes some sensitive detection code and no one notices, we would have had a snapshot of data that is 'behaviourally dodgy', which fails consistently with a specific test to test the detection routine - and we would run all tests regularly (and not just at the last minute).
Sometimes I'll create two (or more) test suites. One suite uses mocks/stubs and only tests the code I'm writing. The other tests test the database, web sites, network devices, other servers, and whatever else is outside of my control.
Those other tests are really tests of my assumptions about the systems my code interacts with. So if they fail, I know my requirements have changed. I can then update my internal tests to reflect whatever new behavior my code needs to have.
The internal tests include tests that simulate various failures of the external systems. Whenever I observe a new kind of failure, either through my other tests or as a result of a bug report, I have a new internal test to write.
Writing tests that model all the bizarre things that happen in the real world can be challenging, but the result is that you really think about all those cases, and produce robust code.
The proper use of Unit Testing starts from the ground up. That is, you write your unit tests BEFORE you write your production code. The unit tests are then forced to consider error conditions, pre-conditions, post-conditions, etc. Once you write your production code (and the unit tests are able to compile and run successfully), if someone makes a change to the code that changes any of its conditions (even subtly), the unit test will fail and you will learn about it very quickly (either via compiler error or via a failed unit test).
EDIT: Regarding the updated question
What you are trying to test is not really suited well for unit testing. Networking and database connections test better in a simulated integration test. There are far too many things that can break during the initialization of a remote connection to create a useful unit test for it (I'm sure there are some unit-tests-fix-all people that will disagree with me there, but in my experience, trying to unit test network traffic and/or remote database functionality is worse than shoving a square peg though a round hole).
You are talking about library or application testing, which is not the same as unit testing. You can use unit testing libraries such as CppUnit/NUnit/JUnit for library and regression testing purposes, but as others have said, unit testing is about testing your lowest level functions, which are supposed to be very well defined and easily separated from the rest of the code. Sure, you could pass all low-level unit tests, and still have a network failure in the full system.
Library testing can be very difficult, because sometimes only a human can evaluate the output for correctness. Consider a vector graphics or font rendering library; there's no single perfect output, and you may get a completely different result based on the video card in your machine.
Or testing a PDF parser or a C++ compiler is dauntingly difficult, due to the enormous number of possible inputs. This is when owning 10 years of customer samples and defect history is way more valuable than the source code itself. Almost anyone can sit down and code it, but initially you won't have a way of validating your program for correctness.
The beauty of mock objects is that you can have more than one. Assume that you are programming against a well-defined interface for a network stack. Then you can have a mock object WellBehavingNetworkStack to test the normal case and another mock object OddlyBehavingNetworkStack that simulates some of the network failures that you expect.
Using unit tests I usually also test argument validation (like ensuring that my code throws NullPointerExceptions), and this is easy in Java, but difficult in C++, since in the latter language you can hit undefined behavior quite easily, and then all bets are off. Therefore you cannot be strictly sure that your unit tests work, even if they seem to. But still you can test for odd situations that do not invoke undefined behavior, which should be quite a lot in well-written code.
What you are talking about is making applications more robust. That is, you want them to handle failures elegantly. However, testing every possible real world failure scenario would be difficult if not impossible. The key to making applications robust is to assume that failure is normal and should be expected at some point in the future. How an application handles failure really depends on the situation. There are a number of different ways to detect and handle failure (maybe a good question to ask the group). Trying to rely on unit testing alone will only get you part of the way. Anticipating failure (even on some simple operations) will get you even closer to a more robust application. Amazon built thier entire system to anticipate all types of failures (hardware, software, memory and file corruption). Take a look at thier Dynamo for an example of real world error handling.

Django Project Done and Working. Now What?

I just finished what I would call a small django project and pretty soon it's going live. It's only 6 models but a fairly complex view layer and a lot of records saving and retrieving.
Of course, forgetting the obvious huge amount of bugs that will, probably, fill my inbox to the top, what would it be the next step towards a website with best performance. What could be tweaked?
I'm using jmeter a lot recently and feel confident that I have a good baseline for future performance comparisons, but the thing is: I'm not sure what is the best start, since I'm a greedy bastard that wants to work the least possible and gather the best results.
For instance, should I try an approach towards infrastructure, like a distributed database, or should I go with the code itself and in that case, is there something that specifically results in better performance? In your experience, whats pays off more?
As a personal contribution: I sometimes have the impression that some operations, when done through django's signals, are faster then the usual view way. But hey, I'm biased. I freaking loooove signals. :)
Personal anecdotes like mine, are welcome as a way to stimulate some research, but some fact based opinions are much more appreciated. :)
Thanks very much.
here is what we did...
used django-debug-toolbar to analyze performance of each page (# of queries and response times)
used Django cache framework...most importantly memcache
used Firebug's pagespeed to optimize HTTP page loads
used Google Analytics for general site usage stats (find out what's being used)
used Apache HTTP server benchmarking tool for quick performance stats
In general, don't try to optimize performance up front. First, collect usage/performance stats, then pick off the most rewarding changes (effort vs. benefit) until you get decent results. The goal should be to avoid unnecessary complexity (distributed databases, etc)
Then, if you still aren't happy, consider these (in order): more RAM (goes a long way), a dedicated database server, load balancing multiple app servers (using perlbal, etc), a dedicated media server, etc...see these for more details (deployment guide, performance tips)
good luck...
Now what?
Deploy. If you have an MVP that is.
Other thoughts:
You didn't mention anything about testing. Do you have unit tests? Do you feel that the test coverage is adequate? I'd recommend reading Karen M. Tracey's book Django 1.1 Testing and Debugging.
Have you watched Jacob Kaplan-Moss's Deployment Workshop?
Have you done any usability testing? You can check out Joel Test article by Joel Spolsky, or you can read Rocket Surgery Made Easy or Don't Make Me Think both by Steve Krug.
Speaking of Spolsky, how does your process rank on the Joel Test?
I know that your question was slanted toward performance, and it may seem that my thoughts aren't performance related. However, thinking about some of these seemingly unrelated items may lead you in a direction that will impact performance. For instance, usability testing may reveal that a certain feature could be reduced in scope yielding better performance due to less data being delivered to the end-user.

What is test-driven development (TDD)? Is an initial design required?

I am very new to test-driven development (TDD), not yet started using it.
But I know that we have to write tests first and then the actual code to pass the test and refactor it till the design is good.
My concern over TDD is where it fits in our systems development life cycle (SDLC).
Suppose I get a requirement of making an order processing system.
Now, without having any model or design for this system, how can I start writing tests?
Shouldn't we require to define the entities and their attributes to proceed?
If not, is it possible to develop a big system without any design?
There is two levels of TDD, ATDD or acceptance test driven development, and normal TDD which is driven by unit tests.
I guess the relationship between TDD and design is influenced by the somewhat "agile" concept that source code IS the design of a software product. A lot of people reinforce this by translating TDD as Test Driven Design rather than development. This makes a lot of sense as TDD should be seen as having a lot more to do with driving the design than testing. Having acceptance and unit tests at the end of it is a nice side effect.
I cannot really say too much about where it fits into your SDLC without knowing more about it, but one nice workflow is:
For every user story:
Write acceptance tests using a tool like FitNesse or Cucumber, this would specify what the desired outputs are for the given inputs, from a perspective that the user understands. This level automates the specifications, or can even replace specification documentation in ideal situations.
Now you will probably have a vague idea of the sort of software design you might need as far as classes / behaviour etc goes.
For each behaviour:
Write a failing test that shows how calling code you would like to use the class.
Implement the behaviour that makes the test pass
Refactor both the test and actual code to reflect good design.
Go onto the next behaviour.
Go onto the next user story.
Of course the whole time you will be thinking of the evolving high level design of the system. Ideally TDD will lead to a flexible design at the lower levels that permits the appropriate high design to evolve as you go rather than trying to guess it at the beginning.
It should be called Test Driven Design, because that is what it is.
There is no practical reason to separate the design into a specific phase of the project. Design happens all the time. From the initial discussion with the stakeholder, through user story creation, estimation, and then of course during your TDD sessions.
If you want to formalize the design using UML or whatever, that is fine, just keep in mind that the code is the design. Everything else is just an approximation.
And remember that You Aren't Gonna Need It (YAGNI) applies to everything, including design documents.
Writing test first forces you to think first about the problem domain, and acts as a kind of specification. Then in a 2nd step you move to solution domain and implement the functionality.
TDD works well iteratively:
Define your initial problem domain (can be small, evolutionary prototype)
Implement it
Grow the problem domain (add features, grow the prototype)
Refactor and implement it
Repeat step 3.
Of course you need to have a vague architectural vision upfront (technologies, layers, non-functional requirement, etc.). But the features that bring added-value to your your application can be introduced nicely with TDD.
See related question TDD: good for a starter?
With TDD, you don't care much about design. The idea is that you must first learn what you need before you can start with a useful design. The tests make sure that you can easily and reliably change your application when the time comes that you need to decide on your design.
Without TDD, this happens: You make a design (which is probably too complex in some areas plus you forgot to take some important facts into account since you didn't knew about them). Then you start implementing the design. With time, you realize all the shortcomings of your design, so you change it. But changing the design doesn't change your program. Now, you try to change your code to fit the new design. Since the code wasn't written to be changed easily, this will eventually fail, leaving you with two designs (one broken and the other in an unknown state) and code which doesn't fit either.
To start with TDD, turn your requirements into test. To do this, ask "How would I know that this requirement is fulfilled?" When you can answer this question, write a test that implements the answer to this question. This gives you the API which your (to be written) code must adhere to. It's a very simple design but one that a) always works and b) which is flexible (because you can't test unflexible code).
Also starting with the test will turn you into your own customer. Since you try hard to make the test as simple as possible, you will create a simple API that makes the test work.
And over time, you'll learn enough about your problem domain to be able to make a real design. Since you have plenty of tests, you can then change your code to fit the design. Without terminally breaking anything on the way.
That's the theory :-) In practice, you will encounter a couple of problems but it works pretty well. Or rather, it works better than anything else I've encountered so far.
Well of course you need a solid functional analysis first, including a domain model, without knowing what you'll have to create in the first place it's impossible to write your unit tests.
I use a test-driven development to program and I can say from experience it helps create more robust, focussed and simpler code. My recipe for TDD goes something likes this:
Using a unit-test framework (I've written my own) write code as you wish to use it and tests to ensure return values etc. are correct. This ensures you only write the code you're actually going to use. I also add a few more tests to check for edge cases.
Compile - you will get compiler errors!!!
For each error add declarations until you get no compiler errors. This ensures you have the minimum declarations for your code.
Link - you will get linker errors!!!
Write enough implementation code to remove the linker errors.
Run - you unit tests will fail. Write enough code to make the test succeed.
You've finished at this point. You have written the minimum code you need to implement your feature, and you know it is robust because of your tests. You will also be able to detect if you break things in the future. If you find any bugs, add a unit test to test for that bug (you may not have thought of an edge case for example). And you know that if you add more features to your code you won't make it incompatible to existing code that uses your feature.
I love this method. Makes me feel warm and fuzzy inside.
TDD implies that there is some existing design (external interface) to start with. You have to have some kind of design in mind in order to start writing a test. Some people will say that TDD itself requires less detailed design, since the act of writing tests provides feedback to the design process, but these concepts are generally orthogonal.
You need some form of specification, rather than a form of design -- design is about how you go about implementing something, specification is about what you're going to implement.
Most common form of specs I've seen used with TDD (and other agile processes) are user stories -- an informal kind of "use case" which tends to be expressed in somewhat stereotyped English sentences like "As a , I can " (the form of user stories is more or less rigid depending on the exact style/process in use).
For example, "As a customer, I can start a new order", "As a customer, I can add an entry to an existing order of mine", and so forth, might be typical if that's what your "order entry" system is about (the user stories would be pretty different if the system wasn't "self-service" for users but rather intended to be used by sales reps entering orders on behalf of users, of course -- without knowing what kind of order-entry system is meant, it's impossible to proceed sensibly, which is why I say you do need some kind of specification about what the system's going to do, though typically not yet a complete idea about how it's going to do it).
Let me share my view:
If you want to build an application, along the way you need to test it e.g check the values of variables you create by code inspection, of quickly drop a button that you can click on and will execute a part of code and pop up a dialog to show the result of the operation etc. on the other hand TDD changes your mindset.
Commonly, you just rely on the development environment like visual studio to detect errors as you code and compile and somewhere in your head, you know the requirement and just coding and testing via button and pop ups or code inspection. this is a Syntax debugging driven development . but when you are doing TDD, is a "semantic debugging driven development " because you write down your thoughts/ goals of your application first by using tests (which and a more dynamic and repeatable version of a white board) which tests the logic (or "semantic") of your application and fails whenever you have a semantic error even if you application passes syntax error (upon compilation).
In practice you may not know or have all the information required to build the application , since TDD kind of forces you to write tests first, you are compelled to ask more questions about the functioning of the application at a very early stage of development rather than building a lot only to find out that a lot of what you have written is not required (or at lets not at the moment). you can really avoid wasting your precious time with TDD (even though it may not feel like that initially)

Relational databases application [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions concerning problems with code you've written must describe the specific problem — and include valid code to reproduce it — in the question itself. See SSCCE.org for guidance.
Closed 9 years ago.
Improve this question
When developing an application which mostly interacts with a database, what is a good way to start? The application requires a lot of filtering based on user input, sorting and structuring.
The best way to start is by figuring out "user stories" (or "use cases" -- but the "story" approach tends to really work great and start dragging shareholder into the shared storytelling...!-); on top of that, designing the database schema as the best-normalized idea you can find to satisfy all data layer needs of the user stories.
Thirdly, you may sketch layers such as views on top of the schema; fourthly, and optionally, triggers and stored procedures that might live in the DB to ensure consistency and ease of use for higher layers (but, no matter how strongly DBAs will push you towards those, don't accept their assurances that they're a MUST: they aren't -- if your storage layer is well designed in terms of normalization and maybe useful views on top, non-storage-layer functionality CAN always reside elsewhere, it's an issue of convenience and performance, NOT logical consistency, completeness, correctness).
I think the business layer and user-experience layers should come after. I realize that's a controversial position, but my point is that the user stories (and implied business-rules that come with them) have ALREADY told you a LOT about the business and user layers -- so, "nailing down" (relatively speaking -- agility and "embrace change!" should always rule;-) the data storage layer is the next order of business, and refining ("drilling down") the higher layers can and should come after.
When you get to the database layer you'll want to handle the database access via stored procedures. This will help give you additional protection against SQL Injection attacks, and make it much easier to push logic changes to the database layer.
If it's mostly users interacting with data, you can design using a form perspective.
What forms are needed for user input?
What forms are needed for output reports?
Once you've determined that, the use of the forms will dictate the business logic needed to be coded behind the scenes. You'll take the inputs, create the set of procedures or methods to deal with them, and output what is necessary. Once you know the inputs and outputs, you will be able to easily design the necessary functions.
The scope of the question is very broad. You are expecting me to tell what to do. I can only do a good job of telling how to do things. Do investigate upon using Hibernate/Spring. Since most of your operations looks like querying db, hibernate should help. Make sure the tables are sufficiently indexed so your queries can run faster if filtered based on index fields. The challenging task is design your DB layer which will be the glue between your application and db. Design your db layer generic enough so that it can build queries based on the params that you pass to it. Then move on to develop the above presentation layer. Developing your application layer by layer helps since it will force you to decouple the db logic from the presentation logic. When you develop the db layer, assume that not just your presentation layer but any client can call it. This will help you to design applications that can be scalable and adaptable to new requirements.
So bottom line : Start with DB, DB integeration layer, Controller and last Presentation Layer.
For the purpose of discussion, I'm going to assume that you are working with a starting application that doesn't have a pre-existing database. If this is false, I'd probably move the order of steps around quite a bit.
1 - Understand the Universe
First, you've got to get a sense of what's around you so you can really understand the problem that you are trying to solve.
User stories or use cases are often a good starting point. Starting with what tasks the user will try to do, and evaluating how frequently they are likely to be is a great starting point. I like to start with screen mockups as well, with or without lots of hands on time with users, I find that having a screen gives our team something really finite to argue about.
What other tools exist in this sphere? These days, it seems to me that users never use just one tool, they swap around alot. You need to know two main things about the other tools you users use:
(1) - what will they be using as part of the process, along side your tool? Consider direct input/output needs - what might they want to cut/copy/paste from or to? What tools might you want to offer file upload/download for with specific formats, what tools are they using alongside your tool that you might want to share terminology, layout, color coding, icons or other GUI elements with. Focus especially on the edges of the tools - a real gotcha I hit in a recent project was emulating the databases of previous tools. It turned out that we had massive database shift, and we would likely have been better starting fresh.
(2) What (if anything) are you replacing or competing with? Steal the good stuff, dump and improve the bad stuff. Asking users is always best. If you can't at least understanding the management initiative is important - is this tool replacing a horrible legacy tool? It may be legacy, but there may be the One True Feature that has kept the tool in business all these years...
At this stage, I find that things are really mushy - there's some screen shots, some writing, some schemas or ICDs - but not a really gelled clue.
2 - Logical Entities
Or at least that's what the OO books call it.
I don't care much for all the writing I see on this task - but I find that any any given system, I have one true diagram that I draw over and over. It's usually about 3-10 boxes, and hopefully less than an exponentially large number of lines connecting them. W
The earlier you can get that diagram the better.
It doesn't matter to me if it's in UML, a database logical model, something older, or on the back of a napkin (as long as the napkin is shrouded in plastic and hung where everyone can see it).
The earlier you can make this diagram correctly, the better.
After the diagram is made, you can start working on the follow on work that may be more official.
I think it's a chicken and egg question on whether you start with your data or you start with your screens and business logic. I know that you certianly want to optimize for database sizing and searchability... but how do you know exactly what your database needs are without screens and interfaces giving you a sense for the data?
In practice, I think this is an ever-churning cycle. You do a little bit everywhere, and then you change it all.
Even if you don't get to do a formal agile lifecycle, I think you're best bet is to view design as agile -- it will take many repetitions and arguments before you really feel it's "right".
The most important thing to keep in mind is that your first, and most likely 2nd 3rd attempt at designing the database will be wrong in some way. That might sound negative, maybe even a little rash, (it's certainly more towards the 'agile' software design philosophy) but it's important thing to keep in mind.
You still need to do your analysis thoroughly of course, try to implement one feature at a time, but try to get all layers working first. That way you won't have to do to much rework when the specs change and you understand the issues better. One you have a lot of data loaded into a system, changing things becomes increasingly difficult.
The main benefit of this approach is you find out quickly where you design is broken, where you haven't separated you design layers correctly. One trick I find extremely useful is to do both a sqllite and a mysql version, so seamless switching between the two is possible. Because the two use a different accent of SQL it highlights where you have too tight a coupling between the layers.
A good start would be to get familiar with Multitier architecture
Then you design your presentation layer.
In your business logic layer implement all logic
And finally you implement your data access layer.
Try to setup a prototype with something that is more productive then C++ for example Ruby, Python and well maybe even PHP.
When the prototype works and you see your data model is okay and your queries are too slow then you can start using C++.
But as your questions suggests you have more options then data and in this case the speed of a scripting langauge should be enough.