Saturday, August 17, 2013

Test Driven Development

Test Driven Development is famous software development process which relies on the developer to write an automated test case before writing any piece of functional code. It emphasizes series of unit tests and re-factoring to provide a simple design.

   Everyone is accustomed to the general practice of software development which looks as below:
  • Design: Figure out how you're going to accomplish all the functionality.
  • Code: Type in the code that implements the design.
  • Test: Run the code a couple of times to see if it works, then hand it over to QA.

On the other hand Test Driven Development modifies this approach as below:
  • Test: Figure out what the next chunk of function is all about.
  • Code: Make it do that.
  • Design: Make it do that excellently.

As described above TDD completely inverts the accepted ordering of 'design-code-test'. So, from one view, TDD just puts the design after the test and the code. Refactoring is considered as pure design in TDD.

   In TDD world we are not allowed to figure out a complete or excellent design to get our test (and all existing tests) to pass, before we start coding it. Although there is sometimes a debate on whether there should be some kind of initial design phase were interfaces (along with methods signature) for the future classes needs to be defined. Further it is not allowed to reduce or skip the "refactor" step during the TDD development. Hence after each iteration of passing test, there should be refactoring done on the code which indirectly contributes to the design. Also once a test is written, TDD allows us to do either of the following during implementation to pass the test:
  1. Reuse some existing code
  2. Introduce meaningful new class(es) and method(s)
  3. Copy existing method(s) and change the copies
TDD helps in certain aspects of the integration, as the entire process a divided into a series of small steps. The more often we check in the code in version control system, and the smaller our changes are, the less likelihood of getting any 'merge conflicts' with others. Also every commit is a guaranteed fallback position, a piton in the rock that we can easily go back to if we slip and fall.

Below is the Red-Green-Refactor Rule for Test Driven Development:

REDWhen you write the test, you are designing the behavior you expect the code-under-test to perform.
GREENWhen you write the code to pass the test, you are designing the internal implementation of that behavior.
REFACTORYour micro-focus on getting to green probably 'un-designed' the code. When you refactor you are re-designing.




The Stepwise Premise for TDD goes as below:
   -  Can gigantic complex architectures really be created using nothing other than red-green-refactor?
   -  Consider these issues:
  • All large solutions don't just materialize out of nowhere; they are ultimately created in modest steps anyway.
  • Even if we have analysis and design phases for large-scale architectural features, we can still develop using TDD.
  • Considerable data is available to support the idea that complex global design processes frequently don't work.
  • TDD has a serious track record: it is being used all over the world to create complex systems.
Below are the commonly used TDD patterns:

Specify It
  • Essence First: What is the most basic functionality needed, not including anything fancy
  • Test First:       What exactly will we be testing? Capture that in the test method name.
  • Assert First:    What behavior would you like to check?  Writing the assert statement will lead us to produce the structure backwards by "backfilling the method" by declaring the objects and methods we need to create as well as the expected result of calling the new code.
Frame It
  • Frame First: Create whatever class(es), constructor(s) and method(s) are needed by our assert statement.
Evolve It
  • Do The Simplest Thing That Could Possibly Work: Focus on minimalism by asking oneself to program only what is absolutely necessary to pass a test.
  • Break It To Make It: Write a new test code that we know will fail because as our production code isn't capable of handling the new test.
  • Refactor Mercilessly: Make design improvements continuously, aggressively, mercilessly avoiding really bad code.
  • Test Driving:  In TDD, we don't want to stray too far from the Green Bar.

Finally, Robert Martin, one of the fanatic devotee of Test Driven Development provides the three laws of TDD in his book Clean Code as below:
  • First Law: You may not write production code until you have written a failing unit test.
  • Second Law: You may not write more of a unit test than is sufficient to fail, and not compiling is failing.
  • Third Law: You may not write more production code than is sufficient to pass the currently failing test.

Refactoring generally involves by taking an existing class that's too complex, and break it into smaller classes, each of which takes part of the old class's responsibility, and both of which work together. There are numerous advantages of refactoring the classes to smaller ones, some listed as as follows:

   1)  By making classes smaller, thus easier to grasp at one time.
   2)  By aligning the smaller classes with a well-understood functional breakdown of the underlying problem.
   3)  By making the couplings between classes mirror the couplings between functionality.
   4)  By (ultimately) allowing complex systems to be built by composing many simpler objects.
   5)  By making each smaller class easier to test.

Refactoring also involves Decremental Development, which means finding ways to shrink the code even as we continue to add new features. All the common functionality are moved as a part of library, while pre-existing libraries (core as well as external) with required implementation is searched for instead of re-inventing the wheel.


GUI Applications

In order to apply TDD on GUI applications, they need to have clear separation between user interface and operational logic most commonly achieved by MVC pattern. Although the model/view split isn't the only technique for TDD'ing GUI's, but it does represent the meta-pattern for all of them.
Following can be achieved by splitting responsibilities:
  • We can test the Model by having our TestCase pretend to be the View.
  • The most important interactions are on the Model, enabling to test core functionality.
  • We can use fake domain objects for testing which are in turn are used by the Model.
  • We can test the View by creating a fake Model and driving it that way.
  • The View can be tested by driving the window's programmatically.

A lot of enhancements can be applied to the Model-View split further such as follows:
 - Add Publisher-Subscriber to allow multiple Views on the same Model.
 - Add a Controller class to translate View-gestures into Model-commands.
 - Add a Command system to isolate and manipulate individual commands.


Test Driven Development Shortcomings

TDD is a development process which assures quality by enforcing unit tests. Although the quality of the code mainly depends on the quality of tests, not when the tests are written during development or how many lines are covered. The essential purpose for writing unit tests is to reduce the possibly of defects in the development phase itself and provide a set of automated tests to validate future changes without introducing new defects. Although such approach is greatly beneficial, the question raised often is to what extent should the tests be written ? When does this approach looses efficiency over the value of auto-tested code ? Does this provide optimal solution to the complex process of software development and unforeseen defects. Is the time and effort spent in writing unit tests to prevent and decrease defects the best approach ?

Most of the Unit Testing tutorials, TDD books and sites describe the approach with basic examples such as processing students grades, calculating wages etc. Although it does gives us a perspective and seems to make the approach by far the best one, but when applied in the co-operate world, such approach has some inherent issues listed as below:

1) Testing a piece of code completely, may involve huge number of scenarios to be considered. Even to select the subset of critical cases and write the test cases for them, it involves almost similar effort as writing the original functional code. But even after selecting a subset of critical cases, we still open ourselves to the possible defects occurring from the ignored scenarios. How to decide which cases are critical and which should be ignored. Some cases may be ignored before, but considering the entire system, such cases could lead to vital failures. Hypothetically, even if we painstakingly compile all the critical cases and wrote unit tests for the entire application, we cannot be sure that there wouldn't be any defects coming up from the unit tested code. Often times, the unit tests validate obvious scenarios (mostly by replicating the code/object in unit test or verifying if the method does get called) thus providing us with a false sense of security. This mostly is caused when the same person writes both the test and the code.

2) Compared to most of the unit testing examples in tutorials, books and articles, the professional code is not that simple or straight forward to isolate. Many real world systems involves, file handling, calling external services, databases, invoking external processes and multi-threading operations. The outcome of these operations is hard to predict. We cannot comprehend the possible values returned by the external services, or by the database all the times. Some of the scenarios such as concurrent operations, server timeout, etc are difficult to recreate in unit test environment. Even if a unit test could be written to check the handling of possible service failures, it would require a substantial amount of efforts compared to manual or integration testing.

3) The basic premise of TDD is that the test drives the system design and implementation. Hence if the line of code cannot be tested then it shouldn't have be written at all. Sometimes due to the limitations of Unit Testing tools such as Junit, Mockito and others the unit test cannot isolately test a certain piece of code. Static methods is one of such cases were despite using Powermock there are many questions raised over the effectiveness of those tests. Also private class fields/methods mostly tend to be changed to lower access modifiers to facilitate unit testing as far as Junit is concerned. Concerns are also raised about the use of Mockito's InjectMocks in unit tests and recommended to use constructor based auto-wiring instead of setter or field based auto-wiring. This ultimately restricts the usage of some features of the programming language or the frameworks inside the boundaries of testability often tagged as bad design.

5) As mentioned previously by Robert Martin, no production code should be written without the corresponding failing test. This totally ignores the fact that whether the unit test is effective, productive and valuable in catching issues. Further it blurs the line between writing a unit test on the behavior/functionality of the code rather than mapping each line of production code with the corresponding unit test. For example creating a new object, setting values to an object, non-conditional calls to library's void methods, logging etc sure compounds to numerous lines of production code, but they hardly articulate any logic or behavior. Consider the following code below:

Properties properties = new Properties();
properties.setProperty("key", "value");
properties.store(new FileOutputStream("C:/test.properties"), null);

The above code creates a Properties object and uses built-in store method of API to create properties file without any conditional logic. There could be many what if arguments made such as what if the store method is not called or file path is incorrect, or properties are not set or incorrectly set etc which often is a slippery slope. But mandating the existence of a line of code or their order is not the purpose of unit test, but is to make sure an independent chunk of code behaves as intended. Any piece of code which only has a single logical flow and returns same or similar results no matter the input has no concrete behavior. Further, if the code does not provide any behavior by itself or relies on external library methods for its behavior then unit testing such code not only adds to overhead and maintenance but fails to provide any productive feedback to detect real problems.
    Further, mandating TDD during a proof of concept or trial and error to fix a known problem not only increases the development overhead exponentially but also distracts the developer from the core task/problem.


4) Someone has said "the line of code that is fastest to write, that never breaks, that doesn't need maintenance is the line you never have to write". In Test Driven Development, as the unit test drives the development (rather than us choosing the critical methods to unit test), there is a lot more test code involved. Multiple scenarios for the given piece of code may encourage duplicate code unless only a single person works on it. In the co-operate projects such big chunks of test code adds up to the maintenance of the system. Badly written unit tests which often involves hardcoded error strings further consume time/effort to maintain. Fragile tests which generate false failures mostly tend to be ignored even in case of valid errors. Modifying the existing functionality using TDD becomes quite challenging as we need to deal with a mesh of interconnected mock objects and a series of test cases.

 Finally the root issue with TDD is not the effort or time required to write them, but their value compared to the effort i.e. Developer Productivity. TDD is much easier to be applied when the design documents dictates the classes/methods and their functionality beforehand. It also would help if all the possible test cases are listed (usually by testers) for the pre-designed classes.

No comments: