Emprovise Blog: Test Driven Development

Test Driven Development is famous software development process which relies on the developer to write an automated test case before writing any piece of functional code. It emphasizes series of unit tests and re-factoring to provide a simple design.

Everyone is accustomed to the general practice of software development which looks as below:

Design: Figure out how you're going to accomplish all the functionality.
Code: Type in the code that implements the design.
Test: Run the code a couple of times to see if it works, then hand it over to QA.

On the other hand Test Driven Development modifies this approach as below:

Test: Figure out what the next chunk of function is all about.
Code: Make it do that.
Design: Make it do that excellently.

As described above TDD completely inverts the accepted ordering of 'design-code-test'. So, from one view, TDD just puts the design after the test and the code. Refactoring is considered as pure design in TDD.

In TDD world we are not allowed to figure out a complete or excellent design to get our test (and all existing tests) to pass, before we start coding it. Although there is sometimes a debate on whether there should be some kind of initial design phase were interfaces (along with methods signature) for the future classes needs to be defined. Further it is not allowed to reduce or skip the "refactor" step during the TDD development. Hence after each iteration of passing test, there should be refactoring done on the code which indirectly contributes to the design. Also once a test is written, TDD allows us to do either of the following during implementation to pass the test:

Reuse some existing code
Introduce meaningful new class(es) and method(s)
Copy existing method(s) and change the copies

TDD helps in certain aspects of the integration, as the entire process a divided into a series of small steps. The more often we check in the code in version control system, and the smaller our changes are, the less likelihood of getting any 'merge conflicts' with others. Also every commit is a guaranteed fallback position, a piton in the rock that we can easily go back to if we slip and fall.

Below is the Red-Green-Refactor Rule for Test Driven Development:

RED	When you write the test, you are designing the behavior you expect the code-under-test to perform.
GREEN	When you write the code to pass the test, you are designing the internal implementation of that behavior.
REFACTOR	Your micro-focus on getting to green probably 'un-designed' the code. When you refactor you are re-designing.

The Stepwise Premise for TDD goes as below:
- Can gigantic complex architectures really be created using nothing other than red-green-refactor?
- Consider these issues:

All large solutions don't just materialize out of nowhere; they are ultimately created in modest steps anyway.
Even if we have analysis and design phases for large-scale architectural features, we can still develop using TDD.
Considerable data is available to support the idea that complex global design processes frequently don't work.
TDD has a serious track record: it is being used all over the world to create complex systems.

Below are the commonly used TDD patterns:

Specify It

Essence First: What is the most basic functionality needed, not including anything fancy
Test First: What exactly will we be testing? Capture that in the test method name.
Assert First: What behavior would you like to check? Writing the assert statement will lead us to produce the structure backwards by "backfilling the method" by declaring the objects and methods we need to create as well as the expected result of calling the new code.

Frame It

Frame First: Create whatever class(es), constructor(s) and method(s) are needed by our assert statement.

Evolve It

Do The Simplest Thing That Could Possibly Work: Focus on minimalism by asking oneself to program only what is absolutely necessary to pass a test.
Break It To Make It: Write a new test code that we know will fail because as our production code isn't capable of handling the new test.
Refactor Mercilessly: Make design improvements continuously, aggressively, mercilessly avoiding really bad code.
Test Driving: In TDD, we don't want to stray too far from the Green Bar.

Finally, Robert Martin, one of the fanatic devotee of Test Driven Development provides the three laws of TDD in his book Clean Code as below:

First Law: You may not write production code until you have written a failing unit test.
Second Law: You may not write more of a unit test than is sufficient to fail, and not compiling is failing.
Third Law: You may not write more production code than is sufficient to pass the currently failing test.

Refactoring generally involves by taking an existing class that's too complex, and break it into smaller classes, each of which takes part of the old class's responsibility, and both of which work together. There are numerous advantages of refactoring the classes to smaller ones, some listed as as follows:

1) By making classes smaller, thus easier to grasp at one time.
2) By aligning the smaller classes with a well-understood functional breakdown of the underlying problem.
3) By making the couplings between classes mirror the couplings between functionality.
4) By (ultimately) allowing complex systems to be built by composing many simpler objects.
5) By making each smaller class easier to test.

Refactoring also involves Decremental Development, which means finding ways to shrink the code even as we continue to add new features. All the common functionality are moved as a part of library, while pre-existing libraries (core as well as external) with required implementation is searched for instead of re-inventing the wheel.

GUI Applications

In order to apply TDD on GUI applications, they need to have clear separation between user interface and operational logic most commonly achieved by MVC pattern. Although the model/view split isn't the only technique for TDD'ing GUI's, but it does represent the meta-pattern for all of them.
Following can be achieved by splitting responsibilities:

We can test the Model by having our TestCase pretend to be the View.
The most important interactions are on the Model, enabling to test core functionality.
We can use fake domain objects for testing which are in turn are used by the Model.
We can test the View by creating a fake Model and driving it that way.
The View can be tested by driving the window's programmatically.

A lot of enhancements can be applied to the Model-View split further such as follows:
- Add Publisher-Subscriber to allow multiple Views on the same Model.
- Add a Controller class to translate View-gestures into Model-commands.
- Add a Command system to isolate and manipulate individual commands.

Test Driven Development Shortcomings

TDD is a development process which assures quality by enforcing unit tests. Although the quality of the code mainly depends on the quality of tests, not when the tests are written during development or how many lines are covered. The essential purpose for writing unit tests is to reduce the possibly of defects in the development phase itself and provide a set of automated tests to validate future changes without introducing new defects. Although such approach is greatly beneficial, the question raised often is to what extent should the tests be written ? When does this approach looses efficiency over the value of auto-tested code ? Does this provide optimal solution to the complex process of software development and unforeseen defects. Is the time and effort spent in writing unit tests to prevent and decrease defects the best approach ?

Most of the Unit Testing tutorials, TDD books and sites describe the approach with basic examples such as processing students grades, calculating wages etc. Although it does gives us a perspective and seems to make the approach by far the best one, but when applied in the co-operate world, such approach has some inherent issues listed as below:

1) Testing a piece of code completely, may involve huge number of scenarios to be considered. Even to select the subset of critical cases and write the test cases for them, it involves almost similar effort as writing the original functional code. But even after selecting a subset of critical cases, we still open ourselves to the possible defects occurring from the ignored scenarios. How to decide which cases are critical and which should be ignored. Some cases may be ignored before, but considering the entire system, such cases could lead to vital failures. Hypothetically, even if we painstakingly compile all the critical cases and wrote unit tests for the entire application, we cannot be sure that there wouldn't be any defects coming up from the unit tested code. Often times, the unit tests validate obvious scenarios (mostly by replicating the code/object in unit test or verifying if the method does get called) thus providing us with a false sense of security. This mostly is caused when the same person writes both the test and the code.

2) Compared to most of the unit testing examples in tutorials, books and articles, the professional code is not that simple or straight forward to isolate. Many real world systems involves, file handling, calling external services, databases, invoking external processes and multi-threading operations. The outcome of these operations is hard to predict. We cannot comprehend the possible values returned by the external services, or by the database all the times. Some of the scenarios such as concurrent operations, server timeout, etc are difficult to recreate in unit test environment. Even if a unit test could be written to check the handling of possible service failures, it would require a substantial amount of efforts compared to manual or integration testing.

3) The basic premise of TDD is that the test drives the system design and implementation. Hence if the line of code cannot be tested then it shouldn't have be written at all. Sometimes due to the limitations of Unit Testing tools such as Junit, Mockito and others the unit test cannot isolately test a certain piece of code. Static methods is one of such cases were despite using Powermock there are many questions raised over the effectiveness of those tests. Also private class fields/methods mostly tend to be changed to lower access modifiers to facilitate unit testing as far as Junit is concerned. Concerns are also raised about the use of Mockito's InjectMocks in unit tests and recommended to use constructor based auto-wiring instead of setter or field based auto-wiring. This ultimately restricts the usage of some features of the programming language or the frameworks inside the boundaries of testability often tagged as bad design.

5) As mentioned previously by Robert Martin, no production code should be written without the corresponding failing test. This totally ignores the fact that whether the unit test is effective, productive and valuable in catching issues. Further it blurs the line between writing a unit test on the behavior/functionality of the code rather than mapping each line of production code with the corresponding unit test. For example creating a new object, setting values to an object, non-conditional calls to library's void methods, logging etc sure compounds to numerous lines of production code, but they hardly articulate any logic or behavior. Consider the following code below:

Properties properties = new Properties();
properties.setProperty("key", "value");
properties.store(new FileOutputStream("C:/test.properties"), null);

The above code creates a Properties object and uses built-in store method of API to create properties file without any conditional logic. There could be many what if arguments made such as what if the store method is not called or file path is incorrect, or properties are not set or incorrectly set etc which often is a slippery slope. But mandating the existence of a line of code or their order is not the purpose of unit test, but is to make sure an independent chunk of code behaves as intended. Any piece of code which only has a single logical flow and returns same or similar results no matter the input has no concrete behavior. Further, if the code does not provide any behavior by itself or relies on external library methods for its behavior then unit testing such code not only adds to overhead and maintenance but fails to provide any productive feedback to detect real problems.
Further, mandating TDD during a proof of concept or trial and error to fix a known problem not only increases the development overhead exponentially but also distracts the developer from the core task/problem.

4) Someone has said "the line of code that is fastest to write, that never breaks, that doesn't need maintenance is the line you never have to write". In Test Driven Development, as the unit test drives the development (rather than us choosing the critical methods to unit test), there is a lot more test code involved. Multiple scenarios for the given piece of code may encourage duplicate code unless only a single person works on it. In the co-operate projects such big chunks of test code adds up to the maintenance of the system. Badly written unit tests which often involves hardcoded error strings further consume time/effort to maintain. Fragile tests which generate false failures mostly tend to be ignored even in case of valid errors. Modifying the existing functionality using TDD becomes quite challenging as we need to deal with a mesh of interconnected mock objects and a series of test cases.

Finally the root issue with TDD is not the effort or time required to write them, but their value compared to the effort i.e. Developer Productivity. TDD is much easier to be applied when the design documents dictates the classes/methods and their functionality beforehand. It also would help if all the possible test cases are listed (usually by testers) for the pre-designed classes.

Was it really Behavior Driven Development ?

Since writing this 2013 blog post, many others have joined to question the effectiveness of TDD. David Heinemeier Hansson, the creator of Ruby on Rails has described TDD as "Test-first fundamentalism is like abstinence-only sex ed: An unrealistic, ineffective morality campaign for self-loathing and shaming". After the blog post Kent Beck put forward his sarcastic defense on TDD which later was followed by conversation with Martin Fowler on whether TDD is Dead. Though the conclusion of the conversation was that TDD is valuable in some contexts, but much disagreement prevailed over the number and type of contexts in which it should be applied. Then in the DevTernity 2017 conference Ian Cooper gave a talk on "TDD, Where Did It All Go Wrong" which was promoted by Uncle Bob Martin. In the talk Cooper pointed out that TDD is being practiced incorrectly since we are focused on testing the implementation details instead of testing the system behavior. Due to this we often write more test code than implementation code. Such implementation driven tests with spaghetti of mocks makes refactoring painful, maintenance a nightmare and decreases the overall development productivity. Developers too often don't understand the intent of such tests and are unable to deduce the system behavior by reading them. Enhancements and re-designs becomes difficult as changing the implementation also requires to change the tests which is long haul process.

TDD is mainly practiced by using 'adding a new method to a class' as trigger to write a test. Such test-case per class approach fails to capture the true ethos for TDD. Adding a new class or method is not the trigger for writing tests. The trigger is implementing a requirement. Write tests to cover the use cases or user stories, not the implementation classes or methods. The system under test is not a class but the exports from a module or its facade. The 'unit' of 'unit testing' here really means module, not a class. A class by itself can be the facade, but many classes are implementation details of the module. Do not write tests for implementation details, these change. Write tests only against the stable contract of the (public) API (which can be within a module). Ian Cooper referenced the first book on TDD, "Test-driven Development: By Example" by Kent Beck and pointed out that Kent has explicitly stated that we need to be testing behavior not the implementation. On page 4 of the book Kent writes "What behavior will we need to produce the revised report? Put another way, what set of tests, when passed, will demonstrate the presence of code we are confident will compute the report correctly ?", which clearly refers to test over behavior not implementation. Kent further states that "When we write a test, we imagine the perfect interface for our operation. We are telling ourselves a story about how the operation will look from the outside. Our story won't always come true, but its better to start from the best-possible application program interface (API) and work backward than to make things complicated, ugly, and 'realistic' from the get-go", which affirms testing API's not implementation methods. The tests should run in isolation from other tests, but not the system under test. The unit of isolation is not the class under test, but the tests themselves. Although tests can and should test several classes working together if that is what is needed to test the behavior. We avoid file system, database, simply because these shared fixture elements prevent us from running in isolation from other tests, or the tests become slow. But if there is no shared fixture problem (one test does not affect another) then its perfectly fine to talk to database (though in-memory) or file system in unit tests. Focusing on methods for testing creates tests which are hard to maintain and code which is difficult to refactor because implementation details are exposed to the tests. Such tests do not capture the behavior we want to preserve and becomes difficult to understand. Refactoring is the process of changing a software system in such a way that it does not alter the external behavior of the code yet improves its internal structure. It is the step were we improve our design/implementation, produce clean code, remove duplication, sanitize code smells and apply design patterns. During refactoring to clean code we should not write new unit tests since we are not introducing new public APIs / classes. Dependency is the key problem in software development at all scales. Dependency between the tests and the code should be eliminated by avoiding mocking. Tests should not depend on implementation details by using Mocks because changing the implementation breaks such tests. Hence mocks should be avoided at all costs except to isolate the tests on the module boundaries (databases, external services, file systems).

Saturday, August 17, 2013

Test Driven Development

1 comment: