At the functional test level, our FitNesse test fixtures build data in-memory wherever possible. These tests run quickly, and the results appear almost instantaneously. When we need to test the database layer, or if we need to test legacy code that’s not accessible independently of the database layer, we usually write FitNesse tests that set up and tear down their own data using a home-grown data fixture. These tests are necessary, but they run slowly and are expensive to maintain, so we keep them to the absolute minimum needed to give us confidence. We want our build that runs business-facing tests to provide feedback within a couple of hours in order to keep us productive.
—Lisa
Comprehensive explanations and examples of various types of test doubles can be found in
Because it’s so difficult to get traction on test automation, it would be easy to say “OK, we’ve got some tests, and they do take hours to run, but it’s better than no tests.” Database access is a major contributor to slow tests. Keep taking small steps to fake the database where you can, and test as much logic as possible without involving the database. If this is difficult, reevaluate your system architecture and see if it can be organized better for testing.
Tools such as DbFit and NdbUnit can simplify database testing and enable test-driven database development; see the bibliography for more resources.
If you’re testing business logic, algorithms, or calculations in code, you’re interested in the behavior of the code itself given certain inputs; you don’t care where the data comes from as long as it accurately represents real data. If this is the case, build test data that is part of the test and can be accessed in memory, and let the production code operate from that. Simulate database access and objects, and focus on the purpose of the test. Not only will the tests run faster, but they’ll be easier to write and maintain.
When generating data for a test, use values that reflect the intent of the test, where possible. Unless you’re completely confident that each test is independent, generate unique test values for each test. For example, use timestamps as part of the field values. Unique data is another safety net to keep tests from infecting each other with stray data. When you need large amounts of data, try generating the data randomly, but always clean it up at the end of the test so that it doesn’t bleed into the next test. We recognize that sometimes you need to test very specific types of data. In these cases, randomly generated data would defeat the purpose of the test. But you may be able to use enough randomization to ensure that each test has unique inputs.
When Database Access Is Unavoidable or Even Desirable
If the system under test relies heavily on the database, this naturally has to be tested. If the code you’re testing reads from and/or writes to the database, at some point you need to test that, and you’ll probably want at least some regression tests that verify the database layer of code.
Our preferred approach is to have every test add the data it needs to a test schema, operate on the data, verify the results in the database, and then delete all of that test data so the test can be rerun without impacting other subsequent tests. This supports the idea that tests are independent of each other.
Lisa’s Story
We use a generic data fixture that lets the person writing the test specify the database table, columns, and values for the columns in order to add data. Another generic data lookup fixture lets us enter a table name and SQL where clause to verify the actual persisted data. We can also use the generic data fixture to delete data using the table name and a key value. Figure 14-3 shows an example of a table that uses a data fixture to build test data in the database. It populates the table “all fund” with the specified columns and values. It’s easy for those of us writing test cases to populate the tables with all of the data we need.
Figure 14-3 Example of a table using a data fixture to build test data in the database
Note that the schemas we use for these tests have most of their constraints removed, so we only have to populate the tables and columns pertinent to the functionality being tested. This makes maintenance a little easier, too. The downside is that the test is a bit less realistic, but tests using other tools verify the functionality with a realistic environment.