Believing in Numbers

No Comments

What follows is in my view a typical scenario of one of the common curses of the performance tester. A new version of a test object finds its way into my hands. After some twiddling with it to get it running, I do a first quick test run. Somebody – development, architecture, project management, whoever – gets wind of it and is curious about the results. Defensively, I try a delaying tactic: “It’s the very first run. The test environment wasn’t completely built up. We have to do more tests. We must at first confirm the numbers before it makes any sense to discuss them.” My counterpart is getting more inquisitive. He insists to be aware of all of my objections and that he just wants to satisfy his curiosity. I finally give in and tell the numbers. Shortly thereafter, I regret it deeply. Like an avalanche, my little piece of information has gathered a gigantic mass in no time and like a boomerang it is coming full circle, heading straight back at me.

What happens in cases such as this is often the result of an excessive believe in numbers. When something has been tested and quantified, there is a tendency to trust and rely on the results regardless of the circumstances.  A first quick measurement point is frequently discussed and given weight as if a complete and elaborate test series has been conducted. If the numbers are bad, this sometimes leads to hectic and strange activities like escalation, crisis meetings and maybe even management action. If they are unexpectedly good, a premature all-clear may be given.

Even the performance tester himself may be prone to this type of mistake. I can speak out of my own experience here… Either because he wishes for or expects a certain result or simply because of routine he might be seduced to accept his own numbers as valid too quickly. Application systems and their configuration are often very complex, so it is easy to overlook or forget little changes with possibly large effects on the performance test’s results.

The unshakable belief in the validity of numeric data can lead to many other inappropriate follow-ups. There is often a tendency to overanalyze a single data point, e.g. projecting possible future performance improvements or extrapolating the effects on a system n-times the size of the test system and so on. If later tests don’t confirm these projections, sometimes a frantic search for an assumed hidden cause of the “wrong” numbers is started, instead of realizing that incomparable things were compared in the first place. Also sometimes earlier numbers – correctly reported and documented – are rediscovered, but their context forgotten, resulting in similar wrong interpretations, conclusions and activities.

Most of the time it is very hard to argue against this firm belief in numbers, to state that other tests are necessary to confirm them and that rather large variations are typical. Performance measurements as part of QA in software development are definitely not an exact science. As “getting things done” is normally the topmost priority, the preconditions for a controlled experiment as in the academic world are rarely fulfilled. Yet the results of such experiments are often handled as if they are fulfilled.

As a performance tester, one has to be aware of the possible ramifications of the reported test results. Testing – especially performance testing – is done to replace uncertainty with knowledge, but careless handling of test results can easily lead to the opposite outcome. In my opinion, the foremost rule should be to only report data that one understands, can stick by and defend. One should always be aware of the reliability, scope, relevance and comparability of the numbers reported and should document these characteristics and attach them to the report. It is often a good idea to explicitly remark if a measurement is just a statement about one particular test-setup and not suitable for further projections.

If a misinterpretation of test results occurs, one should act swiftly, clearly and forcefully to present the tester’s view of the numbers and their context in order to contain the possible damage. Numbers, once reported, tend to acquire a life of their own. Like a genie, it’s practically impossible to get them back into the bottle. Therefore, one should be very careful what to report and when to report.

One final advice: As a performance tester, always remain skeptical about your own work and your own instincts. Other people should not blindly believe in test results and numbers and neither should the tester.

Post by Dr. Raymond Georg Snatzke

Game Programming

codecentric go challenge 2016

Game Programming

codecentric go challenge 2015


Your email address will not be published. Required fields are marked *