next up previous
Next: Experiments cost too much Up: Why should we experiment? Previous: Traditional scientific method isn't

The current level of experimentation is good enough

Suggesting that the current level of experimentation doesn't need to change is based on the assumption that computer scientists, as a group, know what they are doing. This argument maintains that if we need more experiments, we'll simply do them.

But this argument is tenuous; let's look at the data. In [15], 400 papers were classified. Only those papers were considered further whose claims required empirical evaluation. For example, papers that proved theorems were excluded, because mathematical theory needs no experiment. In a random sample of all papers ACM published in 1993, the study found that of the papers with claims that would need empirical backup, 40% had none at all. In journals related to software, this fraction was 50%. The same study also analyzed a non-CS journal, Optical Engineering, and found that in this journal, the fraction of papers lacking quantitative evaluation was merely 15%.

The study by Zelkowitz and Wallace[17] found similar results. When applying consistent classification schemes, both studies report between 40% and 50% unvalidated papers in software engineering. Zelkowitz and Wallace also surveyed journals in physics, psychology, and anthropology and again found much smaller percentages of unvalidated papers there than in computer science.

Relative to other sciences, the data shows that computer scientists validate a smaller percentage of their claims. One could argue that computer science at age 50 is still young and hence a comparison with other sciences is of limited value. I disagree, because 50 years seems plenty of time for two to three generations of scientists to establish solid principles. But even on an absolute scale, I think that it is scary when half of the non-mathematical papers make unvalidated claims. Assume that each idea published without validation would have to be followed up by at least two validation studies (that's a very mild requirement). It follows trivially that no more than one third of papers published could contain unvalidated claims. The data suggests that computer scientists publish a lot of untested ideas or the ideas published are not worth testing.

I'm not advocating replacing theory and engineering by experiment, but I am advocating a better balance. I advocate balance not because it would be desirable for computer science to appear more scientific, but because of the following principal benefits:

Conversely, when we ignore experimentation and avoid contact with reality, we hamper progress.


next up previous
Next: Experiments cost too much Up: Why should we experiment? Previous: Traditional scientific method isn't

Walter Tichy
Mon May 4 16:58:54 MET DST 1998