by Michel Machado - michel at digirati dot com dot
br
Allston, October 13th, 2007.
Even though testing is essential to develop any product, especially software since it is sold “as is,” testing is too expensive. Hand waving, I dare say that testing is analogous to spending money on machines to run an algorithm that runs in exponential time. In other words, we should test software, but we have to be selective on what must be tested because all resources allotted to a project may be spent just on testing.
On the one hand, testing may only be done when a piece of software somehow runs (e.g. unit test), that is, it is something closer to the end of the process of creating software. On the other hand, bugs are born on every phase of the development of software; they come along at the moment a project starts. A fairly natural idea is to improve the whole process of development to avoid and catch mistakes sooner; A notable instance of it is CMMI (Capability Maturity Model Integration). Applying CMMI may be less expensive than overloading testing, but it is still costly. An important lesson may be learned here, testing is money and time consuming, so one should reduce it doing something else.
Programmers are famous for not liking documenting and testing code. However, we love coding. So, why not making testing part of the coding process? It is impossible in many current programming languages, but it is not something out of reach. The missing piece is a bridge between two worlds that do not get together often: software engineering and programming language.
Software-engineering community has spent most of its efforts on processes and better documentation, what does help to some degree, but these efforts have very limited results when it comes to code level. Programming-language community has made significant progresses on static analysis of code, but these results are not well known outside of the community, neither does the community sell its work well.
I am not stating that all problems are already solved, but that what is already solved is way more than what is usually assumed by a non-initiated. In order to be concrete here, I will mention two languages that are ready to be used and one that is not mature enough yet but leverages static analysis to a level never seen in practice.
Ocaml and Haskell are two languages regarded as “functional languages” and do not receive much attention from the general industry for being taken as “academic work.” I am not going to spend words to change that view here, all I want to point out is that these languages have sound type systems that allows one to find much more bugs before the code run for the first time than languages like Java, Perl, or C. There is no need for downcasts or place for null pointers in Ocaml or Haskell.
Although the type systems of Ocaml and Haskell already impress new users, they are not all one can get from type systems, for example, they cannot statically check array bounds, track locks, or insure that a given condition never happens. A language that is trying to bring all the known results into reality is ATS. One should not think that just for taking high level abstractions into account these languages are slow; Some experiments coded in ATS have beaten C code's performance!
If all I am stating is true, why is it still a “secret”? Education. To make a long list of arguments to prove this point short, I will give a real experience I had last year. I was looking for a book to learn more advanced features of Ocaml when I found a book recently launched with an inspiring name: “Practical Ocaml” (it was inspiring because I was tired of reading “theoretical Ocaml”). One would be better off without that book; The author demonstrates over and over again that he does not understand the language at all. If there are no good books and courses about these languages, how can one outside of programming language community effectively code in these languages?
To conclude, we should test less because it is expensive, and we should embrace a language that offers a better type system than the languages that are currently popular. In order to get it done, we need to build a bridge between software engineering and programming languages, translating the heavy wording around these new languages into something that most programmers can read and understand correctly.