Software Evolution: An Introduction

16 Mar

I. Introduction

Studying software evolution has been ongoing since 70s. The fundamental characters in software design methodology, the nature of uncertainty in software application, and software changes were studied as the consequence of software evolution research. These efforts continued with software development process modeling for process control and process improvement while software evolution laws were contemplated [1]. In the paper “The Role and Impact of Assumptions in Software Development, Maintenance and Evolution” [2] M.M. Lehman has summarized his endeavor in software evolution study over 35 years. As it has explicitly given in the title of the article, “assumptions” is the key in the life of software:

“The presence and impact of assumptions on systems in general, computing systems in particular and software most especially has gone largely unrecognized over the past fifty or so years. It is now apparent, with hindsight self-evident, that assumptions, explicit or implicit, that have become invalid as a consequence of changes in an application or its operational domains, and the properties of those assumptions previously ignored that have now become relevant, are in large measure responsible for the need to continually upgrade and evolve software. [2]”

In Lehman’s view, software evolution implies “to widen the scope of the software, adapting or extending the domains over which it can be expected to execute satisfactorily [2].”

Recognized that studies in software engineering were inadequate just by following traditional engineering methodology, researchers in software engineering initiated the exploration of studies in complex systems and made the efforts to extend software evolution studies analogous to evolutionary theories of biology. Believed that software being diverse in nature, varying from abstract machine code and assembly language to more formal programming languages, applications, user-created macros, and scripts, Kitchin and Dodge describe a set of hierarchically organized entities of increasing complexity parallel with that of organic entities shown in figure 1 [3]. This figure portrays a conceptualized view of different scales of software.


Figure 1. A conceptualization of software compared with biological entities/populations at programming language level [3]

Another view portrayed in figure 2, a courtesy of the blog host at [4] , is showing a software evolution cycle at process level regarding software development life cycle management.

Figure 2. A conceptualization of software evolution in terms software life cycle management [4]

Comparing software evolution with “generic theories of evolution, particularly with Dawkins’ concept of replicator” [5] a richer study of software evolution has trail blazed. Similarly, Nehaniv et al [6] have been “studying identified properties that give biological and artifact evolution the capacity to produce complex adaptive variation.” The study attempts to shed light on “how to enhance the evolvability of software systems in general and of evolutionary computation in particular. [6]”

Through reading both Steve J. Gould’s The Structure of Evolutionary Theory [7] and Richard Dawkins’ The Selfish Gene [8], The intention of this blog is to explore how Darwinian’s evolution theory and its new enhanced views could benefit the continuation of software evolution studies. Viewing software as complex adaptive systems (

may shed light on how software resembles the biological ones as well as on how we can take advantages of Darwinian’s evolution theory and make use of it in software engineering.

II. Evolution Approach to Software Development

An evolution approach described here is directly inspired through reading S.J. Gould’s The structure of Evolution Theory [7]. As Gould emphasized in this huge volume book, Darwin’s evolution theory had three principles of central logic that define “the themes of the deepest and most persistent debate – as, in a sense, they must because they constitute the most interesting intellectual questions that are theory for causes of descent with modification must address.”

A. Darwinian’s evolution theory according to S.J. Gould

Darwinian’s three principles are best exemplified under the three general categories: agency, efficacy, and scope.

Agency—Organisms act as the locus of selection and all higher order emerging by the analog of Adam Smith’s invisible hand, from the unconscious “struggles” of organisms for their own personal advantages as expressed in differential reproductive success.

Efficacy—The true cause in natural selection is admittedly weak and negative force and could under certain assumptions about the nature of variation, act as the positive mechanism of evolutionary novelty, that is , could “create the fit” as well as eliminate the unfit by slowly accumulating the positive effects of favorable variations through innumerable generations.

Scope—the full sufficiency in extrapolation from building up small changes, microevolutionary process, to the extend through the immensity of geological time capable of generating the entire pageant of life’s history, both in anatomical complexity and taxonomic diversity without further causal principles required.

In The structure of Evolution Theory Gould made his unique contribution to evolutionary theories “by advocating an independent set of macroevolutionary principles that expand, reformulate, operate in harmony with, or work orthogonally as additions to, the extrapolated, and persistently relevant (but not exclusive, or even dominant) forces of Darwinian microevolution.” It is the Gould’s macroevolutionary thinking that has inspired this research to apply and propose the evolution approach to software development.

B. Evolution approach to software development

“Evolution” in evolution approach to software development, has two kinds of implications. Literally, as defined in the online resource: [9] with two slightly different connotations: One is interpreted as any process of formation or growth, or development as in the evolution of a language; the other is interpreted as a process of gradual, peaceful, progressive change or development, as in social or economic structure or institutions. The two closely related to the meanings of Lehman’s software evolution research [1,2].

On the other hand, biologically, “evolution” means change in the gene pool of a population from generation to generation by such processes as mutation, natural selection, and genetic drift. Clearly, this has been the popular interpretation of Darwinian’s evolution theory. Gould has keenly observed [7] that “no evolutionary assertion has been more commonly advanced in textbooks, or more superficially (and almost nonchalantly) proclaimed by fiat, than the claim that adaptation by natural selection must be fully sufficient to render life’s entire history.”

Questioning the power of a well-formulated theory to explain the origin of multicellularity, the rise of mammals, and the eventual emergence of human intelligence, S.J. Gould made the suggestion of using a grand analogy as a speciational basis for macroevolution.

Gould then explains with punctuated equilibrium that is the essence to validate the macroevolutionary theory, in which the empirical pattern of stasis and abrupt geological appearance remain the standard testimony. By individuating species (and thereby establishing the basis for an independent theoretical domain of macroevolution) Punctuated Equilibrium ratifies an effective realm of macroevolutionary mechanics based on recognizing species as Darwinian individuals.

Followed with the speciational reformulation of macroevolution, Gould used the “life’s little joke,” as the counter example, in which “man [all of us] is the measure of all things.” As it has been the cardinal example, concepts of human evolution long labored under the restrictive purview (now known to be empirically false) of the so-called “single species hypothesis.”

Speciation reformulation has been summarized with the analogy by Gould as “the species as the macroevolutionary analog of the organism as the atom of microevolution.” (p893 in [7]) Acknowledging to his inspirational source from Ernst Mayr (1992 p48) Gould quoted “speciational evolution is Darwinian evolution at a higher hierarchical level. The importance of this insight can hardly be exaggerated.” (p895 in [7])

Lastly, as in the title of chapter 11 of his book, The integration of Constraint and adaptation (structure and function) in Ontogeny and Phylogeny: Structural Constraints, Spandrels, and the Centrality of Exaptation in Macroevolution, Gould argued that species must exapt the rich potentials supplied by structural and historical constraints of spandrels and other Miltonic “things” emplaced into the exaptive pool against (or orthogonally to ) these restricting tendencies of natural selection, where terms such as “evolvability”, “higher Darwinism”, and “this view of life” come to being. This is Gould’s resolution to deal with macroevolutionary paradox, i.e., “constraint ensures flexibility whereas selection crafts restriction.”

Mapping Gould’s macroevolutionary theory on to software development we get a list of the three as follows:
1. Punctuated equilibrium describes software architecture generational changes,
2. Species reformulation describes software product’s changes upon its environmental changes that may be considered as software product generational changes,
3. Exaptation describes architecture minor changes.

On the other hand, mapping Darwinian’s evolution theory at microevolutionary scale, we also get a list of the three as follows:
1. Design choices (including development tools) are considered as natural selection,
2. Different team solutions are considered as mutations/variations (including development methodologies),
3. Same domain problem may share optimally designed algorithms to implement since these algorithms may be eternally useful and heritable as long as problems addressable by these algorithms exist.

C. Discussion

A couple of points will be addressed here. First, the division in the analytical thinking, macro-level analyses versus micro-level analyses, has been ongoing in almost all scientific endeavors. An example can be found in my recent software architecture paper [10], which enhance software evolution study presented here.

If we compare design choices in software development as the natural selection in Darwinian’s evolution theory of biology, it must be true in the sense that only masterpiece designs would be remembered (appreciated, enjoyed, passed on) for generations to come. The fact that software creation was viewed as an art more than a science [11] seems more so now than that when it was said by Donald Knuth in his Turing Award speech 37 years ago.

Obviously, it is much easier to invalidate any theory than to build one. One counter example will do to expose the weakness of a theory. When one is trying to build one, many examples must be presented to support, even though the process itself is evolutionary. The plausible exists in other software practitioners who have similar experience with the originators, and who resonate with the essence given in the theory by means of understanding what have built up through persuasion. That’s what I am trying to do in this writing. I am distilling a theory from my years of working as a software developer.

To quote Richard Dawkins’ words in his The Selfish Gene to end my presentation in this writing,

“I am an enthusiastic Darwinian, but I think Darwinism is too big a theory to be confined to the narrow context of the gene. The gene will enter my thesis as an analogy, nothing more.” (P191 in [8])

My macro-level thinking with Steve J. Gould’s macroevolutionary theory in Darwinian’s Evolutionary Theory of biology applying to the software evolution is no more than using analogy to bring about a better understanding of software development through understanding software being the inevitable artifact created by its originators and forced upon the world we live.


[1]M. M. Lehman, D. E. Perry, and J. F. Ramil, “Implications of evolution metrics on software maintenance,” in Proc. of the 1998 IEEE Intl. Conference on Software Maintenance Bethesda, Maryland, November 1998.
[2]M. M. Lehman, “The Role and Impact of Assumptions in Software Development Maintenance and Evolution,” in Proc. Intl. IEEE Workshop on Software Evolvability 2006, 3-14, IEEE Computer Society Press.
[3]M. Dodge and R. Kitchin. Code/space : software and everyday life. Cambridge, Mass. : MIT Press, c2011.

[5]S. Cook, R. Harrison, M.M. Lehman, and P. Wernick, “Evolution in Software Systems: Foundations of the SPE Classification Scheme,” J. Softw., Maint., and Evol., Res. Pract. 2006, Vol18, p1-35.
[6]C.L. Nehaniv, J. Hewitt, B.Christiansen, and P.Wernick, “What Software Evolution and Biological Evolution Don’t Have in Common,” Second International IEEE Workshop on Software Evolvability, 24-24 Sept. 2006, P58-65, doi: 10.1109/SOFTWARE-EVOLVABILITY.2006.18.
[7]Gould, Steve Jay. The Structure of Evolution Theory. Belknap Press of Harvard University Press; 1ST edition, March 21, 2002.
[8]Dawkins, Richard. The Selfish Gene: 30th Anniversary Edition. Oxford University Press, USA; 30th Anniversary edition, May 25, 2006.

[10]B.H. Wu, “Let’s Enforce a Simple Visualization Rule in Software Architecture,” Nanjing, China ICIST 2011, an IEEE conference, March 26-27, 2011, p427-433. Doi: 10.1109/ICIST.2011.5765283.
[11]Knuth, Donald E. Literate Programming (Center for the Study of Language and Information – Lecture Notes) .Center for the study of Language and Information; 1 edition, June 1, 1992.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: