Recent Data For Planning Mathematical Software Projects

This article is a follow up to the previous article Estimating the Cost and Schedule of Mathematical Software. In the previous article, the author advocated using software engineering expert Barry Boehm’s Basic COCOMO Embedded Mode cost model to estimate the cost and schedule of mathematical software projects, with the important qualification that there are substantial variations between actual effort and estimated effort using this model. By Boehm’s own account, Basic COCOMO estimates are within a factor of two of actual effort only 60 percent of the time.

The formula for Basic COCOMO Embedded is:

[tex]SM = 3.6(KSLOC)^{1.2}[/tex]

where SM is Staff Months, the politicaly correct term formerly known as the Mythical Man Month, and KSLOC is thousand (kilo) source lines of code.

Basic COCOMO is based on a database of sixty-three software projects at TRW, Boehm’s then employer, during the 1970s. The Embedded Mode model is based on twenty-eight (28) of these projects that Boehm classified as Embedded projects. The projects were written in FORTRAN (24), COBOL (5), Jovial (5), PL/I (4), Pascal (2), Assembly (20), and miscellaneous other languages (3). None of these is commonly used today. Nonetheless, in the author’s experience, Basic COCOMO Embedded gives a rough order of magnitude (ROM) estimate of the effort for mathematical software projects such as implementing a video codec in C/C++ today (2012).

The Measurement Free Zone

Remarkably, despite the growing cost and importance of software, it is difficult to find publicly available information on the cost, schedule, and effort of software projects. There are a number of consulting firms and proprietary cost and schedule estimation tools but these do not disclose their databases of historical data. Indeed, many organizations, including many commercial businesses, do not seem to use historical data on the cost and schedule of software development to plan projects!

Donald Reifer’s 2004 Software Productivity Data

In 2004, software engineering expert Donald J. Reifer of Reifer Consultants, a colleague of Barry Boehm, published an article in The DoD SoftwareTech News, now The Journal of Software Technology, “Industry Software Cost, Quality and Productivity Benchmarks” giving the software productivity numbers, broken down by categories such as “Scientific” or “Web Business” for the most recent 600 of 1800 projects in his database of projects. These were projects from the last seven years prior to 2004 (about 1997 to 2004).

Table One below is a subset of Reifer’s data from Table 1 in his paper. These are the categories — Command and Control, Military – Airborne, Military – Ground, Military – Missile, Military – Space, and Scientific — that are similar (Command and Control, Military) or the same (Scientific) as mathematical software. The category “Web Business” is included as a point of reference.

Reifer uses equivalent source lines of code (ESLOC). For new code, ESLOC is equivalent to a line of code. For “legacy” code that is modified or reused, ESLOC applies a weighting factor to the line of code such as 0.4. This way data on maintenance or modifications of existing software can be combined with writing new software. Reifer uses equivalent source lines of code as defined by the Software Engineering Insitute.

Application Domain Number Projects Size Range (KESLOC) Avg. Productivity (ESLOC/SM) Range (ESLOC/SM) Example Application
Command & Control 45 35 to 4,500 225 95 to 350 Command centers
Military -All 125 15 to 2,125 145 45 to 300 See subcategories
Airborne 40 20 to 1,350 105 65 to 250 Embedded sensors
Ground 52 25 to 2,125 195 80 to 300 Combat center
Missile 15 22 to 125 85 52 to 175 GNC system
Space 18 15 to 465 90 45 to 175 Attitude control system
Scientific 35 28 to 790 195 130 to 360 Seismic processing
Web Business 65 10 to 270 275 190 to 985 Client/server sites
Totals 600 10 to 4,500 45 to 985

Table 1 (Abridged): Software Productivity (ESLOC/SM) by Application Domains

Note that productivity in KESLOC (One Thousand Equivalent Source Lines of Code) is significantly higher for the Web Business category. This actually understates the difference because the “Web Business” projects, as indicated elsewhere in Reifer’s article, are usually written in so-called Fourth Generation Languages (4GLs), scripting languages such as Python, Perl, PHP, and so forth, whereas the other software categories are typically written in lower level languages such as C/C++. A single line of a 4GL language such as Python often corresponds to several lines of a language such as C/C++.

Scientific software has an average productivity of 195 ESLOC per Staff Month (SM). Note that there is a wide range of variation: 130 to 360 ESLOC per Staff Month (SM). This is for fairly large projects ranging from 28,000 lines of code to 790,000 lines of code.

Basic COCOMO Embedded predicts a productivity of 142 lines of code per Staff Month for a project with 28,000 lines of code. It predicts a productivity of 73 lines of code per Staff Month for a project with 790,000 lines of code. It predicts a productivity of about 280 lines of code per Staff Month for a project with 1,000 lines of code.

Basic COCOMO Embedded is quite similar to the numbers for Military Airborne, Missile, and Space.

Software productivity numbers are close to meaningless without an associated measure of the quality of the software. Reifer uses the number of bugs/errors/defects per thousand equivalent source lines of code (KESLOC). The error rates upon delivery to the customer show the difference between Web Business and the other categories. When the quality must be high, ideally no errors for mission critical life/death software such as airplane control software (avionics), then the number of lines of code per Staff Month is correspondingly lower.

Application Domain Number Projects Error Range (Errors/KESLOC) Normative Error Rate (Errors/KESLOC) Notes
Command & Control 45 0.5 to 5 1 Command centers
Military — All 125 0.2 to 3 < 1.0 See subcategories
— Airborne 40 0.2 to 1.3 0.5 Embedded sensors
— Ground 52 0.5 to 4 0.8 Combat center
— Missile 15 0.3 to 1.5 0.5 GNC system
— Space 18 0.2 to 0.8 0.4 Attitude control system
Scientific 35 0.9 to 5 2 Seismic processing
Web Business 65 4 to 18 11 Client/server sites

Table 8 (Abridged): Error Rates upon Delivery by Application Domain

Quality Requirements for Mathematical Software

The required quality for many types of mathematical software is often very high, meaning less than one error per thousand lines of code. For example, a video codec such as used by YouTube or Skype, generates the output, the video, seen and used by the customers. Almost any bug in a video codec will result in visible artifacts at best and often completely destroys the video. Many readers have probably noticed occasional blurriness or other anomalies in YouTube or other Web video; these are problems that remain after extensive debugging of the video software.

Many video, image, and audio processing applications have similar quality requirements to video codecs. Similarly, encryption and decryptions such as the Advanced Encryption Standard (AES) usually requires extremely high quality since even a single bit error will result in gibberish. Many other types of mathematical software require similarly high levels of quality. Many seem to have quality requirements in practice similar to avionics and other demanding applications modeled by Basic COCOMO Embedded.

Where Are All The Super Programmers?

It is not uncommon in verbal conversations or comments on Web blogs to encounter programmers who claim to routinely write five to ten-thousand lines of code per month. Nonetheless, Reifer’s data shows little evidence of this performance level. With some exceptions, studies of software productivity usually show much smaller numbers.

There is tremendous variation in software projects. The author once implemented the Advanced Encryption Standard (AES) in about one week. This is about 1500 lines of code. This would translate to 6000 lines of code per month if naively extrapolated. However, this was clearly unusual and stands out in the author’s memory precisely because the project went so quickly and smoothly.

It is probably possible to write many lines of working usable code for certain kinds of simple straight-forward business and user interface software. For example, the top productivity for the Web Business category in Reifer’s published data is 985 lines of code/month.

It is clear though that the average performance for the vast majority of software engineers, including most exceptional software engineers, is much less than 5000 lines of code per month for most categories of software projects, with the possible exception of some types of business and user interface software, if one requires reasonable quality.

Conclusion

In the author’s experience, it is common to encounter extremely optimistic ideas about the size, scope, and difficulty level of mathematical software projects. Many people appear to be genuinely unaware of how complex, how many lines of code, many commonly used examples of mathematical software such as video codecs actually are. Similarly, many people seem to be unaware of the quality level needed to produce an acceptable end-user/customer experience such as an enjoyable streaming video. Many people, even technical people who should know better, often seem to consciously or unconsciously use software productivity numbers like 5-10,000 lines of code per Staff Month even though these are not supported by most historical experience.

How should one use models like Basic COCOMO Embedded that are based on historical data or historical software productivity numbers like Donald Reifer’s data? These are good for rough order of magnitude (ROM) estimates including basic sanity checks. If one only has resources for a two week project and Basic COCOMO says the project is a six month project, one should probably reevaluate one’s plans. On the other hand if one has the resources for a six month project and Basic COCOMO says seven months, the difference is probably not meaningful given the large variation between actual effort and estimated effort. The same applies to blindly plugging in numbers like Reifer’s average 195 lines of code per Staff Month for Scientific software.

These models and data are not good for precise scheduling. There is substantial variation between actual and estimated effort. Software seems to inherently involve large variations in effort that are difficult or impossible to predict in advance.

Suggested Reading/References

Barry Boehm, Software Engineering Economics, Prentice-Hall, Englewood Cliffs, NJ, 1981

© 2012 John F. McGowan

About the Author

John F. McGowan, Ph.D. solves problems using mathematics and mathematical software, including developing video compression and speech recognition technologies. He has extensive experience developing software in C, C++, Visual Basic, Mathematica, MATLAB, and many other programming languages. He is probably best known for his AVI Overview, an Internet FAQ (Frequently Asked Questions) on the Microsoft AVI (Audio Video Interleave) file format. He has worked as a contractor at NASA Ames Research Center involved in the research and development of image and video processing algorithms and technology. He has published articles on the origin and evolution of life, the exploration of Mars (anticipating the discovery of methane on Mars), and cheap access to space. He has a Ph.D. in physics from the University of Illinois at Urbana-Champaign and a B.S. in physics from the California Institute of Technology (Caltech). He can be reached at [email protected].

One Response

  1. Phil H February 25, 2012

Leave a Reply