Note: this content is reposted from my old Google Plus blog, which disappeared when Google took Plus down. It was originally published on 2016-05-18. My views and the way I express them may have evolved in the meantime. If you like this gist, though, take a look at Leprechauns of Software Engineering. (I have edited minor parts of this post for accuracy after having a few mistakes pointed out in the comments.)
Degrees of intellectual dishonesty
In the previous post, I said something along the lines of wanting to crawl into a hole when I encounter bullshit masquerading as empirical support for a claim, such as "defects cost more to fix the later you fix them".
It's a fair question to wonder why I should feel shame for my profession. It's a fair question who I feel ashamed for. So let's drill a little deeper, and dig into cases.
Before we do that, a disclaimer: I am not in the habit of judging people. In what follows, I only mean to condemn behaviours. Also, I gathered most of the examples by random selection from the larger results of a Google search. I'm not picking on anyone in particular.
The originator of this most recent Leprechaun is Roger S Pressman, author of the 1982 book "Software Engineering: a Practitioner's Approach", now in its 8th edition and being sold as "the world's leading textbook in software engineering".
Here is, in extenso, the relevant passage (I quote from the 5th edition, the first edition, which I do not have access to, reportedly stated "67 units" and that later became "between 60 and 100 units"; the rationale for this change is unclear.)
To illustrate the cost impact of early error detection, we consider a series of relative costs that are based on actual cost data collected for large software projects [IBM81]. Assume that an error uncovered during design will cost 1.0 monetary unit to correct. Relative to this cost, the same error uncovered just before testing commences will cost 6.5 units; during testing, 15 units; and after release, between 60 and 100 units.
This [IBM81] is expanded, in the References section of the book, into a citation: "Implementing Software Inspections", course notes, IBM Systems Sciences Institute, IBM Corporation, 1981.
Am I embarrassed for Pressman, that is, do I think he's being intellectually dishonest? Yes, but at worst mildly so.
It's bothersome that for the first edition Pressman had no better source to point to than "course notes" - that is, material presented in a commercial training course, and as such not part of the "constitutive forum" of the software engineering discipline.
We can't be very harsh on 1982-Pressman, as software engineering was back then a discipline in its infancy; but it becomes increasingly problematic as edition after edition of this "bible" lets the claim stand without increasing the quality of the backing.
Moving on, consider this 1995 article:
"Costs and benefits of early defect detection: experiences from developing client server and host applications", Van Megen et al.
This article doesn't refer to the cost increase factors. It says only this:
"To analyse the costs of early and late defect removal one has to consider the meaning and effect of late detection. IBM developed a defect amplification model (IBM, 1981)."
The citation is as follows:
"IBM (1981) Implementing Software Inspections, course notes (IBM Systems Sciences Institute, IBM Corporation) (summarised in Pressman 1992.)"
This is the exact same citation as Pressman's, with the added "back link" to the intermediate source. The "chain of data custody" is intact. I give Van Megen et al. a complete pass as far as their use of Pressman is concerned.
Let's look at a blog post by my colleague Johanna Rothman: http://www.jrothman.com/articles/2000/10/what-does-it-cost-you-to-fix-a-defect-and-why-should-you-care/
Johanna refers, quite honestly, to "hypothetical examples". This means "I made up this data", and she's being up front about it. She says:
"According to Pressman, the expected cost to fix defects increases during the product's lifecycle. [...] even though the cost ratios don't match the generally accepted ratios according to Pressman, one trend is clear: The later in the project you fix the defects, the more it costs to fix the defects."
I'm almost totally OK with that. It bothers me a bit that one would say "one trend is clear" about data that was just made up; we could have made the trend go the other way, too. But the article is fairly clear that we are looking at a hypothetical example based on data that only has a "theoretical" basis.
The citation:
Pressman, Roger S., Software Engineering, A Practitioner's Approach, 3rd Edition, McGraw Hill, New York, 1992. p.559.
This is fine. It's a complete citation with page number, still rather easy to check.
I am starting to feel queasy with this 2007 StickyMinds article by Joe Marasco:
https://www.stickyminds.com/article/what-cost-requirement-error
"The cost to fix a software defect varies according to how far along you are in the cycle, according to authors Roger S. Pressman and Robert B. Grady. These costs are presented in a relative manner, as shown in figure 1."
What Grady? Who's that? Exactly what work is being cited here? There's no way to tell, because no citation is given. Also, the data is presented as fact, and a chart, "Figure 1" is provided which was not present in the original.
This is shady. Not quite outright dishonest, but I'd be hard pressed to describe it more generously than as "inaccurate and misleading".
A different kind of shady is this paper by April Ritscher at Microsoft.
http://www.uploads.pnsqc.org/2010/papers/Ritscher_Incorporating_User_Scenarios_in_Test_Design.pdf
The problem here is a (relatively mild) case of plagiarism. The words "the cost to fix software defects varies according to how far along you are in the cycle" are lifted straight from the Marasco article, with the "according to" clause in a different order. But the article doesn't give Marasco credit for those words.
There's also the distinct possibility that Ritscher never actually read "Pressman and Grady". Do I have proof of that? No, but it is a theorem of sorts that you can figure out the lineage of texts by "commonality of error". If you copy an accurate citation without having read the original, nobody's the wiser. But why would you go to the trouble of reproducing the same mistake that some random person made if you had actually read the original source?
So we're entering the domain of intellectual laziness here. (Again, to stave off the Fundamental Attribution Error: I am not calling the person intellectually lazy; I am judging the behaviour. The most industrious among us get intellectually lazy on occasion, that's why the profession of tester exists.)
Next is this 2008 article by Mukesh Soni:
"The Systems Sciences Institute at IBM has reported that the cost to fix an error found after product release was four to five times as much as one uncovered during design, and up to 100 times more than one identified in the maintenance phase (Figure 1)."
We find the same level of deceit in a 2008 thesis, "A Model and Implementation of a Security Plug-in for the Software Life Cycle " by Shanai Ardi.
http://www.diva-portal.org/smash/get/diva2:17553/FULLTEXT01.pdf
"According to IBM Systems Science Institute, fixing software defects in the testing and maintenance phases of software development increases the cost by factors of 15 and 60, respectively, compared to the cost of fixing them during design phase [50]."
The citation is missing, but that's not really what's important here. We've crossed over into the land of bullshit. Both authors presumably found the claim in the same place everyone else found it: Pressman. (If you're tempted to argue "they might have found it somewhere else", you're forgetting my earlier point about "commonality of error". The only thing the "IBM Systems Science Institute" is known for is Pressman quoting them; it was a training outfit that stopped doing business under that name in the late 1970's.)
But instead of attributing the claim to "IBM, as summarized by Pressman", which is only drawing attention to the weakness of the chain of data custody in the first place, it sounds a lot more authoritative to delete the middle link.
I could go on and on, so instead I'll stop at one which I think takes the cake: "ZDLC for the Early Stages of the Software Development Life Cycle", 2014:
"In 2001, Boehm and Basili claimed that the cost of fixing a software defect in a production environment can be as high as 100 times the cost of fixing the same defect in the requirements phase. In 2009, researchers at the IBM Systems Science Institute state that the ratio is more likely to be 200 to 1 [7], as shown in Figure 2".
The entire sentence starting "In 2009" is a layer cake of fabrication upon mendacity upon affabulation, but it gets worse with the citation.
Citation [7] is this: "Reducing rework through effective requirements management", a 2009 white paper from IBM Rational.
Yes, at the century scale IBM Rational is a contemporary with the defunct IBM Systems Science Institute, but that's a little like attributing a Victor Hugo quote to Napoleon.
While Figure 2 comes straight out of the IBM paper, the reference to "IBM Systems Science Institute" comes out of thin air. And in any case the data does not come from "researchers at IBM", since the IBM paper attributes the data to Boehm and Papaccio's classic paper "Understanding and Controlling Software Costs", which was published not in 2009 but in 1988. (Both of them worked at Defense consultancy TRW.)
We've left mere "bullshit" some miles behind here. This isn't a blog post, this an official peer reviewed conference with proceedings published by the IEEE, and yet right on the first page we run into stuff that a competent reviewer would have red-flagged several times. (I'm glad I've let my IEEE membership lapse a while ago.)
Garden-variety plagiarism and bullshit (of which we are not in short supply) make me feel icky about being associated with "software engineering", but I want to distance myself from that last kind of stuff as strongly as I possibly can. I cannot be content to merely ignore academic software engineering, as most software developers do anyway; I believe I have an active duty to disavow it.
Hi - I have a few serious problems with what you write here (I was pointed to this circuitously from a Register article) - I think it might well be that you may be spouting bullshit instead of Pressman ... do read on ...
First, Pressman's book was first published in 1982, not 1987. The source of information was cited as being from the IBM course notes. These notes were contemporary at the time of publication, being from the previous year (1981).
The first edition copy, which is on my desk right now, actually states:
So, the text did change between the first edition and the later revision you cited. I am not sure why the numbers moved up - perhaps there was later data available?
Next, in 1982, Software Engineering was a discipline that was very much in vogue. It was a mandatory module in the second year of my undergraduate degree course (1983-1984). I graduated in 1986 and Pressman's book was seen as very much state of the art, based as it was and is on actual experience gained to date in real and significant software projects. Most of these projects were large compared to today's standards, and many were dealing with significant complexity. For example IBM's OS360 (much covered by F.P.Brookes) and ICL's VME/B - both large projects spanning from the early 1960's into the 1970's and beyond. In the case of ICL's VME/B, version control and what we today would call normal software engineering practice were already in place. I can point you at the relevant reports from the 1970's if you like. What did happen in the meantime was that the industry was filled out with amateurs with no formal training in computing science, and this has led to "inventions" that predated the birth of the wannabe "inventors" by decades. Moving on ...
You omit to mention that Pressman also develops his thinking to consider "defect amplification", which was even then a well known problem. Pressman was simply reflecting real world data that he had available. A long time ago I had a series of reports, including the IBM reports, where defect cost was measured. These reports aligned, at least in order of magnitude terms, with what Pressman's book was pointing out. Contrary to you thinking that all this is "bullshit", Pressman's words were based on real data. You should realise that even in the 1970s, senior managers, even in competing firms, would collaborate to find better ways to engineer software. I know that these discussions happened between the major mainframe companies - including IBM, and ICL (who were dominant or large in the UK and Commonwealth). I know this because I knew some of the ICL people involved.
As for as my own experience goes, in my first post-graduate job (at ICL), we used the same inspection methodology as IBM to reduce bug counts - this leading to many very high quality outcomes. This was done both with designs (yes we actually wrote them in those days) and with code. We didn't use the "click compile and see if the unit test catches anything" methods of today. Oh, and the first time I had to write unit tests was 1986, not "after 2000" when you think this technique was invented.
My own experience since then has been that many engineers of today do not understand the impact of bugs, the need to design them out nor much else about how to actually engineer good software. They are disinterested in others experience and somehow think their "agile" methods (we used to call this "incremental development" in the 1980's) are somehow new and good. What I have come to know is that bugs that get to production are way more expensive to fix than those found ahead of deployment. Why is this? Just add up the costs to users of dealing with the bug plus the overheads of a support team plus the eventual fixing and (not that this happens much in the new click-happy world) validation and verification of the fix. All of this extra work costs more than ensuring that the bug was removed ahead of release - this is obvious by inspection. Real world data has told the truth for decades.
Last, my son just graduated from a Masters course in Computing from Imperial College (one of the best in the world). Imagine my surprise and delight that Pressman is still required reading. Long may it continue.
To close, sorry for my venting, but hey, you got a counter view from someone with around 35 years of experience!