Skip to content

Instantly share code, notes, and snippets.

@dmckeone
Created April 30, 2014 17:12
Show Gist options
  • Save dmckeone/da287e50b0449e2944fb to your computer and use it in GitHub Desktop.
Save dmckeone/da287e50b0449e2944fb to your computer and use it in GitHub Desktop.
What we know in Computer Science
Inspired by this (http://carmandrew.com/home/2013/9/3/can-we-please-put-the-science-back-into-computer-science), this is a list of papers and summaries of what they say:
- Boehm (1975): most errors are introduced during requirements analysis and design. Not during writing and debugging. The later a bug is removed, the more it costs to remove.
- Woodfield (1979): For every 25% increase in problem complexity, there is a 100% increase in solution complexity. It's non-linear because of interaction effects.
- van Genuchten (1991): The two biggest causes of project failure are poor estimation and unstable requirements, neither of which seem to be improving in the industry as a whole.
- Thomas et al. (1997, but it hasn't been replicated yet): If more than 20-25% of a component has to be revised, it's better to rewrite it from scratch. (This study was done on software for flight avionics, so it may or may not generalize).
- Fagan (1975 at IBM): Hour for hour, sitting down and reading the code is the most effective way to remove bugs. It's better than running the code and better than writing unit tests. 60-90% of all errors can be removed before the very first run of the program. BUT, all of that value comes from the first hour by the first reviewer. You can only read code for an hour before your brain is full - only a couple hundred lines of codes depending on skill / practice. This means we should have small patches and changes.
- Herbsleb (1999): The architecture of the system reflects the organizational structure that built it.
- Nagappan (2007) & Bird (2009): Physical distance between a team of developers doesn't affect post release fault rate (for Windows Vista). Distance in the org chart does. Thus, it's fine to have people remote as long as they are on the same team, just don't put developers working under different people on the same project.
- El Emam (2001): No code (or other) metrics that were published before 2001 have any correlation with post-release error rate beyond that predicted by source lines of code (SLOC).
- Aranda & Easterbrook (2005): When estimating the time it will take to complete a project the only thing that matters in the spec for the project is how long the writer of the spec thinks it will take (ie. the anchoring effect from psychology). "All work done to date on software estimation is pretty much pointless. All the engineers are going to give us back is what we want to hear."
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment