Future and Cosmos: Best Practices Software and Cowboy Coding

Note: I have revised this post to remove its original references to the programming sins of one particular programmer. I have decided to take mercy on this person, and remove all references to his coding sins.

Let's look at the difference between two very different types of programming: best practices software and cowboy coding.

Best-practices Software

Best-practices software is software developed according to software industry guidelines for quality. Examples of these best-practices include the items below. There is not always time to follow all of these practices, but the overall quality and maintainability of the code depends on how many of these standards are followed:

Each source file contains a comment specifying the type of code in that file.
Each method, subroutine or function contains a comment explaining what is done by that method or subroutine. The only exception to this rule is when the name of the method, subroutine, or function leaves no doubt as to exactly what is being done.
There is a short description of each argument to any method that takes arguments, except in the case when the name of an argument leaves no doubt as to what that argument is.
There are comments explaining the logic in any particularly complicated or hard-to-understand parts of the code.
Variables are given names that help to document what they stand for.
Good coding practices are followed by each developer.
Once the code is finished, it is placed in a version control system. Whenever a source code file is changed, the new version is checked into the version control system, with a comment discussing what changes were made.
The code is developed by a team of developers, who can cross-check each others' work.
Once the code is written, documentation is written explaining how the code works and how it can be modified.
A team of quality assurance experts (known as the QA staff) are finally brought in to rigorously test the code to find any bugs in it.
Once the code has been released, a meticulous record is kept of all changes in the code and all reported bugs, along with which of the bugs were fixed.
Any known defects or limitations of the code are clearly documented.
Each subsequent release of the code is given a new version number, with a description of exactly how the code changed during the latest release.

Practices such as these are followed by mission-critical software, or software on which great amounts of money are riding, or software on which lives depend. For example, if a company were writing software for a nuclear reactor, or software for an expensive space mission, or software for guiding a jetliner, it would tend to follow most or all of these best practices.

However, there is a totally different way of programming that is often used, a quick-and-dirty way of programming. This way of programming is sometimes called cowboy coding.

Cowboy Coding

Cowboy coding is what happens when a single developer produces some code, typically in a quick-and-dirty method. The cowboy coder isn't interested in any quality guidelines that will slow him down. He typically grinds out some software without doing much to document it. He may make no use of version control. He may then release his work without having had anyone check it other than himself. Typically the cowboy coder just kind of says, “It seems to work well when I try it – let me know if you find anything wrong with it.” A typical cowboy coder makes little or no attempt to produce written documentation for his software, and may take no care to document different versions or to document exactly which bugs were fixed.

Now cowboy coding certainly has its place. Lots of programs are not mission critical, and need not be developed using best practices. It would be overkill to follow the best practices listed above when creating some little graphic utility for doing something like allowing a user to add text to an image.

However, it must be noted that cowboy coding is a severe danger if it is used for some critical part of a hugely important scientific study. This is because cowboy coding isn't very reliable. Maybe it does the right thing, and maybe it doesn't. It can be hard for anyone to tell except the original cowboy coder, and probably he doesn't even know. This is no exaggeration. Poorly documented software code is very hard to read, even if you are the original developer. Countless cowboy coders simply don't know whether their cowboy-coded projects work correctly. I've cowboy-coded quite a few little projects, and then when I went back to them much later, I could often hardly figure out any more what exactly they were doing.

The Huge Problem of Cowboy Coding in Scientific Studies

There is a very big problem that modern scientific studies often rely on dubious software solutions that have been cowboy-coded. Modern science involves incredibly high amounts of specialized data processing. Scientists cannot buy off-the-shelf software to handle these specialized needs. This is because each scientific specialty requires its own specific type of software, and the market for such software is so small that few software publishers will cater to it.

What very often happens is that scientists will often write their own software programs to handle their own specialized needs. Such efforts are often one-man cowboy-coded efforts that do not come anywhere close to meeting the best practices of modern software development. We have many scientists writing amateurish code that any full-time software developer would be ashamed to put his name on. But such code might become a critical linchpin in some scientific study that uses up millions of federal dollars.

We need new standards to minimize this problem. One possibility is to include software professionals as part of the peer review process for scientific studies. There are major scientific studies that are 30% science and 70% data processing. But in the peer review process, only scientists review the study. This makes no sense. Software professionals should be included in the process, to a degree that depends on how much data processing was done by the study.

Wednesday, March 19, 2014

Best Practices Software and Cowboy Coding

No comments:

Post a Comment