Header 1

Our future, our universe, and other weighty topics


Monday, May 16, 2022

Yes, You Can Do Math Clarifying the Chance of Accidental Biological Innovations

At the Skeptical Inquirer we have a recent article by Jason Rosenhouse attempting but failing to effectively rebut mathematical arguments against the credibility of Darwinist origins claims. The article begins by citing an imaginary example of a mathematical argument against the accidental origin of a gene.  Apparently trying to make the mathematical improbabilities look vastly smaller than they are, the author gives us a faulty example that no competent critic of Darwinism would actually use, because it does not involve an accurate idea of the number of base pairs in human genes. Rosenhouse imagines someone arguing from a gene that is only 100 base pairs long, with such a person saying that such a sequence would have a random likelihood of 1 in 100 to the fourth power, too unlikely to have occurred by chance. 

But according to the site here, "The typical confirmed human gene has 12 exons of an average length of 236 base pairs each, separated by introns of an average length of 5,478 base pairs." Counting only the exons, this gives us an average number of base pairs in a human gene of 12 times 236, which is 2832.  No one would argue from a gene of only 100 base pairs in length when the average human gene has something like 2832 base pairs. A base pair can have any of four values. The number of ways you can arrange the base pairs in a gene of only 100 base pairs is 4 to the 100th power, which is about 10 to the 60th power. The number of ways you can arrange the base pairs in an average human gene of about 2832 base pairs is a vastly larger number, which is 4 to the power of 2832, which is about 10 to the 1705th power, or 10 followed by 1704 zeros. 

Funny how these kind of convenient errors of gigantic complexity understatement keep cropping up in the work of writers trying to assure us about the feasibility of a Darwinian evolution of protein molecules. Similar goofs happened in a paper by professor Luca Peliti (discussed here),  where Peliti incorrectly spoke as if the typical number of amino acids in an enzyme protein molecule is 100 (it's more like 400), and also made a convenient math error claiming that 20 to the hundredth power is about ten to the thirtieth power (20 to the hundredth power is actually roughly ten to the 130th power).  

Such numbers come up when considering genes and proteins. A gene specifies the amino acid sequence of a particular type of protein molecule. The human genome includes roughly 20,000 to 25,000 different protein-coding genes, each of which specifies what amino acids make up a particular type of protein molecule. Each protein molecule is highly specialized to accomplish a particular task in the body.  Just like a computer subroutine of about 400 characters has to be done in a very particular way for the subroutine to work, a protein molecule has to have a very specific arrangement of amino acids to perform its task.

It is known that protein molecules are highly sensitive to small changes. Experiments have repeatedly shown that protein molecules are fragile, and become nonfunctional when only a small fraction of their amino acids are removed.  A biology textbook tells us, "Proteins are so precisely built that the change of even a few atoms in one amino acid can sometimes disrupt the structure of the whole molecule so severely that all function is lost." And we read on a science site, "Folded proteins are actually fragile structures, which can easily denature, or unfold." Another science site tells us, "Proteins are fragile molecules that are remarkably sensitive to changes in structure." Referring to protein molecules that have an average of about 400 amino acids each, a biology textbook tells us, "Proteins are so precisely built that the change of even a few atoms in one amino acid can sometimes disrupt the structure of the whole molecule so severely that all function is lost." Protein molecules are not functional if only a half or a third of their amino acids exist.  Typically we have no credible explanation for why the first half or the first third of any protein molecule would have ever originated.

After giving his ridiculously unrealistic or untypical example of a gene with only 100 base pairs, Whitehouse attacks those who point out the vast improbability of getting a gene from accidental processes.  He accuses such people of ignoring natural selection.  He says, "Natural selection is a non-random process, and this fundamentally affects the probability of evolving a particular gene." 

Misleading language is being used here in two different ways. First, it is misleading (as always) to use the term "natural selection," which refers to a survival-of-the-fittest effect that is not actually selection. Selection means a choice made by an agent, and when biologists describe natural selection, they are not describing any such choice. As Charles Darwin wrote, "In the literal sense of the word, no doubt, natural selection is a false term." Second, it is extremely misleading to claim that natural selection is a non-random process. 

The first definition of "random" given by the Cambridge Dictionary is "happening, done, or chosen by chance rather than according to a plan." Natural selection meets that definition of random.  Natural selection can be described like this: random changes occurring in organisms, with a preservation of lucky changes causing increases in survival or reproduction. It is extremely misleading to describe such a thing as a "non-random" process, since it is something centered around mere chance and not at all following any plan or design. 

Does natural selection (or evolutionary ideas in general) help explain the origin of genes, and the origin of protein molecules corresponding to them? No, such ideas are of very little help in explaining the origin of genes or protein molecules.  The problem is that genes and protein molecules are usually not useful if you only have half or a third of the gene or the protein molecule.  This is because of the extreme sensitivity of protein molecules to small changes (mentioned by the quotes above). Change at random 10% of the amino acids in a protein molecule, and the protein will no longer be able to fold, and will be functionally useless. Protein molecules require a very special kind of three-dimensional structuring called folding, and very small changes in molecules ruin their ability to do such folding, making them useless. 

Accordingly, we cannot explain the origin of genes through some gradualism approach that imagines that first there was one third of the gene that was useful for one purpose, and then there was two thirds of the gene that were useful for some other purpose, and then finally we got the version of the gene that humans now have.  Human genes with only half of their base pairs or a third of their base pairs are not useful, and their corresponding protein molecules are not useful with half of their amino acids. 

There are two other reasons why some "natural selection/gradualism" approach does not actually reduce the fantastically slim likelihoods that arise when discussing the probability of the accidental origin of genes and functional protein molecules:

(1) Anyone arguing for the impossibility of novel genes arising through any known process can refer to additional improbability factors which more than make up for any improbability reduction achieved by evoking "natural selection" or gradualism or more primitive antecedents. For example, suppose you try to reduce the improbability of a gene of 1000 base pairs appearing by appealing to some possibility that an antecedent of that gene would have required only 500 base pairs. I can then counter by pointing out that a large fraction of all proteins (partially specified by genes) are useless unless they are part of what are called protein complexes (involving two or more genes working in a team), or unless they are associated with "chaperone proteins" required for their folding.  These factors increase by very many times the improbability of a gene and its co-dependent genes arising, which more than make up for any improbability reduction achieved by you speculating about simpler gene antecedents. 

(2) Speculations about natural selection and evolution are of no value in explaining the origin of hundreds of genes and protein types necessary for the origin of life (because you cannot have natural selection unless life already exists).  So there's no way to escape the prohibitive math arguing against the impossibility of a natural origin of life.  Natural selection does not fix the impossible odds prohibiting the natural origin of hundreds of genes and protein types needed at the very beginning, at the origin of life, before Darwinian evolution has started. A team of 9 scientists wrote a scientific paper entitled, “Essential genes of a minimal bacterium.” It analyzed a type of bacteria (Mycoplasma genitalium) that has “the smallest genome of any organism that can be grown in pure culture.” According to wikipedia's article, this bacteria has 525 genes consisting of 580,070 base pairs. The paper concluded that 382 of this bacteria's protein-coding genes (72 percent) are essential. 

Rosenhouse misinforms us greatly when he states, "The set of all possible gene sequences is incredibly vast, but this is irrelevant because natural selection shifts the probability distribution dramatically toward the functional sequences and away from the nonfunctional sequences."  Of course, the size of the set of all possible arrangements of the building blocks of a gene or protein molecule is something of the most basic and fundamental importance in realistically estimating the chance of accidental gene innovations. To call such a thing irrelevant is every bit as untrue and misleading as saying that the number of digits you have to match to get a winning lottery ticket is irrelevant, or that the number of parts needed to make something is irrelevant.  The very accomplished biologist Hugo de Vries told us the truth when he stated this:

"Natural selection is a sieve. It creates nothing, as is so often assumed; it only sifts." 

A simple linear increase in the number of amino acids produces an exponential increase in the number of ways in which such amino acids can be arranged, resulting in a vast combinatorial explosion. When you double the number of amino acids in a useful protein molecule, you do not just double the improbability of such a sequence accidentally appearing; instead, you increase such an improbability by very many times.  Below we see some of the numbers involved in this combinatorial explosion.  

protein molecule mathematics

Functional protein sequences are so rare in the vast combinatorial space of all possible amino acid sequences with a length less than 1000 that such functional protein sequences are as rare as combinations of 1000 random characters that make long, useful, grammatical and correctly spelled paragraphs. The math here involves the kind of prohibitive odds that blow up into a million pieces the kind of explanatory boasts Rosenhouse wishes to make. So it is no surprise that he tries to discourage us from doing the math.  He states this:

"Establishing complexity requires carrying out a probability calculation, but we have no means for carrying out such a computation in this context. The evolutionary process is affected by so many variables that there is no hope of quantifying them for the purposes of evaluating such a probability....There is no way to carry out a meaningful calculation, and adding 'specificity' to the mix does nothing to improve the argument."  

Darwinist math

To the contrary, we do have everything we need to do probability calculations in this context.  Specifically:

(1) We know how many base pairs would have to be arranged correctly to get a particular human gene corresponding to a functional protein (an average of roughly 1200, with many genes requiring a special arrangement of more than 2000 base pairs). 

(2) We know how many amino acids would have to be arranged correctly to get a particular type of protein (an average of about 400, with many proteins having more than 1000 such specially arranged amino acids).

(3) We know how many protein coding genes there are in the human genome (roughly 20,000 to 25,000 or more). 

(4) We know that there are four types of gene base pairs and twenty types of amino acids used by living things. 

(5) As suggested by the quotes on protein sensitivity and protein fragility made above, we know that each gene and its corresponding protein molecule are highly sensitive to small changes, with small changes breaking their functionality, partially because of the very sensitive and special requirements for the very hard-to-achieve feat of successful protein folding. A very relevant scientific paper is the paper "Protein tolerance to random amino acid change." The authors describe an "x factor" which they define as "the probability that a random amino acid change will lead to a protein's inactivation." Based on their data and experimental work, they estimate this "x factor" to be 34%. It would be a big mistake to confuse this "x factor" with what percentage of a protein's amino acids could be changed without making the protein non-functional.  An "x factor" of 34% actually suggests that almost all of a protein's amino acid sequence (an average of roughly 400 amino acids) must exist in its current form for the protein to be functional.  

Further evidence for such claims can be found in this paper, which discusses very many ways in which a random mutation in a gene for a protein molecule can destroy or damage the function or stability of the protein.  An "active site" of an enzyme protein is a region of the protein molecule (about 10% to 20% of the volume of the molecule) which binds and undergoes a chemical reaction with some other molecule.  The paper states, "If a mutation occurs in an active site, then it should be considered lethal since such substitution will affect critical components of the biological reaction, which, in turn, will alter the normal protein function." The paper follows that sentence with a mention of quite a few other ways in which random mutations can break protein molecules, making them nonfunctional. For example, we read that "an amino acid substitution at a critical folding position can prevent the forming of the folding nucleus, which makes the remainder of the structure rapidly condense," which is a description of how a single amino acid change (less than a 1% change in the amino acids in a protein molecule) can cause a protein molecule to no longer have the 3D shape needed for its function. Referring to random tiny changes in the amino acids in a protein (mutations), a scientific paper stated, "We predict 27–29% of amino acid changing (nonsynonymous) mutations are neutral or nearly neutral (|s|<0.01%), 30–42% are moderately deleterious (0.01%<|s|<1%), and nearly all the remainder are highly deleterious or lethal (|s|>1%).”  This amounts to an estimate that a random change to the amino acid sequence of a protein has about a 30% chance of breaking the protein's functionality. As a biology textbook tells us, "Proteins are fragile, are often only on the brink of stability."

(6) Using such numbers, we can calculate that the chance of each novel functional protein molecule or each novel functional gene arising during a random combination of its chemical components is no greater than about 1 in 10 to the 500th power. 

(7) Considering the possibility of fractions of such protein molecules being useful, we can calculate that even under the extremely generous assumption that halves of protein molecules are useful (very probably untrue because of the protein fragility considerations listed above), this would still leave you with prohibitive numbers such as a particular combination of the chemical subunits succeeding to make functional protein molecules only about once in about 10 to the 250th power combinations of the components. 

(8) Considering that a large fraction of all protein molecules are functional only as parts of  protein complexes requiring multiple coordinated protein molecules or with the help of additional chaperone protein molecules that aid in the protein folding process, we can calculate that the improbability increase from such factors is roughly the same or greater than any improbability decrease produced by imagining fractions of such protein molecules being useful.

(9) Knowing roughly the number of molecules on the surface of a planet such as ours and the length of time that planets have existed (a few billions of years), we can calculate the total number of chemical combinations that would have occurred in the history of a planet such as ours, which is some number much less than 10 to the seventieth power. 

(10) Judging the total amount of chemical component combinations that would have occurred in the history of planet Earth, and the improbabilities discussed above, and the estimated number of planets in our galaxy (roughly a trillion), we can reasonably calculate that we would never expect any novel average-sized functional protein molecule or any novel average-sized functional gene to have ever accidentally appeared by any known natural processes either in the history of planet Earth or any planet in our galaxy, assuming a trillion planets in our galaxy, and that such an accidental appearance would have been not merely unlikely, but enormously unlikely. It is possible that our galaxy is filled with large organisms, but only if something vastly greater than Darwinian evolution is occurring to allow that. 

Rosenhouse's claim that you cannot do the math here is clearly incorrect.  We have numbers that allow us to calculate that the odds against the accidental appearance of novel functional genes and novel functional protein molecules are utterly prohibitive.  Reasonable calculations from such numbers indicate that Darwinist claims to have explained biological origins are unfounded boasts. If you cannot credibly explain the origin of genes and protein molecules, you have no business claiming that you understand the origin of a species. Besides failing to credibly explain the origin of our genes and protein molecules, Darwinists fail to credibly explain the origin of the visible anatomy and vastly organized hierarchical structure of large organisms, because (contrary to frequent misstatements on this topic) neither DNA nor its genes specify any kind of blueprint for anatomy or even instructions on how to make cells, but merely low-level chemical information such as which amino acids make up a protein. Such theorists also fail to explain the origin of human minds (which are not credibly explained by brains, as discussed in the posts here). 

Calculations by Darwinism skeptics about the improbability of natural gene origination are very similar to calculations about the improbability of natural abiogenesis, life originating from non-life. Very similar calculations are done by mainstream scientists, and their work appears in mainstream journals and is sometimes favorably discussed in mainstream publications.  

In their excellent Journal of Theoretical Biology paper "Using statistical methods to model the fine-tuning of molecular machines and systems," which discusses quite a few things relevant to the discussion of this post, Steinar Thorvaldsen and Ola Hossjer state the following about one scientist's calculation of the probability of a transition from the RNA World scenario to a  "proteins and cells" level of life:


"Eugene Koonin...has made a theoretical study of the path from a putative RNA world to an explicit translation system (like a 'DNA-protein world'). He found this path to be incredibly steep (Koonin 2012, p. 376), even under the best-case scenario."

We are told in Thorvaldsen and Hossjer's paper that Koonin calculated that the chance of such a transition occurring would be less than 1 in 10 to the thousandth  power.  That's less than the chance of you correctly guessing the telephone numbers of 100 consecutive strangers. 
 In the scientific journal Nature there was published a paper by Totani entitled “Emergence of Life in an Inflationary Universe.” We read the following in the LiveScience.com article discussing Totani's paper:

But researchers have found that the random formation of RNA with a length greater than 40 is incredibly unlikely given the number of stars — with habitable planets — in our cosmic neighborhood. There are too few stars with habitable planets in the observable universe for abiogenesis to occur within the timeframe of life emerging on Earth.”

While writers such as Rosenhouse attempt to stigmatize writers making probability arguments based on biological complexity, we find in mainstream science papers (like those I just discussed) the appearance of similar-approach arguments reaching similar conclusions, showing that math along these lines is quite possible and quite legitimate.  And occasionally mainstream scientists will confess that biologists don't really understand the things biologists so often brag about understanding. For example, in  Scientific American a biologist confessed, "While scientists are still working out the details of how the eye evolved, we are also still stuck on the question of how intelligence emerges in biology.” Note the "we are also still stuck" phrase, which has a floundering sound to it. A paper co-authored by a Cal Tech scientist involved in biological engineering confesses, "Biological systems have evolved to amazingly complex states, yet we do not understand in general how evolution operates to generate increasing genetic and functional complexity." A Harvard scientist confesses, "A wide variety of protein structures exist in nature, however the evolutionary origins of this panoply of proteins remain unknown." Referring to the origin of species (speciation), Cambridge University biology professor K. D. Bennett says this on page 175 of his book Evolution and Ecology: The Pace of Life : "Natural selection has been shown to have occurred (for example, among populations of Darwin's finches), but there is no evidence that it accumulates over longer periods of time to produce speciation in the Darwinian sense."  Phillip Ball (for 20 years a physical science editor at the leading mainstream journal Nature) stated the following in a publication of a leading science organization:

"It is not obvious a priori that small mutational steps should permit adaptation rather than simply inevitable loss of function. Nor is it clear why such a mechanism should permit genuine evolutionary innovation rather than being confined to a sort of timid tinkering with existing functionality."

The misunderstanding of writers such as Rosenhouse about natural selection is very great. Something (so-called natural selection) that is at best a propagation effect or preservation effect is spoken of by such writers rather as if it were some magic talisman that provided infinite luck, allowing unlimited miracles of accidental construction.  Such are the very false ideas that can arise in one century partially because people started using language incorrectly centuries earlier, such as using the word "selection" for something that involves no real selection, no real choice.  

As a quick-and-dirty analogy, you can think of natural selection as a mere sieve or filter that preserves lucky results. But perhaps a better analogy is if we think of natural selection as being like a computer printer.  Darwinists believe that a novel gene originates when some incredibly lucky random change occurs in a single organism, and that natural selection causes such a new gene to slowly spread across the gene pool of a species during multiple generations (because the gene produces  a survival benefit or reproduction benefit, causing an organism that has it to be more likely to spread its genes).  According to such a description, natural selection is acting like a computer printer that can make unlimited copies of some page or pages.  But it is a gigantic mistake to think that we can explain the origin of the gene by appealing to natural selection. At best natural selection is like a computer printer, and computer printers don't author things.

natural selection problem

Within the context of explaining the origin of novel genes and novel  proteins, there is actually every reason to believe that the idea of natural selection is a very misleading one (beyond the mere fact that no real selection is occurring because agent is choosing). Why is that? Natural selection is basically the idea that nature preserves some great miracle of biological luck when it occurs. But let us imagine that random mutations were to produce a novel innovation by accidentally making a new type of functional protein molecule. With 99.99% likelihood such a thing would not be preserved in a gene pool for many generations, for the simple reason that it would only be one element when many other miracles of protein innovation or phenotypic innovation would be needed to actually produce a survival benefit or a reproduction benefit.  This is because the requirements for improvements in survival or reproduction are usually incredibly complicated, typically involving a requirement for quite a few coordinated and very complicated changes in different places. Such requirements are vastly underestimated by Darwinism enthusiasts who fail to study the gigantically diverse and complex requirements for successful biological improvements, which often involve multiple very complex "chicken or the egg" cross-dependencies. Just as inventing a CPU chip in 17th century France would not have got you anywhere (because countless other not-yet-invented things would also be needed for a computer), in general some accidental miracle of luck producing a functional new type of protein molecule would almost certainly be futile, because many other simultaneous (or nearly simultaneous) miracles of luck would be needed to produce a benefit in survival or reproduction.   

interlocking biological dependencies

Reading Rosenhouse's "you can't do the math here" argument, I'm reminded of that old saying about lawyers. They say that if a lawyer has the law or the facts on his side, then he argues the law or the facts. But if he doesn't have the law or the facts on his side, then the lawyer shouts and beats his fists on the table. Similarly, when biological theorists have the math on their side, then they make probability arguments using math. But when they don't have the math on their side, then they just try ignoring mathematics (as Darwin did) or make very lame claims that you can't do the math (as Rosenhouse has done). 

Postscript: In the original post I failed to mention a bit of utterly fallacious sophistry by Rosenhouse in which he compares the origin of a protein molecule to getting a series of 100 "heads" coins when coin flipping. It's another case of vastly underestimating the improbability.  Flipping 100 coins and getting all heads has a likelihood of 1 in 10 to the 30th power,  but getting a random arrangement of 400 amino acids to make a functional protein molecule has a likelihood of about 1 in 10 to the 520th power. Rosenhouse tries to suggest that natural selection can easily do something like getting 100 coins that are all heads, on the grounds that it saves the successful fragments on each try, like someone who tosses 100 coins and then saves each "heads" coin, then flipping again only the coins that landed "tails."  The insinuation is false. Neither evolution nor natural selection would have any way of knowing that scattered fragments of a random arrangement of amino acids are fragments of successful protein molecules.  Having no knowledge of the target molecule (an extremely rare successful arrangement of 400 amino acids producing a biological benefit),  neither evolution nor natural selection would ever know that some amino acid in a particular position in the linear series was a fragment (about 1/400th) of a successful solution (a functional protein molecule of about 400 amino acids).  Even an intelligent human could never build a new not-yet-invented successful subroutine of 400 characters by generating 400 random characters and then saving the random characters that are fragments of a successful solution, continuing the process through various iterations. If you don't know the exact solution, there would be no way to tell whether there is a match to the solution at a particular position in the linear sequence. Rosenhouse's fallacy here is the same as in Dawkins famous "Methinks it is like a weasel" sophistry, in which an appeal is made to the ease of positional matching to an exact target, of a type which would be impossible within nature because the target would be unknown. 

We see the same faulty type of reasoning endlessly repeating in the literature of Darwinism apologetics: analogical comparisons in which a blind, mindless process (called either evolution or natural selection) is compared to some type of intelligent agent, in an attempt to assure us that evolution or natural selection has adequate inventive powers. So, for example, evolution or natural selection may be compared to a tinkering inventor, or an engineer, or a person making wise selections or a person matching scattered fragments of a linear sequence to a target, or some mountain climber choosing the best paths to the mountain top, or a blind watchmaker, or some other kind of human. All such comparisons are very fallacious, because they involve comparing willful and intelligents agents to some natural process that has no will, no ideas and  no intelligence.  As a general rule of thumb, you should be extremely suspicious the instant you read anyone making an analogy about evolution or natural selection, and remember that Darwinism apologists have a long history of making fallacious analogies.  

No comments:

Post a Comment