In order to help perpetuate their dogma that humans are accidents of nature owing our existence to lucky random mutations in the distant past, our biologists are very fond of using what I call shrink-speaking language: language that makes the endless towering cathedrals of biological complexity look like mere crumbs. This involves many tricks of diminutive representation, such as:
- Referring to gigantically organized human bodies as "bags of chemicals" or "carbon stuff" or "star stuff."
- Referring to stratospheric leaps of biological organization and functional complexity as mere "variants."
- Referring to enormous new bonanzas of unprecedented biological engineering (such as the Cambrian Explosion) as mere "diversification."
Organisms such as ourselves involve hierarchically structured and enormously organized complexity that cannot be credibly explained by appealing to random mutations. What we have in a human body is enormously organized and fine-tuned complexity so immense that it can be called an enormous engineering effect. In his interesting book Cosmological Koans, the physicist Anthony Aquirre tells us about just how complex biological life is. He states the following on page 338:
"On the physical level, biological creatures are so much more complex in a functional way than current artifacts of our technology that there's almost no comparison. The most elaborate and sophisticated human-designed machines, while quite impressive, are utter child's play compared with the workings of a cell: a cell contains on the order of 100 trillion atoms, and probably billions of quite complex molecules working with amazing precision. The most complex engineered machines -- modern jet aircraft, for example -- have several million parts. Thus, perhaps all the jetliners in the world (without people in them, of course) could compete in functional complexity with a lowly bacterium."
So if a lowly bacterium has a functional complexity comparable to a jetliner, what kind of functional complexity does a human body have? Functional complexity so great it can be called an enormously strong engineering effect. The human body includes many types of molecular machines that are best classified as accidentally unachievable. Something is accidentally unachievable if there are no imaginable unguided accidental events that could cause its origin.
I can explain the idea of something being accidentally unachievable with a simple example that is easy to understand. A bridge across a shallow stream is something that is accidentally achievable. We can imagine some accidental arrangement of rocks that might make a kind of bridge across a shallow stream, and we can imagine some lightning bolt causing a tree to fall, making a bridge across a shallow steam. But a bridge across a very wide and deep river is something that is accidentally achievable. There is no conceivable series of accidents or unwilled natural events that could create a bridge over an average section of a wide river such as the Mississippi, which has an average width of one mile.
Let us look at some of the most impressive cases of molecular machinery in your body, systems requiring so many thousands of well-arranged parts that they are reasonably called accidentally unachievable. Pay attention to the numbers given in the second columns of the tables below, numbers which tell how many amino acids parts have to be well-arranged to get the protein mentioned in each row. If the numbers were very low numbers such as 3 or 5 or 7, it might be excessive to state that the relevant molecular machines are accidentally unachievable. Instead, the actual numbers will be numbers in the hundreds or thousands, meaning that we will be constantly finding that individual protein components of these molecular machines each required hundreds or thousands of well-arranged amino acid parts (with a consequence that the total system required thousands of well-arranged amino acids, equivalent to tens of thousands of well-arranged atoms).
Example #1: The Apoptosome
(Image credit: Wikipedia Commons, derived from Yuan et al. 2010, Structure of an apoptosome-procaspase-9 CARD complex)
Shown above is the apoptosome protein complex involved in programmed cell death. Note the references in the chart to propellers, which remind us how much the complex resembles a product of engineering. Humans have more than 20,000 types of protein molecules, and the average protein molecule is a very special arrangement of more than 400 different amino acid parts. The arrangement of amino acids in each protein is as hard-to-achieve by chance as 400 accidentally typed characters making a paragraph of grammatical and functional prose. Extremely complex engineering arises in the form of protein complexes, in which different proteins (often useless by themselves) work together as team members to achieve some dramatic functional result. We see that in the visual above, where multiple instances of several different types of protein molecules come together to form an extremely complex structure consisting of thousands of well-arranged amino acid parts, and consisting of a total of tens of thousands of well-arranged atoms. A page describes the action of these individually useless proteins coming together to form a functional protein complex:
"The process of programmed cell death, also known as apoptosis, is highly regulated, and the decision to die is made through the coordinated action of many molecules. The apoptosome plays the role of gatekeeper in one of the major processes, termed the intrinsic pathway. It lies between the molecules that sense a problem and the molecules that disassemble the cell once the choice is made. Normally, the many subunits of the apoptosome are separated and inactive, circulating harmlessly through the cell. When trouble occurs, they assemble into a star-shaped complex, which activates protein-cutting caspases that get apoptosis started."
Another site that includes a 3D rotating animation of the structure shown above says this:
"The apoptosome is revealed as a wheel-like complex with seven spokes. On top of the wheel is a spiral-shaped disk that allows for docking and subsequent activation of proteases, which then target cellular components. When active, the apoptosome is revealed to be a dynamic machine with three to five protease molecules tethered to the wheel at any given time."
The "Apaf 1" part of this complex (APAF_HUMAN ) involves 1248 amino acids.
Example # 2: The Spliceosome
At the site here, we read this about the human spliceosome:
"The spliceosome is a complicated and formidable example of a multi-subunit molecular machine, with the pre-catalytic form being the largest spliceosomal complex, containing 5 RNA molecules and 65 proteins, in addition to a substrate mRNA precursor. The arrangement and activities of all of these has to be intricately coordinated, paradoxically to catalyse a rather simple chemical reaction."
The paper here describes the spliceosome as a highly dynamic machine, like some race car that has its parts changed or replaced by a pit crew as the race car stops for pit stops:
"Indeed, ∼45 proteins are recruited to the human spliceosome as part of the spliceosomal snRNPs, whereas non-snRNP proteins comprise the remainder. The composition of the spliceosome is highly dynamic with a remarkable exchange of proteins from one stage of splicing to the next. These changes are also accompanied by extensive remodeling of the snRNPs within the spliceosome."
Below are the number of amino acids involved in these parts, which I looked up using the UniProt online database (you can use the links to check the numbers I have given):
Protein | Number of amino acids | Conment |
793 | On Chromosome 22 | |
462 | On Chromosome 19 | |
501 | On Chromosome 1 | |
683 | On Chromosome 1 | |
522 | On Chromosome 9 | |
2335 | On Chromosome 17 | |
972 | On Chromosome 17 | |
499 | On Chromosome 19 | |
522 | On Chromosme 9 | |
800 | On Chromosome 11 | |
586 | On Chromosome 5 | |
420 | On Chromosome 5 | |
112 | On Chromosome 19 | |
855 | On Chromosome 19 | |
758 | On Chromosome 19 | |
514 | On Chromosome 4 |
Altogether the structure shown above requires more than 9000 amino acids that have to be arranged in just the right way. The structure shown above is not specified in DNA, which merely specifies which amino acids make up each of the protein parts. The amino acid information needed to make the structure above (insufficient to specify the physical arrangement of the structure) is not at all contiguous in DNA. To assemble the structure above, among other wonders of construction a human body must magically gather genetic information scattered across many different chromosomes in the nucleus, like someone quickly finding just the right 45 loose pages hidden in random books of 46 tall, long bookcases in a public library. The table above shows that at least eight of the 23 human chromosome pairs would need to be accessed: Chromosome 1, Chromosome 4, Chromosome 5, Chromosome 9, Chromosome 11, Chromosome 17, Chromosome 19 and Chromosome 22.
The CORUM database page here gives details on 39 different proteins involved in just one part of the spliceosome, a part called the spliceosome C complex. The CORUM database page here gives details on 38 different proteins involved in another part of the spliceosome, a part called the spliceosome pre-B complex. The CORUM database page here gives details on 43 different proteins involved in another part of the spliceosome, a part called the spliceosome B complex. The CORUM database page here gives details on 113 different proteins involved in another part of the spliceosome, a part called the spliceosome A complex. The CORUM database page here gives details on 139 different proteins involved in another part of the spliceosome, a part called the spliceosome E complex. The paper here says, "The spliceosome is composed of as many as 300 distinct proteins and five RNAs, making it among the most complex macromolecular machines known."
Example # 3: RNA Polymerase II
The page here discusses RNA polymerase:
"RNA is a versatile molecule. In its most familiar role, RNA acts as an intermediary, carrying genetic information from the DNA to the machinery of protein synthesis. RNA also plays more active roles, performing many of the catalytic and recognition functions normally reserved for proteins. In fact, most of the RNA in cells is found in ribosomes--our protein-synthesizing machines--and the transfer RNA molecules used to add each new amino acid to growing proteins. In addition, countless small RNA molecules are involved in regulating, processing and disposing of the constant traffic of messenger RNA. The enzyme RNA polymerase carries the weighty responsibility of creating all of these different RNA molecules...RNA polymerase is a huge factory with many moving parts".
There are three different versions of RNA Polymerase, RNA Polymerase I, RNA Polymerase II, and RNA Polymerase III. In a previous post I discussed the proteins that make up the RNA Polymerase III complex: more than a dozen proteins with amino acid sequences specified across ten or more different chromosomes, with the complex requiring thousands of well-arranged amino acids. Below are the number of amino acids involved in RNA Polymerase II, which I looked up using the UniProt online database (you can use the links to check the numbers I have given):
Protein |
Number of amino acids |
Comment |
1970 |
On Chromosome 17 |
|
142 |
On Chromosome 2 |
|
172 |
On Chromosome 11 |
|
1174 |
On Chromosome 4 |
|
144 |
On Chromosome 12 |
|
125 |
On Chromosome 19 |
|
275 |
On Chromosome 16 |
|
117 |
On Chromosome 7 |
|
665 |
On Chromosome 12 |
|
210 |
On Chromosome 19 |
|
1393 |
On Chromosome 15 |
|
67 |
On Chromosome 11 |
|
212 |
On Chromosome 6 |
|
150 |
On Chromosome 3 |
|
|
On Chromosome 12 |
|
2210 |
On Chromosome 12 |
|
2177 |
On Chromosome X |
|
2174 |
On Chromosome 17 |
|
2145 |
On Chromosome 3 |
Requiring many more well-arranged amino acids than RNA Polymerase III, the RNA Polymerase II protein complex clearly requires more than 10,000 amino acids that have to be arranged in just the right way. The RNA Polymerase II structure is not specified in DNA, which merely specifies which amino acids make up each of the protein parts. The amino acid information needed to make RNA Polymerase II is not at all contiguous in DNA. To assemble RNA Polymerase II, among other wonders of construction a human body must magically gather genetic information scattered across many different chromosomes in the nucleus, like someone quickly finding just the right 15+ loose pages hidden in random books of 46 tall, long bookcases in a library. The table above shows that at least eleven of the 23 human chromosome pairs would need to be accessed: Chromosome 2, Chromosome 3, Chromosome 4, Chromosome 6, Chromosome 7, Chromosome 11, Chromosome 12, Chromosome 15, Chromosome 17, Chromosome 19, and Chromosome X.
Example # 4: Proteasomes
The wikipedia.org article on proteasomes tells us this:
"Proteasomes are protein complexes which degrade unneeded or damaged proteins by proteolysis, a chemical reaction that breaks peptide bonds...In structure, the proteasome is a cylindrical complex containing a 'core' of four stacked rings forming a central pore. Each ring is composed of seven individual proteins."
A paper on this topic is entitled "Gates, channels, and switches: elements of the proteasome machine." We read this:
"The proteasome has emerged as an intricate machine that has dynamic mechanisms to regulate the timing of its activity, its selection of substrates, and its processivity. The 19-subunit regulatory particle (RP) recognizes ubiquitinated proteins, removes ubiquitin, and injects the target protein into the proteolytic chamber of the core particle (CP) via a narrow channel."
Another paper is entitled "The 26S Proteasome: A Molecular Machine Designed for Controlled Proteolysis." A page on the site of the Theoretical and Computational Group tells us this:
"Recycling of unneeded protein molecules in cells is performed by a molecular machine called 26S proteasome (Figure 1), which cuts these proteins into smaller pieces for reuse as building blocks for new proteins. Proteins that need to be recycled are labeled by tags made of poly-ubiquitin protein chains. The 26S proteasome machine recognizes and binds to these tags, pulls the tagged protein close, then unwinds it, and finally cuts it into pieces. As the cell's recycling machinery, the 26S proteasome is vital for a variety of essential cellular processes, including protein quality control, cell cycle regulation, adaptive immune response, and apoptosis....The 26S proteasome recruits, unfolds, and degrades poly-ubiquitin tagged proteins through a complex interaction clockwork of over 60 known protein subunits that is driven through ATP hydrolysis."
A scientific paper tells us this:
"The 26S proteasome is a multisubunit complex that catalyzes the degradation of ubiquitinated proteins. The proteasome comprises 33 distinct subunits, all of which are essential for its function and structure."
Below is a depiction of the human 26S proteasome structure, one that labels some of its protein parts. We see three different views of the same protein complex, with different protein parts labeled (the Greek letters used stand for alpha and beta parts mentioned in the table below):
Below are the number of amino acids involved in these parts, which I looked up using the UniProt online database (you can use the links to check the numbers I have given):
Protein | Number of amino acids | Coment |
241 | On Chromosome 6 | |
201 | On Chromosome 1 | |
205 | On Chromosome 17 | |
264 | On Chromosome 1 | |
263 | On Chromosome 14 | |
239 | On Chromosome 17 | |
248 | On Chromosome 20 | |
263 | On Chromosome 11 | |
234 | On Chromosome 7 | |
255 | On Chromosome 14 | |
261 | On Chromosome 15 | |
241 | On Chromosome 1 | |
243 | On Chromosome 14 | |
248 | On Chromosome 20 | |
The structure shown above clearly requires several thousands of amino acids that have to be arranged in just the right way. The structure shown above is not specified in DNA, which merely specifies which amino acids make up each of the protein parts. The amino acid information needed to make the structure above (insufficient to specify the total structure) is not at all contiguous in DNA. To assemble the structure above, among other wonders of construction a human body must magically gather genetic information scattered across many different chromosomes in the nucleus, like someone quickly finding just the right 60 loose pages hidden in random books of 46 tall, long bookcases in a library. The table above shows that at least eight of the 23 human chromosome pairs would need to be accessed: Chromosome 1, Chromosome 6, Chromosome 7, Chromosome 11, Chromosome 14, Chromosome 15, Chromosome 17, and Chromosome 20.
Example # 5: The ATP Synthase Complex
Another example of accidentally unachievable molecular machinery in the human body is the ATP synthase protein complex. It's a very complex molecular motor system described in a paper entitled "ATP Synthase: Motoring to the Finish Line." The paper refers to this complex as a "sophisticated molecular machine." We read this: "ATP synthase is an unusually efficient rotary motor that synthesizes ATP at rates exceeding 100 molecules per second." Another scientific page tells us this:
"ATP synthase is one of the wonders of the molecular world. ATP synthase is an enzyme, a molecular motor, an ion pump, and another molecular motor all wrapped together in one amazing nanoscale machine. It plays an indispensable role in our cells, building most of the ATP that powers our cellular processes....Why have two motors connected together? The trick is that one motor can force the other motor to turn, and in this way, change the motor into a generator. "
Below are some of the components of ATP synthase, as listed in the UniProt database.
Protein | Number of amino acids | Comment |
51 | On Chromosome 20 | |
68 | From Mitochondrion | |
226 | From Mitochondrion | |
215 | On Chromosome 14 | |
108 | On Chromosome 21 | |
58 | On Chromosome 14 | |
246 | On Chromosome 12 | |
69 | On Chromosome 4 | |
161 | On Chromosome 17 | |
94 | On Chromosome 7 | |
529 | On Chromosome 12 | |
168 | On Chromosome 19 | |
553 | On Chromosome 18 | |
136 | On Chromosome 17 | |
298 | On Chromosome 10 | |
141 | On Chromosome 12 | |
213 | On Chromosome 21 | |
142 | On Chromosome 2 | |
256 | On Chromosome 1 | |
ATP Synthase seems to require thousands of amino acid parts arranged in just the right way, which amounts to a special arrangement of tens of thousands of atoms. The arrangement of the parts of the ATP Synthase complex is not specified in DNA, which does not specify which proteins are parts of particular protein complexes. To assemble the structure above, among other wonders of construction a human body must magically gather genetic information scattered across many different chromosomes in the nucleus, like someone quickly finding just the right 19 loose pages hidden in random books of 46 tall, long bookcases in a public library. The table above shows that at least 12 of the 23 human chromosome pairs would need to be accessed: Chromosome 1, Chromosome 2, Chromosome 4, Chromosome 7, Chromosome 10, Chromosome 12, Chromosome 14, Chromosome 17, Chromosome 18, Chromosome 19, Chromosome 20 and Chromosome 21.
Example # 6: The Origin Recognition Complex/Replicative Helicase Complex
Mammalian cells are so complicated they have been compared to factories or jet aircraft. The reproduction of most cells in the human body is a miracle of replication beyond the understanding of today's science. Scientists have confessed that they do not know what causes the fantastically complex process of cell reproduction. Scientists merely understand phases of such a process, and what components play a role in the process.
One of those components is called the origin recognition complex. The wikipedia.org article on this protein complex says this: "The origin recognition complex (ORC) is a highly conserved six subunits protein complex essential for the initiation of the DNA replication in eukaryotic cells." The ORC complex works as a team with a "replicative helicase" complex consisting of the six bottom rows on the table below. So the wikipedia.org article on the ORC complex lists all of the items in the table below as the items in the complete complex. Below are some details of these subunits:
Protein | Number of amino acids | Comment |
861 | On Chromosome 1 | |
577 | On Chromosome 2 | |
711 | On Chromosome 6 | |
436 | On Chromosome 2 | |
435 | On Chromosome 7 | |
252 | On Chromosome 16 | |
560 | On Chromosome 17 | |
904 | On Chromosome 3 | |
808 | On Chromosome 6 | |
863 | On Chromosome 8 | |
734 | On Chromosome 22 | |
821 | On Chromosome 2 | |
719 | On Chromosome 7 | |
Science writer Amber Dance skillfully describes the operations of the unit above:
"The average dividing cell must copy—perfectly—3.2 billion base pairs of DNA, about once every 24 hours. The cell’s replication machinery does an amazing job of this, copying genetic material at a lickety-split pace of some 50 base pairs per second. Still, that’s much too slow to duplicate the entirety of the human genome. If the cell’s copying machinery started at the tip of each of the 46 chromosomes at the same time, it would finish the longest chromosome—No. 1, at 249 million base pairs—in about two months. 'The way cells get around this, of course, is that they start replication in multiple spots,' says James Berger, a structural biologist...'But that poses its own challenge,' says Berger, 'which is, how do you know where to start, and how do you time everything?' Without precision control, some DNA might get copied twice, causing cellular pandemonium... It takes a tightly coordinated dance involving dozens of proteins for the DNA-copying machinery to start replication at the right point in the cell’s life cycle...Kicking off the process is a cluster of six proteins that sit down at the origins. Called ORC, this cluster is shaped like a double-layer ring with a handy notch that allows it to slide onto the DNA strands, Berger’s team has found...Once ORC has settled onto the DNA, it attracts a second protein complex: one that includes the helicase that will eventually unwind the DNA. Costa and colleagues used electron microscopy to work out how ORC lures in first one helicase, and then another. The helicases are also ring-shaped, and each one opens up to wrap around the double-stranded DNA. Then the two helicases close up again, facing toward each other on the DNA strands, like two beads on a string."
The molecular machinery shown above clearly requires more than five thousand amino acids that have to be arranged in just the right way. The structure of the molecular machinery described above is not specified in DNA, which merely specifies which amino acids make up each of the protein parts. The amino acid information needed to make the structure above (insufficient to make the 3D structure) is not at all contiguous in DNA. To assemble the structure above, among other wonders of construction a human body must magically gather genetic information scattered across many different chromosomes in the nucleus, like someone quickly finding just the right 14 loose pages hidden in random books of 46 tall, long bookcases in a library. The table above shows that at least nine of the 23 human chromosome pairs would need to be accessed: Chromosome 1, Chromosome 2, Chromosome 3, Chromosome 6, Chromosome 7, Chromosome 8, Chromosome 16, Chromosome 17 and Chromosome 22.
Example # 7: The Nuclear Pore Complex
The nuclear pore complex is a large protein complex found in the "nuclear envelope" that is the outer boundary of the nucleus inside human cells. The wikipedia.org article on this complex states that it consists of "456 individual protein molecules, and 34 distinct nucleoporin proteins." So the complex apparently requires 34 types of protein molecules. The article tells us that the "principal function of nuclear pore complexes is to facilitate selective membrane transportation of various molecules across the nuclear envelope." This mean that nuclear pore complexes have the extremely complex job of acting like gatekeepers, letting the right kind of molecules get into the nucleus of the cell, and keeping out the wrong type of molecules. The article tells us that there are typically about 1000 of the nuclear pore complexes in every cell. We read of some impressive functionality of these nuclear pore complexes:
"Notably, the nuclear pore complex (NPC) can actively mediate up to 1000 translocations per complex per second. While smaller molecules can passively diffuse through the pores, larger molecules are often identified by specific signal sequences and are facilitated by nucleoporins to traverse the nuclear envelope."
The article tells us that a nuclear pore complex has a molecular weight of about 110 megadaltons. A dalton is the mass equal to a twelfth of the mass of a carbon atom. A protein complex of 110 megadaltons would have the mass of about 9 million carbon atoms. Apparently the proteins that make up this complex are particularly complex proteins. Below are the exact numbers (we may assume that there are multiple instances of such proteins in a nuclear pore complex).
Protein | Number of amino acids | Comment |
1817 | On Chromosome 11 | |
1475 | On Chromosome 6 | |
819 | On Chromosome 16 | |
925 | On Chromosome 12 | |
2012 | On Chromosome 7 | |
1436 | On Chromosome 11 | |
2090 | On Chromosome 9 | |
656 | On Chromosome 17 | |
468 | On Chromosome 22 | |
741 | On Chromosome 17 | |
1156 | On Chromosome 1 | |
1391 | On Chromosome 5 | |
The molecular machinery shown above clearly requires more than 12,000 amino acids that have to be arranged in just the right way, which amounts to a special arrangement of more than 100,000 atoms. The structure of the molecular machinery described above is not specified in DNA, which merely specifies which amino acids make up each of the protein parts. The amino acid information needed to make the structure above is not at all contiguous in DNA. To assemble the structure above, among other wonders of construction a human body must magically gather genetic information scattered across many different chromosomes in the nucleus, like someone quickly finding just the right 34 loose pages hidden in random books of 46 tall, long bookcases in a library. The table above shows that at least nine of the 23 human chromosome pairs would need to be accessed: Chromosome 1, Chromosome 5, Chromosome 6, Chromosome 7, Chromosome 11, Chromosome 12, Chromosome 16, Chromosome 17 and Chromosome 22.
Six Reasons These Molecular Machines And Their Behavior Are Accidentally Unachievable
There are six main reasons why we must regard the molecular machines described above as accidentally unachievable.
Reason #1: Chance processes such as Darwinian evolution could never produce the genes needed to make the proteins that make up such molecular machines (the gene origination problem). To perform the task a particular protein molecule performs, a type of protein molecule typically requires some specific fine-tuned gene, an amino acid sequence with most or nearly all of the protein's actual amino acid sequence, a chain of hundreds or thousands of amino acids specially arranged to produce a functional effect. Evolutionary biologist Richard Lewontin stated, "It seems clear that even the smallest change in the sequence of amino acids of proteins usually has a deleterious effect on the physiology and metabolism of organisms." A biology textbook tells us, "Proteins are so precisely built that the change of even a few atoms in one amino acid can sometimes disrupt the structure of the whole molecule so severely that all function is lost." And we read on a science site, "Folded proteins are actually fragile structures, which can easily denature, or unfold." Another science site tells us, "Proteins are fragile molecules that are remarkably sensitive to changes in structure." A paper describing a database of protein mutations tells us that "two thirds of mutations within the database are destabilising." Those who think that functional folded protein molecules could gradually arise (getting longer and longer from a small size) will be dismayed to read this statement in a 900+ page textbook on protein chemistry: "Polypeptides less than about 70 amino acids in length should not fold because they should not be able to bury a large enough number of hydrophobic amino acids to overcome the configurational entropy of their random coils." Folding is required for most functional protein molecules.
Accordingly, we cannot explain the origin of genes through some gradualism approach that imagines that first there was one tenth of the gene that was useful for one purpose, and then there was two tenths of the gene that were useful for some other purpose, and then finally we got the version of the gene that humans now have. Human genes with only half of their base pairs or a third of their base pairs are not useful, and their corresponding protein molecules are not useful with half of their amino acids.
But how hard would it be to get by chance or random mutations an amino acid sequence that would be the core of a useful protein molecule? That depends on the number of amino acids in the protein. Here we run into a simple principle that is the bane of all theories of accidental biological origins: the principle that a simple linear increase in the number of parts that must be well-arranged results in an exponential or geometric increase in the unlikelihood of such an arrangement occurring by chance. A small increase in the number of parts quickly results in what is called a combinatorial explosion, in which the number of possible combinations skyrockets. This is why computer security experts often tell you to use at at least 14-characters for the password of any financial account. If you change your password from 7-characters to 14 characters, that doesn't make it merely twice as hard for a hacker trying all combinations to break into your account; instead it is is roughly 10,000,000,000 times harder.
The chart below shows some of the relevant mathematics. If you doubt these numbers, you can verify them using the Large Exponents Calculator here. Since there are 20 different amino acids used in protein, you use 20 in the first row of such a calculator. Numbers such as E+6 refer to powers of ten. So 3.2 E+6 means 3,200,000; 1.024 E+13 means 10,240,000,000,000; and E+26 means 1 followed by 26 zeros. The bottom of the chart is a number of combinations equal to about 1 followed by more than 2600 zeros.
Number of amino acids in a molecule |
Number of possible combinations of the molecule's amino acids |
5 |
3.2 E+6 |
10 |
1.024 E+13 |
20 |
1.048576 E+26 |
40 |
1.099511627 E+52 |
80 |
1.208925819 E+104 |
160 |
1.461501637 E+208 |
320 |
2.135987035 E+416 |
640 |
4.562440617 E+832 |
1280 |
2.081586438 E+1665 |
2000 |
1.148130695 E+2602 |
We can see from the chart above that the odds become utterly prohibitive once you start to get amino acid lengths much longer than about 160. Even if you very generously assume that a particular protein molecule only needs to have half of its amino acid sequence matching its actual sequence (an assumption too generous because of what we know about the sensitivity of protein molecules to small changes), you still have a case where we should never expect chance processes to produce successful amino acid sequences (corresponding to functional protein molecules) as long as 320 amino acids.
In most of the protein complexes described above, we have some very complex proteins consisting of very long amino acids chains that we should never expect to have arisen by chance or Darwinian processes, never in the entire visible universe even given billions of years. Specifically:
- One of the complexes (the spliceosome) had a protein consisting of 2335 well-arranged amino acids.
- Another of the complexes (the apoptosome) had a protein consisting of 1248 well-arranged amino acids.
- The nuclear pore protein complex had one protein requiring 2090 well-arranged amino acids, and another protein requiring 2012 well-arranged amino acids, along with three other types of proteins each requiring more than 1000 well-arranged amino acids.
- The origin recognition complex/replicative helicase complex required 7 types of proteins that each required more than 700 well-arranged amino acids.
- The RNA polymerase II protein complex described above had five types of proteins each requiring more than 2000 well-arranged amino acids, and three other types of proteins each requiring more than 1000 well-arranged amino acids.
Cell transcription occurs quickly. The source here lists a time of ten minutes for a gene to be transcribed by a mammal, but another source lists a speed of only about a minute. The great majority of that is used up by the reading of base pairs from the gene, with typically more than 1000 base pairs being read each time a gene is transcribed. The finding of the correct gene to read in DNA seems to occur in only seconds, not minutes, or at most a few minutes.
Descriptions of DNA transcription fail to explain a huge issue: how does a cell find the right gene in DNA so quickly? Human DNA contains more than 20,000 genes, each of which is just a section of the DNA. The DNA is like an extremely long necklace of many thousands of beads, and a typical gene is like a group of several hundred of those beads. We should actually imagine multiple such necklaces, because DNA is scattered across 23 different chromosome pairs. Now if genes had gene numbers, and DNA was a set of numbered genes in numerical order, it might be easy to find a particular gene. So if a cell knew that it was trying to find gene number 4,233, it could use a binary search method that would allow it to find that gene pretty quickly.
But no such method can be used within the human body. Genes do not have gene numbers that can be accessed within the human body, and DNA is not numerically sorted. DNA has no indexes that might allow a cell to find some particular gene that it was trying to find within DNA. So we have an explanatory "needle in a haystack" problem. Or we might call it a "needle in the haystacks" problem, because human DNA is scattered across 23 different chromosome pairs, as shown in the diagram below:
A scientific text tells us some information that makes this explanatory problem seem more pressing:
"One might have predicted that the information present in genomes would be arranged in an orderly fashion, resembling a dictionary or a telephone directory. Although the genomes of some bacteria seem fairly well organized, the genomes of most multicellular organisms, such as our Drosophila example, are surprisingly disorderly. Small bits of coding DNA (that is, DNA that codes for protein) are interspersed with large blocks of seemingly meaningless DNA. Some sections of the genome contain many genes and others lack genes altogether. Proteins that work closely with one another in the cell often have their genes located on different chromosomes, and adjacent genes typically encode proteins that have little to do with each other in the cell. Decoding genomes is therefore no simple matter. Even with the aid of powerful computers, it is still difficult for researchers to locate definitively the beginning and end of genes in the DNA sequences of complex genomes, much less to predict when each gene is expressed in the life of the organism. Although the DNA sequence of the human genome is known, it will probably take at least a decade for humans to identify every gene and determine the precise amino acid sequence of the protein it produces. Yet the cells in our body do this thousands of times a second."
We have here a very severe navigation problem. A cell is somehow able to find the right gene in only seconds or a few minutes when a new protein is made, even though DNA and chromosomes seem to have no physical organization that could allow for such blazing fast access to the right information. In an article on Chemistry World, we read this:
"How does the machinery that turns genes into proteins know which part of the genome to read in any given cell type? ‘To me that is one of the most fundamental questions in biology,’ says biochemist Robert Tjian of the University of California at Berkeley in the US: ‘How does a cell know what it is supposed to be?"
Biochemist Tjian has spoken just as if he had no idea how it is that a cell is able to navigate to the right place to read a particular gene in DNA. Later in the article we read this:
"For one thing, the regulatory machinery ‘is unbelievably complex’, says Tjian, comprising perhaps 60–100 proteins – mostly of a class called transcription factors (TFs) – that have to interact before anything happens. ....As well as promoters, mammalian genes are controlled by DNA segments called enhancers. Some proteins bind to the promoter site, others bind to the enhancer, and they have to communicate. ‘This is where things get bizarre, because the enhancer can sit miles away from the promoter,’ says Tjian – meaning, perhaps, millions of base pairs away, maybe with a whole gene or two in between. And the transcription machinery can’t just track along the DNA until it hits the enhancer, because the track is blocked. In eukaryotes, almost all of the genome is, at any given moment, packaged away by being wrapped around disk-shaped proteins called histones. These, says Tjian, ‘are like big boulders on the track’: you can’t get past them easily.... ‘Even after 40 years of studying this stuff, I don’t think we have a clear idea of how that looping happens,’ says Tjian. Until recently, the general idea was that the TFs and other components all fit together into a kind of jigsaw, via molecular recognition, that will bridge and bind a loop in place while transcription happens. ‘We molecular biologists love to draw nice model schemes of how TFs find their target genes and how enhancers can regulate promoters located millions of base pairs away,’ says Ralph Stadhouders of the Erasmus University Medical Centre in Rotterdam, the Netherlands. ‘But exactly how this is achieved in a timely and highly specific manner is still very much a mystery.’ "
Later in the article Tjian says he was shocked by the speed at which some of the process occurs. He expected it would take hours, but found something much different:
"The residence times of these proteins in vivo was not minutes or hours, but about six seconds!’, he says. ‘I was so shocked that it took me months to come to grips with my own data. How could a low-concentration protein ever get together with all its partners to trigger expression of a gene, when everything is moving at this unbelievably rapid pace?’ "
The rest of the article is just some speculation, which Tjian mostly knocks down, and the article itself calls "hand-wavy." We are left with the impression that no one understands how cells are able to instantly find the right gene.
- The one-dimensional organization of amino acids found in the sequence of amino acids that makes up a protein;
- the three-dimensional organization of such a sequence to make a complex folded three-dimensional shape needed for a particular protein molecule to function properly;
- the entirely different three-dimensional organization needed for the proteins of a protein complex to fit together in the right way to make a physical arrangement so complex that it may be called a "molecular machine."
- "The majority of cellular proteins function as subunits in larger protein complexes. However, very little is known about how protein complexes form in vivo." Duncan and Mata, "Widespread Cotranslational Formation of Protein Complexes," 2011.
- "While the occurrence of multiprotein assemblies is ubiquitous, the understanding of pathways that dictate the formation of quaternary structure remains enigmatic." -- Two scientists (link).
- "A general theoretical framework to understand protein complex formation and usage is still lacking." -- Two scientists, 2019 (link).
- "Protein assemblies are at the basis of numerous biological machines by performing actions that none of the individual proteins would be able to do. There are thousands, perhaps millions of different types and states of proteins in a living organism, and the number of possible interactions between them is enormous...The strong synergy within the protein complex makes it irreducible to an incremental process. They are rather to be acknowledged as fine-tuned initial conditions of the constituting protein sequences. These structures are biological examples of nano-engineering that surpass anything human engineers have created. Such systems pose a serious challenge to a Darwinian account of evolution, since irreducibly complex systems have no direct series of selectable intermediates, and in addition, as we saw in Section 4.1, each module (protein) is of low probability by itself." -- Steinar Thorvaldsen and Ola Hössjerm, "Using statistical methods to model the fine-tuning of molecular machines and systems," Journal of Theoretical Biology.
No comments:
Post a Comment