Future and Cosmos: Does Some Mysterious Life Force Cause Protein Folding and Morphogenesis?

One of my more popular recent posts was one entitled A Table Listing 50 Things Science Cannot Explain. I discussed fifty puzzling mysteries modern science has not been able to solve. But my table failed to mention one of the biggest unsolved mysteries in science: the mystery of protein folding.

Basic molecular building blocks of the body, proteins are specified in DNA by a one-dimensional series of nucleotide base pairs that represent a series of amino acids. A particular protein may be specified in DNA by a gene that is a simple series of nucleotide base pairs that stand for a series of amino acids (the building blocks of proteins). In the visual below, we see a simple gene that specifies the contents of the protein, and (in colors of orange, blue, yellow, and green) some of the nucleotide base pairs that make up the gene. The chromosomes are found in every cell.

Certain combinations of these nucleotide base pairs (guanine, cytosine, adenine, and thymine) represent particular amino acids (because the cells use the genetic code). A protein is composed of these amino acids. Below is a schematic diagram of a small hypothetical protein. Other proteins might consist of not just 13 amino acids, but hundreds of them.

Protein 343

Lysine

Valine

Glycine

Leucine

Threonine

Leucine

Tyrosine

Serine

Valine

Leucine

Lysine

Serine

Threonine

Now, based on what you have just been told, you might think that proteins are long string-like molecules like the long string-like molecule that is the DNA molecule. In other words, you might think that a protein looks like the chain we see in the visual below. A series of amino acids such as this, existing merely as a wire-like length, is sometimes called a polypeptide chain.

But protein molecules instead typically have intricate three-dimensional shapes. So a protein molecule isn't shaped like a simple length of copper wire – it looks more like some intricate copper wire sculpture that some artisan might make.

Below are two examples of the 3D shapes that protein molecules can take. There are countless different variations.

The phenomenon of a protein molecule forming into a 3D shape is called protein folding. How would you make an intricate 3D sculpture from a long length of copper wire? You would do a lot of folding and bending of the wire. Something similar seems to go on with protein folding, causing the one-dimensional series of amino acids in a protein to end up as a complex three-dimensional shape.

The question is: how does this happen? This is the protein folding problem. Biochemists have been knocking their heads on this problem for 50 years, and have made very little progress in solving it.

An idea considered very early in the investigation of the protein folding problem is that after a protein is created, it reaches a particular shape through a kind of trial and error chemical “search.” It was thought that maybe newly created protein molecules kind of cycle through different structural arrangements until one stable arrangement was found; and that the protein shapes we observe are just the results of such a search.

But this idea was ruled out pretty quickly. In 1969 scientist Cyrus Levinthal calculated that a protein with about 100 amino acids could be folded into about 3 to the 198^th power shapes. Trying so many possibilities would take very many years – eons actually. But instead a particular protein will very rapidly form into a characteristic three-dimensional shape, in a very short time – seconds for small proteins, and minutes for large proteins. This discrepancy between the calculated time protein folding should take and the actual time it does take is known as Levinthal's paradox.

Levinthal's paradox bothered those working on the problem of protein folding, and they developed an idea to soothe their pain in this regard. The idea was called Anfinsen's dogma. The idea behind Anfinsen's dogma is that the three-dimensional shape of a protein is determined solely by the sequence of amino acids in it.

We can represent the Anfinsen's dogma with the visual below. The idea is that given some particular chain of amino acids (also called a polypeptide chain), and given the laws of chemistry, you will automatically get some particular complicated 3D shape like the shape below.

Anfinsen's Dogma

Scientists have had more than 40 years to prove Anfinsen's dogma. But they still haven't proven it. We still have essentially no understanding of how laws of chemistry could make it so that some particular chain of amino acids would have to take the complicated 3D shapes we see in proteins.

It seems likely that if Anfinsen's dogma were true, scientists would have proven it by now, and would have solved the protein folding problem. For long ago scientists were able to come up with two sets of data: the three dimensional shapes of many proteins, and the corresponding chain of amino acids that correspond to such a shape. Now given these two sets of data, it would not take very long to unravel some set of chemical rules forcing the chain of amino acids to have that particular shape, if such rules existed.

If it were true that 3D protein shapes are determined solely by a string of amino acids and a set of chemical laws, this would be something resembling encoding. One of the principles of cryptography is that if you have a large number of samples showing information before it was encoded, and the corresponding information after it is encoded, then it is usually fairly easy to figure out what rules are being used for the encoding.

Here is an example. Imagine we have a table with many cases like those below. Given such data, it is quite easy to figure out what the unstated rules are that are causing these transformations. In this case the rules are (1) shift each letter by one position in the alphabet; (2) then reverse the input.

Before Rules Are Applied	After Rules Are Applied
cat	ubd
frog	hpsg
desk	ltfe

Now, in the case of protein folding, we have something similar to the table below. For quite a few years we have known the following things about many proteins: (1) the one-dimensional sequence of amino acids that make up the protein (the polypeptide chain); (2) the three-dimensional shape that the protein takes. Given such data, and given modern computerized technology, it should be fairly easy to figure out what rules of chemistry might cause a sequence of amino acids to become a particular three-dimensional shape – if indeed the three-dimensional shape is actually determined merely by a combination of that sequence of amino acids and some rules of chemistry.

For decades many scientists have been trying to solve this problem, and they have even used super-fast supercomputers to try to solve it. But they still have not come up with the answer. They still have not come up with the answer to the protein folding problem. We know exactly what the benchmark for success would be: if you had the problem solved, then given a sequence of amino acids that make up a protein, you would be able to correctly predict the three-dimensional shape of the protein (without having seen it). No scientist has done anything close to achieving such a benchmark.

But given the amount of effort that has been applied to the protein folding problem, it seems that a solution to the problem should have been discovered if Anfinsen's dogma is correct.

As one scientist states:

Anfinsen’s dogma ...says that the amino acid sequence of the peptide dictates the folding. Were that true the “protein folding problem” would have been understood by now. In fact, predicting the folded structure is still an unsolved problem...

Page 23 of this document tells us this: “Small changes (e.g. replacement of one amino acid by another) can sometimes cause large changes in structure (this is not always true).” This is not what we would expect if the three-dimensional structure of a protein was being determined solely by its sequence of amino acids. Speaking of 3D protein shapes, page 45 of the same document tells us, “There are many cases of nearly identical structures sharing no sequence similarity.” This is also not what we would expect if the three-dimensional structure of a protein was being determined solely by its sequence of amino acids.

Then there is the existence of what are called metamorphic proteins. These are cases in which a protein can have different three-dimensional shapes or folds, despite there being a single underlying sequence of amino acids. This paper estimates that as many 4% of proteins are metamorphic or "fold-switching" proteins.

Therefore, it is likely that Anfinsen's dogma is not correct. The structure of a three-dimensional protein is probably not determined merely by the combination of the sequence of amino acids in it and some laws of chemistry.

What is astonishing is that both in the case of protein folding and in the case of morphogenesis (the progression from a fertilized egg to a human baby), we seem to have a crucially important structural progression that is almost completely unexplained by modern science. Morphogenesis cannot occur merely by reading instructions from DNA, because contrary to popular misconceptions, DNA cannot store any blueprint or sequential list of assembly instructions for making complex organs. DNA is basically just a long sequential list of amino acids, and there seems to be no way of stating in DNA any such thing as a 3D blueprint. See here for several reasons why the whole idea of a body plan stored in DNA is erroneous. DNA is like a big stack of cards in which each card simply has printed on it the name of an amino acid. You can't express body plans or 3D blueprints with such a thing. Expressing a body plan would require some language vastly more expressive than the “bare bones,” minimalist, “amino acid” language in which DNA is written.

What goes on in morphogenesis (the progression from a fertilized egg to a baby) is something currently inexplicable and profoundly mysterious. Somehow a barely visible speck (a fertilized egg) progresses to become a full baby, but we don't even know from where it gets the body plan for a human being, which does not seem to be stored in DNA (something that seems to be mainly an ingredient list, not a structural blueprint).

It is only reasonable to wonder whether such structural progressions occur from some vitally important component of nature that is completely unknown to us. The attempts of scientists to try to explain these mysterious progressions in terms of what is known (rather than admitting that there must be some great fundamental unknown) are clumsy and unconvincing. When they find that what we know is insufficient to explain what we observe, our physicists and cosmologists seem to have no problem introducing hypothetical ideas such as dark matter and dark energy, in an effort to fill the gap. But our biologists seem to be lacking in theoretical imagination, and seem to follow the rule: always try to explain observed biological effects by using only what has been discovered already, no matter how far-fetched such explanations may be.

Imagine you are an astronaut who travels to some strange planet revolving around another star. You notice an astonishing thing. Whenever it snows heavily on this planet, the snow forms into snowmen. So when it snows heavily, a large field might fill up with 10 snow men, even though no visible person is building such snow men. Faced with such a reality, you could mess around with bizarre theories trying to explain such an outcome naturally (such as far-fetched “special wind pattern” type of nonsense). Or you could be candid and admit the strange reality: that there is some mysterious force on the alien planet that likes to build snow men out of snow. Similarly, given our inability to explain protein folding and morphogenesis, we might be wise to admit that there is some mysterious life force on our planet that wants to turn polypeptide chains into the 3D protein molecules needed for life, and that wants to turn a newly fertilized egg into a baby.

Below is a relevant quotation from a medical doctor (the late Ian Stevenson, MD):

Genes provide the instructions for proteins. From them alone, however, we have no understanding of how proteins develop their complicated three-dimensional structure. Even less does our knowledge of genes explain how proteins and other metabolites become organized into cells and then into highly differentiated tissues and the complicated organs that comprise our bodies. Present knowledge of genes tells us almost nothing about embryology and morphology, which is the science of the forms that organisms have. Some geneticists are not modest in assuring us that they will in time supply all the information we need to understand embryology and morphology. This amounts to a promissory note with no immediate cash value, and in the meantime we are free to consider the possibility of other contributory factors.

The statement was made in 1997, but is just as applicable today. And since the decoding of the human genome was completed in 2003, it is hard to believe claims that the answers to such questions will be found in DNA. We've already analyzed human DNA and its gene information completely, and the answers to the riddles of human structure and 3D protein structure aren't there.

Here is an interesting analogy. Let us imagine some person who has come to America from some primitive land. The person lives next to a car factory, and he is curious about how cars are made, something he knows nothing about. But he is too embarrassed to reveal his ignorance by asking. So he decides to break into the car factory in the middle of the night, on several different nights. One night he sees a bunch of parts lying around on the factory floor. The next night he sees a partially assembled chassis on the assembly line. The next night he sees a half-assembled car on the assembly line. The last night he sees a nearly completed car.

If this person had no knowledge of how factories worked, and was lacking in imagination, he might hazard a guess that would be very wrong. He might think somehow the parts knew how to assemble themselves into the car. Our scientists may be making a very similar mistake. The structural progression known as protein folding may occur only because some mysterious agency or some mysterious force acts to form polypeptide chains into complicated 3D shapes. The structural progression known as morphogenesis may occur only because some mysterious agency or some mysterious force acts to gradually change a newly fertilized egg into a full-size human baby over the course of nine months. But knowing nothing about such an agency or force, our scientists try fruitlessly to explain these structural progressions based on only what they do know.

Almost as inexplicable is the progression from the densely packed and disorganized “quark soup” of the Big Bang to the universe we see today filled with beautiful orderly spiral galaxies. The diagram below compares these three structural progressions, and asks whether all three of them involve some mysterious structural forces completely beyond our current understanding.

A new scientific study tells us that protein folding is "surprisingly more complex than previously known." We are told, "The JILA team identified 14 intermediate states—seven times as many as previously observed—in just one part of bacteriorhodopsin, a protein in microbes that converts light to chemical energy and is widely studied in research." It's as if some invisible factory worker (highly trained and very knowledgeable) was running through some very complex series of assembly instructions (not stored in DNA) to make the 3D shape of a protein out of the linear sequence of amino acids listed in DNA. We find again and again that our existence or our health depends on the final output being just right. How improbable is it that blind chemistry is all that is involved here?

Postscript: In yet another case of the premature triumphalism so common in science journalism, the New York Times gives us this wildly inaccurate claim:

They've been stumped by one great mystery: how the building blocks in proteins take their final shape. David Baker...has been investigating that enigma for a quarter of a century. Now, it looks as if he and his colleagues have cracked it.

No, not at all, for two reasons:

(1) The article later tells us that the reported success is only for "short-chained proteins." The median length of a protein in the human body is 431 amino acids. The heart of the problem is how proteins with hundreds of amino acids are able to form their 3D shapes. A solution working only for short-chained proteins isn't any real solution of the problem.
(2) The reported technique uses database techniques in which statistical data on a vast number of proteins is collected and used to predict a 3D shape. But from the standpoint of answering how nature could do protein folding, such a technique is kind of cheating. A polypeptide sequence has no database of thousands of protein shapes it can use when a 3D protein shape appears from a linear polypeptide chain of amino acids. The protein folding problem has always been: how could a complex shape appear from only the data in a linear polypeptide chain of amino acids?

Friday, April 14, 2017

Does Some Mysterious Life Force Cause Protein Folding and Morphogenesis?

No comments:

Post a Comment