Some readers may be thinking along these lines: That's not such a mystery. Given a primordial soup and millions of years of time, there developed some self-replicating molecule. Once you had that, the development of everything else was just a case of things evolving from the simple to the more complex.
But such a glib explanation glosses over the great difficulties involved in explaining the origin of life on the early Earth. The fact is that there are huge difficulties in explaining how life began on our planet billions of years ago. In recent decades scientists have made relatively little progress in solving this problem.
Consider the progress of astronomy during the past 50 years. Since the year 1963 we have seen the discovery of the Big Bang, the discovery that the expansion of the universe is accelerating, and the discovery of more than 1000 extrasolar planets. But without doing a Google search, can you name one bit of progress that has been made in the past 50 years regarding the origin of life? You probably can't. When most of us think of scientific work on the origin of life, we think back to the Miller experiments involving amino acids, but they were done in the 1950's.
We can divide up the problem of the origin of life into three different problems: a necessary components problem, a combinatorial problem, and a computation problem.
The Necessary Components Problem
The basic units of life (below the cellular level) are things such as RNA, DNA, and proteins. Proteins are made of building blocks called amino acids. Some proteins are extremely complicated molecules built from very many amino acids. It was calculated long ago that the chance of some of these proteins forming from random combinations of amino acids is incredibly low, even given billions of years. But that's not necessarily a problem, because proteins are formed using the instructions in DNA. A DNA molecule is like a library of recipe books, with each of the recipes being a recipe for making a particular type of protein.
So if there is a mechanism for producing DNA from a chance combination of chemicals, we might have a way of explaining how all those complicated proteins came into existence. Unfortunately it seems DNA molecules appear to be way too complicated to have arisen from a chance combination of their constituent elements of nucleotides (which consist of sugars, phosphates, and nitrogenous bases), without assistance from something more complicated than nucleotides.
So the current leading hypothesis is that the first self-replicating molecule was not DNA but something simpler, presumably some version of RNA. This idea is called the RNA World hypothesis. The idea is that first there was RNA, and that DNA evolved later. However, the RNA World hypothesis is on shaky ground.
One problem is the difficulty of explaining the origin of all the necessary building blocks. The table below shows the various types of building blocks. As indicated below, there are reasons for doubting that the ribose sugars, purines, and nucleotides would have existed in sufficient quantity for DNA or RNA to originate.
The Combinatorial Problem
The combinatorial problem is the problem of getting anything like RNA or DNA to appear from the building blocks listed above. This scientific paper by Joyce and Orgel refers to the difficulty of joining together nucleosides (a combination of ribose sugar and pyrimidines or purines) and nucleotides (a nucleoside plus a phosphate). The wikipedia article on the RNA World hypothesis notes that “Joyce and Orgel further argued that nucleotides cannot link unless there is some activation of the phosphate group, whereas the only effective activating groups for this are 'totally implausible in any prebiotic scenario', particularly adenosine triphosphate.”
Well-known scientist Freeman Dyson has stated, “The results of thirty years of intensive chemical experimentation has shown that prebiotic synthesis of amino acids is easy to simulate in a reducing environment, but prebiotic synthesis of nucleotides is difficult in all environments...If it happened, it happened by some process that none of our chemists have been clever enough to reproduce.”
RNA is made of nucleotides, which are made of ribose sugar, phosphates, pyrimidines, and purines. Scientists have not been able to synthesize RNA through a simulation of the early earth, and in such simulations have not been able to make the simpler nucleotides either. As discussed in the table above, there are difficulties in assuming the availability of even some of the building blocks of the building blocks of RNA.
The Computational Problem
Perhaps the biggest problem involving the origin of life is the problem of accounting for the origin of the genetic code. The genetic code is a symbolic representation system used by all earthly life. It has been called a kind of miniature programming language.
The Genetic Code
It is fairly easy to explain the basics of how the code works. In the spiral staircase structure of the DNA molecule, the “steps” of the staircase are chemicals called nitrogenous bases: either purines (adenine or guanine) or pyrimidines (cytosine or uracil). Various combinations of three of these chemicals stand for different amino acids (the building blocks of proteins). For example, if there are three consecutive “steps” in the spiral staircase, and the first is cytosine, the second adenine, and the third guanine, that stands for the amino acid glutamine. There are 63 other cases where a sequence of three nitrogenous bases stands for a particular amino acid. (In the diagram above, the chemicals around the four edges of the square are the amino acids.)
Imagine if you liked to write down recipes, but you needed to write down many of them on a single piece of paper. You might invent a little “recipe language” in which MK1 stands for a half a cup of milk, MK2 stands for a full cup of milk, FL1 stands for a half a cup of flour, and so forth, with a total of 64 different three-character symbols (and some other characters standing for “end of recipe”). You might then write out recipes very concisely using this little language. That's quite similar to what the genetic code does, except the recipes are stored in the DNA molecule, and the recipes are instructions for making proteins from the building blocks of amino acids.
The big question is: how did this genetic code ever originate? It's hard to imagine it arising through anything like Darwinian evolution, as the genetic code seems to be required from the very beginning of biological evolution.
The genetic code can be considered an example of code, the term software developers use for the symbolic instructions they create. The baffling question is: how did nature go from chemicals to code? Code seems like something fundamentally different from chemicals, and the two seem as unrelated as an apple is to a bicycle.
The issue was highlighted by a paper by biologists J.T. Trevors and D.L. Abel:
"Peer-reviewed life-origin literature presupposes that, given enough time,
genetic instructions arose via natural events. Thus far, no paper has provided
a plausible mechanism for natural-process algorithm-writing...There is an
immense gap from prebiotic chemistry and the lifeless Earth to a complex DNA instruction set, code encryption into codonic sequences, and decryption
(translation) into amino acid sequences...How did inanimate nature write
(1) the conceptual instructions needed to organize
(2) a language/operating system needed to symbolically
represent, record and replicate those instructions?
(3) a bijective coding scheme (a one-to-one correspondence
of symbol meaning) with planned redundancy
so as to reduce noise pollution between triplet codon
‘‘block code’’ symbols (‘‘bytes’’) and amino acid
We could even add a fourth question. How did
inanimate nature design and engineer
(4) a cell [Turing machine? (Turing, 1936)] capable of
implementing those coded instructions?" -- Trevors and Abel
In this article the widely read physics professor Paul Davies has discussed other difficulties in the “code from chemicals” scenario, the assumption that the genetic code arose from some kind of chemical evolution:
"The language of genes is digital, consisting of discrete bits, cast in the language of a four-letter alphabet. By contrast, chemical processes are continuous. Continuous variables can also process information – so-called analogue computers work that way – but less reliably than digital. Whatever chemical system spawned life, it had to feature a transition from analogue to digital. The way life manages information involves a logical structure that differs fundamentally from mere complex chemistry. Therefore chemistry alone will not explain life's origin, any more than a study of silicon, copper and plastic will explain how a computer can execute a program." -- Davies
This problem of the origin of the genetic code recently got even more difficult to explain, because scientists recently announced the discovery of a second genetic code buried in DNA. Apparently many of the triple sequences have a double-meaning. Explaining one genetic code was a nightmare -- how can we explain two of them?
A New Approach to the Origin of Life
We might get around these difficulties by imagining that the origin of life on Earth required external intervention by a divine agent or perhaps extraterrestrials. But that would raise the question: why should our ordinary little rock have deserved such a special blessing? After all, modern astronomy tells us that planets are as common as apples in an apple orchard.
A more intellectually attractive idea is the daring concept that the origin of life was programmatically predestined. We can boldly postulate that long, long before there arose the programming in the genetic code, there was a more general programming woven into the fabric of the universe, a programming that drives the evolution of the universe, causing the frequent occurrence of things that might otherwise have very little or no chance of occurring. Under such a scenario, we can think that life is appearing throughout the universe, because that is the way the universe is programmed to behave. Under such a concept, we no longer have to imagine the origin of the genetic code by supposing a farfetched case of “code from chemicals.” We can instead plausibly imagine the origin of the genetic code as a case of “code from code” – the genetic code being a product of a more general cosmic software that is influencing cosmic destiny, propelling the universe forward towards desirable outcomes.
There are actually many reasons for adopting such a theory, and most of them come not from the world of biology, but from the worlds of physics and cosmology. I will be presenting some of these reasons in a lengthy blog post I will post on this blog on Sunday, January 5th.