Header 1

Our future, our universe, and other weighty topics


Monday, December 28, 2020

The Two Biggest Brain Projects Have Failed to Bolster the Main Dogmas About Brains

In recent years the two largest brain research projects have been a big US project launched in 2013 called the BRAIN Initiative, and a big European Union project launched in 2013 called the Human Brain Project. In July 2018 I wrote a post describing how the BRAIN Initiative had failed to substantiate claims that the human brain is a storage place for memories and that the human brain is the source of our thinking, consciousness and imagination.  Looking at an article on the BRAIN Initiative's web site recapping what the big project did in 2019, I see no reason for thinking that the situation has changed very much. 

It's rather a bad sign when this "2019 Highlights" article starts off by mentioning some silly experiment in which signs of activity were looked for in the brains of dead pigs a few hours after they died.  After some discussion of some research that merely classified cell types and mapped brain circuits,  there is mention of a study indicating that the human mind can perform well when one half of the brain is removed. But that isn't a discovery of the BRAIN Initiative, and was proven by hemispherectomy operations that occurred long before the BRAIN Initiative started.  Moreover, the finding that he human mind can perform well when one half of the brain is removed is one that is diametrically opposed to the dogmas that the BRAIN Initiative has been trying to prove, claims that the brain is the source of your mind and the storage place of your memories.

Next in the "2019 Highlights" article we have a huge visual of a Science  cover talking about the neurobiology of singing mice, along with a claim that some scientist "measured brain activity in musical mice while they sang duets."  This is not something that should inspire our confidence, since mice can't really sing.  There is no further discussion in the "2019 Highlights" article of anything that  backs up the main dogmatic claims that neuroscientists keep making about brains.  Judging from the article, the BRAIN Initiative is not making very dramatic progress. 

I looked at a News page of the BRAIN Initiative site, to see signs of any recent progress it may have made in trying to prove the things it is trying to prove. I get some links to unimpressive research papers such as this one, "The Anterior Cingulate Cortex Predicts Future States to Mediate Model-Based Action Selection." The paper does not actually provide any good evidence that some brain region is predicting anything, because the study suffered from the usual methodological defects of neuroscience experimental studies.  One of the study groups consisted of only 4 animals, another study group consisted of only 2 animals, and three other study groups consisted of only 8 animals. The chance of a false alarm is too high with such tiny study groups. We should ignore most experimental studies that fail to use at least 15 animals for each study group. Moreover, the study makes no mention of any blinding protocol, something important to have for a reliable experimental study; and the study was also was not a pre-registered study that committed itself to testing a particular hypothesis with a particular methodology.  With so many shortcomings in the study, the BRAIN Initiative should not have had a headline of "Brain Region Implicated in Predicting the Consequences of Actions" to describe this study, since the study did not provide robust evidence of such a thing.

Looking back through all the articles listed on the News page, and going back all the way to July 2019, I can find no sign of any research that substantiates in any robust way any of the dogmas that the BRAIN Initiative has been trying to prove, such as the very dubious claim that the "brain records, processes, uses, stores, and retrieves vast quantities of information." While we know that humans can acquire memories and retrieve memories, and we know that brain cells (like all cells) store genetic information, there is no robust evidence that the brain stores or retrieves memory information, and no credible detailed theory of how any neural storage or instant retrieval of human episodic memory information could occur. The proteins in synapses and brain tissue are so short-lived (having an average lifetime of less than two weeks) that the brain cannot be a place where memories could be stored for 50 years or more. 

On one page of the BRAIN Initiative site, we have a long discussion of some year 2020 symposium featuring speakers funded by the BRAIN Initiative, something called the 6th Annual BRAIN Initiative Investigators Meeting.  There is lots of talk about neuroscience research, but nothing substantially supporting claims that brains produce thinking and store memories. I find no use of the words "thought," "thinking," "consciousness", "imagination," "cognition," "reasoning" or "mind." Here are the only references to memory in the long symposium recap:

"Dr. Nanthia Suthana explained how stimulating and recording deep brain activity could help us understand the neurophysiology of hypervigilance and emotional memory in patients with post-traumatic stress disorder....Dr. Kareem Zhangloul explained the relationship between cortical spiking sequences and memory retrieval in humans."

There's no link to any work by these two, and no one has actually established any relationship between brain spiking sequences and memory retrieval. Searching for a paper by Kareem Zhangloul I find a paper that makes these not very exciting claims:

"Bursts of spikes organized into sequences during memory formation. These sequences were replayed during successful memory retrieval. The extent of sequence replay during correct recall was related to the extent to which cortical spiking activity was coupled with ripples in the medial temporal lobe."

Given the fact that the brain is a constant source of electrical activity, with most of its billions of neurons firing more than once per second, we should expect to be able to find by chance some sequences of spikes that occurred both during memory formation and memory retrieval, regardless of whether memories are stored in brains.  So such research does not qualify as evidence that memories are retrieved from brains. The type of pareidolia going on in such analysis is rather like what would be going on if you had random fluctuation seismograph readings from hundreds of worldwide sites, and found (upon diligent searching) similar patterns during several different Sunday games when the Pittsburgh Steelers played football. 

The BRAIN Initiative page here is entitled "Key Moments in Brain Research." The subtitle is "Explore major milestones in the history of the field, including those stemming from BRAIN-related research programs."  But while there's lots of discussion of about administrative milestones and funding milestones, there's no mention of any research accomplishments of the BRAIN Initiative other than a mention of a classification of brain cell types. There is a mention of a Nobel prize, but that was for research done before the BRAIN Initiative started.   

Like the BRAIN Initiative, the EU's Human Brain Project has announced goals of proving conventional dogmas about the brain. At the page here we read that "the HBP is conducting a coordinated series of experiments to identify the neuronal mechanisms behind episodic memory, and validate them by computational models and robotic systems."  This is an assertion of the unproven dogma that episodic memory can be explained by brain processes; and it is a strange statement, given how silly it is to think that such a dogma could be validated by doing computer models or research into robots.  One of the main tabs of the Human Brain Project has the silly title of "Silicon Brains." No such things exist; brains are brains, and computers are computers. The brain bears no resemblance to a digital computer, and has none of the seven things that a computer uses to store and retrieve information.  Another page of the Human Brain Project has a title of "Understanding Cognition," but makes no mention of any study or experiment backing up the claim that cognition is produced by brains. 

The page here on the Human Brain Project site is entitled "Highlights and Achievements."  But while the page refers to many different scientific studies between 2017 and 2020, it provides no good evidence that the Human Brain Project has done anything to substantiate claims that the brain stores memories or that the brain produces consciousness, selfhood, thinking, creativity or imagination.  Below are some of the studies mentioned.

  • There is a link to a page entitled "Dendrite Activity May Boost Brain Processing Power." But the page confesses, "Neurologically speaking, the physiology that makes the human brain so particularly special and capable remains poorly understood," which makes it sound as if neuroscientists have no factual claims backing up their dogmas about the brain. 
  • There is a link to a page entitled "The Way of Making Memories." But the page does not discuss any substantial progress in understanding memory, but merely mentions some hardly-worth-mentioning paper entitled, "Regulation of adenylyl cyclase 5 in striatal neurons confers the ability to detect coincident neuromodulatory signals." 
  • There is a link to a page entitled "Brains of smarter people have bigger and faster neurons." The page merely refers to a scientific study that fails to establish such a claim. The study only provided data on brain characteristics and IQ for about 25 subjects, and merely found weak correlations such as r= .37 and r= .46 and r = .51.  The site here says, "The relationship between two variables is generally considered strong when their r value is larger than 0.7." Having such a small study group and such not-very-strong correlations, the study does not justify the claim that brains of smarter people have bigger and faster neurons. It is very easy to get by chance a not-very-strong correlation such as .5 between two unrelated things such as hair length and intelligence, from a check of only a small number of subjects (the likelihood of getting a correlation between unrelated things decreases as the number of subjects rises). The study was not a pre-registered study, so we have no idea whether the authors were checking 50 different things, and reporting on a few cases where a not-very-strong correlation was found by chance variation.  To have confidence in a study like this (which could so easily go wrong through subjective analysis), the study would have to have a detailed discussion  of how a full-fledged blinding protocol was followed. Instead there is merely a one sentence mention of some half measures to produce a blinding effect. The study lists six patients who scored above 100 in IQ tests just before surgery for brain tumors, which in not what we would expect if brains were producing human intelligence. 
  • There is a link to a page entitled "How brain cells work together to remember and imagine places." But the page does not discuss any evidence for a brain storage of memories or a brain explanation for imagination. It merely discusses a "computational model."
  • There is a link to a page entitled, "Individual Brain Charting: A high-resolution brain map of cognitive functions." But the title is misleading, because it merely discusses some brain scans taken when 12 people were doing particular things.  The page has the typical misleading language about such scans, saying, "The images obtained make it possible to specify which regions of the brain are activated during a given task." All regions of the brain are active at all times, and brain scans merely show tiny variations such as half of one percent from one region to another, which could easily be chance fluctuations. It is misleading to say that a region showing less than 1% greater activity are "activated during a different task." 
  • There is a link to a page entitled, "A First Principles Approach to Memory Recall." But we get no evidence that neuroscientists understand memory recall, something that has never been credibly explained as a brain process. On the page a neuroscientist states this:
“In Neuroscience, there is nothing you can really predict. We do not know how things really work, and the brain is so complex. Both these things mean you cannot make quantitative predictions."

This creates no impression at all that neuroscientists have facts that prove the dogmas they keep spouting about the brain. 

We should not be impressed by occasional studies that may create some superficial impression that the main assumptions of neuroscientists are correct. Given a huge army of experimental neuroscientists funded each year with so many millions of dollars, it is inevitable that now and then a few weak signals might come forth suggesting some reality behind their assumptions, no matter how wrong they are. Similarly, if you recruit some huge army of people who believe that some clouds are the ghosts of dead animals, and you fund such people with many millions of research money each year, you might occasionally get photos of clouds that might make you think, "Wow, that really looks like the ghost of a dead animal."

The Human Brain Project and the BRAIN Initiative continue to get very many millions of dollars of funding every year. But the web site of the BRAIN Initiative and the web site of the Human Brain Project very much suggest that these lavishly funded projects are failing to substantiate the dogmatic claims about the brain that they are attempting to prove.  Such a failure should surprise no one, because these dogmatic claims (such as the claim that brains store memories and the claims that brains produce minds) are implausible, and are contradicted by many neuroscience facts that have already been established, such as:
  • the very short lifetime of brain proteins, only a thousandth of the longest length of time that humans can reliably remember things (60 years);
  • the lack of any indexing system or position notation system in a brain that might make possible instant memory recall;
  • the failure to discover any proteins or brain mechanisms capable of translating human learned knowledge or episodic memories into synapse states or neuron states;
  •  the ability of minds to function and remember very well when half of brains are removed;
  • the ability of minds to function very well during near-death experiences occuring during cardiac arrest when brains are shut down;
  • the very high levels of noise (and very low levels of synaptic signal transmission reliability) in brains, which should preclude a brain from being able to achieve accurate recall of any detailed memory information; 
  • the lack of any mechanism in the brain for reading or writing memories, and the lack of anything analagous to the read/write head of a computer hard disk;
  • the slow speed at which brain signals travel across dendrites and synapses, which should prevent any instant recall of memories;
  • the failure to find any permanent encoded information in brains other than the genetic DNA information in all cells.
noisy brain
The physical reality of your brain

The paper here discusses some big Chinese multi-year brain project. There is no mention of any research strategy that offers any real hope of backing up standard dogmas about the brain.  In a section entitled "Neural Circuit Mechansims of Cognition," which tries to sell the groundless idea that cognition might be understood through the study of neural circuits, the author states, "Optimists among us may expect within the next two decades the completion of mesoscopic mapping of neural circuits and their activity patterns, and perhaps even the underlying logic and mechanisms, of cognitive processes in animal models such as Drosophila, zebrafish, and rodents."  Clearly our neuroscientists have no understanding of how neural circuits can explain human cognition. Such scientists merely have the hope that two decades of additional study of neural circuits might throw some light on cognition in animals like rats.  There is no reason to suspect that studying the exact way electricity moves around in the brain (the study of neural circuits) will ever explain human mental phenomena such as thought, memory, insight, self-hood and imagination. 

Thursday, December 24, 2020

His Lame Game Was "Shame and Defame"

"The Conversation" web site bills itself as a site with "academic rigor, journalistic flair." But we saw no academic rigor in Chris Impey's recent article on UFO sightings.   It was just a crummy bit of  "stigmatize the witnesses" mudslinging.  

Impey insinuates that UFO sightings are mainly just an American phenomenon.  Very strangely, in the same sentence that he links to an online map indicating many UFO sightings in Canada, Impey says it is "even more surprising that the sightings stop at the Canadian and Mexican borders,"  insinuating that there are no UFO sightings in Canada. This is not at all correct. The link here discusses quite a few UFO sightings in Canada.  A Canadian news site has a story with this  headline: "Canadians report seeing UFOs in the sky at a rate of 3 times a day." Another Canadian news site lists 1267 UFO sightings in Canada in the year 2015. 

The first of Impey's clumsy attempts to besmirch UFO observers is this very crude insinuation: "Sightings concentrate in evening hours, particularly on Fridays, when many people are relaxing with one or more drinks."  This is obviously an attempt to suggest UFO observers tend to be more drunk than the average observer. There is no evidence that such a thing is true. We read here some specific numerical evidence: "After analyzing perhaps about 800 UFO sighting reports from 1966 to 1968, Keel made the determination that the 'greatest number of UFO sightings are reported on Wednesday, and then they slowly taper off through the rest of the week.' ” Getting drunk doesn't cause people to report UFO sightings, and I have never heard of a drunk person reporting a UFO. 

Impey speaks as if he suspects there are many extraterrestrials out there. He approvingly quotes someone saying, "The universe is apparently bulging at the seams with the ingredients of biology.”  This phantasmagoric claim is not at all correct. The ingredients of biology are things such as cells and functional protein molecules, and no such things have ever been discovered in outer space (except in spaceships or space stations built by humans). If Impey (an astronomy professor) suspects there are many extraterrestrials out there, why is he so hostile to UFO observers?  Perhaps it's the same old snobbish elitism we see among professors, who may rather seem to  think that only a tiny priesthood of professors are entitled to make observations of scientifically interesting novel phenomena, not the common masses.  

In the second part of his article, Impey tries some more tarnishing  attempts.  First, he tries to insinuate UFO observers are conspiracy theorists, saying, "UFOs are part of the landscape of conspiracy theories." No, a UFO report is not a conspiracy theory. A conspiracy theory is a claim that a certain number of human beings are secretly plotting to achieve some nefarious end.  Making a UFO report or believing that some UFOs are extraterrestrial visitors is not a case of believing in a conspiracy theory.  

Impey then proceeds to confuse UFO sightings with crop circle observations, so that he can deliver the line, "I remain skeptical that intelligent beings with vastly superior technology would travel trillion of miles just to press down our wheat." This is "straw man" arguing, because no one maintains that extraterrestrials have come here only to make crop circles. And crop circle reports are an entirely different thing from UFO reports. 

Impey then states, "To my mind, UFOs have become a kind of new American religion,"  and he mentions Diana Pasulka. In my post "Belief in UFOs Does Not Qualify as a Religion," I rebut Pasulka's claim that UFO belief is a religion.  Definitions of religion vary, but I think a good definition of religion (applying to every recognized religion) is the following: a religion is a set of beliefs about the fundamental nature of reality and life, or a recommended way of living, typically stemming from the teachings of an authority, along with norms, taboos, ethics, rituals, roles or social organizations that may arise from such beliefs.  A belief in UFOs does not qualify as such a thing. For one thing, such a belief does not stem from the teachings of an authority. Also, a vague belief that we are being visited from beings by another planet (or that UFO sightings are some mysterious unexplained reality) does not qualify as "a set of beliefs about the fundamental nature of reality and life, or a recommended way of living."  Moreover, the title of Impey's article refers to "UFO sightings." UFO sightings are a different thing from beliefs about UFOs.  You cannot discredit UFO sightings by referring to beliefs about UFOs. 

Near the end of his article, Impey gives us one more crude piece of gaslighting. He says, "A study of young adults did find that UFO belief is associated with schizotypal personality, a tendency toward social anxiety, paranoid ideas and transient psychosis."  He refers us to a paper that questioned only 276 subjects, a sample group much smaller than other surveys on the same topic. The abstract of the paper says nothing about paranoid ideas or transient psychosis, but merely claims to find "UFO-related beliefs are associated with higher schizotypy scores." A good rule of thumb is that when the abstract of a paper claiming some association between two things does not report a specific numerical association, we can assume the association is probably a very weak one.  So we can assume that the claimed association is no greater than about 10% or 20%. 

What is this concept of "schizotypy"? It's the dubious speculative concept that certain beliefs or habits might be statistically associated with a somewhat higher chance of getting schizophrenia. It was originally claimed that 10% of people with "schizotypy" (the same as "schizotypal personality") might become schizophrenic (as opposed to 1% in the general population). But a 10-year followup study of 182 subjects found a much lower effect:  only about 2% of those who scored higher on "schizotypy questionaires" developed schizophrenia (versus 1% in the general population).  So this schizotypy or "schizotypal personality" means only a very slightly greater chance of developing hallucinations. 

So there is no substance in Impey's attempt to besmirch UFO believers as being "associated with schizotypal personality, a tendency toward social anxiety, paranoid ideas and transient psychosis." The dubious and arbitrary "schizotypy score" questionaires used by studies such as the one he cites have no value in determining any substantially greater likelihood of someone having hallucinations of things in the sky.  When there is only a 2% tendency for those with "schizotypy" to become schizophrenic, and a weak "association with greater schizotypy" were to be established for those with UFO beliefs (something like maybe a 10% or 20% greater chance or incidence of schizotypy), that would suggest at worst maybe something like a 1 in 200 greater chance (less than 1% greater chance) of hallucinations in UFO believers.  That's trivial in this context, a negligible reason for dismissing UFO sightings.  Schizophrenics virtually never  hallucinate about UFO's. 

I would estimate that at least 98% of professor skeptics of paranormal phenomena show no signs of being serious scholars of the evidence for such phenomena, and Impey shows no sign of being a serious scholar of UFO observations. He merely seems to have collected a few links he has used for cheap shots and tarnishing attempts, the type of "slur and shame the witnesses" thing done by the lawyers of rapists to besmirch witnesses of rape.  His shoddy attempt to insinuate that UFO observers may be rather prone to be psychotic reminds me of what was done in the Soviet Union. In the Soviet Union if someone merely reported seeing regretful things in the "worker's paradise" of Communist society, or merely reported being dissatisfied with such a society, he might be put in a mental hospital, and diagnosed with "sluggish schizophrenia."

shaming the witness
The tactics of lawyers of rapists and skeptics of the paranormal

Here is a bullet list concisely setting the record straight and correcting the many incorrect ideas some may get after reading Impey's article:
  • There is no evidence that alcohol consumption plays any appreciable role in UFO sightings.
  • It is false that UFO sightings stop at the border of the US; such sightings often occur in Canada.
  • Reporting a UFO sighting or believing that UFOs are unexplained or of extraterrestrial origin does not make you a conspiracy theorist.
  • Reporting a UFO sighting or believing that UFOs are unexplained or from other planets is not an example of joining a religion or believing in a religion. 
  • Crop circles are anomalies found on the ground, and are not examples of UFO sightings, since UFO stands for unidentified flying object. 
  • It is not true that so-called "schizotypal personality" is a tendency to have hallucinations, and at worse such a personality means no more than a very slightly greater chance of becoming schizophrenic.
  • There is no evidence that people reporting UFO sightings have any appreciable tendency to have psychiatric hallucinations more than ordinary people.
  • The more impressive cases of UFOs involve multiple witnesses or compelling photographic evidence, meaning they cannot be explained away as a hallucination of a single witness.

Sunday, December 20, 2020

Some COVID-19 Vaccine Rollout Plans Do Not Follow a Rule of "Prioritize the Most Endangered"

The COVID-19 pandemic is raging in the US, with almost 3000 deaths per day.  But both the Pfizer COVID-19 vaccine and the Moderna COVID-19 vaccine have been approved in the US. I congratulate the scientists who worked on this impressive achievement. It reminds me of those old Western movies in which the cavalry would arrive "just in the nick of time" to "save the day."

Millions of doses of the vaccines are being distributed, but not on a "first come, first served" basis.  There is only a limited supply of vaccine doses at this time, with many more doses to be available in 2021. The vaccine shots are being distributed in accordance with vaccine distribution policies that have been decided on by various expert committees.  Below is the plan for New York state as declared on page 29 of this document. 

NY State Vaccine Plan

Does this plan make sense? A recent article by a Dr. Buzz Hollander is entitled "The Current COVID-19 Vaccine Roll-Out Doesn't Make Sense." The author evokes various principles such as "Give Life-Saving Vaccines to the People Most Likely to Die.The doctor claims that the vaccine distribution plans fail to follow such a principle.

The author points out the plan is to give 4 million vaccine doses of the first 20 million vaccine doses to nursing home residents and nursing home personnel, and then states this:

"Giving the next 15 million doses to health care workers does not compute by any calculus valuing lives saved. I know the arguments. They put themselves at risk. They are needed, healthy, for hospitals to function smoothly. They tend to  roll up their sleeves more willingly than average Americans. The problem: they don’t die very  often. Approximately 0.5% of US deaths from Covid-19 have been health care workers. Approximately 80% of US deaths from Covid-19 have been from those over 65. 80%. Versus 0.5%. How is this even a question?"

The author seems to have adopted a simple "save the most lives" principle. But is that the best principle to be following? There is an alternate moral calculation: one that uses a principle of "save the most life years." 

Let's imagine how such a principle might be used. Imagine you are the captain of a small boat, and you come across the aftermath of a shipwreck. Looking through your binoculars, you see two small groups of frigid passengers treading water.  Over to the east is a group of three very old people treading water. Over to the west is a group of two young adults treading water.  Which do you save first?

According to a "save the most lives" principle, the answer is easy: go to the east, because there are three people over there, not just two. But imagine if the captain uses a different principle, a "save the most life years" principle. He then may try to calculate the number of life years saved by going east and going west.  The three very old people have a rather small number of life years ahead of them. Each one will be expected to live an average of only about 10 years. So if the captain goes east, he will be saving a total of about 30 years of human life. But the two young adults to the west will have many years ahead of them if they are saved. Each will live perhaps 50 additional years.  So the captain can save a total of about 100 years of life if he goes to the west. 

In this case, the "save the most life years" principle results in a different decision that the "save the most lives" principle.  If the sea captain acted according to the "save the most lives" principle, he would go to the east to save the three very old people. But if the captain acted according to the "save the most life years" principle, he will go to the west, thinking that it is more important to save about 100 life-years than to save only about 30 life-years. 

Similarly, for certain types of viruses, a "save the most life years" principle might result in different vaccine distribution plans than a "save the most lives" principle. Let us imagine a Virus X that kills both young adults and old people, but causes deaths at a rate 200% greater in people over 70, compared to young adults. If we followed a "save the most lives" policy, we might recommend that people over 70 get the first doses of a newly available vaccine for Virus X.  But since a young adult has maybe a 400% greater life expectancy than a 70-year-old,  if we were following a "save the most life years" policy, it seems we would give the first vaccine doses to the young adults. 

However, in the case of COVID-19, a different mathematics is involved. According to the chart below (made using the data on this CDC page), the death rate for people with an age between 65 and 74 is not just 200% greater than for young adults, but actually more than 50 times greater.

US COVID-19 Deaths by Age

So it turns out even if we use a "save the most life years" principle rather than a "save the most lives" principle,  we come to the same recommendation, that an average person over 65 should get the vaccine before an average young adult should get the vaccine (not taking into consideration occupations).  Such a policy will result in both more lives saved, and also more years-of-life saved. 

Now let's consider a very questionable aspect of the New York state plan for a COVID-19 vaccine rollout.  The plan is that in a Phase 2 "other essential frontline workers" (including "grocery store workers" and "transit workers" and "teacher/school staff") will get the vaccine before people 65 and older, who will not get the vaccine until a Phase 3. Does this make sense, under either a "save the most lives" principle or a "save the most life years" principle? 

To address that question intelligently, we must get some statistics regarding the incidence of COVID-19 deaths by occupation. A British study released in May 2020 claimed the following:

"Healthcare workers, including doctors and nurses, were not found to have statistically higher rates of death involving covid-19 when compared with the rate among those of the same age and sex in the general population....Male security guards had one of the highest death rates at 45.7 deaths per 100 000 (63 deaths). Other jobs with raised rates of covid-19 death included taxi drivers and chauffeurs (36.4 deaths per 100 000), bus and coach drivers (26.4 deaths per 100 000), chefs (35.9 deaths per 100 000), and sales and retail staff (19.8 per 100 000). Men and women working in social care both had significantly raised rates of death involving covid-19 with rates of 23.4 deaths per 100 000 in men (45 deaths) and 9.6 deaths per 100 000 women (86 deaths)...The rate of death among healthcare workers was 10.2 deaths per 100 000 males and 4.8 deaths (43 deaths) per 100 000 females (63 deaths). The category included doctors, nurses, midwives, nurse assistants, paramedics and ambulance staff, and hospital porters."

The chart here compares COVID-19 rates for various occupations,  and tells us that "teaching professionals" and "healthcare workers" have a rate of COVID-19 death less than 10 per 100,000, while people such as police, firemen, shop workers and printers have a rate of COVID-19 up to about 20 per 100,000 or as high as about 30 per 100,000. 

We can compare these relatively low death rates to the vastly higher deaths of people over 65. Below is a chart from a Connecticut state web site, showing COVID-19 death rates per 100,000 in that state, by age (choose "Total Number of Deaths" in the "Age Group Chart" to see the latest version of such a graph). 

COVID-19 Death Rates Per 100,000 in CT
COVID-19 death rates per 100,000 in Connecticut, by age

This graph shows a death rate of 594 per 100,000 for people in their sixties, and 1018 per 100,000 for people in their seventies. Even the lower of these rates is many times higher than the death rate for all of the workers who will get earlier vaccine doses in Phase 2 of the New York state COVID-19 vaccine rollout.  Under the New York state COVID-19 vaccine rollout plan, many workers with relatively low death rates will get COVID-19 vaccine doses in a Phase 1 or Phase 2 of the rollout, before people 65 and older (with many times higher COVID-19 death rates) get their vaccine doses in a Phase 3 of the rollout.  

If we limit ourselves to a "save the most lives" principle we should complain about this, just as did the doctor I earlier mentioned, who evokes such a "save the most lives" principle to complain about the COVID-19 vaccine rollout plans.  But if we use a "save the most life years" principle, there will be less to complain about.  Let's imagine you are a health care worker who is a young adult. Even though your COVID-19 death risk is much lower than that of a 65-year-old, the number of life years that will be saved if your death is prevented is maybe four times greater than if a 65-year-old dies from COVID-19. 

Did those who created the COVID-19 vaccine rollout plans use a "save the most life years" principle rather than a "save the most lives" principle? We don't know. But it's rather clear the rollout plans are not right under a "save the most lives" principle. The plans are less objectionable under a "save the most life years" principle.  But such a principle by itself fails to justify the COVID-19 vaccine rollout plans. 

Let's consider 40-year-old healthcare workers. According to the statistics above, their risk of getting COVID-19 is many times lower than the rate of someone over 65 (which is about 600 per 100,000, according to the graph above). Even if we take into account that the "years lost" for the person in his sixties would be only about 50%-30% of the "years lost" for someone dying at the age of 40, it still seems that a mere "save the most life years" policy would lead us to give the vaccine first to people over 65 before it is given to most healthcare workers. Instead, the healthcare worker gets the vaccine in Phase 1 of the New York state rollout, while a 65-year-old gets the vaccine only in Phase 3 of the rollout. 

Clearly even if we use the less common "save the most life years" principle, it fails to justify the COVID-19 rollout plans such as the New York state plan. Are there other principles that can be appealed to? There are. There is a "keep society running smoothly" principle or "cultural convenience" principle, under which "essential workers" are valued far more highly than retired people. And there is a "reward the recent achievers" principle under which people who previously saved lives or did hard work recently may be given the vaccine earlier, as a kind of bonus.  

So after appealing to three different principles ("save the most life years," "avoid cultural inconveniences and keep society running smoothly" and "reward the recent achievers"), we might be able to end up justifying the current COVID-19 vaccine rollout plans. But they sure cannot be justified by appealing to a single simple principle such as "save the most lives" or "save the most life years" or "prioritize the most endangered."

On page 10 of the document for the New York State vaccine rollout plan, there is this claim: "New York State’s COVID-19 vaccine distribution approach will be based solely on clinical and equitable standards that prioritize access to persons at higher risk of exposure, illness and/or poor outcome, regardless of other unrelated factors, such as wealth or social status, that might confer unwarranted preferential treatment."  This statement is very false. If the plan had been "based solely on clinical and equitable standards that prioritize access to persons at higher risk of exposure, illness and/or poor outcome,"  the people 65 or older would not be put in a phase 3 of the rollout, behind teachers, transit employees and grocery workers put in a phase 2 of the rollout.  The COVID-19 death rate of people over 65 is more than 20 times greater than the COVID-19 death rate of teachers, transit employees and grocery workers.

I can imagine the thoughts of someone creating a COVID-19 rollout plan like the New York state plan. He might be someone who put a high value on convenience. So he might be thinking like this:

"Look, we've got to keep the trains running on time. No one likes late mail, and when you order from Amazon, you don't want to wait too  long for your package. And everyone likes clean streets. Think of all those empty office buildings. They have to be protected. We can't let homeless people camp out inside them. And we have to keep all our grocery workers at their stations. No one likes having to walk too far to buy food, and no one likes having to wait too long in food store lines. When you go buy food, you want everything neatly organized on the shelves, so you can find things very fast. And we can't let teachers be out under quarantine, for then we might have to teach many of our kids how to learn on their own."

It is reasonable to ask whether the authors of such a COVID-19 rollout plan have valued "cultural convenience factors" such as "maintain critical infrastructure" far above saving the lives of people over 65.  It is somewhat as if the authors of such a plan tended to regard senior citizens as being rather expendable. Similarly, I can imagine a rescue boat approaching shipwreck survivors treading in icy water, and the captain shouting out to the swimmers, "Raise your hand if you're usefully employed."

Don't get me wrong. I'm a fan of the COVID-19 vaccines and the brilliant minds who developed them. I'm just not a fan of most of the rollout plans for such vaccines. 

Postscript: An expert panel of the CDC (Center for Disease Control) has released today a recommendation regarding COVID-19 vaccine distribution.  It partially corrects the appalling shortcomings of the New York state plan discussed above, but still jeopardizes those in the very high-risk age group of 65-74. 

Don't be fooled by its designations of 1a, 1b and 1c.  "1a" is phase 1, "1b" is phase 2, and "1c" is phase 3.  The panel recommends that phase 2 of the COVID-19 vaccine distribution (euphemistically called "phase 1b") should be a group of 49 million people including about 14 million who are 75 and older and some huge group of about 35 million called (not really accurately) "frontline essential workers," a group that includes basically any worker at all who cannot work from home. The vast majority of these workers will have a COVID-19 death risk many times smaller than those age 65-74, but such workers will get their vaccines ahead of such seniors, who are consigned to a phase 3 (euphemistically called "1c"). 

The article here quotes one of the panel members who mentions some reasons for his vote, none of which is "save the most lives." 

Anyone impressed by the CDC imprimatur here should remember that both the CDC and the World Health Organization made gigantic COVID-19 blunders earlier this year, by telling people during the critical first months of the pandemic that they did not need to wear face masks, before recanting their foolish advice on this topic (as discussed here and here).  

Post-postscript: On television December 21 a  Professor Anne Rimoin commented on the new CDC guideline. I can't remember her saying anything about any "save the most lives" rationale or any "prioritize the most endangered" rationale. Instead, she repeatedly stated that the specially favored workers (given vaccines before those at vastly higher risk) are "very important people." I guess we can call this guideline a "VIPs first " policy. 

COVID-19 vaccine rollout problem

Below is a table from page 8 of the Canadian plan on COVID-19 vaccine distribution. Except for putting "frontline health and social care workers" near the top, it is based entirely on the risk of someone dying from COVID-19, which rises directly with age. The document says, "As the risk of mortality from COVID-19 increases with age, prioritisation is primarily based on age." 

Canadian COVID-19 Vaccine Rollout Plan

Post-post-postscript:
 An excellent article at www.vox.com is one of the very few in the mainstream press to give us a balanced discussion of the latest CDC recommendation on COVID-19 vaccine distribution, quoting some criticizing that recommendation. The article quotes a director of a Yale Institute for Global Health as saying "even at ages 65 to 74, they have a 90 times higher risk of death" (referring to COVID-19 risk of death). After telling us that COVID-19 vaccine distribution plan in the United Kingdom is the same Canadian plan quoted above, the article states the following, where we read the exact phrase of "life years" that I used above before ever hearing about anyone else using that phrase:

“ 'I will be eternally perplexed if the US doesn’t choose to vaccinate the elderly first and foremost, along with those who take care of them directly,' wrote Zeynep Tufekci, a University of North Carolina professor of sociology who has emerged as one of the country’s sharpest coronavirus policy commentators. 'Everyone deserves protection, but if we do not prioritize vaccination by actual risk, which basically means prioritizing by age and vaccinating the elderly first, it may well be the greatest, most consequential mistake [the] United States does in a year full of very very bad ones.' She cited a preprint of a paper on vaccine prioritization, which makes the case that vaccinating older adults first will save the most lives. It would also save the most 'life-years' — a measure of lives saved that considers how many more years of life that person has (and so values saving a 20-year-old much more than saving an 80-year-old)."

Besides the paper mentioned above, there is the paper here, which finds that "for a range of assumptions on the action and efficacy of the vaccine, targeting older age groups first is optimal and can avoid a second wave if the vaccine prevents transmission as well as disease." Then there is the paper here which says, "Our model suggests a vaccine distribution that emphasizes age-based mortality risk more than occupation-based exposure risk." Then there is the paper here which says, "Vaccinating 60+ year-olds first prevents more deaths (up to 8% more) than transmission-interrupting strategies for January [2021] vaccine availability across most parameter regimes."

A COVID-19 vaccine strategy of "prioritize the most endangered" would also distribute first to neighborhoods with the highest infection rate.  If this were done in New York City, priority would be given to neighborhoods with the highest infection rates, as listed on the current map of New York City zip codes and their infection rates.  Under such a policy, some minority-rich neighborhoods would get vaccines first, not because ethnicity was considered, but simply because the COVID-19 infection rate happened to be higher in such neighborhoods.  

Post-post-post-postscript: On January 12, 2021, the CDC announced that it was changing its guidelines, recommending the immediate COVID-19 vaccination of anyone 65 or older. So the policy I complained about was changed in the way I recommended much earlier (around December 20, 2020). 

Wednesday, December 16, 2020

The COVID-20 Express: A Science Fiction Story

Nine years after COVID-19 was finally defeated, an even worse blow came from the biosphere. The scientists  called the disease COVID-20. It spread like wildfire across the world.  The virus could exist in someone for two months before symptoms appeared. This made it much easier for the disease to spread.  By the time the symptoms first appeared in people, the virus had already infected millions of people.  

They called it "the coughing death."  First a person would start coughing occasionally. Then that person would cough more and more. By the time the person got hospitalized, it was usually too late. There would be a very high fever, usually followed by a brief coma and then death. 

Gene and his wife Carla were both hospitalized on the same day.  When Gene was put in a hospital room, he thought he might never see his wife again.  As he began to fade out of consciousness, he saw his electronic life signs monitor display a temperature of 106 degrees.  Then after what seemed like a deep sleep, he awoke. He was very surprised to find that he was no longer in a hospital. 

Gene looked around and saw himself seated in a train traveling at high speeds. Next to him was his wife, her eyes closed. The train car was filled with passengers. Gene looked out the windows of the train, but could see hardly anything. It was as if the train was traveling through a dark tunnel.  

Soon Gene's wife woke up. "How did we get here?" asked Carla.

"Your guess is as good as mine," said Gene. "The last thing I remember, I was very sick in a hospital bed. But I feel fine now."

"Me too," said Carla. "Never felt better." 

Before long a man in a black uniform came into the car, demanding to see tickets. The man looked like one of the conductors on an Amtrak train. Soon the conductor came to Gene and Carla.

"Tickets, please," said the conductor.

"I don't remember buying a ticket," said Gene.

"Check your left breast pocket," said the conductor, as if he had said it a thousand times before.

Looking down at his body, Gene saw a slip of paper in his left pocket.  It read the following: "GENE BOONE, COVID-20, JULY 13, 2030."  The letters were in green.  After seeing this, the conductor stuck a little piece of paper above Gene's seat,  which had on it a green arrow pointing upward. He put another such slip above Carla's seat. 

Gene watched the conductor move to the seat in front of him. Reminded to check his front pocket, the man in the seat in front of Gene produced a slip saying in red letters, "DOUG GRADISON, MURDER-SUICIDE, JULY 13, 2030."  The conductor stuck a slip of paper above that person's seat. But instead of having a green arrow pointing upward, it had a red arrow pointing downward. 

The train continued to hurtle forward, seemingly passing through a long dark tunnel. Gene got out of his seat, and started moving around, asking questions of other people in the train car. He soon found that all of them were as baffled to be on the train as he was. Most of the people had a story to tell just like that of Gene. 

Most of the passengers said the last thing they remembered they were very sick, with symptoms like that of COVID-20.  One person said that the last thing he remembered was his car accidentally smashing into another car. Another person said the last thing he remembered was toppling from a ladder while trying to clean out the gutter on his roof.  One of the passengers said he thought he was still back in his hospital bed, and that he was just having a delirious hallucination of being on a train. 

Looking around in the train car, Gene could see that above almost every seat was a little slip of paper with a green arrow pointing upward. But above two seats the slip of paper had a red arrow pointing down. Soon the train started to slow.  Not much could be seen in the darkness outside the train. The conductor entered the train car. 

"This is Stop #1", shouted the conductor. "Please move over here to the exit door if you have a red arrow in the slip of paper above your seat."  Two people got out of their seats, and came toward the conductor. 

In the dim light some faces could be seen outside of the window. They looked rather like the audience of a strip club, or like some mob cheering on the fighters in a street brawl. "What kind of stop is this?" asked one of the two people about to exit. 

"Don't worry, you'll meet your kind of people," said the conductor.  After the train slowed to a halt, an exit door opened, and the two passengers left.  Looking out the window, Gene and Carla could dimly see some faces and their unholy expressions. 

"This is not the kind of stop I would ever want to get off at," said Carla. 

"Damn right," said Gene. 

The train then accelerated, and started moving faster than ever before. 

"I remember reading about people who had close brushes with death," said Carla. "They often reported traveling through a mysterious tunnel. Do you think there's any connection with what's happening now?"

"I don't know," said Gene.

Waiting for the train to reach its destination, Gene's mind became filled with vivid recollections of all that had happened in the life he had led. With unusual clarity and speed, he seemed to remember the whole course of his life. The scenes passed through his mind almost as clearly as if he were watching some movie displayed on a wide-screen TV in front of him. 

Carla closed her eyes and leaned back in her seat. She began to have a strange vision in her mind's eye, a vision of unusual clarity. She first vividly visualized a building she recognized as the local hospital, as if she were floating above it. Then she intensely visualized different floors on the building, filled with doctors and nurses wearing face masks. It was like her mind was passing through each floor. Finally she visualized a basement area where there were many cloth-covered bodies on shelves and tables. Her mind seemed to hover above one cold cloth-covered body. That was me, she thought. Like someone awaking from a bad dream, she snapped out of the strange vision. 

Gene kept looking out a window, eager to see some sign of light ahead of the train. Finally he saw a glimpse of some light ahead of the train.  "We're heading towards some light!" he exclaimed. 

Finally the train came out of the tunnel. Suddenly the train car was flooded with light. Looking out the windows, Gene and Carla could see a stunning landscape. Bathed in some unearthly light, it looked more beautiful than any landscape they had ever seen. 

Eventually the train approached what looked like a town, one with buildings of exquisite elegance. The train started to slow down. Looking out the window, Gene and Carla could see that a big crowd  had come to see the train's arrival.  The people looked very happy. The conductor came into the train car. 

"Last stop!" yelled the conductor. "All passengers must exit."

Gene and Carla got in line to exit. The train came to a halt. They could hear outside of the train a boisterous sound of cheering, trumpets blaring, and church bells ringing.  The door was opened, and Gene and Carla could smell some wonderful scent unlike anything they had smelled before. 

"Don't be too surprised if you see there's no track underneath the train," said Gene. 

As the first passenger exited, a wreath of flowers was put around his neck by a smiling woman. People outside the train were throwing colorful confetti, laughing and cheering. 

"Mother!" yelled the first passenger to exit, recognizing someone in the crowd outside of the train. He ran to greet her. 

"Where are we?" asked Gene. 

"Somehow I don't think we've left our home," said Carla. "I think we've finally arrived at our real home." 

Saturday, December 12, 2020

No, DeepMind's AlphaFold2 Did Not Solve the Protein Folding Problem

One of the principal unsolved problems of science is the problem of protein folding, the problem of how simple strings of amino acids (called polypeptide chains) are able to form very rapidly into the intricate three-dimensional shapes that are functional protein molecules. Scientists have been struggling with this problem for more than 50 years. Protein folding is constantly going on inside the cells of your body, which are constantly synthesizing new proteins. The correct function of proteins depends on them having specific three-dimensional shapes.

In DNA, proteins are represented simply as a sequence of nucleotide base pairs that represents a linear sequence of amino acids. A series of amino acids such as this, existing merely as a wire-like length, is sometimes called a polypeptide chain.  

polypeptide chain

But  a protein molecule isn't shaped like a simple length of copper wire – it looks more like some intricate copper wire sculpture that some artisan might make. Below is one of the 3D shapes that protein molecules can take. There are countless different variations. Each type of protein has its own distinctive 3D shape (and some types, called metamorphic proteins, can have different 3D shapes).

The phenomenon of a polypeptide chain (a string of amino acids)  forming into a functional 3D-shape protein molecule is called protein folding. How would you make an intricate 3D sculpture from a long length of copper wire? You would do a lot of folding and bending of the wire. Something similar seems to go on with protein folding, causing the one-dimensional series of amino acids in a protein to end up as a complex three-dimensional shape. In the body this happens very rapidly, in a few minutes or less. It has been estimated that it would take 1042 years for a protein to form into a shape as functional as the shape it takes, if mere trial and error were involved.

The question is: how does this happen? This is the protein folding problem that biochemists have been struggling with for decades. Recent press reports on this topic in the science press were doing  what people in the science field do like crazy: making unfounded achievement boasts and parroting in a credulous fashion the boasts churned out by the PR desks of vested interests.  The reports were saying that some software called AlphaFold2 (made by a company called DeepMind) had "solved" or "essentially solved" the protein folding problem. This is not at all correct. 

The AlphaFold2 software gets its results by a kind of black-box "blind solution" prediction that doesn't involve  real understanding of how protein folding could occur.  As discussed on page 22 of this technical document, this "deep learning" occurs when the software trains on huge databases of millions of polypeptide chains (sequences of amino acids) and 3D protein shapes that arise from such sequences, databases derived from very many different organisms.  One of these databases has 200 million entries, about a thousand times greater than the number of proteins in the human genome. 

This is an example of what is called frequentist inference. Frequentist inference involves making guesses based on previously observed correlations. I can give an example of how such frequentist inference may involve no real understanding at all. Imagine if I had a library of vehicle images, each of which had a corresponding description such as "green van" or "red sports-car convertible" or "white tractor-trailer." I could have some fancy "deep learning" software train on many such images.  The software might then be able to make a sketch that predicted pretty well what type of image would show up given a particular description such as "brown four-door station wagon."  But this would be a kind of "blind solution."  The software would not really know anything about what vehicles do, or how vehicles are built. 

The situation I have described with this software is similar to the cases in which the Alpha Fold2 software performs well.  In such cases the software uses vast external databases created from analyzing countless thousands of polypeptides sequences in countless different organisms, and 3D shapes derived from them.  But it's a black box "blind solution." The software has no actual understanding of how polypeptide chains are able to form into functional 3D shapes. It cannot be true that something similar is going on inside the body. The body has no similar "deep learning" database created from crunching data on polypeptide sequences and 3D shapes derived from them, data derived from countless different organisms. 

When predictions of such a type are made by deep learning software,  this activity is not something corresponding to what goes when scientific theory predictions are tested against data. In the latter case what goes on is this:

(1) A theory is created postulating some way in which nature works (for example, it might be postulated that there is a universal force of attraction between bodies that acts with a certain strength, and according to a certain "inverse square" equation). 

(2) Predictions are derived using the assumptions of such a theory. 

(3) Such predictions are tested against reality. 

But that isn't what is going on in deep-learning software. With such software, there is no theory of nature from which predictions are derived.  So successes with deep-learning software aren't really explanatory science. 

The type of analysis that the AlphaFold2 software does fairly well on is called template-based modeling.  Template-based modeling involved frequentist inferences based on some huge database of polypeptide chains and 3D shapes that correspond to them. There's another type of way for a computer software to try to predict protein folds: what is called template-free modeling or FM.  Software using template-free modeling or FM would use only the information in the polypeptide chain and the known laws of chemistry and physics, rather than relying on some huge 3D shape database (derived from many organisms) that isn't available in the human body. 

How well does the Alpha Fold software do when it uses the targets that were chosen to test template-free modeling or FM? Not very well. We can find the exact data on the CASP web site.  

What is called the Critical Assessment of Protein Structure Prediction (CASP) is a competition to assess the progress being made on the protein folding problem. They have been running the competition every two years since 1994. You can read about the competition and see its results at this site. The first competition in 1994 was called CASP1, and the latest competition in 2020 was called CASP14. Particular prediction programs such as AlphaFold2 are used to make predictions about the 3D shape of a protein, given a particular polypeptide chain (a sequence of amino acids specified by a gene). The competitors supposedly don't know the 3D shape, but only are given the amino acid sequence (the polypeptide sequence). The competing computer programs make their best guess about the 3D shape.

To check how well the AlphaFold2 oftware does using the targets that are supposed to test template-free modeling, you can go to the page here, click AlphaFold2 in the left box, click FM (which stands for template-Free Modeling), and then click on the Show Results button:


You will then get the results screen below:

AlphaFold2 Template Free Results

These results are not terribly impressive. The number on the right is a number that will be close to 100 for a good prediction, and close to 0 for a very bad prediction. 19 out of 23 predictions have scored less than 90, meaning they are substantially off. 

 A crucial consideration in judging these results is: how hard were these prediction targets? The complexity of proteins vastly differs, depending on the number of amino acids in the protein (which is sometimes referred to as the number of residues).  A protein can have between 50 and 2000 different amino acids. The average number of amino acids in a human protein is about 480.  If the results shown above were achieved mainly by trying to predict the 3D shapes of proteins with a below-average number of amino acids, what we can call relatively easy targets, then we should be much less impressed by the results. 

What I would like to have is a column in the table above showing the number of amino acids in each prediction. The web site does not give me that. But by using the data on column 3 of this page, I can add a new column to the table above, showing the number of amino acids in the proteins that had their 3D shape predicted. Below is the result, with the new column I added on the right:

AlphaFold prediction results


We see here that most of the predictions above were done for proteins with a smaller-than-average number of amino acids, because the numbers in the right column are mostly much smaller than 480, the average number of amino acids in a human protein. A really impressive result would be if all of the numbers in the GDT_TS column were above 90, and if most of the numbers in the last column were much larger than 480.  Instead, we have a result in which AlphaFold2 often is way off (scoring much less than 90) even in simpler-than-average proteins with fewer than 300 amino acids. 

Based on these results with the FM prediction targets, it is not at all true that AlphaFold2 has solved the protein folding problem. Trying to solve such targets, the software often is way off in predicting the shape of protein molecules of below-average complexity (which are easier prediction targets).  

There is also a reason for thinking the not-so-great results shown in the table above would be much worse if a more reliable method was used to calculate the degree of accuracy between the prediction and the actual protein. The table above shows us numbers using a method called GDT_TS. But there's a more accurate method to calculate 
the degree of accuracy between the prediction and the actual protein. That method is called GDT_HA, HA standing for "high accuracy." According to the paper here, "Of the two GDT scores under consideration, GDT_HA is generally 10–20 less than the GDT_TS scores computed from the same models (Fig 4A and 4B), reflecting its higher stringency." Figure 4 of the paper here has a graph showing that GDT_HA scores tend to be 20 or 25 points lower than GDT_TS scores. The site here also has two graphs indicating that the GDT_HA scores tend to be about 20 points lower than the GDT_TS scores. So if the more accurate GDT_HA method had been used to assess accuracy, the scores produced by AlphaFold2 for the targets picked to test template-free modeling would apparently have been only between about 24 and 73, not very impressive at all. 

When we look for results using the easier targets designed to test template-based modeling (derived by using a massive "deep learning" database not available in the human body), we get results averaging about 90 for the AlphaFold2 software. But such results are using the GDT_TS scores. Based on the comment above, the more accurate GDT_HA scores would be about 20 points lower. So if the more accurate GDT_HA scores were shown we would see scores of around 70, which would not seem very impressive.  Overall, judging from this page, about 75% of the CASP14 prediction targets had a number of amino acids (residues) much smaller than 480, which means that the great majority of the CASP14 prediction targets were "low-hanging fruit" that were relatively easy to solve, not a set of prediction targets that you would from get just randomly picking proteins. 

Did the AlphaFold2 software stick to template-free modeling for the targets that were picked to test template-free modeling? We don't know, because the software is secret.  Science is supposed to be open, available to inspection by anyone -- not some procedures hidden in secret software. 

I am unable to find any claims by the DeepMind company that the AlphaFold2 software ever used any real template-free modeling based solely on the amino acids in a gene corresponding to a particular protein. On page 22 of the document here, DeepMind describes its technique, making it sound as if they used deep-learning protein database analysis for all of their predictions. So it seems as if 100% of AlphaFold2's predictions were made though what some might call the "cheat" of inappropriate information inputs, the trick of trying to predict protein folding in an organism by making massive use of information available only outside of such an organism, and derived from the analysis of proteins in very many other organisms. The fault may lie mainly with those running the CASP competitions, who have failed to enforce a sensible set of rules, preventing competitors from using inappropriate information inputs.  On the current "ab initio" tab of the CASP competition, there is no mention of any requirement there ever be used any real template-free modeling based solely on the amino acids in a gene corresponding to a particular protein.  

Another thing we don't know is this: how novel and secret were the prediction targets selected for the latest CASP14 competition? The targets were manually selected, and supposedly some of the prediction targets were selected in an attempt to get some novel types of prediction targets the competitors had not seen before. Were they really that, or did most of the targets tend to resemble previous year's targets, making it relatively easy for some database-based prediction program to succeed? And what type of security measures were taken to make sure that the supposedly secret prediction targets really were secret, and completely unknown to the CASP14 competitors? We don't know. Were any of the CASP14 competitors able to just find 3D shapes corresponding to some of the "secret targets," shapes that someone else had determined through molecular analysis, perhaps through some backdoor "grapevine"? We don't know. A few naughty "insider information" emails might have been sufficient for a CASP14 competitor to get information allowing it to score much higher.  It's hard to predict how unlikely such a thing would be, because while we know that the military has a very good tradition of keeping secrets, we don't know how good biologists are at keeping secrets.  Given very many millions of items in one of the protein databases, it would have been all-but-impossible for anyone submitting a prediction target to have known that it was something novel, not something very much like some protein already in the databases that AlphaFold2 had scanned.  

It is believed that 20 to 30 percent of proteins have shapes that are not predictable from any amino acid sequence, because such proteins acheive their 3D shapes with the aid of other molecules called chaperone molecules.  On such proteins a program such as AlphaFold2 would presumbably perform very poorly at its predictions. Are proteins that require chaperone molecules for their folding excluded as prediction targets for the CASP competitions, for the sake of higher prediction scores? 

 At this site we read "Business Insider reports that many experts in the field remain unimpressed, instead calling DeepMind’s announcement hype."  There is no basis for claiming that the AlphaFold2 software has solved or "essentially solved" the protein folding problem.  An example of the credulous press coverage is a news article in the journal Nature. We see a graph that is based on the template-based modeling not very relevant to human cells, since such modeling uses a massive "deep learning" database not in human cells.  The graph uses the GDT_TS score, showing an average score of almost 90. We are not told that the score would have been about 20 points lower if the more accurate GDT_HA measure had been used.  And we are not told that the average score using targets designed to test template-free modeling (the more relevant targets) and also using a GDT_HA measure would have been some unimpressive number such as only about 60.  The graph has a misleading phrase saying, "A score above 90 is considered roughly equivalent to the experimentally determined structure."  That is not correct. Since the more accurate GDT_HA measure tends to be about 20 points lower than the GDT_TS score, a prediction of a 3D protein shape can win a GDT_TS score of 90 even though it is far off the mark.  

In a boastful DeepMind press release, we hear the co-founder of the CASP competition (John Moult) say this: "To see DeepMind produce a solution for this, having worked personally on this problem for so long and after so many stops and starts wondering if we’d ever get there, is a very special moment.”  But in this article we read this: "AlphaFold’s predictions were poor matches to experimental structures determined by a technique called nuclear magnetic resonance spectroscopy, but this could be down to how the raw data is converted into a model, says Moult."  So Moult apparently got evidence from nuclear magnetic resonance spectroscopy that the AlphaFold2's software was not predicting very well, but he seems to have disregarded that evidence, by declaring the software a monumental success.  This may be the bias that can arise when someone is yearning to have a "very special moment" of triumphal celebration. 

The Nature article makes it rather clear that the CASP14 competition did not use an effective blinding protocol. Quoting a person (Andrei Lupas) who is listed as the person responsible for assessing high accuracy modeling for the CASP14 competition, the article states this:

"AlphaFold’s predictions arrived under the name 'group 427', but the startling accuracy of many of its entries made them stand out, says Lupas. 'I had guessed it was AlphaFold. Most people had,' he says."

This confession makes it sound like a simplistic blinding protocol was used, allowing the judges to guess that "group 427" was the AlphaFold2 software of the prestigious DeepMind corporation, the company that won the previous CASP competition.  So the clumsy attempt at blinding didn't work well. We can only guess how much the "prediction success" analysis was biased once people started thinking that the "group 427" (i.e. AlphaFold2) was supposed to win higher scores.  Any moderately skilled computer programmer could have easily figured out a double-level blinding system that would have avoided such a problem, one in which a unique name was used for the source of each prediction.  Since the CASP14 competition apparently didn't do blinding effectively, what confidence can we have that it did security correctly? If the security wasn't done correctly, that would be a reason for lacking confidence in the predictive results; for one or more competitors might have not been blind about the 3D shapes they were trying to predict. 

AlphaFold2 used frequentist inference when it used template-based modeling to get its best results. I can give another example to explain how such frequentist inference can give "blind predictions" that involve no real understanding. I might be given some data on some death cause described only as "Death Cause #523."  The data might include how many people of different ages in different locations died with this cause. Using frequentist inference and some "deep learning" computer program,  I might be able to predict pretty well how many people of different ages in different places will die next year from this cause. But I would have no understanding of what this death cause was, or how it killed people. That's how it so often is with frequentist inference or "deep learning." You can get it to work fairly well without understanding anything about causes. 

I can sketch out how it might actually look like if humans were to solve the protein folding problem:

(1) There would be a software for protein shape prediction, and that software would have all of its code published online. 
(2) The software would not use any "deep learning" database or "template library" database unavailable to an organism, some database derived from studying countless different organisms. When doing a prediction for a protein in one particular organism, the software would only use the information in the organism's genome.
(3) The software would only make predictions based on only the information in an organism and known principles of chemistry, physics and biology. 
(4) The software would be very well-documented internally, so that each step of its logic would be justified, and we could tell that it was using only principles of nature that have been discovered, or information existing in the organism having a particular protein.  
(5) A user of the software could select any gene in the human genome (or some other genome) to test the software. 
(6) The software would then generate a 3D model of a protein using the selected gene, without doing any very lengthy number crunching unlike anything that could take place in an organism.  
(7) A user could then compare the accuracy of the resulting model, by comparing a visualization of the software's 3D model with a visualization previously produced by some independent party and stored in one of the protein databases. 
(8) If automated techniques were used to judge accuracy, only the more accurate GDT_HA method would be used, rather than the less accurate GDT_TS method. 
(9) You would be able to test the software using genes of greater-than-average-complexity, and on proteins that require chaperone molecules for their folding, and the predictive results would be very accurate. 

The situation with the AlphaFold2 software is light-years short of the type of situation I have described.  The protein folding problem has not at all been solved, because it is still the case that no can explain how it is that mere polypeptide chains (sequences of amino acids) tend with such reliable regularity and amazing speed to form into functionally useful 3D protein shapes.  The article here discusses why protein folding involves enormous fine-tuned complexity and interlocking dependencies of a type that would seem to make it forever impossible to very reliably predict the more complex protein shapes from the mere amino acid sequences in a single gene.  We read the following:

"Protein folding is a constantly ongoing, complicated biological opera itself, with a huge cast of performers, an intricate plot, and dramatic denouements when things go awry. In the packed, busy confines of a living cell, hundreds of chaperone proteins vigilantly monitor and control protein folding. From the moment proteins are generated in and then exit the ribosome until their demise by degradation, chaperones act like helicopter parents, jumping in at the first signs of bad behavior to nip misfolding in the bud or to sequester problematically folded proteins before their aggregation causes disease." 

Lior Pachter (a CalTech professor of computational biology) states that while the AlphaFold2 software has made significant progress, "protein folding is not a solved problem." He scolds his colleagues who have got sucked in by the hype on this matter. 

In an article in Chemistry World, we read the following:

"Others take issue with the notion that the method ‘solves the protein folding problem’ at all. Since the pioneering work of Christian Anfinsen in the 1950s, it has been known that unravelled (denatured) protein molecules may regain their ‘native’ conformation spontaneously, implying that the peptide sequence alone encodes the rules for correct folding. The challenge was to find those rules and predict the folding path. AlphaFold has not done this. It says nothing about the mechanism of folding, but just predicts the structure using standard machine learning. It finds correlations between sequence and structure by being trained on the 170,000 or so known structures in the Protein Data Base: the algorithm doesn’t so much solve the protein-folding problem as evade it. How it ‘reasons’ from sequence to structure remains a black box. If some see this as cheating, that doesn’t much matter for practical purposes."

No doubt, the fine work of the DeepMind corporation in improving the performance of their AlphaFold2 software will prove useful somehow, in areas such as the pharmaceutical industry.  But when a modest success was achieved, that should not be inaccurately described as a solution to one of nature's great mysteries.