TRANSLATE THIS ARTICLE

Integral World: Exploring Theories of Everything

An independent forum for a critical discussion of the integral philosophy of Ken Wilber

Frank Visser, graduated as a psychologist of culture and religion, founded IntegralWorld in 1997. He worked as production manager for various publishing houses and as service manager for various internet companies and lives in Amsterdam. Books: Ken Wilber: Thought as Passion (SUNY, 2003), and The Corona Conspiracy: Combatting Disinformation about the Coronavirus (Kindle, 2020).

SEE MORE ESSAYS WRITTEN BY FRANK VISSER

NOTE: This essay contains AI-generated content
Check out my other conversations with ChatGPT

Digital Genomes and the Denial of Viruses

Why the Data Still Holds

Frank Visser / ChatGPT

Digital Genomes and the Denial of Viruses

Image by ChatGPT

Context: in 2020 and 2021 I wrote two online books on virus deniers and PCR skeptics.

The emergence of digital virology has given rise not only to astonishing scientific advances but also to a backlash of skepticism.[1] From fringe forums to public court cases, a number of critics — most prominently German biologist Stefan Lanka — have argued that viruses like SARS-CoV-2 were never actually isolated, and that their genomes are constructed through dubious digital processes that could, in theory, generate any virus from the same raw data. These concerns, though rooted in misunderstandings, tap into deeper anxieties about the opacity of computational biology and the ease with which digital artifacts can be misinterpreted as biological reality.

While it is important to engage these concerns in good faith, the conclusions drawn by virus skeptics do not hold up to scrutiny. The methods used to detect, sequence, and assemble viral genomes — especially in the case of SARS-CoV-2 — are robust, reproducible, and constrained by both statistical and biochemical reality. Far from being arbitrary constructs, modern viral genomes are the outcome of overlapping, independently validated reads supported by stringent bioinformatic controls.

The Isolation Debate: A Red Herring?

Virus skeptics often begin by pointing out that viruses are not isolated in the classical Pasteurian sense: as purified, visible particles separated from all cellular material. Instead, modern virology relies on molecular methods such as reverse transcription, PCR amplification, and genome sequencing — none of which resemble the test tube “capture” metaphor that the term “isolation” historically evokes.

But this shift is not a conspiracy or a sign of fraud — it reflects the evolution of technology. Viruses are submicroscopic and obligate intracellular entities. Their identification depends on:

Cytopathic effects in cultured cells
Electron microscopy
Specific antibodies
PCR amplification of unique genetic sequences
De novo genome assembly from RNA extracts

The criticism that “viruses aren't isolated” confuses methodological change with epistemic failure. In truth, modern viral detection provides far more detail and reproducibility than classical methods ever could.

Constructing a Genome: Reads, Overlaps, and Contigs

The core of Lanka's claim is that viral genomes are not discovered but invented — assembled from short snippets of genetic data (reads) using reference-based software that allegedly tells the computer what to find. He even claimed that one could construct “any genome” from the same set of data, simply by choosing a different reference.

Here, the science parts ways with the ideology.

Modern viral genome assembly — particularly for SARS-CoV-2 — follows two main approaches:

Reference-guided assembly: aligning sequencing reads to a known genome.
De novo assembly: building a genome purely from overlaps between reads, without a reference.

The latter, de novo assembly, offers the strongest rebuttal to virus denialism. It depends on statistical overlap of sequences — not wishful alignment. For example:

Read 1: ATGCGTACGT-------
Read 2: ---CGTACGTGAA----
Read 3: --------GTGAACCAT

These overlapping reads form a contig (continuous sequence):

Contig: ATGCGTACGTGAACCAT

You cannot, from these same reads, construct a genome like “TTGACCCGGGATTA.” There is no statistical overlap, no pathway to assemble such a sequence, and any attempt to do so results in fragmentation, low coverage, or outright failure in standard software.

SARS-CoV-2: A Case Study in Genomic Rigor

When the first reports of a novel respiratory virus surfaced in Wuhan in December 2019, researchers extracted RNA from patient lung fluid, sequenced it, and independently assembled a viral genome. This genome, now known as SARS-CoV-2, was:

~30,000 nucleotides in length
Rich in coronavirus-specific open reading frames (ORFs)
Supported by deep read coverage and long overlaps
Validated by labs around the world within weeks

Far from being a one-off construct, tens of thousands of SARS-CoV-2 genomes were later sequenced and uploaded to open databases (e.g., GISAID), showing only minor mutations across samples — a strong sign of biological consistency, not digital fabrication.

No lab, including Lanka's, has ever shown that a radically different genome (e.g., HIV or measles) could be assembled from SARS-CoV-2 data. Such an achievement would require long, consistent overlaps and high coverage across a completely unrelated viral architecture — a statistical impossibility.

Control Studies: The Missing Evidence in Lanka's Case

A central pillar of Lanka's argument is that virologists fail to perform proper negative control experiments — such as sequencing RNA from uninfected cells under the same conditions, or running the assembly pipeline on random data to see if viral genomes emerge.

This accusation is simply false.

Virology labs regularly run such controls:

Mock-infected cells are used to detect background cytopathic effects.
Control sequencing libraries ensure no cross-contamination.
BLAST searches against control datasets confirm viral sequences are absent in uninfected samples.
Statistical validation tools compare real samples to background noise.

More importantly, no false viral genome has ever been assembled from uninfected samples using standard pipelines. If Lanka's claim were true, we would routinely see viral genomes emerging from blank runs — yet this has never happened.

Ironically, Lanka has often claimed to have performed such control studies himself — assembling “virus-like genomes” from healthy tissues or random data — but he has never published these results in peer-reviewed form. No raw data, no methods, no reproducible protocol. His critiques demand rigor from others while offering none of his own.

Read Length and the Collapse of Artefactual Assemblies

Virus skeptics often point to the danger of digital artifacts — and they're not wrong in principle. With very short reads (e.g., 20-30 base pairs), it is easier to find accidental overlaps, misassemblies, or contamination.

However, modern sequencing methods use far longer reads, and quality control is built into every stage of the pipeline:

Illumina: 150-300 bp reads
PacBio / Nanopore: up to 100,000 bp
Assemblers like SPAdes and Velvet use k-mer graphs, error correction, and read-back validation

As read length increases, the chance of assembling a false genome drops exponentially. For example, the probability that a 30-base read appears by chance is ~1 in 10¹⁸ . With 150-base reads, that probability is virtually zero. Lanka's examples using extremely short reads amount to contrived demonstrations that exploit edge cases, not evidence of systemic fraud.

Read Length	Probability of Random Match	Example
10 bp	1 in 1,048,576	High chance of random hits in large genomes
15 bp	1 in 1 billion	Still some random matches in viral-scale data
30 bp	1 in 1.15 x 10¹⁸	Essentially zero chance of false match
150 bp	~zero	Used in modern Illumina sequencing

Fair Questions, False Conclusions

It's fair to ask:

How can we trust digital reconstructions of invisible entities?
Are genome assemblies vulnerable to bias or error?
Can reference genomes influence what we think we've “found”?

These are valid epistemological concerns. But virus skeptics leap from these questions to a radical, unwarranted conclusion: that viruses do not exist or are digital fabrications.

In reality:

Viral genomes are reproducible across labs and continents
Genome assembly is statistically constrained
SARS-CoV-2 was detected, sequenced, cultured, and imaged using multiple independent techniques

What virus denialists highlight is the need for scientific transparency, not the abandonment of virology. Their skepticism points to a communication gap, not a crisis of empirical reality.

Conclusion: Complexity Is Not Conspiracy

Science today operates at a level of complexity that makes room for misunderstanding. That doesn't mean it's invalid — only that it must be explained clearly, with humility and openness. Denying the existence of viruses like SARS-CoV-2 because they're digitally assembled is akin to denying the reality of gravitational waves because we see them as wiggly lines on a computer.

The question is not whether we build models — we always have. The question is how tightly constrained those models are by the evidence. In the case of SARS-CoV-2 and modern virology, the constraints are high, the evidence is deep, and the genomes are real.

NOTES

[1] See also my two volumes on Virus Deniers and PCR Skeptics "The Corona Conspiracy". Relevant chapters:

Comment Form is loading comments...