Asimov Blog

The Biotech Digest No. 5

Written by Niko McCarty | Jul 14, 2024 9:21:27 PM

This weekly digest highlights recent papers and news in biotechnology. Please send feedback to niko@asimov.com.

Gene-Editing Torrent

The “programmable DNA scissors” known as CRISPR-Cas9 debuted just 12 years ago. In the time since, it’s startling to consider how much progress has been made in the gene-editing field.

Cas proteins have been unearthed, engineered, and adapted into a veritable Swiss Army knife for molecular biology. It is now relatively straightforward to swap out individual bases in the genome, paste in massive chunks of DNA, or even cut RNA and protein molecules. The tools have gotten smaller, and thus easier to package into viruses for delivery. In some cases, they have even been open-sourced.

This progress shows no signs of slowing. Just this week, a slew of papers reported:

  1. A new technology called "multitrons" that can edit—insertions, deletions, and gene replacements—up to five genes at once in a wide range of organisms. Nature Chemical Biology
  2. Phages and base editors can be used to genetically modify specific bacteria inside the mouse gut. Using a phage particle that naturally infects E. coli, the researchers delivered base editors that edited a target gene in the microbe with 93% efficiency. Nature
  3. Tiny base editors, called IminiBEs, that are just 496 amino acids in length, or about one-third the size of Cas9. In vitro experiments showed that these mini base editors could perform C-to-T swaps at 16 genomic sites with 67% average efficiency. Nature Chemical Biology
  4. A new gene-editing tool based on retrotransposons, or “jumping genes,” can insert genes into specific sites in the genome using only RNA molecules. In mouse embryos, it had 60% efficiency and inserted the gene at the right place 99% of the time. Cell
  5. Base editors that can be used to correct the most common genetic mutation underlying cystic fibrosis. In cells taken from afflicted patients, base editors repaired the CFTR gene in about 25% of cells. Nature Biomedical Engineering

This is all great news! But it’s also important, I think, to occasionally take a step back and think about the “big problems” that bottleneck a field. The whole point of making better gene-editing tools is to speed up basic science or, for most companies, to cure diseases in people. And if you go to a big conference like the American Society of Gene & Cell Therapy, it seems like most scientists are quite happy with basic CRISPR gene-editing tools, and are much more focused on solving problems with their delivery, immunogenicity, and manufacturing.

Manufacturing and delivery remain major challenges. It is still difficult to make long strands of RNA or single-stranded DNA, for example. It is simpler to make double-stranded DNA, but delivering such DNA to cells is also often more immunogenic. It is still difficult to deliver gene editors to solid tissues, such as muscles, and much easier to deliver therapies to the eyes or blood. Some gene editors are still too large to fit inside of delivery vehicles, such as adeno-associated viruses or AAVs. And if it’s too difficult to manufacture a gene-editing therapy, then that therapy can probably only be used (at least initially) to target ultra-rare diseases. 

In other words, it seems like the greatest leaps in progress tomorrow will come not from the tools themselves, but from everything that follows downstream. Just something to think about.

Thanks to Arturo Casini for help writing this section.

Multimodal LLMs for Science

OpenAI and Los Alamos National Laboratory are collaborating to study how AI tools can bolster bioscience research, according to a press release. This is the crux of the announcement:

“Our upcoming evaluation with Los Alamos will be the first experiment to test multimodal frontier models in a lab setting by assessing the abilities of both experts and novices to perform and troubleshoot…standard laboratory experimental tasks…Tasks may include transformation (e.g., introducing foreign genetic material into a host organism; cell culture…and cell separation …By examining the uplift in task completion and accuracy enabled by GPT-4o, we aim to quantify and assess how frontier models can upskill both existing professionals / PhDs as well as novices in real-world biological tasks.”

A few weeks ago, OpenAI released a multi-modal model called GPT-4o that can take text, audio, images, and videos as inputs. The model generates responses much faster than prior versions—just 320 milliseconds for audio, on average. This low latency makes it much more likely that LLMs can be used to query and navigate in the real-world. And it seems like OpenAI is hinting that GPT-4o and future multimodal models could be used to “make scientists better.”

When I was working on education initiatives at MIT, I went around and spoke to more than 100 CEOs and executives at biotechnology companies. I asked all of them the same question: “What would it take for someone with only a Bachelor’s degree to become a scientist at your company?” About 80% of people said that this was not possible, and the other 20% (mostly startups) said that it was possible, but only in rare cases.

High-level scientists and executives at biotechnology companies often have PhDs, and it's difficult for those with community college or Bachelor's diplomas to move up the career ladder. This isn’t necessarily the fault of companies, either. Biotechnology is such a sprawling field—encompassing molecular and cell biology, genetics, genetic engineering, biochemistry and, increasingly, computational skills—that it can take many years to become an expert researcher. It would be difficult to teach all these skills, in addition to "critical reasoning," at sufficient depth during an undergraduate program. But perhaps multimodal LLMs could fill the gap.

I’m somewhat skeptical that using an AI assistant to guide a human scientist on a wide range of tasks will work in the near-term. Many experiments in biology today rely upon tacit knowledge. Information is not always written down. Methods are often described poorly, or omit important details that are necessary to execute an experiment.

But still, if the bioeconomy is ever to reach $4 trillion in value in the next one or two decades, as Schmidt Futures has suggested, then we will need to train millions more people to join the workforce. Expanding PhD programs cannot be the answer to this problem. We will also need better undergraduate programs, more community college efforts and, perhaps, some help from AI.

Plasmid Errors

There was an interesting preprint that came out last month. A Chicago-based company, called VectorBuilder, studied thousands of plasmids—the loops of DNA that scientists often use to express genes in cells—to figure out how many had design or sequence errors in them. A single mutation in a plasmid’s DNA can throw off the results of an entire experiment, and scientists don’t always sequence an entire plasmid to ensure it’s correct.

The gist of the preprint is this: Out of 2,521 plasmids that were received and studied “from academia and industry around the world,” about 15% had “significant design errors that could impact function.” Scientists often used the wrong promoters, or placed genetic components in the wrong orientations. Many plasmids also encoded toxic genes, which could kill E. coli cells during cloning.

The VectorBuilder team additionally checked 852 plasmids for possible sequence errors, and found “inconsistent” patterns in 15% of them. They then sequenced 259 plasmids in their entirety, and found that 35% of them had “sequence variations” from the reference sequences provided by the scientists. The conclusion is somewhat despondent:

“In total, we estimate that 45-50% of lab-made plasmids have undetected design and/or sequence errors that could potentially compromise the intended applications. Indeed, we suspect that this figure may underestimate the true scale of quality issues in lab-made plasmids because we had asked our clients to check the designs and sequences of their plasmids before submission to us, and also because they were paying for our services utilizing their plasmids.”

An easy way to avoid many of the errors flagged in this study is to sequence plasmids in their entirety. Plasmidsaurus and other startups offer full plasmid sequencing for $15.

 

Papers You Might Have Missed

(* = Recommended)

*Three-dimensional genome architecture persists in a 52,000-year-old woolly mammoth skin sample. Cell

*Targeted genome editing restores auditory function in adult mice with progressive hearing loss caused by a human microRNA mutation. Science Translational Medicine

*Integrated translation and metabolism in a partially self-synthesizing biochemical network. Science

*Carefully controlling the expression of multiple genes at once. Cell Systems

Template-independent enzymatic synthesis of RNA oligonucleotides. Nature Biotechnology

Repurposing Type I-A CRISPR-Cas3 for a robust diagnosis of human papillomavirus (HPV). Communications Biology

DropBlot: single-cell western blotting of chemically fixed cancer cells. Nature Communications

Machine learning to identify DNA sequences that might encode nanostructures that give bacteria color. PNAS

Electroporation works better in H. influenzae cells lacking the restriction enzymes HindII and HindIII. bioRxiv

CD22-directed CAR T-cell therapy for large B-cell lymphomas progressing after CD19-directed CAR T-cell therapy: a dose-finding phase 1 study. The Lancet

GeneMAP enables discovery of metabolic gene function. Nature Genetics

Two-dimensional high-throughput on-cell screening of immunoglobulins against broad antigen repertoires. Communications Biology

*High throughput platform technology for rapid target identification in personalized phage therapy. Nature Communications

12-month neurological and psychiatric outcomes of semaglutide use for type 2 diabetes: a propensity-score matched cohort study. eClinicalMedicine

Temperature change elicits lipidome adaptation in the simple organisms Mycoplasma mycoides and JCVI-syn3B. Cell Reports

SHIELD: Skull-shaped hemispheric implants enabling large-scale electrophysiology datasets in the mouse brain. Neuron

*Modified nucleotides for self-amplifying RNA boost potency. Nature Biotechnology

Guiding cell-state transitions with microRNA sensors. Nature Biomedical Engineering

*Miniature base editors with broad targeting range. Nature Chemical Biology

Design rules for retron gene editors. bioRxiv

Phase 3 trial of an antisense oligonucleotide for hereditary angiodema. NEJM

*Multi-site genome editing with retron arrays. Nature Chemical Biology

A fast way to make lipids for in vivo mRNA delivery. Nature Chemistry

Stable AAV expression in mice using Sleeping Beauty and lipid nanoparticles. Molecular Therapy

Base editing of bacteria inside the mouse gut. Nature

*Optimized prime editors to correct a cystic fibrosis-causing mutation in human cells. Nature Biomedical Engineering

TCRen predicts whether T cell receptors will recognize epitopes. Nature Computational Science

*All-RNA-mediated targeted gene integration in mammalian cells with engineered retrotransposons. Cell

Recombinant human enamelin produced in E. coli. BMC Biotechnology

In Other News…

Elon Musk’s Neuralink Is Ready to Implant a Second Volunteer. WIRED

Bridge RNA: A new gene editing technique that could overcome the limitations of CRISPR. Labiotech

Woolly Mammoth Skin "Freeze-Dried" For 52,000 Years Delivers First-Ever 3D Chromosomes. IFL Science

About half of analyzed plasmids had “errors in sequences crucial to expressing a therapeutic gene.” Nature

There were 38 biotech layoff announcements in Q2 of 2024. Fierce Biotech

Flagship Pioneering raises $3.6 billion in new funds. STAT

First fossil chromosomes discovered in freeze-dried mammoth skin. Nature

Study suggests those with type 2 diabetes who take GLP-1 drugs have lower risks of developing 10 out of 13 obesity-associated cancers when compared to taking insulin. Ars Technica

Total larynx transplant restores a patient’s voice. IFL Science