By Jonathan D. Grinstein, PhD
Patricia Brennan talks about the science and technology mission within the Chan Zuckerberg Initiative (CZI) in trying to achieve the same level of conviction that marked President John F. Kennedy’s legendary May 1961 proposal to that the United States accomplishes a “one-man landing on the ground.” Moon and return it safely to Earth” before the end of the decade.
“Our mission is to cure, prevent and manage all diseases by the end of the century,” said Brennan, CZI vice president for science and technology. GENEdge.
This phrase is not really new, as it is an iteration of a similar slogan found on the CZI website and literature: “Is it possible to cure, prevent or manage all diseases of by the end of this century? We think so.
On September 19, 2023, CZI announced the funding and creation of one of the largest non-profit life sciences research computing systems in the world, which is expected to include more than 1,000 GPUs and will enable the artificial intelligence (AI) and large language models (LLM) for large-scale biomedicine.
Although the CZI announcement did not reach as many people as another historic quote relating to the moon landing: “One small step for man, one giant leap for mankind” by Neil Armstrong in July 1969, as millions watched with their faces around the world. glued to televisions, the potential successes could reach many others.
The AI triad
Machine learning (ML) tools, such as generative AI, large language models (LLM), and fundamental models, have taken the world of science, technology, and even popular culture by storm , as evidenced by ChatGPT from Open AI. In biology and medicine, these machine learning tools are already accomplishing important tasks for scientists, such as identifying complex patterns and important variables within large amounts of information, to support significant advances in discovery. medicines, genetics and precision medicine.
“Some say that biology is, in many ways, a computational challenge or endeavor,” Brennan said. This phrase echoes what American biophysicist and expert on the origins of life Harold Morowitz, PhD, said years ago: “Computer science is to biology as computation is to physics. It is the natural mathematical technique that best reflects the character of the subject.
Generally speaking, all AI applications, whether biomedical, robotic or economic, rely on the same triad: data, algorithms and computing power. In recent years, CZI has made advances in algorithms and data, as evidenced by its work in imaging and single-cell biology.
With the CELL by GENE (CELLxGENE) platform, CZI has worked with grantees and the broader scientific community to aggregate, standardize, integrate, curate and update single-cell data to provide researchers with a comprehensive set of aggregated and growing data of more than 50 million unique data points. cells instead of having to allocate huge resources to create their own datasets, which may not even be comparable to others.
“What we found is that not only this notion of aggregating data and standardizing it, but also making it available, making the whole corpus available and an easy-to-use searchable mode, has really boosted the development of new models and research in these areas. different areas,” Brennan said. “We’re seeing researchers move from what we call data-level analysis to atlas-level analysis, where they look at tissue atlases or other aggregated data sets.”
Other data sources include resources generated by CZ Science research institutes. The Chan Zuckerberg Biohub San Francisco created the OpenCell Protein Localization and Interaction Atlas as well as the Tabula Sapiens Cell Atlas. Meanwhile, the Chan Zuckerberg Institute for Advanced Biological Imaging (CZ Imaging Institute) will create large datasets of cells at molecular resolution.
The science and technology team at CZI has been and will continue to be engaged in the development of AI software. One such tool is CellGuide, a free interactive encyclopedia that provides researchers with crucial information on more than 700 distinct cell types and cell subtypes using definitions produced by ChatGPT. These include definitions, relevant datasets, an expandable ontology tree visualization of a cell’s lineage, and computational and canonical marker genes.
Additionally, in collaboration with the CZI science and technology team, the CZ Imaging Institute is developing an open source, cloud-based portal for querying curated data from cryo-electron tomography (cryoET) experiments.
“Over the past five years, I have seen tremendous progress in predicting the properties of individual molecules,” said Nicholas Sofroniew, PhD, director of product technology at CZI. “We can fold a single protein, but how do these proteins assemble in cells? This is still much less known there. So measurements with cryoET, where you see native proteins in this native environment, could be part of the next wave of AI and ML algorithms that the whole community can then work on and develop as we make these elements available.
But according to Sofroniew, the gap between the computing resources available to individual academics and a very small number of technology research labs is very large. Sofroniew said CZI can fill this gap and help researchers solve problems they might not be willing to solve at the moment.
“There’s a really unique place for us to position ourselves that covers the types of problems we can solve and the ways we might want to solve them, and the types and amounts of computation we can implement. on these kinds of problems, which is increasingly becoming a constraint for computational biology,” Sofroniew said.
Strengthening fundamental science
In some ways, CZI has progressed toward this moment, both from a granting perspective and from a technology perspective, for processing complex data, as it has already gained ground in understanding imaging and single-cell data. Last year, they recruited Stanford University biophysicist Stephen Quake, PhD, to become CZI’s new chief scientist, following former boss Cori Bargmann, PhD,’s decision to return to her lab. Rockefeller University.
Although CZI focuses primarily on basic research, it has projects that include collaboration with researchers who study applications and understand disease mechanisms. However, CZI’s contribution to the cure, prevention and management of disease by the end of the century will not be directly through the discovery or development of therapies.
“Many pharmaceutical companies are very interested in AI and biology right now, and a lot of investment is being made in drug design,” Sofroniew said. “We’re not going to work on the same types of problems. The CZI focuses on enabling tools, technologies, and data that will advance and enable additional scientific discoveries. We are not in the pharmaceutical business and we are not entering into this business either.
The goal of the CZI is to think about this mission in the long term and to want to develop technology from data, models and applications that allow science to go faster and scientists to generate more science.
“(Biology’s) challenges take time and are best addressed by integrating a mix of software development, basic science and scientific research alongside grantmaking and identifying specialized funding opportunities,” said Brennan. “With all the developments, whether in large language models or just computing and computing power over the last few months or years, we see that there is an opportunity to bring it all together.”