GPT-Rosalind
GPT-Rosalind is a purpose-built frontier reasoning model developed by OpenAI to accelerate scientific research across biology, drug discovery, and translational medicine. It is designed specifically for life sciences workflows, where researchers must navigate large volumes of literature, experimental data, and specialized databases to generate and validate new ideas. It combines deep domain understanding in areas such as chemistry, genomics, protein engineering, and disease biology with advanced tool-use capabilities, allowing it to interact with scientific databases, analyze experimental outputs, and support complex, multi-step reasoning tasks. It can assist with evidence synthesis, hypothesis generation, literature review, sequence interpretation, and experimental planning, helping scientists move faster from raw data to actionable insights. GPT-Rosalind transforms complex, time-intensive research processes into more efficient AI-assisted workflows.
Learn more
Evo Designer
Evo Designer is an advanced tool developed by the Arc Institute, leveraging the capabilities of the Evo 2 genomic foundation model to facilitate DNA sequence generation and analysis. This platform enables users to input nucleotide sequences or specify organisms, prompting the model to generate corresponding DNA sequences. It provides comprehensive annotations of coding regions and, for prokaryotic sequences, offers 3D protein visualizations utilizing ESMFold. Additionally, Evo Designer evaluates sequences by scoring their perplexity and per-nucleotide entropy, assisting researchers in assessing sequence complexity and variability. The underlying Evo 2 model is trained on over 9 trillion nucleotides from a diverse array of prokaryotic and eukaryotic genomes, employing a deep learning architecture that models biological sequences at single-nucleotide resolution with a context window extending up to 1 million tokens.
Learn more
ESMC
ESMC is the latest in the ESM family of protein language models, establishing a new frontier in representation learning for protein biology. Trained on billions of evolutionary sequences, it learns representations that reflect a mechanistic reduction of protein structure and function. The model is built on a transformer architecture, supports sequences as its core modality, and is trained on up to 6 billion proteins. ESMC is designed for protein science research, including structure prediction, function annotation, protein design, and understanding evolutionary relationships between proteins. It can generate novel proteins from partial sequence, structure, or functional constraints, helping researchers explore new possibilities in protein design and biological discovery. The Biohub Platform provides access to ESMC through the API and the ESM Python package, with quickstart resources for installing the package, creating an API key, connecting to the platform.
Learn more
NVIDIA BioNeMo
BioNeMo is an AI-powered drug discovery cloud service and framework built on NVIDIA NeMo Megatron for training and deploying large biomolecular transformer AI models at a supercomputing scale. The service includes pre-trained large language models (LLMs) and native support for common file formats for proteins, DNA, RNA, and chemistry, providing data loaders for SMILES for molecular structures and FASTA for amino acid and nucleotide sequences. The BioNeMo framework will also be available for download for running on your own infrastructure. ESM-1, based on Meta AI’s state-of-the-art ESM-1b, and ProtT5 are transformer-based protein language models that can be used to generate learned embeddings for tasks like protein structure and property prediction. OpenFold, a deep learning model for 3D structure prediction of novel protein sequences, will be available in BioNeMo service.
Learn more