Find open-source science resources

A directory of tools, AI models, datasets, and research resources for biotech, bioinformatics, and other scientific fields. Aggregated from curated GitHub awesome-lists, HuggingFace, bio.tools, Bioconductor, and more.

269 of 5,923 resources

Showing 51100

Unified Python framework for bulk, single-cell, and spatial RNA-seq multi-omics analysis with deep learning deconvolution (VAE) and graph neural networks, bridging Bindea, Bindea, scanpy and squidpy ecosystems (Nature Communications 2024)

Active1K2 weeks ago
Python
GPL-3.0

Automates and standardizes ligand preparation for AutoDock Vina.

Active1852 weeks ago
Python
Apache-2.0

Simulation of large-scale brain models

Active9292 weeks ago
Python
NOASSERTION

Cross-platform library for differentiable programming of quantum computers with automatic differentiation, enabling hybrid quantum-classical machine learning for quantum chemistry, quantum physics, and NISQ algorithm research (Xanadu, 3k+ stars)

Active3.2K3 weeks ago
Python
Apache-2.0

Open-source framework for building physics-ML models at scale (renamed from Modulus, 2025)

Active2.8K3 weeks ago
Python
Apache-2.0

Automated cell type annotation tool for single-cell transcriptomics using gradient boosting and logistic regression with reference atlases, enabling standardized classification across datasets (Wellcome Sanger Institute, Nature Biotechnology 2022)

Active4863 weeks ago
Python
MIT

Next-generation benchmark for data-driven global weather models with standardized evaluation framework and curated datasets for ML forecasting (Google Research, 2024)

Active6143 weeks ago
Python
Apache-2.0

Transformer encoder-decoder for de novo peptide sequencing from tandem mass spectrometry, translating MS/MS spectra directly to peptide sequences without reference databases, enabling identification of novel peptides for immunopeptidomics, antibody repertoires, and metaproteomes (Noble Lab UW, Nature Communications 2024)

Active1873 weeks ago
Python
Apache-2.0

Interactive and hardware-agnostic SDK for laboratory automation, enabling programmatic control of liquid handlers, plate readers, and other lab instruments across multiple vendors; foundational infrastructure for self-driving laboratories and AI-driven experimental execution (447+ stars)

Active4503 weeks ago
Python
MIT

GPU-accelerated differentiable physics simulation engine built on NVIDIA Warp, supporting rigid/soft body, cloth, and gradient-based optimization for scientific ML, initiated by Disney Research, DeepMind, and NVIDIA (Linux Foundation, Apache 2.0, 2025)

Active5K3 weeks ago
Python
Apache-2.0

AlphaFold 3 inference pipeline for unified biomolecular structure prediction of proteins, nucleic acids, small molecules, ions, and post-translational modifications (Google DeepMind, Nature 2024)

Active8.1K3 weeks ago
Python
NOASSERTION

Automate downloading, opening, and parsing DrugBank.

Active653 weeks ago
Python
MIT

Deep learning-based bioacoustic monitoring framework for automated bird species identification from audio recordings, supporting 6,000+ species globally with real-time analysis, batch processing, and API deployment; foundational tool in biodiversity research, conservation biology, and ecological acoustic monitoring (Cornell Lab of Ornithology, 1.5K+ stars, MIT License)

Active1.6K3 weeks ago
Python
MIT

LLM agents for working with the SRA (Sequence Read Archive) and associated bioinformatics databases, enabling natural language querying of high-throughput sequencing data and metadata across genomic repositories (Arc Institute, 169+ stars, 2024-2026)

Active1703 weeks ago
Python
MIT

Modern LLM-native agent simulation platform for social science research and experimental design, providing a flexible framework for creating and managing intelligent agents in simulated environments (Tsinghua FIB Lab, 984+ stars, 2025)

Active1K3 weeks ago
Python
Apache-2.0

DeepMind's neural network for ab-initio quantum chemistry, directly solving the many-electron Schrödinger equation via variational Monte Carlo with antisymmetric wavefunctions, extended to excited states (Phys. Rev. Research 2020, Science 2024)

Active8443 weeks ago
Python
Apache-2.0

Biological vision foundation model trained on TreeOfLife-200M, yielding extraordinary accuracy on diverse biological visual tasks including habitat classification and trait prediction despite a narrow training objective (Ohio State University Imageomics Institute)

Active683 weeks ago
Python
NOASSERTION

A package for accessing data from the NIST webbook...

Active563 weeks ago
Python
MIT

E(3)-equivariant neural network interatomic potentials achieving DFT accuracy with up to 1000× less training data than invariant models, foundational architecture behind MACE and Allegro (Harvard, MIT, Nature Communications 2022)

Active9143 weeks ago
Python
MIT

Pretrained time series foundation model for long-horizon forecasting across diverse scientific domains including climate variables, biomedical signals, and physical observations; decoder-only Transformer architecture with strong zero-shot generalization (19.8K+ stars, Apache 2.0, 2024-2025)

Active20.1K3 weeks ago
Python
Apache-2.0

Interaction Fingerprints for protein-ligand complexes and more.

Active5013 weeks ago
Python
Apache-2.0

Utilities for working with CSV/Tab-delimited files.

Active6.4K3 weeks ago
Python
MIT

Automatic atomic model building program for cryo-EM maps using deep learning, enabling rapid de novo protein structure determination from electron density with high accuracy (3DEM/EMBL, 169+ stars)

Active1693 weeks ago
Python
MIT

A swiss army knife for manipulating and editing PDB files.

Active4544 weeks ago
Python
Apache-2.0

First fully customizable open-source multiagent framework automating complete research lifecycle from idea conception to LaTeX papers with dynamic workflows

Active5604 weeks ago
Python
MIT

General multimodal protein design framework enabling DNA-encoding of chemistry for programmable enzyme design and diverse protein generation through diffusion-based generative modeling (190+ stars, Apache 2.0, 2026)

Active1901 month ago
Python
Apache-2.0

Numerical differential equation solving in JAX

Active2K1 month ago
Python
Apache-2.0

Pythonic Access to the Ensembl database.

Active4001 month ago
Python
Apache-2.0

Deep learning with spiking neural networks in Python, providing gradient-based training of SNNs via PyTorch autodifferentiation for brain-inspired computing and neuromorphic research, with online learning capabilities and extensive tutorials (1.9K+ stars, actively maintained)

Active2K1 month ago
Python
MIT

Learning operators in Fourier space

Active3.7K1 month ago
Python
MIT

A Python package useful for chemistry (mainly physical/inorganic/analytical chemistry)

Active6461 month ago
Python
BSD-2-Clause

Multi-LLM consensus framework for automated cell type annotation in single-cell transcriptomics, integrating predictions from 10+ large language models with iterative discussion and uncertainty quantification to reduce single-model biases, achieving up to 95% accuracy without reference datasets; available as CRAN R package and PyPI Python package with Scanpy/Seurat integration (2025)

Active6411 month ago
Python
MIT

Open-source self-supervised vision foundation model for Earth observation by Clay Foundation (non-profit), a Masked Autoencoder ViT pretrained on multimodal satellite imagery (Sentinel-1/2, Landsat 8-9, NAIP, MODIS, LINZ DEM) with location/time embeddings, supporting classification, segmentation, change detection, similarity search, and few-shot downstream geospatial tasks (Apache 2.0, v1.5 2024-2025)

Active5791 month ago
Python
Apache-2.0

Machine learning interatomic potentials

Active1.2K1 month ago
Python
NOASSERTION

Composite-objective protein design framework integrating Boltz, AlphaFold2, OpenFold3, ProteinMPNN, and ESM via JAX-based gradient optimization over continuous relaxed sequence space for multi-property binder design (319+ stars, MIT License, 2025)

Active3231 month ago
Python
MIT

Co-create PowerPoint presentations with Generative AI from documents or topics

Active3581 month ago
Python
MIT

Medical large vision-language model unifying comprehension and generation via heterogeneous knowledge adaptation, enabling holistic medical image understanding, visual question answering, and clinical report generation across diverse modalities (ZJU4HealthCare, 1.6K+ stars)

Active1.6K1 month ago
Python
Apache-2.0

Closed-loop multi-agent system from hypothesis to verification across 12 scientific tasks, #1 on MLE-Bench (36.44%)

Active1.3K1 month ago
Python
NOASSERTION

Google DeepMind's unified DNA sequence foundation model predicting molecular consequences of genetic variants from single-base resolution up to 1 megabase context, jointly outputting thousands of regulatory tracks (RNA expression, splicing, chromatin accessibility, TF binding, contact maps) for human and mouse genomes via a Python client and non-commercial API (2025)

Active1.9K1 month ago
Python
Apache-2.0

Benchmark quantifying end-to-end autonomous AI research abilities of LLM agents across 20 tasks from SOTA machine learning papers spanning NLP, code, math, biochemical modelling, and time series forecasting, with normalized score metrics against human SOTA and HuggingFace dataset

Active941 month ago
Python
NOASSERTION

Machine learning model predicting cellular perturbation response across diverse contexts with State Transition (ST) and State Embedding (SE) variants, featuring CLI tooling, PyPI distribution, and Virtual Cell Challenge integration (575+ stars)

Active5871 month ago
Python
NOASSERTION

Probabilistic framework for inferring cell fate decisions and trajectory dynamics from multi-view single-cell data using Markov chains and machine learning, integrating RNA velocity, pseudotime, and metabolic labeling to predict differentiation paths and terminal states (scverse/Theis Lab, 449+ stars, BSD 3-Clause)

Active4501 month ago
Python
BSD-3-Clause

General-purpose RNA language model with 650M parameters pretrained on 36M non-coding RNA sequences, achieving strong generalization on structure prediction tasks including secondary structure prediction, splice-site prediction, mean ribosome loading, and ncRNA classification (lbcb-sci, 165+ stars, Apache-2.0)

Active1651 month ago
Python
Apache-2.0

First physics-aligned interactive benchmark for LLM agents in engineering construction, designing rockets/cars/bridges in physics simulator with 3D spatial geometry library

Active921 month ago
Python
NOASSERTION

LLM-driven machine learning engineering agent using agentic tree search to autonomously draft, debug and benchmark ML code; wins 4× more medals than the best linear agent on OpenAI's MLE-Bench (75 Kaggle competitions) (1.3K+ stars, MIT License)

Active1.3K1 month ago
Python
MIT

JCVI is a versatile toolkit for comparative genomics analysis. It is a collection of Python libraries to parse bioinformatics files, or perform computation related to assembly, annotation, and comparative genomics.

Active9161 month ago
Python
BSD-2-Clause

Incremental knowledge graph construction using LLMs with entity extraction and Neo4j visualization

Active9471 month ago
Python
Apache-2.0

Unified framework for state-of-the-art pre-trained bio foundation models across genomics and transcriptomics, providing standardized interfaces and pipelines for DNA, RNA, and single-cell models including Evo 2, Geneformer, scGPT, and UCE with streamlined inference, benchmarking, and fine-tuning workflows (213+ stars, 2024-2025)

Active2151 month ago
Python
AGPL-3.0

Fast spike sorting with drift correction for extracellular electrophysiology, enabling universal neural spike sorting via deep learning on high-density neural probe recordings (MouseLand, 609+ stars)

Active6161 month ago
Python
GPL-3.0

Directed message passing neural networks for property prediction of molecules and reactions with uncertainty and interpretation.

Active2.4K1 month ago
Python
NOASSERTION