Find open-source science resources

A directory of tools, AI models, datasets, and research resources for biotech, bioinformatics, and other scientific fields. Aggregated from curated GitHub awesome-lists, HuggingFace, bio.tools, Bioconductor, and more.

403 of 5,893 resources

Showing 51100

Simulations of spiking neural networks.

Active1.2K1 week ago
Python
NOASSERTION

SOTA multimodal document parsing with 1.2B parameters outperforming GPT-4o, converts PDFs to LLM-ready Markdown/JSON

Active65.9K1 week ago
Python
NOASSERTION

World's first fully open, accelerated weather AI software stack with Medium Range forecasting and Nowcasting models using generative AI (January 2026)

Active9592 weeks ago
Python
Apache-2.0

Fast, interactive, multi-dimensional image viewer for Python, foundational platform for scientific imaging AI with a rich plugin ecosystem integrating deep learning segmentation, object tracking, and microscopy analysis workflows (2.6K+ stars)

Active2.7K2 weeks ago
Python
BSD-3-Clause

Toolkit for large-scale whole-slide image processing supporting 22+ patch encoders (UNI, CONCH, Virchow, H-Optimus-0, etc.), slide encoders (TITAN, GigaPath, PRISM, CHIEF, Madeleine, Feather), tissue segmentation, and multi-GPU inference with end-to-end pipeline and smart resume for standardized deployment of computational pathology foundation models (Mahmood Lab, Harvard Medical School, 553+ stars)

Active5672 weeks ago
Python
NOASSERTION

Python package for simulation-based inference enabling likelihood-free Bayesian parameter estimation from scientific simulators, with flexible interfaces for neural posterior estimation, sequential methods, and MCMC/variational backends (Mackelab, 825+ stars)

Active8282 weeks ago
Python
Apache-2.0

First fully open-source model achieving AlphaFold3-level accuracy with 1000x faster binding affinity prediction (MIT)

Active4K2 weeks ago
Python
MIT

Vision foundation model for the tree of life, pretrained on diverse biological imagery across taxa for zero-shot species identification, trait extraction, and biodiversity research (Ohio State University Imageomics Institute)

Active2592 weeks ago
Python
NOASSERTION

Hulu-Med: A Transparent Generalist Model towards Holistic Medical Vision-Language Understanding

Active1.1K2 weeks ago
Python

Hulu-Med: A Transparent Generalist Model towards Holistic Medical Vision-Language Understanding

Active1832 weeks ago
Python

Hulu-Med: A Transparent Generalist Model towards Holistic Medical Vision-Language Understanding

Active642 weeks ago
Python

Highly scalable equivariant deep learning interatomic potentials enabling million-atom molecular dynamics simulations with ab initio accuracy, building on E(3)-equivariant architectures for large-scale atomistic modeling (mir-group, MIT License, 480+ stars)

Active4822 weeks ago
Python
MIT

Turn any AI agent into an AI Scientist. The #1 Agent Skills library for science with 140+ ready-to-use skills and 100+ scientific databases covering biology, chemistry, medicine, and drug discovery. Compatible with Cursor, Claude Code, Codex, Antigravity, and the open Agent Skills standard (K-Dense-AI, 26K+ stars, 2025)

Active26.5K2 weeks ago
Python
MIT

Biological simulation tools

Active152 weeks ago
Python
MIT

SSSOM is a Simple Standard for Sharing Ontological Mappings, providing - a TSV-based representation for ontology term mappings - a comprehensive set of standard metadata elements to describe mappings and - a standard translation between the TSV and the Web Ontology Language (OWL). Most metadata elements, such as "sssom:mapping_justification" are defined in the sssom namespace.

Active2012 weeks ago
Python
BSD-3-Clause

197 bioinformatics and life science skills for Claude Code and AI agents, achieving 92.0% accuracy on BixBench. Covers RNA-seq, single-cell analysis, drug discovery, proteomics, and more. Powers OmicsHorizon (195+ stars, 2026)

Active1952 weeks ago
Python
NOASSERTION

A quantum chemistry package written in Python.

Active772 weeks ago
Python
Apache-2.0

ESMC is a state-of-the-art protein language model that has learned the rules of protein biology from training on billions of protein sequences. ESMC provides representations of proteins enabling novel AI applications from therapeutic protein engineering to unlocking basic insights into protein…

Active2.8K2 weeks ago
Python

Generalist deep learning algorithm for cell and nucleus segmentation across diverse image types, with human-in-the-loop training (2.0) and one-click image restoration (3.0), 70K+ training objects (Nature Methods 2021/2022/2025)

Active2.2K2 weeks ago
Python
BSD-3-Clause

This set of model weights was released with the GitHub-compatible esm package format. The models here are kept for backwards compatibility, but we recommend you use the HuggingFace-compatible model weights at biohub/ESMC-6B (or biohub/ESMC-300M / biohub/ESMC-600M) instead.

Active2.5K2 weeks ago
Python

This set of model weights was released with the GitHub-compatible esm package format. The models here are kept for backwards compatibility, but we recommend you use the HuggingFace-compatible model weights at biohub/ESMC-6B (or biohub/ESMC-300M / biohub/ESMC-600M) instead.

Active6.2K2 weeks ago
Python

Machine learning and statistical learning for neuroimaging in Python, providing easy-to-use tools for fMRI and MRI analysis including decoding, connectivity estimation, and parcellation with seamless scikit-learn integration (INRIA Parietal team, 1.4K+ stars)

Active1.4K2 weeks ago
Python
BSD-3-Clause

# Geneformer Geneformer is a foundational transformer model pretrained on a large-scale corpus of human single cell transcriptomes to enable context-aware predictions in settings with limited data in network biology.

Active3.2K2 weeks ago
Python

MEG and EEG.

Active3.4K2 weeks ago
Python
BSD-3-Clause
Active732 weeks ago
Python

Modular toolchain for an extensible and customizable ETL pipeline that extracts, transforms, and loads clinical data and medical imaging metadata, applying dataset-specific mappings to generate outputs compatible with the EUCAIM Common Data Model (CDM). Its design aims to minimize manual data preparation efforts and facilitate customization and integration with other components, such as data quality assurance tools. Containerized, currently supports input datasets in CSV, JSON, XLSX.

Active02 weeks ago
Python

Unified Python framework for bulk, single-cell, and spatial RNA-seq multi-omics analysis with deep learning deconvolution (VAE) and graph neural networks, bridging Bindea, Bindea, scanpy and squidpy ecosystems (Nature Communications 2024)

Active1K2 weeks ago
Python
GPL-3.0

Automates and standardizes ligand preparation for AutoDock Vina.

Active1852 weeks ago
Python
Apache-2.0

Simulation of large-scale brain models

Active9292 weeks ago
Python
NOASSERTION

Cross-platform library for differentiable programming of quantum computers with automatic differentiation, enabling hybrid quantum-classical machine learning for quantum chemistry, quantum physics, and NISQ algorithm research (Xanadu, 3k+ stars)

Active3.2K3 weeks ago
Python
Apache-2.0

Open-source framework for building physics-ML models at scale (renamed from Modulus, 2025)

Active2.8K3 weeks ago
Python
Apache-2.0

Automated cell type annotation tool for single-cell transcriptomics using gradient boosting and logistic regression with reference atlases, enabling standardized classification across datasets (Wellcome Sanger Institute, Nature Biotechnology 2022)

Active4863 weeks ago
Python
MIT

Next-generation benchmark for data-driven global weather models with standardized evaluation framework and curated datasets for ML forecasting (Google Research, 2024)

Active6143 weeks ago
Python
Apache-2.0

A data model for managing information about chemical entities, ranging from atoms through molecules to complex mixtures.

Active233 weeks ago
Python
CC0-1.0
Active823 weeks ago
Python

Transformer encoder-decoder for de novo peptide sequencing from tandem mass spectrometry, translating MS/MS spectra directly to peptide sequences without reference databases, enabling identification of novel peptides for immunopeptidomics, antibody repertoires, and metaproteomes (Noble Lab UW, Nature Communications 2024)

Active1873 weeks ago
Python
Apache-2.0

Interactive and hardware-agnostic SDK for laboratory automation, enabling programmatic control of liquid handlers, plate readers, and other lab instruments across multiple vendors; foundational infrastructure for self-driving laboratories and AI-driven experimental execution (447+ stars)

Active4503 weeks ago
Python
MIT

GPU-accelerated differentiable physics simulation engine built on NVIDIA Warp, supporting rigid/soft body, cloth, and gradient-based optimization for scientific ML, initiated by Disney Research, DeepMind, and NVIDIA (Linux Foundation, Apache 2.0, 2025)

Active5K3 weeks ago
Python
Apache-2.0

AlphaFold 3 inference pipeline for unified biomolecular structure prediction of proteins, nucleic acids, small molecules, ions, and post-translational modifications (Google DeepMind, Nature 2024)

Active8.1K3 weeks ago
Python
NOASSERTION

Automate downloading, opening, and parsing DrugBank.

Active653 weeks ago
Python
MIT

Deep learning-based bioacoustic monitoring framework for automated bird species identification from audio recordings, supporting 6,000+ species globally with real-time analysis, batch processing, and API deployment; foundational tool in biodiversity research, conservation biology, and ecological acoustic monitoring (Cornell Lab of Ornithology, 1.5K+ stars, MIT License)

Active1.6K3 weeks ago
Python
MIT
Active3753 weeks ago
Python

LLM agents for working with the SRA (Sequence Read Archive) and associated bioinformatics databases, enabling natural language querying of high-throughput sequencing data and metadata across genomic repositories (Arc Institute, 169+ stars, 2024-2026)

Active1703 weeks ago
Python
MIT

Modern LLM-native agent simulation platform for social science research and experimental design, providing a flexible framework for creating and managing intelligent agents in simulated environments (Tsinghua FIB Lab, 984+ stars, 2025)

Active1K3 weeks ago
Python
Apache-2.0

DeepMind's neural network for ab-initio quantum chemistry, directly solving the many-electron Schrödinger equation via variational Monte Carlo with antisymmetric wavefunctions, extended to excited states (Phys. Rev. Research 2020, Science 2024)

Active8443 weeks ago
Python
Apache-2.0

Biological vision foundation model trained on TreeOfLife-200M, yielding extraordinary accuracy on diverse biological visual tasks including habitat classification and trait prediction despite a narrow training objective (Ohio State University Imageomics Institute)

Active683 weeks ago
Python
NOASSERTION

A package for accessing data from the NIST webbook...

Active563 weeks ago
Python
MIT

E(3)-equivariant neural network interatomic potentials achieving DFT accuracy with up to 1000× less training data than invariant models, foundational architecture behind MACE and Allegro (Harvard, MIT, Nature Communications 2022)

Active9143 weeks ago
Python
MIT

Target-Conditioned Molecular Ideation Model for Drug Discovery Research

Active03 weeks ago
Python