Find open-source science resources

A directory of tools, AI models, datasets, and research resources for biotech, bioinformatics, and other scientific fields. Aggregated from curated GitHub awesome-lists, HuggingFace, bio.tools, Bioconductor, and more.

58 of 5,893 resources

Showing 150

Democratizing AI scientists by transforming any LLM into research systems with 600+ scientific tools (Harvard MIMS)

Active1.4K6 hours ago
Python
Apache-2.0

NVIDIA and King's College London's open-source AI toolkit for healthcare imaging, providing foundational frameworks for medical image annotation (MONAI Label), training (MONAI Core), and deployment (MONAI Deploy) across radiology, pathology, and endoscopy (8K+ stars, Apache 2.0)

Active8.3K2 days ago
Python
Apache-2.0

High-performance ML research

Active35.8K2 days ago
Python
Apache-2.0

High-accuracy RAG for scientific PDFs with citation support, agentic RAG, and contradiction detection

Active8.6K4 days ago
Python
Apache-2.0

Robust, lightweight infrastructure for multi-agent autonomous self-evolution, built for autoresearch; agents run in isolated git worktrees, share knowledge through a common state directory, and are scored by a grader daemon; natively integrated with Claude Code, Codex, Cursor Agent, OpenCode, and Kiro (672+ stars, Apache 2.0)

Active7124 days ago
Python
Apache-2.0

Advanced OCR with PP-StructureV3 document parsing, 13% accuracy improvement, supports 80+ languages

Active81.3K5 days ago
Python
Apache-2.0

Self-evolving AI scientist with 6 specialized sub-agents (plan/research/code/debug/analyze/write) and persistent memory, #1 on DeepResearch Bench II and AstaBench, supporting multi-provider LLMs and multi-channel deployment (Apache 2.0, 2026)

Active3.3K1 week ago
Python
Apache-2.0

General-purpose biomedical AI agent integrating LLM reasoning with retrieval-augmented planning and code-based execution to autonomously execute diverse biomedical research tasks and generate testable hypotheses (Stanford SNAP, bioRxiv 2025)

Active3.2K1 week ago
Python
Apache-2.0

World's first fully open, accelerated weather AI software stack with Medium Range forecasting and Nowcasting models using generative AI (January 2026)

Active9591 week ago
Python
Apache-2.0

Python package for simulation-based inference enabling likelihood-free Bayesian parameter estimation from scientific simulators, with flexible interfaces for neural posterior estimation, sequential methods, and MCMC/variational backends (Mackelab, 825+ stars)

Active8281 week ago
Python
Apache-2.0

A quantum chemistry package written in Python.

Active771 week ago
Python
Apache-2.0

Cross-platform library for differentiable programming of quantum computers with automatic differentiation, enabling hybrid quantum-classical machine learning for quantum chemistry, quantum physics, and NISQ algorithm research (Xanadu, 3k+ stars)

Active3.2K2 weeks ago
Python
Apache-2.0

Open-source framework for building physics-ML models at scale (renamed from Modulus, 2025)

Active2.8K2 weeks ago
Python
Apache-2.0

Next-generation benchmark for data-driven global weather models with standardized evaluation framework and curated datasets for ML forecasting (Google Research, 2024)

Active6142 weeks ago
Python
Apache-2.0

Transformer encoder-decoder for de novo peptide sequencing from tandem mass spectrometry, translating MS/MS spectra directly to peptide sequences without reference databases, enabling identification of novel peptides for immunopeptidomics, antibody repertoires, and metaproteomes (Noble Lab UW, Nature Communications 2024)

Active1872 weeks ago
Python
Apache-2.0

GPU-accelerated differentiable physics simulation engine built on NVIDIA Warp, supporting rigid/soft body, cloth, and gradient-based optimization for scientific ML, initiated by Disney Research, DeepMind, and NVIDIA (Linux Foundation, Apache 2.0, 2025)

Active5K2 weeks ago
Python
Apache-2.0

Modern LLM-native agent simulation platform for social science research and experimental design, providing a flexible framework for creating and managing intelligent agents in simulated environments (Tsinghua FIB Lab, 984+ stars, 2025)

Active1K2 weeks ago
Python
Apache-2.0

DeepMind's neural network for ab-initio quantum chemistry, directly solving the many-electron Schrödinger equation via variational Monte Carlo with antisymmetric wavefunctions, extended to excited states (Phys. Rev. Research 2020, Science 2024)

Active8442 weeks ago
Python
Apache-2.0

Pretrained time series foundation model for long-horizon forecasting across diverse scientific domains including climate variables, biomedical signals, and physical observations; decoder-only Transformer architecture with strong zero-shot generalization (19.8K+ stars, Apache 2.0, 2024-2025)

Active20.1K3 weeks ago
Python
Apache-2.0

Interaction Fingerprints for protein-ligand complexes and more.

Active5013 weeks ago
Python
Apache-2.0

A swiss army knife for manipulating and editing PDB files.

Active4543 weeks ago
Python
Apache-2.0

General multimodal protein design framework enabling DNA-encoding of chemistry for programmable enzyme design and diverse protein generation through diffusion-based generative modeling (190+ stars, Apache 2.0, 2026)

Active1903 weeks ago
Python
Apache-2.0

Numerical differential equation solving in JAX

Active2K3 weeks ago
Python
Apache-2.0

Pythonic Access to the Ensembl database.

Active4003 weeks ago
Python
Apache-2.0

Open-source self-supervised vision foundation model for Earth observation by Clay Foundation (non-profit), a Masked Autoencoder ViT pretrained on multimodal satellite imagery (Sentinel-1/2, Landsat 8-9, NAIP, MODIS, LINZ DEM) with location/time embeddings, supporting classification, segmentation, change detection, similarity search, and few-shot downstream geospatial tasks (Apache 2.0, v1.5 2024-2025)

Active5791 month ago
Python
Apache-2.0

Medical large vision-language model unifying comprehension and generation via heterogeneous knowledge adaptation, enabling holistic medical image understanding, visual question answering, and clinical report generation across diverse modalities (ZJU4HealthCare, 1.6K+ stars)

Active1.6K1 month ago
Python
Apache-2.0

Google DeepMind's unified DNA sequence foundation model predicting molecular consequences of genetic variants from single-base resolution up to 1 megabase context, jointly outputting thousands of regulatory tracks (RNA expression, splicing, chromatin accessibility, TF binding, contact maps) for human and mouse genomes via a Python client and non-commercial API (2025)

Active1.9K1 month ago
Python
Apache-2.0

Incremental knowledge graph construction using LLMs with entity extraction and Neo4j visualization

Active9471 month ago
Python
Apache-2.0

The submission-centric metadata schema for the German Human Genome-Phenome Archive (GHGA).

Active161 month ago
Python
Apache-2.0

An extension of Schema.org to annotate metadata on software projects

Active3481 month ago
Python
Apache-2.0

FutureHouse's end-to-end scientific discovery multi-agent system orchestrating literature search (Crow/Falcon) and data analysis (Finch) agents, first AI-generated drug discovery identifying ripasudil as novel dry AMD therapeutic (2025)

Active4411 month ago
Python
Apache-2.0

Pretrained time series foundation model for zero-shot forecasting across diverse scientific and real-world domains; tokenizes continuous time series into discrete bins to train transformer language models on large-scale corpora, achieving strong zero-shot generalization and competitive performance with task-specific supervised models on climate, energy, and health benchmarks (5.3K+ stars, Apache 2.0, 2024-2026)

Active5.4K1 month ago
Python
Apache-2.0

A python-based workflow manager.

Active5901 month ago
Python
Apache-2.0

Fully autonomous medical image segmentation research system that generates complete manuscripts end-to-end from datasets with zero human intervention, beating strongest baselines on 24 of 31 datasets and achieving T1-T2 tier manuscript quality in double-blind evaluations (USTC & Shanghai AI Lab, 2026)

Active3501 month ago
Python
Apache-2.0

Multi-modal foundation model for biomolecular structure prediction (proteins, small molecules, DNA, RNA, glycans) achieving SOTA across benchmarks, with optional MSA/template support (Chai Discovery, 2024)

Active1.9K2 months ago
Python
Apache-2.0

Programmatic data labeling and weak supervision

Active6K2 months ago
Python
Apache-2.0

Google DeepMind's diffusion-based ensemble weather forecasting model at 0.25° resolution, outperforming ECMWF ENS on 97.2% of targets up to 15 days ahead, with open-source code and weights (Nature 2024)

Active6.7K2 months ago
Python
Apache-2.0

End-to-end semi-automated scientific discovery system that designs, iterates, and analyzes code-based experiments via LLM-as-a-mutator over scientific articles and code examples; auto-creates, runs, and debugs experiment code in containers and writes meta-analysis reports (339+ stars, Apache 2.0)

Active3392 months ago
Python
Apache-2.0

Free-text promptable universal 3D medical image segmentation foundation model enabling zero-shot segmentation of diverse anatomical structures and pathologies via natural language prompts across CT, MRI, and other volumetric imaging modalities (DKFZ, 195+ stars, Apache 2.0)

Active1972 months ago
Python
Apache-2.0

Open-source implementation of AlphaEvolve's evolutionary coding agent paradigm, enabling LLMs to autonomously discover and optimize algorithms through iterative evolution, matching the approach behind DeepMind's breakthrough matrix multiplication discovery (6.2K+ stars, 2025)

Active6.4K2 months ago
Python
Apache-2.0

Apache 2.0 single-cell foundation model family scaling to 3B parameters, pretrained on 266M cell profiles including perturbation data and released with training, embedding, and downstream benchmarking workflows for disease-relevant single-cell tasks (2025)

Active1564 months ago
Python
Apache-2.0

Foundation model for joint segmentation, detection, and recognition of biomedical objects across nine imaging modalities, with v2 introducing BoltzFormer architecture for end-to-end 3D inference (Microsoft, Nature Methods 2025)

Active6684 months ago
Python
Apache-2.0

DeepMind's Olympiad-level geometry theorem prover combining neural language model with symbolic deduction engine, AlphaGeometry2 solves 84% of IMO geometry problems (42/50) at gold-medalist level (Nature 2024)

Active4.8K4 months ago
Python
Apache-2.0

Fast, modular, and accurate de novo design of protein binders based on the Protenix foundation model, achieving 17-82% nanomolar hit rates across diverse targets with 2-6× improvement over prior methods like AlphaProteo and RFdiffusion (229+ stars, Apache 2.0)

Active2295 months ago
Python
Apache-2.0

ECMWF's unified framework and command-line tool to run AI-based weather forecasting models (GraphCast, Aurora, Pangu, NeuralGCM, FourCastNet) with operational ECMWF data infrastructure, enabling standardized inference and benchmarking across state-of-the-art meteorological AI systems (ECMWF, 576+ stars)

Active5795 months ago
Python
Apache-2.0

Trainable, memory-efficient PyTorch reproduction and retraining of AlphaFold2 providing new insights into its learning dynamics and out-of-distribution generalization; widely used as the open-source AlphaFold2 backbone underpinning many downstream protein structure prediction and design pipelines (Columbia AlQuraishi Lab & OpenFold Consortium, Nature Methods 2024)

Active3.4K5 months ago
Python
Apache-2.0

A library for computational chemistry (DFT) for input file generation, data extraction, method screening and analysis.

Idle227 months ago
Python
Apache-2.0

Generalist foundation model and database for open-world medical image segmentation, enabling universal segmentation of diverse anatomical structures and pathologies with zero-shot generalization to unseen tasks and modalities (Nature Biomedical Engineering 2025)

Idle868 months ago
Python
Apache-2.0

Automated and rigorous experiments using AI agents for scientific discovery

Idle3608 months ago
Python
Apache-2.0

Family of diffusion protein language models demonstrating versatile generative and predictive capabilities for protein sequences and structures, including multimodal co-generation, conditional folding, inverse folding, motif scaffolding, and representation learning, with open pretrained weights and training scripts (327+ stars, ICML 2024, ICLR 2025, ICML 2025 Spotlight)

Idle33510 months ago
Python
Apache-2.0