Find open-source science resources

A directory of tools, AI models, datasets, and research resources for biotech, bioinformatics, and other scientific fields. Aggregated from curated GitHub awesome-lists, HuggingFace, bio.tools, Bioconductor, and more.

107 of 5,923 resources

Showing 150

Developer toolkit for accelerating training and inference for AI in chemistry and material science, providing optimized GPU-accelerated workflows for molecular and materials machine learning (NVIDIA, 2026)

Active921 day ago
Python
Apache-2.0

Democratizing AI scientists by transforming any LLM into research systems with 600+ scientific tools (Harvard MIMS)

Active1.4K3 days ago
Python
Apache-2.0

Auto-generates clean, customizable academic CVs from open research data (OpenAlex, ORCID, Crossref, DataCite, Open Editors Plus). A single canonical CV object drives every output format (HTML, PDF, DOCX, LaTeX, Markdown); citations render through CSL; and the account holder is matched by persistent identifier (ORCID / OpenAlex ID) rather than name string. Free for individuals, open-source, and FAIR by design.

Active03 days ago
JavaScript
Apache-2.0

High-performance symbolic regression for discovering interpretable scientific equations from data, multi-population evolutionary search with Python/Julia backend, widely used in physics and astronomy (Cambridge, NeurIPS 2023)

Active3.6K4 days ago
Python
Apache-2.0

NVIDIA and King's College London's open-source AI toolkit for healthcare imaging, providing foundational frameworks for medical image annotation (MONAI Label), training (MONAI Core), and deployment (MONAI Deploy) across radiology, pathology, and endoscopy (8K+ stars, Apache 2.0)

Active8.3K5 days ago
Python
Apache-2.0

High-performance ML research

Active35.8K5 days ago
Python
Apache-2.0

High-accuracy RAG for scientific PDFs with citation support, agentic RAG, and contradiction detection

Active8.6K1 week ago
Python
Apache-2.0

Language agent gymnasium for challenging scientific tasks including DNA manipulation, literature search, and protein engineering

Active2711 week ago
Python
Apache-2.0

Robust, lightweight infrastructure for multi-agent autonomous self-evolution, built for autoresearch; agents run in isolated git worktrees, share knowledge through a common state directory, and are scored by a grader daemon; natively integrated with Claude Code, Codex, Cursor Agent, OpenCode, and Kiro (672+ stars, Apache 2.0)

Active7121 week ago
Python
Apache-2.0

Advanced OCR with PP-StructureV3 document parsing, 13% accuracy improvement, supports 80+ languages

Active81.3K1 week ago
Python
Apache-2.0

Molecular dynamics in JAX

Active1.4K1 week ago
Jupyter Notebook
Apache-2.0

xgt is a command-line tool for programmatic access to the GTDB REST API. It provides four subcommands: search (genome queries with pagination), genome (cards, metadata, taxonomic history), taxon (lineage and genome set retrieval), and diff (per-rank taxonomic comparison between any two GTDB releases). All subcommands support batch input, JSON/CSV/TSV output, file splitting, and automatic retry. Implemented in Rust as a self-contained binary with no runtime dependencies.

Active301 week ago
Rust
Apache-2.0

Self-evolving AI scientist with 6 specialized sub-agents (plan/research/code/debug/analyze/write) and persistent memory, #1 on DeepResearch Bench II and AstaBench, supporting multi-provider LLMs and multi-channel deployment (Apache 2.0, 2026)

Active3.3K1 week ago
Python
Apache-2.0

General-purpose biomedical AI agent integrating LLM reasoning with retrieval-augmented planning and code-based execution to autonomously execute diverse biomedical research tasks and generate testable hypotheses (Stanford SNAP, bioRxiv 2025)

Active3.2K1 week ago
Python
Apache-2.0

World's first fully open, accelerated weather AI software stack with Medium Range forecasting and Nowcasting models using generative AI (January 2026)

Active9592 weeks ago
Python
Apache-2.0

Python package for simulation-based inference enabling likelihood-free Bayesian parameter estimation from scientific simulators, with flexible interfaces for neural posterior estimation, sequential methods, and MCMC/variational backends (Mackelab, 825+ stars)

Active8282 weeks ago
Python
Apache-2.0

A quantum chemistry package written in Python.

Active772 weeks ago
Python
Apache-2.0

Automates and standardizes ligand preparation for AutoDock Vina.

Active1852 weeks ago
Python
Apache-2.0

Cross-platform library for differentiable programming of quantum computers with automatic differentiation, enabling hybrid quantum-classical machine learning for quantum chemistry, quantum physics, and NISQ algorithm research (Xanadu, 3k+ stars)

Active3.2K3 weeks ago
Python
Apache-2.0

Open-source framework for building physics-ML models at scale (renamed from Modulus, 2025)

Active2.8K3 weeks ago
Python
Apache-2.0

Next-generation benchmark for data-driven global weather models with standardized evaluation framework and curated datasets for ML forecasting (Google Research, 2024)

Active6143 weeks ago
Python
Apache-2.0

Transformer encoder-decoder for de novo peptide sequencing from tandem mass spectrometry, translating MS/MS spectra directly to peptide sequences without reference databases, enabling identification of novel peptides for immunopeptidomics, antibody repertoires, and metaproteomes (Noble Lab UW, Nature Communications 2024)

Active1873 weeks ago
Python
Apache-2.0

GPU-accelerated differentiable physics simulation engine built on NVIDIA Warp, supporting rigid/soft body, cloth, and gradient-based optimization for scientific ML, initiated by Disney Research, DeepMind, and NVIDIA (Linux Foundation, Apache 2.0, 2025)

Active5K3 weeks ago
Python
Apache-2.0

Modern LLM-native agent simulation platform for social science research and experimental design, providing a flexible framework for creating and managing intelligent agents in simulated environments (Tsinghua FIB Lab, 984+ stars, 2025)

Active1K3 weeks ago
Python
Apache-2.0

DeepMind's neural network for ab-initio quantum chemistry, directly solving the many-electron Schrödinger equation via variational Monte Carlo with antisymmetric wavefunctions, extended to excited states (Phys. Rev. Research 2020, Science 2024)

Active8443 weeks ago
Python
Apache-2.0

Pretrained time series foundation model for long-horizon forecasting across diverse scientific domains including climate variables, biomedical signals, and physical observations; decoder-only Transformer architecture with strong zero-shot generalization (19.8K+ stars, Apache 2.0, 2024-2025)

Active20.1K3 weeks ago
Python
Apache-2.0

Interaction Fingerprints for protein-ligand complexes and more.

Active5013 weeks ago
Python
Apache-2.0

A swiss army knife for manipulating and editing PDB files.

Active4544 weeks ago
Python
Apache-2.0

General multimodal protein design framework enabling DNA-encoding of chemistry for programmable enzyme design and diverse protein generation through diffusion-based generative modeling (190+ stars, Apache 2.0, 2026)

Active1901 month ago
Python
Apache-2.0

Numerical differential equation solving in JAX

Active2K1 month ago
Python
Apache-2.0

Pythonic Access to the Ensembl database.

Active4001 month ago
Python
Apache-2.0

Open-source self-supervised vision foundation model for Earth observation by Clay Foundation (non-profit), a Masked Autoencoder ViT pretrained on multimodal satellite imagery (Sentinel-1/2, Landsat 8-9, NAIP, MODIS, LINZ DEM) with location/time embeddings, supporting classification, segmentation, change detection, similarity search, and few-shot downstream geospatial tasks (Apache 2.0, v1.5 2024-2025)

Active5791 month ago
Python
Apache-2.0

Medical large vision-language model unifying comprehension and generation via heterogeneous knowledge adaptation, enabling holistic medical image understanding, visual question answering, and clinical report generation across diverse modalities (ZJU4HealthCare, 1.6K+ stars)

Active1.6K1 month ago
Python
Apache-2.0

A client to simplify fetching predictions from the Koina web service. Koina is a model repository enabling the remote execution of models. Predictions are generated as a response to HTTP/S requests, the standard protocol used for nearly all web traffic.

Active531 month ago
R
Apache-2.0

Google DeepMind's unified DNA sequence foundation model predicting molecular consequences of genetic variants from single-base resolution up to 1 megabase context, jointly outputting thousands of regulatory tracks (RNA expression, splicing, chromatin accessibility, TF binding, contact maps) for human and mouse genomes via a Python client and non-commercial API (2025)

Active1.9K1 month ago
Python
Apache-2.0

General-purpose RNA language model with 650M parameters pretrained on 36M non-coding RNA sequences, achieving strong generalization on structure prediction tasks including secondary structure prediction, splice-site prediction, mean ribosome loading, and ncRNA classification (lbcb-sci, 165+ stars, Apache-2.0)

Active1651 month ago
Python
Apache-2.0

Incremental knowledge graph construction using LLMs with entity extraction and Neo4j visualization

Active9471 month ago
Python
Apache-2.0

The submission-centric metadata schema for the German Human Genome-Phenome Archive (GHGA).

Active161 month ago
Python
Apache-2.0

An extension of Schema.org to annotate metadata on software projects

Active3481 month ago
Python
Apache-2.0

FutureHouse's end-to-end scientific discovery multi-agent system orchestrating literature search (Crow/Falcon) and data analysis (Finch) agents, first AI-generated drug discovery identifying ripasudil as novel dry AMD therapeutic (2025)

Active4411 month ago
Python
Apache-2.0

Pretrained time series foundation model for zero-shot forecasting across diverse scientific and real-world domains; tokenizes continuous time series into discrete bins to train transformer language models on large-scale corpora, achieving strong zero-shot generalization and competitive performance with task-specific supervised models on climate, energy, and health benchmarks (5.3K+ stars, Apache 2.0, 2024-2026)

Active5.4K1 month ago
Python
Apache-2.0

Descriptor library containing a variety of fingerprinting techniques, including the Smooth Overlap of Atomic Positions (SOAP).

Active4661 month ago
C++
Apache-2.0

A python-based workflow manager.

Active5901 month ago
Python
Apache-2.0

Fully autonomous medical image segmentation research system that generates complete manuscripts end-to-end from datasets with zero human intervention, beating strongest baselines on 24 of 31 datasets and achieving T1-T2 tier manuscript quality in double-blind evaluations (USTC & Shanghai AI Lab, 2026)

Active3502 months ago
Python
Apache-2.0

Multi-modal foundation model for biomolecular structure prediction (proteins, small molecules, DNA, RNA, glycans) achieving SOTA across benchmarks, with optional MSA/template support (Chai Discovery, 2024)

Active1.9K2 months ago
Python
Apache-2.0

Programmatic data labeling and weak supervision

Active6K2 months ago
Python
Apache-2.0

Google DeepMind's diffusion-based ensemble weather forecasting model at 0.25° resolution, outperforming ECMWF ENS on 97.2% of targets up to 15 days ahead, with open-source code and weights (Nature 2024)

Active6.7K2 months ago
Python
Apache-2.0

First architecture deeply integrating a DNA foundation model with an LLM for multimodal biological reasoning, achieving 98% accuracy on KEGG disease pathway prediction and 15%+ average gains on variant effect prediction with interpretable step-by-step reasoning traces (bowang-lab, 390+ stars)

Active3902 months ago
Jupyter Notebook
Apache-2.0

End-to-end semi-automated scientific discovery system that designs, iterates, and analyzes code-based experiments via LLM-as-a-mutator over scientific articles and code examples; auto-creates, runs, and debugs experiment code in containers and writes meta-analysis reports (339+ stars, Apache 2.0)

Active3392 months ago
Python
Apache-2.0

Free-text promptable universal 3D medical image segmentation foundation model enabling zero-shot segmentation of diverse anatomical structures and pathologies via natural language prompts across CT, MRI, and other volumetric imaging modalities (DKFZ, 195+ stars, Apache 2.0)

Active1972 months ago
Python
Apache-2.0