Open Science Index

Find open-source science resources

A directory of tools, AI models, datasets, and research resources for biotech, bioinformatics, and other scientific fields. Aggregated from curated GitHub awesome-lists, HuggingFace, bio.tools, Bioconductor, and more.

Filters

Health

Active503
Idle199
Stale110
Archived8
(None)25

Domain

text-generation98
Protein & Drug Discovery57
fill-mask32
Autonomous Research Systems (2023-2025 Breakthroughs)30
Genomics & Bioinformatics28
image-text-to-text24
Simulations22
Medical AI & Clinical Applications21
Climate Modeling16
Materials Discovery16
feature-extraction15
Domain-Specific Research Agents13
(None)70

Language(1)

R2432
Python845
Jupyter Notebook89
HTML49
Makefile35
C++34
C29
JavaScript29
Java24
Shell24
TypeScript15
Perl9
(None)2529

License

MIT192
Apache-2.0151
NOASSERTION76
BSD-3-Clause33
GPL-3.024
CC-BY-NC-ND-4.012
CC-BY-4.08
CC0-1.07
LGPL-3.07
Other5
AGPL-3.04
BSD-2-Clause4
(None)311

Source

github573
awesome-ai-for-science335
huggingface249
bio.tools95
awesome-python-chemistry62
bioregistry50
awesome-bioinformatics31
awesome-cheminformatics25
awesome-scientific-python13
1

Type

Software tool546
AI model249
Database50

Filters

Health

Active503
Idle199
Stale110
Archived8
(None)25

Domain

text-generation98
Protein & Drug Discovery57
fill-mask32
Autonomous Research Systems (2023-2025 Breakthroughs)30
Genomics & Bioinformatics28
image-text-to-text24
Simulations22
Medical AI & Clinical Applications21
Climate Modeling16
Materials Discovery16
feature-extraction15
Domain-Specific Research Agents13
(None)70

Language(1)

R2432
Python845
Jupyter Notebook89
HTML49
Makefile35
C++34
C29
JavaScript29
Java24
Shell24
TypeScript15
Perl9
(None)2529

License

MIT192
Apache-2.0151
NOASSERTION76
BSD-3-Clause33
GPL-3.024
CC-BY-NC-ND-4.012
CC-BY-4.08
CC0-1.07
LGPL-3.07
Other5
AGPL-3.04
BSD-2-Clause4
(None)311

Source

github573
awesome-ai-for-science335
huggingface249
bio.tools95
awesome-python-chemistry62
bioregistry50
awesome-bioinformatics31
awesome-cheminformatics25
awesome-scientific-python13
1

Type

Software tool546
AI model249
Database50

845 of 6,223 resources

Showing 1–50

alimotahharynia/DrugGen-2

by alimotahharynia

text-generation

# DrugGen 2: A disease-aware language model for enhancing drug discovery DrugGen-2 is a disease‑aware language model specialized for generating drug-like SMILES structures based on both disease pathways and protein sequence.

Active↓1821 day ago

figtracer

Plain-text, git-tracked electronic lab notebook (ELN) for reproducible bioinformatics — threads your R & Python figures into living lab notes with full provenance. Built for single-cell / CyTOF / flow cytometry; works with Obsidian, Quarto & Jupyter.

Active★01 day ago

NVIDIA Earth-2

Climate Modeling

World's first fully open, accelerated weather AI software stack with Medium Range forecasting and Nowcasting models using generative AI (January 2026)

Active★1K1 day ago

napari

Medical AI & Clinical Applications

Fast, interactive, multi-dimensional image viewer for Python, foundational platform for scientific imaging AI with a rich plugin ecosystem integrating deep learning segmentation, object tracking, and microscopy analysis workflows (2.6K+ stars)

Active★2.7K1 day ago

OpenDDE (Aureka Research, 2026)

Protein & Drug Discovery

Open-source, all-atom biomolecular foundation model that turns co-folding into a scalable engine for structure prediction, design, and optimization across proteins, nucleic acids, and small molecules in drug discovery; ranked first on PXMeter-AB, FoldBench-AB, and 2026ARK-AB antibody-antigen benchmarks (263+ stars, Apache 2.0)

Active★2712 days ago

UniParser/MolParser-Mobile

by UniParser

💻 Github | 📄 Report (Coming soon...) | 🚀 Demo

Active↓512 days ago

Jackrong/Qwopus3.5-27B-v3.5-GGUF

by Jackrong

image-text-to-text

!image

Active↓2.8K2 days ago

Scientific Agent Skills

Research Workbench & Plugins

Turn any AI agent into an AI Scientist. The #1 Agent Skills library for science with 140+ ready-to-use skills and 100+ scientific databases covering biology, chemistry, medicine, and drug discovery. Compatible with Cursor, Claude Code, Codex, Antigravity, and the open Agent Skills standard (K-Dense-AI, 26K+ stars, 2025)

Active★30.6K2 days ago

reaperdoesntknow/Qwen3-1.7B-Distilled-30B-A3B

by reaperdoesntknow

text-generation

A 1.7B-parameter causal language model distilled from Qwen3-30B-A3B on 6,122 STEM chain-of-thought samples using discrepancy-informed knowledge distillation. The training objective emphasizes proof structure, detects reasoning pivot tokens through token-level divergence dynamics, smooths…

Active↓1.4K2 days ago

WeatherBench2

Climate Modeling

Next-generation benchmark for data-driven global weather models with standardized evaluation framework and curated datasets for ML forecasting (Google Research, 2024)

Active★6233 days ago

nilearn

Neuroscience & Behavioral Analysis

Machine learning and statistical learning for neuroimaging in Python, providing easy-to-use tools for fMRI and MRI analysis including decoding, connectivity estimation, and parcellation with seamless scikit-learn integration (INRIA Parietal team, 1.4K+ stars)

Active★1.4K3 days ago

fableforge-ai/NEXUS-Medical

by fableforge-ai

text-generation

> NEXUS domain specialist for medical Q&A and clinical reasoning — lightweight & uncensored.

Active↓1.5K3 days ago

MACE

Materials Discovery

Machine learning interatomic potentials

Active★1.3K3 days ago

overreact

A library and command-line tool for building and analyzing complex homogeneous microkinetic models from quantum chemistry calculations, with support for quasi-harmonic thermochemistry, quantum tunnelling corrections, molecular symmetries and more.

Active★643 days ago

MNE

MEG and EEG.

Active★3.5K4 days ago

Pyscf

A quantum chemistry package written in Python.

Active★774 days ago

BioNeMo Framework

Domain-Specific Models

NVIDIA's open-source platform for building and adapting biological AI models at scale, bundling ESM-2, Geneformer, MolMIM and DNA embedding models with recipes for single-GPU to multi-node training (2025)

Active★8184 days ago

CellRank

Genomics & Bioinformatics

Probabilistic framework for inferring cell fate decisions and trajectory dynamics from multi-view single-cell data using Markov chains and machine learning, integrating RNA velocity, pseudotime, and metabolic labeling to predict differentiation paths and terminal states (scverse/Theis Lab, 449+ stars, BSD 3-Clause)

Active★4544 days ago

Tesseract Core (Pasteur Labs, SciPy 2025 / JOSS)

Scientific Machine Learning Frameworks

Universal components for differentiable scientific computing, packaging heterogeneous scientific tools into self-contained, portable, gradient-propagating components with auto-generated schemas, CLI/REST API/Python SDK interfaces, and reproducible deployment across local, cloud, and HPC environments (105+ stars, Apache 2.0)

Active★1054 days ago

NeuroAI (Meta FAIR)

Neuroscience & Behavioral Analysis

Modular Python suite for Neuro-AI research across all modalities, providing efficient data loaders (NeuralSet), curated datasets (NeuralFetch), scalable training (NeuralTrain), and unified benchmarking (NeuralBench) for building and evaluating neuroscience foundation models (Meta FAIR, 270+ stars, MIT License, 2026)

Active★2725 days ago

OpenProteo

OpenProteo is the open-source Rust stack for proteomics raw-file access. It reads Thermo, Bruker, and Waters acquisitions through a single API (via the sibling OpenTFRaw, OpenTimsTDF, and OpenWRaw readers), converts them to PSI-MS mzML 1.1.0 with a canonical writer, and provides a zero-copy read_arrow() API (enabled by default) that loads directly into Polars or Pandas via PyArrow. No vendor SDKs, no Windows-only DLLs, no binary blobs in the release pipeline. Includes a one-shot vendor2mzml CLI.

Active★45 days ago

OpenWRaw

OpenWRaw is a standalone, cross-platform reader for Waters MassLynx .raw acquisition directories, implemented in pure Rust with no dependency on vendor DLLs. Python bindings built on PyO3 expose functions, scans, and ion-mobility data as native Python objects from Waters QTof and SYNAPT instrument families, ready to be assembled into a Pandas or Polars DataFrame.

Active★35 days ago

OpenTimsTDF

OpenTimsTDF is a standalone, cross-platform reader for Bruker timsTOF .tdf and .tdf_bin acquisition files, implemented in pure Rust with no dependency on vendor SDKs. Python bindings built on PyO3 expose frame, scan, and peak data as native Python objects, providing ion-mobility-aware access that can be assembled into a Pandas or Polars DataFrame.

Active★25 days ago

OpenTFRaw

OpenTFRaw is a standalone, cross-platform reader for Thermo Fisher Scientific .raw mass-spectrometry files, implemented in pure Rust with no dependency on vendor DLLs or .NET. Python bindings built on PyO3 return NumPy arrays for spectral data, straightforward to load into Pandas or Polars. Covers format versions 8 through 66 (LCQ Classic through Orbitrap Astral and modern TSQ instruments), supporting both centroid and profile spectra.

Active★105 days ago

Zaynoid/Qwen3.5-80B-A10B-medical-reap-v2

by Zaynoid

image-text-to-text

Active↓1395 days ago

OpenEvolve

Autonomous Research Systems (2023-2025 Breakthroughs)

Open-source implementation of AlphaEvolve's evolutionary coding agent paradigm, enabling LLMs to autonomously discover and optimize algorithms through iterative evolution, matching the approach behind DeepMind's breakthrough matrix multiplication discovery (6.2K+ stars, 2025)

Active★6.7K6 days ago

AgentSociety

Social Science Research & Simulation

Modern LLM-native agent simulation platform for social science research and experimental design, providing a flexible framework for creating and managing intelligent agents in simulated environments (Tsinghua FIB Lab, 984+ stars, 2025)

Active★1.1K6 days ago

gevaertlab/diffusiongemma-radiology-vqa

by gevaertlab

image-text-to-text

This repository contains LoRA finetunes of DiffusionGemma (image-conditioned discrete-diffusion LLM) for radiology visual question answering, each paired with an autoregressive Gemma-4 finetune as a controlled baseline. It corresponds to the paper Discrete Diffusion Language Models for Interactive…

Active↓01 week ago

NVIDIA PhysicsNeMo

Physics-Informed Neural Networks

Open-source framework for building physics-ML models at scale (renamed from Modulus, 2025)

Active★3K1 week ago

Newton

Specialized Frameworks

GPU-accelerated differentiable physics simulation engine built on NVIDIA Warp, supporting rigid/soft body, cloth, and gradient-based optimization for scientific ML, initiated by Disney Research, DeepMind, and NVIDIA (Linux Foundation, Apache 2.0, 2025)

Active★5.2K1 week ago

mosaic

Protein & Drug Discovery

Composite-objective protein design framework integrating Boltz, AlphaFold2, OpenFold3, ProteinMPNN, and ESM via JAX-based gradient optimization over continuous relaxed sequence space for multi-property binder design (319+ stars, MIT License, 2025)

Active★3451 week ago

TRIDENT (2025)

Computational Pathology & Digital Pathology

Toolkit for large-scale whole-slide image processing supporting 22+ patch encoders (UNI, CONCH, Virchow, H-Optimus-0, etc.), slide encoders (TITAN, GigaPath, PRISM, CHIEF, Madeleine, Feather), tissue segmentation, and multi-GPU inference with end-to-end pipeline and smart resume for standardized deployment of computational pathology foundation models (Mahmood Lab, Harvard Medical School, 553+ stars)

Active★6001 week ago

ai4s-skills

Autonomous Research Systems (2023-2025 Breakthroughs)

Agent skills (SKILL.md + deterministic tools) for the AI4S workflow — topic exploration, literature survey, runnable experiments, publication-grade papers, and integrity audit, with every citation and number traceable to its source (by ai4s-research, maintainers of this list; MIT, 2026)

Active★901 week ago

SQUARNA

Computational biology

SQUARNA is a tool for RNA secondary structure prediction. It can take a single RNA sequence or an alignment of sequences as input. SQUARNA handles pseudoknots and can predict alternative structures. SQUARNA allows structural restraints and chemical probing data as additional input and is available at https://github.com/febos/SQUARNA and https://larnal.imol.institute/.

Active★201 week ago

bioSkills

Research Workbench & Plugins

Collection of SKILLS.md guiding AI coding agents (Claude Code, OpenAI Codex, Google Gemini, OpenCode, OpenClaw) through common bioinformatics workflows from basic sequence manipulation to advanced analyses such as single-cell RNA-seq and population genetics; evaluated on the Bio-Task Bench dataset (GPTomics, 969+ stars, MIT License, 2026)

Active★9741 week ago

PennyLane

Specialized Frameworks

Cross-platform library for differentiable programming of quantum computers with automatic differentiation, enabling hybrid quantum-classical machine learning for quantum chemistry, quantum physics, and NISQ algorithm research (Xanadu, 3k+ stars)

Active★3.3K1 week ago

PyLabRobot

Lab Automation & Robotics

Interactive and hardware-agnostic SDK for laboratory automation, enabling programmatic control of liquid handlers, plate readers, and other lab instruments across multiple vendors; foundational infrastructure for self-driving laboratories and AI-driven experimental execution (447+ stars)

Active★4751 week ago

TimesFM (Google Research)

General Science Models

Pretrained time series foundation model for long-horizon forecasting across diverse scientific domains including climate variables, biomedical signals, and physical observations; decoder-only Transformer architecture with strong zero-shot generalization (19.8K+ stars, Apache 2.0, 2024-2025)

Active★26.4K1 week ago

AlphaFold3

Protein & Drug Discovery

AlphaFold 3 inference pipeline for unified biomolecular structure prediction of proteins, nucleic acids, small molecules, ions, and post-translational modifications (Google DeepMind, Nature 2024)

Active★8.3K1 week ago

doctolib-lab/doctobert-fr-base

by doctolib-lab

🤗 Blog | 📄 Paper | 💻 Code | 🌐 FineMed | 🩺 DoctoBERT

Active↓4601 week ago

MIRA (NeurIPS 2025)

Medical AI & Clinical Applications

Medical time series foundation model pretrained on 454B time points from heterogeneous clinical corpora spanning ICU physiological signals and hospital EHR, with continuous-time rotary positional encoding, frequency-specialized Mixture-of-Experts, and neural ODE extrapolation for zero-shot forecasting across irregular and multimodal temporal health data (Microsoft, 399+ stars, MIT License)

Active★4081 week ago

Claude Scholar

Interactive Research Environments

Semi-automated research assistant for academic research and software development, supporting Claude Code, Codex CLI, Kimi Code CLI, and OpenCode across ideation, coding, experiments, writing, and publication (Galaxy-Dawn, 4.5K+ stars, MIT License, 2026)

Active★4.6K1 week ago

ParmEd

Parameter/topology editor and molecular simulator with visualization capability.

Active★4521 week ago

Chemprop

Machine Learning

Directed message passing neural networks for property prediction of molecules and reactions with uncertainty and interpretation.

Active★2.4K1 week ago

efo

Active★641 week ago

Pippinlitli/evolva-qwen-0.5b-heretic

by Pippinlitli

text-generation

Heretic-abliterated version of Qwen/Qwen2.5-0.5B-Instruct for the Evolva drug discovery pipeline.

Active↓2351 week ago

OmicVerse

Genomics & Bioinformatics

Unified Python framework for bulk, single-cell, and spatial RNA-seq multi-omics analysis with deep learning deconvolution (VAE) and graph neural networks, bridging Bindea, Bindea, scanpy and squidpy ecosystems (Nature Communications 2024)

Active★1.1K1 week ago

DeepAnalyze

Data Analysis & Visualization

First agentic LLM for autonomous data science with end-to-end pipeline from data to analyst-grade reports

Active★4.3K1 week ago

Chemical Entity Materials and Reactions Ontological Framework

A data model for managing information about chemical entities, ranging from atoms through molecules to complex mixtures.

Active★231 week ago

nnU-Net

Medical AI & Clinical Applications

Self-configuring deep learning framework for semantic segmentation of biomedical images requiring no manual hyperparameter tuning; automatically adapts preprocessing, network topology, and training parameters to achieve state-of-the-art results across 120+ international competitions and benchmarks out-of-the-box (DKFZ, Nature Methods 2021, 8.3k+ stars)

Active★8.6K1 week ago

← Prev

1
2
3
17

Submit a resource bio.tools Awesome Bioinformatics