Find open-source science resources

Plain-text, git-tracked electronic lab notebook (ELN) for reproducible bioinformatics — threads your R & Python figures into living lab notes with full provenance. Built for single-cell / CyTOF / flow cytometry; works with Obsidian, Quarto & Jupyter.

Active01 day ago

Research Workbench & Plugins

Scientific Agent Skills

Turn any AI agent into an AI Scientist. The #1 Agent Skills library for science with 140+ ready-to-use skills and 100+ scientific databases covering biology, chemistry, medicine, and drug discovery. Compatible with Cursor, Claude Code, Codex, Antigravity, and the open Agent Skills standard (K-Dense-AI, 26K+ stars, 2025)

Active30.6K2 days ago

overreact

Simulations

A library and command-line tool for building and analyzing complex homogeneous microkinetic models from quantum chemistry calculations, with support for quasi-harmonic thermochemistry, quantum tunnelling corrections, molecular symmetries and more.

Active644 days ago

Neuroscience & Behavioral Analysis

NeuroAI (Meta FAIR)

Modular Python suite for Neuro-AI research across all modalities, providing efficient data loaders (NeuralSet), curated datasets (NeuralFetch), scalable training (NeuralTrain), and unified benchmarking (NeuralBench) for building and evaluating neuroscience foundation models (Meta FAIR, 270+ stars, MIT License, 2026)

Active2725 days ago

Scientific Writing & Collaboration

Claude Prism

Offline-first scientific writing workspace powered by Claude, integrating LaTeX, Python, and 100+ scientific skills with local execution, Zotero integration, and privacy-focused design (2026)

Active1.7K1 week ago

TypeScript

Rarr

DataImport

The Zarr specification defines a format for chunked, compressed, N-dimensional arrays. It's design allows efficient access to subsets of the stored array, and supports both local and cloud storage systems. Rarr aims to implement this specification in R with minimal reliance on an external tools or libraries.

Active541 week ago

Autonomous Research Systems (2023-2025 Breakthroughs)

ARA (Agent-Native Research Artifact)

Research ecosystem for rigorous and trustworthy AI scientists — a protocol and skill bundle that makes autonomous research verifiable, crystallized, and observable through structured, machine-executable research artifacts and five agent skills for research management, compilation, verification, visualization, and publication (ARA-Labs, 447+ stars, MIT License, 2026)

Active4501 week ago

HTML

mosaic

Protein & Drug Discovery

Composite-objective protein design framework integrating Boltz, AlphaFold2, OpenFold3, ProteinMPNN, and ESM via JAX-based gradient optimization over continuous relaxed sequence space for multi-property binder design (319+ stars, MIT License, 2025)

Active3451 week ago

📋 Paper Collections & Repositories

Awesome LLM Scientific Discovery

LLM papers for scientific discovery

Active3971 week ago

Autonomous Research Systems (2023-2025 Breakthroughs)

ai4s-skills

Agent skills (SKILL.md + deterministic tools) for the AI4S workflow — topic exploration, literature survey, runnable experiments, publication-grade papers, and integrity audit, with every citation and number traceable to its source (by ai4s-research, maintainers of this list; MIT, 2026)

Active901 week ago

Research Workbench & Plugins

bioSkills

Collection of SKILLS.md guiding AI coding agents (Claude Code, OpenAI Codex, Google Gemini, OpenCode, OpenClaw) through common bioinformatics workflows from basic sequence manipulation to advanced analyses such as single-cell RNA-seq and population genetics; evaluated on the Bio-Task Bench dataset (GPTomics, 969+ stars, MIT License, 2026)

Active9741 week ago

Neural Differential Equations

DiffEqFlux.jl

Neural differential equations in Julia

Active9171 week ago

Julia

Lab Automation & Robotics

PyLabRobot

Interactive and hardware-agnostic SDK for laboratory automation, enabling programmatic control of liquid handlers, plate readers, and other lab instruments across multiple vendors; foundational infrastructure for self-driving laboratories and AI-driven experimental execution (447+ stars)

Active4751 week ago

Medical AI & Clinical Applications

MIRA (NeurIPS 2025)

Medical time series foundation model pretrained on 454B time points from heterogeneous clinical corpora spanning ICU physiological signals and hospital EHR, with continuous-time rotary positional encoding, frequency-specialized Mixture-of-Experts, and neural ODE extrapolation for zero-shot forecasting across irregular and multimodal temporal health data (Microsoft, 399+ stars, MIT License)

Active4081 week ago

Interactive Research Environments

Claude Scholar

Semi-automated research assistant for academic research and software development, supporting Claude Code, Codex CLI, Kimi Code CLI, and OpenCode across ideation, coding, experiments, writing, and publication (Galaxy-Dawn, 4.5K+ stars, MIT License, 2026)

Active4.6K1 week ago

epiregulon

SingleCell

Gene regulatory networks model the underlying gene regulation hierarchies that drive gene expression and observed phenotypes. Epiregulon infers TF activity in single cells by constructing a gene regulatory network (regulons). This is achieved through integration of scATAC-seq and scRNA-seq data and incorporation of public bulk TF ChIP-seq data. Links between regulatory elements and their target genes are established by computing correlations between chromatin accessibility and gene expressions.

Active281 week ago

graphein

Machine Learning

Provides functionality for producing geometric representations of protein and RNA structures, and biological interaction networks.

Active1.2K1 week ago

Jupyter Notebook

igvShiny

Software

This package is a wrapper of Integrative Genomics Viewer (IGV). It comprises an htmlwidget version of IGV. It can be used as a module in Shiny apps.

Active381 week ago

Data Analysis & Visualization

DeepAnalyze

First agentic LLM for autonomous data science with end-to-end pipeline from data to analyst-grade reports

Active4.3K1 week ago

scToppR

Pathways

scToppR provides an easy-to-use API wrapper for the ToppGene web platform, used for gene ontology and functional enrichment research. The package also integrates visualization tools, making it a convenient tool directly connecting ToppGene to code-based workflows in R. The tool can also easily save results into different formats.

Active71 week ago

Babel

Hand-curated Snakemake pipelines to combine identifier cross-references from multiple sources across dozens of biomedical types, including anatomical entities, diseases and phenotypes, genes and proteins and many others.

Active141 week ago

Domain-Specific Research Agents

ClawBio

First bioinformatics-native AI agent skill library enabling local-first, reproducible genomic and population-genetics research workflows built on OpenClaw (871+ stars, MIT License, 2026)

Active1K1 week ago

Slides & Presentation Generation

PPTAgent

Beyond text-to-slides generation with PPTEval multi-dimensional evaluation (EMNLP 2025)

Active4.7K1 week ago

matbench-discovery

Force Fields

A benchmark for ML-guided high-throughput materials discovery.

Active2371 week ago

Autonomous Research Systems (2023-2025 Breakthroughs)

ARIS (Auto-Research-In-Sleep)

Lightweight Markdown-only skills for autonomous ML research with cross-model review loops, idea discovery, and experiment automation; no framework lock-in, works with Claude Code, Codex, OpenClaw, or any LLM agent (12.8K+ stars, MIT License, 2026)

Active12.8K1 week ago

Genomics & Bioinformatics

mLLMCelltype

Multi-LLM consensus framework for automated cell type annotation in single-cell transcriptomics, integrating predictions from 10+ large language models with iterative discussion and uncertainty quantification to reduce single-model biases, achieving up to 95% accuracy without reference datasets; available as CRAN R package and PyPI Python package with Scanpy/Seurat integration (2025)

Active6492 weeks ago

ROBERT

Machine Learning

Ensemble of automated machine learning protocols that can be run sequentially through a single command line. The program works for regression and classification problems.

Active552 weeks ago

Interactive Research Environments

BioMCP

Biomedical Model Context Protocol (MCP) server unifying literature search across PubMed/Europe PMC, entity pivoting across genes/variants/drugs/diseases/pathways/proteins, local study analytics, and Claude Code/Codex integration for agentic biomedical research (531+ stars, MIT License, 2025-2026)

Active5322 weeks ago

Rust

SevenNet (JCTC 2024)

Materials Discovery

Graph neural network interatomic potential package supporting efficient multi-GPU parallel molecular dynamics simulations, enabling large-scale atomistic modeling with machine learning potentials (MDIL-SNU, MIT License)

Active2522 weeks ago

NequIP

Materials Discovery

E(3)-equivariant neural network interatomic potentials achieving DFT accuracy with up to 1000× less training data than invariant models, foundational architecture behind MACE and Allegro (Harvard, MIT, Nature Communications 2022)

Active9332 weeks ago

GlycoDash

Biochemistry

GlycoDash is an R Shiny dashboard for processing glycomics data obtained from LaCyTools, SweetSuite and Skyline.

Active22 weeks ago

RiboCrypt

Software

R Package for interactive visualization and browsing NGS data. It contains a browser for both transcript and genomic coordinate view. In addition a QC and general metaplots are included, among others differential translation plots and gene expression plots. The package is still under development.

Active62 weeks ago

ORFik

ImmunoOncology

R package for analysis of transcript and translation features through manipulation of sequence data and NGS data like Ribo-Seq, RNA-Seq, TCP-Seq and CAGE. It is generalized in the sense that any transcript region can be analysed, as the name hints to it was made with investigation of ribosomal patterns over Open Reading Frames (ORFs) as it's primary use case. ORFik is extremely fast through use of C++, data.table and GenomicRanges. Package allows to reassign starts of the transcripts with the use of CAGE-Seq data, automatic shifting of RiboSeq reads, finding of Open Reading Frames for whole genomes and much more.

Active382 weeks ago

RBPBench

RNA

RBPBench is a multi-function tool to evaluate CLIP-seq and other related genomic region data using a comprehensive collection of known RNA-binding protein (RBP) binding motifs. RBPBench can be used for a variety of purposes, from RBP motif search (database or user-supplied RBP motifs) in genomic regions, over motif enrichment and co-occurrence analysis, in-depth comparisons over multiple datasets via sequence and genomic annotation statistics, to benchmarking CLIP-seq peak caller methods as well as comparisons across cell types and CLIP-seq protocols. RBPBench supports both sequence and structure motifs, as well as regular expressions (sequence and structure patterns). Moreover, users can easily provide their own motif collections.

Active72 weeks ago

Remote Sensing & Geospatial AI

TESSERA (CVPR 2026)

University of Cambridge's foundation model for time-series satellite imagery, enabling efficient extraction of temporal patterns from Earth observation for land classification, canopy height prediction, and other remote sensing tasks

Active6222 weeks ago

miRNAProtPred

Biosciences

A powerful, high-performance bioinformatics framework for discovering, evaluating, and verifying microRNA (miRNA) interactions across DNA, RNA, and protein target sequences. The mirnaprotpred package provides two core modules: SeqFinder: A discovery engine to find all potential miRNA interactions across a genome or target sequence. Validator: A targeted verification engine to test specific, user-provided miRNAs against a target sequence. Both modules are powered by a shared, rigorous biological engine that evaluates exact seed matching, wobble pairing, AU-rich context, and RNAduplex thermodynamic stability.

Active02 weeks ago

SeqKit

Sequence Processing

A cross-platform and ultrafast toolkit for FASTA/Q file manipulation in Golang.

Active1.6K2 weeks ago

DeepTaxa

Metagenomics

DeepTaxa is a hybrid CNN-BERT deep learning framework for multi-rank taxonomic classification of 16S rRNA gene sequences. It predicts all seven Linnaean ranks from domain to species in a single forward pass and provides pre-trained checkpoints for full-length 16S and V3-V4 amplicons.

Active42 weeks ago

Machine Learning for Physics

Walrus (arXiv 2025)

Cross-domain foundation model for continuum dynamics trained on 19 physical scenarios spanning 63 variables, featuring adaptive compute via stride modulation and patch jittering for long-run stability (Polymathic AI, 293+ stars, MIT License)

Active2932 weeks ago

anndataR

SingleCell

Bring the power and flexibility of AnnData to the R ecosystem, allowing you to effortlessly manipulate and analyse your single-cell data. This package lets you work with backed h5ad and zarr files, directly access various slots (e.g. X, obs, var), or convert the data into SingleCellExperiment and Seurat objects.

Active1882 weeks ago

Evaluation & Benchmarking

ResearchClawBench (InternScience, arXiv 2026)

Benchmark evaluating AI agents for end-to-end automated research from re-discovery to new-discovery, with 40 real-science tasks across 10 disciplines, curated datasets from published papers, and expert-curated multimodal rubrics (170+ stars, MIT License)

Active1722 weeks ago

Jupyter Notebook

Literature & Knowledge Management

paper-search-mcp

MCP server, CLI, and agent skills for searching and downloading academic papers from multiple open sources (arXiv, PubMed, bioRxiv, Semantic Scholar, OpenAlex, CORE, Europe PMC, etc.) with unified, deduplicated, LLM-friendly retrieval and an OA-first download fallback chain (OpenAGS, 1.9K+ stars, MIT License, 2025)

Active2K2 weeks ago

isolib

Metabolomics

Create MSP files containing the isotopic patterns for given molecules with given adducts. The tool is based on enviPat and the RforMassSpectrometry toolbox.

Active152 weeks ago

Medical AI & Clinical Applications

BioImage.IO

Community-driven model zoo and deployment infrastructure for AI-powered bioimage analysis, enabling standardized sharing, validation, and cross-platform execution of deep learning models across Fiji, Ilastik, napari, and other scientific imaging tools (EPFL, EMBL, and global collaborators, actively maintained)

Active382 weeks ago

Jupyter Notebook

Research Workbench & Plugins

Medical Research Skills

Curated library of 550+ medical research agent skills spanning evidence insights, protocol design, omics/clinical data analysis, and academic writing; each skill is reviewed through MedSkillAudit and compatible with Claude Code, Codex, Open Code, OpenClaw, and SKILL.md-compatible agents (AIPOCH, 1.2K+ stars, MIT License, 2026)

Active1.3K2 weeks ago

Scientific Machine Learning Frameworks

Optimization.jl

Unified interface for local, global, gradient-based and derivative-free optimization (800+ stars)

Active8272 weeks ago

Julia

STADyUM

StatisticalMethod

STADyUM is a package with functionality for analyzing nascent RNA read counts to infer transcription rates. This includes utilities for processing experimental nascent RNA read counts as well as for simulating PRO-seq data. Rates such as initiation, pause release and landing pad occupancy are estimated from either synthetic or experimental data. There are also options for varying pause sites and including steric hindrance of initiation in the model.

Active12 weeks ago

DeepChem

Machine Learning

Deep learning library for Chemistry based on Tensorflow

Active6.8K2 weeks ago

Neuroscience & Behavioral Analysis

SpikeInterface

Unified Python framework for extracellular electrophysiology, standardizing interfaces to 10+ ML-based spike sorting algorithms including Kilosort for reproducible neural spike sorting workflows (792+ stars, actively maintained)

Active8062 weeks ago