Find open-source science resources

A directory of tools, AI models, datasets, and research resources for biotech, bioinformatics, and other scientific fields. Aggregated from curated GitHub awesome-lists, HuggingFace, bio.tools, Bioconductor, and more.

5,961 resources indexed

Showing 3,9514,000

Cheminformatic extension for the SQLAlchemy database.

Analysis of molecular dynamics trajectories.

Parsers and algorithms for computational chemistry logfiles.

an automated workflow for the generation and storage of DFT calculations for organic molecules.

Wrapper for RDKit's RunReactants to improve stereochemistry handling

Webapp for generating conformers

A list of papers, data sets, and other resources for machine learning for small-molecule drug discovery.

Secure text-to-visualization through standardized chart specifications

Multi-type data labeling and annotation tool

Multi-agent system with Parser-Planner-Painter architecture converting `paper.pdf` to editable `poster.pptx`, outperforms GPT-4o with 87% fewer tokens

Multimodal LLM for scientific charts and diagrams understanding/generation

Beyond text-to-slides generation with PPTEval multi-dimensional evaluation (EMNLP 2025)

Transform arXiv papers into Beamer slides using LLMs

Convert PDF files into editable slides with three lines of code

First benchmark for automatic video generation from scientific papers (NeurIPS 2025)

Transform arXiv research papers into engaging presentations and YouTube-ready videos

Automated academic illustration generation for AI scientists, converting research papers into publication-ready figures using VLMs and diffusion models with iterative refinement (PKU & Google Research, 6.2K+ stars, 2026)

Comprehensive toolkit for high-quality PDF content extraction with layout detection, formula recognition, and OCR

Neural optical understanding for academic documents, transforms scientific PDFs to Markdown with mathematical formula support

Production-grade ETL for transforming complex documents into structured formats, with open-source API

High-accuracy PDF→Markdown/JSON/HTML conversion, specialized for tables/formulas/code blocks with benchmark scripts

Large-scale PDF/LaTeX/JATS parsing to standardized JSON for millions of papers

Machine learning software for extracting structured metadata from scholarly documents

Extract figures, tables, captions, and section titles from scholarly PDFs

Large-scale table detection and recognition dataset with pre-trained models

AI coding assistant for JupyterLab with agent mode, supporting arbitrary LLM providers (2025+)

Human-centered research OS with terminal-first harness and local browser Studio, turning research work into reproducible artifact-backed runs through a 9-stage workflow with human approval gates, resume/rollback controls, and venue-aware manuscript packaging (1K+ stars, 2026)

Research agent system deeply integrated with Zotero supporting Agent Mode, skills, multi-model backends (OpenAI-compatible, Claude Code, WebChat, Codex), and MinerU PDF parsing for literature Q&A, summarization, figure inspection, and source comparison (1.3K+ stars, 2026)

AI-powered note linking and research graph navigation

Structure-aware prefix adaptation for integrating LLMs with knowledge graphs (ACM MM 2024)

First system progressively surpassing human SOTA on frontier AI tasks (183.7%, 1.9%, 7.9% improvements), month-long autonomous discovery with 20,000+ GPU hours

Extended autonomy AI scientist with 200 parallel agent rollouts, 42K lines of code execution, 1.5K papers analyzed per run, achieving 79.4% accuracy and 7 scientific discoveries (Edison Scientific)

Autonomous algorithm discovery combining evolutionary search with peer-review reward models, achieving best-known performance on circle packing problems

Fully autonomous research from idea to paper with multi-agent debate, citation verification, and OpenClaw integration (11K+ stars, 2026)

Autonomous pipeline from literature review→hypothesis→algorithm implementation→publication-level writing with Scientist-Bench evaluation

Andrej Karpathy's autonomous LLM research framework: AI agent runs overnight experiments on a real training setup, auto-editing code→5min training→evaluation in a loop, ~100 experiments per night on a single GPU

Universal scientific research intelligence covering 50+ disciplines, repositioning LLMs as cross-disciplinary generators with human experts as verifiers; 30B model outperforms Claude Opus and GPT on 5 research benchmarks

102 executable tasks from 44 peer-reviewed papers across 4 disciplines with containerized evaluation

Research coding benchmark curated by scientists with 338 subproblems across 16 subdomains (physics, math, materials, biology, chemistry), evaluating LLMs on realistic scientific programming tasks with gold-standard solutions (NeurIPS 2024)

Web application for LLM-assisted manuscript review and annotation

AI agent for biological discovery and research automation

Multimodal LLM-based AI agent enabling deep research in spatial transcriptomics, automating analysis and interpretation of spatial gene expression data (Harvard LiuLab, bioRxiv 2025)

Large Language Models for automated open-domain scientific hypotheses discovery (ACL 2024, ICML Best Poster)

Bioinspired multi-agent intelligent graph reasoning system that autonomously traverses ontological knowledge graphs to generate, critique, and refine novel research hypotheses, demonstrated on bio-inspired materials discovery with cross-disciplinary connection mining (MIT Lamm Group, 2024)

Neural differential equations in PyTorch

Sparse identification of nonlinear dynamics

Efficient foundation models for PDEs with pretrained transformer-based neural operators and downstream task fine-tuning pipelines, HuggingFace integration for models and datasets (ETH Zurich CAMLab, arXiv 2024)

Geometry Aware Operator Transformer serving as an efficient and accurate neural surrogate for PDEs on arbitrary domains, combining geometric priors with transformer architectures for scientific computing (ETH Zurich CAMLab, 92+ stars)

Differentiable PDE solving framework for machine learning with built-in fluid simulation, supporting PyTorch/JAX/TensorFlow backends and enabling neural network training within physical simulations (TUM, MIT License)

Efficient differentiable n-dimensional PDE solvers built on JAX and Equinox, shipping 46+ built-in equations with Fourier spectral methods, exponential time differencing, and full auto-differentiation for physics-based deep learning workflows (MIT, 200+ stars, 2024)