Find open-source science resources

A directory of tools, AI models, datasets, and research resources for biotech, bioinformatics, and other scientific fields. Aggregated from curated GitHub awesome-lists, HuggingFace, bio.tools, Bioconductor, and more.

418 of 5,923 resources

Showing 401418

End-to-end composable multi-agent framework for automating OpenFOAM-based CFD simulations from natural language prompts, managing meshing, case setup, execution, error correction, and post-processing; achieves 100% success rate on 110 FoamBench tasks with Claude Opus 4.6 through Architect-Input Writer-Runner-Reviewer agent collaboration with RAG-enhanced generation and MCP tool integration (RPI CSML, 242+ stars, MIT License)

AI scientist framework for autonomous deep research in biological sciences, combining literature analysis agents with data scientist agents to enable iterative scientific discovery through user feedback integration; achieves state-of-the-art performance on BixBench benchmark (48.78% open-answer, 64.39% multiple-choice) outperforming Kepler and GPT-5 (bio-xyz, arXiv 2601.12542, 160+ stars, 2025-2026)

Full spaCy pipeline and models for scientific/biomedical documents, enabling named entity recognition, abbreviation resolution, and UMLS linking for scientific literature mining (1.9K+ stars, Apache 2.0)

Unified Python framework for extracellular electrophysiology, standardizing interfaces to 10+ ML-based spike sorting algorithms including Kilosort for reproducible neural spike sorting workflows (792+ stars, actively maintained)

Universal foundation model for grounded biomedical image interpretation, enabling comprehensive visual understanding, reasoning, and grounding across diverse biomedical imaging modalities with strong zero-shot generalization (55+ stars, Apache 2.0, 2025-2026)

Community-driven model zoo and deployment infrastructure for AI-powered bioimage analysis, enabling standardized sharing, validation, and cross-platform execution of deep learning models across Fiji, Ilastik, napari, and other scientific imaging tools (EPFL, EMBL, and global collaborators, actively maintained)

Open-source SDK for working with quantum computers at the level of extended quantum circuits, operators, and primitives, enabling quantum algorithm development for quantum chemistry, materials science, and optimization research (IBM, 7.4K+ stars, Apache 2.0)

Decentralized self-organizing teams of AI agents for long-running computational scientific experimentation; agents critique each other's proposals before spending compute and share successes/failures to avoid redundant exploration, achieving +8.33% on BioML-Bench, 1.9× faster nanoGPT optimization, and +12.5% on ProteinGym ACE2-Spike (425+ stars, 2026)

First real quadrotor robot trained end-to-end with differentiable physics for vision-based agile flight, bridging simulation-based learning and real-world deployment with physics-informed neural network controllers (558+ stars)

State-of-the-art RNA 3D folding model developed with Stanford Das Lab and Kaggle competition winners, featuring a 488M-parameter AF3-like architecture with MSA and template-based modeling, enabling structure-driven drug discovery and RNA therapeutics design (NVIDIA-Digital-Bio, Apache 2.0)

103B-parameter open-source medical language model with 1/32 Mixture-of-Experts architecture, achieving HealthBench-leading performance among open-source models with only 6.1B active parameters; jointly developed by Ant Group and Zhejiang Province Health Information Center (MIT License)

Large-scale flow-based protein backbone generator utilizing hierarchical fold class labels for conditioning with a tailored scalable transformer architecture, enabling controllable de novo protein design (264+ stars)

Large transformer-based single-cell foundation model pretrained on 50 million cells for robust gene network inference, expression denoising, cell embedding, and zero-shot label prediction, leveraging ESM2 protein embeddings and bidirectional transformer architecture (Cantini Lab, 148+ stars, GPL-3.0)

Multimodal AI bridging transcriptomics data and natural language, enabling intuitive chat-based exploration and analysis of single-cell RNA-seq datasets through conversational interaction without coding; fine-tuned Mistral 7B LLaVA model emulating biologist-bioinformatician discussions (207+ stars, GPL-3.0)

Controllable foundation model for general and specialized biomolecular structure prediction across proteins, nucleic acids, and complexes, featuring a public web server for interactive prediction workflows (IntelliGen AI, 223+ stars, Apache 2.0, 2025)

PyTorch-native atomistic simulation engine for the machine-learned interatomic potential (MLIP) era, enabling batched molecular dynamics and structural relaxation with automatic GPU memory management; supports MACE, Fairchem, SevenNet, ORB, MatterSim and other popular MLIPs with up to 100x speedup over ASE (Radical AI, AI for Science 2026, 468+ stars, MIT License)

Local-first, open-source healthcare AI toolkit for clinical NLP and PHI/PII de-identification across 12 languages, running entirely on-device with 1,000+ specialized medical models; provides Python SDK, REST API, Docker deployment, and native Swift apps via OpenMedKit with Apple MLX/CoreML acceleration, supporting HIPAA-aware de-identification with 247 PII checkpoints (3K+ stars, Apache 2.0, arXiv 2508.01630)

Pretrained machine-learned force field for (bio)molecular simulations combining the fast SO3krates neural network for semi-local interactions with universal pairwise force fields for short-range repulsion, long-range electrostatics, and dispersion interactions; supports geometry optimization, NVT/NPT/NVE MD, fine-tuning, ASE calculator, and JAX-MD integration (JACS 2025, 218+ stars, MIT License)