Find open-source science resources
A directory of tools, AI models, datasets, and research resources for biotech, bioinformatics, and other scientific fields. Aggregated from curated GitHub awesome-lists, HuggingFace, bio.tools, Bioconductor, and more.
Filters
Health
Domain
Language
License
Source(1)
Type
418 of 5,923 resources
Showing 351–400
Python package for segmenting geospatial data with the Segment Anything Model (SAM), enabling zero-shot object segmentation in satellite and aerial imagery for remote sensing and Earth observation (MIT, 4k+ stars)
Curated list of large weather models for AI Earth science
Python toolkit for fine-tuning geospatial foundation models
PyTorch domain library for geospatial deep learning providing standardized datasets, samplers, transforms, and pre-trained models for remote sensing, land cover mapping, and environmental monitoring (Microsoft, 4K+ stars)
Allen Institute for AI's global geospatial foundation model for satellite imagery analysis, enabling large-scale mapping of buildings, wind turbines, trees, and land cover from Sentinel-2 data with open-source weights and inference tools (2024)
Semantic-enhanced multi-modal remote sensing foundation model for Earth observation (Nature Machine Intelligence 2025), enabling universal interpretation across diverse satellite imagery modalities with open-source weights and benchmarks
University of Cambridge's foundation model for time-series satellite imagery, enabling efficient extraction of temporal patterns from Earth observation for land classification, canopy height prediction, and other remote sensing tasks
First any-to-any generative foundation model for Earth Observation, enabling unified multimodal understanding and generation across diverse satellite sensors and geospatial tasks through a single architecture (258+ stars)
Agricultural machine learning platform
Ecological modeling and conservation AI
Microsoft AI for Good Lab's open-source biodiversity research hub providing AI models, edge devices, and tools for wildlife monitoring and conservation, including MegaDetector (camera trap animal detection), SPARROW (species recognition), PytorchWildlife (conservation AI toolkit), and bioacoustics analysis pipelines (1K+ stars)
Curated collection of 23,000+ agent skills for empirical research across 8 social science disciplines, enabling reproducible social science research with AI agents (Stanford REAP & CoPaper.AI, 1.1K+ stars, 2026)
Large language model for science
Open-source scientific multimodal foundation model built on a 235B MoE LLM and 6B vision encoder, continually pretrained on 5T tokens including 2.5T scientific-domain tokens, with strong results across chemistry, materials, life science, and earth science benchmarks (2025)
Open language model for mathematics (7B/34B) trained on Proof-Pile-2, outperforming Minerva at equal scale on MATH benchmark, with tool use and formal theorem proving in Lean without finetuning (EleutherAI, ICLR 2024)
Mathematical reasoning
IBM's open foundation model family for materials and chemistry, covering SMILES, SELFIES, molecular graphs, 3D atom positions, and electron density grids, with a unified toolkit for representation learning and downstream prediction/generation (Apache 2.0, 2024-2025)
15TB collection of 16 large-scale numerical simulation datasets spanning fluid dynamics, MHD, astrophysics, biological systems, and acoustic scattering, with unified PyTorch dataloaders and benchmarks for training foundation models on physical sciences (Polymathic AI, NeurIPS 2024)
Acausal modeling framework for automatically parallelized scientific machine learning (1.5k+ stars)
Scientific machine learning benchmarks & differential equation solvers
Unified interface for local, global, gradient-based and derivative-free optimization (800+ stars)
SDK & library for AI-driven scientific computing applications
Machine learning in Julia
Molecular dynamics analysis
Euclidean neural networks for arbitrary point transformations enabling E(3)-equivariant deep learning, foundational library for building geometry-aware neural networks in molecular dynamics, materials science, and physics
Probabilistic programming
High-performance molecular simulation toolkit
End-to-end molecular dynamics engine built on PyTorch, enabling differentiable simulations with neural network potentials and GPU acceleration for machine learning-accelerated molecular dynamics (MIT License, 707+ stars)
Graph neural network library for PyTorch enabling molecular modeling, materials discovery, protein interaction networks, and scientific knowledge graph learning (23.7k+ stars)
LLM for scientific research papers
PINN research collection
LLM agents across scientific domains
200+ AI for Science papers with Chinese interpretations
Physics-informed ML and SciML
Biomedical AI agents
Google DeepMind's official collection of agentic science skills accelerating scientific workflows with better grounding and higher token efficiency, integrating insights from AlphaGenome, AFDB, UniProt and 30+ other databases and tools (2026)
Structure prediction and design of proteins with noncanonical amino acids, enabling AI-powered modeling of synthetic biology constructs and expanded genetic code systems (133+ stars, 2025)
Multimodal AI system generating virtual populations for tumor microenvironment modeling from H&E and multiplex immunofluorescence pathology images, enabling large-scale spatial analysis of cancer biology and therapeutic response prediction (Microsoft Research & Providence, 370+ stars)
Open-source medical large language model for complex clinical reasoning, extending the o1 long-chain-of-thought paradigm to biomedical question answering and diagnostic inference (FreedomIntelligence, 1.3K+ stars)
Design, conduct and analyze results of AI-powered surveys and experiments. Simulate social science and market research with large numbers of AI agents and LLMs (460+ stars, 2024)
First bioinformatics-native AI agent skill library enabling local-first, reproducible genomic and population-genetics research workflows built on OpenClaw (871+ stars, MIT License, 2026)
PyTorch-based embedding instance segmentation algorithm optimized for accurate, efficient, and portable cell and nucleus segmentation across fluorescence and brightfield microscopy images, achieving state-of-the-art speed and accuracy with lightweight model sizes suitable for edge deployment (224+ stars, Apache 2.0)
Deep Graph Library for scalable deep learning on graphs, powering molecular modeling, materials discovery, protein interaction networks, and scientific knowledge graph learning across PyTorch, TensorFlow, and MXNet backends (14K+ stars)
Advanced paper search agent powered by large language models, autonomously invoking search tools, reading papers, and selecting references to deliver comprehensive and accurate results for complex scholarly queries (1.5K+ stars, Apache 2.0, 2024)
Python library to train, interpret, and apply deep learning models to DNA sequences, providing a unified framework for regulatory genomics with support for CNN and transformer architectures, variant effect prediction, and attribution analysis (325+ stars)
Open-source bioimage analysis platform for digital pathology and research, featuring AI-powered cell detection, tissue classification, and whole-slide image analysis with extensible scripting and plugin architecture (1.3K+ stars, actively maintained)
Open-source image analysis toolkit for high-throughput plant phenotyping, extracting morphological, color, and texture traits from RGB, hyperspectral, and thermal imagery with modular Python workflows for crop improvement, stress detection, and plant biology research (Donald Danforth Plant Science Center, 795+ stars, MPL-2.0)
Standard data-centric AI package for data quality and machine learning, automatically detecting label errors, outliers, and dataset issues to improve scientific dataset reliability and model performance (11K+ stars, MIT License)
First versatile medical reasoning agent for chest X-ray interpretation, dynamically integrating state-of-the-art CXR analysis tools and multimodal LLMs into a unified framework; introduces ChestAgentBench with 2,500 complex medical queries across 7 categories (bowang-lab, 1.1K+ stars)
First open-source agentic AI physicist turning research questions into structured workflows with rigorous verification and multi-step analytical work for long-horizon physics projects; integrates with Claude Code, Codex, Gemini CLI, and OpenCode (804+ stars, Apache 2.0, 2026)