Foam-Agent (NeurIPS 2025)

DeepMind's Olympiad-level geometry theorem prover combining neural language model with symbolic deduction engine, AlphaGeometry2 solves 84% of IMO geometry problems (42/50) at gold-medalist level (Nature 2024)

Active4.8K5 months ago

Apache-2.0

Biomni

General-purpose biomedical AI agent integrating LLM reasoning with retrieval-augmented planning and code-based execution to autonomously execute diverse biomedical research tasks and generate testable hypotheses (Stanford SNAP, bioRxiv 2025)

Active3.2K1 week ago

Apache-2.0

DeepSeek-Prover-V2

DeepSeek's open-source large language model for formal theorem proving in Lean 4, integrating informal and formal mathematical reasoning through recursive subgoal decomposition and reinforcement learning powered by DeepSeek-V3, with open weights and ProverBench evaluation (2025)

Idle1.3K11 months ago

NOASSERTION

LeanDojo

Open-source toolkit and benchmark for learning-based theorem proving in Lean, providing programmatic Lean interaction, a 98K+ theorem dataset extracted from 217 Lean projects, and ReProver—the first retrieval-augmented LLM-based theorem prover for Lean—with reproducible training pipelines underpinning much subsequent Lean prover research (Caltech & NVIDIA, NeurIPS 2023 Outstanding Paper, Datasets & Benchmarks)

Active8034 months ago

MIT

TxAgent

AI agent for therapeutic reasoning across a universe of tools, achieving 92.1% accuracy in drug reasoning and outperforming GPT-4o by 25.8% (Harvard MIMS, 2025)

Idle63410 months ago

MIT

Camyla

Fully autonomous medical image segmentation research system that generates complete manuscripts end-to-end from datasets with zero human intervention, beating strongest baselines on 24 of 31 datasets and achieving T1-T2 tier manuscript quality in double-blind evaluations (USTC & Shanghai AI Lab, 2026)

Active3502 months ago