Find open-source science resources
A directory of tools, AI models, datasets, and research resources for biotech, bioinformatics, and other scientific fields. Aggregated from curated GitHub awesome-lists, HuggingFace, bio.tools, Bioconductor, and more.
Filters
Health
Domain(1)
Language(1)
License
Source(1)
Type
40 of 5,893 resources
HuggingFaceBio/Carbon-3B
by HuggingFaceBioTechnical Report 🧬
gemma4-12b-bioinfo is a fine-tuned Gemma 4 12B instruction model for bioinformatics, genomics, and computational biology question answering.
FINAL-Bench/Darwin-218B-Delphi
by FINAL-Bench> VIDRAFT FINAL-Bench — chemistry-specialized 218B MoE, served via the DELPHI 5-Phase inference cascade.
HealthJudge is a domain-adapted helpfulness evaluator for health-related Community Notes. It is designed to judge whether a note provides helpful context for a potentially misleading social-media post, following the Community Notes helpfulness criteria.
ProtGPT3-MSA is a multiple-sequence, homolog-conditioned autoregressive protein language model. It is part of the ProtGPT3 family, an open-source suite of promptable and aligned protein language models for protein sequence generation.
Hamdan003/inventmol-r1
by Hamdan003Target-Conditioned Molecular Ideation Model for Drug Discovery Research
Junhauwong/Surge-Cognition-4x8B
by JunhauwongBGI-HangzhouAI/Genos-m
by BGI-HangzhouAIGenos-m is a foundation model for human-associated microbial genomes. It is trained to model microbial DNA sequences at single-nucleotide resolution and supports ultra-long genomic contexts up to one million tokens.
Qwen3-8B-syco_med-gated-attention-FT is a plug-and-play gated attention weight released for AI safety research.
Apertus-8B-MeditronFO is a 8B-parameter medical specialist LLM, produced by supervised fine-tuning of Apertus-8B-Instruct on the Fully Open Meditron Corpus.
Apertus-70B-MeditronFO is a 70B-parameter medical specialist LLM, produced by supervised fine-tuning of Apertus-70B-Instruct on the Fully Open Meditron Corpus.
Gemma 4 E2B fine-tuned on 225K drug–target pairs for novel small-molecule generation.
MedPsy-4B is a state-of-the-art, text-only medical and healthcare language model purpose-built for edge deployment. Built on top of Qwen3-4B-Thinking-2507 and post-trained with a multi-stage pipeline (supervised fine-tuning + reinforcement learning) on curated medical data, it surpasses models…
Duchifat-2.3-Instruct is a state-of-the-art, instruction-tuned Large Language Model developed by TopAI. As the flagship of the Duchifat series, this model represents a fundamental breakthrough in how Hebrew is processed, reasoned, and generated in the LLM era.
Fine-tuned version of google/gemma-4-E4B-it across three professional domains — Medical, Legal, and Finance — using QLoRA (4-bit NF4) with Optuna-tuned hyperparameters, trained on Kaggle T4 GPU.
Verdugie/STEM-Oracle-27B
by Verdugie# or·a·cle /ˈôrəkəl/ — a source of wise counsel; one who provides authoritative knowledge. From Latin ōrāculum, meaning divine announcement. In computer science, an oracle is a black box that always returns the correct answer — you don't ask it how it knows, you ask and it answers.
A compact protein language model distilled from ProtGPT2 using complementary-regularizer distillation---a method that combines uncertainty-aware position weighting with calibration-aware label smoothing to achieve 31% better perplexity than standard knowledge distillation at 3.8x compression.
littleworth/protgpt2-distilled-small
by littleworthA compact protein language model distilled from ProtGPT2 using complementary-regularizer distillation---a method that combines uncertainty-aware position weighting with calibration-aware label smoothing to achieve 54% better perplexity than standard knowledge distillation at 9.4x compression.
littleworth/protgpt2-distilled-tiny
by littleworthA compact protein language model distilled from ProtGPT2 using complementary-regularizer distillation---a method that combines uncertainty-aware position weighting with calibration-aware label smoothing to achieve 87% better perplexity than standard knowledge distillation at 20x compression.
From Inquiry to Decision: Building Trustworthy Medical AI
microsoft/MediPhi-Instruct
by microsoftThe MediPhi Model Collection comprises 7 small language models of 3.8B parameters from the base model Phi-3.5-mini-instruct specialized in the medical and clinical domains. The collection is designed in a modular fashion. Five MediPhi experts are fine-tuned on various medical corpora (i.e.
vandijklab/C2S-Scale-Gemma-2-27B
by vandijklabGitHub homepage: Cell2Sentence GitHub
Neeto-1.0-8b is an openly released biomedical large language model (LLM) created by BYOL Academy to assist learners and practitioners with medical exam study, literature understanding, and structured clinical reasoning.
darkknight25/deepseek-16b-medical-GPT
by darkknight25darkknight25/deepseek-16b-medical-GPT is a fine-tuned version of deepseek-ai/deepseek-l6b-moe-chat, optimized for medical question answering, reasoning, and clinical summarization using QLoRA and open-access healthcare datasets.
Dans-PersonalityEngine-V1.3.0-24b Dans-PersonalityEngine-V1.3.0-24b ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⠀⠄⠀⡂⠀⠁⡄⢀⠁⢀⣈⡄⠌⠐⠠⠤⠄⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⡄⠆⠀⢠⠀⠛⣸⣄⣶⣾⡷⡾⠘⠃⢀⠀⣴⠀⡄⠰⢆⣠⠘⠰⠀⡀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠃⠀⡋⢀⣤⡿⠟⠋⠁⠀⡠⠤⢇⠋⠀⠈⠃⢀⠀⠈⡡⠤⠀⠀⠁⢄⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠁⡂⠀⠀⣀⣔⣧⠟⠋⠀⢀⡄⠀⠪⣀⡂⢁⠛⢆⠀⠀⠀⢎⢀⠄⢡⠢⠛⠠⡀⠀⠄⠀⠀ ⠀⠀⡀⠡⢑⠌⠈⣧⣮⢾⢏⠁⠀⠀⡀⠠⠦⠈⠀⠞⠑⠁⠀⠀⢧⡄⠈⡜⠷⠒⢸⡇⠐⠇⠿⠈⣖⠂⠀…
Dans-PersonalityEngine-V1.2.0-24b ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⠀⠄⠀⡂⠀⠁⡄⢀⠁⢀⣈⡄⠌⠐⠠⠤⠄⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⡄⠆⠀⢠⠀⠛⣸⣄⣶⣾⡷⡾⠘⠃⢀⠀⣴⠀⡄⠰⢆⣠⠘⠰⠀⡀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠃⠀⡋⢀⣤⡿⠟⠋⠁⠀⡠⠤⢇⠋⠀⠈⠃⢀⠀⠈⡡⠤⠀⠀⠁⢄⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠁⡂⠀⠀⣀⣔⣧⠟⠋⠀⢀⡄⠀⠪⣀⡂⢁⠛⢆⠀⠀⠀⢎⢀⠄⢡⠢⠛⠠⡀⠀⠄⠀⠀ ⠀⠀⡀⠡⢑⠌⠈⣧⣮⢾⢏⠁⠀⠀⡀⠠⠦⠈⠀⠞⠑⠁⠀⠀⢧⡄⠈⡜⠷⠒⢸⡇⠐⠇⠿⠈⣖⠂⠀ ⠀⢌⠀⠤⠀⢠⣞⣾⡗⠁⠀⠈⠁⢨⡼⠀⠀⠀⢀⠀⣀⡤⣄⠄⠈⢻⡇⠀⠐⣠⠜⠑⠁⠀⣀⡔⡿⠨⡄…
ibm-research/GP-MoLFormer-Uniq
by ibm-researchGP-MoLFormer is a class of models pretrained on SMILES string representations of 0.65-1.1B molecules from ZINC and PubChem. This repository is for the model pretrained on all the unique molecules from both datasets.
Henrychur/MMedS-Llama-3-8B
by Henrychur# MMedS-Llama3 💻Github Repo 🖨️arXiv Paper
johnsnowlabs/JSL-MedLlama-3-8B-v2.0
by johnsnowlabs# JSL-MedLlama-3-8B-v2.0
knowledgator/SMILES2IUPAC-canonical-base
by knowledgatorSMILES2IUPAC-canonical-base was designed to accurately translate SMILES chemical names to IUPAC standards.
TachyHealth/Thealth_Mixtral-8x7B
by TachyHealth# ChemGPT 1.2B ChemGPT is based on the GPT-Neo model and was introduced in the paper Neural Scaling of Deep Chemical Models.
# ChemGPT 19M ChemGPT is based on the GPT-Neo model and was introduced in the paper Neural Scaling of Deep Chemical Models.
# ChemGPT 4.7M ChemGPT is based on the GPT-Neo model and was introduced in the paper Neural Scaling of Deep Chemical Models.