Find open-source science resources

Technical Report 🧬

Active6.1K4 days ago

medicalai/ClinicalGPT-R1-Qwen-7B-EN-preview

by medicalai

# Instruction For more information, visit our GitHub repository: https://github.com/medfound/medfound

Active4.6K5 days ago

yashm/gemma4-12b-bioinfo-GGUF

by yashm

This repository contains GGUF files for gemma4-12b-bioinfo, a fine-tuned Gemma 4 12B model for bioinformatics and computational biology.

Active6595 days ago

yashm/gemma4-12b-bioinfo

by yashm

gemma4-12b-bioinfo is a fine-tuned Gemma 4 12B instruction model for bioinformatics, genomics, and computational biology question answering.

Active1505 days ago

pfnet/GenerRNA

by pfnet

*GenerRNA is a generative pre-trained language model for de novo RNA sequence design. It is a Transformer (decoder-only, GPT-style) model that learns the "language" of RNA from millions of natural sequences and can generate novel, realistic RNA sequences without any structural input, functional…

Active01 week ago

PhysicsWallahAI/Aryabhata-2.0

by PhysicsWallahAI

Aryabhata 2 is a reasoning-focused language model developed by PhysicsWallah for competitive STEM examinations (JEE, NEET). It is obtained by post-training GPT-OSS-20B via reinforcement learning on a curated curriculum of Physics, Chemistry, Mathematics, and General Reasoning questions — achieving…

Active2921 week ago

FINAL-Bench/Darwin-218B-Delphi

by FINAL-Bench

> VIDRAFT FINAL-Bench — chemistry-specialized 218B MoE, served via the DELPHI 5-Phase inference cascade.

Active181 week ago

Eculid/HealthJudge

by Eculid

HealthJudge is a domain-adapted helpfulness evaluator for health-related Community Notes. It is designed to judge whether a note provides helpful context for a potentially misleading social-media post, following the Community Notes helpfulness criteria.

Active311 week ago

AI4PD/ProtGPT3-MSA

by AI4PD

ProtGPT3-MSA is a multiple-sequence, homolog-conditioned autoregressive protein language model. It is part of the ProtGPT3 family, an open-source suite of promptable and aligned protein language models for protein sequence generation.

Active1751 week ago

pankajpandey-dev/Carbon-3B-GGUF

by pankajpandey-dev

GGUF quantizations of HuggingFaceBio/Carbon-3B — a generative DNA foundation model — for efficient inference with llama.cpp.

Active6202 weeks ago

Hamdan003/inventmol-r1

by Hamdan003

Target-Conditioned Molecular Ideation Model for Drug Discovery Research

Active03 weeks ago

Junhauwong/Surge-Cognition-4x8B

by Junhauwong

Active323 weeks ago

BGI-HangzhouAI/Genos-m

by BGI-HangzhouAI

Genos-m is a foundation model for human-associated microbial genomes. It is trained to model microbial DNA sequences at single-nucleotide resolution and supports ultra-long genomic contexts up to one million tokens.

Active313 weeks ago

sichengwang04/Qwen3-8B-syco_med-gated-attention-FT

by sichengwang04

Qwen3-8B-syco_med-gated-attention-FT is a plug-and-play gated attention weight released for AI safety research.

Active03 weeks ago

EPFLiGHT/Apertus-8B-MeditronFO

by EPFLiGHT

Apertus-8B-MeditronFO is a 8B-parameter medical specialist LLM, produced by supervised fine-tuning of Apertus-8B-Instruct on the Fully Open Meditron Corpus.

Active3813 weeks ago

EPFLiGHT/Apertus-70B-MeditronFO

by EPFLiGHT

Apertus-70B-MeditronFO is a 70B-parameter medical specialist LLM, produced by supervised fine-tuning of Apertus-70B-Instruct on the Fully Open Meditron Corpus.

Active3973 weeks ago

vadimbelsky/qwen3.5-medical-ft

by vadimbelsky

LoRA fine-tune of Qwen3.5-9B on synthetic clinical triage Q&A pairs generated from PubMed Central open-access papers. The model is specialized for emergency-medicine decision-making: triaging patients, applying clinical decision rules, and generating protocol-grounded triage recommendations.

Active03 weeks ago

JThomas-CoE/coe-gemma4-biology-mmlu_pro-14b-a4b-q4

by JThomas-CoE

Base model: google/gemma-4-26b-it Architecture: MoE — 26B total / ≈4B active parameters (1 shared expert + 8 routed from a pool of 128 per MoE layer, 30 MoE layers) Method: Activation-directed expert surgery — 128 → 64 experts per layer (50% reduction) Quantization: Q4KM (≈9.7 GB on disk) Tags:…

Active903 weeks ago

JThomas-CoE/coe-gemma4-math-mmlu_pro-14b-a4b-q4

by JThomas-CoE

Active3063 weeks ago

dlyog/gemma-cure

by dlyog

Gemma 4 E2B fine-tuned on 225K drug–target pairs for novel small-molecule generation.

Active253 weeks ago

GenerTeam/GENERator-v2-eukaryote-1.2b-base

by GenerTeam

## Important Notice If you are using GENERator for sequence generation, please ensure that the length of each input sequence is a multiple of 6. This can be achieved by either: 1. Padding the sequence on the left with 'A' (left padding); 2. Truncating the sequence from the left (left truncation).

Active4.9K4 weeks ago

qvac/MedPsy-4B

by qvac

MedPsy-4B is a state-of-the-art, text-only medical and healthcare language model purpose-built for edge deployment. Built on top of Qwen3-4B-Thinking-2507 and post-trained with a multi-stage pipeline (supervised fine-tuning + reinforcement learning) on curated medical data, it surpasses models…

Active1.2K1 month ago

InstaDeepAI/instanovo-phospho-v1.0.0

by InstaDeepAI

InstaNovo-P is a specialized transformer-based model for de novo peptide sequencing from phosphoproteomics mass spectrometry data. This model is specifically trained and optimized for identifying phosphorylated peptides and their modification sites.

Active71 month ago

InstaDeepAI/instanovo-v1.0.0

by InstaDeepAI

# InstaNovo: De novo Peptide Sequencing Model ## Model Description

Active111 month ago

InstaDeepAI/instanovo-v1.1.0

by InstaDeepAI

# InstaNovo: De novo Peptide Sequencing Model ## Model Description

Active161 month ago

empirischtech/DeepSeek-R1-Distill-Qwen-32B-gptq-4bit

by empirischtech

A domain-optimized reasoning model built on DeepSeek-R1-Distill-Qwen-32B, refined through a multi-stage pipeline of GPTQ quantization-aware training and QLoRA fine-tuning. Achieves 84% on MedQA — within 4 points of GPT-4o — in a ~20GB package that fits on a single L40/L40s GPU.

Active3471 month ago

razielAI/Duchifat-2.3-Instruct

by razielAI

Duchifat-2.3-Instruct is a state-of-the-art, instruction-tuned Large Language Model developed by TopAI. As the flagship of the Duchifat series, this model represents a fundamental breakthrough in how Hebrew is processed, reasoned, and generated in the LLM era.

Active1541 month ago

Acryl-aLLM/ALLM.H-Bv4-Gemma4-31B-BF16

by Acryl-aLLM

Active132 months ago

rajveer43/gemma-4-E4B-medical-legal-finance-qa

by rajveer43

Fine-tuned version of google/gemma-4-E4B-it across three professional domains — Medical, Legal, and Finance — using QLoRA (4-bit NF4) with Optuna-tuned hyperparameters, trained on Kaggle T4 GPU.

Active1K2 months ago

learning-unit/L1-16B-A3B

by learning-unit

L1 (Learning Unit 1) is the first language model from Lunit and Lunit Consortium, purpose-built for the medical domain. Derived from Gravity-16B-A3B-Base, L1 is designed for clinical reasoning and decision support.

Active2492 months ago

Verdugie/STEM-Oracle-27B

by Verdugie

# or·a·cle /ˈôrəkəl/ — a source of wise counsel; one who provides authoritative knowledge. From Latin ōrāculum, meaning divine announcement. In computer science, an oracle is a black box that always returns the correct answer — you don't ask it how it knows, you ask and it answers.

Active1422 months ago

littleworth/protgpt2-distilled-medium

by littleworth

A compact protein language model distilled from ProtGPT2 using complementary-regularizer distillation---a method that combines uncertainty-aware position weighting with calibration-aware label smoothing to achieve 31% better perplexity than standard knowledge distillation at 3.8x compression.

Active703 months ago

littleworth/protgpt2-distilled-small

by littleworth

A compact protein language model distilled from ProtGPT2 using complementary-regularizer distillation---a method that combines uncertainty-aware position weighting with calibration-aware label smoothing to achieve 54% better perplexity than standard knowledge distillation at 9.4x compression.

Active53 months ago

littleworth/protgpt2-distilled-tiny

by littleworth

A compact protein language model distilled from ProtGPT2 using complementary-regularizer distillation---a method that combines uncertainty-aware position weighting with calibration-aware label smoothing to achieve 87% better perplexity than standard knowledge distillation at 20x compression.

Active143 months ago

baichuan-inc/Baichuan-M3-235B-Q4_K_M-GGUF

by baichuan-inc

From Inquiry to Decision: Building Trustworthy Medical AI

Active204 months ago

EQUES/JPharmatron-7B

by EQUES

Active454 months ago

OpenDFM/ChemDFM-R-14B

by OpenDFM

While large language models (LLMs) have achieved impressive progress, their application in scientific domains such as chemistry remains hindered by shallow domain understanding and limited reasoning capabilities. In this work, we focus on the specific field of chemistry and develop a Chemical…

Active735 months ago

OpenDFM/ChemDFM-v2.0-14B

by OpenDFM

ChemDFM-v2.0 is the latest non-thinking model of ChemDFM, the pioneering open-sourced dialogue foundation model for Chemistry and molecule science.

Active9645 months ago

microsoft/MediPhi-Instruct

by microsoft

The MediPhi Model Collection comprises 7 small language models of 3.8B parameters from the base model Phi-3.5-mini-instruct specialized in the medical and clinical domains. The collection is designed in a modular fashion. Five MediPhi experts are fine-tuned on various medical corpora (i.e.

Active2K5 months ago

vandijklab/C2S-Scale-Gemma-2-27B

by vandijklab

GitHub homepage: Cell2Sentence GitHub

Idle9607 months ago

InstaDeepAI/instanovoplus-v1.1.0

by InstaDeepAI

InstaNovoPlus is a diffusion-based model for de novo peptide sequencing from mass spectrometry data. This model leverages multinomial diffusion for accurate, database-free peptide identification for large-scale proteomics experiments.

Idle58 months ago

google/medgemma-27b-text-it

by google

Idle26K8 months ago

lastmass/Qwen3_Medical_GRPO

by lastmass

中文版说明

Idle779 months ago

S4nfs/Neeto-1.0-8b

by S4nfs

Neeto-1.0-8b is an openly released biomedical large language model (LLM) created by BYOL Academy to assist learners and practitioners with medical exam study, literature understanding, and structured clinical reasoning.

Idle7.7K9 months ago

darkknight25/deepseek-16b-medical-GPT

by darkknight25

darkknight25/deepseek-16b-medical-GPT is a fine-tuned version of deepseek-ai/deepseek-l6b-moe-chat, optimized for medical question answering, reasoning, and clinical summarization using QLoRA and open-access healthcare datasets.

Idle011 months ago

futurehouse/ether0

by futurehouse

!ether0 logo

Idle25611 months ago

bartowski/PocketDoc_Dans-PersonalityEngine-V1.3.0-24b-GGUF

by bartowski

Using llama.cpp release b5466 for quantization.

Idle10.9K1 year ago

PocketDoc/Dans-PersonalityEngine-V1.3.0-24b

by PocketDoc

Dans-PersonalityEngine-V1.3.0-24b Dans-PersonalityEngine-V1.3.0-24b ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⠀⠄⠀⡂⠀⠁⡄⢀⠁⢀⣈⡄⠌⠐⠠⠤⠄⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⡄⠆⠀⢠⠀⠛⣸⣄⣶⣾⡷⡾⠘⠃⢀⠀⣴⠀⡄⠰⢆⣠⠘⠰⠀⡀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠃⠀⡋⢀⣤⡿⠟⠋⠁⠀⡠⠤⢇⠋⠀⠈⠃⢀⠀⠈⡡⠤⠀⠀⠁⢄⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠁⡂⠀⠀⣀⣔⣧⠟⠋⠀⢀⡄⠀⠪⣀⡂⢁⠛⢆⠀⠀⠀⢎⢀⠄⢡⠢⠛⠠⡀⠀⠄⠀⠀ ⠀⠀⡀⠡⢑⠌⠈⣧⣮⢾⢏⠁⠀⠀⡀⠠⠦⠈⠀⠞⠑⠁⠀⠀⢧⡄⠈⡜⠷⠒⢸⡇⠐⠇⠿⠈⣖⠂⠀…

Idle1651 year ago

PocketDoc/Dans-PersonalityEngine-V1.2.0-24b

by PocketDoc

Dans-PersonalityEngine-V1.2.0-24b ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⠀⠄⠀⡂⠀⠁⡄⢀⠁⢀⣈⡄⠌⠐⠠⠤⠄⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⡄⠆⠀⢠⠀⠛⣸⣄⣶⣾⡷⡾⠘⠃⢀⠀⣴⠀⡄⠰⢆⣠⠘⠰⠀⡀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠃⠀⡋⢀⣤⡿⠟⠋⠁⠀⡠⠤⢇⠋⠀⠈⠃⢀⠀⠈⡡⠤⠀⠀⠁⢄⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠁⡂⠀⠀⣀⣔⣧⠟⠋⠀⢀⡄⠀⠪⣀⡂⢁⠛⢆⠀⠀⠀⢎⢀⠄⢡⠢⠛⠠⡀⠀⠄⠀⠀ ⠀⠀⡀⠡⢑⠌⠈⣧⣮⢾⢏⠁⠀⠀⡀⠠⠦⠈⠀⠞⠑⠁⠀⠀⢧⡄⠈⡜⠷⠒⢸⡇⠐⠇⠿⠈⣖⠂⠀ ⠀⢌⠀⠤⠀⢠⣞⣾⡗⠁⠀⠈⠁⢨⡼⠀⠀⠀⢀⠀⣀⡤⣄⠄⠈⢻⡇⠀⠐⣠⠜⠑⠁⠀⣀⡔⡿⠨⡄…

Idle561 year ago

ibm-research/GP-MoLFormer-Uniq

by ibm-research

GP-MoLFormer is a class of models pretrained on SMILES string representations of 0.65-1.1B molecules from ZINC and PubChem. This repository is for the model pretrained on all the unique molecules from both datasets.

Idle1.5K1 year ago