Find open-source science resources

Active454 months ago

Raziel1234/OSTLM

by Raziel1234

translation

A Neural Machine Translation (NMT) model based on a custom Transformer (Encoder-Decoder) architecture, trained from scratch. This model is designed to translate English sentences into Hebrew using multilingual encoding and specialized layer configurations.

Active184 months ago

google/alphagenome-all-folds

by google

Active04 months ago

OpenDFM/ChemDFM-R-14B

by OpenDFM

While large language models (LLMs) have achieved impressive progress, their application in scientific domains such as chemistry remains hindered by shallow domain understanding and limited reasoning capabilities. In this work, we focus on the specific field of chemistry and develop a Chemical…

Active735 months ago

OpenDFM/ChemDFM-v2.0-14B

by OpenDFM

ChemDFM-v2.0 is the latest non-thinking model of ChemDFM, the pioneering open-sourced dialogue foundation model for Chemistry and molecule science.

Active9645 months ago

unsloth/medgemma-1.5-4b-it-GGUF

by unsloth

Unsloth Dynamic 2.0 achieves superior accuracy & outperforms other leading quants.

Active7.8K5 months ago

OpenMed/OpenMed-PII-SuperClinical-Small-44M-v1

by OpenMed

token-classification

PII Detection Model | 44M Parameters | Open Source

Active27K5 months ago

microsoft/MediPhi-Instruct

by microsoft

The MediPhi Model Collection comprises 7 small language models of 3.8B parameters from the base model Phi-3.5-mini-instruct specialized in the medical and clinical domains. The collection is designed in a modular fashion. Five MediPhi experts are fine-tuned on various medical corpora (i.e.

Active2K5 months ago

nvidia/geneformer_V2_316M

by nvidia

## Description: Geneformer is a foundational transformer model pretrained on a large-scale corpus of single-cell transcriptomes to enable context-specific predictions in settings with limited data in network biology.

Active325 months ago

nvidia/geneformer_V2_104M_CLcancer

by nvidia

Active165 months ago

nvidia/geneformer_V2_104M

by nvidia

Active315 months ago

nvidia/geneformer_V1_10M

by nvidia

Idle176 months ago

plant-llms/PlantBiMoE

by plant-llms

## Model Overview PlantBiMoE is a DNA language model trained on 42 representative plant species genomes. More specifically, PlantBiMoE uses the BiMamba and SparseMoE architecture with a masked language modeling objective to leverage highly available genotype data from 42 different plant speices to…

Idle106 months ago

prov-gigatime/GigaTIME

by prov-gigatime

image-to-image

Idle2846 months ago

juppy44/plant-identification-2m-vit-b

by juppy44

image-classification

Idle3206 months ago

MedSwin/MedSwin-DaRE-TIES-KD-0.7

by MedSwin

question-answering

This is a merge of pre-trained language models created using mergekit.

Idle416 months ago

mace-foundations/mace-mh-1

by mace-foundations

MACE-MH-1 is a foundation machine-learning interatomic potential (MLIP) that bridges molecular, surface, and materials chemistry through cross-domain learning:

Idle06 months ago

ZJU-AI4H/Hulu-Med-4B

by ZJU-AI4H

Hulu-Med: A Transparent Generalist Model towards Holistic Medical Vision-Language Understanding

Idle18.6K6 months ago

microsoft/llava-med-v1.5-mistral-7b

by microsoft

Large Language and Vision Assistant for bioMedicine (i.e., “LLaVA-Med”) is a large language and vision model trained using a curriculum learning method for adapting LLaVA to the biomedical domain. It is an open-source release intended for research use only to facilitate reproducibility of the…

Idle21.4K6 months ago

gbyuvd/chemembed-chemselfies

by gbyuvd

sentence-similarity

ChemFIE-BED is a sentence-transformers based on gbyuvd/chemselfies-base-bertmlm fine-tuned on around (for now) 2 million pairs of valid molecules' SELFIES (Krenn et al. 2020) taken from COCONUTDB (Sorokina et al. 2021) and ChemBL34 (Zdrazil et al. 2023).

Idle1177 months ago

vandijklab/C2S-Scale-Gemma-2-27B

by vandijklab

GitHub homepage: Cell2Sentence GitHub

Idle9607 months ago

google/medgemma-4b-it

by google

Idle226.2K7 months ago

tahoebio/Tahoe-x1

by tahoebio

Tahoe-x1 is a family of perturbation-trained single-cell foundation models with up to 3 billion parameters, developed by Tahoe Therapeutics. Pretrained on 266 million single-cell transcriptomic profiles including the Tahoe-100M perturbation compendium, Tahoe-x1 achieves state-of-the-art performance…

Idle407 months ago

elonlit/GeneJEPA

by elonlit

feature-extraction

GeneJEPA is a Joint-Embedding Predictive Architecture (JEPA) trained for self-supervised representation learning on scRNA-seq. It uses a Perceiver-style encoder to handle sparse, high-dimensional gene count vectors and a Fourier-feature tokenizer for numerical tokenization.

Idle07 months ago

biomni/Biomni-R0-32B-Preview

by biomni

# Biomni-R0-32B-Preview This repo contains the weights of Biomni-R0-32B-Preview, a research preview of the series of biomedical AI agents trained by the Biomni team.

Idle3828 months ago

InstaDeepAI/instanovoplus-v1.1.0

by InstaDeepAI

InstaNovoPlus is a diffusion-based model for de novo peptide sequencing from mass spectrometry data. This model leverages multinomial diffusion for accurate, database-free peptide identification for large-scale proteomics experiments.

Idle58 months ago

gbyuvd/chemselfies-base-bertmlm

by gbyuvd

This model is a lightweight model pre-trained on SELFIES (Self-Referencing Embedded Strings) representations of molecules. It is trained on 2.7M unique and valid molecules taken from COCONUTDB and ChemBL34, with 7.3M total generated masked examples.

Idle58 months ago

nvidia/AMPLIFY_350M

by nvidia

> [!NOTE] > This model has been optimized using NVIDIA's TransformerEngine > library. Slight numerical differences may be observed between the original model and the optimized > model. For instructions on how to install TransformerEngine, please refer to the > official documentation.

Idle348 months ago

nvidia/AMPLIFY_120M

by nvidia

Idle5838 months ago

lingshu-medical-mllm/Lingshu-7B

by lingshu-medical-mllm

Website    🤖 7B Model    🤖 32B Model    MedEvalKit    Technical Report    Lingshu MCP

Idle4.1K8 months ago

google/medgemma-27b-text-it

by google

Idle26K8 months ago

evo-design/evo-2-7b-8k-microviridae

by evo-design

Evo 2 is a state of the art DNA language model for long context modeling and design. Evo 2 models DNA sequences at single-nucleotide resolution at up to 1 million base pair context length using the StripedHyena 2 architecture, using Savanna.

Idle09 months ago

lastmass/Qwen3_Medical_GRPO

by lastmass

中文版说明

Idle779 months ago

S4nfs/Neeto-1.0-8b

by S4nfs

Neeto-1.0-8b is an openly released biomedical large language model (LLM) created by BYOL Academy to assist learners and practitioners with medical exam study, literature understanding, and structured clinical reasoning.

Idle7.7K9 months ago

Zaixi/RNAGenesis

by Zaixi

feature-extraction

Idle519 months ago

ByteDance-Seed/bamboo_mixer

by ByteDance-Seed

This repository contains the official model of the paper A Unified Predictive and Generative Solution for Liquid Electrolyte Formulation.

Idle09 months ago

sagawa/ReactionT5v2-forward

by sagawa

This is a ReactionT5 pre-trained to predict the products of reactions. You can use the demo here.

Idle2K9 months ago

AdaptLLM/biomed-Qwen2.5-VL-3B-Instruct

by AdaptLLM

This repos contains the biomedicine MLLM developed from Qwen2.5-VL-3B-Instruct in our paper: On Domain-Adaptive Post-Training for Multimodal Large Language Models. The correspoding training dataset is in biomed-visual-instructions.

Idle1579 months ago

OpenMed/OpenMed-NER-ChemicalDetect-ElectraMed-33M

by OpenMed

token-classification

Specialized model for Chemical Entity Recognition - Identifies chemical compounds and substances in biomedical literature

Idle7110 months ago

ameya98/JAMUN

by ameya98

other

JAMUN is a novel approach for generating conformational ensembles of protein structures, presented in the paper JAMUN: Bridging Smoothed Molecular Dynamics and Score-Based Learning for Conformational Ensembles.

Idle010 months ago

darkknight25/deepseek-16b-medical-GPT

by darkknight25

darkknight25/deepseek-16b-medical-GPT is a fine-tuned version of deepseek-ai/deepseek-l6b-moe-chat, optimized for medical question answering, reasoning, and clinical summarization using QLoRA and open-access healthcare datasets.

Idle011 months ago