DOEJGI/GenomeOcean-4B

https://huggingface.co/DOEJGI/GenomeOcean-4B
Idleby DOEJGI4.4K10updated 1 year ago

This is the base model of GenomeOcean-4B. It is trained with Causal Language Modeling (CLM) and uses a BPE tokenizer with 4096 tokens. It supports a maximum sequence length of 10240 tokens (~50kbp).

Sourced from

  • HuggingFaceDOEJGI/GenomeOcean-4B

Related resources

Active6.5K3 months ago
Python

gemma4-12b-bioinfo is a fine-tuned Gemma 4 12B instruction model for bioinformatics, genomics, and computational biology question answering.

Active1202 days ago
Python

A PyTorch port of AlphaGenome, the DNA sequence model from Google DeepMind that predicts hundreds of genomic tracks at single base-pair resolution from sequences up to 1M bp.

Active433 months ago
Idle6.2K11 months ago
Python
Idle73811 months ago
Python

GitHub homepage: Cell2Sentence GitHub

Idle9607 months ago
Python