macwiatrak/bacformer-large-masked-MAG

Name: macwiatrak/bacformer-large-masked-MAG
Author: macwiatrak

Actively maintainedby macwiatrak2131updated 1 week ago

- 2025-05-15: We identified a bug in the Bacformer Large code on HuggingFace which resulted in a significant drop in the quality of the output embeddings. This is now fixed, but if you downloaded or cached the model before this date, re-download and use the latest model revision before running…

README

libraryname: transformers tags: bacteria bacformer prokaryotes biology protein genomic genome license: apache-2.0 bacformer-large-masked-MAG 2025-05-15: We identified a bug in the Bacformer Large code on HuggingFace which resulted in a significant drop in the quality of the output embeddings. This is now fixed, but if you downloaded or cached the model before this date, re-download and use the latest model revision before running evaluations. KEY CHANGES from Bacformer base (26M): Number of…

HuggingFace: https://huggingface.co/macwiatrak/bacformer-large-masked-MAG

Source attribution

HuggingFace — macwiatrak/bacformer-large-masked-MAG

Related resources

macwiatrak/bacformer-large-masked-complete-genomes

by macwiatrak

Model

fill-mask

3531 week ago

Python

nvidia/AMPLIFY_350M

by nvidia

Model

fill-mask

> [!NOTE] > This model has been optimized using NVIDIA's TransformerEngine > library. Slight numerical differences may be observed between the original model and the optimized > model. For instructions on how to install TransformerEngine, please refer to the > official documentation.

278 months ago

Python

nvidia/AMPLIFY_120M

by nvidia

Model

fill-mask

6548 months ago

Python

littleworth/protgpt2-distilled-small

by littleworth

Model

text-generation

A compact protein language model distilled from ProtGPT2 using complementary-regularizer distillation---a method that combines uncertainty-aware position weighting with calibration-aware label smoothing to achieve 54% better perplexity than standard knowledge distillation at 9.4x compression.

53 months ago

Python

littleworth/protgpt2-distilled-medium

by littleworth

Model

text-generation

A compact protein language model distilled from ProtGPT2 using complementary-regularizer distillation---a method that combines uncertainty-aware position weighting with calibration-aware label smoothing to achieve 31% better perplexity than standard knowledge distillation at 3.8x compression.

663 months ago

Python

littleworth/protgpt2-distilled-tiny

by littleworth

Model

text-generation

A compact protein language model distilled from ProtGPT2 using complementary-regularizer distillation---a method that combines uncertainty-aware position weighting with calibration-aware label smoothing to achieve 87% better perplexity than standard knowledge distillation at 20x compression.

203 months ago

Python