aasatorres/esm2-sae-topk-16384-k512

https://huggingface.co/aasatorres/esm2-sae-topk-16384-k512
Activeby aasatorres182updated 2 weeks ago

Sparse Autoencoder (SAE) trained on residue-level embeddings from ESM-2 (650M, layer 33) for interpretability research on protein language models.

Sourced from

  • HuggingFaceaasatorres/esm2-sae-topk-16384-k512

Related resources

This model was finetuned on concatenated pairs of interacting proteins in much the same way as PepMLM. It is meant to generate interaction partners for proteins using the masked language modeling capabilities of ESM-2. The model is not well tested, so use with caution.

Stale32 years ago
Python

This model card provides an overview of the intended use of the ESMC SAE models and examples of how to access them, but it does not have a specific model or model weights. To access each SAE model collection, use the links below:

Active06 days ago
Python

ESMC is a state-of-the-art protein language model that has learned the rules of protein biology from training on billions of protein sequences. ESMC provides representations of proteins enabling novel AI applications from therapeutic protein engineering to unlocking basic insights into protein…

Active614.4K6 days ago
Python

A compact protein language model distilled from ProtGPT2 using complementary-regularizer distillation---a method that combines uncertainty-aware position weighting with calibration-aware label smoothing to achieve 87% better perplexity than standard knowledge distillation at 20x compression.

Active143 months ago
Python

This set of model weights was released with the GitHub-compatible esm package format. The models here are kept for backwards compatibility, but we recommend you use the HuggingFace-compatible model weights at biohub/ESMC-6B (or biohub/ESMC-300M / biohub/ESMC-600M) instead.

Active2.5K2 weeks ago
Python

ESMC is a state-of-the-art protein language model that has learned the rules of protein biology from training on billions of protein sequences. ESMC provides representations of proteins enabling novel AI applications from therapeutic protein engineering to unlocking basic insights into protein…

Active3.5K6 days ago
Python