hussenmi/scimilarity_expanded_model

https://huggingface.co/hussenmi/scimilarity_expanded_model
Activeby hussenmi01updated 1 month ago

An extended version of SCimilarity, a metric-learning model for single-cell RNA-seq that maps cells to a unified 128-dimensional embedding space. The original model and method are described in:

Sourced from

  • HuggingFacehussenmi/scimilarity_expanded_model

Related resources

GeneJEPA is a Joint-Embedding Predictive Architecture (JEPA) trained for self-supervised representation learning on scRNA-seq. It uses a Perceiver-style encoder to handle sparse, high-dimensional gene count vectors and a Fourier-feature tokenizer for numerical tokenization.

Idle07 months ago

A diffusion language model for genome-scale perturbation prediction across diverse cellular contexts.

Active03 months ago

Automated cell type annotation tool for single-cell transcriptomics using gradient boosting and logistic regression with reference atlases, enabling standardized classification across datasets (Wellcome Sanger Institute, Nature Biotechnology 2022)

Active4863 weeks ago
Python
MIT

General-purpose RNA language model with 650M parameters pretrained on 36M non-coding RNA sequences, achieving strong generalization on structure prediction tasks including secondary structure prediction, splice-site prediction, mean ribosome loading, and ncRNA classification (lbcb-sci, 165+ stars, Apache-2.0)

Active1651 month ago
Python
Apache-2.0

Methods to infer clonal tree configuration for a population of cells using single-cell RNA-seq data (scRNA-seq), and possibly other data modalities. Methods are also provided to assign cells to inferred clones and explore differences in gene expression between clones. These methods can flexibly integrate information from imperfect clonal trees inferred based on bulk exome-seq data, and sparse variant alleles expressed in scRNA-seq data. A flexible beta-binomial error model that accounts for stochastic dropout events as well as systematic allelic imbalance is used.

Stale653 years ago
R
GPL-3.0

This package fits a model to the pattern of dropouts in single-cell RNASeq data. This model is used as a null to identify significantly variable (i.e. differentially expressed) genes for use in downstream analysis, such as clustering cells. Also includes an method for calculating exact Pearson residuals in UMI-tagged data using a library-size aware negative binomial model.

Stale332 years ago
R
GPL-2.0+