Find open-source science resources

A directory of tools, AI models, datasets, and research resources for biotech, bioinformatics, and other scientific fields. Aggregated from curated GitHub awesome-lists, HuggingFace, bio.tools, Bioconductor, and more.

288 of 5,893 resources

Showing 251288

# ibm/biomed.sm.mv-te-84m-MoleculeNet-ligand_scaffold-MUV-101 biomed.sm.mv-te-84m is a multimodal biomedical foundation model for small molecules created using MMELON (Multi-view Molecular Embedding with Late Fusion), a flexible approach to aggregate multiple views (sequence, image, graph) of…

Idle121 year ago

If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files.

Idle3811 year ago
Python

ChemFIE-SA is a BERT-like sequence classifier for predicting synthesis accessibility given a SELFIES string of a compound, fine-tuned from gbyuvd/chemselfies-base-bertmlm on DeepSA's expanded dataset from Wang et al. 2023.

Idle71 year ago
Python

This model is a BERT-like sequence classifier for 221 human protein drug targets, fine-tuned from gbyuvd/chemselfies-base-bertmlm on a dataset derived ChemBL34 (Zdrazil et al. 2023). It predicts potential drug targets using chemical structures represented as SELFIES (Self-Referencing Embedded…

Idle71 year ago
Python

This is a ReactionT5 pre-trained to predict the products of reactions.

Idle371 year ago
Python

Universal Cell Embeddings (UCE) is a foundation model designed for single-cell RNA sequencing data analysis. UCE generates a universal representation of cells that captures the molecular diversity across different cell types, tissues, and species.

Idle141 year ago

# JSL-MedLlama-3-8B-v2.0

Stale6032 years ago
Python

This model is a fine-tuned version of DeBERTa on the PubMED Dataset.

Stale45.1K2 years ago
Python

Abstract:

Stale103.9K2 years ago
Python

Abstract:

Stale8752 years ago
Python

SMILES2IUPAC-canonical-base was designed to accurately translate SMILES chemical names to IUPAC standards.

Stale5.1K2 years ago
Python
Stale02 years ago

### Model Description A machine learning model for waste classification

Stale02 years ago

This model was finetuned on concatenated pairs of interacting proteins in much the same way as PepMLM. It is meant to generate interaction partners for proteins using the masked language modeling capabilities of ESM-2. The model is not well tested, so use with caution.

Stale32 years ago
Python

ProstT5 is a protein language model (pLM) which can translate between protein sequence and structure. !ProstT5 pre-training and inference

Stale7.8K2 years ago
Python

Question Answering Model for the PathoTHREAT Project

Stale42 years ago
Python
Stale02 years ago

This is a Japanese RoBERTa base model pre-trained on academic articles in medical sciences collected by Japan Science and Technology Agency (JST).

Stale1462 years ago
Python

I present a demo showcasing retinal vessel segmentation using the U-Net model, which is a well-known and widely used model in medical image segmentation. The model was trained on the DRIVE dataset, and the training process was conducted on Google Colab.

Stale02 years ago
Python

datasets: - UMLS

Stale1.8M2 years ago
Python

In recent years, pre-trained language models (PLMs) achieve the best performance on a wide range of natural language processing (NLP) tasks. While the first models were trained on general domain data, specialized ones have emerged to more effectively treat specific domains.

Stale1703 years ago
Python

In recent years, pre-trained language models (PLMs) achieve the best performance on a wide range of natural language processing (NLP) tasks. While the first models were trained on general domain data, specialized ones have emerged to more effectively treat specific domains.

Stale03 years ago
Python

In recent years, pre-trained language models (PLMs) achieve the best performance on a wide range of natural language processing (NLP) tasks. While the first models were trained on general domain data, specialized ones have emerged to more effectively treat specific domains.

Stale3213 years ago
Python

In recent years, pre-trained language models (PLMs) achieve the best performance on a wide range of natural language processing (NLP) tasks. While the first models were trained on general domain data, specialized ones have emerged to more effectively treat specific domains.

Stale1.5K3 years ago
Python
Stale03 years ago
Python
Stale03 years ago
Stale03 years ago
Python

This modelcard aims to be a base template for new models. It has been generated using this raw template.

Stale03 years ago
Python
Stale03 years ago
Stale03 years ago
Stale03 years ago

# ChemGPT 1.2B ChemGPT is based on the GPT-Neo model and was introduced in the paper Neural Scaling of Deep Chemical Models.

Stale3.2K3 years ago
Python

# ChemGPT 19M ChemGPT is based on the GPT-Neo model and was introduced in the paper Neural Scaling of Deep Chemical Models.

Stale2K3 years ago
Python

# ChemGPT 4.7M ChemGPT is based on the GPT-Neo model and was introduced in the paper Neural Scaling of Deep Chemical Models.

Stale3.3K3 years ago
Python

Deep learning for chemistry and materials science remains a novel field with lots of potiential. However, the popularity of transfer learning based methods in areas such as NLP and computer vision have not yet been effectively developed in computational chemistry + machine learning.

Stale276.9K5 years ago
Python