Find open-source science resources
A directory of tools, AI models, datasets, and research resources for biotech, bioinformatics, and other scientific fields. Aggregated from curated GitHub awesome-lists, HuggingFace, bio.tools, Bioconductor, and more.
Filters
Health
Domain
Language
License
Source
Type(1)
313 of 5,940 resources
Showing 201–250
makiyeah/CMRCLIP
by makiyeah> A CMR-report contrastive model combining Vision Transformers and pretrained text encoders.
Unsloth Dynamic 2.0 achieves superior accuracy & outperforms other leading quants.
helical-ai/helix-mRNA
by helical-aiLab-Rasool/RadImageNet
by Lab-RasoolThis repository contains pre-trained models from RadImageNet, a large-scale radiologic image dataset designed to facilitate transfer learning for medical imaging applications.
ibm-research/materials.smi-ted
by ibm-researchWelcome to IBM's series of large foundation models for sustainable materials. Our models span a variety of representations and modalities, including SMILES, SELFIES, 3D atom positions, 3D density grids, molecular graphs, and other formats.
zhihan1996/DNA_bert_3
by zhihan1996zhihan1996/DNA_bert_4
by zhihan1996zhihan1996/DNA_bert_5
by zhihan1996zhihan1996/DNA_bert_6
by zhihan1996SaltySander/MOSAIC
by SaltySander> [!IMPORTANT] > 🎉 Check out the latest version of Phikon here: Phikon-v2 > > Phikon is a self-supervised learning model for histopathology trained with iBOT.
microsoft/NatureLM-8x7B
by microsoft# Model details ## Model description Nature Language Model (NatureLM) is a sequence-based science foundation model designed for scientific discovery. Pre-trained with data from multiple scientific domains, NatureLM offers a unified, versatile model that enables various applications including…
microsoft/NatureLM-8x7B-Inst
by microsoft# Model details ## Model description Nature Language Model (NatureLM) is a sequence-based science foundation model designed for scientific discovery. Pre-trained with data from multiple scientific domains, NatureLM offers a unified, versatile model that enables various applications including…
An Evolutionary-scale Model (ESM) for protein function prediction from amino acid sequences using the Gene Ontology (GO). Based on the ESM2 Transformer architecture, pre-trained on UniRef50, and fine-tuned on the AmiGO dataset, this model predicts the GO subgraph for a particular protein sequence -…
An Evolutionary-scale Model (ESM) for protein function prediction from amino acid sequences using the Gene Ontology (GO). Based on the ESM2 Transformer architecture, pre-trained on UniRef50, and fine-tuned on the AmiGO dataset, this model predicts the GO subgraph for a particular protein sequence -…
An Evolutionary-scale Model (ESM) for protein function prediction from amino acid sequences using the Gene Ontology (GO). Based on the ESM2 Transformer architecture, pre-trained on UniRef50, and fine-tuned on the AmiGO dataset, this model predicts the GO subgraph for a particular protein sequence -…
An Evolutionary-scale Model (ESM) for protein function prediction from amino acid sequences using the Gene Ontology (GO). Based on the ESM2 Transformer architecture, pre-trained on UniRef50, and fine-tuned on the AmiGO dataset, this model predicts the GO subgraph for a particular protein sequence -…
An Evolutionary-scale Model (ESM) for protein function prediction from amino acid sequences using the Gene Ontology (GO). Based on the ESM2 Transformer architecture, pre-trained on UniRef50, and fine-tuned on the AmiGO dataset, this model predicts the GO subgraph for a particular protein sequence -…
This model had been created as part of joint research of HUMADEX research group (https://www.linkedin.com/company/101563689/) and has received funding by the European Union Horizon Europe Research and Innovation Program project SMILE (grant number 101080923) and Marie Skłodowska-Curie Actions…
An Evolutionary-scale Model (ESM) for protein function prediction from amino acid sequences using the Gene Ontology (GO). Based on the ESM2 Transformer architecture, pre-trained on UniRef50, and fine-tuned on the AmiGO dataset, this model predicts the GO subgraph for a particular protein sequence -…
mathpluscode/CineMA
by mathpluscodeCineMA is a foundation model for Cine cardiac magnetic resonance (CMR) imaging based on Masked-Autoencoder. CineMA has been pre-trained on UK Biobank data and fine-tuned on multiple clinically relevant tasks such as ventricle and myocaridum segmentation, ejection fraction (EF) regression,…
Sisigoks/FloraSense
by SisigoksFloraSense is a fine-tuned Vision Transformer (ViT) model designed for accurate classification of plant species and flora-related imagery. It builds on top of the powerful google/vit-base-patch16-224 base model and is fine-tuned on the PlanterGARDENEDITION dataset curated by Sisigoks, which…
Using llama.cpp release b5466 for quantization.
Dans-PersonalityEngine-V1.3.0-24b Dans-PersonalityEngine-V1.3.0-24b ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⠀⠄⠀⡂⠀⠁⡄⢀⠁⢀⣈⡄⠌⠐⠠⠤⠄⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⡄⠆⠀⢠⠀⠛⣸⣄⣶⣾⡷⡾⠘⠃⢀⠀⣴⠀⡄⠰⢆⣠⠘⠰⠀⡀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠃⠀⡋⢀⣤⡿⠟⠋⠁⠀⡠⠤⢇⠋⠀⠈⠃⢀⠀⠈⡡⠤⠀⠀⠁⢄⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠁⡂⠀⠀⣀⣔⣧⠟⠋⠀⢀⡄⠀⠪⣀⡂⢁⠛⢆⠀⠀⠀⢎⢀⠄⢡⠢⠛⠠⡀⠀⠄⠀⠀ ⠀⠀⡀⠡⢑⠌⠈⣧⣮⢾⢏⠁⠀⠀⡀⠠⠦⠈⠀⠞⠑⠁⠀⠀⢧⡄⠈⡜⠷⠒⢸⡇⠐⠇⠿⠈⣖⠂⠀…
Dans-PersonalityEngine-V1.3.0-12b Dans-PersonalityEngine-V1.3.0-12b ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⠀⠄⠀⡂⠀⠁⡄⢀⠁⢀⣈⡄⠌⠐⠠⠤⠄⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⡄⠆⠀⢠⠀⠛⣸⣄⣶⣾⡷⡾⠘⠃⢀⠀⣴⠀⡄⠰⢆⣠⠘⠰⠀⡀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠃⠀⡋⢀⣤⡿⠟⠋⠁⠀⡠⠤⢇⠋⠀⠈⠃⢀⠀⠈⡡⠤⠀⠀⠁⢄⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠁⡂⠀⠀⣀⣔⣧⠟⠋⠀⢀⡄⠀⠪⣀⡂⢁⠛⢆⠀⠀⠀⢎⢀⠄⢡⠢⠛⠠⡀⠀⠄⠀⠀ ⠀⠀⡀⠡⢑⠌⠈⣧⣮⢾⢏⠁⠀⠀⡀⠠⠦⠈⠀⠞⠑⠁⠀⠀⢧⡄⠈⡜⠷⠒⢸⡇⠐⠇⠿⠈⣖⠂⠀…
Dans-PersonalityEngine-V1.2.0-24b ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⠀⠄⠀⡂⠀⠁⡄⢀⠁⢀⣈⡄⠌⠐⠠⠤⠄⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⡄⠆⠀⢠⠀⠛⣸⣄⣶⣾⡷⡾⠘⠃⢀⠀⣴⠀⡄⠰⢆⣠⠘⠰⠀⡀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠃⠀⡋⢀⣤⡿⠟⠋⠁⠀⡠⠤⢇⠋⠀⠈⠃⢀⠀⠈⡡⠤⠀⠀⠁⢄⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠁⡂⠀⠀⣀⣔⣧⠟⠋⠀⢀⡄⠀⠪⣀⡂⢁⠛⢆⠀⠀⠀⢎⢀⠄⢡⠢⠛⠠⡀⠀⠄⠀⠀ ⠀⠀⡀⠡⢑⠌⠈⣧⣮⢾⢏⠁⠀⠀⡀⠠⠦⠈⠀⠞⠑⠁⠀⠀⢧⡄⠈⡜⠷⠒⢸⡇⠐⠇⠿⠈⣖⠂⠀ ⠀⢌⠀⠤⠀⢠⣞⣾⡗⠁⠀⠈⠁⢨⡼⠀⠀⠀⢀⠀⣀⡤⣄⠄⠈⢻⡇⠀⠐⣠⠜⠑⠁⠀⣀⡔⡿⠨⡄…
Unsloth Dynamic 2.0 achieves superior accuracy & outperforms other leading quants.
ibm-research/GP-MoLFormer-Uniq
by ibm-researchGP-MoLFormer is a class of models pretrained on SMILES string representations of 0.65-1.1B molecules from ZINC and PubChem. This repository is for the model pretrained on all the unique molecules from both datasets.
XformAI-india/qwen-0.6b-mentalhealth-support
by XformAI-indiaModel Repo: xformai/qwen-0.6b-mentalhealth-support Base Model: Qwen/Qwen-0.5B Task: Empathetic Conversational AI for mental health & emotional support Fine-Tuned By: XformAI
QIAIUNCC/EYE-Llama_gqa
by QIAIUNCC## Model Description EYE-Llama_gqa is a large language model specifically designed for ophthalmic question-answering (QA). It is built upon the Llama 2 architecture and fine-tuned on a the EYE-lit and EYE-QA+ dataset.
## Overview This project focuses on curating and modeling bioactivity data of small molecules targeting immune receptors. Using datasets from ImmtorLig_DB, we applied machine learning techniques to predict interactions between small molecules and immune receptors or cytokines, aiding drug discovery…
prov-gigapath/prov-gigapath
by prov-gigapathmedicalai/ClinicalBERT
by medicalaiThis model card describes the ClinicalBERT model, which was trained on a large multicenter dataset with a large corpus of 1.2B words of diverse diseases we constructed. We then utilized a large-scale corpus of EHRs from over 3 million patient records to fine tune the base language model.
This is the full precision (f16) GGUF version of a model trained for medical chatbot and dental implant assistant tasks. It combines general doctor–patient dialogue understanding with domain-specific Q&A derived from Straumann® dental implant system manuals.
Protein solubility is a critical factor in both pharmaceutical research and production processes, as it can significantly impact the quality and function of a protein. This is an example for finetuning ibm/biomed.omics.bl.sm-ted-458m for protein solubility prediction (binary classification) based…
prithivMLmods/Indian-Western-Food-34
by prithivMLmods!fffffff.png
This deep learning model is designed for ECG image classification, fine-tuned using ResNet-50. It can classify ECG images into different categories to assist in heart disease detection.
PurvaTijare/PPTStab
by PurvaTijarePPTStab: Prediction and Designing of thermostable proteins with a desired melting temperature
mradermacher/Dans-PersonalityEngine-V1.2.0-24b-i1-GGUF
by mradermacherIf you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files.
한국어 모델을 이용한 SapBERT(Self-alignment pretraining for BERT)입니다. 한·영 의료 용어 사전인 KOSTOM을 사용해 한국어 용어와 영어 용어를 정렬했습니다. 참고: SapBERT, Original Code
DOEJGI/GenomeOcean-4B
by DOEJGIThis is the base model of GenomeOcean-4B. It is trained with Causal Language Modeling (CLM) and uses a BPE tokenizer with 4096 tokens. It supports a maximum sequence length of 10240 tokens (~50kbp).
StanfordShahLab/llama-base-4096-clmbr
by StanfordShahLabsonglab/gpn-brassicales
by songlab# GPN trained on Arabidopsis thaliana and 7 other Brassicales See https://github.com/songlab-cal/gpn for more details.
BiomedCLIP is a biomedical vision-language foundation model that is pretrained on PMC-15M, a dataset of 15 million figure-caption pairs extracted from biomedical research articles in PubMed Central, using contrastive learning.
FremyCompany/BioLORD-2023
by FremyCompany# FremyCompany/BioLORD-2023 This model was trained using BioLORD, a new pre-training strategy for producing meaningful representations for clinical sentences and biomedical concepts.