Find open-source science resources

A directory of tools, AI models, datasets, and research resources for biotech, bioinformatics, and other scientific fields. Aggregated from curated GitHub awesome-lists, HuggingFace, bio.tools, Bioconductor, and more.

16 of 5,923 resources

Provides functionality for producing geometric representations of protein and RNA structures, and biological interaction networks.

Active1.2K3 weeks ago
Jupyter Notebook
MIT

Directed message passing neural networks for property prediction of molecules and reactions with uncertainty and interpretation.

Active2.4K1 month ago
Python
NOASSERTION

Descriptor library containing a variety of fingerprinting techniques, including the Smooth Overlap of Atomic Positions (SOAP).

Active4661 month ago
C++
Apache-2.0

A Deep Learning Library for Compound and Protein Modeling DTI, Drug Property, PPI, DDI, Protein Function Prediction.

Stale1.2K2 years ago
Jupyter Notebook
BSD-3-Clause

Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals.

Archived5563 years ago
Jupyter Notebook
BSD-3-Clause

A deep learning framework (based on Chainer) with applications in Biology and Chemistry.

Stale7003 years ago
Python
MIT

Enables machine learning on three-dimensional molecular structure.

Stale3193 years ago
Python
MIT

a robust molecular representation learning framework against distribution shifts.

Stale613 years ago
Python
MIT

Molecular property prediction with unified API for diverse models and respresentations,

A python package for optimizing chemical reactions using machine learning (contains 10 algorithms + several benchmarks).

Library of descriptors to aid in the data-mining of materials properties, created by the Lawrence Berkeley National Laboratory.

Aims to provide useful high-level interfaces that make ML for materials science as easy as possible.

Library for fast calculations of **mo**lecula**r** **fe**at**u**re**s** from 3D structures for machine learning with a focus on steric descriptors.

Ensemble of automated machine learning protocols that can be run sequentially through a single command line. The program works for regression and classification problems.

Self-Referencing Embedded Strings (SELFIES): A 100% robust molecular string representation.

Library with several compositional and structural material descriptors, along with a few pre-trained neural network models of material properties.