Find open-source science resources
A directory of tools, AI models, datasets, and research resources for biotech, bioinformatics, and other scientific fields. Aggregated from curated GitHub awesome-lists, HuggingFace, bio.tools, Bioconductor, and more.
Filters
Health
Domain
Language
License(1)
Source(1)
Type
31 of 5,893 resources
Official Jupyter extension with `%%ai` magic commands and sidebar chat assistant, connecting multiple model providers and local inference
A Workflow Management System geared towards scientific workflows.
Deep learning software to decode EEG, ECG or MEG signals, providing standardized neural network models, preprocessing pipelines, and evaluation workflows for brain-computer interfaces and cognitive neuroscience research (1.2K+ stars, BSD 3-Clause, actively maintained)
Fast, interactive, multi-dimensional image viewer for Python, foundational platform for scientific imaging AI with a rich plugin ecosystem integrating deep learning segmentation, object tracking, and microscopy analysis workflows (2.6K+ stars)
GO Rules are a way of documenting the set of filters and reports that should apply to GAF annotation data. Some rules are expressed as SPARQL on a triplestore, some are code in the GAF parsing software, ontobio.
SSSOM is a Simple Standard for Sharing Ontological Mappings, providing - a TSV-based representation for ontology term mappings - a comprehensive set of standard metadata elements to describe mappings and - a standard translation between the TSV and the Web Ontology Language (OWL). Most metadata elements, such as "sssom:mapping_justification" are defined in the sssom namespace.
Generalist deep learning algorithm for cell and nucleus segmentation across diverse image types, with human-in-the-loop training (2.0) and one-click image restoration (3.0), 70K+ training objects (Nature Methods 2021/2022/2025)
Structural variant discovery by integrated paired-end and split-read analysis.
Machine learning and statistical learning for neuroimaging in Python, providing easy-to-use tools for fMRI and MRI analysis including decoding, connectivity estimation, and parcellation with seamless scikit-learn integration (INRIA Parietal team, 1.4K+ stars)
The Chromosome Ontology is an automatically derived ontology of chromosomes and chromosome parts.
Vitro is a full stack framework for building semantic web applications. It is not domain specific.
This package provides an R wrapper for the popular Bowtie2 sequencing read aligner, optimized to run on NVIDIA graphics cards. It includes wrapper functions that enable both genome indexing and alignment to the generated indexes, ensuring high performance and ease of use within the R environment.
Probabilistic framework for inferring cell fate decisions and trajectory dynamics from multi-view single-cell data using Markov chains and machine learning, integrating RNA velocity, pseudotime, and metabolic labeling to predict differentiation paths and terminal states (scverse/Theis Lab, 449+ stars, BSD 3-Clause)
Manipulation and analysis of geometric objects.
Deep learning-based variant caller
A library containing basis sets for use in quantum chemistry calculations. In addition, this library has functionality for manipulation of basis set data.
A package for working with nuclear magnetic resonance (NMR) data including functions for reading common binary file formats and processing NMR data.
flowcatchR is a set of tools to analyze in vivo microscopy imaging data, focused on tracking flowing blood cells. It guides the steps from segmentation to calculation of features, filtering out particles not of interest, providing also a set of utilities to help checking the quality of the performed operations (e.g. how good the segmentation was). It allows investigating the issue of tracking flowing cells such as in blood vessels, to categorize the particles in flowing, rolling and adherent. This classification is applied in the study of phenomena such as hemostasis and study of thrombosis development. Moreover, flowcatchR presents an integrated workflow solution, based on the integration with a Shiny App and Jupyter notebooks, which is delivered alongside the package, and can enable fully reproducible bioimage analysis in the R environment.
Using single-cell RNA-Seq expression to visualize CNV in cells.
State-specific protein-ligand complex structure prediction with a multi-scale deep generative model, enabling conformational state-aware modeling of molecular interactions (329+ stars, 2024)
A module for solving and visualizing the Schrödinger equation.
zitools allows for zero inflated count data analysis by either using down-weighting of excess zeros or by replacing an appropriate proportion of excess zeros with NA. Through overloading frequently used statistical functions (such as mean, median, standard deviation), plotting functions (such as boxplots or heatmap) or differential abundance tests, it allows a wide range of downstream analyses for zero-inflated data in a less biased manner. This becomes applicable in the context of microbiome analyses, where the data is often overdispersed and zero-inflated, therefore making data analysis extremly challenging.
A Deep Learning Library for Compound and Protein Modeling DTI, Drug Property, PPI, DDI, Protein Function Prediction.
Psi4-based reference implementations and Jupyter notebook-based tutorials for foundational quantum chemistry methods.
Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals.
This package offers an interface to NDEx servers, e.g. the public server at http://ndexbio.org/. It can retrieve and save networks via the API. Networks are offered as RCX object and as igraph representation.
Open Drug Discovery Toolkit, a modular and comprehensive toolkit for use in cheminformatics, molecular modeling etc.
Jupyter Widgets to interact with molecular datasets.
Suffix Array Kernel Smoothing (see https://academic.oup.com/bioinformatics/article-abstract/35/20/3944/5418797), or SArKS, identifies sequence motifs whose presence correlates with numeric scores (such as differential expression statistics) assigned to the sequences (such as gene promoters). SArKS smooths over sequence similarity, quantified by location within a suffix array based on the full set of input sequences. A second round of smoothing over spatial proximity within sequences reveals multi-motif domains. Discovered motifs can then be merged or extended based on adjacency within MMDs. False positive rates are estimated and controlled by permutation testing.
An ontology that represents the basic knowledge of physical, chemical and functional characteristics of nanotechnology as used in cancer diagnosis and therapy.