Find open-source science resources

A directory of tools, AI models, datasets, and research resources for biotech, bioinformatics, and other scientific fields. Aggregated from curated GitHub awesome-lists, HuggingFace, bio.tools, Bioconductor, and more.

27 of 5,940 resources

Nallo is a bioinformatics analysis pipeline for long-reads from both PacBio and (targeted) ONT-data, focused on rare-disease. The pipeline detects a wide range of genetic variants, performs genome assembly, and reports CpG methylation. It also enables annotation and ranking of variants based on their predicted functional consequences.

Active674 days ago
Groovy
MIT

A static web application presents an interactive knowledge graph of single-cell long-read RNA sequencing literature synthesized from seven source papers. Users navigate mind-tree, network graph, guided learning-path, and Sankey views linking platforms, protocols, methods, and software. A benchmark tab provides 34 question-answer pairs with category and difficulty filters, exportable as JSON or CSV for LLM and agent evaluation.

Active05 days ago
JavaScript
MIT

Phylo-Movies is an open-source React and Flask web application, also available as a desktop app, for inspecting ordered phylogenetic tree series. It computes and visualizes subtree-prune-and-regraft transition frames between consecutive trees, helping users see which taxa or subtrees move across sliding-window analyses, bootstrap replicates, and curated tree-series comparisons. The viewer includes timeline playback, tree comparison, MSA context, coloring, analytics, image export, and recording tools.

Active15 days ago
HTML
MIT

Derives cells per well and suspension pipette volumes for standard 6-, 12-, 24-, 48-, 96-, and 384-well plates from a hemocytometer stock count, trypan blue viability, and target seeding confluency, with QC flags for low viability and impractical transfers. A browser calculator supports interactive planning with cell-line presets; a Python library and command-line tool submit the same parameters to the Pepkio Tools API for scripted and pipeline use. Calculator arithmetic is hosted remotely; the client transmits parameters and returns structured plate tables and shareable run identifiers.

Active11 week ago
Python
MIT

Computes weighed laboratory buffer recipes from target pH, concentration, and volume, accounting for separate preparation and working temperatures when pKa shifts with temperature. Supports calculator mode from dry reagents and stock dilution mode, returning acid and base masses, ionic strength estimates, optional NaCl adjustment, gravimetric and titration routes, and stepwise protocols. A browser calculator supports interactive recipe entry with shareable links; a Python library and command-line tool submit the same parameters to the Pepkio Tools API for scripted and pipeline use. Calculator arithmetic is hosted remotely; the client transmits parameters and returns structured recipe tables, compatibility warnings, and shareable run identifiers.

Active11 week ago
Python
MIT

Galaxy workflow for BlockClust pipeline.

Active1231 week ago
R
MIT

Plans geometric serial dilution series for molecular biology and biochemistry workflows, rounding transfer volumes to declared pipette ranges and optional 96- or 384-well plate layouts. A browser calculator supports interactive protocol design; a Python client and command-line tool submit the same parameters to the Pepkio Tools API for scripted and pipeline use. Calculator arithmetic is hosted remotely; the client transmits parameters and returns structured step tables and shareable run identifiers.

Active11 week ago
Python
MIT

Tool for converting raw DNA data files between 23andMe, AncestryDNA, MyHeritage, and FamilyTreeDNA formats.

Active12 weeks ago
PHP
MIT

Standalone browser-based Gene Ontology network viewer for exploring, filtering, searching, and exporting GO term and gene annotation neighborhoods from locally preprocessed GO OBO and GAF data.

Active03 weeks ago
TypeScript
MIT

Pathogensurveillance is a population genomics pipeline for pathogen identification, variant detection, and biosurveillance. The pipeline accepts paths to raw reads for one or more organisms and creates reports in the form of an interactive HTML document. Significant features include the ability to analyze unidentified eukaryotic and prokaryotic samples, creation of reports for multiple user-defined groupings of samples, automated discovery and downloading of reference assemblies from NCBI RefSeq, and rapid initial identification based on k-mer sketches followed by a more robust multi gene phylogeny and SNP-based phylogeny.

Active601 month ago
Groovy
MIT

A Python script that converts positional information from a SAM dataset into interval format with 0-based start and 1-based end. CIGAR string of SAM format is used to compute the end coordinate.

Active373 months ago
Python
MIT

A python extension, written in C, for quick access to bigBed files and access to and creation of bigWig files.

Active2445 months ago
C
MIT

Minigraph is a sequence-to-graph mapper and graph constructor. For graph generation, it aligns a query sequence against a sequence graph and incrementally augments an existing graph with long query subsequences diverged from the graph.

Idle48110 months ago
C
MIT

In silico derivatization for GC. The GC-derivatization tool converts carbonyl groups to C═N-OCH3 (MeOX) and transforms acidic protons into -Si(CH3)3 (TMS). Key functionalities include checking for specific groups, removing derivatization groups, and adding derivatization groups to molecules.

Stale12 years ago
Jupyter Notebook
MIT

A cookiecutter template for bioinformatics projects, with a focus on building bioinformatics workflows that can run on the MPI-IE cluster according to FAIR principles.

Stale143 years ago
Python
MIT

MITObim - mitochondrial baiting and iterative mapping

Stale1165 years ago
Perl
MIT

NanoSV is a software package that can be used to identify structural genomic variations in long-read sequencing data, such as data produced by Oxford Nanopore Technologies’ MinION, GridION or PromethION instruments, or Pacific Biosciences RSII or Sequel sequencers.

Stale926 years ago
Python
MIT

RBPBench is a multi-function tool to evaluate CLIP-seq and other related genomic region data using a comprehensive collection of known RNA-binding protein (RBP) binding motifs. RBPBench can be used for a variety of purposes, from RBP motif search (database or user-supplied RBP motifs) in genomic regions, over motif enrichment and co-occurrence analysis, in-depth comparisons over multiple datasets via sequence and genomic annotation statistics, to benchmarking CLIP-seq peak caller methods as well as comparisons across cell types and CLIP-seq protocols. RBPBench supports both sequence and structure motifs, as well as regular expressions (sequence and structure patterns). Moreover, users can easily provide their own motif collections.

Tool to generate a count matrix for expression data in Galaxy. generate_count_matrix reads in one or more input text files with expression counts and produces a single combined file. Each input will have a column in the matrix containing expression values. The column containing gene (or feature) names should be identical for all input count files.

It is a web-application for visual and interactive gene expression analysis. Phantasus is based on Morpheus – a web-based software for heatmap visualisation and analysis, which was integrated with an R environment via OpenCPU API. Aside from basic visualization and filtering methods, R-based methods such as k-means clustering, principal component analysis or differential expression analysis with limma package are supported.

Membrane Protein-Lipid Interaction Database. A large-scale experimentally validated dataset of 80685 residue-level lipid contact annotations across 4712 membrane proteins derived from PDB crystal and cryo-EM structures. Provides pre-computed binary contact labels, continuous distance values, sequence-identity-based cluster assignments, and ready-made train-validation-test splits for machine learning.

Screen a bacterial assembly (contigs/CDS or proteins) for nucleotide or protein sequences. Pipeline that screens for presence of genes of interest (GOI) in bacterial assemblies. Generates multiple CSVs and plots that describe which genes are present and how variable their sequence is. Can use DNA or protein query sequences (GOIs) and DNA contigs/fastas or protein fastas as database (db) to search in.

maeparser is a parser for Schrodinger Maestro files.

Processes 96-well plate absorbance data through blank subtraction, regression fitting, and dilution correction to report sample concentrations with QC flags for BCA, Bradford, and ELISA workflows. A browser calculator supports interactive grid entry with CSV and PDF export; a Python library and command-line tool submit the same parameters to the Pepkio Tools API for scripted and pipeline use. Calculator arithmetic is hosted remotely; the client transmits plate layout and absorbance values and returns model comparison, per-sample concentrations, and shareable run identifiers.

Estimates PCR primer melting temperatures and polymerase-specific annealing temperatures from sequence and buffer inputs, with per-pair QC for hairpins, dimers, and Tm balance. A browser calculator supports interactive single-pair and batch entry (up to 200 pairs) with method comparison and export; a Python library and command-line tool submit the same parameters to the Pepkio Tools API for scripted and pipeline use. Calculator arithmetic for the API client is hosted remotely; sequences are transmitted for programmatic runs while the web interface performs calculations in the browser.

Plans PCR and qPCR master-mix reagent volumes from stock and final concentrations, reaction counts, and pipetting overage, with consolidated totals when several assays are prepared together. A browser calculator supports interactive recipe entry with printable bench sheets; a Python library and command-line tool submit the same parameters to the Pepkio Tools API for scripted and pipeline use. Calculator arithmetic is hosted remotely; the client transmits parameters and returns structured volume tables, dilution warnings, and shareable run identifiers.

Computes laboratory solution preparation parameters—powder mass to weigh, stock and diluent volumes for single dilutions, and multi-step serial concentration tables—with correction for hydrated salts and supplier purity. A browser calculator supports interactive prep planning with saved recipes and shareable links; a Python client and command-line tool submit the same parameters to the Pepkio Tools API for scripted and pipeline use. Calculator arithmetic is hosted remotely; the client transmits parameters and returns structured protocol steps and shareable run identifiers.