Find open-source science resources

A directory of tools, AI models, datasets, and research resources for biotech, bioinformatics, and other scientific fields. Aggregated from curated GitHub awesome-lists, HuggingFace, bio.tools, Bioconductor, and more.

39 of 6,007 resources

Module for single-cell data extraction given a segmentation mask and multi-channel image.

Active1542 days ago
Nextflow
MIT

Performs laboratory unit conversions across molarity, OD600 cell density, C₁V₁ dilution, and related dimensional pairs from mass, volume, molecular weight, and organism-specific OD factors. A browser calculator combines four modes in one tabbed workspace with compound MW lookup, species-aware OD uncertainty ranges, cross-tab chaining, and shareable links; a Python library and command-line tool submit the same parameters to the Pepkio Tools API for scripted use. Calculator arithmetic for the API client is hosted remotely; the client transmits conversion inputs and returns structured results and shareable run identifiers.

Active13 days ago
Python
MIT

Calculates sequence-derived molecular properties and related laboratory planning outputs from FASTA and assay setup inputs. The tool supports sequence analysis for DNA, RNA, and protein entries, plus dilution and ligation calculation modes through one API-backed workflow. Programmatic use is available through a Python library and command-line interface that submit run payloads and return structured result objects.

Active13 days ago
Python
MIT

Translates between centrifuge RPM and relative centrifugal force using rotor geometry, reporting g-force or speed at rmin, ravg, and rmax. Convert mode handles rpm_to_rcf and rcf_to_rpm with rotor presets or manual radii in mm; transfer mode maps a source RPM on one rotor to an equivalent target RPM at matched rmax RCF; batch mode processes multiple spin steps from CSV or row arrays. A browser calculator and a Python library with command-line interface submit the same parameters to the Pepkio Tools API and return structured results with optional methods text and safety warnings.

Active13 days ago
Python
MIT

Performs batch four-parameter and five-parameter logistic regression on multi-compound concentration–response screens to estimate IC50, EC50, pIC50, Hill slope, and related potency metrics with per-compound QC grades. A browser calculator supports CSV upload, curve review, and figure export; a Python library and command-line tool submit the same parameters to the Pepkio Tools API for scripted and pipeline use. Calculator arithmetic is hosted remotely; the client transmits concentration–response data and returns structured fit results and shareable run identifiers.

Active23 days ago
Python
MIT

Nallo is a bioinformatics analysis pipeline for long-reads from both PacBio and (targeted) ONT-data, focused on rare-disease. The pipeline detects a wide range of genetic variants, performs genome assembly, and reports CpG methylation. It also enables annotation and ranking of variants based on their predicted functional consequences.

Active671 week ago
Groovy
MIT

A static web application presents an interactive knowledge graph of single-cell long-read RNA sequencing literature synthesized from seven source papers. Users navigate mind-tree, network graph, guided learning-path, and Sankey views linking platforms, protocols, methods, and software. A benchmark tab provides 34 question-answer pairs with category and difficulty filters, exportable as JSON or CSV for LLM and agent evaluation.

Active01 week ago
JavaScript
MIT

Phylo-Movies is an open-source React and Flask web application, also available as a desktop app, for inspecting ordered phylogenetic tree series. It computes and visualizes subtree-prune-and-regraft transition frames between consecutive trees, helping users see which taxa or subtrees move across sliding-window analyses, bootstrap replicates, and curated tree-series comparisons. The viewer includes timeline playback, tree comparison, MSA context, coloring, analytics, image export, and recording tools.

Active11 week ago
Python
MIT

Derives cells per well and suspension pipette volumes for standard 6-, 12-, 24-, 48-, 96-, and 384-well plates from a hemocytometer stock count, trypan blue viability, and target seeding confluency, with QC flags for low viability and impractical transfers. A browser calculator supports interactive planning with cell-line presets; a Python library and command-line tool submit the same parameters to the Pepkio Tools API for scripted and pipeline use. Calculator arithmetic is hosted remotely; the client transmits parameters and returns structured plate tables and shareable run identifiers.

Active11 week ago
Python
MIT

Computes weighed laboratory buffer recipes from target pH, concentration, and volume, accounting for separate preparation and working temperatures when pKa shifts with temperature. Supports calculator mode from dry reagents and stock dilution mode, returning acid and base masses, ionic strength estimates, optional NaCl adjustment, gravimetric and titration routes, and stepwise protocols. A browser calculator supports interactive recipe entry with shareable links; a Python library and command-line tool submit the same parameters to the Pepkio Tools API for scripted and pipeline use. Calculator arithmetic is hosted remotely; the client transmits parameters and returns structured recipe tables, compatibility warnings, and shareable run identifiers.

Active11 week ago
Python
MIT

Galaxy workflow for BlockClust pipeline.

Active1231 week ago
R
MIT

Plans geometric serial dilution series for molecular biology and biochemistry workflows, rounding transfer volumes to declared pipette ranges and optional 96- or 384-well plate layouts. A browser calculator supports interactive protocol design; a Python client and command-line tool submit the same parameters to the Pepkio Tools API for scripted and pipeline use. Calculator arithmetic is hosted remotely; the client transmits parameters and returns structured step tables and shareable run identifiers.

Active11 week ago
Python
MIT

Tool for converting raw DNA data files between 23andMe, AncestryDNA, MyHeritage, and FamilyTreeDNA formats.

Active12 weeks ago
PHP
MIT

Standalone browser-based Gene Ontology network viewer for exploring, filtering, searching, and exporting GO term and gene annotation neighborhoods from locally preprocessed GO OBO and GAF data.

Active04 weeks ago
TypeScript
MIT

Pathogensurveillance is a population genomics pipeline for pathogen identification, variant detection, and biosurveillance. The pipeline accepts paths to raw reads for one or more organisms and creates reports in the form of an interactive HTML document. Significant features include the ability to analyze unidentified eukaryotic and prokaryotic samples, creation of reports for multiple user-defined groupings of samples, automated discovery and downloading of reference assemblies from NCBI RefSeq, and rapid initial identification based on k-mer sketches followed by a more robust multi gene phylogeny and SNP-based phylogeny.

Active601 month ago
Groovy
MIT

RFantibody is a pipeline for structure-based de novo antibody and nanobody design, integrating backbone design with RFdiffusion, sequence design with ProteinMPNN, and structure prediction with RoseTTAFold2. It provides a comprehensive toolset for generating and filtering high-quality antibody designs.

Active5033 months ago
Shell
MIT

A Python script that converts positional information from a SAM dataset into interval format with 0-based start and 1-based end. CIGAR string of SAM format is used to compute the end coordinate.

Active373 months ago
Python
MIT

A python extension, written in C, for quick access to bigBed files and access to and creation of bigWig files.

Active2445 months ago
C
MIT

Minigraph is a sequence-to-graph mapper and graph constructor. For graph generation, it aligns a query sequence against a sequence graph and incrementally augments an existing graph with long query subsequences diverged from the graph.

Idle48110 months ago
C
MIT

In silico derivatization for GC. The GC-derivatization tool converts carbonyl groups to C═N-OCH3 (MeOX) and transforms acidic protons into -Si(CH3)3 (TMS). Key functionalities include checking for specific groups, removing derivatization groups, and adding derivatization groups to molecules.

Stale12 years ago
Jupyter Notebook
MIT

A cookiecutter template for bioinformatics projects, with a focus on building bioinformatics workflows that can run on the MPI-IE cluster according to FAIR principles.

Stale143 years ago
Python
MIT

MITObim - mitochondrial baiting and iterative mapping

Stale1165 years ago
Perl
MIT

NanoSV is a software package that can be used to identify structural genomic variations in long-read sequencing data, such as data produced by Oxford Nanopore Technologies’ MinION, GridION or PromethION instruments, or Pacific Biosciences RSII or Sequel sequencers.

Stale926 years ago
Python
MIT

RBPBench is a multi-function tool to evaluate CLIP-seq and other related genomic region data using a comprehensive collection of known RNA-binding protein (RBP) binding motifs. RBPBench can be used for a variety of purposes, from RBP motif search (database or user-supplied RBP motifs) in genomic regions, over motif enrichment and co-occurrence analysis, in-depth comparisons over multiple datasets via sequence and genomic annotation statistics, to benchmarking CLIP-seq peak caller methods as well as comparisons across cell types and CLIP-seq protocols. RBPBench supports both sequence and structure motifs, as well as regular expressions (sequence and structure patterns). Moreover, users can easily provide their own motif collections.

Tool to generate a count matrix for expression data in Galaxy. generate_count_matrix reads in one or more input text files with expression counts and produces a single combined file. Each input will have a column in the matrix containing expression values. The column containing gene (or feature) names should be identical for all input count files.

It is a web-application for visual and interactive gene expression analysis. Phantasus is based on Morpheus – a web-based software for heatmap visualisation and analysis, which was integrated with an R environment via OpenCPU API. Aside from basic visualization and filtering methods, R-based methods such as k-means clustering, principal component analysis or differential expression analysis with limma package are supported.

Membrane Protein-Lipid Interaction Database. A large-scale experimentally validated dataset of 80685 residue-level lipid contact annotations across 4712 membrane proteins derived from PDB crystal and cryo-EM structures. Provides pre-computed binary contact labels, continuous distance values, sequence-identity-based cluster assignments, and ready-made train-validation-test splits for machine learning.

Screen a bacterial assembly (contigs/CDS or proteins) for nucleotide or protein sequences. Pipeline that screens for presence of genes of interest (GOI) in bacterial assemblies. Generates multiple CSVs and plots that describe which genes are present and how variable their sequence is. Can use DNA or protein query sequences (GOIs) and DNA contigs/fastas or protein fastas as database (db) to search in.

maeparser is a parser for Schrodinger Maestro files.

Estimates PCR primer melting temperatures and polymerase-specific annealing temperatures from sequence and buffer inputs, with per-pair QC for hairpins, dimers, and Tm balance. A browser calculator supports interactive single-pair and batch entry (up to 200 pairs) with method comparison and export; a Python library and command-line tool submit the same parameters to the Pepkio Tools API for scripted and pipeline use. Calculator arithmetic for the API client is hosted remotely; sequences are transmitted for programmatic runs while the web interface performs calculations in the browser.

Plans PCR and qPCR master-mix reagent volumes from stock and final concentrations, reaction counts, and pipetting overage, with consolidated totals when several assays are prepared together. A browser calculator supports interactive recipe entry with printable bench sheets; a Python library and command-line tool submit the same parameters to the Pepkio Tools API for scripted and pipeline use. Calculator arithmetic is hosted remotely; the client transmits parameters and returns structured volume tables, dilution warnings, and shareable run identifiers.

Bias factorized, base-resolution deep learning models of chromatin accessibility (chromBPNet).

Translates spectrophotometer and NanoDrop readings into mass and molar concentrations for dsDNA, ssDNA, ssRNA, and protein from a single anchor input, with optional sequence-specific nearest-neighbor extinction coefficients. A browser calculator supports bidirectional unit conversion, batch processing of up to ninety-six NanoDrop export rows, and A260/A280 purity interpretation with plain-language warnings; a REST API exposes converter, batch, and purity modes for scripted use. Calculator arithmetic is hosted remotely; API clients transmit parameters and return structured result fields and shareable run identifiers.

Constructs Punnett squares and offspring genotype and phenotype ratios for complete, incomplete, codominant, ABO multiple-allele, and sex-linked Mendelian crosses from parent genotypes, with step-by-step walkthroughs and reduced ratio output. A browser calculator provides live grids, textbook presets, PNG and SVG export, and shareable links; a Python library and command-line tool submit the same parameters to the Pepkio Tools API for scripted use. Calculator arithmetic for the API client is hosted remotely; the client transmits cross inputs and returns structured grids, ratios, walkthroughs, and shareable run identifiers.

Evaluates Hardy-Weinberg equilibrium for diploid loci with two to six alleles using chi-square and Guo-Thompson exact tests, inbreeding coefficient F, and plain-language verdicts from observed genotype counts, allele frequencies, or biallelic disease incidence. A browser calculator provides De Finetti plots, export, and Wright-Fisher simulation under selection, drift, mutation, and migration; a Python library and command-line tool submit the same parameters to the Pepkio Tools API for scripted use. Calculator arithmetic for the API client is hosted remotely; the client transmits genotype or simulation inputs and returns structured results and shareable run identifiers.

Generates pre-miRNA and mature miRNA count tables from read alignments to pre-miRNA sequences and a gff file, both downloaded from mirBase. Produces also read coverage plots of pre-miRNAs.

Create MSP files containing the isotopic patterns for given molecules with given adducts. The tool is based on enviPat and the RforMassSpectrometry toolbox.

Automated strain separation of low-complexity metagenomes

Standalone C library for assembling Illumina short reads in small regions