Find open-source science resources
A directory of tools, AI models, datasets, and research resources for biotech, bioinformatics, and other scientific fields. Aggregated from curated GitHub awesome-lists, HuggingFace, bio.tools, Bioconductor, and more.
Filters
Health
Domain
Language(1)
License
Source
Type(1)
16 of 5,893 resources
A compressor of common genomic file formats (BAM, CRAM, FASTQ, VCF etc).
samtools/bcftools are a suite of tools for manipulating NGS data and can be used to call variants.
Fast and accurate protein structure search using a learned 3Di structural alphabet (VQ-VAE) that discretizes tertiary interactions into structural tokens, enabling protein-universe-scale structural alignment at sequence-search speeds (4-5 orders of magnitude faster than DALI/TM-align) and underpinning many AI4S tools such as SaProt, ESMAtlas search, and AFDB clustering pipelines (Steinegger Lab, Nature Biotechnology 2023)
Genome mapping and spliced alignment of cDNA or amino acid sequences
BWA-MEM drop-in replacement: 2-3x faster, 2-5x cheaper, 100% identical output on standard CPUs.
lumpy: a general probabilistic framework for structural variant discovery.
SIMD C library for global, semi-global, and local pairwise sequence alignments
A fuzzy Bruijn graph approach to long noisy reads assembly
VerityMap is a tool for mapping long reads to assemblies of extra-long tandem repeats, producing SAM files and identifying potential heterozygous sites and assembly errors through analysis of rare k-mers. It supports PacBio HiFi and ONT reads and generates interactive HTML plots for variant analysis.
FASTQ/A short-reads pre-processing tools: Demultiplexing, trimming, clipping, quality filtering, and masking utilities.
Minigraph is a sequence-to-graph mapper and graph constructor. For graph generation, it aligns a query sequence against a sequence graph and incrementally augments an existing graph with long query subsequences diverged from the graph.
Comprehensive set of programs for phylogenetic analyses; available for PC and Mac; source code available for easy compiling in UNIX.